天天看點

Python生物資訊學②從PDB檔案中提取蛋白序列

環境

OS version      : Win10 x64
python_version  : Python 3.6.5      

執行個體代碼

aa_codes = {
     'ALA':'A', 'CYS':'C', 'ASP':'D', 'GLU':'E',
     'PHE':'F', 'GLY':'G', 'HIS':'H', 'LYS':'K',
     'ILE':'I', 'LEU':'L', 'MET':'M', 'ASN':'N',
     'PRO':'P', 'GLN':'Q', 'ARG':'R', 'SER':'S',
     'THR':'T', 'VAL':'V', 'TYR':'Y', 'TRP':'W'}      
seq = ''
 
for line in open("1TLD.pdb"):
    if line[0:6] == "SEQRES":
        columns = line.split()
        for resname in columns[4:]:
            seq = seq + aa_codes[resname]seq = ''
 
for line in open("1TLD.pdb"):
    if line[0:6] == "SEQRES":
        columns = line.split()
        for resname in columns[4:]:
            seq = seq + aa_codes[resname]      
i = 0
print (">1TLD")
while i < len(seq):
    print (seq[i:i + 64])
    i = i + 64      
Python生物資訊學②從PDB檔案中提取蛋白序列