--- tags: python --- 運用Python解生物資訊問題(3) === ## 練習1 先下載herpesvirus_genome.json [herpesvirus_genome.json](https://muddle2.cs.huji.ac.il/ru19/course/view.php?id=68&section=3) ### A) Find the frequency of each amino-acid in the herpesvirus's proteome. 找出herpesvirus的蛋白質體中,所有胺基酸出現的頻率。 ```python import os import json from collections import Counter # r代表raw python string literal, 不用正規表達式表示。 DIR = r'C://downloads' f = open(os.path.join(DIR, 'herpesvirus.json'), 'r') data = json.load(f) f.close() # 將計算好的氨基酸出現次數除以總胺基酸數 def normalize(counter): total = sum(counter.values()) return {key: count / total for key, count in counter.items()} # 建立空的記數字典 all_aa_counter = Counter() # 這邊可以先另外用Notepad++開herpesvirus.json檔案看 # coding regions以list儲存每個coding region資訊 # 其中每個coding region有個key值為translation # 當中儲存該coding region的胺基酸序列 for coding_region in data['coding_regions']: # 用update字典的方法加入計數器 aa_aa_counter.update(coding_region['translation'] ) aa_all_freq = normalize(all_aa_counter) print(all_aa_freq) ```