# STARS 2023 https://hackmd.io/@foss/stars2023 ## Links and resources - [3Blue1Brown channel - wordle guesses](https://youtu.be/fRed0Xmc2Wg) - [DNA Subway](https://dnasubway.cyverse.org/) - [DNA Subway Bioinformatics](https://dnabarcoding101.org/lab/bioinformatics.html) # scroll down to Blue Line - Also see: [YouTube](https://www.youtube.com/live/rMtHaA-KMss?feature=share) ## Learning bioinformatics more after the class **Notebooks used in this course** - Biocoding 2020 Notebooks [link](https://github.com/JasonJWilliamsNY/biocoding-2020-notebooks) - - You can download these materials: [link](https://github.com/JasonJWilliamsNY/biocoding-2020-notebooks/archive/master.zip) **General Coding** - CodeCademy: [link](https://www.codecademy.com/) - Hour of code (also in languages other than English): [link](https://code.org/learn) **Software installations** Be sure you have permission to install software - Try Ubuntu: [link](https://tutorials.ubuntu.com/tutorial/try-ubuntu-before-you-install#0) - Python: [link](https://www.python.org/downloads/) - Jupyter: [link](https://jupyter.org/) - Wing IDE: [link](https://wingware.com/) - Atom text editor: [link](https://atom.io/) **Bioinformatics** - Learn bioinformatics in 100 hours: [link](https://www.biostarhandbook.com/edu/course/1/) - Rosalind bioinformatics: [link](http://rosalind.info/about/) - Bioinformatics coursera: [link](https://www.coursera.org/learn/bioinformatics) - Bioinformatics careers: [link](https://www.iscb.org/bioinformatics-resources-for-high-schools/careers-in-bioinformatics) **Help** - General software help: [link](https://stackoverflow.com/) - Bioinformatics-specific software help: [link](https://www.biostars.org/) ### Friday pizza preferences [Menu](https://www.southdownpizza.com/) --- ### Group Projects **Place presentations in the [Friday Google Drive Folder](https://drive.google.com/drive/folders/115Yrti0QFvK-NSe7NH__VWGPo6Zy8BUV?usp=sharing)** RABT (Restriction Analysis & Bacterial Transformation) 1. Madelyn Eisenberg 2. Sury Guadalupe-Pena 3. Joaquin Martin HMS (Human Mitochondrial Sequencing) 1. Allison Hernandez 2. Isabel Larrea 3. Giovanna Petruccelli PTC (Phenylthiocarbamide) 1. Gabrielle Jeanty 2. saydee westman 3. BLI Intro (Dr. Fernandez Marco & Jeffry Petracca) & Barcoding (rbcL) 1. Genesis Acevedo 2. morgan jairala Biocoding/Bioinformatics 1. Peter Ruiz II 2. Dante Del Vecchio 3. Aydan Cowan D1S80 1. madison jairala 2. Jean Parnell Louis 3. Augustin Mingoia Murphy SBU Trip 1.Derek Poetzsch 2.Logan Acuria-Lauer Guest Speakers: Dr. Koo, Dr. Trotman, Dr. Dos Santos 1.Angelina Gonzalez Villalba 2.Liah-Branam-McClurkin 3.Amasa Ellis Guest Speakers: Dr. Cheadle, Dr. Jackson, DIAS 1.Eliana Eisenberg 2.Sara Tovar --- ### Stony Brook Trip SCHEDULE OF EVENTS STONY BROOK UNIVERSITY (SBU) - 10:00 AM to 10:45 AM Ms. Christina Lafaso, Undergraduate Admissions Advisor - 10:45 AM to 11:15 AM Tour SBU Undergraduate Campus (Tour to end in front of Administration Bldg.) TRANSPORT TO SOUTH CAMPUS - 11:30 AM to 12:15 PM Lunch South Campus, Endeavour Hall Room 120. - 12:20 PM to 12:50 PM Professor Erwin Cabrera, Director, SBU Simons STEM Scholars Program; Admissions Liaison, Brady Brick [provides members with full scholarships, housing, research opportunities, internship stipends, advising, mentoring to supercharge the pathway to STEM careers - 12:55 PM to 1:25 PM Professor Kamazima Lwiza, School of Marine & Atmospheric Sciences (SoMAS) RENAISSANCE SCHOOL OF MEDICINE HEALTH SCIENCES CENTER (HSC) - 1:30 PM TRANSPORT TO HSC HOSPITAL ENTRANCE: HSC L2 Room 1AB (on level 2, reserved) - 1:40 PM to 2:00 PM Professor Jennie Williams, Admissions, Renaissance School of Medicine (RSOM) - 2:05 PM to 2:25 PM Dr. Allison McLarty, Cardiothoracic Surgeon, Department of Surgery, RSOM - 2:30 PM to 2:50 PM Dr Alexandra Guillaume, Gastroenterologist, Director, Gastrointestinal Motility Center, Department of Medicine - 3 PM to 4:30 PM Ms. Perrilynn Baldelli, Director, Clinical Simulation Center - 5:00 PM DEPART SBU CAMPUS FOR DNALC Pick-up site: - Life Sciences Bus stop - 6:00 PM Camp Dismissal, Parent pick-up --- --- ### STARS 2023 Student Email Addresses 1.92sunlite13malibu@gmail.com - Derek Poetzsch 2.logan1122008@icloud.com - Logan Acuria-Lauer - loganacurialauer@gmail.com 3.yugiohpr2d2@gmail.com - Peter Ruiz II 4.Allison Hernandez - alliehernandez286@gmail.com 5.Sara Tovar- sarasofiatovar2217@gmail.com 6.Angelina Gonzalez- angelinaGonzVillalva@gmail.com 7.amasaellis31@gmail.com - Amasa 8.saydee westman raindropsonroses427@gmail.com 9. Giovanna Petruccelli~ giopetruccelli11@gmail.com 10. gabriellej2727@gmail.com - gabrielle jeanty 11.songbirdyee@gmail.com-eliana eisenberg 12.Jean Parnell Louis - JP - jeanparnellone@gmail.com 13.aydanscowan@gmail.com - Aydan Cowan 14.joaqmart21@gmail.com - Joaquin Martin 15.liahbranam@gmail.com - Liah-Branam-McClurkin 16.Augustin Mingoia Murphy apmm2024@yahoo.com 17. ms.m.eisenberg@gmail.com 18. dantedelvecchio377@gmail.com - Dante Del Vecchio 19. sgp.student7@gmail.com - Sury Guadalupe-Pena 20.madisonjairala@gmail.com - madison jairala 21. genacevedo14@gmail.com - Genesis Acevedo 22.Isabel Larrea_ Larreabell94@gmail.com 23.morganjairala@gmail.com - morgan jairala --- ## Jupyter hub accounts williams acevedo poetzsch guadalupe-pena hernandez ellis jairala martin jeanty ruiz eisenberg_e eisenberg_m larrea aange westman petruccelli jairalam1 acuria-lauerl gonzalez mingoia_murphy johnson del_vecchio jairalamo rala cowan louis branam-mcclurkin tovar --- #### Jupyter Hub Address ``` # Example of links in HTML vs. Markdown <a href="http://www.google.com">Google Search</a> [Google Search](http://www.google.com) ``` [JupyterHub](http://3.235.162.1:8000) ``` git clone https://github.com/JasonJWilliamsNY/biocoding-2021-notebooks.git ``` --- ## Notebook 2 #### Naming variables **Average weight of a mouse group?** avg_g_*group name* avg_w_gam avgWGamma avg_w avg_lb_m avg_lb avgw_ avgW_gam avrgweight_m avg_wgt avg_wgt avg_wgh_a,avg_wgh_b,avg_wgh_g avgWg avgw_g, avgw_b, avgw_a avgweight avg_wgt_mce avg(gamma), avg(beta), avg(alpha) w_g, w_b, w_a alpha_weight avg_mass avg_(m)_weight **Number of mice in a group?** mice_pop_*group name* avg_mass num_g avg_nmbrgroupA='CGJ28371' groupB='SJW99399' groupC='PWS29382' print(beta_id[0:3:]) print(alpha_id[0:3:1]) print(gamma_id[0:3:]) #Create new variables that contain the ID of the experimenter #for each mouse group; print the value of these new variables print(alpha_id[3:8:]) print(beta_id[3:8:]) print(gamma_id[3:8:]) mice_# num_m_per_gr mice# avg#_miceG num_g, num_b, num_a mice_pop_# avg_n numMiceGamma mice_num ****WEIGHT alpha_w beta_w gamma_w ****mass alpha_g beta_g gamma_g micenumber avgn_ num_mice groupnm num_a, num_b, num_g alpha_w, beta_w, gamma_w mice_fam beta_mice_w **Challenge In the cell below, print the alpha_id character by character in reverse** --- JP print(alpha_id[7],alpha_id[6],alpha_id[5],alpha_id[4],alpha_id[3],alpha_id[2],alpha_id[1],alpha_id[0]) --- alpha_id = '1738JGC' len(alpha_id) print(alpha_id) -------- log print(alpha_id[7]) print(alpha_id[6]) print(alpha_id[5]) print(alpha_id[4]) print(alpha_id[3]) print(alpha_id[2]) print(alpha_id[1]) print(alpha_id[0]) --- --- print(alpha_id[7]) print(alpha_id[6]) print(alpha_id[5]) print(alpha_id[4]) print(alpha_id[3]) print(alpha_id[2]) print(alpha_id[1]) print(alpha_id[0]) --- --- Peter print(alpha_id[7]) print(alpha_id[6]) print(alpha_id[5]) print(alpha_id[4]) print(alpha_id[3]) print(alpha_id[2]) print(alpha_id[1]) print(alpha_id[0]) --- print(alpha_id[7],alpha_id[6],alpha_id[5],alpha_id[4],alpha_id[3],alpha_id[2],alpha_id[1],alpha_id[0]) --- ------- print(alpha_id[7]) print(alpha_id[6]) print(alpha_id[5]) print(alpha_id[4]) print(alpha_id[3]) print(alpha_id[2]) print(alpha_id[1]) print(alpha_id[0]) ------ ------ print(alpha_id[7]) print(alpha_id[6]) print(alpha_id[5]) print(alpha_id[4]) print(alpha_id[3]) print(alpha_id[2]) print(alpha_id[1]) print(alpha_id[0]) --- augustin print(len(alpha_id(-1))) --- aydan print(am_id[7],am_id[6],am_id[5],am_id[4],am_id[3],am_id[2],am_id[1],am_id[0]) --- #### Challenge **Create new variables that contain the initials of the experimenter; for each mouse group; print the value of these new variables** ------- groupA = 'CGJ28371' groupAinitials = groupA[0:3] print(groupAinitials) groupB = 'SJW99399' groupBinitials = groupB[0:3] print(groupBinitials) groupG = 'PWS29382' groupGinitals = groupG[0:3] print(groupGinitals) print(alpha_id[0], gama_id[3]) **Create new variables that contain the ID of the experimenter; for each mouse group; print the value of these new variables** --- aydan #original id am_id = "CGJ28371" bm_id = "SJW99399" gm_id = "PWS29382" #id initials am_id_initial = am_id[0:3] bm_id_initial = bm_id[0:3] gm_id_initial = gm_id[0:3] #unique id number am_id_unumber = am_id[3:8] bm_id_unumber = bm_id[3:8] gm_id_unumber = gm_id[3:8] #print id initials print(am_id_initial) print(bm_id_initial) print(gm_id_initial) #print unique id number print(am_id_unumber) print(bm_id_unumber) print(gm_id_unumber) ---Madelyn alpha_exp =(alpha_id[0:3]) beta_exp = (beta_id[0:3]) gamma_exp =(gamma_id[0:3]) print(alpha_exp) print(beta_exp) print(gamma_exp) alpha_exp_id = (alpha_id[3:8:]) beta_exp_id = (beta_id[3:8:]) gamma_exp_id = (gamma_id[3:8:]) print(alpha_exp_id) print(beta_exp_id) print(gamma_exp_id) --- <3 JP <3 HELLO WORLD PART ONE alpha_id = 'CGJ28371' beta_id = 'SJW99399' gamma_id = 'PWS29382' name_i = (alpha_id[0],alpha_id[1],alpha_id[2]) name_i2= (beta_id[0],beta_id[1],beta_id[2]) name_i3= (gamma_id[0],gamma_id[1],gamma_id[2]) name_bi = (alpha_id[0:3:]) name_bi2 = (beta_id[0:3:]) name_bi3 = (gamma_id[0:3:]) print(name_i) print(name_i2) print(name_i3) print(name_bi) print(name_bi2) print(name_bi3) PART 2 name_id = (alpha_id[3::]) name_id2 = (beta_id[3::]) name_id3 = (gamma_id[3::]) print(name_id) print(name_id2) print(name_id3) --- --- Peter alpha_id = 'CGJ28371' beta_id = 'SJW99399' gamma_id = 'PWS29382' print(gamma_id[0], gamma_id[1], gamma_id[2]) print(gamma_id[3::1]) print(alpha_id[0], alpha_id[1], alpha_id[2]) print(alpha_id[3::1]) print(beta_id[0], beta_id[1], beta_id[2]) print(beta_id[3::1]) --- alpha_exp = alpha_id[0:3] beta_exp = beta_id[0:3] gamma_exp = gamma_id[0:3] print(alpha_exp, beta_exp, gamma_exp) alpha_exp_id = alpha_id[3:8] beta_exp_id = beta_id[3:8] gamma_exp_id = gamma_id[3:8] print(alpha_exp_id, beta_exp_id, gamma_exp_id) --- --- dante groupA = 'CGJ28371' groupA_ID = groupA[3:] print(groupA_ID) groupB = 'SJW99399' groupB_ID = groupB[3:] print(groupB_ID) groupG = 'PWS29382' groupG_ID = groupG[3:] print(groupG_ID) ------- groupA = 'CGJ28371' groupAinitials = groupA[3:8] print(groupAinitials) groupB = 'SJW99399' groupBinitials = groupB[3:8] print(groupBinitials) groupG = 'PWS29382' groupGinitals = groupG[3:8] print(groupGinitals) ---Joaquin groupA = 'CGJ28371' groupAvariables = groupA[4:8] print(groupAvariables) groupB = 'SJW99399' groupBvariables = groupB[4:8] print(groupBvariables) groupG = 'PWS29382' groupGvariables = groupG[4:8] print(groupGvariables) alpha_id_numbers = alpha_id[3:8] print(alpha_id_numbers) beta_id_numbers = beta_id[3:8] print(beta_id_numbers) gamma_id_numbers = gamma_id[3:8] print(gamma_id_numbers) ----- augustin group_a = 'CGJ28371' group_a_init = group_a[0:3] print(group_a_init) group_b = 'SJW99399' group_b_init = group_b[0:3] print(group_b_init) group_c = 'PWS29382' group_c_init = group_c[0:3] print(group_c_init) group_a = 'CGJ28371' group_a_ID = group_a[3:] print(group_a_ID) group_b = 'SJW99399' group_b_ID = group_b[3:] print(group_b_ID) group_c = 'PWS29382' group_c_ID = group_c[3:] print(group_c_ID) --- groupA='CGJ28371' groupB='SJW99399' groupC='PWS29382' print(beta_id[0:3:]) print(alpha_id[0:3:1]) print(gamma_id[0:3:]) #Create new variables that contain the ID of the experimenter #for each mouse group; print the value of these new variables print(alpha_id[3:8:]) print(beta_id[3:8:]) print(gamma_id[3:8:]) --- alpha_exp_ini= alpha_id[0:3:] print(alpha_exp_ini) beta_exp_ini= beta_id[0:3:] print(beta_exp_ini) gamma_exp_ini=gamma_id[0:3:] print(gamma_exp_ini) #id alpha_exp_id= alpha_id[3:7:] print(alpha_exp_id) beta_exp_id = beta_id[3:7:] print(beta_exp_id) gamma_exp_id = gamma_id[3:7:] print(gamma_exp_id) **Challenge Let's create a simple sequence in Python that will do the following** -Madelyn dna1 = 'AATGCGTGCGGATCATATTTTACCGGATCGGATGGCGTAAATCCGCGCTA' print ('>sequence 001', '\n', dna1) --- dna_string= 'AGTAGCCCGATAAGATACGGCGACATAGGTTTTTTAAGCGATGCATG' start= '>sequence one' print(start, '\n', dna_string) -Allie --- sequence_name = '> RandomSequence' sequence_code = 'AGCTAGCTCCATGCTAGATCTTAGCTAGACGTGTCGATTAGCTGACTGCGTAGGAA' print(sequence_name, '\n' + sequence_code) --- ------ dante DNA = '>DNA Sequence 01' dnaSeq = 'ACTGTTTTGGCCCATCCCATCATCATCGATCGASTCACGTGATCGTSACSCA' print(DNA,'\n',dnaSeq) --- Peter s1 = '>' + 'sequence001' s1DNA = 'ATACTCGATACTAGCTAGCTATACGTAGCTATCGATC' print(s1, '\n', s1DNA) print(">Dereks Sequence \n AATTGGCCAATACGTACTTTCCATTAC") ----- print(">Logan's sequence \n AATGCCGATTAGCATTCGTATAGCCCGTAATTTGC") ------ Augustin DNA = "Random DNA Sequence" rand_seq = "ATGGGCCTAAATGTATAG" print(DNA, "\n", rand_seq) --- #### Determine and print the length of the HIV genome -----Logan print(len(hiv_genome)) ---- ---Madelyn print(len(hiv_genome)) #### Create variables for and print the sequences for the following HIV genes - gag - pol - vif - vpr - env ---madelyn gag = (hiv_genome[789:2292:]) pol = ('\n' + hiv_genome[2084:5096:]) vif = ('\n' + hiv_genome[5040:5619:]) vpr = ('\n' + hiv_genome[5558:5850:]) env = ('\n' + hiv_genome[6224:8795:]) print(gag, '\n', pol, 'vif', vif, '\n', vpr, '\n', vpr, '\n', env ) ---Peter gag = hiv_genome[790:2292:] pol = hiv_genome[2085:5096:] vif = hiv_genome[5041:5619:] vpr = hiv_genome[5559:5850:] env = hiv_genome[6045:8795:] print('HIV Gene GAG', '\n', '\n', gag) print('\n') print('HIV Gene POL', '\n', '\n', pol) print('\n') print('HIV Gene VIF', '\n', '\n', vif) print('\n') print('HIV Gene VPR', '\n', '\n', vpr) print('\n') print('HIV Gene ENV', '\n', '\n', env) --Logan-- gag = hiv_genome[790:2292] pol = hiv_genome[2085:5096] vif = hiv_genome[5041:5619] vpr = hiv_genome[5559:5850] env = hiv_genome[6045:8795] print(gag) print(pol) print(vif) print(vpr) print(env) --- ---- gag=hiv_genome[790:2293] print(gag) pol=hiv_genome[2085:5097] print(pol) vif=hiv_genome[5041:5620] print(vif) vpr=hiv_genome[5559:5851] print(vpr) env=hiv_genome[6225:8796] print(env) ---- --- hiv_gag = hiv_genome[790:2293] hiv_pol = hiv_genome[2085:5097] hiv_vif = hiv_genome[5041:5620] hiv_vpr= hiv_genome[559:5851] hiv_env = hiv_genome [6225:8796] print(hiv_gag, '\n', '\n', hiv_pol,'\n', '\n', hiv_vif,'\n','\n', hiv_vpr,'\n', '\n', hiv_env) --- ---Joaquin # gag groupgagvariables = hiv_genome[789:2292] # pol grouppolvariables = hiv_genome[2084:5096] # vif groupvifvariables = hiv_genome[5040:5619] # vpr groupvprvariables = hiv_genome[5558:5850] # env groupenvvariables = hiv_genome[6044:8795] print(groupgagvariables, '\n''\n''\n', grouppolvariables, '\n''\n''\n', groupvifvariables, '\n''\n''\n', groupvprvariables, '\n''\n''\n', groupenvvariables) ____ JP var_gag = (hiv_genome[789:2292:]) var_pol = (hiv_genome[2084:5096:]) var_vif = (hiv_genome[5040:5619:]) var_vpr = (hiv_genome[5558:5850:]) var_env = (hiv_genome[5969:8795:]) print('Gene gag:',var_gag,'\n\n\n\n','Gene pol:',var_pol,'\n\n\n\n','Gene vif:',var_vif,'\n\n\n\n','Gene vpr:',var_vpr,'\n\n\n\n','Gene env:',var_env,'\n\n\n\n') ____ ---Aydan gag = (hiv_genome[789:2135]) # pol pol = (hiv_genome[2084:5097]) # vif vif = (hiv_genome[5040:5620]) # vpr vpr = (hiv_genome[5558:5851]) # env env = (hiv_genome[6224:8796]) print(">gag" + "\n" + gag + "\n" + ">pol" + "\n" + pol + "\n" + ">vif" + "\n" + vif + "\n" + ">vpr" + "\n" + vpr + "\n" + ">env" + "\n" + env + "\n") --- gag=print(hiv_genome[790:2292]) pol=print(hiv_genome[2085:5096]) vif= print(hiv_genome[5041:5619]) vpr=print(hiv_genome[5559:5850]) env=print (hiv_genome[6045:8795]) #### Generate the RNA sequence for each of the genes you have isolated above ---Madelyn gag_rna = gag.replace('t','u') print(gag_rna) pol_rna = pol.replace('t', 'u') print(pol_rna) vif_rna = vif.replace('t', 'u') print(vif_rna) vpr_rna = vpr.replace('t', 'u') print(vpr_rna) env_rna = env.replace('t', 'u') print(env_rna) ---Sury hiv_genome = (hiv_genome.replace('t' , 'u')) gag = (hiv_genome[789:2292]) print(gag) pol = ('\n' + hiv_genome[2084:5096]) print(pol) vif = ('\n' + hiv_genome[5040:5619]) print(vif) vpr = ('\n' + hiv_genome[5558:5850]) print(vpf) env = ('\n' + hiv_genome[6224:8795]) print(env) ---Peter gag = hiv_genome[789:2292:] pol = hiv_genome[2084:5096:] vif = hiv_genome[5040:5619:] vpr = hiv_genome[5558:5850:] env = hiv_genome[6044:8795:] RNAgag = gag.replace('t','u') RNApol = pol.replace('t','u') RNAvif = vif.replace('t','u') RNAvpr = vpr.replace('t','u') RNAenv = env.replace('t','u') print('HIV Gene GAG', '\n', '\n', RNAgag) print('\n') print('HIV Gene POL', '\n', '\n', RNApol) print('\n') print('HIV Gene VIF', '\n', '\n', RNAvif) print('\n') print('HIV Gene VPR', '\n', '\n', RNAvpr) print('\n') print('HIV Gene ENV', '\n', '\n', RNAenv) --- gag_rna = hiv_gag.replace ('t', 'u') pol_rna = hiv_pol.replace ('t','u') vif_rna = hiv_vif.replace ('t','u') vpr_rna = hiv_vpr.replace ('t','u') env_rna = hiv_env.replace ('t','u') print(gag_rna, '\n', '\n', pol_rna, '\n', '\n', vif_rna, '\n', '\n', vpr_rna, '\n', '\n', env_rna) --- RNA_gag = var_gag.replace('t','u') RNA_pol = var_pol.replace('t','u') RNA_vif = var_vif.replace('t','u') RNA_vpr = var_vpr.replace('t','u') RNA_env = var_env.replace('t','u') print('RNA of gag:',RNA_gag,'\n\n\n\n','RNA of pol:',RNA_pol,'\n\n\n\n','RNA of vif:',RNA_vif,'\n\n\n\n','RNA of vpr',RNA_vpr,'\n\n\n\n','RNA of env',RNA_env) ___ ---aydan RNA_gag = gag.replace("t","u") RNA_pol = pol.replace("t","u") RNA_vif = vif.replace("t","u") RNA_vpr = vpr.replace("t","u") RNA_env = env.replace("t","u") print(">RNA_gag" + "\n" + RNA_gag + "\n" + ">RNA_pol" + "\n" + RNA_pol + "\n" + ">RNA_vif" + "\n" + RNA_vif + "\n" + ">RNA_vpr" + "\n" + RNA_vpr + "\n" + ">RNA_env" + "\n" + RNA_env + "\n") --- #### For each gene, generate a sum for each of the nuclotides in that gene (e.g., #of 'A',#of'U',#of'G',#of'C') ___JP print('COUNT FOR GAG in RNA') print("Number of a",RNA_gag.count('a')) print("Number of u",RNA_gag.count('u')) print("Number of c",RNA_gag.count('c')) print("Number of g",RNA_gag.count('g')) print("\n\n") print('COUNT FOR POL in RNA') print("Number of a",RNA_pol.count('a')) print("Number of u",RNA_pol.count('u')) print("Number of c",RNA_pol.count('c')) print("Number of g",RNA_pol.count('g')) print("\n\n") print('COUNT FOR VIF in RNA') print("Number of a",RNA_vif.count('a')) print("Number of u",RNA_vif.count('u')) print("Number of c",RNA_vif.count('c')) print("Number of g",RNA_vif.count('g')) print("\n\n") print('COUNT FOR VPR in RNA') print("Number of a",RNA_vpr.count('a')) print("Number of u",RNA_vpr.count('u')) print("Number of c",RNA_vpr.count('c')) print("Number of g",RNA_vpr.count('g')) print("\n\n") print('COUNT FOR ENV in RNA') print("Number of a",RNA_env.count('a')) print("Number of u",RNA_env.count('u')) print("Number of c",RNA_env.count('c')) print("Number of g",RNA_env.count('g')) print('COUNT FOR GAG in DNA') print('Number of a',var_gag.count('a')) print('Number of t',var_gag.count('t')) print('Number of c',var_gag.count('c')) print('Number of g',var_gag.count('g')) print("\n\n") print('COUNT FOR POL in DNA') print('Number of a',var_pol.count('a')) print('Number of t',var_pol.count('t')) print('Number of c',var_pol.count('c')) print('Number of g',var_pol.count('g')) print("\n\n") print('COUNT FOR VIF in DNA') print('Number of a',var_vif.count('a')) print('Number of t',var_vif.count('t')) print('Number of c',var_vif.count('c')) print('Number of g',var_vif.count('g')) print("\n\n") print('COUNT FOR VPR in DNA') print('Number of a',var_vpr.count('a')) print('Number of t',var_vpr.count('t')) print('Number of c',var_vpr.count('c')) print('Number of g',var_vpr.count('g')) print("\n\n") print('COUNT FOR ENV in DNA') print('Number of a',var_env.count('a')) print('Number of t',var_env.count('t')) print('Number of c',var_env.count('c')) print('Number of g',var_env.count('g')) ---Madelyn print('gag_rna_a',gag_rna.count('a')) print('gag_rna_u',gag_rna.count('u')) print('gag_rna_g',gag_rna.count('g')) print('gag_rna_c',gag_rna.count('c'), '\n') print('pol_rna_a',pol_rna.count('a')) print('pol_rna_u',pol_rna.count('u')) print('pol_rna_g',pol_rna.count('g')) print('pol_rna_c',pol_rna.count('c'), '\n') print('vif_rna_a',vif_rna.count('a')) print('vif_rna_u',vif_rna.count('u')) print('vif_rna_g',vif_rna.count('g')) print('vif_rna_c',vif_rna.count('c'), '\n') print('vpr_rna_a',vpr_rna.count('a')) print('vpr_rna_u',vpr_rna.count('u')) print('vpr_rna_g',vpr_rna.count('g')) print('vpr_rna_c',vpr_rna.count('c'), '\n') print('env_rna_a',env_rna.count('a')) print('env_rna_u',env_rna.count('u')) print('env_rna_g',env_rna.count('g')) print('env_rna_c',env_rna.count('c')) --- **--Peter--** gag = hiv_genome[789:2292:] pol = hiv_genome[2084:5096:] vif = hiv_genome[5040:5619:] vpr = hiv_genome[5558:5850:] env = hiv_genome[6044:8795:] RNAgag = gag.replace('t','u') RNApol = pol.replace('t','u') RNAvif = vif.replace('t','u') RNAvpr = vpr.replace('t','u') RNAenv = env.replace('t','u') print('HIV Gene GAG Nuclotide #s', '\n', '\n', '"A" Nuclotide #', RNAgag.count('a'), '\n') print('"U" Nuclotide #', RNAgag.count('u'), '\n') print('"C" Nuclotide #', RNAgag.count('c'), '\n') print('"G" Nuclotide #', RNAgag.count('g'), '\n') print('\n') print('HIV Gene POL Nuclotide #s', '\n', '\n', '"A" Nuclotide #', RNApol.count('a'), '\n') print('"U" Nuclotide #', RNApol.count('u'), '\n') print('"C" Nuclotide #', RNApol.count('c'), '\n') print('"G" Nuclotide #', RNApol.count('g'), '\n') print('\n') print('HIV Gene VIF Nuclotide #s', '\n', '\n', '"A" Nuclotide #', RNAvif.count('a'), '\n') print('"U" Nuclotide #', RNAvif.count('u'), '\n') print('"C" Nuclotide #', RNAvif.count('c'), '\n') print('"G" Nuclotide #', RNAvif.count('g'), '\n') print('\n') print('HIV Gene VPR Nuclotide #s', '\n', '\n', '"A" Nuclotide #', RNAvpr.count('a'), '\n') print('"U" Nuclotide #', RNAvpr.count('u'), '\n') print('"C" Nuclotide #', RNAvpr.count('c'), '\n') print('"G" Nuclotide #', RNAvpr.count('g'), '\n') print('\n') print('HIV Gene ENV Nuclotide #s', '\n', '\n', '"A" Nuclotide #', RNAenv.count('a'), '\n') print('"U" Nuclotide #', RNAenv.count('u'), '\n') print('"C" Nuclotide #', RNAenv.count('c'), '\n') print('"G" Nuclotide #', RNAenv.count('g'), '\n') ---aydan RNA_gag_a = gag.count("a") RNA_gag_u = gag.count("u") RNA_gag_g = gag.count("g") RNA_gag_c = gag.count("c") RNA_pol_a = pol.count("a") RNA_pol_u = pol.count("u") RNA_pol_g = pol.count("g") RNA_pol_c = pol.count("c") RNA_vif_a = vif.count("a") RNA_vif_u = vif.count("u") RNA_vif_g = vif.count("g") RNA_vif_c = vif.count("c") RNA_vpr_a = vpr.count("a") RNA_vpr_u = vpr.count("u") RNA_vpr_g = vpr.count("g") RNA_vpr_c = vpr.count("c") RNA_env_a = env.count("a") RNA_env_u = env.count("u") RNA_env_g = env.count("g") RNA_env_c = env.count("c") print("gag a") print(RNA_gag_a) print("gag u") print(RNA_gag_u) print("gag g") print(RNA_gag_g) print("gag c") print(RNA_gag_c) print("pol a") print(RNA_pol_a) print("pol u") print(RNA_pol_u) print("pol g") print(RNA_pol_g) print("pol c") print(RNA_pol_c) print("vif a") print(RNA_vif_a) print("vif u") print(RNA_vif_u) print("vif g") print(RNA_vif_g) print("vif c") print(RNA_vif_c) print("vpr a") print(RNA_vpr_a) print("vpr u") print(RNA_vpr_u) print("vpr g") print(RNA_vpr_g) print("vpr c") print(RNA_vpr_c) print("env a") print(RNA_env_a) print("env u") print(RNA_env_u) print("env g") print(RNA_env_g) print("env c") print(RNA_env_c) --- #### For each gene, caculate the GC content (%) - percent GC = sum of (G) + sum (C) / total number of nuclotides in a given gene ---Madelyn #gag gene gag_rna_g = gag_rna.count('g') gag_rna_c = gag_rna.count('c') print('gag rna gc concentration') print((gag_rna_g + gag_rna_c)/len(gag)*100, '\n') #pol gene pol_rna_g = pol_rna.count('g') pol_rna_c = pol_rna.count('c') print('pol rna gc concentration') print((pol_rna_g + pol_rna_c)/len(pol)*100, '\n') #vif gene vif_rna_g = vif_rna.count('g') vif_rna_c = vif_rna.count('c') print('vif rna gc concentration') print((vif_rna_g + vif_rna_c)/len(vif)*100, '\n') #vpr gene vpr_rna_g = vpr_rna.count('g') vpr_rna_c = vpr_rna.count('c') print('vpr rna gc concentration') print((vpr_rna_g + vpr_rna_c)/len(vpr)*100, '\n') #env gene env_rna_g = env_rna.count('g') env_rna_c = env_rna.count('c') print('env rna gc concentration') print((env_rna_g + env_rna_c)/len(env)*100) -- aydan x = hiv_genome.count("g") y = hiv_genome.count("c") d = len(hiv_genome) z = (x + y) / d p = z * 100 print(str(p) + " %" ) ---jp x=hiv_genome.count('g') y=hiv_genome.count('c') d = len(hiv_genome) z = ( x + y ) / d p = z * 100 print(str(p), 'percent') ___