# STARS 2023
https://hackmd.io/@foss/stars2023
## Links and resources
- [3Blue1Brown channel - wordle guesses](https://youtu.be/fRed0Xmc2Wg)
- [DNA Subway](https://dnasubway.cyverse.org/)
- [DNA Subway Bioinformatics](https://dnabarcoding101.org/lab/bioinformatics.html) # scroll down to Blue Line
- Also see: [YouTube](https://www.youtube.com/live/rMtHaA-KMss?feature=share)
## Learning bioinformatics more after the class
**Notebooks used in this course**
- Biocoding 2020 Notebooks [link](https://github.com/JasonJWilliamsNY/biocoding-2020-notebooks)
- - You can download these materials: [link](https://github.com/JasonJWilliamsNY/biocoding-2020-notebooks/archive/master.zip)
**General Coding**
- CodeCademy: [link](https://www.codecademy.com/)
- Hour of code (also in languages other than English): [link](https://code.org/learn)
**Software installations**
Be sure you have permission to install software
- Try Ubuntu: [link](https://tutorials.ubuntu.com/tutorial/try-ubuntu-before-you-install#0)
- Python: [link](https://www.python.org/downloads/)
- Jupyter: [link](https://jupyter.org/)
- Wing IDE: [link](https://wingware.com/)
- Atom text editor: [link](https://atom.io/)
**Bioinformatics**
- Learn bioinformatics in 100 hours: [link](https://www.biostarhandbook.com/edu/course/1/)
- Rosalind bioinformatics: [link](http://rosalind.info/about/)
- Bioinformatics coursera: [link](https://www.coursera.org/learn/bioinformatics)
- Bioinformatics careers: [link](https://www.iscb.org/bioinformatics-resources-for-high-schools/careers-in-bioinformatics)
**Help**
- General software help: [link](https://stackoverflow.com/)
- Bioinformatics-specific software help: [link](https://www.biostars.org/)
### Friday pizza preferences
[Menu](https://www.southdownpizza.com/)
---
### Group Projects
**Place presentations in the [Friday Google Drive Folder](https://drive.google.com/drive/folders/115Yrti0QFvK-NSe7NH__VWGPo6Zy8BUV?usp=sharing)**
RABT (Restriction Analysis & Bacterial Transformation)
1. Madelyn Eisenberg
2. Sury Guadalupe-Pena
3. Joaquin Martin
HMS (Human Mitochondrial Sequencing)
1. Allison Hernandez
2. Isabel Larrea
3. Giovanna Petruccelli
PTC (Phenylthiocarbamide)
1. Gabrielle Jeanty
2. saydee westman
3.
BLI Intro (Dr. Fernandez Marco & Jeffry Petracca) & Barcoding (rbcL)
1. Genesis Acevedo
2. morgan jairala
Biocoding/Bioinformatics
1. Peter Ruiz II
2. Dante Del Vecchio
3. Aydan Cowan
D1S80
1. madison jairala
2. Jean Parnell Louis
3. Augustin Mingoia Murphy
SBU Trip
1.Derek Poetzsch
2.Logan Acuria-Lauer
Guest Speakers: Dr. Koo, Dr. Trotman, Dr. Dos Santos
1.Angelina Gonzalez Villalba
2.Liah-Branam-McClurkin
3.Amasa Ellis
Guest Speakers: Dr. Cheadle, Dr. Jackson, DIAS
1.Eliana Eisenberg
2.Sara Tovar
---
### Stony Brook Trip
SCHEDULE OF EVENTS
STONY BROOK UNIVERSITY (SBU)
- 10:00 AM to 10:45 AM Ms. Christina Lafaso, Undergraduate Admissions Advisor
- 10:45 AM to 11:15 AM Tour SBU Undergraduate Campus (Tour to end in front of Administration Bldg.)
TRANSPORT TO SOUTH CAMPUS
- 11:30 AM to 12:15 PM Lunch South Campus, Endeavour Hall Room 120.
- 12:20 PM to 12:50 PM Professor Erwin Cabrera, Director, SBU Simons STEM Scholars Program; Admissions Liaison, Brady Brick [provides members with full scholarships, housing, research opportunities, internship stipends, advising, mentoring to supercharge the pathway to STEM careers
- 12:55 PM to 1:25 PM Professor Kamazima Lwiza, School of Marine & Atmospheric Sciences (SoMAS)
RENAISSANCE SCHOOL OF MEDICINE HEALTH SCIENCES CENTER (HSC)
- 1:30 PM TRANSPORT TO HSC HOSPITAL ENTRANCE:
HSC L2 Room 1AB (on level 2, reserved)
- 1:40 PM to 2:00 PM Professor Jennie Williams, Admissions, Renaissance School of Medicine (RSOM)
- 2:05 PM to 2:25 PM Dr. Allison McLarty, Cardiothoracic Surgeon, Department of Surgery, RSOM
- 2:30 PM to 2:50 PM Dr Alexandra Guillaume, Gastroenterologist, Director, Gastrointestinal Motility Center, Department of Medicine
- 3 PM to 4:30 PM Ms. Perrilynn Baldelli, Director, Clinical Simulation Center
- 5:00 PM DEPART SBU CAMPUS FOR DNALC Pick-up site:
- Life Sciences Bus stop
- 6:00 PM Camp Dismissal, Parent pick-up
---
---
### STARS 2023 Student Email Addresses
1.92sunlite13malibu@gmail.com - Derek Poetzsch
2.logan1122008@icloud.com - Logan Acuria-Lauer - loganacurialauer@gmail.com
3.yugiohpr2d2@gmail.com - Peter Ruiz II
4.Allison Hernandez - alliehernandez286@gmail.com
5.Sara Tovar- sarasofiatovar2217@gmail.com
6.Angelina Gonzalez- angelinaGonzVillalva@gmail.com
7.amasaellis31@gmail.com - Amasa
8.saydee westman
raindropsonroses427@gmail.com
9. Giovanna Petruccelli~ giopetruccelli11@gmail.com
10. gabriellej2727@gmail.com - gabrielle jeanty
11.songbirdyee@gmail.com-eliana eisenberg
12.Jean Parnell Louis - JP - jeanparnellone@gmail.com
13.aydanscowan@gmail.com - Aydan Cowan
14.joaqmart21@gmail.com - Joaquin Martin
15.liahbranam@gmail.com - Liah-Branam-McClurkin
16.Augustin Mingoia Murphy apmm2024@yahoo.com
17. ms.m.eisenberg@gmail.com
18. dantedelvecchio377@gmail.com - Dante Del Vecchio
19. sgp.student7@gmail.com - Sury Guadalupe-Pena
20.madisonjairala@gmail.com - madison jairala
21. genacevedo14@gmail.com - Genesis Acevedo
22.Isabel Larrea_ Larreabell94@gmail.com
23.morganjairala@gmail.com - morgan jairala
---
## Jupyter hub accounts
williams
acevedo
poetzsch
guadalupe-pena
hernandez
ellis
jairala
martin
jeanty
ruiz
eisenberg_e
eisenberg_m
larrea
aange
westman
petruccelli
jairalam1
acuria-lauerl
gonzalez
mingoia_murphy
johnson
del_vecchio
jairalamo
rala
cowan
louis
branam-mcclurkin
tovar
---
#### Jupyter Hub Address
```
# Example of links in HTML vs. Markdown
<a href="http://www.google.com">Google Search</a>
[Google Search](http://www.google.com)
```
[JupyterHub](http://3.235.162.1:8000)
```
git clone https://github.com/JasonJWilliamsNY/biocoding-2021-notebooks.git
```
---
## Notebook 2
#### Naming variables
**Average weight of a mouse group?**
avg_g_*group name*
avg_w_gam
avgWGamma
avg_w
avg_lb_m
avg_lb
avgw_
avgW_gam
avrgweight_m
avg_wgt
avg_wgt
avg_wgh_a,avg_wgh_b,avg_wgh_g
avgWg
avgw_g, avgw_b, avgw_a
avgweight
avg_wgt_mce
avg(gamma), avg(beta), avg(alpha)
w_g, w_b, w_a
alpha_weight
avg_mass
avg_(m)_weight
**Number of mice in a group?**
mice_pop_*group name*
avg_mass
num_g
avg_nmbrgroupA='CGJ28371'
groupB='SJW99399'
groupC='PWS29382'
print(beta_id[0:3:])
print(alpha_id[0:3:1])
print(gamma_id[0:3:])
#Create new variables that contain the ID of the experimenter
#for each mouse group; print the value of these new variables
print(alpha_id[3:8:])
print(beta_id[3:8:])
print(gamma_id[3:8:])
mice_#
num_m_per_gr
mice#
avg#_miceG
num_g, num_b, num_a
mice_pop_#
avg_n
numMiceGamma
mice_num
****WEIGHT
alpha_w
beta_w
gamma_w
****mass
alpha_g
beta_g
gamma_g
micenumber
avgn_
num_mice
groupnm
num_a, num_b, num_g
alpha_w, beta_w, gamma_w
mice_fam
beta_mice_w
**Challenge In the cell below, print the alpha_id character by character in reverse**
---
JP
print(alpha_id[7],alpha_id[6],alpha_id[5],alpha_id[4],alpha_id[3],alpha_id[2],alpha_id[1],alpha_id[0])
---
alpha_id = '1738JGC'
len(alpha_id)
print(alpha_id)
-------- log
print(alpha_id[7])
print(alpha_id[6])
print(alpha_id[5])
print(alpha_id[4])
print(alpha_id[3])
print(alpha_id[2])
print(alpha_id[1])
print(alpha_id[0])
---
---
print(alpha_id[7])
print(alpha_id[6])
print(alpha_id[5])
print(alpha_id[4])
print(alpha_id[3])
print(alpha_id[2])
print(alpha_id[1])
print(alpha_id[0])
---
--- Peter
print(alpha_id[7])
print(alpha_id[6])
print(alpha_id[5])
print(alpha_id[4])
print(alpha_id[3])
print(alpha_id[2])
print(alpha_id[1])
print(alpha_id[0])
---
print(alpha_id[7],alpha_id[6],alpha_id[5],alpha_id[4],alpha_id[3],alpha_id[2],alpha_id[1],alpha_id[0])
---
-------
print(alpha_id[7])
print(alpha_id[6])
print(alpha_id[5])
print(alpha_id[4])
print(alpha_id[3])
print(alpha_id[2])
print(alpha_id[1])
print(alpha_id[0])
------
------
print(alpha_id[7])
print(alpha_id[6])
print(alpha_id[5])
print(alpha_id[4])
print(alpha_id[3])
print(alpha_id[2])
print(alpha_id[1])
print(alpha_id[0])
---
augustin
print(len(alpha_id(-1)))
---
aydan
print(am_id[7],am_id[6],am_id[5],am_id[4],am_id[3],am_id[2],am_id[1],am_id[0])
---
#### Challenge
**Create new variables that contain the initials of the experimenter; for each mouse group; print the value of these new variables**
-------
groupA = 'CGJ28371'
groupAinitials = groupA[0:3]
print(groupAinitials)
groupB = 'SJW99399'
groupBinitials = groupB[0:3]
print(groupBinitials)
groupG = 'PWS29382'
groupGinitals = groupG[0:3]
print(groupGinitals)
print(alpha_id[0], gama_id[3])
**Create new variables that contain the ID of the experimenter; for each mouse group; print the value of these new variables**
---
aydan
#original id
am_id = "CGJ28371"
bm_id = "SJW99399"
gm_id = "PWS29382"
#id initials
am_id_initial = am_id[0:3]
bm_id_initial = bm_id[0:3]
gm_id_initial = gm_id[0:3]
#unique id number
am_id_unumber = am_id[3:8]
bm_id_unumber = bm_id[3:8]
gm_id_unumber = gm_id[3:8]
#print id initials
print(am_id_initial)
print(bm_id_initial)
print(gm_id_initial)
#print unique id number
print(am_id_unumber)
print(bm_id_unumber)
print(gm_id_unumber)
---Madelyn
alpha_exp =(alpha_id[0:3])
beta_exp = (beta_id[0:3])
gamma_exp =(gamma_id[0:3])
print(alpha_exp)
print(beta_exp)
print(gamma_exp)
alpha_exp_id = (alpha_id[3:8:])
beta_exp_id = (beta_id[3:8:])
gamma_exp_id = (gamma_id[3:8:])
print(alpha_exp_id)
print(beta_exp_id)
print(gamma_exp_id)
--- <3 JP <3 HELLO WORLD
PART ONE
alpha_id = 'CGJ28371'
beta_id = 'SJW99399'
gamma_id = 'PWS29382'
name_i = (alpha_id[0],alpha_id[1],alpha_id[2])
name_i2= (beta_id[0],beta_id[1],beta_id[2])
name_i3= (gamma_id[0],gamma_id[1],gamma_id[2])
name_bi = (alpha_id[0:3:])
name_bi2 = (beta_id[0:3:])
name_bi3 = (gamma_id[0:3:])
print(name_i)
print(name_i2)
print(name_i3)
print(name_bi)
print(name_bi2)
print(name_bi3)
PART 2
name_id = (alpha_id[3::])
name_id2 = (beta_id[3::])
name_id3 = (gamma_id[3::])
print(name_id)
print(name_id2)
print(name_id3)
---
--- Peter
alpha_id = 'CGJ28371'
beta_id = 'SJW99399'
gamma_id = 'PWS29382'
print(gamma_id[0], gamma_id[1], gamma_id[2])
print(gamma_id[3::1])
print(alpha_id[0], alpha_id[1], alpha_id[2])
print(alpha_id[3::1])
print(beta_id[0], beta_id[1], beta_id[2])
print(beta_id[3::1])
---
alpha_exp = alpha_id[0:3]
beta_exp = beta_id[0:3]
gamma_exp = gamma_id[0:3]
print(alpha_exp, beta_exp, gamma_exp)
alpha_exp_id = alpha_id[3:8]
beta_exp_id = beta_id[3:8]
gamma_exp_id = gamma_id[3:8]
print(alpha_exp_id, beta_exp_id, gamma_exp_id)
---
--- dante
groupA = 'CGJ28371'
groupA_ID = groupA[3:]
print(groupA_ID)
groupB = 'SJW99399'
groupB_ID = groupB[3:]
print(groupB_ID)
groupG = 'PWS29382'
groupG_ID = groupG[3:]
print(groupG_ID)
-------
groupA = 'CGJ28371'
groupAinitials = groupA[3:8]
print(groupAinitials)
groupB = 'SJW99399'
groupBinitials = groupB[3:8]
print(groupBinitials)
groupG = 'PWS29382'
groupGinitals = groupG[3:8]
print(groupGinitals)
---Joaquin
groupA = 'CGJ28371'
groupAvariables = groupA[4:8]
print(groupAvariables)
groupB = 'SJW99399'
groupBvariables = groupB[4:8]
print(groupBvariables)
groupG = 'PWS29382'
groupGvariables = groupG[4:8]
print(groupGvariables)
alpha_id_numbers = alpha_id[3:8]
print(alpha_id_numbers)
beta_id_numbers = beta_id[3:8]
print(beta_id_numbers)
gamma_id_numbers = gamma_id[3:8]
print(gamma_id_numbers)
-----
augustin
group_a = 'CGJ28371'
group_a_init = group_a[0:3]
print(group_a_init)
group_b = 'SJW99399'
group_b_init = group_b[0:3]
print(group_b_init)
group_c = 'PWS29382'
group_c_init = group_c[0:3]
print(group_c_init)
group_a = 'CGJ28371'
group_a_ID = group_a[3:]
print(group_a_ID)
group_b = 'SJW99399'
group_b_ID = group_b[3:]
print(group_b_ID)
group_c = 'PWS29382'
group_c_ID = group_c[3:]
print(group_c_ID)
---
groupA='CGJ28371'
groupB='SJW99399'
groupC='PWS29382'
print(beta_id[0:3:])
print(alpha_id[0:3:1])
print(gamma_id[0:3:])
#Create new variables that contain the ID of the experimenter
#for each mouse group; print the value of these new variables
print(alpha_id[3:8:])
print(beta_id[3:8:])
print(gamma_id[3:8:])
---
alpha_exp_ini= alpha_id[0:3:]
print(alpha_exp_ini)
beta_exp_ini= beta_id[0:3:]
print(beta_exp_ini)
gamma_exp_ini=gamma_id[0:3:]
print(gamma_exp_ini)
#id
alpha_exp_id= alpha_id[3:7:]
print(alpha_exp_id)
beta_exp_id = beta_id[3:7:]
print(beta_exp_id)
gamma_exp_id = gamma_id[3:7:]
print(gamma_exp_id)
**Challenge
Let's create a simple sequence in Python that will do the following**
-Madelyn
dna1 = 'AATGCGTGCGGATCATATTTTACCGGATCGGATGGCGTAAATCCGCGCTA'
print ('>sequence 001', '\n', dna1)
---
dna_string= 'AGTAGCCCGATAAGATACGGCGACATAGGTTTTTTAAGCGATGCATG'
start= '>sequence one'
print(start, '\n', dna_string)
-Allie
---
sequence_name = '> RandomSequence'
sequence_code = 'AGCTAGCTCCATGCTAGATCTTAGCTAGACGTGTCGATTAGCTGACTGCGTAGGAA'
print(sequence_name, '\n' + sequence_code)
---
------ dante
DNA = '>DNA Sequence 01'
dnaSeq = 'ACTGTTTTGGCCCATCCCATCATCATCGATCGASTCACGTGATCGTSACSCA'
print(DNA,'\n',dnaSeq)
--- Peter
s1 = '>' + 'sequence001'
s1DNA = 'ATACTCGATACTAGCTAGCTATACGTAGCTATCGATC'
print(s1, '\n', s1DNA)
print(">Dereks Sequence \n AATTGGCCAATACGTACTTTCCATTAC")
-----
print(">Logan's sequence \n AATGCCGATTAGCATTCGTATAGCCCGTAATTTGC")
------
Augustin
DNA = "Random DNA Sequence"
rand_seq = "ATGGGCCTAAATGTATAG"
print(DNA, "\n", rand_seq)
---
#### Determine and print the length of the HIV genome
-----Logan
print(len(hiv_genome))
----
---Madelyn
print(len(hiv_genome))
#### Create variables for and print the sequences for the following HIV genes
- gag
- pol
- vif
- vpr
- env
---madelyn
gag = (hiv_genome[789:2292:])
pol = ('\n' + hiv_genome[2084:5096:])
vif = ('\n' + hiv_genome[5040:5619:])
vpr = ('\n' + hiv_genome[5558:5850:])
env = ('\n' + hiv_genome[6224:8795:])
print(gag, '\n', pol, 'vif', vif, '\n', vpr, '\n', vpr, '\n', env )
---Peter
gag = hiv_genome[790:2292:]
pol = hiv_genome[2085:5096:]
vif = hiv_genome[5041:5619:]
vpr = hiv_genome[5559:5850:]
env = hiv_genome[6045:8795:]
print('HIV Gene GAG', '\n', '\n', gag)
print('\n')
print('HIV Gene POL', '\n', '\n', pol)
print('\n')
print('HIV Gene VIF', '\n', '\n', vif)
print('\n')
print('HIV Gene VPR', '\n', '\n', vpr)
print('\n')
print('HIV Gene ENV', '\n', '\n', env)
--Logan--
gag = hiv_genome[790:2292]
pol = hiv_genome[2085:5096]
vif = hiv_genome[5041:5619]
vpr = hiv_genome[5559:5850]
env = hiv_genome[6045:8795]
print(gag)
print(pol)
print(vif)
print(vpr)
print(env)
---
----
gag=hiv_genome[790:2293]
print(gag)
pol=hiv_genome[2085:5097]
print(pol)
vif=hiv_genome[5041:5620]
print(vif)
vpr=hiv_genome[5559:5851]
print(vpr)
env=hiv_genome[6225:8796]
print(env)
----
---
hiv_gag = hiv_genome[790:2293]
hiv_pol = hiv_genome[2085:5097]
hiv_vif = hiv_genome[5041:5620]
hiv_vpr= hiv_genome[559:5851]
hiv_env = hiv_genome [6225:8796]
print(hiv_gag, '\n', '\n', hiv_pol,'\n', '\n', hiv_vif,'\n','\n', hiv_vpr,'\n', '\n', hiv_env)
---
---Joaquin
# gag
groupgagvariables = hiv_genome[789:2292]
# pol
grouppolvariables = hiv_genome[2084:5096]
# vif
groupvifvariables = hiv_genome[5040:5619]
# vpr
groupvprvariables = hiv_genome[5558:5850]
# env
groupenvvariables = hiv_genome[6044:8795]
print(groupgagvariables, '\n''\n''\n', grouppolvariables, '\n''\n''\n', groupvifvariables, '\n''\n''\n', groupvprvariables, '\n''\n''\n', groupenvvariables)
____ JP
var_gag = (hiv_genome[789:2292:])
var_pol = (hiv_genome[2084:5096:])
var_vif = (hiv_genome[5040:5619:])
var_vpr = (hiv_genome[5558:5850:])
var_env = (hiv_genome[5969:8795:])
print('Gene gag:',var_gag,'\n\n\n\n','Gene pol:',var_pol,'\n\n\n\n','Gene vif:',var_vif,'\n\n\n\n','Gene vpr:',var_vpr,'\n\n\n\n','Gene env:',var_env,'\n\n\n\n')
____
---Aydan
gag = (hiv_genome[789:2135])
# pol
pol = (hiv_genome[2084:5097])
# vif
vif = (hiv_genome[5040:5620])
# vpr
vpr = (hiv_genome[5558:5851])
# env
env = (hiv_genome[6224:8796])
print(">gag" + "\n" + gag + "\n" + ">pol" + "\n" + pol + "\n" + ">vif" + "\n" + vif + "\n" + ">vpr" + "\n" + vpr + "\n" + ">env" + "\n" + env + "\n")
---
gag=print(hiv_genome[790:2292])
pol=print(hiv_genome[2085:5096])
vif= print(hiv_genome[5041:5619])
vpr=print(hiv_genome[5559:5850])
env=print (hiv_genome[6045:8795])
#### Generate the RNA sequence for each of the genes you have isolated above
---Madelyn
gag_rna = gag.replace('t','u')
print(gag_rna)
pol_rna = pol.replace('t', 'u')
print(pol_rna)
vif_rna = vif.replace('t', 'u')
print(vif_rna)
vpr_rna = vpr.replace('t', 'u')
print(vpr_rna)
env_rna = env.replace('t', 'u')
print(env_rna)
---Sury
hiv_genome = (hiv_genome.replace('t' , 'u'))
gag = (hiv_genome[789:2292])
print(gag)
pol = ('\n' + hiv_genome[2084:5096])
print(pol)
vif = ('\n' + hiv_genome[5040:5619])
print(vif)
vpr = ('\n' + hiv_genome[5558:5850])
print(vpf)
env = ('\n' + hiv_genome[6224:8795])
print(env)
---Peter
gag = hiv_genome[789:2292:]
pol = hiv_genome[2084:5096:]
vif = hiv_genome[5040:5619:]
vpr = hiv_genome[5558:5850:]
env = hiv_genome[6044:8795:]
RNAgag = gag.replace('t','u')
RNApol = pol.replace('t','u')
RNAvif = vif.replace('t','u')
RNAvpr = vpr.replace('t','u')
RNAenv = env.replace('t','u')
print('HIV Gene GAG', '\n', '\n', RNAgag)
print('\n')
print('HIV Gene POL', '\n', '\n', RNApol)
print('\n')
print('HIV Gene VIF', '\n', '\n', RNAvif)
print('\n')
print('HIV Gene VPR', '\n', '\n', RNAvpr)
print('\n')
print('HIV Gene ENV', '\n', '\n', RNAenv)
---
gag_rna = hiv_gag.replace ('t', 'u')
pol_rna = hiv_pol.replace ('t','u')
vif_rna = hiv_vif.replace ('t','u')
vpr_rna = hiv_vpr.replace ('t','u')
env_rna = hiv_env.replace ('t','u')
print(gag_rna, '\n', '\n', pol_rna, '\n', '\n', vif_rna, '\n', '\n', vpr_rna, '\n', '\n', env_rna)
---
RNA_gag = var_gag.replace('t','u')
RNA_pol = var_pol.replace('t','u')
RNA_vif = var_vif.replace('t','u')
RNA_vpr = var_vpr.replace('t','u')
RNA_env = var_env.replace('t','u')
print('RNA of gag:',RNA_gag,'\n\n\n\n','RNA of pol:',RNA_pol,'\n\n\n\n','RNA of vif:',RNA_vif,'\n\n\n\n','RNA of vpr',RNA_vpr,'\n\n\n\n','RNA of env',RNA_env)
___
---aydan
RNA_gag = gag.replace("t","u")
RNA_pol = pol.replace("t","u")
RNA_vif = vif.replace("t","u")
RNA_vpr = vpr.replace("t","u")
RNA_env = env.replace("t","u")
print(">RNA_gag" + "\n" + RNA_gag + "\n" + ">RNA_pol" + "\n" + RNA_pol + "\n" + ">RNA_vif" + "\n" + RNA_vif + "\n" + ">RNA_vpr" + "\n" + RNA_vpr + "\n" + ">RNA_env" + "\n" + RNA_env + "\n")
---
#### For each gene, generate a sum for each of the nuclotides in that gene (e.g., #of 'A',#of'U',#of'G',#of'C')
___JP
print('COUNT FOR GAG in RNA')
print("Number of a",RNA_gag.count('a'))
print("Number of u",RNA_gag.count('u'))
print("Number of c",RNA_gag.count('c'))
print("Number of g",RNA_gag.count('g'))
print("\n\n")
print('COUNT FOR POL in RNA')
print("Number of a",RNA_pol.count('a'))
print("Number of u",RNA_pol.count('u'))
print("Number of c",RNA_pol.count('c'))
print("Number of g",RNA_pol.count('g'))
print("\n\n")
print('COUNT FOR VIF in RNA')
print("Number of a",RNA_vif.count('a'))
print("Number of u",RNA_vif.count('u'))
print("Number of c",RNA_vif.count('c'))
print("Number of g",RNA_vif.count('g'))
print("\n\n")
print('COUNT FOR VPR in RNA')
print("Number of a",RNA_vpr.count('a'))
print("Number of u",RNA_vpr.count('u'))
print("Number of c",RNA_vpr.count('c'))
print("Number of g",RNA_vpr.count('g'))
print("\n\n")
print('COUNT FOR ENV in RNA')
print("Number of a",RNA_env.count('a'))
print("Number of u",RNA_env.count('u'))
print("Number of c",RNA_env.count('c'))
print("Number of g",RNA_env.count('g'))
print('COUNT FOR GAG in DNA')
print('Number of a',var_gag.count('a'))
print('Number of t',var_gag.count('t'))
print('Number of c',var_gag.count('c'))
print('Number of g',var_gag.count('g'))
print("\n\n")
print('COUNT FOR POL in DNA')
print('Number of a',var_pol.count('a'))
print('Number of t',var_pol.count('t'))
print('Number of c',var_pol.count('c'))
print('Number of g',var_pol.count('g'))
print("\n\n")
print('COUNT FOR VIF in DNA')
print('Number of a',var_vif.count('a'))
print('Number of t',var_vif.count('t'))
print('Number of c',var_vif.count('c'))
print('Number of g',var_vif.count('g'))
print("\n\n")
print('COUNT FOR VPR in DNA')
print('Number of a',var_vpr.count('a'))
print('Number of t',var_vpr.count('t'))
print('Number of c',var_vpr.count('c'))
print('Number of g',var_vpr.count('g'))
print("\n\n")
print('COUNT FOR ENV in DNA')
print('Number of a',var_env.count('a'))
print('Number of t',var_env.count('t'))
print('Number of c',var_env.count('c'))
print('Number of g',var_env.count('g'))
---Madelyn
print('gag_rna_a',gag_rna.count('a'))
print('gag_rna_u',gag_rna.count('u'))
print('gag_rna_g',gag_rna.count('g'))
print('gag_rna_c',gag_rna.count('c'), '\n')
print('pol_rna_a',pol_rna.count('a'))
print('pol_rna_u',pol_rna.count('u'))
print('pol_rna_g',pol_rna.count('g'))
print('pol_rna_c',pol_rna.count('c'), '\n')
print('vif_rna_a',vif_rna.count('a'))
print('vif_rna_u',vif_rna.count('u'))
print('vif_rna_g',vif_rna.count('g'))
print('vif_rna_c',vif_rna.count('c'), '\n')
print('vpr_rna_a',vpr_rna.count('a'))
print('vpr_rna_u',vpr_rna.count('u'))
print('vpr_rna_g',vpr_rna.count('g'))
print('vpr_rna_c',vpr_rna.count('c'), '\n')
print('env_rna_a',env_rna.count('a'))
print('env_rna_u',env_rna.count('u'))
print('env_rna_g',env_rna.count('g'))
print('env_rna_c',env_rna.count('c'))
---
**--Peter--**
gag = hiv_genome[789:2292:]
pol = hiv_genome[2084:5096:]
vif = hiv_genome[5040:5619:]
vpr = hiv_genome[5558:5850:]
env = hiv_genome[6044:8795:]
RNAgag = gag.replace('t','u')
RNApol = pol.replace('t','u')
RNAvif = vif.replace('t','u')
RNAvpr = vpr.replace('t','u')
RNAenv = env.replace('t','u')
print('HIV Gene GAG Nuclotide #s', '\n', '\n', '"A" Nuclotide #', RNAgag.count('a'), '\n')
print('"U" Nuclotide #', RNAgag.count('u'), '\n')
print('"C" Nuclotide #', RNAgag.count('c'), '\n')
print('"G" Nuclotide #', RNAgag.count('g'), '\n')
print('\n')
print('HIV Gene POL Nuclotide #s', '\n', '\n', '"A" Nuclotide #', RNApol.count('a'), '\n')
print('"U" Nuclotide #', RNApol.count('u'), '\n')
print('"C" Nuclotide #', RNApol.count('c'), '\n')
print('"G" Nuclotide #', RNApol.count('g'), '\n')
print('\n')
print('HIV Gene VIF Nuclotide #s', '\n', '\n', '"A" Nuclotide #', RNAvif.count('a'), '\n')
print('"U" Nuclotide #', RNAvif.count('u'), '\n')
print('"C" Nuclotide #', RNAvif.count('c'), '\n')
print('"G" Nuclotide #', RNAvif.count('g'), '\n')
print('\n')
print('HIV Gene VPR Nuclotide #s', '\n', '\n', '"A" Nuclotide #', RNAvpr.count('a'), '\n')
print('"U" Nuclotide #', RNAvpr.count('u'), '\n')
print('"C" Nuclotide #', RNAvpr.count('c'), '\n')
print('"G" Nuclotide #', RNAvpr.count('g'), '\n')
print('\n')
print('HIV Gene ENV Nuclotide #s', '\n', '\n', '"A" Nuclotide #', RNAenv.count('a'), '\n')
print('"U" Nuclotide #', RNAenv.count('u'), '\n')
print('"C" Nuclotide #', RNAenv.count('c'), '\n')
print('"G" Nuclotide #', RNAenv.count('g'), '\n')
---aydan
RNA_gag_a = gag.count("a")
RNA_gag_u = gag.count("u")
RNA_gag_g = gag.count("g")
RNA_gag_c = gag.count("c")
RNA_pol_a = pol.count("a")
RNA_pol_u = pol.count("u")
RNA_pol_g = pol.count("g")
RNA_pol_c = pol.count("c")
RNA_vif_a = vif.count("a")
RNA_vif_u = vif.count("u")
RNA_vif_g = vif.count("g")
RNA_vif_c = vif.count("c")
RNA_vpr_a = vpr.count("a")
RNA_vpr_u = vpr.count("u")
RNA_vpr_g = vpr.count("g")
RNA_vpr_c = vpr.count("c")
RNA_env_a = env.count("a")
RNA_env_u = env.count("u")
RNA_env_g = env.count("g")
RNA_env_c = env.count("c")
print("gag a")
print(RNA_gag_a)
print("gag u")
print(RNA_gag_u)
print("gag g")
print(RNA_gag_g)
print("gag c")
print(RNA_gag_c)
print("pol a")
print(RNA_pol_a)
print("pol u")
print(RNA_pol_u)
print("pol g")
print(RNA_pol_g)
print("pol c")
print(RNA_pol_c)
print("vif a")
print(RNA_vif_a)
print("vif u")
print(RNA_vif_u)
print("vif g")
print(RNA_vif_g)
print("vif c")
print(RNA_vif_c)
print("vpr a")
print(RNA_vpr_a)
print("vpr u")
print(RNA_vpr_u)
print("vpr g")
print(RNA_vpr_g)
print("vpr c")
print(RNA_vpr_c)
print("env a")
print(RNA_env_a)
print("env u")
print(RNA_env_u)
print("env g")
print(RNA_env_g)
print("env c")
print(RNA_env_c)
---
#### For each gene, caculate the GC content (%)
- percent GC = sum of (G) + sum (C) / total number of nuclotides in a given gene
---Madelyn
#gag gene
gag_rna_g = gag_rna.count('g')
gag_rna_c = gag_rna.count('c')
print('gag rna gc concentration')
print((gag_rna_g + gag_rna_c)/len(gag)*100, '\n')
#pol gene
pol_rna_g = pol_rna.count('g')
pol_rna_c = pol_rna.count('c')
print('pol rna gc concentration')
print((pol_rna_g + pol_rna_c)/len(pol)*100, '\n')
#vif gene
vif_rna_g = vif_rna.count('g')
vif_rna_c = vif_rna.count('c')
print('vif rna gc concentration')
print((vif_rna_g + vif_rna_c)/len(vif)*100, '\n')
#vpr gene
vpr_rna_g = vpr_rna.count('g')
vpr_rna_c = vpr_rna.count('c')
print('vpr rna gc concentration')
print((vpr_rna_g + vpr_rna_c)/len(vpr)*100, '\n')
#env gene
env_rna_g = env_rna.count('g')
env_rna_c = env_rna.count('c')
print('env rna gc concentration')
print((env_rna_g + env_rna_c)/len(env)*100)
-- aydan
x = hiv_genome.count("g")
y = hiv_genome.count("c")
d = len(hiv_genome)
z = (x + y) / d
p = z * 100
print(str(p) + " %" )
---jp
x=hiv_genome.count('g')
y=hiv_genome.count('c')
d = len(hiv_genome)
z = ( x + y ) / d
p = z * 100
print(str(p), 'percent')
___