# Genome Annotations
Chloroplast genome assembly
#### Plastid genome
- gene number 120-130
- genome size ~160kb
- genome structure: long single copy, short single copy, inverted repeat regions
- generally conserved between species
#### Procedures
1. Sequencing (long reads and short reads)
2. Filter reads
3. Assembly
4. QC (mummer plot short reads mapping results)
5. Genome annotation using GeSeq https://chlorobox.mpimp-golm.mpg.de/geseq.html
#### Program: GetOrganelle
Get whole genome sequence data; align to reference chloroplast and toss out everything that doesn't align (the nuclear genome data), then take all the sequence that aligned and assemble it.
get_organelle_from_reads.py -1 /home/labuser/data/wzhou/PASTOS21/polish/P21_R1.fastq -2 /home/labuser/data/wzhou/PASTOS21/polish/P21_R2.fastq -t 6 -o P21_M -F embplant_pt -R 15
get_organelle_from_reads.py -1 /home/labuser/data/wzhou/PASTOS21/polish/P21_R1.fastq -2 /home/labuser/data/wzhou/PASTOS21/polish/P21_R2.fastq -t 6 -o P21_M -F embplant_pt -R 15
### Bandage
- useful tool to visualize your assembly
- runs from MY computer
- https://rrwick.github.io/Bandage/
### ELITE: Efficiently locating intertions of transposable elements
*mentioned in mouse lecture