# Genome Annotations Chloroplast genome assembly #### Plastid genome - gene number 120-130 - genome size ~160kb - genome structure: long single copy, short single copy, inverted repeat regions - generally conserved between species #### Procedures 1. Sequencing (long reads and short reads) 2. Filter reads 3. Assembly 4. QC (mummer plot short reads mapping results) 5. Genome annotation using GeSeq https://chlorobox.mpimp-golm.mpg.de/geseq.html #### Program: GetOrganelle Get whole genome sequence data; align to reference chloroplast and toss out everything that doesn't align (the nuclear genome data), then take all the sequence that aligned and assemble it. get_organelle_from_reads.py -1 /home/labuser/data/wzhou/PASTOS21/polish/P21_R1.fastq -2 /home/labuser/data/wzhou/PASTOS21/polish/P21_R2.fastq -t 6 -o P21_M -F embplant_pt -R 15 get_organelle_from_reads.py -1 /home/labuser/data/wzhou/PASTOS21/polish/P21_R1.fastq -2 /home/labuser/data/wzhou/PASTOS21/polish/P21_R2.fastq -t 6 -o P21_M -F embplant_pt -R 15 ### Bandage - useful tool to visualize your assembly - runs from MY computer - https://rrwick.github.io/Bandage/ ### ELITE: Efficiently locating intertions of transposable elements *mentioned in mouse lecture