# Plant mitochondria and chloroplast
- Auto-generated Table of Content
[ToC]
# 1 . Running MBG
```
sh /software/team311/mu2/maptoboth2-allreads.sh
```
Example
```
allreads="dcPolAvic1.trimmedReads.fasta"
/software/team311/mu2/mbg-noKbug/MBG/bin/MBG -k 1001 -w 250 -a 5 -u 150 -i $allreads -o $allreads.1k.gfa -t 18
/software/team311/mu2/mbg-noKbug/MBG/bin/MBG -k 5001 -w 250 -a 5 -u 150 -i $allreads -o $allreads.5k.gfa -t 18
```
Have a look at the MBG parameters above: we are excluding nuclear genome kmer coverage. We are selecting to get only high-frequency kmers that (usually) represent the high-copy expression of the organelles (mito, chloro).
## 1.1 MBG will generate graph outputs
Some graphs are pretty simple:

## 1.2 The chloroplast
We have an specific orientation to output the chloroplast. The manual way of getitng it (before Max's tool) was:
1-) Get the fasta for each unitig
```
/software/team311/mu2/MitoHiFi/scripts/gfa2fa *.gfa > *.fasta
```
2-) Then go through Bandage and chose your path across the graph. Then get the sequences for that path.
path chosen chloro 1+,2+,3+,2-
Use tool ``` python /software/team311/mu2/join_segments.py ```
```
python /software/team311/mu2/join_segments.py HiFiMapped.bam.filtered.3k.fa chlro.path HiFiMapped.bam.filtered.3k.gfa chloro.fa
Collecting sequences from HiFiMapped.bam.filtered.3k.fa
Reading links from HiFiMapped.bam.filtered.3k.gfa
=======================================
Processing 1+,2+,3+,2-
1+ 2+
================== Basic step
1+ 2+ 4332M
From cigar 4332
Added coordinates: [96641 - 123226)
2+ 3+
================== Basic step
2+ 3+ 4587M
From cigar 4587
Added coordinates: [123226 - 141277)
3+ 2-
================== Basic step
3+ 2- 4587M
From cigar 4587
Added coordinates: [141277 - 167607)
Writing chloro
```
## 1.3 Run mitohifi
Once you have a linear sequence, run mitohifi to exclude the remaining redudancy at the ends of the fasta and to annotate the sequence.
```
singularity exec --bind /lustre/:/lustre/ /software/tola/images/mitohifi-2.1.sif mitohifi_v2.py -c chloro.fa -f NC_058892.1.fasta -g NC_058892.1.gb -o 11 -t 18
```
After MitoHigi finishes, check the annotation. The bellow is the chloroplast with reads mapped back and a gff track on the botton.

The above is the orientation we want:
LSC - large single copy unit
IR - inverted repeat
SSC - short single copy unit
We want the LSC at the beginning starting at the gene psbA and we want the SSC finishing at the gene ycf1.
# 2. The mitochondria
In the case above (red graph at 1.1) its just a circular molecule. But this is really unlikely for plants. For the case above, we just need to get the fasta for that unitig, download a close-related species reference and run mitohifi to circularize and annotate.
But some graphs are much more complicated:
```
/lustre/scratch124/tol/projects/darwin/sub_projects/organelles/pmpap_outputs_raw/poa_gfas
```
## 2.1 Mito genes
Diferent plant groups will have different sets of genes. But to compare your annotation with the close-reference you used do:
```
python /software/team311/mu2/comparing_gene.py 1.gb 2.gb
```
https://hackmd.io/_lbKCM_UQ1qFUIDUX-W2LQ#8-daBalNigr1
Other links:
```
To point you to the directories on the farm where these fastas now exist:
/lustre/scratch124/tol/projects/darwin/sub_projects/organelles/mitochondria/fastas
/lustre/scratch124/tol/projects/darwin/sub_projects/organelles/chloroplast/chloroplast_fastas
The raw pmpap.py output is here:
/lustre/scratch124/tol/projects/darwin/sub_projects/organelles/pmpap_outputs_raw
And the raw reads used:
/lustre/scratch124/tol/projects/darwin/sub_projects/organelles/raw_reads_to_assemble.tsv
/lustre/scratch124/tol/projects/darwin/sub_projects/organelles/pmpap_outputs_raw/poa_gfas
```
# 3. Papers
https://doi.org/10.1016/j.mito.2020.06.002
Turudic ́,A.;Liber,Z.; Grdiša, M.; Jakše, J.; Varga, F.; Šatovic ́, Z. Towards the Well-Tempered Chloroplast DNA Sequences. Plants 2021,10,1360. https://doi.org/ 10.3390/plants10071360
Kozik A, Rowan BA, Lavelle D, Berke L, Schranz ME, Michelmore RW, et al. (2019) The alternative reality of plant mitochondrial DNA: One ring does not rule them all. PLoS Genet 15(8): e1008373. https://doi.org/10.1371/journal. pgen.1008373