# ddAraThal4
- Auto-generated Table of Content
[ToC]
# 1 ddAraThal4
## 1.1. MBG k=1001
```/lustre/scratch124/tol/projects/darwin/data/dicots/Arabidopsis_thaliana/working/mito_chloro_mbg```

## 1.2 Chloroplast by hand
```
biopython) mu2@tol-head2:/lustre/scratch124/tol/projects/darwin/data/dicots/Arabidopsis_thaliana/working/mito_chloro_mbg$ python /software/team311/mu2/join_segments.py NC_037304.1.MZ323108.1.fasta.BOTH.HiFiMapped.bam.filtered.1k.fa chloro.paths NC_037304.1.MZ323108.1.fasta.BOTH.HiFiMapped.bam.filtered.1k.gfa chloro.paths.joined.fasta
```
3+, 2-, 1-, 2+
## 1.3 Chloroplast annotation
So what I did for now (on 19.1.) was:
SSC-IRb-LSC-IRa
Ok - First let me rotate such it starts at psbA
126760..127821
gene="psbA"
```mu2@tol-head1:/lustre/scratch124/tol/projects/darwin/data/dicots/Arabidopsis_thaliana/working/mito_chloro_mbg/mitohifi-chloro/PGA/target-nonrotated$ python /software/team311/mu2/MitoHiFi/scripts/rotate.py -i cloro.annotation_mtDNA_contig.fasta -r 126760 > cloro.annotation_mtDNA_contig.rot.psbA.fasta
INFO: rotating each sequence record 126760 bp clockwise...
INFO: cloro sequence (154474 bp) was rotated 126760 bp.
```
Lucia says chloroplast orientation should be:
LSC-IR1-SSC-IR2 + making sure that the orientation of SSC is such that it ends with ycf1 and LSC starts at psbA
what she wants is:
1-,2+, 3-, 2-
```
(biopython) mu2@tol-head1:/lustre/scratch124/tol/projects/darwin/data/dicots/Arabidopsis_thaliana/working/mito_chloro_mbg$ python /software/team311/mu2/join_segments.py NC_037304.1.MZ323108.1.fasta.BOTH.HiFiMapped.bam.filtered.1k.fa chloro-order NC_037304.1.MZ323108.1.fasta.BOTH.HiFiMapped.bam.filtered.1k.gfa chloro-order.fa
Collecting sequences from NC_037304.1.MZ323108.1.fasta.BOTH.HiFiMapped.bam.filtered.1k.fa
Reading links from NC_037304.1.MZ323108.1.fasta.BOTH.HiFiMapped.bam.filtered.1k.gfa
=======================================
Processing 1-,2+,3-,2-
1- 2+
================== Basic step
1- 2+ 1435M
From cigar 1435
Added coordinates: [87033 - 111870)
2+ 3-
================== Basic step
2+ 3- 1455M
From cigar 1455
Added coordinates: [111870 - 131092)
3- 2-
================== Basic step
3- 2- 1455M
From cigar 1455
Added coordinates: [131092 - 155909)
Writing chloro
```
# 1.4 Lucia's requested order
1+,2+,3-,2-
```
mu2@tol-head2:/lustre/scratch124/tol/projects/darwin/data/dicots/Arabidopsis_thaliana/working/mito_chloro_mbg$ python /software/team311/mu2/join_segments.py NC_037304.1.MZ323108.1.fasta.BOTH.HiFiMapped.bam.filtered.1k.fa chloro-order2 NC_037304.1.MZ323108.1.fasta.BOTH.HiFiMapped.bam.filtered.1k.gfa chloro-order2.fa
Collecting sequences from NC_037304.1.MZ323108.1.fasta.BOTH.HiFiMapped.bam.filtered.1k.fa
Reading links from NC_037304.1.MZ323108.1.fasta.BOTH.HiFiMapped.bam.filtered.1k.gfa
=======================================
Processing 1+,2+,3-,2-
1+ 2+
================== Basic step
1+ 2+ 1435M
From cigar 1435
Added coordinates: [87033 - 111870)
2+ 3-
================== Basic step
2+ 3- 1455M
From cigar 1455
Added coordinates: [111870 - 131092)
3- 2-
================== Basic step
3- 2- 1455M
From cigar 1455
Added coordinates: [131092 - 155909)
Writing chloro
```

# 2. Mitochondria
Going manually across the graph following the path bellow:
4-,6+, 9-, 7-, 5-, 6+, 8-, 7-
The used joing_segments (Sergey's code) to merge and create a fasta
```
(biopython) mu2@tol-head1:/lustre/scratch124/tol/projects/darwin/data/dicots/Arabidopsis_thaliana/working/mito_chloro_mbg$ python /software/team311/mu2/join_segments.py NC_037304.1.MZ323108.1.fasta.BOTH.HiFiMapped.bam.filtered.1k.fa mito_mu2 NC_037304.1.MZ323108.1.fasta.BOTH.HiFiMapped.bam.filtered.1k.gfa mito_mu2.fa
```
Then ran mitohifi to final circularization of the extremities and automated annotation.
Mitohifi run:
```
/lustre/scratch124/tol/projects/darwin/data/dicots/Arabidopsis_thaliana/working/mito_chloro_mbg/mitohifi_mito_mu2
```
General result is good:
```
# Related mitogenome is 367808 bp long and has 58 genes
contig_id frameshifts_found genbank_file length(bp) number_of_genes
final_mitogenome rpl2 final_mitogenome.gb 368831 60
```
## 2.1 Mapping view of mitochondria
Mapping view is very clean

Where are these files?
```
/lustre/scratch124/tol/projects/darwin/data/dicots/Arabidopsis_thaliana/working/mito_chloro_mbg/mitohifi_mito_mu2/mapping/mito_mu2_rotated.bam
```
samtools flagstat
19581 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
19581 + 0 mapped (100.00% : N/A)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (N/A : N/A)
0 + 0 with itself and mate mapped
0 + 0 singletons (N/A : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)
The above results was a mapping having both, the assembled mitochondria and chloroplast sequences as references.