# ddAraThal4 - Auto-generated Table of Content [ToC] # 1 ddAraThal4 ## 1.1. MBG k=1001 ```/lustre/scratch124/tol/projects/darwin/data/dicots/Arabidopsis_thaliana/working/mito_chloro_mbg``` ![](https://i.imgur.com/rlXAzfe.png) ## 1.2 Chloroplast by hand ``` biopython) mu2@tol-head2:/lustre/scratch124/tol/projects/darwin/data/dicots/Arabidopsis_thaliana/working/mito_chloro_mbg$ python /software/team311/mu2/join_segments.py NC_037304.1.MZ323108.1.fasta.BOTH.HiFiMapped.bam.filtered.1k.fa chloro.paths NC_037304.1.MZ323108.1.fasta.BOTH.HiFiMapped.bam.filtered.1k.gfa chloro.paths.joined.fasta ``` 3+, 2-, 1-, 2+ ## 1.3 Chloroplast annotation So what I did for now (on 19.1.) was: SSC-IRb-LSC-IRa Ok - First let me rotate such it starts at psbA 126760..127821 gene="psbA" ```mu2@tol-head1:/lustre/scratch124/tol/projects/darwin/data/dicots/Arabidopsis_thaliana/working/mito_chloro_mbg/mitohifi-chloro/PGA/target-nonrotated$ python /software/team311/mu2/MitoHiFi/scripts/rotate.py -i cloro.annotation_mtDNA_contig.fasta -r 126760 > cloro.annotation_mtDNA_contig.rot.psbA.fasta INFO: rotating each sequence record 126760 bp clockwise... INFO: cloro sequence (154474 bp) was rotated 126760 bp. ``` Lucia says chloroplast orientation should be: LSC-IR1-SSC-IR2 + making sure that the orientation of SSC is such that it ends with ycf1 and LSC starts at psbA what she wants is: 1-,2+, 3-, 2- ``` (biopython) mu2@tol-head1:/lustre/scratch124/tol/projects/darwin/data/dicots/Arabidopsis_thaliana/working/mito_chloro_mbg$ python /software/team311/mu2/join_segments.py NC_037304.1.MZ323108.1.fasta.BOTH.HiFiMapped.bam.filtered.1k.fa chloro-order NC_037304.1.MZ323108.1.fasta.BOTH.HiFiMapped.bam.filtered.1k.gfa chloro-order.fa Collecting sequences from NC_037304.1.MZ323108.1.fasta.BOTH.HiFiMapped.bam.filtered.1k.fa Reading links from NC_037304.1.MZ323108.1.fasta.BOTH.HiFiMapped.bam.filtered.1k.gfa ======================================= Processing 1-,2+,3-,2- 1- 2+ ================== Basic step 1- 2+ 1435M From cigar 1435 Added coordinates: [87033 - 111870) 2+ 3- ================== Basic step 2+ 3- 1455M From cigar 1455 Added coordinates: [111870 - 131092) 3- 2- ================== Basic step 3- 2- 1455M From cigar 1455 Added coordinates: [131092 - 155909) Writing chloro ``` # 1.4 Lucia's requested order 1+,2+,3-,2- ``` mu2@tol-head2:/lustre/scratch124/tol/projects/darwin/data/dicots/Arabidopsis_thaliana/working/mito_chloro_mbg$ python /software/team311/mu2/join_segments.py NC_037304.1.MZ323108.1.fasta.BOTH.HiFiMapped.bam.filtered.1k.fa chloro-order2 NC_037304.1.MZ323108.1.fasta.BOTH.HiFiMapped.bam.filtered.1k.gfa chloro-order2.fa Collecting sequences from NC_037304.1.MZ323108.1.fasta.BOTH.HiFiMapped.bam.filtered.1k.fa Reading links from NC_037304.1.MZ323108.1.fasta.BOTH.HiFiMapped.bam.filtered.1k.gfa ======================================= Processing 1+,2+,3-,2- 1+ 2+ ================== Basic step 1+ 2+ 1435M From cigar 1435 Added coordinates: [87033 - 111870) 2+ 3- ================== Basic step 2+ 3- 1455M From cigar 1455 Added coordinates: [111870 - 131092) 3- 2- ================== Basic step 3- 2- 1455M From cigar 1455 Added coordinates: [131092 - 155909) Writing chloro ``` ![](https://i.imgur.com/xDHeSyB.png) # 2. Mitochondria Going manually across the graph following the path bellow: 4-,6+, 9-, 7-, 5-, 6+, 8-, 7- The used joing_segments (Sergey's code) to merge and create a fasta ``` (biopython) mu2@tol-head1:/lustre/scratch124/tol/projects/darwin/data/dicots/Arabidopsis_thaliana/working/mito_chloro_mbg$ python /software/team311/mu2/join_segments.py NC_037304.1.MZ323108.1.fasta.BOTH.HiFiMapped.bam.filtered.1k.fa mito_mu2 NC_037304.1.MZ323108.1.fasta.BOTH.HiFiMapped.bam.filtered.1k.gfa mito_mu2.fa ``` Then ran mitohifi to final circularization of the extremities and automated annotation. Mitohifi run: ``` /lustre/scratch124/tol/projects/darwin/data/dicots/Arabidopsis_thaliana/working/mito_chloro_mbg/mitohifi_mito_mu2 ``` General result is good: ``` # Related mitogenome is 367808 bp long and has 58 genes contig_id frameshifts_found genbank_file length(bp) number_of_genes final_mitogenome rpl2 final_mitogenome.gb 368831 60 ``` ## 2.1 Mapping view of mitochondria Mapping view is very clean ![](https://i.imgur.com/ukgAzCF.png) Where are these files? ``` /lustre/scratch124/tol/projects/darwin/data/dicots/Arabidopsis_thaliana/working/mito_chloro_mbg/mitohifi_mito_mu2/mapping/mito_mu2_rotated.bam ``` samtools flagstat 19581 + 0 in total (QC-passed reads + QC-failed reads) 0 + 0 secondary 0 + 0 supplementary 0 + 0 duplicates 19581 + 0 mapped (100.00% : N/A) 0 + 0 paired in sequencing 0 + 0 read1 0 + 0 read2 0 + 0 properly paired (N/A : N/A) 0 + 0 with itself and mate mapped 0 + 0 singletons (N/A : N/A) 0 + 0 with mate mapped to a different chr 0 + 0 with mate mapped to a different chr (mapQ>=5) The above results was a mapping having both, the assembled mitochondria and chloroplast sequences as references.