# Plants - Auto-generated Table of Content [ToC] ## 1. ucDunPrim2 Mito and chloro FastK profiles So-called Chloroplast reads ### Chloroplast ![](https://i.imgur.com/nbLat9A.png) --- ### Mito reads - more promissing! ![](https://i.imgur.com/aLtpWNj.png) --- ## 2. dhQueRobu3 ### Chloroplast-reads Profiles (Oak) Plotting Oak (dhQueRobu3) Median and all kmer profiles for chloroplast-selected reads ```/lustre/scratch116/tol/projects/darwin/data/dicots/Quercus_robur/working/chloro-mitohivi_v2/r ``` ![](https://i.imgur.com/JKMTBV8.png) ### Assembly graph from hifiasm ```/lustre/scratch116/tol/projects/darwin/data/dicots/Quercus_robur/working/chloro-mitohivi_v2/r``` gbk.HiFiMapped.bam.filtered.assembled.p_ctg.noseq.gfa.png gbk.HiFiMapped.bam.filtered.assembled.p_ctg.fa A = 17206588 (32.1%), C = 9585209 (17.9%), G = 9577275 (17.9%), T = 17187789 (32.1%), CpG = 2187430 (4.1%) sum = 53556861, n = 1300, mean = 41197.5853846154, largest = 109444, smallest = 20353 N50 = 40889, L50 = 543 N60 = 39505, L60 = 677 N70 = 37627, L70 = 816 N80 = 35337, L80 = 963 N90 = 32726, L90 = 1120 N100 = 20353, L100 = 1300 ![](https://i.imgur.com/umrZRIC.png) gbk.HiFiMapped.bam.filtered.assembled.a_ctg.noseq.gfa.png gbk.HiFiMapped.bam.filtered.assembled.a_ctg.fa A = 4302158 (32.1%), C = 2372562 (17.7%), G = 2379167 (17.8%), T = 4339884 (32.4%), CpG = 450476 (3.4%) sum = 13393771, n = 411, mean = 32588.2506082725, largest = 85092, smallest = 13029 N50 = 33824, L50 = 169 N60 = 32300, L60 = 210 N70 = 30175, L70 = 252 N80 = 28049, L80 = 299 N90 = 24758, L90 = 349 N100 = 13029, L100 = 411 ![](https://i.imgur.com/ZDwHafg.png) ### Chloroplast profMedian (Oak) ![](https://i.imgur.com/31dq47F.png) ### Mitochondrion (oak) Assembly graphs ### r_utg.noseq.gfa Haplotype-resolved raw unitig graph in GFA format (prefix.r_utg.gfa). This graph keeps all haplotype information, including somatic mutations and recurrent sequencing errors. ![](https://i.imgur.com/qTII7eb.png) ### p_utg.noseq.gfa Haplotype-resolved processed unitig graph without small bubbles (prefix.p_utg.gfa). Small bubbles might be caused by somatic mutations or noise in data, which are not the real haplotype information. ![](https://i.imgur.com/VZ2NPW5.png) ### p_ctg.gfa Primary assembly contig graph (prefix.p_ctg.gfa). This graph collapses different haplotypes. ![](https://i.imgur.com/6x0PwVh.png) Profiles ![](https://i.imgur.com/32kb9x3.png) ### Mitohifi-plant for reads from 3000-15000 /lustre/scratch116/tol/projects/darwin/data/dicots/Quercus_robur/working/mito-v2/mitohifi-plant-passedMedian300015000 ### Common reads between mito and chloro How come mitohifi has fished in-common reads ? ![](https://i.imgur.com/ITWRK5z.png) Chloroplast reference was LOCUS NC_046388 161172 bp DNA circular PLN 26-MAR-2020 DEFINITION Quercus robur cultivar Fastigiata chloroplast, complete genome. LOCUS MN199236 412886 bp DNA circular PLN 04-APR-2020 DEFINITION UNVERIFIED: Quercus variabilis mitochondrion sequence. Both references dont have much in common ![](https://i.imgur.com/FpDpXfv.png) Im re-running mitohifi to be sure I fished the reads in the correct way. ``` /lustre/scratch116/tol/projects/darwin/data/dicots/Quercus_robur/working/mito-v2/mitohifi-r-MN199236/again ``` Its confirmed, got the same result as above. What happens if I assemble those 3 pots separately? Ok, so now I'm running MitoHifi-plants and MBG to the groups of reads above: - 65807 reads that are fished only by the ref-chloroplast ```/lustre/scratch116/tol/projects/darwin/data/dicots/Quercus_robur/working/chloro-mitohivi_v2/uniq_chloro1``` ![](https://i.imgur.com/r3e9cua.png) - 48289 reads that are fished by both, chloro and mito ```/lustre/scratch116/tol/projects/darwin/data/dicots/Quercus_robur/working/mito-v2/mitohifi-r-MN199236/common_reads_mitochloro ``` ![](https://i.imgur.com/O5I2zjZ.png) - 40543 reads that were fished by ref-mito only ```/lustre/scratch116/tol/projects/darwin/data/dicots/Quercus_robur/working/mito-v2/mitohifi-r-MN199236/reads_fished_only_mito/mbg``` ![](https://i.imgur.com/iqNfStW.png) Sergey's result ```/lustre/scratch116/tol/projects/darwin/data/dicots/Quercus_robur/working/mito-v2/sergey/share/``` ![](https://i.imgur.com/2D7hWbp.png) ### To dos - [ ] Hic data - [x] Assembly graph for hifiasm nuclear - [ ] MEGAhit - [ ] Fly (permissive with gluing things together) - [ ] jumberDB (DBG-like) ## 3. drPotAnse1 Potentilla_anserina ProfMedian Mito-reads ``` /lustre/scratch116/tol/projects/darwin/data/dicots/Potentilla_anserina/working/mito/profMedian ``` ![](https://i.imgur.com/gUKVqai.png) ok, from the image above I got reads from 1200 to 2800. Blast a few and they look like choroplasts! ![](https://i.imgur.com/phOWmKJ.png) Tetranucs from Claudia. Blue is the first peak above ![](https://i.imgur.com/zGU5hdw.png) -[ ] Running mito-plants for reads in the peak 1200-2800 ```/lustre/scratch116/tol/projects/darwin/data/dicots/Potentilla_anserina/working/mito/mito-plants-12002800``` -[ ] Running mito-plants for reads in the peak 4500-6350 -[ ] Running profs for chloro reads ```/lustre/scratch116/tol/projects/darwin/data/dicots/Potentilla_anserina/working/chloro``` Chloroplast reads ![](https://i.imgur.com/0ysOgQg.png) ### drPotAnse1 MBG mito 1k ![](https://i.imgur.com/69gx82z.png) ## 4. drFilUlma1 ### Chloroplast Many different contigs annotated. None is the size or circularizes. /lustre/scratch116/tol/projects/darwin/data/dicots/Filipendula_ulmaria/working/chloro/chloro-plants Running profMedian ## 5. ucDunPrim2 ```/lustre/scratch116/tol/projects/darwin/data/algae/Dunaliella_primolecta/working/mito/mito-plants``` To dos: - [ ] profMedian mito - [ ] blast mito and chloro - [ ] make a venn diagram ## 6. Notes Papers: - https://www.nature.com/articles/s41598-019-41377-w - https://www.mdpi.com/2223-7747/10/7/1360/htm ## 7. drAilAlti1 ### Mito MBG -k 1001 -w 250 -a 5 -u 150 ![](https://i.imgur.com/67Pc923.png) MBG -k 5001 -w 250 -a 5 -u 150 ![](https://i.imgur.com/F3MroDN.png) 4 - chloro 3 - chloro 2 - chloro 1 - chloro ### Chloro MBG -k 3001 -w 250 -a 5 -u 150 ![](https://i.imgur.com/5anaDc8.png) ## 7. dcPolAvic1 Polygonum_aviculare Mito k1001 ![](https://i.imgur.com/PpfABI1.png) k6001 ![](https://i.imgur.com/BAE3SYt.png) - [ ] recombination of plant mito? - cat mito & chloro and mapp to it then select better mapping ---------------------------------------------------------------- ## 8. cbRhyLore1 ### 8.1 cbRhyLore1 MBG k=1001 ![](https://i.imgur.com/SgLi5Bu.png) ---------------------------------------------------------------- ### 8.2 cbRhyLore1 MBG k=5001 ![](https://i.imgur.com/zWKtvmg.png) ### 8.4 Chloroplast Linear (from MBG k=1001) ![](https://i.imgur.com/ww1roHm.png) Circular (from MBG k=1001) ![](https://i.imgur.com/YpVM7jA.png) Chloroplast Pacbio reads coverage ![](https://i.imgur.com/KTrTZ9H.png) contigs_stats.tsv: Related mitogenome is 125195 bp long and has 116 genes | contig_id | frameshifts? | length(bp) | number of genes | | -------- | -------- | -------- | -------- | final_mitogenome | No | 124828 | 104 Mitohifi run: ```/lustre/scratch116/tol/projects/darwin/data/non-vascular-plants/Rhytidiadelphus_loreus/working/mito_chloroplast/mitohifi_chloro ``` ### 8.4.1 ChloroplastChloroplast 10x coverage ![](https://i.imgur.com/szbDWYP.png) ![](https://i.imgur.com/OeXE83b.png) Flagstats of mapped 10x reads: ``` flagstat chlor_rotated.bam 2845471 + 0 in total (QC-passed reads + QC-failed reads) 80651 + 0 secondary 0 + 0 supplementary 531228 + 0 duplicates 2845471 + 0 mapped (100.00% : N/A) 2764820 + 0 paired in sequencing 1383339 + 0 read1 1381481 + 0 read2 1391229 + 0 properly paired (50.32% : N/A) 2738426 + 0 with itself and mate mapped 26394 + 0 singletons (0.95% : N/A) 17598 + 0 with mate mapped to a different chr 7079 + 0 with mate mapped to a different chr (mapQ>=5) ``` ### 8.5 Mitochondria Linear (MBG k=1001) ![](https://i.imgur.com/IWj7V0L.png) Circular (MBG k=1001) ![](https://i.imgur.com/c5AVP42.png) mito 10x coverage ![](https://i.imgur.com/EYSO8yi.png) ``` ND5 complement(71541..72254) /translation="E*PTPVSASIHAATMVTAGVFMIARCSPLFEYSSTALIVITFVGA MTSFFAATTGILQNDLKRVIAYSTCSQLGYMIFACGISNYSVSVFHLMNHAFFKALLFL SAGSVIHAMSDEQDMRKMGGLASLLPFTYAMMLIGSLSLIGFPFRTGFYSKDVILELAY TKYTISGNFAFWLGSVSVFFTSYYSFRLLFLTFLASTNSFKRDILRCHDAPILMAIPLI FLAFGSIFVGYVAKV* ``` ND5 seems to have a frameshift in the beginning and the mapping of pacbio and 10x is kinda of perfect in the area ![](https://i.imgur.com/1l8inDT.png) ![](https://i.imgur.com/av04ScI.png) ## 9 dhQueRobu3 ### 9.1 dhQueRobu3 MBG= 1001K ![](https://i.imgur.com/CilWF2s.png) ---------------------------------------------------------------- ### 9.2 dhQueRobu3 MBG= 5001K ![](https://i.imgur.com/dpjzCYl.png) ---------------------------------------------------------------- ### 9.1.1 Working to construct mito versions from MBG k=1001 1+,4+,5+,8+,6-,4-,2-,3+,9-,8-,7+,3- Pacbio reads coverage: ![](https://i.imgur.com/j3N92qx.png) ![](https://i.imgur.com/QN6paKW.png) ![](https://i.imgur.com/PIbOK8Q.png)