# Plants
- Auto-generated Table of Content
[ToC]
## 1. ucDunPrim2
Mito and chloro FastK profiles
So-called Chloroplast reads
### Chloroplast

---
### Mito reads - more promissing!

---
## 2. dhQueRobu3
### Chloroplast-reads Profiles (Oak)
Plotting Oak (dhQueRobu3) Median and all kmer profiles for chloroplast-selected reads
```/lustre/scratch116/tol/projects/darwin/data/dicots/Quercus_robur/working/chloro-mitohivi_v2/r ```

### Assembly graph from hifiasm
```/lustre/scratch116/tol/projects/darwin/data/dicots/Quercus_robur/working/chloro-mitohivi_v2/r```
gbk.HiFiMapped.bam.filtered.assembled.p_ctg.noseq.gfa.png
gbk.HiFiMapped.bam.filtered.assembled.p_ctg.fa
A = 17206588 (32.1%), C = 9585209 (17.9%), G = 9577275 (17.9%), T = 17187789 (32.1%), CpG = 2187430 (4.1%)
sum = 53556861, n = 1300, mean = 41197.5853846154, largest = 109444, smallest = 20353
N50 = 40889, L50 = 543
N60 = 39505, L60 = 677
N70 = 37627, L70 = 816
N80 = 35337, L80 = 963
N90 = 32726, L90 = 1120
N100 = 20353, L100 = 1300

gbk.HiFiMapped.bam.filtered.assembled.a_ctg.noseq.gfa.png
gbk.HiFiMapped.bam.filtered.assembled.a_ctg.fa
A = 4302158 (32.1%), C = 2372562 (17.7%), G = 2379167 (17.8%), T = 4339884 (32.4%), CpG = 450476 (3.4%)
sum = 13393771, n = 411, mean = 32588.2506082725, largest = 85092, smallest = 13029
N50 = 33824, L50 = 169
N60 = 32300, L60 = 210
N70 = 30175, L70 = 252
N80 = 28049, L80 = 299
N90 = 24758, L90 = 349
N100 = 13029, L100 = 411

### Chloroplast profMedian (Oak)

### Mitochondrion (oak)
Assembly graphs
### r_utg.noseq.gfa
Haplotype-resolved raw unitig graph in GFA format (prefix.r_utg.gfa). This graph keeps all haplotype information, including somatic mutations and recurrent sequencing errors.

### p_utg.noseq.gfa
Haplotype-resolved processed unitig graph without small bubbles (prefix.p_utg.gfa). Small bubbles might be caused by somatic mutations or noise in data, which are not the real haplotype information.

### p_ctg.gfa
Primary assembly contig graph (prefix.p_ctg.gfa). This graph collapses different haplotypes.

Profiles

### Mitohifi-plant for reads from 3000-15000
/lustre/scratch116/tol/projects/darwin/data/dicots/Quercus_robur/working/mito-v2/mitohifi-plant-passedMedian300015000
### Common reads between mito and chloro
How come mitohifi has fished in-common reads ?

Chloroplast reference was
LOCUS NC_046388 161172 bp DNA circular PLN 26-MAR-2020
DEFINITION Quercus robur cultivar Fastigiata chloroplast, complete genome.
LOCUS MN199236 412886 bp DNA circular PLN 04-APR-2020
DEFINITION UNVERIFIED: Quercus variabilis mitochondrion sequence.
Both references dont have much in common

Im re-running mitohifi to be sure I fished the reads in the correct way.
``` /lustre/scratch116/tol/projects/darwin/data/dicots/Quercus_robur/working/mito-v2/mitohifi-r-MN199236/again ```
Its confirmed, got the same result as above.
What happens if I assemble those 3 pots separately?
Ok, so now I'm running MitoHifi-plants and MBG to the groups of reads above:
- 65807 reads that are fished only by the ref-chloroplast
```/lustre/scratch116/tol/projects/darwin/data/dicots/Quercus_robur/working/chloro-mitohivi_v2/uniq_chloro1```

- 48289 reads that are fished by both, chloro and mito
```/lustre/scratch116/tol/projects/darwin/data/dicots/Quercus_robur/working/mito-v2/mitohifi-r-MN199236/common_reads_mitochloro ```

- 40543 reads that were fished by ref-mito only
```/lustre/scratch116/tol/projects/darwin/data/dicots/Quercus_robur/working/mito-v2/mitohifi-r-MN199236/reads_fished_only_mito/mbg```

Sergey's result
```/lustre/scratch116/tol/projects/darwin/data/dicots/Quercus_robur/working/mito-v2/sergey/share/```

### To dos
- [ ] Hic data
- [x] Assembly graph for hifiasm nuclear
- [ ] MEGAhit
- [ ] Fly (permissive with gluing things together)
- [ ] jumberDB (DBG-like)
## 3. drPotAnse1
Potentilla_anserina
ProfMedian Mito-reads
``` /lustre/scratch116/tol/projects/darwin/data/dicots/Potentilla_anserina/working/mito/profMedian ```

ok, from the image above I got reads from 1200 to 2800. Blast a few and they look like choroplasts!

Tetranucs from Claudia. Blue is the first peak above

-[ ] Running mito-plants for reads in the peak 1200-2800
```/lustre/scratch116/tol/projects/darwin/data/dicots/Potentilla_anserina/working/mito/mito-plants-12002800```
-[ ] Running mito-plants for reads in the peak 4500-6350
-[ ] Running profs for chloro reads
```/lustre/scratch116/tol/projects/darwin/data/dicots/Potentilla_anserina/working/chloro```
Chloroplast reads

### drPotAnse1 MBG mito 1k

## 4. drFilUlma1
### Chloroplast
Many different contigs annotated. None is the size or circularizes.
/lustre/scratch116/tol/projects/darwin/data/dicots/Filipendula_ulmaria/working/chloro/chloro-plants
Running profMedian
## 5. ucDunPrim2
```/lustre/scratch116/tol/projects/darwin/data/algae/Dunaliella_primolecta/working/mito/mito-plants```
To dos:
- [ ] profMedian mito
- [ ] blast mito and chloro
- [ ] make a venn diagram
## 6. Notes
Papers:
- https://www.nature.com/articles/s41598-019-41377-w
- https://www.mdpi.com/2223-7747/10/7/1360/htm
## 7. drAilAlti1
### Mito
MBG -k 1001 -w 250 -a 5 -u 150

MBG -k 5001 -w 250 -a 5 -u 150

4 - chloro
3 - chloro
2 - chloro
1 - chloro
### Chloro
MBG -k 3001 -w 250 -a 5 -u 150

## 7. dcPolAvic1 Polygonum_aviculare
Mito k1001

k6001

- [ ] recombination of plant mito?
- cat mito & chloro and mapp to it
then select better mapping
----------------------------------------------------------------
## 8. cbRhyLore1
### 8.1 cbRhyLore1 MBG k=1001

----------------------------------------------------------------
### 8.2 cbRhyLore1 MBG k=5001

### 8.4 Chloroplast
Linear (from MBG k=1001)

Circular (from MBG k=1001)

Chloroplast Pacbio reads coverage

contigs_stats.tsv: Related mitogenome is 125195 bp long and has 116 genes
| contig_id | frameshifts? | length(bp) | number of genes |
| -------- | -------- | -------- | --------
| final_mitogenome | No | 124828 | 104
Mitohifi run: ```/lustre/scratch116/tol/projects/darwin/data/non-vascular-plants/Rhytidiadelphus_loreus/working/mito_chloroplast/mitohifi_chloro ```
### 8.4.1 ChloroplastChloroplast 10x coverage


Flagstats of mapped 10x reads:
``` flagstat chlor_rotated.bam
2845471 + 0 in total (QC-passed reads + QC-failed reads)
80651 + 0 secondary
0 + 0 supplementary
531228 + 0 duplicates
2845471 + 0 mapped (100.00% : N/A)
2764820 + 0 paired in sequencing
1383339 + 0 read1
1381481 + 0 read2
1391229 + 0 properly paired (50.32% : N/A)
2738426 + 0 with itself and mate mapped
26394 + 0 singletons (0.95% : N/A)
17598 + 0 with mate mapped to a different chr
7079 + 0 with mate mapped to a different chr (mapQ>=5)
```
### 8.5 Mitochondria
Linear (MBG k=1001)

Circular (MBG k=1001)

mito 10x coverage

```
ND5
complement(71541..72254)
/translation="E*PTPVSASIHAATMVTAGVFMIARCSPLFEYSSTALIVITFVGA
MTSFFAATTGILQNDLKRVIAYSTCSQLGYMIFACGISNYSVSVFHLMNHAFFKALLFL
SAGSVIHAMSDEQDMRKMGGLASLLPFTYAMMLIGSLSLIGFPFRTGFYSKDVILELAY
TKYTISGNFAFWLGSVSVFFTSYYSFRLLFLTFLASTNSFKRDILRCHDAPILMAIPLI
FLAFGSIFVGYVAKV*
```
ND5 seems to have a frameshift in the beginning and the mapping of pacbio and 10x is kinda of perfect in the area


## 9 dhQueRobu3
### 9.1 dhQueRobu3 MBG= 1001K

----------------------------------------------------------------
### 9.2 dhQueRobu3 MBG= 5001K

----------------------------------------------------------------
### 9.1.1 Working to construct mito versions from MBG k=1001
1+,4+,5+,8+,6-,4-,2-,3+,9-,8-,7+,3-
Pacbio reads coverage:


