# Mito reads filtering with FastK
<!-- Put the link to this slide here so people can follow -->
slide: https://hackmd.io/YvPVQo0MQKqagtPLTY0VKQ?both
---
A few slides on dealing with mitochondrion HiFi reads
---
## A good example:

---
**Mito reads selected by mapping against a close-related mito**
sum = 7237237, n = 816,
largest = 16347, smallest = 2320, N50 = 10613

---
Make a dict of the kmer occurences within a read:
```
1 1302
2 428
3 330
```

---
Make a rule:
Filter out reads that have too many occurences of low covered kmers.
In the next case:
filtered out reads if more than 200 kmers occur 1 to 20 times in the reads
---
---

---
- Sea start mito with frameshift before filtering
- No frameshift after filtering
---
- passed_no.py
- code to get only the "passed" reads
But don't think is the only and best filter. Let's look at some other cases:

---
Using the same filter for this species

---
So the best filter would be:
- separete the reads around the different peaks
(studying: how to automatically determine the peaks, which ones, etc..)
- Then plot profiles again and remove outliers

---
Buttt.... how about the bee case?
https://hackmd.io/SdEMPFz0S1GTZH6ZJm3WqA?both=#
---
Plants (separating chloroplasts and mitos)
https://hackmd.io/XnCMhNDSSwiX7jtHdp2f9g
---
Further ideas:
Calculate the median of the kmer coverage in each read?
---
All mito kmers coverage

---
{"metaMigratedAt":"2023-06-16T03:37:31.249Z","metaMigratedFrom":"YAML","title":"Talk slides template","breaks":true,"description":"View the slide with \"Slide Mode\".","contributors":"[{\"id\":\"b707d883-ca75-4770-acc3-2f82c1c92f1e\",\"add\":2651,\"del\":3176}]"}