# Fractional sketches
## or, some details of how sourmash and FracMinHash work!
---
Consider overlaps between k-mers extracted from three genomes - two that share sequence, and one that does not.
From these we can calculate k-mer based similarity measures (Jaccard similarity and containment).
![](https://hackmd.io/_uploads/Hk_OnVC1j.png)
---
### FracMinHash sketching compresses k-mer collections while retaining set relationships
![](https://hackmd.io/_uploads/BJ6zxBC1i.png)
This is implemented in the software [sourmash](https://sourmash.readthedocs.io/en/latest/).
---
### These Jaccard (k-mer) measures can be translated to ANI
![](https://hackmd.io/_uploads/r1iPpE0ko.png)
Credits: Dr. Tessa Pierce-Ward et al; [more info](https://github.com/sourmash-bio/sourmash/issues/1859).
---
### References:
FracMinHash and sourmash - [Lightweight compositional analysis of metagenomes with FracMinHash and minimum metagenome covers](https://www.biorxiv.org/content/10.1101/2022.01.11.475838v2), Irber et al. 2022
ANI calculations - [Debiasing FracMinHash and deriving confidence intervals for mutation rates across a wide range of evolutionary distances, Hera et al.](https://www.biorxiv.org/content/10.1101/2022.01.11.475870v2), Hera et al., 2022
{"metaMigratedAt":"2023-06-17T08:09:04.384Z","metaMigratedFrom":"Content","title":"Fractional sketches","breaks":"true","contributors":"[{\"id\":\"fbac64b8-20e4-4eb4-85a6-d4048a601d72\",\"add\":1293,\"del\":28}]"}