# Fractional sketches ## or, some details of how sourmash and FracMinHash work! --- Consider overlaps between k-mers extracted from three genomes - two that share sequence, and one that does not. From these we can calculate k-mer based similarity measures (Jaccard similarity and containment). ![](https://hackmd.io/_uploads/Hk_OnVC1j.png) --- ### FracMinHash sketching compresses k-mer collections while retaining set relationships ![](https://hackmd.io/_uploads/BJ6zxBC1i.png) This is implemented in the software [sourmash](https://sourmash.readthedocs.io/en/latest/). --- ### These Jaccard (k-mer) measures can be translated to ANI ![](https://hackmd.io/_uploads/r1iPpE0ko.png) Credits: Dr. Tessa Pierce-Ward et al; [more info](https://github.com/sourmash-bio/sourmash/issues/1859). --- ### References: FracMinHash and sourmash - [Lightweight compositional analysis of metagenomes with FracMinHash and minimum metagenome covers](https://www.biorxiv.org/content/10.1101/2022.01.11.475838v2), Irber et al. 2022 ANI calculations - [Debiasing FracMinHash and deriving confidence intervals for mutation rates across a wide range of evolutionary distances, Hera et al.](https://www.biorxiv.org/content/10.1101/2022.01.11.475870v2), Hera et al., 2022
{"metaMigratedAt":"2023-06-17T08:09:04.384Z","metaMigratedFrom":"Content","title":"Fractional sketches","breaks":true,"contributors":"[{\"id\":\"fbac64b8-20e4-4eb4-85a6-d4048a601d72\",\"add\":1293,\"del\":28}]"}
    904 views