# Mappers: "Why do we use that mapper? I dunno 🤷‍♂️️ because pop-gen do it?" ### Background - A lot of inherited knowledge from 'older' ancient DNA fields, or just the 'tradition' within the lab - e.g. everyone in our group uses 'bwa'... why? 🤷‍♀️ - We _assume_ it's best, but if we ask, no-one can explain why - Both for settings and which tools: - all existing studies focused on human poop-gen or animals: Oliva 2021, Poullet 2020 - only test 'established' tools (`bowtie2`/`bwa`) ### Open Questions - Which tool to use for microbial genomes? - What settings to use? - When dealing with community-diverse samples (e.g. dealing with close/distant relatives) - How much effect does relaxing mismatches cause for bringing in environmental relatives - Are more modern mappers suitable for aDNA (e.g. `minimap2`) - Do we need 'damage aware' dedicated aDNA mappers (e.g. `mapAD`) ### Project idea Benchmarking of established and newer aligners with different amount of damage/length for aligning against microbial genomes Variables: - Mapper (`bowtie2`/`bwa aln`/`bwa mem`/`minimap2`/`mapAD`) - Microbe/Genome (bacteria/virus - with close relatives) - Damage/Length (spanning typical aDNA ranges) - Mapper's Parameters ### References Poullet, M., & Orlando, L. (2020). Assessing DNA Sequence Alignment Methods for Characterizing Ancient Genomes and Methylomes. Frontiers in Ecology and Evolution, 8, 105. https://doi.org/10.3389/fevo.2020.00105 Oliva, A., Tobler, R., Llamas, B., & Souilmi, Y. (2021). Additional evaluations show that specific BWA‐aln settings still outperform BWA‐mem for ancient DNA data alignment. Ecology and Evolution. https://doi.org/10.1002/ece3.8297 --- ## Brainstrom - Everyone uses `bwa aln` (more for eukarotic, microbes sometimes Bowtie2) - But bowtie2 is faster (multithreads indexing), and accordingly to Poullet and Ludovic - Minimap2 - Surprisingly OK with aDNA if you look at how much of the ends of reads were clipped?