# ARG project: Methods & questions ## Methods: Bioinformatic ARG detection on assembled sequences - [Characterization of antimicrobial resistance genes and virulence factor genes in an Arctic permafrost region revealed by metagenomics](https://linkinghub.elsevier.com/retrieve/pii/S0269749121022168) (2021, Environmental Pollution) - **ORF prediction** (MetaGeneAnnotator with default parameters) - **ORF clustering** (90% coverage using CD-HIT, representative sequence = longest sequence per cluster) - **ARG prediction:** Align cluster representatives against CARD using DIAMOND BLASTP. - **Taxonomic assignments:** Align cluster representatives against NCBI non-redundant protein sequences database NR using DIAMOND. Reduce false positives by stringent thresholds: E-value <1e+5, identity > 70%, alignment length >50 - **Normalization/gene abundance:** Transcripts per million (TPM) calculated via Kallisto and merged with alignment results via contig ID. - **MGE prediction:** AMR module of PathoFact pipeline - **Identify potential PARBs (pathogenic ABR bacteria):** bin contigs into MAGs (MetaBAT 2 and MaxBin 2 from the MetaWRAP pipeline) and map ARGs onto them (in a seems-like unnecessarily laborious process...). - **Statistics on dataset parameters** in R(e.g. physiochemical parameters (pH...); for our samples maybe age, geographic location, origin (dental calculus, paleofeces...), host species... - [Characterization of antibiotic resistance genes in the species of the rumen microbiota](https://www.nature.com/articles/s41467-019-13118-0) (2019, Nature Communications) - Download microbial genomes - Screen with - **RESFinder** v2.1, cut-offs: query coverage 60%, gene identity 70%, and sequence length 60% of the reference ARG - **Resfams** v1.2: predict conserved ARG domains, no custom cut-offs - **ARG-ANNOT** v3 and **BLASTn**: cut-offs: sequence identity 70%, sequence length: 60%, E-Value <1E-6. - [Shotgunโ€‘metagenomics based prediction of antibiotic resistance and virulence determinants in Staphylococcus aureus from periprosthetic tissue on blood culture bottles](https://doi.org/10.1038/s41598-021-00383-7) (2021, Nature Scientific Reports) - Assembly/binning/taxonomic classification - ARGs from reads: **Groot** v.1.0.2 - ARGs from contigs: **ABRicate** v0.8 - for both: against **NCBI** bacterial antimicrobial resistance reference gene database - chromosomal **point mutations associated with antimicrobial resistance**: **ResFinder** v.4.1 (Point Finder database) - cut-offs for all: sequence identity 90%, sequence coverage 90% - read cut-off: read coverage >20x - [Global landscape of gut microbiome diversity and antibiotic resistomes across vertebrates](https://doi.org/10.1016/j.scitotenv.2022.156178) - **ARGs-OAP** v2.0 pipeline with **Structured Antibiotic Resistance Genes (SARG) database** - **RGI** v5.1.1 with **ResFam database** - cut-offs (prbly for both): sequence identity of 70%, sequence length >70%, and an e-value < 1eโˆ’5 - [Exploring divergent antibiotic resistance genes in ancient metagenomes and discovery of a novel beta-lactamase family](https://doi.org/10.1111/1758-2229.12453) (2016, Environmental Microbiology Reports) - read-level: **BLASTx** with **ARGANNOT database**, cut-off: e-value 1E-10 (for reads <100 bp: 1E-05) - then: compare best hit against "**NCBI database**" (no further specifications) - **discard false positives**: compare best ARGANNOT hit against best NCBI hit, discard everything under cutoff: 30% sequences length - [Identification of bacterial pathogens and antimicrobial resistance directly from clinical urines by nanopore-based metagenomic sequencing](https://doi.org/10.1093/jac/dkw397) (2017, Journal of Antimicrobial Chemotherapy) - MinION data: - **LAST**: Align MinION reads against **CARD**. - **false positive determination**: Some sequences in CARD contain resistance-gene-flanking regions. They were manually visualized in **Artemis** (Sanger). - Consensus sequences: - cut-off: >80% identity over the length of a gene - reciprocal best BLAST hits (really strong evidence if you have big database) - cultivated bacteria (Illumina) - Genefinder + in-house script - cut-offs: 90% sequence identity over full length of sequence (MinION was lower with 80% because of expected read errors) - [Culture-independent genotyping, virulence and antimicrobial resistance gene identification of Staphylococcus aureus from orthopaedic implant-associated infections](https://doi.org/10.3390%2Fmicroorganisms9040707) (2021, Microorganisms) - ARG prediction: - ARMA module of ONT's **cloud-based EPI2ME** workflows, cut-off: Q-score >= 7 - **ResFinder** v4.1, cut-off: sequence identity 90%, sequence length 60% - :negative_squared_cross_mark: [Identification of Genes Transcribed by Actinobacillus pleuropneumoniae in Necrotic Porcine Lung Tissue by Using Selective Capture of Transcribed Sequences](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC523062/) (2004, Infection and Immunity) - **[HUSAR package](https://www.dkfz.de/en/forschung/zentrale_einrichtungen/cf-omics/husar.html):** from DKFZ Heidelberg, something with BLAST (no details given). ## ARG-Projekt: Fragestellungen und Strategien ### Which antibiotics/secondary metabolites are in (meta)genomes? :arrow_right: Targeted ARG/BGC identification - Identify ARG-containing BGCs - Possibly screen for duplications and HGT - e.g. ARTS: Antibiotic Resistant Target Seeker (developed in 2016, not updated since 2019) ๐Ÿ“– https://doi.org/10.1093%2Fnar%2Fgkx360 - "Identification of Thiotetronic Acid Antibiotic Biosynthetic Pathways by Target-directed Genome Mining" ๐Ÿ“– [PDF](https://pubs.acs.org/doi/pdf/10.1021/acschembio.5b00658) - "Targeted antibiotic discovery through biosynthesis-associated resistance determinants: target directed genome mining" https://doi.org/10.1080/1040841X.2019.1590307 ### How to analyse new/underexplored taxa? :arrow_right: Taxonomic distribution of ARGs :arrow_right: Identify gene-rich lineages, characterize them #### Phylogeny of MAGs + annotation per MAG Compare ARG class, amount, and/or diversity per MAG, e.g. in paper "Biosynthetic potential of the global ocean microbiome" ![genes per MAG/genome 1](https://i.imgur.com/Zr2S3N8.png) ๐Ÿ“– https://doi.org/10.1038/s41586-022-04862-3 ![genes per MAG/genome 2](https://i.imgur.com/jhwraAy.png) ๐Ÿ“– https://doi.org/10.1016/j.scitotenv.2022.156178 - Which are unexplored lineages/peculiarities? - Do sampling materials have an influence on ARG abundance etc.? Characterize samples in general (number of genes etc.) - (Compare metatranscriptome expression?) โžก๏ธ new targets for discovery of natural products/biosynthetic compounds ### ARG and sample comparisons #### Permutational Multivariate Analysis of Variance (PERMANOVA) - What are differences of geographic sites (in this figure water layers) and body sites according to their ARG class composition? ![sample comparison](https://i.imgur.com/nVLlj0s.png) ๐Ÿ“– https://doi.org/10.1101/2021.01.20.427441 - How do ARGs overlap within a cohort/sample/site ![ARG overlap](https://i.imgur.com/ZH8dTsP.png) ๐Ÿ“– https://doi.org/10.1016/j.scitotenv.2022.156178 - Compare relative abundance of ARGs from several datasets ![ARG abundance and dataset heatmap](https://i.imgur.com/moUSMGm.png) ๐Ÿ“– https://doi.org/10.1111/1758-2229.12453 ### Additional datasets - List of publicly available metagenomic datasets: https://doi.org/10.1111/1758-2229.12453 (Table 1)