# ARG project: Methods & questions
## Methods: Bioinformatic ARG detection on assembled sequences
- [Characterization of antimicrobial resistance genes and virulence factor genes in an Arctic permafrost region revealed by metagenomics](https://linkinghub.elsevier.com/retrieve/pii/S0269749121022168) (2021, Environmental Pollution)
- **ORF prediction** (MetaGeneAnnotator with default parameters)
- **ORF clustering** (90% coverage using CD-HIT, representative sequence = longest sequence per cluster)
- **ARG prediction:** Align cluster representatives against CARD using DIAMOND BLASTP.
- **Taxonomic assignments:** Align cluster representatives against NCBI non-redundant protein sequences database NR using DIAMOND. Reduce false positives by stringent thresholds: E-value <1e+5, identity > 70%, alignment length >50
- **Normalization/gene abundance:** Transcripts per million (TPM) calculated via Kallisto and merged with alignment results via contig ID.
- **MGE prediction:** AMR module of PathoFact pipeline
- **Identify potential PARBs (pathogenic ABR bacteria):** bin contigs into MAGs (MetaBAT 2 and MaxBin 2 from the MetaWRAP pipeline) and map ARGs onto them (in a seems-like unnecessarily laborious process...).
- **Statistics on dataset parameters** in R(e.g. physiochemical parameters (pH...); for our samples maybe age, geographic location, origin (dental calculus, paleofeces...), host species...
- [Characterization of antibiotic resistance genes in the species of the rumen microbiota](https://www.nature.com/articles/s41467-019-13118-0) (2019, Nature Communications)
- Download microbial genomes
- Screen with
- **RESFinder** v2.1, cut-offs: query coverage 60%, gene identity 70%, and sequence length 60% of the reference ARG
- **Resfams** v1.2: predict conserved ARG domains, no custom cut-offs
- **ARG-ANNOT** v3 and **BLASTn**: cut-offs: sequence identity 70%, sequence length: 60%, E-Value <1E-6.
- [Shotgunโmetagenomics based prediction of antibiotic resistance and virulence determinants in Staphylococcus aureus from periprosthetic tissue on blood culture bottles](https://doi.org/10.1038/s41598-021-00383-7) (2021, Nature Scientific Reports)
- Assembly/binning/taxonomic classification
- ARGs from reads: **Groot** v.1.0.2
- ARGs from contigs: **ABRicate** v0.8
- for both: against **NCBI** bacterial antimicrobial resistance reference gene database
- chromosomal **point mutations associated with antimicrobial resistance**: **ResFinder** v.4.1 (Point Finder database)
- cut-offs for all: sequence identity 90%, sequence coverage 90%
- read cut-off: read coverage >20x
- [Global landscape of gut microbiome diversity and antibiotic resistomes across vertebrates](https://doi.org/10.1016/j.scitotenv.2022.156178)
- **ARGs-OAP** v2.0 pipeline with **Structured Antibiotic Resistance Genes (SARG) database**
- **RGI** v5.1.1 with **ResFam database**
- cut-offs (prbly for both): sequence identity of 70%, sequence length >70%, and an e-value < 1eโ5
- [Exploring divergent antibiotic resistance genes in ancient metagenomes and discovery of a novel beta-lactamase family](https://doi.org/10.1111/1758-2229.12453) (2016, Environmental Microbiology Reports)
- read-level: **BLASTx** with **ARGANNOT database**, cut-off: e-value 1E-10 (for reads <100 bp: 1E-05)
- then: compare best hit against "**NCBI database**" (no further specifications)
- **discard false positives**: compare best ARGANNOT hit against best NCBI hit, discard everything under cutoff: 30% sequences length
- [Identification of bacterial pathogens and antimicrobial resistance directly from clinical urines by nanopore-based metagenomic sequencing](https://doi.org/10.1093/jac/dkw397) (2017, Journal of Antimicrobial Chemotherapy)
- MinION data:
- **LAST**: Align MinION reads against **CARD**.
- **false positive determination**: Some sequences in CARD contain resistance-gene-flanking regions. They were manually visualized in **Artemis** (Sanger).
- Consensus sequences:
- cut-off: >80% identity over the length of a gene
- reciprocal best BLAST hits (really strong evidence if you have big database)
- cultivated bacteria (Illumina)
- Genefinder + in-house script
- cut-offs: 90% sequence identity over full length of sequence (MinION was lower with 80% because of expected read errors)
- [Culture-independent genotyping, virulence and antimicrobial resistance gene identification of Staphylococcus aureus from orthopaedic implant-associated infections](https://doi.org/10.3390%2Fmicroorganisms9040707) (2021, Microorganisms)
- ARG prediction:
- ARMA module of ONT's **cloud-based EPI2ME** workflows, cut-off: Q-score >= 7
- **ResFinder** v4.1, cut-off: sequence identity 90%, sequence length 60%
- :negative_squared_cross_mark: [Identification of Genes Transcribed by Actinobacillus pleuropneumoniae
in Necrotic Porcine Lung Tissue by Using Selective Capture of
Transcribed Sequences](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC523062/) (2004, Infection and Immunity)
- **[HUSAR package](https://www.dkfz.de/en/forschung/zentrale_einrichtungen/cf-omics/husar.html):** from DKFZ Heidelberg, something with BLAST (no details given).
## ARG-Projekt: Fragestellungen und Strategien
### Which antibiotics/secondary metabolites are in (meta)genomes?
:arrow_right: Targeted ARG/BGC identification
- Identify ARG-containing BGCs
- Possibly screen for duplications and HGT
- e.g. ARTS: Antibiotic Resistant Target Seeker (developed in 2016, not updated since 2019) ๐ https://doi.org/10.1093%2Fnar%2Fgkx360
- "Identification of Thiotetronic Acid Antibiotic Biosynthetic Pathways
by Target-directed Genome Mining" ๐ [PDF](https://pubs.acs.org/doi/pdf/10.1021/acschembio.5b00658)
- "Targeted antibiotic discovery through biosynthesis-associated resistance determinants: target directed genome mining" https://doi.org/10.1080/1040841X.2019.1590307
### How to analyse new/underexplored taxa?
:arrow_right: Taxonomic distribution of ARGs
:arrow_right: Identify gene-rich lineages, characterize them
#### Phylogeny of MAGs + annotation per MAG
Compare ARG class, amount, and/or diversity per MAG, e.g. in paper "Biosynthetic potential of the global ocean microbiome"

๐ https://doi.org/10.1038/s41586-022-04862-3

๐ https://doi.org/10.1016/j.scitotenv.2022.156178
- Which are unexplored lineages/peculiarities?
- Do sampling materials have an influence on ARG abundance etc.? Characterize samples in general (number of genes etc.)
- (Compare metatranscriptome expression?)
โก๏ธ new targets for discovery of natural products/biosynthetic compounds
### ARG and sample comparisons
#### Permutational Multivariate Analysis of Variance (PERMANOVA)
- What are differences of geographic sites (in this figure water layers) and body sites according to their ARG class composition?

๐ https://doi.org/10.1101/2021.01.20.427441
- How do ARGs overlap within a cohort/sample/site

๐ https://doi.org/10.1016/j.scitotenv.2022.156178
- Compare relative abundance of ARGs from several datasets

๐ https://doi.org/10.1111/1758-2229.12453
### Additional datasets
- List of publicly available metagenomic datasets: https://doi.org/10.1111/1758-2229.12453 (Table 1)