<center><img src="https://i.imgur.com/rPIZUIq.png" alt="drawing" width="700"/></center>
# ACEIDHA - SV Annotation and comparative genomics of the conjugation dataset
Because we are working with ESBLs, we are curious to see which resistance genes are located on the genome or on the plasmid.
**I**:
* Select your genome contigs (in FASTA format). Use hybrid assemblies for A111 and A177, and ONT assemblies for the transconjugate. In total, 3 genomes.
* Select whether or not you wish to scan your genome for point mutations giving antimicrobial resistance using the PointFinder database. This requires you to specify the specific organism you are scanning
* Can you actually use it?
* Run the tool.
Inspect the results:
- Summarize your genomes genotypes, plasmids and AMR genes
- Which ST type do you have?
- Are A111 and A177 ESBL producing bacteria? What about our transconjugate? (see definition [here](https://pubmed.ncbi.nlm.nih.gov/24821872/))
**II.** Search for virulence factors: drun the contigs through another software tool called ABRIcate; it will scan the contigs for presence of known virulence genes.
- Find the tool
- Choose input fasta
- Choose correct database (VFDB)
- Inspect results - what do you make of it?
- Make a summary file with the results from ABRIcate using `ABRicate Summary` (another software)
**III.** Use `Prokka` ([Seeman, T. 2014](https://pubmed.ncbi.nlm.nih.gov/24642063/)) to annoatate A177.
- Find **Prokka** under Annotation Section
- Select the A177_hybrid_assembly to annotate
- Fill out Species name and make sure the Select Multiple dataset mode for *Contigs to annotate*, and make sure that *Kingdom* is set to *Bacteria*. Adjust outputs so you get annotations in a gff file and statistics only (otherwise you will get so many files).
- Press execute
**RENAME PROKKAFILES!**
After automatically annotating your genome, it is important to visualize your results so you can understand what your organism looks like, and then to manually refine these annotations along with any additional data you might have. This process is important when you for instance characterize the mechanism behind the AMR pattern observed.
**IV** Use Jbrowse with the following parameters:
“Reference genome to display”: Use a genome from history
“Select the reference genome”: Select the `A177 hybrid` fasta file
“Genetic Code”: 11. The Bacterial, Archael and Plant Plastid Code
* In “Track Group”:
* “Insert Track Group”
* “Track Category”: Gene Calls
* In “Annotation Track”:
* "Insert Annotation Track”
* "Track Type”: GFF/GFF3/BED Features
* “GFF/GFF3/BED Track Data”: `A177.gff file`
When its done, find the ESBL gene in contig 2 (which is the plasmid). The coordinates should be available from the Staramr results. Inspect its upstreams and downstreams regions.
1. What type of genes are around it?
2. What does this biologically mean?
**V.** Pick the name of the closest downstreams genes. Search in [TnCentral](https://tncentral.proteininformationresource.org/index.html).
What is this?
###### Different elements may be involved in the mobilization of blaCTX-M genes. ISEcp1 or ISEcp1-like insertion sequences have repeatedly been observed 42 to 266 bp upstream of ORFs encoding the CTX-M-1, CTX-M-2, CTX-M-3, CTX-M-9, CTX-M-13, CTX-M-14, CTX-M-15, CTX-M-17, CTX-M-19, CTX-M-20, and CTX-M-21 enzymes (1, 18, 29, 36, 49, 86), which is also the case for certain plasmid-mediated ampC genes (72, 98) (Fig. (Fig.2B).2B). This insertion sequence is composed of two imperfect inverted repeats and an ORF encoding a 420-amino-acid putative transposase. Its amino acid sequence displays only 24% identity with the IS492 transposase from Bacteroides fragilis, the most closely related transposase. Stapleton (P. D. Stapleton, Abstr. 39th Intersci. Conf. Antimicrob. Agents Chemother., abstr. 1457, 1999) suggests that the ISEcp1 element is able to achieve the transfer of the downstream DNA sequence by a one-ended transposition process. Plasmid conduction experiments have confirmed the potential involvement of ISEcp1 in the mobility of blaCTX-M (26). In addition, the mapping of the blaCTX-M-17 promoter region by primer extension has revealed −35 (TTGAAA) and −10 (TACAAT) promoter sequences at the 3′ end of an ISEcp1-like sequence (nucleotides 2690 to 2719 of GenBank sequence number AY033516) which probably provides the promoter for expression of blaCTX-M genes associated with the ISEcp1 element (26, 49).
#### Comparative genomics of the plasmid of Ecoli_A177 and A177.
What we wanted was to see if the plasmid was transferred to transconjugates - this is difficult since EcoliA111 and A111 does not have plasmids, and the Ecoli_A177 is a mixture of donor and recipient. Lets practice anyway!
**VI.** Align your plasmid assemblies from donor and recipient using nucmer (from MUMmer package). Download the alignment and upload it to the [Assemblytics GUI](http://assemblytics.com/). Before you can do this you need to split the hybrid assembly of A177 using `faSplit`. Identify which contig is most likely the same as in Ecoli_A177 and align these to eachother (look at size). Use [Assemblytics](http://assemblytics.com/) to vizualise.
*Wondering what if all other contigs also are plasmids? Execute PlasmidFinder on the six different files!*
## This is the end of the workshop. What did you think about it? Did you learn something?