<center><img src="https://i.imgur.com/rPIZUIq.png" alt="drawing" width="700"/></center>
## h2Discord invitation link https://discord.gg/bXcNDvRe
- To get an account on Discord, go to https://discord.com
and click "Open Discord in your browser" and follow the the steps that come up. If you already have an account, log in as normal.
- Then go to https://discord.gg/bXcNDvRe to join the group
- I also recommend to dowload the App for an even smoother experience.
# ACEIDHA-SV: Quality control of assemblies - ONT
###### Relies heavily on *
REMEMBER TO RENAME WHEN APPROPRIATE!
You have the following files to assess:
| Strain | Files |
| -------- | -------- |
| A177 | ONT; SR; Hybrid|
| A111 | ONT; SR; Hybrid|
| Ecoli_A177| ONT|
| Ecoli_A111| ONT
If you dont have any assemblies, import this history: https://usegalaxy.eu/u/allarena/h/conjugation-assemblies
We expect A177 and A111 to have plasmids. We expect these plasmids to be in the E.coli after conjugation.
**I. Run `Quast` on your eight assemblies.**
Visualise the assemblies of isolate A177 using Bandage (Wick et al. 2015). This tool will let us better understand how the assembly graph really looks, and can give us a feeling for if the genome was well assembled or not.
**II. Find the Bandage Image tool**, and choose the “Contig graph” of A177 short read assembly and hybrid assembly. Execute. View the output files.
**TASK**
1. How many contigs do you have in A177s three assemblies?
2. What is the lenght of your longest contig?
3. Does this look like a Klebsiella genome? (hint check GC%, and size)
**III.** **Run Bandage on the transconjugates** (E.coli_A177_ONT and E.coli_A111_ONT). Inspect both Quast and Bandage from the transconjugates.
5. There is something wrong with both transconjugates.....(hint: It should be an E. coli with a plasmid from one of the donors). What is wrong?
**III.** Lets look what has happened to the transconjugate E.coli_A177.
The program faSplit splits multifasta file into separate files.
a) **Run `faSplit`** on the E.coli_A177_ONT assembly.
b) **Run** **`Quast`** again on the resulting files.
To do this, move the resulting fasta files from the faSplit program from hidden to active files (by clicking first "Show hidden" and then "Unhide" next to relevant files)
1. Describe what you got
We want to figure out what the two biggest contigs in E.coli_A177 are. We therefore compare them to the donor A177 and the labstrain E.coli Dh5 ([Imported from this history](https://usegalaxy.eu/u/allarena/h/reference-genome-c-jejuni) using FastANI again.
**III.** **Run FastANI** with the two contigs from faSplit against the E. coli DH5 imported and the donor A177 (hybrid assembly).
2. Identify what the two contigs are. What has happened here?
3. What could be the reason for this?
4. What is the biological implication for our results (think of what was the origianal research idea)
5. Can we include this assembly in the remaining downstream analysis?
### Evaluation of assembly completness
#### Core genes completness with BUSCO
`BUSCO` (Benchmarking Universal Single-Copy Orthologs) allows a measure for quantitative assessment of genome assembly based on evolutionarily informed expectations of gene content. Details for this tool are here: [Busco website](https://busco.ezlab.org/)
**II.** Rub Busco with the following parameters:
- `Sequence to analyse`: choose A177 hybrid and short read of assemblies
- `Mode` : Genome assemblies (DNA)
- Use Augustus instead of Metaeuk: use Metaeuk
- Auto-detect or select lineage: `Select lineage`
- Lineage: `Enterobacterales`
- Which outputs should be generated: `short summary text` and `summary image`
**Task**
Compare the number of `BUSCO` genes identified in the ONT only and hybrid assemblies of A177. What do you observe?
### Conclusion
This pipeline shows how to evaluate a genome assembly. Once you are satisfied with your genome sequence, you can start the annotation process!
###### *https://training.galaxyproject.org/training-material/topics/assembly/tutorials/assembly-quality-control/tutorial.html