<center><img src="https://i.imgur.com/rPIZUIq.png" alt="drawing" width="700"/></center>
# ACEIDHA: Quality control of assemblies - ONT
###### Relies heavily on *
REMEMBER TO RENAME WHEN APPROPRIATE!
You have the following files to assess:
| Strain | Files |
| -------- | -------- |
| A177 | ONT; SR; Hybrid|
| A111 | ONT; SR; Hybrid|
| Ecoli_A177| ONT|
| Ecoli_A111| ONT
We expect A177 and A111 to have plasmids. We expect these plasmids to be in the E.coli after conjugation.
Repeat step I - III from the [Quality control of assemblies - short read](https://hackmd.io/WuBIGaHvSfCM1e1xKWLUnw?view). **Remember to rename your files when appropriate!** For the FastANI analysis, you need reference genomes. Please find them like this (download and upload).
<iframe src="https://scribehow.com/embed/Reference_genomes_download_and_upload__hL4OrHcDRWCm0w6s1LHCAA" width="640" height="640" allowfullscreen frameborder="0"></iframe>
*Write down any inconsistencies you did not expect.*
**I.** Choose the Scrapbook option to compare A177 assembly based on short-read, ONT and hybrid. Compare Quast and Bandage output.
**TASK**
1. How many contigs do you have in those three assemblies?
2. What is the coverage of the longest contig?
3. What is the lenght of your longest contig?
4. Does this look like a Klebsiella genome?
**II.** Inspect both Quast and Bandage from the transconjugates.
5. There is something wrong with both transconjugates.....(hint: It should be an E. coli with a plasmid from one of the donors). What is wrong?
**III.** Steps to take to "clean up" the transconjugate E.coli_A177. The program faSplit splits multifasta file into separate files. Run Quast again on the resulting files. To do this, move the resulting fasta files from the faSplit program from hidden to active files (by clicking first "Show hidden" and then "Unhide" next to relevant files)
1. Describe what you got
We want to figure out what the two biggest contigs in E.coli_A177 are. We therefore compare them to the donor A177 and the labstrain used Dh5 (download from [here](https://www.ncbi.nlm.nih.gov/nuccore/CP080399.1) using FastANI again
**III.** Run contigs against the E. coli DH5 downloaded and the donor A177 (hybrid assembly).
2. Identify what the two contigs are. What has happened here?
3. What could be the reason for this?
4. What is the biological implication for our results (think of what was the origianal research idea)
5. Can we include this assembly in the remaining downstream analysis?
### Evaluation of assembly completness
#### Core genes completness with BUSCO
`BUSCO` (Benchmarking Universal Single-Copy Orthologs) allows a measure for quantitative assessment of genome assembly based on evolutionarily informed expectations of gene content. Details for this tool are here: [Busco website](https://busco.ezlab.org/)
**II.** Rub Busco with the following parameters:
- `Sequence to analyse`: all included assemblies in the conjugation dataset
- `Mode` : Genome assemblies (DNA)
- Use Augustus instead of Metaeuk: use Metaeuk
- Auto-detect or select lineage: `Select lineage`
- Lineage: `Enterobacterales`
- Which outputs should be generated: `short summary text` and `summary image`
**Task**
Compare the number of `BUSCO` genes identified in the short read, ONT only and hybrid assemblies of A177. What do you observe?
Despite `BUSCO` being robust for species that have been widely studied, it can be inaccurate when the newly assembled genome belongs to a taxonomic group that is not well represented in OrthoDB. Even in a well-represented taxonomic group, the bias on the selection of reference genomes selected to create OrthoDB can lead to an under-scoring of the newly assembled genome and is dependent on the evolution of the genomes. For example, in microsporidia, basal genomes have much lower scores due to the strong drive on gene loss and gain in these organisms.
### Conclusion
This pipeline shows how to evaluate a genome assembly. Once you are satisfied with your genome sequence, you can start the annotation process!
###### *https://training.galaxyproject.org/training-material/topics/assembly/tutorials/assembly-quality-control/tutorial.html