# BUSCO and DOT plots to help QC data ## 1. DOT plots (i.e. synteny maps) Used to visualize genome assembly Compare our genome to an existing one vs one from closely related species or reference genome Parallel lines - repeat sequences Break in the line - insertion or deletion ### Use program MUMer aligns large genomes very fast and the most common nucleotide script is nucmer. - nucmer - aligmnet of 2 nucleotide sequences that are related and contain rearrangment See example code - use it to create dot plots! ## 2. BUSCO (Benchmarking Universal Single Copy Orthologs) Algorithm to find single copy orthologs in our genome of interest. Uses blast to find single copy genes in gneom, then Augustus will delineate the precise position from where each gene starts adn ends, then hmmsearch assigns a score. What is a good genome in terms of BUSCO values (>90% is good for a draft genome) - you have to have assembled gneome >90% for publication, you definitely don't want it to below 80% in general - that's a bad sing :( For the first BUSCO, go with the BUSCO data set that is the most closely related