--- title: Quality assessment of rDerCor1_cur20201106 tags: VGP, Genome description: Exploration of quality in the curated version of rDerCor1 date 2020.11.06 --- ## General Stats Stats from Sanger pipeline from [Sanger Team](https://github.com/sanger-pathogens/assembly-stats) |Stats|rDerCor1.pri.cur.20201106.fasta| [N]| |---|---:|---| |Sum| 2,164,762,090| 40 |N50| 137,568,771 |5 |N60| 127,644,590 |7 |N70| 105,213,316| 9 |N80| 79,995,514 |11 |N90 |25,428,285 |17 |N100 |4,772 |40 |Average| 54,119,052 |Largest| 354,478,938 |N_count| 5,594,612 |Gaps| 667| ## DotPlot We compare the genome with pri.curation.20200406; rCheMyd1.curation20200811 & rCheMyd1_DNAZoo using [Dgenies](http://dgenies.toulouse.inra.fr/). <i>Cabanettes F, Klopp C. (2018) D-GENIES: dot plot large genomes in an interactive, efficient and simple way. PeerJ 6:e4958 https://doi.org/10.7717/peerj.4958</i> <center> ![HiC_DNAZoo](https://i.imgur.com/DOL5MUc.jpg) *rDerCor1_cur_VS_DNAZoorCheMyd1* ![rCheMyd1_cur](https://i.imgur.com/NEH3M60.jpg) *rCheMyd1_cur_VS_rDercor1_cur* ![rDerCor1_previous.cur](https://i.imgur.com/3o1hnGh.jpg) *rDerCor1_cur20201106VS20200406* ##### Comparing with rCheMyd1_pri_cur_20200811 ![](https://i.imgur.com/kR06zbf.png) *Region low identity* ![](https://i.imgur.com/XObxurp.png) *Region low identity* ![](https://i.imgur.com/yKGpS1p.png) *Break and reorder* ![](https://i.imgur.com/kay1LGF.png) *Break and low identity* ![](https://i.imgur.com/FvOQif0.png) *Last Scaffolds* ![](https://i.imgur.com/j9lVXaB.jpg) *Last Scaffolds* ![](https://i.imgur.com/7m1z7gk.png) *Scaffolds 9;10;11;14 in rDerCor1 to check shift* ![](https://i.imgur.com/i4MJJt5.png) *Scaffolds 9;10;11;14 in rDerCor1 to check shift, shifted* </center> ### Busco3 ||rDerCor1.pri.cur.20200406.fasta|| rDerCor1.pri.cur.20200406.fasta | |---|---:|---:|---|:---:| |Complete BUSCOs (C) |2,426 | 93.80%| 2,464| 95.20%| |Complete and single-copy BUSCOs (S)| 2,410| 93.20%| 2,455| 94.90% |Complete and duplicated BUSCOs (D) |16| 0.60%| 9| 0.30% |Fragmented BUSCOs (F)| 104| 4.00%| 77| 3.00%| |Missing BUSCOs (M)| 56| 2.20%| 45| 1.80% |Total BUSCO groups searched| 2,586| --| 2,586| --| ### Merqury |Assembly | K-mers asm | Kmers asm & reads | QV | Error rate| |---|---:|---:|---|---| |rDerCor1.pri.cur.20201106 |5,705,907 |2,159,153,338| 38.9963| 0.000126| |rDerCor1_merfin_pri_SortedSize| 5,727,562 |2,159,914,818 |38.9814| 0.000126434 |rDerCor1_t3_primary| 18,168,030| 2,159,644,793| 33.9555| 0.000402209 #### Copy-number spectrum analysis for rDerCor1.20201106 <center> ![](https://i.imgur.com/v8lqxC4.png) ![](https://i.imgur.com/YnR0mjE.png) </center> Table XX. Length %GC and %N for curated 2020.11.06 version of rDerCor1 |Name_Scaffold| Leng| %GC| %N| |---|---|---|---| |SUPER_1| 354,478,938 |0.4227926371| 0.003112108737 |SUPER_2| 272,704,568 |0.4195716223| 0.004507460249 |SUPER_3| 212,168,110 |0.4223197303| 0.0005436726566 |SUPER_4| 146,520,673 |0.4173785224| 0.004935453716 |SUPER_5| 137,568,771 |0.4204562168| 2.66E-05 |SUPER_6| 131,128,815 |0.4310519621| 0.001805987494 |SUPER_7| 127,644,590 |0.4308359876| 0.002006070136 |SUPER_8| 109,294,577 |0.4311128996| 0.000574840964 |SUPER_9| 105,213,316 |0.4297130508| 0.003656371785 |SUPER_10| 86,472,384 |0.4376877478| 0.002140105216 |SUPER_11| 79,995,514 |0.4221339462| 0.003428929777 |SUPER_12| 44,241,090 |0.4329135652| 0.002098320814 |SUPER_13| 41,159,716 |0.4523335632| 0.001191067499 |SUPER_14| 40,021,378 |0.4640961888| 0.005266810153 |SUPER_15| 33,199,653 |0.4470593413| 8.92E-06 |SUPER_16| 26,401,370 |0.4544061918| 0.003022987065 |SUPER_17| 25,428,285 |0.4483935507| 1.16E-05 |SUPER_18| 23,665,181 |0.4614426148| 3.74E-05 |SUPER_19| 20,023,515 |0.4719584449| 0.002042049061 |SUPER_20| 19,206,509 |0.5205470187| 0.002162339861 |SUPER_21| 18,859,425 |0.4695203592| 5.78E-05 |SUPER_22| 18,746,278 |0.4604031264| 5.80E-05 |SUPER_23| 17,228,317 |0.5341474736| 9.16E-05 |SUPER_24| 16,950,461 |0.5114341138| 0.01037181231 |SUPER_25| 16,479,532 |0.4913683835| 4.19E-05 |SUPER_26| 16,472,168 |0.4625251515| 0.006196634226 |SUPER_27| 16,294,212 |0.4895787535| 3.03E-05 |SUPER_28| 6,736,449 |0.5116992647| 0.03263588873 |SUPER_11_unloc_1| 161,456 |0.5007494302| 0.003084431672 |SUPER_11_unloc_2| 88,916 |0.5123262405| 0 |SUPER_11_unloc_3| 62,672 |0.4594555782| 0.009557697217 |SUPER_11_unloc_4| 30,635 |0.5188836298| 0 |SUPER_11_unloc_5| 27,714 |0.522046619 |0 |SUPER_11_unloc_6| 24,225 |0.5138493292| 0 |SUPER_11_unloc_7| 21,914 |0.4550515652| 0 |SUPER_11_unloc_8| 12,220 |0.4931260229| 0 |SUPER_11_unloc_9| 9,024 |0.4655363475| 0.01108156028 |SUPER_11_unloc_10| 8,721 |0.4346978558| 0.01146657493 |SUPER_11_unloc_11| 6,026 |0.4412545636| 0 |SUPER_11_unloc_12| 4,772 |0.4377619447| 0 Table XX. Length %GC and %N for curated 2020.11.06 version of rCheMyd1 |Name_Scaffold| leng| %GC| %N | |---|---|---|---| |SUPER_1| 348,265,484| 0.426608139| 0.005 |SUPER_2| 262,513,884| 0.4258510457| 0.000 |SUPER_3| 204,120,564| 0.426853215| 0.000 |SUPER_4| 142,324,156| 0.4234725622| 0.002 |SUPER_5| 134,428,053| 0.4273345758| 0.027 |SUPER_6| 133,272,604| 0.4243452315| 0.001 |SUPER_7| 123,872,898| 0.4371868332| 0.000 |SUPER_8| 106,505,537| 0.4369205894| 0.000 |SUPER_9| 101,620,591| 0.4366290096| 0.000 |SUPER_10| 84,054,250| 0.4441273463| 0.001 |SUPER_11| 79,522,289| 0.4246977222| 0.014 |SUPER_12| 44,456,234| 0.4770332998| 0.004 |SUPER_13| 42,953,438| 0.4499156971| 0.027 |SUPER_14| 42,626,581| 0.4380274599| 0.004 |SUPER_15| 33,324,608| 0.4569506114| 0.000 |SUPER_16| 25,690,524| 0.465054508| 0.000 |SUPER_17| 25,202,864| 0.4575149872| 0.000 |SUPER_18| 23,386,534| 0.471161524| 0.000 |SUPER_19| 19,769,559| 0.4832919642| 0.000 |SUPER_20| 19,256,569| 0.4707118386| 0.000 |SUPER_21| 19,202,859| 0.5242525084| 0.005 |SUPER_24| 18,854,909| 0.4800383815| 0.000 |SUPER_25| 16,433,863| 0.5034322727| 0.000 |SUPER_26| 16,225,734| 0.4753074961| 0.000 |SUPER_27| 16,135,183| 0.5012314394| 0.000 |SUPER_22| 11,152,253| 0.5202827626| 0.024 |SUPER_23| 9,977,174| 0.4587280927| 0.164 |SUPER_23_unloc_1| 8,903,217| 0.5200635905| 0.042 |SUPER_28| 7,864,902| 0.4945411907| 0.065 |SUPER_22_unloc_1| 7,753,707| 0.533713487| 0.023 |scaffold_48_arrow_ctg1| 468,323| 0.1589821555| 0.645 |scaffold_42_arrow_ctg1| 414,693| 0.5288683436| 0.002 |scaffold_59_arrow_ctg1| 222,063| 0.4974219028| 0.002 |scaffold_49_arrow_ctg1| 200,098| 0.4521284571| 0.000 |scaffold_60_arrow_ctg1| 200,034| 0.6284581621| 0.000 |scaffold_61_arrow_ctg1| 186,729| 0.6133059139| 0.005 |scaffold_63_arrow_ctg1| 156,755| 0.5972568658| 0.000 |scaffold_45_arrow_ctg1| 139,352| 0.4775747747| 0.000 |scaffold_65_arrow_ctg1| 139,178| 0.5295592694| 0.001 |scaffold_66_arrow_ctg1| 136,476| 0.5218206864| 0.000 |scaffold_67_arrow_ctg1| 131,618| 0.5193742497| 0.000 |scaffold_68_arrow_ctg1| 130,366| 0.5817621159| 0.004 |scaffold_69_arrow_ctg1| 104,581| 0.5434256701| 0.000 |scaffold_70_arrow_ctg1| 91,510| 0.5298546607| 0.000 |scaffold_71_arrow_ctg1| 87,187| 0.529207336| 0.001 |scaffold_72_arrow_ctg1| 84,098| 0.5292159148| 0.000 |scaffold_76_arrow_ctg1| 81,177| 0.4907301329| 0.001 |scaffold_77_arrow_ctg1| 76,334| 0.5131789242| 0.002 |scaffold_78_arrow_ctg1| 74,105| 0.4520612644| 0.000 |scaffold_80_arrow_ctg1| 73,766| 0.6859799908| 0.007 |scaffold_79_arrow_ctg1| 73,540| 0.4006119119| 0.000 |scaffold_83_arrow_ctg1| 68,053| 0.6505958591| 0.000 |scaffold_84_arrow_ctg1| 67,755| 0.5005387056| 0.007 |scaffold_86_arrow_ctg1| 65,737| 0.6570272449| 0.008 |scaffold_87_arrow_ctg1| 62,133| 0.4745787263| 0.000 |scaffold_88_arrow_ctg1| 61,604| 0.5103727031| 0.000 |scaffold_89_arrow_ctg1| 61,238| 0.6525033476| 0.008 |scaffold_90_arrow_ctg1| 59,032| 0.5058612278| 0.000 |scaffold_91_arrow_ctg1| 53,571| 0.5847566781| 0.000 |scaffold_92_arrow_ctg1| 51,870| 0.5339888182| 0.010 |scaffold_93_arrow_ctg1| 51,123| 0.6870097608| 0.000 |scaffold_95_arrow_ctg1| 51,066| 0.6133826812| 0.000 |scaffold_94_arrow_ctg1| 50,289| 0.4775597049| 0.000 |scaffold_96_arrow_ctg1| 48,875| 0.4888184143| 0.000 |scaffold_97_arrow_ctg1| 48,603| 0.4761640228| 0.000 |scaffold_98_arrow_ctg1| 48,044| 0.4328532179| 0.002 |scaffold_100_arrow_ctg1| 46,898| 0.673269649| 0.000 |scaffold_99_arrow_ctg1| 46,854| 0.4942374184| 0.000 |scaffold_101_arrow_ctg1| 45,103| 0.494601246| 0.000 |scaffold_102_arrow_ctg1| 44,400| 0.4560135135| 0.000 |scaffold_103_arrow_ctg1| 44,162| 0.6547484262| 0.000 |scaffold_104_arrow_ctg1| 43,167| 0.5399031668| 0.000 |scaffold_105_arrow_ctg1| 40,978| 0.432207526| 0.000 |scaffold_106_arrow_ctg1| 39,936| 0.6700971554| 0.000 |scaffold_107_arrow_ctg1| 37,745| 0.5296065704| 0.000 |scaffold_108_arrow_ctg1| 34,665| 0.4378191259| 0.000 |scaffold_109_arrow_ctg1| 32,845| 0.6800121784| 0.000 |scaffold_110_arrow_ctg1| 31,630| 0.4792285805| 0.000 |scaffold_111_arrow_ctg1| 31,051| 0.6601719751| 0.000 |scaffold_112_arrow_ctg1| 27,483| 0.637557763| 0.000 |scaffold_113_arrow_ctg1| 27,288| 0.5906992084| 0.000 |scaffold_114_arrow_ctg1| 27,235| 0.5109601616| 0.000 |scaffold_12B_arrow_ctg1| 21,116| 0.4715381701| 0.000 |scaffold_115_arrow_ctg1| 21,067| 0.6232021645| 0.000 |scaffold_116_arrow_ctg1| 20,277| 0.6266212951| 0.000 |scaffold_117_arrow_ctg1| 16,618| 0.4925382116| 0.000 |scaffold_118_arrow_ctg1| 16,270| 0.6877688998| 0.000 |scaffold_119_arrow_ctg1| 14,461| 0.4435377913| 0.000 |scaffold_120_arrow_ctg1| 13,486| 0.4807207474| 0.000 |scaffold_121_arrow_ctg1| 11,258| 0.501598863| 0.000 |scaffold_122_arrow_ctg1| 8,758| 0.4674583238| 0.000 |scaffold_123_arrow_ctg1| 8,373| 0.6219992834| 0.012 |scaffold_124_arrow_ctg1| 8,025| 0.6343925234| 0.000 |scaffold_125_arrow_ctg1| 6,689| 0.4783973688| 0.000 |scaffold_126_arrow_ctg1| 6,663| 0.4691580369| 0.000 |scaffold_127_arrow_ctg1| 3,718| 0.7490586337| 0.000 |scaffold_128_arrow_ctg1| 3,472| 0.4282834101| 0.000 |scaffold_129_arrow_ctg1| 3,063| 0.4436826641| 0.000 We re-oriented the scaffolds (reverse complementary) 10 and 11 ![](https://i.imgur.com/D2Cadif.jpg) We can see that scaffold CheMyd_12 match with scaffold DerCor_14 and CheMyd_14 with DerCor_12. Therefore we ask for a rename of DerCor_12 and DerCor_14 ![](https://i.imgur.com/pwuhRbU.png) Scaffold 19 has a break in one of the genomes ![](https://i.imgur.com/fNd6sxC.png) Scaffold rCheMyd_20 map with rDerCor_22 ![](https://i.imgur.com/lKX4qBk.png) Scaffold rCheMyd1_21 map with rDerCor_24 and rCheMyd1_24 map with rDerCor_21 Therefore we ask for a rename for rDerCor_21 and rDerCor_24 ![](https://i.imgur.com/7PLhybv.png) Scaffold rCheMyd_22 and rCheMyd_22_unloc map rDerCor_20 maybe join rCheMyd_22 and rCheMyd_22_unloc ![](https://i.imgur.com/rWxlwmz.png) Here after join ![](https://i.imgur.com/dqaJZax.png) Scaffolds rCheMyd_23 and rCheMyd_23_unloc_RC map rDerCor23, maybe join ![](https://i.imgur.com/tjzuSpx.png)