# Mouse ont assembly and pangenome Assembly one sample (BXD24). |Reads| Yield_[bp]| N50| Coverage| Max_length |Mean_length| Median_length |Mean_q |Median_q| |----|----|---|---|---|---|---|--|--| | 4,032,662 | 81,375,782,055 | 36,289 | 30 | 669,638 | 20,179 | 13,362 | 13.34 | 14.40 ### 1. Generate the assembly - I use wtdbg2 to perform an assembly of long-read nanopore sequencing data (-x ont) from a fastq file obtained by the base calling. - I use wtpoa-cns to polish the assembly generated in step 1. - I use minimap2 to align the original long-read nanopore sequencing data to the polished assembly from step 2. The command use the option specify that we are mapping long reads (-ax map-ont) and that the reference genome was assembled using a long-read assembly algorithm (-r2k). The output is a BAM file, in which I filter the aligned reads to remove secondary and supplementary alignments (specified by the "-F0x900" option). ``` sbatch -p workers -w octopus02 -c 48 --wrap 'cd /scratch && /lizardfs/flaviav/mouse_ont/assembly/as.sh' Submitted batch job 124177 ``` ### 1b. Statistics on the assembly (quast) quast.py mouse_reads.ont.wtdbg2.asm1.cns.fa.gz ![](https://i.imgur.com/iR6yL2Z.png) ### 1c. Assembly with canu ``` sbatch -p workers -w octopus02 -c 48 --wrap 'cd /scratch && /lizardfs/flaviav/mouse_ont/canu.sh' submitted batch job 125166 ``` ### 2. Polishing the assembly with linked reads ``` bwa-0.7.17/bwa index mouse_reads.ont.wtdbg2.asm1.cns.fa ``` ``` sbatch -p workers -w octopus05 -c 48 --wrap 'cd /scratch && /lizardfs/flaviav/mouse_ont/assembly/polish.sh' Submitted batch job 125066 ``` - Alignment: BWA mem to align linked reads to the ONT-based assembly of the mouse genome. - Sorting: The resulting SAM file from the alignment step is then converted to a binary BAM file and sorted using Sambamba. - Variant calling: This command uses Freebayes to call variants on each contig of the ONT-based assembly of the mouse genome. - Concatenation: This command concatenates the variant calls from each contig into a single BCF file using BCFTOOLS. The -n and -f flags are used to specify the list of BCF files to concatenate and the reference FASTA file to use during the normalization process, respectively. - Polishing: This command uses the polished BCF file generated in step 4 to polish the ONT-based assembly of the mouse genome. The -i, -H, and -f flags are used to specify the filter expression for variant calling. Error in the sorting step, sambamba--> Pjotr is checking and I'll use GATK instead of sambamba for now