This tutorial assumes you have installed docker correctly! See https://hackmd.io/@wvdt/docker on how to do that. Let's look at the results for Run1, barcode08. This is a positive control sample. ``` DIR=/home/$USER/workshop_data/groupA cd $DIR # look at the quality of the run with pycoQC conda activate pycoQC pycoQC --summary_file ./sequencing_summary*.txt \ --html_outfile pycoQC_report.html conda activate QC mkdir -p $DIR/nanoplot/barcode08 cd $DIR/nanoplot/barcode08 NanoPlot --fastq ../../fastq_pass/barcode08/F*.fastq.gz cat $DIR/fastq_pass/barcode08/*.fastq.gz > $DIR/fastq_pass/barcode08/all_reads.fastq.gz mkdir -p $DIR/fastq_pass/barcode08/filtered filtlong --min_length 150 $DIR/fastq_pass/barcode08/all_reads.fastq.gz \ | gzip > $DIR/fastq_pass/barcode08/filtered/filtered_reads.fastq.gz NFDIR=$DIR/nf-flu/barcode08 mkdir -p $NFDIR echo "sample,reads" > $NFDIR/samplesheet.csv echo "barcode08,/home/$USER/workshop_data/groupA/fastq_pass/barcode08/filtered" >> $NFDIR/samplesheet.csv cat $NFDIR/samplesheet.csv conda activate nextflow cd $NFDIR nextflow run CFIA-NCFAD/nf-flu --input samplesheet.csv \ --platform nanopore -profile docker \ --skip_irma_subtyping_report false ``` Pre-download the NCBI files to speed up future runs! ``` # Tip: you can download the NCBI fasta and metadata yourself once and save them # in this way, you can provide the pipeline with the filepaths when you run it, # such that they don't have to be downloaded every single time! mkdir -p ~/workshop_data/references/{ncbi_influenza_fasta,ncbi_influenza_metadata} wget https://ftp.ncbi.nih.gov/genomes/INFLUENZA/genomeset.dat.gz -O ~/workshop_data/references/ncbi_influenza_metadata/genomeset.dat.gz wget https://ftp.ncbi.nih.gov/genomes/INFLUENZA/influenza.fna.gz -O ~/workshop_data/references/ncbi_influenza_fasta/influenza.fna.gz NCBI_META=~/workshop_data/references/ncbi_influenza_metadata/genomeset.dat.gz NCBI_FASTA=~/workshop_data/references/ncbi_influenza_fasta/influenza.fna.gz ``` Let's do barcode01 of the same run. We will use `medaka` for variant calling now. Attention: when using `medaka`, you have to specify which medaka-model to use for variant calling and consensus, this should be the same model as was used by Guppy for basecalling! You can find a list of available models here: https://github.com/nanoporetech/medaka/blob/master/medaka/options.py For the r10 pore chemistry the naming scheme is: `r[pore_chemistry]_e82_[basecall_speed]_[basecalling_mode]_[guppy_version]` Our data was generated on a r10.4.1 pore, with 400bps Fast basecalling and basecalled with guppy 6.4.6 (you can find this information in the file `report_[....].html` that is generated by the MinION), so the best `medaka` model we can choose for variant calling is 'r1041_e82_400bps_fast_variant_g632'. ``` NCBI_META=~/workshop_data/references/ncbi_influenza_metadata/genomeset.dat.gz NCBI_FASTA=~/workshop_data/references/ncbi_influenza_fasta/influenza.fna.gz BARCODE=barcode01 DIR=/home/$USER/workshop_data/groupA conda activate pycoQC cd $DIR pycoQC --summary_file ./sequencing_summary*.txt --html_outfile pycoQC_report.html conda activate QC mkdir -p $DIR/nanoplot/$BARCODE cd $DIR/nanoplot/$BARCODE NanoPlot --fastq ../../fastq_pass/$BARCODE/F*.fastq.gz cat $DIR/fastq_pass/$BARCODE/*.fastq.gz > $DIR/fastq_pass/$BARCODE/all_reads.fastq.gz mkdir -p $DIR/fastq_pass/$BARCODE/filtered filtlong --min_length 100 $DIR/fastq_pass/$BARCODE/all_reads.fastq.gz \ | gzip > $DIR/fastq_pass/$BARCODE/filtered/filtered_reads.fastq.gz NFDIR=$DIR/nf-flu/$BARCODE mkdir -p $NFDIR echo "sample,reads" > $NFDIR/samplesheet.csv echo "$BARCODE,$DIR/fastq_pass/$BARCODE/filtered" >> $NFDIR/samplesheet.csv cat $NFDIR/samplesheet.csv conda activate base cd $NFDIR nextflow run CFIA-NCFAD/nf-flu --input samplesheet.csv --platform nanopore -profile docker \ --skip_irma_subtyping_report false \ --variant_caller medaka \ --medaka_variant_model r1041_e82_400bps_fast_variant_g632 \ --ncbi_influenza_fasta $NCBI_FASTA \ --ncbi_influenza_metadata $NCBI_META ``` Results: https://drive.google.com/drive/folders/1AotYhrseIOdEprWJT6v5nbouHHsS2zY1?usp=drive_link