This tutorial assumes you have installed docker correctly!
See https://hackmd.io/@wvdt/docker on how to do that.
Let's look at the results for Run1, barcode08. This is a positive control sample.
```
DIR=/home/$USER/workshop_data/groupA
cd $DIR
# look at the quality of the run with pycoQC
conda activate pycoQC
pycoQC --summary_file ./sequencing_summary*.txt \
--html_outfile pycoQC_report.html
conda activate QC
mkdir -p $DIR/nanoplot/barcode08
cd $DIR/nanoplot/barcode08
NanoPlot --fastq ../../fastq_pass/barcode08/F*.fastq.gz
cat $DIR/fastq_pass/barcode08/*.fastq.gz > $DIR/fastq_pass/barcode08/all_reads.fastq.gz
mkdir -p $DIR/fastq_pass/barcode08/filtered
filtlong --min_length 150 $DIR/fastq_pass/barcode08/all_reads.fastq.gz \
| gzip > $DIR/fastq_pass/barcode08/filtered/filtered_reads.fastq.gz
NFDIR=$DIR/nf-flu/barcode08
mkdir -p $NFDIR
echo "sample,reads" > $NFDIR/samplesheet.csv
echo "barcode08,/home/$USER/workshop_data/groupA/fastq_pass/barcode08/filtered" >> $NFDIR/samplesheet.csv
cat $NFDIR/samplesheet.csv
conda activate nextflow
cd $NFDIR
nextflow run CFIA-NCFAD/nf-flu --input samplesheet.csv \
--platform nanopore -profile docker \
--skip_irma_subtyping_report false
```
Pre-download the NCBI files to speed up future runs!
```
# Tip: you can download the NCBI fasta and metadata yourself once and save them
# in this way, you can provide the pipeline with the filepaths when you run it,
# such that they don't have to be downloaded every single time!
mkdir -p ~/workshop_data/references/{ncbi_influenza_fasta,ncbi_influenza_metadata}
wget https://ftp.ncbi.nih.gov/genomes/INFLUENZA/genomeset.dat.gz -O ~/workshop_data/references/ncbi_influenza_metadata/genomeset.dat.gz
wget https://ftp.ncbi.nih.gov/genomes/INFLUENZA/influenza.fna.gz -O ~/workshop_data/references/ncbi_influenza_fasta/influenza.fna.gz
NCBI_META=~/workshop_data/references/ncbi_influenza_metadata/genomeset.dat.gz
NCBI_FASTA=~/workshop_data/references/ncbi_influenza_fasta/influenza.fna.gz
```
Let's do barcode01 of the same run. We will use `medaka` for variant calling now.
Attention: when using `medaka`, you have to specify which medaka-model to use for variant calling and consensus, this should be the same model as was used by Guppy for basecalling! You can find a list of available models here: https://github.com/nanoporetech/medaka/blob/master/medaka/options.py
For the r10 pore chemistry the naming scheme is:
`r[pore_chemistry]_e82_[basecall_speed]_[basecalling_mode]_[guppy_version]`
Our data was generated on a r10.4.1 pore, with 400bps Fast basecalling and basecalled with guppy 6.4.6 (you can find this information in the file `report_[....].html` that is generated by the MinION), so the best `medaka` model we can choose for variant calling is 'r1041_e82_400bps_fast_variant_g632'.
```
NCBI_META=~/workshop_data/references/ncbi_influenza_metadata/genomeset.dat.gz
NCBI_FASTA=~/workshop_data/references/ncbi_influenza_fasta/influenza.fna.gz
BARCODE=barcode01
DIR=/home/$USER/workshop_data/groupA
conda activate pycoQC
cd $DIR
pycoQC --summary_file ./sequencing_summary*.txt --html_outfile pycoQC_report.html
conda activate QC
mkdir -p $DIR/nanoplot/$BARCODE
cd $DIR/nanoplot/$BARCODE
NanoPlot --fastq ../../fastq_pass/$BARCODE/F*.fastq.gz
cat $DIR/fastq_pass/$BARCODE/*.fastq.gz > $DIR/fastq_pass/$BARCODE/all_reads.fastq.gz
mkdir -p $DIR/fastq_pass/$BARCODE/filtered
filtlong --min_length 100 $DIR/fastq_pass/$BARCODE/all_reads.fastq.gz \
| gzip > $DIR/fastq_pass/$BARCODE/filtered/filtered_reads.fastq.gz
NFDIR=$DIR/nf-flu/$BARCODE
mkdir -p $NFDIR
echo "sample,reads" > $NFDIR/samplesheet.csv
echo "$BARCODE,$DIR/fastq_pass/$BARCODE/filtered" >> $NFDIR/samplesheet.csv
cat $NFDIR/samplesheet.csv
conda activate base
cd $NFDIR
nextflow run CFIA-NCFAD/nf-flu --input samplesheet.csv --platform nanopore -profile docker \
--skip_irma_subtyping_report false \
--variant_caller medaka \
--medaka_variant_model r1041_e82_400bps_fast_variant_g632 \
--ncbi_influenza_fasta $NCBI_FASTA \
--ncbi_influenza_metadata $NCBI_META
```
Results:
https://drive.google.com/drive/folders/1AotYhrseIOdEprWJT6v5nbouHHsS2zY1?usp=drive_link