# Analysis fungal ITS amplicon from illumina sequencing by qiime2
MSc. Kelly J. Hidalgo Martinez
Microbiologist
Ph.D. student in Genetics and Molecular Biology
Division of Microbial Resources
Research Center for Agriculture, Biology and Chemical
University of Campinas
Brazil
Phone: +55 19 98172 1510
---
### Requirements
1. Putty [for windows users](https://www.chiark.greenend.org.uk/~sgtatham/putty/latest.html)
2. [ Qiime2 ](https://docs.qiime2.org/2019.4/) installing via conda qiime2-2019.4
3. FASTQC - installing via conda bioinfo environment
4. Trimmomatic - installing via conda bioinfo environment [manual](http://www.usadellab.org/cms/uploads/supplementary/Trimmomatic/TrimmomaticManual_V0.32.pdf)
5. Filezilla [windows users](https://filezilla-project.org/download.php?platform=win64) / [linux users](https://filezilla-project.org/download.php)

6. [Download](https://github.com/khidalgo85/qiime2/raw/master/workflow_its_qiime2.pdf) the schematic workflow
## Importants tips before you start
1. When typing a fila name or directory path, you can use **tab completion**
2. Press **up arrow** to get back previous commands you typed
3. Do not stored commands in a word processing program
4. Shell commands are **case-sensitive**
5. Every command has a **help** menu. (qiime --help, qiime dada2 --help, qiime dada2 denoise-paired --help)
### 1. Join the server
```coffeescript=
## Server IP address (IP address for Host box and bioinfor for Username box in filezilla)
ssh -x bioinfo@143.106.82.118
## Password (In filezilla for the password box. Port box 22)
bioinfo@15
## As root
su root
## Root password
20141117
```
### 2. Working Directory
```coffeescript=
## Change directory
cd /data/treinamento/its/
## Make new directory for the raw data
mkdir 00.RawData
## Change directory
cd 00.RawData/
```
Put the samples in the 00.RawData directory
***Download the dataset*** (Note if you have your own samples, it's not necessary)
```coffeescript=
## sample 1 Forward
wget https://github.com/USDA-ARS-GBRU/itsxpress-tutorial/raw/master/data/sample1_r1.fq.gz
## sample 1 Reverse
wget https://github.com/USDA-ARS-GBRU/itsxpress-tutorial/raw/master/data/sample1_r2.fq.gz
## sample 2 Forward
wget https://github.com/USDA-ARS-GBRU/itsxpress-tutorial/raw/master/data/sample2_r1.fq.gz
## sample 2 Reverse
wget https://github.com/USDA-ARS-GBRU/itsxpress-tutorial/raw/master/data/sample2_r2.fq.gz
```
---
## *First option: Quality control with Fastqc and trimming with Trimmomatic (out of Qiime2)*
---
### 3A. Trim primers with cutadapt
The cutadapt is installed in qiime2-2019.4 virtual environment.
The cutadapt screen out reads that do not begin with primer sequences and remove primer sequence from reads.
```coffeescript=
## o activate the virtual enviroment
conda activate qiime2-2019.4
## Make new directory for the samples without primers
mkdir 01.PrimersTrim
```
The below primers correspond to the fungal ITS2 region
-g GCATCGATGAAGAACGCAGC \
-G TCCTCCGCTTATTGATATGC \
You have to run the next command for each sample
```coffeescript=
parallel --link --jobs 4 \
'cutadapt \
--pair-filter any \
--no-indels \
--discard-untrimmed \
-g GCATCGATGAAGAACGCAGC \
-G TCCTCCGCTTATTGATATGC \
-o 01.PrimersTrim/sample1_r1.fq.gz \
-p 01.PrimersTrim/sample1_r2.fq.gz \
{1} {2} \
> 01.PrimersTrim/sample1_cutadapt_log.txt' \
::: 00.RawData/sample1_r1.fq.gz ::: 00.RawData/sample2_r2.fq.gz
```
Download microbiome_helper package
```coffeescript=
git clone https://github.com/LangilleLab/microbiome_helper.git
```
Create a log.txt file only with all the sequences that passed after the primer trimming, with a script of microbiome_helper.
```coffeescript=
microbiome_helper/parse_cutadapt_logs.py -i 01.PrimersTrim/*log.txt
## See the txt file
nano cutadapt_log.txt
## Erase all intermediate arquives
rm 01.PrimersTrim/*_log.txt
```
Create a .xlsx (excel) file for control the number the reads in each step
| SampleID | Raw_reads | Length | Post_trim_primers | Diference | Post-trim | Diference | itsXpress | Difference | % Lost | Denoised | Difference | Merged | non-chimeric | Difference | Final difference | Final % lost |
| -------- | --------- | ------ | ----------------- | --------- | --------- | --------- | --------- | ---------- | ------ | -------- | ---------- | ------ | ------------ | ---------- | ---------------- | ------------ |
Like this! (*summary_stats.xlsx* This template is the working material)

### 4A. Inspect read quality
```coffeescript=
## Make new directory for the quality reports
mkdir 02.FastqcReports
## Fastqc is installed in the bioinfo virtual enviroment. Activate!
conda activate bioinfo
## Run fastqc
fastqc -t 10 01.PrimersTrim/* -o 02.FastqcReports/
## Make one report with all the samples
multiqc 02.FastqcReports/* -o 02.FastqcReports/
```
Review output from multiqc `multiqc_report.html` in the `02.FastqcReports/` with the quality information. Download from Filezilla. You can view this report in a web-browser. The most important is the per-base quality it must be Q>30.
### 5A. Filter out low-quality reads
```coffeescript=
## Make new directory for the samples after quality control
mkdir 03.CleanData
## Make new directory for the unpaired reads (this is trash)
mkdir unpaired
```
Run Trimmomatic (Modify the trimming parameters for your samples)
#### Parameters
*LEADING* Remove low quality bases from the beginning. Specifies the minimum quality required to keep a base.
*TRAILING* Remove low quality bases from the end. Specifies the minimum quality required to keep a base.
*SLIDINGWINDOW* Perform a sliding window trimming, cutting once the average quality within the window falls below a threshold. WindowSize:requiredQuality.
*MAXINFO* Performs an adaptive quality rim, balancing the benefits of retining reads against the cost of retaining bases with errors. targetLength:strictness.
*MINLEN* Removes reads that fall below the specigied minimal length.
For more details see the [manual](http://www.usadellab.org/cms/uploads/supplementary/Trimmomatic/TrimmomaticManual_V0.32.pdf)
```coffeescript=
for i in 01.PrimersTrim/*1.fq.gz
do
BASE=$(basename $i 1.fq.gz)
## The number of threads is depending of your pc
trimmomatic PE -threads 12 $i 01.PrimersTrim/${BASE}2.fq.gz 03.CleanData/${BASE}1_paired.fq.gz unpaired/${BASE}1_unpaired.fq.gz 03.CleanData/${BASE}2_paired.fq.gz unpaired/${BASE}2_unpaired.fq.gz LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MAXINFO:80:0.5 MINLEN:200
done
```
Re-check the quality
```coffeescript=
fastqc -t 10 03.CleanData/* -o 02.FastqcReports/
multiqc 02.FastqcReports/*paired* -o 02.FastqcReports/
## Download from Filezilla
```
## 6A. Importing the FASTQ files as artifact
Create the ManifestFile.txt
```coffeescript=
nano ManifestFile.txt
## Columns separated by tab
sample-id forward-absolute-filepath reverse-absolute-filepath
sample1 $PWD/03.CleanData/sample1_r1_paired.fq.gz $PWD/03.CleanData/sample1_r2_paired.fq.gz
sample2 $PWD/03.CleanData/sample2_r1_paired.fq.gz $PWD/03.CleanData/sample2_r2_paired.fq.gz
## To exit
Ctrl + x
## To Save
$ S
## To confirm
$ Enter
```
Like this!

```coffeescript=
## To deactivate the bioinfo environment, stay in qiime2-2019.4 environment
conda deactivate
## Make new directory for the artifacts
mkdir 04.ImportedReads
## Import as artefact in qiime2
qiime tools import \
--type 'SampleData[PairedEndSequencesWithQuality]' \
--input-path ManifestFile.txt \
--output-path 04.ImportedReads/reads_trimmed.qza \
--input-format PairedEndFastqManifestPhred33V2
## Visualization
qiime demux summarize \
--i-data 04.ImportedReads/reads_trimmed.qza \
--o-visualization 04.ImportedReads/reads_trimmed.qzv
```
Use the qiime2 studio for visualizations [Qiime2 View](https://view.qiime2.org)
Generate a table with the number of sequences in each step using a microbiome_helper script
```coffeescript=
microbiome_helper/qiime2_fastq_lengths.py 04.ImportedReads/reads_trimmed.qza --proc 4 -o read_counts.tsv
## See the file
nano read_counts.tsv
```
## 7A. OPTIONAL! Extracting fungal ITS with [itsXpress](https://github.com/USDA-ARS-GBRU/q2_itsxpress) (only for fungal analysis)
This qiime2 plugin extract ITS1 and ITS2 - as well as full-length ITS sequences from high-throughput sequencing datasets.
```coffeescript=
## Make new directory for ItsX output
mkdir 05.ItsXpress
## for help
qiime itsxpress trim-pair-output-unmerged --help
## Run (The region depends on your sequenced region)
qiime itsxpress trim-pair-output-unmerged \
--i-per-sample-sequences 04.ImportedReads/reads_trimmed.qza \
--p-region ITS2 \
--p-taxa F \
--p-threads 10 \
--o-trimmed 05.ItsXpress/readstrimmed_itsxpress_out.qza
```
Generate a table with the number of sequences in each step using a microbiome_helper script
```coffeescript=
microbiome_helper/qiime2_fastq_lengths.py 04.ImportedReads/reads_trimmed.qza 05.ItsXpress/readstrimmed_itsxpress_out.qza --proc 4 -o read_counts.tsv
## See the file
nano read_counts.tsv
## Complete your summary_stats.xlsx file!!
```
## 8A. Denoising, joining reads and chimera removing with DADA2
If you don't make the stage 7A, you have to change the input file for the next command (05.ItsXpress/readstrimmed_itsxpress_out.qza for 04.ImportedReads/reads_trimmed.qza)
```coffeescript=
## Run DADA2
qiime dada2 denoise-paired --i-demultiplexed-seqs 05.ItsXpress/readstrimmed_itsxpress_out.qza \
--p-trunc-len-f 0 \
--p-trunc-len-r 0 \
--output-dir 06.Dada2Output
## Convert the denoising stats from .qza to .tsv
qiime tools export --input-path 06.Dada2Output/denoising_stats.qza --output-path 06.Dada2Output
## See the file
nano 06.Dada2Output/stats.tsv
## Complete your summary_stats.xlsx file!!
```
## 9. Download and fit the database
For fungal classification the database is UNITE
You can fit the database, or download ready to use [here](http://kronos.pharmacology.dal.ca/public_files/taxa_classifiers/qiime2-2019.7_classifiers/classifier_sh_refs_qiime_ver8_99_s_02.02.2019_ITS.qza)
```coffeescript=
## Download the database
wget https://files.plutof.ut.ee/public/orig/51/6F/516F387FC543287E1E2B04BA4654443082FE3D7050E92F5D53BA0702E4E77CD4.zip
## Change the name
mv 516F387FC543287E1E2B04BA4654443082FE3D7050E92F5D53BA0702E4E77CD4.zip unite_02_02_2019.zip
## Unzip
unzip unite_02_02_2019.zip
## Remove all the files out of developver directory
rm unite_02_02_2019.zip *.fasta *.txt
## Formating the fasta file
awk '/^>/ {print($0)}; /^[^>]/ {print(toupper($0))}' developer/sh_refs_qiime_ver8_dynamic_02.02.2019_dev.fasta | sed -e '/^>/!s/\(.*\)/\U\1/;s/[[:blank:]]*$//' > developer/sh_refs_qiime_ver8_dynamic_02.02.2019_uppercase.fasta
```
```coffeescript=
## Make new directory for the database
mkdir database
## Import the sequences as artifact
qiime tools import \
--type FeatureData[Sequence] \
--input-path developer/sh_refs_qiime_ver8_dynamic_02.02.2019_uppercase.fasta \
--output-path database/UNITE.qza
## Import the taxonomy file as artifact
qiime tools import \
--type FeatureData[Taxonomy] \
--input-path developer/sh_taxonomy_qiime_ver8_dynamic_02.02.2019_dev.txt \
--output-path database/UNITE_tax.qza \
--input-format HeaderlessTSVTaxonomyFormat
## fit the database
qiime feature-classifier fit-classifier-naive-bayes \
--i-reference-reads database/UNITE.qza \
--i-reference-taxonomy database/UNITE_tax.qza \
--o-classifier database/UNITE_classifier.qza
```
If your analysis isn't for fungal sequences, bellow can you download the databases based on the region sequenced:
* [16S V4/V5 region](http://kronos.pharmacology.dal.ca/public_files/taxa_classifiers/qiime2-2019.7_classifiers/classifier_silva_132_99_16S_V4.V5_515F_926R.qza) `(classifier_silva_132_99_16S_V4.V5_515F_926R.qza`)
* [16S V6/V8 region](http://kronos.pharmacology.dal.ca/public_files/taxa_classifiers/qiime2-2019.7_classifiers/classifier_silva_132_99_16S_V6.V8_B969F_BA1406R.qza) `(classifier_silva_132_99_16S_V6.V8_B969F_BA1406R.qza)`
* [16S V6/V8 region targeting archaea](http://kronos.pharmacology.dal.ca/public_files/taxa_classifiers/qiime2-2019.7_classifiers/classifier_silva_132_99_16S_V6.V8_A956F_A1401R.qza)`(classifier_silva_132_99_16S_V6.V8_A956F_A1401R.qza)`
* [18S V4 region](http://kronos.pharmacology.dal.ca/public_files/taxa_classifiers/qiime2-2019.7_classifiers/classifier_silva_132_99_18S_V4_E572F_E1009R.qza)`(classifier_silva_132_99_18S_V4_E572F_E1009R.qza)`
* [Full ITS - all eukaryotes](http://kronos.pharmacology.dal.ca/public_files/taxa_classifiers/qiime2-2019.7_classifiers/classifier_sh_refs_qiime_ver8_99_s_all_02.02.2019_ITS.qza)`(classifier_sh_refs_qiime_ver8_99_s_all_02.02.2019_ITS.qza)`
## 10. Assign taxonomy to ASVs
```coffeescript=
## Make new directory for the taxonomy classification
mkdir 07.TaxonomyClassification
## Run
qiime feature-classifier classify-sklearn \
--i-classifier database/UNITE_classifier.qza \
--i-reads 06.Dada2Output/representative_sequences.qza \
--o-classification 07.TaxonomyClassification/taxonomyclassification_dynamic.qza
```
It's recommend to compare the taxonomic assigments with the top BLASTn hits for certain ASVs (~5)
```coffeescript=
qiime feature-table tabulate-seqs --i-data 06.Dada2Output/representative_sequences.qza \
--o-visualization 06.Dada2Output/representative_sequences.qzv
```
representative_sequences.qzv

## BONUS (FOR 16S analysis)
### Build tree with FastTree and MAFFT qiime2 plugins
```coffeescript=
## Make new directory for the phylogenetic tree
mkdir 12.PhylogeneticTree
## Multiple align with MAFFT
qiime alignment mafft --i-sequences 06.DadaOutput/representative_sequences.qza \
--p-n-threads 4 \
--o-alignment 12.PhylogeneticTree/rep_seqs_aligned.qza
qiime alignment mask --i-alignment 12.PhylogeneticTree/rep_seqs_aligned.qza \
--o-masked-alignment 12.PhylogeneticTree/rep_seqs_aligned_masked.qza
## FastTree
qiime phylogeny fasttree --i-alignment 12.PhylogeneticTree/rep_seqs_aligned_masked.qza \
--p-n-threads 4 \
--o-tree 12.PhylogeneticTree/rep_seqs_aligned_masked_tree.qza
# Root the tree
qiime phylogeny midpoint-root --i-tree 12.PhylogeneticTree/rep_seqs_aligned_masked_tree.qza \
--o-rooted-tree 12.PhylogeneticTree/rep_seqs_aligned_masked_tree_rooted.qza
```
## 11. Barplot
```coffeescript=
## Make new directory for the graphs
mkdir 08.Graphs
## Create a sample-metadata.txt file. Here you can put all the variables that describes your samples
nano sample-metadata.txt
sample-id SampleName SampleType Local Date
#q2:types categorical categorical categorical categorical
sample1 sample1 water SãoPaulo April-18
sample2 sample2 Soil Campinas April-19
## The second row is for classify the variable according to its nature (p.e categorical ou numerical)
```
Like this! Separate by tab

```coffeescript=
## Make a barplot
qiime taxa barplot \
--i-table 06.Dada2Output/table.qza \
--i-taxonomy 07.TaxonomyClassification/taxonomyclassification_dynamic.qza \
--m-metadata-file sample-metadata.txt \
--o-visualization 08.Graphs/taxa-bar-plots-dynamic.qzv
```
Barplot on qiime2

## 12. Rarefaction curves
See the table from DADA2 output to know the depth
```coffeescript=
qiime feature-table summarize \
--i-table 06.Dada2Output/table.qza \
--o-visualization 06.Dada2Output/table_summary.qzv \
--m-sample-metadata-file sample-metadata.txt
```
table_summary.qzv

```coffeescript=
## Make new directory for the rarefaction curves
mkdir 09.RarefactionCurves
## Run
qiime diversity alpha-rarefaction --i-table 06.Dada2Output/table.qza \
--p-max-depth 6782 \
--p-steps 10 \
--p-metrics shannon \
--p-metrics observed_otus \
--p-metrics simpson \
--p-metrics chao1 \
--m-metadata-file sample-metadata.txt \
--o-visualization 09.RarefactionCurves/rarefaction_curves.qzv
```
rarefaction_curves.qzv

## 13. Alpha and Beta diversity analysis
Choose the sampling-depth based on the largest library (see table_summary.qzv).
For fungal analysis isn't recommend to use phylogenetic distances. For 16S analysis use the command `qiime diversity core-metrics-phylogenetic` to include the phylogenetic analysis and use `12.PhylogeneticTree/rep_seqs_aligned_masked_tree_rooted.qza` tree
```coffeescript=
qiime diversity core-metrics --i-table 06.Dada2Output/table.qza \
--p-sampling-depth 6089 \
--m-metadata-file sample-metadata.txt \
--p-n-jobs 4 \
--output-dir 10.AlphaBetaDiversity
qiime diversity alpha --i-table 06.Dada2Output/table.qza \
--p-metric chao1 \
--o-alpha-diversity 10.AlphaBetaDiversity/chao1_vector.qza
qiime diversity alpha --i-table 06.Dada2Output/table.qza \
--p-metric simpson \
--o-alpha-diversity 10.AlphaBetaDiversity/simpson_vector.qza
# To see all the vectors (alpha diversity indexes and estimators) in a one table
qiime metadata tabulate --m-input-file sample-metadata.txt \
--m-input-file 10.AlphaBetaDiversity/shannon_vector.qza \
--m-input-file 10.AlphaBetaDiversity/observed_otus_vector.qza \
--m-input-file 10.AlphaBetaDiversity/simpson_vector.qza \
--m-input-file 10.AlphaBetaDiversity/chao1_vector.qza \
--o-visualization 10.AlphaBetaDiversity/alfadiversidade_all.qzv
```
## 14. Exporting
```coffeescript=
## Make new directory for the exported files
mkdir 11.ExportFiles
qiime tools export --input-path 06.Dada2Output/table.qza \
--output-path 11.ExportFiles/
## .biom to .tsv
biom convert -i 11.ExportFiles/feature-table.biom \
-o 11.ExportFiles/Otu_Table.tsv \
--to-tsv \
--table-type "OTU table"
```
Open `Otu_Table.tsv file`, change #OTU ID to OTUID.
```coffeescript=
nano 11.ExportFiles/Otu_Table.tsv
```
Export the taxonomy
```coffeescript=
qiime tools export --input-path 07.TaxonomyClassification/taxonomyclassification_dynamic.qza \
--output-path 11.ExportFiles/taxonomy
```
Open `taxonomy.tsv`, change Feature ID to OTUID
```coffeescript=
nano 11.ExportFiles/taxonomy/taxonomy.tsv
```
# Final `summary_stats.xlsx`
| SampleID | Raw_reads | Length | Post_trim_primers | Diference | Post-trim | Diference | itsXpress | Difference | % Lost | Denoised | Difference | Merged | non-chimeric | Difference | Final difference | Final % lost |
| -------- | --------- | ------ | ----------------- | --------- | --------- |:---------:| --------- | ---------- | ------ | -------- | ---------- | ------ | ------------ | ---------- | ---------------- | ------------ |
|sample1|10000|48-251|10000|0|7755|2245|6415|3585|22,45|6354|3646|6089|3911|6089|3911|39,11|
|sample2|10000|48-251|10000|0|7414|2614|7386|2614|25,86|7296|2704|6782|3218|6782|3218|32,18|
---
## *Second option: Quality control and trimming with Qiime2 plugins*
---
## 2B. Importing the FASTQ files as artifact
Create the ManifestFile.txt
```coffeescript=
nano ManifestFile.txt
## Columns separated by tab
sample-id forward-absolute-filepath reverse-absolute-filepath
sample1 $PWD/00.RawData/sample1_r1.fq.gz $PWD/00.RawData/sample1_r2.fq.gz
sample2 $PWD/00.RawData/sample2_r1.fq.gz $PWD/00.RawData/sample2_r2.fq.gz
## To exit
Ctrl + x
## To Save
$ S
## To confirm
$ Enter
```
Like this!

```coffeescript=
## Make new directory for the artifacts
mkdir 01.ImportedReads
## Import as artefact in qiime2
qiime tools import \
--type 'SampleData[PairedEndSequencesWithQuality]' \
--input-path ManifestFile.txt \
--output-path 01.ImportedReads/reads_raw.qza \
--input-format PairedEndFastqManifestPhred33V2
## Visualization
qiime demux summarize \
--i-data 01.ImportedReads/reads_raw.qza \
--o-visualization 01.ImportedReads/reads_raw.qzv
```
Use the qiime2 studio for visualizations [https://view.qiime2.org/](https://)
Download microbiome_helper package
```coffeescript=
git clone https://github.com/LangilleLab/microbiome_helper.git
```
Generate a table with the number of sequences in each step using a microbiome_helper script
```coffeescript=
microbiome_helper/qiime2_fastq_lengths.py 01.ImportedReads/reads_raw.qza --proc 4 -o read_counts.tsv
## See the file
nano read_counts.tsv
```
## 3B. Trim primers with cutadapt qiime2 plugin
The cutadapt is a qiime2 plugin.
The cutadapt screen out reads that do not begin with primer sequences and remove primer sequence from reads.
```coffeescript=
qiime cutadapt trim-paired \
--i-demultiplexed-sequences 01.ImportedReads/reads_raw.qza \
--p-cores 10 \
--p-front-f GCATCGATGAAGAACGCAGC \
--p-front-r TCCTCCGCTTATTGATATGC \
--p-discard-untrimmed \
--p-no-indels \
--o-trimmed-sequences 01.ImportedReads/reads_primerstrim.qza
# Visualization
qiime demux summarize \
--i-data 01.ImportedReads/reads_primerstrim.qza \
--o-visualization 01.ImportedReads/reads_primerstrim.qzv
# Generate a table with the number of sequences in each step using a microbiome_helper script
$ microbiome_helper/qiime2_fastq_lengths.py 01.ImportedReads/reads_raw.qza 01.ImportedReads/reads_primerstrim.qza --proc 4 -o read_counts.tsv
```
## 4B. Extracting fungal ITS with itsXpress (optional- and only for fungal analysis)
This qiime2 plugin extract ITS1 and ITS2 - as well as full-length ITS sequences from high-throughput sequencing datasets.
```coffeescript=
## Installing the itsXpress plugin
conda install -c bioconda itsxpress
## Make new directory for ItsX output
mkdir 02.ItsXpress
## for help
qiime itsxpress trim-pair-output-unmerged --help
## Run (The region depends on your sequenced region)
qiime itsxpress trim-pair-output-unmerged \
--i-per-sample-sequences 01.ImportedReads/reads_primerstrim.qza \
--p-region ITS2 \
--p-taxa F \
--p-threads 10 \
--o-trimmed 02.ItsXpress/readstrimmed_itsxpress_out.qza
```
Generate a table with the number of sequences in each step using a microbiome_helper script
```coffeescript=
microbiome_helper/qiime2_fastq_lengths.py 01.ImportedReads/reads_raw.qza 01.ImportedReads/reads_primerstrim.qza 02.ItsXpress/readstrimmed_itsxpress_out.qza --proc 4 -o read_counts.tsv
## See the file
nano read_counts.tsv
## Complete your summary_stats.xlsx file!!
```
## 5B. Quality contol, filtering, Denoising, joining reads and chimera removing with DADA2
```coffeescript=
## See the reads quality
qiime demux summarize \
--i-data 02.ItsXpress/readstrimmed_itsxpress_out.qza \
--o-visualization 02.ItsXpress/readstrimmed_itsxpress_out.qzv
```

```coffeescript=
## For help
qiime dada2 denoise-paired --help
## Run DADA2 . Replace XX with the values for your samples according with observed in readstrimmed_itsxpress_out.qzv
qiime dada2 denoise-paired --i-demultiplexed-seqs 02.ItsXpress/readstrimmed_istxpress_out.qza \
--p-trunc-len-f \
--p-trunc-len-r \
--p-trim-len-r \
--p-rim-left-f \
--output-dir 03.Dada2Output
## Convert the denoising stats from .qza to .tsv
$ qiime tools export --input-path 03.Dada2Output/denoising_stats.qza
--output-path 03.Dada2Output
## See the file
nano 03.Dada2Output/stats.tsv
## Complete your summary_stats.xlsx file!!
```
### Continue to step 9 of the first option.
## ATTENTION!!!
If you made the second option, for continue to step 9, you have to pay attention in the names of directories and files, you will have to change them
## IMPORTANT TOOLS FOR UPSTREAM ANALYSIS
[RAW GRAPHS:](https://rawgraphs.io/) To construct graph, only you need your formated tables.
[Microbiome Analyst:](https://www.microbiomeanalyst.ca/) tool on-line for microbiome analysis. You need all the qiime2 outputs (otu_table, tax_table, sample_metadata, phy_tree-optional).
[I want hue:](http://medialab.github.io/iwanthue/) Generate and refine palettes of optimally distinct colors for your graphs in R.
## Some Citations
* **Qiime2:** Bolyen, Evan, et al. QIIME 2: Reproducible, interactive, scalable, and extensible microbiome data science. No. e27295v1. PeerJ Preprints, 2018.
* **FASTQC:** Andrews, Simon. "FastQC: a quality control tool for high throughput sequence data." (2010).
* **Trimmomatic:** Bolger, Anthony M., Marc Lohse, and Bjoern Usadel. "Trimmomatic: a flexible trimmer for Illumina sequence data." Bioinformatics 30.15 (2014): 2114-2120.
* **Cutadapt:** Martin, Marcel. "Cutadapt removes adapter sequences from high-throughput sequencing reads." EMBnet. journal 17.1 (2011): 10-12.
* **ITSxpress:** Rivers AR, Weber KC, Gardner TG et al. ITSxpress: Software to rapidly trim internally transcribed spacer sequences with quality scores for marker gene analysis [version 1; peer review: 2 approved]. F1000Research 2018, 7:1418
(https://doi.org/10.12688/f1000research.15704.1)
* **DADA2:** Callahan, Benjamin J., et al. "DADA2: high-resolution sample inference from Illumina amplicon data." Nature methods 13.7 (2016): 581.
* **UNITE:** Abarenkov, Kessy, et al. "The UNITE database for molecular identification of fungi–recent updates and future perspectives." New Phytologist 186.2 (2010): 281-285.
* **SILVA:** Quast, Christian, et al. "The SILVA ribosomal RNA gene database project: improved data processing and web-based tools." Nucleic acids research 41.D1 (2012): D590-D596.