---
title: nf-core/funcscan tutorial
tags: nf-core,documenation,funcscan,pipeline
---
# nf-core/funcscan tutorial
In this tutorial, we will guide you through how to set up a run for nf-core/funcscan almost from scratch!
It will show you how to install nextflow, how to turn on and off different screening categories, and particular tools within each category.
We will simulate having performed _de novo_ assembly of two metagenomes, and wish to screen for antimicrobial resistance genes (ARG) and antimicrobial peptides (AMP).
## Prerequisites
For this tutorial you will need basic command line experience.
You will also need at a minimum the following software installed:
- A Unix operating system (Linux, OSX etc.)
- [conda](https://docs.conda.io/en/latest/miniconda.html) (or [mamba](https://github.com/conda-forge/miniforge#mambaforge))
- With channels correctly configured
```bash
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge
```
Please see the documentation of the respective tools as necessary.
## Software and Databases
First we will make a directory that we will run our test in, and change into it.
```bash
mkdir funcscan-run
cd funcscan-run/
```
As nf-core/funcscan uses Nextflow to run the pipeline, we will install Nextflow using conda in a separate software environment called `nf-core`
```bash
mamba create -n nf-core -c bioconda nextflow nf-core
```
> ℹ️ Replace `mamba` with `conda` if you did not install `mamba`.
Once this environment is installed, we can activate it
```bash
conda activate nf-core
```
You can check this has successfully installed Nextflow by running
```bash
nextflow -version
```
This should print the version of Nextflow, and then deactivate the environment with
```bash
conda deactivate
```
## Make samplesheet
Next we will download some example metagenome assemblies, and prepare our input samplesheet.
First we will make a directory called `samples/`, and download two FASTA files into it.
```bash
mkdir samples/
cd samples/
## The resulting FASTA files will total ~11MB
curl https://www.ebi.ac.uk/metagenomics/api/v1/analyses/MGYA00575326/file/ERZ1664520_FASTA.fasta.gz -o sample1.fasta.gz
curl https://www.ebi.ac.uk/metagenomics/api/v1/analyses/MGYA00575327/file/ERZ1664518_FASTA.fasta.gz -o sample2.fasta.gz
cd ../
```
Now we can use some simple bash commands to programmatically create the make a input CSV samplesheet.
```bash
## Get the full paths for each file
ls -1 samples/* > paths.txt
## Construct a sample name based on the filename
sed 's#samples/##g;s#.fasta.gz##g' paths.txt > samplenames.txt
## Create the samplesheet, adding a header and then adding the samplenames and paths
echo 'sample,fasta' > samplesheet.csv
paste -d "," samplenames.txt paths.txt >> samplesheet.csv
```
The contents of the resulting `samplesheet.csv` should look like:
```
sample,fasta
sample1,samples/sample1.fasta.gz
sample2,samples/sample2.fasta.gz
```
> ⚠️ The `sed` command may not work on OSX, due to differences in `sed` builds between Linux (here) and OSX! Check the before running!
## Pipeline execution preparation
Now we can run funcscan!
As in some cases pipelines can take a long time, a good practise is to utilise `screen` sessions, that allow you to run your pipeline in the background, and give you your terminal back to do other things while waiting.
To create a screen session we can run
```bash
screen -R funcscan-run
```
Next we load our software environment containing Nextflow
```bash
conda activate nf-core
```
And then download the pipeline code
```bash
nextflow pull nf-core/funcscan
```
## Pipeline exectuion
Now we can construct our pipeline run command!
> ⚠️ Important: do not execute the command until the tutorial says!
First we specify the pipeline name and the version we want to run.
```bash
nextflow run nf-core/funcscan -r 1.0.0 \
```
Then we specify which software environment system to use.
In this case we will use `conda` sa we have already got this on our machine for the purposes of this tutorial.
```bash
nextflow run nf-core/funcscan -r 1.0.0 \
-profile conda \
```
Next we need to specify our input samplesheet we constructed and where we want to specify our results directory to go.
```bash
nextflow run nf-core/funcscan -r 1.0.0 \
-profile conda \
--input 'samplesheet.csv' \
--outdir './results' \
```
To specify which categories of biomolcules you wish to screen for, we turn on the respective workflows.
In this case, we _dont_ want to screen for biosynthetic gene clusters (BGCs), but we _do_ want to turn on screening for ARGs and AMPs. We do this by specifying the following `--run_*_screening` flags.
```bash
nextflow run nf-core/funcscan -r 1.0.0 \
-profile conda \
--input 'samplesheet.csv' \
--outdir './results' \
--run_amp_screening \
--run_arg_screening \
```
Once we have done this, we can customise a little which tools we want to run for each screening category. For example, we want to skip the [HMMsearch](http://hmmer.org/) modules and [AMPlify](https://github.com/bcgsc/AMPlify) for AMPs, [deepARG](https://bitbucket.org/gusphdproj/deeparg-ss/src/master/deeparg/) and [RGI](https://github.com/arpcard/rgi) for ARGs. We do this with `--*_skip_<tool>` flags. Otherwise all other tools in the two screening categories will be used.
```bash
nextflow run nf-core/funcscan -r 1.0.0 \
-profile conda \
--input 'samplesheet.csv' \
--outdir './results' \
--run_amp_screening \
--run_arg_screening \
--amp_skip_hmmsearch \
--amp_skip_amplify \
--arg_skip_deeparg \
--arg_skip_rgi
```
Now hit enter to run the command 🚀!
You should now get progress bars indicating what steps of the pipeline are being executed.
The pipeline will install the software via conda, download any required databasee, and run the contigs listed in the samplesheet against each tool!
If the pipeline is taking a long time (it will take about 37 minutes on a laptop), you can detach from your screen session by pressing `ctrl + a` on your keyboard and the `d` to `detach`.
This should return you to your prompt. This should take you back to your normal terminal.
To re-attach to the screen session, you can type
```bash
screen -r funcscan-run
```
to see the pipeline progress information.
However hopefully, if completed, you should now see a `Pipeline completed successfully!` message.
## Results
<!-- TODO: ~/Downloads/funcscan-run - dir is 1.6G -->
## Clean Up
Once you've explored the results and wish to finish the tutorial you can deactivate the conda environment with:
```bash
conda deactivate
```
You can then exit the screen session by typing:
```bash
exit
```
And delete the directory we ran the pipeline
> ⚠️ If you wish to retain any downloaded databases to re-use them in future pipeline runs, make sure to move these to another location _before_ running the next command!
```bash
rm -r /<path>/<to>/funcscan-run/
```
> ℹ️ Optional: if you wish to also wish to delete the conda environmens we made, run `conda env list` and identify the two environments (`amrfinderplus` and `nf-core`) and delete the listed directories.
And you should be done!
```
[[id: sample1, anon: prokka] [/path/to/one.fa, /path/to/two.fa, /path/to/a_dir/, /path/to/a_dir2/]]
```