<!-- .slide: data-background="https://raw.githubusercontent.com/maxulysse/maxulysse.github.io/main/assets/img/svg/green_white_bg.svg" -->
<a href="https://www.nf-co.re"><img src="https://raw.githubusercontent.com/nf-core/logos/master/byte-size-logos/bytesize-darkbg.svg" width="65%"><img></a>
</br>
## Custom scripts
<img src="https://openmoji.org/data/color/svg/1F431-200D-1F4BB.svg" width=50> <font size="6">Chris Hakkaart</font> | <img src="https://openmoji.org/data/color/svg/E040.svg" width=50> <font size="6">@chris_hakk</font> | <img src="https://openmoji.org/data/color/svg/E045.svg" width=50> <font size="6">christopher-hakkaart</font>
<img src="https://github.com/seqeralabs/logos/blob/master/seqera-logo-white.png?raw=true" width=300>
---
### Custom scripts
1. Background
2. `myfirstpipeline.nf`
3. How to use a `bin/` directory
4. How to use a `templates/` directory
5. Managing dependencies
6. Summary
---
### Background
- Real-world pipelines often use custom scripts written in different languages (BASH, R, Python, others...)
- With Nextflow you can integrate any scripting language into a workflow by adding the corresponding shebang to code blocks.
- You can avoid keeping large code blocks in your main workflow by executing them as custom scripts.
---
<!-- .slide: style="font-size: 24px;" -->
## `myfirstpipeline.nf`
```bash
process MYSCRIPT {
input:
val STR
output:
stdout
script:
"""
echo $STR | tr '[a-z]' '[A-Z]'
"""
}
workflow {
Channel.of('this', 'that', 'other') | MYSCRIPT | view
}
```
---
<!-- .slide: style="font-size: 24px;" -->
## `myfirstpipeline.nf`
```bash
process MYSCRIPT {
input:
val STR
output:
stdout
script:
"""
#!/usr/bin/env Rscript
cat(toupper("$STR"))
"""
}
workflow {
Channel.of('this', 'that', 'other') | MYSCRIPT | view
}
```
---
<!-- .slide: style="font-size: 24px;" -->
## `myfirstpipeline.nf`

---
<!-- .slide: style="font-size: 24px;" -->
### `/full/path/to/myfirstscript.r`
```R
#!/usr/bin/env Rscript
args = commandArgs(trailingOnly=TRUE)
cat(toupper(args[1]))
```
### `/full/path/to/myfirstpipeline.nf`
```bash
process MYSCRIPT {
input:
val STR
output:
stdout
script:
"""
/full/path/to/myfirstscript.r ${STR}
"""
}
workflow {
Channel.of('this', 'that', 'other') | MYSCRIPT | view
}
```
Don't forget to make your script executable with `chmod +x myfirstscript.r`
---
<!-- .slide: style="font-size: 24px;" -->
## How to use a `bin/` directory

---
<!-- .slide: style="font-size: 24px;" -->
## How to use a `bin/` directory
### `bin/myfirstscript.r`
```R
#!/usr/bin/env Rscript
args = commandArgs(trailingOnly=TRUE)
cat(toupper(args[1]))
```
### `myfirstpipeline.nf`
```bash
process MYSCRIPT {
input:
val STR
output:
stdout
script:
"""
myfirstscript.r ${STR}
"""
}
workflow {
Channel.of('this', 'that', 'other') | MYSCRIPT | view
}
```
---
<!-- .slide: style="font-size: 24px;" -->
## How to use a `templates/` directory

---
<!-- .slide: style="font-size: 24px;" -->
## How to use a `templates/` directory
### `templates/myfirstscript.r`
```R
#!/usr/bin/env Rscript
cat(toupper($STR))
```
### `myfirstpipeline.nf`
```bash
process MYSCRIPT {
input:
val STR
output:
stdout
script:
template 'myfirstscript.r'
}
workflow {
Channel.of('this', 'that', 'other') | MYSCRIPT | view
}
```
---
## Managing dependencies
- Dependencies are managed the same way as other tools.
- Can require one or more tool(s)/package(s).
- Multiple tools can be combined in a mulled container.
- Helper tools and documentation are available
- [`nf-core modules mulled`](https://nf-co.re/tools/#generate-the-name-for-a-multi-tool-container-image)
- [`multi-package-containers`](https://github.com/BioContainers/multi-package-containers/blob/master/combinations/hash.tsv)
---
## [`modules/local/salmon_summarizedexperiment.nf`](https://github.com/nf-core/rnaseq/blob/master/modules/local/salmon_summarizedexperiment.nf)
<!-- .slide: style="font-size: 24px;" -->
```R
process SALMON_SUMMARIZEDEXPERIMENT {
tag "$tx2gene"
label "process_medium"
conda (params.enable_conda ? "bioconda::bioconductor-summarizedexperiment=1.20.0" : null)
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/bioconductor-summarizedexperiment:1.20.0--r40_0' :
'quay.io/biocontainers/bioconductor-summarizedexperiment:1.20.0--r40_0' }"
input:
path counts
path tpm
path tx2gene
output:
path "*.rds" , emit: rds
path "versions.yml", emit: versions
when:
task.ext.when == null || task.ext.when
script: // This script is bundled with the pipeline, in nf-core/rnaseq/bin/
"""
salmon_summarizedexperiment.r \\
NULL \\
$counts \\
$tpm
cat <<-END_VERSIONS > versions.yml
"${task.process}":
r-base: \$(echo \$(R --version 2>&1) | sed 's/^.*R version //; s/ .*\$//')
bioconductor-summarizedexperiment: \$(Rscript -e "library(SummarizedExperiment); cat(as.character(packageVersion('SummarizedExperiment')))")
END_VERSIONS
"""
}
```
---
<!-- .slide: style="font-size: 24px;" -->
## [`modules/local/deseq2_qc.nf`](https://github.com/nf-core/rnaseq/blob/master/modules/local/deseq2_qc.nf)
```R
process DESEQ2_QC {
label "process_medium"
// (Bio)conda packages have intentionally not been pinned to a specific version
// This was to avoid the pipeline failing due to package conflicts whilst creating the environment when using -profile conda
conda (params.enable_conda ? "conda-forge::r-base bioconda::bioconductor-deseq2 bioconda::bioconductor-biocparallel bioconda::bioconductor-tximport bioconda::bioconductor-complexheatmap conda-forge::r-optparse conda-forge::r-ggplot2 conda-forge::r-rcolorbrewer conda-forge::r-pheatmap" : null)
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/mulled-v2-8849acf39a43cdd6c839a369a74c0adc823e2f91:ab110436faf952a33575c64dd74615a84011450b-0' :
'quay.io/biocontainers/mulled-v2-8849acf39a43cdd6c839a369a74c0adc823e2f91:ab110436faf952a33575c64dd74615a84011450b-0' }"
input:
path counts
path pca_header_multiqc
path clustering_header_multiqc
output:
path "*.pdf" , optional:true, emit: pdf
path "*.RData" , optional:true, emit: rdata
path "*pca.vals.txt" , optional:true, emit: pca_txt
path "*pca.vals_mqc.tsv" , optional:true, emit: pca_multiqc
path "*sample.dists.txt" , optional:true, emit: dists_txt
path "*sample.dists_mqc.tsv", optional:true, emit: dists_multiqc
path "*.log" , optional:true, emit: log
path "size_factors" , optional:true, emit: size_factors
path "versions.yml" , emit: versions
when:
task.ext.when == null || task.ext.when
script:
def args = task.ext.args ?: ''
def args2 = task.ext.args2 ?: ''
def label_lower = args2.toLowerCase()
def label_upper = args2.toUpperCase()
"""
deseq2_qc.r \\
--count_file $counts \\
--outdir ./ \\
--cores $task.cpus \\
$args
if [ -f "R_sessionInfo.log" ]; then
sed "s/deseq2_pca/${label_lower}_deseq2_pca/g" <$pca_header_multiqc >tmp.txt
sed -i -e "s/DESeq2 PCA/${label_upper} DESeq2 PCA/g" tmp.txt
cat tmp.txt *.pca.vals.txt > ${label_lower}.pca.vals_mqc.tsv
sed "s/deseq2_clustering/${label_lower}_deseq2_clustering/g" <$clustering_header_multiqc >tmp.txt
sed -i -e "s/DESeq2 sample/${label_upper} DESeq2 sample/g" tmp.txt
cat tmp.txt *.sample.dists.txt > ${label_lower}.sample.dists_mqc.tsv
fi
cat <<-END_VERSIONS > versions.yml
"${task.process}":
r-base: \$(echo \$(R --version 2>&1) | sed 's/^.*R version //; s/ .*\$//')
bioconductor-deseq2: \$(Rscript -e "library(DESeq2); cat(as.character(packageVersion('DESeq2')))")
END_VERSIONS
"""
}
```
---
## Summary
- Nextflow can use custom scripts written from many different languages.
- Custom scripts can be stored in the `bin/` or the `templates/` directory.
- Dependencies can be managed using conda and containers.
---

{"metaMigratedAt":"2023-06-17T14:43:56.106Z","metaMigratedFrom":"YAML","title":"nf-core/bytesize: Custom scripts","breaks":true,"contributors":"[{\"id\":\"9ddf09a0-dd4b-45d5-8723-6f4e7abad24a\",\"add\":8781,\"del\":0}]"}