# NF-Test :rocket: ## A simple test framework for Nextflow pipelines ### [Official Docs](https://code.askimed.com/nf-test/) πŸ“š <!-- Put the link to this slide here so people can follow --> slide-mode: [Here](https://hackmd.io/@Fe-DlBDORpeApqPrKRTlbA/SJVrM7uHh) πŸŽ₯ --- ## Agenda * :question: What is NF-Test * :eye: Overview of current framework using pytest * :bar_chart: nf-core/ampliseq module tests * :bar_chart: nf-core/ampliseq pipeline `test_single` profile tests * :books: Additional Resources --- ## What is NF-Test - Testing framework for nextflow pipelines based on groovy - Created in 2021 by Lukas Forer and Sebastian SchΓΆnherr. --- # nf-test vs pytest While nf-test :rocket: and pytest-workflow :snake: (currently popular in nf-core) are both test frameworks for Nextflow pipelines :gear:, they differ in structure, implementation, and features. Let's delve into these differences and explore why nf-test may offer some advantages over pytest-workflow :mag:. ---- ### Structure and implementation: | nf-test | pytest | | -------- | -------- | | Built on Groovy | Built on Python | | Specific to Nextflow | General-purpose testing framework for Python | | Provides specific assertions for testing Nextflow channels | Provides a range of assertions for testing Python code (or nextflow indirectly). Pytest assertions are not specific to any data structure or concept like channels in Nextflow | --- ### Current Testing Framework ### NF-Core Modules ---- **NF-Core Modules pytest-workflow** :snake: - Test cases are organized in separate test workflow files, like `main.nf` :scroll: ```groovy include { BCFTOOLS_SORT } from '../../../../../modules/nf-core/bcftools/sort/main.nf' workflow test_bcftools_sort { input = [ [ id:'test' ], // meta map file(params.test_data['sarscov2']['illumina']['test_vcf'], checkIfExists: true) ] BCFTOOLS_SORT ( input ) } ``` ---- - additional tests are through alias imports ```groovy include { BCFTOOLS_REHEADER } from '../../../../../modules/nf-core/bcftools/reheader/main.nf' include { BCFTOOLS_REHEADER as BCFTOOLS_REHEADER_VCF_GZ} from '../../../../../modules/nf-core/bcftools/reheader/main.nf' include { BCFTOOLS_REHEADER as BCFTOOLS_REHEADER_BCF_GZ} from '../../../../../modules/nf-core/bcftools/reheader/main.nf' ``` ---- - Additional parameters are specified in the `nextflow.config` file :gear: ```groovy process { publishDir = { "${params.outdir}/${task.process.tokenize(':')[-1].tokenize('_')[0].toLowerCase()}" } } ``` ---- - Test outputs and appropriate checks are listed in the `test.yml` file :clipboard: ``` - name: bcftools sort test_bcftools_sort command: nextflow run ./tests/modules/nf-core/bcftools/sort -entry test_bcftools_sort -c ./tests/config/nextflow.config -c ./tests/modules/nf-core/bcftools/sort/nextflow.config tags: - bcftools - bcftools/sort files: - path: output/bcftools/test.vcf.gz md5sum: 4f24467157f5c7a3b336481acf0c8a65 - path: output/bcftools/versions.yml ``` --- ### Current Testing Framework ### NF-Core/Ampliseq Pipeline ---- **NF-Core Pipelines - Github CI Testing Framework** ```bash - name: Run pipeline with ${{ matrix.profile }} test profile run: | nextflow run ${GITHUB_WORKSPACE} -profile ${{ matrix.profile }},docker --outdir ./results profile: [ test_multi, test_pacbio_its, test_doubleprimers, test_iontorrent, test_single, test_fasta, test_reftaxcustom, test_novaseq, ] ``` ---- ### Current Testing Framework Limitations πŸ”΅ The local execution process can be challenging for newcomers due to the need to understand Python, interpret unconventional outputs, and handle container virtualization technologies. ``` PROFILE=docker pytest --tag fastqc_single_end --symlink --keep-workflow-wd ``` ---- ⚠️ There's a need to manually create and maintain md5sums. Even though this process can be scripted, it requires considerable effort, which can lead to potential neglect and decay over time. πŸ”’ The creation of custom test logic is complex and requires advanced expertise, which could be a significant barrier for beginners. --- ## **NF-Test** :rocket: - Test cases can be organized as a suite of tests within a single test file :file_folder: - Each test has a name, a `when` closure, and a `then` closure :bulb: - These closures are used to describe the expected behavior of the process :mag_right: ---- - nf-test provides the `generate` :wrench: command, which creates a skeleton test code for Nextflow processes or workflows. This command automatically fills in the name :label:, script :scroll:, and process :gear: of the test case and creates a skeleton for the first test method :bulb:. ---- Example - `bcftools/sort/main.nf.test` ```groovy nextflow_process { name "Test Process BCFTOOLS_SORT" script "modules/nf-core/bcftools/sort/main.nf" process "BCFTOOLS_SORT" config "tests/modules/nf-core/bcftools/sort/nextflow.config" tag "bcftools_sort" test("Sarscov2 Illumina VCF") { when { params { outdir = "$outputDir" } process { """ input[0] = [ [ id:'test' ], // meta map file(params.test_data['sarscov2']['illumina']['test_vcf'], checkIfExists: true) ] """ } } then { assertAll( { assert process.success }, { assert snapshot(path(params.outdir).list()).match() } ) } } } ``` --- ### Writing Assertions ---- ### Power Assertions: * Writing test cases involves making assumptions using assertions :writing_hand:. * Groovy's power assert :muscle: offers detailed output when the boolean expression evaluates to false :x:. * nf-test simplifies this process with several extensions and commands tailored for Nextflow channels :zap:. ---- 1. **`with`** :point_right: Allows you to assert the contents of an item in a channel by index. 2. **`contains`** :mag: Lets you assert that an item is present anywhere in the channel. 3. **`assertContainsInAnyOrder`** Allows you to make assertions about the contents of a channel without considering order. --- ### Files * md5 Checksum ```groovy assert path(process.out.out_ch.get(0)).md5 == "64debea5017a035ddc67c0b51fa84b16" ``` * JSON Files ```groovy assert path(process.out.out_ch.get(0)).json == path('./some.json').json assert path(process.out.out_ch.get(0)).json.key == "value" ``` ---- * GZip Files ```groovy assert path(process.out.out_ch.get(0)).linesGzip.contains("Line Content") ``` * Filter lines ```groovy def lines = path(process.out.gzip.get(0)).linesGzip[0..5] assert lines.size() == 6 ``` ---- * Grep lines ```groovy def lines = path(process.out.gzip.get(0)).grepLinesGzip(0,5) assert lines.size() == 6 ``` --- ### Snapshots Snapshots :camera: are incredibly handy when you want to ensure your output channels or output files don't change unexpectedly :exclamation:. A typical snapshot test case :clipboard: creates a snapshot of the output channels or other objects, and then compares it to a reference snapshot file stored with the test (`*.nf.test.snap`) :file_folder:. ---- The test will fail :x: if the two snapshots don't match. This indicates either an unexpected change, or the reference snapshot needs updating to reflect the new output of a process, workflow, pipeline, or function :gear:. ---- The `snapshot` keyword creates a snapshot of the object and its match method can then be used to check if its contains the expected data from the snap file. ```groovy assert snapshot(process.out).match() ``` ---- ```groovy assert snapshot(path("$outputDir/fastqc/test_fastqc.html")).match() ``` ``` { "Should run without failures": { "content": [ "test_fastqc.html:md5,8455303cf8e02b1083e813bb2a35d99e" ], "timestamp": "2023-05-22T23:22:36+0000" } } ``` ---- Example of Snapshot Failure ```bash Test Process FASTQC Test [91188cdf] 'Sarscov2 Illumina Fastq' java.lang.RuntimeException: Different Snapshot: Found: [ "test_fastqc.html:md5,ec30463244068beb09abee14c95c55d4" ] Expected: [ "test_fastqc.html:md5,9759fe2ebad67decc1be1d80c8c0954d" ] FAILED (16.021s) Assertion failed: 1 of 2 assertions failed ``` ---- #### Updating Snapshots When a snapshot test is failing due to an intentional implementation change, you can use the --update-snapshot flag to re-generate snapshots for all failed tests. ```bash nf-test test tests/main.nf.test --update-snapshot ``` --- ### (Gitpod - Exercise) ### NF-Test Generation for NF-Core/Ampliseq --- We will be generating tests for modules and pipeline of **nf-core/ampliseq** workflow 1. **[Open nf-core/ampliseq in Gitpod](https://gitpod.io/new#https://github.com/nf-core/ampliseq) (nf-test installed)** (OR) 2. Use your own local ampliseq repo (master branch) and install **nf-test** as shown below: ```bash curl -fsSL https://code.askimed.com/install/nf-test | bash # or wget -qO- https://code.askimed.com/install/nf-test | bash ``` ---- ## nf-test init The init command set ups nf-test in the current directory. ```bash= nf-test init ``` The init command creates the following files: `nf-test.config` and `tests/nextflow.config`. It also creates a folder `tests/` which is the home directory of your test code. ---- `tests/nextflow.config` will come with config settings as below: ```groovy config { testsDir "tests" workDir ".nf-test" configFile "tests/nextflow.config" profile "" } ``` Change `configFile` from `tests/nextflow.config` -> `nextflow.config` to utilize pipeline's main config --- ## NF-Tests for Modules (Fastqc) ```bash nf-test generate process modules/nf-core/fastqc/main.nf ``` ```bash πŸš€ nf-test 0.7.3 https://code.askimed.com/nf-test (c) 2021 - 2023 Lukas Forer and Sebastian Schoenherr Load source file '/workspace/ampliseq/modules/nf-core/fastqc/main.nf' Wrote process test file '/workspace/ampliseq/tests/modules/nf-core/fastqc/main.nf.test SUCCESS: Generated 1 test files. ``` ---- `tests/modules/nf-core/fastqc/main.nf.test` ```groovy nextflow_process { name "Test Process FASTQC" script "modules/nf-core/fastqc/main.nf" process "FASTQC" test("Should run without failures") { when { params { // define parameters here. Example: // outdir = "tests/results" } process { """ // define inputs of the process here. Example: // input[0] = file("test-file.txt") """ } } then { assert process.success assert snapshot(process.out).match() } } } ``` ---- - **`name "Test Process FASTQC"`** :label: This is where you name your process. - **`script "modules/nf-core/fastqc/main.nf"`** :scroll: Here you specify the script you are testing. - **`process "FASTQC"`** :gear: Here you indicate the process you are testing. :clipboard: **`test("Should run without failures")`** : This is the main test case and its description. ---- - **`when`** :clock1: block: This is where you set up your test. - **`params`** :memo: This is where you define parameters for your test. For example, you could define an output directory here. - **`process`** :computer: This is where you define the inputs for the process you are testing and **inputs are provided via index-position(input[0], input[1])**. ---- - **`then`** :point_right: block: This is where you assert the expected outcomes of your test. - **`assert process.success`** :heavy_check_mark: Here you assert that the process should be successful. - **`assert snapshot(process.out).match()`** :camera: Here you assert that the output of the process matches the snapshot stored in your test file. ---- To the test file: 1. Add profile "docker" under process 2. Set outdir param in `when` block 3. provide the following test input for fastqc module 4. provide assertions below in the `then` block: ```groovy nextflow_process { name "Test Process FASTQC" script "modules/nf-core/fastqc/main.nf" process "FASTQC" profile "docker" test("Sarscov2 Illumina Fastq") { when { params { outdir = "$outputDir" } process { """ input[0] = [ [ id: 'test', single_end:true ], [ file("https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/sarscov2/illumina/fastq/test_1.fastq.gz", checkIfExists: true) ] ] """ } } then { assertAll( { assert process.success }, { assert snapshot(path("$outputDir/fastqc/test_fastqc.html")).match() } ) } } } ``` ---- Testing the module ```bash nf-test test tests/modules/nf-core/fastqc/main.nf.test ``` - The first time the test is run, the snapshot file is created. - Re-Run the test again to check if the snapshots match ---- output: ```bash πŸš€ nf-test 0.7.3 https://code.askimed.com/nf-test (c) 2021 - 2023 Lukas Forer and Sebastian Schoenherr Test Process FASTQC Test [91188cdf] 'Sarscov2 Illumina Fastq' PASSED (19.82s) SUCCESS: Executed 1 tests in 19.84s ``` --- ## NF-Tests for Pipeline (Single-End) --- ```bash nf-test generate pipeline main.nf ``` ```bash πŸš€ nf-test 0.7.3 https://code.askimed.com/nf-test (c) 2021 - 2023 Lukas Forer and Sebastian Schoenherr Load source file '/workspace/ampliseq/main.nf' Wrote pipeline test file '/workspace/ampliseq/tests/main.nf.test SUCCESS: Generated 1 test files. ``` ---- `tests/main.nf.test` ```groovy nextflow_pipeline { name "Test Workflow main.nf" script "main.nf" test("Should run without failures") { when { params { // define parameters here. Example: // outdir = "tests/results" } } then { assert workflow.success } } } ``` ---- To the test file: 1. Add profile "test_single,docker" under script 2. Set outdir param in `when` block 3. Assert the default `workflow.success` or the following ```groovy nextflow_pipeline { name "Test Workflow main.nf" script "main.nf" profile "test_single,docker" tag "pipeline" test("Illumina SE") { when { params { outdir = "$outputDir" } } then { assertAll( { assert workflow.success }, { assert snapshot(path("$outputDir/pipeline_info/software_versions.yml")).match("software_versions") }, { assert snapshot(path("$outputDir/overall_summary.tsv")).match("overall_summary_tsv") }, { assert snapshot(path("$outputDir/barrnap/rrna.arc.gff"), path("$outputDir/barrnap/rrna.bac.gff"), path("$outputDir/barrnap/rrna.euk.gff"), path("$outputDir/barrnap/rrna.mito.gff")).match("barrnap") }, { assert new File("$outputDir/barrnap/summary.tsv").exists() }, { assert snapshot(path("$outputDir/cutadapt/cutadapt_summary.tsv")).match("cutadapt") }, { assert new File("$outputDir/cutadapt/1a_S103_L001_R1_001.trimmed.cutadapt.log").exists() }, { assert new File("$outputDir/cutadapt/1_S103_L001_R1_001.trimmed.cutadapt.log").exists() }, { assert new File("$outputDir/cutadapt/2a_S115_L001_R1_001.trimmed.cutadapt.log").exists() }, { assert new File("$outputDir/cutadapt/2_S115_L001_R1_001.trimmed.cutadapt.log").exists() }, { assert new File("$outputDir/cutadapt/assignTaxonomy.cutadapt.log").exists() }, { assert snapshot(path("$outputDir/dada2/ASV_seqs.fasta"), path("$outputDir/dada2/ASV_table.tsv"), path("$outputDir/dada2/ref_taxonomy.txt"), path("$outputDir/dada2/DADA2_stats.tsv"), path("$outputDir/dada2/DADA2_table.rds"), path("$outputDir/dada2/DADA2_table.tsv")).match("dada2") }, { assert new File("$outputDir/dada2/ASV_tax.tsv").exists() }, { assert new File("$outputDir/dada2/ASV_tax_species.tsv").exists() }, { assert new File("$outputDir/fastqc/1a_S103_L001_R1_001_fastqc.html").exists() }, { assert new File("$outputDir/fastqc/1_S103_L001_R1_001_fastqc.html").exists() }, { assert new File("$outputDir/fastqc/2a_S115_L001_R1_001_fastqc.html").exists() }, { assert new File("$outputDir/fastqc/2_S115_L001_R1_001_fastqc.html").exists() }, { assert snapshot(path("$outputDir/input/Samplesheet_single_end.tsv")).match("input") }, { assert snapshot(path("$outputDir/multiqc/multiqc_data/multiqc_fastqc.txt"), path("$outputDir/multiqc/multiqc_data/multiqc_general_stats.txt"), path("$outputDir/multiqc/multiqc_data/multiqc_cutadapt.txt")).match("multiqc") } ) } } } ``` ---- - Run the test ```bash nf-test test tests/main.nf.test ``` - Re-Run the test to ensure snapshots match ```bash nf-test test tests/main.nf.test ``` ---- ```bash πŸš€ nf-test 0.7.3 https://code.askimed.com/nf-test (c) 2021 - 2023 Lukas Forer and Sebastian Schoenherr Test Workflow main.nf Test [68e76581] 'Illumina SE' PASSED (261.213s) SUCCESS: Executed 1 tests in 261.287s ``` --- ### Run All Tests at once :boom: ```bash nf-test test ``` ---- ```bash πŸš€ nf-test 0.7.3 https://code.askimed.com/nf-test (c) 2021 - 2023 Lukas Forer and Sebastian Schoenherr Found 1 files in test directory. Test Process FASTQC Test [91188cdf] 'Sarscov2 Illumina Fastq' PASSED (19.231s) Test Workflow main.nf Test [68e76581] 'Illumina SE' PASSED (149.098s) SUCCESS: Executed 2 tests in 168.458s ``` --- ### Assertion order recommendations `* still evolving` * Use **[assertAll()](https://code.askimed.com/nf-test/assertions/assertions/#using-assertall)** to ensure that all supplied closures do no throw exceptions * Use the test dataset details in the name of the test as applicable (ex - `Sarscov2 Illumina PE Sorted BAM`) * The test should at least contain `process.success` assertion ---- * Snapshot all output including `versions.yml` * if not possible (as in cases of timestamps/execution folder paths in output) - snapshot output files after filtering or grep lines that don’t change * if snapshot for test output is not possible, go for new File ``.exists()` checks however, the snapshot MUST contain the versions.yml md5 --- ### Types of Assertions - Assert on output channels ```groovy assert process.out.versions[0] ==~ ".*/versions.yml" with(process.out.vcf[0]) { assert get(0).id == "test" assert get(1) ==~ ".*/test.vcf.gz" } ``` - Assert on output files ```groovy assertAll( { assert process.success }, { assert new File("$outputDir/bcftools/test.vcf.gz").exists() }, { assert new File("$outputDir/bcftools/versions.yml").exists() } ) ``` ---- - Snapshot All files in Output Directory ```groovy assert snapshot(path(params.outdir).list()).match() ``` - Snapshot Selective files in Output Directory ```groovy assert snapshot(path("$outputDir/bcftools/test.vcf.gz")).match() assert new File("$outputDir/bcftools/versions.yml").exists() ``` ---- - Snapshot Filtered Selective files in Output Directory ```groovy assert snapshot(path("$outputDir/bcftools/test.vcf.gz").linesGzip[32-41] ``` - Custom filtering of elements in snapshot ```groovy def softwareVersions = path("$outputDir/pipeline_info/software_versions.yml").yaml if (softwareVersions.containsKey("Workflow")) { softwareVersions.Workflow.remove("Nextflow") } assert snapshot(softwareVersions).match("software_versions") ``` --- # :100: :muscle: :tada: --- ### Wrap up - NF-Test is quite new, exciting and rapidly evolving - check out the `#nf-test` channel in nf-core Slack for further discussions ---- ### Resources * [NF-Test Official Docs](https://code.askimed.com/nf-test/) * [NF-Test Bytesize Talk By Edmund Miller](https://nf-co.re/events/2022/bytesize_nftest) * [NF-Test Examples](https://github.com/sateeshperi/nf-test/tree/main/test-data) * [NF-Core NF-Test Migration Notes](https://hackmd.io/@Fe-DlBDORpeApqPrKRTlbA/r1M6VpFbn) --- ### Thank you! You can find me on - [@GitHub](https://github.com/sateeshperi) - Nextflow & Nf-Core Slack Workspaces
{"metaMigratedAt":"2023-06-18T04:53:48.969Z","metaMigratedFrom":"YAML","title":"NF-Test","breaks":true,"description":"View the slide with \"Slide Mode\".","contributors":"[{\"id\":\"15ef8394-10ce-4697-80a6-a3eb2914e56c\",\"add\":29949,\"del\":9113},{\"id\":\"35a34875-3971-4e5c-ac89-b0a2d4098367\",\"add\":436,\"del\":1}]"}
    855 views