owned this note
owned this note
Published
Linked with GitHub
---
title: DSL2
tags: march-2021, nf-core, hackathon, talk
description: View the slide with "Slide Mode".
---
{%hackmd theme-dark %}
![](https://i.imgur.com/p9mCNxa.png)
# Hackathon March 2021
---
# DSL2
<!-- TODO optional: add images, gifs etc. -->
---
# Primary focus
<!-- TODO: State the goal of your group -->
1. Add new modules
2. Add tests to existing modules
3. Add overhead memory requirements
---
# Group members
<!-- TODO: add all group members -->
- Harshil Patel
- Kevin Menden
- Friederike Hanssen
- Gregor Sturm
- Jose Espinosa
- Michael Heuer
- Francesco Lescai
- Robert Petit
- Mark S. Hill
- Carlos Ruiz
- Batool Almarzouq
- Santiago Revale
- Aron Skaftason
- Hédia Tnani
- Anthony Fullam
- Suzanne Jin
- Ravneet Bhuller
- Kevin Brick
- Pernilla Ericsson
- Alex Peltzer
- Maxime Borry
- Yuk Kei Wan
- Nick Toda
- Edmund Miller
---
# New modules - Achievements
## Update test files for all modules
- New test_data.config
- Requires ALL modules to be updated
- Everyone involved
## New module sequenzautils/bam2seqz
- opened a new issue
- work in progress
## New module sequenzautils/gcwiggle
- PR submitted https://github.com/nf-core/modules/pull/345 on Tuesday
- all tests passed!
- work complete!
- reviewing complete! Many thanks to Harshil for the review!
- merged into https://github.com/nf-core/modules/pull/345#pullrequestreview-619327844
- PR closed on Wednesday!
## New module cnvkit
- merged into nf-core/modules https://github.com/nf-core/modules/pull/173
- many thanks to all the reviewers!
- PR closed on Tuesday!
## New module picard/collectwgsmetrics
- Answered issue https://github.com/nf-core/modules/issues/264
- Most of the heavy-lifting had already been done by [hpatel](@YO3oG0hdSw2Gv5IWHc0nfA)
- PR submitted: https://github.com/nf-core/modules/pull/304
- Currently failing 1 EditorConfig lint test
- Info: Can't use md5sum checks for this picard module with pytest due to time stampe in output. Used contain arguments instead.
- Info: Pytest still complains about exit code locally, even though running the nextflow command on it's own seems to work fine
## New module prokka
- Adds module for [Prokka](https://github.com/tseemann/prokka)
- Submitted by Robert (@rpetit3)
- Answered Issue https://github.com/nf-core/modules/issues/288
- PR Merged https://github.com/nf-core/modules/pull/298
- All the CI tests are passing
- First nf-core module PR by [rpetit3](@d6tmnqZgSaGbC-FcqZEFkg), so feed back appreciated
- Some outputs contain timestamps and MD5s could not be used
- Thanks for the reviews!
## Update module shovill
- Updates the deprecated module for [Shovill](https://github.com/tseemann/shovill)
- Submitted by Robert (@rpetit3)
- Answered Issue https://github.com/nf-core/modules/issues/329
- PR Submitted https://github.com/nf-core/modules/pull/337
- Included test for each assembler
- Had to use larger test data from [nf-core/test-datasets](https://github.com/nf-core/test-datasets/tree/bacass)
- Tests failing due to memory issue with shovill_spades: https://github.com/nf-core/modules/pull/337#issuecomment-805159617
- Set SPAdes to assemble with k=31 to avoid out of memory error
- Some MD5s are not reproducible for some outputs due to timestamps and contigs changing orders
## New module vcftools
- PR (https://github.com/nf-core/modules/pull/334) merged in to master
- Responds to issue https://github.com/nf-core/modules/issues/214
- Notes:
- Quite awkward getting something like vcftools to work well as a module, so solutions provided do not exhaustively cover all use-cases. But, they can be flexibly applied so individual users should be able to easily tailor to their own requirements.
## New module samtools/merge
- Added samtools merge, which merges multiple sorted bam files into one bam file
- Merged PR: https://github.com/nf-core/modules/pull/296
## New module samtools/fastq
- Added samtools fastq, which converts BAM file into FASTQ file
- Merged PR: https://github.com/nf-core/modules/pull/316
- Then modified the module in order to produce compressed fastq.gz files
- Submitted PR: https://github.com/nf-core/modules/pull/339
## New module strelka/germline
- Addapted code from Sarek to make it fit to modules
- Merged PR: https://github.com/nf-core/modules/pull/340
## New module ucsc/bed12tobigbed
- A part of the dsl2 modules used in the nanoseq dsl2 conversion (https://github.com/nf-core/nanoseq/issues/108)
- Submitted PR: https://github.com/nf-core/modules/pull/302
- Previous PR got conflicts
## New module freebayes/single
- Created the software
- Running the tests, error encountered
## New module fgbio/fastqtobam
- Added module FastqToBam from FGBIO toolkit, to be used as part of workflow to handle UMIs
- Responds to issue https://github.com/nf-core/modules/issues/189
- PR submitted as https://github.com/nf-core/modules/pull/306
- Tuesday update:
- quite some work into fixing a strange change in the pytest yaml, and mismatch with upstream (had to revert changes, rebase, checkout a specific file, add missing test manually)
- pytests are failing with different messages
- docker: complaining a file that should be there is not actually there. running test locally (but error message is not transparent with pytest) seems there might be a problem with the docker image of fgbio in biocontainers
- conda: test runs just fine locally, but on github CI pytest it complains the md5sum is not equal to expected
- singularity: test complains md5sum not equal to expected (same as with conda)
- Wednesday update
- after further testing (all nextflow local run worked, while pytest variable depending on environment) I have removed the md5sum for the output bam file
- now both conda and singularity CI tests pass
- further conflicts resolved
- docker remains an issue, with likely the biocontainer running into a library problem: this might require modification of the original container and therefore cannot be resolved within the timeframe of the hackathon.
## New module adam/markduplicates
- Ran into issue with ``-u $(id -u):$(id -g)`` specified in `docker.runOptions` failing tests; see discussion on Slack and e.g. https://github.com/nf-core/tools/pull/351#issuecomment-581320133
- https://github.com/nf-core/modules/pull/308
- See also https://github.com/nf-core/modules/pull/315
## New module unicycler
- Added module unicycler
- Responds to issue https://github.com/nf-core/modules/issues/290
- PR submitted as https://github.com/nf-core/modules/pull/307
- One of the missing modules for viralrecon DSL2 migration, see https://github.com/nf-core/viralrecon/issues/149
## New module: AdapterRemoval
- Adding adapterRemoval module
- Issue: https://github.com/nf-core/modules/issues/286
- PR: https://github.com/nf-core/modules/pull/309
## New module: bismark/align
- added module for alignment with bismark
## New module: bismark/report
- added module for bismark_report
- generates single-sample reports from bismark alignments
## New module: bismark/summary
- added module for bismark_summary report
- generates summary reports over many samples
## New module: allelecounter
- Added allelecounter module (@Anthony)
- Issue: https://github.com/nf-core/modules/issues/185
- PR: https://github.com/nf-core/modules/pull/313
## New module Kallisto/quant
- Adds module for [Kallisto/quant](https://github.com/nf-core/modules/pull/351)
- PR: https://github.com/nf-core/modules/pull/351
- Submitted by Batool (@BatoolMM)
- Answered Issue https://github.com/nf-core/modules/issues/359
- Also, attempted to fix the test path for salmon/index and salmon/quant in this PR https://github.com/nf-core/modules/pull/385/
## New module: DamageProfiler
- Added DamageProfiler module
- Issue: https://github.com/nf-core/modules/issues/285
- Progress here: https://github.com/maxibor/modules/tree/damageprofiler
- Probem with biocontainer (missing libfontconfig1 dep): need to update bioconda recipe
## New module: prodigal [Gregor]
- added prodigal module for gene prediction in procaryotes
- PR: https://github.com/nf-core/modules/pull/333
## New module: gatk4/applybqsr
- Submitted by Carlos Ruiz (@yocra3)
- Issue: https://github.com/nf-core/modules/issues/196
- PR: https://github.com/nf-core/modules/pull/331
- Merged with master branch
## New module: gatk4/baserecalibrator
- Submitted by Carlos Ruiz (@yocra3)
- Issue: https://github.com/nf-core/modules/issues/197
- PR: https://github.com/nf-core/modules/pull/327
- Awaiting review
## New module: gatk4/indexfeaturefile
- Issue: https://github.com/nf-core/modules/issues/310
- In progress (@santiagorevale)
## new module: gatk4/fastqtosam
- Added gatk4 module for converting fastq to sam files
- Merged with master branch
## New module: gatk4/genotypegvcfs
- Issue: https://github.com/nf-core/modules/issues/200
- In progress (@santiagorevale)
## New module: gatk4/markduplicates
- Submitted by John Juma (@ajodeh-juma)
- Issue: https://github.com/nf-core/modules/issues/202
- PR: https://github.com/nf-core/modules/pull/356
## New module: ensembl-VEP
- Issue: https://github.com/nf-core/modules/issues/215
- In progress (@HediaTnani)
## New module: msisensor/scan
- Added msisensor/scan module (@kevbrick)
- Issue: https://github.com/nf-core/modules/issues/207
- PR: https://github.com/nf-core/modules/pull/343
- This module was not in original issue but msisensor/msi requires output
- All checks have passed
- Awaiting review
## New module: msisensor/msi
- Added msisensor/msi module (@kevbrick)
- Issue: https://github.com/nf-core/modules/issues/207
- PR: https://github.com/nf-core/modules/pull/343
- All checks have passed
- Awaiting review
## New module: kb/ref
- Submitted by (@flowuenne)
- Issue: https://github.com/nf-core/modules/issues/358
- Module for creating a reference for scrnaseq pipeline processing with kb
- In progress:
- test data does not seem to work with indexing of kb ref
- Need human test data. Local test with only chr21 worked. Simply need a small file with few genes and transcripts and subset genomic fasta.
- Will wait for new human test data added by [@Friederike Hanssen ](@KuylkJ0AS0-limIJRUVsnw ) to creat PR
## New module: kb/count
- Submitted by (@flowuenne)
- Issue: https://github.com/nf-core/modules/issues/360
- Module for quantifying scRNA-seq data (fastq files) using kb-python
- In progress:
- For proper tests, will need the index created by kb/ref from new test data. Local test done with index of chromosome 21 work.
- Will wait for new human test data added by [@Friederike Hanssen ](@KuylkJ0AS0-limIJRUVsnw ) to creat PR
## New module: NanoPlot
- Yuk Kei (Wednesday): this module also uses sequencing_summary.txt in addition to viralcon's NanoPlot module, which only uses the fastq option
- Submitted PR: https://github.com/nf-core/modules/pull/364
- PyTest working locally but not on GitHub checks
# Modules Testing - Achievements
## Human data set
<!-- State what you achieved here -->
- Rike (Monday): Search for a suitable human data set
- Rike (Tuesday): Download data, convert to fastqs, determine UMI Adapter, start sarek
- Rike (Wednesday): Data is finally running now to generate all the indices, unfortunately not enough reads mapped to chr6, so chr22 it is for now
## Nanopore SARS-CoV2 test dataset
- Yuk Kei (Tuesday & Wednesday): created a 100-read SARS-CoV2 test dataset with single-read fast5 files, fastq, bam/bai, and sequencing_summary.txt
- Merged PR (fast5, fastq, and bam/bai): https://github.com/nf-core/modules/pull/344
- Submitted PR along with NanoPlot (sequencing_summary.txt): https://github.com/nf-core/modules/pull/364
## Changes to test data
- Add readgroup to existing test bam files for compatibility with gatk tools
- Add recalibration table (.table) for testing gatk tools.
- Updated fastp tests to deal with non-deterministic ouputs
## Remove runOptions from docker profile
- https://github.com/nf-core/modules/pull/315
- Need to investigate further whether it is strictly necessary to use ``-u $(id -u):$(id -g)`` for Docker in nf-core
# Other - Achievements
* Re-organised test data for different platforms
* New test data config (@everyone!🎉)
* Update CI tests to add linting and to bypass limit (Edmund)
* Added more docs about local installation (properly tested modules has helped!)
* Updated PR template to encourage linking the corresponding issues [[#312 by Gregor](https://github.com/nf-core/modules/pull/312)]
* Add docs for running tests locally with pytest [[#338 by Harshil](https://github.com/nf-core/modules/pull/338)]
## modules with custom scripts
- Trying to find a way to include custom scripts in a module
- `${moduleDir}/bin/my_script.sh` works in principle but breaks the cache
- Short term: workaround with `#tools`?
- Proof-of-concept PR at https://github.com/nf-core/modules/pull/368
- Issue at https://github.com/nf-core/modules/issues/294 and open for further discussion
- Long term: Find nextflow solution: https://github.com/nextflow-io/nextflow/issues/1798
---