# Meeting 13 Dec 2023 ## To discuss 1. Scope of the pipeline + general workflow Germline variants Somatic variants Structural variants CNVs Single cell? MSA? atacseq? all outputs of pipelines?? 3. Proposal? 4. What is the overlap with [NCbench](https://f1000research.com/articles/12-1125) and how will we handle this? (NCbench currently benchmarks germline small variants.) ## Notes 1. NCbench 2. snakemake pipeline dependent on the type of analysis (evaluation), includes the production of the matrices used for the benchmarking. (https://github.com/snakemake-workflows/dna-seq-benchmark) 3. Github actions independent of analysis (https://github.com/ncbench/ncbench-workflow/blob/main/.github/workflows/main.yml) 4. Reporting (https://github.com/ncbench/ncbench.github.io/blob/main/.github/workflows/static.yml) - Use NCbench for germline and somatic variants - Add pipelines (nextflow or snakemake) to the NCbench organization for other data types - Possible benchmarking pipelines by nextflow: RNA-seq variants, single cell variants, SV - CNV variants - Instead of one major benchmarking nf-core pipeline, it will be easier to keep seperate nextflow pipelines by data type. - We could try with one or a couple of PoC and see how it works. # Benchmarking ## Variant calling ### SNVs - `hap.py` (https://github.com/Illumina/hap.py) - `RTGtools vcfeval` (https://github.com/RealTimeGenomics/rtg-tools) - `bedtools jaccard` (https://bedtools.readthedocs.io/en/latest/content/tools/jaccard.html) GH4GH Standards for small variant benchmarking(https://github.com/ga4gh/benchmarking-tools/tree/master/doc/ref-impl) ### SVs - `truvari` (https://github.com/ACEnglish/truvari) - `SVanalyzer` (https://github.com/nhansen/SVanalyzer/blob/master/docs/svbenchmark.rst) ### Somatic Variants - `som.py`(https://github.com/Illumina/hap.py/blob/master/doc/sompy.md) ### Normalization tools