<!-- .slide: data-background="https://raw.githubusercontent.com/maxulysse/maxulysse.github.io/main/assets/img/svg/green_white_bg.svg" -->
<a href="https://www.nf-co.re"><img src="https://github.com/nf-core/logos/raw/master/nf-core-logos/nf-core-logo-darkbg.svg" width="65%"><img></a>
## Reproducible Pipelines for Core Facilities (and you!)
Franziska Bonath ▸ KTH | NGI | SciLifeLab
Maxime U Garcia ▸ Seqera Labs
---
## Outline
* Data Flow to NGI
* QC Steps from Sequencer to Delivery
* Pipelines and Workflow Managers
* nf-core: A Community Curated Set of Pipelines using Nextflow
* Nextflow Pipeline Case Study: Sarek
---
<!-- .slide: data-background="https://hackmd.io/_uploads/H1CZ2IH2j.jpg" -->
## NGI
#### (Stockholm node, Illumina projects only)
|<font size = 6> projects: 631</font>| |<font size = 6> samples: 31686</font>|
| -------- | -------- | -------- |
| | <font size = 7> 2022 </font> | |
|<font size = 6> bases: 1373 Gbp / day</font>| |<font size = 6> 1 human genome / 3.39 minutes</font>|
---
## Data flow at NGI
<image src=https://hackmd.io/_uploads/rkvgNrrhj.jpg width=90%/>
[comment]: <NAS = Network Attached Storage>
---
## bioinformatics at NGI
### 1) Primary QC of flowcells
* Did the flowcell/lane get enough reads?
* Is the average quality of all reads acceptable?
* % of reads above Phred Q30
* PhiX error rate below 2 %


---
## bioinformatics at NGI
### 2) Demultiplexing
<div style="text-align: left; float: left;"><font size=6>
* Did all samples get enough reads
* Are there excessive amounts of undetermined reads?
* Are there valid indexes within the undetermined reads
</font></div>
<span><image src=https://hackmd.io/_uploads/B1CDU1O3s.jpg/></span>
---
## bioinformatics at NGI
### 2) Demultiplexing
<image src=https://hackmd.io/_uploads/ry9UBvHhs.png, width="393"/>
<span><image src=https://hackmd.io/_uploads/H1_VUPH3o.png, width="500"/></span> <span><image src=https://hackmd.io/_uploads/BkjM_PHho.png, width="810"/></span>
---
## bioinformatics at NGI
### 3) QC reports by sample
<div style="display: flex; justify-content: space-evenly; align-items:center;">
<div style="text-align: left; float: left;"><font size=5>
* Quality on sample level
* % of reads above Phred Q30
* Contamination report (Fastq-screen)
* mapping against most common species
* Summary of QC report in MultiQC
</font></div><img src="https://hackmd.io/_uploads/HJXD0eOho.jpg" width="35%">
</div>
---
## bioinformatics at NGI
### 4) "Best Practice" Analysis
<div style="text-align: left; float: left;"><font size=6>
- Analysis to control for library preparation issues
- Specific to library preparation type
- First steps of data analysis for the data type
- NGI _cannot_ do project specific analysis
</font></div>
<div style="text-align: left;"><font size=6>
- Use of nextflow pipelines under nf-core
- Results are summarized using MultiQC
</font></div>
<img src="https://github.com/nextflow-io/trademark/raw/master/nextflow2014_no-bg-bright.png" width="40%"/><img></a><a href="https://www.nf-co.re"><img src="https://github.com/nf-core/logos/raw/master/nf-core-logos/nf-core-logo-square.svg" width="10%"><img></a><a href="https://www.nextflow.io/"/><img src="https://github.com/ewels/MultiQC/raw/master/docs/images/MultiQC_logo_darkbg.png" width="30%"/>
---
## bioinformatics at NGI
### 5) Generation of project reports
* Will contain:
* General QC stats for the flowcell and each sample
* Information on
* Library prep
* Sequencing setup
* Accreditation status and deviations
---
## bioinformatics at NGI
### 6) Deliveries

<div style="display: flex; justify-content: space-evenly; align-items:center;">
<div style="text-align: left; float: left;"><font size=5>
<p data-markdown>- For sensitive data</p>
<p data-markdown>- Hosted by Uppmax</p>
<p data-markdown>- Requires a SNIC account</p>
</font></div>
<div style="text-align: right; float: right;"><font size=5>
<p data-markdown>- (Currently) only for non-sensitive data</p>
<p data-markdown>- hosted by SciLifeLab Data Centre</p>
<p data-markdown>- Email with access link sent to user</p>
</font></div></div>
---
<!-- .slide: data-background="https://dog.dnr.alaska.gov/resources/images/backgrounds/pipeline.jpg" -->
# Pipelines
---
## What is a pipeline?
<img src="https://hackmd.io/_uploads/BJ0vVftns.png" width="50%"><img>
<font size=5>
<a href="https://doi.org/10.1038/s41592-021-01254-9">10.1038/s41592-021-01254-9</a>
</font>
---
## What is a workflow manager?
<img src="https://hackmd.io/_uploads/HyEpPfY3i.png" width="70%"><img>
<font size=5>
<a href="https://doi.org/10.1038/s41592-021-01254-9">10.1038/s41592-021-01254-9</a>
</font>
---
### Some available workflow managers
<img src="https://hackmd.io/_uploads/S1rJYzY3o.png" width="90%"><img>
<font size=5>
<a href="https://doi.org/10.1038/s41592-021-01254-9">10.1038/s41592-021-01254-9</a>
</font>
---
<a href="https://www.nf-co.re"><img src="https://github.com/nf-core/logos/raw/master/nf-core-logos/nf-core-logo-darkbg.svg" width="65%"><img></a>
---
## Reproducibility is central
<a href="https://academic.oup.com/view-large/figure/118918033/giy077fig1.jpg"><img src="https://maxulysse.github.io/assets/img/slides/gigascience_giy077_fig1.jpg" width="50%"><img></a>
<font size=5>
<a href="https://doi.org/10.1093/gigascience/giy077">10.1093/gigascience/giy077</a>
</font>
---
# What is nf-core?
> A community effort to collect a curated set of analysis pipelines built using Nextflow.
---
# What is Nextflow?
<a href="https://www.nextflow.io/"><img src="https://maxulysse.github.io/assets/img/slides/nextflow.png" width="50%"><img></a>
* Workflow manager
* Data driven language
* Portable
* executable on multiple platforms
* Shareable and reproducible
* with containers or virtual environments
---
## Data driven language
The execution graph depends on the input data,
and is calculated on the go
In `snakemake` it's the other way around
The execution graph depends on the final target,
and is calculated before launch
---
## Portability
[www.nextflow.io/docs/latest/executor.html](https://www.nextflow.io/docs/latest/executor.html)
- <i class="fa fa-server"></i> Sun Grid Engine, SLURM, PBS/Torque...
- <i class="fa fa-cloud"></i> AWS Batch, Kubernetes, Google Life Sciences
---
## Reproducibility
<a href="https://docs.conda.io/"><img src="https://maxulysse.github.io/assets/img/svg/conda_logo.svg" width="50%"><img></a> | <a href="https://www.docker.com/"><img src="https://maxulysse.github.io/assets/img/svg/docker_logo.svg" width="50%"><img></a> | <a href="https://sylabs.io/singularity/"><img src="https://maxulysse.github.io/assets/img/svg/singularity_logo.svg" width="50%"><img></a>
:-:|:-:|:-:
---
# What is nf-core: community

---
# What is nf-core: for users

---
# What is nf-core: for developers

---
# What does nf-core provide
- **Pipelines**: ready-made pipelines [n=68]
- **Docs <i class="fa fa-globe"></i>**: Guidelines, tutorials, videos
- **Subworkflows <i class="fa fa-globe"></i>**: multi-tool wrappers [n=31]
- **Modules <i class="fa fa-globe"></i>**: single-tool wrappers [n=797]
- **Configs <i class="fa fa-globe"></i>**: shared infrastructure configs
- **Test datasets <i class="fa fa-globe"></i>**: test data for :point_up_2:
- **Tools <i class="fa fa-globe"></i>**: linting, template + automation for :point_up_2:
<i class="fa fa-globe"></i> provided for the larger community
---
## Pipeline requirements
[<i class="fa fa-globe"></i> nf-co.re/docs/contributing/adding_pipelines](https://nf-co.re/docs/contributing/adding_pipelines)
- Nextflow based
- Common structure
- Stable release tags
- MIT license
- Software bundled for reproducibility
- Continuous Integration testing
- _lagom_
---
## Sarek
[<i class="fa fa-globe"></i> nf-co.re/sarek](https://nf-co.re/sarek)
- Based on GATK Best Practices
- Alignment, Variant Calling, Annotation
- SNPs Indels, SVs, CNV, MSI...
- Germline, Somatic or Tumor only
---
<a href="https://nf-co.re/tools/"><img src="https://maxulysse.github.io/assets/img/svg/nf-core-tools_logo.svg" width="60%"><img></a>
---
## A companion tool
[<i class="fa fa-globe"></i> https://nf-co.re/tools](https://nf-co.re/tools)
- **[launch](https://nf-co.re/tools#launch-a-pipeline)** - with interactive prompts
- **[download](https://nf-co.re/tools#downloading-pipelines-for-offline-use)** - for offline use
- **[lint](https://nf-co.re/tools#linting-a-workflow)** - check code against guidelines
- **[modules](https://nf-co.re/tools/#modules)** - List, update, lint, create...
- **[subworkflows](https://nf-co.re/tools/#subworkflows)** - List, update, lint, create...
- ...
---
## Configurations
All pipelines come with a default sensible configuration for a regular sized HPC
(Including UPPMAX)
---
## Configurations
[<i class="fa fa-github"></i> github.com/nf-core/configs](https://github.com/nf-core/configs/) allows shared configurations between pipelines for a specific HPC
* cpus, time and memory requirements
* scheduler
* queues
* environments
* path to common references files
* ...
---
## <i class="fa fa-laptop"></i> Training and other events
[<i class="fa fa-globe"></i> https://nf-co.re/events](https://nf-co.re/events)
<a href="https://nf-co.re/events/2020/hackathon-francis-crick-2020"><img src="https://maxulysse.github.io/assets/img/slides/nf-core_hackathon_crick2020.jpg" width="60%"><img></a>
[<i class="fa fa-globe"></i> nf-co.re/events/2023/training-march-2023](https://nf-co.re/events/2023/training-march-2023)
---
## Need help?
<!-- .slide: data-background="https://raw.githubusercontent.com/maxulysse/maxulysse.github.io/main/assets/img/svg/green_white_bg.svg" -->
Website: [`https://nf-co.re`](https://nf-co.re)
Chat: [`https://nf-co.re/join`](https://nf-co.re/join) <img src="https://cdn.brandfolder.io/5H442O3W/at/pl546j-7le8zk-6gwiyo/Slack_Mark.svg" width=7.5%></img>
<div style="margin-top:0.1em"> </div>
<p align="center">
Follow nf-core on
<a href="https://www.twitter.com/nf_core"><img src="https://openmoji.org/data/color/svg/E040.svg" width=6%></a>
<a href="https://mstdn.science/@nf_core"><img src="https://upload.wikimedia.org/wikipedia/commons/thumb/4/48/Mastodon_Logotype_%28Simple%29.svg/216px-Mastodon_Logotype_%28Simple%29.svg.png" width=5%></a>
<a href="https://github.com/nf-core"><img src="https://openmoji.org/data/color/svg/E045.svg" width=6%></a>
<a href="https://www.youtube.com/c/nf-core"><img src="https://openmoji.org/data/color/svg/E044.svg" width=6%></a>
</a>
</p>
<a href="https://nf-co.re/" style="color: #000000; font-family:Monaco, monospace; font-weight:bold;">https://nf-co.re/</a>
<div style="display: flex; justify-content: space-evenly; align-items:center;">
<img src="https://chanzuckerberg.com/wp-content/themes/czi/img/logo.svg" width=15%>
<div style="font-style:italic; font-size: 0.5em; color: #666;">Icons:<br><a href="https://openmoji.org">openmoji.org</a></div></div>
<style>
body {
background-image: url(https://raw.githubusercontent.com/nf-core/logos/master/nf-core-logos/nf-core-logo-square.svg);
background-size: 7.5%;
background-repeat: no-repeat;
background-position: 3% 96%;
background-color: #181a1b;
}
.reveal body {
font-family: 'Roboto', sans-serif;
font-weight: 300;
color: white;
}
.reveal p {
font-family: 'Roboto', sans-serif;
font-weight: 300;
color: white;
}
.reveal h1 {
font-family: 'Roboto', sans-serif;
font-style: bold;
font-weight: 400;
color: white;
font-size: 62px;
}
.reveal h2 {
font-family: 'Roboto', sans-serif;
font-weight: 300;
color: white;
}
.reveal h3 {
font-family: 'Roboto', sans-serif;
font-style: italic;
font-weight: 300;
color: white;
}
.reveal p {
font-family: 'Roboto', sans-serif;
font-weight: 300;
color: white;
}
.reveal li {
font-family: 'Roboto', sans-serif;
font-weight: 300;
color: white;
}
.reveal pre {
background-color: #272822 !important;
display: inline-block;
border-radius: 7px;
color: #aaaba9;
}
.reveal pre code {
color: #eeeeee;
background-color: #272822;
font-size: 100%;
}
.reveal code {
background-color: #272822;
font-size: 75%;
}
.reveal .progress {
color: #24B064;
}
.reveal .controls button {
color: #24B064;
}
.reveal blockquote {
display: block;
position: relative;
width: 90%;
margin: 20px auto;
padding: 5px;
background: rgba(255, 255, 255, 0.05);
box-shadow: 0px 0px 2px rgb(0 0 0 / 20%);
}
</style>
{"metaMigratedAt":"2023-06-17T19:14:43.969Z","metaMigratedFrom":"YAML","title":"Introduction to bioinformatics using NGS Data - NBIS course","breaks":true,"contributors":"[{\"id\":\"fb193497-1111-470c-a594-827d34b6f673\",\"add\":21847,\"del\":14742},{\"id\":\"5d29bb46-4e7a-46d5-91af-8540de253fce\",\"add\":8436,\"del\":2460}]"}