---
title: "nf-core/genomeassembler assembly workflow planning"
tags: pipeline,genomeassembler,planning,plans
---
# nf-core/genomeassembler assembly workflow discussion
## Desired components:
- assembles
- polishes
- scaffolds
- purge_dups
- phases
- sub assembly of organelles, sex chromosomes, germline/somatic genomes (https://www.nature.com/articles/s41467-019-13427-4)
- circular DNA rotation ( e.g., plasmids, organelles, etc )
## Workflow diagram draft:
Are Tricycler & SPAdes prokaryote specific?
```mermaid
flowchart LR
Illumina-WGS --> SPAdes
Illumina-WGS --> Shovill
Illumina-WGS --> Shovill-SE
SPAdes --> Illumina-Output
Shovill --> Illumina-Output
Shovill-SE --> Illumina-Output
```
```mermaid
flowchart LR
PacBioHifi --> Canu
PacBioHifi --> HiFiAsm
PacBioHifi --> Flye
Canu --> Pacbio-Output
Flye --> Pacbio-Output
HiFiAsm --> Pacbio-Output
```
```mermaid
flowchart LR
Illumina-WGS --> Pilon
Nanopore --> Canu
Nanopore --> RedBean
Nanopore --> Raven
Nanopore --> Shasta
Nanopore --> Miniasm
Nanopore --> Flye
Canu --> Tricycler
Flye --> Tricycler
RedBean --> Tricycler
Raven --> Tricycler
Shasta --> Tricycler
Miniasm --> Tricycler
Tricycler --> Medaka
Tricycler --> Pilon
Tricycler --> NanoPolish
Nanopore-fast5 --> NanoPolish
NanoPolish --> Nanopore-Output
Pilon --> Nanopore-Output
Medaka --> Nanopore-Output
```
<!-- Felipe Almeida: I am making some change suggestions here in the graph as I believe some processes are not fit.
1. PacBioHifi I believe should not be passed to spades. I believe the tools that handles them best are hifiasm, flye and canu.
2. I believe the graph should use Tricycler as a consensus maker as it gets generated assemblies and creates a consensus instead of having "Nanopore -> Tricycler" we must "Canu -> Tricycler"
3. I don't believe tricycler can be used for short reads.
4. Tricycler output can be polished later with Medaka or Pilon. Tricycler expects "raw" assemblies.
5. I believe saying the generated end-line of each workflow in the graph should help understanding it. E.g "Pacbio-Output".
6. Maybe is better to separate graphs for cleaner visualization?
I left the removed lines in the mermaid graph below:
Illumina-WGS -\-> Tricycler
PacBioHifi -\-> SPAdes
PacBioHifi -\-> Tricycler
Nanopore -\-> SPAdes
Nanopore -\-> Tricycler
Miniasm -\-> Racon
Racon -\-> NanoPolish
Racon -\-> Medaka
-->