---
title: "nf-core/genomeassembler input discussion"
tags: pipeline,genomeassembler,planning,plans
---
# nf-core/genomeassembler input format discussion
Genome assembly can be a complex process with
a variety of data inputs and steps.
We'd like to formalise how the workflow takes input here:
## TSV format
Aimed at taking simpler input for less complex workflow execution:
Proposed properties:
- Should omit any one-to-many or many-to-one relations
- Could accept comma separated lists though where neccessary
- One row per assembly
- Column names should be identical to YAML format
## YAML format
Designed for more complex workflow executions.
Scope:
- Multiple samples
- Assembler execution with varying parameters
- What to subassemble/circularise?
```yaml
samples:
- id: My_awesome_species_1
assemblies:
- id: assemblerX_build1
path: /path/to/assembly
- id: assemblerY_build1
path: /path/to/assembly
hic:
- id: hic_assembly_1
path: /path/to/reads
hifi:
- path: /path/to/reads
- path: /path/to/reads
rnaseq:
- path: /path/to/reads
isoseq:
- path: /path/to/reads
tools:
assembler1:
options:
- '--opt1 X --opt2 Y'
- '--opt2 A --opt2 B --opt3 C'
assembler2:
options:
- '--optx X --opty Y'
- '--optx A --opty B --optz C'
assembler3: true
```
- Should use [Pep Specification](http://pep.databio.org/en/latest/).
### Implementation details.
- Tool options are configured using the `ext.args` process option. In order to use this in the configuration, the options must be passed with the `meta` input map/dictionary, and can then be accessed in the configuration using:
```nextflow
process {
withName: 'ASSEMBLER_X' {
ext.args = { meta.assembler_options }
}
}
```