---
title: "nf-core/genomeassembler test data set discussion"
tags: pipeline,genomeassembler,planning,plans
---
# nf-core/genomeassembler test data set discussion
nf-core requires that a workflow have both a small scale test data set and a full scale test data. The small scale test data need only test that the workflow runs through. The actual results do not need to make sense. A full scale test data set on the other hand should be representative of a data set this workflow will analyse and will give an idea of running time and such.
## Prokarotic dataset needs
- Illumina WGS
- PacBio HiFi
- Oxford Nanopore
- Reference genome
Potential source is the [module test data](https://github.com/nf-core/modules/blob/master/tests/config/test_data.config):
- bacteroides_fragilis: all data except PacBio
- candidatus_portiera_aleyrodidarum: all data except PacBio
## Eukaryotic dataset needs
- Illumina
- PacBio HiFi
- Oxford Nanopore
- Hi-C
- Reference genome
## Small test data set
https://training.galaxyproject.org/training-material//topics/assembly/tutorials/vgp_genome_assembly/tutorial.html#genome-profile-analysis might be a start.
## Full scale test data set