--- title: "nf-core/genomeassembler data QC planning" tags: pipeline,genomeassembler,planning,plans --- # nf-core/genomeassembler data QC and preprocessing discussion ## Desired components: ### Quality Checks: - Check file checksums - FastQC / Nanoplot / etc - Kraken2 / Mash - K-mer histogram ( GenomeScope ) - K-mer consistency between files ( i.e. seeing biases between different files cf. https://kat.readthedocs.io/en/latest/walkthrough.html#checking-library-consistency ) - Data quantity check. - Read chimerism. ### Preprocessing: - Adapter removal - [HiFiAdapterFilt](https://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-022-08375-1) - Subsampling - Normalization (e.g. for Whole genome amplified) - Read filtering - from List (txt file) - from mapping (take ID's from bam) ## Workflow diagram draft: ```mermaid flowchart LR Illumina-WGS --> FastQC Hi-C --> FastQC Nanopore --> Nanoplot Nanopore --> PycoQC PacBioHifi --> Nanoplot Illumina-WGS --> Kraken2/Mash PacBioHifi --> Kraken2/Mash Nanopore --> Kraken2/Mash ```
×
Sign in
Email
Password
Forgot password
or
By clicking below, you agree to our
terms of service
.
Sign in via Facebook
Sign in via Twitter
Sign in via GitHub
Sign in via Dropbox
Sign in with Wallet
Wallet (
)
Connect another wallet
New to HackMD?
Sign up