# **Genome assembly: Flye software** **Author** Diana Moreno (dmorenos@ttu.edu) [Flye](https://github.com/fenderglass/Flye) is a de novo assembler for single molecule sequencing reads, such as those produced by PacBio and Oxford Nanopore Technologies. It is designed for a wide range of datasets, from small bacterial projects to large mammalian-scale assemblies. It takes raw PB / ONT reads as input and outputs polished contigs. ## Table of Contents [TOC] ### Flye usage: :::info ``` flye (--pacbio-raw | --nano-raw ) <file1> <file2...> --genome-size <int> --out-dir <PATH> --threads <int> --iterations <int> Assembly of long and error-prone reads optional arguments: -h, --help show this help message and exit --pacbio-raw path [path ...] PacBio raw reads --pacbio-corr path [path ...] PacBio corrected reads --nano-raw path [path ...] ONT raw reads --nano-corr path [path ...] ONT corrected reads --subassemblies path [path ...] high-quality contigs input -g size, --genome-size size estimated genome size (for example, 5m or 2.6g) -o path, --out-dir path Output directory -t int, --threads int number of parallel threads [1] -i int, --iterations int number of polishing iterations [1] -m int, --min-overlap int minimum overlap between reads [auto] --asm-coverage int reduced coverage for initial disjointig assembly [not set] --plasmids rescue short unassembled plasmids --meta metagenome / uneven coverage mode --no-trestle skip Trestle stage --polish-target path run polisher on the target sequence --resume resume from the last completed stage --resume-from stage_name resume from a custom stage --stop-after stage_name stop after the specified stage completed --debug enable debug output -v, --version show program's version number and exit ``` ::: ### Install flye using conda To install flye with conda, simply run this command: ``` conda install flye ``` :::warning <span style="color:red">****Warning****</span> Flye is a bioconda package, therefore we need to have bioconda enabled first. If bioconda is not enabled do the following: ``` conda config --add channels defaults conda config --add channels bioconda conda config --add channels conda-forge ``` ::: ### Run flye Before running flye, check the available memory. For a human genome with 30x coverage, you will need ~800Gb at peak. Flye can be easily run with a simple command line: - For nanopore raw sequences (i.e no corrected) ``` flye --nano-raw <reads1.fastq reads2.fastq> --genome-size <int> --out-dir <path> --threads <int> --iterations 2 ``` - For pacbio raw sequences (i.e no corrected) ``` flye --pacbio-raw <reads1.fastq reads2.fastq> --genome-size <int> --out-dir <path> --threads <int> --iterations 2 ``` Flye run can take from 1 to 2 weeks on a 2GB mammal genome, if the run stops you can always restarted with the --resume-from option (e.g. --resume-from polishing) The results will be saved on the output directory. ###### tags: `Genome analysis` , `Diana`