program commands
RepeatModeler2 identifies repeats and assemble consensus sequences from a genome assembly. It attemps a basic classification based on the DFAM database.
genome.fasta (genome sequence) –> RM2 –> genome-families.fasta (TE consensi)
Example for a given genome called "genome"
LTRStruct
enables the LTR module of RM2
genome-families.fa
full documentation
RepeatMasker will identify repeats on the genome using the library made and annotated by RepeatModeler2 genome-families.fa
. The default engine is rmblastn (modified version of blastn for RepeatMasker).
-pa
: CPUs WARNING RepeatMasker multiplies CPU x 4 usingrmblastn
!!!
-a
: .align file (needed for TE landscapes)
-s
: "slow"-search mode (recommended)
-gccalc
: computes the gc content
-gff
: produces a gff track
-cutoff 200
: min size to keep hit (recommended)
-no_is
: don't look for insertion sequences (prokaryotic TE)