# RepeatModeler2 / RepeatMasker
###### tags: `program commands`
## RepeatModeler2
[full documentation](http://repeatmasker.org/RepeatModeler/)
RepeatModeler2 identifies repeats and assemble consensus sequences from a genome assembly. It attemps a basic classification based on the DFAM database.
genome.fasta (genome sequence) --> RM2 --> genome-families.fasta (TE consensi)
Example for a given genome called "genome"
#### 1. Build database for RM2
```
<RepeatModelerPath>/BuildDatabase -name genome genome.fa
```
#### 2. Run RM2
```
nohup <RepeatModelerPath>/RepeatModeler -database genome -pa 20 -LTRStruct >& run.out &
```
> `LTRStruct` enables the LTR module of RM2
#### 3. Output file to keep is `genome-families.fa`
## RepeatMasker
full documentation
RepeatMasker will identify repeats on the genome using the library made and annotated by RepeatModeler2 `genome-families.fa`. The default engine is rmblastn (modified version of blastn for RepeatMasker).
```
nohup <RepeatMaskerPath>/RepeatMasker -pa 15 -a -s -gccalc -gff -cutoff 200 -no_is -lib genome-families.fa genome.fa
```
> `-pa`: CPUs **WARNING** RepeatMasker multiplies CPU x 4 using `rmblastn` !!!
> `-a`: .align file (needed for TE landscapes)
> `-s`: "slow"-search mode (recommended)
> `-gccalc`: computes the gc content
> `-gff`: produces a gff track
> `-cutoff 200`: min size to keep hit (recommended)
> `-no_is`: don't look for insertion sequences (prokaryotic TE)