# HybPiper update and Gene Trees
## Hybpiper
To activate hybpiper environment
`conda activate hybpiper`
## Running HybPiper
Make sure you are in hybpiper environment before running command.
#### Intial command
`hybpiper assemble -t_dna [targetfile].fasta -r [samplefile].fastq --prefix (NameOutputFile) --bwa --cpu N --targetfile_ambiguity_codes NMRWSYKVHD`
#### Command for while read loop
`while read name
do
hybpiper assemble -r $name.R1.paired.fastq $name.R2.paired.fastq -t angiosperms353.abronia.fasta --prefix --bwa --cpu N --targetfile_ambiguity_codes NMRWSYKVHD $name
done < namelist.txt`
#### Extracting supercontigs
`while read name
do
hybpiper assemble -r $name.R1.paired.fastq $name.R2.paired.fastq -t angiosperms353.abronia.fasta --prefix --bwa --cpu N --targetfile_ambiguity_codes NMRWSYKVHD --run_intronerate --no-blast --no-distribute --no-assemble
done < namelist.txt`
#### HybPiper stats
Before running you need to create namelist.txt using nano and put in the text file the name of the output directory of where your hybpiper output was dumped.
`hybpiper stats angiosperms353.abronia.fasta -t_dna gene namelist.txt`
#### Recovering heat map
`hybpiper recovery_heatmap seq_lengths.tsv`
/scratch/bot3404/sprice/fastq/501/fastafiles
# Gene Trees
## MAFFT
To be able to infer a phylogeny, we first need to align the sequences. MAFFT takes unaligned raw sequences and creates multiple sequence alignments of amino acids or nucleotide sequences.
First make a new directory named MAFFT `mkdir MAFFT` to put your output file in
##### Intial Command
`mafft --preservecase --maxiterate 1000 --localpair inputfile.fasta > MAFFT/outputfile.mafft.fasta`
##### Command for parallel
`parallel "mafft --preservecase --maxiterate 1000 --localpair Sequences2/inputfile.fasta > MAFFT/outputfile.mafft.fasta" :::: namelist.txt`
## Trimal
After aligning sequences, there will be spaces in between base pairs that need to be removed before building the tree. Trimal will remove any illegitimate or poorly aligned sections of the sequences.
First, make a new directory named TRIMAL, `mkdir TRIMAL` for your trimal output files to be directed to.
##### Base Command
`trimal -in <inputfile> -out <outputfile> -(other options)`
##### Options for Trimal
-gt is an option that tells trimal how big of a gap is allowed in that fraction of the sequence.
##### Command for one sample
`trimal -in MAFFT/$name.mafft.fasta -out TRIMAL/$name.trimal.fasta -gt .5`
##### Command using parallel
`parallel -j 10 --eta trimal -in MAFFT/{}.mafft.fasta -out TRIMAL/{}.trimal.fasta -gt .5 :::: namelist.txt`
## IQ Tree
IQ tree command will take your input of multiple sequence alignment and will reconstruct a phylogeny that is best explained by your input data.
You must be in your directory where your trimal output is.
##### Inital Command
`iqtree -s $name.fasta`
-s gives you the option to specify the name of the alignment
Three output files will be generated
```
$name.fasta.iqtree
$name.fasta.treefile
$name.fasta.log
```
##### Command for parallel
`parallel --eta -j 10 iqtree -s fastafiles/{}_supercontig.trimal.fasta -m MFP -B 1000 :::: namelist.txt`
## Astral
This compiles all sequences into a tree
#### Base command
`java -jar /opt/apps/Software/ASTRAL/Astral/astral.5.6.3.jar -i inputfilename.tre > outputfilename.astral.tre`