# Bi278 Lab 7 Phylogenetics
### By Lee Ferenc 11/7/2023
* Whole Genome phyolgenies: `andi`
* Single Genes: `iqtree`
* Visualizing and manipulating trees: `R`
## 1. `andi` (***An***chor ***Di***stances)
Alightment-free method of genearating a phylogeny. `andi` works by idenitifying long exact matches (anchors) to use as homologous segemnets between two genomes. Then it looks at the mismatches regions and estimates a distance matrix.
### Step 0: Make a directory
### Step 1: Prepare an input files that contains all of the **genomes** to build a phylogeny with.
```
ls /courses/bi278/Course_Materials/lab_07/*.*a > andi_inputs.txt
```
### Step 2: Run `andi`
```
andi --file-of-filenames=andi_inputs.txt -t 2 --join > andi_output.dist
```
Note that it will ignore the file names after the first `.` so be wary if you use `.` and rename the files with an `_` or something else.
My example:
```
andi --file-of-filenames=andi_Kinputs.txt -t 2 --join > /home2/enfere24/lab_07/andi_output.txt
```
## 2. IQ-TREE
It estimates trees from either nucleotide or protein alignments using maximum likelihood.
### Step 1: Get a MSA
Lab example: (it combines two different fasta squences and then makes an MSA using `mafft` (see lab 4))
```
cat /courses/bi278/Course_Materials/lab_07/ncbi_sensor_kinase.txt /courses/bi278/Course_Materials/lab_07/orthofinder_18taxa/Orthogroup_Sequences/OG0002162.fa > OG0002162_plus_ncbi.fa
mafft --genafpair --maxiterate 1000 OG0002162_plus_ncbi.fa > OG0002162_mafft.faa
```
###### Here I decided to use the lab example due to it being annoying grab the specific locations of proteins from the annotated genomes. I'd like to take a gene that's across every strain. I reran the lab data for andi as well.
### Step 2: Run `iqtree`
```
iqtree -s FILE.faa
```
The output file has `.treefile` at the end and you can preview of what the tree looks like with the file+`.iqtree`
#### Lab example:
```
iqtree -s OG0002162_mafft.faa
```
## 3. Trees in `R`
### Step 1: Go to RStudio at: [bi278.colby.edu](https://bi278.colby.edu)
### Step 2: Load phylogeny packages
```
library(ape)
library(maps)
library(phytools)
```
(Maps was needed)
### Step 3: Take `andi` distance matrix and make neighbor-joining tree
```
andi.dist <- read.table("PATH/andi_output.dist", header=F, row.names=1, skip=1)
colnames(andi.dist) <- rownames(andi.dist)
# infer tree from distance matrix
andi.tree <- nj(as.dist(andi.dist))
```
### Step 4: Plot tree
```
plot(andi.tree)
```
My tree (which isn't great)

### Step 5: Clean-Up
#### Check tip labels
```
andi.tree$tip.label
```
#### Sub patterns (gets rid of things like "_nanoassembly")
```
temp <- gsub(andi.tree$tip.label, pattern="_nanoassembly", replacement="")
temp2 <- gsub(temp, pattern="_ncbi", replacement="")
```
#### Sub inidividual elements
```
temp2[1] <- "bh69"
```
#### Sub ranges
```
temp2[13:14] <- c("bmall", "bpseu")
```
#### Clean-up inconsistencies
```
temp3 <- gsub(temp2, pattern="qs", replacement="")
```
#### Replace labels
```
andi.tree$tip.label <- temp3
```
#### Clean-up temp files
```
rm(temp, temp2, temp3)
```
### Step 6: Plot again
```
plot(andi.tree)
nodelabels()
```

Don't know if that's *better*. Especially with the outgroups being at 34 and not at the root.
### Step 7: Root tree with outgroup
```
rooted.andi <- root.phylo(andi.tree, node=34, resolve.root=T)
plot(rooted.andi, cex=1, label.offset=0.005)
```

### Step 8: look at the species tree outputted by orthofinder.
#### Make tree
```
ortho.tree <- read.tree("/courses/bi278/Course_Materials/lab_07/orthofinder_18taxa/Species_Tree/SpeciesTree_rooted.txt")
plot(ortho.tree)
```

#### fix tip labels
```
ortho.tree$tip.label
temp <- gsub(ortho.tree$tip.label, pattern="_prokka", replacement="")
temp2 <- gsub(temp, pattern="qs", replacement="")
temp2[6:7] <- c("Bmallei", "Bpseudomallei")
```
#### replace labels
```
ortho.tree$tip.label <- temp2
rm(temp, temp2)
```
# root
```
plot(ortho.tree)
nodelabels()
rooted.ortho <- root.phylo(ortho.tree, node=25, resolve.root=T)
plot(rooted.ortho, cex=1, label.offset=0.005)
```


### Step 8: Rotate nodes so tips = andi tree
```
rotate.ortho <- rotateConstr(rooted.ortho, rooted.andi$tip.label)
plot(rotate.ortho)
```
### Step 9: Compare trees!
```
# compare topology (branching pattern)
all.equal(andi.tree, ortho.tree, use.edge.length=FALSE)
# plot both trees side by side
cophyloplot(rooted.andi, rotate.ortho)
```
I didn't alter all the labels in the andi tree. Howver it's not super bad, I still understand everything's the same except Sprentia and Pterrae and also the branches got flipped in the middle? No idea why considering I copied everything from the lab. Wait I didn't copy everything, I think the files are slightly different

### Step 10: Look at gene tree
#### Make the tree and check it out
```
iq.tree <- read.tree("/home2/enfere24/lab_07/OG0002162_mafft.faa.treefile")
plot(iq.tree)
nodelabels()
```
Yikes

#### Root it
```
rooted.iq <- root.phylo(iq.tree, node=64, resolve.root=T)
plot(rooted.iq, cex=0.75, label.offset=0.005)
```

### Step 11: Clean-up workspace for next time and save ur work
(did not do, I don't want these files)
```
write.tree(rooted.andi, file="PATH/andi_cleaned.tre")
write.tree(rotate.iq, file="PATH/iq_sensorkinase.tre")
```
## 4. Modify your tree with `dendextend`
### Step 0: Load packages
```
library(DECIPHER)
library(dendextend)
```
### Step 1: Read in a tree and plot
```
iq.dend <- iq.dend <- ReadDendrogram("/home2/enfere24/lab_07/OG0002162_mafft.faa.treefile")
plot(iq.dend)
```

### Step 2: Adjust plot
```
par(mar = c(8,4,2,2) + 0.1) #bottom, left, top, right; default is c(5,4,4,2)
plot(iq.dend)
```

### Step 3: Make order of taxa on dendrogram identical to what you see in plot window
```
iq.dend %>% labels
```
### Step 4: Color leaves (with circles!)
```
iq.dend %>% set("leaves_pch", 19) %>% set("leaves_cex", 1) %>% set("leaves_col", c(rep(1,15),2,2,2,2,2,3,3,3,4,4,4,4,4,4,4,4,5,5)) %>% plot(type="rectangle")
```
### Step 5: Turn sideways and add fancy colors
```
par(mar = c(4,2,2,8))
iq.dend %>% set("leaves_pch", 15) %>% set("leaves_cex", 1) %>% set("leaves_col", c(rep(1,15),2,2,2,2,2,3,3,3,4,4,4,4,4,4,4,4,5,5)) %>% plot(type="rectangle", horiz=TRUE)
```

### Step 6: Change text color
```
iq.dend %>% set("labels_colors", c(rep(1,15),2,2,2,2,2,3,3,3,4,4,4,4,4,4,4,4,5,5)) %>% plot(type="rectangle", horiz=TRUE)
```

### Step 7: Reset margins
```
par(mar = c(5,4,4,2))
```