# Exercise 4 - Lexi Hammer
### Problem 1
How many specimens are present in the `Bombus.forewings` dataset? What species are included?
- there are 99 specimens in the dataset and the species included are:
--> vag
--> imp
--> bimac
--> terri
--> tem
--> bor
--> ferv
--> san
```
Bombus.forewings$specimen.number
Bombus.forewings
```
#### Challenge 1
And how many specimens represent each species? (Hint: You could simply count each, but that would be tedious. The goal here is a computational solution. Try looking into the use of the R function `by`. )
```
BombusData <- Bombus.forewings$metadata
BombusData$species
```

- there are 8 different species that the specimens represent
### Problem 2
Perform generalized Procrustes alignment of the `Bombus.forewings` dataset, using the methods and tools we discussed. Be aware that some data curation may be necessary. Use your discretion and check whether it is necessary to omit or reflect specimens, or to correct landmark errors made during digitization. Be sure to document any manual corrections in the data provenance.
```
fore_align <- align.procrustes(Bombus.forewings)
fore_align <- align.procrustes(Bombus.forewings, outlier.analysis = TRUE)
```

###### image of the procruste alignment figure produced in RStudio
In your response to these questions, include an image of the aligned landmarks (the plot with gray and black dots generated by the `align.procrustes` function) and the data provenance (in markdown format) for the GPA step and any steps of data curation.
```
names(Bombus.forewings$provenance)
cat(Bombus.forewings$provenance$read.tps)
dim(Bombus.forewings$coords)
Bombus.forewings$coords[,,1]
plot(Bombus.forewings$coords[,,1])
landmark.plot(Bombus.forewings)
landmark.plot(Bombus.forewings, specimen.number = 1)
landmark.plot(Bombus.forewings, specimen.number = 1:4)
```

###### Figure of four landmark plots side by side showing how similar the placements were
### Problem 3
Use the function `procrustes.distance` to compare the shapes of the first 3 specimens in the `Bombus.forewings` dataset, after Procrustes alignment. Which two specimens are most similar?
```
procrustes.distance(Bombus.forewings$coords[,,1], Bombus.forewings$coords[,,2])
procrustes.distance(Bombus.forewings$coords[,,1], Bombus.forewings$coords[,,3])
procrustes.distance(Bombus.forewings$coords[,,2], Bombus.forewings$coords[,,3])
```
- the distance between 1 and 2 is 12.28647, the distance between 1 and 3 is 6.686372 and the distane between 2 and 3 is 7.578541. Therefore, specimens 1 and 3 are the most similar because the distances between the two is the smallest.
#### Challenge 3
We discussed some of the considerations in choosing landmarks. The function `procrustes.jackknife` will test how variation in the location of each landmark affects the outcome of Procrustes alignment. It does this by one-by-one removing landmarks from the dataset and repeating GPA. After each iteration, the median pairwise Procrustes distance among all specimens is determined. The results are output as a table and a plot. (You can include an argument to `links` in this function too.)
Run this function on your curated coordinate data from `Bombus.forewings` (not on the GPA-aligned data). Include the resulting image in your answer. Which landmarks are more variable? What explanations exist for these results?
```
library(viridis)
procrustes.jackknife(Bombus.forewings)
```

###### image of the procrustes jackknife figure showing the distances between landmarks following removal
- Landmarks 5 and 17 are the most variable and one possible explanation for this is that these two points vary more with the angle that the wing picture is taken. This can even be seen in the figure above showing the 4 different landmark plots where 5 and 17 are not very similar in placement comparison.