# Exercise 5. Procrustes distance and alignment
###### tags: `exercise`
## Instructions
- :warning: **Be sure to include your name on your answers!** :warning:
- Below are Problems which should all be achievable using the information in the tutorial. Challenges will require you to find resources or answers on your own. (Hint: Try googling "how to … in R".)
- Create a HackMD note to share your answers to these prompts. Copy the prompts, add your code and explanatory text and images. Use markdown formatting to distinguish the prompts from the code and explanatory text of your answers.
- Please complete this assignment individually. Discussing the problems with others is acceptable. However, keep in mind that your goal here should be to build a skill you can use going forward. Copying code you don't understand will not help you in the long run!
- Submit your answers by sharing the note with Dr. Angelini before Friday at dawn.
---
The `borealis` package comes with several build-in datasets. The object `Bombus.forewings` contains a dataset of bumblebee forewing shapes digitized using the same landmarks you recently used. The same matrix of "links" will work to highlight the anatomy of these wings. Many functions, including `align.procrustes` will also accept the `links` argument.
### Problem 1
How many specimens are present in the `Bombus.forewings` dataset?
What species are included?
### Problem 2
And how many specimens represent each species? (Hint: You could simply count each, but that would be tedious. The goal here is a computational solution. Try looking into the use of the R functions `table` or `by`. )
### Problem 3
Perform generalized Procrustes alignment of the `Bombus.forewings` dataset, using the methods and tools we discussed. Be aware that some data curation may be necessary. Use your discretion and check whether it is necessary to omit or reflect specimens, or to correct landmark errors made during digitization. Be sure to document any manual corrections in the data provenance.
In your response to these questions, include an image of the aligned landmarks (the plot with gray and black dots generated by the `align.procrustes` function) and the data provenance (in markdown format) for the GPA step and any steps of data curation.
#### Challenge 3
After checking lab notebooks you discover that landmarks 15 and 16 were incorrectly placed in specimen FJ190827-012. Address this issue as you complete Problem 2 and include coverage of your solution in the data provenance.
### Problem 4
Use the function `procrustes.distance` to compare the shapes of the first 3 specimens in the `Bombus.forewings` dataset, after Procrustes alignment. Which two specimens are most similar?
#### Challenge 4
We discussed some of the considerations in choosing landmarks. The function `procrustes.jackknife` will test how variation in the location of each landmark affects the outcome of Procrustes alignment. It does this by one-by-one removing landmarks from the dataset and repeating GPA. After each iteration, the median pairwise Procrustes distance among all specimens is determined. The results are output as a table and a plot. (You can include an argument to `links` in this function too.)
Run this function on your curated coordinate data from `Bombus.forewings` (not on the GPA-aligned data). Include the resulting image in your answer. Which landmarks are more variable? What explanations exist for these results?
---
**Quick Links** | [BI377 Moodle](https://moodle.colby.edu/course/view.php?id=33250) | [BI377 HackMD](https://hackmd.io/@ColbyBI377/landingpage) | [rstudo.colby.edu](https://rstudio.colby.edu/)
---