lydiafrance
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Make a copy
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Make a copy Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    # Shape analysis Bird flight # LINKS DATA: https://unioxfordnexus-my.sharepoint.com/:f:/r/personal/pemb4571_ox_ac_uk/Documents/AriannaAndreasLydia_Data?csf=1&web=1&e=exwu3p CODE: # Hawk Trajectories ### Experiments & Data Four hawks (3 young, 1 adult) flying in a room between two perches, ~1600 flights. Four perch-perch distances: 12m, 9m, 7m, 5m. They can take any route they like between the perches, but they did very similar trajectories consistently. When the perch-perch distance was bigger (12m) the trajectories are U-shaped and they graze the floor. When the perch-perch distance is short, they fly more in a shallow U-shape and don't lose much height. Data is from a backpack with a rigid pattern of markers in a formation -- the average position of these markers was used as the "mean backpack" for each frame in time. The coordinate system is based on the room and the origin is the target perch. <img src="https://i.imgur.com/pvAPFZM.png" alt="" width="350"/> > Flights between two perches. Change in the curves for different flight distances. <img src="https://i.imgur.com/WxVFS9p.png" alt="" width="1400"/> > Bootstrapped confidence interval mean for each individual hawk, different perch distances. > Different shape for each individual. * Bird names - Bird 1: Drogon () - Bird 2: Rhaegal (?) - Bird 3: Ruby (biggest, least data) - Bird 4: Toothless (bigger than 1 & 2, not fastest) ## Research Questions * **What variation is there for a single individual doing the same perch-perch distance over many flights?** * **What variation is there between individuals doing the same perch-perch distance? (4 individuals)** * **What variation is there across same individual doing different perch-perch distance? (4 distances)** * **Generally, are parts of the trajectory more or less variable?** > We know what is being optimised and we know the space of possibilities is constrained, so this is a good place to start with trying out shape analysis for trajectory/animal/movement data. ### Previous Work We have simple simulations of their flights to find out what is feasible and test different optimisations. The hawks are not choosing the fastest route to the perch, and they are not choosing the most energy efficient route. They are optimising to minimise how much stall they encounter at the end -- i.e. they make sure they have plenty of speed at a particular position. So far this work has boiled down the trajectory to a single position -- the lowest point on the curve. # Baby Bird Learning Trajectories ### Experiments & Data ~3000 flights by ~20 zebra finches from their first flight (Day 1) to Day 100, with variable intervals of recording. 10 flights per session where the finch is encouraged to fly to a perch in the same place they become familiar with. The perch has a force meter in it to measure their landing and also their weight as they grow. The birds are wearing a marker on their head and their body (so we have a body vector) and recorded with motion capture. This data needs some cleaning up before it is ready for analysis. Work in progress! ## Research Questions * **Are the weight curve and a landing force curve over time related or unrelated?** They grow but they also get better at the task over time. * **How does the trajectory shape change as they improve over time?** The fledglings start by flapping the whole way to the perch, then as they grow they start to do undulating flight (closing the wings and starting to freefall) which then they improve at as well. ![]() <img src="https://www.sibleyguides.com/wp-content/uploads/Flight_style_diagrams_HOSP1.jpg" alt="" width="400"/> > Flap-bounding/undulating flight * **Can we see how landing forces might be influenced by the trajectory shape?** What metric are they improving in their approach? * **What is the individual variation in flight trajectories?** # Wings ### Experiments & Data <img src="https://i.imgur.com/DfiLcpB.png" alt="" width="400"/> > Marker locations on the hawk Same hawk flights back and forth between perches. However, in addition to the backpack there are other markers on the hawks. 3 per wing and 2 on the outer edges of the tail. This could be treated as each marker has a trajectory... or this could be treated as a low-res ~~surface~~ curve. <img src="https://i.imgur.com/4JBKbkI.png" alt="" width="400"/> ![]() > Some example "curves" of the wing shape from the data set. The wings are in an outstretched configuration (green and magenta) and a dropped configuration (during flapping) in red and blue. --- ![](https://i.imgur.com/y2O4AZV.png) > Markers from hundreds of flights around the hawk during flapping -- The shape of the motion --- ![](https://i.imgur.com/1SovDGl.png) <img src="https://i.imgur.com/DfiLcpB.png" alt="" width="400"/> > Boot strapped confidence interval around the median for each of the markers, the time series information. Each flap in the series is unique, but consistent across hundreds of flights. **Additional Dataset** 45 flights with higher res with 45 markers on the wings and tail, so a polygon of the hawk outline. Each frame has a point cloud of the hawk but each marker is not labelled/identified as a trajectory over time. ## Research Questions * **How much does a flap vary by the same individual?** * **How much does a flap vary across individuals?** * **Can we find the key motions in flight?** * **A more meaningful way to classify/measure/describe/quantify wing motion?** ### Previous Work Working on a final draft for a paper on PCA of the shape of the bird, which treats each frame as independent in time and asks what the PCs are that describe the variation in morphing wing flight. The engineers want to call this "the control space for morphing wing flight" but I'm nervous. Trying tangent PCA...!? ### Mathematical Framework for Wing Data Before we are able to answer the research questions above we must cast the questions in the appropriate mathematical framework. The data can be embedded in different spaces, each with different results and difficulties (mathematically and - in particular - computationally). The object we are studying is nominally a 3D object but our data consists of points parameterised by time. **Wing Data as Curves** There is an option to study the wing data for one flight as a family of curves in time (one for each marker) for which we have elastic metrics to compute statistics. If we have, say, 6 markers on the hawks, then each shape consists of 6 curves, and our mean would be 6 curves. Note that the shape space is that of curves, not 6x curves! It is conceivable that we could compute correlations between these 6 curves but I am not sure the result can be interpreted meaningfully. **Wing Data as a 3D object** The second, possibly more attractive option, is to consider the points as samples of discretised 3D shape: varifolds are suitable for this I think. I omit a description of varifolds, as it is a little bit of a mouthful, but in essence the representation is a union of surfaces stuck together. This is much like in your animation which is why I consider it an interesting option. In light of my comment in the "Previous Work" section, your existing work could be greatly leveraged here and, importantly, compared against using shape analysis tools. It would be a great win for shape analysis if it clearly outperforms more crude representations! This is important because we cannot attribute meaning to doing tangent PCA without characterising the fundamental objects are that we are studying. # Data Summary The OneDrive data contains the following: ## Hawk trajectory project `fullTrajectoryData.csv`: data for all trajectories for each perch distance: 5, 7, 9, 12m. Columns: - `trajx trajy trajz`: The trajectory of the flight from the backpack centroid, relative to the destination perch. - `nadirHorz nadirVert`: the single position of the lowest point in the trajectory - `TimeFromStart`: the time data using 0 as the beginning of flight which coincides with the end of the jump. The rest is aerodynamics nonsense so just ignore that. *Note:* the trajectory information is available in the wing data below, it is called backpack_XYZ. ## Baby bird project No data yet, as indicated in the section above. ## Wing project ### Wing Shape Data `DrogonMarkers_allFlights.csv` `RhaegalMarkers_allFlights.csv` `RubyMarkers_allFlights.csv` `ToothlessMarkers_allFlights.csv` These have been separated by hawk individual to make the files smaller. They are concatenated in: `allBirdsMarkers_allFlights.csv` Each row is a frame where one side of the hawk is visible. So a row has **all** the following markers: - wingtip - primary - secondary - tailtip There is another column `Left` which says whether these markers are from the right or left of the hawk. If they are from the left, the coordinates have been mirrored. In a perfect frame where all markers are visible and labelled from the entire hawk, there will be two rows with the same frame number and one row is "left" and the other "right". Frames where one side was missing **any** of the markers have been removed -- so if there was a frame where all the markers are visible except for the right tailtip, there will only be a row with "left" markers. The right ones are completely excluded. Columns: - `HorzDistance`: the horizontal distance of the backpack centroid to the destination perch - `VertDistance`: as above but vertical - `bodypitch`: whole body rotation in pitch - `wingtip_rot_xyz`: the position of the wingtip, relative to the backpack centroid, and with pitch rotation corrected. - `primary_rot_xyz, secondary_rot_xyz, tailtip_rot_xyz`: as above - `Left`: whether this row of markers has been mirrored (left side of the hawk) or not. ### Raw Marker Data `fullwingdata.csv`: This is raw data where each row is an individual marker in a single frame. These can be: - left_wingtip, right_wingtip - left_primary, right_primary - left_secondary, right_secondary - left_tailtip, right_tailtip This is complete data -- if only a `left_wingtip` was visible for a single frame in an entire flight and nothing else, that is still in this data set. This data set is useful if you want to analyse a given marker on the hawk moving in time. The other data sets are helpful if you want to consider the markers as part of a surface/shape. Columns: - `XYZ`: each feather marker in the global coordinate system - `MarkerName`: 3 per wing, 2 on the tail - `HorzDistance`: horizontal distance to the destination perch - `backpack_XYZ`: the centroid of the backpack of the hawk, global coordinate system - `xyz`: the feather marker relative to the backpack centroid - `rot_xyz`: the feather marker relative to the backpack centroid, and with whole body pitch rotation removed - `flightPhase`: rough estimation as to which kind of flight behaviour is going on (big flapping, small flapping, gliding, landing) Most meaningful wing data is therefore in rot_xyz. --- Flights (is the data as explained in the Wings section?) for four species: `DrogonMarkers_allFlights.csv` `RhaegalMarkers_allFlights.csv` `RubyMarkers_allFlights.csv` `ToothlessMarkers_allFlights.csv` `fullWingData.csv`: contains the union of the files above? If not, how is this different? I think I am getting this wrong since these birds were mentioned in the section about trajectories. However, I do not see any perch information in these CSV files, so I must be missing something. **Unsorted** `allBirdsMarkers_allFlights`: not sure where this belongs? How is it different from the `{Drogon, Rhaegal, Ruby, Toothless}Markers_allflights.csv` and `fullWingData.csv`? --- # Mathematics Shape analysis comprises the mathematical description and evolution of shapes. There are, roughly speaking, two main aspects to this field: 1. Shape spaces (and so-called pre-shape spaces), 2. Group actions & metrics (how to "morph" the shapes). A central challenge in this field of analysis is that shapes are nonlinear objects admitting several descriptions depending on the context (images, curves, measures, etc.). Often the key subject of our study is some equivalence class: shapes are described by some principal class of objects occupying an ambient domain (i.e. some submanifold called the *pre-shape space*) quotiented by some group of transformations under which shapes are deemed similar. The resulting space is called the *shape space*. For instance, the unit disk in $\mathbb{R}^2$: any rotation, scaling or translation does not affect the inherent shape of the object. > We usually know our pre-shape spaces: they are the objects we are studying e.g. curves. Sometimes the invariance is less clear (quotient space). Examples of pre-shape spaces: * **Images**: black & white images constitute a shape (with e.g. $L^2$ regularity) * **Measures** can be considered as shapes. * **Curves** (two examples since it will be central to this project): 1. *Immersions*: the space $\mathrm{Imm}([0,1], \Omega) = \{c \in \mathcal{C}^\infty([0,1], \Omega), \, c'\neq 0\}$ is a pre-shape space. For instance, the shape traced out digit "8" occupies this space. 2. *Embeddings*: $\mathrm{Emb}([0,1], \Omega) = \{ c \in \mathrm{Imm}([0,1], \Omega), \, c' \mathrm{\;is\;injective\;everywhere} \}$ is also a pre-shape space, but the number "8" does not belong here (why?). > The spaces above are in our case always infinite-dimensional Riemannian manifolds, and $\Omega$ is always $\mathbb{R}^2$ or $\mathbb{R}^3$. Shapes such as curves are invariant to some transformations. For instance, a reparameterisation (i.e. *how fast* we draw parts of it) of the digit "8" does not inherently change its shape. The invariance gives rise to $\mathrm{Diff}^+([0,1])$, which we wish to quotient out. Taking the space of embeddings as a pre-shape space, we will consider the follow shape space (equivalence class) for our curves: $\mathcal{S}([0,1], \Omega) = \mathrm{Emb}([0,1], \Omega) \setminus \mathrm{Diff}^+([0,1])$ > $SE(n)$ (rotations, scalings and translations) is another example of an invariance, but this group acts from the **left** and not the right! When an adequate shape space has been selected one can study the evolution of these shapes. This is usually done by endowing the manifold (shape space) with a metric appropriate for the application. We will explain what this means later. *Geodesic* motion is a central concept here, which are analogues of "*shortest paths*" in Riemannian geometry. A shortest path with respect to a metric is called the *geodesic*, which are sought by solving an optimisation problem (it is even in the name "shortest" path). Mathematically, the metric defines an inner product space over the tangent space (at each point on the manifold) and is the intrinsic the ruler with which to measure trajectories on the manifold. Different metrics are suitable to different problems. ## Square-root velocity framework (SRVF) The desired invariance of elements of $\mathcal{S}([0,1], \Omega)$ (shorthand $\mathcal{S}$) with respect to reparameterisations is a cumbersome constraint to account for. It turns out that there is a convenient way of handling this by, for each curve $\beta \in \mathcal{S}$, associating the normalised curve: $q_\beta(t) = \frac{\dot{\beta(t)}}{\sqrt{\dot \beta(t)}}$. Two curves differing by only a parameterisation have identical SRVF transforms. For details, see Chapter 5 ("*Shapes of Planar Curves*") of Srivastava & Klassen. The square root is taken since lengths are computed in the $L^2$ metric i.e. the length $l(q)$ of $q$ is defined by: $l(q)^2 = \int_0^1 \langle \frac{\partial q(\eta)}{\partial \eta}, \frac{\partial q(\eta)}{\partial \eta} \rangle \mathrm{d}\eta$ similar to the Euclidean setting. This leads us to consider the shape space: $\mathcal{C} := \{ q \in L^2([0,1], \Omega),\, \int_0^1 |q(\eta)|^2 \mathrm{d}\eta = 1 \} \subset \mathcal{S}$. > We restrict our attention to a subspace of $\mathcal{S}$ for which our analysis and computations shall simplify. Next we will see how we can measure distances between elements of $\mathcal{C}$. ### Metrics on SRVF functions We have already seen a metric for the shape space $\mathcal{C}$: $\langle a, b \rangle_{L^2} := \int_0^1 \langle \frac{\partial a(\eta)}{\partial \eta}, \frac{\partial b(\eta)}{\partial \eta} \rangle$, where $a$ and $b$ occupy $T\mathcal{C}$, the *tangent bundle* of the shape space (details are omitted - think of tangent vectors as Riemannian generalisations of vectors measuring infinitesimal direction and magnitude). This is the $L^2$ metric is also called the *bending energy*. We can now define our first distance (metric) on $\mathcal{C}$: $d_\mathcal{C}(a, b) = \inf_{f:[0,1]\rightarrow \mathcal{C}, f(0)=a, f(1)=b} \int_0^1 \langle \frac{\partial f(\eta)}{\partial \eta} \frac{\partial f(\eta)}{\partial \eta} \rangle$. *NB!* The domain $[0,1]$ of $f$ *IS NOT THE SAME* as the domain of the parameterisation of curves in $\mathcal{C}$! It is rather the "artificial" time it takes for the object $f(0)=a$ to "evolve" into $f(1)=b$. > The distance between $a$ and $b$ is curve between $a$ and $b$ that requires the least amount of energy measured in the $L^2$ metric. So if $a$ is a straight line and $b$ is a "C" shape, then $f(0.5)$ would resemble a parentheses "(" shape. This hopefully explains the new $[0,1]$ domain. Oftentimes the parameterisation of the curves is $[0,2\pi]$ for this reason, which can make it easier to read. ### PCA, Scores, and other assorted pastries *Purpose of this section: Arianna & Andreas had a look at how scores are defined in relation to eigenvalues, because it helps to use the same language (or to easily translate between the two).* Suppose you have $N$ data points of size $n$, denoted by $X\in \mathbb{R}^{n\times N}$. Assuming centered data, we compute the covariance $S = \frac{1}{N-1} X X^\top$. Let $P \Sigma P^{-1}$ be an eigendecomposition of $S$ (it is not unique on the kernel of $S$, hence "a" not "the"), where $\Sigma$ is diagonal and $$P = | p_1\cdots p_n |,$$ where $p_i\in\mathbb{R}^n$ are eigenvectors. **Definition:** the columns of $P$, i.e. the eigenvectors of $S$, are the *principal components of* $S$. > What are "scores" then? Scores, as used in [this Python notebook](https://github.com/LydiaFrance/HawkShapeAnalysis/blob/main/testing_PCA.ipynb), are related to eigenvalues, but take values in a different space. Scores are real numbers representing the variation of a data point's measured by an eigenvector. The $i$th score of a data point $x \in \mathbb{R}^n$ is defined as (*note the order of the indices!*): $$s_i = \sum_{i=1}^n P_{ji} x_j = p_i\cdot x.$$ **Definition:** the $i^\text{th}$ score of $x$ is the inner product between $x$ and the $i^\text{th}$ eigenvector. So this is just a measure of how accurately direction $p_i$ describes data $x$ (think of this in 2D, it's just an angle, more or less). If we want to write this for all data points $X$ in matrix notation, the $n\times N$ scores of $X\in \mathbb{R}^{n\times N}$ are computed as: $$Y := P^\top X.$$ > This is just a change of variables into the eigenspace defined by the covariance $S$. **Finally:** Scores in the notebook above are computed as: ``` scores = pca.transform(pca_input) max_score = np.mean(scores, axis=0) + np.std(scores, axis=0)*2 ``` Let us understand what's going on here. Mathematically, `scores` is exactly $Y^\top$ and `pca_input` is $X^\top$ (see also [this useful stackexchange post](https://stats.stackexchange.com/questions/409176/what-does-the-pca-transform-method-do)). Then, `np.mean(scores, axis=0)` is the $n$-dimensional vector representing the average scores, and `np.std(scores, axis=0)*2` is two times the standard deviation. For a normal distribution, 95% of the data points fall within two standard deviations of the mean, which for $N\rightarrow\infty$ is the asymptotic case (caveat: data must be identically distributed). > Can we do this for shapes? Yes, WIP! # Resources ## Papers & Books * Landmark-Guided Elastic Shape; Analysis of Human Character Motions https://arxiv.org/pdf/1502.07666.pdf * Srivastava A, Klassen EP. Functional and shape data analysis. New York: Springer; 2016 Oct 3. ## Useful Links * https://github.com/LydiaFrance/HawkShapeAnalysis * Surface matching with varifolds (Jean Feydy's example): https://www.kernel-operations.io/keops/_auto_tutorials/surface_registration/plot_LDDMM_Surface.html?highlight=lddmm * Karcher mean & tangent PCA code: https://fdasrsf-python.readthedocs.io/en/latest/curve_statistics.html ## GitHub Repository https://github.com/andreasbock/shape_analysis_birds # Roadmap ## TODO List: 1. Compute Karcher mean of the curves (Arianna & Andreas) 2. 2d vs 3d plotting of the means 3. Animate if time ## Trajectory - Find out the results from the trajectories by better plotting - Are the different start and end points a problem - 2d vs 3d plots of the principal components. Is this what `fdasrvf` is plotting? Understand how to plot these objects! - xyz vs xy vs yz? **NOTE FROM LYDIA** Code for aspect ratio problem, possibly helpful... ``` # Axis Limits minbound = -0.3 maxbound = 0.3 ax.auto_scale_xyz([minbound, maxbound],[minbound, maxbound],[minbound, maxbound]) ``` - Now we have the kercher mean curve... - Generate a family of mean curves for each hawk and each perch-perch distance - We can compare how each hawk solves the different problems - Look at how the curves differ quantitatively - How variable are the solutions? ## Lydia's Thoughts on how to show the results from PCA - Step 1: Put in the dataset into the PCA with the data [n,x,3] or [n,x,2] depending on whether it's 2d or 3d, x is the number of points for each flight, and n is the number of flights (curves) [Note: I don't know how x works with your parameterised version] - Step 2: The results of the PCA give you the components (P is the number of PCs) - `pca = PCA()` - `pca.fit(pca_input)` - `pca.components_` - Step 3: Get the " s" from the PCA by fitting the original data - `scores = pca.transform(pca_input)` - Which gives you a [n,x,P] array where n is the number of flights, x is the number of points in each flight, P is the number of PCs - Step 4: Find the min and max scores for each PC - I use `±2 * stdev` the standard deviation in case there are some extreme outliers - `±2 * stdev` is 95% of the data - Step 5: Take the kercher mean curve, the coefficient, and the max score for the following: - `new_curve = kercher_mean_curve + max_score * principle_component_1` - Step 6: Your new curve is now a transformed version of the mean curve as though it were affected by PC1 only and by the max amount - Step 7: Repeat for the min score - Step 8: Plot them in two different colours next to the mean curve. If you want to animate this it's not the easiest but ask Lydia :grin: - You generate a range of scores between the min and max (say 200) - You generate an array of 200 curves generated from this score array (Step 5) - You need a plotting function that takes in "curve" and produces a 3d plot - `plot_curve(curve,fig,ax)` - You have an "update" function that takes in a matrix and the iterable `ii`: - `update_plot(ii,curve_array,fig,ax)` - `plot_curve(curve_array[ii],fig,ax)` - You have the animation function which uses the update function - `anim = FuncAnimation(fig,update_plot,n_frames,fargs=(fig,ax,curve_array), interval = 20, repeat=True` - As long as your update_plot function takes an iterable as its first argument, it will know what to do...! ## Wing shapes - Make it a tangent PCA not normal PCA out of interest ## Maths Questions 2023-01-19 - When we see two types of variation in the same PC (e.g. timing of flapping and also the length of the glide), why are they shown together in the same PC? Are they bundled randomly? - In my wing shape PCA, I see the wings spreading and the tail also spreading. this makes sense in a physical system because the bird wants to increase its surface area. But mathematically could the grouping of those variations be random? ## Meeting Minutes ### 19/01/23 LF, AB, ASJ Zoom 14:00 - 16:15 GMT #### Arianna's meeting notes: - New dataset has been made to account for asymmetry. Object is placed in the birds' path to cause turns -- banking. - Karcher means are looking promising. We are mainly seeing variation at the start and at the end. Is this normal or an affect of the mismatch of start/end points etc? And if we didn't scale the axes, would we begin to see more variation in the middle part of the curve (i.e., the actual "flight")? Should try plotting these with different axes. Would also be interesting to see what the Karcher means looked like if the first and last segments were completely excluded - but that would require some manually truncation. Food for thought. - The deviation of start/end points is less of a problem with greater perch distances. Are they affecting the Karcher mean results? Could we manually ammend these points before feeding the curves into the Karcher mean algorithm? Can we truncate without messing with the aims of the data? Also, would including the time variable here help? - Can look at Karcher means and PCA in 2D. Either by feeding in 3D data (or even 4D data if we wanted backpacks and time), and then focusing on two dimensions, or by dimension reduction from the start, e.g., can feed in y and z axis into the Karcher mean code. - Would be good to create a histogram on resolutions to see what the true differences in numbers are. Is there missing data? Can we just exclude some instances? Should we be looking at sampling on the lower number instead of interpolating with the larger number? - Could be helpful to create a plot of the alignent of all birds, but colour coded for each bird, so we can see the differences between the birds. In addition to this, individual Karcher means of different birds plotted in one plot might be interesting to look at. - How can we compare the birds with each other? Pairwise (goedesic) distances and further clustering? Or maybe we can focus on the Karcher means of each of the individual birds, and compare those (e.g., distances between each other or between some template). Kmeans? - Can we automatically detect anomolies (like that weird purple one), by looking at the velocity variables? And use that to exclude weird curves from dataset. - How can we interpret the PC results meaninfully in the tangent space? - Lydia has updated the wing data PCA code and will be push onto Github soon. Would be good to use this to test out tPCA on this dataset. - There may be another exciting dataset too, this time related to the outlines of the birds rather than 4 points on the birds. Very cool - possibly something for the future. --- ### 2023-02-24 Meeting Arianna's notes Things to check for: - straight vertical lines. - different Ns (total points) - depending on which fdasrsf func, if in the time axis x, x(n+1)<x(n) **Exclusion criteria for data points** Time start cutoff (t>0) -- this should remove the jump End distance cutoff (y>-0.35) -- this should remove data when the bird has grabbed the perch **Exclusion criteria for flight sequence** Must be within *Ym* from the starting perch Must be within *Ym* from the end perch Must have minimum 140 data points (for 12m flights) ### 2023-03-17 Meeting * TODO from 17MAR23: * ~~Code: PCA of _all_ birds together -> what does that give us? (AB)~~ * ~~Code: Fix/remove jagged curves. (ASJ)~~ * ~~`joint_pca.py`: a bug with data quality for 12m.~~ * ~~Code: Add % explained to each PC plot. (ASJ)~~ * ~~Code: subplot PCs together. (ASJ)~~ * ~~Writing: paragraph detailing how we're plotting scores (add section to maths part in HackMD?). (ASJ & AB)~~ * ~~Compare scores with Lydia's score (i.e., scores = pca.transform(pca_input), max_score = np.mean(scores, axis=0) + (np.std(scores, axis=0)*2) etc)~~ ### 2023-05-12 Meeting (Lydia) Possible Figures... - Comparing k-means of trajectories - 4 rows (perch distance), 1 col subplots (4 lines for each bird) - Each subplot is a different perch-perch distance - Each subplot has the k-mean for each individual and also the overall k-mean (5) - Able to compare each individual within each subplot - Can compare the shape of the different perch distances across the subplots - The point: shows us how the individuals vary and how technique changes for the different distances - The deformation areas comparing perch-perch distance ![](https://hackmd.io/_uploads/ryypEgBH3.png) > These are the first derivative (velocity) of the trajectories versus time. The actual data plotted here is a mess, yours will be a huge improvement on this. You can see that from -0.5 to 0 the averages are pretty much exactly the same, then the 5m one diverges first and so on. - deformation continued... - The scaled version of the all-birds k-mean 5m trajectory - The scaled version of the all-birds k-mean 12m trajectory - A "heat map" or just plotting the values directly of where the deformation of transforming the 5m to 12m changes the curve the most. (can also compare other combinations) - The point: shows us which part of the trajectory changes when the birds need to do a different kind of flight. E.g. the take-off part is always the same, but something is very different in the middle when it's a shorter flight - **Andreas' notes:** use angle/warping function! See Srivastava Fig 5.3. - The Tangent PCA - Helpful extra -- the "mickey mouse" figure - Shows the trajectory incrementally deforming into another. This is the standard way of depicting geodesics. - Not that helpful for this paper but v v v useful for upcoming work so we should have some code/familiarity with it . - **Andreas' notes**: this is "just" an example of a geodesic between a given curve and, for instance, the mean. - **Andreas' notes**: "distance matrix"? ### 2023-05-19 Meeting > Andreas & Lydia Edited the notes in [this section](#2023-05-12-Meeting) to add Andreas' notes for todo list. #### Possible journal plans... - Lydia & Andreas not too worried about "impact factor" - However, the [Royal Society Journals](https://royalsociety.org/journals/authors/which-journal/) are nice because interdisciplinary audience and they don't mind interesting structures/headings as a result and don't specify it aggressively. - We could try for [Journal of the Royal Society Interface](https://royalsocietypublishing.org/rsif/about) because lots of bird research published there, they look for interdisciplinary stuff, the articles look nice. Might be a bit ambitious though... - > Interface publishes research applying chemistry, engineering, materials science, mathematics and physics to the biological and medical sciences - If not accepted then we can publish to [Journal of the Royal Society Open](https://royalsocietypublishing.org/rsos/about) which is way more accepting - > The journal covers the entire range of science and mathematics and allows the Society to publish all the high-quality work it receives without the usual restrictions on scope, length or impact. #### Paper writing plans - First finalise figures as the visual version of the paper - Andreas wants to write raw maths part in Overleaf, then Arianna alter it for biology audience, use Lydia as a test subject - Lydia has written some key bullet points about how the introduction might go. - We should write this the way it makes sense to us and interdsciplinary audience - If we pick a Journal of the Royal Society, their structure is basically: Introduction, "Main Text" with any headings we want, then methods at the end. This gives us loads of freedom to do whatever #### Figure plans - ~~See [this section](#2023-05-12-Meeting) for a TODO list.~~ See Paper Plans below. #### Code plans - Need to have a meeting with Arianna but Lydia also feels strongly we should have some kind of mini toolbox released with this paper. - The code should be generalisable to allow for someone else with clean 2d trajectory data to run the methods on it and make the nice figures that make sense to biologists - Lydia can test it with finch trajectories. - This is something we can show off within the Turing as some open science software and get some kudos - ~~Andreas is going to separate some of the "cleaning steps" e.g. the filters away from the algorithms so the cleaning step is separate and first, then the algorithms and figures are produced.~~ *Andreas: I had a think/closer look at this. I think there is a better way, see [Section 1.3 on "Code Plans"](#1.3-Code-Plans) below. - Lydia's day job is about making science code more general/open this so she will help! #### New PCA analysis... - Me and Andreas are calling this the grey soup PCA - What happens if you pour in all perch distances, all birds into the same PCA. Lengths of data are normalised. Making a grey soup - What is the point of this? Basically this is a data set where the underlying variation is known (there are different individuals, there are different perch distances) but can this kind of analysis reveal these "clusters" without prior knowledge - That way if we use this method for a data set where the variation within it is unknown, could it reveal them. ### 2023-06-08 Meeting (Andreas&Arianna&Lydia) - Andreas' workload: - Done the bulk of it now! - Read the [Brighton et al paper](https://www.pnas.org/doi/epdf/10.1073/pnas.1714532114) looking at the [Appendix](https://www.pnas.org/action/downloadSupplement?doi=10.1073%2Fpnas.1714532114&file=pnas.1714532114.sapp.pdf) to see an example as to how biologists solve this problem currently. puts our paper into context. - Getting those figures out and in the pdf - Figures more reproducible/pretty - Help Lydia with the code deep dive - Arianna's workload: - Free from the `27th June` - helping Andreas with a bottleneck - the geodesics - making the maths explanation simplified. - Lydia's workload: - Deep dive the code and make into a toolbox suitable for biologists. - Try it out with a new series of curves -- weights added 9m -- as a test case. - Do some formal hypothesis testing which will then drive the write up for the whole paper Planned meeting on the 22nd -- all three of us interpret the figures to write up the results and see the progress of the code. --- ## 2024-02-13 Meeting At the Turing (Andreas&Arianna&Lydia) ### Update from Lydia There's three aims of this project: Paper: Turing doesn't care, our priority outcome Software Toolbox: Turing priority Science Results: Can't write paper without this, need to check everything is working, Lydia needs to be comfortable running hypotheses herself By writing the toolbox, Lydia is checking the science and testing with a new unseen dataset to check the reliability and generalisability of the methods. Some gremlins have appeared, so working on those. - Problem with how we define the start and end points of the curves - Overzealous throwing away data based on a sole rogue datapoint The stretch goals for the toolbox is to have mixed input options, so choice of CLI or use within a notebook, or scripting directly. Paper writing reliant on Lydia Toolbox being written by Lydia, now Andreas wants to join in the effort too. ### Hypotheses/Science Questions Kinds of comparisons: - Perch distance comparison - Individuals - Experience - Weight How variable are the 5m flights by Drogon? (Do individual birds vary a lot on the same task) - Take the mean of the flights and co variance - Groupwise alignment, norm of the warping functions (i.e. comparing to y=x) Where in the flight (by section) do the 5m flights by Drogon vary? If the 5m flights by Drogon vary, where in the flight do they vary (by section) - Warping function - mean ± PCA score How do the different birds fly 5m in comparison to each other - Visually look at the means - Pairwise geodesic distance matrix of Karcher Means (Drogon and Rhaegal are closer together) - Cluster PCs (5m PCA with all birds) Is the way that the 5m flights vary comparable to the way the 12m flights vary? What would it look like if a bird flew 12m in the style of 5m (can't go back to original space) - Geodesic distance comparison - PCs vector parallel transfrom (squiggly man) ### TODOs: - [ ] Send data (LAF) - [ ] Ghost marker problem (drop a frame if it's just a frame triggering the Lipschitz filter) -- (AB) - [ ] Circle around the end perch rather than x=-0.3 cutoff (AB) - [ ] Have a cry about the wiggly man (AB & ASJ) - [ ] Upload the proto-toolbox ready for help (LAF) - [ ] Notebook for geodesic distance comparison (ASJ) - [ ] Notebook for warping function (ASJ) - [ ] De-mathsing Arianna-fied the methods (ASJ) --- # Paper Plans ## **1. Trajectory Shape Analysis** [GitHub link](https://github.com/andreasbock/bird_trajectory_paper) [Overleaf link](https://www.overleaf.com/project/64609f6e0b8e408d2420bc10) ### 1.1 Paper Writing > Research Question: How can we meaningfully quantify variation in biological trajectory data? * [ ] Biology/Flight Biology Pitch: How to quantify individual variation and changes in flight technique/paths rather than collapsing variation down to a single point or eyeballing it * [ ] Maths Pitch (Andreas & Arianna): *started drafting* * [ ] [Reference this paper](https://www.nature.com/articles/s41586-022-04861-4) which uses the same data to show how trajectory and simulation data are collapsed to just a single point for each flight. * [ ] [Reference this paper](https://www.pnas.org/doi/10.1073/pnas.1714532114) to show how comparing simulation results to real trajectories are done by crude "distance" between (random...?) points * [ ] Pitch it to biologists, include evolutionary biology, have a good maths explanation for biologists. Lydia deals with all the Abstract/Intro/Background/Discussion/Conclusions writing. * [ ] Andreas & Arianna to discuss maths section. *Iterating on draft* * [ ] Write explanations for the figures together. * [ ] Think about including the "weighted data set". ### 1.2 Figure Plans > *Please refer to the README.md in the repository for a detailed description of what each script produces.* * [x] **Karcher means** 4x1 plot for each perch (`plot_karcher_mean.py`, legend needs fixing) * Does this make sense to us? I wonder if tPCA reveals more than the mean does (so do that) * [x] **Tangent PCA** for each bird/perch combination, and for each perch distance across all birds (`plot_pca_bird_pirch.py` and `plot_pca_perch.py`) * [x] **Grey soup PCA** (`plot_pca.py`): combine all birds & perch distances (after normalisation). This can check if the metric we have chosen uncovers the known underlying variation as a "sanity check" of our method. We can do clustering on the pair-wise distance between all curves! This is related to the notion of a distance matrix. * [x] **Geodesic plot** between two curves (`plot_geodesic.py`): needs some improvement. * [x] ~~Plot first derivatives/"deformation area" of each perch distance/bird. This is computed from each PC. I think it is computable using `calc_alphadot`/`find_basis_normal_path`, but not sure how to do this yet...~~ * [x] Rewrite/edit `plotting.py` to make the figures _shine_ appropriate for the journal/audience we have in mind. Prettify/fix legends etc, `matplotlib` tinkering basically (everyone). * [ ] [Vertical vs Horizontal PCA](https://fdasrsf-python.readthedocs.io/en/latest/fpca_example.html?highlight=parallel#Elastic-Functional-Principal-Component-Analysis). * [ ] **Warping function** plot instead of/in addition to "SRVF geodesic" plot (Arianna). * [ ] **Colour/clustering** plots, k-means (Arianna). * [ ] **Plot more than 3 PCs for grey soup (might be instructive)** * [ ] **Parallel transport** between "global PCs" and bird PCs (Andreas). * [X] Add all the figures we've got to the Overleaf (Andreas). ### 1.3 Code Plans > Goal is to build some kind of mini toolbox released with this paper. - [x] ~~Fix `scaling=True` bug (Arianna).~~ - [x] Fix `plot_karcher_means.py` legend/grid, axes, etc. Andreas did nothing he hates `matplotlib` xD (Arianna). - [x] Add `fs.curve_functions.scale_curve(curve)` when setting up the `fdacurve` object in `curve.py`. This would alleviate the need to do the same in `curve.py.plot_curves` (Andreas/Arianna). - [ ] Start new repo (fork/copy). *Andreas: As mentioned, 90+% of the code is just cleaning, 10% pertains to curves. As a result, it is simpler to just lift what we need from the existing repository. There is a `curve.py` module which has an interface for creating the `fdacurve` object (transparent to user) and plotting the stats.* - [ ] Lydia to test it with finch trajectories. #### 2023-06-08 Lydia Code plans to do - [ ] Lydia makes notes on all the code going through it - [ ] Test using it with new trajectories (with weights) to see how it works for an ignorant new user - [ ] Use those learnings to figure out documentation/walkthrough - [ ] Plan for refactoring it to a toolbox, what might be needed/rearranged/missing/generalisable ### Miscellaneous TODO list - [ ] Show some evolution/flight biologists to get some of their opinions - ~~Lydia is going to try and get hold of the simulated optimal trajectories too from the Nature paper~~ - [ ] Rope in Lydia's biology prof as an author for clout ## **2. PCA Wing Motion Paper** `Research Question: What is the control space for morphing wing flight?` Just Lydia's work here currently (all the birdy animations). This will likely be a good first paper to introduce the wing motion data as a preliminary basic analysis. Currently I'm using plain PCA to look at wing motion and trying to wrestle with how big that is into a paper. The principal components reduce the dimensionality of highly complex morphing wing&tail during flight behaviours. - [x] Problem I'm currently up to my eyeballs in -- adding a new data set to the 4 million points which adds in asymmetrical flight data. (Is this a reasonable amount of work for just one person? Absolutely not) :scream: - I managed it last week :sunglasses: `2023-06-08` 2023-06-08 Update... - Have now added turning/asymmetrical flight to the ## **3. Ambitious Shape Analysis Wing Motion Paper** `Research question: What variation is there in wing morphing technique across flight behaviours and different individuals?` Following the previous paper this one will be able to ask much much more ambitious novel questions. This will be a lot of fun! With shape analysis maths this becomes much more ambitious. Everything we find in this data set is novel and very exciting so we have a lot of potential. The maths is much harder though. OPTION 1: Two papers, a maths one (how we did it) and a biology one (what do the results mean). The biology one references the maths one. It might be splitting it is more impactful for your maths fields where you can explain the work more. OPTION 2: Contain everything together, but potentially need 2 papers or more to cover everything we find.....! ---

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully