MeLODy Project

# MeLODy Project --- ## Project Statement (TODO) --- ## My Conceptual Notes * [Notation](https://jejjohnson.github.io/research_notebook/content/notes/concepts/notation.html) * [Inverse Problems](https://jejjohnson.github.io/research_notebook/content/notes/data_assimilation/inv_problems.html) * [Dynamical Systems](https://jejjohnson.github.io/research_notebook/content/notes/data_assimilation/dynamical_sys.html) * [Markov Models](https://jejjohnson.github.io/research_notebook/content/notes/data_assimilation/markov_models.html) * [Optimal Interpolation](https://jejjohnson.github.io/research_notebook/content/notes/data_assimilation/oi.html) * [Kalman Filter](https://jejjohnson.github.io/research_notebook/content/notes/data_assimilation/kf.html) * [Ensemble Kalman Filter](https://jejjohnson.github.io/research_notebook/content/notes/data_assimilation/enskf.html) --- ## Projects ### Neural Fields 4 SSH Interpolation --- #### Problem Statement We have a training set of observations, $y_\text{obs}$, which come from altimetry tracks. Each of these observations have been capture at a latitude, longitude and time, i.e. the coordinates $\mathbf{x}_\phi$. However, they are sparsely scattered and are assumed to be noisy. We want some parameterized function, $\boldsymbol{f_\theta}$, to learn an implicit field which maps the `lat,lon,time` coordinates, $\mathbf{x}_\phi$, to the observations, $y_\text{obs}$. $$ y_\text{obs} = \boldsymbol{f_\theta}(\mathbf{x}_\phi) + \epsilon, \hspace{5mm} \epsilon \sim \mathcal{N}(0, \sigma^2) $$ The standard approach is to use Optimal Inteprolation (OI). However, it can be very computationally expensive with kernel parameters that can be difficult to tune. We use a family of neural networks (i.e. Neural Fields (NerF), Implicit Neural Representations (INR), Coordinate-Based Neural Networks) to learn this mapping purely from observations. **Hypothesis**: We believe that there are enough observations such that we can train a neural network to *learn* the implicit SSH field given the coordinates and observations. Furthermore, we believe we can match the performance of OI and DUACs (for certain cases) in terms of statistical and physical metrics while still being more scalable. --- #### Code + [Main Github Repository](https://github.com/jejjohnson/ml4ssh) + [QG Data Challenge](https://github.com/jejjohnson/2022b_qg_mapping) + [QG Simulation Code](https://github.com/jejjohnson/torchqg/tree/package) + [OSE SSH Data Challenge](https://github.com/ocean-data-challenges/2021a_SSH_mapping_OSE) --- #### Dissemenation **Talks** * [2022/06 - M2Lines]() (**TODO**) * [2022/06 - SWOT Meetup]() (**TODO**) **Posters** * [2022/10 - WOC ESA]() (**TODO**) **Papers** * [2022/12 - NeurIPs MLPS Workshop (Submission)](https://www.overleaf.com/1567587229mhvgpmdnkcqk) * [Journal Article](https://www.overleaf.com/8891627777cfvwpfzcntzd) [**Writing**] - (JAMES?) | [Read-Only](https://www.overleaf.com/read/ybbbxzgmcrjh) --- #### Datasets * [QG Simulation]() (**TODO**) * [OSE Data Challenge 2021b]() (**TODO**) --- #### Algorithms | Method | Formulation | | -------- | -------- | | Optimal Interpolation | $\boldsymbol{f} \sim \mathcal{GP}\left(\boldsymbol{m_\psi}(\mathbf{x}_\phi), \boldsymbol{k_\rho}(\mathbf{x}_\phi,\mathbf{x}_\phi')\right)$ | | Neural Fields | $y_\text{obs} = \boldsymbol{f_\theta}(\mathbf{x}_\phi) + \epsilon, \hspace{5mm} \epsilon \sim \mathcal{N}(0, \sigma^2)$| We are traditionally interested in Optimal Interpolation (OI). This is a very good method for regression tasks where there are very few observations because they have very strong priors. However, it is very hard to find the hyperparameters on large amounts of data. So we look to neural networks to alleviate this problem. * [Optimal Interpolation]() (**TODO**) * [Neural Fields]() (**TODO**) --- #### QG PDE Regularization > Here we add a QG regularization term which we hope will constrain the solution of the NN to exhibit more physical attributes that one would expect form sea surface height. $$ \partial_t \eta - \alpha\partial_t \nabla^2 \eta - \beta\det \boldsymbol{J}(\eta, \nabla^2 \eta) = 0 $$ <details> where: * $\eta$ is SSH * $\alpha = \frac{f}{g}$ * $\beta= \frac{gL_R^2}{f}$, * $\boldsymbol{\nabla}^2$ - Laplacian operator * $\partial_t$ - the partial derivative wrt time, $t$ * $\det \boldsymbol{J}(u,v) = \partial_x u \partial_y v - \partial_y u \partial_x v$ - the determinant Jacobian of a vector valued function, $\boldsymbol{f}:\mathbb{R}^2 \rightarrow \mathbb{R}^2$. </details> This will be used a regularization term on top of the standard MSE loss. * [PDE Formulation](https://community.inkdrop.app/note/feef7d19c5bf8bc1eba0e574b4b0d3f3/note:tunNgmDiJ) * [PDE Simulations](https://community.inkdrop.app/note/feef7d19c5bf8bc1eba0e574b4b0d3f3/note:H0rDrVrn7) * [Derivation](https://community.inkdrop.app/note/feef7d19c5bf8bc1eba0e574b4b0d3f3/note:Rn33ThReq) * Non-Dimensionalized PDE - [Math](https://community.inkdrop.app/note/feef7d19c5bf8bc1eba0e574b4b0d3f3/note:d5aEeMObL) * QG with Spherical Coordinates - [Math](https://community.inkdrop.app/note/feef7d19c5bf8bc1eba0e574b4b0d3f3/note:GNEowyBPM) * Non-dimensionalized QG with Spherical Coordinates - [Math](https://community.inkdrop.app/note/feef7d19c5bf8bc1eba0e574b4b0d3f3/note:pmH_qyHDH) * [Code Demo](https://community.inkdrop.app/note/feef7d19c5bf8bc1eba0e574b4b0d3f3/note:e_WvJ0A8G) --- #### Experiments **QG Simulation**: Density of Sampling **QG Simulation**: Quantity of Noise **QG Simulation**: Affect of QG Regularization **OSE SSH Data Challenge**: Best Methods * [Preliminary Results](https://github.com/jejjohnson/ml4ssh/tree/main/experiments/dc21a)