4DVarNet for space-time interpolation: ongoing developments

1.The 4DVarNet algorithm

Let

y (Ω) = {y_{k} (Ω_{k})}

denotes the partial and potentially noisy observational dataset corresponding to subdomain

Ω = {Ω_{k}} \subset D

\overset{―}{Ω}

denotes the gappy part of the field and index

k

refers to time

t_{k}

. Using a data assimilation state space formulation, we aim at estimating the hidden space

x = {x_{k} (Ω_{k})}

from the observations

y

1.1 The variational model

Considering a variational data assimilation scheme, the state analysis

x^{⋆}

is obtained by solving the minimization problem:

\begin{array}{r} x^{⋆} = \underset{x}{\arg min} J (x) \end{array}

where the variational cost function

J (x) = J_{Φ} (x, y, Ω)

is generally the sum of an observation term and a regularization term involving an operator

Φ

which is typically a dynamical prior:

\begin{aligned} J_{Φ} (x, y, Ω) & = J^{o} (x, y, Ω) + J_{Φ}^{b} (x) \\ = λ_{1} | | y - H (x) | |_{Ω}^{2} + λ_{2} | | x - Φ (x) | |^{2} \end{aligned}

with

H

the observation operator and

λ_{1, 2}

are predefined or learnable scalar weights. This formulation of functional

J_{Φ} (x, y, Ω)

directly relates to strong constraint 4D-Var.

For inverse problems with time-related processes, the minimization of functional

J_{Φ}

usually involves iterative gradient-based algorithms and in particular request to consider the adjoint method in classic model-based variational data assimilation schemes where operator

Φ

identifies to a deterministic model

x_{k + 1} = M (x_{k})

\begin{array}{r} x^{(i + 1)} = x^{(i)} - α \nabla_{x} J_{Φ} (x^{(i)}, y, Ω) \end{array}

In our case, we are interested in purely data-driven operator

Φ

: we consider NN-based Gibbs-Energy (GENN) representations, a way of embedding Markovian priors in CNN which proves to be efficient on SSH altimetric datasets. This enables to use deep learning automatic differentiation tools: the computation of this gradient operator

\nabla_{x} J_{Φ}

given the architecture of operator

Φ

can be seen as a composition of operators involving tensors, convolutions and activation functions.

1.2 Trainable solver architecture

The proposed end-to-end architecture consists in embedding an iterative gradient-based solver based on the considered variational representation. As inputs, we consider an observation

y

, the associated observation domain

Ω

and some initialization

x^{(0)}

. Let us denote by

Γ

this iterative update operator. Following meta-learning schemes, a residual LSTM-based representation of operator

Γ

is considered here where the

i^{t h}

iterative update of the solver is given by:

\begin{array}{r} {\begin{array}{ccl} g^{(i + 1)} & = & L S T M [α \cdot \nabla_{x} J_{Φ} (x^{(i)}, y, Ω), h (i), c (i)] \\ x^{(i + 1)} & = & x^{(i)} - T (g^{(i + 1)}) \end{array} \end{array}

with

g^{(i + 1)}

is the LSTM output using as input gradient

\nabla_{x} J_{Φ} (x^{(i)}, y, Ω)

, while

h (i)

and

c (i)

denotes the internal states of the LSTM,

α

is a normalization scalar and

T

a linear or convolutional mapping.

Let note that a CNN architecture could also be used instead of the LSTM representation of

Γ

and that when replacing both the LSTM cell by the identity operator and the minimization function

J_{Φ} (x, y, Ω)

by its single regularization term

J_{Φ}^{b} (x)

, the gradient-based solver simply leads to a parameter-free fixed-point version of the algorithm.

1.3 End-to-end joint learning scheme

Overall, let denote by

Ψ_{Φ, Γ} (x^{(0)}, y, Ω)

the output of the end-to-end learning scheme given architectures for both NN-based operators

Φ

and

Γ

, see Figure below, the initialization

x^{(0)}

of state

x

and the observations

y

on domain

Ω

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Then, the joint learning of operators

{Φ, Γ}

is stated as the minimization of a reconstruction cost:

\begin{array}{r} \arg min_{Φ, Γ} L (x, x^{⋆}) s.t. x^{⋆} = Ψ_{Φ, Γ} (x^{(0)}, y, Ω) \end{array}

In case of supervised learning, where targets are gap-free:

L (x, x^{⋆}) = | | x - x^{⋆} | |^{2} + | | \nabla_{x} - \nabla_{x^{⋆}} | |^{2}

, i.e. the L2-norm of the difference between state

x

and reconstruction

x^{⋆}

with an additional term related to the gradient of state

x

.
In case of unsupervised learning, given the observations

y

on domain

Ω

and hidden state

x

, the 4DVar cost function may be used

L (x, x^{⋆}) = λ_{1} | | y - H (x) | |_{Ω}^{2} + λ_{2} | | x - Φ (x) | |^{2}

with weights

λ_{1}

and

λ_{2}

to adapt according to the reliability of the observations.

2.The Dataset

The 4DVarNet algorithm has been successfully tested on small datasets, typically sequences of spatio-temporal images with sizes 11x200x200 (NtxNyxNx).

2.1 Location

In this Hackathon, the main objective is to make it work efficiently on a larger dataset, available at:

# obs
https://s3.eu-central-1.wasabisys.com/melody/NATL/data/gridded_data_swot_wocorr/dataset_nadir_0d_swot.nc

#oi
https://s3.eu-central-1.wasabisys.com/melody/NATL/oi/ssh_NATL60_swot_4nadir.nc


#ref
https://s3.eu-central-1.wasabisys.com/melody/NATL/ref/NATL60-CJM165_NATL_ssh_y2013.1y.nc
https://s3.eu-central-1.wasabisys.com/melody/NATL/ref/NATL60-CJM165_NATL_sst_y2013.1y.nc

They are stored using NetCDF format, which can be handled in Python using for instance the xarray package.

2.2 Description

All the three datasets (reference, data, optimal interpolation) have three dimensions:

time (365)
lat (761)
lon (1721)

All the variables are stored as 2D 761x1721 regular grids along the 365 days with the following coordinates:

time 365 days: '2012-10-01' '2012-10-02' … '2013-09-30'
lat 761 x 1/20° : 27.0 27.05 27.1 27.15 27.2 … 64.85 64.9 64.95 65.0
lon 1721 x 1/20° : -79.0 -78.95 -78.9 -78.85 … 6.9 6.95 7.0

2.2.1 Ground Truth

The SSH Ground Truth NATL60-CJM165_NATL_ssh_y2013.1y.nc:
- ssh (Sea surface Height) One year long daily datasets provided by the NATL60 state-of-the-art oceanic simulation
The SST complementary Ground Truth NATL60-CJM165_NATL_sst_y2013.1y.nc:
- sst (Sea surface Temperature): sst may be used for complementary tests to improve the ssh spatio-temporal interpolation

2.2.2 Pseudo-observations

The pseudo-observations dataset is generated by sampling the SSH Ground Truth with realistic satellite trajectories dataset_nadir_0d_swot.nc :
- mask : mask (1 -> ocean, 0 -> land)
- lag : time deviation (in hour) to the selected day
- flag : satellite type (0 -> NADIR, 1 -> SWOT)
- ssh_obs: data with additional realistic noise
- ssh_mod: data without noise (=model)

2.2.3 Optimal interpolation (OI)

the state-of-the-art optimal interpolation (OI) dataset based on the previous pseudo-observations. This is the baseline the 4DVarNet algorithm aims at improving ssh_NATL60_swot_4nadir.nc:
- ssh_obs: OI using ssh_obs in the pseudo-observations dataset
- ssh_mod: OI using ssh_mod in the pseudo-observations dataset

Let note that the OI are currently involved when applying the 4DVarNet algorithm on the anomaly

x - \overset{―}{x}

between the raw ssh and the OI (denoted here as

\overset{―}{x}

and seen as a large scale component of the SSH). This solution is for now the one giving the best results. The OI is also used as additional input channels in the 4DVarNet algorithm and thus can be seen as an extra covariate that helps to localize the areas wth large anomalies.

3 Code available on Github

Here is the repo GitHub with a simplified version of our code: 4DVarNet and some explanations regarding its architecture architecture of the code

Key components of the code:

4DvarNet pytorch Lightning module
Multi-GPU/multi-node distribution using pytorch lightning
Possible on-the-fly batch generation from raw files

5 Actions under development

Color code to specify the status the following tasks:
Done
In progress
To do
As you can see, there are still plenty of things to do :)

5.1 R&T CNES 2021

5.1.1 OSSE NATL60

Finalize the experiments (xp1, 2 and 3), produce the appropriate set of scores, check we won the cup ;) and prepare the products (NetCDF+Notebooks) related to the OSSE Boost-SWOT data challenge:
DC link (Mohamed)
Amélioration des aspects strides pour l'apprentissage sur la zone GF "étendue", afin d'éviter les effets de bords (Quentin)
Debuggage de la branche 4DVarNet-core:
-> (1) repartir de la branche de Quentin pour un premier test GF->GF (en forçant à 0 les loss d'apprentissage sur la calibration SWOT).
-> (2) modifier/simplifier le code de quentin pour revenir au pb d'interpolation uniquement (ie, pas de calibration conjointe). Vérifier que les résultats sont similaires à la 1ère expérience
-> (3) ajouter dans ce code l'aspect état augmenté. Vérifier que cette config permet de faire aussi bien voire mieux que les résultats 4dvarnet actuels sur le repo boost-swot.
(Hugo)
Mid-october: Based on the Boost-SWOT data challenge, create a new repo for the "OSSE NATL data challenge", with the three areas, GF, OSMOSIS and NATL

GF domain: [-65.,-55.,33.,43.]
OSMOSIS domain: [-19.5,-11.5,45.,55.]
NATL domain: [-79.,7.,27.,65.]
(Hugo & Maxime)

Based on the plan provided by Maxime in Section 5.5.1, start the redaction of R&T Report (see Overleaf link for the successful scaling up of 4DVarNet (Maxime)

5.1.2 OSSE Glorys

(Ronan & Maxime)
Think about the experiments we want to run at global scale with Glorys (resolution at 1/12°)
- prepare a data challenge
- prepare meeting with CLS
Run the OSSE (Benjamin)

5.2 OSTST/TOSCA

5.2.1 OSE

Run the comparison based on the c2 along-track datasets, cf BOOST-SWOT data challenge (Maxime)

5.2.1 Multi-tracer/multi-sensor synergies

4DVarNet-SLA-SST based on NATL60 datasets on the whole North Atlantic basin (Hugo?)

5.3 SWOT ST DIEGO

5.1.1 SWOT calibration

SWOT robustness to noise signals (Quentin)

5.3 On-going developments:

Pull requests
- Clean the PR of the Github repo (Maxime and Quentin)
- Get GAN-based code (Anis) and make a PR (Hugo & Ronan)
- Get new iterative approach (Duong) and make a PR (Quentin)
- Implement test method on multi-GPU (Hugo)
- Improve animation (Hugo)
Technical issues
- Improve the gestion of complex domains and the extraction at the center of the patch (Maxime and Quentin)
- - Add along-track nadir gradients
- Test sur le domaine complet (Hugo)
- Discuss hydra/lightning (Quentin & Benjamin)
4DVarNet-stochastic (Maxime)

5.4 Publications

Quentin: calibration paper (submitted)
Maxime: SPDE (inprep)

5.5 Livrables

5.5.1 R&T CNES 2021

L1.1 Réplication des résultats OSSE obtenus sur la zone GULFSTREAM à partir des données nadir+SWOT à l’échelle du bassin Atlantique Nord
L1.2 Réplication des résultats OSSE obtenus sur la zone GULFSTREAM à partir des données nadir seules à l’échelle du bassin Atlantique Nord
L1.3 Synthèse des OSSE utilisant une formulation variationnelle pour la reconstruction de champs altimétriques à l’échelle du bassin Atlantique Nord
L2.1 Spécifications des OSE pour le passage à l’échelle
L2.2 Synthèse des OSE utilisant une formulation variationnelle pour la reconstruction de champs altimétriques à l’échelle du bassin Atlantique Nord
L2.3 Plan d’action pour la suite des activités liées au portage des applications SLANet à la thématique DUACS, notamment en termes de développements complémentaires en vue d’un portage opérationnel.

5.5.2 OSTST DUACS HR

WP3 New Neural Networks approaches for SL mapping
R. Fablet, M. Beauchamp, Q. Febvre, M Ballarotta, A. Pascual

WP 3.1 Development of a benchmarking framework for learning-based SL mapping using real satellite altimeter datasets
WP 3.2 Development and evaluation of end-to-end learning frameworks for the SL mapping
WP 3.3 Learning-based SL mapping using multi-tracer/multi-sensor synergies