Try   HackMD

4DVarNet for space-time interpolation: ongoing developments

1.The 4DVarNet algorithm

Let

y(Ω)={yk(Ωk)} denotes the partial and potentially noisy observational dataset corresponding to subdomain
Ω={Ωk}D
,
Ω
denotes the gappy part of the field and index
k
refers to time
tk
. Using a data assimilation state space formulation, we aim at estimating the hidden space
x={xk(Ωk)}
from the observations
y
.

1.1 The variational model

Considering a variational data assimilation scheme, the state analysis

x is obtained by solving the minimization problem:
x=argminx J(x)

where the variational cost function
J(x)=JΦ(x,y,Ω)
is generally the sum of an observation term and a regularization term involving an operator
Φ
which is typically a dynamical prior:
JΦ(x,y,Ω)=Jo(x,y,Ω)+JΦb(x)=λ1||yH(x)||Ω2+λ2||xΦ(x)||2

with
H
the observation operator and
λ1,2
are predefined or learnable scalar weights. This formulation of functional
JΦ(x,y,Ω)
directly relates to strong constraint 4D-Var.

For inverse problems with time-related processes, the minimization of functional

JΦ usually involves iterative gradient-based algorithms and in particular request to consider the adjoint method in classic model-based variational data assimilation schemes where operator
Φ
identifies to a deterministic model
xk+1=M(xk)
:
x(i+1)=x(i)αxJΦ(x(i),y,Ω)

In our case, we are interested in purely data-driven operator
Φ
: we consider NN-based Gibbs-Energy (GENN) representations, a way of embedding Markovian priors in CNN which proves to be efficient on SSH altimetric datasets. This enables to use deep learning automatic differentiation tools: the computation of this gradient operator
xJΦ
given the architecture of operator
Φ
can be seen as a composition of operators involving tensors, convolutions and activation functions.

1.2 Trainable solver architecture

The proposed end-to-end architecture consists in embedding an iterative gradient-based solver based on the considered variational representation. As inputs, we consider an observation

y, the associated observation domain
Ω
and some initialization
x(0)
. Let us denote by
Γ
this iterative update operator. Following meta-learning schemes, a residual LSTM-based representation of operator
Γ
is considered here where the
ith
iterative update of the solver is given by:
{g(i+1)=LSTM[αxJΦ(x(i),y,Ω),h(i),c(i)] x(i+1)=x(i)T(g(i+1))

with
g(i+1)
is the LSTM output using as input gradient
xJΦ(x(i),y,Ω)
, while
h(i)
and
c(i)
denotes the internal states of the LSTM,
α
is a normalization scalar and
T
a linear or convolutional mapping.

Let note that a CNN architecture could also be used instead of the LSTM representation of

Γ and that when replacing both the LSTM cell by the identity operator and the minimization function
JΦ(x,y,Ω)
by its single regularization term
JΦb(x)
, the gradient-based solver simply leads to a parameter-free fixed-point version of the algorithm.

1.3 End-to-end joint learning scheme

Overall, let denote by

ΨΦ,Γ(x(0),y,Ω) the output of the end-to-end learning scheme given architectures for both NN-based operators
Φ
and
Γ
, see Figure below, the initialization
x(0)
of state
x
and the observations
y
on domain
Ω
.

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Then, the joint learning of operators

{Φ,Γ} is stated as the minimization of a reconstruction cost:
argminΦ,ΓL(x,x) s.t. x=ΨΦ,Γ(x(0),y,Ω)

In case of supervised learning, where targets are gap-free:
L(x,x)=||xx||2+||xx||2
, i.e. the L2-norm of the difference between state
x
and reconstruction
x
with an additional term related to the gradient of state
x
.
In case of unsupervised learning, given the observations
y
on domain
Ω
and hidden state
x
, the 4DVar cost function may be used
L(x,x)=λ1||yH(x)||Ω2+λ2||xΦ(x)||2
with weights
λ1
and
λ2
to adapt according to the reliability of the observations.

2.The Dataset

The 4DVarNet algorithm has been successfully tested on small datasets, typically sequences of spatio-temporal images with sizes 11x200x200 (NtxNyxNx).

2.1 Location

In this Hackathon, the main objective is to make it work efficiently on a larger dataset, available at:

# obs
https://s3.eu-central-1.wasabisys.com/melody/NATL/data/gridded_data_swot_wocorr/dataset_nadir_0d_swot.nc

#oi
https://s3.eu-central-1.wasabisys.com/melody/NATL/oi/ssh_NATL60_swot_4nadir.nc


#ref
https://s3.eu-central-1.wasabisys.com/melody/NATL/ref/NATL60-CJM165_NATL_ssh_y2013.1y.nc
https://s3.eu-central-1.wasabisys.com/melody/NATL/ref/NATL60-CJM165_NATL_sst_y2013.1y.nc

They are stored using NetCDF format, which can be handled in Python using for instance the xarray package.

2.2 Description

All the three datasets (reference, data, optimal interpolation) have three dimensions:

  • time (365)
  • lat (761)
  • lon (1721)

All the variables are stored as 2D 761x1721 regular grids along the 365 days with the following coordinates:

  • time 365 days: '2012-10-01' '2012-10-02' '2013-09-30'
  • lat 761 x 1/20° : 27.0 27.05 27.1 27.15 27.2 64.85 64.9 64.95 65.0
  • lon 1721 x 1/20° : -79.0 -78.95 -78.9 -78.85 6.9 6.95 7.0

2.2.1 Ground Truth

  • The SSH Ground Truth NATL60-CJM165_NATL_ssh_y2013.1y.nc:

    • ssh (Sea surface Height) One year long daily datasets provided by the NATL60 state-of-the-art oceanic simulation
  • The SST complementary Ground Truth NATL60-CJM165_NATL_sst_y2013.1y.nc:

    • sst (Sea surface Temperature): sst may be used for complementary tests to improve the ssh spatio-temporal interpolation

2.2.2 Pseudo-observations

  • The pseudo-observations dataset is generated by sampling the SSH Ground Truth with realistic satellite trajectories dataset_nadir_0d_swot.nc :
    • mask : mask (1 -> ocean, 0 -> land)
    • lag : time deviation (in hour) to the selected day
    • flag : satellite type (0 -> NADIR, 1 -> SWOT)
    • ssh_obs: data with additional realistic noise
    • ssh_mod: data without noise (=model)

2.2.3 Optimal interpolation (OI)

  • the state-of-the-art optimal interpolation (OI) dataset based on the previous pseudo-observations. This is the baseline the 4DVarNet algorithm aims at improving ssh_NATL60_swot_4nadir.nc:
    • ssh_obs: OI using ssh_obs in the pseudo-observations dataset
    • ssh_mod: OI using ssh_mod in the pseudo-observations dataset

Let note that the OI are currently involved when applying the 4DVarNet algorithm on the anomaly

xx between the raw ssh and the OI (denoted here as
x
and seen as a large scale component of the SSH). This solution is for now the one giving the best results. The OI is also used as additional input channels in the 4DVarNet algorithm and thus can be seen as an extra covariate that helps to localize the areas wth large anomalies.

3 Code available on Github

Here is the repo GitHub with a simplified version of our code: 4DVarNet and some explanations regarding its architecture architecture of the code

Key components of the code:

  • 4DvarNet pytorch Lightning module
  • Multi-GPU/multi-node distribution using pytorch lightning
  • Possible on-the-fly batch generation from raw files

5 Actions under development

Color code to specify the status the following tasks:
Done
In progress
To do
As you can see, there are still plenty of things to do :)

5.1 R&T CNES 2021

5.1.1 OSSE NATL60

  • Finalize the experiments (xp1, 2 and 3), produce the appropriate set of scores, check we won the cup ;) and prepare the products (NetCDF+Notebooks) related to the OSSE Boost-SWOT data challenge:
    DC link
    (Mohamed)

  • Amélioration des aspects strides pour l'apprentissage sur la zone GF "étendue", afin d'éviter les effets de bords (Quentin)

  • Debuggage de la branche 4DVarNet-core:
    -> (1) repartir de la branche de Quentin pour un premier test GF->GF (en forçant à 0 les loss d'apprentissage sur la calibration SWOT).
    -> (2) modifier/simplifier le code de quentin pour revenir au pb d'interpolation uniquement (ie, pas de calibration conjointe). Vérifier que les résultats sont similaires à la 1ère expérience
    -> (3) ajouter dans ce code l'aspect état augmenté. Vérifier que cette config permet de faire aussi bien voire mieux que les résultats 4dvarnet actuels sur le repo boost-swot.
    (Hugo)

  • Mid-october: Based on the Boost-SWOT data challenge, create a new repo for the "OSSE NATL data challenge", with the three areas, GF, OSMOSIS and NATL

GF domain: [-65.,-55.,33.,43.]
OSMOSIS domain: [-19.5,-11.5,45.,55.]
NATL domain: [-79.,7.,27.,65.]
(Hugo & Maxime)

  • Based on the plan provided by Maxime in Section 5.5.1, start the redaction of R&T Report (see Overleaf link for the successful scaling up of 4DVarNet (Maxime)

5.1.2 OSSE Glorys

  • (Ronan & Maxime) Think about the experiments we want to run at global scale with Glorys (resolution at 1/12°)
    • prepare a data challenge
    • prepare meeting with CLS
  • Run the OSSE (Benjamin)

5.2 OSTST/TOSCA

5.2.1 OSE

  • Run the comparison based on the c2 along-track datasets, cf BOOST-SWOT data challenge (Maxime)

5.2.1 Multi-tracer/multi-sensor synergies

  • 4DVarNet-SLA-SST based on NATL60 datasets on the whole North Atlantic basin (Hugo?)

5.3 SWOT ST DIEGO

5.1.1 SWOT calibration

  • SWOT robustness to noise signals (Quentin)

5.3 On-going developments:

  • Pull requests

    • Clean the PR of the Github repo (Maxime and Quentin)
    • Get GAN-based code (Anis) and make a PR (Hugo & Ronan)
    • Get new iterative approach (Duong) and make a PR (Quentin)
    • Implement test method on multi-GPU (Hugo)
    • Improve animation (Hugo)
  • Technical issues

    • Improve the gestion of complex domains and the extraction at the center of the patch (Maxime and Quentin)
      • Add along-track nadir gradients
    • Test sur le domaine complet (Hugo)
    • Discuss hydra/lightning (Quentin & Benjamin)
  • 4DVarNet-stochastic (Maxime)

5.4 Publications

  • Quentin: calibration paper (submitted)
  • Maxime: SPDE (inprep)

5.5 Livrables

5.5.1 R&T CNES 2021

  • L1.1 Réplication des résultats OSSE obtenus sur la zone GULFSTREAM à partir des données nadir+SWOT à l’échelle du bassin Atlantique Nord
  • L1.2 Réplication des résultats OSSE obtenus sur la zone GULFSTREAM à partir des données nadir seules à l’échelle du bassin Atlantique Nord
  • L1.3 Synthèse des OSSE utilisant une formulation variationnelle pour la reconstruction de champs altimétriques à l’échelle du bassin Atlantique Nord
  • L2.1 Spécifications des OSE pour le passage à l’échelle
  • L2.2 Synthèse des OSE utilisant une formulation variationnelle pour la reconstruction de champs altimétriques à l’échelle du bassin Atlantique Nord
  • L2.3 Plan d’action pour la suite des activités liées au portage des applications SLANet à la thématique DUACS, notamment en termes de développements complémentaires en vue d’un portage opérationnel.

5.5.2 OSTST DUACS HR

WP3 New Neural Networks approaches for SL mapping
R. Fablet, M. Beauchamp, Q. Febvre, M Ballarotta, A. Pascual

  • WP 3.1 Development of a benchmarking framework for learning-based SL mapping using real satellite altimeter datasets
  • WP 3.2 Development and evaluation of end-to-end learning frameworks for the SL mapping
  • WP 3.3 Learning-based SL mapping using multi-tracer/multi-sensor synergies

5.5.3 SWOT ST DIEGO