# Python Model MVP Requirements
**Goal**: gather general requirements for a Python MVP
## Typical structure of a wheather and climate model (Anurag)
Use [this](https://drive.google.com/file/d/1wdR8FPwVmZRBLeKP6Wzq2PmwaHkaDNhM/view?usp=sharing) to start the discussions
-------
**Discussion**
- Coupled = (atmosphere + land) + (ocean + sea ice)
- Usually atmosphere and land go together very tightly coupled (every time step) and ocean and sea ice are less tightly coupled to atmosphere but still coupled between them. Ocean even runs as its own model (own time stepping) in parallel, and is coupled to the atmosphere through a coupler (in ICON: YAC).
- [Workflow](https://drive.google.com/file/d/1wdR8FPwVmZRBLeKP6Wzq2PmwaHkaDNhM/view?usp=sharing):
1. Horizontal grid (and vertical decomposition)
2. Surface specification (EXTPAR software in ICON)
3. Vertical grid: defines the height/elevation of the K levels
4. Compute time-invariant grid related quantities (geometric factors, interpolation coefficients, ...)
5. Initialization
- Analytic data (f.ex. Jablonowski-Williamson steady state solution, ...)
- Setting up quantities from file inputs: model output from different run ozone, CO2, aerosoles in atmosphere
- Measurements (NWP)
6. Time loop starts:

1. For the first time steps in NWP: measurements, IAU (Incremental Analisys Update) / data assimilation (MCH doesn't use adjoint model)
2. Dycore / dynamics (time-loop with small time steps: `dtime/ndyn_substeps`)
- *Maybe* in ICON it could be subdivided in multiple components in the future
4. Forward operators: compare real measurements with data prediction and decide how to assimilate (related to IAU?, only for the first time steps?)
5. Diffusion, Tracer advection, fast physics (dtime)
6. Atmospheric physics (dtime)
- cloud microphysics
- turbulence
- ...
7. slow physics (larger time steps that dynamics)
- radiation
8. Land model: (coupled to turbulence?)
- Within the land model, there might be different sublayers/components (e.g. snow, terrain, ...) which need to be computed in some specific order
- Dycore lowest layer is not the ground surface, the land model provides information about the actual ground which may affect the firsts layers of the dycore (e.g. wind, ...)
- Land model provides some quantities like albedo related to physics
9. Coupler
- In ICON this is YAC (Yet Another Coupler: https://www.dkrz.de/pdfs/docs/yac-tutorial-1.55)
- ETH users usually require standards outputs from the atmosphere model?
- Ocean
- Initializes ocean model with some quantities from atmosphere dycore (top boundary condition for ocean)
- Ocean has its own dycore/physics which runs (usually in parallel) with different timestep so the coupler needs to sync between atmosphere and ocean
- Sea ice: depends on the user needs and how specific they want to be
Christoph's ideas:
- components with optional outputs might be implemented as a core component with a fixed set of outputs and later plug another component to compute the extra outputs from the core set (if gt4py or the compiler is able to fuse them together again to avoid performance penalties)
- different land-ocean coupling strategies (see slides 18 to 35): https://inria.hal.science/hal-02418164/document
-------
## General restriction of MVP in python
**Goal**: agree upon what components we need for a minimal working model,
1) *Pure Python* (independent of Fortran model run)
- exceptions: f.ex. domain decomposition?
2) *Global Model*: no LAM, that is no boundary conditions
3) *Only atmospheric*:
- minimal set of "granules": dycore (`dt_substep`), diffusion (`dtime`), slower time steps at multiples of
5) *Grid topology/geometry*: Icosahedron
- torus?
6) *I/O*:
- input: grid file, vertical levels,
- output: prognostics, diagnostics,
- log,
- timers
7) *Single node/multi node runs* ?
8) *Initial conditions*: only analytic/artificial (JW)
*More/Other?*
### use cases to support 2024
- moist JW: Icosahedral, global
- Warm Bubble: Torus, initialization from analytical conditions, minimal physics, no land, no coupling, no ocean (dycore, tracer-advection, microphysics?)
### What does this mean in terms of minimal set of components that we need?
- physical components ("granules")
- dycore
- diffusion
- (warm bubble) cloud microphysics, advection
- Setup
- grid topology, geometry
- pre-computed fields
- configuration
- ..
- Infrastructure
- I/O
- ...
[This](https://unlimited.ethz.ch/display/EXCLAIM/Code+structure) is something that we (MPI and EXCLAIM) had discussed in 2022. Might be helpful to shape up the structure.
## Requirements for a Python driver model
**Goal:** understand how users work with the model, and what we want to enable?
1. *Backend* agnostic: runs on CPU, GPU
2. run on a Jupyter notebook as well as at scale
3. *Open/Close* Principle: Open to extension: add new components without modification of existing components, exchange components.
- Example time loop:
- what type of extensions do users normally do? (Anurag), where do they need to hook in?
- time loop, new diagnostics, new/changed physics parametrizations, ...? Who would that work?
- initial conditions
- ...
3. Facilitate post processing:
- produce output that conforms to CF conventions, can easily be used with tools that scientists now
- re-use of tooling that is used in post processing
### user needs
- Most users of EXCLAIM at ETH are end users: they run the model (as is given config options) and just configure what output they need at what time intervall. They don't implement any code.
- Other usages: migth range from adding small granules, to changing enhancing dycore (ex. shallow building layer)
## notes on IO (Magdalena)
- **output**: output component should be decoupled from model.
- IO could be requested also from within a granule (in case dycore is seen as one granule). Might output the same field twice from the dycore??
- might need to interpolate quantities to regular lat/lon grid, or resample to coarser resolution.
- Look into YAC for coupling of IO system.
-----------------