# [Greenline] Python model architecture -
<!-- Add the tag for the current cycle number in the top bar -->
- Shaped by: Magdalena
- Appetite (FTEs, weeks):
- Developers: <!-- Filled in at the betting table unless someone is specifically required here -->
## Problem
<!-- The raw idea, a use case, or something we’ve seen that motivates us to work on this -->
### current status of greenline
Some components of ICON (*Granules*) have been ported to Python, there also exists a simple timestepping functionalitz. This allows us to run a Jablonowski-Williamson test case for an atmospheric dynamical core. So far the ports have been very faithful to the original, which allows us to test and verify individual granules against the Fortran code.
### motivation
In order to be of any benefit at all,
we believe that a Python version of ICON should not just reproduce the Fortran original rather
- it should provide a more modular architecture that allows to swap (for example, run a different dycore) or add new granules easily.
- Python is adopted by many scientific communities and widely used for data centric applications. These communities provide a lot of libraries and tools that have proven useful and we should use wherever possible instead of duplicating the functionality in an akward way.
- It should conform to standards and conventions used in the Weather&Climate community.
- Python is already widely adopted by the Weather&Climate community, be it only for postprocessing and analysis. Rewriting the model in python allows to breach the gap between modelers and users allow to them to develop or use common tools. Even in our partner institutions (C2SM, MeteoSwiss) scientists use such Python and write their own tools to work with ICON data.
We therefore want to revisit the overall architecture of the greenline model and move towards these goals.
## Appetite
<!-- Explain how much time we want to spend and how that constrains the solution -->
Full cycle, for at one least leading developer, collaboration from scientistific side of EXCLAIM for discussion and validation of ideas is needed.
## Solution
<!-- The core elements we came up with, presented in a form that’s easy for people to immediately understand -->
The goal of this project is two fold:
1. Clarification and documentation of what a MVP for a atmospheric model should contain:
- granules (physics components)
- infrastrcture: IO, driver code, configuration, setup,
- important requirements for an atmospheric model that might drive the architecture and development.
The goal of this task is to bring everybody to the same page, reduce misunderstandings and learn what we still need and how to prioritize development
Those requirements should be gathered together/from domain scientist, with this project we also want to encourage and deepen collaboration.
2. Derive architectural pattern that we want to follow in the future development and restructuring of the model MVP.
The current model version does no IO. We propose to take the IO component as a first example to develop along the before mentioned principles and explore our ideas with it.
Requirements for the IO system should be elaborated in more detail than the other components together with domain scientists.
### documents
- [CF conventions](https://cfconventions.org/)
- links to python cf ([aware libraries](https://cfconventions.org/software.html))
- [UGRID conventions](http://ugrid-conventions.github.io/ugrid-conventions/)
- picture of
[domain model of ICON](https://drive.google.com/file/d/1RxNWFh_GBHXzFBE5c5WY2kF7tb3ApFX7/view?usp=sharing) from roughly a year ago, when we started on the greenline.
- Enrique s notes on [Climate and Forecast (CF) data handling in Python](https://hackmd.io/0x-YtL00Qq6G97C1oVaWkw)
- Magdalena notes on [MVP draft](https://hackmd.io/jVzy2KlcTH-WQWLJEEJVcQ)
- [IO system requirements](https://hackmd.io/U202TVPoQveNej2xMbtUMw)
*python model frameworks*
- [Sympl](https://github.com/mcgibbon/sympl)
- [Tasmania](https://github.com/eth-cscs/tasmania) (closed source)
## Rabbit holes
<!-- Details about the solution worth calling out to avoid problems -->
## No-gos
<!-- Anything specifically excluded from the concept: functionality or use cases we intentionally aren’t covering to fit the ## appetite or make the problem tractable -->
## Progress
<!-- Don't fill during shaping. This area is for collecting TODOs during building. As first task during building add a preliminary list of coarse-grained tasks for the project and refine them with finer-grained items when it makes sense as you work on them. -->
- [x] Model MVP requirements ([notes](https://hackmd.io/2XTIjN1BSVqiQzD30mgwmg) )
- [x] [Architecture draft](https://hackmd.io/Rl0fAO8bSHa-ltMHxbNyFA):
- [x] IO for icon4py:[requirements](https://hackmd.io/U202TVPoQveNej2xMbtUMw)
- [x] Interface to EXCLAIM data platform: As of now (March, 2024) from the platform point of view it is ok to simply write netcdf from where the platform can then take over
- [x] YAC: has now python bindings can be used for coupling different model runs, grid resolutions
- [x] are transformations (downsampling, regridding always needed) -> see [IO requirements document](https://hackmd.io/U202TVPoQveNej2xMbtUMw)
- [ ] implement simple POC
- [x] metadata for prognostics
- [ ] output prognostics at configurable time interval
- [x] icon grid matchin UGRID convention
- [ ] netcdf datafiles refering to grid file
- [ ] add to timeloop