# [WIP] icon4Py Greenline architecture ## Introduction The so called *greenline* development stream in [EXCLAIM](https://exclaim.ethz.ch/the-project/implementation.html) aims at porting the model driver into python and building up a more modular and flexibel architecture. This approach has several - separation of concerns between user facing model code and backend by using a DSL approach with GT4Py. - With this same approach allow users to run the model on a wide range of devices from an high resolution runs an HPC cluster to small exploratory runs on their laptops - Decouple the model into well separted and encapsulated, physically meaningful components, so called *granules*. This should make it easier to handle for people who want to change or replace some components. We use Python because it provides - with GT4Py we have a flexible DSL for performance portability which is also developed within EXCLAIM - It is widely adopted in the science community and widely used in data centric application, which allows us to re-use many libraries and tools that have proven usef andsupport the approach. - It is widely adopted also in the W&C community be it only for post processing and analysis. We think it might be fruitful to bridge the gap between modeling and analysis. And a DSL approach to the HPC model code allows us to do so. ## Approach taken so far Within EXCLAIM some ICON model components were ported to Python. They are examples of such granules and represent a very faithful approach to the model. These are - dycore (SolveNonHydro) - diffusion - Microphysics Further work on them is necessary to reach the above goals, especially in terms of composition and standardisation of those components. ## Architecture outline We try to follow a similar approach originally pioneered by [sympl](#sympl) and later used in [tasmania](#tasmania) All physical component should follow a set of common requirements for `component`s: They should be self-explanatory or self-contained in the sense that the component declares its input fields and output fields in a for the user transparent way and extract these fields from the model state at runtime. ### Model state The model state is essentially a dictionary that contains *mutable* state: prognostic variables, diagnostics, tendency fields, etc. Components will select their input fields from that dictionary. Fields in the model state are self descriptive and follow [CF conventions](#cf_conventions): they are not only data buffers but annotated fields with metadata descriptive names, units and dimensions. It might be advisable to split the state into several parts depending on what type of fields there are. (prognostics, diagnostics, ...) ### Model Components (aka granules) Components operate on the model state and compute updates to it or derive diagnostic quantities from it. They necessarily declare the input fields that they need and the output they produce statically. Upon runtime in the model loop they get passed the model state and select their input fields from there. Components need access to a couple of other infrastructure components which they need for their computation but do not change. These are the objects like the `Mesh`, `Communication` infrastructure for halo exchanges, static precomputed fields that might be shared among components, possible more. When a component is called it returns the computed tendencies and diagnostics as defined n `output_properties`. ```python= class MyComponent(ModelComponent): def __init__(config:MyComponentConfig, mesh:IconGrid, comm static_fields, ...): ... def input_properties(): """ returns a list of dynamic fields that the components needs as input and selects from the model state upon __call__. """ return {"air_pressure": {"units":"Pa"}, "normal_velocity":{"units":"m s-1"}} def output_properties(): return {"tendency_of_air_pressure":{"units":"Pa s-1"}} def __call__(model_state, *args): """ select self.input_properties from the model state, do component specific computation and return self._output_properties """ ... return _tendencies ``` #### Different generic components It might be helpful to provid a richer generic components hierarchy by providing a couple of generic component types depending on what the granule does. For example `sympl` Sympl declares several [generic components types](https://sympl.readthedocs.io/en/latest/computation.html), depending on what type of calculation they do: there are generic `TendencyComponent`, `DiagnosticComponent`s, `TimeStepper`. This might ease the process and we will discover along the way what makes sense. They differ in what kind of output they produce. It is not so clear whether the ICON granules can be clearly separated into these different types of components or they all tend to produce all kinds of output. Components can be composed into larger `CompositeComponents` ##### IO components One set of special components are - [Output Components](https://hackmd.io/U202TVPoQveNej2xMbtUMw) that save state but do contribute any change to the model state. - `Input Components` that only create new state. #### Configuration Each component may have a set of configuration parameters which are passed at the initialization of the components and correspond to (some of) the namelist parameters in ICON. Configuration uses a hierarchical model where a components configuration is in itself independent but might reuse parts of another configuration set. ```mermaid classDiagram Dictionary ModelComponent ModelComponent <|-- MyComponent Dictionary <|-- ModelState Configuration Configuration <|-- MyComponentConfiguration ModelComponent: + __call__(ModelState state) Dictionary ModelComponent: + Dictionary input_properties ModelComponent: + Dictionary output_properties MyComponent: + __init__(MyComponentConfiguration config, Mesh mesh, ExchangeCommunicator comm, ...) MyComponent MyComponent: - Mesh mesh MyComponent: - Communicator comm MyComponent: - MyComponentConfiguration config ``` ### Other Model Infrastructure As mentioned above there are several infrastructure elements of the model that can be provided to all components. There include - horizontal grid topology - geometry of the horizontal grid - vertical grid - backend, field allocators used - model time - communication infrastructure for halo exchanges ## Dynamic Runtime view ### Model State validation before starting the model run the validity of the model state can be verified by requesting the `input_`- and `output_properties` of all components and verifying the the tree they build up is valid. ### Time stepping / dynamical core The dynamical core is usually the most complex component in terms of functionality and interfacing. It can be treated as a single monolitic component in the beginning and maybe later on split into a composite granule. [tasmania](#tasmania) defined a special component only for the dynamical core. The Icon timeloop runs different components at different time intervals ![icon time stepping](https://hackmd.io/_uploads/HkD1ucsn6.png) Each of the boxes represents a possible granule. The `Timeloop` provides functionality to register `Components` in order at the different timeintervals and subsequently run them. It is itself unaware of what the components do and which components it runs, but controls the dynamic of the system (for example the current number of substeps `n_substeps`...) ``` classDiagram class Timeloop{ - Dictionary components + __init__(TimeloopConfiguration config) Timeloop + register(Component comp, time_step) + __call__() } ``` TODO: diagram, Is the timeloop a composite component itself, that takes in all model state and returns all of it. Or is it something apart, an outside control structure. ### Resources - [sympl](https://sympl.readthedocs.io/) - ##### [tasmania](https://github.com/stubbiali/tasmania) - ##### [CF conventions](https://cfconventions.org/) - ##### [U Grid conventions](https://github.com/ugrid-conventions/ugrid-conventions) - ##### [esmf](https://github.com/esmf-org/esmf/tree/develop), [presentation](https://docs.google.com/presentation/d/1ihScQpKtag5FyUc6SIYM736XfIWB5fV-F5HVrY66CBY/edit#slide=id.p4) - ##### [community physics package](https://dtcenter.org/community-code/common-community-physics-package-ccpp) - ##### [pangeo](https://pangeo.io/) #### related python resources - ##### [xarray](https://docs.xarray.dev/en/stable/) - ##### [uxarray](https://uxarray.readthedocs.io/en/latest/) - ##### [iris](https://scitools-iris.readthedocs.io/en/latest/) - ##### configuration - [hydra](https://hydra.cc/) - [omegaconf](https://omegaconf.readthedocs.io/en/2.3_branch/)