Python ICON bindings generator

# Python ICON bindings generator ###### tags: `functional cycle 11` Appetite: full cycle Developers: Sam, Matthias (full cycle) ### Background Currently the generation of the bindings (from fortran to dsl code) is done by a C++ code generator. A meta data file is passed from icon4pygen to the C++ bindings generator, on the basis of which the bindings are generated. The C++ code of the bindings generator is modified dawn code and in a state that would require heavy refactoring if it were to stay in C++. Additionally C++ standard string libraries are very basic and one would have to switch to a stronger string templating library. Instead we aim to port this bindings generator to python ### Goals Port the bindings generator to python so that no intermediate data file is needed and the necessary information can be inferred directly from the representation of the stencil code in memory (probably the representation will be field operator ast (foast)). Use eve/mako for the code generation. ### Non Goals Do not solve sparse representation problem or other things which are still up for discussion. ### Known Steps * Before port, evaluate and clean up generated C++ code (for example use static inline C++ 17 feature to prevent need for declaring class members outside of class) * Make sure code generator can handle fused stencil interfaces * Ensure Python generated code is 1 to 1 the same as C++ generated code * any deviation needs to be well justified * Support new sparse field formulation without hacks * During development evaluate which kind of mako/jinja templates (file template vs inline template vs mixing them) is the better suited approach ### Possible Rabbit Holes * Do not force the use of eve if using mako/jinja directly would be significantly more convenient (alternatively change eve to support what we need ?) ### ToDos & Notes * Use workflows from compiled backends to orchestrate generation of different files (.h, .f90, .cpp) (?) * At least .h and .f90 lend themselves very well to eve workflow discussed with Rico (in-python templates, use eve template injection facilities) * f90 <-> cpp type mapping exists in code * In a first step, keep stringified interface. Do not write meta data file to disk but keep it in memory * use strongly typed iface in a second step * Re-evaluate generated code for cpp (MR) * class really needed? * if class is needed, explot c++17 features to omit external storage declaration for static data members * adapt c++ bindgen - for all code generators (f90, gtheader, cppheader, cppimpl) pass fields and offsets into constructor. - Do this for CppHeader generator, and make use of field class. - Try to model data using Sequence[Field], and use simple loops to generate header, and thus make generator simpler. - improve Field class, add other required methods/attributes (such as _is_pointer). Cleanup `__init__` method. - split codegen.py into separate modules in a `codegen` subpackage. ## Cpp Definition Parts - [x] static chunk of code (at the start) - [x] Includes chunk, and namespace use - [x] template neighbor_table_fortran - [x] template neighbor_table_4new_sparse - [x] get_sid function - [x] Stencil class - [x] GpuTriMesh struct - [x] private members **SK** - [x] grid function **SK** - [x] public members: - [x] getMesh() **MR** - [x] getStream() **MR** - [x] getKsize() **MR** - [x] get_<output_field>_KSize() **MR** - [x] free function **MR** - [x] setup function **SK** - [x] <stencil> function (constructor) **SK** - [x] run function - [x] copy_pointers function **MR** - [x] function definitions - [x] run_<stencil> function - [x] verify_<stencil> function - [x] run_and_verify_<stencil> - [x] setup_<stencil> - [x] free_<stencil> ### Possible Refactoring/Cleanup **Exceptions? Contracts?** #### types.py - [x] `types.is_valid` -> can we make it look more pythonic? (maybe further improvement later on) - [x] `Offset.emit_strided_connectivity()` -> maybe rename? - [x] Offset shorthand functions could be cleaned up? **TODO**: can be done after regression testing - [x] ` Offset.num_nbh` -> rename, e.g. `get_number_of_neighbors()` - [x] `Offset.__init__()` refactor into one or more functions, returning `target_0`, `target_1` - [x] `Field` is looking quite monolithic. We could refactor string rendering functions into another class which is only responsible for rendering the c++ or f90 strings. This class can then be access inside field using `self.renderer.<render_function>`. In this way we separate concerns between field metadata and field rendering for different codegens. - [x] `Field.num_nbh` looks similar to `num_nbh` from `Offset`. Could this be only defined once? - [x] `Field._get_horizontal_dimension_and_update_location()` whats the return type here? Possible to simplify further? #### header.py - [x] not all templates need to be outside of class. Only ones which are used in other modules as well. #### cpp.py - [x] `_get_field_data` could return a dict instead of a tuple. - [x] maybe move `GpuTriMeshOffsetHandler` to a `codegen.utils` module? #### Other things to consider after passing integration and regression - [x] Remove `build.py` - [x] Remove `cppbindgen` from workflow. --> Remove `cppbindgen` folder - [x] Improve unit tests - [x] Remove regression tests when? - [X] Same definition order of class/TemplateClass in `cpp`, `f90` and `header` files. ### Refactoring / Review - Improve decoupling between entities and renderers. For this we should get rid of the wrapped calls to the render functions. Instead call render function on renderer attribute of entity instance. ``` entity = Entity(data, MyRenderer) entity.renderer.render_x() ``` - Simplify code generators by adding `__post_init__` statement to the main parent node. - Simplify code generators further by decreasing the number of eve nodes per file.