# [Blueline] Refactor icon4pygen to make it production ready <!-- Add the tag for the current cycle number in the top bar --> - Shaped by: Sam, Christoph - Appetite (FTEs, weeks): - Developers: <!-- Filled in at the betting table unless someone is specifically required here --> ## Problem <!-- The raw idea, a use case, or something we’ve seen that motivates us to work on this --> `icon4pygen` was ported to Python from C++ in a one cycle project over 1 year ago, and was not touched since then except for adding additional features/fixes. It has grown to be quite complex, accumulating technical debt and not making it very easy to work with and modify without prior knowledge. As such currently it would not yet be ready to deploy in a production setting at MCH. ## Appetite <!-- Explain how much time we want to spend and how that constrains the solution --> 2 people, full cycle. ## Solution There should be a fundamental restructuring of the code in order to separate concerns. Additionally there are many refactoring opportunities. ##### Restructuring - Separate code in the following components - Language interface generation - Fortran -> C - C -> C++ - C++ -> gtfn - Verification interface - Backend interface, allowing to swap gtfn for DACE in the future - Additional smaller code generation parts (for example in the generated Fortran interface file, some quantities are derived before the C interface is called) - A possible structure could then be: - verification.py - run.py -> backend.py - ... ##### Refactoring ideas ###### `metadata.py` - Use class-based encapsulation to encapsulate related functions into a class. - Separate interfaces like `StencilInfo` into separate interface.py module. - maybe we can rename this module to `parse` in alignment with liskov. Since here we are parsing info from gt4py stencils. - provide_neighbor_table needs more error handling, sparse field method is brittle. ###### `cli.py` - shell completion seems to be broken, using wrong stencil names. Need to see if this is still necessary. - rethink arguments names, e.g. fencil should be program_import_path ###### `icochainsize.py` - refactor global access, pass dictionary mapping function names. - group grid functions into a class HexGrid, for example. ###### `backend.py` - transform_and_configure_fencil, could be split into different functions, has quite low cohesion right now. ###### Other changes - add a constants.py file where we can define H_START, ... - Only one of the two ways for handling neighbor tables should be supported, wait until gt4py team decides, then refactor ###### bindings generally don't like how most logic is hidden inside the `bindings` package, and then further down in `codegen` and `render`. I think there should be a more "flat" subpackage structure which is more logically aligned with the components. - introduce templates subpackage - should include templates for all generated files. - templates should not include any logic related to actually generating the code, this should be in a separate module e.g. generate.py - try to simplify cpp.py a bit - renderers and entities must be closer together, otherwise their relationship might not be immediately apparent. Maybe the renderer can already be set in the base class. ## Rabbit holes <!-- Details about the solution worth calling out to avoid problems --> ## No-gos <!-- Anything specifically excluded from the concept: functionality or use cases we intentionally aren’t covering to fit the ## appetite or make the problem tractable --> - Do not try to refactor the generated code, rather the focus should be on refactoring how the code is generated, e.g template splitting. ## Progress <!-- Don't fill during shaping. This area is for collecting TODOs during building. As first task during building add a preliminary list of coarse-grained tasks for the project and refine them with finer-grained items when it makes sense as you work on them. --> - [x] Task 1 ([PR#xxxx](https://github.com/GridTools/gt4py/pulls)) - [x] Subtask A - [x] Subtask X - [ ] Task 2 - [x] Subtask H - [ ] Subtask J - [ ] Discovered Task 3 - [ ] Subtask L - [ ] Subtask S - [ ] Task 4