GT4Py refactoring, cleanup

# GT4Py refactoring, cleanup ###### tags: `cycle 14` [TOC] - Developers: - Appetite: full cycle (4 FTE) ## TODO next cycle - FOAST lowering tests using string utils (already mentioned in the test best practices document) - exceptions -> discuss shaped task before cycle (pre-shaped a couple of cycles ago by Peter) - global configuration flags (already discussed, try to find the notes and finish the discussion) - 2nd iteration of fieldview feature test improvements ## Appetite Improving the code quality of GT4Py will reduce the current technical debt and thus have a beneficial impact in the development of new features in the future. The appetite for the refactoring is large: 4 developers from the GT4Py team (with previous experience developing GT4Py internals). ## General goals - Reduce technical debt to be able to introduce more features in the future - Improve diagnostics for users - Reduce surprises in partially implemented features - Develop/improve guidelines for development and especially testing  ## Toolchain tasks ### discuss-itir-conventions (discussion [✔], implementation: WIP) Working document: [Iterator IR: best practices](https://hackmd.io/sqPtFEroSNyKBWFPv04hOg) Discuss and agree on the preferred strategy to generate and transform IteratorIR. - Problem: There are several differences in the way that lowering and transformation passes deal with IteratorIR in parts of the IR where it is not well documented what is the correct/expected way to express some constructs. - Goal: Standardize and document the canonical way ITIR should be built and transformed. - Appetite: 4 FTE days. - Priority: ?? - No-gos: Discussions about the Iterator model or changes in the IR semantics are out-of-scope. This tasks should only focus on cleaning up the current implementation and document the unclear points. - Proposals: - don't do `ir.SymRef(id="add")`, instead maybe `ir.add` - unify `is_shift`, `is_applied_shift`? - utilities for dealing with symbol creation (currently hidden in some passes): - double-check lowering for symbol clashes (lifted_lambda construction) - double-check fuse_maps - discuss `factory-boy` vs `ffront.itir_makers` - Linked to: [redesign-lowering-tests](#redesign-lowering-tests), [redesign-transformation-tests](#redesign-transformation-tests) ### exceptions-hierarchy Design exception hierarchy for handling errors in GT4Py. - Problem: Current error handling in the gt4py toolchain is confusing (e.g. https://github.com/GridTools/gt4py/issues/1115) - Goal: Design and implement a uniform way to emit errors in the toolchain and forward them to users with informative error messages. - Appetite: 2.5 FTE days. - Priority: ?? - Proposals: - Shaped document from cycle 13: [Make exceptions great again](https://hackmd.io/F4nezNEvTB-wRiXteYL0Dg) - Partially linked to: [toolchain-customization](#toolchain-customization) ### toolchain-customization (preliminar work: PR#1224) Simple improved pass manager - Goal: Improve customization of IR transformations [#1135](https://github.com/GridTools/gt4py/issues/1135) and support for debugging passes. The enhancement of GT4Py support for debugging most likely requires to extend the workflow pattern (shaped [here](https://hackmd.io/Lp-GdaqbTf-ltj-z5GWZ2Q)) and add the possiblity to provide debugging flags from outside (verbose output, keep intermediate files (generated code)). - Appetite: ?? FTE days. - Priority: ?? - Proposals: + Standard set of configuration flags for executors which can use defaults from global configuration objectt (or environment variables, etc...). Specially releant for the `debug` flag + Decision: agreed in general, but we need to define the implementation details + Use a logging package to provide configurable verbose levels without reinventing the wheel. The `logging` module from the standard library provides a lot of functionality, customizability and it's widely supported but it requires very verbose configuration. Other option could be [loguru](https://github.com/Delgan/loguru), which is far easier to configure, has more features (like built-in online notifiers for automatically sending email or slack messages)) and it's compatible with the standard logging, but it's an extra third-party library dependency. + Although the implementation of ahead-of-time (AOT) program compilation is out of the scope for this task, the enhancements to the toolchain design should take it into account. For reference, JAX recently introduced an API to provide partial support for AOT compilation (https://jax.readthedocs.io/en/latest/aot.html, https://jax.readthedocs.io/en/latest/jax.stages.html) + Optional: consider renaming the confusing and potentially misleading (https://www.paratools.com/otf/) `otf` subpackage to some other name. - No-gos: ?? - Partially linked to: [exceptions-hierarchy](#exceptions-hierarchy) ### complete-gtfn-bindings (implementation: [✔]) Document GT4Py naming and organization conventions for tests. Add support for all missing features in the GTFN bindings. - Problem: Some features (like returning tuple of fields [#1044](https://github.com/GridTools/gt4py/issues/1044)) are not yet supported in the generated bindings. - tuples - sparse fields - (index_fields maybe not needed, see implementation of `index()` builtin) - Goal: Add all missing features to the binding generator. - Appetite: ?? FTE days. - Priority: 4 - Proposals: - Type system for the bindings (similar to what @petiaccja already started as a part of https://github.com/GridTools/gt4py/pull/1130 - Also related to other tasks dealing with index fiels: [`index()`-builtin in Field View](https://hackmd.io/U_f4p-GlTcOKF85fZ4Ck0A) - No-gos: ?? ### bare-fieldview-executor Add a bare executor for field view stencils. - Problem: Currently we always run the default set of transformation/optimization passes when starting from field view. Therefore we have no guarantee that lowering produced Iterator IR that is supported in the backends (see [#1201](https://github.com/GridTools/gt4py/issues/1201)). - Goal: Add a new field-view executor to test that ITIR directly generated from the lowering is valid and and embedded backend can handle it without requiring further postprocessing. - Appetite: ?? FTE days. - Priority: 0 - No-gos: ?? - Blocked by: [toolchain-customization](#toolchain-customization), we need a clean way to disable transformations --------------- ## Testing tasks (PR#1226, PR#1236) Working document: [Test best practices](https://hackmd.io/pjJjs9oqTySuKc0Z3sZBZw) PRs: - https://github.com/GridTools/gt4py/pull/1226 - https://github.com/GridTools/gt4py/pull/1236 ### fix test-conventions cornercases (discussion [✔], implementation: TODO) Get into the test modules and apply the test conventions more strictly by: - splitting or - merging or - renaming ### test-conventions (discussion [✔], implementation: [✔]) Document GT4Py naming and organization conventions for tests. - Goal: Complete the `Testing` section in the [CODING_GUIDELINES.md](https://github.com/GridTools/gt4py/blob/main/CODING_GUIDELINES.md) document. - Appetite: 2 FTE days. - Priority: ?? - TODO: meeting about the general strategy for tests ### lint-tests (discussion [✔], implementation: TODO) Fix code quality and style in `tests/nexts`. - Problem: current tests are not checked by linter tools. - Goal: Make sure all testing code is also checked with `flake8` and `mypy` to catch embarrasing errors in tests (e.g. [#1208](https://github.com/GridTools/gt4py/pull/1208)). - Appetite: 3 FTE days. - Priority: ?? - No-gos: achieving 100% error-free code quality may easily take a insane amount of time and require heavy refactorings. Using `#ignore` comments and skipping specially conflicting parts should be an acceptable compromise for the first step. - Proposals: + Consider using [LibCST](https://libcst.readthedocs.io/en/latest/), [Bowler](https://pybowler.io/) or [rope](https://rope.readthedocs.io/en/latest/) for automated refactorings. ### backend-parametrization-iteration-2 (discussion [✔], implementation: TODO) Apply and refine the backend parametrization from [redesign-backend-parametrized-tests](https://hackmd.io/46Oa6MspRoSTiFozHzun5A#redesign-backend-parametrized-tests-discussion-✔-implementation-✔) - Apply to all of `next_tests.integration_test.feature_tests.ffront_tests.test_execution`, wherever possible. - Check if other test utils become obsolete, remove those if so - Consolidate with other relevant test utils - Record the decisions about test utils in `CODING_GUIDELINES.md` as a first approximation of test utils guidelines. - Consider porting some of the utils to a new `gt4py.next.testing` module. This might better be shaped as a separate project, because requirements for in-library code are higher than for test utils. ### redesign-backend-parametrized-tests (discussion [✔], implementation: [✔]) Redesign the strategy for running parameterized backend tests. - Problem: Using fixtures as parameters to set the backend in the `program`/`field_operator`/`fencil` is error-prone. For example, several PRs were opened in the last weeks with related issues (links: ??). - Goal: Design a simple strategy to automatize the selection of backends in tests. - Appetite: 2 FTE days. - Priority: ?? - No-gos: ?? - Proposal: - Use a context var to select the default backend. Either default is then a backend that throws an error or the default backend is selected to be the parameter. - Linked to: [redesign-lowering-tests](#redesign-lowering-tests) ### redesign-lowering-tests (discussion [✔], implementation: WIP) Redesign the strategy to write frontend lowering tests. - Problem: Signal-to-noise ratio in current lowering tests is low and thus it is hard to quickly understand and add tests for lowering code from frontend to IR. - Goal: Design a new approch for writing lowering tests with minimal overhead. `itir makers` are nice to use, but ideally we would like to have an infrastructure for parametrizing tests that start from code and see (and/or) use the pretty-printed expected IR in the test itself, since it is much easier to understand for humans. - Appetite: 3 FTE days. - Priority: ?? - No-gos: A fully automatized solution which works for every possible case. Very specific corner cases could be tested with a different and more manual approach. - Proposals: + Evaluate helper factories: `factory-boy` vs `itir-makers` + Check if the [pytest-cases](https://smarie.github.io/python-pytest-cases/) plugin could be useful to define/collect cases. + Check if an automated worflow from declarative tests as the ones used in `mypy` and its plugins could make sense (reference: https://sobolevn.me/2019/08/testing-mypy-types) + Check if using [cog](https://nedbatchelder.com/code/cog/) to embed automatically-generated python code/data from code comments could make sense. Note that `cog` is already used in the gt4py tooling to keep the [`.pre-commit-config.yaml`](https://github.com/GridTools/gt4py/blob/main/.pre-commit-config.yaml) file in sync with `requirements-dev.txt`. - Linked to: [redesign-backend-tests](#redesign-backend-tests), [discuss-itir-conventions](#discuss-itir-conventions) - TODOs: - Make them more useful (don't hard-code the itir structure output) - maybe execute the resulting itir expression in embedded and see if the result is correct - if we have fielview embedded, we could compare execution before and after lowering ### redesign-transformation-tests (discussion: WIP) Redesign the strategy to write ITIR transformation tests. - Goal: Explore if ITIR transformations tests could be generated from pretty printed ITIR code. If possible, use a similar approach to the solution found in `redesign-lowering-tests`. - Appetite: 2 FTE days. - Priority: ?? - No-gos: ?? - Linked to: [discuss-itir-conventions](#discuss-itir-conventions)  ### itir-frontend-tests (implementation: WIP) Test all frontend features inside `map_`. - Goal: Make sure that we test all builtins both in non-mapped and mapped (`itir.map_` on lists) contexts (as well as maybe double mapped context). From the perspective of the frontend these are operations between sparse / neighbor fields. - Appetite: ?? - Priority: ?? - No-gos: ?? ---- ## Iterator IR tasks ### itir-specification (implementation: WIP) Update Iterator IR specification in concepts repository. - Goal: Update `gridtools/concepts` formal IteratorIR specification document with the result of recent discussions and cleanups. - Appetite: ?? FTE days. - Priority: 4 - Proposals: - Discuss open questions: + _does `shift()` require `OffsetLiteral`?_ [#1203](https://github.com/GridTools/gt4py/issues/1203) - Document types of builtins. - Mention connections of IteratorIR with established functional design patterns [Iterator IR, Functors, and Comonads](https://hackmd.io/x7vfhtrdQh-WQgAGEbHiZQ) - No-gos: ?? - Linked to: [discuss-itir-conventions](#discuss-itir-conventions) - **Assigned to: Enrique** ### itir-conditionals Expected behavior of conditional expressions in ITIR. - Problem: There are open questions about the expected behavior of conditionals/ternary operators in IteratorIR ([#1177](https://github.com/GridTools/gt4py/issues/1177), [#1061](https://github.com/GridTools/gt4py/issues/1061)) - Goal: Discuss & find a solution that works for the cases we care about. - Appetite: 3 FTE days. - Priority: 3 - Proposals: - No-gos: A large refactoring of the embedded Iterator IR executor (if required by the solution found after discussion). ### itir-column-globals (implementation: PR#1120) Cleanup implementation of scan using internal global variables to use `contextvars`. - Problem: The current implementation of `scan` in the embedded Iterator IR backend uses global variables to keep track of the axis and column range used in the scan. This was a temporary solution but it's ugly and bug prone (it will break when running with multiple threads). - Goal: Replace the current implementation with a cleaner approach using `contextvars` from the standard library, following the existing proof-of-concept implementation in [#1120](https://github.com/GridTools/gt4py/pull/1120). - Appetite: 2 FTE days. - Priority: ?? - No-gos: Any other cleanup in the embedded iterator IR implementation is out of scope. ### itir-typing [✔] Check that typing is ready for DaCe backend work. - Problem: The dace backend requires the dtype of all expressions. - Goal: Merge and cleanup the [PR](https://github.com/GridTools/gt4py/pull/1199) and check that it fulfills the requirements of the DaCe backend. - Appetite: 1 FTE days. - Priority: ?? - Proposals: - No-gos: ?? --------------- ## Eve tooling tasks ### eve-trees-enhancement Enhance existing tree traversals and visitor creation utilities. - Problem: The current functionality to deal with tree traversals is verbose and basic: it doesn't support pattern matching, there are not pre-configured pre and post-order visitors and it only works in DataModel classes. - Goal: Enhance the current tree traversal functionality in Eve to support traversing arbitrary hierarchies of nodes (e.g. pure dataclasses, parameterized typing annotations) Add a more general kind of visitors based on functional design pattern (Traversal optics, recursion schemes) and `singledispatch` functions, and LibCST-style Visitors and Matchers. - Appetite: 2.5 FTE weeks. - Priority: ?? - Proposals: + Add caching of dispatcher results for better perfomance to current class-based visitors (prototyped in: https://github.com/egparedes/gt4py) + Add post-order visitor hooks à la LibCST (`leave_Node()`). + Generic tree traversal mechanism based on recursion schemes and singledispatch functions (_children_getter_, _children_mapper_, and value function dispatcher) /tree/cartesian-perf-hacks). - No-gos: Removing existing functionality or migrating existing code (specially cartesian) to use the enhanced utilities is out of the scope. --------------- ## Frontend tasks ### scalar-if (implementation: PR) Adds support for regular if statements (with boolean condition). - Problem: Expressing conditionals using ternaries is impractical. - Goal: Change lowering in the existing PR https://github.com/GridTools/gt4py/pull/1079 to emit multiple `if_` statements such that the Tuple of Iterator pattern is avoided (which blocked the merge). - Appetite: 2 FTE days. - Priority: 4 - Proposals: - No-gos: Removing SSA on FOAST level. ### document-foast-stmt-to-itir-lowering Document the lowering of FOAST `stmt`s to ITIR, in particular the role of `inner_expr`, in the docstring of `FieldOperatorLowering`. __THIS task was added here during cycle 15 is not eligle to be picked up during the cycle.__ - Problem: The role of `inner_expr` is not documented. - Goal: Document it such that people not familiar with the procedure understand it. - Appetite: 0.5 FTE days. - Priority: - Proposals: - No-gos: Removing SSA on FOAST level. --------------- ## Storage ### Make sure storages work properly with `gt4py.next` (discussion [✔], implementation: WIP) - https://github.com/GridTools/gt4py/pull/1208 - https://github.com/GridTools/gt4py/pull/1202 --------------- ## Public API tasks ### redesign-public-imports Improve the user/public API by moving public functions to the correct module. - Problem: The current public API requires users to import from many different submodules, which is a cumbersome and unintuitive. - Goal: ?? - Appetite: ?? FTE days. - Priority: implement/merge late, discuss early - Proposals: - `from gt4py.next.ffront.decorator import field_operator` - `from gt4py.next.iterator.embedded import np_as_located_field` - ... - No-gos: ?? - TODO: schedule meeting early --------------- ## Miscelanea tasks - Frontend simplification: simplify the pre and post dialect parsing - Iterator IR pretty printer: add dtype of literal - https://github.com/GridTools/gt4py/issues/1200 - Remove floordiv builtin ??: - https://github.com/GridTools/gt4py/issues/1136 - Passing out tuples to programs is not supported: - https://github.com/GridTools/gt4py/issues/1045 - Type inference: simplify type constraints for arithmetic builtins (they are mostly the same) + linked to [itir-specification](itir-specification) --------------- ---------------