# [GT4Py] Prepare ITIR for CombinedIR <!-- Add the tag for the current cycle number in the top bar --> - Shaped by: @tehrengruber, @havogt - Appetite (FTEs, weeks): - Developers: <!-- Filled in at the betting table unless someone is specifically required here --> Since introduction of ITIR we have noticed several shortcomings of the IR. This project collects all changes to prepare our backends (roundtrip, gtfn) to solve these problems. We explicitly want to merge these changes into `main` in a clean way, while seperating experimental work (e.g. combined IR) into another project. ## Problem ### Missing control flow The top-level construct on ITIR, a `FencilDefinition`, is in essence a flat collection of stencil applications (together with inputs and outputs we call this `StencilClosure`s) that are unconditionally executed. As a result control flow can only be modeled by using the `if`-builtin function *inside of a stencil*. Beside the complex juggling we have to do in the lowering from the frontend in order to model complex if-statements this is fundamentally incompatible with the usage of temporaries. Consider the following example: ```python @field_operator def foo(inp: Field[[Vertex], float]): if local_area: edge_tmp = inp(E2V[0]) return edge_tmp(V2E[0]) else: return inp ``` Since we can not represent the if-statement at the top-level, i.e. outside of a closure, everything needs to be expressed using a single stencil. Consequently we can not extract `edge_tmp` into a temporary regardless of whether this would be beneficial for performance. ### Missing index built-in see https://hackmd.io/U_f4p-GlTcOKF85fZ4Ck0A ### Unclean handling of scalars The gtfn backend expects all arguments to a stencil execution (`gtfn_ir.StencilExecution`) to be a SID. When we introduced scalar arguments no type inference on ITIR existed and we resorted to a hack to convert scalar arguments into SIDs. Consider the following example *Frontend code* ```python= @gtx.field_operator def testee(a: int32): return broadcast(a, (Vertex,)) ``` *ITIR code* ``` __field_operator_testee(__sym_1, out, __out_size_0) { out ← (deref)(__sym_1) @ u⟨ Vertex: [0, __out_size_0) ⟩; } ``` Now, in order to generate C++ from this we currently promote all arguments that are used in an `itir.StencilClosure` to a SID (see [here](https://github.com/GridTools/gt4py/blob/main/src/gt4py/next/program_processors/codegens/gtfn/gtfn_module.py#L89)) using `gridtools::stencil::global_parameter`. This has the downside that a scalar argument can either be used as an input to a stencil closure or outside of a closure, but not both. Right now the only place an argument can occur outside of a closure is as a domain size argument (e.g. `__out_size_0` above is used that way). Beside being unclean this has only been a minor issue so far as it rarely happens that one scalar argument is used in both contexts. However, if we want to introduce control-flow on ITIR this changes as now conditions to if-statements also consume scalar arguments and it is much more likely that they also occur inside of a stencil. Note that there is a workaround by passing a scalar argument twice and then using each of them exclusively in one context, but this incompatible with the blue-line. *C++ code* ```c++ # lowered gtfn_ir.StencilExecution make_backend(backend, gtfn::unstructured_domain( ::gridtools::tuple((__out_size_0 - 0)), ::gridtools::tuple(0), connectivities__...)) .stencil_executor()() .arg(out) .arg(__sym_1) .assign(0_c, _fun_1(), 1_c) .execute(); # interface source gridtools::stencil::global_parameter( std::forward<decltype(__sym_1)>(__sym_1) ``` <!-- The raw idea, a use case, or something we’ve seen that motivates us to work on this --> ## Dependencies - Missing control flow - Option 1 - Program only: Requires new ITIR type inference - Option 2 - Program only: Requires adoption of existing ITIR type inference - Option 3 - Coexistence of Fencil and Program: No dependencies - Unclean handling of scalars: Requires ITIR type inference & Step 1 of the solution to missing control flow. ## Appetite <!-- Explain how much time we want to spend and how that constrains the solution --> ## Solution Careful: The following are not completely solving all of the above problems, but merely prepare our backends for them to be solved. #### Steps - Implement Step 1 of *Missing control flow*. Review & merge. - Wait for type inference project. - Implemented *Unclean handling of scalars*. Review & merge. - Implement Step 2 of *Missing control flow*. - The missing index builtin is independent and can be tackled at any point. ### Missing control flow (CF) The core idea is to eventually replace `itir.StencilClosure` by introducing statements, either an assignment-statement with an `apply_stencil` expression on the right-hand side or an if-statement. #### Step 1: Introduce CF after `apply_common_transforms` In order to split the work into seperately testable and reviewable units it is proposed to first carry out all work needed in the backends. For that we - Introduce a set of new nodes in ITIR. ```python class Stmt(Node): pass class Program(Node, ValidatedSymbolTableTrait): id: Coerced[SymbolName] params: List[Sym] tmps: List[Sym] function_definitions: List[FunctionDefinition] stmts: list[Stmt] _NODE_SYMBOLS_: ClassVar[List[Sym]] = [Sym(id=name) for name in BUILTINS] class Assign(Stmt): target: SymRef expr: FunCall # note: no nested `apply_stencil` calls! # TODO: validate expr is an apply_stencil call and all arguments # are SymRefs (or make_tuple calls of SymRefs?) class IfStmt(Stmt): condition: Expr true_branch: list[Stmt] false_branch: list[Stmt] ``` - Introduce a new built-in function `apply_stencil(stencil, inputs, output)`(decide on name with the supporting person). This function is only allowed to occur on the right-hand side of an `Assign`. A short investigation if / how we can assert that would be useful. Note that `apply_stencil` can not be nested (at this point). All inputs must be `itir.SymRef`s. Some consideration should be put into tuples and nested tuple arguments. In principle the inputs argument can be a simple `make_tuple` call. - Introduce a new helper function `convert_to_new_itir_program` that takes a fencil `FencilDefinition` or `FencilWithTemporaries` and does a (straightforward) conversion into a `Program` node which we just introduced. - Change `apply_common_transforms` to return a `Program` instead. ```python def apply_common_transforms(ir: itir.FencilDefinition, ...) -> itir.Program: return convert_to_new_itir_program(ir) ``` - Change the embedded backend to use `itir.Program` instead of `itir.FencilDefinition`. - Change the gtfn backend to operate on `itir.Program` instead of `itir.FencilDefinition`. - Introduce new nodes in `gtfn_ir`: `gtfn_ir.Program`, `gtfn_ir.Stmt`, `gtfn_ir.IfStmt` - Rename `gtfn_ir.StencilExecution`, `gtfn_ir.ScanExecution` to `gtfn_ir.StencilExecutionStmt`, `gtfn_ir.ScanExecutionStmt` and make them inherit of `gtfn_ir.Stmt`. This is not a functional IR and we don't need to place temporaries as in ITIR, as such splitting into two nodes like we did on ITIR `StencilClosure` -> `Assign` & `apply_stencil call` doesn't make sense. - Adopt ITIR -> GTFN IR lowering accordingly - Adopt GTFN IR to C++ codegen accordingly #### Step 2: Remove `itir.FencilDefinition`, `itir.StencilClosure`, `itir.FencilWithTemporary` nodes The next step is to remove the `itir.FencilDefinition`, `itir.StencilClosure` and `itir.FencilWithTemporary` nodes from the IR. This is in principle a straightforward cleanup with one exception: Some passes rely on Type Inference (only temporary pass, collapse tuple). We have 3 options here: - Option 1: We wait for this step to happen until a new type inference on ITIR is implemented and only then carry out this step. - Option 2: We change the existing ITIR type inference. This should not take too much time, but we have seen in the past that even small things easily escalte to a multi-day effort since debugging is so hard there. - Option 3: We just postpone this step and let the old nodes co-exist. ##### Step 2.1: Changes in `gt4py.next.iterator` Occurences in `gt4py.next.iterator` and estimated effort (implementation, no review) - symbol_ref_utils: ~minute - PrettyParser: ~hour - PrettyPrinter: ~minutes Straighforward - Tracing: unclear ~hour(s)? - global_tmps: ~hour - InlineCenterDerefLiftVars: ~minute - InlineFunDefs: ~minutes - InlineLifts: ~minute - TraceShifts: ~hour(s) - TypeInference: unclear ~day? ##### Step 2.2: Changes in frontend `gt4py.next.ffront` Beside a couple of type annotations only the lowering from PAST to ITIR needs to be changed such that instead of `itir.FencilDefinition` & `itir.StencilClosure` -> `itir.Program` & `itir.Assign` is emitted. This changes are rather minimal and need to happen [here](https://github.com/GridTools/gt4py/blob/main/src/gt4py/next/ffront/past_to_itir.py#L203) and [here](https://github.com/GridTools/gt4py/blob/main/src/gt4py/next/ffront/past_to_itir.py#L246). ### Unclean handling of scalars A clean way to promote scalars to SIDs is to make use of type inference in the `itir` -> `gtfn_ir` lowering. If an expression in the inputs of an `apply_stencil` call is a scalar we emit a `gridtools::stencil::global_parameter` call. In order to do that simply introduce a new `gtfn_ir.GlobalParameterConversion` node ```python class GlobalParameterConversion(Node): arg: SymRef ``` and add the respective template in the gtfn code generation. ## Outlook ## Rabbit holes <!-- Details about the solution worth calling out to avoid problems --> ## No-gos <!-- Anything specifically excluded from the concept: functionality or use cases we intentionally aren’t covering to fit the ## appetite or make the problem tractable --> ## Progress <!-- Don't fill during shaping. This area is for collecting TODOs during building. As first task during building add a preliminary list of coarse-grained tasks for the project and refine them with finer-grained items when it makes sense as you work on them. --> - [x] Task 1 ([PR#xxxx](https://github.com/GridTools/gt4py/pulls)) - [x] Subtask A - [x] Subtask X - [ ] Task 2 - [x] Subtask H - [ ] Subtask J - [ ] Discovered Task 3 - [ ] Subtask L - [ ] Subtask S - [ ] Task 4 ## Scratch ```python @field_operator def foo(a, b): return a+b lambda a, b: apply_stencil(plus)(a,b) #within program foo(a,b,out=out) # cir apply_fieldop(foo, [a,b], out=out, domain=smaller_than_out_domain) out[smaller_than_out_domain] = foo(a,b)[smaller_than_out_domain] write_field_to_out(foo(a,b), out, smaller_than_out_domain) # <-- this style, better name write_to apply_stencil(plus, [a,b], out=out, domain=...) apply_stencil(plus, [map(plus)(a,b),c], out...) ```