Simplifying `stencil_function`

# Simplifying `stencil_function` Currently `stencil_function` is a function in computer science terms, i.e. a routine. Recently we discussed a lot about pure stencil functions (as they are intended in mathematical terms). How about applying that concept to simplify the current design? It suffices to reduce its body from a full AST with statements to just an `Expr` (the return value). ``` stencil_function <name> ( { { field | variable | direction | offset } <name> }* ) -> <expression> ``` By `field` I mean either an rvalue or an lvalue field, examples below will clarify. The return "type" (cartesian field, field on cells, vertices, edges or scalar) is the same as the return expression's. `AssignmentExpr`s are disallowed within the return expression's AST. Benefits gained when dealing with a pure `stencil_function` in IIR: - No variables declared inside a function, all variables/fields accessed belong to the caller. So no need for separated metadata. - Arguments can only be input, there is no way to write into them within the function. Accesses computation simplifies to just computing the accesses of the return expression. `FunctionCallExpr` will only read fields, never write into them. - No need to check for data dependencies inside a function. - A generic visitor doesn't need to stop at a `StencilFunCallExpr`. No need to do "something special" for a function. ## DSL to IIR overview >Let's assume that the frontend wants to keep routines. The following examples are DSL to SIR translation suggestions and insights of how the IIR should look. How to simplify this function defined in (pseudo) DSL? ``` function f(field A, field B) { var tmp1 = g(A) + 1; var tmp2 = tmp1[i-1] + h(B); var tmp3 = tmp1 + q(A, B); return y(tmp1, tmp2, tmp3); } ``` (Assume g, h, q, y have already been converted to pure stencil functions) Here the user stores a temporary in tmp1 for reuse in tmp2 and tmp3 and call to y(). If we inline tmp1, tmp2 and tmp3 into the return expression we would end up with a pure stencil function, but it is extremely hard to recover tmp1 and we might loose the possibility to optimize with a temporary. What we should do, instead, when translating to SIR, is to create a stencil function to substitute each temporary: ``` stencil_function tmp1(field A) -> g(A) + 1 stencil_function tmp2(field B, field tmp1_) -> tmp1_[i-1] + h(B) stencil_function tmp3(field A, field B, field tmp1_) -> tmp1_ + q(A, B); stencil_function f(field tmp1_, field tmp2_, field tmp3_) -> y(tmp1_, tmp2_, tmp3_); ``` (SIR) In the body of the stencil, we call `f()` without the use of temporaries (all functional): ``` C = f(tmp1(A), tmp2(B, tmp1(A)), tmp3(A, B, tmp1(A))); ``` (IIR) After analyzing the tree of function calls we might decide to have the result of the call to `tmp1()` stored as temporary: ``` var tmp1_ = tmp1(A); C = f(tmp1_, tmp2(B, tmp1_), tmp3(A, B, tmp1_)); ``` or all results stored as temporaries: ``` var tmp1_ = tmp1(A); var tmp2_ = tmp2(B, tmp1_); var tmp3_ = tmp3(A, B, tmp1_); C = f(tmp1_, tmp2_, tmp3_); ``` or keep the initial functional version (= compute on the fly). #### What about routines that have side effects? ``` function f(field A, field B) { var tmp1 = g(A) + h(B); A = B; B = x(tmp1); return h(tmp1); } C = f(A, B); ``` Generate these: ``` stencil_function tmp1(field A, field B) -> g(A) + h(B) stencil_function f(field tmp1_) -> h(tmp1_) ``` and translate the statement `C = f(A, B)` to ``` // Side effects A = B; B = x(tmp1(A, B)); // Call to f() C = f(tmp1(A, B)) ``` ## Corner cases ### Variable rewrite ``` function f(field A, field B) { var tmp1 = g(A); var tmp2 = g(tmp1); tmp1 = h(B); return tmp1 + tmp2; } f(A, B); ``` need to treat `tmp1` as 2 different variables: ``` stencil_function tmp1_1(field A) -> g(A) stencil_function tmp2(field tmp1_1_) -> g(tmp1_1_) stencil_function tmp1_2(field B) -> h(B) stencil_function f(field tmp1_2_, field tmp2_) -> tmp1_2_ + tmp2_ f(tmp1_2(B), tmp2(tmp1_1(A))); ``` ### If-else ``` function fluxLimiter (field flux, field rxp, field rxm, offset dir) { double fluxCenter = flux; if(fluxCenter >= 0.0) fluxCenter *= math::min(1.0, math::min((double)rxp[dir], (double)rxm)); else fluxCenter *= math::min(1.0, math::min((double)rxp, (double)rxm[dir])); return fluxCenter; } ``` must be rewritten with a ternary operator: ``` function fluxLimiter (field flux, field rxp, field rxm, offset dir) { return flux >= 0.0 ? flux * math::min(1.0, math::min((double)rxp[dir], (double)rxm)) : flux * math::min(1.0, math::min((double)rxp, (double)rxm[dir])) ; } ``` This and the `advection_driver_n_order` functions are the only cases in clang-gridtools of if-else constructs inside a stencil function. Can be easily rewritten with the ternary operator. ## Vectors (possible generalization) Suppose the DSL supports vector fields. We could either: * represent them also in the IRs, allowing vector expressions (therefore also vector functions), or * split them component-wise when going from DSL to SIR; the following explains how. Denoting vectors as `vec_field`s. Dot notation to access components. ``` double dot(vec_field a, vec_field b) { return a.u * b.u + a.v * b.v; } vec_field myFun(vec_field a, vec_field b) { vec_field tmp; tmp.u = dot(a, b); tmp.v = a.u + b.v; return tmp; } ``` To go to SIR apply simple transformations, no analysis required. 1. expand input vectors: ``` double dot(field a_u, field a_v, field b_u, field b_v) { return a_u * b_u + a_v * b_v; } vec_field myFun(field a_u, field a_v, field b_u, field b_v) { vec_field tmp; tmp.u = dot(a_u, a_v, b_u, b_v); tmp.v = a_u + b_v; return tmp; } ``` 2. expand local variables: ``` double dot(field a_u, field a_v, field b_u, field b_v) { return a_u * b_u + a_v * b_v; } vec_field myFun(field a_u, field a_v, field b_u, field b_v) { field tmp_u, tmp_v; tmp_u = dot(a_u, a_v, b_u, b_v); tmp_v = a_u + b_v; return {tmp_u, tmp_v}; } ``` 3. assignments to functions (same as [variable rewrite](#Variable-rewrite) ) ``` dot(field a_u, field a_v, field b_u, field b_v) -> a_u * b_u + a_v * b_v tmp_u(field a_u, field a_v, field b_u, field b_v) { -> dot(a_u, a_v, b_u, b_v) tmp_v(field a_u, field b_v) { -> a_u + b_v vec_field myFun(field tmp_u_, field tmp_v_) { return {tmp_u_, tmp_v_}; } ``` 4. decompose functions returning vectors: (only showing myFun) ``` myFun_u(field tmp_u_, field tmp_v_) -> tmp_u_ myFun_v(field tmp_u_, field tmp_v_) -> tmp_v_ ``` Dawn will take care of optimizing (unused arguments). ## Implementation Described [here](https://www.dropbox.com/scl/fi/67fxq23vsrid9aorw7u6o/Stencil-Function-Refactoring-Proposal.paper?dl=0&rlkey=1pwfojvgi6xo9kkqhxhbunxzr#:uid=209739940951567334627566&h2=Simplification-to-pure-stencil)