Try โ€‚โ€‰HackMD

GTScript Syntax for Type Hinting

tags: cycle 5

Problem

Temporaries are currently always 3D, which complicates the implementation of some computations. For example, 2D temporaries are useful for vertical reduction/scan patterns, and high-dimensional temporaries are useful in operations involving vector/matrix/tensor fields, like FEM methods.

Ideally, to avoid the declaration of pseudo-temporary fields outside stencils, it could be possible to explicitly create lower and higher dimensional temporary fields, which requires some sort of typing for temporaries.

Prior work

Already shaped some aspects of this in a previous sprint at https://hackmd.io/7XI0eNXSTlWnXnHIx5o-Bw.

Typed Temporary Variables: https://hackmd.io/qz976j3XTuWR6aLYCdXz4w

Appetite

  • 1 developer, one cycle because the gridtools c++ implementation is trickier than originally thought, and the numpy backend needs data_dims.

Solution

Temporary fields should be 3D by default, unless a explicit type hint is provided (which also ensures backward compatibility). This means that there is no type inference and all data dimensions and lower-dimensional fields should be explicitly declared.

One important problem addressed by this proposed solution is that a temporary field might be initialized with different expressions in different intervals of the same computation, which would complicate any type inference mechanism with complex rules not so easy to understand for users.

Example:

def stencil(a: Field[IJ, float], b: Field[IJK, float]): # Complex type inference example for 'tmp' with computation: with interval(0, 1): tmp = 0.5 with interval(1, 2): tmp = 3.0 * a with interval(2, -1): another_tmp = a + b with interval(-1, None): tmp = -0.5 * b

In the proposed solution, temporaries can be optionally declared in the function scope to define their dimensionality before their first use:

def stencil(a, b): # Lower dimensional temporary c_2d: Field[IJ, float] # Lower dimensional temporary with data dimensions d_4d: Field[IJ, (float, (2,2))] # Lower dimensional temporary with default ?? c_2d_with_default: Field[IJ, float] = 0. with computation: with interval(0, 2): # c is 3d by default c_3d = inp_2d[0, 0] # c is 2d because of the explicit typing hint above c_2d = inp_2d[0, 0] with interval(2, None): # c is 3d by default c_3d = 3 # c is 2d because of the explicit typing hint above c_2d = inp_2d[0, 0] # Optional shortcut to access data dims without spatial offset: # field[[d1, d2]] == field[0, ...][d1, d2] ? # example: field_a[[0, 1]] = field_b[[0, 0]]

Implementation

Here is a summary of the main changes required to implement this proposal:

Frontend

  1. Temporary declaration
  • Make sure ValueInliner pass replaces external symbols with their actual values inside of the type hints used for temporary declaration.
  • Support Assign nodes in CONTROL_FLOW context for the declaration of temporaries with type hints-only.
  • Inside IRMaker.visit_Assign add a new if branch if current context is CONTROL_FLOW
    • If the assignment has no target, then parse type hint and save this information for the future declaration of the temporary (first time is used).

    • If the assignment target is a scalar value, additionally create a new computation block with the assignment/initialization using the scalar value. Example:

      โ€‹โ€‹โ€‹โ€‹โ€‹โ€‹โ€‹โ€‹c_2d_with_default: Field[IJ, float] = 1.23  
      

      The implementation of this feature could introduce a new full parallel with computation block with the initial assignment:

      โ€‹โ€‹โ€‹โ€‹โ€‹โ€‹โ€‹โ€‹with computation(PARALLEL), interval(...):
      โ€‹โ€‹โ€‹โ€‹โ€‹โ€‹โ€‹โ€‹    c_2d_with_default = 1.23
      
  • FieldDecl for temporaries:
    • Parse type hints
    • If temporary is lower dimensional: add only declared axes
    • If temporary is higher dimensional: add declared data_dims
  1. Statements
  • Make sure assignments to temporaries with data dimensions are supported if the temporary has been declared with data dimensions.

Toolchain / Backend

  • Passes and optimizations: only a few of them should be affected and should not be hard to fix

  • GTC backends should be supported

    • Code generation for GridTools C++ 2.x backends (check details and questions with Anton Afanasyev):
      • High dimensional temporaries are declared as normal temporaries using GT_DECLARE_TMP (docs). The only difference is that the type of the temporary is not a single scalar but a fixed-sized array (for CUDA compatibility use gridtools::array instead of std::array)
      • For lower dimensional temporaries, an actual field needs to be declared because GridTools C++ assumes all temporaries work as 3D/IJK fields. Check the code snippet at the end of document with an example of how to declare and use a 2D IJ temporary in a vertical reduction. Note that the fake temporary field is allocated using GridTools functionality (make_cached_allocator) for performance.
  • DaCe backends should be supported and most likely it will work out of the box, since the entry point is the OIR, which is barely affected by the changes. Verify that it works, add tests and fix any minor issue that may appear.

  • Classic backends are out of scope and don't need to be supported.

Optional: Type hints on functions

The implementation of type hints in the function arguments or in symbols declared inside functions should be straightforward, because there are not several interval/computation contexts.

def gtfunc(a: Field[IJ, float], b): tmp: Field[IJ, (float, (2, 3))] = a + b return tmp + b def stencil(a, b): with ...: c = gtfunc(a, b[0, 0, 0][1]) def gtfunc(a, b): return a[0, 0, 0]+b def stencil(a, b): with ...: c = a[0, 0, 0]+b[0, 0, 0][1]

The CallInliner pass in the frontend inlines all function calls and creates unique names for each argument or temporary symbol declared. If the declarations contain type hints, they should be parsed and used in the FieldDecl node.

Other issues

  • Type hinting temporaries as scalar variables is not supported : it should be the toolchain and not the user who decides what's the best implementation of fields which are constant in the whole iteration domain. If the user wants to improve readability of the code, simple symbol naming conventions can be used (_constant, _global, _sc, โ€ฆ suffixes).

Rabbit holes

  • Some optimizations will not work by default since assuming these axes and dtypes for temporaries and will break.
  • Implementation of lower dimensional temporaries with GridTools C++ might be tricky, but it should be doable following the approaches described here.

GridTools C++ 2.0 code snippet

using axis_t = axis<1>; using full_t = axis_t::full_interval; struct calc_min { using in = in_accessor<0>; using out = inout_accessor<1, extent<0, 0, 0, 0, -1, 0>>; using param_list = make_param_list<in, out>; template <typename Evaluation> GT_FUNCTION static void apply(Evaluation eval, full_t::first_level) { eval(out()) = eval(in()); } template <typename Evaluation> GT_FUNCTION static void apply(Evaluation eval, full_t::modify<1, 0>) { eval(out()) = std::min(eval(in()), eval(out(0, 0, -1))); } }; struct use_min { using in = in_accessor<0>; using out = inout_accessor<1>; using param_list = make_param_list<in, out>; template <typename Evaluation> GT_FUNCTION static void apply(Evaluation eval) { eval(out()) -= eval(in()); } }; const auto my_spec = [](auto tmp, auto out) { return multi_pass(execute_forward().stage(calc_min(), out, tmp), execute_parallel().stage(use_min(), tmp, out)); }; using namespace literals; TEST(my_tmp, _) { auto alloc = sid::make_cached_allocator(&std::make_unique<char[]>); auto tmp = sid::shift_sid_origin(sid::make_contiguous<double>(alloc, tuple_util::make<tuple>(1, 1)), tuple<>()); double out[1][1][3] = {{{3, 2, 1}}}; run(my_spec, naive(), make_grid(1, 1, 3), tmp, out); for(auto&& o : out) for(auto&& oo : o) { EXPECT_EQ(oo[0], 2); EXPECT_EQ(oo[1], 1); EXPECT_EQ(oo[2], 0); } }