# GTScript2 Syntax Proposal (Anton) ###### tags: `GTScript`,`GTScript-2` ## True Stencils ### Definition #### Stencil Let us define a stencil as a python function that takes zero or more field accessors in the given point and return the value in this point. The simplest case -- stencil without arguments: ```python def foo(): return 42 ``` This stencil being executed over some N-dimensional domain will return a field with 42 in each point of that domain. Stencil can have several field as outputs. In this case it should return a tuple: ``` python def bar(): return 3, 4.5 ``` #### Field Accessors Stencils take field accessors as parameters. Let us define an accessor as a function that takes offsets in the form and of python `kwargs` and returns the field value taken with the given offset from the current point. In other words accessors have the signature like `f(**offsets : Int). To use accessor one should call it like this: ```python f() # returns the value in the focus point f(i=-1, k=2) # returns the value taken with offset -1 in i-direction and offset 2 in k-direction ``` Q: Why can’t we use positional parameters for accessors? A: Positional parameters sucks in several different ways: * `f(0, 0, 1)` is less expressive than `f(k=1)`. *(If you in doubt try: `f(some_extra_dim=1)` vs `f(0, 0, 0, 0, 0, 0, 1)`) * Poor composability. Say you want to define stencils like: ```python def dx(f): return f(1, 0) - f(0, 0) def dy(f): return f(0, 1) - f(0, 0) ``` It is hard to factor out the common pattern here. For the named dimension it is trivial: ```python def d(dim): return lambda f: f(**{dim:1}) - f() dx = d(‘i’) dy = d(‘j’) ``` ### Composability. Lift operation Suppose we already have `dx` stencil and want to define `d2x`. The naive approach: ```python def d2x(f): return dx(dx(f)) ``` It doesn’t work - `dx` accepts a field accessor and returns a value. We need something like this: ```python def d2x(f): return dx(lift(dx)(f)) ``` Where `lift` is a high order function that takes a stencil and returns a function that takes accessors and returns an accessor. It can be several ways to implement such a function in the context of the stencil execution. We can do full inlining of this function or we can use temporary field to store intermediate parameters. We can have even more variations depending on backend. Caching etc. Inlining alternative can be implemented in python: ```python= def lift(stencil): def res(*fs): def res(**outer_offsets): def wrapper(f): def res(**offsets): for dim in outer_offsets: if not dim in offsets: offsets[dim] = 0 offsets[dim] += outer_offsets[dim] return f(**offsets) return res return stencil(*(wrapper(f) for f in fs)) return res return res ``` With the regular function composition and this `lift` abstraction we can define almost any true stencil. Here is horizontal diffusion definition as an example: ```python= def lap(f): return 4. * f() - f(i=1) - f(i=-1) - f(j=1) - f(j=-1) def flax(dim): def d(f): return f(**{dim:1}) - f() def res(l, f): res = d(l) return res if res * d(f) > 0. else 0. return res def hd(f, coeff): l = lift(lap)(f) fx = lift(flax('i'))(l, f) fy = lift(flax('j'))(l, f) return f() - coeff() * (fx() - fx(i=-1) + fy() - fy(j=-1)) ``` ### Execution Obviously to execute a stencil we need at least: * stencil definition * compute domain definition * input fields * otput field(s) Let us define computation domain as a dictionary of pairs of integers. Something like: ```python make_domain(i=(2, 8), j=(2, 8), k=(0, 10)) # (1) make_domain(i=(2, 8), extra=(0, 3)) # (2) custom domain ``` The second case defines non standard computation domain. it should be legal from the language perspective to pass such a domain for execution. The natural requirement is that the keys withing domain match the keys within output fields. The requirements for the fields are different depending on the type of the fields. Let as take NumPy array as an example. They have positional indexing hence we need to pass a map from positional to symbolic dimensions for each of the field. Execution call for horizontal diffusion could look like: ```python= # regular 3D iteration execute( hd, make_domain(i=(2, 8), j=(2, 8), k=(0,10)), output=np_field(out, ['i', 'j', 'k']), np_field(f, ['i', 'j', 'k']), np_field(coeff, ['i', 'j', 'k']), ) # custom iteration execute( hd, make_domain(i=(2, 8), j=(2, 8), a=(0,2), b=(0,3)), output=np_field(out, ['i', 'j', 'b', 'a']), # out is 4D np_field(f, ['i', 'j']), # f is an ij field np_field(coeff, ['a', 'b']), # coeff is a matrix ) ``` ### Reference (naive) implementation The good thing about GTSript2 that the stencil definition can be executed by python without parsing to AST. The recipe is the following: * iterate over the computation domain keeping track on dimension indices * for each iteration prepare field accessors using dimension indices and input fields * get output value(s) by calling stencil * place output value(s) into output field using dimension indices This kind of direct implementation can also serve as a reference implementation. Note on self assignment in execute Consider the following: ```python execute( my_stencil, make_domain(i=(0,10)) output=np_field(f, ['i']), np_field(f, ['i']), ) ``` The same field is used as input and output here. The result depends on `my_stencil`. If it defined like: ```python def my_stencil(f): return 2 * f() ``` execution is legal. But if it is like: ```python def my_stencil(f): return 2 * f(i + 1) ``` the result of execution is undefined becuase it depends on iteration order. ## Vertical Stencils ### Stencil We represent vertical stencils as python functions that takes a generator that iterate over vertical indices. This function should return a generator function that takes field accessors and yields values for all vertical range. We also need to specify what dimension is vertical and in what direction (forward or backward) we a going to yield. Example: ```python= def thomas(kk): def res(inf, diag, sup, rhs): c = [None] * len(kk) d = c.copy() for k in kk: if k == 0: c[k] = sup(k=k) / diag(k=k) d[k] = rhs(k=k) / diag(k=k) else: c[k] = sup(k=k) / (diag(k=k) - c[k - 1] * inf(k=k)) d[k] = rhs(k=k) - inf(k=k) * d[k - 1] / (diag(k=k) - c[k - 1] * inf(k=k)) out = None for k in reversed(kk): if out is None: out = d[k] else: out = d[k] - c[k] * out yield k, out return res thomas.dimension = 'k' thomas.is_backward = True ``` ### Execution Naive python execution should place vertical dimension loop the innermost. ```python k_loop = thomas(range(domain['k'][0], domain['k'][1])) # outer loops for k, val in k_loop(...): out[i, j, k] = val ``` ### Composability Let us introduce a helper: ```python def shift(**offsets): return lift(lambda f: f(**offsets)) ``` Call true stencil `a` from vertical stencil `b`: ```python def b(kk): def res(f): for k in kk: yield k, a(shift(k=k)(f)) + 42 return res ``` Call vertical stencil `a` from vertical stencil `b`: ```python def b(kk): assert(b.dimension == a.dimension) def res(f): for k, val in a(kk)(f): yield k, val + 42 return res ``` ### Complicated MeteoSwiss Example Fortran: ```python= do k=2,km-1 do i=1,im if( q(i,k,ic) < 0. ) then if ( q(i,k-1,ic) > 0. ) then ! Borrow from above dq = min ( q(i,k-1,ic)*dp(i,k-1), -q(i,k,ic)*dp(i,k) ) q(i,k-1,ic) = q(i,k-1,ic) - dq/dp(i,k-1) q(i,k ,ic) = q(i,k ,ic) + dq/dp(i,k ) endif if ( q(i,k,ic)<0.0 .and. q(i,k+1,ic)>0. ) then ! Borrow from below: dq = min ( q(i,k+1,ic)*dp(i,k+1), -q(i,k,ic)*dp(i,k) ) q(i,k+1,ic) = q(i,k+1,ic) - dq/dp(i,k+1) q(i,k ,ic) = q(i,k ,ic) + dq/dp(i,k ) endif endif enddo enddo ``` GTScript2: ```python= def f(kk): l = len(kk) kk = itertools.takewhile(lambda k : k < l - 2, kk) kk = itertools.dropwhile(lambda k : k < 2, kk) def res(q, dp): prev = q(k=0) cur = q(k=1) for k in kk: prev = cur cur = next next = q(k=k+1) if cur < 0: if prev > 0: dq = min(prev * dp(k=k-1), -cur * dp(k=k)) prev = prev - dq / dp(k=k-1) cur = cur + dq / dp(k=k) if next > 0: dq = min(next * dp(k=k+1), -cur * dp(k=k)) next = next - dq / dp(k=k+1) cur = cur + dq / dp(k=k) yield k, prev yield l - 2, cur yield l - 1, next return res f.dimension = 'k' f.is_backward = False ```