# GTScript: _for_ loops in stencil bodies
###### tags: `cycle 2`
shaped by: Enrique, Mauro, Johann
## Assigned
to Enrique, Tobias W, full cycle
## Context
GTScript currently does not support any sort of control flow looping constructs, other than the IJK loops in the parallel model. This means that code needs to either be duplicated, or multiple stencil calls made. Neither are ideal.
Loops fall into a few distinct categories:
1. `for x in y` loop control flow inside the AST. This can be compile-time or run-time (e.g. `y = range(N)`).
2. Tracer loops: run the same set of computations (stencil) over a set of inputs, moving the for loop in python into the stencil. This is like a compile-time for loop outside a computation. This will require stencils calling stencils, and enabling tuples of fields passed to a stencil in the frontend.
3. Vertical sub-looping inside an interval over a `+/- k` range.
This project addressed the first of these. The second is probably best addressed by instead enabling stencils calling stencils, in conjunction with a loop enabled by this project. Vertical sub-looping is a more advanced version of `for` loops that is out of scope here.
## Problem
Having the ability to loop inside the stencil bodies is required to implement linear algebra operations and other patterns. Since stencil expression act on full fields, it is important that the number of iterations in the loops are independent on the iteration point, so the termination condition must **not** be a field expression.
In this project we propose to implement the feature of _for_-loops with compile-time sized list of values, which are useful to scan through higher dimensional fields, like vector or matrix fields. The limitation to have literals in `range` could be removed if the substitution with the variable is easy. Debug-mode bound checks may be introduced to avoid out-of-bound errors.
## Appetite
Full cycle, 1 or 2 people (probably 2 in any case, but definitely 2 developers if this feature should be implemented in both GTC and classic toolchains, although the classic toolchain might not be needed).
## Features
- Syntax for looping construct: `for` loops.
- Syntax for the specification of the iteration:
+ `range(stop)` or `range(start, stop[, step])` with the usual [Python semantics](https://devdocs.io/python~3.9/library/stdtypes#range)
+ `tuple`-like objects with literal values: `for i in (1.4, 2.3, 5.1, 7.2):`
- Using the loop index variable in scalar expressions
- Indexing of field data dimensions with variables (**not** for spatial indices):
```python
@gt.stencil()
def stencil(...):
with computation(FORWARD), interval(...):
for idx in range(3):
vector_field[0][idx] = scalar_field[0,0,0] - 2.0 * idx
```
- Code generation and update the checks for optimizations to avoid possible incorrect code generation
To keep the extent analysis working, the following limitations should be applied:
- field indexing using the loop variable should be limited to data (non-spatial) dimensions of fields
- if statements inside the _for_-loop body trigger the creation of new stages or have any impact in the extents analysis, the loop should be fully unrolled at compile-time
Other extra features (like an appropriate syntax to specify bounds or quite sophisticated code analysis) should be implemented in later cycles in case more advanced looping use cases are required in the future, which is not yet clear.
## Implementation
The gtscript frontend will need to parse the `for` expression, supporting `range`, and `tuple`-like objects as iteration specifiers.
The `For` node could follow the example of `ast.For`:
```python
class For(Expr):
target: VarDecl
iterator: Iterator
body: Expr
# orelse: Expr # Probably not for GTScript
```
where a `Iteration` is a type of node that abstracts the iteration spec:
```python=3.8
class Iterator:
dtype
class IndexRange(Iterator):
start: ScalarAccess
end: ScalarAccess
step: ScalarAccess
```
If `iteration` is a compile-time constant, the frontend could unroll this and avoid the generation of a `For` node, but for the sake of simplifying the frontend and supporting run-time values in the range specification in the future, the unrolling should be implemented as an optimization pass.
### No-goal
- Linear algebra operations
- Advanced optimizations: anything more complex than loop-unrolling is definitely out of scope
- Tracer loops
- Tuple inputs to stencils
## Possible time-sinks
- Run-time iteration are difficult to define in general
- Merging the feature while other work on backends and other features are being developed. Frequent sync-ups between the teams should alleviate this problem, hopefully
## Implementation notes
- GridTools C++ example showing how to perform a nested inner loop on the extra storage dimension(s), in this case, projecting a field 'f' on a finite elements space:
https://github.com/GridTools/gridtools/blob/master/tests/regression/extended_4D.cpp