# Support for lower and higher dimensional fields in GTScript
###### tags: `cycle 1`
## Problem
GTScript stencils currently only support 3D scalar fields. Although this approach simplified the initial development of the toolchain, and works for the most common cases, it complicates the development of applications storing 2D or 1D data (domain dims < 3) or dealing with fields that do not contain scalar values but vector/tensor values (data dims > 1). Currently, the user needs to use workarounds to deal with these cases using 3D fields:
- tile the lower dimensional data to fit into a 3D array. This greatly affects performance.
- use several scalar 3D fields to represent a single vector field, which obfuscates the code and does not scale for large vectors or higher order fields. A critical example of higher dimensional fields appears in the CloudSC dwarf from Escape (example code for porting IFS physics to GT4Py), which contains an LU solver. Additionally, high-order FEM methods also require non-scalar datatypes, for example to store degrees of freedom.
## Appetite
Half a cycle.
## Solution
Since this project involves API and DSL changes, a small GDP could be written to collect the proposed changes and document the reasons behind the design. The actual implementation of this project during the build cycle would be used as the reference implementation of the GDP and could be approved and merged at the end, if the projects succeeds.
### For lower dimensional fields
This has been implemented for the non-GTC backends in PR [#288](https://github.com/GridTools/gt4py/pull/288).
1. Move frontend checks to new PR
2. Add support for 1D and 2D fields in the `gtir.CartesianOffset` and `gtir.FieldDecl` nodes, by either using axes masks or optional offsets in the nodes (both approaches could work).
3. Support correct indexing in backends (ask Anton and Mauro about SID concept implemetation details)
- GridTools 1.x: May require different data store types for each field axis combination. See [#288](https://github.com/GridTools/gt4py/pull/288).
- GridTools 2.x: Uses SIDs and an adapter from Buffer Protocol to SID (which it is currently possible using `rename_dimensions` ??).
4. Adapt `StencilObject.__call__() / _call_run()` checks to deal with lower dimensional / masked arrays (using old storages for now, which do not check that the axis tags are correct, just check that `ndims` matches).
### For higher dimensional fields
1. First task is to define the syntax to be used for defining and indexing higher dimensional fields and extend the frontend to support it.
- Syntax for type annotations: a symple lightweight syntax (which does not introduce new keywords) can be used for now and it could be refined letter with more powerful concepts in a syntax worskop:
```python=3.8
def stencil(field_a: Field[IJK, ((3, 3), float)]):
...
```
Following discussions in GTScript workshop and in the scientific community about the typing of ndarrays (https://github.com/ramonhagenaars/nptyping), other less preferable options could be:
```python=3.8
def stencil(field_a: Field[IJK, Tensor[float, (3, 3)]]):
...
def stencil(field_a: Field[IJK, float, (3, 3)]):
...
```
- Syntax for indexing operations: another set of brackets will be used for the indices of the data dimensions.
```python=3.8
field[0, 0, 0][2] = 3 # [0,0,0] -> domain offset, [2] -> data offset
```
To avoid ambiguities in the syntax, when indexing fields with data dimensions, the index should always be fully specified for both domain and data dimensions, even if all of them are zero (e.g.`field_a[0,0,0][0,0]`).
2. IR modifications
- gtir.FieldDecl: add support for the definition of the data dimensions
- gtir.FieldAccess: add support for the indexing offsets of the data dimensions
3. Backend modifications to support the following operations with higher dimensional fields:
- Reading from single data element (e.g. `a = field[0, 0, 0][1]`)
- Assignment to single data element (e.g. `field[0, 0, 0][1] = 3`)
- Is there something needed here to deal with higher dimensional fields in GridTools C++?
4. Support in temporaries
- Frontend will check for this syntax being used with temporaries: `tmp[0,0,0][3] = 0` will cause exception to be raised
5. Adapt code generation to deal with data dimensions. In principle, nothing should be changed here for GridTools C++ since SID does not care of the semantics of the dimensions (domain dimensions and data dimensions are the same).
6. Adapt `StencilObject.__call__() / _call_run()` checks to deal with data arrays with data dimensions.
<!--
### Typing compatibility
It seems like the existing GTScript typing does not work with
- https://github.com/GridTools/gt4py/issues/214
-->
### No-goals
- Optimizations to reduce the dimensionality of temporary fields
- This project will not support explicit typing for higher or lower dimensional storages
- Support for linear algebra operators for higher dimensional fields (vector / matrix operations or element-wise operations between data vectors).
### Time sinks
- Definition of a proper syntax to annotate higher dimensional Fields: check GTScript syntax workshop discussions and try to be conservative (more user-friendly syntatic sugar can be added later).
- Definition of a proper syntax for indexing the data dimensions of higher dimension: request a explicit syntax from the user to avoid problems. Syntatic sugar can be added later.
- Dealing with GridTools C++ data_store / SID types in the generated code: it shouldn't really be a problem in GT 2.0 with SIDs, but the SID concept and interface should be well understood by the developers.