# No-Storage implementation
###### tags: `functional cycle 9` `storage`
developer: Linus (half cycle)
## ToDo Shaping:
* What about violating layout in GT backends: Do we just not enforce it? Continue to raise?
## Description
We propose to replace the storage interface in cartesian gt4py with a new approach that does not rely on custom types but rather leverages existing standards and only adds a minimal custom interface where necessary. The new interface will be shared among cartesian and unstructured gt4py.
The full specification can be found in [this document](https://hackmd.io/kl_H-FZ5SvirNp_28qQI5Q), which is also to be finalized during the cycle.
The key points are:
* There are no Storage classes in gt4py
* There is no mechanism in gt4py copying data from CPU to GPU or vice-versa.
* We rely on existing standard interfaces to expose buffers for use in stencils rather than creating own classes or interfaces.
* We do provide utilities `zeros`, `ones`, `full` and `empty` and their corresponding `{}_like` counterparts to allocate arrays that have an appropriate layout, alignment. The signature of those will be similar to those proposed in GDP-3. In case of GPU backends, the result is a CuPy ndarray, for CPU a NumPy ndarray.
* The semantic meaning of dimensions of arrays passed to stencils is looked up according to
1. the `dims` attribute of the array if present
2. the annotation of the corresponding argument to the stencil
* While any type implementing the standard interfaces is understood by the stencils, an array can still be with incompatible e.g. due to its layout not being supported in the backend.
## Tasks
For this cycle, we propose to remove the existing storage and allocation facilities and implement the allocation utilities in a common location shared between cartesian and functional versions, similar to `eve`. Since bindings are expected to go on separately, we propose to change the StencilObject of cartesian GT4Py to support the new interface, while leaving support in functional to the bindings project.
* Remove all current storage classes
* Move allocation routines to common repository with functional
* Change return type of allocation routines to directly return numpy or cupy ndarrays.
* Change allocation routine signature (starting point can be [gronerl/gtp-3-implementation](https://github.com/gronerl/gt4py/tree/gdp-3-implementation))
* Adapt StencilObject to work with arbitrary buffers.
* Implement `dims` lookup, adapt `default_origin` lookup in cartesian GT4Py. (starting point can be [gronerl/gtp-3-implementation](https://github.com/gronerl/gt4py/tree/gdp-3-implementation))
* Finalize [Storage.md](https://hackmd.io/kl_H-FZ5SvirNp_28qQI5Q) and add to the [gt4py concepts wiki](https://github.com/GridTools/concepts/wiki).
* Adapt existing tests and demo scripts to work without storages.
## No-Goals
Performance is not a primary concern in extracting buffers. We expect to be able to rely heavily on NumPy and CuPy's `asarray` and `from_dlpack` routines. Should those prove to introduce too much overhead, a more direct interface can be introduced later.
## Appetite
As allocation per se is already solved, this work largely comprises in removing code and adapting examples. The implementation of the `dims` interface should be straight forward as it just requires permuting the dimensions appropriately which was previously already implemented in [gronerl/gtp-3-implementation](https://github.com/gronerl/gt4py/tree/gdp-3-implementation).
We therefore expect the above tasks to be completed by **1 developer in half a cycle.**
<!--
## Resources
(old) `__gt_data_interface__` description: https://github.com/GridTools/gt4py/blob/master/docs/gt4py/GDPs/gdp-0003-duck-storage.rst
`xarray` dataset_accessor https://xarray.pydata.org/en/stable/internals/extending-xarray.html
binary cuda interface (DLPack): https://github.com/dmlc/dlpack
## Sketch of first ideas:
* Keep allocation functions like empty, ones, zeros, ..., from_array
- from_array
- no non-copying wrapping
- arguments similar to GDP-3, reduce default value lookup to backend defaults.
* provide xarray dataset_accessor like GDP-3
* no-op only wrapper class for labeling dimension / keeping reference to allocated buffer, e.g.:
field_op(inp_field, out=out_field[1:-1])
field_op(inp_field, out=out_field[2:-2])
```python
class Field:
def __init__(self, array, dims_mapping: list[Union[str, Dimension]]):
self._array = array
self._dims_mapping = dims_mapping
@property
def __array_interface__ # or __cuda_array_interface__
@property
def __gt_data_interface__(self):
...
@classmethod
def empty(self, *args, dims_mapping, **kwargs):
array = empty(*args, **kwargs)
return Field(array, dims_mapping)
# same for ones, etc.
```
## Tasks Shaping:
- Familiarize with GDP
- Discuss Field interface with Team/EGP
- Field should be constructable from everything that supports `__gt_data_interface__`. If such an object is passed to a field operator it should be converted implicitly.
- Shall we use a dictionary or a class (with slots)?
- expand details of above sketch
- define allocation API.
- stencil call interface / type annotations / bindings
- figure out how to share storage between functional & cartesian repo
- Do we need the `default_origin` or is everyone implementing this themselves anyway?
- Describe procedure to deduce dimensions from field op given a storage (with optional dim labels)
## Notes
* DLPack support:
* Mainly need to support `__dlpack__` in Stencil interface, already covers large part of `__gt_data_interface__` (but not xarray-style labeling of axes)
* Allocation API
## Scratchpad
```python=
class DeviceIdentifier(enum.Enum):
CPU = 0
GPU = 1
class DataContainer(GTDataContainerConcept):
dims: Tuple[str...]
device_identifier: DeviceIdentifier
shape: Tuple[int, ...]
typestr: str
data: Tuple[int, bool]
strides: Tuple[int, ...]
def empty(self, *args, dims, **kwargs) -> DataContainer:
array = empty(*args, **kwargs)
return Field(array, dims)
class CustomDataContainer(GTDataContainerConcept):
...
# construct field from "datacontainer"
data_container = DataContainer.empty(
shape=(10, 10),
dtype=float64,
dims=[I, J]
)
f = Field(data_container)
# construct field from allocated buffer
# note: user is responsible
data = np.zeros(...)
f = Field(data, dims=[I, J])
#
Field.empty(shape=(10, 10), dtype=float64, dims=[I, J])
``` -->