- Shaped by: Enrique, Rico - Appetite (FTEs, weeks): 4 weeks - Developers: <!-- Filled in at the betting table unless someone is specifically required here --> ## Problem This is a batch project of small tasks related to support direct GPU execution from the Python interface. Part of this project is the continuation of the unfinished parts of [Support GPU backends from Python](https://hackmd.io/@gridtools/B1vHhmbN2) task from cycle 15 (check the previous task for the whole context and motivation), and another part is to update the Python/C++ bindings to support [DLPack](https://dmlc.github.io/dlpack/latest/), the zero-copy memory exchange standard which covers multiple GPU frameworks and devices. ## Solution Batch of mini-projects to complete the open issues to support GPU backends from Python. ### 1. Add GPU compilation and bindings to gtfn C++ backend (1 week) Regarding the modifications of gtfn C++ backend, bindings and compilation, most of the points have been already addressed in two draft PRs: - https://github.com/GridTools/gt4py/pull/1276 (by Rico) - https://github.com/DropD/gt4py/pull/1 (by Till) The remaining work is the clean-up of hacks and shortcuts introduced in both branches, most importantly: 1. add a switch in the bindings generator to use either the Buffer protocol or the CUDA array interface (`cuda_as_sid`) depending on the target backend (CPU or GPU) 2. add extra information to the compilation cache to generate different cache IDs for CPU and GPU builds of the same operator 3. refactor and cleanup the cmake-related parts of the OTF compilation subsystem to work for GPUs ### 2. Storages (2 weeks) 1. Finish refactoring of low-level buffer allocation tools to make it work with the current API for cartesian 2. New high-level API to create fields (data buffer plus domain information) compatible with new field-view embedded execution and the established gt4py protocol to share domain information (`__gt_dims__`, `__gt_origin__`) ### 3. Improved bindings support and HIP (2 weeks) #### GridTools C++ Add [DLPack](https://dmlc.github.io/dlpack/latest/) support on the GridTools C++ side using [nanobind](https://github.com/wjakob/nanobind). #### Update bindings in gt4py.next to use nanobind #### gt4py.cartesian Finalize and merge Stefano Ubbiali's PRs adding HIP support to both gt4py.cartesian and gridtools-c++: - gt4py: https://github.com/GridTools/gt4py/pull/1278 - gridtools c++ (after migration to DLPack): https://github.com/GridTools/gridtools/pull/1759 #### gt4py.next - Add support to the c++ backend compilation using HIP toolchain instead of CUDA-nvcc ## Rabbit holes <!-- Details about the solution worth calling out to avoid problems --> ## No-gos <!-- Anything specifically excluded from the concept: functionality or use cases we intentionally aren’t covering to fit the ## appetite or make the problem tractable --> ## Implementation ### 2.2. High-level API to create fields #### Idea 1: like cartesian see pr - think about how the hello world looks like - make sure the default allocator doesn't work for non-default backend #### Idea 2 - `backend` has `allocator` - `allocator` provides `empty`, `zeros`, etc... #### Idea 3 - single allocate function that takes different values (`None`-> empty, `1` -> fill, `Callabe`-> from_function, `ndarray`->from_array)