# [ICON4Py] GT4Py-program interface for compile-time domains and custom backends <!-- Add the tag for the current cycle number in the top bar --> - Shaped by: - Appetite (FTEs, weeks): - Developers: <!-- Filled in at the betting table unless someone is specifically required here --> ## Problem In ICON4Py granules, the GT4Py programs are precompiled in init, setting the backend and the static arguments (currently vertical domain + configuration flags). We would like to be able to make this interface more flexible - to allow switching horizontal domain sizes to be compile time - to allow setting custom backend options per program. While these cases are mostly orthogonal, they touch the same parts of the ICON4PY granule initializiation and execution code. ## Appetite <!-- Explain how much time we want to spend and how that constrains the solution --> ## Solution The current interface looks like ``` self.apply_diffusion_to_vn = apply_diffusion_to_vn.with_backend(self._backend).compile( enable_jit=False, nudgezone_diff=[self.nudgezone_diff], fac_bdydiff_v=[self.fac_bdydiff_v], limited_area=[self._grid.limited_area], vertical_start=[0], vertical_end=[self._grid.num_levels], offset_provider=self._grid.connectivities, ) ``` for initialization and ``` self.apply_diffusion_to_vn( u_vert=self.u_vert, v_vert=self.v_vert, primal_normal_vert_v1=self._edge_params.primal_normal_vert[0], primal_normal_vert_v2=self._edge_params.primal_normal_vert[1], z_nabla2_e=self.z_nabla2_e, inv_vert_vert_length=self._edge_params.inverse_vertex_vertex_lengths, inv_primal_edge_length=self._edge_params.inverse_primal_edge_lengths, area_edge=self._edge_params.edge_areas, kh_smag_e=self.kh_smag_e, diff_multfac_vn=diff_multfac_vn, nudgecoeff_e=self._interpolation_state.nudgecoeff_e, vn=prognostic_state.vn, nudgezone_diff=self.nudgezone_diff, fac_bdydiff_v=self.fac_bdydiff_v, start_2nd_nudge_line_idx_e=self._edge_start_nudging_level_2, limited_area=self._grid.limited_area, horizontal_start=self._edge_start_lateral_boundary_level_5, horizontal_end=self._edge_end_local, vertical_start=0, vertical_end=self._grid.num_levels, offset_provider=self._grid.connectivities, ) ``` for calling the program. The task is to improve over this interface to enable the use-cases mentioned in [Problem](#Problem). ### Interface Example - Done The following is a sketch of an idea that should be refined and discussed with ICON4Py developers. ``` horizontal_sizes_for_apply_diffusion_to_vn=dict(start_2nd_nudge_line_idx_e=self._edge_start_nudging_level_2, horizontal_start=self._edge_start_lateral_boundary_level_5, horizontal_end=self._edge_end_local) self.apply_diffusion_to_vn = compile( program=apply_diffusion_to_vn, horizontal_sizes=horizontal_sizes_for_apply_diffusion_to_vn, vertical_sizes={...} bound_args={some_metric_field=some_metric_field}, bound_static_args={some_config_flag=some_config_flag} static_args={some_runtime_switch=[some_runtime_switch1, some_runtime_switch2]}, backend_tag=CPU/GPU/GTFN_CPU/GTFN_GPU/DACE_CPU/DACE_GPU ) self.apply_diffusion_to_vn(some_runtime_switch=True, some_prognostic_field=some_prognostic_field) ``` where `compile` is an ICON4Py function that wraps the program and - sets the backend (variant) - passes `vertical_sizes` (and if enabled `horizontal_sizes`) to the GT4Py `program.compile` - passes `bound_static_args` and `static_args` to `program.compile` - binds `bound_static_args` and `bound_args` (`functools.partial`) - the result is a callable that only has non-bound parameters - (backend_tag allows to lookup a configuration/dict for which backend variant it should use) ### Backend selection - Done From the GT4Py perspective configuring a backend with an option (e.g. `inline_everything=True`) is a new backend. There are 2 obvious interface choices, to make the ICON4Py backend configurable for each program: - take a GT4Py backend factory (exists in GT4Py to construct the backend, builder pattern) and instantiate it to make it concrete - use a backend tag and construct the backend from the tag The second has the advantage that we could introduce a `CPU` and `GPU` tag, where the granule itself chooses the default (not the user), e.g. the user says `GPU`, we decide GTFN for program A, DaCe for program B etc. With both design variants we could look up the configuration for the concrete backend from an object which provides defaults but options to customize along different dimensions, e.g. - name of the program - gpu/cpu architecture ## Rabbit holes <!-- Details about the solution worth calling out to avoid problems --> ## No-gos <!-- Anything specifically excluded from the concept: functionality or use cases we intentionally aren’t covering to fit the ## appetite or make the problem tractable --> ## Progress <!-- Don't fill during shaping. This area is for collecting TODOs during building. As first task during building add a preliminary list of coarse-grained tasks for the project and refine them with finer-grained items when it makes sense as you work on them. --> - [x] Task 1 ([PR#xxxx](https://github.com/GridTools/gt4py/pulls)) - [x] Subtask A - [x] Subtask X - [ ] Task 2 - [x] Subtask H - [ ] Subtask J - [ ] Discovered Task 3 - [ ] Subtask L - [ ] Subtask S - [ ] Task 4 ## Scratch ``` class Granule: def __init__(..., device, backend_kind=default) def granule_init(backend): ... backend_maker = maker_from_backend(backend) ... ... backend=make_concrete_backend(foo, backend_maker, gpu/cpu, lookup_device_arch()), ...) ... backend=make_concrete_bakcend(bar, gtfn_maker, gpu/cpu, lookup_device_arch()) @program def foo: ... Metadata: --------- - <program_name> - <device arch> - - foo - dace - A100 - dace - blocksize: x - gtfn - fuse_all: True - GH200 - gtfn - some_flag: False - blocksize: y - bar - A100 - def general_maker(device: GPU | CPU, backend_kind = backend_kind.DACE): if backend_kind == backend_kind.DACE: return functools.partial(dace_maker, device=device) elif backend_kind == backend_kind.GTFN: return gtfn_maker default = general_maker(backend_kind) default() from_database_config = {}#get_from_json(<program>, <backend_kind>, <device>, <arch>) concrete_backend=default(**from_data_base_config) another_concrete_backend=general_maker() def more_concrete_maker(**kwargs): kwargs.pop("block_size") return functools.partial(gtfn_maker, device=gpu, block_size=(32,8))(**kwargs) ... ```