# 2024-04-11 GT4Py to DaCe coordination meeting ~~The target is to get the gt4py icon dycore running and validating within the next 2 cylces (6 weeks) with DaCe.~~ For getting the icon4py dycore running and validating there are currently 3 options: 1. GT4Py -> ITIR -> SDFG 2. GT4Py -> Combined IR -> SDFG 3. GT4Py -> ??? -> SDFG Note : current workflow GT4py -> ITIR -> GTfn , performance issues ## Points to discuss: - Currently from the GT4Py team, option 2 is considered the best. - Do we all agree that 2. is the best option? - Why is 1. not possible ? Is it because ITIR before inlining is not representable in SDFG ? - it is not possible in a reasonable time to optimize fully inlined stencils - it is not possible to easily do fission. - this was already discussed in november - If 2. is there any work or coordination needed between GT4Py developers and DaCe developers, is the current coordination working or do we need to change anything? - Would the CombinedIR allow to have all needed information for future optimizations, will there be a need for a future IR refactoring in a few month? - there will be still some modification in the future - how to coordinate them with the transformation in SDFG - Is there some work that can be started now, to prepare CombinedIR -> SDFG lowering ? - 3. could include JAX? GT4Py -> JAX -> SDFG or direct Gt4py -> SDFG - 3. could not work because you can not represent domain information , and limit the DSL frontend syntax - Where are the temporaries set : gt4py ## desired outcome * If possible, a realistic timeline indicating projects for the current and the subsequent cycles ## Related Document - current DaCe backend : https://github.com/GridTools/gt4py/blob/main/docs/development/ADRs/0014-DaCe_backend.md - original design of iterator view: https://github.com/GridTools/concepts/blob/master/Iterator-View.md - [DaCe workshop 2023-11-08](https://hackmd.io/jMdsVJmxRi-h6suXit8eEg) - combined IR: https://hackmd.io/fCXnShnFR96kFau7lw36ew ## Hannes sketching a time-line on CombinedIR (AD)*This is helpful. Would be good to extend this timeline to a performance goal.* 1. Cycle 21: new ITIR (-> CombinedIR) typesystem, reason see https://hackmd.io/lrjA8ZavQiCMvDD_8hIxwg 2. Cycle 21: (partial) CombinedIR lowering to SDFG (tests starting from CombinedIR to SDFG, all maps annotated with domain), no execution 3. Cycle 21: prepare ITIR (see https://hackmd.io/h_lELsaZSEmtJ6zQvsakaA), specifically Fencil->Program aspect 4. Cycle 21: upgrade domain(shape)-inference pass to annotate maps with domains 5. Cycle 22?: Lowering from frontend (PAST/FOAST) to CombinedIR (field view extreme of CombinedIR) 6. Cycle 22/23?: Program -> SDFG 7. Cycle 22/23?: Temporary placement in CombinedIR 8. Cycle 23/24?: Tests on ICON4Py, optimizations ```graphviz digraph { rankdir="RL" 1 [label="typesystem"] 2 [label="CIR2SDFG lowering from tests"] 3 [label="Prepare ITIR for CIR"] 4 [label="map domain inference"] 5 [label="frontend2CIR"] 6 [label="Program2SDFG"] 7 [label="Temporary placement CIR", style="dashed"] 8 [label="ICON4Py testing"] 5 -> 1 5 -> 3 6 -> 5 6 -> 4 6 -> 2 7 -> 5 7 -> 4 8 -> 7 [style=dashed] 8 -> 6 } ``` ## Next steps - Start today unittest for combined IR to SDFG - allow to start writting the translator using hand written combined IR. - How to translate special cases : scan, dynamic offset ? -> Start with whatever, this is not relevant for performance - not clear - Wait until we can generate automatically SDFG to start working on optimization - Is it possible to bring domain information / hint to DaCe : not for the sort term : this will be decided once we have the dycore running. - Todos for DaCe should be discuss at the DaCe / gt4py weekly meeting - What is the path to make a product on the DaCe side? Still to be discuss later. On step is to add test. ## Scratchpad ### Performance risk - Horizontal boundary conditions ```python b_vertex_tmp = a+b i_vertex_tmp = a*b start, stop = compute_inverse_image(E2V[0], starting_domain) boundary_only = b_vertex_tmp[start:stop](E2V[0]) interior = i_vertex_tmp(E2V[0]) concat_where(vertex_idx == lateral_boundary, boundary_only, interior) ```