# GTC DaCe Backend refactor ###### tags: `cycle 7` `dace` ## Motivation The current implementation of the gtc:dace backend was written with the requirement that there must be a path back from an OIR-level SDFG to OIR. This has led to intermixing of SDFGs with eve nodes, which has made a visitor-based implementation of the conversion to SDFG and the expansion of library nodes impossible. Further, in the partial SDFG representation, a lot of domain specific information such as the correspondence between grid points and indices, lead to a very cluttery code where this information needs to be inferred from shapes and subsets rather than storing this information. The current implement also is hard wired to the execution model ("parallel model") that OIR imposes, namely that the k loop is outermost. For further performance improvements, it is unavoidable to add further options to the current expansions. Overall, it has proven that the resulting code is hard to expand and difficult to understand. Since recently we dropped the requirement for loopbacks into OIR we can implement the backend with a cleaner separation of eve-based structures and SDFG representation. In the wake of this refactor, a number of improvements should be made: * Improved parametrizeability of the current expansion, i.e. for individual library nodes, set different loop orders, layouts of temporaries, tile sizes, more schemes like in stencil_benchmarks, etc. * The possibility for a delayed library node expansion, i.e. when using in DaCe orchestration, retaining library nodes and their domain specific information in the whole-program representation and parametrizing expansion according to suitability in that context. ## Proposed design To achieve this, * There will be only one layer of SDFG with `StencilComputation`s which contains computations equivalent to `VerticalLoop` oir library node in terms of computation, yet without the restriction to loop order in the expansion. This is the only kind of library nodes. There will be no individual SDFGs for `VerticalLoopSection`s and no more`HorizontalExecutionLibraryNode`s alltogether. * Creation of a new DaCeIR during expansion with concepts equivalent to the elements of SDFGs (maps, nested SDFGs, state machines), yet retaining map ranges and subsets in a Domain-specific manner. (origin/halo+domain) In this way, the final SDFG can be built with a visitor that will be approachable for GTC devs. * Build the graph of library nodes by building simple SDFGStates and fusing them after. This avoids the complicated dependency analysis currently in place and non-data edges that currently are often blocking DaCe transformations after expansion. ## Steps / Rough Schedule First steps in this effort have already been made, making a lot of progress quickly. We sketch a possible timeline (Single dev being Linus): Week 1&2: Feature-equivalence to other gtc:* backends with naive expansion (k outermost) Week 3&4: Different expansion schemes (Loop Order, Caches) Week 5: Adding Regions and, expansion schemes for Regions (predicate vs map range) Week 6: Debugging on FV3 DyCore