Issues with Domain Definition in The Iterator IR

# Issues with Domain Definition in The Iterator IR ###### tags: `IR` `Felix’s remains` ## Issues ### 1. Unknown Sizes Of Location Types Let’s look at a simple copy stencil application followed by a neighbor reduction: ``` unknown_edge_size(n_vertices, out, inp) { copy = λ(inp) → ·inp; stencil = λ(inp) → reduce(λ(x, y) → x + y, 0.0)(⟪V2Eₒ⟫((↑copy)(inp))); out ← (stencil)(inp) @ ⟨ Vertex: [0, n_vertices) ⟩; } ``` The temporary popup pass transforms this into the following: ``` # fencil with temporaries: unknown_edge_size(n_vertices, out, inp) { _gtmp_0 = temporary(domain=UNKNOWN_DOMAIN, dtype=float); # modified fencil: unknown_edge_size(n_vertices, out, inp, _gtmp_0) { _gtmp_0 ← (deref)(inp) @ UNKNOWN_DOMAIN; out ← (λ(_lift_1) → reduce(plus, 0.0)(⟪V2Eₒ⟫(_lift_1)))(_gtmp_0) @ ⟨ Vertex: [0, n_vertices) ⟩; }; # call to modified fencil unknown_edge_size(n_vertices, out, inp, _gtmp_0); } ``` It is clear from the shift `⟪V2Eₒ⟫` that the temporary field `_gtmp_0` lives on edges. But there is no way to get the required domain size (that is, number of edges), as the user only passed the number of vertices. ### 2. Proper Extent Analysis Let’s look at the 1D Poisson $\nabla^2 u=g$, discretized with standard 2nd-order finite differences: $\frac{1}{\Delta x}(u_{i + 1} + u_{i - 1} - 2 u_i)=g_i$. The $k$-th Jacobi iteration then looks like $u^{k+1}_i=\frac{1}{2\Delta x}(g_i - u^k_{i+1} - u^k_{i-1})$. In iterator IR, for simplicity assuming $\Delta x=\frac{1}{2}$, this gives: ``` jacobi_step(n, u_kp1, u_k, g) { step = λ(u, g) → ·g - ·⟪Iₒ, 1ₒ⟫(u) - ·⟪Iₒ, -1ₒ⟫(u); u_kp1 ← (step)(u_k, g) @ ⟨ IDim: [0, n) ⟩ } ``` If we now want to incorporate Dirichlet boundary conditions, we could get: ``` jacobi_step(n, u_kp1, u_k, g, i, u_left, u_right) { step = λ(u, g) → ·g - ·⟪Iₒ, 1ₒ⟫(u) - ·⟪Iₒ, -1ₒ⟫(u); step_with_bc = λ(u, g, i, u_l, u_r) → if ·i == 0 then ·u_l else if ·i == n - 1 then ·u_r else step(u, g); u_kp1 ← (step_with_bc)(u_k, g, i, u_left, u_right) @ ⟨ IDim: [0, n) ⟩; } ``` This works flawlessly, but problems arise as soon as we want to fuse multiple steps. Consider the following: ``` jacobi_steps(n, u_kp3, u_k, g, i, u_left, u_right) { # … (same as above) multiple_steps = λ(u, g, i, u_l, u_r) → step_with_bc((↑step_with_bc)((↑step_with_bc)(u, g, i, u_l, u_r), g, i, u_l, u_r), g, i, u_l, u_r); u_kp3 ← (multiple_steps)(u_k, g, i, u_left, u_right) @ ⟨ IDim: [0, n) ⟩; } ``` By the global temporaries pass, this could (conceptually) be transformed back to: ``` jacobi_steps(n, u_kp3, u_k, g, i, u_left, u_right) { _gtmp_0 = temporary(domain=UNKNOWN_DOMAIN, dtype=float); _gtmp_1 = temporary(domain=UNKNOWN_DOMAIN, dtype=float); jacobi_steps(n, u_kp3, u_k, g, i, u_left, u_right, _gtmp_0, _gtmp_1) { _gtmp_0 ← step_with_bc(u, g, i, u_l, u_r) @ UNKNOWN_DOMAIN; _gtmp_1 ← step_with_bc(_gtmp_0, g, i, u_l, u_r) @ UNKNOWN_DOMAIN; u_kp3 ← step_with_bc(_gtmp_1, g, i, u_l, u_r) @ ⟨ IDim: [0, n) ⟩; }; } ``` But now, because there is no known relation of the positional field `i` and the domain size `n`, there is no way to derive the correct domains (unaltered `⟨ IDim: [0, n) ⟩`) for `_gtmp_0` and `_gtmp_1`. Even worse, naive domain extension (à la GridTools) may lead to out-of-bounds accesses. ## Possible Solution Attempts ### Additional Builtins For Domain Extension ```haskell= domain_extend :: Domain, Offsets… → Domain domain_union :: Domains… → Domain ``` Then, for the first example above, we would get the following: ``` # fencil with temporaries: unknown_edge_size(n_vertices, out, inp) { _gtmp_0 = temporary( domain=domain_extend(⟨ Vertex: [0, n_vertices) ⟩, V2Eₒ), dtype=float ); # modified fencil: unknown_edge_size(n_vertices, out, inp, _gtmp_0) { _gtmp_0 ← (deref)(inp) @ domain_extend(⟨ Vertex: [0, n_vertices) ⟩, V2Eₒ); out ← (λ(_lift_1) → reduce(plus, 0.0)(⟪V2Eₒ⟫(_lift_1)))(_gtmp_0) @ ⟨ Vertex: [0, n_vertices) ⟩; }; # call to modified fencil unknown_edge_size(n_vertices, out, inp, _gtmp_0); } ``` Note however that there is a strange ‘feature’ of the iterator IR: if we define a compute domain — like in this example `⟨ Vertex: [0, n_vertices) ⟩` — this compute domain knows much more than you might think. That is, it has a ‘global’ knowledge of connectivities and knows that `domain_extend(⟨ Vertex: [0, n_vertices) ⟩, V2Eₒ)` gives a new compute domain on a different location type. This additionally means that there is only one unique location type ‘Vertex’ allowed per fencil belonging to some global domain structure. Additionally, the `domain` and `named_range` builtins could be removed if we enforce out-of-IR domain definitions. So, the example would look like this: ``` # fencil with temporaries: unknown_edge_size(vertex_dom, out, inp) { _gtmp_0 = temporary( domain=domain_extend(vertex_dom, V2Eₒ), dtype=float ); # modified fencil: unknown_edge_size(vertex_dom, out, inp, _gtmp_0) { _gtmp_0 ← (deref)(inp) @ domain_extend(vertex_dom, V2Eₒ); out ← (λ(_lift_1) → reduce(plus, 0.0)(⟪V2Eₒ⟫(_lift_1)))(_gtmp_0) @ vertex_dom; }; # call to modified fencil unknown_edge_size(vertex_dom, out, inp, _gtmp_0); } ``` The disadvantage here is, that the domains are also abstract to the optimization passes. For example, if the user passes multiple domains with the same location type or even the same size, there is no way to detect that within the IR and perform respective optimizations (e.g. loop fusion). Another attempt could be to have two additional abstract types: a _compute_ domain (as already available) and some complete domain description that can not be built from within the IR. If we provide the following interface: ```haskell= global_domain_size :: GlobalDomain, Dimension -> Int compute_domain_on :: GlobalDomain, (Dimension, Start, Stop)… → ComputeDomain compute_domain_extend :: ComputeDomain, Offsets… → ComputeDomain compute_domain_union :: ComputeDomains… → ComputeDomain ``` But: what’s the global domain in the cartesian case? … ### Additional Builtins for Location Querying Probably we need some builtins to check an iterator position relative to the compute domain. Is `in_domain` enough? ``` jacobi_step(n, u_kp1, u_k, g, u_left, u_right) { step = λ(u, g) → ·g - ·⟪Iₒ, 1ₒ⟫(u) - ·⟪Iₒ, -1ₒ⟫(u); step_with_bc = λ(u, g, u_l, u_r) → if ¬in_domain(⟪Iₒ, -1ₒ⟫(u)) then ·u_l else if ¬in_domain(⟪Iₒ, 1ₒ⟫(u)) then ·u_r else step(u, g); u_kp1 ← (step_with_bc)(u_k, g, u_left, u_right) @ ⟨ IDim: [0, n) ⟩; } ``` But also here, we got some problems: - `in_domain` for unstructured: is it always `True`? Or is it the same as `can_deref`? - Do now all iterators have to track their indices? Or do we have special iterators for `in_domain` checks? Or do we resolve all `in_domain` checks to some other if conditions? - Dependency analysis is still super complex…