Iterator IR, Functors, and Comonads

--- tags: 'model,study' --- # Iterator IR, Functors, and Comonads [TOC] ## Introduction Iterator IR was mostly designed based on experience with GridTools and its SID concept: an abstraction for multidimensional strided pointers and pointer arithmetic. As large parts of Iterator IR are purely functional, it also makes sense to look at existing theories and approaches from functional languages. This document tries to be a small introduction to concepts used in type theory that are readily available in Haskell. ## Functors A type `f` is a Functor if it provides a function `fmap` which, given any types `a` and `b`, lets you apply any function from `(a → b)` to turn an `f a` into an `f b`. In Haskell, this looks like: ```haskell class Functor f where fmap :: (a → b) → f a → f b ``` Further (but not enforcable using Haskell type classes), the following rules must hold: ```haskell fmap id == id fmap (f . g) == fmap f . fmap g ``` There are many examples of functors, a simple example are lists. An appropriate definition for lists is trivial: ```haskell instance Functor [a] where fmap = map ``` ## Comonads A comonad type `c` is an extension to a functor, requiring the following three additional functions: ```haskell class Functor c ⇒ Comonad c where extract :: c a → a duplicate :: c a → c (c a) extend :: (c a → b) → c a → c b ``` Further, the following rules have to hold: ```haskell extend f c == (fmap f . duplicate) c duplicate c == extend id c fmap f c == extend (f . extract) c ``` A comonad can be seen as some read-only data container. A simple example is one, that always returns a constant value (the instance definition could be shortened, but expressiveness was chosen over briefness): ```haskell data Constant a = Constant a instance Functor Constant where fmap f (Constant x) = Constant (f x) instance Comonad Constant where extract (Constant x) = x duplicate (Constant x) = Constant (Constant x) extend f x = Constant (f x) ``` ## The Store Comonad The store comonad (also called costate comonand as dual of the state monad) is a comonad, that resembles a container with an index type `i` and a value type `v`: ```haskell data Store i v = Store (i → v) i instance Comonad (Store i) where extract (Store f i) = f i duplicate (Store f i) = Store (Store f) i extend g (Store f i) = Store (g . Store f) i ``` The store comonad is quite similar to a C-style pointer: the function `extract` allows element access, that is, dereferencing (`*ptr` in C). The function `duplicate` is the contrary, it adds another level of indirection, like referencing in C (`&ptr`). The store comonad is useful for modeling partial data access of a larger structure (e.g., like a member of a class in an object-oriented language or a single element in a container). For example, the following store models a pointer to the element `42` of the list `[0, 42, 37]`: ```haskell listElementPtr = Store ([0, 42, 37]!!) 1 ``` `extract listElementPtr` thus gives `42`. Now, these are the basic concepts on which we build on, but certain important aspects like shifting are missing to construct our IR’s iterator types. Luckily, there are some slightly more complex extensions, which allow us to implement exactly these missing parts. These are introduced in the following section. ## Indexed Functors, Comonads, and Store Comonads A (type-)indexed functor is almost the same as a ‘normal’ functor, just that the type `f` is parametrized by one or more types. In our case, we require a 2-argument indexed functor, which can be defined in Haskell as follows, where `x` and `y` are the additional type indices: ```haskell class IndexedFunctor f where ifmap :: (a → b) → f x y a → f x y b ``` There are no additional requirements on the types `x` and `y`. Similar to the functor, instead of the basic comonad, we also require a 2-argument indexed comonad. Its definition is: ```haskell class IndexedFunctor c ⇒ IndexedComonad c where iextract :: c x x a → a iduplicate :: c x z a → c x y (c y z a) iextend :: (c y z a → b) → c x z a → c x y b ``` Note that the type index requirements are now a bit more complex than before on the indexed functor. This definition is based on Edward Kmett’s ‘lens’ library (https://github.com/ekmett/lens) which uses dualized indexed comonads of the indexed monads introduced in “Parameterized Notions of Computation” by Bob Atkey. But what’s important: this thing looks exactly like what we need for iterators! Let’s define our iterator very similar to the store comonad, just with an additional type parameter: ```haskell data Iterator i j v = It (j → v) i instance IndexedFunctor Iterator where ifmap g (It f i) = It (g . f) i instance IndexedComonad Iterator where iextract (It f i) = f i iduplicate (It f i) = It (It f) i iextend g (It f i) = It (g . It f) i ``` > Note: the infix operator `.` in Haskell stands for function composition. The iterator has the following type parameters: - `i`: the type of the index of the location to which the iterator is currently pointing. - `j`: the type of the index of the location on which the iterator is dereferencable. - `v`: the value type of the iterator. Now, looking at the definition of `iextract` above, it’s clear that `iextract` is exactly our `deref`: it’s only applicable when the iterator points to the same location type as it’s defined on and there it returns the stored value. And did you realize what `iextend` is? It’s our `lift`! It takes a function that has an iterator as argument and returns a value (that is, a `stencil` in our jargon) and return a function that takes an iterator and returns an iterator. What’s still missing are shifts. Here, we have to introduce another comonad, the indexed store comonad: ```haskell class IndexedComonad s ⇒ IndexedComonadStore s where ipos :: s x y a → x ipeek :: y → s x y a → a ipeeks :: (x → y) → s x y a → a iseek :: y → s x z a → s y z a iseeks :: (x → y) → s x z a → s y z a iexperiment :: Functor f => (x → f y) → s x y a → f a ``` This gives us all required functions (and even some more). A valid implementation for our iterators looks like the following: ```haskell instance IndexedComonadStore Iterator where ipos (It _ a) = a ipeek b (It g _) = g b ipeeks f (It g a) = g (f a) iseek a (It g _) = It g a iseeks f (It g a) = It g (f a) iexperiment f (It g a) = fmap g (f a) ``` So `ipos` returns the current index of the iterator; such a builtin is actually _not_ available in the current Iterator IR. `ipeek` dereferences the iterator at a given absolute index; such a builtin is neither available in iterator IR. `ipeeks` dereferences the iterator after a transformation of the index based on the current index; depending on the passed transformation, this can be interpreted as a composition of `shift` and `deref`. `iseek` returns an iterator shifted to a new absolute index; no correspondence in current iterator IR. `iseeks` returns a shifted iterator, relative to the current index; this is exactly our `shift`. We will find out in a moment, what our connectivities must be to ensure that) `iexperiment` does something fancy with a functor and we will see that this actually matches our derefed shift to neighbors! So a short summary: our iterators are indexed store comonads and Iterator IR builtins map _exactly_ to the methods defined in the respective type classes. The mapping is as follows: ```haskell deref = iextract -- from IndexedComonad lift = iextend -- from IndexedComonad shift = iseeks -- from IndexedComonadStore derefedNbShift = iexperiment -- from IndexedComonadStore ``` ## Offsets and Connectivities A remaining question is, what are shift offsets and connectivities? Let’s look at `shift` aka. `iseeks`. The type signature is: ```haskell shift :: (x → y) → Iterator x z v → Iterator y z v ``` So a shift offset is just a function mapping from a location to another location! But let’s look at `derefedNbShift` aka. `iexperiment`: ```haskell derefedNbShift :: Functor f ⇒ (x → f y) → Iterator x y v → f v ``` This is actually quite vague as the functor `f` is a free type variable. We can easily fix it to something suitable like a list! So, we get the following: ```haskell derefedNbShift :: (x → [y]) → Iterator x y v → [v] ``` The interesting point here is, that `shift` and `derefedNbShift` take a _different_ kind/type of offset argument. The difference is that `derefedNbShift` needs to know the number of neighbors. This also shows the difference of the general shifts which work both in Cartesian and unstructured cases and the shifts required for neighborhood reductions, only available in the unstructured case. So, if we define a connectivity as a function mapping from some location index to a list of neighbor indices (like `x → [y]`), it’s trivial to convert this connectivity to a basic offset function which selects a specific neighbor: ```haskell nb :: (x → [y]) → Int → x → y nb c i = (!! i) . c ``` > A note about the syntax above: Haskell’s infix-operator `!!` selects an element of a list, that is, `foo !! i` selects the i-th element of list ‘foo’. Parentheses can convert an infix operator to a prefix operator, that is, `(!!) foo i` is equivalent to `foo !! i`. Partial binding of arguments to infix operators is also possible, so `foo !! i`, `(!!) foo i`, `(foo !!) i`, and `(!! i) foo` are actually all equivalent. ## Neighbor Reductions Reductions over a neighborhood can easily be implemented using the previously introduced types and functions. The `reduce` function that we currently have in Iterator IR is basically the composition of a fold and deref: ```haskell reduce f x = foldl f x . deref ``` Note that – like for the current `reduce` builtin – the return value of `reduce f x` is a liftable stencil. ## Multi-Iterator Cases In the current design of Iterator IR, some functions are variadic. Other functional languages (like Haskell, OCaml, and also modern and fancy ones like Agda) don’t support variadic functions. Typically, the functions are defined in a fully curried way, that is, each functions takes exactly one argument. Multi-argument functions are actually just single-argument functions returning another function. Supporting variadic functions in this scenario is impossible: the function would have to know beforehand if the user wants to add another argument or already get the final result. See the following example: ```haskell -- Basic variadic function in a curried form, -- computing the sum of an arbitrary number of ints intSum :: Int → Int … → Int -- But what’s the type of ‘x’ here? x = intSum 1 2 -- Could of course be Int and then x == 3 -- BUT maybe we actually want to add more arguments and expect: x 3 4 == 10 ``` So without usage context it’s actually impossible to determine the correct type! Haskell actually allows some very limited ‘fake’ variadic functions using type classes, this trick is for example used in its standard library’s `printf` (see e.g., [this Stackoverflow answer](https://stackoverflow.com/a/7828634) for a description how this works), but is not generalizable to arbitrary use cases. Also F# has some limited support for .NET ‘parameter arrays’, but they are only usable in some ‘uncurried’ contexts. The type inference algorithm for Iterator IR uses uncurried functions together with variadic tuples and special language-specific type constraints to make things work. In Haskell, it’s common to use different function names instead. So, for example for lift, we have: ```haskell lift1 :: (Iterator y z v → w) → Iterator x z v → Iterator x y w lift1 = iextend -- from IndexedComonad lift2 :: (Iterator y z v → Iterator y z' v' → w) → Iterator x z v → Iterator x z' v' → Iterator x y w lift2 g (It f a) (It f' a') = It (\x → g (It f x) (It f' x)) a -- Note: a and a' must be identical, check omitted for simplicity ``` However, none of these multi-argument versions of Iterator IR’s current builtins is available in the previously introduced comonadic types. So, still working on that stuff… ## Proof-of-Concept Implementation An implementation for further inspection is available on [Github](https://github.com/fthaler/iterator-ir-haskell).