# Recursion schemes, categorically! ## 2. Lists We started with the tree example, but there is a simpler recursive data structure, a list. You may think of a list as a tree with the branching ratio of one. ```haskell= data List c = Nil | Cons c (List c) ``` ```scala= sealed trait List[C] case class Nil[C]() extends List[C] case class Cons[C](head: C, tail: List[C]) extends List[C] ``` The `Nil` constructor creates an empty leaf, and the `Cons` constructor, besides containing a payload of type `C`, acts like a node that has only one child. This is a recursive data structure, so we should be able to apply our algebraic methods to factorize it. Step one is to define a functor in which we replace the recursive part with an arbitrary type `R` ```haskell= data ListF c r = NilF | ConsF c r ``` ```scala= sealed trait ListF[C, R] case class NilF[C, R]() extends ListF[C, R] case class ConsF[C, R](head: C, tail: R) extends ListF[C, R] ``` The first argument describes the payload, which we'll keep around as a type parameter. `ListF` is indeed a functor ```haskell= instance Functor (ListF c) where fmap f NilF = NilF fmap f (ConsF c r) = ConsF c (f r) ``` ```scala= implicit def functorForListF[A]: Functor[ListF[A, *]] = new Functor[ListF[A, *]] { def map[B, C](fa: ListF[A, B])(f: B => C): ListF[A, C] = fa match { case NilF() => NilF() case ConsF(head, tail) => ConsF(head, f(tail)) } } ``` In category theory we would describe this functor as a sum (coproduct) of the unit (terminal object) and a pair (product) of $C$ and $R$ $$F (R) = 1 + C \times R$$ An algebra for this functor ```haskell= Algebra (ListF c) a ``` ```scala= Algebra[ListF[C, *], A] ``` is defined by a choice of a carrier `A` and an evaluator of the type ```haskell= eval :: ListF c a -> a ``` ```scala= def eval[C, A]: ListF[C, A] => A = ??? ``` Let's see an example of one such evaluator, replacing both `C` and `A` with `Int` ```haskell= evalSum :: ListF Int Int -> Int evalSum NilF = 0 evalSum (ConsF x a) = x + a ``` ```scala= val evalSum: Algebra[ListF[Int, *], Int] = l => l match { case NilF() => 0 case ConsF(x, a) => x + a } ``` You may think of this evaluator as adding the payload `x` to the accumulator `a`, which is the result of evaluating the tail of the list. An empty list gets assigned a zero value. You can guess that the catamorphism for this algebra will produce the sum of the elements of a list. A list is a fixed point of this functor ```haskell= type L c = Fix (ListF c) ``` ```scala type L[C] = Fix[ListF[C, *]] ``` This definition is equivalent to the original recursive definition of a list. To create values of this type, we can define helper functions or *smart constructors* ```haskell= nil :: L c nil = Fix NilF cons :: c -> L c -> L c cons c cs = Fix (ConsF c cs) ``` ```scala= def nil[C](): L[C] = Fix(NilF()) def cons[C](c: C, l: L[C]): L[C] = Fix(ConsF(c, l)) ``` For instance, the list of integers `[1, 2, 3]` can be defined as ```haskell= lst :: Fix (ListF Int) lst = cons 1 (cons 2 (cons 3 nil)) ``` ```scala= val lst: Fix[ListF[Int, *]] = cons(1, cons(2, cons(3, nil[Int]))) ``` We can apply a catamorphism to it to sum its elements ```haskell= cata evalSum lst ``` ```scala= cata(evalSum).apply(lst) ``` Since lists are so ubiquitous in functional programming, they deserve special syntax. And because list catamorphisms are very useful, they are defined in libraries as folds. Notice that a list algebra splits into two components: one is the value of type `A` corresponding to `NilF`, and the other is a function that combines the two values stored in `ConsF x a`. This function has the signature ```scala (C, A) ⇒ A ``` You may recognize these two components in the definition of [`foldRight`](https://www.scala-lang.org/api/current/scala/collection/immutable/List.html#foldRight[B](z:B)(op:(A,B)=%3EB):B) ```scala def foldRight[B](z: B)(op: (A, B) ⇒ B): B ``` This is because `foldRight` is a list catamorphism. ## Example We are going to use the algebraic methods to implement quicksort. This will require the use of algebras and coalgebras, so we'll build it step by step. The first step is to be able to partion a list in two parts according to a predicate (a yes/no function returning `Bool`). The result of the partitioning will be a pair of lists ```haskell= type TwoLst a = ([a], [a]) ``` ```scala= type TwoLst[A] = (List[A], List[A]) ``` For simplicity, we've encoded the result type using the built-in implementation of the list. But we'll use the fixed point of `ListF` as the argument for recursion. We'll start by defining the algebra with the carrier `TwoLst a` ```haskell= Algebra (ListF a) (TwoLst a) ``` ```scala= Algebra[ListF[A], TwoLst[A]] ``` We'll generate this algebra from a predicate. Since `ListF` has two constructors, we have to deal with two cases. The `NifF` case produces the starting empty accumulator. The `ConsF` case adds a new value `x` to the accumulator, placing it on the left or on the right, depending on the predicate ```haskell= splitAlg :: (a -> Bool) -> Algebra (ListF a) (TwoLst a) splitAlg pred NilF = ([], []) splitAlg pred (ConsF x twoLst) = select pred x twoLst where select :: (a -> Bool) -> a -> TwoLst a -> TwoLst a select pred x (l, r) = if pred x then (x: l, r) else (l, x: r) ``` ```scala= def splitAlg[A](pred: A => Boolean): Algebra[ListF[A, *], TwoLst[A]] = { case NilF() => (Nil, Nil) case ConsF(x, (l, r)) => if (pred(x)) (x :: l, r) else (l, x :: r) } ``` We can now pass this algebra to a catamorphism and apply it to an arbitrary list ```haskell= partition :: Ord a => (a -> Bool) -> Fix (ListF a) -> TwoLst a partition pred = cata (splitAlg pred) ``` ```haskell= l1 :: Fix (ListF Int) l1 = cons 2 (cons 8 (cons 1 (cons 7 nil))) p1 = partition' (< 5) l1 > ([2,1],[8,7]) ``` ```scala= val l1: Fix[ListF[Int, *]] = cons(2, cons(8, cons(1, cons(7, nil[Int])))) def lessThan5: Int => Boolean = _ < 5 partition(lessThan5).apply(l1) ``` Since we are operating on lists, the same thing can be implemented using `foldr` ```haskell= partition p xs = foldr (select p) ([],[]) xs where select :: (a -> Bool) -> a -> TwoLst a -> TwoLst a select p x (ts,fs) | p x = (x:ts, fs) | otherwise = (ts, x:fs) ``` ```scala= def select[A](p: A => Boolean, a: A, x: TwoLst[A]): TwoLst[A] = if (p(a)) (a :: x._1, x._2) else (x._1, a :: x._2) def partition2[A](l: List[A])(p: A => Boolean): TwoLst[A] = l.foldRight((List.empty[A], List.empty[A]))((a, b) => select(p, a, b)) ``` Here, the second argument passed to `foldr` is the `NilF` part of the algebra, and the function `select` deals with the `ConsF` case, combining the new value with the old acumulator to produce the new acumulator. You can find code for this blog post [here](https://scastie.scala-lang.org/oli-kitty/CC3s9g6eQQGX6YperjkLqA/3)