# Category Theory in Computer Science (see also [Category Theory - Axiomatic Theory of Structure](https://hackmd.io/@alexhkurz/H1OxumxRP)) My aim is to give undergraduate students who happen to pass by here a first idea on what category theory is about. Category theory can be described in many different ways, as in the well-known metaphor of the [elephant](https://en.wikipedia.org/wiki/Blind_men_and_an_elephant), which gave the title to the important monograph [Sketches of an Elephant](https://ncatlab.org/nlab/show/Sketches+of+an+Elephant). One of the most common ways to explain category theory is to say that it describes structures, or objects, not by how to build them but by how they interact among themselves. From the point of view of a computer scienctist, we may say that category theory shifts the attention from implementation to specification. But there is more. I want to say it as follows. Category theory is an axiomatic theory of structure. What **is** geometry? What **are** the real numbers? Mathematicians like to answer such philosophical questions by setting out a list of axioms. Any structure that satisfies the axioms of geometry is a geometry. Any structure that satisfies the axioms of the real numbers can stand in for the real numbers. But what **is** structure? One way to answer this question is category theory. ## An Example of Structure I want to give an example of **structure** and introduce the idea of a **structure preserving** map. Let me start with a question: What is a clock? We could think about what a clock is made of. There is something that changes periodically, like a pendulum, or an Unruh, or a quartz. There is also something like hands, or a digital display. The disadvantage of approaching a clock from this engineering point of view is that such a definition is dependent on implementation details that may change from model to model. Let us try something more abstract. A clock is a mapping from time to a display. Let us say that we model time by the whole numbers $\mathbb Z$ (counting hours) and the display by arithmetic modulo 12, written as $\mathbb Z_{12}$, modelling the hour-hand on a 12-hour clock.[^arithmetic_modulo] In mathematical notation, such a clock then can be understood as a function $$ clock: \mathbb Z\to \mathbb Z_{12}.$$ Moreover, a clock that is neither fast nor late is the unique such function that preserves addition, that is, that has the property $$ clock(n+m)=clock(n)+clock(m).$$ In case you have seen any category theory already, you know that we like to write equations as diagrams. The equation above can be written as the following diagram where the horizontal arrows are given by the function $clock$ and the vertical arrows are given by addition. $$ \begin{array}{} \mathbb Z\times \mathbb Z & \longrightarrow & \mathbb Z_{12}\times \mathbb Z_{12} \\ \downarrow & & \downarrow \\ \mathbb Z & \longrightarrow & \mathbb Z_{12} \end{array} $$ (Take the time to interpret the equation by the diagram and vice versa.) To summarise, in this example I take the point of view that - time is the mathematical structure of integers with addition; - the display is the integers modulo 12, again with addition as structure; - a clock is a structure preserving map from time to the display. The point was to given an example of structure (integers with addition) and structure preserving map (a function preserving addition). ## The Notion of a Category The idea of category theory is to study structure abstractly. In other words, category theory should apply to whatever structure means in the concrete situation at hand. In the example, structure was addition. We then *defined* structure preserving maps as those functions that preserve addition. In the axiomatic approach we turn this around. We start from "maps" and interpret them as structure preserving, without saying explicitely what the structure is. Instead of defining structure explicitely and then define maps in terms of structure, we start with the maps and then recover the structure as that what is preserved by the maps (for example through representation theorems). To summarize, we axiomatise the notion of "structure preserving map" directly, without reference to any particular kind of structure. So what are the minimal requirements we should impose on "maps" in general? The answer: Maps should form a category. Essentially, this only says that we can compose maps. That is, if we have maps $$ A \stackrel f \longrightarrow B \stackrel g \longrightarrow C$$ then we can compose $f$ and $g$ to one map denoted by $f;g$ or $g\circ f$ $$ A\stackrel{f; g}\longrightarrow C.$$ Maps are also called "arrows" or "morphisms". At this level of generality, I prefer "arrows" because arrows in a category do not need to be what we usually think of when we say "map". $A,B,C$ are called "objects", or sometimes "types". Here are some examples of categories. - |Category|Objects|Arrows| |:---:|:---:|:---:| |**Set** | sets | functions | |**Rel** | sets | relations | |**Group** | groups | structure preserving maps | |**Hask** | types | [(Haskell) programs](https://math.andrej.com/2016/08/06/hask-is-not-a-category/)[^Hask] | |**...** | ... | structure preserving maps | We have seen groups in the example of the clock: Numbers with addition are typical examples of groups. The "..." indicate that whatever your notion of mathematical structure at hand, there will also be an associated category of structure preserving maps. - Programming languages typically have a fragment that forms a category where the objects are the types (bool, int, float, ...) and the arrows are the programs. - A logic gives rise to a category in which the objects are the theorems and the arrows are the proofs. - Every ordered set, or lattice, is a category. This observation can be extended to metric spaces. ## Universal Properties So far I have indicated that the definition of a category admits a wide range of examples, many of which can be thought of as structures with structure preserving maps. But the notion of a category is so general that it is not immediately obvious that one can develop this into an interesting theory with powerful theorems, methods, and software tools. The key to understanding how the notion of category enables a general theory of structure is the notion of universal property. This informal introduction is not the place to give a mathematical precise treatment of universal properties. But if I may assume that the reader knows the notion of a cartesian product, I can give an example. To visualise the product, take the example of two numberlines, one called the x-axis and the other called the y-axis. Then the product is the usual [cartesian coordinate plane](https://www.khanacademy.org/math/algebra/x2f8bb11595b61c86:foundation-algebra/x2f8bb11595b61c86:algebra-overview-history/v/descartes-and-cartesian-coordinates) in which we always draw our graphs. More mathematically, the cartesian product $A\times B$ of two sets $A$ and $B$ is defined as the set $$A\times B = \{(a,b) \mid a\in A, b\in B\}$$ of pairs $(a,b)$ of elements $a\in A$ and $b\in B$. As the curly braces indicate, we are using the notation of set theory above.[^set-theory] This notation indicates how to build the product as a set of pairs. This defines the cartesian product via a particular implementation (think of set theory as a "programming language" for mathematics). Instead of the set-theoretic implementation of the product, there is also a category theoretic definition. Instead of saying how to build the product from elements, we describe how the product interacts with other sets. We say that an arbitrary set[^arbitrary] $A\times B$ with two functions $$ A\stackrel {\ \ p_1} \longleftarrow A\times B \stackrel {p_2\ } \longrightarrow B $$ is a product of $A$ and $B$ if for all objects $C$ and all arrows $f_1:C\to A$ and $f_2:C\to B$ there is a unique arrow $h:C\to A\times B$ such that $h;p_1=f_1$ and $h;p_2=f_2$.[^exercise] <!-- https://q.uiver.app/#q=WzAsNCxbMSwxLCJBXFx0aW1lcyBCIl0sWzAsMiwiQSJdLFsyLDIsIkIiXSxbMSwwLCJDIl0sWzAsMSwicF8xIiwyXSxbMCwyLCJwXzIiXSxbMywxLCIiLDAseyJjdXJ2ZSI6Mn1dLFszLDIsIiIsMix7ImN1cnZlIjotMn1dLFszLDAsIiIsMSx7InN0eWxlIjp7ImJvZHkiOnsibmFtZSI6ImRhc2hlZCJ9fX1dXQ== --> <iframe class="quiver-embed" src="https://q.uiver.app/#q=WzAsNCxbMSwxLCJBXFx0aW1lcyBCIl0sWzAsMiwiQSJdLFsyLDIsIkIiXSxbMSwwLCJDIl0sWzAsMSwicF8xIiwyXSxbMCwyLCJwXzIiXSxbMywxLCIiLDAseyJjdXJ2ZSI6Mn1dLFszLDIsIiIsMix7ImN1cnZlIjotMn1dLFszLDAsIiIsMSx7InN0eWxlIjp7ImJvZHkiOnsibmFtZSI6ImRhc2hlZCJ9fX1dXQ==&embed" width="356" height="332" style="border-radius: 8px; border: none;"></iframe> [^arbitrary]: It is important to note that in this paragraph $A\times B$ stands in for an arbitrary set that satisfies the universal property. The previously defined $A\times B$ as a set of pairs is just one example of a categorical product. The category theoretic definition does not define the product uniquely. But, as it will turn out, all categorical products are "isomorphic" and that is good enough. Btw, the reason to abuse notation and to use $A\times B$ with two different meanings is a practical one: Both in set theory and in category theory the notation is used. And, in practice, the difference in meaning is slight and rarely relevant (because, as we said, different products are ismorphic). Most definitions in category theory can be phrased by following a similar pattern. Such definitions are called definitions by **universal property**. The qualifier *universal* here stems from the fact that the definition above quantifies over *all* objects $C$ of the category and that the definition makes sense for *all* categories. ## An Axiomatic Theory of Structure If you see the definition of product via its universal property for the first time, this definition may seem a very round-about way to define what a product is. But it has a crucial advantage: It only uses the language of arrows in a category: *This same definition makes sense in any category, whatever notion of structure the category happens to implement.* So now we can begin to see how an axiomatic theory of structure becomes possible. Instead of starting with a particular category, such as the category **Set** of sets and functions, and then describe constructions like the product in terms of the available structure (such as $\in$, $\{\dots\}$, etc), one proceeds axiomatically. We can now start by saying "*Let $\mathcal C$ be a category with products ...*" and then go on to prove theorems for *all* categories $\mathcal C$ satisfying our assumptions. This approach to mathematics was invented by [Eilenberg and Mac Lane](https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=author%3Aeilenberg+author%3Amaclane+%22General+theory+of+natural+equivalences%22&btnG=) in order to simplify and refactor certain mathematical proofs. This often happens in maths: Good abstractions make it easier to prove theorems because good abstractions only keep the relevant facts and throw away everything that is a distraction. ## Applications to Computer Science There are several reasons why category theory is of interest in computer science. - The right level of abstraction is as important (and as difficult) to find in software engineering as it is in mathematics. - Category theory has a lot to say about [logic in computer science](https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=%22category+theory%22+%22computer+science%22&btnG=).[^logic] - Type theory, which is an important part of theoretical computer science, is essentially category theory from a computational point of view. - Category theory has a native notion of "duality"[^duality], which can be used to formalise various phenomena arising in mathematics and computer science. ## Further Reading - [Category Theory - An Axiomatic Theory of Structure](https://hackmd.io/@alexhkurz/H1OxumxRP) - Applied Category Theory is a young and active research area. For a first entry point see [appliedcategorytheory.org](http://www.appliedcategorytheory.org/). [^arithmetic_modulo]: Arithmetic modulo 12 is the arithmetic on the clock. For example, $11+2=1$ as in 11am + 2h = 1pm. [^set-theory]: When I went to school, we started with set theory in the first grade. Venn diagrams, and the like. Khanacademy has a basic introduction to [set-theoretic notation](https://www.khanacademy.org/math/statistics-probability/probability-library/basic-set-ops/v/intersection-and-union-of-sets?modal=1) hidden in a course statistics and probability. [^exercise]: An important exercise at this point would be to show that the cartesian product of sets satisfies this property. And that all products are isomorphic. But this brief introduction is not a course with exercises and theorems. [^logic]: Categories typically come with their own "internal" logic. Different logics can be classified according to categorical structure. For example, the correspondence of products and equational logic is the lowest rung on a beautiful and rich ladder of logics. This is studied in categorical logic. [^duality]: The dual category is defined simply by turning around all arrows. [^Hask]: Btw, I am not fully convinced by the arguments that Hask is not a category. If one wants to define the category Hask, one is under no obligation to include all features of Haskell in the definition.