Category Theory in Computer Science

My aim here is to give undergraduate students who happen to pass by here a first idea on what category theory is about.

Category theory can be described in many different ways, as in the well-known metaphor of the elephant, which gave the title to the important monograph Sketches of an Elephant.

One of the most common ways to explain category theory is to say that it describes structures, or objects, not by how to build them but by how they interact among themselves.

From the point of view of a computer scienctist, we may say that category theory shifts the attention from implementation to specification.

But there is more. I want to say it as follows.

Category theory is an axiomatic theory of structure.

What is geometry? What are the real numbers? Mathematicians like to answer such philosophical questions by setting out a list of axioms. Any structure that satisfies the axioms of geometry is a geometry. Any structure that satisfies the axioms of the real numbers can stand in for the real numbers.

But what is structure? One way to answer this question is category theory.

An Example of Structure

I want to give an example of structure and introduce the idea of a structure preserving map.

Let me start with a question: What is a clock?

We could think about what a clock is made of. There is something that changes periodically, like a pendulum, or an Unruh, or a quartz. There is also something like hands, or a digital display. The disadvantage of approaching a clock from this engineering point of view is that such a definition is dependent on implementation details that may change from model to model.

Let us try something more abstract. A clock is a mapping from time to a display. Let us say that we model time by the whole numbers

Z

(counting hours) and the display by arithmetic modulo 12, written as

Z_{12}

, modelling the hour-hand on a 12-hour clock.^[1] In mathematical notation, such a clock then can be understood as a function

c l o c k : Z \to Z_{12} .

Moreover, a clock that is neither fast nor late is the unique such function that preserves addition, that is, that has the property

c l o c k (n + m) = c l o c k (n) + c l o c k (m) .

In case you have seen any category theory already, you know that we like to write equations as diagrams. The equation above can be written as the following diagram where the horizontal arrows are given by the function

c l o c k

and the vertical arrows are given by addition.

\begin{matrix} Z \times Z & ⟶ & Z_{12} \times Z_{12} \\ ↓ & ↓ \\ Z & ⟶ & Z_{12} \end{matrix}

(Take the time to interpret the equation by the diagram and vice versa.)

To summarise, in this example I take the point of view that

time is the mathematical structure of integers with addition;
the display is the integers modulo 12, again with addition as structure;
a clock is a structure preserving map from time to the display.

The point was to given an example of structure (integers with addition) and structure preserving map (a function preserving addition).

The Notion of a Category

The idea of category theory is to study structure abstractly. In other words, category theory should apply to whatever structure means in the concrete situation at hand.

In the example, structure was addition. We then defined structure preserving maps as those functions that preserve addition.

In the axiomatic approach we turn this around. We start from "maps" and interpret them as structure preserving, without saying explicitely what the structure is. Instead of defining structure explicitely and then define maps in terms of structure, we start with the maps and then recover the structure as that what is preserved by the maps (for example through representation theorems).

To summarize, we axiomatise the notion of "structure preserving map" directly, without reference to any particular kind of structure.

So what are the minimal requirements we should impose on "maps" in general?

The answer: Maps should form a category.

Essentially, this only says that we can compose maps. That is, if we have maps

A \overset{f}{⟶} B \overset{g}{⟶} C

then we can compose

f

and

g

to one map denoted by

f; g

g \circ f

A \overset{f; g}{⟶} C .

Maps are also called "arrows" or "morphisms". At this level of generality, I prefer "arrows" because arrows in a category do not need to be what we usually think of when we say "map".

A, B, C

are called "objects", or sometimes "types".

Here are some examples of categories.

Category Objects Arrows

Set sets functions

Rel sets relations

Group groups structure preserving maps

… … structure preserving maps

We have seen groups in the example of the clockw: Numbers with addition are typical examples of groups. The "…" indicate that whatever your notion of mathematical structure at hand, there will also be an associated category of structure preserving maps.
Programming languages typically have a fragment that forms a category where the objects are the types (bool, int, float, …) and the arrows are the programs.
A logic gives rise to a category in which the objects are the theorems and the arrows are the proofs.
Every ordered set, or lattice, is a category. This observation can be extended to metric spaces.

Category	Objects	Arrows
Set	sets	functions
Rel	sets	relations
Group	groups	structure preserving maps
…	…	structure preserving maps

Universal Properties

So far I have indicated that the definition of a category admits a wide range of examples, many of which can be thought of as structures with structure preserving maps.

But the notion of a category is so general that it is not immediately obvious that one can develop this into an interesting theory with powerful theorems, methods, and software tools.

The key to understanding how the notion of category enables a general theory of structure is the notion of universal property.

This informal introduction is not the place to give a mathematical precise treatment of universal properties. But if I may assume that the reader knows the notion of a cartesian product, I can give an example.

To visualise the product, take the example of two numberlines, one called the x-axis and the other called the y-axis. Then the product is the usual cartesian coordinate plane in which we always draw our graphs.

More mathematically, the cartesian product

A \times B

of two sets

A

and

B

is defined as the set

A \times B = {(a, b) ∣ a \in A, b \in B}

of pairs

(a, b)

of elements

a \in A

and

b \in B

As the curly braces indicate, we are using the notation of set theory above.^[2] This notation indicates how to build the product as a set of pairs.

This defines the cartesian product via a particular implementation (think of set theory as a "programming language" for mathematics).

Instead of the set-theoretic implementation of the product, there is also a category theoretic definition. Instead of saying how to build the product from elements, we describe how the product interacts with other sets.

We say that an arbitrary set^[3]

A \times B

is a product of

A

and

B

A \times B

comes equipped with to arrows

p_{1} : A \times B \to A, p_{2} : A \times B \to B

such that for all objects

C

and all arrows

f_{1} : C \to A

and

f_{2} : C \to B

there is a unique arrow

h : C \to A \times B

such that

h; p_{1} = f_{1}

and

h; p_{2} = f_{2}

.^[4]

Most definitions in category theory can be phrased by following a similar pattern. Such definitions are called definitions by universal property. The qualifier universal here stems from the fact that the definition above quantifies over all objects

C

of the category.

An Axiomatic Theory of Structure

If you see the definition of product via its universal property for the first time, this definition may seem a very round-about way to define what a product is. But it has a crucial advantage: It only uses the language of arrows in a category:

This same definition makes sense in any category, whatever notion of structure the category happens to implement.

So now we can begin to see how an axiomatic theory of structure becomes possible.

Instead of starting with a particular category, such as the category Set of sets and functions, and then describe constructions like the product in terms of the available structure (such as

\in

{\dots}

, etc), one proceeds axiomatically.

We can now start by saying "Let

$C$ be a category with products …" and then go on to prove theorems for all categories

C

satisfying our assumptions.

This approach to mathematics was invented by Eilenberg and Mac Lane in order to simplify certain mathematical proofs. This often happens in maths: Good abstractions make it easier to prove theorems because good abstractions only keep the relevant facts and throw away everything that is a distraction.

Applications to Computer Science

There are several reasons why category theory is of interest in computer science.

The right level of abstraction is as important (and as difficult) to find in software engineering as it is in mathematics.
Category theory has a lot to say about logic in computer science.^[5]
Type theory, which is an important part of theoretical computer science, is essentially category theory from a computational point of view.
Category theory has a native notion of "duality"^[6], which can be used to formalise various phenomena arising in mathematics and computer science.

Category Theory in Computer Science

An Example of Structure

The Notion of a Category

Universal Properties

An Axiomatic Theory of Structure

Applications to Computer Science

Further Reading

Read more

Create, Detect, Lift, Preserve, Reflect (Co)Limits

A Very Short Introduction to Monads

Category Theory - Axiomatic Theory of Structure

Natural Deduction