(see also Category Theory - Axiomatic Theory of Structure)
My aim here is to give undergraduate students who happen to pass by here a first idea on what category theory is about.
Category theory can be described in many different ways, as in the well-known metaphor of the elephant, which gave the title to the important monograph Sketches of an Elephant.
One of the most common ways to explain category theory is to say that it describes structures, or objects, not by how to build them but by how they interact among themselves.
From the point of view of a computer scienctist, we may say that category theory shifts the attention from implementation to specification.
But there is more. I want to say it as follows.
Category theory is an axiomatic theory of structure.
What is geometry? What are the real numbers? Mathematicians like to answer such philosophical questions by setting out a list of axioms. Any structure that satisfies the axioms of geometry is a geometry. Any structure that satisfies the axioms of the real numbers can stand in for the real numbers.
But what is structure? One way to answer this question is category theory.
I want to give an example of structure and introduce the idea of a structure preserving map.
Let me start with a question: What is a clock?
We could think about what a clock is made of. There is something that changes periodically, like a pendulum, or an Unruh, or a quartz. There is also something like hands, or a digital display. The disadvantage of approaching a clock from this engineering point of view is that such a definition is dependent on implementation details that may change from model to model.
Let us try something more abstract. A clock is a mapping from time to a display. Let us say that we model time by the whole numbers
Moreover, a clock that is neither fast nor late is the unique such function that preserves addition, that is, that has the property
In case you have seen any category theory already, you know that we like to write equations as diagrams. The equation above can be written as the following diagram where the horizontal arrows are given by the function
(Take the time to interpret the equation by the diagram and vice versa.)
To summarise, in this example I take the point of view that
The point was to given an example of structure (integers with addition) and structure preserving map (a function preserving addition).
The idea of category theory is to study structure abstractly. In other words, category theory should apply to whatever structure means in the concrete situation at hand.
In the example, structure was addition. We then defined structure preserving maps as those functions that preserve addition.
In the axiomatic approach we turn this around. We start from "maps" and interpret them as structure preserving, without saying explicitely what the structure is. Instead of defining structure explicitely and then define maps in terms of structure, we start with the maps and then recover the structure as that what is preserved by the maps (for example through representation theorems).
To summarize, we axiomatise the notion of "structure preserving map" directly, without reference to any particular kind of structure.
So what are the minimal requirements we should impose on "maps" in general?
The answer: Maps should form a category.
Essentially, this only says that we can compose maps. That is, if we have maps
then we can compose
Maps are also called "arrows" or "morphisms". At this level of generality, I prefer "arrows" because arrows in a category do not need to be what we usually think of when we say "map".
Here are some examples of categories.
Category | Objects | Arrows |
---|---|---|
Set | sets | functions |
Rel | sets | relations |
Group | groups | structure preserving maps |
… | … | structure preserving maps |
We have seen groups in the example of the clockw: Numbers with addition are typical examples of groups. The "…" indicate that whatever your notion of mathematical structure at hand, there will also be an associated category of structure preserving maps.
Programming languages typically have a fragment that forms a category where the objects are the types (bool, int, float, …) and the arrows are the programs.
A logic gives rise to a category in which the objects are the theorems and the arrows are the proofs.
Every ordered set, or lattice, is a category. This observation can be extended to metric spaces.
So far I have indicated that the definition of a category admits a wide range of examples, many of which can be thought of as structures with structure preserving maps.
But the notion of a category is so general that it is not immediately obvious that one can develop this into an interesting theory with powerful theorems, methods, and software tools.
The key to understanding how the notion of category enables a general theory of structure is the notion of universal property.
This informal introduction is not the place to give a mathematical precise treatment of universal properties. But if I may assume that the reader knows the notion of a cartesian product, I can give an example.
To visualise the product, take the example of two numberlines, one called the x-axis and the other called the y-axis. Then the product is the usual cartesian coordinate plane in which we always draw our graphs.
More mathematically, the cartesian product
of pairs
As the curly braces indicate, we are using the notation of set theory above.[2] This notation indicates how to build the product as a set of pairs.
This defines the cartesian product via a particular implementation (think of set theory as a "programming language" for mathematics).
Instead of the set-theoretic implementation of the product, there is also a category theoretic definition. Instead of saying how to build the product from elements, we describe how the product interacts with other sets.
We say that an arbitrary set[3]
Most definitions in category theory can be phrased by following a similar pattern. Such definitions are called definitions by universal property. The qualifier universal here stems from the fact that the definition above quantifies over all objects
If you see the definition of product via its universal property for the first time, this definition may seem a very round-about way to define what a product is. But it has a crucial advantage: It only uses the language of arrows in a category:
This same definition makes sense in any category, whatever notion of structure the category happens to implement.
So now we can begin to see how an axiomatic theory of structure becomes possible.
Instead of starting with a particular category, such as the category Set of sets and functions, and then describe constructions like the product in terms of the available structure (such as
We can now start by saying "Let
This approach to mathematics was invented by Eilenberg and Mac Lane in order to simplify certain mathematical proofs. This often happens in maths: Good abstractions make it easier to prove theorems because good abstractions only keep the relevant facts and throw away everything that is a distraction.
There are several reasons why category theory is of interest in computer science.
Arithmetic modulo 12 is the arithmetic on the clock. For example,
When I went to school, we started with set theory in the first grade. Venn diagrams, and the like. Khanacademy has a basic introduction to set-theoretic notation hidden in a course statistics and probability. ↩︎
It is important to note that in this paragraph
An important exercise at this point would be to show that the cartesian product of sets satisfies this property. And that all products are isomorphic. But this brief introduction is not a course with exercises and theorems. ↩︎
Categories typically come with their own "internal" logic. Different logics can be classified according to categorical structure. For example, the correspondence of products and equational logic is the lowest rung on a beautiful and rich ladder of logics. ↩︎
The dual category is defined simply by turning around all arrows. ↩︎