What is Abstraction?

"I don't know; I don't want to know."

Rich Hickey, author of Clojure, in his entertaining presentation Simple Made Easy (2011), which I recommend to all future software engineers.

Abstraction plays a major role in mathematics and software engineering, as it does in every day life.

I remember my little boy, many years ago, riding on my shoulders and pointing at the moon saying "ball". One reaction is to be amused at the child getting it wrong. Another is, to recognize that he understood the abstract concept of "round".

In its nature, the question "What is abstraction?" is a philosphical question that cannot be answered once and for all. Nevertheless, we can try to mathematically formalise abstraction. ^[1]

In this lecture, I want to explain one powerful way of formalising abstraction mathematically. It was devised in the late 19th century and its usefulness played an important role why set theory was widely adopted by the mathematical community as a foundation of mathematics. It is known as the technique of quotienting by an equivalence relations.

But let us first look at examples of abstraction in software engineering.

Examples: A Network of Abstractions

On the lowest level we have quantum physics. In quantum physics, measurements assume discrete values.
On the next level, we are dealing with circuits in which voltages assume continues values. Note that no measurement device is accurate enough to actually return a continuity of values. Continuity is an abstraction used for a large number of discrete values. ^[2]
Next, with the help of some sophisticated engineering, we turn the continous voltages into discrete bits, arriving at the level of chips, addresses, hex. ^[3]
Then we have microprogramming …
… and assembly …
… and LLVM …
… and … until we arrive at our favourite programming language, say,
Haskell.

At this point, things get really complicated. What looked like a hierarchy in the beginning now turns into a network. For example, we want to design algorithms and solve problems independently of any particular programming language such as Haskell. So we need to find new levels of abstraction. Moreover, the right level of abstraction depends on the problem to be solved …

Methods for Constructing Abstractions

So we need a wide variety of methods to construct useful levels of abstractions:

Logic
Rewriting
Context-free grammars
Automata, graphs, transition systems, …
Virtual Machines
Domain Specific Languages
UML
Category Theory
…

One could easily spend a whole semester exploring specific examples and pointing out connections and interdependencies. One overarching theme of these methods is that they should all lead to formal methods and algorithms that can be turned into software tools supporting designers and developers.

After having set the scene, I want to introduce one particular way of construction abstractions, namely abstraction by equivalence relation.

Equivalence Relations

One of the most powerful and simple ways of constructing an abstraction to consider things equal that are different.

To come back to the introductory example, my little boy was not confused about the meaning of "ball" when he pointed to the moon, rather he took the meaning of "ball" to be the set of all objects that are round. Then he used the name of one of the objects in this set as a name for the set itself.

Equivalence relations allow us to formalise this mathematically.

Let me choose an example of colours. We usually pick from the wide range of possible colours a small discrete set. Let me start with a chart that has the following colours:

I imagine I walk through a garden collecting flowers and fruits in a basket. Then I sit down and classify

Object	Colour
dandelion	yellow
daffodil	yellow
lemon	yellow
orange	orange
rose	red
poppy	red
violet	purple
pansy	purple
	violet
	blue
	blue-green
	green

Mathematically, this classification is a function

f : O b j e c t s \to C o l o u r s

Like all functions, this function induces an equivalence relation on objects.

The dandelion and lemon are equivalent because the are both yellow. Similarly, the rose and the poppy are equivalent. Etc.

Symbolically, we write this as

\begin{aligned} d a n d e l i o n & \equiv l e m o n \\ d a n d e l i o n & \equiv d a f f o d i l \\ r o s e & \equiv p o p p y \\ v i o l e t & \equiv p a n s y \end{aligned}

This relation

\equiv

, the equivalence relation induced by

f

, can be formally defined:

a \equiv b \overset{d e f}{⟺} f (a) = f (b)

Note that this definition makes sense for any function

f

We say that

\begin{matrix} {d a n d e l i o n, l e m o n, d a f f o d i l} \\ {o r a n g e} \\ {r o s e, p o p p y} \\ {v i o l e t, p a n s y} \end{matrix}

are the equivalence classes of the equivalence relation

\equiv

These equivalence classes can be themselves collected in a set:

\begin{matrix} {{d a n d e l i o n, l e m o n, d a f f o d i l}, {o r a n g e}, {r o s e, p o p p y}, {v i o l e t, p a n s y}} \end{matrix}

This set is called the quotient of the set

O b j e c t s

by the equivalent relation

\equiv

and written in symbolical notation as

O b j e c t s / \equiv

The terminology "quotient" will be explained later.

Notice how

O b j e c t s / \equiv

is an abstraction of

O b j e c t s

. It formalises the idea that as far as our colours are concerned, we cannot distinguish the rose from the poppy: both are red. And as long as this is the only observation available, we cannot make out how they are different. From this more abstract point of view, the rose is equal to the poppy.

Before we apply this methodology to more interesting examples, I want to introduce one more notation. We write

[r o s e]_{\equiv}

for the equivalence class that contains "rose". That is,

[r o s e]_{\equiv} = {r o s e, p o p p y}

. We turn this into a general definition of equivalence class:

[a]_{\equiv} \overset{d e f}{=} {b ∣ b \equiv a}

a

is called a represenatitive of the equivalence class

[a]_{\equiv}

Notice that

a \equiv b ⟺ [a]_{\equiv} = [b]_{\equiv}

Informally, we can interpret this as a way to make precise in which sense

a, b

are simultaneously different and equal, depending on whether we take the concrete (on the left) or abstract (on the right) point of view.

Applications

Every list of applications of such a wide ranging concept as quotienting by equivalence relations suffers from the fact that no individual example can do justice to the importance of the general concept.

I limit myself to a few examples.

The first example is a bit artificial but it shows why one talks about "quotienting".
Sets and lists, which point towards more applications to data structures and algorithms.
Some answers to what-is questions in mathematics.
Prices, which show that examples are not limited to mathematics and computer science.

Quotienting finite sets

The following should be pretty obvious if you draw a picture.

Exercise: Let

A

be a finite set with

n

elements and let

\equiv

be an equivalence relation on

A

such that each equivalence class has

m

elements. Then

A / \equiv

has

n / m

elements.

In group theory, this is known as Lagrange's theorem.

The data type of sets

Mathematicians define sets axiomatically. But for finite sets we can also obtain them as equivalence classes of lists and this gives a common way to implement sets algorithmically.

Let us define ≡ as the smallest equivalence relation containing the following for all elements x and y and all lists xs.

x:y:xs ≡ y:x:xs
x:x:xs ≡ x:xs

The first equation says that the order does not matter and the second equation says that duplicates can be eliminated.

Mathematically, this defines a set as an equivalence class of lists.

From a programming point of view, this means that any library that implements operations on lists can serve as a library for sets as long as we make sure to only choose operations that respect the two equations above.

Exercise: Concatenation of lists corresponds to what operation on sets?

One general idea to take from here is that in programming we often need to represent data with detail that is irrelevant to the specification of the problem. In these cases, the points of view of the implementation and the specification, or the concrete and the abstract, are related by an equivalence relation.

Answers to some what-is questions in maths

What is an integer? An integer is a pair

(n, m) \in N \times N

modulo ^[4]

(a, b) \equiv (c, d) \overset{d e f}{\Leftrightarrow} a - b = c - d

What is a fraction? An fraction is a pair

(n, m) \in Z \times N_{+}

modulo

(a, b) \equiv (c, d) \overset{d e f}{\Leftrightarrow} a \cdot d = b \cdot c

What is a real number? An equivalence class of Cauchy sequences. And on it goes.

Price

(Disclaimer: The purpose of this example is only to illustrate some general ideas, not to present some actual results from the field of economics. On the other hand there are writings of 19th century economists that suffer from the fact that at the time quotienting by an equivalence relation was not yet invented.)

Imagine a function

p r i c e : G o o d s \to P r i c e s

But what if the economy we want to model has no money? Is there another way to define what the price of a good is?

Exercise: Assume that we can observe the relation

a \equiv b

where

a, b

are goods and

a \equiv b

is defined as

a

can be traded for

b

. Assume further that

\equiv

is an equivalence relation. Is there a way to define

P r i c e s

and the function

p r i c e s

Appendix: Modular Arithmetic

Let us model time by the integers

Z

. Time point 0 is an arbitrarily chosen starting time. The difference between two consecutive integers models one hour in real time. A clock for us is a device that measures hours on a scale of

{0, 1, \dots 11}

. More formally a clock is an implementation of the function

f : Z \to {0, 1, \dots 11}

mapping an integer

n

to the smallest non-negative number

m

such that there is an integer k with

n = 12 k + m

Exercise: Show that the equivalence relation

a \equiv b \overset{d e f}{⟺} f (a) = f (b)

satisfies

a \equiv b \Leftrightarrow \exists k . b - a = 12 k

Exercise: Explain why the equivalence classes of

\equiv

are also denoted by

12 Z + k 0 \leq k < 12

The set

Z / \equiv

of all these equivalence classes is also denoted by

Z / 12 Z .

What is Abstraction?

Examples: A Network of Abstractions

Methods for Constructing Abstractions

Equivalence Relations

Applications

Quotienting finite sets

The data type of sets

Answers to some what-is questions in maths

Price

Appendix: Modular Arithmetic

Further Reading

Read more

Create, Detect, Lift, Preserve, Reflect (Co)Limits

A Very Short Introduction to Monads

Category Theory - Axiomatic Theory of Structure

Category Theory in Computer Science