# What is Abstraction? "I don't know; I don't want to know." [Rich Hickey](https://en.wikipedia.org/wiki/Rich_Hickey), author of Clojure, in his entertaining presentation [Simple Made Easy](https://www.infoq.com/presentations/Simple-Made-Easy/) (2011), which I recommend to all future software engineers. --- Abstraction plays a major role in mathematics and software engineering, as it does in everyday life. I remember my little boy, many years ago, riding on my shoulders and pointing at the moon saying "ball". One reaction is to be amused at the child getting it wrong. Another is, to recognize that he understood the abstract concept of "round". In its nature, the question "What is abstraction?" is a philosphical question that cannot be answered once and for all. Nevertheless, we can try to mathematically formalise abstraction. [^philosophy] In this lecture, I want to explain one powerful way of formalising abstraction mathematically. It was devised in the late 19th century and its usefulness played an important role in why set theory was widely adopted by the mathematical community as a foundation of mathematics. It is known as the technique of quotienting by an equivalence relations. But let us first look at examples of abstraction in software engineering. ## Examples: A Network of Abstractions - On the lowest level we have quantum physics. In quantum physics, measurements assume discrete values. - On the next level, we are dealing with circuits in which voltages assume continous values. Note that no measurement device is accurate enough to actually return a continuity of values. Continuity is an abstraction used for a large number of discrete values. [^Newton] - Next, with the help of some sophisticated engineering, we turn the continous voltages into discrete bits, arriving at the level of chips, addresses, hex. [^chips] - Then we have microprogramming ... - ... and assembly ... - ... and LLVM ... - ... and ... until we arrive at our favourite programming language, say, in my case, - Haskell. At this point, things get really complicated. What looked like a hierarchy in the beginning now turns into a network. For example, we want to design algorithms and solve problems independently of any particular programming language such as Haskell. So we need to find new levels of abstraction. Moreover, the right level of abstraction depends on the problem to be solved ... ## Methods for Constructing Abstractions So we need a wide variety of methods to construct useful levels of abstractions: - Logic - Rewriting - Context-free grammars - Automata, graphs, transition systems, ... - Virtual Machines - Domain Specific Languages - UML - Category Theory - ... One could easily spend a whole semester exploring specific examples and pointing out connections and interdependencies. One overarching theme of these methods is that they should all lead to formal methods and algorithms that can be turned into software tools supporting designers and developers. After having set the scene, I want to introduce one particular way of construction abstractions, namely abstraction by equivalence relation. ## Equivalence Relations One of the most powerful and simple ways of constructing an abstraction is to consider things equal that are different. To come back to the introductory example, my little boy was not confused about the meaning of "ball" when he pointed to the moon, rather he took the meaning of "ball" to be the set of all objects that are round. Then he used the name of one of the objects in this set as a name for the set itself. Equivalence relations allow us to formalise this mathematically. Let me choose an example of colours. We usually pick from the wide range of possible colours a small discrete set. Let me start with a chart that has the following colours: ![](https://i.imgur.com/CsmS87b.png =400x) I imagine I walk through a garden collecting flowers and fruits in a basket. Then I sit down and classify |Object | Colour | |:---:|:---: |dandelion | yellow |daffodil | yellow |lemon | yellow |orange | orange | rose | red |poppy | red |violet | purple |pansy | purple | | violet | | blue | | blue-green | | green Mathematically, this classification is a function $$f: Objects \to Colours$$ Like all functions, this function induces an equivalence relation on objects. The dandelion and lemon are equivalent (from the point of view of $f$) because they are both yellow. Similarly, the rose and the poppy are equivalent. Etc. Symbolically, we write this as \begin{align} \rm dandelion & \equiv \rm lemon\\ \rm dandelion & \equiv \rm daffodil\\ \rm rose & \equiv \rm poppy\\ \rm violet & \equiv \rm pansy \end{align} This relation $\equiv$, the **equivalence relation induced by** $f$, can be formally defined: $$ a\equiv b \ \stackrel{\rm def}\Longleftrightarrow \ f(a)=f(b)$$ Note that this definition makes sense for any function $f$. We say that \begin{gather} \{\rm dandelion, lemon, daffodil\}\\ \{\rm orange\}\\ \{\rm rose,poppy\}\\ \{\rm violet,pansy\} \end{gather} are the **equivalence classes** of the equivalence relation $\equiv$. These equivalence classes can be themselves collected in a set: \begin{gather} \{ \{\rm dandelion, lemon, daffodil\}, \{\rm orange\}, \{\rm rose,poppy\}, \{\rm violet,pansy\} \} \end{gather} This set is called the quotient of the set $Objects$ by the equivalent relation $\equiv$ and written in symbolical notation as $$Objects / {\equiv}$$ The terminology "quotient" will be explained later. Notice how $Objects / {\equiv}$ is an abstraction of $Objects$. It formalises the idea that as far as our colours are concerned, we cannot distinguish the rose from the poppy: both are red. And as long as this is the only observation available, we cannot make out how they are different. From this more abstract point of view, the rose is equal to the poppy. Before we apply this methodology to more interesting examples, I want to introduce one more notation. We write $$[\rm rose]_\equiv$$ for the equivalence class that contains "rose". That is, $[\rm rose]_\equiv = \{\rm rose, poppy\}$. We turn this into a general definition of **equivalence class**: $$[a]_\equiv \ \stackrel{\rm def}= \ \{b \mid b\equiv a\}$$ $a$ is called a **represenatitive** of the equivalence class $[a]_\equiv$. Notice that $$a\equiv b \ \Longleftrightarrow \ [a]_\equiv = [b]_\equiv$$ Informally, we can interpret this as a way to make precise in which sense $a,b$ are simultaneously different and equal, depending on whether we take the concrete (on the left) or abstract (on the right) point of view. ## Applications Every list of applications of such a wide ranging concept as quotienting by equivalence relations suffers from the fact that no individual example can do justice to the importance of the general concept. I limit myself to a few examples. - The first example is a bit artificial but it shows why one talks about "quotienting". - Sets and lists, which point towards more applications to data structures and algorithms. - Some answers to what-is questions in mathematics. - Prices, which show that examples are not limited to mathematics and computer science. ### Quotienting finite sets The following should be pretty obvious if you draw a picture. **Exercise:** Let $A$ be a finite set with $n$ elements and let $\equiv$ be an equivalence relation on $A$ such that each equivalence class has $m$ elements. Then $A/{\equiv}$ has $n/m$ elements. In group theory, this is known as [Lagrange's theorem](https://en.wikipedia.org/wiki/Lagrange%27s_theorem_(group_theory)). ### The data type of sets Mathematicians define sets axiomatically. But for finite sets we can also obtain them as equivalence classes of lists and this gives a common way to implement sets algorithmically. Let us define `≡` as the smallest equivalence relation containing the following for all elements `x` and `y` and all lists `xs`. x:y:xs ≡ y:x:xs x:x:xs ≡ x:xs The first equation says that the order does not matter and the second equation says that duplicates can be eliminated. Mathematically, this defines a set as an equivalence class of lists. From a programming point of view, this means that any library that implements operations on lists can serve as a library for sets as long as we make sure to only choose operations that respect the two equations above. **Exercise:** Concatenation of lists corresponds to [what](https://www.youtube.com/watch?v=vZ9myHhpS9s#t=32s) operation on sets? One general idea to take from here is that in programming we often need to represent data with detail that is irrelevant to the specification of the problem. In these cases, the points of view of the implementation and the specification, or the concrete and the abstract, are related by an equivalence relation. ### Answers to some what-is questions in maths What is an integer? An integer is a pair $(n,m)\in\mathbb N\times \mathbb N$ modulo [^integer] $$(a,b)\equiv(c,d)\ \stackrel{\rm def}\Leftrightarrow \ a-b=c-d$$ [^integer]: As an exercise define arithmetic on these integers. What is a fraction? An fraction is a pair $(n,m)\in\mathbb Z\times \mathbb N_+$ modulo $$(a,b)\equiv(c,d)\ \stackrel{\rm def}\Leftrightarrow \ a\cdot d =b\cdot c $$ What is a real number? An equivalence class of Cauchy sequences. And on [it goes](https://www.newyorker.com/magazine/1969/05/17/dresden). ### Price (Disclaimer: The purpose of this example is only to illustrate some general ideas, not to present some actual results from the field of economics. On the other hand there are writings of 19th century economists that suffer from the fact that at the time quotienting by an equivalence relation was not yet invented.) Imagine a function $$price:Goods\to Prices$$ But what if the economy we want to model has no money? Is there another way to define what the price of a good is? **Exercise:** Assume that we can observe the relation $a\equiv b$ where $a,b$ are goods and $a\equiv b$ is defined as $a$ can be traded for $b$. Assume further that $\equiv$ is an equivalence relation. Is there a way to define $Prices$ and the function $prices$? ## Appendix: Modular Arithmetic Let us model time by the integers $\mathbb Z$. Time point 0 is an arbitrarily chosen starting time. The difference between two consecutive integers models one hour in real time. A clock for us is a device that measures hours on a scale of $\{0,1,\ldots 11\}$. More formally a clock is an implementation of the function $$f:\mathbb Z \to \{0,1,\ldots 11\}$$ mapping an integer $n$ to the smallest non-negative number $m$ such that there is an integer k with $n=12k+m$. **Exercise:** Show that the equivalence relation $$ a\equiv b \ \stackrel{\rm def}\Longleftrightarrow \ f(a)=f(b)$$ satisfies $a\equiv b \Leftrightarrow \exists k.b-a=12k$. **Exercise:** Explain why the equivalence classes of $\equiv$ are also denoted by $$12\mathbb Z + k \quad\quad 0\le k <12$$ The set $\mathbb Z/{\equiv}$ of all these equivalence classes is also denoted by $$\mathbb Z/12\mathbb Z.$$ ## Further Reading I put a few more technicalities in my write-up on - [Equivalence Relations](https://hackmd.io/@alexhkurz/SJ1cc-dDr#Equivalence-Relations). The history of equivalence relation is investigated in - Amir Asghari: [Equivalence: an attempt at a history of the idea](https://link.springer.com/article/10.1007/s11229-018-1674-2). 2018. A typical application of these ideas in computer science is to - [Operational and Denotational Semantics](https://hackmd.io/@alexhkurz/Hkf6BTL6P). The notion of an equivalence relation is closely related to - Leibniz's Principle of [The Identity of Indiscernibles](https://plato.stanford.edu/entries/identity-indiscernible/). A new mathematical theory of equality that allows us to keep distinct objects distinct and simultaneously treat them as equal is [Homotopy Type Theory](https://homotopytypetheory.org/). My favourite first introduction to the topic is - Daniel Grayson: [An introduction to univalent foundations for mathematicians](https://arxiv.org/pdf/1711.01477). 2018. [^philosophy]: Some of the greatest advances in mathematics and physics can be understood as answers to what-is questions about geometry (Euclid, Klein), instantaneous change (Newton), numbers (Dedekind), heat (Boltzmann), chaos (Poincare), proof (Hilbert, Gentzen), gravity (Einstein), computation (Church, Turing, Goedel), algebra (Noether), games (von Neumann), probability (Kolmogorov), information (Shannon), concurrency (Milner), causality (Pearl), names (Gabbay and Pitts), equality (Martin-L&ouml;f, Voevodsky), etc. [^Newton]: The insight that by abstracting from a large number of discrete values to an infinity of continuous values we can actually simplify the mathematics took the genius of a Newton. [^chips]: This is an area I know very little about. I'd guess that one divide this again in several layers of abstractions.