Monoids

Automata and Lists

(not in final form)

Part 1 of a short course on monads. See also the introductory video on What is algebraic structure?

We start with monoids as an example of algebraic structures, then study the structure of a particular monoid, the free monoid (=monoid of finite words over an alphabet). This structure will later turnout to be a monad. For now, we will be happy to study this structure via its implementation in Haskell.

In the first lecture we concentrated on the sections labelled Examples, Axioms, and Homework. We did a quick review of the remaining material at the beginning of the second lecture. For a first short route through the material of this lecture

watch the viedo on monoids and free monoids,
do the homework on monoids introduced in the video on the list-monad,
do the homework on an error monad introduced in this video.

Programs: monoids, monoidshs, wrap.

Examples

Deterministic Finite Automata

Finite state machines or deterministic finite automata (DFA) are given by a function

I n p u t S y m b o l s \times S t a t e s \to S t a t e s

Σ \times Q \to Q

Notation: "sigma" reminds of "symbol" and "Q" for the set of states is traditional.

Σ

is also called the alphabet. It is common to denote elements of

Σ

a, b, c,

etc. For more background I refer to Chapter 2 of the influential textbook Introduction to Automata Theory, Languages,and Computation. For most of the following, I will ignore initial and final states, which are also part of the definition of a DFA.

At this point I recommend to have a quick look at my Very Short Introduction to Automata and Haskell. The following assumes that the reader is somewhat familiar with DFAs.

We can think of each letter

a \in Σ

as a function

\overset{―}{a} : Q \to Q

These functions compose: For any two letters

a, b

there is a word of two letters

a b

which corresponds to another function

\overset{―}{a b} : Q \to Q

and maps a state

q \in Q

(\overset{―}{b} \circ \overset{―}{a}) (q) = \overset{―}{b} (\overset{―}{a} (q)))

Notation: I apologize for switching the order of

a

and

b

. In particular, the equation

\overset{―}{a b} = \overset{―}{b} \circ \overset{―}{a}

is annoying, but, at some point, unavoidable because it stems from the clash of two traditions:

Mathematicians write composition of functions

$A \overset{f}{\to} B \overset{g}{\to} C$

as
$g \circ f$ because
$g \circ f$ is defined as

$g \circ f (x) = g (f (x)) .$
Computer scientists read a word
$a b$ from left to right as "first input
$a$ , then input
$b$ ".

A reasonable compromise seems the following. We introduce to the mathematical notation a new symbol "

;

" which we define as

f; g = g \circ f

Here

;

reminds us of sequential composition in programming languages "first

f

, then

g

We can push this even further and write

x . f

instead of

f (x)

, reminding us of the notation x.input_a() in OO-languages such as Java. With this in mind, I will sometimes write

x . f = f (x)

Note that if we put everything together we have

\begin{aligned} x . (f; g) & = (x . f) . g \\ \overset{―}{a b} & = \overset{―}{a}; \overset{―}{b} \\ q . \overset{―}{a} . \overset{―}{b} & = q . (\overset{―}{a}; \overset{―}{b}) \end{aligned}

where I dropped, for better readability, one pair of parentheses on the left-hand side of the last equation.

As we will see later on, catgory theorists love to write mathematics in diagrammatic form. We can already see why this is a good idea. After all, the long discussion on notation above was only about a diagram of the form

∙ \to ∙ \to ∙

Words, Strings, Lists, …

Monoids are closely related to a "data type" the elements of which are variously known as words, strings, lists, tuples, or sequences. As this data type is of fundamental importance in mathematics and programming there are many different notations for it. Take note of those familiar to you and feel free to ignore the others.

In the examples below we take

Σ

to be the latin alphabet.

domain	an element of the data type is called	neutral element (length $0$ )	example of length $3$
English	word		$t h e$
Automata and Formal Languages	word	$ϵ$	$t h e$
Maths	sequence, tuple	$()$	$(t, h, e)$
Programming	string	`""`	`"the"`
Programming	list	`Nil`	`Cons 't' (Cons 'h' (Cons 'e' Nil))`
Programming	list	`[]`	`["t","h","e"]`

This list is not exhaustive, many further variations do exists both in math and in programming. In the following I may (possibly inadvertently) switch between these notations. Give me a shout if you are in doubt.

The Free Monoid

(essential)

Given an set

Σ

, we denote by

Σ^{*}

, or sometimes

L i s t (Σ)

, the set of all words over

Σ

Notation: If we think of the elements of

Σ^{*}

as words or strings,

Σ

is called an alphabet. But this is not the only possible interpretation of

Σ

. Instead of

Σ

, we also use

X

, in particular if we think of the elements of

X

not as input symbols of an automaton but as variables used in terms and equations such as

x \cdot (y \cdot z) = (x \cdot y) \cdot z

. We will hear much more about variables, terms, equations throughout this course.

Terminology: The operation that takes a set

X

to the set

X^{*}

of words over

X

is known as the Kleene star.

X^{*}

is known as the free monoid over

X

, a terminology that will be explained below.

The set of words

X^{*}

carries additional structure: Words can be concatenated and there is an empty word. And there is much more structure, some of which we will explore now.

The free monoid can be represented in different, and equally useful, ways. The first two will appear again in the last lecture when we will define monads and are named for Eilenberg and Moore and Kleisli. If you want to avoid the mathematical notation for now, you can instead also look at the Haskell code in monoids. The mathematical notation has the advantage of being independent of any particular programming language. ^[1]

Table Comparing Notations

In fact, it may be useful to give a table of translations already now, even if not all the notation will be explained right here: ^[2] ^[3]

my Lists in Math	my Lists in Haskell	built-in lists in Haskell	Generic Monads in Math	Generic Monads in Haskell
$X^{*}$	`List a`	`[a]`	$T X$	`m a`
$f^{*}$	`mapList f`	`map f`	$T f$	`fmap f`
$[x]$	`eta x`	`[x]`	$η_{X} (x)$	`return x`
$μ$	`mu`	`concat`	$μ_{X}$	`join`
$g^{♯} (t)$	`lift g t`	`t >>= g`	$g^{♯} (t)$	`t >>= g`

The first column shows widely used notation for the list monad (=free monoid monad) in mathematics, which I adapted, in the second column for my implementation of the list-monad in Haskell, see monoids. If you want to use instead the built-in Haskell list-monad use the notation of the third column, see monoidshs. Widely used notation for generic monads in mathematics is given in the fourth column, while the fifth column does the same for Haskell, see also here and here.

The following three Notations can be skipped on first reading. The idea is that, in order to define the list monad, there are various equivalent ways to choose the basic operations from which the others can be defined.

Notation 1 (Eilenberg-Moore)

$η : X \to X^{*}$ defined as
$η (x) = [x]$ ,
$(-)^{*} : (X \to Y) \to X^{*} \to Y^{*}$ mapping
$f : X \to Y$ and
$[x_{1}, \dots x_{n}]$ to
$[f (x_{1}), \dots f (x_{n})]$ ,
$μ : (X^{*})^{*} \to X^{*}$ flattening a list of lists to a list. For example,
$μ_{N} ([[], [1, 2, 3], [], [4, 5], [6]]) = [1, 2, 3, 4, 5, 6]$ .

Notation 2 (Kleisli)

$η : X \to X^{*}$ as above,
For every
$g : X \to Y^{*}$ an extension (or substitution)
$g^{♯} : X^{*} \to Y^{*}$ .

Notation 3 (operations)

an empty word
$ϵ \in X^{*}$ ,
the multiplication
$\cdot : X^{*} \times X^{*} \to X^{*}$ concatenating two lists.

Exercise: Compute

η (1)

f^{*} ([1, 2, 3])

where

f (x) = x + 1

μ ([[1, 2, 3], [4, 5], [6]])

. Note that

X

above can be

X^{*}

itself. So it makes sense to evaluate

η ([1, 2, 3])

and

η^{*} ([1, 2, 3])

and

μ ([[[1, 2], [3, 4]], [[5, 6]]])

. To run these and more examples go to monoidshs.

Programming Activity: Think about whether these functions are in your your list library. Can you find some equations that these operations must satisfy? Write some list processing algorithms that make use of these functions. ^[4]

The next exercise contains the essential insights needed to prove that Notation 1, 2, and 3 are equivalent.

Exercise: Define

μ

in terms of

\cdot

and vice versa. Define

(-)^{♯}

in terms of

(-)^{*}

and

μ

. Define

μ

in terms of

(-)^{♯}

. Test your definitions by implementing them in monoidshs.

Remark: The equations for Notations 1 and 2 are generic while the ones of Notation 3 depend on a particular presentation. If that remark is not clear now, I hope it will become clear as we go on. Ask me if in doubt.

Axioms

(important)

We will now use Notation 3 to formally define monoids. We will come back to Notation 1 and 2 in the exercises and when we define monads in Lecture 4.

A monoid

(M, e, \cdot)

is a set

M

together with an element

e \in M

and a binary operation

\cdot : M \times M \to M

satisfying for all

x, y, z \in M

\begin{aligned} e \cdot x & = x & (l e f t i d e n t i t y) \\ x \cdot e & = x & (r i g h t i d e n t i t y) \\ x \cdot (y \cdot z) & = (x \cdot y) \cdot z & (a s s o c i a t i v i t y) \end{aligned}

A (right) monoid action

(M, A, .)

consist of a monoid

M

, a set

A

and a function

∙ : A \times M \to A

such that

a ∙ ϵ = a a ∙ (x \cdot y) = (a ∙ x) ∙ y

Remark: One can also define a left action. Or an action both from the left and the right. ^[5]

Activity/Exercise: How do you have to map the notation

\cdot

and

∙

of the definition to the notation of the introductory example of DFAs in order to see that a DFA is an "action of the free monoid on the set of states"?

More Examples

The following are examples of monoids.

The set
$X \to X$ of functions on a set
$X$ . (Such functions are also called endo-functions).
The set of binary relations on a set
$X$ .
The set of natural numbers with addition.
The set of natural numbers with multiplication.
The set of integers, rationals, etc …
A group is a monoid in which every element as an inverse. ^[6]
A ring consists of a group and a monoid that interact via a distributive law.
Every geometry is a group.
Time is a monoid.
$λ$ -calculus.
…

Theory

(can be skipped for now, but these topics will show up later again)

There is a lot of theory to choose from …

Equational reasoning by reduction to normal form

We have seen that monoids are defined up to the equations of identity and associativity. This means that there are a lot of terms/expressions that are equivalent but look different. For computation and reasoning that is a problem. How do we know that two terms are equivalent? Is equivalence even decidable? Is there always a "canonical" or "normal" term we can pick? This is not always possible, but if it is, it makes life much easier.
Representation of monoids.

We have seen that functions give rise to monoids. What about the converse? In fact, every monoid can be represented as a monoid of functions. This representation theorem is important because it answers the question: What is a monoid? In other words, the monoid axioms capture exactly the properties of function composition. Moreover, its categorical generalisation, the Yoneda lemma, is very powerful and the cornerstone of category theory.
Quotienting by congruence relations

When can we simplify a monoid by identifying several of its elements. We introduce congruence relations and show that they allow us to quotient monoids. This allows us to abstract, or simplify, in a structure preserving way. For example, we can use this to minimize automata, a particular way to do data compression.

In the following I will only look at the 2nd item. It is relevant conceptually, but not needed technically.

The 3rd item is interesting because algebras for a monad are closed under congruence relations. But while important, this observation is not central to what follows in this course. (I may fill it in later, though.)

Every monoid is a monoid of functions

Functions form monoids because identity is a function and functions have associative composition.

But some monoids, such as

(N, 0, +)

do not look as if their elements were functions. Or can we think of numbers as functions?

In fact we can …

Proposition: Every monoid can be represented as a monoid of functions.

Proof (Sketch): Let

(M, e, \cdot)

be a monoid. For all

m \in M

define a function

\begin{aligned} \hat{m} : M & \to M \\ x & \mapsto x \cdot m \end{aligned}

Remark: Using notation from

λ

-calculus we could write the definition of

\hat{m}

as ^[7]

\hat{m} = λ x . x \cdot m

Exercise: Finish the proof above. Interpret the construction of the proof in various examples.

Remark: In group theory, the proposition is known as Cayley's Theorem (all groups are groups of permutations).

Quotienting by a congruence relation

Here I assume familiarity with the notion of equivalence relation and the quotient of a set by a an equivalence relation. See What is Abstraction? for a general introduction and Equivalence Relations for a summary of the relevant technicalities.

Definition (congruence relation): Let

(M, e, \cdot)

be a monoid. A congruence relation

\equiv

M

is an equivalence relation on

M

such that ^[8]

\frac{x \equiv x^{'} y \equiv y^{'}}{x \cdot y \equiv x^{'} \cdot y^{'}}

This rule should be obviously true if you think of

\equiv

=

. Conversely, any relation with this property can be thought of as an equality relation on

M

. This is formalised in the next proposition.

Proposition: The quotient of a monoid by a congruence relation is a monoid.

Maths Exercise: Prove the proposition.

Answer. ^[9]

The mathematical essence of the definition of a monad we will see in Lecture 4 can be understood as an axiomatic generalisation of the following fact.

Proposition: Every monoid

(M, e, \cdot)

is a quotient of the free monoid over

M

Maths Exercise: Prove the proposition.

Answer. ^[10]

Applications

(incomplete … can be skipped)

We first show with the help of congruence relations that every automaton can be minimised.

Then we indicate briefly that all we learned about monoids in this lecture extends to general algebras, that is, mathematical structures defined by arbitrary operations and equations.

General algebras are studied in the area of mathematics called Universal Algebra. Universal algebra is closely related to monads. In fact, much of universal algebra can be seen as the study of monads from the point of view of operations and equations. In these lectures, we will not have the time to introduce universal algebra and will just say enough to be able to point out some connections and directions a curious reader might want to follow up.

One of the many connections of universal algebra with programming is that many data types can be understood as algebras.^[11] Another connection is given by an (until recently unsolved) problem in the area of constraint satisfaction.

Minimal Automata

Minimising DFAs is such an important optimisation technique for solving many practical problems (such as searching for patterns in text) that there are various different ways to construct them. I choose here one that emphasises the point of view of automata-as-monoids.

Every DFA

A

over

Σ

defines defines an equivalence relation

\equiv

Σ^{*}

as follows.

w_{1} \equiv w_{2}

iff for all

a \in A

w_{1} ∙ a = w_{2} ∙ a

. The quotient

Σ^{*} / \equiv

is called the syntactic monoid of the automaton. The syntactic monoid is also the minimal automaton accepting the same language.
- The initial state is the equivalence class of the empty word.
- The transition function is given by

w ∙ [v]_{\equiv} = [w \cdot v]_{\equiv}

.
- The final states are the equivalence classes of …

Homework

If you have time for just one homework, do the one on monoids. We will use it later.

(Maths) Equivalence of Eilenberg-Moore and Kleisli notation

If you want to do some mathematics show the following.

Given
$η$ ,
$(-)^{*}$ ,
$μ$ define
$(-)^{♯}$ .
Given
$η$ and
$(-)^{♯}$ , define
$(-)^{*}$ and
$μ$ .
Show that doing Step 1 and then Step 2 leads us back where we started. Similarly for doing first Step 2 and then Step 1.

There is more to the equivaence of Eilenberg-Moore and Kleisli, namely the preservation of certain equations. We will come back to this in Lecture 4.

Btw, the Haskell implementation below can help with the solving this exercise.

(Programming) Implementing Monads in Haskell: Introduction

The aim of the homework is to practice translating from maths to Haskell and back. We look at two different kind of algebraic structures.

Monoids

We learn how to translate into Haskell some of the math above, in particular the Eilenberg-Moore and the Kleisli notation.
An Error Monad

We build the simplest possible monad in Haskell and investigate it from mathematical point of view.

In both cases we use the algebraic/monadic approach in order to take into account that data obeys equations:

In the case of monoids we do not implement the equations of identity and associativity directly. Instead we define lists first, then define multiplication/composition/concatenation by induction and then prove that this makes a monoid. It is interesting to think about why we need to do it this way. ^[12]

In the case of the error monad we take the equations that will later define a monad as a specification and ask to implement the functions of Notation 1 and 2 so that they satisfy this specification.

(Programming) The Free Monoid Monad

(essential, we will build on this throughout the following lectures)

Above we defined the so-called free monoid

X^{*}

over a set

X

. We can write this as a Haskell data type, essentially just writing a for

X

, List a for

X^{*}

, Nil for the empty word and Cons for the operation that prefixes a word by a letter.

data List a = Nil | Cons a (List a)

To experiment follow the link monoids. I also made a version that uses Haskell's predefined lists in monoidshs. ^[13]

For List to be a monoid, we need a neutral element and multiplication. The neutral element will be Nil. Multiplication (which is concatenation) can be defined by recursion:

mult :: List a -> List a -> List a   
mult Nil xs = xs
mult (Cons x xs) ys = Cons x (mult xs ys)

Exercise: Do Nil and mult satisfy the equations of identity and associativity? Write some Haskell code that is testing these laws.

Next we go back to Notation 1 and 2 above, where we introduced functions

\begin{aligned} f^{*} & : X^{*} \to Y^{*} & f o r f : X \to Y \\ [-] & : X \to X^{*} \\ μ & : (X^{*})^{*} \to X^{*} \\ g^{♯} & : X^{*} \to Y^{*} & f o r g : X \to Y^{*} \end{aligned}

In Haskell we can implement these functions as follows ^[14]

mapList :: (a -> b) -> (List a -> List b)
mapList f Nil = Nil
mapList f (Cons x xs) = Cons (f x) (mapList f xs)

eta :: a -> List a 
eta x = Cons x nil 

mu :: List (List a) -> List a 
mu Nil = Nil 
mu (Cons xs xss) =  mult xs (mu xss)

lift :: (a -> List b) -> List a -> List b 
lift g xs = mu (mapList g xs)

Exercise: Execute these functions pen-and-paper on some small examples. (You also may want to find more instructive names for the operations … I kept names generic as they will appear later in different contexts.)

Exercise (Eilenberg-Moore): Write Haskell functions that test the following equations on input data of your choice. (Recall that

f^{*}

is mapList f.)

\begin{matrix} μ (η (x)) = x \\ μ (η^{*} (x)) = x \\ μ (μ^{*} (x)) = μ (μ (x)) \end{matrix}

Note: In the first two equations

x

is any list, but in the last equation

x

is a list of lists of lists.

Exercise (Kleisli): Write Haskell functions that test the following equations on input data of your choice.

\begin{matrix} η^{♯} (x) = x \\ g^{♯} (η (x)) = g (x) \\ (λ x . g^{♯} (f (x)))^{♯} (x) = g^{♯} (f^{♯} (x)) \end{matrix}

(Programming) An Error Monad

(important but optional)

Let us define a data type that allows us to wrap any other data type into one that adds the possibility of an error as result. You can experiment with the code following the link to wrap. We define

data Wrap a = Error | Value a

Mathematically, we can think of Wrap as an operation that takes a set

X

and maps it to

{E r r o r} + X

, where

+

denotes here disjoint union of sets. (One often sees the notation

1 + X

instead of

{E r r o r} + X

.) This disjoint union comes, by definition, with two functions

\begin{matrix} E r r o r : 1 \to 1 + X \\ V a l u e : X \to 1 + X \end{matrix}

Where do the equations come in? In the case of the monoids we knew that we had to take into account the laws of identity and associativity. On the other hand, there do not seem to be any obvious equations relating errors and values.

But if we go back to the monoids we see that the equations come into play when we iterate the type constructor

mu :: List (List a) -> List a

In fact, it is mu that encodes the laws of identity and associativity.

Something similar happens for Wrap. After all, error hanling through functions that call each other recursively can lead to a whole stack Wrap (Wrap ( ...)...) and in order to extract an error or value from this stack, we need to flatten it. What are the equations for this flattening?

Exercise: Justify the following equations in terms of error handling:

\begin{matrix} V a l u e (E r r o r) = E r r o r \\ V a l u e (V a l u e (x)) = V a l u e (x) \end{matrix}

We write this in Haskell as

mu :: Wrap (Wrap a) -> Wrap a
mu Error = Error
mu (Value Error) = Error
mu (Value (Value x)) = Value x

So far so good. But for an error handling library we need more functionality. We need a function

eta :: a -> Wrap a

that wraps a value, a function

mapWrap :: (a -> b) -> Wrap a -> Wrap b

that allows to change wrapped values, and a function

lift :: (a -> Wrap b) -> (Wrap a -> Wrap b)

that allows us to chain computations.^[15]

Exercise: Head over to wrap and implement these functions so that they pass the tests specified in the file.

Summary and Outlook

We started with monoids for several reasons. First, they are a fundamental structure in computer science, not least because monoid actions model time-irreversible behaviour. Second, monoids are one of the simplest algebraic structures. Third, groups are special monoids and play a central role in much of mathematics. Fourth, we hinted at the relevance of these structures to algebraic data types. In the next lecture, we will explain this in more detail. We will also generalised monoids to universal algebras and give our first axiomatic answer to the question of what is algebraic structure.

References

For more on automata see

Hopcroft, Motwani, Ullman: Introduction to Automata Theory, Languages, and Computation, 3rd edition, 2006.

For more on the List monad in Haskell see

School of Haskell

Code

-- https://hackmd.io/@alexhkurz/B1oIuXxRv

-- the free monoid over `a`
data List a = Nil | Cons a (List a)
    deriving (Show, Eq)

-- multiplication / concatenation
mult :: List a -> List a -> List a   
mult Nil xs = xs
mult (Cons x xs) ys = Cons x (mult xs ys)

-- fold a `List a` over a monoid `b -> (b->b->b)`
foldMon :: (a->b) -> b -> (b->b->b) -> List a -> b
foldMon f e op Nil = e
foldMon f e op (Cons x xs) = op (f x) (foldMon f e op xs) 

mapList :: (a -> b) -> (List a -> List b)
mapList f Nil = Nil
mapList f (Cons x xs) = Cons (f x) (mapList f xs)

eta :: a -> List a 
eta x = Cons x Nil 

mu :: List (List a) -> List a 
mu Nil = Nil 
mu (Cons xs xss) =  mult xs (mu xss)

lift :: (a -> List b) -> List a -> List b 
lift f xs = mu (mapList f xs) 

-- testing 
display :: List a -> [a]
display Nil = []
display (Cons x xs) =  x:(display xs)

{- does not work:
display :: List a -> [a]
display Nil = []
display (Cons x xs) =  case x of 
  Nil -> []:(display xs)
  Cons y ys -> (display (Cons y ys)):(display xs)
  x -> x:(display xs)
-}

-- add your own tests
main = do 
  let l123 = Cons 1 (Cons 2 (Cons 3 Nil))
  let l456 = Cons 4 (Cons 5 (Cons 6 Nil))
  let l789 = Cons 7 (Cons 8 (Cons 9 Nil))
  let l = l123 
  let element = 7 
  let f = (\ x -> eta (x + 1))
  let g = (\ x -> eta (2*x))
  let lift_f_after_g x = lift f (g x) 
  let l3 = Cons (Cons l123 (Cons l456 Nil)) (Cons (Cons l789 Nil) Nil)

  putStr "test multiplication: [1,2,3] [4,5,6] == " ; print $ display $ (mult l123 l456)

  putStr "fold + over [4,5,6]:  " ; print $ foldMon (\x->x) 0 (\x->(\y->(x+y))) l456
  putStr "fold * over [4,5,6]:  " ; print $ foldMon (\x->x) 1 (\x->(\y->(x*y))) l456 

  print "test Eilenberg-Moore laws for eta, map, mu:"
  print $ mu ( mapList eta l123) == l123
  print $ mu (eta l123) == l123
  print $ (mu (mu l3)) ==  mu (mapList mu l3)

  print "test Kleisli laws for eta, lift:"
  print $ (lift eta l123 == l123)
  print $ (lift f (eta element) == f element)
  print $ (lift lift_f_after_g l123) == lift f (lift g l123)

Monads can be implemented in any programming language. Here is an example of monads in ML. ↩︎
The notation
$η, μ$ is widely used. The use of
$(-)^{*}$ is specific to the example of words. In fact, other sources use
$(-)^{*}$ for what I write
$(-)^{♯}$ . I may use
$M$ later for monads but I already use
$M$ for monoids and
$T$ is another conventional letter for monads.
$T$ may remind us of "triple" (an older name for monads) or of "term" (we will see that the elements of
$T X$ can be understood as equivalence classes of terms over variables in
$X$ ) or of "type" (many important type constructors are monads). ↩︎
t >>= g, or should we write as >>= bs is usually written as
```
do a <- as
   bs a
```
↩︎
If you know monads in Haskell try to match them up with the Kleisli notation. ↩︎
DFAs are right actions. DFAs cannot be reverse because not all functions have inverses. On the other hand, relations can be reversed and non-deterministic automata give rise to both left and right actions. ↩︎
(eg integers with addition but not integers with multiplication) ↩︎
There is an analogy here with Church numberals. Church numerals are functions used to represent numbers. Here we did go the other way round and used numbers to define functions ↩︎
Read
$\frac{A}{B}$ as
$A \Rightarrow B$ or "
$A$ implies
$B$ " or "if
$A$ then
$B$ " or "assuming
$A$ conclude
$B$ ". ↩︎
We define the quotient
$M / \equiv$ as the set
${[m]_{\equiv} ∣ m \in M}$ , where
$[m]_{\equiv}$ is the set of equivalence classes of
$\equiv$ . The neutral element is
$[e]_{\equiv}$ . Multiplication is defined as

$[x]_{\equiv} \cdot [y]_{\equiv} = [x \cdot y]_{\equiv}$

We have to show that this definition does not depend on the choice of "representatives"
$x, y$ . But this is exactly what the definition of congruence guarantees. ↩︎
We define
$f : L i s t (M) \to M$ inductively as
$f ([]) = e$ and
$f ([m, m_{1}, \dots m_{n}]) = m \cdot f (m_{1}, \dots m_{n})$ . It remains to show that
$l \equiv l^{'} \overset{d e f}{\Leftrightarrow} f (l) = f (l^{'})$ is a congruence relation. These type of definitions always give equivalence relations. So we need to show that the congruence rule holds. This involves a somewhat tedious unfolding of the relevant definitions plus one lemma that is of independent interest,
$f (x \cdot x^{'}) = f (x) \cdot f (x^{'})$ , which can be proved by induction on the structure of lists. ↩︎
The idea of so-called abstract data types is easy to explain. An interface for a data type such as queues can be described independently from any implementation by operations
```
nil: Queue
enqueue: Queue x Elem -> Queue
dequeue: Queue -> Queue
head: Queue -> Elem
```
The operation head may also be called front (for fifo) or end (for lifo), but here we want to emphasize that both lifo and fifo are based on the same types (or sorts) Queue and Elemand the same operations (or signature).

Nevertheless, the difference between lifo and fifo does show up in an implementation-independent, purely algebraic way, namely in the equations that hold for them.

For example, for lifo we have
```
dequeue(enqueue(s,e)) = s
head(enqueue(s,e)) = e
```
which do not hold for fifo. (Exercise: Write out more equations for lifo and fifo.) ↩︎
Associativity tells us that all terms can be normalised by moving all parentheses to the left (or right). Lists can be understood as an inductive definition of these normal forms. ↩︎
Haskell's predefined lists are more readable and but one must rely on some "secret" Haskell internals. ↩︎
Renaming
$X, Y$ to a,b and
$[-]$ to eta and
$f^{*}$ as mapList f and
$μ$ as mu and
$g^{♯}$ as lift g. ↩︎
Suppose we have two functions
```
f :: (Int -> Wrap Int)
g :: (Int -> Wrap Int)
```
and we want to compute
$f$ of
$g (x)$ for some integer
$x$ . Notice that the types don't match up, we cannot simply write
$f (g (x))$ or f (g x). The solution is to use f (lift g x). ↩︎

Monoids

Automata and Lists

Examples

Deterministic Finite Automata

Words, Strings, Lists, …

The Free Monoid

Table Comparing Notations

Notation 1 (Eilenberg-Moore)

Notation 2 (Kleisli)

Notation 3 (operations)

Axioms

More Examples

Theory

Every monoid is a monoid of functions

Quotienting by a congruence relation

Applications

Minimal Automata

Homework

(Maths) Equivalence of Eilenberg-Moore and Kleisli notation

(Programming) Implementing Monads in Haskell: Introduction

(Programming) The Free Monoid Monad

(Programming) An Error Monad

Summary and Outlook

References

Further Reading

Code

Read more

Create, Detect, Lift, Preserve, Reflect (Co)Limits

A Very Short Introduction to Monads

Category Theory - Axiomatic Theory of Structure

Category Theory in Computer Science