$\newcommand{\sem}[1]{[\![#1]\!]}$
Universal Algebra 3: Variables, Free Algebras
===
"In order to find the value of an expression it is necessary to know the value of its components. Thus to find the value of a + 5 + b/a we need to know the values of a and b. Thus we speak of evaluating an expression in an environment (or sometimes relative to an environment) which provides the values of components."
Christopher Strachey, [Fundamental Concepts in Programming Languages
](https://www.cs.cmu.edu/~crary/819-f09/Strachey67.pdf)
## Introduction
In the previous lecture we learned about
- term-algebras
- homomorphisms
- initiality
In this lecture we will see how to add
- variables
- substitutions
- equations
into the picture. We start with some general consideration about variables, then run again through our running example from the syntax and semantics lecture and think about how to extend it with variables. This then leads to the new notion of free algebras.
## What are variables?
There are at least two possible answers.
Let us think of natural numbers with addition, that is, the algebra $(\mathbb N,0,+)$.
### Variables as place holders
For example, going back in school to when we first encountered variables, we wanted to express, for example, that $+$ is commutative, that is
\begin{align}
1+2 &=2+1\\
1+3 &= 3+1\\
\ldots &= \dots \\
2+3 &= 3+2 \\
\ldots &= \dots
\end{align}
and then found that it is much easier to write instead
$$ x+y=y+x$$
So, in the first answer, variables are place-holders for the elements of the algebra and just a clever notation to avoid writing infinitely many equations.
### Variables as first class citizens
What distinguishes high-school algebra from primary school computations with numbers?
One of the main ideas introduced in high-school algebra is that of variables as first-class citizens. For example, when we do computations such as
$$x^2-1=(x+1)\cdot(x-1)$$
to find the 0's of the quadratic equation
$$y=x^2-1$$
we treat the variables as part of the data ("first class citizens"), not just as place-holders that represent numbers.
For example, we say that $y=x^2-1$ "is a parabola", which makes no sense when we replace $x,y$ by individual numbers as in $y=0$ and $x=1$.
### Meta-variables vs object-variables
In programming languages we often call the programming language that is studied the *object language*, as opposed to the *meta-language*, which is the language that is used to study the programming language, typically mathematics or English or a mix of the two. We have seen this distinction of object-language and meta-language at work in the lecture on [syntax and semantics](https://hackmd.io/hILQksyiTUW4mXxxOSF7eQ) where the language of expressions `exp` is the object language but, for example, the numbers $\mathbb N$ and the meaning function $\sem{-}$ are part of the meta-language.
In primary school the object languages consists of numbers and operations $+,*,-,/$. Later, when we learn first about variables, they are ***meta-variables***, which we use to denote numbers in the object language. The step to algebra consists in adding variables to the object language. To distinguish these variables from meta-variables we call them ***object-variables***.
For example, when we computed above
$$x^2-1=(x+1)\cdot(x-1)$$we did not do this computation for all possible numbers, but we used rules of computation that did directly manipulate the symbols $x,y$. In other words, variables are "objects" of rules of computation.
We will encounter below further examples of meta-variables and object-variables.
### Outlook on variables in programming languages
Of course, in programming languages variables also play a crucial role, both as mathematical variables and as names for memory cells and other resources.
- Variables are sometimes called "names" or "nominals" or "identifiers" if the operation of interest is the equaltiy test $x=y$ (which here is not a comparison of values but a comparison of the names themselves). [^names]
- Mathematical variables are names that can have a value. They also can be substituted.
- Programming variables are variables the value of which can change over time.
- Both mathematical and programming variables can be free or bound, and can have scope. We will hear much more about this when we come to $\lambda$-calculus.
In this lecture, we will only look at the simplest kind of mathematical variables.
To see what is at issue let us go back to the example of the first lecture.
## Example: Arithmetic Expressions
Recall from the lecture on [syntax and semantics](https://hackmd.io/hILQksyiTUW4mXxxOSF7eQ) that we started with an example of a formal language given by
exp ::= 1 | exp + exp | exp * exp
and we defined its semantics by
\begin{align}
\sem{-}:\mathcal L&\to \mathbb N\\\hline
\sem{1}& =1\\
\sem{t+t'}& =\sem{t}+\sem{t'} \\
\sem{t*t'}& =\sem{t}\cdot\sem{t'}
\end{align}
What happens if we add variabls to the language?
Pause the lecture and think about it.
First, we need to add variables to the syntax. We can add one variable `x` by defining
exp ::= 1 | x | exp + exp | exp * exp
or we can have a set `Var` of variables all of which we may want to use in expressions (Question: How many variables are there in your favourite programming language?) and we write
exp ::= 1 | v | exp + exp | exp * exp v in Var
**Remark:** In the first definition of `exp` the symbol `x` is the variable we want to use in the expressions. In the second definition `v` is a place-holder that ranges over the set of variables `Var`.[^variable1] [^variable2] For example, if `Var` is the set consisting of `x,y,z`, then the second definition of `exp` is a short hand for
exp ::= 1 | x | y | z | exp + exp | exp * exp
In programming terminology we can think of the `v in Var` as a macro that is expanded to turn the second definition of `exp` into the third. In practice, the notational distinction between a variable `x` and the place-holder `v` that ranges over variables such as `x` is usually not made. So don't worry too much about it now. But distinctions such as these will become important again when we will look at compilers next semester. (End of Remark)
How can we modify the semantics to account for variables? We need to add to the three clauses of
\begin{align}
\sem{-}:\mathcal L&\to \mathbb N\\\hline
\sem{1}& =1\\
\sem{t+t'}& =\sem{t}+\sem{t'} \\
\sem{t*t'}& =\sem{t}\cdot\sem{t'}
\end{align}
one clause for variables. But where do we map variables to? What should $\sem{x}$ be?
Something is missing here ... so let us just add it!
## Environments
The solution is to introduce an "environment", that is a function $\rho:\mathsf{Var}\to\mathbb N$ (pronounced as "rho") that evaluates variables. We can then define the semantics $\sem{}_\rho$ depending on the parameter $\rho$ as follows:
\begin{align}
\sem{-}_\rho:\mathcal L&\to \mathbb N\\\hline
\sem{1}_\rho& =1\\
\sem{t+t'}_\rho& =\sem{t}_\rho+\sem{t'}_\rho \\
\sem{t*t'}_\rho& =\sem{t}_\rho\cdot\sem{t'}_\rho\\
\sem{x}_\rho& =\rho(x)
\end{align}
**Remark:** Notice that $\sem{-}_\rho$ preserves $1,+,*$ but not $x$. In other words, $\sem{-}$ is a homomorphism wrt the signature $\{1,+,*\}$ but not wrt the signature $\{1,+,*,x\}$.
**Remark:** It is typical for universal algebra that the $\rho$ in the definition above always stays the same. This will change as soon as we start applying this methodology to programming languages that have assigments or variable binding or higher-order functions. But for now it is good to explore in some depth the simpler situation. (Our programming language is now that of a calculator that can deal with (mathematical) variables but does not allow the user to define functions or write programs.)
The next step is to look at the above definition of $\sem{-}$ trough the eyes of universal algebra. One of the central notions of universal algebra is that of a free algebra.
## Free algebras
The aim of this section is to show define so-called free algebras and to show how they are used to describe syntax and semantics of languages that inlcude variables.
There are at two different ways to introduce free algebras. A concrete one, via term-algebras, and an abstract one, generalising the universal property of initiality.
### Term-algebras over a set of variables
The first one emphasises the similarity of variables and constants: The free algebra over a set of variables $X$, is the term algebra that is obtained when adding to the operations of the signature also the variables $x\in X$ as new constants symbols. At this point, the only difference between constants and variables is that the variables are not part of the signature and, therefore, do not need to preserved by homomorphisms.
**Example:** Let $\Sigma=\{1,+,*\}$ be the signature of our running example, that is, with one constant 1 and two binary operations $+,*$. The free algebra over the set $X=\{x,y,z\}$ of variables is the language $\mathcal L_X$ given by
exp ::= 1 | exp + exp | exp * exp | x | y | z
with operations $1^{\mathcal L_X}$, $+^{\mathcal L_X}$, $*^{\mathcal L_X}$ defined in exactly the same way as in the section on [term-algebras](https://hackmd.io/s/Bymo_vCj7#Termalgebras) in the previous lecture.
It is clear that the example above generalises to arbitrary signatures. This leads to new
**Notation:** We denote by $Term_\Sigma(X)$, or just $Term(X)$ the term-algebra over a set $X$.
### Substitution as a homomorphism
Here we want to show that substitutions are homomorphisms on the term-algebra.
For concreteness let us fix a set $X=\{x,y,z\}$ of variables.
I claim that homomorphisms
$$h:Term(X)\to Term(X)$$
are precisely substitutions.
First, every homomorphism $h$ defines a substitution $s:X\to Terms(X)$ given by
$$s(x)\ \stackrel{def}{=}\ h(x)$$
So a substitution just specifies for each variable $x$ a term $s(x)$, which is meant to be the term replacing $x$ in another term $t$.
This means that we also have the operation of applying a substitution $s$ to a term $t$. This is often written as $t[x\mapsto s(x)]$ or $t[x\mapsto s_x]$ or $t[x:=s_x]$.
**Activity:** Make some concrete examples and think about how to implement the operation that takes a term and a substitution and applies the substitution to the term. Think about why this operation is a homomorphism and why it is important that the signature does not contain the variables as constants.
We summarise the activity by
**Proposition:** There is a bijective correspondence between functions (substitutions) $X\to Term(X)$ and homomorphism $h:Term(X)\to Term(X)$. More precisely, for all functions $s:X\to Term(X)$ there is a unique homomorphism $h:Term(X)\to Term(X)$ such that $s(x)=h(x)$ for all $x\in X$.
Notice the similarity of the proposition with the definition of initiality.
### Free algebras defined
The second way to introduce free algebras directly leads to the official definition of free algebra. Recall the universal property of initiality:
An algebra $I$ is initial if for all algebras $A$ there is a unique homomorphism $I\to A$.
How should we modify this universal property if, moving back to our example, instead of the initial algebra $\mathcal L$ we want to characterise the algebras $\mathcal L_X$ given wrt a set $X$ of variables?
Recall how we introduced the environments $\rho:X\to\mathbb N$ in our [running example](https://hackmd.io/s/By3OtPAsQ#Example-Arithmetic-Expressions) (back then we wrote Var instead of $X$). The semantics $\sem{-}$ was defined as the unique function such that
\begin{align}
\sem{-}_\rho:\mathcal L&\to \mathbb N\\\hline
\sem{1}_\rho& =1\\
\sem{t+t'}_\rho& =\sem{t}_\rho+\sem{t'}_\rho \\
\sem{t*t'}_\rho& =\sem{t}_\rho\cdot\sem{t'}_\rho\\
\sem{x}_\rho& =\rho(x)
\end{align}
The first three conditions say that $\sem{-}$ is a $\Sigma$-homomorphism. Hence $\sem{-}$ is the unique homomorphism satsifying $$\sem{x}=\rho(x)$$
To summarise, we have seen that $\mathcal L_X$ has the following property.
For all algebras $A$ and for all environments $\rho:X\to A$ there is a unique homomorphism $h:\mathcal L_X \to A$ such that $h(x)=\rho(x)$.
We turn this now into a
**Definition:** Let $\Sigma$ be a signature and $X$ a set. A $\Sigma$-algebra $F_X$ is called a ***free $\Sigma$-algebra over $X$*** if for all $\Sigma$-algebras and all functions $\rho:X\to
A$ there is a unique homomorphism $h:F_X\to A$ such that $h(x)=\rho(x)$.
**Remark:** The notation $F_X$ is meant to suggest "free over $X$". Other common notations are $FX$ or $F(X)$ or $Term_\Sigma(X)$ or $T(X)$ or $TX$ and many more. There is no standardized symbolic notation for free algebras, but the terminology "free algebra" is used uniformly in mathematics and computer science.
**Remark on the difficulty of maintaining a uniform notation:** Why do we change notation from Var to $X$, or from $\mathcal L$ to $I$, or from $\mathcal L_X$ to $Term(X)$ to $F_X$?
$\quad$ Changing notation always make learning a subject more difficult. On the other hand, it is impossible to keep a consistent and good notation across different levels of abstraction. And as this is something we need to deal with all the time as software engineers anyway, we can as well get used to it now.
$\quad$ For example: On the concrete level one prefers notation that reflects the particularities of the current situation. For instance, in programming we name variables and functions in such a way that the code becomes self-documenting as far as possible; on an abstract level, though, one prefers a more neutral naming convention that is not biased towards a particular application or interpretation.
## References
...
[^names]: Assigning unique names is a more subtle problem then it may look like at first sight.
(1) For example, if we want to have unique postal addresses, we use house numbers to identify a house in a street. We use street names to identify streets in a city. Etc. But note how uniqueness is achieved by fixing a context such as "in a street", "in a city", etc.
(2) It is common to use numbers as addresses (Activity: List 3 examples of this.) One property that makes numbers suitable for this is that there infinitely many numbers. So if one needs a new name, we know that there must be one available.
(3) It is not always convenient or practical to keep a list of all names that have been assigned. (Activity: List examples.) A common solution then is to use [hash-functions](https://en.wikipedia.org/wiki/Hash_function). Hashes can be used as names because they are (almost) unique and they can be compared for equality, which is the basic operation one needs on names.
(4) Activity: Think about, in the spirit of our discussion about [abstract data types](https://hackmd.io/s/Hk7_Dfyj7#Abstract-Data-Types), what are the operations that should be supported by an abstract data type of names. (As in the case of sequences, there may be more than one answer.)
[^variable1]: `x` is an object-variable, `v` is a meta-variable.
[^variable2]: I want to counter the impression that meta-variables are simpler than object-variables. This impression may be created by our examples which suggests that meta-variables are merely placeholders. In fact, meta-variables can carry the same complicated structure as object variables. For example, both `exp` and `Var` are meta-variables, with `exp` being a bound variable and `Var` a free variable. We also see that the scope of `Var` changes in the remark: First, `Var` refers to any set, later `Var` is in a scope in which it has value $\{x,y,z\}$.