--- tags: chapman, programming languages --- $\newcommand{\sem}[1]{[\![#1]\!]}$ (part of a course on Programming Languages) # Operational and Denotational Semantics "In a rough and ready sort of way it seems to me fair to think of the semantics as being what we want to say and the syntax as how we have to say it." [Christopher Strachey](https://www.chessprogramming.org/Christopher_Strachey) in his seminal paper [Fundamental Concepts in Programming Languages](https://www.cs.cmu.edu/~crary/819-f09/Strachey67.pdf) (1967). ## Aims To be able to explain two concepts central to the field of Programming Languages, namely *denotational and operational semantics*. ## Introduction To solve programming problems we need a **programming language to represent data and define operations on data**. To reason about the correctness of our programs we need to also know what is being represented. To this end we define **mathematical models of the real world**. These models are sometimes also called domains or mathematical domains. *Example:* In data science, mathematical models are typically statistical models built from probability distributions. (Whether the mathematical models are accurate representations of the real world cannot be determined by purely scientific methods. For this we need to look at scientific consensus, which arises from a process that draws on science but also comprises social, historical and psychological factors. In the following we will not pursue this important point but I recommend the important book [The Structure of Scientific Revolutions](https://en.wikipedia.org/wiki/The_Structure_of_Scientific_Revolutions) by Thomas Kuhn to everybody who is interested in this.) [^truth] Once we agreed on the mathematical model, the correctness of the program becomes the question whether the program agrees with the model. We can picture this as a function $$\sem{\cdot}: Data \longrightarrow Domain$$ Such a map is often called a **denotational semantics** because it associates to each datum its mathematical denotation (meaning). It is also sometimes called a 'meaning function' or just 'the semantics'. *Example:* For shortest path algorithms $Data$ could be adjacency lists or matrices (or ...) while the objects of $Domain$ could be graphs. For sorting and search algorithms $Data$ could be lists or trees or B-trees or red-black-trees (or ...) while $Domain$ could just be finite sequences (lists). *Activity:* Add your own examples. In the following, we are interested in the case where the elements of the set $Data$ are programs. We now write the denotational semantics $$\sem{\cdot}: \mathcal L \longrightarrow Domain$$ to emphasize that we think of $\mathcal L$ as a programming $\cal L$anguage. Programs are special data that can be executed. Thus, in addition to a denotational semantics we want an **operational semantics** that captures computations. For this we need a *model of computation*. To keep things simple, we will build on our previous work on rewriting. In other words, our model of computation will be **rewriting to normal form**. We can now summarize our approach (that will be refined later) in the following diagram <iframe class="quiver-embed" src="https://q.uiver.app/?q=WzAsMyxbMiwwLCJcXGNhbCBMIl0sWzQsMiwiRG9tYWluIl0sWzAsMiwiTm9ybWFsRm9ybXMiXSxbMCwxLCJkZW4uIHNlbSJdLFswLDIsIm9wLnNlbS4iLDJdLFsyLDEsImFncmVlbWVudCIsMl1d&embed" width="885" height="432" style="border-radius: 8px; border: none;"></iframe> ## Notation Since the write-up of this lecture is longer than usual, <font color=grey>I typeset the parts that can be skipped on first reading in grey. </font> ## Questions <font color=grey> This lecture will address the following questions. What is meaning? [^whatis] On the one hand, computation consists of the blind application of formal rules, on the other hand we write programs to perform meaningful tasks. Is it possible to give a mathematical account of the meaning of a program? While these questions may seem rather philosophical, the attempt at an answer will lead to both a deeper understanding of programming and to practical tools that are relevant to everyday software engineering. [^meaning] For example: - How can I be sure that my program computes what I want it to compute? - How do I know what the program computes? Operational Semantics! - How do I know what i want? Denotational Semantics! - How can we build software tools that verify correctness of programs? - What does it mean to say the following? ![](https://i.imgur.com/xFnB7p4.png =520x) - What is the relationship bertween the "purity" of Haskell and the importance of monads in Haskell? </font> ## Outline of the Answers Throughout, we will illustrate the concepts by the simplest example from arithmetic. But this methodology scales to complex software systems including compilers and operating systems. We start with a language $\mathcal L$ that has only one constant `1` and two binary operations `+` and `*`. So far $\mathcal L$ is a purely formal language that has no meaning. In particular, there is nothing that distinguishes `+` from `*`. There are two ways to equip $\mathcal L$ with meaning. 1. **Operational Semantics** adds computation rules (equations, rewrite rules). For example, after adding X*(Y+Z) = X*Y+X*Z the symmetry between `+` and `*` is broken and `+` begins to look like addition and `*` like multiplication. Moreover, we can read it from right to left and intepret it as a computation rule. 2. **Denotational Semantics** maps the language $\mathcal L$ to our familiar mathematical numbers $\mathbb N$. In particular, we will map `1` to $1$, `+` to $+$ and `*` to $\cdot$. **Question:** Can you think of examples of both methods in the context of software engineering? Review some of your favourite algorithms. How do you establish the correctness of the algorithms? [^favourite-algorithms] **Summary:** The remainder of the lecture discusses the details of the sketch above: - Recap of Arithmetic - Denotational Semantics - Operational Semantics - Agreement between Operational and Denotational Semantics (Adequacy) Finally, we will indicate that the principles outlined for arithmetic translate to programming languages and software engineering in general and briefly look at the problem of full abstraction. ## Recap of Arithmetic The following recap of arithmetic is not intended to introduce new material, but to clarify the structure of arithmetic in terms of syntax, computation and semantics. This structure repeats itself in countless examples in programming languages, algorithms, and logic and we will see more of it throughout Programming Languages and Compiler Construction. ### Abstract Syntax of Arithmetic The abstract syntax [^abstractsyntax] exp ::= 1 | exp + exp | exp * exp allows us to form, for example, the expression (1+1)*((1+1)+1) We can read this as $2*3$, but to keep things simple, we only allow ourselves $1$ as a basic number. That is ok, because all other numbers can be obtained as a sum of $1$s. For example, the number $4$ will be represented as ((1+1)+1)+1 **Remark (Haskell):** Similarly to our implemention of [the calculator](https://hackmd.io/@alexhkurz/HkpdXJ1fK), we can represent the abstract syntax of a simple language for arithmetic in Haskell as ```haskell data Exp = One | Add Exp Exp | Mult Exp Exp ``` The expression `(1+1)*((1+1)+1)` would then be represented as Mult (Add One One) (Add (Add One One) One) ### Computing in Arithmetic What about computations? **Acticity:** How do we compute $$2*3$$ on a machine that has only one basic number `1` and does not have timetables stored.? Answer. [^twoTimesThree] **Activity:** Which rules do justify the computation steps above? Label each step of the computation above by the rule that justifies it. Hint. [^xyzEquations] **Further Question:** Can you think of an example computation that needs the following equations? X + Y = Y + X X * Y = Y * X X * ( Y * Z ) = ( X * Y ) * Z **Important Idea:** We can use equations as **rewrite rules** to do computations. In order for computations to terminate, we orient equations and allow applications of equations only "from left to right". If we want to emphasise our intention to only apply equations from left to right, we say "rewrite rule" instead of "equation" and use `->` instead of `=`. For example, X + ( Y + Z ) -> ( X + Y ) + Z X * 1 -> X X * ( Y + Z ) -> X * Y + X * Z **Exercise:** The rewrite rules above define an abstract reduction system $(A,R)$. Write out in detail what $A$ and $R$ is. **Exercise:** Why does every computation in the language given by the syntax `exp ::= 1 | exp + exp` and the rewrite rule `X + ( Y + Z ) -> ( X + Y ) + Z` terminate? Can you find an invariant? **Optional Exercise:** Extend the exercise above to the larger language involving multiplication. Another extension would be to add zero and a unary operation for successor. Etc. **Preliminary summary:** We reviewed abstract syntax, equations, rewrite rules. Moreover we raised the question of how to recognise whether we have enough equations/rewrite rules. "Enough" can refer to two distinct but related ideas. Do we have enough equations to prove all mathematical equalities? Do we have enough rewrite rules to do all relevant computations (in our example: to evaluate all arithmetic expressions)? <!-- ### Semantics of Arithmetic? Of course, we learned in school that the equation `X + ( Y + Z ) = ( X + Y ) + Z` is true (order of addition doesn't matter). But, so far, we only have a formal language. Just looking at the syntax exp ::= 1 | exp + exp | exp * exp there is really no difference between `+` and `*`. Both are just two different meaningless symbols. So what distinguishes true equations such as X * 1 = X from wrong ones such as the following? X + 1 = X How can we distinguish right and wrong, true and false? --> ## Introduction to Semantics The big idea that is missing so far is how to give meaning to a language. We also say: What is the semantics of our formal language? Meaning comes from talking about something that everybody has agreed upon or understands already (the pre-understanding). For us, there are two possibilities: We all understand (at least to some degree that should suffice for the following) - mathematics - computing machines If we give the semantics of a language in terms of a mathematical model we speak of mathematical or **denotational semantics**. If the semantics is in terms of a (virtual) computing machine, we speak of **operational semantics**. Both are two different and important ways of explicating the meaning of a programming language. In our running example of arithmetic, the denotational semantics will assign to every expression the number denoted by that expression. The operational semantics will describe instead the computation steps taken in order to evaluate the expression to normal form. [^denvsop] [^denvsop]: These last two paragraphs may not be clear right now, but I will expand on them in the next two subsections. A good way to clarify these ideas for yourself is to make your own examples and discuss them with me. The borderline between denotational semantics and operational semantics can be somewhat fluid if the semantic domain is a **virtual machine**. For example, if we take as our semantic domain for arithmetic the `Integer` data type of Haskell, then, since Haskell can be executed, we have a semantics of arithmetic that is both denotational and operational. But usually there is a distinct gap between operational and denotational semantics: The operational semantics is the specification of an interpreter for the programming language whereas the denotational semantics tries to capture the mathematically essence of a program, abstracting away from computational details. ## Denotational Semantics Let us write $\mathcal L$ for the formal language of expressions given again by the grammar exp ::= 1 | exp + exp | exp * exp Let $\mathbb N = \{1,2,3,\dots\}$ be the natural numbers. The (denotational) semantics of the language $$\color{black}{\sem{-} :}\color{blue}{\mathcal L} \color{black}{\to} \color{red}{\mathbb N}$$ will involve three different languages and we use, just for now, three different colours to help keeping them apart. $\color{blue}{\textrm{Blue}}$ to denote the formal language (aka object language or source language), $\color{red}{\textrm{red}}$ for the target language (in our example the mathematical natural numbers with addition and multiplication) and black for the informal "meta"-language in which we freely mix English with mathematical notation. We can now define the semantics $\sem{-}$ by recursion on abstract syntax \begin{align} \sem{\color{blue}1}& =\color{red}1\\ %\sem{\color{blue}{t+1}}& =\sem{\color{blue}t}\color{red}{+1 }\\ \sem{\color{blue}{t+t'}}&=\sem{\color{blue}t}\color{red}+\sem{\color{blue}{t'} }\\ \sem{\color{blue}{t*t'}}&=\sem{\color{blue}t}\color{red}\cdot\sem{\color{blue}{t'} } \end{align} Or the same definition in Haskell (now taking Haskell's `Integer` instead of the mathematician's $\mathbb N$): ```haskell eval :: Exp -> Integer eval One = 1 eval (Add n m) = (eval n) + (eval m) eval (Mult n m) = (eval n) * (eval m) ``` You can run this definition at [repl.it](https://repl.it/@alexhkurz/arithmeticOnePlus#main.hs). [^runit] **Remark:** As a little excursion important later in the course, note that the definition of the function $\sem{-}$ is what we call ***structure preserving***: The $\color{blue}{+}$ on the left is mapped to a $\color{red}{+}$ on the right, etc. This principle is key to all of engineering in general. It is what allows us to build big systems from small systems and to control how the small systems interact. Here this general principle is visible in the simple fact that an equation such as $$\quad\sem{\color{blue}{t+t'}}=\sem{\color{blue}t}\color{red}+\sem{\color{blue}{t'} }\quad$$ explains the meaning of the "big system" $\;\sem{\color{blue}{t+t'}}\;$ as a function of the meanings of the "small systems" $\;\sem{\color{blue}t}\;$ and $\;\sem{\color{blue}{t'} }\;$. This principle is also called ***compositional semantics***. It plays a major role not only in computer science and other engineering disciplines but also in linguistics to explain the meaning of natural languages. (End of Remark) **Remark:** To get a feeling for how such an inductive definition works we do an example computation where we inductively/recursively decompose terms (**Exercise:** fill in the dots in the equational reasoning below.) until we can apply the base case and finally add up in the natural numbers. \begin{align} \sem{\color{blue}{((1+1)+1)+(1+1)}} & = \sem{\color{blue}{((1+1)+1)}}\color{red}{+}\sem{\color{blue}{1+1}} \\ & = \dots \\ & = \sem{\color{blue}{1}}\color{red}{+} \sem{\color{blue}{1}}\color{red}{+} \sem{\color{blue}{1}}\color{red}{+} \sem{\color{blue}{1}}\color{red}{+} \sem{\color{blue}{1}} \\ & = \color{red}{1}\color{red}{+} \color{red}{1}\color{red}{+} \color{red}{1}\color{red}{+} \color{red}{1}\color{red}{+} \color{red}{1} \\ & = \color{red}{5}\\ \end{align} If you now wonder why this denotational semantics is not also an operational semantics you are on the right track. But we are in a special situation here where the mathematics (in red) is in fact computable. This is not always the case for mathematics. For example, most real numbers are not computable. ## Operational Semantics While the denotational semantics worked by mapping syntax to a semantic domain, we now describe a different way of giving meaning to syntax. There are many different ways to set up an operational semantics. The simplest one, pursued in this section, is **rewriting to normal form**. We have studied rewriting to normal form before, so this section is largely a recap of things we have seen. One advantage of operational semantics via rewriting to normal form is that rewriting to normal form does not require us to define another language or a virtual machine. Instead, we directly define computations on the already given syntax exp ::= 1 | exp + exp by adding rewrite rules, which, in our case, amounts to adding the single rewrite rule X + ( Y + Z ) -> ( X + Y ) + Z **Question:** What do we need to do in order to show that all computations terminate? If two computations from the same initial expression terminate, are the two final expressions the same? (Recall what we learned about confluence and unique normal forms.) [^maclane] Recall that a **normal form** is an expression to which no rewrite rule can be applied. **Exercise:** Determine the normal forms of the rewrite system above. A rewrite system is **terminating** if there are no infinte computations. **Exercise:** Show that the rewrite system above is terminating. **Remark:** Recall that a rewrite system is a non-determinstic algorithm. The input is the starting expression. The result is the normal form in which the computation terminates. For a full analysis of our rewrite system it remains to show that results of computations are unique, that is, any two computations starting from the same input lead to the same normal form. Recall that a rewrite system has **unique normal forms** if any two terminating computations starting from the same input lead to the same normal form. **Exercise:** Show that the rewrite system above has unique normal forms. Recall that two expressions are called **equivalent**, if they can be connected by a finite sequence of (forward or backward) rewrite steps. **Exercise:** Characterise equivalence of expressions. **Optional Exercise:** Extend the exercises above to the language with multiplication. Now that we have two different ways of explaining the meaning of a programming language, we want to know whether they amount to the same (in a suitable sense). Before we can explain the (positive) answer to this question, we to learn/revise some notions from discrete mathematics. ## Recap: Equivalence Relations and Isomorphisms In order to clarify the relationship between rewrite rules and equations, we need the mathematical notion of equivalence relations. If you remember what this was about feel free to skip. Briefly, for any binary relation $\to$ on a set, there is a smallest equivalence relation $\equiv$ containing $\to$. [^equiv] [^equiv]: We write $\equiv$ instead of $=$, because $=$ already has so many uses in maths and computer science. We write $\equiv$ to denote a particular equivalence relation. One of the uses one makes of equivalence relations is to quotient by them. This turns equivalence into equality. For example, if one quotients the integers by the smallest equivalence relation containing $\{(n,m) \in \mathbb Z\times \mathbb Z \mid n = m + 12\}$ then this quotient is isomorphic[^isomorphic] to the set $\{0,1,2,3,4,5,6,7,8,9,10,11\}$ known from the hours on the clock. [^isomorphic]: For our current purposes we can take isomorphic to mean bijective and isomorphism to mean bijective function. But ismorphism is the right concept if we want to compare not only sets but also take into account operations like `+` and `*`. So I prefer to use the terminology that will continue to be useful for a more thorough investigation we may turn to later (if time permits). If you are familiar with these ideas you can go to the next section, otherwise read the notes on [Discrete Mathematics](https://hackmd.io/@alexhkurz/SJ1cc-dDr). To check your understanding consider again the equation X + ( Y + Z ) = ( X + Y ) + Z **Remark:** Note that the variables `X, Y, Z` are not part of our language of arithmetic. They are merely place holders (metavariables) for the terms (expressions) of the language. **Exercise:** Describe the equivalence relation induced by the equation above. Answer.[^Answer1] [^Answer1]: We define the equivalence relation ${\equiv}\subseteq A\times A$ on the set $A$ of all abstract syntax trees given by `exp ::= 1 | exp + exp` as the smallest equivalence relation containing $\{(X + ( Y + Z ),( X + Y ) + Z) \mid X,Y,Z \in A \}$. **Question:** Equations introduce a problem. On the one hand, 3+2=5. On the other hand, 3+2 and 5 are certainly different expressions.[^different] How can we make sense of these different notions of equality? [^different]: 3+2 and 5 are certainly different as strings. But in some sense they also differ in information content: If I tell you that I am thinking of two numbers the sum of which is 5, you won't be able to say which numbers I am thinking of. Computation destroys information. For background on this question see the sections on Equivalence Relations and Quotienting by Equivalence Relations in [my notes on relations](https://hackmd.io/@alexhkurz/SJ1cc-dDr). **Exercise:** Answer the question above using the fact that one can quotient by an equivalence relation. In the next section, we want to connect operational and denotational semantics. The following two exercises will be important. **Exercise:** Show the quotient of the language `exp ::= 1 | exp + exp` by the equation `X + ( Y + Z ) = ( X + Y ) + Z` is isomorphic to the positive integers. Answer. [^Answer2] **Exercise:** Show that the isomorphism $\quad \mathcal L/{\equiv}\ \cong\ \mathbb N\quad$ from the previous exercise preserves all the relevant structure. **Optional Exercise:** Extend all exercises above to the language with multiplication. ## Adequacy Now that we have built two different semantics, one mathematical and one computational, we want to know how they relate to each other. To recap, the denotational semantics is a function $$\sem{-}:\mathcal L \to \mathbb N$$ and the operational semantics is a function $$norm:\mathcal L \to NForm$$ where $NForm=\{1, 1+1, (1+1)+1,\ldots \}$ is the set of normal forms (or values) and $norm$ is the function that normalises expressions. How can we compare the two semantics? **Exercise:** Find a map $NForm\to\mathbb N$. I will call that map the "meaning" map $mng:NForm\to\mathbb N$. Now we can arrange our ingredients into a triangle as follows. ![](https://i.imgur.com/zyfM22F.png=500x200) We say that **the diagram commutes** to express that going around both ways gives the same function. In one-dimensional algebraic notation we write this $$mng \circ norm = \sem{-}$$ or sometimes also as $$norm \,; mng = \sem{-}$$ Here, the ";" reminds us of the sequential composition in programming languages. We say that the operational semantic is **adequate** if the diagram above commutes. If we allow ourselve the identifications 1 = 1 (1+1) = 2 ((1+1)+1) = 3 ... then we can also formulate **adequacy** as $$ e \stackrel\ast\to n \quad \Longleftrightarrow\quad \sem{e}= n$$ The direction "$\Rightarrow$" can be phrased as "denotation is an invariant of computation" and is also known as the **correctness** of the denotational semantics wrt the operational semantics. Correctness, even for more complicated program languages, is usually not hard to prove, because we construct the denotational semantics with correctness in mind. The direction "$\Leftarrow$" is then known as adequacy in the proper sense and can require sophisticated mathematics beyond a typical undergraduate degree. The reason for the difficulties stem from recursion and higher-order functions. An entire branch of mathematics, known as domain theory, had to be developed to deal with them. But in the case of arithmetic alone, adequacy is much easier to establish and we will look at this in a later lecture on [Normalization by Evaluation](https://hackmd.io/@alexhkurz/BkQ7VEp_r). ## Full Abstraction The language of arithmetic is special since there are no functions and no recursion. All computations terminate and evaluate to a number. In particular two expressions are equivalent if and only if they evaluate to the same value. But what would be a good notion of equivalence for functions? Two functions, or programs, are **observationally equivalent** if they evaluate to the same values in all contexts. I will write $P\simeq P'$ if the two programs $P,P'$ are observationally equivalent. A context $C[-]$ is a program with a hole. For example, $C$ could be context that applies a function to an argument. The functions that take an input `x` and return, respectively, `x+1` and `1+x` are observationally equivalent. **Exercise:** Use the definition of observational equivalence to show that the code fragments `x+x` and `2*x` are equivalent. **Remark:** Observational equivalence is the notion of equivalence needed for compiler optimisations: If $P\simeq P'$ then the compiler can safely replace the code fragment $P$ by (the supposedly more efficient) $P'$ without changing the behaviour of the overall program $C[P]$. Similarly, to our discussion of adequacy, we can now ask again whether operational and denotational semantics agree, that is, $$P\simeq P' \quad\Longleftrightarrow\quad \sem{P}=\sem{P'}$$ Here, the "easy" direction is "$\Leftarrow$". [^rt] Following (Milner, 1977), the other direction is called **full abstraction** and can be much harder to establish. [^rt]: $P\simeq P' \Leftarrow \sem{P}=\sem{P'}$ can be considered as a stronger version of [referential transparency](https://en.wikipedia.org/wiki/Referential_transparency): If the two expressions $P,P'$ have the same semantics then one can be replaced by the other without changing the overall program's behaviour. The problem of full abstraction was first introduced for the programming language PCF, which is the simply typed lambda calculus with natural numbers and recursion. How can we try to understand why full abstraction is a difficult problem? One can always construct a model from syntax by quotienting. But this does not constitute a "natural mathematical" model, that is, a model from which one could hope to get insights into programming from using "ordinary" mathematics. On the other hand, when one constructs a mathematical model from mathematical (as opposed to computational) principles, this model may contain features that are natural from a mathematical point of view but happen to not be part of the programming language. If this happens, the mathematical model is not fully abstract for the operational semantics. For PCF this is exactly what happened. (Plotkin, 1977) idenitified the feature that was missing from PCF to make it fully abstract wrt to Scott's denotational semantics: The parallel or. Obviously, this would need further explanations, but as this excursion on full abstraction already goes beyond of what we can discuss in this course, I leave it at this for now and refer to the literature below for more on this story. ## Conclusion We can summarize the lecture in the following picture. We start with a language $\mathcal L$ and then consider three ways of giving a meaning to its expressions. ![](https://i.imgur.com/pL50bVp.png) On the left, we have an operational semantics rewriting expression to a normal form. On the right we have a denotational semantics to a semantic domain $\mathcal D$ defined using mathematics independent of any notion of computation.[^computation] [^computation]: "Independent of any notion of computation" is somewhat vague and one can debate in each particular case how well this aim was achieved. What is important is that the denotational and the operational model both shed different lights on the meaning of the language and complement each other. In the middle we have the model that is constructed from quotienting syntax. In our example, the horizontal arrows are all isomorphisms, that is, we obtain a perfect match between operational and denotational semantics.[^inourexample] This perfect picture can be achieved in many situations and will appear, implicitely or explicitely, again and again later in this course. [^inourexample]: Moreover, $\equiv$ is the smallest equivalence containing the rewrite relation $\to$ that defines the norml forms. In particular, we will see examples where the operational semantics (= an algorithm) is given and the question is to understand it (= find a denotational semantics) and, vice versa, where the speicification of a problem (= a denotational semantics) is given and the task is to find an appropriate algorithm (= an operational semantics). Again, in most cases we will look at in this course, a perfect match between the two sides is what we aim for. However, if we had time to study how this picture can be extended to programming languages with recursion (or while loops, for that matter), we would see that we run into the halting problem and matters get rather more complicated/interesting. ## Homework Read these notes and do the exercises. Here are two further discussion questions: - In the conclusion we made the analogy between programming language and algorithm, thinking of both as an operational semantics. But algorithms are written *in* programming languages. How can we justify to think of a programming language as just another algorithm? [^algorithms] [^algorithms]: A relevant idea here is the universal Turing machine. - At the beginning of the lecture we quoted Strachey saying *semantics as being what we want to say and syntax as how we have to say it.* Discuss the following variation: *denotational semantics as being what we want to compute and operational semantics as how we have to compute it.* ## Backstory Even though Church invented the lambda calculus as a caculus of mathematical functions, for a long time it only had an operational semantics. In fact, it was known that there cannot be a (naive) set-theoretic denotational semantics for (untyped) lambda calculus.[^settheory] This was a significant obstacle since set theory was (and is) the most widely accepted foundation of mathematics. Then, in 1969, two things happened: While working on a paper (Scott, 1969/1993) arguing against the (untyped) lambda calculus of Church,[^frustration] Dana Scott found a denotational semantics for it (Scott, 1972). With these discoveries, Scott not only solved one of the outstanding problems of theoretical computer science, he also initiated the fields of domain theory and denotational semantics, which in turn had ramifications in process algebra, concurrency, theorem proving, verification, program analysis. For some pointers to the early literature, continue reading the historical references below. ## Historical References Reading the introductions of the following historical references gives a sense of the developments. Many of the questions that arose at the time remain fundamental in the area of programming languages. The early history of the subject is difficult to pin down. We are lucky that the following has been published (more than 20 years after it was written). It is the paper that argued against the untyped lambda calculus and introduced a typed lambda calculus known as the logic of computable functions (LCF): - Dana Scott: [A type-theoretical alternative to ISWIM, CUCH, OWHY](https://www.cs.cmu.edu/~crary/819-f09/Scott93.pdf). 1969/1993. In the 1993 preface to the above paper, Scott tells the story how he finally found a mathematical model for the (untyped) lambda calculus. The earliest publication of this I could find is: - Dana Scott: [Continuous Lattices](https://www.cs.ox.ac.uk/files/3229/PRG07.pdf). 1972. For a modern account on the field that was started by this work of Scott, see Abramsky and Jung's [Domain Theory](https://www.cs.bham.ac.uk/~axj/pub/papers/handy1.pdf). The next paper emphasises the connections between the mathematical theory (domain theory) and its application to programming languages (denotational semantics). - Dana Scott: [Mathematical concepts in programming language semantics](http://ropas.snu.ac.kr/~kwang/520/readings/sco72.pdf). 1972. Ironically, Scott's unpublished work on LCF was as important as his models of untyped lambda caclulus. Robin Milner turned LCF into one of the first modern interactive theorem provers. Moreover, in order to implement LCF, he invented one of the major functional programming languages, ML. Then, to make programming in ML practical, he created the type-inference algorithm described in [A theory of type polymorphism in programming](https://www.sciencedirect.com/science/article/pii/0022000078900144?via%3Dihub).[^wrong] An early paper by Milner on LCF is the following. It is still surprisingly readable, already written in the notation we still use today. - Robin Milner: [Models of LCF](http://i.stanford.edu/pub/cstr/reports/cs/tr/73/332/CS-TR-73-332.pdf). 1973. The next in my list is one of the early papers that showed how to apply the nascent area of domain theory to the semantics of programming languages. It introduced PCF, which is to the day the paradigmatic language to introduce domain theory and denotational semantics. - Gordon Plotkin: [LCF considered as a programming language](http://homepages.inf.ed.ac.uk/gdp/publications/LCF.pdf). 1977. As a direct response to Plotkin's work on PCF, Milner introduced the notion of full abstraction. - Robin Milner: [Fully abstract models of typed λ-calculi](https://www.sciencedirect.com/science/article/pii/0304397577900536?via%3Dihub). 1977. Milner provided a syntactic model (syntax quotiented by an equivalence relation). The question of finding a fully abstract mathematical model for PCF not arising as a quotient of syntax was one of the most famous and fruitful problems in Programming Languages. The best-known solutions have been found around 1993 by Abramsky, Jagadeesan and Malacaria and by Hyland and Ong. For more on the history see the introduction of [Full Abstraction for PCF](https://arxiv.org/abs/1311.6125) and the conclusion of [On Full Abstraction for PCF](https://www.sciencedirect.com/science/article/pii/S0890540100929171?via%3Dihub). The work on full abstraction for PCF led to a third style of semantics, game semantics. These three styles of semantics, operational semantics, denotational semantics and game semantics continue to play a major role in Programming Languages research. A graduate level introduction to the topic can be found in Dan Ghica's videoed lectures on [Game Semantics](https://www.youtube.com/watch?v=EpGGenaS-mQ). We followed some of the history of denotational semantics and domain theory. The modern appraoch to operational semantics was introduced in - Gordon Plotkin: [A Structural Approach to Operational Semantics](https://web.eecs.umich.edu/~weimerw/2006-615/reading/plotkin81structural.pdf). 1981. For more on the history of the ideas sketched above see - Gordon Plotkin: [The origins of structural operational semantics](https://www.sciencedirect.com/science/article/pii/S1567832604000268). 2004. ## Further References (This is material typically covered in graduate courses on programming languages.) The following lecture by one of the leading researchers in domain theory and denotational semantics explains some of the basic questions of the area in quite general philosophical terms that should be of interest to a wider computer science audience. - Achim Jung: [The Surprising Difficulty of Using Mathematics in Computer Science](http://logic-ufrn.weebly.com/achim-jung.html). Next, I list some sources one would study in a graduate course on Semantics of Programming Languages. The first 30 min of the first lecture of the videoed course - Dan Ghica: [Game Semantics](https://www.youtube.com/watch?v=EpGGenaS-mQ) gives a good introduction to Semantics of Programming Langauges in general. The rest of the first lecture explains the idea of functions as processes at the heart of game semantics.[^rest] After that, the introduction of [Game-theoretic analysis of call-by-value computation](https://www.sciencedirect.com/science/article/pii/S0304397599000390) comparing call-by-name and call-by-value should make sense. [^rest]: This will make more sense after you learned about lambda calculus. Classic texts on domain theory and denotational semantics are: - Gordon Plotkin: [Pisa Notes on Domains](http://homepages.inf.ed.ac.uk/gdp/publications/Domains_a4.ps). 1983. - Gunter, Mosses, Scott: [Semantic Domains and Denotational Semantics](https://repository.upenn.edu/cgi/viewcontent.cgi?article=1887&context=cis_reports). 1989. - Samson Abramsky and Achim Jung: [Domain Theory](https://www.cs.bham.ac.uk/~axj/pub/papers/handy1.pdf). 1994. - Glynn Winskel: [Lecture Notes on Denotational Semantics](https://www.cl.cam.ac.uk/~gw104/dens.pdf). 1997, 2004. - Thomas Streicher: [Domain-theoretic Foundations of Functional Programming](https://b-ok.org/dl/768876/ab56b8). 2006. The "canonical" content of a first course on denotational semantics is discussed in - Achim Jung: [Teaching denotational semantics](https://www.cs.bham.ac.uk/~axj/pub/papers/Jung-2014-Teaching-denotational-semantics.pdf). 2014. While this outline of a course on denotational semantics is addressed to the teacher rather than the student, it may be a useful overview for students as well. The technical details can be found in (Streicher, 2006). [^abstractsyntax]: Recall that by saying "abstract syntax", we are signalling that we think of syntax as trees. Parentheses are not part of the abstract syntax and are only used to unambiguously write trees in one-dimensional notation. [^settheory]: We may come back to this if you are interested. [^frustration]: And reading the introduction, one gets a real sense of the frustration caused by the inability of finding a denotational semantics of the lambda calculus. [^wrong]: This paper contains the famous slogan that well-typed programs cannot go "wrong". This slogan is usually quoted without the quotes and then has to be taken with a grain of salt. But it is also a theorem in the paper, see page 365. [^Answer2]: Let $\mathbb N$ be the positive integers. We first define a function $f:\mathcal L\to \mathbb N$ by recursion on abstract syntax as $f(1) = 1$ and $f(e_1+e_2)=f(e_1)+f(e_2)$. This function induces a relation $\bar f :\mathcal L/{\equiv}\to \mathbb N$ via $\bar f([e]_\equiv) = f(e)$. We need to show that $\bar f$ is a bijective function. $\bar f$ is clearly total. $\bar f$ is single-valued because the equation $X + ( Y + Z ) = ( X + Y ) + Z$ holds in $\mathbb N$. $\bar f$ is surjective because every number in $\mathbb N$ can be represented as a sum of $1$s. $\bar f$ is injective follows from the lemma below (how?). Lemma: Let $\mathsf{nf}:\mathbb N \to\mathcal L$ be the function that assigns to the number $n$ the expression `(...(1+1)+... 1)`. Then $\mathsf{nf}(f(e))\equiv e$. Exercise: Define $\mathsf{nf}$ be induction. Prove the lemma. Answer the "how?" question. ## Acknowledgements Thanks to Clarisse Bonang, Dan Haub, Drew Moshier for suggesting improvements. [^truth]: Is truth objective or is truth relative to agreement by people? This question has been important ever since the invention of money, writing and propaganda, but is becoming ever more urgent now where more and more people spend most of their lifes online in virtual worlds. [^whatis]: "What is" questions are often what moves science forward. Typically, these questions do not have definite answers and every now and then a new answer is found, often moving science forward decisively. Examples are: What is the infinite? What is the continuum? What is space? What is time? What is causality? What is logic? What is a proof? What is an algorithm? What is communication? What is probability? What is causality? All these question have received mathematical answers which are important to engineering despite not being settled. [^meaning]: Roughly speaking, the outcome of the discussion will be that we cannot fully escape the fact that meaning depends on "subjective" pre-scientific understanding (the *pre-understanding*) tied to our human existence, sometimes called [thrownness](https://books.google.com/books/about/Understanding_Computers_and_Cognition.html?id=2sRC8vcDYNEC) and giving rise to the [hermeneutic circle](https://plato.stanford.edu/entries/hermeneutics/#HermCirc), the relevance of which to science was studied by [Jean-Michel Salanskis](https://en.wikipedia.org/wiki/Jean-Michel_Salanskis) in his book "L'herméneutique formelle". The relevance of the hermeneutic circle to computer science has been presented by well-known AI researcher (and PhD advisor of Google founder Larry Page) Terry Winograd (in collaboaration with Fernando Flores) in the book [Understanding computers and cognition : a new foundation for design](https://archive.org/details/understandingcom00wino). Importantly, from a software engineering point of view, while the philosophical problems should inform our work, they do not prevent us from creating ad hoc formalizations of meaning that can be turned into powerful software tools (such as [verified compilers](https://www.cs.cmu.edu/~15811/papers/compcert-journal.pdf)). [^favourite-algorithms]: Some comments: - One of my favourite examples is **finite automata and regular languages**. Stretching the ideas presented so far a little, one can take finite automata as expressions in a programming language $\mathcal L$ and their denotational semantics would be functions that map words to booleans (true if a word is accepted, false otherwise). The operational semantics would be given by a machine that executes the program and computes the function. - The general picture arising from this discussion should be: The denotational semantics is (a mathematical model) of the **specification**, the operational semantics is the **algorithm** and the agreement between operational and denotational semantics is the correctness of the algorithm, that is, proof of the algorithm satisfying its specification. [^twoTimesThree]: We want to do a computation like the following. (1+1)*((1+1)+1) = (1+1)*(1+1)+(1+1)*1 = (1+1)*(1+1)+(1+1) = ((1+1)*1+(1+1)*1)+(1+1) = ((1+1)*1+(1+1))+(1+1) = ((1+1)+(1+1))+(1+1) = (((1+1)+1)+1)+(1+1) = ((((1+1)+1)+1)+1)+1 [^xyzEquations]: Consider the equations/computation rules/rewrite rules X + ( Y + Z ) = ( X + Y ) + Z X * 1 = X X * ( Y + Z ) = X * Y + X * Z [^runit]: It is interesting to think about why the Haskell code runs. There is still a difference between source (blue, `Exp`) and target (red, `Integer`) language. But all are embedded into Haskell (black), which has itself an operational semantics. [^maclane]: This proof is famous in category theory. If you look at page 167 of MacLane's [Category Theory for the Working Mathematician](https://people.math.rochester.edu/faculty/doug/otherpapers/maclanecat.pdf) you may recognize the diamond shape we encountered when we discussed confluence. This is not a coincidence. If one peels away all the category theory, MacLane's famous proof boils down to a simple rewriting argument. Essentially, it just show that the rewriting system given by the equation above has unique normal forms.