--- tags: programming languages --- $\newcommand{\fix}{{\rm fix}}$ # Lambda Calculus: Extensions --- **Summary of `let rec`:** concrete syntax ``` let rec f = e1 in e2 ``` abstract syntax ``` Rec f e1 e2 ``` to implement `let rec` in the interpreter, we need an equation along the lines of ``` eval (Rec f e1 e2) = ... ``` the strategy is to do this in two steps: ``` eval (Rec f e1 e2) = ... Fix ... eval (Fix e) = ... ``` to fill in the dots, you need to know (do the [Important Exercise](https://hackmd.io/@alexhkurz/rJEeYqZtw#Important-Exercise) below) that the equation for Fix is (mixing abstract and concrete syntax for better readability, adapt appropriately for the interpreter) ``` Fix e = e (Fix e) ``` and that `let rec f = e1 in e2` can then be taken as syntactic sugar for ``` let f = (fix (\ f. e1)) in e2 ``` (if you look at the grammar, you see that the concrete syntax `fix` is not allowed in the programming language `LambdaNat`, one has to use `let rec` ... what we gain from introducing `Fix` in the abstract syntax is the transparent two-step implementation of recursion outlined above) --- In this lecture we will show that the $\lambda$-calculus is a flexible language that can be extended by a wide range of programming constructs. The main idea is that we use the $\lambda$-calculus to deal with variables, binding, scope, and subsitution, but are free to add whatever features we are interested in. We start from pure untyped lambda calculus, the syntax of which in our bnfc grammar was specified as ``` Abs. Exp ::= "\" Id "." Exp ; App. Exp ::= Exp Exp1 ; Var. Exp1 ::= Id ; ``` In this lecture, we will see how to add recursion, conditionals, sums, products, and numbers. [^moreispossible] [^moreispossible]: Much more is possible. For example, memory, input/output, objects and classes, polymorphism, streams and infinite data types, etc. In fact, that is how Haskell came into existence. The approach of using $\lambda$-calculus to deal with variables, binding and scope and to add further features as desired is also sometimes known as [Higher Order Abstract Syntax](https://en.wikipedia.org/wiki/Higher-order_abstract_syntax) (HOAS). The idea is that whenever we have to deal with variables, binding and capture avoiding substitution, we use the binding mechanism provided by the $\lambda$-calculus instead of redefining it each time "by hand". This can be done either by directly extending the $\lambda$-calculus, as done here, or by using the $\lambda$-calculus as a meta-language into which we translate our object-language. ## The General Scheme Each time we add a new feature to the language we need to specify syntax and semantics. Theorists call - the rules for specifying syntax "introduction rules". They correspond to the rules we add to the grammar. - the rules for specifying semantics "computation rules". They correspond to the equations we add to the interpreter in Haskell. ## Numbers Let us add numbers. We define numbers in the by now familiar way via 0 and S $$ \frac{}{0 : \mathbb N} \quad\quad\quad\quad \frac{n:\mathbb N}{Sn : \mathbb N} $$ Only the notation is new. Read the horizontal bar as a rule, with the premises on the top and the conclusions at the bottom. So the rule on the left says that $0$ is a natural number. And the rule on the right says "If $n$ is a natural number, then $Sn$ is a natural number." **Activity:** Line up the mathematical notation of the two rules above with the rules of the context-free grammar Nat0. Exp4 ::= "0" ; NatS. Exp4 ::= "S" Exp4 ; In which sense are these just two different notations for the same rules? What are the differences? What are the reasons for the differences? Which would would you find easier to use? These rules are also called **introduction rules** as they introduce new elements of a data type. Introduction rules tell us how to construct data. <font color='grey'> ### The Computation Rules of Natural Numbers (this subsection can be skipped ... here we define recursion over natural numbers but we will see how to define general recursion later anyway) Each time we introduce a new data type, we should also think about **computation rules**. What computation rules are appropriate for natural numbers? The computation rules should do some kind of recursion. A simple one would be a new constant $Iter$ (for iteration) which takes as arguments an element $z$ in a type $A$ and a function $s:A\to A$ and returns a function $$Iter\ z\ s:\mathbb N\to A$$ satisfying the computation rules $$ \frac{z : A\quad\quad s:A\to A}{Iter\, z\, s\, 0 \approx z} \quad\quad\quad\quad \frac{z : A\quad\quad s:A\to A}{Iter\, z\, s\, (Sn) \approx s(Iter\, z\, s\, n)} $$ In the rules above, the "$\approx$" has essentially the same meaning as the "$=$" in Haskell and the "$\to$" in rewriting systems: "$\approx$" denotes a computation step (from left to right). I chose "$\approx$" here because both "$=$" and "$\to$" are already used with different meanings (eg in the sentence "[...] for $A=\mathbb N$, $z=0$, the functions $s:\mathbb N\to\mathbb N$ given by [...]"). **Activity:** Compute $Iter\, z\, s\, (SSS0)$ for $A=\mathbb N$, $z=0$, the functions $s:\mathbb N\to\mathbb N$ given by - $k\mapsto k+1$ - $k\mapsto k+n$ for some $n\in\mathbb N$ - $k\mapsto k\cdot n$ for some $n\in\mathbb N$ **Question:** Can we do factorial or fibonacci with iteration? One can also think about more powerful recursion schemes. Actually we will see general recursion below. **Remark:** One reason why special recursion schemes are of interest is that they can be used to create programming languages in which all computations terminate. This is an important topic in type theory. </font> ## Conditional There are several reasons why the lambda calculus is foundational and one of them is that it is easy to extend by new constructs. [^level] Of course, if we add new constructs, the $\beta$-rule on its own may not be enough. In our programming language we try to be economical in the features we introduce. For example, we have no Booleans and instead a case distinction in which the equality test is built in: if e1 = e2 then e3 else e4 A more principled approach is to introduce the conditional together with Booleans. The introduction rules for Booleans are $$ \frac{}{true : Bool} \quad\quad\quad\quad \frac{}{false : Bool} \quad\quad\quad\quad $$ The computation rules for the conditional are $$ \frac{}{\texttt{if true then e1 else e2} \ \approx \ \texttt{e1}} \quad\quad\quad\quad \frac{}{\texttt{if false then e1 else e2} \ \approx \ \texttt{e2}} $$ **Notation:** $\approx$ is `=` in Haskell and $\to$ in ARSs. I was wondering whether it would be better to use `=` or $\to$, but these symbols are already over-used in maths ... ## Naming functions using `let` ### Syntax of `let` In a [previous lecture](https://hackmd.io/@alexhkurz/H1e4Nv8Bv) we have seen functions such as ```haskell= (\plus . \two . \one . plus two one) (\m.\n.\f.\x. m f (n f x)) (\f.\x. f (f x)) (\f.\x.f x) ``` **Exercise:** On which line is the function named by `plus` (similarly by `two` and by `one`)? This looks like a roundabout way to name functions. In particular, the name (eg `one`) and the function it names (eg `\f.\x.f x`) are quite far apart. In future we want to write this as let plus = \m.\n.\f.\x. m f (n f x) in let two = \f.\x. f (f x) in let one = \f.\x.f x in plus two one To this end we add to the grammar (ignoring precedence levels) Let. Exp ::= "let" Id "=" Exp "in" Exp ; ### Semantics of `let` We already know that the meaning of $\lambda$-expressions is given by $\beta$-reduction. So instead of defining bespoke computation rules for `let`, we just treat it as "syntactic sugar". In other words, whenever we have in a program let x = e1 in e2 (which is `Let id e1 e2` in abstract syntax) we translate it to (\ x . e2 ) e1 (which is `App (Abs x e2) e1` in abstract syntax) and then evaluate this expression using the already existing clause for `App`. **Exercise:** Evaluate pen-and-paper [^fmid] ``` let f = (\x . x x) in let mid = f c in f mid ``` [^fmid]: Should give `c c ((\ x . x x) c)`. Thanks to Chris Chang for the example.` ` ## Recursion ### Syntax: `let rec` `let rec` is our syntax for defining recursive functions. We add to the grammar (ignoring precedence levels) Rec. Exp ::= "let rec" Id "=" Exp "in" Exp ; (which is `Rec id e1 e2` in abstract syntax) **Remark (ML):** Functional programming languages are often compared to Haskell and ML. [Ocaml](https://ocaml.org/) is one of the best known languages in the ML family. In Ocaml, the factorial can be written as [^letrec] ```ocaml let rec f n = if n=0 then 1 else f(n-1)*n ;; ``` The notation we use in LambdaNat is inspired by ML. **Exercise:** Write the Haskell program `fact.hs` ```haskell fact n = if n==0 then 1 else n * fact (n-1) ``` in `LambdaNat` using the `let rec` syntax. Use the expression after `in` to call `fact` on an argument. [Hint: Use the additional constants `if then else` and `==` and `0`, `1`, `-1`, `*` plus $\lambda$-abstraction.] ### Semantics: The fixed point combinator Recursion can be added by a so-called fixed point combinator. **Activity:** - Review the notion of [fixed point](https://en.wikipedia.org/wiki/Fixed_point_(mathematics)) of a function from calculus. - Functions can be fixed points of equations. For example, the equation $f=\frac{d}{dx} f$ has as a solution the function $e^x$. [^exponential] - Find a $\lambda$-term $F$ such that on can write $$ f(n) \ = \ \texttt{if} \ n=0 \ \texttt{then} \ 1 \ \texttt{else} \ f(n-1)*n$$ as $$f=Ff.$$ ### Important Exercise Show that the factorial function `fact` is a fixed point of ```ocaml \f.\n. if n=0 then 1 else f(n-1)*n ``` that is, `fact` satisfies ```ocaml fact = (\f.\n. if n=0 then 1 else f(n-1)*n) fact ``` ### The magic of recursion The magic of recursion is that we can solve a problem by pretending to be able to solve it. Think about $$ f(n) \ = \ \texttt{if} \ n==0 \ \texttt{then} \ 1 \ \texttt{else} \ f(n-1)*n.$$ We define the $f$ on the left-hand side of the equation by pretending to know that we already know $f$ and can use it on the right-hand side. So why not take this seriously? Recall the definition of $F$ from above. Let us just pretend we know the solution of $$f=Ff$$ and call it [^callit] $$\fix_F.$$ [^callit]: In math we write $\fix_F$, concrete syntax is `fix e` and abstract syntax is `Fix e` (following conventions we follow elsewhere). Here `e` and `F` denote the same kind of entity, but in math we prefer $F$ to remind us of function whereas `e` reminds us of expression. This is the so-called fixed point combinator. We give ourselves one such fixed point combinator for each lambda term $F$. What should be the computation rule for the fixed point combinator? What we are looking for is something like the $\beta$-rule, or the rules for `if then else` above, but for $\fix_F$. But that is easy, we just use as a computation rule the property of being a fixed point: $$\fix_F \approx F\;\fix_F$$ <!-- **Notation:** The equation above, and similar ones to follow below, should be see as elements of a set $E$ of equations in the sense of the lecture on [TRS](https://hackmd.io/s/BJLCzAKnQ). Recall how equations $E$ give rise to a rewrite relation $\to_E$ which is used to compute with data. Accordingly, and following the tradition of type theory, we call these equations *computation rules*. --> ### Important Exercise Let $F$ be $$\lambda f.\lambda n.\ \texttt{if} \ n==0 \ \texttt{then} \ 1 \ \texttt{else} \ f(n-1)*n$$ and reduce $\fix_F 2$. (Use equational reasoning as we have done previously.) **Remark:** The fixed point combinator can be encoded in $\lambda$-calculus. **Remark:** The reason why Church introduced the lambda calculus. At the time, mathematics was shaken by paradoxes. These [paradoxes](https://plato.stanford.edu/entries/russell-paradox/) where mathematical versions of the well-known "This sentence is false". Is it false or true? There is no solution in two-valued logic. But there is a lambda-term solving this paradox, namely a process which runs forever and outputs an infinite sequence (true, false, true, false, ...) ### More Exercises **Exercise:** Compute pen-and-paper each of the following.[^hw-letrec] ``` let rec f = (\ x . S x) in f 0 let rec f = (\ x . if x = 0 then 0 else f (minus_one x)) in f (S 0) let rec f = (\ x . if x = 0 then 0 else f (minus_one x)) in f (S S 0) let rec f = (\ x . if x = 0 then x else c (f (minus_one x))) in f (S 0) ``` Some of the examples depend on how you define `minus_one`. [^hw-letrec]: Results should be S 0 0 0 c ((fix \ fx0 . \ x00 . if x00 = 0 then x00 else c (fx0 (minus_one x00))) (minus_one S 0)) **Exercise:** - Write a recursive addition function in `LambdaNat` and compute `add SS0 S0` by hand. - Use our interpreter and run and test factorial and addition. - Also add multiplication. - Implement the fibonacci sequence in LambdaNat. - A compiler will maintain a call stack to implement recursion. Think about why we can do recursion with the fixed point combinator without implementing a stack. (Maybe this question will make more sense after next semester's course on compiler construction, but you can give it a try.) --- (... what follows was not covered in the lectures and will not be needed for the assignment ... see it as bonus material ...) <font color=grey> ## Booleans We have two Booleans, true and false. So let us write this down has an inductive definition $$\frac{}{true:Bool} \quad\quad\quad\quad \frac{}{false: Bool}$$ **Notation:** We now write "$:$" instead of "$\in$" as this is the more common notation in the subfield of programming languages called *type theory*, which is the area we are touching upon here. The above rules tell us how to construct Booleans. We can also add some more such as $$\frac{a,b:Bool}{a\wedge b:Bool} \quad\quad\quad\quad \frac{a,b:Bool}{a\vee b:Bool} \quad\quad\quad\quad \frac{a:Bool}{\neg a:Bool} $$ Apart from the rules telling us how to construct data, we also need **computation rules** telling us how to compute with data. We already used the rules $$ \texttt{if} \ true \ \texttt{then} \ x \ \texttt{else} \ y \ \approx \ x $$ and $$ \texttt{if} \ false \ \texttt{then} \ x \ \texttt{else} \ y \ \approx \ y $$ **Remark:** **Exercise:** What are the rules for $\wedge,\vee,\neg$? ## Computation rules via universal properties Let us review the previous section from the point of view of [universal property](https://hackmd.io/s/Bymo_vCj7#Universal-Properties). First, the definition of $(\mathbb N,0,S)$ makes it the [initial algebra](https://hackmd.io/s/Bymo_vCj7#Universal-Properties) for a signature with a constant (the interpretation of which in $\mathbb N$ is denoted $0$) a unary operation symbol (the interpretation of which in $\mathbb N$ is denoted $S$). The assumptions for the computation rule of $Iter$ make sure that $(A,z,s)$ is an algebra for the same signature. **Exercise:** Show that $Iter\,z\,s$ is a [homomorphism](https://hackmd.io/s/Bymo_vCj7#Homomorphisms) $(\mathbb N,0,S)\to(A,z,s)$. Since $(\mathbb N,0,S)$ is initial, we know that there can be only one homomorphism $(\mathbb N,0,S)\to(A,z,s)$ and it, therefore, must coincide with $Iter\,z\,s$. **Activity:** What is the universal property corresponding to the computation rule of if-then-else? ## Sum Types If $A$ and $B$ are types, then we can form a type $A+B$ which has constructors $$\frac{a: A}{inl\, a: A + B} \quad\quad\quad\quad \frac{b: B}{inr\, b: A + B}$$ We read $inl$ as "in left" and $inr$ as "in right". Their is a new constant coming with this construction which we denote by $[-,-]$. Given functions $f:A\to C$ and $f:B\to C$, we construct a function $[f,g]:A+B\to C$ which is defined via computation rules $$\frac{f: A\to C}{[f,g](inl\, a) \approx f(a)} \quad\quad\quad\quad \frac{g: B\to C}{[f,g](inr\, b) \approx g(b)}$$ **Activity:** What is the universal property from which the two computation rules arise? **Exercise:** Show that $Bool$ is a sum type. Does this account includes the computation rules as well? ## Product Types Product types are dual to sum types. Instead of $$inl:A\to A+B \quad\quad inr:B\to A+B$$ we have projections $$lpr:A\times B \to A \quad\quad rpr:A\times B\to B$$ and instead of $$[f,g]:A+B\to C$$ we have, given $f:C\to A$ and $f:C\to B$, the so-called pairing $$\langle f,g\rangle:C\to A\times B$$ The computation rules state that $$lpr\,\langle f,g\rangle = f \quad\quad rpr\,\langle f,g\rangle = g$$ If $C$ contains exactly one element, then $f$ can be identified with an element of $a\in A$ and $g$ with an element of $b\in B$ and we write $(a,b)$ instead of $\langle f,g\rangle$. The computation rules then say $$lpr\,(a,b) \approx a \quad\quad rpr\,(a,b) \approx b$$ **Activity:** Explore in what sense sum types are dual to product types. **Notation:** In many programming languages the products and projections are written in the following style. struct myProduct left :: myTypeA right :: myTypeB end When we then evaluate `myProduct(a,b).left` we obtain `a`, which is just another notation for the computation rule $lpr\,(a,b) \approx a$. </font> ## Answers to selected exercises **Exercise:** Show that the factorial function `fact` is a fixed point of ```ocaml \f.\n. if n=0 then 1 else f(n-1)*n ``` that is, `fact` satisfies ```ocaml fact = (\f.\n. if n=0 then 1 else f(n-1)*n) fact ``` **Answer:** [^factfix] [^factfix]: We can prove this equation by induction. Base case ($k=0$): ```ocaml ((\f.\n. if n=0 then 1 else f(n-1)*n) fact) 0 = 1 = fact 0 ``` Inductive case ($k+1$): ```ocaml ((\f.\n. if n=0 then 1 else f(n-1)*n) fact) (k+1) = if k=0 then 1 else fact(k)*(k+1) = fact(k)*(k+1) = fact(k+1) ``` [^level]: $\lambda$-calculus is a low-level language if we encode everything just using variables, application and abstraction. On the other hand, we can turn $\lambda$-calculus in a high-level language by adding high-level constructs. [^exponential]: In fact, $e^x$ is the unique such function that satisfies $f(0)=1$. [^letrec]: An interesting discussion and further references on the distinction between `let` and `let rec` can be found [in this question at stackexchange](https://stackoverflow.com/questions/9325888/why-does-ocaml-need-both-let-and-let-rec). In fact, with what we learned in this section we can answer in a clear way: `let x=N in M` means $(\lambda x.M)N$ whereas `let rec` denotes the fixed point combinator. Understanding this, of course, does not take away from us the effort of learning the subtle differences in the way actual programming languages such as Haskell and ML express these concepts. $\quad$ Nevertheless, I still think it helps to know the theory behind such questions, because the theory is actually *easier to understand* than the often baroque implementations in really existing programming languages.