A Calculator in Haskell (Abstract Syntax) (2020)

--- tags: programming languages --- # A Calculator in Haskell (Abstract Syntax) (2020) "The form of syntax we shall now describe differs from the Backus normal form in two ways. First, it is analytic rather than synthetic; it tells how to take a program apart, rather than how to put it together. Second, it is abstract in that it is independent of the notation used to represent, say sums, but only affirms that they can be recognized and taken apart." From the article [Towards a Mathematical Science of Computation](http://www-formal.stanford.edu/jmc/towards.ps) by [McCarthy](https://en.wikipedia.org/wiki/John_McCarthy_%28computer_scientist%29). [^mccarthy] [^mccarthy]: Today AI and Programming Languages are areas that are often thought to have little overlap. McCarthy is a founding father of both. For example, he coined both the term Artificial Intelligence and created LISP, one of the first functional programming languages. --- The next two lectures aim at the following. - Learn some more Haskell, in particular [algebraic data types](https://en.wikipedia.org/wiki/Algebraic_data_type). - Write a simple calculator. - Understand how parsing and interpretation works in the smallest meaningful example. - Introduce the idea of concrete and abstract syntax. At the same time we will learn some of the basic ideas underpinning next semester's compiler construction course. Abstract syntax plays the role of an intermediate language between the source language and the target language. For example, all the main algorithms of our C compiler (typechecking, interpretation, code generation) will operate on abstract syntax. In this session we focus on a calculator that works on abstract syntax. The aim of the first activity is to find the simplest way of defining our own numbers. **Activity:** We want to keep everything under control and not use the Haskell implementation of numbers for our calculator. But we also do not want to burden ourselves with the subtleties of how to represent binary numbers, nor do we want to complicate our algorithms with the need to carry bits. - So what is the simplest possible way to represent numbers? [Hint: Ignore efficiency questions.] - I will show you how to implement these numbers as a simple recursive data type in Haskell. Answer. [^successornumbers] Next, we need some operations on numbers. By the way, without our simple representation of numbers we would have to face complications. For example, here is a circuit that adds 4 bit binary numbers (taken from this [Four Bit Full Adder Tutorial](http://langster1980.blogspot.com/2014/12/four-bit-full-adder-tutorial.html)). As they say, "It gets complicated!" ![](https://i.imgur.com/cZKvWSd.png =400x400) **Activity:** How can we define addition and multiplication for these numbers? How can we implement this in Haskell? Our next aim is to write our first interpreter. **<font color=red>Homework:</font>** - Read [Abstract and Concrete Syntax](http://www.cse.chalmers.se/edu/year/2011/course/TIN321/lectures/proglang-02.html) up to and including "Notations for abstract syntax trees". Prepare a question for Thursday. (We will here more about context-free grammars, BNF, and the BNF converter in the next lecture. Skip the excursion on GF.) - Download and test [addition.hs](https://github.com/alexhkurz/programming-languages-2021/blob/master/src/Haskell/addition.hs). Experiment with dialogues like the following. >ghci Prelude> :load addition.hs *Main> add (S O) (S O) S (S O) *Main> add 1 1 <interactive>:3:5: error: • No instance for (Num NN) arising from the literal ‘1’ - What is the reason for the error message in the dialogue above? - Implement a function plusone :: NN->NN [Hint: This does not need recursion as the meaning of `S` is already "plus one".] Similarly, implement a function minusone :: NN -> NN - Can you do multiplication? [Hint: How do you break down a multiplication into a sequence of additions?] - What about subtraction? [Hint: Make two piles of stones and think about how to subtract the second from the first by removing a stone at a time from both piles.] **Stocktaking:** We currently have a presentation of numbers and addition and multiplication for these numbers. Or, in other words, we have a programming language in which we can add and multiply. But for a calculator, we also need a language in which the user can write arithmetic expressions. This is what we turn to next. **Activity:** Define a Haskell data type of arithmetic expressions that are made from numerals, and from operations `Plus` and `Times`. The following should be examples of numerals Num 0 Num 1 Num 2 Num 3 etc. From these numerals we then want to form expressions such as Times (Num 2) (Num 3) Plus (Num 1) (Times (Num 2) (Num 3)) **Stocktaking:** So far we have numbers with addition and multiplication and a language for arithmetic expressions. Our next aim is to write an interpreter that evaluates arithmetic expressions. **Activity:** Write a Haskell function`eval` that evaluates such expressions as numbers. ## Homework (until Tuesday) Finish the previous activity. In more detail, the task is to evaluate expressions in our source language data Exp = Num Int | Plus Exp Exp | Times Exp Exp into the successor numbers defined by data NN = O | S NN The type of the function will be eval : Exp -> NN To get you started I give you the base case eval (Num 0) = O eval (Num n) = S (eval (Num (n-1))) - Add the cases for `Plus` and `Times`. You should then be able to get a dialogue such as *Main> eval (Times (Num 2) (Num 3)) S (S (S (S (S (S O))))) *Main> eval (Plus (Num 1) (Times (Num 2) (Num 3))) S (S (S (S (S (S (S O)))))) *Main> - Extend your program so that it can account for subraction. - (Optional:) Can you add division? This is interesting, but will require some work as one has to find out how to do division with successor numbers. Use subtraction in the recursion. - (Optional:) Can you simplify the program by using Haskell's built in numbers instead of our own successor numbers? Once you done this adding division should be straightforward as you can now use Haskell's built-in division. --- ... this is how far we went on Thursday ... --- ## Coming Next So far, we can evaluate "abstract expressions" such as eval (Plus (Num 1) (Times (Num 2) (Num 3))) but it would be nicer to also be able to evaluate "concrete experessions" such as 1 + 2 * 3 (Of course, both are just two different ways of encoding $1+2\cdot 3$.) This will be the topic of the next lecture. **Activity:** Discuss the advantages and disadvantages of the concrete and the abstract format. Relate this discussion to the introductory quote by McCarthy. ## Summary We have seen how to write a calculator that operates on abstract syntax. I want to emphasise how simple this is. Also note how little we used of Haskell: Just algebraic data types, recursion and pattern matching. No libraries. No monads (that is the Haskell way to introduce side effects such as memory and input/output into pure functional programming). On the other hand, we do not want to force users to interact with our calculator using abstract syntax. One reason is that abstract syntax is 2-dimensional and linearising it forces us to write a lot of parentheses. More generally, humans find it difficult to read linearised abstract syntax. To address this problem we will take a quick look at parsing in the next session. ## A quick look at the history The inductive definition of addition and multiplication in terms of successor appears to my knowledge for the first time in - Richard Dedekind: [Was sind und was sollen die Zahlen](https://gdz.sub.uni-goettingen.de/id/PPN23569441X). 1888. On page 379 one finds ![](https://i.imgur.com/j5oQtJw.png) We see that Dedekind does recursion on the second argument, uses $1$ as its base case instead of $0$, and writes $'$ for successor. Multiplication is treated in the same way on page 382. Given how simple it is to write an interpreter on abstract syntax, and how difficult it is to write it on concrete syntax (if we do not use libraries or tools that help us with the parsing), it is maybe not a surprise that in one of the first modern programming languages, LISP, programmers write directly in linearised abstract syntax. (That is why LISP programs have so many parentheses.) Have a look at the classical article [Towards a Mathematical Science of Computation](http://www-formal.stanford.edu/jmc/towards.ps) by [McCarthy](https://en.wikipedia.org/wiki/John_McCarthy_%28computer_scientist%29). McCarthy was a pioneer of Computer Science. In 1955 he coined the term "artificial ingelligence", shortly afterwards he invented LISP and garbage collection, [time-sharing systems](https://en.wikipedia.org/wiki/Time-sharing), and in 1962 he introduced, with the quote above, the notion of abstract syntax. Interesting for us is that McCarthy emphasises that BNF [^bnf] makes it easy to generate, write and synthesize programs, whereas abstract syntax makes it easy to analyse, translate, type check, interprete and compile programs. Another way to put it, is to say that concrete syntax is good for human readers and abstract syntax is good for automated processing. Another influential article by McCarthy is [Ascribing mental qualities to machines](http://cs.uns.edu.ar/~grs/InteligenciaArtificial/ascribing.pdf) from 1979. (It is always worth looking at the original work of the pioneers. They are mostly remembered for only a very small part of their ideas, so there is often something interesting and forgotten to discover.)  ## Homework: More Practice Questions The aim of these questions is to help you to transfer the skills from the last session to the new concept of algebraic data types. - Instead of using built-in lists, we can define lists as our own algebraic data type as follows. data List a = Nil | Cons a List `a` is a type variable, denoting the type of the elements in the list. `Nil` plays the same role as `[]`. And `Cons` plays the same role as `:`, just that here we do not write it in infix but in prefix notation. The exercise is now to rewrite all the programs from the [previous session](https://hackmd.io/@alexhkurz/H1jUka4Gv) that worked by recursion over lists with this new definition of lists. - Using recursion over data NN = O | S NN - rewrite the programs from the [previous session](https://hackmd.io/@alexhkurz/H1jUka4Gv) that worked by recursion over numbers such as `fib`, `factorial`, `triangle_numbers`, etc. - write functions of type NN->NN->NN for less, greater, less_equal, greater_equal, div, mod, exp, ... Let me know if you make your own examples. ## Further Reading [Alexis King - "Types as axioms"](https://lexi-lambda.github.io/blog/2020/08/13/types-as-axioms-or-playing-god-with-static-types/). One thing that is new to many Haskell beginners is the type system. This blog presents an interesting point of view. In particular the data types we will define to represent abstract syntax are an example of what she calls "postive space". [^successornumbers]: The easiest way to represent numbers is using [tally marks](https://en.wikipedia.org/wiki/Tally_marks): `I` for one, `II` for two, `IIIIIIIIIIII` for 12, etc. But we also want a symbol for zero. So let us write `O` (pronounced "Oh") for zero, `S O` for 1, `S S O` for two, `S S S S S S S S S S S S S O` for 12, etc. The letter `S` here is used to remind us of "successor". As we will see shortly, these so-called **successor numerals** allow us to implement arithmetic much more easily then if had to use a [place-value system](https://en.wikipedia.org/wiki/Positional_notation) such as binary or decimal numbers. [^bnf]: We will learn about [BNF](https://en.wikipedia.org/wiki/Backus%E2%80%93Naur_form) in the next session.