Assignment 3: Interp

Assignment 3: ==Interp== ------ This is a **pair or solo Gradescope assignment**—you will submit 4 Racket files and a README on Gradescope (to several different drops, as described below). :::info Recall that assignments, if completed as a pair, must be completed **together** rather than dividing work. See the syllabus for more. ::: :::warning For this assignment, you may reference the [`plait` documentation](https://docs.racket-lang.org/plait/index.html), the [free online PLAI textbook](https://www.plai.org), and the class notes. You should not use any other resources without checking with me first! You should also not use any mutation within your implementation. ::: ### Change log I will avoid making changes to the assignment after it is posted, but any necessary changes will be added here: # Setup As with previous assignments, I suggest you complete this assignment in DrRacket. The stencil code is setup to use `#lang plait`. ### Attribution This assignment is adapted from the `interp`/`desugar` assignment of [CS173](https://cs.brown.edu/courses/cs173/2024/) at Brown University, by Shriram Krishnamurthi (the author of the [PLAI textbook](https://www.plai.org/)), David Beazley and the staff of CSCI 1730. # Introduction For this assignment, you will write an interpreter for a simple (but Turing complete!) language with numbers, booleans, strings, and lambdas: `MiniLang`. Testing is critical to good language implementations, so as mentioned in class, your tests will also be programmatically graded! To make things even more interesting, you will also implement a `desugar` function that allows you to add `and`, `or`, and `let` to the languages _without_ changing your interpreter (by treating them as _syntactic sugar_). :::success As such, the assignment will have 4 parts: 1. **Part 1**:`interp` tests 2. **Part 2**:`interp` code 3. **Part 3**:`desugar` tests 4. **Part 4**:`desugar` code ::: :::info I suggest approaching the assignment as follows: 1. Read through all of **Parts 1 & 2: `interp`**. Ask any questions you may have on Zulip or at help hours. 2. Write 20+ `interp` test cases (they will initially fail, since the stencil's `interp` only ever returns an error). Upload your `interp` tests to Gradescopes to get automated feedback from the autograder, to make sure you understand what the interpreter **should** do. 3. Once you are happy with your initial tests, implement `interp` case-by-case, checking to see if more of your tests pass on every case. Upload your `interp` to Gradescope to get basic automated feedback that the autograder can run your handin. 4. Read through **Parts 3 & 4: `desugar`**. Ask any questions you may have on Zulip or at help hours. Again, repeat the process of checking your tests, then uploading your implementation. ::: The reference implementation for `interp` is around 60 lines of code and the reference implementation for `desugar` is around 35 lines of code. But, ==these lines of code will all require careful consideration==! I've added **preliminary deadlines** for earlier parts of the assignment to Gradescope and in the submission section at the end. I will accept submission of all 4 parts until the final late deadline, but I will take into account whether you made **some good-faith submission** (does not need to be your final submission) by these preliminary deadlines in determining the 10 points for your style/design score. :::warning Last semester, this assignment took students on average 11 hours, though some students took up to 15 hours. This is also our last assignment before Quiz 1 (and there is no assignment the week of the quiz). For these reasons, the final part of this assignment is due next Sunday, rather than Thursday. The late deadline for the entire assignment is Tuesday, 10/7. ::: # Stencil Code I've provided following starter code (files you should edit are in green): :::success - `interpreter-tests.rkt`, where you will write your tests for `interp`. - `interpreter.rkt`, where you will implement `interp`. You are not allowed to change the signature of `eval` or `interp`, but you are welcome to--and might need to!--add helper functions for your implementation. - `desugar-tests.rkt`, where you will write your tests for `desugar`. - `desugar.rkt`, where you will implement `desugar`. ::: :::warning - `support.rkt`, `test-support.rkt`, `desugar-support.rkt`: support files, including parsing, that you may read but **should not** edit. ::: You can download the [stencil code here (requires Wellesley login)](https://drive.google.com/file/d/1fVW9Oo9JluaJ2qedT5DNxrkGdcNPMSqb/view?usp=sharing). # Parts 1 & 2: `interp` First, you'll write tests for and implement your very first interpreter. In doing so, you'll have _implemented_ a programming language for the first time, instead of just _using_ one! Your interpreter will consume ASTs and produces values. The stencil code provides a function, `parse`, which consumes an expression in the `MiniLang` language's concrete syntax, `S-Exp`, and returns the abstract syntax tree representation of that expression (an `Expr`). ``` parse :: S-Exp -> Expr ``` `parse` only accepts expressions that follow `MiniLang`'s [grammar](#%28part._grammar%29). :::success You will implement a new function: `interp`. - `interp :: Expr -> Value` which consumes the abstract syntax tree (i.e. an `Expr`) and returns a `MiniLang` `Value` (which has variants for a number, string, boolean, or lambda). ::: `interp` should evaluate programs by performing a _post-order traversal_ of the abstract syntax tree (AST): first, evaluate all of the children of an expression from left to right, then evaluate the expression itself. This makes it unambiguous which error to raise if there are multiple errors in the AST. ## Errors For throwing errors, you should use Racket's error message convention we saw in class. You should throw an error with `(error <symbol> <message string>)`. For this assignment, you should use the symbol `'interp-error` and then a descriptive error message for that specific error. For example: ``` (error 'interp-error "unbound variable was found") ``` Some examples of expressions that should raise errors are: ```racket (str= (+ 5 "bad") "hello") (++ false (+ "bad" 6)) ("not function" (+ 7 "bad")) ``` ## Features to Implement ### Environment Your interpreter should use an environment, `Env`, to keep track of the `Value`s of variables in scope (similar to our `consts` in the calculator example from class). Your `interp` should implement _lexical scope_. ```racket (define-type-alias Env (Hashof Symbol Value)) ``` Since `Env` is a [Hashof](https://docs.racket-lang.org/plait/Types.html#%28form._%28%28lib._plait%2Fmain..rkt%29._.Hashof%29%29), you can use Plait’s built-in hash table functions on your `Env`. :::warning For your environment, make sure you use [hash](https://docs.racket-lang.org/plait/Predefined_Functions_and_Constants.html?q=hash#%28def._%28%28lib._plait%2Fmain..rkt%29._hash%29%29), which creates _immutable_ hash tables! What happens if you use [make-hash](https://docs.racket-lang.org/plait/Predefined_Functions_and_Constants.html?q=hash#%28def._%28%28lib._plait%2Fmain..rkt%29._make-hash%29%29), which creates _mutable_ hash tables instead? Try replacing one with the other and see. If none of your tests fail, you aren’t testing enough! You should have at least one failing test, if not several, when you make this switch. ::: `interp` should allow _variable shadowing_, meaning that if you bind a variable that is already bound, the new, inner-most binding takes precedence. When in doubt, your interpreter should handle scope the same way Racket handles scope. When you encounter an unbound variable in `interp` encounters an unbound variable, it should raise an error. ### Binary Operators `MiniLang` includes binary addition (`+`) and number equality testing (`num=`), as well as string appending (`++`) and string equality testing (`str=`). In place of having separate syntactic forms for each of `+`, `num=`, `++`, and `str=`, `parse` converts these operators into a single AST datatype variant, `e-op`, which denotes the operation to use via an `Operator` variant: ``` (define-type Operator (op-plus) (op-append) (op-str-eq) (op-num-eq)) ``` When you implement these operators, you should use Plait's - [+](https://docs.racket-lang.org/plait/Predefined_Functions_and_Constants.html#%28def._%28%28lib._plait%2Fmain..rkt%29._%2B%29%29) for `op-plus`, - [string-append](https://docs.racket-lang.org/plait/Predefined_Functions_and_Constants.html#%28def._%28%28lib._plait%2Fmain..rkt%29._string-append%29%29) for `op-append`, - [string=?](https://docs.racket-lang.org/plait/Predefined_Functions_and_Constants.html#%28def._%28%28lib._plait%2Fmain..rkt%29._string~3d~3f%29%29) for `op-str-eq`, and - [=](https://docs.racket-lang.org/plait/Predefined_Functions_and_Constants.html#%28def._%28%28lib._plait%2Fmain..rkt%29._~3d%29%29) for `op-num-eq`. Evaluation should also raise an error for non-numeric values passed to `+` and `num=` operations, and for non-string values passed to `++` and `str=` operations. For example, you should throw an error when the operator is receiving the wrong type of value, for example: ```racket (+ 0 "string") ; should be an error ``` ### Conditionals `if`-expressions in `MiniLang` have three parts: - `condition`, which should evaluate to a Boolean `Value` - `consq`, which evaluates if `condition` evaluated to `true` - `altern`, which evaluates if `condition` evaluated to `false` Like in Racket, `if` statements should short-circuit (i.e. only evaluate the relevant branch). :::warning **Unlike in Racket**, if the `condition` to the `if` evaluates to a non-Boolean `Value`, in `MiniLang` this should produce an error. ::: ### Functions For simplicity, functions in `MiniLang` are _unary_ (that is, they take exactly 1 argument). Here are two examples of valid functions and their applications: ``` ((lam x (+ x 3)) 2) ((lam y 5) 1) ``` These should both evaluate to 5. It’s possible that when attempting to perform a function application, the value in the function position isn’t actually a function; e.g., you might have `(1 2)`. In this case, you should raise an error as well. ## `MiniLang` Grammar The grammar of `MiniLang` is as follows: ``` <expr> ::= <num> | <string> | <var> # variable | true | false | (+ <expr> <expr>) | (++ <expr> <expr>) | (num= <expr> <expr>) | (str= <expr> <expr>) | (if <expr> <expr> <expr>) | (lam <var> <expr>) # anonymous function | (<expr> <expr>) # function application ``` ### `MiniLang` ASTs Refer to [Environment](#environment) for the definition of `Env` and [Binary Operators](#binary-operators) for the definition of `Operator`. ```racket ; MiniLang ASTs (define-type Expr (e-num [value : Number]) (e-str [value : String]) (e-bool [value : Boolean]) (e-op [op : Operator] [left : Expr] [right : Expr]) (e-if [condition : Expr] [consq : Expr] [altern : Expr]) (e-lam [param : Symbol] [body : Expr]) (e-app [func : Expr] [arg : Expr]) (e-var [name : Symbol])) ``` Your `interp` should produce the following values: ```racket ; MiniLang Value variants (results) (define-type Value (v-num [value : Number]) (v-str [value : String]) (v-bool [value : Boolean]) (v-fun [param : Symbol] [body : Expr] [env : Env])) ``` ## Testing Programming language implementations are expected to be extremely reliable (when’s the last time you ran into an implementation bug?). I want you to practice upholding this standard! In addition to the quality and correctness of your code, you will be evaluated on the **quality and correctness of your tests**. Rather than trying to judge this purely subjectively, on this assignment, I will also do so programmatically. ### How I'll Test Your Tests What’s the job of a test suite (i.e., a set of tests)? It’s to find errors in a program. In short, test suites are like sorting machines, putting programs in a “good” or “bad” bin. If you are a mathy person, you might call a test suite a _classifier_. So, here’s how I will test your test suites on this assignment: the autograder has some interpreters that are known to be correct; we'll call each of these a _wheat_. The others are known to be incorrect (i.e., have known errors); we call each of these a _chaff_. Your test suite’s job is to separate the [wheat from the chaff](https://en.wikipedia.org/wiki/Chaff#Metaphor). That is, the autograder runs each of the wheats and chaffs against your test suite and see what happens: ``` | On a wheat… | On a chaff… | ------------------------------------------------ …all tests passed | GREAT! | Not great… | …some tests failed | Ooops! | GREAT! | ``` All tests passing on a wheat, and **at least one test failing** on a chaff, is exactly what we are hoping for. If all tests pass on a chaff, that's not ideal, but you may miss _some_ chaffs, so it may be okay. But when *any* tests fail on a wheat, that's **definitely** a problem because it should never happen. It quite likely means you've misunderstood the problem statement, or perhaps the problem statement is ambiguous, or something like that. If your tests fail on a wheat and you don't understand why, reach out to me via Zulip DM right away. The quality of your test suite is then a measure of whether you passed all of the wheats and how many chaffs you caught. Of course, the latter could be arbitrarily hard depending on how devious the introduced bugs are. For instance, I could define a chaff that always works correctly except when a given number is exactly 42. The chaffs will not be like these, both because that would make the assignment too difficult and because real implementations are very rarely buggy in this way. Instead, each chaff has one or more “reasonable” mistakes (but not all of them will be easy!). In short, I will be running _your_ test suite against _other_ implementations. Therefore, it is very important that when you turn in your test suite (see details below), it not be accompanied by your implementation: otherwise, the autograder will fail on naming conflicts. ### Guidelines for testing your interpreter As a reminder, the provided `eval` consumes a program in the `MiniLang` language's concrete syntax (`S-Exp`) and returns a `MiniLang` `Value`, calling your `interp` along the way: ``` eval :: S-Exp -> Value ``` You should use `eval` in your testing file when writing test cases. You should not call `interp` or any helper functions directly in your **test file** (though you are welcome to individually test these functions in your code file). To enable autograder of tests, please use the following testing forms: - `(test-equal? name actual expected)`, `(test-not-equal? name actual expected)` Tests that `actual` and `expected` evaluate to the same value (in the case of `test-not-equal?`, different values). - `(test-true name expr)`, `(test-false name expr)` Tests that `expr` evaluates to `#t` (in the case of `test-false`, `#f`). - `(test-pred name pred expr)` Tests that `expr` returns a value that satisfies the given pred predicate. - `(test-raises-error? test-name expr)` Tests that the given `expr` raises an error. (Example usage can be found in the testing stencil.) Finally, recall that programs can evaluate to functions. For simplicity, your tests in your test file should only check that such a program returned a function, and not rely on the specific function returned (because of there could be differences in representation of lambdas). For instance, you may write: ``` (test-pred "My predicate test" v-fun? (eval `{lam x 5}) #t) ``` :::info Reminder: In Plait, you can add a `?` to the end of the name of any given type variant to create a function that returns `true` if the expression evaluates to that type variant. ::: However, you should not write: ``` (test-equal? "Don't write this test" (eval `{lam x 5}) (v-fun 'x (e-num 5) (hash (list )))) ``` because other representations of closures may not match your exact representation. ## Part 1: `interp` tests :::success First, write tests that cover all the core language features and focus on potentially interesting behavior. This should be 20+ tests. All of your tests will initially fail with `error todo: unimplemented`. Upload `interp-test.rkt` to the `A3: Part 1: interp tests` drop on Gradescope. You will receive autograder feedback on whether your tests pass on all wheats, and fail on a _subset_ of chaffs (the full set of chaffs will only be autograded after your final submission). ::: ## Part 2: `interp` code :::success Once you are happy with your initial set of `interp` tests, get to work implementing `interp`! You may find yourself adding new tests as you write the actually implementation, add these as you go. You can locally run your tests against your own implementation by pressing `Run` from `interp-tests.rkt`. Upload `interp.rkt` to `A3: Part 1: interp code` on Gradescope. ::: ## Debugging You may find it useful to use Plait's [trace](https://docs.racket-lang.org/plait/Definitions.html#%28form._%28%28lib._plait%2Fmain..rkt%29._trace%29%29) to help understand the control flow of your interpreter. For instance, if you write ``` (trace interp) ``` then all subsequent calls (including—and especially—recursive calls) to `interp` will be presented with their arguments and results. Do not include calls to `trace` in your final submissions. # Parts 3 & 4: `desugar` As in Parts 1 & 2, I have provided a `parse` function which consumes an expression in the `MiniLang`'s *extended* concrete syntax and returns an AST: we'll call the extended language `MiniLang+` This new parser produces ASTs as a new type, `Expr+`: ``` parse+ :: S-Exp -> Expr+ ``` `Expr+` is very similar to the `Expr` from before. However, it now includes three new constructors, `sugar-and`, `sugar-or`, and `sugar-let`. These represent the new syntactic sugar expressions! To avoid confusion, the constructors it shares with `Expr` have been renamed to `e-num+`, `e-app+`, and so on. You will implement the `desugar` function: ``` desugar :: Expr+ -> Expr ``` which consumes a `MiniLang+` abstract syntax tree (i.e. an `Expr+`, as returned by `parse+`), replaces all instances of syntactic sugar with desugared equivalents, and returns the result as an `Expr` (which can then be evaluated by you existing `interp`). Observe that an expression may have _multiple different correct desugarings_. (A desugaring is “correct” if, when the result is evaluated, it produces the desired result.) :::warning Because there are multiple correct desugarings, you must be careful in how you write tests for this assignment. For example, any test that follows this pattern: ``` (test-equal? "test name" (desugar ...) ...) ``` is **invalid!** If you were to run this test on another `desugar` implementation that happens to produce a different correct desugared expression, it would fail. ::: To judge whether your `desugar` implementation is correct, you will need your`interp` implementation (so, you should finish **Part 2** first. The `desugar.rkt` code stencil automatically imports your interpreter. You will use the provided `eval` function (which wraps around your `desugar` and `interp`, and the stencil's `parse+`) to write test cases. ## Features to Implement As described previously, `MiniLang+` contains several forms of syntactic sugar: `and`, `or`, and `let`. `desugar` should convert `sugar-and`, `sugar-or`, and `sugar-let` `Expr+`s into functionally equivalent, non-sugar `Expr`s. There are multiple implementation strategies for `desugar`. Be sure to test it well: it's easy to miss some details when desugaring! ## `and` and `or` - `and` consumes two boolean expressions. It evaluates to `true` if both boolean expressions are true; otherwise, it evaluates to `false`. - `or` consumes two boolean expressions. It evaluates to `true` if at least one boolean expression is true; otherwise, it evaluates to `false`. :::warning In `MiniLang+`, `and` and `or` should short-circuit; that is, they should only evaluate the second expression if necessary to determine `true` or `false` (similar to Racket's `and` and `or`). For your desugaring strategy, it may help to think of what *core* `MiniLang` feature(s) have similar behavior (where a subexpression is only evaluated some of the time). ::: ## `let` For simplicity, `MiniLang+`'s `let` supports just a single variable-value pair and a body. `let` evaluates the value, binds it to the variable, and evaluates the body with the newly bound variable in scope. For example, the following should evaluate to `3`: ```racket (let (x 1) (+ x 2)) ``` `MiniLang+`'s `let` should not support recursive definitions. That is, in `(let (<var> <expr>) <body>)`, `<var>` should be bound in `<body>` but not in `<expr>` (e.g., it should work more like Racket's `let` than `let*`). The desugaring of `sugar-let` may not be obvious, so here's a hint: what `Expr`(s) allow us to extend the variables bound within a given environment? ## `MiniLang+` extended grammar The (expanded) grammar of `MiniLang+` is as follows: ``` <expr> ::= <num> | <string> | <var> # variable | true | false | (+ <expr> <expr>) | (++ <expr> <expr>) | (num= <expr> <expr>) | (str= <expr> <expr>) | (if <expr> <expr> <expr>) | (and <expr> <expr>) | (or <expr> <expr>) | (let (<var> <expr>) <expr>) | (lam <var> <expr>) # anonymous function | (<expr> <expr>) # function application ``` ### `MiniLang+` ASTs ```racket (define-type Expr+ (e-num+ [value : Number]) (e-str+ [value : String]) (e-bool+ [value : Boolean]) (e-op+ [op : Operator] [left : Expr+] [right : Expr+]) (e-if+ [condition : Expr+] [consq : Expr+] [altern : Expr+]) (e-lam+ [param : Symbol] [body : Expr+]) (e-app+ [func : Expr+] [arg : Expr+]) (e-var+ [name : Symbol]) (sugar-and [left : Expr+] [right : Expr+]) (sugar-or [left : Expr+] [right : Expr+]) (sugar-let [var : Symbol] [value : Expr+] [body : Expr+])) ``` ## Part 3: `desugar` tests For the purposes of testing, there's now a modified `eval` function that additionally passes the parsed `Expr+` through your `desugar` function before interpreting it. ``` eval :: S-Exp -> Value ``` You should use again `eval` in your testing file when writing test cases. You should not directly test `desugar` individually in your test file (though you are welcome to individually test your functions in your code file). Your submitted test cases should indirectly test desugaring, because you should test that `and`, `or`, and `let` work correctly)! Of course, since `eval` relies on both your `interp` and `desugar` functions, test failures could come from either implementation. If you can't figure out a bug, you may want to double-check that your `interp` is correct - try writing more test cases for **Part 1**. :::success Write tests cases for `and`, `or`, and `let`; as well as how they interact with other language features from Parts 1 & 2. Upload `desugar-tests.rkt` to the `A3: Part 3: desugar tests` drop on Gradescope. ::: ## Part 4: `desugar` code :::success Once you are happy with your initial set of `desugar` tests, get to work implementing `desugar`! Upload `desugar.rkt` to `A3: Part 4: desugar code` on Gradescope. For this final submission, you should also include a README that reports: - Roughly how long did this assignment take you (in hours)? - Which was the most challenging part? - Which your favorite part? - Which resources did you find to be the most useful? ::: --- # Grading This assignment is worth 100 points, broken down by: :::info - 30: Part 1: `interp` tests - 30: Part 2: `interp` code - 15: Part 3: `desugar` tests - 15: Part 4: `desugar` code - 10: Style, design, and good-faith effort on preliminary deadlines. ::: Note that each Gradescope drop is out of 100, but these will be weighted to build your final assignment grade as specified above. # Submission Submit the following **5 files** on **Gradescope**: :::success - To `A3: Part 1: interp tests`: `interpreter-tests.rkt` - Preliminary deadline: Sunday, 09/21 - To `A3: Part 2: interp code`: `interpreter.rkt` - Preliminary deadline: Tuesday, 09/30 - To `A3: Part 3: desugar tests`: `desugar-tests.rkt` - Preliminary deadline: Thursday, 10/2 - To `A3: Part 4: desugar code`: `desugar.rkt`, `README.` {`.txt`, `.md`} - On-time deadline: Sunday, 10/5 :::