# Assignment 5: Type Checking `MiniLangT`
This is a **pair or solo Gradescope assignment**—you will submit 2 OCaml files (`a5.ml` and `a5test.ml`) and a README on Gradescope.
:::info
Recall that assignments, if completed as a pair, must be completed **together** rather than dividing work. See the syllabus for more.
:::
In this assignment, you will define a single OCaml function, `type_of`, that is a **static type checker** for expressions in the `MiniLangT` language: a typed variant of the `MiniLang` language from `A3: Interp`. You can (and should!) design helper functions to aid in your implementation of `type_of`. You will also answer some written reflection questions on the impact of type checking. For extra credit, you may optionally choose to implement an interpreter for `MiniLangT` that is designed to run _after_ type checking is successful.
:::warning
For this assignment, you may reference any OCaml language documentation, the [free online OCaml textbook](https://cs3110.github.io/textbook/cover.html), and the class notes. You should not use any other resources without checking with me first!
You should also not use any mutation within your implementation.
:::
### Change log
I will avoid making changes to the assignment after it is posted, but any necessary changes will be added here:
# Setup
:::success
As with Assignment 4, the [stencil code](https://drive.google.com/file/d/1GuWmRAOed6C40q0wlVsafl-6H5wuobAi/view?usp=share_link) is set up as a `dune` OCaml project.
You will also need to install the following new library, for parsing code:
```
opam install sexplib
```
Then, run `dune test` to compile your code in `a5.ml` and run the inline tests in `a5tests.ml`.
:::
### Attribution
This assignment is adapted from the type checking assignment of [CS173](https://cs.brown.edu/courses/cs173/2024/) at Brown University, by Shriram Krishnamurthi and others.
# Introduction
As we saw in class, `MiniLangT` is a language with the following types: numbers (now, always integers), booleans, function types, and homogenous list types. We represent these types with the following OCaml variant type:
```ocaml
type typ =
| NumT
| BoolT
(* Types of the arguments, then return type *)
| FunT of typ list * typ
(* Element type *)
| ListT of typ
```
:::success
Your task is to implement a static type checker to assign one of these types to a `MiniLangT` program, or produce a specific exception if there is a type error. If a program "passes" your type checker (if some type is assigned), then _interpreting_ that program should not encounter a type error.
:::
The stencil code provides a `parse_from_string` function that consumes a `MiniLangT` expression as a string and produces a value from the following AST:
```ocaml
type operator =
| PlusOp
| NumEqOp
| ConsOp
type unary_operator =
| FirstOp
| RestOp
| IsEmptyOp
type expr =
| NumE of int
| BoolE of bool
| VarE of string
| EmptyE of typ
| OpE of operator * expr * expr
| UnaryOpE of unary_operator * expr
| IfE of expr * expr * expr
(* parameters with types, then body *)
| LamE of (string * typ) list * expr
(* function then argument *)
| AppE of expr * expr list
(* bindings then body *)
| LetE of (string * expr) list * expr
```
:::success
Your job is to complete the definition of the following function, along with any necessary helpers:
```ocaml
let type_of (e : expr) : typ =
```
:::
The stencil code also provides the function to parse and then call your type checker on an input:
```ocaml
let type_check (s: string) : typ =
let parsed = parse_from_string s in
type_of parsed
```
:::success
You should write a series of tests using `type_check`. Some examples are provided for you in `a5tests.ml`. You should have at least one test for every branch/arm of your `type_of` implementation.
:::
# Grammar
`MiniLangT` has the following concrete grammar.
Beyond being *statically types*, `MiniLangT` has the following changes from `MiniLang+` from Assignment 3, as we discussed in class:
1. For simplicity, `MiniLangT` does not have a string type or string operations, or `and` and `or` sugar.
2. `MiniLangT` has cons lists and their basic operations: `empty`, `cons`, `first`, and `rest`.
3. `let` local bindings and `lam` anonymous functions now can take 1 or more arguments (instead of exactly 1). Note that this means that in the grammar, there is an extra set of parentheses around the list of variables for each. In the grammar, the superscript `⁺` indicates that a form can appear 1 or more times (similar to `⁺` in [regular expressions](https://en.wikipedia.org/wiki/Regular_expression)).
4. As we saw in class, we need to annotate `lam` arguments and the `empty` list constructors with their types (otherwise, we would need to have a polymorphic type, which `MiniLangT` lacks).
```
<expr> ::= <num>
| true
| false
| <var>
| (+ <expr> <expr>)
| (num= <expr> <expr>)
| (if <expr> <expr> <expr>)
| (lam ((<id> : <type>)⁺) <expr>)
| (<expr> <expr>⁺)
| (let ((<id> <expr>)⁺) <expr>)
| (first <expr>)
| (rest <expr>)
| (is-empty <expr>)
| (empty : <type>)
| (cons <expr> <expr>)
<type> ::= Num
| Str
| Bool
| (List <type>)
| (<type>⁺ -> <type>)
```
:::warning
Note, the spaces before and after the ` : ` and ` -> ` are required by the parser for type annotations. In general, make sure you have spaces whenever they are shown in the grammar.
:::
# Features to Implement
## Type Environment
As we saw in class, static type checking requires us to track the types of any bound variables. Similar to your interpreter, your type checker should do this with a _type environment_. Rather than using a hash, the stencil code defines the following type definition based on a list of tuples that you should use:
```ocaml
type typ_env = (string * typ) list
```
As well as the following functions to look up a type given a variable:
```ocaml
let rec lookup env name =
match env with
| [] -> None
| (x, value) :: env -> if x = name then Some value else lookup env name
```
Due to shadowing, if the program binds an identifier that is already bound, the new binding should take precedence.
## `let`
`let` expressions now accept a series of identifier-expression pairs and a body.
`let` evaluates each expression in the list to a value, binds it to the corresponding identifier, and evaluates the body with the newly bound identifiers in scope.
For example, the following should evaluate to `3` and have type `Num`:
```racket
(let ((x 1)) (+ x 2))
```
While the following should evaluate to `2` and have type `Num`:
```racket
(let ((x (+ 1 2)) (y false)) (if y x 2))
```
`let` should not support recursive definitions. That is, in `(let ((<id1> <expr1>) (<id2> <expr2>)) <body>)`, `<id1>` should be bound in `<body>` but not in `<expr1>` or `<expr2>`.
## Lists
`MiniLangT` now contains support for lists via the constant `empty` and the operations `cons`, `is-empty`, `first`, and `rest`. Like in OCaml, lists in `MiniLangT` are _homogeneous_: all elements in the list must have the same type. (Unlike OCaml, `MiniLangT` lacks variant types, which means we could not implement a type checker for `MiniLangT` _in_ `MiniLangT` - unless we made a much more convoluted type definition!)
Because our type checker is not polymorphic, we handle typing the `empty` constant list with this with a simple syntactic adjustment: we require that every empty list be annotated with a type for the elements. The elements that eventually are cons to it must be consistent with that annotated type.
Thus, our typed list semantics are as follows:
- `empty`
`(empty : t)` makes an empty list whose elements have type `t`.
- `cons` has type `t (List t) -> (List t)`
`(cons x y)` appends the element `x` to the front of the list `y`.
- `first` has type `(List t) -> t`
`(first x)` returns the first element of `x`.
- `rest` has type `(List t) -> (List t)`
`(rest x)` returns the list `x` except for the first element of `x`.
- `is-empty` has type `(List t) -> Bool`
`(is-empty x)` returns `true` if `x` is `empty`; otherwise, it returns false.
Whenever these type signatures are violated, your type checker should raise an error.
Error-checking `cons` should occur in the following manner: if the type of the second argument to `cons` isn't a `(List t)`, you should raise an `OperatorError` exception with a message. If the second argument to `cons` is a `(List t)` but the type of the first argument is not `t`, then you should raise an `OperatorError` exception with a different message. For example:
```racket
(cons 1 3)
(cons 2 (empty : Bool))
```
should both raise `OperatorError` exceptions with helpful error messages.
## Functions, conditionals, arithmetic
Your type checker should statically rule out all of the possible type errors from assignment 3 around these constructs.
Recall that conditionals in `MiniLang` require that the first argument be a boolean. This issue or any other `if`-related type issue should raise a `ConditionalError` exception.
`MiniLangT` has true multi-argument support, rather than support for currying. Thus, passing the wrong number or type of arguments should be a type error. These issues should raise a `ApplicationError` exception.
Any other type issue (for example, with `+`, `num=`, `cons`, etc) should raise an `OperatorError` exception.
:::success
Note the autograder will check for specific exceptions (so your tests should, too)! If you are unsure what exception a specific `MiniLangT` program should raise, ask me on Zulip.
:::
## Unbound variables
Unbound variables should raise the `UndefVar` exception.
For example:
```
(let ((y: Int)) x)
```
Should raise `UndefVar` at the use of `x`.
## Error Exceptions
The stencil code provides the following exceptions:
```ocaml
(* You should use in a5.ml *)
exception OperatorError of string
exception ConditionalError of string
exception ApplicationError of string
exception UndefVar of string
(* Used by the stencil code *)
exception ParseError of string
```
As noted in the comments and described above, you should use the first 4 exceptions in implementing `ty_of`. The stencil code use the `ParseError` exception, but you do not need to use or test for it directly.
As shown in the example test, you should have tests for specific exceptions. For example:
```ocaml
let%test "test_num_eq_bool_num_error" =
try ignore (type_check "{num= false 0}"); false
with OperatorError _ -> true
```
OCaml's builtin `ignore` function has type `'a -> unit`, it's used to indicate that the value is not used but instead just produces some side effect (here, raising an exception in the expected case). If `type_of` incorrect raised an `UndefVar` exception on this program, the exception would not be caught by the `with` and this test would fail, as expected.
## Implementation summary
:::success
Your `type_of` function should produce either:
1. The specific type of the expression passed in, or
2. Raise a specific one of the 4 provided exceptions.
:::
The reference implementation is 85 lines (though depending on how you structure your `match` statements and use newlines, yours may be substantially shorter or longer).
## Debugging with `utop`
You can use `dune utop` to help you debug your code. For example, here is how to use `dune utop` to see the `MiniLangT` AST (`expr`) for a given expression, then check the type your `type_of` assigns it.
```ocaml
$ dune utop
utop # #mod_use "a5types.ml";;
module A5types :
sig
exception OperatorError of string
(* ... more definitions elided ... *)
utop # #use "a5.ml";;
type typ_env = (string * typ) list
val string_of_typ_env_entry : string * typ -> string = <fun>
val lookup : ('a * 'b) list -> 'a -> 'b option = <fun>
val ty : expr -> typ_env -> typ = <fun>
val type_of : expr -> typ = <fun>
val type_check : string -> typ = <fun>
utop # parse_from_string
"{rest (cons (lam ((x : Bool) (y : Bool)) x) (empty : (Bool Bool -> Bool)))}" ;;
- : expr =
UnaryOpE (RestOp,
OpE (ConsOp, LamE ([("x", BoolT); ("y", BoolT)], VarE "x"),
EmptyE (FunT ([BoolT; BoolT], BoolT))))
utop # type_check
"{rest (cons (lam ((x : Bool) (y : Bool)) x) (empty : (Bool Bool -> Bool)))}" ;;
- : typ = ListT (FunT ([BoolT; BoolT], BoolT))
```
# Written questions
:::success
Create a README file and answer the following questions.
:::
Consider implementing an interpreter for `MiniLangT` that runs _after_ type checking (as you can do for extra credit, as described below).
1. Did you need to account for whether `MiniLangT` has lexical vs. dynamic scope in your type checker? Explain how this impacts your design.
1. What runtime errors does the hypothetical interpreter for `MiniLangT` **not** need to check for, with type-checking run beforehand? Give at least two examples. Think back to your Assignment 3 implementation for inspiration.
1. What runtime errors are possible from this hypothetical interpreter for `MiniLangT`, even if type-checking succeeded? That is, what runtime errors can type checking **not** rule out here?
1. **[Independent]** How has the addition of type-checking changed the semantics of `MiniLangT` compared to a language with the same features, but no types? Write two programs that would run just fine (if type annotations on `lam` and `empty` were ignored), but are no longer valid because they would fail the type checker. For full credit, the two programs should fail the type checker for different reasons.
1. **[Independent]** How does adding type-checking affect how long it takes to interpret a program (ignoring the time to type-check)? What kinds of programs might take less ["wall clock" running time](https://en.wikipedia.org/wiki/Elapsed_real_time#:~:text=In%20computing%2C%20elapsed%20real%20time,at%20which%20the%20task%20started.), what kinds might stay the same, and what kinds might take more time?
# [Independent] Optional extra credit: interpreter redux
For up to +10 (out of 100) extra credit, implement an interpreter for `MiniLangT` in OCaml that is designed to run after your type checker.
You should define a `valu` type, then an `interp` function, which you should then be able to use like below.
:::success
Place the definitions of `valu`, `interp`, and `ty_interp` in `a5.ml`.
:::
```ocaml
let ty_interp (s: string) : valu =
let parsed = parse_from_string s in
let _ = type_of parsed in
interp parsed
```
Include additional tests for `ty_interp` in a new file, `a5interptest.ml`. Start the file with:
```ocaml
open A5
open A5types
```
Add the following to your `dune` file to be able to run these new tests:
```ocaml
(library
(name a5interptest)
(modules a5interptest)
(libraries a5 a5types)
(inline_tests)
(preprocess
(pps ppx_inline_test)))
```
Submit the new file as well. Note the extra credit is not checked by the autograder for this assignment, since you defined your own `valu` type.
# Submission
Submit the following **3 files** on **Gradescope**:
:::success
- `a5.ml`
- `a5tests.ml`
- `README` as either a `txt`, `md`, or `pdf` file, with:
- Your answers to written questions 1-5.
- Roughly how long did this assignment take you (in hours)?
- Which was the most challenging part of this assignment?
- Which resources did you find to be the most useful?
- If you completed the extra credit, describe how your interpreter implementation differs from the one you completed in Assignment 3.
:::