# Assignment 9: `MiniJava` :coffee: This is a **pair or solo Gradescope assignment**—you will submit 2 OCaml files (`eval.ml` and `tests.ml`) and a README on Gradescope. :::info Recall that assignments, if completed as a pair, must be completed **together** rather than dividing work. See the syllabus for more. ::: We all know and love Java from CS230. In this assignment, you will get to explore how Java is made :coffee:! You will complete a type-checker and interpreter for a miniature version of Java, `MiniJava`. You are provided with a stencil of an implementation for `MiniJava`, a language that captures many interesting OOP semantic features of Java. The implementation has 3 phases: :::info - **Phase 1**: A type-checker that runs on the source AST, - **Phase 2**: A compiler from the source AST to a lower-level object representation, and - **Phase 3**: An interpreter that runs on the lower-level representation. ::: This loosely follows the design of real [JVMs](https://en.wikipedia.org/wiki/Java_virtual_machine). To tighten our focus in this assignment to the new OOP features, the stencil includes cases for the non-OOP language constructs. **Phase 2** is also fully implemented in the stencil. This leaves you with two tasks: - **Phase 1**: `type_check` and `type_check_classes` :::success Your first job is to implement the **type-checking cases** for objects, methods, and fields. ::: - **Phase 2**: The compilation step is implemented for you, but relies on your `type_check*` code to correctly rule out type errors and set the static object type on field constructs. - **Phase 3**: `interp` :::success Your second job is to implement the **interpreter cases** for objects, methods, and fields. ::: :::warning For this assignment, you may reference any OCaml language documentation, the [free online OCaml textbook](https://cs3110.github.io/textbook/cover.html), Java documentation, and the class notes. You may also find it useful to translate `MiniJava` programs to Java to check expected outputs! ::: The stencil code also provides numerous test cases. :::success Your final job is to add **at least 5 more test cases** for `type_check_classes` and **at least 10 more test cases** for `iterp` (you can reuse some of your 5 type check programs). ::: ### Change log I will avoid making changes to the assignment after it is posted, but any necessary changes will be added here: - [4/24] Added more tips for finding type-errors and defining `type_check_class` [here](https://hackmd.io/yNdBHOKoTIeyywH6JcmkKg?view#Finding-type-errors) - [4/28] Added more on handling runtime matches. - [5/5] Fix typo in `type_check` name. # Setup As with previous assignments, the [stencil code](https://drive.google.com/file/d/1ny03jGLADWaG1xxclC_mrX7eIF9c7FPp/view?usp=sharing) is set up as a `dune` OCaml project. :::success The stencil code uses a new dependency to pretty-print exceptions. Install it with: ``` opam install ppx_deriving ``` ::: Run `dune test` to compile your code in `eval.ml` and run the inline tests in `tests.ml`. If you would like to test expression in `utop`, run `dune utop` from the project directory then, from inside `utop`, run: ``` #mod_use "eval.ml" ;; open Eval ;; ``` You can then see program output with: ``` eval "(program ...)" ;;; ``` This may be useful for debugging test cases. ### Attribution This assignment is adapted from an assignment from [CS173](https://cs.brown.edu/courses/cs173/2024/) at Brown University, by Shriram Krishnamurthi and others. # The `MiniJava` Language To help us focus on the core OOP semantics rather than imperative parsing and interpretation, `MiniJava` uses a Racket-like parenthesized syntax and let-bindings. A `MiniJava` program consists of a list of classes (just like real Java programs) and then a single expression that is a simplified version of the program's `main` function (without the `public static void` syntax!). Each class consists: 1. A class name. 2. A required superclass (unlike real Java, `MiniJava` programs must explicitly extend `Object` if they do not have a different superclass). 3. A list of fields (which can be empty). 4. A list of methods (which can be empty). :::info All objects must have a superclass, which can be `Object`. However, `Object` is still implicitly defined; for `MiniJava`, as the top-level class that has no fields or methods. ::: For simplicity, all methods are "single-arity", which means they must take exactly one argument and must return a value. For example, here is a `MiniJava` class called `Foo` that extends `Object` with an integer field named `x` and a method named `bar` that takes in a number and returns the sum of `x` with the input number: ```lisp (class Foo extends Object (fields (x Int)) (methods (method Int bar (Int y) (+ (get this x) y)))) ``` The grammar for fields is ```lisp (field <field-name> <field-type>) ``` The grammar for methods (similar to real Java) is: ```lisp (method <return-type> <method-name> (<param-type> <param-name>) <method-body>) ``` We can instantiate a new object with ``` (new <class-name>) ``` ## Object creation Like real Java, `MiniJava` creates new object instances with `new`. Less like real Java, `MiniJava` does not have special constructor semantics. When an instance of an object is instantiated, all of its number field values should be initialized to `0` and all of its object field values should be initialized to `null`. Users must set fields explicitly with `get` or `set` (though, as you can see from `tests.ml`, you can write your own `init<Class>` functions to run after `new`). ## Inheritance and polymorphism Like real Java, `MiniJava` has polymorphism from inheritance via **method overriding**. Subclasses are subtypes of superclasses. Any type is also a subtype of itself. When inheriting from a superclass, a subclass can **override methods** defined in the superclass. The typing and evaluation rules for methods follow that of real Java, as specified below (and in more detail in the class notes). :::info Note that the stencil code provides a `check_subtype` helper. ::: ### Static and dynamic types As we saw in class, Java semantics depends on both the **static** and **dynamic** type of an object. As a reminder, in the real Java line ```java A declAC = new C(); ``` The static type of `declAC` is `A`; the dynamic type is `C`. We can write the same example in `MiniJava` with: ```lisp (let (A declAC (new C)) ...code-that-uses-declAC...) ``` ### Method overriding with dynamic type As in real Java, method calls should be resolved via inheritance and dynamic dispatch rules. In short, method calls should be based on the **dynamic type** of an object. Method calls should use the definition provided in the most specific superclass of the dynamic type of the object that implements the method. To call a method on an object in `MiniJava`, we can write: ```lisp (call <object-expr> <method-name> <arg-value>) ``` #### Covariant return types Like in real Java, `MiniJava` programs with method overriding can have **covariant return types**. The return type of the overriden method must be a subtype of the return type in the superclass. For example, the following program type would type check: ```lisp (classes (class A extends Object (fields) (methods (method A foo (Int x) (...)))) (class B extends A (fields) (methods (method B foo (Int x) (...))))) ``` For simplicity, argument types must be the exact same type in overriden methods. That is, like real Java, `MiniJava` does not allow overriden methods to have different arguments types than the method they are overriding. #### No method overloading For simplicity, `MiniJava` **does not** support method overloading (i.e., there cannot be two methods in the same class with the same name). ### Field access with static type As in real Java, fields are inherited, but **not overridable** in the sense that methods are. Instead, in short, field access should be based on the **static type** of an object. If the static type does not have a field defined, then the field references the superclass's fields (recursively until `Object` is reached). #### Field `set` To set a field on an object in `MiniJava`, we can write: ```lisp (set <object-expr> <field-name> <value-expr>) ``` Unlike real Java, in `MiniJava`, `set` should also return the new value being assigned. #### Field `get` To get a field on an object in `MiniJava`, we can write: ```lisp (get <object-expr> <field-name>) ``` :::warning Unlike real Java, `MiniJava` makes a distinction between fields and other identifiers. We cannot access the current object's field implicitly with `myField`, we must write `(get this myField)`. If an identifier `x` appears in a method outside of a `get` or `set` expression, then `x` can only refer to a method parameter or a bindings from a `let`. ::: ## `this` As in real Java, `MiniJava` has the `this` keyword. `this` should be added to the type (and dynamic) environments when a method is called, mapping the identifier `this` to the current object instance. ## Recursion Fields and methods can reference their own class. For example, we can have a class, `Foo`, which has a field named `f` of type `Foo`. Methods can also have recursive calls and `call` other methods within the class by passing the `this` keyword as the object-expression to the call function. ## Non-OOP constructs :::info These constructs have already been implemented for you, but their semantics are important to understand for writing new test cases. ::: `MiniJava` only has a few other non-OOP constructs---just enough to write some interesting OOP-focused programs! `+` and `let` work as they did in `MiniLangT`. `ifzero` is like `if` in `MiniLangT`, except that because `MiniLangT` does not have booleans, `ifzero` checks whether the first argument is `0`. For example, `(ifzero 0 1 2)` should be 1 and `(ifzero (new Object) 1 2)` should be a type error. `block` is roughly the same as in Assignment 7, except that it evaluates to each expression for its side effects, then produces the last expression in the chain, rather than evaluating statements for their side effects. This is implemented for y # Grammar `MiniJava` has the following concrete grammar. ``` <type> ::= | Int | <class-name> <program> ::= | (program (classes <class> ...) <expr>) <class> ::= | (class <class-name> extends <class-name> (fields <field> ...) (methods <method> ...)) <field> ::= | (<field-name> <type>) <method> ::= | (method <type> <method-name> (<type> <id>) <expr>) <expr> ::= | <num> | <id> | (+ <expr> <expr>) | (block <expr> <expr> ...) | (let (<type> <id> <expr>) <expr>) | (ifzero <expr> <expr> <expr>) | (get <expr> <field-name>) | (set <expr> <field-name> <expr>) | (call <expr> <method-name> <expr>) | (new <class-name>) | this | (null <class-name>) ``` The `tests.ml` file has many examples of `MiniJava` programs and expected outputs. Two test programs include translations of the `Tricky.java` code from our first OOP lecture! ## Abstract Syntax Trees and compiled representation. The source language is parsed to the AST type `jprogram_src`, which the stencil code compiles to a lower-level type `jprogram`. :::warning You can find the definitions for the language, including the environment types, in `ast.ml`. You should read, but not edit, this file. ::: ## Error Exceptions The stencil code provides the following exceptions. Note that similar to real Java, there are far more compile-type/type-checker errors than dynamic errors! In fact, for `MiniJava`, the **only** runtime exception your code should actually hit (once complete) is `NullPointerException`. If you are, for example, performing a pattern match in `interp`, you will still need to do something like `failwith "unreachable"` for the unreachable branch to satisfy the OCaml type checker. However, your code should never actually hit this case! ```ocaml (* You should use in type_check_classes or its helpers *) exception ErrClassNotFound of string exception ErrFieldNotFound of string exception ErrMethodNotFound of string exception ErrTypeMismatch of jtype * jtype exception ErrThisOutsideOfMethod exception ErrNonObject of jtype (* Already used in type_check_classes or its helpers by the stencil code *) exception ErrUnboundId of string exception ErrIfBranches of jtype * jtype (* You should use in interp *) exception NullPointerException ``` # Implementation Summary The stencil code `eval.ml` has 3 places marked `TODO` for you to hook in your new code. ## Type checking :::success Your job is to implement the remaining `type_check` and `type_check_classes` code. You can (and probably should) add additional helper functions. ::: Before implementing type checker code, I suggest commenting-out the existing tests and writing a few type-checker-only tests. That way, you can be sure your type checker works before moving on! You can use something like the following helper method to write tests specifically for type checking: ```ocaml let type_check_prog (prog : jprogram_src) : jtype = let (ProgramSrc {classes; main}) = prog in List.iter (type_check_class prog) classes; type_check main [] classes ``` :::success Your type checker functions must return the right `jtype`. In addition, to facilitate a separate type-checker and compiler implementation, your `type_check` function must set the correct static object type on the `GetFieldSrc` and `SetFieldSrc` expressions. ::: ### Finding type errors Your type checker should raise the following errors: - `ErrClassNotFound` if the class name is referenced, but does not exist (for example, if a field is declared as a type `Foo` but no class `Foo` is defined). - `ErrFieldNotFound` if a field is used but not defined for the relevant class (potentially via inheritance). - `ErrMethodNotFound` if a field is used but not defined for the relevant class (potentially via inheritance). - `ErrTypeMismatch` if a found type is not equal to or a subtype of an expected type (for example, in a method call's return value). - `ErrTypeMismatch` if `this` is used outside of a method body. - `ErrNonObject` if an object was expected, but not provided (for example, a method being called on a non-object). ### Defining `type_check_class` How you should define `type_check_class` follows from the errors you need to identify. To know if a class type-checks, you must check that its field and method definitions do not violate any of the typing rules we expect. Put another way, a valid type checker should make it so that the only possible runtime error is a `NullPointerException`. :::success `type_check_class` should: 1. Validate the types used by each field in the class. 2. Validate that each method body type-checks with respect to the method's types. This should use `type_check` on the body. ::: ## Interpreter Once your type checker works, turn to the `interp` function. :::success Your job is to implement the remaining `interp` code. You can (and probably should) add additional helper functions. ::: I suggest slowly adding back in the provided tests, as you support more languages features. # Testing The stencil code already includes many interesting tests, including a test for covariance, our `Tricky.java` program, a linked-list length method! :::success Your final job is to add **at least 5 more test cases** for `type_check_classes` and **at least 10 more test cases** for `iterp`. Your `interp` test cases can reuse some of your 5 type check programs, as long as they are checking for value results and not errors. ::: If you are unsure about what the expected result for a new case should be, DM me on Zulip. # Submission Submit the following **3 files** on **Gradescope**: :::success - `eval.ml` - `tests.ml` - `README` as either a `txt`, `md`, or `pdf` file, with: - Roughly how long did this assignment take you (in hours)? - Which was the most challenging part of this assignment? - Which resources did you find to be the most useful? :::