Assignment 9: `MiniJava` :coffee: --- :::warning Note, this assignment has two deadlines: - A Checkpoint deadline Monday, December 8 (our last day of class). I will allocate some time in class for working on this. - A final submission Tuesday, December 16. The last time you can submit with an extension is ==Thursday, December 18, 12:00pm== (the college has a hard deadline at 4pm; 12pm accounts for possible technical issues). ::: This is a **pair or solo Gradescope assignment**—you will submit 2 OCaml files (`eval.ml` and `tests.ml`) and a README on Gradescope. :::info Recall that assignments, if completed as a pair, must be completed **together** rather than dividing work. See the syllabus for more. ::: We all know and love Java from CS230. In this assignment, you will get to explore how Java is made :coffee:! You will complete a type-checking compiler/interpreter for a miniature version of Java, `MiniJava`. You are provided with a stencil of an implementation for `MiniJava`, a language that captures many interesting OOP semantic features of Java. The implementation has 2 phases: :::info - **Phase 1**: A type-checking compiler that runs on the source AST, and produces a lower-level object representation - **Phase 2**: An interpreter that runs on the lower-level representation. ::: This loosely follows the design of real [JVMs](https://en.wikipedia.org/wiki/Java_virtual_machine), where Java source code is first compiled to Java "bytecode", then JVM implements an interpreter on the bytecode. However, unlike real JVMs, this assignment does not require you to implement your own garbage collector! :truck: :sunglasses: Instead, your focus is just these two core tasks: - **Phase 1**: `compile`: ==source `MiniJava`== :arrow_right: ==compiled AST (or static type error)== :::success Your first job is to implement a **type-checking `compile` function**. You'll likely want to write your own helper functions for compiling *expressions*, *methods*, and *classes*. ::: - **Phase 2**: `interp`: ==compiled AST== :arrow_right: ==result value (or dynamic error)== :::success Your second job is to implement the **interpreter cases** for objects, methods, and fields. ::: For this assignment, you may reference any OCaml language documentation, the [free online OCaml textbook](https://cs3110.github.io/textbook/cover.html), Java documentation, the class notes, and other tools as noted in the AI policy. You may also find it useful to translate `MiniJava` programs to Java to check expected outputs! :::danger You **should not** view code from previous CS251 students or any related course assignments at other universities. ::: The stencil code also provides several test cases (though, it does not provide all of the test cases the autograder will use). :::success Your final task is to implement at least 20 additional tests. See the Testing section for details. ::: ### Change log I will avoid making changes to the assignment after it is posted, but any necessary changes will be added here: # Setup As with previous assignments, the [stencil code](https://drive.google.com/file/d/1hPWqJzSZz290mFJr38UPkGWTtrwCtV-v/view?usp=share_link) is set up as a `dune` OCaml project. :::success The stencil code uses a new dependency to pretty-print exceptions and other AST values. Install it with: ``` opam install ppx_deriving ``` ::: Run `dune test` to compile your code in `eval.ml` and run the inline tests in `tests.ml`. If you would like to test expression in `utop`, run `dune utop` from the project directory then, from inside `utop`, run: ``` #mod_use "eval.ml" ;; open Eval ;; ``` You can then see program output with: ``` eval "(program ...)" ;;; ``` This may be useful for debugging test cases. ### Attribution This assignment is adapted from an assignment from [CS173](https://cs.brown.edu/courses/cs173/2024/) at Brown University, by Shriram Krishnamurthi and others. # AI Policy For the programming on this assignment, you may use a wide range of resources, including generative AI if you so desire, subject to the following restrictions: 1. You should not submit any code that you do not understand. I may ask to have an in-person grading meeting to discuss your submission. 2. You must describe in detail in your README which parts of your submission used which resources, including AI. # The `MiniJava` Language To help us focus on the core OOP semantics rather than imperative parsing and interpretation, `MiniJava` uses a Racket-like parenthesized syntax and let-bindings. The stencil code includes a provided parser. A `MiniJava` program consists of a list of classes (just like real Java programs) and then a single expression that is a simplified version of the program's `main` function (without the `public static void` boilerplate!). Each class consists of: 1. A class name. 2. A required superclass (unlike real Java, `MiniJava` programs must explicitly extend `Object` if they do not have a different superclass). 3. A list of fields (which can be empty). 4. A list of methods (which can be empty). :::info All objects must have a superclass, which can be `Object`. However, `Object` is still implicitly defined; for `MiniJava`, as the top-level class that has no fields or methods. ::: For simplicity, all methods are "single-arity", which means they must take exactly one argument. All methods must return a value. For example, here is a `MiniJava` class called `Foo` that extends `Object` with an integer field named `x` and a method named `bar` that takes in a number and returns the sum of `x` with the input number: ```lisp (class Foo extends Object (fields (x Int)) (methods (method Int bar (Int y) (+ (get this x) y)))) ``` The grammar for fields is ```lisp (field <field-name> <field-type>) ``` The grammar for methods (similar to real Java) is: ```lisp (method <return-type> <method-name> (<param-type> <param-name>) <method-body>) ``` We can instantiate a new object with ``` (new <class-name>) ``` The provided parser checks that there are no cycles in the inheritance graph. ## Object creation (with defaults) Like real Java, `MiniJava` creates new object instances with `new`. Less like real Java, `MiniJava` does not have special constructor semantics. :::info When you instantiate a new instance of an object, you should initialize all of its number field values to the default value of`0` and all of its object field values to the default value of `null`. ::: Users must set fields explicitly with `get` or `set` (though, as you can see from `tests.ml`, you can write your own `init<Class>` functions to run after `new`). ## Inheritance and polymorphism Like real Java, `MiniJava` has polymorphism from inheritance via **method overriding**. Subclasses are subtypes of superclasses. Any type is also a subtype of itself. When inheriting from a superclass, a subclass can **override methods** defined in the superclass. The typing and evaluation rules for methods follow that of real Java, as specified below (and in more detail in the class notes). :::success You should write a `check_subtype` helper (prior to the Checkpoint deadline). `IntT` should be a subtype of itself, as should any defined class. ::: ### Static and dynamic types As we saw in class, Java semantics depends on both the **static** and **dynamic** type of an object. As a reminder, in the real Java line ```java A declAC = new C(); ``` The static type of `declAC` is `A`; the dynamic type is `C`. We can write the same example in `MiniJava` with: ```lisp (let (A declAC (new C)) ...code-that-uses-declAC...) ``` ### Method overriding with dynamic type As in real Java, method calls should be resolved via inheritance and dynamic dispatch rules. In short, method calls should be based on the **dynamic type** of an object. Method calls should use the definition provided in the most specific superclass of the dynamic type of the object that implements the method. To call a method on an object in `MiniJava`, we can write: ```lisp (call <object-expr> <method-name> <arg-value>) ``` #### Covariant return types for method overriding Like in real Java, `MiniJava` programs with method overriding can have **covariant return types**. The return type of the overriden method must be a subtype of the return type in the superclass. For example, the following program type would type check: ```lisp (classes (class A extends Object (fields) (methods (method A foo (Int x) (...)))) (class B extends A (fields) (methods (method B foo (Int x) (...))))) ``` For simplicity, argument types must be the **exact same type** in overriden methods. That is, like real Java, `MiniJava` does not allow overriden methods to have different arguments types than the method they are overriding (that is, neither Java nor `MiniJava` have contravariant argument types for overriden methods). #### No method overloading For simplicity, `MiniJava` **does not** support method overloading (i.e., there cannot be two methods in the same class with the same name). ### Field access with static type As in real Java, fields are inherited, but **not overridable** in the sense that methods are. They are instead *shadowed*. In short, field access should be based on the **static type** of an object. If the static type does not have a field defined, then the field references the superclass's fields (recursively until `Object` is reached). See the class notes for more details! #### Field `set` To set a field on an object in `MiniJava`, we can write: ```lisp (set <object-expr> <field-name> <value-expr>) ``` Unlike real Java, in `MiniJava`, `set` should also return the new value being assigned. #### Field `get` To get a field on an object in `MiniJava`, we can write: ```lisp (get <object-expr> <field-name>) ``` :::warning Unlike real Java, `MiniJava` makes a distinction between fields and other identifiers. We cannot access the current object's field implicitly with `myField`, we must write `(get this myField)`. If an identifier `x` appears in a method outside of a `get` or `set` expression, then `x` can only refer to a method parameter or a bindings from a `let`. ::: ## `this` As in real Java, `MiniJava` has the `this` keyword. `this` should be added to the type (and dynamic) environments when a method is called, mapping the identifier `this` to the current object instance. ## No `super` For this assignment, you do not need to implement the keyword `super` (though you can do so for extra credit). ## Recursion Fields and methods can reference their own class. For example, we can have a class, `Foo`, which has a field named `f` of type `Foo`. Methods can also have recursive calls and `call` other methods within the class by passing the `this` keyword as the object-expression to the call function. ## Non-OOP constructs `MiniJava` only has a few other non-OOP constructs---just enough to write some interesting OOP-focused programs! - `+` and `let` work as they did in `MiniLangT`. - `ifzero` is like `if` in `MiniLangT`, except that because `MiniLangT` does not have booleans, `ifzero` checks whether the first argument is `0`. For example, `(ifzero 0 1 2)` should be 1 and `(ifzero (new Object) 1 2)` should be a type error. - `block` is roughly the same as the `;` operator in OCaml. It evaluates each subexpression in order for its side effects, then produces the last subexpression in the block. The type should be the type of the last subexpression in the block. # Grammar `MiniJava` has the following concrete grammar. ``` <type> ::= | Int | <class-name> <program> ::= | (program (classes <class> ...) <expr>) <class> ::= | (class <class-name> extends <class-name> (fields <field> ...) (methods <method> ...)) <field> ::= | (<field-name> <type>) <method> ::= | (method <type> <method-name> (<type> <id>) <expr>) <expr> ::= | <num> | <id> | (+ <expr> <expr>) | (block <expr> <expr> ...) | (let (<type> <id> <expr>) <expr>) | (ifzero <expr> <expr> <expr>) | (get <expr> <field-name>) | (set <expr> <field-name> <expr>) | (call <expr> <method-name> <expr>) | (new <class-name>) | this | (null <class-name>) ``` The `tests.ml` file has many examples of `MiniJava` programs and expected outputs. Two test programs include translations of the `Tricky.java` code from our first OOP lecture! ## Abstract Syntax Trees and compiled representation The source language is parsed to the AST type `jprogram_src`, which the stencil code compiles to a lower-level type `jprogram`. There are also source and compiled versions of other parts of the AST, for example, `jexpr_src`/`jexpr` for expressions, `jmethod_src`/`jmethod` for method definitions, and `jclass_src`/`jclass` for classes. Note that many of these AST components use the following structure: ```ocaml type jprogram_src = | ProgramSrc of { classes: jclass_src list; main: jexpr_src; } ``` That is, they define a variant type (here, `jprogram_src`) that has one variant: `ProgramSrc`, which has an [inlined record](https://ocaml.org/manual/5.4/inlinerecords.html) with named fields. This is a common pattern for ASTs, because multiple parts of the AST may have overlap in field names. For example, the compiled program has the same field names: ```ocaml type jprogram = | Program of { classes: jclass list; main: jexpr; } ``` This means to get the list of classes from a value of type `jprogram_src` or similar, you'll need to pattern match. For example: ```ocaml let interp (Program prog : jprogram) : jval = (* ... now can use prog.main, prog.classes *) let (ProgramSrc {classes; main}) = prog in (* Now can use classes, main *) ``` :::warning You can find all of the definitions for `MiniJava`'s source and compiled ASTs, including the environment types, in `ast.ml`. You should read, but not edit, `ast.ml`. ::: ## Error Exceptions The stencil code provides the following exceptions. Note that similar to real Java, there are far more compile-type/type-checker errors than dynamic errors! In fact, for `MiniJava`, the **only** runtime exception your code should actually hit (once complete) is `NullPointerException`. If you are, for example, performing a pattern match in `interp`, you will still need to do something like `failwith "unreachable"` for the unreachable branch to satisfy the OCaml type checker. However, your code should never actually hit this case! ```ocaml (* You should use in compile or its helpers *) exception ErrClassNotFound of string exception ErrFieldNotFound of string exception ErrMethodNotFound of string exception ErrTypeMismatch of jtype * jtype exception ErrThisOutsideOfMethod exception ErrNonObject of jtype exception ErrUnboundId of string exception ErrIfBranches of jtype * jtype (* You should use in interp or its helpers *) exception NullPointerException ``` # Implementation Summary ## Type checking `compile` function :::success Your first task is to implement a `compile` function. You can (and should) add additional helper functions. ::: Before implementing `compile` code, I suggest commenting-out the existing tests and writing a few basic tests for `compile`. Implement one language feature at a time, targeting new tests as you go. That way, you can be sure each part of your `compile` works before moving on! :::success Your `compile` function should produce either a complete lower-level `jprogram` AST, or a type error. The lower-level `jprogram` includes both a list of classes, and a `main` `jexpr`, as described above. :star: Your compiled list of classes should likely include a class for the `Object` superclass, even though the source programs will not include this. ::: ### Finding type errors Your type checker should raise the following errors: - `ErrClassNotFound` if the class name is referenced, but does not exist (for example, if a field is declared as a type `Foo` but no class `Foo` is defined). - `ErrFieldNotFound` if a field is used but not defined for the relevant class (potentially via inheritance). - `ErrMethodNotFound` if a method is used but not defined for the relevant class (potentially via inheritance). - `ErrTypeMismatch` if a found type is not equal to or a subtype of an expected type (for example, in a method call's return value). - `ErrThisOutsideOfMethod` if `this` is used outside of a method body. - `ErrNonObject` if an object was expected, but not provided (for example, a method being called on a non-object). ### Logic to `compile` each class How you should define `compile` follows from the errors you need to identify. To know if a class type-checks, you must check that its field and method definitions do not violate any of the typing rules we expect. Put another way, a valid type checker should make it so that the only possible runtime error is a `NullPointerException`. It should also make sure the program does not try to use any class types that are not actually defined! :::success To compile a class, you should: 1. Ensure that all the types used by each **field** in the class are valid (either `IntT`, a class defined in the program, or `Object`). 2. `compile` and type check each method body in the correct environment. To do this, you will likely want a `compile_expr` helper. ::: ### :star: `compile` tips: - In order to both (1) type-check each source expression, and (2) tranlate each source expression (`jexpr_src`) to a compiled expression (`jexpr`), you may want to consider having your `compile_expr` helper's return type include **both** a type and a compiled expression. - To implement `check_subtype` and related helpers, you'll need to be able to get a `jclass_src` from a class name (`string`). You can pass around the list of classes and use `List` library functions to accomplish this. ## Interpreter Once your type checker works, turn to the `interp` function. :::success Your job is to implement the `interp` function. You can (and probably should) add additional helper functions. ::: I suggest slowly adding back in the provided tests, as you support more languages features. # Testing The stencil code already includes a few interesting tests, including a test for covariance, our `Tricky.java` program, a linked-list length method! :::success Your final job is to add **at least 10 more static error test cases** for `compile` (see the type error exceptions, of which there are 8) and **at least 10 more test cases** for `interp` with value results. Note that you can write and pass the error test cases prior to implementing `interp`, since `compile` runs first. ::: If you are unsure about what the expected result for a new case should be, DM me on Zulip. # Timeline and grading The reference solution for this assignment contains around 300 lines of OCaml (on top of the existing stencil lines), which is more than previous assignments. :::info This assignment will be weighted 200, rather than 100, points toward your assignment total. ::: While the stencil code provides the ASTs and parsing logic, it provides little structure in terms of how you implement `compile` and `interp`. As such, we will have the following *checkpoint* deadline. # Checkpoint deadline :::success By check Checkpoint deadline of **Monday, December 8** (our last day of class), you should submit your work-in-progress file to Gradescope. To receive a ==90% or above== on the final project, your checkpoint submission should include a `check_subtype` helper, as well as a basic structure for your `compile` function and its helpers (it does not need to pass all tests or fully compile programs yet). For example, a basic structure might include: - A `check_subtype` helper, potentially with inline unit tests in the same `eval.ml` file. - A `compile` method that compiles classes and the main method for a given program. - Stubs for `compile` helpers that may not yet be complete (e.g., you might have `failwith "TODO"` for several or all cases). ::: After this Checkpoint deadline, we will release additional optional stencil code: a `check_subtype` and one possible structure for `compile` helper functions. That way, even if your group is not able to get through these starter tasks prior to the Checkpoint deadline, you can get back on track (to reach a score up to 90%). # Extra credit Only attempt the extra credit once you have received 100% test correctness on the autograder. :::warning Since several of these change the AST representation, you'll need to implement them in a copy of your `ast.ml`/`eval.ml` file (so that the autograder can still run). Make copies of these files (`ast-ec.ml`/`eval-ec.ml`) and make the changes there. ::: I'm happy to discuss these (and help with necessary parser changes) over Zulip. You can also propose new ideas for extensions. ## `super` keyword [up to +5] Implement the `super` keyword for method calls and field access, matching the semantics of real Java. ## Access modifiers [up to +10] Implement the `public`, `private`, and `protected` access modifiers on methods and fields, matching the semantics of real Java. ## Parser from a subset of real Java [up to +15] Implement a parser from a subset of real Java programs to `MiniJava` programs. That way, you can impress your friends with your complete Java implementation! For example, the real Java program: ```java class A { int x; int plusX(int y) { return x + y; } } public class Main { public static void main(String[] args) { A a = new A(); System.out.println(a.plusX(3)); } } ``` Should parse to the same AST representation as: ```ocaml (program (classes (class A extends Object (fields (x Int)) (methods (method Int plusX (Int y) (+ (get this x) y))))) (block (call (new A) plusX 3))) ``` (Your parser can assume that `main` ends with a single `println`). And both should, when compiled, produce `3`. For full extra credit for this option, you should also create an executable so that you can consume a Java program as `<executable-name> Main.java` and have `3` printed to the terminal. # Submission For the Checkpoint, submit the following **1 file** on Gradescope: :::success - `eval.ml` ::: For the final submission, submit the following **3 files** on **Gradescope**: :::success - `eval.ml` - `tests.ml` - `README` as either a `txt`, `md`, or `pdf` file, with: - Roughly how long did this assignment take you (in hours)? - Which was the most challenging part of this assignment? - Which resources did you find to be the most useful? - Which resources/references/tools did you use on which parts of the assignment? Include specific details of what code used generative AI, if used. :::