# Assignment 7: Modularity This is a **pair or solo Gradescope assignment**—you will submit 3 OCaml files (`assoc_list_map.ml`, `bst_map.ml`, and `a7tests.ml`) and a README on Gradescope. :::info Recall that assignments, if completed as a pair, must be completed **together** rather than dividing work. See the syllabus for more. ::: In this assignment, you will explore how different data structures can implement the same abstract map interface. You'll also be able to write *one* group of tests to check multiple implementations. :::warning For this assignment, you may reference any OCaml language documentation, the free online [OCaml Programming: Correct + Efficient + Beautiful](https://cs3110.github.io/textbook/cover.html) textbook, and the class notes. This assignment also has a more open **additional source** policy than previous assignments. You may use additional sources subject to the constraints: (1) You provide detailed citations of your use in your README, and (2) You are sure that you understand each line of code you submit. I may ask each group to discuss their solution with me to ensure that this condition is met, and may adjust grades based on these discussions. ::: ### Change log I will avoid making changes to the assignment after it is posted, but any necessary changes will be added here: # Setup :::success As with previous assignments, the [stencil code](https://drive.google.com/file/d/1osn5p6c_517j9fJwJc3SBmmErE7MLEQi/view?usp=sharing) is set up as a `dune` OCaml project. Run `dune test` to compile your code and run the inline tests in `a7tests.ml`. Create a `README`.{`txt`, `md`} file for your written answers. ::: # Modular `map`s In this assignment, you will implement different versions of key-value maps. :::danger You can (and should) use functions from the OCaml standard library`List` and `Array` modules. You should not use the standard library's `Map` or `Hashtbl` modules, since you will be implementing those features yourself. ::: # Testing your maps Testing is an essential part of this assignment! You’ll use the provided file `a7test.ml` to write and run tests for each of your `Map` implementations. `a7test.ml` contains *parametric testing module* called `MapTests`, which takes a module `M` that satisfies the `Map` signature and runs a collection of tests against it: ```ocaml module MapTests (M : Map) = struct ... end ``` This syntax defines what’s known as a **functor**—a module that *takes another module as input*. You don’t need to understand functors in detail for this assignment or for CS251 in general. Here, all you need to understand is that we can reuse the *same* test code to test *different* implementations of the `Map` interface. For example: ```ocaml module AssocTests = MapTests(Assoc_list_map) module BstTests = MapTests(Bst_map) ``` Each line *instantiates* `MapTests` for a particular module that implements `Map` (in this case, running the full set of tests on each) When you run `dune test`, all tests inside both `AssocTests` and `BstTests` will automatically run. I suggest temporarily commenting out the second `Bst_map` line while you first work on implementing the `Assoc_list_map`. :::success You should implement a thorough set of tests for each `map` function (as well as combinations of different functions). You may want to define local variables for larger maps to use across multiple tests. You could even try randomly generating different maps to be tested! ::: You may also define tests that depend on implementation details of your map (like internal types), but any of these tests should go in the *implementation files*, not `a7test.ml`. :::warning `a7test.ml` should only rely on the signature defined in `map.mli`, not any implementation details. ::: :::info Debugging tip: There is no simple, safe way in OCaml to print a polymorphic value of type `'a`. However, you may want to print values to aid in debugging. The following function uses some unsafe OCaml magic to attempt to print a value if it is an `int` or can be represented as a `string`. You may use this for debugging, but remove any calls to it before submitting your code. Alternatively, you can ammend you functions to call an internal version that takes a `print_key`/`print_value argument`, which is safer and more idiomatic in OCaml. ```ocaml (* Unsafe debug print for polymorphic keys/values *) let debug_print k = if Stdlib.Obj.is_int (Stdlib.Obj.repr k) then Printf.printf "%d\n" (Obj.magic k : int) else try Printf.printf "%s\n" (Obj.magic k : string) with _ -> Printf.printf "<unknown>\n" ``` ::: # Warmup: functional `map` with association list To start, you'll implement a functional key-value `map` using an association list. :::warning Note that many (but not all) of the functions here are provided in the [OCaml Programming: Correct + Efficient + Beautiful](https://cs3110.github.io/textbook/cover.html) textbook. However, you should make sure you fully understand their implementation. I suggest first attempting to complete the module on your own, before referencing the book implementations. ::: One way to implement a key-value map is with an *association list*, like those we used in A6: Types II. :::success Review the specific `Map` module type signature in `map.mli`. In `assoc_list_map.ml`, define a module named `Assoc_list_map` that implements the `Map` module type using a `list` type. For full credit, use currying to omit explicit arguments where possible. ::: :::success In your README, answer the following - **Written question 1**: Does your list maintain any *invariants* to make some of the functions simpler or more efficient? Which functions have simpler implementations due to your design choice of which (if any) invariants to enforce? Does your implemetation assume anything about the input types? ::: # Functional `map` with binary search tree Next, you'll complete the implementation of a functional key-value `map` using a *binary search tree*. A binary search tree (BST) is a binary tree that maintains the following **BST invariant**: :::info A tree $t$ with value $t_v$ is a BST if: - For all values $l_{vi}$ in the left subtree, $l_{vi} < t_v$ - For all values $r_{vi}$ in the right subtree, $r_{vi} > t_v$ ::: Note that by this definition, BSTs cannot store duplicate values. This invariant lets us use a convenient algorithm to find a given node in the BST: To find out if an element `x` is in the BST, we start at the root. If we reach an empty tree, then we know the element is not present. Otherwise, we check whether `x` is equal to the value stored at the root (if it is, we've found it!). If not, we compare `x` against the root's value, repeating this process on the left subtree if `x` is less than the root's value, or the right subtree if `x` is greater than the root's value. We can use a similar process to insert a new element into the tree: we figure out the position where it *should* go in the tree. If the new element was not previously present, then we'd eventually get to an empty tree, which we can replace with the new element. If you'd like additional review of BSTs, see Chapter 17 of the CS230 textbook [here](https://drive.google.com/file/d/1LXkQ02oQq_woG6oHBlQXyvX2Khq0hH87/view). --- ## Completing `bst_map.ml` `bst_map.ml` contains the *start* of a binary search tree implementation of a `map`. It uses an OCaml variant type for a binary tree, with a record type for named node fields. For the sake of this assignment, we will use the OCaml standard library `compare` function to determine :::warning A partial implementation for `insert` is provided, but it's not correct! Your job is to fix this implementation and complete the remaining required methods. ::: :::success Complete the implementation of `bst_map.mli` (**without** changing the `type ('k, 'v) t = ('k, 'v) tree` type definition). ::: Once you have all functions defined, uncomment the line: ```ocaml module Bst_map : Map = struct ``` :::success In your README, answer the following - **Written question 2**: What was incorrect about the initial implementation of `insert`? Be as specific as possible. - **Written question 3**: Compare the asymptotic efficiency of your two `map` implementations. - **Written question 4**: Provide one hypothetical situation/application where your `Assoc_list_map` would be the better choice for a programmer to use. - **Written question 5**: Provide one hypothetical situation/application where `Bst_map` would be the better choice for a programmer to use. ::: # Modularity across languages :::success **Written question 6** Pick a programming language other than OCaml. Use the language documentation and other online sources to answer the following (citing each source you used). In your `README`, describe how the language of your choice handles *modularity*. Include both how the language handles *namespaces* and how it handles *abstraction*, as defined in the class notes. Also include how these concepts tie into file naming/project structure for that language. ::: # Extra credit :::warning Two notes for the extra credit parts on this assignment: 1. Only attempt them once you have completed the standard assignment. 2. To receive any extra credit *points* for this assignment, your team must have a **synchronous grading meeting** with me (either at help hours after the assignment is submitted or with a Calendly meeting). Your extra credit score will be based primarily on answering my design questions in that meeting, rather than the code itself. ::: ## EC 1: Balanced BST [Independent] Copy your `bst_map.ml` implementation into a new file, `bst_map_balanced.ml`. There, implement a *balanced* binary tree (see the [CS230 textbook](https://drive.google.com/file/d/1LXkQ02oQq_woG6oHBlQXyvX2Khq0hH87/view) or other online sources for more details as needed). Write new inline unit tests in the same file. Add your new implementation file to the `dune` file (see the documentation for details). :::info Up to +10 extra credit points pending the **synchronous grading meeting**. ::: ## EC 2: Resizable mutable map [Independent] Review the signature in `mut_map.mli`. Implement a mutable map that supports amortized constant-time `lookup`. Use the OCaml data structures we've seen in class to implement your own backing hash table (see Chapter 20 of the [CS230 textbook](https://drive.google.com/file/d/1LXkQ02oQq_woG6oHBlQXyvX2Khq0hH87/view)). To handle hash collisions, use either chaining with buckets (typically easier in a functional context) or open addressing with linear probing. Write new inline unit tests in the same file. Add your new implementation file to the `dune` file (see the documentation for details). :::info Up to +10 extra credit points pending the **synchronous grading meeting**. ::: # Submission Submit the following **4 files** on **Gradescope**: :::success - `assoc_list_map.ml` - `bst_map.ml` - `a7tests.ml` - `README` as either a `txt`, `md`, or `pdf` file, with: - **Written questions 1-6**. - Roughly how long did this assignment take you (in hours)? List the extra credit separately if attempted. - Which was the most challenging part of this assignment? - Which resources did you find to be the most useful? :::