# Assignment 4: OCaml time! This is a **pair or solo Gradescope assignment**—you will submit 2 OCaml files (`a4.ml` and `a4test.ml`) and a README on Gradescope. :::info Recall that assignments, if completed as a pair, must be completed **together** rather than dividing work. See the syllabus for more. ::: In this assignment, you will define several OCaml functions across **5 problems** (*and optionally, an extra credit challenge problem*). Some functions will be short because they use higher-order functions. Except where otherwise noted, you may use functions in OCaml’s library; the problems point you towards useful ones and often require that you use them. The sample solution is 70 lines (125 lines including the challenge problem). --- # **Change Log** I will avoid making changes to the assignment after it is posted, but any necessary changes will be added here: --- # Setup ### Attribution This assignment is adapted from a [CS341](https://sites.google.com/cs.washington.edu/cse341autumn2024) assignment from the University of Washington. ## OCaml Installation Install OCaml on your machine by following the University of Washington's [OCaml Software Setup Instructions](https://sites.google.com/cs.washington.edu/cse341autumn2024/resources/software-setup). I suggest also following these instructions to install VSCode if you have not already installed it. ## **Dune** The code and directory structure for this homework use OCaml's `dune` build system. After getting the starter code, navigate to the `a4` directory in your terminal. This directory should contain a `dune` file and other homework files. From that directory, you can run the following commands: - `dune test` compiles your code (reporting any type errors, if there are any) and if successful, runs the test file `a4test.ml` - Please see below for more testing details. - If you want to run your code interactively, `dune utop` loads your libraries into the `utop` environment. - Once in `dune utop` you can use `#use` as normal to load OCaml files. :::info If after installation, you get an error that `dune` is not found when running `dune build`, make sure that you've run the following command in your current terminal: ``` eval $(opam config env) ``` ::: ## Stencil Code I've provided the following [starter code](https://drive.google.com/file/d/1dDOCawwHoHHvd8YMaiGQN76fRiBaScpj/view?usp=sharing) (requires Wellesley access). Files you should edit are in green: :::success - `a4.ml`, where you should implement the required functions and any helper functions you'd like. - `a4test.ml`, where you should write tests for each function. ::: :::warning - `a4types.ml`, provided type definitions and a required helper function for Problem 3. Read, but do not edit, this file. - `dune`, `dune-project` setup files so that you can build the project using `dune build` (and test using `dune test`). Don't worry about reading or editing these, unless you are curious. ::: :::success ### Before you submit Make sure that running both `dune build` and `dune test` work without error. Do not modify `a4types.ml` and do not turn it in. ::: ## Setup instructions for testing This assignment uses a test framework for writing inline tests. This is similar to `RackUnit` in Racket. Now that you've seen unit testing in one language, this assignment asks you to practice picking it up in a new language! You will first need to install the test framework using `opam`: `opam install ppx_inline_test` You can then write tests in `a4test.ml` such as `let%test "my test" = fst (1, 2) = 1` The `let%test` binding takes an expression that evaluates to a boolean on the right hand side of the assignment and throws an error if it is false. See `a4test.ml` for examples. :::warning Run the tests with `dune test` . Please note that `dune test` will **produce no output** if your tests pass and will **output error messages** if at least one of your tests fails. ::: --- ## Important note on function bindings In `a4.ml`, for each function you will implement, the first line is provided for you in the assignment. **The form of this first line matters and you should not change it.** Consider: ```ocaml let foo x = e1 let bar x y = e2 let baz = e3 ``` Notice `foo` takes one argument named `x` while `bar` uses [currying (see textbook chapter)](https://cs3110.github.io/textbook/chapters/hop/currying.html) to take two arguments `x` and `y`. Most importantly, `baz` is a “regular” variable binding, but if `e3` evaluates to a function, then `baz` will be bound to that function. For each function with snippet `let baz = ...`, you need to implement an expression that evaluates to the correct function. You should *not* change the form of the binding to be something like `let foo x = ...`. You should also not have your expression be an anonymous function. This is to practice currying! For example, suppose a problem asked for a function that takes an `int list` and produces a list containing only the positive numbers in the input. If the provided snippet was `let only_positive = ...`, then a correct solution is ```ocaml let only_positive = List.filter (fun x -> x > 0) ``` :::danger whereas these solutions would pass all tests but **would not receive full credit**: ```ocaml let only_positive xs = List.filter (fun x -> x > 0) xs let only_positive = fun xs -> List.filter (fun x -> x > 0) xs ``` ::: :::success ## **Tip** As you work on these problems, read through the documentation for the [String](https://ocaml.org/manual/5.0/api/String.html) and [List](https://ocaml.org/manual/5.2/api/List.html) as applicable for each problem. One of the goals of CS251 is to increase your comfort picking up new languages from reading their documentation (which should be much easier, now that you've seen similar idioms in Racket)! ::: --- # **Problems** :::success For each function, you should write your own tests in `a4test.ml`. If you are unsure about the expected result for specific inputs, please post a question on Zulip. ::: ## **Problem 1** :::success Write a function `longest_string1` that takes a `string list` and returns the longest `string` in the list. If the list is empty, return `""`. In the case of a tie, return the string closest to the beginning of the list. ::: Use `List.fold_left`, `String.length`, and no explicit recursion. :::warning Your function must start with: ```ocaml let longest_string1 = ... ``` ::: :::success Then, write a function `longest_string2` that is exactly like `longest_string1` except in the case of ties it returns the string closest to the end of the list. ::: :::warning Your function must start with: ```ocaml let longest_string2 = ... ``` ::: Your second function should be almost an exact copy of `longest_string1`. ## **Problem 2** :::success Write a function `all_answers` of type: ```ocaml ('a -> 'b list option) -> 'a list -> 'b list option ``` ::: Note, the arguments are *curried*! The first argument, a function, should be applied to each element of the second argument. If the function application returns `None` for any element, then the result for `all_answers` is `None`. Else, the calls to the first argument will have produced ```ocaml Some lst1, Some lst2, ... Some lstn ``` and the result of `all_answers` is `Some lst` where `lst` is ```ocaml lst1, lst2, ... lstn ``` appended together. :::warning Your function must start with: ```ocaml let all_answers f xs = ... ``` ::: Your solution can return these lists appended in any order. The sample solution is 10 lines. It uses a helper function with an accumulator and uses `@` (list append). :::warning Note: `all_answers f []` should evaluate to `Some []`. ::: --- # Patterns as data The remaining problems use these type definitions, which are *inspired by* the type definitions that OCaml’s implementation would use to implement pattern matching: ```ocaml type pattern = | WildcardP | VariableP of string | UnitP | ConstantP of int | ConstructorP of string * pattern | TupleP of pattern list type valu = | Constant of int | Unit | Constructor of string * valu | Tuple of valu list ``` ## **Problem 3** :::info *This problem uses the `pattern` recursive variant type but is not really about pattern-matching.* ::: `a4types.ml` contains a function, `g`, that you **must use** for each of the functions in this problem. :::success Use `g` to define a function `sum_variable_lengths` that takes a pattern and returns the sum of the string lengths of all the variables in the variable patterns it contains. ::: :::warning Your function must start with: ```ocaml let sum_variable_lengths = ... ``` ::: Use `String.length`. We care only about **variable** names; the constructor names are not relevant. :::success Use `g` to define a function `count_a_var` that takes a string and a pattern (*curried*) and returns the number of times the string appears as a variable in the pattern. ::: :::warning Your function must start with: ```ocaml let count_a_var s = ... ``` ::: We care only about **variable** names; the constructor names are not relevant. ## **Problem 4** :::success Write a function `check_pat` that takes a pattern and returns true if and only if all the **variables** appearing in the pattern are distinct from each other (i.e., use different strings). ::: :::warning Your function must start with: ```ocaml let check_pat pat = ... ``` ::: The constructor names are not relevant. Hints: * The sample solution uses two helper functions. * `List.exists` or `List.mem` may be useful. * The sample solution is 15 lines. These are option hints: you are not required to use `List.fold_left` and `List.exists`/`List.mem` here, but they make it easier. If you want to build your functions from scratch, that is fine for this problem! ## **Problem 5** Consider the following definition of `matches`, inspired by the idea of using these types to implement pattern matching in an interpreter/compiler for an OCaml-like language. Given `valu` *v* and `pattern` *p*, either ==*p matches v*== or not. If it does, the match produces a `list` of `string * valu` bindings pairs; order in the list does not matter. The rules for matching should be match your intuition for OCaml pattern matching: - `WildcardP` ==matches== everything and produces the empty list of bindings. This is similar to the `else` block in a Racket `cond`. - `VariableP s` ==matches== any value v and produces the one-element binding list holding `(s,v)`. - `UnitP` ==matches== only `Unit` and produces the empty list of bindings. - `ConstantP 17` ==matches== only `Constant 17` and produces the empty list of bindings (and similarly for other integers). - `ConstructorP(s1,p)` ==matches== `Constructor(s2,v)` if `s1` and `s2` are the same string (you can compare them with `=`) and `p` ==matches== `v`. The list of bindings produced is the list from the nested pattern match. We call the strings `s1` and `s2` the *constructor name*. Note that unlike in OCaml, we allow any string to be a constructor name. - `TupleP ps` ==matches== a value of the form `Tuple vs` if `ps` and `vs` have the same length and for all *i*, the *ith* element of `ps` ==matches== the *ith* element of `vs`. The list of bindings produced is all the lists from the nested pattern matches appended together. - Nothing else matches. :::success Write a function `matches` of type `valu -> pattern -> (string * valu) list option`. It should take a value and a pattern, and return `Some lst` where `lst` is the list of bindings if the value "matches" the pattern according to the rules above, or `None` otherwise. ::: :::warning Your function must start with: ```ocaml let rec matches valu pat = ... ``` ::: Note that `if` the value matches but the pattern has no variables in it (i.e., no patterns of the form `VariableP s`), then the result is `Some []`. Hints: - The sample solution has one `match` expression with 7 branches, and uses `all_answers` (as you defined earlier) and `List.combine`. - The sample solution is about 18 lines. - Remember to follow the rules above for what patterns match what values, and what bindings they produce. :::info - Tip: Similar to how function definitions can unpack tuples, it can also be useful to use tuples in `match` expressions. For example, here, it may be helpful to match on: ```ocaml match (valu, pat) with ``` ::: --- ## **Challenge Problem** :::warning The following problem is **[INDEPENDENT]**, thus, we won't offer help on it beyond clarification questions. It is worth up to +10 extra credit on the assignment. ::: :::success Write a function `typecheck_patterns` that “type-checks” a `pattern list`. ::: :::warning Your function must start with: ```ocaml let typecheck_patterns bindings pats = ... ``` ::: Types for our made-up pattern language are defined by: ```ocaml type typ = | AnythingT (* any type of value is okay *) | UnitT (* type for Unit *) | IntT (* type for integers *) | TupleT of typ list (* tuple types *) | VariantT of string (* some named variant *) ``` The first argument contains elements that look like `("foo","bar",IntT)`, which means constructor `foo` makes a value of type `VariantT "bar"` given a value of type `IntT`. Assume list elements all have different first fields (the constructor name), but there are probably elements with the same second field (the variant name). Under the assumptions this list provides, you “type-check” the `pattern list` to see if there exists some `typ` (call it *t*) that *all* the patterns in the list can have. If so, return `Some t`, else return `None`. You must return the “most lenient” type that all the patterns can have. For example, given patterns: `TupleP [VariableP "x", VariableP "y"]` and `TupleP [WildcardP, WildcardP]`, you should return `Some (TupleT [AnythingT, AnythingT])` even though they could both have type `TupleT [IntT, IntT]`. As another example, if the only patterns are `TupleP [WildcardP, WildcardP]` and `TupleP [WildcardP, TupleP [WildcardP, WildcardP]]` you should return `Some (TupleT [AnythingT, TupleT[AnythingT, AnythingT]])` --- # **Summary** Evaluating a correct assignment solution should generate these bindings *or more general types* in addition to the bindings from the provided code. ```ocaml val longest_string1 : string list -> string val longest_string2 : string list -> string val all_answers : ('a -> 'b list option) -> 'a list -> 'b list option val sum_variable_lengths : pattern -> int val count_a_var : string -> pattern -> int val check_pat : pattern -> bool val matches : valu -> pattern -> (string * valu) list option (* optional challenge problem generates this binding: *) val typecheck_patterns : (string * string * typ) list -> pattern list -> typ option ``` Notice all functions use currying for multiple arguments. Generating these bindings does not guarantee that your solutions are correct. # Submission Submit the following **3 files** on **Gradescope**: :::success - `a4.ml` - `a4test.ml` - A `README.txt` or `README.md` file that answers the following: - Roughly how long did this assignment take you (in hours)? - Which was the most challenging problem? - Which was your favorite problem? - Which resources did you find to be the most useful? :::