Lecture 8: OCaml cont., compound static types --- :::warning Announcements - **Please sit in the first 2 rows of the room**. - A3: Interp parts 3, 4 due tonight, Sunday - Quiz 1 next Thursday, the 9th - Logistics page [here](https://cs.wellesley.edu/~cs251/f25/notes/quiz1.html), including practice quiz. - Next help hours: - Emily's 2-4pm Friday L037 ::: :::success Agenda for today - Continuing OCaml basics - Local bindings, floats, strings - Lists in more detail - Pattern matching - Group exercise: lists, pattern matching - Sum vs. product types - Variant types revisited - Tuples, records ::: Today, we'll continue to see the basics of OCaml and how it relates to what we already know from Racket. Then, we'll move on to some :new: exciting language features in OCaml: richer composite types. # Continuing OCaml basics ## Local bindings in OCaml In OCaml, local bindings with `let` take the form `let <id> = <e> in <body>`. You can use `and` for chains of lets: ```ocaml let x = 1 and y = 2 in x + y ``` The equivalent to Racket's `let*` would be to write a series of `let` statements. This style is somewhat more common than the `and` above: ```ocaml let x = 1 in let y = x + 2 in x + y ``` We can use local bindings inside a top-level function binding: ```ocaml utop: let f x = let y = 2 in (x * y);; val f : int -> int = <fun> utop: f 5 ;; - : int = 10 ``` :::warning Note here that OCaml does not require parentheses in function calls - they're only necessary to disambiguate order of operations. Here, in calling `f` and passing `5`, we don't need to group anything with parens. ::: ## Floats Recall that last lecture, we saw that using the `+` operator allowed OCaml to infer that the arguments had type `int`: ```ocaml utop: let g x = 1 + x ;; val g : int -> int = <fun> ``` :::info This implies that `+` can only be used on OCaml integers! How can we add other types? ::: OCaml has a distinct type for `float`s. ```ocaml utop # 5.3 ;; - : float = 5.3 ``` Like the `.` indicating that a constant should be read as a `float`, OCaml's operators on `float` include a `.`: ```ocaml utop # let h x = 1.0 +. x ;; val h : float -> float = <fun> ``` We cannot add integers and floats with this alone, since it violates OCaml's static typing semantics: ```ocaml utop # 1.0 +. 2 ;; Error: This expression has type int but an expression was expected of type float ``` However, we can use an explicit conversion operator: ```ocaml utop # 1.0 +. (float_of_int 2) ;; - : float = 3. ``` :::info This pattern of `type1_of_type2` is used across many OCaml types. ::: Note that we can also leave off the zero: ```ocaml utop # 1. ;; - : float = 1. ``` ## Strings OCaml strings are what you might expect: ```ocaml utop # "hi" ;; - : string = "hi" ``` We cannot add integers and strings: ```ocaml utop # "hi" + 0 ;; Error: This expression has type string but an expression was expected of type int utop # "1.0" + 0 ;; Error: This expression has type string but an expression was expected of type int ``` However, we can use a conversion: ```ocaml utop # (int_of_string "1") + 0 ;; - : int = 1 ``` We can also append strings using `^`: ```ocaml "1" ^ string_of_int 0 ;; ``` ## Boolean operators OCaml uses `true`, `false`, infix `&&` and `||`, as well as the keyword `not`. To demonstrate some of these features together: ```ocaml utop # let is_even n = n mod 2 = 0 ;; val is_even : int -> bool = <fun> utop # let avg_ceil ((x: int), (y: int), (ceil: bool)) = ((x + y) / 2) + (if ceil && not (is_even (x + y)) then 1 else 0) ;; val avg_ceil : int * int * bool -> int = <fun> utop # avg_ceil (1, 2, true) ;; - : int = 2 utop # avg_ceil (1, 2, false) ;; - : int = 1 ``` Next, we'll zoom out to look at more complex types that we briefly saw in `#lang plait` but will explore in depth in OCaml. # OCaml compound data types, pattern matching Broadly, whenever we design a new type `t` we need two things: 1. A way to ==build== (“introduce”, “construct”) values of type `t` 2. A way to ==use== (“eliminate”, “destruct”) values of type `t` ## Lists For example, for the cons `list` type we already previewed: :::info **Type:** `t list` **==Build==** `[]`, `::` Examples: `bool list` `int list` `int list list` **==Use==** `List.hd` and `List.tl` :new: Or, **pattern match** with ` x::xs'` ! ::: :::danger Note that OCaml's static types also rule out many lists that were allowed by Racket! ::: For example: ```ocaml utop # "a"::1::[] ;; Error: This expression has type int but an expression was expected of type string ``` We can see from this example that OCaml lists must be homogeneous: ==every element must have the same type==. ## Our first taste of pattern matching Across both Racket and OCaml, we have now seen many recursive functions that (1) have a case for the empty list, then (2) have a case that does something that uses both the first element of the list, and the remaining elements of the list. OCaml provides an extremely convenient syntax for this common form: **pattern matching**. We'll see more of this next class, but the basic idea is that you can write the following, for example, to find the product of a list of integers: ```ocaml (* xs has type a' list *) match xs with (* base case *) | [] -> 1 (* recursive case, can use x for head and xs' for tail *) | x::xs' -> x * (product xs') ``` Following the general pattern of: ```ocaml match <expr with compound type> with (* base case *) | <building base case> -> 1 (* recursive case *) | <building rec case> -> <use bound names> ``` # Group exercise: lists, pattern matching :::success Write a function that sums the numbers in a list using explicit tail recursion in OCaml. Use pattern matching, e.g., `| x::xs' ->`. Use an *internal* helper with `let rec helper ...` ::: :::spoiler Think, then click! Solution: explicit tail recursion ```ocaml let sum_list ls = let rec helper xs acc = match xs with | [] -> acc | x::xs' -> helper xs' (acc + x) in helper ls 0 ``` ::: ## Sum and Product Types: Tuples, Variants We've now seen many of the basic primitive types in OCaml, and a basic structured, homogeneous data type of `list`. But how do we _combine_ primitive types to make more complex data representations? Say we have two types: an integer, and a string. We have different ways of combining these depending on the purpose! Broadly, programming language people describe these as either _sum_ or _product_ types. If we think of "the set of all integers" as $I$ and "the set of all strings" as #, sometimes we want: 1. A datatype that pulls from *either* of these two sets: an integer *or* a string. This is a ==sum type==. :::success Have we seen examples of this in CS251? ::: :::spoiler Think, then click! Yes, the `type-expr` s from `plait`: `MathExpr` from lecture and `Expr`, `Value`, etc from A3: Interp. ::: 2. A datatype that includes components from *both* of the two sets: an integer *and* a string. This is a ==product type==. (Think: "cross product"). ## Sum Types: Variants _Sum_ types have their name because conceptually, going back to our "set of all integers" and "set of all strings" thought experiment, _sum_ types are choices taken from either of the sets. The total number of choices is the sum, rather than the product. We have already seen basic sum types: `plait` variants! OCaml also defines these as variant types: For example, if we want _either_ an integer or a string (maybe because we want an ID that can be either): ```ocaml type int_or_string = | MyInt of int | MyString of string ``` :::info **Type:** Variant Types **==Build==** Define the type as: ```ocaml type type_name = | Variant1 of ... | Variant2 of ... ``` Where each `...` could be a further compound type. Example types: `int_or_string` Calculator example (below) Construct a variant with: ```ocaml Variant1 <data> ``` For example: ``` Variant1 5 ``` **==Use==** Pattern match with: ```ocaml match e with | Variant1 ... -> (* case1 *) | Variant2 ... -> (* case2 *) ``` ::: We can use variant types to reproduce the core of our calculator example from `plait`: ```ocaml (* Define the type MathExpr *) type math_expr = | ENum of int | EAdd of math_expr * math_expr (* Function to evaluate a math expression *) let rec calc expr = match expr with | ENum value -> value | EAdd (x, y) -> (calc x) + (calc y) ``` Note that here, our `math_expr` variant type is recursive! The `EAdd` variant contains a pair of `math_expr`s. Examples of building variants: ```ocaml utop # let e0 = (ENum 0) ;; val e0 : math_expr = ENum 0 utop # let e1 = (ENum 1) ;; val e1 : math_expr = ENum 1 utop # let e2 = EAdd (e0, e1) ;; val e2 : math_expr = EAdd (ENum 0, ENum 1) ``` OCaml also has a built in variant type for *options*, like we saw in `plait`. :::info **Type:** option type **==Build==** Defined by OCaml as the type: ```ocaml type 'a option = | Some of 'a | None ``` Example types: `int option` **==Use==** Pattern match with: ```ocaml match e with | Some e' -> (* case1 *) | None -> (* case2 *) ``` ::: --- ## Product types: tuples and records The most simple product types in OCaml are _tuples_ (which you have seen before in Python). ## Product type: pairs (2-tuples) We'll start with tuples with two values: also known as _pairs_: :::info **Type:** Pairs (2-tuples) **==Build==** `(e1, e2)` where `e1`, `e2` are expressions Example types: `(int * int)` `(string * int)` `*` is a type constructor: it builds tuple types out of other types. **==Use==** Access with: `fst e` and `snd e` *Or* pattern match with: `let (e1, e2) = e`. ::: :::warning Note that `*` means different things when applied to *values* vs. *types*: `3 * 4` is an expression that evaluates to `12`, `int * int` is a type that represents a pair of integers. ::: ### Pair examples For example, we could use a pair to model a student with name and id: ```ocaml utop # let student : string * int = ("alice", 4) ;; val student : string * int = ("alice", 4) (fst student) ;; - : string = "alice" utop # (snd student) ;; - : int = 4 utop # let (name, id) = student in name ;; - : string = "alice" ``` ### General tuples We can also have more than two fields in a tuple: ```ocaml utop # let coordinate : float * float * float = (1., 2., 0.) ;; ``` Pairs (and more general tuples) are great for when we want to use multiple types _at once_ in a compound data type. If we have *a lot* of different types we want to combine into one data type, it can get confusing to remember which field is in which position. OCaml also provides a way to have *named* fields in product types: a *record*. ## Product type: records :::info **Type:** Records **==Build==** `{ f1 = e1; f2 = e2; ... }` where `e1`, `e2`, ... are expressions and `f1`, `f2`, ... are field names. Example types: ```ocaml { x : float; y : float; z: float } { name : string; age : int } ``` Each field has a **field name** and a **type**. **==Use==** Access fields with: `e.<fieldname>` *Or* pattern match with: ```ocaml let { f1 = x; f2 = y } = e (* ... *) | { f1 = x; f2 = y } -> (* or shorthand: *) let { f1; f2 } = e (* ... *) | { f1; f2 } -> ``` ::: For example, we can define a type: ```ocaml type student = {name : string; year : int} ``` which is a record with two fields, a `name` of type `string` and a `year` of type `int`. We can construct a student with: ```ocaml let s = {name = "alice"; year = 2027} ``` :::warning ### When to use records vs tuples When combining a large number of types into one (say, more than 3), it's arguably better to use a record instead of a tuple, so the programmer does not have to remember what position corresponds to what field. But the syntax is a bit more concise for variants. ::: ## Combining patterns Now that we have more interesting static types to play with, we can also write more interesting patterns for pattern matching. Say we wanted to sum a list of type `int option`, where `None`s are treated as zero. We could write this with nested match statements like so: ```ocaml let rec sum_int_options1 xs = match xs with | [] -> 0 | x::xs' -> match x with | None -> sum_int_options1 xs' | Some i -> i + sum_int_options1 xs' ``` However, we can also write a more concise, single match statement by nesting the option cases inside the list cases: ```ocaml let rec sum_int_options2 xs = match xs with | [] -> 0 | None::xs' -> sum_int_options2 xs' | (Some i)::xs' -> i + sum_int_options2 xs' ``` :::info Question froma student: what if we matched on something more specific, or didn't cover all cases? ::: :::spoiler Think, then click! If we wrote a silly match like this, that only checked for `Some 5`: ```ocaml let rec sum_int_options42 xs = match xs with | [] -> 0 | None::xs' -> sum_int_options42 xs' | (Some 5)::xs' -> 5 + sum_int_options42 xs' ``` Then we'd get a warning like this: ```ocaml File "options.ml", lines 12-15, characters 2-46: 12 | ..match xs with 13 | | [] -> 0 14 | | None::xs' -> sum_int_options42 xs' 15 | | (Some 5)::xs' -> 5 + sum_int_options42 xs' Warning 8 [partial-match]: this pattern-matching is not exhaustive. Here is an example of a case that is not matched: Some 0::_ ``` And an error when we actually tried to run this code on a list with other integers: ```ocaml utop # sum_int_options42 [Some 3; Some 4] ;; Exception: Match_failure ("options.ml", 12, 2). ``` ::: We'll continue with more examples of pattern matching next class.