Lecture 8: OCaml cont., compound static types
---
:::warning
Announcements
- **Please sit in the first 2 rows of the room**.
- A3: Interp parts 3, 4 due tonight, Sunday
- Quiz 1 next Thursday, the 9th
- Logistics page [here](https://cs.wellesley.edu/~cs251/f25/notes/quiz1.html), including practice quiz.
- Next help hours:
- Emily's 2-4pm Friday L037
:::
:::success
Agenda for today
- Continuing OCaml basics
- Local bindings, floats, strings
- Lists in more detail
- Pattern matching
- Group exercise: lists, pattern matching
- Sum vs. product types
- Variant types revisited
- Tuples, records
:::
Today, we'll continue to see the basics of OCaml and how it relates to what we already know from Racket. Then, we'll move on to some :new: exciting language features in OCaml: richer composite types.
# Continuing OCaml basics
## Local bindings in OCaml
In OCaml, local bindings with `let` take the form `let <id> = <e> in <body>`.
You can use `and` for chains of lets:
```ocaml
let x = 1
and y = 2 in
x + y
```
The equivalent to Racket's `let*` would be to write a series of `let` statements. This style is somewhat more common than the `and` above:
```ocaml
let x = 1 in
let y = x + 2 in
x + y
```
We can use local bindings inside a top-level function binding:
```ocaml
utop: let f x = let y = 2 in (x * y);;
val f : int -> int = <fun>
utop: f 5 ;;
- : int = 10
```
:::warning
Note here that OCaml does not require parentheses in function calls - they're only necessary to disambiguate order of operations. Here, in calling `f` and passing `5`, we don't need to group anything with parens.
:::
## Floats
Recall that last lecture, we saw that using the `+` operator allowed OCaml to infer that the arguments had type `int`:
```ocaml
utop: let g x = 1 + x ;;
val g : int -> int = <fun>
```
:::info
This implies that `+` can only be used on OCaml integers! How can we add other types?
:::
OCaml has a distinct type for `float`s.
```ocaml
utop # 5.3 ;;
- : float = 5.3
```
Like the `.` indicating that a constant should be read as a `float`, OCaml's operators on `float` include a `.`:
```ocaml
utop # let h x = 1.0 +. x ;;
val h : float -> float = <fun>
```
We cannot add integers and floats with this alone, since it violates OCaml's static typing semantics:
```ocaml
utop # 1.0 +. 2 ;;
Error: This expression has type
int
but an expression was expected of type
float
```
However, we can use an explicit conversion operator:
```ocaml
utop # 1.0 +. (float_of_int 2) ;;
- : float = 3.
```
:::info
This pattern of `type1_of_type2` is used across many OCaml types.
:::
Note that we can also leave off the zero:
```ocaml
utop # 1. ;;
- : float = 1.
```
## Strings
OCaml strings are what you might expect:
```ocaml
utop # "hi" ;;
- : string = "hi"
```
We cannot add integers and strings:
```ocaml
utop # "hi" + 0 ;;
Error: This expression has type string
but an expression was expected of type
int
utop # "1.0" + 0 ;;
Error: This expression has type string
but an expression was expected of type
int
```
However, we can use a conversion:
```ocaml
utop # (int_of_string "1") + 0 ;;
- : int = 1
```
We can also append strings using `^`:
```ocaml
"1" ^ string_of_int 0 ;;
```
## Boolean operators
OCaml uses `true`, `false`, infix `&&` and `||`, as well as the keyword `not`.
To demonstrate some of these features together:
```ocaml
utop # let is_even n = n mod 2 = 0 ;;
val is_even : int -> bool = <fun>
utop # let avg_ceil ((x: int), (y: int), (ceil: bool)) =
((x + y) / 2) + (if ceil && not (is_even (x + y)) then 1 else 0) ;;
val avg_ceil : int * int * bool -> int = <fun>
utop # avg_ceil (1, 2, true) ;;
- : int = 2
utop # avg_ceil (1, 2, false) ;;
- : int = 1
```
Next, we'll zoom out to look at more complex types that we briefly saw in `#lang plait` but will explore in depth in OCaml.
# OCaml compound data types, pattern matching
Broadly, whenever we design a new type `t` we need two things:
1. A way to ==build== (“introduce”, “construct”) values of type `t`
2. A way to ==use== (“eliminate”, “destruct”) values of type `t`
## Lists
For example, for the cons `list` type we already previewed:
:::info
**Type:** `t list`
**==Build==**
`[]`, `::`
Examples:
`bool list`
`int list`
`int list list`
**==Use==**
`List.hd` and `List.tl`
:new: Or, **pattern match** with ` x::xs'` !
:::
:::danger
Note that OCaml's static types also rule out many lists that were allowed by Racket!
:::
For example:
```ocaml
utop # "a"::1::[]
;;
Error: This expression has type int but an expression was expected of type
string
```
We can see from this example that OCaml lists must be homogeneous: ==every element must have the same type==.
## Our first taste of pattern matching
Across both Racket and OCaml, we have now seen many recursive functions that (1) have a case for the empty list, then (2) have a case that does something that uses both the first element of the list, and the remaining elements of the list.
OCaml provides an extremely convenient syntax for this common form: **pattern matching**.
We'll see more of this next class, but the basic idea is that you can write the following, for example, to find the product of a list of integers:
```ocaml
(* xs has type a' list *)
match xs with
(* base case *)
| [] -> 1
(* recursive case, can use x for head and xs' for tail *)
| x::xs' -> x * (product xs')
```
Following the general pattern of:
```ocaml
match <expr with compound type> with
(* base case *)
| <building base case> -> 1
(* recursive case *)
| <building rec case> -> <use bound names>
```
# Group exercise: lists, pattern matching
:::success
Write a function that sums the numbers in a list using explicit tail recursion in OCaml.
Use pattern matching, e.g., `| x::xs' ->`. Use an *internal* helper with `let rec helper ...`
:::
:::spoiler Think, then click!
Solution: explicit tail recursion
```ocaml
let sum_list ls =
let rec helper xs acc =
match xs with
| [] -> acc
| x::xs' -> helper xs' (acc + x)
in
helper ls 0
```
:::
## Sum and Product Types: Tuples, Variants
We've now seen many of the basic primitive types in OCaml, and a basic structured, homogeneous data type of `list`. But how do we _combine_ primitive types to make more complex data representations?
Say we have two types: an integer, and a string.
We have different ways of combining these depending on the purpose!
Broadly, programming language people describe these as either _sum_ or _product_ types.
If we think of "the set of all integers" as $I$ and "the set of all strings" as #, sometimes we want:
1. A datatype that pulls from *either* of these two sets: an integer *or* a string. This is a ==sum type==.
:::success
Have we seen examples of this in CS251?
:::
:::spoiler Think, then click!
Yes, the `type-expr` s from `plait`: `MathExpr` from lecture and `Expr`, `Value`, etc from A3: Interp.
:::
2. A datatype that includes components from *both* of the two sets: an integer *and* a string. This is a ==product type==. (Think: "cross product").
## Sum Types: Variants
_Sum_ types have their name because conceptually, going back to our "set of all integers" and "set of all strings" thought experiment, _sum_ types are choices taken from either of the sets. The total number of choices is the sum, rather than the product.
We have already seen basic sum types: `plait` variants! OCaml also defines these as variant types:
For example, if we want _either_ an integer or a string (maybe because we want an ID that can be either):
```ocaml
type int_or_string =
| MyInt of int
| MyString of string
```
:::info
**Type:** Variant Types
**==Build==**
Define the type as:
```ocaml
type type_name =
| Variant1 of ...
| Variant2 of ...
```
Where each `...` could be a further compound type.
Example types:
`int_or_string`
Calculator example (below)
Construct a variant with:
```ocaml
Variant1 <data>
```
For example:
```
Variant1 5
```
**==Use==**
Pattern match with:
```ocaml
match e with
| Variant1 ... -> (* case1 *)
| Variant2 ... -> (* case2 *)
```
:::
We can use variant types to reproduce the core of our calculator example from `plait`:
```ocaml
(* Define the type MathExpr *)
type math_expr =
| ENum of int
| EAdd of math_expr * math_expr
(* Function to evaluate a math expression *)
let rec calc expr =
match expr with
| ENum value -> value
| EAdd (x, y) -> (calc x) + (calc y)
```
Note that here, our `math_expr` variant type is recursive! The `EAdd` variant contains a pair of `math_expr`s.
Examples of building variants:
```ocaml
utop # let e0 = (ENum 0) ;;
val e0 : math_expr = ENum 0
utop # let e1 = (ENum 1) ;;
val e1 : math_expr = ENum 1
utop # let e2 = EAdd (e0, e1) ;;
val e2 : math_expr = EAdd (ENum 0, ENum 1)
```
OCaml also has a built in variant type for *options*, like we saw in `plait`.
:::info
**Type:** option type
**==Build==**
Defined by OCaml as the type:
```ocaml
type 'a option =
| Some of 'a
| None
```
Example types:
`int option`
**==Use==**
Pattern match with:
```ocaml
match e with
| Some e' -> (* case1 *)
| None -> (* case2 *)
```
:::
---
## Product types: tuples and records
The most simple product types in OCaml are _tuples_ (which you have seen before in Python).
## Product type: pairs (2-tuples)
We'll start with tuples with two values: also known as _pairs_:
:::info
**Type:** Pairs (2-tuples)
**==Build==** `(e1, e2)` where `e1`, `e2` are expressions
Example types:
`(int * int)`
`(string * int)`
`*` is a type constructor: it builds tuple types out of other types.
**==Use==**
Access with: `fst e` and `snd e`
*Or* pattern match with: `let (e1, e2) = e`.
:::
:::warning
Note that `*` means different things when applied to *values* vs. *types*: `3 * 4` is an expression that evaluates to `12`, `int * int` is a type that represents a pair of integers.
:::
### Pair examples
For example, we could use a pair to model a student with name and id:
```ocaml
utop # let student : string * int = ("alice", 4) ;;
val student : string * int = ("alice", 4)
(fst student) ;;
- : string = "alice"
utop # (snd student) ;;
- : int = 4
utop # let (name, id) = student in name ;;
- : string = "alice"
```
### General tuples
We can also have more than two fields in a tuple:
```ocaml
utop # let coordinate : float * float * float = (1., 2., 0.) ;;
```
Pairs (and more general tuples) are great for when we want to use multiple types _at once_ in a compound data type.
If we have *a lot* of different types we want to combine into one data type, it can get confusing to remember which field is in which position.
OCaml also provides a way to have *named* fields in product types: a *record*.
## Product type: records
:::info
**Type:** Records
**==Build==** `{ f1 = e1; f2 = e2; ... }` where `e1`, `e2`, ... are expressions and `f1`, `f2`, ... are field names.
Example types:
```ocaml
{ x : float; y : float; z: float }
{ name : string; age : int }
```
Each field has a **field name** and a **type**.
**==Use==**
Access fields with: `e.<fieldname>`
*Or* pattern match with:
```ocaml
let { f1 = x; f2 = y } = e
(* ... *)
| { f1 = x; f2 = y } ->
(* or shorthand: *)
let { f1; f2 } = e
(* ... *)
| { f1; f2 } ->
```
:::
For example, we can define a type:
```ocaml
type student = {name : string; year : int}
```
which is a record with two fields, a `name` of type `string` and a `year` of type `int`.
We can construct a student with:
```ocaml
let s = {name = "alice"; year = 2027}
```
:::warning
### When to use records vs tuples
When combining a large number of types into one (say, more than 3), it's arguably better to use a record instead of a tuple, so the programmer does not have to remember what position corresponds to what field. But the syntax is a bit more concise for variants.
:::
## Combining patterns
Now that we have more interesting static types to play with, we can also write more interesting patterns for pattern matching.
Say we wanted to sum a list of type `int option`, where `None`s are treated as zero.
We could write this with nested match statements like so:
```ocaml
let rec sum_int_options1 xs =
match xs with
| [] -> 0
| x::xs' ->
match x with
| None -> sum_int_options1 xs'
| Some i -> i + sum_int_options1 xs'
```
However, we can also write a more concise, single match statement by nesting the option cases inside the list cases:
```ocaml
let rec sum_int_options2 xs =
match xs with
| [] -> 0
| None::xs' -> sum_int_options2 xs'
| (Some i)::xs' -> i + sum_int_options2 xs'
```
:::info
Question froma student: what if we matched on something more specific, or didn't cover all cases?
:::
:::spoiler Think, then click!
If we wrote a silly match like this, that only checked for `Some 5`:
```ocaml
let rec sum_int_options42 xs =
match xs with
| [] -> 0
| None::xs' -> sum_int_options42 xs'
| (Some 5)::xs' -> 5 + sum_int_options42 xs'
```
Then we'd get a warning like this:
```ocaml
File "options.ml", lines 12-15, characters 2-46:
12 | ..match xs with
13 | | [] -> 0
14 | | None::xs' -> sum_int_options42 xs'
15 | | (Some 5)::xs' -> 5 + sum_int_options42 xs'
Warning 8 [partial-match]: this pattern-matching is not exhaustive.
Here is an example of a case that is not matched:
Some 0::_
```
And an error when we actually tried to run this code on a list with other integers:
```ocaml
utop # sum_int_options42 [Some 3; Some 4] ;;
Exception: Match_failure ("options.ml", 12, 2).
```
:::
We'll continue with more examples of pattern matching next class.