Mutation - HackMD

:::warning Announcements - Help hours 3:45-5:15pm in L037 - A5 due tonight - Late deadline: Tuesday after SB. - Quiz 2: - Monday (Mar 31) the week after week back from SB - Content through Mutation (to be finished Monday after SB) ::: :::success Agenda for today - Mutation - References - Sequencing - Records, mutable fields - Mutable arrays (more on this next time) - Quiz 2 questions (more on this next time) ::: # Mutation We know that function programming focuses on _pure_ computations that produces values without side effects, typically by prioritizing recursion over iteration. _Pure_ here means functions are mathematically defined by their inputs and outputs, without side effects. However, both Racket and OCaml are not 100% pure languages—they do allow side effects. We've already seen cases of this, like I/O via printing to the terminal or dynamic errors due to division by zero. Functional programming languages allow mutability for several reasons: 1. For programs to have useful results, they often need side effects. 2. Some data structures are easier to implement with mutation. We'll start with the simplest OCaml type with mutation—`ref`—then build to some other cases. ## Reference cells _References_ in OCaml are locations in memory whose contents can be mutated. :::info **New type**: `t ref` where `t` is a type References are first-class in OCaml (they are values that can be used anywhere other values can be used) New expressions to construct a `ref`: - `ref e` to create a reference with initial contents `e` - If `e` has type `t`, `ref e` has type `t ref` - Note the constructor name and the type name are the same: this is fine, OCaml will figure out which one you mean via the context. The constructor comes before the contents, the type `ref` comes after the element type (similar to the `list` type). New expressions to use a `ref`: - `e1 := e2` to update contents - `:=` is an operator with type `'a ref -> 'a -> unit` - `!e` to retrieve contents (`!` is not negation in OCaml, which is `not`) ::: Here is an examples of constructing two references: ```ocaml let x = ref 10 (* : int ref *) (* ____ *) (* x -> | 10 | *) (* ---- *) let y = ref 20 (* : int ref *) (* ____ *) (* x -> | 10 | *) (* ---- *) (* ____ *) (* y -> | 20 | *) (* ---- *) ``` Then seeing the contents of reference `x` to be the contents of reference `y` + 1: ```ocaml let _ = x := !y + 1 (* ____ *) (* x -> | 21 | *) (* ---- *) (* ____ *) (* y -> | 20 | *) (* ---- *) ``` In code demo: ```ocaml (* refs and mutable state *) let x = ref 10 let y = ref 20 (* ____ *) (* x -> | 10 | *) (* ---- *) (* ____ *) (* y -> | 20 | *) (* ---- *) let z = x (* ____ *) (* x -> | 10 | *) (* z -> ---- *) (* ____ *) (* y -> | 20 | *) (* ---- *) let _ = x := 21 (* ____ *) (* x -> | 21 | *) (* z -> ---- *) (* ____ *) (* y -> | 20 | *) (* ---- *) let w = !x + !z (* 21 + 21 = 42! *) (* x + 1 *) (* does not typecheck! *) let w = (!y) + (!z) (* 41 *) ``` Note that `ref` first evaluates its argument to a value before it is saved as the contents: ```ocaml utop # ref (1 + 2) ;; - : int ref = {contents = 3} utop # ref (1 / 0) ;; Exception: Division_by_zero. ``` `ref`s themselves are also values. :::warning If you've programed in C, you'll notice that `ref`s are similar to pointers. However, unlike C, OCaml is _memory safe_ (more on this later in the semester), which in part means that `ref`s are much less flexible than C pointers. You cannot cast back and forth between integers and references, for example, and you cannot do pointer arithmetic in OCaml. ::: ### Variables vs. reference contents - A variable bound to a reference (e.g., `x`) is still immutable: it will always refer to the same reference cell (or spot in memory). - But the contents of the reference may change via `:=` - And there may be aliases to the reference, which sees the contents change! That is, the contents `x` references might change even if `x` itself is not used directly. E.g., `z` aliases `x` in the previous example! :::danger - **Aliasing** in general: when one location for data in memory can be accessed through two or more different names in the program. Aliasing matters for references because of mutability! ::: ### Evaluation and typing rules for `ref` Constructing references :::info **Construct with**: `ref e` **Type rules** - If `e : t`, `ref e : t ref` **Eval rules** - Evaluate `e` to a value `v` - Allocate a new location `loc` in memory to hold `v` - Store `v` in `loc` - Return `loc` ::: Mutating references (ref assignment) :::info **Use with**: `e1 := e2` **Type rules** - If `e1 : t ref` and `e2: t`, then `e1 : e2 : unit` **Eval rules** - Evaluate `e2` to a value `v` and `e1` to a location `loc` - Store `v` in `loc` - Return `()` (the unit value) ::: Access contents (deference) :::info **Use with**: `!e` **Type rules** - If `e : t ref`, then `!e : t` **Eval rules** - Evaluate `e` to a location `loc`. - Return the contents of `loc`. ::: ## Records, mutable fields We've been working with one OCaml product type: tuples. We'll now introduce another OCaml product type, tuple, that is by default immutable, but can be combined with mutation. _Records_ are another composite type of other types of data, where instead of the types being combined just by position, they are combined by name. This is similar to structs in C (and Rust) and slightly similar to fields in Java/python. For examples, we can define a type: ```ocaml type student = {name : string; year : int} ``` which is a record with two fields, a `name` of type `string` and a `year` of type `int`. We can construct a student with: ```ocaml let s = {name = "alice"; year = 2027} ``` :::info **New type**: Records Construct with: `{f1 = e1; ...; fn = en}` Use with: - field access `e.f`. `f` must be a a field name, not an expression. - Pattern matching: - `{f1 = p1; ...; fn = pn} -> ` binds `p1`... `pn` to their respective fields. - Or syntactic sugar: `{f1; ...; fn}` ::: :::warning ### When to use records vs tuples When combining a large number of types into one (say, more than 3), it's arguably better to use a records instead of a tuple, so the programmer does not have to remember what position corresponds to what field. But the syntax is a bit more concise for variants. ::: ### Mutable record fields OCaml also lets you declare fields of a record as mutable, which allows their contents to change without constructing a new records. For example: ```ocaml type student = {name : string; year : int; mutable num_classes : int} ``` The operator `<-` updates a mutable field, for example, ```ocaml let s2 = {name = "alice"; year = 2027; num_classes = 4} ``` We can then update the field with: ```ocaml s2.num_classes <- 5 ``` ### What's the difference? Refs actually _are_ mutable record fields! OCaml actually implements the `ref` type via the mutable field mechanism. OCaml's standard library contains: ```ocaml type 'a ref = { mutable contents : 'a } ``` The other syntax we saw earlier are actually syntactic sugar! E.g., `!` and `:=` desugar to roughly: ```ocaml let ( ! ) r = r.contents let ( := ) r x = r.contents <- x ``` ## Sequencing of side effects Sometimes when using expressions just for their side effects, it's useful to describe a chain of effectful expressions. We can do this with `;` (similar to non-expression languages). :::info **Sequence** `e1; e2` **Type checking**: if `e1: unit` and `e2 : t`, `e1; e2 : t`. Similar for longer chains (all `unit` until a `t` in the last expression). **Eval rules** - First evaluate `e1` to a value `v1`. - Then evaluate `e2` to a value `v2`. - Return `v2`. (`v1` is not used at all.) - If there are multiple expressions in a sequence, e.g., `e1; e2; ...; en`, then evaluate each one in order from left to right, returning only `vn`. ::: # Physical Equality The OCaml documentation states: > `e1 == e2` tests for physical equality of `e1` and `e2`. On mutable types such as references, arrays, byte sequences, records with mutable fields and objects with mutable instance variables, `e1 == e2` is true if and only if physical modification of `e1` also affects `e2`. On non-mutable types, the behavior of ( `==` ) is implementation-dependent; however, it is guaranteed that `e1 == e2` implies compare `e1 e2 = 0`. A `ref` is physically equal to itself and any aliases, but not to another `ref` that is a different location in memory, even if it has the same contents. Different types can have special equality: eg., `String.equal` looks for structural equality. We can see this with mutable records: ```ocaml let s3 = {name = "alice"; year = 2027; num_classes = 5} let physically_eq = s2 == s3 (* false *) let structurally_eq = s2 = s3 (* true *) ``` # Group exercises ```ocaml (* Exercise 1: draw boxes for reference locations. Use them to predict the output of all bindings in this program. *) let a = ref 1 let b = ref 2 let c = ref 1 let b1 = a = c let b2 = a == c let d = b let b3 : bool = d = b let b4 = d == b let _ = (d := 1) let b5 = a = b ``` Solution, as provided by `utop`: ```ocaml utop # let a = ref 1 let b = ref 2 let c = ref 1 ;; val a : int ref = {contents = 1} val b : int ref = {contents = 2} val c : int ref = {contents = 1} utop # let b1 = a = c ;; val b1 : bool = true utop # let b2 = a == c ;; val b2 : bool = false utop # let d = b ;; val d : int ref = {contents = 2} utop # let b3 = d = b ;; val b3 : bool = true utop # let b4 = d == b ;; val b4 : bool = true utop # ignore (d := 1) ;; - : unit = () utop # d ;; - : int ref = {contents = 1} utop # b ;; - : int ref = {contents = 1} utop # let b5 = a = b ;; val b5 : bool = true ``` ```ocaml (* Exercise 2: draw boxes for reference locations. Use them to predict the output of all bindings in this program. *) (* Note: exercise adapted from OCaml Programming textbook https://cs3110.github.io/textbook/chapters/mut/exercises.html *) (* Java has `++` and `+=` operators. Informally describe what each one does. Then, define OCaml equivalents on `ref` types, like so:*) let ( ++ ) x = x := !x + 1 (* Note: important thing here is the body, OCaml is picky about operator syntax in a way that makes this one impossible to define exactly how it's used in Java/C: https://ocaml.org/docs/operators#allowed-operators *) let ( += ) x y = x := !x + y ``` --- ## Mutable arrays Arrays in OCaml are similar to primitive arrays in Java or C in that they are ordered sequences of elements with constant-time access and mutation. :::success Recall: how does the $O(...)$ runtime differ between arrays and linked list (which `cons` lists are a form of)? Accessing an element in a linked list or cons list of length $n$ is $O(n)$, while for arrays it is $O(1)$. ::: In OCaml, array sizes must be specified at construction and cannot be changed. More on this next time!