Lecture 7: Intro OCaml, static types --- :::warning Announcements - **Please sit in the first 2 rows of the room**. - A3: Interp parts 2-4 due Tuesday, Thursday, Sunday - Quiz 1 next Thursday, the 9th - Logistics page [here](https://cs.wellesley.edu/~cs251/f25/notes/quiz1.html), including practice quiz. - Next help hours: - Mine 6:30-8:30pm tonight in L037 - Grace's 1-2pm Tuesday on Zoom ::: :::success Agenda for today - Finish lexical vs dynamic scope ([prior notes](https://hackmd.io/@avh/HyaUZJQnxx#Why-lexical-scope)) _**(end of Quiz 1 material)**_ --- - Motivation/context for OCaml - Static types - Intro to OCaml - Bindings for evaluation *and* types - Recursion in OCaml - Basics of pattern matching, tuples, records, variant types (if time) ::: # New language: OCaml! For the next segment of the course, we'll see how these language fundamentals apply in a new context: in the functional, statically-typed language **OCaml**! ## Why OCaml? As we discussed in the start of the semester, new languages help us (1) become better programmers in the languages we already know, and (2) practice the skill of learning new languages. In addition, OCaml is just really fun! Some reasons: - Powerful static types (like the sports car of type systems) - Good libraries, industry use (e.g., Jane Street) I highly recommend the (free, online, videos included) [OCaml textbook](https://cs3110.github.io/textbook/cover.html) linked to from the Resources page. ## A very brief history of OCaml **"The other"** ML: ML: Meta-Language for Theorem-Proving - Developed in the early 1970s at the University of Edinburgh, for treating programs as proofs. - It's creator, Milner, won the [Turing Award](https://amturing.acm.org/award_winners/milner_1569367.cfm) (the "CS Nobel Prize") for "ML, the first language to include polymorphic type inference together with a type-safe exception-handling mechanism". - OCaml is the most widely-used/taught modern descendant of ML. --- # Types in general Zooming out a bit, let's consider _types_ in general. :::success Question: what is a _type_ (in a programming sense)? ::: :::spoiler Think, then click! A type is... - How data is represented in binary, as 0s and 1s - A set of values that share some property - A prediction about the value an expression will yield. E.g., if the expression `(+ 1 0)` has type `int`, we can predict that it will yield an int-shaped value even before evaluating the expression. ::: :::success Question: Why do programming languages use types? ::: :::spoiler Think, then click! - Types help catch certain kinds of errors (see below). - Types help with documentation. Comments can get out-of-date/state: types cannot! - Compilers can exploit types to make code faster. For example, if the type system rules out invalid field access, then the compiled code does not need to check the type of the data before accessing the field. ::: :::success What kind of errors can types rule out? ::: :::spoiler Think, then click! - Adding a number and a string - Calling a function that returns the wrong number or type of arguments - Calling a function that returns the wrong result - Storing data of the wrong kind to a data structure - For those who have taken 240: silently corrupting memory In more advanced languages, like Rust: - Eliminating concurrency bugs ::: In practice, type systems often reflect categories of values that share implementation-independent properties. Integers: - support arithmetic: adding, subtracting, _truncating_ divide - Produce other integers as a result of the above - Can be formatted a certain way Lists: - Can be concatenated - Can have elements appended - Have a first item ## Static type checking In static type checking, a program is checked for type errors before it is run/evaluated. The benefits of this are that (1) it catches certain errors as early as possible, and (2) it enables languages implementations (compilers, interpreters) to produce more efficient code. The downside of this is that it will rules out programs that _may_ have run as intended, because the static type analysis determines that it is _possible_ for them to have an error. We'll see some examples of this in OCaml. --- # Types as a window into OCaml In A3: Interp, you'll use environments to track variable bindings. These environments we have worked with so far this semester are _dynamic_ environments: values are added to them as the program executes (not to be confused with dynamic scope, which is about where lambda/function environments are saved). This is essentially how the core of Racket operates. OCaml instead maintains **two distinct kinds** of environments: 1. ==The dynamic environment (runtime)==: Maps names to actual values. This is analogous to what we have been using so far with Racket. 2. ==:new: the static environment (compile-time)==: Maps names to types. These environments help the compiler and the runtime system keep track of identifier types and values. The meaning of a binding in OCaml is to update both the static and dynamic environments. OCaml uses the static environment to ensure type correctness before execution (at compile-time). OCaml uses the dynamic environment to handle actual values during execution (at runtime). --- ## Running OCaml We'll first show running OCaml in a Read-Eval-Print Loop (REPL) similar to the DrRacket interactions window. We'll run this REPL in any IDE: I'll show examples in VSCode's terminal. :::info If you have not used VSCode before, you can install it [here](https://code.visualstudio.com/download). I suggest installing OCaml itself (including `utop` for the REPL and `dune` for the build system) with [these instructions](https://sites.google.com/cs.washington.edu/cse341autumn2024/resources/software-setup) from CS341 at the University of Washington. ::: The OCaml REPL is called utop. - To start utop: after installing OCaml and then `utop`, type `utop` in your VSCode terminal. - Run commands: Type OCaml expressions and see immediate results. Use `;;` to signal the end of the top-level expressions. - Load a file: Use `#use "filename.ml";;` to load and execute a file. Example: ```ocaml #use "my_program.ml";; ``` In OCaml, we use `let` bindings to associate names with values (including functions). `let` takes the place of `define` in Racket: its used for both top-level and local bindings. An OCaml program is a series of bindings. ```ocaml (* Binding a variable *) let x = 3 (* Static Environment Update: { x : int } *) (* Dynamic Environment Update: { x : 3 } *) (* Binding another variable *) let y = 6 (* Static Environment Update: { x : int; y : int } *) (* Dynamic Environment Update: { x : 3; y : 6 } *) ``` ## Conditionals in OCaml Unlike Racket, OCaml conditionals do use the `then` and `else` keywords. OCaml also uses infix, rather than prefix, operators. ```ocaml utop # let result = if y > 5 then 0 else 1 ;; val result : int = 0 ``` Note: in `utop`, we **have** to use `;;` to indicate the end of the top-level expressions that we want evaluated. :::warning `;;` is not required in files! It's better OCaml style to omit it in files. ::: Beyond the syntactic differences, OCaml's static types give conditionals a different semantics to Racket's. In particular, the following is an error: ```ocaml utop # let result = if y > 5 then 0 else "1" ;; Error: This expression has type string but an expression was expected of type int ``` :::success _Why_ is this an error in OCaml? ::: :::spoiler Think, then click! In order to prevent bugs with types, OCaml needs to be able to assign a type to _every_ subexpression. To assign a type to `if y > 5 then 0 else "1"` _without_ evaluating the expression, we need to impose the more strict requirement that regardless of whether `y > 5`, the overall value should have the same type. As we saw in A0, this is different than Racket, which does not have static types? ::: :::info Student question: does the first argument to `if` need to be a boolean? ::: :::spoiler Think, then click! Let's try it out in `utop`! ```ocaml utop # if 0 then 1 else 2 ;; Error: This expression has type int but an expression was expected of type bool because it is in the condition of an if-statement ``` So *yes*, unlike Racket, OCaml requires the first argument to have type `bool`. ::: :::info Student question: does OCaml not have a sense of "truthy"? ::: :::spoiler Think, then click! No, not in the sense that Racket does. Booleans in OCaml look like this: ```ocaml utop # true ;; - : bool = true utop # false ;; - : bool = false ``` ::: # Functions in OCaml Like in Racket, OCaml functions are first-class values. They can be defined, passed as arguments, and returned from other functions. ```ocaml (* Defining Functions Syntax: let function_name ((param_1: type_1), ..., (param_n: type_n)) : return_type = (* function body *) Example: *) let max (a: int) (b: int) : int = if a < b then b else a (* Function Name: max *) (* Parameters: two ints named a and b both of type int *) (* Return Type: int *) (* Function Body: Uses an `if` expression to determine the maximum *) (* We also can leave off the types *) let max2 a b = if a < b then b else a;; (* val max2 : 'a -> 'a -> 'a = <fun> *) (* (We'll cover _type inference_ later in the semester *) let increment x = x + 1 (* val increment : int -> int = <fun> increment is the identifier to which the value was bound. int -> int is the type of the value. This is the type of functions that take an int as their argument and produce an int as output. The value is a function, which the utop chooses to display just as <val> rather than the full body. *) ``` # Recursion in OCaml OCaml, as a functional language, of course supports recursion! We use `let rec` to indicate a binding that can refer to itself: ```ocaml let rec pow (base: int) (exp: int) : int = if exp = 0 then 1 else base * pow base (exp - 1) (* Function Name: pow *) (* Parameters: two ints, base and exp, both of type int *) (* Return Type: int *) (* Base Case: If exp = 0, return 1 *) (* Recursive Case: Multiply base by pow (base, exp - 1) *) ``` We can call this function with either: ```ocaml (pow 2 2) (* Or: *) pow 2 2 ``` # Lists in OCaml Lists are also a recursive data structure in OCaml: we have the empty list, then non-empty lists that _cons_ together an element and a list. When we enter the empty list into `utop`, we get: ```ocaml utop # [] ;; - : 'a list = [] ``` Because _every_ expression in OCaml must have a type, OCaml uses `'a` to indicate that this is a list of some *polymorphic*, unspecified type. That is, the empty list could be a list of anything! This is similar to generic in Java, and we will explore it in much more detail in the coming weeks! We can create a non-empty list with cons, which is written as an infix `::`: ```ocaml utop # 1::[] ;; - : int list = [1] ``` Now, we see the type is `int list`. We could also have a list of lists of int: `int list list`. As we make the list longer: ```ocaml utop # 1::2::[] ;; - : int list = [1; 2] ``` We see that OCaml prints list values as `[v0; ... ; vn]`. Note the `;` instead of `,`. :::warning We got this far in Monday's lecture. We'll continue with this topic on Thursday! ::: OCaml's static types also rule out many lists that were allowed by Racket. For example: ```ocaml utop # "a"::1::[] ;; Error: This expression has type int but an expression was expected of type string ``` That is, lists must be homogeneous: every element should have the same type. We can get the first (_head_) and rest (_tail_) with: ```ocaml utop # List.hd ls;; - : int = 1 utop # List.hd(ls);; - : int = 1 utop # List.tl(ls) ;; - : int list = [2; 3; 4] ``` :::info Note, we can leave off the parens when calling functions in OCaml if the expression does not need them to specify order of operations. ::: # Local bindings in OCaml In OCaml, local bindings with `let` take the form `let <id> = <e> in <body>`. You can use `and` for chains of lets: ```ocaml let x = 1 and y = 2 in x + y ``` The equivalent to Racket's `let*` would be to write a series of `let` statements. This style is somewhat more common than the `and` above: ```ocaml let x = 1 in let y = x + 2 in x + y ``` # Group exercise :::success Write a function that sums the numbers in a list using explicit natural recursion, then tail recursion, in OCaml. Use `List.hd`, `List.tl` to access the head and tail. Check for empty with `= []`. ::: :::spoiler Think, then click! Solution: explicit natural recursion. Note we can leave the types off, since the `+` in OCaml can only apply to `int`s: so OCaml infers the full types for us. ```ocaml let rec sum_list lst = if lst = [] then 0 else List.hd lst + sum_list (List.tl lst) ``` Solution: explicit tail recursion ```ocaml let sum_list ls = let rec helper ls acc = if ls = [] then acc else helper (List.tl ls) (acc + (List.hd ls)) in helper ls 0 ``` ::: Note, for local `let` bindings, we use `in` to separate the expression being bound from the body: `let x = e1 in body`. Next class, we'll see the more cannonical way to write this in OCaml with _pattern matching_.