Lecture 5: Higher-Order Functions

:::success Agenda for today - The most famous higher-order functions (HOF) - `map` (transform all!) - `filter` (keep some!) - `fold` (combine to one value!) - left vs right fold - Practice: - Writing HOFs - Using HOFs - If time: testing with RackUnit ::: # Announcements - A2 pairs out, due Thursday - Note: make sure names match exactly, as you functions will be graded in part via unit test suites. - Thinking about testing is an important learning goal of this course! In A3, your _tests_ will also be programmatically graded by how many _wrong_ implementations they can find. - My help hours: next week, likely moving to: - 3:45-5:15pm Mondays in L043 (this room) - 3:45-5:15pm Thursdays in L037 (systems lab) # Higher-order function terminology First, a bit of terminology: A **higher order function** is a function that takes one or more other functions as arguments and/or produces a function as a result. A language having **first-class** functions means that functions can be used in any other place a value can be used. This is also sometimes referred to as "functions as values". Why two different terms? "Higher-order functions" are a mathematical concept that applies outside of programming, where "first-class functions" is more commonly used to describe a feature of a programming language. # `map` (transform all) Consider the following warmup questions: ```racket (define (double-all ls) (if (empty? ls) '() (cons (* 2 (first ls)) (double-all (rest ls))))) (define (add-1-all ls) (if (empty? ls) '() (cons (+ 1 (first ls)) (add-1-all (rest ls))))) (define ls '(1 2 3 4 5)) ;; WARMUP QUESTIONS: ;; Q1: What is shared across these two implementations? ;; Q2: What is different? ;; Q3: Write two lambda expressions to capture what each function does to ;; each element of the list (the "different" piece, one for double-all ;; and one for add-1-all). ``` For Q1: both implementations produce a new list by applying some transformation to every element in the argument list. For Q2: `double-all` doubles the element, `add-1-all` adds 1 to the element. For Q3: here are the two lambdas: ```racket (lambda (x) (* 2 x)) ; doubles (lambda (x) (+ 1 x)) ; adds 1 ``` Now, let's write a function that can do _either_ of these actions based on the function passed in as an argument: we can start with `double-all`, rename it to `out`, then take the function to apply as an argument `f`: ```racket (define (out f ls) (if (empty? ls) '() (cons (f (first ls)) (out f (rest ls))))) ; note: need to pass f ``` We can then call this as: ```racket > (out (lambda (x) (* 2 x)) ls) '(2 4 6 8 10) > (out (lambda (x) (+ 1 x)) ls) '(2 3 4 5 6) ``` We have accomplished the same thing as our two original functions, without duplicating so much code! :::success This is in fact the **most famous** higher order function: better known by the name **map**! **map** applies a given function to each element of a collection, producing a new collection of results in the same respective order as the original list. ::: With its more common name: ```racket ; Most famous higher-order function! (define (map f xs) (if (empty? xs) '() (cons (f (first xs)) (map f (rest xs))))) (map (lambda (x) (* 2 x)) ls) (map (lambda (x) (+ 1 x)) ls) ``` Another question: based on our implementation, does the value we return for each element need to be the same type as the values in the list? Answer: No! Nothing here requires that the output have the same type. Consider: ```racket > (map (lambda (x) (number->string x)) ls) '("1" "2" "3" "4" "5") ``` Question: is this good style? Can you see how we can make it more concise? Answer: rather than using a lambda, we _already_ have a function that converts a number to a string: it's named `number->string`! That is, we don't need to introduce a lambda when we already have a binding that does what we need. It's much better style to do: ``` > (map number->string ls) ``` :::success :star: style point: don't use redundant lambdas! Functions are values, so you can map an existing variable directly. ::: ### Map in other languages Map is built into _many_ languages now: Python: ```python # Map in python doubled = list(map(lambda x: x * 2, numbers)) # Or: similar idea, list comprehension doubled = [x * 2 for x in numbers] ``` JavaScript: ```javascript const doubled = numbers.map(x => x * 2); ``` Rust: ```rust let doubled: Vec<i32> = numbers.iter().map(|x| x * 2).collect(); ``` OCaml: ```ocaml let doubled = List.map (fun x -> x * 2) numbers;; ``` ## More properties of map: :::success Properties of **map**: - Input items and return items do not need to be of the same type - Preserves the length of the original list ::: --- # `filter` (keep some) Let's look at another case where we could use a nice functional abstraction: ```racket (define (keep-even ls) (if (empty? ls) '() (let ((rs (keep-even (rest ls)))) (if (even? (first ls)) (cons (first ls) rs) rs)))) (define (keep-gt3 ls) (if (empty? ls) '() (let ((rs (keep-gt3 (rest ls)))) (if (> (first ls) 3) (cons (first ls) rs) rs)))) ``` What do both of these functions do? They both apply a check to every element of a list, producing a new list with only the values that passed the check. In the first function, the check is `even?`. In the second function, the check is '(lambda (x) (> x 3)`. This leads us to another famous higher-order function: **filter**! ```racket ; Another famous higher-order function. (define (filter f xs) (if (null? xs) null (if (f (first xs)) (cons (first xs) (filter f (rest xs))) (filter f (rest xs))))) ``` Rewriting our two functions from above: ```racket > (filter even? ls) '(2 4) > (filter (lambda (x) (> x 3)) ls) '(4 5) ``` Question: what restrictions do we have on the function for filter? There are in a sense two different answers: 1. *Conceptually*, we want it to return a boolean. 2. In practice, Racket treats anything other than `#f` as true. For example: ```racket > (filter (lambda (x) (+ 1 x)) ls) '(1 2 3 4 5) ``` :::warning Be aware of this as you are debugging! **filter** will run with functions that don't produce a boolean, but probably that was not what you meant to do. ::: ## Properties of **filter** :::success Properties of **filter**: - The length of the produced list may be anything from 0 to the length of the original list. - (In Racket) this does not mutate the list or its elements, conceptually it copies them to build a new list. ::: --- # `fold`: (reduce/combine to result) We'll motivate our third famous higher-order function with another pair of examples: ```racket (define (sum ls) (if (empty? ls) 0 (+ (first ls) (sum (rest ls))))) (define (product ls) (if (empty? ls) 1 (* (first ls) (product (rest ls)))) ``` Question: what is **not** shared across these two functions? This will let us know how many arguments our HOF needs (in addition to the list its operating on)? Answer: 1. The function to apply (e.g., `+` vs `*`) 2. The base case (here, the identity: `0` vs `1`) - we can also think of this as the "starting" value We call this *fold* (though in other languages, its sometimes called *reduce*). ## Fold group exercise: - Implement 2 version of fold - 1 should use natural recursion (same pattern as sum and product) - 1 should use tail recursion - Discuss: do both versions produce the same result? - It's likely helpful to draw out list of calls - '(fold + 0 ls) ; ls '(1 2 3)` - '(fold + (+ ? ?) (?))` - (Followup): what happens if we pass both '(fold cons empty lst)` Solution: ```racket ; Our third famous higher-order function: fold! ; Natural recursion (define (foldr func acc lst) (if (empty? lst) acc (func (first lst) (foldr func acc (rest lst))))) ; Note: foldr is not tail recursion, since foldr is not called ; as the outer-most expression of the "else" ; Said another way: after a recursive call returns, the function ; still needs to do the work of applying the function to the ; current element and the recursive result. ; In tail recursion, the function should have no work left by ; the time the recursive call returns ; Tail recursion ; Note we don't *need* a helper function with an additional ; acc, since we can use the argument tracking the "starting" ; value for that! (define (foldl func acc lst) (if (empty? lst) acc ; Note: correction from previous version of the notes, ; acc should be the 2nd arg to the (func ...) call to ; match Racket's built-in foldl order. In class, our ; specification was not precise enough to require this, ; but the version below *is* what I showed in class as ; the reference solution! (foldl func (func (first lst) acc) (rest lst)))) ``` ```racket ; Fold right recursion (foldr + 0 '(1 2 3)) ; result: (+ 1 (foldr + 0 '(2 3))) (foldr + 0 '(2 3)) ; result: (+ 2 (foldr + 0 '(3))) (foldr + 0 '(3)) ; result: (+ 3 (foldr + 0 '())) (foldr + 0 '()) ; result 0 ; Overall, computes (+ 1 (+ 2 (+ 3 0)) ; ^ base case, initial acc ; ^ processes elems -> to the right ; Note: acc stays the same, is just passed down to the base ; case ; Fold left recursion (foldl + 0 '(1 2 3)) (foldl + (+ 1 0) '(2 3)) (foldl + (+ 2 1) '(3)) (foldl + (+ 3 3) ()) ; returns (+ 3 3) = 6 ; Overall, computes (+ 3 (+ 2 (+ 1 0))) ; ^ initial acc ; ^ processes elems <- to the left ; 3 is the outer-most elem ; Note: acc changes on each recursive call ``` :star: `foldr` uses an accumulator, but is not tail recursive! - `foldl` is more efficient because it is tail recursive, but if order matters, it effectively visits the elements in reverse order. - `foldr` maintains the same order. This matters for e.g., string concatenation. If stack space is more of a concern then time, it's more efficient to `foldl` and then `reverse` (both tail recursive). # Combination HOF example Write a function that returns the sum of the squares of all the positive numbers in a list. ```racket ; Answer using all 3 famous HOFs! (foldl + 0 (map (lambda (x) (* x x)) (filter positive? ls))) ``` Note: this solution uses HOFs, but does not use _explicit_ recursion (it does not call a function recursively on the `rest` of `ls`, even though implicitly we know the implementations do recur). Assignment 2 asks you to do some similar examples. # Testing A few quick notes on [RackUnit](https://docs.racket-lang.org/rackunit/), which I encourage you to use for Assignment 2: ```racket ; Use the unit testing library RackUnit (require rackunit) ; Our function-under-test (define (f x) (cons x empty)) ; check-equal? checks for equivalent values (check-equal? (f 0) '(0)) ; succeeds ; check-eq? check for identical values (the same heap location) ; This matters for cons cells, but not always for literals ;(check-eq? (f 0) '(0)) ; fails, copy of cons cell ; For functions that return booleans, good style to use: (check-true (odd? (first (f 1)))) (check-false (odd? (first (f 2)))) ``` :::warning Important note: - Moving forward, I encourage you to use tests over commented-out interaction code! :::