--- tags: Documents-S23 title: Lambda expressions for Table functions --- # Lambda expressions for Table functions (addendum to DCIC) ## Addendum to Section 4.1.4 [Section 4.1.4](https://dcic-world.org/2024-09-03/intro-tabular-data.html#%28part._.Processing_.Rows%29) of your textbook gives some examples of using `filter-with` and `build-column` in a format slightly different than what we are going to use in class. **For [4.1.4.1 Finding Rows](https://dcic-world.org/2024-09-03/intro-tabular-data.html#%28part._.Finding_.Rows%29):** * The textbook writes: `filter-with(shuttle, below-1K)` * We want you to use: `filter-with(shuttle, lam(r): below-1K(r) end)` **For [4.1.4.3 Adding New Columns](https://dcic-world.org/2024-09-03/intro-tabular-data.html#%28part._.Adding_.New_.Columns%29):** * The textbook writes: `build-column(employees, "total-wage", compute-wages)` * We want you to use: `build-column(employees, "total-wage", lam(r): compute-wages(r) end)`. :::success **Why are we doing it this way?** A lambda is just a function without a name (called an "anonymous function"). It allows us to express a computation that we want to happen on some input, for example, each `Row` of a `Table`. For now, **we'll only use them in one specific context: as the input for `filter-with` and `build-column`.** Since `build-column` and `filter-with` require us to provide a computation that gets applied to each `Row`, think of breaking up a lambda expression such as `lam(r): compute-wages(r) end` into two pieces: 1. An expression on any `Row` of the table, assuming the `Row` is named `r`, e.g. `compute-wages(r)`. In this case, we are calling `compute-wages` on `r`, which we know we can do because `compute-wages` was written to take in a `Row` with the specific columns of the `employees` `Table`. 2. The piece with `lam(r)` and `end`, which is just syntax we use for defining an anonymous function (which `filter-with` and `build-column` will call on `Row` of the `Table`). By using the lambda syntax, we make it explicit that **we are using an expression that computes on each row `r` of the `Table`**. This will come in handy when we tackle more complicated program tasks, but can also save you from having to write a function every time you have to process a table, as we'll see below. ::: ## Simplifying the textbook example without writing a helper function In our preferred method of calling build-column with a lambda expression, **the lambda expression can be any expression on a Row of the table, not just a helper function call!** Let's see how we can use that idea to rewrite the example in 4.1.4.3 without a helper function. ### Trying out expressions on `Row`s When we were writing functions, we started by writing example expressions on different, *specific* pieces of data and highlighting the similarities and differences in order to create a function that can take in custom inputs (like the [three stripe flag example](https://dcic-world.org/2024-09-03/From_Repeated_Expressions_to_Functions.html)). We can do so the same thing with expressions on `Row`s of a `Table`. Let's do this with the total wage example from 4.1.4, by computing the total wage for some example rows. First, let's grab some example rows from the table using `row-n`: harley-row = employees.row-n(0) obi-row = employees.row-n(1) How do we compute Harley's total wage and Obi's total wage using the information stored in these rows? We know that we can use the notation `row-name[column-name]` to extract the value of a specific column within a row ([Section 4.1.2](https://dcic-world.org/2024-09-03/intro-tabular-data.html#%28part._.Extracting_.Rows_and_.Cell_.Values%29))). If we wanted to compute Harley's total wage, we would multiply the value of the "hourly-wage" cell by the value of the "hours-worked" cell: harley-row["hourly-wage"] * harley-row["hours-worked"] Same idea for Obi: obi-row["hourly-wage"] * obi-row["hours-worked"] We can run these expressions in Pyret to check our work. ### From specific expressions to lambdas Looking at our examples above, we see that if we have any `Row` in the employees table (let's call it `r`), the total wage will be: r["hourly-wage"] * r["hours-worked"] If we want to build a column with the total wage for every employee, this is exactly the computation we need! `build-column` applies the computation of its third input (the lambda expression) to every row of the input table, in order to compute the value that goes into the new column of that row. The syntax `lam(r)` helps that happen, by telling Pyret/`build-column` that each row can be called `r`: build-column(employees, "total-wage", lam(r): r["hourly-wage"] * r["hours-worked"] end) This means that we do not need a separate `compute-wages` function and can accomplish our task in just one line. ### Helper functions with inputs other than rows Let's think outside of the context of `Table`s for a minute and say we had a function to compute a total wage from two `Number`s: ``` fun get-total-wage(hour-wage :: Number, hours-worked :: Number) -> Number: hour-wage * hours-worked where: get-total-wage(100, 0) is 0 get-total-wage(15.50, 8) is 124 end ``` :::spoiler **Why might we have such a function?** The example above is admitedly extremely simple, but often we have functions that are written outside of the context of tabular data but work on data we see in a `Table`. These functions might be used in other computations, or might be easier to test than functions on `Table`s/`Row`s, since they just require us to provide example values such as `String`s and `Number`s. ::: <br> How might we use `get-total-wage` to rewrite our `build-column`? **We need to pay attention to how we "stitch together" the data in a row (because `build-column` works on `Row`s) with the data the function takes in.** Again, let's start with an example for Harley's `Row`. To get their total wage using the helper function, we need to know their hourly wage and their hours worked. But we can get that using the bracket notation: ``` get-total-wage(harley-row["hourly-wage"], harley-row["hours-worked"]) ``` Generalizing to any row `r` in the table, our helper function call looks like: ``` get-total-wage(r["hourly-wage"], r["hours-worked"]) ``` So, our `build-column` can be written a *third* way, ``` build-column(employees, "total-wage", lam(r): get-total-wage(r["hourly-wage"], r["hours-worked"]) end) ``` :::warning **When should we use a helper function vs. putting the whole computation in a lambda expression?** The expressions `build-column(employees, "total-wage", lam(r): compute-wages(r) end)` and `build-column(employees, "total-wage", lam(r): r["hourly-wage"] * r["hours-worked"] end)` and `build-column(employees, "total-wage", lam(r): get-total-wage(r["hourly-wage"], r["hours-worked"]) end)` will produce equivalent tables, so either one would be correct. However, one form may be preferable to the other in the following cases: **If the expression is a short (1 line) expression over a row,** it is probably simpler/more readable to directly write the computation in the lambda expression rather than to use a helper function. Some examples include short arithmetic expressions such as `(r["col1"] + r["col2]) / 2"`, String expressions such as `string-length(r["col3]"")`, or Boolean expressions such as `r["name"] == "Blueno"`. **If you want to test the computation separately,** use a helper function (either on `Row`s or other data). You cannot write tests for lambda expressions, but you could write tests for a helper function such as `compute-wages` to verify that it is working correctly. **If the computation is more complex than a 1-line expression,** use a helper function (either on `Row`s or other data). Not only will your code be more readable, more complex expressions should be tested. An example would be if your computation requires an `if`-expression (such as if we needed to determine if an employee's salary is taxed at different rates, in the wages example). **If the computation gets used in a context outside of `Table`s,** use a helper function that takes in data other than `Row`s (such as our `get-total-wage` function). ::: ## Using lambda expressions when working with multiple tables In this section, we are using the problems statement and plan from Part 1 of [this document](https://hackmd.io/@cs111/planning). The plan was: 1. Find the row in the student table that matches the given student 2. Extract the semester from the row 3. Find the row in the advisor table that matches the extracted semester 4. Extract the advisor name from the row How do we translate this to working with our `Table` operations? First, we define our function: `fun get-advisor(student-name :: String, student-table :: Table, advisor-table :: Table) -> String` We will be writing the body of this function, so we have the following names available to us from the inputs `student-name`, `student-table`, `advisor-table`. #### 1. Find the row in the student table that matches the given student To find a row in a table, we filter the table (using `filter-with`) with a lambda expression that will result in a table that *only contains the row we need.* If we assume that student names are unique in the table, the criteria simply needs to be that the "name" column of the row matches the given student name. Once we have a `Table` with one `Row`, we can use `row-n(0)` to get that row (since the row in question is guaranteed to be the first row): `student-row = filter-with(student-table, lam(r): r["name"] == student-name end).row-n(0)` Note that we can use function inputs in lambda expressions, as long as the whole line of code appears in the body of the function. #### 2. Extract the semester from the row We simply use bracket notation: `student-semester = student-row["semester"]` Note that we could have combined the expressions from steps 1 and 2 into one expression, e.g. `student-semester = filter-with(student-table, lam(r): r["name"] == student-name end).row-n(0)["semester"]`. Regardless of how we code the steps, it is still worthwhile to break the problem into two steps to remember that we are first finding a row, and then extracting a value from the row. #### 3. Find the row in the advisor table that matches the extracted semester Now we have a new named piece of data to work with, `student-semester`. We can use the same searching idea from Step 1, but recognize now that we are searching within `advisor-table` for the "sem" column of a row to match `student-semester`: `advisor-row = filter-with(advisor-table, lam(r): r["sem"] == student-semester end).row-n(0)` #### 4. Extract the advisor name from the row Again, we use bracket notation: `advisor-row["name"]`. The final function looks like: ``` fun get-advisor(student-name :: String, student-table :: Table, advisor-table :: Table) -> String: student-row = filter-with(student-table, lam(r): r["name"] == student-name end).row-n(0) student-semester = student-row["semester"] advisor-row = filter-with(advisor-table, lam(r): r["sem"] == student-semester end).row-n(0) advisor-row["name"] end ``` When juggling multiple tables, writing down a plan helps us keep track of which information we need from which table. Then, it is a matter of using operations we've already learned for single tables to code up the steps.