Section 4.1.4 of your textbook gives some examples of using filter-with
and build-column
in a format slightly different than what we are going to use in class.
For 4.1.4.1 Finding Rows:
filter-with(shuttle, below-1K)
filter-with(shuttle, lam(r): below-1K(r) end)
For 4.1.4.3 Adding New Columns:
build-column(employees, "total-wage", compute-wages)
build-column(employees, "total-wage", lam(r): compute-wages(r) end)
.Why are we doing it this way? A lambda is just a function without a name (called an "anonymous function"). It allows us to express a computation that we want to happen on some input, for example, each Row
of a Table
. For now, we'll only use them in one specific context: as the input for filter-with
and build-column
. Since build-column
and filter-with
require us to provide a computation that gets applied to each Row
, think of breaking up a lambda expression such as lam(r): compute-wages(r) end
into two pieces:
An expression on any Row
of the table, assuming the Row
is named r
, e.g. compute-wages(r)
. In this case, we are calling compute-wages
on r
, which we know we can do because compute-wages
was written to take in a Row
with the specific columns of the employees
Table
.
The piece with lam(r)
and end
, which is just syntax we use for defining an anonymous function (which filter-with
and build-column
will call on Row
of the Table
).
By using the lambda syntax, we make it explicit that we are using an expression that computes on each row r
of the Table
. This will come in handy when we tackle more complicated program tasks, but can also save you from having to write a function every time you have to process a table, as we'll see below.
In our preferred method of calling build-column with a lambda expression, the lambda expression can be any expression on a Row of the table, not just a helper function call! Let's see how we can use that idea to rewrite the example in 4.1.4.3 without a helper function.
Row
sWhen we were writing functions, we started by writing example expressions on different, specific pieces of data and highlighting the similarities and differences in order to create a function that can take in custom inputs (like the three stripe flag example). We can do so the same thing with expressions on Row
s of a Table
. Let's do this with the total wage example from 4.1.4, by computing the total wage for some example rows. First, let's grab some example rows from the table using row-n
:
harley-row = employees.row-n(0)
obi-row = employees.row-n(1)
How do we compute Harley's total wage and Obi's total wage using the information stored in these rows? We know that we can use the notation row-name[column-name]
to extract the value of a specific column within a row (Section 4.1.2)). If we wanted to compute Harley's total wage, we would multiply the value of the "hourly-wage" cell by the value of the "hours-worked" cell:
harley-row["hourly-wage"] * harley-row["hours-worked"]
Same idea for Obi:
obi-row["hourly-wage"] * obi-row["hours-worked"]
We can run these expressions in Pyret to check our work.
Looking at our examples above, we see that if we have any Row
in the employees table (let's call it r
), the total wage will be:
r["hourly-wage"] * r["hours-worked"]
If we want to build a column with the total wage for every employee, this is exactly the computation we need! build-column
applies the computation of its third input (the lambda expression) to every row of the input table, in order to compute the value that goes into the new column of that row. The syntax lam(r)
helps that happen, by telling Pyret/build-column
that each row can be called r
:
build-column(employees, "total-wage", lam(r): r["hourly-wage"] * r["hours-worked"] end)
This means that we do not need a separate compute-wages
function and can accomplish our task in just one line.
Let's think outside of the context of Table
s for a minute and say we had a function to compute a total wage from two Number
s:
fun get-total-wage(hour-wage :: Number, hours-worked :: Number) -> Number:
hour-wage * hours-worked
where:
get-total-wage(100, 0) is 0
get-total-wage(15.50, 8) is 124
end
The example above is admitedly extremely simple, but often we have functions that are written outside of the context of tabular data but work on data we see in a Table
. These functions might be used in other computations, or might be easier to test than functions on Table
s/Row
s, since they just require us to provide example values such as String
s and Number
s.
How might we use get-total-wage
to rewrite our build-column
? We need to pay attention to how we "stitch together" the data in a row (because build-column
works on Row
s) with the data the function takes in. Again, let's start with an example for Harley's Row
. To get their total wage using the helper function, we need to know their hourly wage and their hours worked. But we can get that using the bracket notation:
get-total-wage(harley-row["hourly-wage"], harley-row["hours-worked"])
Generalizing to any row r
in the table, our helper function call looks like:
get-total-wage(r["hourly-wage"], r["hours-worked"])
So, our build-column
can be written a third way,
build-column(employees, "total-wage", lam(r): get-total-wage(r["hourly-wage"], r["hours-worked"]) end)
When should we use a helper function vs. putting the whole computation in a lambda expression?
The expressions
build-column(employees, "total-wage", lam(r): compute-wages(r) end)
and
build-column(employees, "total-wage", lam(r): r["hourly-wage"] * r["hours-worked"] end)
and
build-column(employees, "total-wage", lam(r): get-total-wage(r["hourly-wage"], r["hours-worked"]) end)
will produce equivalent tables, so either one would be correct. However, one form may be preferable to the other in the following cases:
If the expression is a short (1 line) expression over a row, it is probably simpler/more readable to directly write the computation in the lambda expression rather than to use a helper function. Some examples include short arithmetic expressions such as (r["col1"] + r["col2]) / 2"
, String expressions such as string-length(r["col3]"")
, Boolean expressions such as r["name"] == "Blueno"
, and very simple if-expressions such as if r["discount"] == "student: r["price"] * 0.9 else: r["price"] end
.
If you want to test the computation separately, use a helper function (either on Row
s or other data). You cannot write tests for lambda expressions, but you could write tests for a helper function such as compute-wages
to verify that it is working correctly.
If the computation is more complex than a 1-line expression, use a helper function (either on Row
s or other data). Not only will your code be more readable, more complex expressions should be tested. An example would be if your computation requires anif
-expression with else if
s (such as if we needed to determine if an employee's salary is taxed at different rates and needed, in the wages example).
If the computation gets used in a context outside of Table
s, use a helper function that takes in data other than Row
s (such as our get-total-wage
function).
In this section, we are using the problems statement and plan from Part 1 of this document. The plan was:
How do we translate this to working with our Table
operations? First, we define our function:
fun get-advisor(student-name :: String, student-table :: Table, advisor-table :: Table) -> String
We will be writing the body of this function, so we have the following names available to us from the inputs student-name
, student-table
, advisor-table
.
To find a row in a table, we filter the table (using filter-with
) with a lambda expression that will result in a table that only contains the row we need. If we assume that student names are unique in the table, the criteria simply needs to be that the "name" column of the row matches the given student name. Once we have a Table
with one Row
, we can use row-n(0)
to get that row (since the row in question is guaranteed to be the first row):
student-row = filter-with(student-table, lam(r): r["name"] == student-name end).row-n(0)
Note that we can use function inputs in lambda expressions, as long as the whole line of code appears in the body of the function.
We simply use bracket notation:
student-semester = student-row["semester"]
Note that we could have combined the expressions from steps 1 and 2 into one expression, e.g. student-semester = filter-with(student-table, lam(r): r["name"] == student-name end).row-n(0)["semester"]
. Regardless of how we code the steps, it is still worthwhile to break the problem into two steps to remember that we are first finding a row, and then extracting a value from the row.
Now we have a new named piece of data to work with, student-semester
. We can use the same searching idea from Step 1, but recognize now that we are searching within advisor-table
for the "sem" column of a row to match student-semester
:
advisor-row = filter-with(advisor-table, lam(r): r["sem"] == student-semester end).row-n(0)
Again, we use bracket notation: advisor-row["name"]
. The final function looks like:
fun get-advisor(student-name :: String, student-table :: Table, advisor-table :: Table) -> String:
student-row = filter-with(student-table, lam(r): r["name"] == student-name end).row-n(0)
student-semester = student-row["semester"]
advisor-row = filter-with(advisor-table, lam(r): r["sem"] == student-semester end).row-n(0)
advisor-row["name"]
end
When juggling multiple tables, writing down a plan helps us keep track of which information we need from which table. Then, it is a matter of using operations we've already learned for single tables to code up the steps.