Due: Wednesday, February 19th, 11:59 PM
hw4-code.arr
where you will write your Homework 4 solution.hw4-src.pdf
. This will hold your written work. Do not include your name in your SRC submission!hw4-code.arr
Copy and paste the code in the following dropdown into hw4-code.arr
. Remember to change your context to dcic2024
, just like in HW3!
include gdrive-sheets
include data-source
ALL-DATA = load-spreadsheet("1wWEI3nx64xkbpV9Ey06Ho_yvNQcWi8KpyQAtGlQmcfo")
CURRENCY-CONVERSION = load-table: from-c :: String, to-c :: String, conv-rate :: Number
source: ALL-DATA.sheet-by-name("currency", true)
end
# Uncomment for Part 2
#|
ART = load-table: id :: Number, cost :: Number, currency :: String, kind :: String
source: ALL-DATA.sheet-by-name("art", true)
end
|#
Table
documentation)Remember to use the CS0111 Table
documentation from above (also linked in Setup and Handin), NOT the official documentation. If your code uses constructs named sieve or extend (without functions), for example, you are using the wrong version and will not get credit.
This assignment has you practice a multi-stage data-analysis task. We're going to write functions that draw on data from two tables to perform computations. We'll also do multiple versions of a function, where each version adds more features, so you can see an example of building up solutions a bit at a time.
This assignment draws on all of the table lectures so far. For the planning tasks, we saw an example on Feb 12, and you're also practicing it in Lab 3. While we have covered everything that you need as of Feb 12, the examples we do in class on Feb 14 will give more practice with the concepts in this assignment.
Important: as you start working on more complicated programming problems, we want you to practice planning out your tasks before you start coding. Taking the time to plan your approach to the programming tasks can potentially save hours of debugging time. We have given you some guidance on planning here.
As an international organization, Highland Cow LCC (limited cow corporation) has to deal with the complexities of currency exchange rates. In order to do this, the leaders at Highland Cow LCC have to manipulate tables containing complicated data. This homework will get you ready to join Highland Cow LCC by using your table skills to uncover important information. Once you've practiced your skills, one of your new co-workers, Karis, also needs some help evaluating art sales and keeping track of some names.
Learning Goals:
Companies that do business internationally need to be able to quote prices in different currencies, following a table of exchange rates. This is the kind of situation where one would have two separate tables: one with the prices of items, and another with the exchange rates per country. For this problem, we'll be computing currency conversions for an online art store in which artists and clients may be in different countries.
The store has two tables, that are both in your starter code. One is for artwork, called ART
: it tracks the unique id, cost and base currency for each piece, as well as what kind of artwork it is. It has some errors in the data, which we will clean up in the next part. It is commented out – ignore it for now.
table: id :: Number, cost :: Number, currency :: String, kind :: String
The other table is for currency conversion, called CURRENCY-CONVERSION
. Each row holds the conversion rate (the multiplicative factor) to convert from the first currency (from-c
) to the second (to-c
).
table: from-c :: String, to-c :: String, conv-rate :: Number
For example, if a row of the Table
looked like:
["EUR" "TND" 3.32]
we would convert from Euros (EUR) to Tunisian Dinars (TND) by multiplying the price in Euros by 3.32.
We will use these tables to develop several programs for selling art across currencies.
Tables
in this assignment look like?Unlike in Homework 3, these tables are not typed directly into Pyret, but rather pulled in from a Google Sheet. To see what these Tables
look like, feel free to type the name of the Table
in the definitions window and hit "Run", or hit "Run" first and then type the name in the interactions window and hit return.
For these tasks, you can use our CURRENCY-CONVERSION
Table
when writing tests.
Task 1: Write a function called exchange-price-1
, which takes in as input a Table
that has the same columns as CURRENCY-CONVERSION
, a starting currency (a String
), a desired currency (a String
), and a price (a Number
), and returns a Number
of the price converted from the starting currency to the desired currency. For this exercise, you can assume that a row with the starting currency in the "from-c" column and the desired currency in the "to-c" column exists exactly once in the table. For now, do not worry about handling the case of converting a currency to itself. Remember to make a plan for this task before starting to code it up!
For this and every homework problem that has text like "you can assume that a row appears exactly once in the table," your test cases should obey this assumption. That is, for this problem, do not include tests like exchange-price-1(CURRENCY-CONVERSION, "JPY", "EUR")
, since this test violates the assumption (there are no rows with "JPY" in the "from-c" column and "EUR" in the "to-c" column in this Table
).
Remember that you can use mathematical expressions in tests, such as 0.25 is 1/4
.
Pyret also has an is-roughly, but it will be much easier to use the mathematical expression instead.
Task 2: Sometimes, we cannot make the assumption that we make in Task 1. For this task, assume that input currency conversion table may not list the conversion from A to B, but it may list the conversion from B to A. Write a function called exchange-price-2
, which has the same inputs and output as exchange-price-1
, but uses the direct conversion (as in Task 1) or the inverse conversion if necessary. (You compute the inverse of a number N using 1 / N, where / is division). Remember to make a plan for this task before starting to code it up!
Details:
Table
does not contain the relevant row. (Hint: what would the conversion rate be in this case?)Table
. If there are multiple matching rows, you can assume that they are consistent with each other (that is, if you have a Row
with A in to-c and B in from-c, and a row with B in to-c and A in from-c, no matter if you choose to do the direct conversion or the inverse conversion, you will get approximately the same answer.)Task 3: Create a function called plot-exchange-rates
which takes in a Table
that has the same columns as CURRENCY-CONVERSION
and a starting currency (a String
) and produces a bar chart of conversion rates from the given currency to all the other available currencies, that is, currencies for which there exists either a direct or an inverse conversion (or both) from the given currency. Remember to make a plan for this task before starting to code it up!
Important: Please put the line import bar-chart from shared-gdrive("dcic-2021", "1wyQZj_L0qqV9Ekgr9au6RX2iqt2Ga8Ep")
at the top of your file in order to use bar-chart
. bar-chart
was accidentally left out of the dcic2024
context, and we've contacted the creators of the context to get this fixed, but we have to use this quick fix for now.
This is one of the more complex tasks in this assignment - you might think about how you will do it, and then move on to other tasks in the assignment and come back to it. If you get stuck, try one of the following strategies:
bar-chart
function comes from only one column. How can you make sure that the bar chart includes both of these cases?In class, we discussed that a Table
that we're given sometimes does not contain a column with the exact information we need. In that case, we need to do some computations to get the Table
into a usable state. This could include isolating/keeping desired values from the original dataset. Consider what sort of column(s) would be helpful to add to this table?
If you are finding that the computation for adding on a column is too complex, work on creating a helper function that can take in a row or the relevant information from a row and perform the computation for a single row's worth of data. Test it thoroughly by pulling some example Row
s from the Table to gain confidence that it's working!
This section deals with the ART
Table
. Most of the section is exploratory (you are meant to write short pieces of code and write down what you notice, but not turn in code), with one coding task that you turn in.
In the Feb 14 lecture, we learned that there are multiple approaches to cleaning data: using sanitizers, manually cleaning up the source spreadsheet, filtering out rows that we recognize as malformed, and writing functions to clean up columns. One of your jobs in these tasks will be to determine which of these approaches makes the most sense for different kinds of messy data.
Task 4: Uncomment the block below the line that says # Uncomment for Part 2
by removing the surrounding #|
and |#
, and try to run the code. You will encounter one error. Using what you learned in class, change the load-table
block of code so that this error no longer happens (Note: column A corresponds to the "id" column). Even though the error text will hint towards editing the Google sheet itself, you can and should resolve this in code. In a one-sentence comment below this change, in your own words, explain what the error was and why it happened.
Now, examine the ART
Table
in the Pyret window. For the data to be considered "clean," we would want:
The next few tasks ask you to reflect on what it would mean to clean the existing data to conform to these expectations.
As you complete tasks 5-7 remember that, aside from scrolling through the columns to examine data, you can also graph the data or even sort it.
Task 5: One of the values in the "id" column appears twice, violating one of the requirements above. In a short comment that starts with "Part 2, Task 5:", name this value, and explain how to find it efficiently ("efficiently" here means faster than reading through all of the values in the column).
Task 6: In a short comment that starts with "Part 2, Task 6:", explain how the values in the "cost" column are not clean, using at least 2 specific examples from the data. Which of the approaches listed in the blue box at the top of this part would be your first choice for cleaning up the data, and why? Which would be your second choice, and what is an upside and a downside of this second choice?
Task 7: In a short comment that starts with "Part 2, Task 7:", explain how the values in the "currency" column are not clean, using 2-4 specific examples from the data. Which of the approaches listed in the blue box at the top of this part would be your first choice for cleaning up the data, and why? Which would be your second choice, and what is an upside and a downside of this second choice?
Task 8: Examine the "kind" column. We consider "visual" art to be art that has "paint" or "photo" in the "kind" description, and "sculpture" to be art that has "marble", "ceramic," or "wood" in the "kind" description (all case-insensitive). Everything else would be considered "other." Write a function called clean-kind
that takes in a String
(which you can assume will look like any of the values in the pre-cleaning "kind" column) and produces either "visual", "sculpture", or "other", based on the criteria named in this task. To test this task, use specific values from the given data.
Note: this task does not ask you to use transform-column
to clean the ART
Table
, but you can (and should) certainly try out what it looks like to use clean-kind
as an input to transform-column
for this purpose.
Learning Goals:
Task 9: Since the given ART
Table
has data errors and is pretty large (and therefore unwieldy to use in test cases), come up with one small example Table
with the same columns as ART
. Make sure that the Table
you create obey the criteria for what it means for the data to be clean, as given in Part 2. For the currency column, you can limit yourself to the following currencies (so that they match the CURRENCY-CONVERSION
Table
): CAD, CHF, CNY, EUR, JPY, MXN, USD, ZAR. You may end up revising this table as you work through and test the following tasks – that's okay, but make the table now as a starting point!
For the next two tasks, you can use the provided CURRENCY-CONVERSION
as your test input for the relevant table, and you should use the test table you defined in Task 9.
Task 10: Create a function called get-art-price
that takes in an art ID (Number
), a currency (String
), a Table
that has the same columns as CURRENCY-CONVERSION
and a Table
that has the same columns as ART
. This function will return a Number
for the price of the artwork with the given id, converted to the input currency. You can assume that exactly one row with the given id exist in the ART
-like input table. This means that you do not have to test for an art ID that does not exist in the table. However, if a direct or inverse conversion doesn't exist (as described in Task 2), this function should raise("unable to convert price")
. Remember to make a plan for this task before starting to code it up!
Task 11: You want to find the priciest art in the Table
, but since the artworks are in different currencies, you will need to convert all of the artwork to the same currency to do this analysis. Create a function called priciest-art
that takes in a target currency (String
), a Table
that has the same columns as CURRENCY-CONVERSION
, and a Table
that has the same columns as ART
, and returns the artwork ID (Number
) that costs the most once all artworks have been converted to the target currency. You can assume that all conversions to the given currency exist (either as direct or inverse conversions) in the input currency conversion table. Remember to make a plan for this task before starting to code it up!
Note: this means that you will have to examine CURRENCY-CONVERSION
to find such currencies when you are testing your function in order to obey this assumption.
Learning Goals:
Task 12: Highland Cow LCC would like to create a table that will store names. Users of the table will add rows to the table with data. Create a table to store information about names and store 3-5 names within the table. It’s up to you to decide what sorts of column(s) you’ll use for this table.
Task 13: Read this article. In your code file, copy-paste the following function:
fun last-name(name :: String) -> String:
doc: "Returns the last name from the given name"
space-index = string-index-of(name, " ")
string-substring(name, space-index + 1, string-length(name))
where:
last-name("Angus McMoo") is "McMoo"
end
based on the article, add 4 different examples in the where-block that would fail on this function, and include a brief comment about why each one fails. You are not modifying the function, so the tests you write should all show up as failing in the Pyret window. If the function should produce an error for some name, you can indicate that with a raises. For example,last-name(“Cher”) raises “No last name!” # Cher does not have a last name
Now, in your hw4-src.pdf
file,
Task 14: While the design of a dataset embodies certain assumptions, no dataset can be perfectly representative due to limitations with data collection and data storage. Consider the following groups who are interested in using a name dataset.
In hw4-src.pdf
, answer the following questions for the groups:
Hit "Run", then paste this check block into the interactions window. All the tests should pass: if they don't, this means you have a typo in a function header and the autograder won't run on your code!
check "functions exist and have correct inputs":
CLEAN-ART = table: id, cost, currency, kind
row: 1, 1, "USD", "other"
end
is-number(exchange-price-1(CURRENCY-CONVERSION, "USD", "EUR", 1)) is true
is-number(exchange-price-2(CURRENCY-CONVERSION, "USD", "EUR", 1)) is true
is-image(plot-exchange-rates(CURRENCY-CONVERSION, "USD")) is true
is-string(clean-kind("")) is true
is-number(get-art-price(1, "USD", CURRENCY-CONVERSION, CLEAN-ART)) is true
is-number(priciest-art("USD", CURRENCY-CONVERSION, CLEAN-ART)) is true
end
If you see the block below appear in the interactions window after running it, then you are fine! Please submit to Gradescope.
If not, double-check your function names, input types, and input order. If you are stuck at any point, please feel free to come to hours or post on Ed!
hw4-code.arr
.hw4-src.pdf
. If your written file is not a PDF, we might not be able to grade it.hw4-code.arr
and `hw4-src.pdf' on Gradescope. Make sure you do not include your name or any other identification on your submissions!Back at the Barnyard Theme Song by Nicktoons
Brown University CSCI 0111 (Spring 2024)