--- title: Homework 3 tags: Homeworks-F23, 2023 --- # Homework 3: Testing and Tables **Due:** Tuesday, February 13th at 11:59 PM ### Setup - Create a file in [Pyret](http://code.pyret.org) called `hw3-code.arr`. You will write your solution in here. You will need to copy and paste the following at the top of the file. ``` provide * provide-types * include shared-gdrive("dcic-2021", "1wyQZj_L0qqV9Ekgr9au6RX2iqt2Ga8Ep") include tables FACULTY-DIR = table: name :: String, department :: String, office :: String, phone :: String row: "Strong", "East Asian Studies", "Kassar 180", "282-280-2481" row: "Pugh", "Econ", "Gerard 21", "581-280-2510" row: "Humphrey", "Physics", "Metcalf 300", "231-121-1129" row: "Coleman", "Physics", "B&H 387", "974-093-5128" row: "Kaufman", "Music", " ", "502-120-5781" row: "Pratt", "Anthro", "CIT 672", "281-6723-5677" row: "Cherry", "CS", "CIT 007", "459-005-2098" row: "Sosa", "History", "", "878-784-5612" row: "Price", "Music", "Kassar 4", "102-503-6002" row: "Hopkins", "English", " ", "742-024-8762" row: "Everett", "Physics", "Metcalf 039", "975-202-3333" row: "Keller", "French", "Watson 91", "871-458-0235" row: "Day", "French", " ", "441-459-8722" row: "Stark", "Anthro", "Maxcy 19", "240-120-7852" row: "Branch", "CS", "Robinson 230", "583-0971-2011" row: "Brandt", "Bio", "Watson 210", "401-412-9065" row: "Moody", "Sociology", "Gerard 721", "206-425-3109" row: "Chung", "APMA", "Metcalf 001", "102-030-4050" end ``` - Do **not** put your name anywhere in the file. ### Resources We recommend briefly browsing relevant pages (particularly the CS111 Table documentation, Pyret's String documentation, and the lambda expression documentation) before working on these problems. - [Pyret documentation](https://www.pyret.org/docs/latest/) - [Lambda expressions for tables](https://hackmd.io/@cs111/lambdas-tables) - [CS111 Table documentation](https://hackmd.io/@cs111/table) (use this **instead** of the official `Table` documentation) - [Strings documentation](https://www.pyret.org/docs/latest/strings.html) - [Edstem](https://edstem.org/us/courses/54800) - [Testing, Design, and Style guide](https://cs.brown.edu/courses/csci0111/fall2021/assets/docs/pyret-clarity-design-testing.html) - Style tip: ALL lines (including comments) must be **under 100 characters long** ## The Assignment Your friend Francine Frogger (who is a frog) is a student at Green University, and you are helping her navigate through all of the different departments and professors. In order to find the professors you are looking for, you have to create a way to easily look through the large staff at the university. <img src="https://hackmd.io/_uploads/SkoBG-DY6.jpg" alt="frog in a graduation cap" width="50%"> <img src="https://lh3.google.com/u/0/d/1gap4fkfE81BHxvUhvFaB5BBqteWMMu5a" alt="two cs 111 TAs in frog costumes" width="40%"> ## Learning Objectives Our goal this week is to practice several essential skills, including breaking down problems into smaller tasks, using documentation, developing comprehensive collections of tests, and writing functions that takes `Row` as input. We are also going to think about the tables of data that consumer products collect about people. We're building up to writing programs that process tables as part of doing data analysis (on homework 4 and the first project). ## Part 1: Practice Writing Comprehensive Tests :::info Learning Goal: - To practice developing a comprehensive set of tests for a function that you didn't write ::: Knowing how to check whether a program does what you need it to is a valuable skill, even if you aren't the one writing the program. For example, you might have to determine whether a program written (or sold!) by someone else matches your needs. Running a set of specific tests against the existing program can help with this task. For this problem, imagine that you want to hire a programmer to write a function called `is-phone-num` that checks if a given string is a valid phone number. A valid phone number is of the format `XXX-XXX-XXXX`, where each `X` is a numerical digit. Your job is to write a `check` block for `is-phone-num`. It will look like the following (though you don't need to include our tests): ``` check "Basic tests for is-phone-num": is-phone-num("123-456-7890") is true is-phone-num("1") is false end ``` **You are NOT writing the is-phone-num function itself**. The CS111 staff have already done that. In fact, we've written *several* versions, some that return the answers that you want and some that don't (think of these as proposed solutions from multiple programmers). We are going to grade your `check` block by running it against all of our versions and seeing whether your tests are comprehensive enough to tell the ones that work from the ones that don't. :::spoiler How do tests tell apart working and non-working solutions? Imagine that someone claims to have written a program `double` that doubles a number. You write a test ``` check: double(2) is 4 end ``` This will return true on both of the following functions: ``` fun double(x :: Number) -> x * 2 end # a correct version fun double(x :: Number) -> x + 2 end # an incorrect version ``` Thus, a `check` block with only this one test isn't very good, because it doesn't have enough tests to flag that the second version doesn't work. If instead, you wrote the following tests: ``` check: double(2) is 4 double(3) is 6 end ``` Now the first function passes both tests while the second function fails one of the tests. This is a more comprehensive test suite because has enough cases to catch errors in the functions. Your job on this assignment will be to figure out a set of tests that catch errors in as many of our broken solutions as you can. ::: <br> To run your tests against our many versions, you'll work in a custom version of Pyret. **Task 1:** Click [this link](https://pyret.cs.brown.edu/assignment/1ePOdRVmJyozQdbpsIaXL4Io85lS0b4M_) to access the custom Pyret version. It might take some time to load (sometimes even 1-2 minutes), especially the first time. **DO NOT click "Begin Implementation".** :::spoiler Pyret/Examplar not loading? Try removing permissions for Pyret@Brown from your Google account. You can do this by going to Manage your Google Account, searching "apps" in the search bar on top of the page, and removing permissions for Pyret@Brown. Then, revisit the link, log in, and re-allow permissions. If you have any questions, post in Ed or come to hours! ::: <br> :::spoiler Did you accidentally click "begin implementation?" Go to drive.google.com, logged in with the account you use for Pyret. Find the folder that is called "pyret.cs.brown.edu" (***not code.pyret.org***), and delete the file called "is-phone-code.arr". Refresh the examplar page and you should be good to go! ***Do not delete hw3-tests-examples.arr, since that contains your work!!!*** ::: <br> **Task 2:** Fill in the `check` block that you see in the custom Pyret window. Your block should contain multiple tests for `is-phone-num`. As you add tests and hit "Run", the report in the interactions window will show you how many unsatisfactory software products you've ruled out. Pyret is running your tests against seven versions of `is-phone-num`, only one of which is correct (each of the other six is broken in some way). Your goal is to knock out as many of the six broken ones as you can. *Note: You don't have to knock out all of the unsatisfactory products to do well on this. This is our first exercise that focuses on designing good tests. This week, we're trying to build up your experience writing tests. Aim to knock out at least four, and see if you can get all the way to knocking out all six. You will get full credit for catching four.* :::spoiler Hints on how to approach this The key to doing well on this is to think systematically about the conditions of the problem and the possible variations in the strings. It will help to have some examples of correctly formatted phone numbers, but also make sure you have plenty of examples with *incorrectly* formatted phone numbers that should result in `false`. What might an incorrectly formatted phone number be? Think about the assumptions of the format -- for example, the length of the input string, the kind of character that exists at every spot in the string, etc. ::: &NewLine; **Task 3:** In your tests file, include a multi-line comment summarizing your strategy and what you learned about writing tests from doing this exercise. **Task 4:** Download your tests file (you'll upload it with the code file when you are done with the assignment). :::warning If your tests download as a .zip file, unzip it to retrieve the Pyret file. Make sure your file is called `hw3-tests-examples.arr`. This will be submitted along with `hw3-code.arr` to Gradescope. If you also see a downloaded file called `hw3-code-ignore.arr`, disregard that file. ::: **Task 5:** As we begin to work with tabulated data this week, in the form of a faculty directory at Green University, we want to ask ourselves where this data comes from. Our goal in tasks 5 and 6 is to think about how we retrieve data and recognize some of the biases in [data pre-processing](https://www.sciencedirect.com/topics/engineering/data-preprocessing). Read the following article by Robert Santos from the Urban Institute titled [‘Nonresponse Bias’ Could Explain the Census Bureau’s Finding on the Citizenship Question.](https://www.urban.org/urban-wire/nonresponse-bias-could-explain-census-bureaus-finding-citizenship-question) In a separate document titled `hw3-src.pdf`, respond to the following prompts: 1. In **1-2 sentences** , answer the questions: Were you familiar with the United States Census Bureau prior to this assignment? What surprised you about this article? (see this [resource](https://www.census.gov/programs-surveys/gov-finances/technical-documentation/methodology/how-the-data-are-collected.html) on how the data is collected for more information). 2. In **2-3 sentences** , explain what non-response bias is. (see this [ article](https://www.sciencedirect.com/topics/nursing-and-health-professions/nonresponse-bias) from ScienceDirect for more information). **Task 6:** Now read this excerpt from Brookings on how census sampling could be dangerous (see the full article [here](https://www.brookings.edu/articles/census-sampling-is-dangerous/)): > While the 1990 Census reported that 12.05% of adults are African-Americans, the true figure is more like 12.41%. To address undercounting, the Census Bureau has an adjustment mechanism called the Accuracy and Coverage Evaluation. The idea is quite simple—after the first census, the bureau randomly selects particular locations (technically, 11,800 block clusters with 314,000 housing units) and then intensively revisits them. Using this second survey, the Census estimates the number of people who were missed by the first census in various population subgroups (e.g., white males between 19 and 24 living in rural areas). Then the Census uses these undercount estimates to adjust its population estimates. Thus, if 5% of the African-American population is estimated to be missed, the population count for this group will be raised appropriately. > > Critics of these techniques argue that they introduce their own errors. This is true, but in ideal circumstances, this procedure will surely improve things. Unfortunately, statistical adjustment also gives much greater discretion to the Census Bureau. The correction procedure is based on population subgroups, and choosing them is very subjective. Do we treat all young urban black males as a subgroup or do we separate them by region? How many ethnic groups do we want to treat as distinct? > >This leads to a general point: As you allow for more statistical sophistication, you put more discretion in the hands of the statistician. In about **4-5 sentences**, answer the following questions: Do you agree with correcting datasets with mechanisms such as the Census Bureau’s [Accuracy and Coverage Evaluation](https://www.google.com/url?q=https://www.census.gov/programs-surveys/decennial-census/technical-documentation/coverage-evaluation/dssd03-dm.html&sa=D&source=docs&ust=1707337522082207&usg=AOvVaw33jxgrrb9JZeT44v70QkrF) to limit inaccurate figures of underrepresented groups? What is missing when we misrepresent certain groups? Are data collectors responsible for disclosing the limitations of the data? (There are no right or wrong answers here, just try to address all of these questions!) Please submit answers to Tasks 5 & 6 in the seperate document called `hw3-src.pdf`. ## Part 2: String validation :::info Learning Goals: - To practice writing functions using string functions and boolean operators - To practice using documentation to find useful built-in functions ::: **Task 7:** Write a function `in-CIT` that takes a `String` (an office, such as `"CIT 429"` or `"Metcalf 322"`) and returns a `Boolean` that indicates whether the office is in the CIT, that is, whether it starts with the characters `"CIT "` (not including the quotations, and including the space after the T). :::spoiler **Hints:** - Many of the tasks in this section are about which characters are in specific positions in a string. Think about that approach to these problems, and look in the documentation for a `String` function that helps with this. - Remember that the first position is a `String` is numbered 0, not 1. ::: &NewLine; ::: warning **Note:** Being able to look in language documentation for useful operations is an important skill, which is why we aren't telling you exactly which `String` operations to use. That said, limit yourself to operations with input and output types that we have used this semester (`Number`, `String`, `Boolean`). Don't use operations that return `List`, as we haven't covered that yet.* ::: spoiler **How to Search for Functions in Documentation** When trying to use documentation, read over the names and types of the available functions: do any sound relevant? For those that do, look at the text description and the examples for more detail. If you aren't sure whether a function will help, try a small example on your own in the interactions window. ::: <br> **Task 8:** Write a function called `in-building` that takes a `String` (an office) and another `String` (a building) and returns a `Boolean` that indicates whether the office is in that building, that is, whether it starts with the building name followed by a space. **Task 9:** Write a function called `get-room-number` that takes in a `String` (an office) and returns a `String` of only the office number. For example, `get-room-number("CIT 429")` should return `"429"` (as a `String`). The room number of an office is the part of the String that follows the first space. If the input string does not have a space, it should `raise("Malformed input: no spaces")`. We've seen an example of `raise` in the `get-grade` function in the Feb 7 lecture (take a look at the starter code). [Here](https://www.pyret.org/docs/latest/_global_.html#%28part._~3cglobal~3e_raise%29) is the Pyret documentation. :::spoiler How do we test if a function raises? You can use `raises` instead of `is`! See the [Pyret documentation](https://www.pyret.org/docs/latest/testing.html#%28part._testing_raises%29) ::: ## Part 3: Lookup operations **We will cover the material for part 3 on Feb 9. You can do part 4 before then.** :::info Learning Goals: - To give you practice accessing data from tables - To help you understand the benefit of bundling small pieces of data into one larger piece of data ::: Take a look at the `FACULTY-DIR` table in your Pyret file (you can see it in full by typing `FACULTY-DIR` in the interactions window), which represents a directory of university faculty. Notice it has the columns "name", "department", "office", and "phone". ### Part 3A: Combining helper functions and lambdas :::warning Remember that we have an [explainer on using lambda expressions and Tables](https://hackmd.io/@cs111/lambdas-tables). Be careful of boolean redundancy! ::: **Task 10:** In a comment in your Pyret file, write a lambda expression that operates on a row `r` and computes the result of calling `in-CIT` for the "office" cell of the row. Said otherwise, this lambda expression would evaluate to true if the faculty office corresponding to that row returns true and false otherwise. *Hint: don't overthink this task -- this should be a very short expression!* :::info You are not running or testing this expression, you are just writing it down so that you have a reference to use for later tasks. Your answer should be a single line that looks like `lam(r): [your expression here] end`. ::: **Task 11:** In a comment in your Pyret file, write a lambda expression that operates on a row `r` and returns the result of calling `get-room-number` for the "office" cell of the row. ### Part 3B: Table operations with helper functions :::danger *Note: It is **extremely** important that you use the [CS0111 Table documentation](https://hackmd.io/@cs111/table), not the Pyret `Table` documentation. The 111 version includes the documentation for `Row`s, which is necessary for this assignment.* ::: **Task 12:** Write an expression for computing the table of all of the directory entries of faculty in `FACULTY-DIR` who have offices in the CIT. The result of this expression should be stored with the name `rows-in-CIT` (e.g. you should have something of the form `rows-in-CIT = [some expression to compute the table]` in your code). Make use of the lambda expression you wrote down in Part 3A, Task 10. **Task 13:** Some faculty in the directory do not have offices listed. In those cases, the "office" cell of that row is a String of 0 or more spaces (e.g. "" or " "). Write an expression for computing a table of all of the directory entries in `FACULTY-DIR` who do not have an office listed. Your expression should make use of a helper function (one option is for this helper to take in a `String` and return a `String`, but you might have a different approach). Remember to write tests for this helper function. The resulting table should be stored with the name `no-office-fac`. :::spoiler **Hint**: The helper function should be used to help determine if a String is only made up of spaces. There are at least two ways to do this using functions in the Pyret String documentation -- by transforming the original `String`, or by generating a new `String` to compare against. ::: <br> **Task 14:** Write a computation to determine the name of the faculty member who has the largest listed office number on campus (as determined alphabetically, e.g. using String sorting). You might have to use multiple lines to compute intermediate data, but the name should be stored with the name `largest-office-fac.` Your computation should make use of the lambda expression you wrote in Task 11. **If the result surprises you, that's the point -- keep reading the handout 🐸** :::spoiler **Hint**: Faculty who don't have an office listed by definition cannot have the largest office number. How can you make sure that the error `raise`d by `get-room-number` does not cause an error in this computation? ::: What office number does that faculty member have? Examine `FACULTY-DIR` by looking through it manually -- does the result surprise you? Are there larger office numbers in the table? **Task 15:** In a comment right below the expression for Task 14, write 1-2 sentences on why the office number for the faculty member that you determined in Task 14 may not be the *numerically* largest office number in the directory. Your answer should demonstrate knowledge of data types. **Task 16:** Write a set of expressions to determine whether all the offices for the Physics faculty (faculty with the department "Physics") are located on the 3rd floor. An office on the third floor has a number with three digits, where the first digit is 3 (for example, "CIT 368" is on the third floor, while "Metcalf 30" is not). Assume all Physics faculty have an office listed in the table. The result of the computation should be a `Boolean` stored with the name `all-physics-third-floor`. :::spoiler **Hint:** There are multiple ways how to approach this task. Write down a to-do list of the intermediate computation(s) you might need. Where does it make sense to apply a table operation? Where does it make sense to make a helper function? ::: &NewLine; **Task 17:** Write a function called `get-phone` that takes in a faculty name (as a `String`) and returns that faculty member's phone number as a `String`. If a faculty member of that name is not in `FACULTY-DIR`, this function should `raise("faculty name not in table")`. You can assume that names are unique in this table, that is, no two faculty have the same name. ## Check Block (Autograder Compatibility) When you complete all the tasks of this homework, please copy the following `check` block into the bottom of your code file! The `check` block will be checking that all required functions and names are included, correctly named, and that they handle inputs in the right order. This is the bare minimum for the autograder to function properly -- do not edit this! ``` check "functions exist and have correct inputs": in-CIT("") is false in-building("", "") is false is-string(get-room-number("CIT 333")) is true is-table(rows-in-CIT) is true is-table(no-office-fac) is true is-string(largest-office-fac) is true is-boolean(all-physics-third-floor) is true is-string(get-phone("Strong")) is true end ``` If you see this block appear in the interactions window after running it, then you are fine! Please submit to [Gradescope](https://www.gradescope.com/courses/718380). &NewLine; ![](https://i.imgur.com/vp6aMHK.png) If not, double-check your function names, input types, and input order. If you are still stuck, please feel free to come to hours or post on [Ed](https://edstem.org/us/courses/54800)! ## Handin - Download both your solutions file and your test file (instructions about creating the test file are in Part 2) and make sure they are called `hw3-code.arr` and `hw3-tests-examples.arr` respectively. Download your written responses as `hw3-src.pdf`. Hand in your work on [Gradescope](https://www.gradescope.com/courses/718380) in the assignment named "Homework 3," and remember to include all three files in your submission! ## Theme Song [Twenty Froggies Went To School](https://www.youtube.com/watch?v=pzhP9Xq5i30) by Rajeev Lawrence ------ > Brown University CSCI 0111 (Spring 2024) <iframe src="https://forms.gle/tVELrdxLYisxKvsb6" width="640" height="372" frameborder="0" marginheight="0" marginwidth="0">Loading…</iframe>