Homework 5: Much Ado about Modern Lists!

Due: Tuesday October 25, 2022 at 11:59PM ET.

This assignment has two goals: to give you some practice writing list functions, and to step back and give you a chance to show us what you have learned about writing clean code and developing a good set of examples for a problem. It's something of a mid-semester checkpoint on your program development skills. We'll be paying particular attention to your code organization and testing in grading this assignment.

Setup and Handin

Setup

  • We are not using a stencil/helper code file for this homework! Like with the first assignment, use code.pyret.org.
  • Create a file in Pyret called hw5-code.arr where you will write your Homework 5 solution.
  • Do not put your name anywhere in the file.

Handin

  • Hit "Run", then paste this check block into the interactions window. All the tests should pass: if they don't, this means you have a typo in a function header.
check "function name and input checks":
​ exists = lam(f): true end
​ # count was renamed line-count in the next line
​ t = table: name :: String, line-count :: Number end
​ check_list = [list:]
​ find-charactersB(check_list) 
​ find-charactersR(check_list) 
​ characters-table("") is t  # NEW/CORRECT
​ # Next line was old/incorrect check in original handout
​ # characters-table(check_list, check_list) is t
​ count-a-gabriel("gabriel hi", "gomes") is 0
​ exists(food) is true
​ is-expired(food("",100,true))
​end
  • Download your coding file and make sure it is called hw5-code.arr.
  • Download your tests file from Examplar and make sure it is called hw5-tests.arr.
  • Hand in both files on Gradescope.

Remember your resources!

The Assignment

Learning Goals

  • To practice writing recursive functions over lists
  • To practice using map and filter functions
  • To practice testing list programs
  • To practice writing your own datatype

This assignment draws on some basic ideas about analyzing texts. For some of these problems, we will tell you whether to use the built-in list operations (like map and filter) or an explicit recursive function (written with cases and based on examples, like we've been doing in lecture).

Part 1: Analyzing Literature

Computational tools are helping literature scholars find patterns in text. With the text of many books now available online, we can download that text and use programs to extract data about the text. For this part of the assignment, we will work with the text of the play Much Ado About Nothing.

Our goal is to determine the number of lines in the play that are spoken by each character. Specifically, we want to generate a table with two columns: the first column holds names of characters in the play and the second column holds how many lines that character had in the play. There should be only one row per character, but those rows can be in any order.

Unlike with earlier tables work, for this assignment we will be building tables from lists. We'll introduce the operations that you need as they come up in this assignment.

Task 1: Copy this code into your hw5-code.arr file. The text of the play is defined (in the shakespear-text.arr file) as one long string named book-text.

include shared-gdrive("shakespeare-text.arr", "1-XGIpbC13ItjxCh7WY14cCjfygk5wz5x") 
include shared-gdrive("dcic-2021", "1wyQZj_L0qqV9Ekgr9au6RX2iqt2Ga8Ep")

To examine the text, we have to convert the single String that contains the text to a list of words.

Task 2: Create a list named book-words that contains all of the words in book-text in order. In other words, we are converting the single string for the text into a list of the words. You should be able to do this in a single line. Refer to the Pyret String documentation if you don't remember the function that we've seen earlier in the semester that does this (professionals refer to documentation to look up functions they've used before all the time).

Identifying the characters

In order to build our table of lines per character, we first need to identify all of the characters in the play. Luckily for us, this play text used character names in ALL-CAPS to label each line (real text is usually messier). Thus, we need to isolate all of the ALL-CAPS names.

Since this assignment practices different ways to process lists, we want you write this two ways: once using built-in list operations and once as a recursive function.

Task 3: Write a function called find-charactersB that takes a list of String and returns a list of all of the character names in the given list. Your implementation should use built-in list operations and no manually-written recursion. You may write non-recursive helper functions (with or without lam, as you prefer).

Task 4: Now write a second version of this function called find-charactersR with the same inputs/outputs, but that uses recursion instead of built-ins. Your output list may contain duplicates (or not, that decision is up to you).

Note: Remember how we wrote such functions in class: start by writing out the entire list pattern (with the cases and the call to the function on the rest of the list), then try to fill in the holes to finish the function.

Note: We are not asking for sophisticated text processing here. We're asking you to find words in ALL-CAPS. Don't try to scan for ALL-CAPS that might not mark lines, etc. Our tests assume that you've counted ALL-CAPS words, plain and simple.

Testing: You should have several where examples for each of these functions (you can use the same examples for both).

Creating a table of lines per character

Now that you have a list of all the names, it's time to create a table summarizing the number of lines per character. We are going to do this using mostly functions over lists. In particular, do NOT use the count function in the table documentation (there will be no credit for that solution).

Final table requirements: the table produced for this part should have one row per character. It should have columns called "name" and "line-count". The character rows can be in any order.

Roughly, to do this, we're going to create a list of the contents of each column in the desired table, then add those column-lists to the table. Here are functions (that were loaded in the starter code that you copied) that will help you do this:

  • create-table-with-col(colname :: String, colvals :: List<Any>) -> Table
    This function takes a column header (like "name") and a list of values and creates a single-column table with those values.
  • add-col :: (t :: Table, colname :: String, colvals :: List<Any>) -> Table
    This function takes a table and adds a new (rightmost) column with the given colname and values. The list of colvals must be the same length as the number of rows (without the header/column name.)

Experiment with these in the interactions window to see how they work.

Task 5: Now, let's put it all together. Write a function characters-table that takes a String with the text of a play and produces a table summarizing the number of lines spoken by each character.

Note: You may use any combination of built-in operators and recursion to create the lists of column values. Use your code from previous tasks as helpers if you want (and write additional helpers as you need). Write as many helper functions as you need. Sketch out a plan if that would help you. This is your chance to show what you've learned about writing clean and robust code for a nontrivial problem.

Task 6: Develop two good examples of shorter input strings to use to test your character-table function (and use them in a where block for that function). Put a comment above each example indicating what situation(s) you designed it to help test.

Note: to test your function, you can generate a character table then pull out individual cell values to check you don't have to check the entire table. This can make your testing much easier and less annoying to write.

Part 2: Looking at Text in Context

Our first text analysis centered around individual words. For some questions, however, we need to look at text in context, meaning that we consider the words that occur close to other words. This set of problems looks at how we write programs to consider words in context.

Imagine that you are analyzing a text that features two characters named Gabriel (Gabriel Sun and Gabriel Costa). You want to see which one is mentioned more often. To do this, you need to inspect the words immedately after "Gabriel" in the text (which could be either one of the characters' last names or some other word).

Task 7: Use Examplar to develop a comprehensive set of tests (in a check block) for a function named count-a-gabriel, which takes a String of main text (multiple words) and a String containing a last name to track and returns a Number. The function returns the number of times that the given last name directly follows "Gabriel" in the main text. Accept any capitalization of "gabriel" and the given last name.

Note: There are 7 broken solutions. Finding 6 will get you full credit (with partial credit for finding fewer than 6; you'll get most of that credit if you find 5).

Designing a Solution: Now it's time to think about how to implement count-a-gabriel. This is a different style problem than what you've done before, so let's think through how to approach it.

As we've discussed in lecture, people program by retrieving/remembering solutions to problems that they've already seen and adapting them to a new situation. We have done problems that count list items before, but not problems that count surrounding items. We've seen two approaches to counting occurrences of items: (1) a combination of filter and length or (2) explicit recursion.

Which makes sense here? Think about it before opening this to check your answer.

The built-in functions, by design, only look at one element of a list at a time. In this problem, we need the surrounding words, so we have to use explicit recursion.


If you already see how to adapt a previous counting approach to write count-a-gabriel, jump down to Task 8. Otherwise, read on for guidance.

Sometimes, we can successfully adapt code by actually writing out the original version and deciding how to modify it. Set aside checking the last names, and write down the previous approach that you know to just count appearances of "Gabriel" in the given string. (We're serious, type it into your file and make sure it works).

Now, look at that code and identify what has to be different to count the word after "Gabriel" instead. Keep the high-level structure of the code that you have, and decide how you'll need to modify it.

Task 8: Implement count-a-gabriel. Your code should pass the tests that you developed through Examplar. You are welcome to go back to Examplar and add more tests if something occurs to you while you are writing the function.

Part 3: Creating Datatypes

Task 9: Create a datatype FoodItem. It should have one constructor food with three fields:food-name (a String), days-until-expiration (a Number) and keep-cold (a Boolean).

Note: These inputs should be in the same order as written here, in order to work with the autograder. For this question, we're just checking that you can use the syntax from class to define a datatype, nothing fancier than that!

Task 10: Create a list named food-list containing three FoodItems (you choose the details of the items).

Task 11: Develop a function is-expired that determines whether or not a food expired. It should take one FoodItem and return a boolean indicating whether the days-until-expiration is below zero.

Theme Song

Two Way Street originally by Kimbra, covered by Brown University's Harmonic Motion A Cappella


Brown University CSCI 0111 (Fall 2022)
Do you have feedback? Fill out this form