Due: Tuesday October 25, 2022 at 11:59PM ET.
This assignment has two goals: to give you some practice writing list functions, and to step back and give you a chance to show us what you have learned about writing clean code and developing a good set of examples for a problem. It's something of a mid-semester checkpoint on your program development skills. We'll be paying particular attention to your code organization and testing in grading this assignment.
hw5-code.arr
where you will write your Homework 5 solution.hw5-code.arr
.hw5-tests.arr
.Table
documentation)Learning Goals
This assignment draws on some basic ideas about analyzing texts. For some of these problems, we will tell you whether to use the built-in list operations (like map
and filter
) or an explicit recursive function (written with cases and based on examples, like we've been doing in lecture).
Computational tools are helping literature scholars find patterns in text. With the text of many books now available online, we can download that text and use programs to extract data about the text. For this part of the assignment, we will work with the text of the play Much Ado About Nothing.
Our goal is to determine the number of lines in the play that are spoken by each character. Specifically, we want to generate a table with two columns: the first column holds names of characters in the play and the second column holds how many lines that character had in the play. There should be only one row per character, but those rows can be in any order.
Unlike with earlier tables work, for this assignment we will be building tables from lists. We'll introduce the operations that you need as they come up in this assignment.
Task 1: Copy this code into your hw5-code.arr
file. The text of the play is defined (in the shakespear-text.arr
file) as one long string named book-text
.
To examine the text, we have to convert the single String
that contains the text to a list of words.
Task 2: Create a list named book-words
that contains all of the words in book-text
in order. In other words, we are converting the single string for the text into a list of the words. You should be able to do this in a single line. Refer to the Pyret String documentation if you don't remember the function that we've seen earlier in the semester that does this (professionals refer to documentation to look up functions they've used before all the time).
In order to build our table of lines per character, we first need to identify all of the characters in the play. Luckily for us, this play text used character names in ALL-CAPS to label each line (real text is usually messier). Thus, we need to isolate all of the ALL-CAPS names.
Since this assignment practices different ways to process lists, we want you write this two ways: once using built-in list operations and once as a recursive function.
Task 3: Write a function called find-charactersB
that takes a list of String
and returns a list of all of the character names in the given list. Your implementation should use built-in list operations and no manually-written recursion. You may write non-recursive helper functions (with or without lam
, as you prefer).
Task 4: Now write a second version of this function called find-charactersR
with the same inputs/outputs, but that uses recursion instead of built-ins. Your output list may contain duplicates (or not, that decision is up to you).
Note: Remember how we wrote such functions in class: start by writing out the entire list pattern (with the cases and the call to the function on the rest of the list), then try to fill in the holes to finish the function.
Note: We are not asking for sophisticated text processing here. We're asking you to find words in ALL-CAPS. Don't try to scan for ALL-CAPS that might not mark lines, etc. Our tests assume that you've counted ALL-CAPS words, plain and simple.
Testing: You should have several where
examples for each of these functions (you can use the same examples for both).
Now that you have a list of all the names, it's time to create a table summarizing the number of lines per character. We are going to do this using mostly functions over lists. In particular, do NOT use the count
function in the table documentation (there will be no credit for that solution).
Final table requirements: the table produced for this part should have one row per character. It should have columns called "name"
and "line-count"
. The character rows can be in any order.
Roughly, to do this, we're going to create a list of the contents of each column in the desired table, then add those column-lists to the table. Here are functions (that were loaded in the starter code that you copied) that will help you do this:
create-table-with-col(colname :: String, colvals :: List<Any>) -> Table
"name"
) and a list of values and creates a single-column table with those values.add-col :: (t :: Table, colname :: String, colvals :: List<Any>) -> Table
colvals
must be the same length as the number of rows (without the header/column name.)Experiment with these in the interactions window to see how they work.
Task 5: Now, let's put it all together. Write a function characters-table
that takes a String
with the text of a play and produces a table summarizing the number of lines spoken by each character.
Note: You may use any combination of built-in operators and recursion to create the lists of column values. Use your code from previous tasks as helpers if you want (and write additional helpers as you need). Write as many helper functions as you need. Sketch out a plan if that would help you. This is your chance to show what you've learned about writing clean and robust code for a nontrivial problem.
Task 6: Develop two good examples of shorter input strings to use to test your character-table
function (and use them in a where
block for that function). Put a comment above each example indicating what situation(s) you designed it to help test.
Note: to test your function, you can generate a character table then pull out individual cell values to check – you don't have to check the entire table. This can make your testing much easier and less annoying to write.
Our first text analysis centered around individual words. For some questions, however, we need to look at text in context, meaning that we consider the words that occur close to other words. This set of problems looks at how we write programs to consider words in context.
Imagine that you are analyzing a text that features two characters named Gabriel (Gabriel Sun and Gabriel Costa). You want to see which one is mentioned more often. To do this, you need to inspect the words immedately after "Gabriel" in the text (which could be either one of the characters' last names or some other word).
Task 7: Use Examplar to develop a comprehensive set of tests (in a check
block) for a function named count-a-gabriel
, which takes a String
of main text (multiple words) and a String
containing a last name to track and returns a Number
. The function returns the number of times that the given last name directly follows "Gabriel"
in the main text. Accept any capitalization of "gabriel" and the given last name.
Note: There are 7 broken solutions. Finding 6 will get you full credit (with partial credit for finding fewer than 6; you'll get most of that credit if you find 5).
Designing a Solution: Now it's time to think about how to implement count-a-gabriel
. This is a different style problem than what you've done before, so let's think through how to approach it.
As we've discussed in lecture, people program by retrieving/remembering solutions to problems that they've already seen and adapting them to a new situation. We have done problems that count list items before, but not problems that count surrounding items. We've seen two approaches to counting occurrences of items: (1) a combination of filter
and length
or (2) explicit recursion.
The built-in functions, by design, only look at one element of a list at a time. In this problem, we need the surrounding words, so we have to use explicit recursion.
If you already see how to adapt a previous counting approach to write count-a-gabriel
, jump down to Task 8. Otherwise, read on for guidance.
Sometimes, we can successfully adapt code by actually writing out the original version and deciding how to modify it. Set aside checking the last names, and write down the previous approach that you know to just count appearances of "Gabriel"
in the given string. (We're serious, type it into your file and make sure it works).
Now, look at that code and identify what has to be different to count the word after "Gabriel"
instead. Keep the high-level structure of the code that you have, and decide how you'll need to modify it.
Task 8: Implement count-a-gabriel
. Your code should pass the tests that you developed through Examplar. You are welcome to go back to Examplar and add more tests if something occurs to you while you are writing the function.
Task 9: Create a datatype FoodItem
. It should have one constructor food
with three fields:food-name
(a String
), days-until-expiration
(a Number
) and keep-cold
(a Boolean
).
Note: These inputs should be in the same order as written here, in order to work with the autograder. For this question, we're just checking that you can use the syntax from class to define a datatype, nothing fancier than that!
Task 10: Create a list named food-list
containing three FoodItems
(you choose the details of the items).
Task 11: Develop a function is-expired
that determines whether or not a food expired. It should take one FoodItem
and return a boolean indicating whether the days-until-expiration
is below zero.
Two Way Street originally by Kimbra, covered by Brown University's Harmonic Motion A Cappella
Brown University CSCI 0111 (Fall 2022)
Do you have feedback? Fill out this form