In this lab, you’ll be using everything you’ve learned about tables (and some of what you’ve learned about lists) to create a data visualization tool. Specifically, you’ll use a dataset of active Providence business licenses published by the Rhode Island government to mark locations on a map.
A key goal of this lab is to give you practice breaking large problems into separate tasks, each of which gets handled with its own collection of functions, expressions, and operators. As you work on the problems, practice writing out plans of tasks and using those to guide your work.
Lab 5 Presentation Slides
Pyret List Documentation
CS0111 Table Documentation
Before we get into maps, however, let's practice writing functions with lists. We're going to do a set of problems that involve processing emails. Your goal is to use a combination of filter
, map
, distinct
, length
, and get
on lists to solve these problems.
Note: As you read the Pyret lists documentation, you might see that there are two syntaxes for computing on lists. For example, if sample-lst
is a list, you could get the distinct elements by calling the function distinct
on the list: distinct(sample-lst)
. The Pyret documentation also includes list methods, which use a slightly different notation: sample-lst.distinct()
. Both will produce the same answer, but we prefer that you use the function version (e.g. distinct(sample-lst)
). The reasoning is that we are applying a computation on a list, just as we've been using functions in class up to this point.
Task 1: Define two lists of strings, one that has all valid email addresses (like "milda@brown.edu", "ice@google.com", etc) and one that also has some invalid ones (missing @, anything after the username, etc). You'll use these for testing.
Task 2: Write a function that takes a list of email addresses and counts the number that are from .edu
addresses (it's okay to just use string-contains
here, rather than checking that .edu
is at the end of the address).
Task 3: Write a function that takes a list of emails and returns the ones that lack the "@" sign.
Task 4: Write a function that takes a list of email addresses and returns just a list of the domains without the ending ("brown" for "milda@brown.edu", "google" for "ice@google.com"), or the part of the email that is between the "@" and the ".". This takes it one step further than what we saw in class using the string-split
function.
Task 5: Write a function that returns a list of unique domains without the ending (so if your email list has multiple "@brown.edu" addresses, the result should only have one "brown").
Before you start working with the data, take a look at the Providence Business License dataset. The first sheet (inside the file) is the original dataset, and the second sheet is a pruned version with only the columns you’ll need for this lab. Load the pruned sheet into Pyret by pasting the following code into the top of the definitions window (above your code from Part 1):
include shared-gdrive("dcic-2021", "1wyQZj_L0qqV9Ekgr9au6RX2iqt2Ga8Ep")
include gdrive-sheets
import data-source as ds
include shared-gdrive("stencil.arr", "1Ntm5Hyox9of8JoVy9gZqLZkiqwFxEkCM")
##############
# SETUP CODE #
##############
ssid = "18eKc_o-eZ8TWno6ePYkcMGKhx-Y2Fe8LEERqV5gClaU"
data-sheet = load-spreadsheet(ssid)
pvd-businesses =
load-table: license-id, name, extended-hours, hotel, parking,
restaurant, food-and-liquor, food-dispenser, lat, long
source: data-sheet.sheet-by-name("Licenses_Pruned", true)
sanitize license-id using ds.string-sanitizer
sanitize name using ds.string-sanitizer
sanitize extended-hours using ds.bool-sanitizer
sanitize hotel using ds.bool-sanitizer
sanitize parking using ds.bool-sanitizer
sanitize restaurant using ds.bool-sanitizer
sanitize food-and-liquor using ds.bool-sanitizer
sanitize food-dispenser using ds.bool-sanitizer
sanitize lat using ds.num-sanitizer
sanitize long using ds.num-sanitizer
end
We have provided you with a function table-to-map(t :: Table) -> Image
, which takes in a table and plots rows as dots on a map of Providence, based on their "x"
, "y"
, and "color"
columns. Treat table-to-map
like a built-in function; you can call it just like you called filter-with
or any other built-in function in previous labs.
To use table-to-map
to map the locations of the businesses listed in pvd-businesses
, you will need to convert the longitude and latitude values in the table to x- and y-coordinate values. Only locations that fit within the bounds of the Providence map can be included (table-to-map
will account for this automatically). We’ve defined the following values for you in the stencil.arr
file that the setup code includes:
LAT-MIN
- the minimum latitude that fits the mapLAT-MAX
- the maximum latitude that fits the mapLON-MIN
- the minimum longitude that fits the mapLON-MAX
- the maximum longitude that fits the mapHEIGHT
- the height of the mapWIDTH
- the width of the mapTry typing LAT-MIN
or any of these values in the interactions window, you should see they are predefined values that you can use in your functions.
Use these values to write functions that scale latitudes to y
values and longitudes to x
values relative to the map dimensions. You must proportionally scale the businesses' (long
, lat
) coordinates to (x
, y
) coordinates in order for the rows to be read by the table-to-map
function correctly.
To make things easier for you, we've included scaling formulas!
Task 6: Use these scaling functions to add columns named x
and y
to the pvd-businesses
table with these scaled values.
Task 7: Add a column labeled color
and set each row’s color to "black." After adding these columns, the computed table can then be used as an input to table-to-map
.
Think about whether the output for this column is dependent on each specific row. How can we generate this column using build-column
?
Complete the tasks and pass the transformed table into table-to-map
. The output should look something like this:
Now that you have a map of Providence, let’s think about the types of data that have vital geo-spatial components. When comparing your map to the tabular representation of the active Providence business licenses, each portrayal of data provided different insights. Data linked to these licenses will most likely be spatial as it contains attributes linked to location.
As we think about the City of Providence we can consider how several factors have shaped the city today: historic urban planning decisions and policies, development pressure, rates of homeownership, inequitable private and public investment in the installation and maintenance of green spaces and trees, urban density and available plantable area, social and cultural practices, etc. Several of these factors are interconnected and could even have mutually reinforcing relationships.
Look at Providence maps of:
from the Providence Neighborhood Planting Program.
Note: if the above link doesn't work, use this archive link!
Think about how these phenomena intertwine with one another and common trends in different areas of Providence. With your partner, find a blank slide of the Google Slide presentation, provided to you by your TAs and write down your observations. Circle areas on the different maps that you and your partner speculate may have relationships to each other. Annotate any further questions you or your partner have about these areas of the city. Once you have at least 4-5 annotations, call the TA over to get checked off.
Now that you have a map of Providence (albeit not a very pretty one), we need to put it to good use. Your friend Kara has taken you up on your invitation to visit and has just arrived in Providence. She wants to know what she should do while she's here!
Task 8: Write a function generate-tourist-map(needs-hotel :: Boolean, has-car :: Boolean, stays-up-late :: Boolean, eats-out :: Boolean, businesses :: Table) -> Image
that takes in a series of Booleans and the original pvd-businesses
table and returns a map marked with the places that Kara might want to visit.
needs-hotel
: If this Boolean is true
, Kara needs a hotel to stay at! Mark all of the inns on the map with a color of your choice.has-car
: If true
, Kara has decided to drive and will need to find places with parking. Mark all places with parking with a second color.stays-up-late
: If true
, Kara is ready for a night out! Mark all places with extended hours (Look at the column entitled extended-hours
) with a third color.eats-out
: If true
, Kara has given up on cooking for herself and wants to eat out. Mark all eateries with a fourth color.Remember that your generate-tourist-map
function should take in the original pvd-businesses
table, not the modified one made in Part 1. You'll have to add the x
and y
columns again within generate-tourist-map
.
If a place has more than one relevant characteristic, it should be colored according to the characteristic that's most important to Kara. Any non-relevant places should be colored black. Starting with the most important characteristic, Kara cares first about hotels, then eating out, then places that have parking, and lastly, places that stay open late.
Depending on the values of the parameters (needs-hotel
, has-car
, stays-up-late
, eats-out
), the computed colors for places will be different. Think about how you can accomplish this behavior with a helper function.
NOTE: Look at the list of predefined colors in the Pyret Documentation to see what options you have.
It turns out that your game-enthusiast friend Kara wants nothing more than to buy and play Monopoly. However, before she can buy Monopoly, Kara would like a doughnut (everyone gets hungry!). Good thing Kara is a big advocate for local businesses and wants to visit all the local doughnut shops in Providence!
How do we distinguish local shops from those in national chains? One way is to find all the doughnut shops and see which ones have names that appear many times in the data–-local shops will have fewer entries in the list. A frequency-bar chart would show us this answer visually, but what if we had to compute this information from a list of names of doughnut shops, rather than read it off of a frequency-bar-chart?
Task 9: Write an expression (you may need to use a helper function) that gets the list of all donut shops from the original table.
Something to keep in mind: There are two spellings for the tasty dessert of interest: donut and doughnut. Be aware of this when you are looking for businesses.
Task 10: Based on inspecting the list of doughnut shops, determine which shops are local or a chain. We'll consider a doughnut shop to be local if it has no more than 2 locations in the pvd-businesses
table.
Filter out the chain donut shops to create a list called donuts-lst
of names of local doughnut shops. Since you're working with a List
(as opposed to a Table
), remember to use List
operations.
Remember to write out your plan first and use that to guide your work.
Task 11: Filter the original pvd-businesses
table to only include donut shops (both local and chain).
Task 12: On this filtered table, add a color column where local stores are red and chain stores are black.
Task 13: Using the table-to-map
function and your updated table, create a map that highlights all of the donut shops in their correct colors.
Now that Kara got her doughnut, she is ready to purchase Monopoly. Unsatisfied with how few stores sell them in Providence, Kara decides to open her own game store. After conducting a survey of existing options, she's created a list of the most popular board games. With Monopoly, Kara can finally have fun!
Task 14: Write a function create-token(token-colors :: List<String>) -> List<Image>
that takes in a list of strings representing valid colors and produces a list of game tokens of these colors. Each token should have this shape, but with its index-specific color and a yellow star:
Each token should also have a randomly generated size, such that its radius is between 15-35 units.
Discuss why you think lists are being used in this problem as opposed to tables. When would it be preferable to use tables over lists?
Kara has discovered that the ideal token size has a radius within 20-30. She wants to discard all of the token images that are outside of that range and make display signs out of the rest.
Task 15: Write a function generate-token-signs(token-images :: List<Image>) -> List<Image>
that takes in the output list from create-token
. This function should remove all of the tokens that lie outside of the size range and put the remaining token on square backgrounds. This is what an output list item should look like:
look at the image-width
function in the documentation when making the function
All the square backgrounds should be of the same fixed size, independent of the random sizes of the tokens. However, all possible token sizes should fit into this background – think about what the maximum token size could be when you're picking the size of your square background!
Kara finally bought her new token! Hooray! As a gift, she has provided you with a $5 gift card for her bee store! Business at Kara's store is booming, all thanks to you.
Brown University CSCI 0111 (Spring 2025)
Feedback form: tell us about your lab experience today