The objective of this lab is to gain familiarity with sanitizing, visualizing, and gaining insights from data.
In all past assignments, the data that you worked with had been sanitized for you (i.e. you could expect Pyret to process the data without any extra work). The data in this lab, however, is being ported directly from Google Forms submissions, so there exists data that either isn't valid or isn't useful, and you need to find ways to effectively sort through such cases so that you can gather the insights you need.
If you don't feel more comfortable with sanitizing, visualizing, and interpreting data after working on this lab, come to TA hours! Your TAs would love to help in any way they can.
You’ve just arrived at Pluto University to meet with one of the resident psychology professors.
The professor wants to learn more about human behavior. Given your rare qualifications as an Earth-dwelling human, they thought they could use your help.
They need your help to collect meaningful data and analyze it for them so they can publish their next paper!
Brainstorm. What defines you as a person? What kind of information would be good for a report on human behavior? Some types of data are better for different things. For example, line graphs are often best at representing data in the form of numbers, while pie charts may be better for presenting tallies and votes.
For their paper, the professor would like a variety of data types, from numbers to strings. After thinking for a while, the professor decides that they want to study the life of a student – namely, Brown University students (how convenient!).
Your lab TAs will give you a link to a Google Form with the professor's questions.
Fill out the form to contribute to the study!
Go to the response spreadsheet (also from your TAs), make a copy of it (select "Make a Copy" under the "File" tab), and import it into your program. This copy of the spreadsheet is yours to tweak manually. There are some values that Pyret will not allow you to read in, such as empty cells and other lovely surprises for you to discover. Pyret will let you know when it finds a value it doesn't like, and you will have to correct these by hand directly in the spreadsheet.
To import the spreadsheet, include this at the top of your program:
(Replace the ssid with that of your own copy.)
Note: The highlighted portion is what your ssid looks like:
Now that we have the data collected, we need to make sure that the data is clean and processed before analysis.
Talk about why it will be difficult to draw insights from the columns about concentration name and hours of extracurricular obligations. Be able to explain your reasoning to a TA. Create a new table (within Google Sheets) that doesn't include these columns, and use this new table for the rest of the lab.
Once you import your table of student data, you will notice that the values in certain columns are in the format some(x)
or none
. In order to turn these values into the data types we all know and love (String
, Number
, etc.) use the sanitize functions upon import. Read the documentation here.
Before witing your sanitizers, remember: There are some values that Pyret will not allow you to read in, such as empty cells and other lovely surprises for you to discover. Pyret will let you know when it finds a value it doesn't like, and you will have to correct these by hand directly in the spreadsheet.
Look at the other columns. Discuss with your partner which columns might need to be cleaned, and come up with a plan to do this. In particular, come up with functions to clean up the columns about the number of classes, the hours spent doing schoolwork, the number of extracurriculars, and the commute time. Think about some times when it might be better to clean the data using sanitizers and other times when it might be more prudent to do so manually.
Write these functions, and create a new table with cleaned columns.
Now that the data has been tidied up, we need to put it together and analyze it in a way that will be meaningful for a research paper! What interesting ways can we represent this information? How should we visualize it?
You get a message from your space laboratory out in the city and they're wondering what is taking so long – they need your help in the laboratory! Consult with your lab partner on how to explain what you have been up to. Telling your boss and co-workers that you're "relaxing on Pluto's beautiful campus" probably isn't going to cut it. Show them something cool to persuade them that your trip is worthwhile!
We want to transpose the table (flip rows and columns) to look at the average commute times and average hours of sleep on weekdays of students living on campus and students living off campus. Since we don't yet know how to transpose an entire table through code, we'll build the table we need manually.
campus-living-table
with the columns location
, avg-commute
, and avg-sleep-weekdays
.Location | Commute time | Sleep on weekdays |
---|---|---|
On campus | Average | Average |
Off campus | Average | Average |
Plot any two columns that you think might be related to one another. Do you notice any cool correlations?
It's now time for the professor's paper to be published and presented to a bunch of other smart professors! The professor gets a little shy when it comes to answering questions and presenting – so they have asked you to do all of it.
Discuss the answers to the following questions with your partner:
Thanks to your excellent work, the presentation was a success and the professor is very pleased. Who knew aerospace engineers could make such good psychologists?