Final Project
Due: December 12, 2022 at 10 pm
Overview
The final project in CS 100 is an opportunity to apply any and all of the skills you have learned during the course of the semester (e.g., statistics, machine learning, visualization) to tell a compelling story, backed by a data set of your choosing. You should pick a topic you are genuinely interested in, and should aim to produce something you are proud of. Working on this project should be fun, engaging, and rewarding.
Groups
You should complete this project in a group of either 2 or 3. Groups of size 2 will only be required to use one data set, but groups of size 3 will need to use at least two related data sets to tell a compelling and cohesive story.
There is a thread on EdStem dedicated to searching for partners. Note that the project proposals are due on Tuesday, November 22, so if you don't already have a partner lined up, you should begin looking for one soon.
CS 100 Homework #4
Movie Recommendations
Due: December 2, 2022 at 10 pm
Instructions
Please submit to Gradescope your R Markdown (.Rmd) file. Please also knit your R markdown file, and submit the resulting PDF file as well.
Be sure to follow the CS100 course collaboration policy as you work on this and all CS100 assignments.
Objectives
To practice clustering with a variety of similarity metrics, and to learn about content and collaborative filtering.
amygreenwald changed 3 years agoView mode Like Bookmark
CS100: Studio 12
Visualizing Election Data
December 7, 2022
Instructions
During today’s studio, you will be creating data visualizations in R. Please write all of your code, and answers to the questions, in an R markdown document.
Objectives
By the end of this studio, you will know:
what FIPS is
CS 100: Studio 11
Text Analysis
November 30, 2022
Instructions
Today, you will be analyzing the transcripts of all the 2016 US Presidential Primary Debates.
You will be exploring the text to detect patterns in the language used by the candidates.
First, you will look at all the candidates individually, and then you will separate them by party to see if any patterns emerge that distinguish the two parties.
Upon completion of all tasks, a TA will give you credit for today's studio. If you do not manage to complete all the assigned work during the studio period, do not worry. You can continue to work on this assignment until Sunday, December 4 at 7 PM. Come by TA hours any time before then to show us your completed work and get credit for today's studio.
amygreenwald changed 3 years agoView mode Like Bookmark
CS 100 Homework #3, Part II
Due: December 5, 2022 at 10 pm
N.B. This assignment is optional. If completed, this grade may be substituted for one of Homework's 0, 1, 2, or 3 Part I.
Instructions
This assignment is extra credit. It will replace your lowest grade among the first four homeworks (if this grade is higher than that grade). Please submit to Gradescope your R Markdown (.Rmd) file. Please also knit your R markdown file, and submit the resulting PDF file as well.
Be sure to follow the CS100 course collaboration policy as you work on this and all CS100 assignments.
Overview
The topic of this homework assignment is supervised learning. The first part is concerned with linear regression, and this, the second part, classification.
CS 100: Studio 10
Clustering
November 16, 2022
Instructions
In this week’s studio, you will be clustering colleges using the College Scorecard Data, first collected under President Obama.
Upon completion of all tasks, a TA will give you credit for today's studio. If you do not manage to complete all the assigned work during the studio period, do not worry. You can continue to work on this assignment until Sunday, November 20 at 7 PM. Come by TA hours any time before then to show us your completed work and get credit for today's studio.
Objectives
By the end of this studio, you will know:
CS 100: Studio 9
Classification
November 9, 2022
Instructions
During today’s studio, you’ll be building several binary classifiers in R to classify the passengers on the Titanic as survivors or not. Specifically, you will be using the $k$-nearest neighbors algorithm, for various values of $k$.
Upon completion of all tasks, a TA will give you credit for today's studio. If you do not manage to complete all the assigned work during the studio period, do not worry. You can continue to work on this assignment until Sunday, November 13 at 7 PM. Come by TA hours any time before then to show us your completed work and get credit for today's studio.
Objectives
To understand a supervised learning algorithm, namely $k$-nearest neighbors, and to use cross-validation to optimize the hyperparameter ($k$) of this algorithm.
amygreenwald changed 3 years agoView mode Like Bookmark
CS 100 Homework #3, Part 1
Part 1 Due: November 17, 2022 at 10 pm
Instructions
Please submit to Gradescope your R Markdown (.Rmd) file. Please also knit your R markdown file, and submit the resulting PDF file as well.
Be sure to follow the CS100 course collaboration policy as you work on this and all CS100 assignments.
Overview
The topic of this homework assignment is supervised learning. The first (and only) part is concerned with linear regression. A now out-of-service second part (which may be available for extra credit) concerns classification.
amygreenwald changed 3 years agoView mode Like Bookmark
CS 100: Studio 8
Simple Linear Regression
November 2, 2022
Instructions
In this week’s studio, you will be learning how to create and interpret a simple linear regression model in R. You will also get some practice with data transformations.
Upon completion of all tasks, a TA will give you credit for today's studio. If you do not manage to complete all the assigned work during the studio period, do not worry. You can continue to work on this assignment until Sunday, November 6 at 7 PM. Come by TA hours any time before then to show us your completed work and get credit for today's studio.
Objectives
By the end of this studio, you will know:
amygreenwald changed 3 years agoView mode Like Bookmark
CS 100 Homework #2
Hypothesis Testing
Due: November 5, 2022 at 10 pm
Instructions
Please submit to Gradescope your R Markdown (.Rmd) file. Please also knit your R markdown file, and submit the resulting PDF/HTML (PDF preferred) file as well.
Be sure to follow the CS100 course collaboration policy as you work on this and all CS100 assignments.
Objectives
By the end of this homework, you will know:
amygreenwald changed 3 years agoView mode Like Bookmark
Objectives
By the end of this section, you will be able to:
use the stringr and lubridate libraries
save a cleaned data set with a date tag and re-load it
Setup
In order to proceed, you will need to install a few R libraries for data cleaning. Open RStudio, and then run the following commands in the console:
```{r}
CS100 Mini-Project
Fun with EDA!
Due: October 18, 2022, at 10 pm
Instructions
This is a pair programming assignment. Please refresh your memory about pair programming here.
As the name suggests, pair programming requires that you work with a partner. If you have trouble finding a partner, please post on EdStem that you are looking for a match. If you cannot find one that way, please contact the TAs for help.
Handin instructions: Each submission should include both your code, as an R markdown (.Rmd) file---suppressing code, or not, as appropriate---as well as the resulting PDF, after running Knit PDF on the R markdown file. Partners should submit their mini-projects as a group, using Gradescope’s group submission feature. After uploading your files and pressing submit on Gradescope, press either the “Group Members” or “Add Group Member” button to add your partner to the submission.
CS 100: Studio 4
Programming Practice
October 5, 2022
Instructions
During today's studio, you will be practicing with some programming fundamentals. Please write all of your code in an R script (not in an R markdown file, like usual).
Upon completion of all tasks, a TA will give you credit for today's studio.
Objectives
By the end of this studio, you will be able to:
CS 100: Studio 2
Introduction to R
September 21, 2022
Instructions
During this week’s studio, you will be learning how to use the dplyr library, and to produce plots using the mosaic library. You will be writing your code in R Markdown, and you will also be using RStudio to interface with R.
So far, you have been conducting your data analyses using spreadsheets. While R is a different and more powerful tool, the concepts you have learned will continue to apply.
Upon completion of all tasks, a TA will give you credit for today’s studio.
CS 100: Studio 3
Introduction to Visualization in R
September 28, 2022
Instructions
During today’s studio, you will be creating data visualizations in R. Please write all of your code, and answers to the questions, in an R markdown document.
Upon completion of all tasks, a TA will give you credit for today’s studio.
Objectives
By the end of this studio, you will know:
CS 100: Studio 7
Confidence Intervals and Hypothesis Testing: Two Sides of the Same Coin
October 26, 2022
Introduction
The topic of this week’s studio is approval ratings. In the first part, you will investigate Presidential approval ratings, and in the second part, those of Congress. These ratings summarize polls conducted among U.S. adults. Your job will be to investigate the statistical significance of the polls’ results, using confidence intervals and hypothesis testing---two sides of the same coin.
Upon completion of all tasks, a TA will give you credit for today's studio. If you do not manage to complete all the assigned work during the studio period, do not worry. You can continue to work on this assignment until Sunday, October 30 at 7 PM. Come by TA hours any time before then to show us your completed work and get credit for today's studio.
Objectives
By the end of this studio you will be able to:
CS 100: Studio 0
Welcome!
September 7, 2022
Instructions:
Welcome to your first CSCI 0100 studio! During this studio, you will take care of various administrative necessities, to get you ready for an exciting and productive semester. For example, you will begin by signing the course Collaboration Policy. You will also get started with Markdown, a tool for creating documents that can be converted into web pages, pdfs, etc..
You are invited to complete all the rest of your studios in a computer lab in the CIT, with your fellow classmates. For this studio only, you can/should do most of your work independently, and then visit the course TAs during office hours to introduce yourselves to them, and get credit for completing this studio. They can also help you with the various studio tasks, as necessary.
To find out when TA hours are held each week, you can check the TA hours here.