<style type="text/css">
.reveal .slides {
min-height: 100%;
width: 100%;
height: 96mm;
}
.reveal .slides section h1
{
color: #ffefef;
text-shadow:
-1px -1px 0 rgba(230,2,255, 0.7),
2px -2px 0 rgba(230,2,255, 0.7),
-3px 3px 0 rgba(230,2,255, 0.7),
3px 3px 0 rgba(230,2,255, 0.7);
}
.reveal .slides section h3
{
color: #ffefef;
text-shadow:
-1px -1px 0 rgba(230,2,131, 0.7),
1px -1px 0 rgba(230,2,131, 0.7),
-2px 2px 0 rgba(230,2,131, 0.7),
2px 2px 0 rgba(230,2,131, 0.7);
}
.reveal img, .reveal video, .reveal iframe {
width: 100%;
min-width: 100%;
max-height: 100%;
}
html, .reveal {
color: #CFC7BF;
}
ol, ul {
font-size: 3.5rem;
}
.reveal pre code {
display: block;
width:100%;
line-height: 1.5em;
border-radius: 0.9rem;
font-size: 2rem;
padding: 3rem;
overflow: auto;
}
</style>
---
# Final Project
## [Northeastern University](https://www.northeastern.edu/)
27 September 2022
### by Kelsey Packard
---

---
### Objectives
- Explore data warehouses
- Practice using SQL
- Understand how data travels as `.csv` files
- Look for correlation between maks wearing and public health
---
# Mask Implementation during COVID-19
## Did you wear your mask during the Pandemic?
---

---
# SQL queries
---
# The Schema of Table 1
---

---
# The Schema of Table 2
---

---
## I used the following SQL code on BigQuery public data sets
---
```sql
SELECT
m.*,
c.*
FROM
`bigquery-public-data.covid19_nyt.mask_use_by_county` AS m
INNER JOIN
`bigquery-public-data.covid19_nyt.us_counties` AS c
ON
c.county_fips_code = m.county_fips_code;
```
---
## The first query generated over 2 million records so I tried a different one. Basically, to start with I wanted from table c only the name of the county and the name of the state
---
```sql=
SELECT
DISTINCT(c.county_fips_code),
m.*,
c.county,
c.state_name
FROM
`bigquery-public-data.covid19_nyt.mask_use_by_county` AS m
INNER JOIN
`bigquery-public-data.covid19_nyt.us_counties` AS c
ON
c.county_fips_code = m.county_fips_code;
```
---
## I used DISTINCT to only get each county once
---
## Then I uploaded the `.csv` into a Jupyter nootebook
---

---
## Pandas: data exploration ðŸ§
---
<iframe title="Embedded cell output" src="https://embed.deepnote.com/5e0e7109-c2dd-4701-a62c-9c6f35b1ab25/b16620a7-1450-444b-827f-6599d328bb37/f896aeb6a0304397b994d3a37be93145?height=134.17897415161133" height="134.17897415161133" width="500"/>
---
## The first graph 👇
---
<iframe title="Embedded cell output" src="https://embed.deepnote.com/b582b3d0-399f-4dee-88a7-309ce0171951/b16620a7-1450-444b-827f-6599d328bb37/8f66c56bef1a4c6fb4f86f368d027716?height=600" height="600" width="500"/>
---
## The Second
---
<iframe title="Embedded cell output" src="https://embed.deepnote.com/b582b3d0-399f-4dee-88a7-309ce0171951/b16620a7-1450-444b-827f-6599d328bb37/3d1b0ad0844442af988b4cc97a4cecad?height=600" height="600" width="500"/>
---
# The Third
---
<iframe title="Embedded cell output" src="https://embed.deepnote.com/5e0e7109-c2dd-4701-a62c-9c6f35b1ab25/b16620a7-1450-444b-827f-6599d328bb37/3388c81f1a5f4f82b19461d221ba31c2?height=599.9744262695312" height="599.9744262695312" width="500"/>
---
## Some data cleaning
---

---
## Choropleth without data cleaning
---
```
import plotly.express as px
fig = px.choropleth(mask_use_by_county_df,
geojson=counties,
locations='fips',
color='% always',
color_continuous_scale="Viridis",
range_color=(0, 12),
scope="usa",
labels={'county':'county name'}
)
fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0})
fig.show()
```
---
<iframe title="Embedded cell output" src="https://embed.deepnote.com/5e0e7109-c2dd-4701-a62c-9c6f35b1ab25/b16620a7-1450-444b-827f-6599d328bb37/f46d73326ccc4d7e8b253b86e3f9ae5d?height=599.9744262695312" height="599.9744262695312" width="500"/>
---
The graph (it is called a choropleth) above has California broken. I Googled aboout FIPS format and found that it had to be formatted as a 5 digit string which is what the code below does.
---
## **Data Cleaning and Manipulation**
```
mask_use_by_county_df['county_fips_code'].apply(lambda x: str(x) if len(str(x))>4 else '0'+str(x))
```
---
<iframe title="Embedded cell output" src="https://embed.deepnote.com/5e0e7109-c2dd-4701-a62c-9c6f35b1ab25/b16620a7-1450-444b-827f-6599d328bb37/ef8da8e82c7a4fdeb5b597c2ecbba203?height=599.9957275390625" height="599.9957275390625" width="500"/>
---
## Other types of data viz
---
<iframe title="Embedded cell output" src="https://embed.deepnote.com/5e0e7109-c2dd-4701-a62c-9c6f35b1ab25/b16620a7-1450-444b-827f-6599d328bb37/d2f9038010d0489c9d390932f2c979c5?height=629.9744262695312" height="629.9744262695312" width="500"/>
---
<iframe title="Embedded cell output" src="https://embed.deepnote.com/5e0e7109-c2dd-4701-a62c-9c6f35b1ab25/b16620a7-1450-444b-827f-6599d328bb37/63ca423e7f4d4318872a3196a61aaaea?height=629.9744262695312" height="629.9744262695312" width="500"/>
---

---
<iframe title="Embedded cell output" src="https://embed.deepnote.com/5e0e7109-c2dd-4701-a62c-9c6f35b1ab25/b16620a7-1450-444b-827f-6599d328bb37/86423ae5c1da42eb8aa719187909742b?height=599.9744262695312" height="599.9744262695312" width="500"/>
---
<iframe title="Embedded cell output" src="https://embed.deepnote.com/5e0e7109-c2dd-4701-a62c-9c6f35b1ab25/b16620a7-1450-444b-827f-6599d328bb37/40fbbec1d7ed44e1ada634285a101a08?height=599.9744262695312" height="599.9744262695312" width="500"/>
---
<iframe title="Embedded cell output" src="https://embed.deepnote.com/5e0e7109-c2dd-4701-a62c-9c6f35b1ab25/b16620a7-1450-444b-827f-6599d328bb37/581ee4e9e09048df9a06ceb4b9f944cd?height=599.9744262695312" height="599.9744262695312" width="500"/>
---
<iframe title="Embedded cell output" src="https://embed.deepnote.com/5e0e7109-c2dd-4701-a62c-9c6f35b1ab25/b16620a7-1450-444b-827f-6599d328bb37/ab06081dbd8340388665a84a3e2cf82f?height=599.9744262695312" height="599.9744262695312" width="500"/>
---
<iframe title="Embedded cell output" src="https://embed.deepnote.com/5e0e7109-c2dd-4701-a62c-9c6f35b1ab25/b16620a7-1450-444b-827f-6599d328bb37/84c3cedf39804a8f8c595c287d8f86b0?height=683.9943237304688" height="683.9943237304688" width="500"/>
---
{"metaMigratedAt":"2023-06-17T10:15:15.659Z","metaMigratedFrom":"YAML","title":"Kelsey Packard presentation","breaks":true,"slideOptions":"{\"theme\":\"dark\",\"transition\":\"fade\"}","contributors":"[{\"id\":\"28161843-f826-46bb-ab16-4f11acb3bfad\",\"add\":8624,\"del\":1006},{\"id\":null,\"add\":5097,\"del\":5744}]"}