10/12/2021 (Meeting Minutes)
===
:::info
- **Time** 3:40- 4:50 PM
- **Agenda**
- Revisit and discuss potential issues
- Obsidian notes.
:::
:books: Discussion Notes
---
- Discussed issues with school level data on lead samples (the data seems to have multiple entries taken at different times).
- Scope of study will focus primarily on public schools only.
- Sample will include K-12
- Main historically redlined cities of San Diego, Los Angeles, Fresno, Sacramento,Stockton, Oakland, and San Francisco
- Look at differences within cities and then compare across other redlined cities.
:dart: Task Accomplished
---
- Consensus on Obsidian notes repository
- Creating new shared drive
- Separate folder for Literature Review
- Schedule next meeting with Serge
:Question: Upcoming Meeting’s Questions
---
- Since the samples for each school are collected from different locations and different times during a particular day should we aggregate the data?
- How to aggregate the data within schools?
- Regression and Geospatial Analysis?
- Do we look at educational outcomes?
----
10/21/2021 (Meeting Minutes)
===
:::info
- **Time** 3:40- 4:50 PM
- **Agenda**
- Progress report
- Review the lead sample data.
:::
:books: Discussion Notes
----
- Explored potential exploratory research questions for lead sample data.
- Who are the schools exempt from testing? Do we see any potential of high SES schools getting exception ( mostly schools built after 2010, those who got tested outside, and those who have their own water supply).
- What is the frequency of sampling in most schools?
- Do we see patterns of action level exceedance clustered in certain school points or do we have more schools with lead problems.
- Do we see patterns/hotspots of schools with ALE concentrated in redlined neighborhoods?
:dart: Task Accomplished
--
- Scheduled group meeting to discuss the proposal and other issues.
- Setting a realistic timeline for the project.
- Scheduled next meeting with Serge.
- Got a copy of the HOLC maps for California.
:Question: Upcoming Meeting’s Questions
---
- Seek comments on the project proposal.
- Highlight next tasks for group meetings.
- Discuss the geocoding of school names as school points
---
10/27/2021 (Meeting Minutes)
===
:::info
- **Time** 6:20- 7:00 PM
- **Agenda**
- Progress Report
- Points on data cleaning
:::
:books: Discussion Notes
----
- HOLC data is up on the repo.
- Capstone proposal is to be reviewed and send out at the end of this week.
- Potential issues with data cleaning.
:dart: Task Accomplished
--
- HOLC data is up on the repo.
- Email to California Water Resource Board regarding school shape file.
:Question: Upcoming Meeting’s Questions
---
- Feedback/comments on the **[Project Proposal](https://hackmd.io/yIT3Z06LTFSP-7P4KUr3Lg?view)**.
- Discuss the geocoding of school names as school points? Can we have a session(2 slots on 11/11) for geocoding the data and other data inquires?
- How do we pull out data for only HOLC cities from school addresses?
> [name=WN] ==**[geosnap](https://github.com/spatialucr/geosnap/blob/master/examples/02_creating_community_datasets.ipynb)** community feature ?==
> [color=#bf0361]
#### Data Limitations
- Outliers in data, in some school sample lead concentration is > 1000 ppb. Misleading results are reported from fixtures that are shut down for several months. [[1]](https://edsource.org/2018/gaps-in-california-law-requiring-schools-to-test-for-lead-could-leave-children-at-risk/602756)
- No reporting of exact number under 5 ppb.
- The school names column in the data has 140 more observations than school address.
- Missing observations in the sample data?
- Unique ID can help solve this issue?
- Refer **[Notebook](https://github.com/pcarl006/Capstone-Project/blob/Final/New_Lead_Data.ipynb)**.
---
10/28/2021 (Meeting Minutes)
===
:::info
- **Time** 3:45- 4:00 PM
- **Agenda**
- Meeting with the Mentor
- Progress Report
- Capstone Proposal Review
- Data related questions
- Additional timeslots for geocoding
:::
:books: Discussion Notes
----
- Add more research questions like same racial composition of HOLC versus Non-HOLC data.
- Potential to get shape files from **[SWRCB](https://www.waterboards.ca.gov/)**.
> [name=Haley]Send an email earlier this week. No response yet from SWRCB.
> [color=#bf0361]
- Discussion on the data issues from lead.
- Finalize next meeting with advisor.
:dart: Task Accomplished
--
- Received feedback on the proposal.
- Progress report.
- Updated the **[Project Proposal](https://hackmd.io/yIT3Z06LTFSP-7P4KUr3Lg?view)**.
---
11/10/2021 (Meeting Minutes)
===
:::info
- **Time** 6:30 PM
- **Agenda**
- Group Meeting Software Installation
:::
:dart: Task Accomplished
--
- Installed `geosnap`
11/16/2021 (Meeting Minutes)
===
:::info
- **Time** 12 PM
- **Agenda**
- Group Meeting Geocoding
:::
:books: Discussion Notes
----
Split up the data and run the geocoding code.
:dart: Task Accomplished
--
Geocoding using two different methods
11/25/2021 (Meeting Minutes)
===
:::info
- **Time** 4 PM
- **Agenda**
- Group Meeting Notebooks
:::
:books: Discussion Notes
----
- Explained example for Riverside
- Geocoding works but we dont get the entire sample back some values were not geocoded
- Subsample data for school above 5 ppb
- Map it on census data along with school points
:dart: Task Accomplished
--
- Next Meeting agenda
12/02/2021 (Meeting Minutes)
===
:::info
- **Time** 3:30 PM
- **Agenda**
- Meeting with the Mentor
:::
:books: Discussion Notes
----
- Comments on the project progress
- We be comparing HOLC with NonHOLC cities in California that have similar underlying charateristics.
- We should test the frequency of testing done at each location (idea to did bubble or proportional choropleth maps)
- We might find some underlying charateristics with the help of frequency ... do schools in richer areas , metros , white neighborhoods get more test done??
- pscode
:dart: Task Accomplished
--
- Showed the progress so far
- Received feedback and ideas on how to proceed
- Identified next meeting goals
12/09/2021 (Meeting Minutes)
===
:::info
- **Time** 3:30 PM
- **Agenda**
- Data Cleaning
:::
:books: Discussion Notes
----
- Retrieved 80 percent locations of the schools using NCES data.
- Updated the Github Repository with additional notebooks
:dart: Task Accomplished
--
- School data have geographic coordinates
- Issues of missing schools not found in the merged data.
- Whether to look at public schools only and exclude private schools since NCES does not have the data.
01/04/2022 (Meeting Minutes)
===
:::info
- **Time** 3:00 PM
- **Agenda**
- Discuss data issues
:::
:books: Discussion Notes
----
- Comments on the project progress
- Presentation targets for 01/10/2021
- Attempted coding of school addresses
:dart: Task Accomplished
--
- Showed the progress so far
- Set up next meeting with the mentor
- Identified next meeting goals
01/06/2022 (Meeting Minutes)
===
:::info
- **Time** 3:30 PM
- **Agenda**
Meeting with the Mentor
- Discuss the potential recurring meetings for the quarter on any suitable day.
- Discuss the possibility of a reduced sample size to only have the school locations we retrieved so far?
- Gantt Chart for the group
- Progress Report
- HOLC data sprint ask for time and date
:::
:books: Discussion Notes
----
- to get max value use a code similar to `df.groupby(by='schoolid').max()`
- Review, plotted the Data provided from the DDW.
- other data issues discussed such as how to get maximum value, merging the two datasets.
- Sorted issues related to recurring meetings in coming weeks.
- Next meeting task defined including HOLC mapping and census tracts overlayed with schools.
:dart: Task Accomplished
--
- Report on progress
- Showed the latest notebooks
01/13/2022 (Meeting Minutes)
===
:::info
- **Time** 1 PM
- **Agenda**
Meeting with the Mentor
- Coding session on combining HOLC map with 2010 census data from `geosnap`
- Request advisor for study material/ past studies
> **Comparing redlined versus nonredlined areas with similar racial/ demographic charateristics.**
:::
:books: Discussion Notes
----
Questions:
We are doing analysis for 2010 CT, do we need to harmoize using `geosnap`??
### Road Blocks
- holc-narsc repo `make enivorement` does not solve
- Failed to `make notebooks` using geosnap environment
- Individual notebooks work for 100, 150, 151 but the join doesnt work on 155 notebook where we combined holc maps with 2010 census.
- the new communities data set use the areal interpolation to create new community (neighborhoods)
- spatial join is recommended on the for schools and new community layer.
:dart: Task Accomplished
--
- Report on progress
- Showed the latest notebooks
- Discussed the method for doing the redlined (treatment) versus nonredlined (control) analysis.
01/20/2022 (Meeting Minutes)
===
:::info
- **Time** 3:30 PM
- **Agenda**
Meeting with the Mentor
- Coding session on combining HOLC map with 2010 census data from `geosnap`
- Difference between the sjoin to pool the cities or overlay for creating a community.
- The overlay just gives us the intensive and extensive variables.
- How do we combine the redlined cities with the rest of California (is that required for our analysis??)
- Sjoin for schools ??
- How to contanenate the two data sets
- How to do pair matching ? tools/softwares
:::
:books: Discussion Notes
- Discussesed the issues with combining the data set
- The holc cities are defined as counties... update the methodology.
- Discussion on pair matching
- Discussion on regression analysis ....
:dart: Task Accomplished
--
- Report on progress
- Showed the latest notebooks
- Fixed issues with crs
----
01/27/2022 (Meeting Minutes)
===
:::info
- **Time** 3:30 PM
- **Agenda**
Meeting with the Mentor
- Progress Report
:::
:books: Discussion Notes
- interactive mapping
-
:dart: Task Accomplished
--
- Report on progress
- Showed the latest notebooks
----