# Coding for DA
This is a coding course designed to be taught together with the textbook Békés, G and G. Kézdi: Data Analysis for Business, Economics, and Policy, Cambridge UP, 2021
The coding course structure is language neutral, the class material is prepared in Stata, R, and Python.
One class is aimed to be 90-120 minutes lab. But as languages vary in complexity of syntax, the length of classes will vary.
***This is the shared prepping doc***
-----
## Part 1: Basics to EDA
### 1/1 Setup, intro to the language
* How the IDE works
* elementary operations
### 1/2 Basics of data wrangling
* IO: load, save, look at data
* filter on rows, select columns
### 1/3 Doing graphs 1
* geommetry of graphics
* bar graphs (horizontal, vertical)
* histograms
* scatterplot + line
### 1/4 Essential coding
* language specific bits.
* R: tidyverse
* Stata: do files
### 1/6 Doing graphs 2
* Setting the scaffolding
* labels, legends
* axes
* Visuals, color schemes/palette
* Save as
---
## Part 2: Running regressions
### 2/1 OLS 1
* Scatterplots,
* non-parametric regressions (splines)
* univariate regressions,
* regression tables (basic)
* standard errors
* functional form (log, squared)
**Relates to**
* Chapter [07](), [08](), [09]()
* Case study: [hotels...]()
**Coding materials**
*Stata:* [da-stata-class-part2-class1]()
*R:* [da-r-class-part2-class1]()
*Python:* [da-python-class-part2-class1]()
### 2/2 OLS 2
* multivariate regressions,
* interactions,
* regression tables (advanced)
* predicted values
* yhat-y plot
* coeff plot
### 2/3 Essential coding II.
* more advanced, useful coding, language specific
* loops
* functions (R, python); local/global (Stata)
### 2/4 Probability regression
* LPM
* logit, probit
* AME
* predicted probabilities
* calibration
### 2/5 Time series basics to regression
* Working with time series data, frequency
* Time series graphs
* Serial correlation
* Lags and leads
* TS regressions
---
## Part 3 Prediction and Machine Learning
### 3/1 Predictions, CV
### 3/2 LASSO, elastic net
### 3/3 CART
### 3/4 RF, Boosting
### 3/5 Classification
### 3/6 Time series prediction
---
## Part 4 Causal inference
### 4/1 Advanced methods 1
* matching (exact, coerced, NN)
* IV, RDD (?)
### 4/2 Difference in differences
### 4/3 Panel data methods
* working with xsec-ts panel data (language specific)
* POLS, FE
*
### 4/4 Advanced panel: event study, synth