Core August 2020 Metrics

--- tags: metrics, core, group --- $$ % My definitions \def\ve{{\varepsilon}} \def\dd{{\text{ d}}} \newcommand{\dif}[2]{\frac{d #1}{d #2}} % for derivatives \newcommand{\pd}[2]{\frac{\partial #1}{\partial #2}} % for partial derivatives \def\R{\text{R}} \def\E{\mathbb{E}} $$ # Q1 Let a random variable $X$ have the following distribution: $$ P(X=\frac{1}{4})= \frac{1}{4}, P(X=\frac{1}{2})=\frac{1}{2}, P(X=\frac{3}{4})=\frac{1}{4} $$ Suppose $Y$ is a Bernoulli random variable and the joint distribution of $X$ and $Y$ satisfies the following condition: $$\E[Y |X] = X^2$$ Calculate: (a) $\E[Y ]$. (b) $\E[XY ]$. (c ) $\E[ \frac{Y}{X}]$. (d) $Var[Y]$. (e) $Var[Y|X]$. (f) $\E[X|Y]$. ## Answer ### a $$ \E[Y] = \E[\E[Y|X]] = \E[X^2] = \frac{9}{32}, $$ By LIE. ### b $$ \E[XY] = \E[\E[XY|X]] = \E[X\E[Y|X]] = \E[X^3] =\frac{11}{64} , $$ By LIE. ### c $$ \E[\frac{Y}{X}] = \E[\E[\frac{Y}{X}|X]] = \E[\frac{1}{X}\E[Y|X]] = \E[X] =\frac{1}{2} , $$ By LIE. ### d Since $Y$ is a Bernoulli random variable, let $p=P(Y=1)$, we know $Var[Y]=p(1-p)$. Moreover, $$ p=P(Y=1) = \E[Y] = \frac{9}{32} \implies Var[Y]= \frac{9}{32} \cdot \frac{23}{32} $$ ### e $$ Var[Y|X] = \E[Y^2|X] - \E[Y|X]^2) $$ Since $Y$ is a Bernoulli random variable, the outcome of $Y$ is still $\{0,1\}$ conditional on $X$. In other words, $Y|X$ is a Bernoulli random variable with differe parameter $p(x)$. Moreover, the event $Y^2=1$ is the same event of $Y=1$; the event $Y^2=0$ is the same event of $Y=0$. Hence, $$ \E[Y^2|X=x] = \E[Y|X=x] $$ Hence, $$ Var[Y|X] = X^2 - (X^2)^2 = X^2 - X^4. $$ ### f We can use Bayes's rule. Let $p(X)$ be the parameter of Bernoulli r.v. $Y|X$. $$ \E[Y|X=\frac{1}{4}] = \frac{1}{16} \implies p(X=\frac{1}{4})= \frac{1}{16}\\ \E[Y|X=\frac{1}{2}] = \frac{1}{4} \implies p(X=\frac{1}{2})= \frac{1}{4}\\\ \E[Y|X=\frac{3}{4}] = \frac{9}{16} \implies p(X=\frac{3}{4})= \frac{9}{16}\\ $$ We already know $P(Y=1)= \E[Y] = \frac{9}{32}$. Hence, given $Y=1$, $$ P(X=\frac{1}{4}|Y=1) = \frac{\frac{1}{64}}{\frac{9}{32}} = \frac{1}{18}\\ P(X=\frac{1}{2}|Y=1) = \frac{\frac{8}{64}}{\frac{9}{32}} = \frac{4}{9}\\\ P(X=\frac{3}{4}|Y=1) = \frac{\frac{9}{64}}{\frac{9}{32}} = \frac{1}{2}\\ $$ Hence, $$ \E[X|Y=1]= \frac{1}{4} \cdot \frac{1}{18} + \frac{1}{2} \cdot \frac{4}{9} + \frac{3}{4} \cdot \frac{1}{2}= \frac{11}{18} $$ We can also derive $\E[X|Y=0]$. # Q2 Consider the simple regression model $$Y_i = \beta_0 + \beta_1 X_i + u_i$$ and let $Z_i$ be a dummy instrumental variable for $X_i$, where $E(u|Z) = 0.$ (a) Show that the IV estimator $\hat{\beta}_1$ can be written as $$ \hat{\beta}_1 = \frac{\hat{Y}_1-\hat{Y}_0}{\bar{X}_1-\bar{X}_0}. $$ where $\hat{Y}_0$ and $\hat{X}_0$ are the sample averages of $Y_i$ and $X_i$ over the part of the sample with $Z_i= 0$, and where $\hat{Y}_1$ and $\hat{X}_1$ are the sample averages of $Y_i$ and $X_i$ over the part of the sample with $Z_i = 1$. (b) Is it a consistent estimator of $\beta_1$? Why? ## Answer ### a :::info Derive from the scratch ::: $$ Z'= \begin{pmatrix} 1 & \cdots & 1 \\ z_1 & \cdots & z_n \end{pmatrix}, X= \begin{pmatrix} 1 & x_1 \\ & \cdots \\ 1 & x_n \end{pmatrix}, Y= \begin{pmatrix} y_1 \\ \cdots \\ y_n \end{pmatrix} $$ Hence, $$ Z'X= \begin{pmatrix} n & \sum_{i=1}^n x_i \\ \sum_{i=1}^n z_i & \sum_{i=1}^n z_i x_i \end{pmatrix}, Z'Y= \begin{pmatrix} \sum_{i=1}^n y_i\\ \sum_{i=1}^n z_i y_i \end{pmatrix} = \begin{pmatrix} n \E_n[y]\\ n\E_n[zy] \end{pmatrix} $$ Also, $$ (Z'X)^{-1} = \frac{1}{n \sum_{i=1}^n z_i x_i - \sum_{i=1}^n x_i \sum_{i=1}^n z_i} \begin{pmatrix} \sum_{i=1}^n z_i x_i & -\sum_{i=1}^n x_i \\ -\sum_{i=1}^n z_i & n \end{pmatrix}\\ = \frac{1}{n^2 \E_n[zx] - n^2 \E_n[x] \E_n[z]} \begin{pmatrix} n \E_n[zx] & -n \E_n[x] \\ -n \E_n[z] & n \end{pmatrix}\\ $$ Thus, $$ (Z'X)^{-1} Z'Y = \frac{1}{n \sum_{i=1}^n z_i x_i - \sum_{i=1}^n x_i} \begin{pmatrix} \sum_{i=1}^n z_i x_i \sum_{y_i} - \sum_{i=1}^n x_i \sum_{i=1}^n z_i y_i \\ -\sum_{i=1}^n z_i \sum_{i=1}^n y_i + n \sum_{i=1}^n z_i y_i \end{pmatrix}\\ = \frac{1}{n^2 \E_n[zx] - n^2 \E_n[x] \E_n[z]} \begin{pmatrix} n^2 \E_n[zx] \E_n[y] - n^2 \E_n[x] \E_n[zy]\\ -n^2 \E_n[z] \E_n[y] + n^2 \E_n[zy] \end{pmatrix}\\ $$ Hence, $$ \hat{\beta}_1 = \frac{n^2 \E_n[y] -n^2 \E_n[z] \E_n[zy]}{n^2 \E_n[zx] - n^2 \E_n[x] \E_n[z]} = \frac{ \E_n[zy] - \E_n[z] \E_n[y]}{ \E_n[zx] - \E_n[x] \E_n[z]} = \frac{\hat{Cov}(z,y)}{\hat{Cov}(z,x)} $$ :::info Start here during the exam ::: \begin{align} \hat{\beta}_1 &= \frac{\hat{Cov}(z,y)}{\hat{Cov}(z,x)}\\ &= \frac{ \E_n[zy] - \E_n[z] \E_n[y]}{ \E_n[zx] - \E_n[x] \E_n[z]}\\ \end{align} Since $z$ is binary, define $\hat{p} = \frac{1}{n}\sum_{i=1}^n z_i$. By the definition, $$ \bar{X}_1 = \frac{1}{\sum_{i=1}^n z_i} \sum_{i=1}^n z_i x_i = \frac{1}{n\hat{p}} \sum_{i=1}^n z_i x_i,\\ \bar{X}_0 = \frac{1}{\sum_{i=1}^n (1-z_i)} \sum_{i=1}^n (1-z_i) x_i = \frac{1}{n(1-\hat{p})} \sum_{i=1}^n (1-z_i) x_i $$ Hence, $$ \E_n[zx]=\hat{p} \bar{X}_1, \E_n[x] = \hat{p} \bar{X}_1 + (1-\hat{p}) \bar{X}_0, \E_n[z]= \hat{p} $$ Similarily, $$ \E_n[zy]=\hat{p} \bar{Y}_1, \E_n[y] = \hat{p} \bar{Y}_1 + (1-\hat{p}) \bar{Y}_0, \E_n[z]= \hat{p} $$ Finally, $$ \hat{\beta}_1 = \frac{\hat{p} \bar{Y}_1 - (\hat{p} \bar{Y}_1 + (1-\hat{p}) \bar{Y}_0)\hat{p}}{\hat{p} \bar{X}_1 - (\hat{p} \bar{X}_1 + (1-\hat{p}) \bar{X}_0)\hat{p}} = \frac{\hat{Y}_1-\hat{Y}_0}{\bar{X}_1-\bar{X}_0}. $$ ### b Yes. By WLLN and Slutsky, \begin{align} \hat{\beta}_1 &= \frac{\hat{Cov}(z,y)}{\hat{Cov}(z,x)} \to_p \frac{Cov(z,y)}{Cov(z,x)}\\ &= \frac{Cov(z,\beta_0 + \beta_1 x +u)}{Cov(z,x)} = \frac{\beta_1 Cov(z,x)}{Cov(z,x)} = \beta_1, \end{align} where we use $Cov(z, u)=0$ and $Cov(z, x) > 0$ from the assumption. # Q3 # Q4