---
tags: metric, memo, public
---
# Instrumental Variable
## Hansen's Notation
$$Y_1 = X \beta + u$$
$$X = Z \Gamma + e$$
$$Y_1 = Z \lambda + \varepsilon$$
* $Y_1$: $n \times 1$ matrix
* $X$: $n \times k$ matrix
* $Z$: $n \times l$ matrix
* $\Gamma$: $l \times k$ matrix
* $\lambda$: $l \times 1$ matrix
$$X'u \neq 0, Z'u =0, Z'e =0, Z' \varepsilon =0$$
Some basic algebra,
$$Z'u =0 = Z'(Y_1 - X \beta )$$
$$Z'Y_1 = Z 'X \beta $$
If $l = k$, we may find the inverse of $Z 'X$, and calculate the **instrumental variables(IV)** estimator,
$$\hat{\beta}_{iv}=(Z'X)^{-1}Z'Y_1$$
Another approach is pluggin in the equations,
$$Y_1 = X \beta + u = (Z \Gamma + e) \beta + u = Z \Gamma \beta + e \beta + u = Z \lambda + \varepsilon$$
$$\lambda = \Gamma \beta$$
If $l = k$, we may find $\Gamma^{-1}$ and $\beta = \Gamma^{-1}\lambda$. In this case, we call the estimator **Indirect Least Suqres(ILS)**, which is equivalen to the instrumental variables(IV) estimators, that is
$$\hat{\beta}_{ils} = \hat{\Gamma}^{-1}\hat{\lambda} \\= ((Z'Z)^{-1}(Z'X))^{-1}((Z'Z)^{-1}(Z'Y_1))\\=(Z'X)^{-1}Z'Y_1=\hat{\beta}_{iv}$$
If $l \neq k$, we can use
$$Y_1 = X \beta + u = (Z \Gamma + e) \beta + u = Z \Gamma \beta + e \beta + u$$
If we know $\Gamma$, then we have
$$\hat{\beta}=[(Z \Gamma)'(Z \Gamma)]^{-1}(Z \Gamma)' Y_1\\=(\Gamma 'Z' Z \Gamma)^{-1}\Gamma Z' Y_1$$
Using $\hat{\Gamma}=(Z'Z)^{-1} Z'X$, we can have a feasible **two-stage-least squared (2SLS)** estimator, tha is
$$\hat{\beta}_{2sls}=(\hat{\Gamma} 'Z' Z \hat{\Gamma})^{-1}\hat{\Gamma} Z' Y_1 \\= (X'Z(Z'Z)^{-1}Z'Z(Z'Z)^{-1} Z'X)^{-1} X' Z(Z'Z)^{-1} Z' Y_1\\=(X'Z(Z'Z)^{-1} Z'X)^{-1}X'Z (Z'Z)^{-1} Z' Y_1.$$
## Baltalgi's Notation
$$y_1 = Y_1 \alpha_1 + X_1 \beta_1 + u_1 = Z_1 \delta_1 + u_1$$
$$Y_1 = X \Gamma + e$$
$$y_1 = X \lambda + \varepsilon$$
* $y_1$: $T \times 1$ matrix
* $Y_1$: $T \times g_1$ matrix
* $X_1$: $T \times k_1$ matrix
* $\alpha_1$: $g_1 \times 1$ matrix
* $\beta_1$: $k_1 \times 1$ matrix
* $Z_1$: $[Y_1, X_1]$ ,$T \times (g_1 + k_1)$ matrix
* $\delta_1$: $[\alpha_1', \beta_1']$ ,$(g_1 + k_1) \times 1$ matrix
* $X_2$: $T \times (k-k_1)$ matrix
* $X$: $[X_1, X_2]$ ,$T \times k$ matrix
$$X_1'u_1 \neq 0, Z'u =0, Z'e =0, Z' \varepsilon =0$$
We get $\widehat{Y}_1$ in the first stage and use it to estimate at the second stage,
$$y_1 = \widehat{Y}_1 \alpha_1 + X_1 \beta_1 + w_1 = \widehat{Z}_1 \delta_1 + w_1$$
where $\widehat{Z}_1 =[\widehat{Y}_1, X_1]$
$$\widehat{\delta}_{1,2SLS} = (\widehat{Z}_1'\widehat{Z}_1)^{-1} \widehat{Z}_1'y_1 = (Z_1'P_X Z_1)^{-1}Z_1'P_X y_1$$
---
In other words, in the first stage, we use following equation to estimate $\widehat{Z}$,
$$Z = X \Theta + \phi$$
we get
$$\widehat{Z}=P_X Z$$
In the second stage, we use the following equation to estimate $\widehat{\delta}_{1}$,
$$y_1 = \widehat{Z} \delta_1 + u_1$$
**Note:** For some reasons, scholars call the estimator under just-idnetifed case as **instrumental variables(IV)** estimator and the estimator under over-identified case as **two-stage-least squared (2SLS)** estimator. However, 2SLS estimator is just the generalized version of IV estimator. In general, we use these two terms interchangably.
## Overidentification Test
Suppose that the number of instumental variables is greater than the number of dependent variables, that is , $l > (g_1 +k_1)$ in Baltagi's notation, then we can test following overidentification test:
$$H_0: y_1 = Z_1 \delta_1+ u_1, \quad E(W'u)=0$$
$$H_1: y_1 = Z_1 \delta_1 + W^* \gamma + u_1\quad E(W'u)=0$$
where $W$ is the instrumental matrix and $W^*$ is a constructed matrix such that $[P_W X \quad W^*]$ can span the same linear space spanned by $W$.
* $W$: $T \times l$ instrument matrix
* $W^*$: $T \times (l-k_1 -g_1)$ matrix
An alternative representaion of the hypothesis is:
$$H_0: y_1 = P_W Z_1 \delta_1+ u_1, \quad E(W'u)=0$$
$$H_1: y_1 = P_W Z_1 \delta_1 + W^* \gamma + u_1\quad E(W'u)=0$$
Based on the hypothesis, we can implement a F test as:
$$F = \frac{RRSS^* - URSS^*/(l-(g_1 + k_1))}{URSS/(T-l)} \sim F_{l-(g_1 + k_1), T-l}$$
$RRSS^* = (y_1 - P_W Z_1 \widehat{\delta}_1)'(y_1 - P_W Z_1 \widehat{\delta}_1)$
$URSS^* = (y_1 - P_W Z_1 \widehat{\delta}_1 - W^* \widehat{\gamma})'(y_1 - P_W Z_1 \widehat{\delta}_1 - W^* \widehat{\gamma})$
$URSS = (y_1 - Z \widehat{\delta}_{1, 2SLS}- W^* \widehat{\gamma})'(y_1 - Z \widehat{\delta}_{1, 2SLS}- W^* \widehat{\gamma})$
By construction, $[P_W X \quad W^*]$ can span the same linear space as $W$,
$$URSS^*= [(I-P_W)y_1]'[(I-P_W)y_1] =y_1 \bar{P}_W y_1$$
We also have,
$$RRSS^* = [(I-P_{P_W Z})y_1]'[(I-P_{P_W Z})y_1] =y_1' \bar{P}_{\hat{Z}} y_1$$
**Claim:** $RRSS^* - URSS^* = \|(P_W(I-P_{P_W Z}))y_1\|^2 = \|(P_W(y_1 - Z \widehat{\delta}_1)\|^2$
First, $\|(P_W(y_1 - Z \widehat{\delta}_1)\|^2 = \|(P_W(y_1 - \hat{Z} \widehat{\delta}_1)\|^2=\|P_W(I-P_{P_W Z})y_1\|^2$
We can observe that
$$(I-P_w)(P_W(I-P_{P_W Z})=0 $$
and
$$(I-P_w)+(P_W(I-P_{P_W Z})=I-P_{P_W Z}$$
Therefore, the equation hold by the Pythagorean theorem.
Moreover, we can use $\chi^2$ statistic to test overidentifiction restriction.
$$\frac{RRSS^* - URSS^*}{\widehat{\sigma}^2} = \frac{\|(P_W(y_1 - Z \widehat{\delta}_1)\|^2}{\|y_1 - Z \widehat{\delta}_1\|^2/n} \sim \chi^2_{l-(g_1 + k_1)}$$
In fact, this statistic is based on the following hypothesis:
$$H_0: y_1 - Z_1 \hat{\delta}_1 = u$$
$$H_1: y_1 - Z_1 \hat{\delta}_1 = W \Gamma+ u$$
## DWH Tests
The **Durbin-Wu-Hausman tests** are aimed to test the exogenouity of the OLS model. That is
$$H_0: y_1 = Z_1 \delta +u, \quad E(Z'u)=0$$
$$H_1: y_1 = Z_1 \delta +u, \quad E(Z'u)\neq 0, E(W'u)=0$$
Let $\hat{\delta}_{OLS}$ and $\hat{\delta}_{2SLS}$ are the estimations based on OLS and 2SLS, respectively. Under $H_0$, this two estimations should be very close, so we can use their difference to construct test statistics.
$$\hat{q} = \hat{\delta}_{2SLS} - \hat{\delta}_{OLS} \\
=(X'P_W X)^{-1}X'P_W y - (X'X)^{-1}X'y \\
=(X'P_W X)^{-1}(X'P_W y -X'P_W X(X'X)^{-1}X'y)\\
=(X'P_W X)^{-1}X'P_W (I - X(X'X)^{-1}X')y\\
=(X'P_W X)^{-1}X'P_W (I -P_X)y\\
=(X'P_W X)^{-1}X'P_W M_Xy$$
Under $H_0$, we have
$$\hat{q}'(Var(\hat{q}))^{-1}\hat{q} \sim \chi^2_{g_1+k_1}$$
Meanwhile, $\hat{q}$ close to $0$ or not only depends on $X'P_W M_Xy$. $Z_1=[Y_1, X_1]$ and only the $Y_1$ would affect the estimation, so we only need to test whether $Y_1'P_W M_X y$ is different from $0$. It is equivalent to test the following hypothesis:
$$H_0: y_1 = X \delta +u$$
$$H_1: y_1 = X\delta +P_W Y_1 \gamma +u$$
By the FWL theorem, $\hat{\gamma}$ is just regress $M_X y$ on $M_X P_W Y$, that is
$$\hat{\gamma}=(Y'P_W M_X P_W Y )^{-1}Y'P_W M_X y$$
and the assocatied F test is
$$\frac{(RRSS-URSS)/g_1}{URSS/(T-2 g_1 -k_1)} \sim F_{g_1, (T -2 g_1 - k_1)}$$