---
tags: metric, memo, public
---
# Seemingly Unrelated Regressions
## Kronecker product
Let $A=[a_{ij}]$ be a $m \times n$ matrix. We define the Kronecker product as:
$$A \otimes B = \begin{pmatrix} a_{11} B & a_{12}B & ... & a_{1n}B \\
a_{21} B & a_{22}B & ... & a_{2n}B \\ . & . & ... & . \\ a_{m1} B & a_{m2}B & ... & a_{mn}B\end{pmatrix}$$
### Properties
$(A \otimes B)(C \otimes D) = (AC)\otimes (BD)$
$(A \otimes B)^{-1}=A^{-1} \otimes B^{-1}$
$(A \otimes B)^{\top}=A^{\top} \otimes B^{\top}$
## Setting
Suppose we want to estimate $M$ regressions and each regression have $T$ observations, then we can implement Seemingly Unrelated Regressions. That is estimating $M$ equations with $T$ observations and each observation is i.i.d:
$$y_i = X_i \beta_i + U_i, i=1,2,..,M$$
Let $K =\sum_{j=1}^M K_j$ as the number of all independent variables in this system, we can stack the vectors as:
$y = \begin{pmatrix} y_1 \\ y_2 \\... \\ y_M\end{pmatrix}$: $T M \times 1$ matrix
$\beta = \begin{pmatrix} \beta_1 \\ \beta_2 \\... \\ \beta_M\end{pmatrix}$: $K \times 1$ matrix
$U = \begin{pmatrix} U_1 \\ U_2 \\... \\ U_M\end{pmatrix}$: $TM \times 1$ matrix
$X = \begin{pmatrix} X_1 & 0 & ... & 0 \\ 0 & X_2 & ... & 0 \\ . & . &... &.\\ 0 &... & 0 & X_M\end{pmatrix}$ : $TM \times K$ matrix
After stacking, we get
$$y= X \beta + U$$
Let $\Omega$ as the covariance matrix of $U$. Following the assumption of i.i.d, we have
$$\Omega = \Sigma \otimes I_T$$
Where $\Sigma$ is the covariance matrix between differenent linear equations.
## OLS and GLS
We have
$$\hat{\beta}_{OLS} = (X'X)^{-1}X'y \\ = \begin{pmatrix} (X_1' X_1)^{-1}X'_1y_1 \\ (X_2' X_2)^{-1}X'_2y_2 \\ (X_3' X_3)^{-1}X'_3y_3 \\ ... \\ (X_M' X_M)^{-1}X'_M y_M \end{pmatrix}.$$
If we know $\Sigma$, we can calculate the GLS as:
$$\hat{\beta}_{GLS} = (X' \Omega^{-1} X)^{-1} X' \Omega^{-1}y \\ = (X' (\Sigma \otimes I_T)^{-1} X)^{-1} X' (\Sigma \otimes I_T)^{-1}y \\ = (X' (\Sigma ^{-1} \otimes I_T) X)^{-1} X' (\Sigma ^{-1} \otimes I_T)y.$$
Let $[\sigma^{ij}] =\Sigma^{-1}$, we can rewrite the GLS estimator as:
$$\hat{\beta}_{GLS} = \begin{pmatrix}\sigma^{11} (X_1'X_1) & \sigma^{12} (X_1'X_2) & ... & \sigma^{1M} (X_1'X_M) \\ \sigma^{21} (X_2'X_1) & \sigma^{22} (X_2'X_2) & ... & \sigma^{2M} (X_2'X_M) \\ . & .&...& . \\ \sigma^{M1} (X_M'X_1) & \sigma^{M2} (X_M'X_2) & ... & \sigma^{MM} (X_M'X_M)\end{pmatrix}^{-1} \begin{pmatrix}X'_1(\sum_j \sigma^{1j}y_j) \\ X'_2(\sum_j \sigma^{2j}y_j) \\... \\X'_M(\sum_j \sigma^{Mj}y_j) \end{pmatrix}.$$
### Explanations
SUR is just the GLS under a system of equations. As we know, GLS is more efficient than OLS in general, so SUR could be a better estimator under this setting.
## When SUR equals OLS
### Uncorrelated between Equations
If $\Sigma$ is diagonal, i.e., $\sigma_{ij}=0$ if $i \neq j$, than $\Sigma^{-1}=diag[1/\sigma_{ii}]$ is also a diagonal matrix and $\hat{\beta}_{GLS} =\hat{\beta}_{OLS}$.
$$\Sigma^{-1}= \begin{pmatrix} \sigma_{11}^{-1} & 0 & 0 & ... & 0\\
0 & \sigma_{22}^{-1} & 0 & ... & 0 \\
& & ...\\
0 & 0 & ... & 0 &\sigma_{MM}^{-1}
\end{pmatrix}$$
$$\hat{\beta}_{GLS} = \begin{pmatrix}\sigma_{11}^{-1} (X_1'X_1) & 0 & ... & 0 \\
0 & \sigma_{22}^{-1} (X_2'X_2) & ... & 0 \\
& &...& \\
0 & 0 & ... & \sigma_{MM}^{-1} (X_M'X_M)\end{pmatrix}^{-1} \begin{pmatrix}X'_1( \sigma_{11}^{-1}y_1) \\
X'_2( \sigma_{22}y_2) \\
... \\
X'_M( \sigma_{MM}^{-1}y_M) \end{pmatrix}\\
=\begin{pmatrix}\sigma_{11} (X_1'X_1))^{-1} & 0 & ... & 0 \\
0 & \sigma_{22} (X_2'X_2))^{-1} & ... & 0 \\
& &...& \\
0 & 0 & ... & \sigma_{MM} (X_M'X_M)^{-1}\end{pmatrix} \begin{pmatrix}\sigma_{11}^{-1}X'_1 y_1 \\
\sigma_{22}^{-1} X'_2 y_2 \\
... \\
\sigma_{MM}^{-1} X'_M y_M \end{pmatrix}\\
= \begin{pmatrix} (X_1' X_1)^{-1}X'_1y_1 \\ (X_2' X_2)^{-1}X'_2y_2 \\ (X_3' X_3)^{-1}X'_3y_3 \\ ... \\ (X_M' X_M)^{-1}X'_M y_M \end{pmatrix} \\
= (X'X)^{-1}X'y \\
=\hat{\beta}_{OLS}.$$
### Identical Regressors
If $X_j = X_0, j=1, ..., M$, than $X = I_M \otimes X_0$ and $\hat{\beta}_{GLS} =\hat{\beta}_{OLS}$.
$$\hat{\beta}_{GLS} = (X' (\Sigma ^{-1} \otimes I_T) X)^{-1} X' (\Sigma ^{-1} \otimes I_T)y \\
= ((I_M \otimes X_0)' (\Sigma ^{-1} \otimes I_T) (I_M \otimes X_0))^{-1} (I_M \otimes X_0)' (\Sigma ^{-1} \otimes I_T)y \\
= (\Sigma^{-1} \otimes X_0'X_0)^{-1}\Sigma^{-1} \otimes X_0'y\\
= (I_M \otimes (X_0'X_0)^{-1}X_0') \begin{pmatrix}y_1 \\y_2 \\... \\ y_M\end{pmatrix} \\
=\begin{pmatrix} (X_0' X_0)^{-1}X'_0y_1 \\
(X_0' X_0)^{-1}X'_0y_2 \\ (X_0' X_0)^{-1}X'_0y_3 \\
... \\
(X_0' X_0)^{-1}X'_0 y_M \end{pmatrix}\\
=\begin{pmatrix} (X_1' X_1)^{-1}X'_1y_1 \\
(X_2' X_2)^{-1}X'_2y_2 \\ (X_3' X_3)^{-1}X'_3y_3 \\
... \\
(X_M' X_M)^{-1}X'_M y_M \end{pmatrix}\\
=\hat{\beta}_{OLS}.$$
## Feasible SUR
A simple way to estimate $\Sigma$ is $\hat{\Sigma}=[\hat{\sigma}_{ij}]$:
$$\hat{\sigma}_{ij}=\frac{1}{T}(y_i-X_i \hat{\beta}_i)'(y_j-X_j \hat{\beta}_j)$$
where $\hat{\beta}$ is the OLS estimator. A summation notation of $\hat{\sigma}_{ij}$ is:
$$\hat{\sigma}_{ij} = \frac{1}{T}\sum_{t=1}^T e_{it} e_{jt}=\hat{s}_{ij}$$
where $e_{it}$ is the OLS residual of $i$ equation on $t$ observation and $\hat{s}_{ij}$ is Baltagi's notation.
$\hat{\Sigma}$ is a consistent but biased estimator of ${\Sigma}$.
Hence, the feasible SUR is:
$$\hat{\beta}_{SUR} = \hat{\beta}_{FGLS} \\
= (X' (\hat{\Sigma} ^{-1} \otimes I_T) X)^{-1} X' (\hat{\Sigma} ^{-1} \otimes I_T)y.$$
## Test the Covariance Matrix
We can test whether $\Sigma$ is a diagonal matrix.
$H_0$: $\Sigma$ is diagonal
$$LM = T \sum_{i=2}^M \sum_{j=1}^{i-1} r_{ij}^2$$
$$r_{ij} = \frac{\hat{s}_{ij}}{(\hat{s}_{ii}\hat{s}_{jj})^{0.5}} $$
under $H_0$,
$$\lambda_{LM} = LM \sim \chi^2_{M(M-1)/2}$$