# Testing for multi-step causality in time series
###### tags: `references`
reference:
* [Testing for multi-step causality in time series](http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.45.3420)
* [Identification of feedback loops in neural networks based on multi-step Granger causality and its supplementary materials](https://academic.oup.com/bioinformatics/article/28/16/2146/324993#supplementary-data)(這篇的supplementary materials 有附matlab code)
## VAR model
Suppose a *stationary* multivariate(dimension=$N$) stochastic process
$X_k=[x_1(k),x_2(k),...,x_N(k)]^T$ ($k$ is the time index, $k=1,...,L$).
The $VAR(p)$ model: $X_k=V+A_1$$X_{k-1}+A_2$$X_{k-2}+...+A_p$$X_{k-p}+E_k$ ---(1)
## h-step prediction by VAR model
Let $\tilde{X_k}=\begin{bmatrix} X_k\\X_{k-1}\\\vdots\\X_{k-p-1} \end{bmatrix}=\begin{bmatrix} V\\\vdots\\\end{bmatrix}+\begin{bmatrix} A_1&A_2&\dots\ &A_p\\I&0&\dots\ &0\\0&I&\dots\ &0\\\vdots\ & \vdots\ &\dots\ & \vdots\\\ 0&0& \dots\ & 0\end{bmatrix}\begin{bmatrix} X_{k-1}\\X_{k-2}\\\vdots\\X_{k-p-2} \end{bmatrix}+\begin{bmatrix} E_k\\\vdots\\\end{bmatrix}$
$=\tilde{V}_{Np\times 1}+\tilde{A}_{Np\times Np}\tilde{X}_{k-1}+\tilde{E_k}$,
then $\tilde{X}_{k+1}=\tilde{V}+\tilde{A}\tilde{X}_{k}+\tilde{E}_{k+1}=\tilde{V}+\tilde{A}(\tilde{V}+\tilde{A}\tilde{X}_{k-1}+\tilde{E_k})+\tilde{E}_{k+1}=(I+\tilde{A})\tilde{V}+\tilde{A}^2\tilde{X}_{k-1}+\tilde{A}\tilde{E}_k+\tilde{E}_{k+1}$
$\tilde{X}_{k+2}=\tilde{V}+\tilde{A}\tilde{X}_{k+1}+\tilde{E}_{k+2}=\tilde{V}+\tilde{A}[\tilde{V}+\tilde{A}((I+\tilde{A})\tilde{V}+\tilde{A}^2X_{k-1}+\tilde{A}\tilde{E}_k+\tilde{E}_{k+1})]+\tilde{E}_{k+2}$
$=(I+\tilde{A}+\tilde{A}^2)\tilde{V}+\tilde{A}^3\tilde{X}_{k-1}+\tilde{A}^2\tilde{E}_k+\tilde{A}\tilde{E}_{k+1}+\tilde{E}_{k+2}$
thus the h-step predictor
$X_k(h)=J(I+\tilde{A}+\tilde{A}^2+...+\tilde{A}^{(h-p)})\tilde{V}+J\tilde{A}^{h}\tilde{A}^h\tilde{X}_k, h=1,2,..., J=\begin{bmatrix} I & 0 & \dots\ & 0 \end{bmatrix}_{N \times Np}$
## Modified Wald test
Want to test if $x_{i}$ causes $x_{j}$
1. Find $A_1,...,A_p$ from (1)
2. Find $\tilde{A}$
3. Find $C_{p \times pN^2}$ such that $Cvec(\begin{bmatrix} A_1 & A_2 & \dots\ & A_p \end{bmatrix})=vec(\begin{bmatrix} A_{ji,1} & A_{ji,2} & \dots\ & A_{ji,p} \end{bmatrix})$
4. Let $\alpha=vec(J\tilde{A})$, $\alpha^m=vec(J\tilde{A}^m)$ $\in$$\mathbb{R}^{pN^2 \times 1}, m=2,...,h$
5. Let $\xi^{(h)}=\begin{bmatrix} \alpha & \alpha^{(2)} & \dots\ & \alpha^{(h)} \end{bmatrix}_{hpN^2 \times 1}$
6. Find $\hat{\Sigma}_{\xi}(h)=\frac{\partial\xi^{(h)}}{\partial\alpha^T}\hat{\Sigma}_{\alpha}(\frac{\partial\xi^{(h)}}{\partial\alpha^T})^T$$\in$$\mathbb{R}^{hpN^2 \times hpN^2}$
a. $\hat{\Sigma}_{\alpha}$ is the Estimated variance-covariance of model coefficients
b. $\frac{\partial\xi^{(h)}}{\partial\alpha^T}=\begin{bmatrix} \frac{\partial\alpha}{\partial\alpha^T} & \frac{\partial\alpha^{(2)}}{\partial\alpha^T} & \dots\ & \frac{\partial\alpha^{(h)}}{\partial\alpha^T} \end{bmatrix}$, $\frac{\partial\alpha}{\partial\alpha^T}=I$, $\frac{\partial\alpha^{(m)}}{\partial\alpha^T}=\sum_{i=0}^{m-1}(\tilde{A^T})^{m-1-i}\otimes J\tilde{A}^iJ^T, m=2,...,h$
Note that $\otimes$ is the Kronecker product
7. Let $\hat{\Sigma_{\rho}}=\begin{bmatrix} 0 & 0\\0 & I_{h-1} \otimes diag(C\hat{\Sigma}_{\alpha}C^T) \end{bmatrix}_{hp\times hp}$
8. Draw a random vector $\rho(h)$ from $N(0,\gamma \hat\Sigma_\rho(h))$, $\gamma$ is a small positive number (is set to 0.1 in the paper)
9. The test statistic $\lambda_{wald}=L[(I_h\otimes C)\xi^{(h)}+\frac{\rho(h)}{\surd{L}}]^T[(I_h\otimes C)\hat\Sigma_\xi(h)(I_h\otimes C^T)+\gamma\hat\Sigma_\rho(h)]^{-1}[(I_h\otimes C)\xi^{(h)}+\frac{\rho(h)}{\surd{L}}]$
it follows a $\chi^2(hp)$
10. if $\lambda_{wald} \leq$$\chi^2_{0.95}(hp)$ accept $H_0: x_i does not cause x_j$
else, reject $H_0$
## Pitfalls
1. 不清楚原理是甚麼
2. 沒有現成套件可以用 只能自己刻:使用stats model的VAR, 可算出$A_1,...,A_p$ 和 $\hat{\Sigma}_{\alpha}$
3. 做出來的$\lambda_{wald}$ 總是很大,在9.中不要乘上$L$ 看起來比較正常
4. 還在做模擬實驗以及debug QAQ
5. 改成用本來版本的multivariate granger做(h=1),可直接使用stat model

