---
tags: math, macro
---
# Difference Equations
## Setting
Let $\{y_t\}_{t=-\infty}^{\infty}$ and $\{w_t\}_{t=-\infty}^{\infty}$ be two real sequences, and $\phi_i, i=1, \dots, p$ are real numbers.
## 1st order difference equation
Suppose we have the following relationship between two sequences,
$$
y_t = \phi_1 y_{t-1} + w_t, \quad \forall t
$$
By iterating, we can get,
\begin{align*}
y_{t+1} &= \phi_1 y_t + w_{t+1} = \phi_1 (\phi_1 y_{t-1} + w_t) + w_{t+1} = \phi_1^2 y_{t-1} +\phi_1 w_{t} + w_{t+1}\\
y_{t+2} &= \phi_1 y_{t+1} + w_{t+2} = \phi_1^3 y_{t-1} +\phi_1^2 w_{t} + \phi_1 w_{t+1} + w_{t+2}\\
\dots \\
y_{t+k} &= \phi_1^{k+1} y_{t-1} + \phi_1^k w_t +\phi_1^{k-1} w_{t+1} + \dots + \phi_1 w_{t+k-1} + w_{t+k}
\end{align*}
### Dynamic multiplier
We can also define dynamic multiplier as
$$
\frac{\partial y_{t+k}}{w_t}.
$$
Under the 1st difference model,
$$
\frac{\partial y_{t+k}}{w_t} = \phi_1^k.
$$
If $|\phi_1|<1$, then $\lim_{k \to \infty}\frac{\partial y_{t+k}}{w_t} =0$. The interpretation is that the transitory impact could not last forever.
### Permanent income
Suppose $\{y_t\}_{t=0}^{\infty}$ is the stream of income, with the discount factor $0 \le \beta \le 1$, we can define the permanent income, i.e. present value of future income, at time $t$ as
$$
PV_t= \sum_{k=t}^\infty \beta^{k-t} y_k
$$
We are also interested in the effect of $w_t$ on the permanent income,
$$
\frac{\partial PV_t}{w_t} = \frac{\partial \sum_{k=t}^\infty \beta^{k-t} y_k}{w_t} =\sum_{k=t}^\infty \beta^{k-t} \frac{\partial y_k}{w_t} .
$$
Under the 1st difference model,
$$
\frac{\partial PV_t}{\partial w_t} = \sum_{k=t}^\infty \beta^{k-t} \phi_1^{k-t} = \frac{1}{1-\beta \phi_1}.
$$
## P-th order difference equation
The second-order difference equation has the following form.
$$
y_t = \phi_1 y_{t-1} + \phi_2 y_{t-2} + w_t, \quad \forall t
$$
The p-th order difference equation has the following form.
$$
y_t = \phi_1 y_{t-1} + \phi_2 y_{t-2} + \dots + w_t = \sum_{k=1}^p \phi_k y_{t-k} + w_t, \quad \forall t
$$
### Matrix expression
For the calculation, it is more convenient to use matrix expression. We can express a p-th order difference equation as the following vectors and matrix,
$$
Y_t = \begin{pmatrix} y_t\\
y_{t-1}\\
. \\
y_{t-p+1}\\
y_{t-p}\end{pmatrix}, \quad
W_t=\begin{pmatrix} w_t\\
0\\
. \\
0\\
0\end{pmatrix}
$$
$$
F= \begin{pmatrix} \phi_1 & \phi_2 & \dots & \phi_{p-1} & \phi_p\\
1 & 0 & \dots & 0 & 0\\
0 &1 &\dots & 0 & 0 \\
& & \dots \\
0 & 0 & \dots & 1 & 0\end{pmatrix}
$$
We have,
$$
Y_t = F Y_{t-1} + W_t.
$$
That is,
$$
\begin{pmatrix} y_t\\
y_{t-1}\\
. \\
y_{t-p+1}\\
y_{t-p}\end{pmatrix} = \begin{pmatrix} \phi_1 & \phi_2 & \dots & \phi_{p-1} & \phi_p\\
1 & 0 & \dots & 0 & 0\\
0 &1 &\dots & 0 & 0 \\
& & \dots \\
0 & 0 & \dots & 1 & 0\end{pmatrix} \begin{pmatrix} y_{t-1}\\
y_{t-2}\\
. \\
y_{t-p}\\
y_{t-p-1}\end{pmatrix}
+
\begin{pmatrix} w_t\\
0\\
. \\
0\\
0\end{pmatrix}.
$$
By iterating, we have
$$
Y_{t+k} = F^{k+1} Y_{t-1} + F^k W_t + F^{k-1} W_{t+1} + \dots + W_{t+k}.
$$
The first element of $Y_{t+k}$ is $y_{t+k}$.
## Diagonalization
The advantage of using linear algebra is that we (probably) can diagonalize the matrix. Suppose we can diagonalize $F$ as
$$
F = Q \Lambda Q^T,
$$
where $Q Q^T=I$ and $\Lambda$ is the diagonal matrix with eigenvalues. If the length of all eigenvalues are less than $1$, $F^k$ will converge to the zero matrix as $k \to \infty$.
We have,
$$
Y_t = Q \Lambda Q^T Y_{t-1} + W_t,
$$
and,
$$
Y_{t+k} = Q \Lambda^{k+1} Q^T Y_{t-1} + Q \Lambda^k Q^T W_t + Q \Lambda^{k-1} Q^T W_{t+1} + \dots + W_{t+k}.
$$
### Non-diagonizable
Not all matrices have eigenvalues (in the real field). However, if we extend the field into complex number, all matrices have at least one eigenvalues. Moreover, we can use the Jordan normal form to decompose any matrices. Therefore, for any matrix $F$, if the length/module of all its eigenvalues are less than $1$, $\lim_{k \to \infty}F^k$ will converge to the zero matrix.
## Characteristic polynomials
The eigenvalues of $F$ are the roots of the characteristic polynomial $det|F - \lambda I|$. Moreover, the eigenvalues of $F$ are the roots of the following polynomial,
$$
f(\lambda) = \lambda^p - \phi_1 \lambda^{p-1} - \phi_2 \lambda^{p-2} - \dots - \phi_{p-1} \lambda - \phi_1 = 0
$$
<details>
<summary>
Proof:
</summary>
$$
det|F - \lambda I|= \begin{vmatrix} \phi_1 - \lambda & \phi_2 & \dots & \phi_{p-1} & \phi_p\\
1 & - \lambda & \dots & 0 & 0\\
0 &1 &\dots & 0 & 0 \\
& & \dots \\
0 & 0 & \dots & -\lambda &0\\
0 & 0 & \dots & 1 & -\lambda \end{vmatrix}
$$
For the diagonal matrix, the determinant is the product of diagonal terms. We can observe that $F - \lambda I$ is close to the diagonal matrix. The trick is just to transfer it into a diagonal matrix.
Since column operation does not change the determinant, we can multiply the last column and add it to the second last column and delete the lower triangular term in the last second column.
$$
det|F - \lambda I|= \begin{vmatrix} \phi_1 - \lambda & \phi_2 & \dots & \phi_{p-1} + \lambda^{-1} \phi_p & \phi_p\\
1 & - \lambda & \dots & 0 & 0\\
0 &1 &\dots & 0 & 0 \\
& & \dots \\
0 & 0 & \dots & -\lambda &0\\
0 & 0 & \dots & 0 & -\lambda \end{vmatrix}
$$
Applying this operation $p-1$ times, we can get
$$
det|F - \lambda I|= \begin{vmatrix} \phi_1 - \lambda + \lambda^{-1} \phi_2 + \dots + \lambda^{-(p-1)} \phi_p & mess & \dots & \phi_{p-1} + \lambda^{-1} \phi_p & \phi_p\\
0 & -\lambda & \dots & 0 &0 \\
0 & 0 &\dots &0 & 0 \\
0 & 0 & \dots & -\lambda &0\\
0 & 0 & \dots & 0 & -\lambda \end{vmatrix}\\
= \left( \phi_1 - \lambda + \lambda^{-1} \phi_2 + \dots + \lambda^{-(p-1)} \phi_p \right) (- \lambda)^{p-1}\\
=\left( \phi_1 - \lambda + \lambda^{-1} \phi_2 + \dots + \lambda^{-(p-1)} \phi_p \right)(\lambda)^{p-1} (- 1)^{p-1}\\
=\left( -\lambda^{p} + \lambda^{p-1} \phi_1 + \lambda^{p-2} \phi_2 + \dots + \lambda^{0} \phi_p \right ) (- 1)^{p-1}.
$$
Because the roots of a polynomial do not change if we multiply the polynomial with any non-zero constant, the roots of $|F - \lambda I|$ must be the roots of
$$
f(\lambda) = \lambda^p - \phi_1 \lambda^{p-1} - \phi_2 \lambda^{p-2} - \dots - \phi_{p-1} \lambda - \phi_1 = 0
$$
</details>
### Dynamic multiplier
Using the matrix calculus, we have
$$
\frac{\partial Y_{t+k}}{\partial W_t'} = F^k,
$$
where the dynamic multiplier $\frac{\partial y_{t+k}}{\partial w_t}$ is just the $(1,1)$ element of $F^k$. Notice that here we take the derivative of a column vector with respect to a row vector (the transpose of $W_t$), so we get a $p \times p$ matrix. If the length of all eigenvalues are less than $1$, $\lim_{k \to \infty}\frac{\partial y_{t+k}}{\partial w_t}=0$ as $F^k$ converge to the zero matrix.
## Permanent income
We can define the permanent income vector at time $t$ as
$$
PIV_t= \sum_{k=t}^\infty \beta^{k-t} Y_k
$$
We are also interested in the effect of $W_t$ on the permanent income vector,
$$
\frac{\partial PIV_t}{ \partial W_t'} = \frac{\partial \sum_{k=t}^\infty \beta^{k-t} Y_k}{\partial W_t'} =\sum_{k=t}^\infty \beta^{k-t} \frac{\partial Y_k}{ \partial W_t'} = \sum_{k=t}^\infty \beta^{t-k} F^{t-k}.
$$
The dynamic multiplier $\frac{\partial PV_t}{ \partial w_t}$ is the $(1,1)$ element of the above matrix.
Let $\sum_{k=t}^\infty \beta^{t-k} F^{t-k} = H$.Since its a geometric function, we have
$$H(I - \beta F) = I.$$
If $I - \beta F$ is invertible, we can get
$$
\frac{\partial PIV_t}{ \partial W_t'} = \sum_{k=t}^\infty \beta^{t-k} F^{t-k} = (I - \beta F)^{-1}.
$$
## Formula
If all of the eigenvalues of $F$ are less than $\beta^{-1}$, then $I - \beta F$ is invertible and the dynamic multiplier is,
$$
\frac{\partial PV_t}{ \partial w_t} =1/(1 - \phi_1 \beta - \phi_2 \beta^2 - \dots - \phi_{p-1}\beta^{p-1} - \phi_{p} \beta^{p} ).
$$
<details>
<summary>
Proof:
</summary>
First, if $I - \beta F$ is not invertible, its determinant should be $0$. We can observe that
$$
det|I - \beta F| = det\left|-\beta (F - \beta^{-1} I )\right| = (-\beta)^p \cdot det|F -\beta^{-1} I|.
$$
Hence, if $det|I - \beta F| =0$, $\beta^{-1}$ is an eigenvalue of $I - \beta F$, which violates our assumption.
Let $(I - \beta F)^{-1}=H$, by the definition,
$$
H (I - \beta F)= H \begin{pmatrix} 1- \beta\phi_1 & - \beta \phi_2 & \dots & -\beta \phi_{p-1} & -\beta \phi_p\\
\beta & 1 & \dots & 0 & 0\\
0 &1 &\dots & 0 & 0 \\
& & \dots \\
0 & 0 & \dots & 1 &0\\
0 & 0 & \dots & -\beta & 1 \end{pmatrix} =I
$$
Let $H_1=\begin{pmatrix}x_{11} & x_{12} & \dots & x_{1p}\end{pmatrix}$ be the first row of $H$. We have
$$
\begin{pmatrix}x_{11} & x_{12} & \dots & x_{1p}\end{pmatrix} \begin{pmatrix} 1- \beta\phi_1 & - \beta \phi_2 & \dots & -\beta \phi_{p-1} & -\beta \phi_p\\
\beta & 1 & \dots & 0 & 0\\
0 &1 &\dots & 0 & 0 \\
& & \dots \\
0 & 0 & \dots & 1 &0\\
0 & 0 & \dots & -\beta & 1 \end{pmatrix} = \begin{pmatrix}1 &0 & \dots & 0 & 0\end{pmatrix}
$$
Again, column operation does not affect the solution. We can right multiply a invertible matrix on each side to delete the terms in the lower triangular. The first step is multiply the last column with $\beta$ and add into the second last column,
$$
\begin{pmatrix}x_{11} & x_{12} & \dots & x_{1p}\end{pmatrix} \begin{pmatrix} 1- \beta\phi_1 & - \beta \phi_2 & \dots & -\beta \phi_{p-1} -\beta^2 \phi_p & -\beta \phi_p\\
\beta & 1 & \dots & 0 & 0\\
0 &1 &\dots & 0 & 0 \\
& & \dots \\
0 & 0 & \dots & 1 &0\\
0 & 0 & \dots & 0 & 1 \end{pmatrix} = \begin{pmatrix}1 &0 & \dots & 0 & 0\end{pmatrix}
$$
Repeating this process $p-1$ times, we can get,
$$
\begin{pmatrix}x_{11} & x_{12} & \dots & x_{1p}\end{pmatrix} \begin{pmatrix} 1 - \phi_1 \beta - \phi_2 \beta^2 - \dots - \phi_{p-1} \beta^{p-1} - \phi_{p} \beta^{p} & \dots & -\beta \phi_p\\
0 & \dots & 0\\
0 &\dots & 0 \\
& \dots \\
0 & \dots &0\\
0 & \dots & 1 \end{pmatrix} = \begin{pmatrix}1 &0 & \dots & 0 & 0\end{pmatrix}
$$
Hence,
$$
\frac{\partial PV_t}{ \partial w_t} =x_{11} = 1/(1 - \phi_1 \beta - \phi_2 \beta^2 - \dots - \phi_{p-1}\beta^{p-1} - \phi_{p} \beta^{p} ).
$$
</details>
## Lag operator
### Definition
Let $X$ and $Y$ be two sets. $F: X \to Y$ is a function from $X$ to $Y$. Suppose $X=Y$, we have a special name of this kind of function, **operator**.
We define the lag operator $L: \{y_t\}_{t=-\infty}^{\infty} \to \{y_t\}_{t=-\infty}^{\infty}$ as
$$
L(y_t) = y_{t-1}, \quad \forall t
$$
### P-th order difference equation
We can use the lag operator to rewrite the difference equation. The 1st order difference equation is
$$
y_t = \phi_1 y_{t-1} + w_t, \quad \forall t
$$
With lag operator,
$$
y_t = \phi_1 L(y_t) + w_t \implies (1-\phi L)y_t = w_t , \quad \forall t
$$
The second-order difference equation,
$$
y_t = \phi_1 y_{t-1} + \phi_2 y_{t-2} + w_t, \quad \forall t
$$
With lag operator,
$$
(1- \phi_1 L - \phi_2 L^2) y_t = w_t, \quad \forall t
$$
The p-th order difference equation,
$$
y_t = \phi_1 y_{t-1} + \phi_2 y_{t-2} + \dots + w_t = \sum_{k=1}^p \phi_k y_{t-k} + w_t, \quad \forall t
$$
With lag operator,
$$
(1- \phi_1 L - \phi_2 L^2- \dots -\phi_p L^p) y_t = w_t, \quad \forall t
$$
### Inverse of 1st order
It is straighfowrad that $L^{-1}(y_t) = y_{t+1}$. However, does $(1 - \phi L)^{-1}$ exists?
Since $1 - \phi L$ is also a operator/function defines on $\{y_t\}_{t=-\infty}^{\infty}$, the inverse function is reasonable if it exists. Consider an operator $F_k$ on $\{y_t\}_{t=-\infty}^{\infty}$ as
$$
F_k = 1 + \phi L + \phi^2 L^2 + \dots + \phi^k L^k.
$$
We can observe that,
$$
F_k(1 - \phi L) (y_t)= \left( 1 -\phi^{k+1} L^{k+1}\right) (y_t) = y_t - \phi^{k+1} y_{t-k-1}.
$$
If $\{y_t\}_{t=-\infty}^{\infty}$ is bounded and $|\phi|<1$, then
$$
\lim_{k \to \infty} F_k(1 - \phi L) (y_t)= y_t - \lim_{k \to \infty} \phi^{k+1} y_{t-k-1} =y_t, \quad \forall y_t
$$
Hence,
$$
(1 - \phi L)^{-1} = \lim_{k \to \infty} F_k = 1 + \phi L + \phi^2 L^2 + \dots = \sum_{i=0}^\infty \phi^i L^{i},
$$
where we define $L^0=1$.
Given a 1st order difference equation,
$$
(1-\phi L)y_t = w_t.
$$
If $(1 - \phi L)^{-1}$ exists and $\{w_t\}_{t=-\infty}^{\infty}$ is bounded, we have
$$
y_t = (1 - \phi L)^{-1} w_t = \sum_{i=0}^\infty \phi^i L^{i} w_t,
$$
In other words, $y_t$ is the weighted average of $\{w_t\}_{t=-\infty}^{\infty}$/ past error terms.
### Inverse of p-th order
Given a p-th order difference equation,
$$
f(L)=1- \phi_1 L - \phi_2 L^2- \dots -\phi_p L^p.
$$
If we can factorize it as
$$
f(L)=(1- \lambda_1 L)(1- \lambda_2 L) \dots (1- \lambda_p L),
$$
then we can use the inverse of the 1st order lag operator to find its inverse function.
In fact, we can, by the fundamental theorem of algebra.(The root may be complex)
### Equivalent to Matrices
The roots of a p-th order difference equation,
$$
f(L)=1- \phi_1 L - \phi_2 L^2- \dots -\phi_p L^p,
$$
are also the roots of
$$
g(z) =1- \phi_1 z - \phi_2 z^2- \dots -\phi_p z^p = (1- \lambda_1 z)(1- \lambda_2 z) \dots (1- \lambda_p z).
$$
Divide the both sides by $z^{-p}$,
$$
z^{-p} - \phi z^{-p+1} \dots -\phi_p z^0= (z^{-1}- \lambda_1 )(z^{-1}- \lambda_2 ) \dots (z^{-1}- \lambda_p )
$$
Define $\lambda = z^{-1}$,
$$
h(\lambda) = \lambda^p - \phi \lambda^{-p+1} \dots -\phi_p \lambda^0= (\lambda- \lambda_1 )(\lambda- \lambda_2 ) \dots (\lambda- \lambda_p )
$$
Therefore, $\lambda_i, i=1, \dots p$ are the roots of $h(\lambda)$, and $\lambda_i^{-1}, i=1, \dots p$ are the roots of $g(z)$ and $f(L)$. Since $h(\lambda)$ has the exact same form at the [characteristic polynomial](#Characteristic-polynomials) defined by the same difference equation, these two methods can derive the same result.
## Reference
Hamilton, J. D. (2020). *Time series analysis.* Princeton university press.
Stephen H. Friedberg, Arnold J. Insel y Lawrence E. Spence, *Linear Algebra*, Prentice-Hall, Inc., 2003
Ljungqvist, Lars, and Thomas J. Sargent. *Recursive macroeconomic theory*, MIT press, 2018.
https://en.wikipedia.org/wiki/Jordan_normal_form