---
tags: information theory
---
# Basic Property of Markov Porcess
## Definition
### Definition 1 (Markov Process)
$\{X_i | i = 1, 2, \ldots\}$ is a ***Markov process*** if $\forall n > 1$,
$$
P_{X_n | X_{n-1}, X_{n-1}, \ldots, X_1} = P_{X_n | X_{n-1}}.
$$
It's typically denoted by $X_1 - X_2 - \cdots - X_n - \cdots$.
The common alphabet $\mathcal X$ is called the state space of the Markov process.
#### Corollary 1
$$
X_1 - X_2 - X_3 \iff P(X_1, X_2, X_3) = P(X_1) \cdot P(X_2|X_1) \cdot P(X_3|X_2),
$$
**proof**
$(\Rightarrow)$
By chain rule, $P(X_1, X_2, X_3) = P(X_1) \cdot P(X_2|X_1) \cdot P(X_3|X_2, X_1)$, where $P(X_3|X_2,X_1) = P(X_3|X_2)$ by definition.
$(\Leftarrow)$
Since $P(X_1, X_2, X_3) = P(X_1) \cdot P(X_2|X_1) \cdot P(X_3|X_2)$, we have
$$P(X_3 | X_2, X_1) = \frac{P(X_3, X_2, X_1)}{P(X_2, X_1)}
= \frac{P(X_1) \cdot P(X_2|X_1) \cdot P(X_3|X_2)}{P(X_2, X_1)}
= P(X_3 | X_2).
$$
**remark**
It can be easily generalized to $X_1 - \cdots - X_n$ by induction.
#### Corollary2
$X-Y-Z$ if and only if $X$ and $Z$ are independent given $Y$.
**proof**
Markovity implies conditional independence because
$$
P(x, z|y) = P(z | x, y) \times P(x | y)
= P(z | y) \times P(x | y)
$$
Conversely, if $P(x, z | y) = P(x | y) \cdot P(z | y)$, then
$$
P(z | x, y) = \frac{P(x, y, z)}{P(x, y)}
= \frac{P(x, z | y) P(y)}{P(x,y)}
= \frac{P(x| y) P(z|y) P(y)}{P(x,y)}
= P(z|y).
$$
**remark**
By symmetry, $X-Y-Z$ implies that $Z-X-Y$.
## Properties
### Lemma 1
In Markov Process $X_1 - \cdots - X_n$,
$$
P(X_n | X_1)
= \sum_{X_2, \ldots, X_{n-1}} P(X_n|X_{n-1}) \times \cdots \times P(X_2 | X_1).
$$
(same result for continuous case)
**proof**
By the difinition of marginal probability distribution,
$$
\begin{align}
P(X_1, X_n)
&= \sum_{X_2, \ldots, X_{n-1}} P(X_1, \ldots, X_n).
\end{align}
$$
By [Corollary 1](#Corollary-1) in the Definition 1, the equality above gives
$$
P(X_1, X_n) = \sum_{X_2, \ldots, X_{n-1}} P(X_1) \times P(X_2|X_1) \times \cdots \times P(X_n|X_{n-1}).
$$
Dividing both side by $P(X_1)$ gives the complete the proof.
**remark**
It implies that $P(X_n) = \sum_{X_1, \ldots, X_n-1} P(X_1) \times P(X_2|X_1) \times \cdots \times P(X_n | X_{n-1})$
### Lemma 2
If $X_1 - \cdots - X_m - \cdots - X_{m+n}$, then $X_{m+1} - \cdots - X_{m+n}$ for any $n, m \in \mathbb N$.
**proof**
By definition of marginal probability distribution,
$$
\begin{align}
P(X_{m+1}, \ldots, X_{m+n}) = \sum_{X_1, ... ,X_m}P(X_1, \ldots, X_{m+n})
\end{align}.
$$
Applying [Corollary 1](#Corollary-1) in Definition 1 and [Lemma 1](#Lemma-1), we obtain
$$
\begin{align}
P(X_{m+1}, \ldots, X_{m+n})
&= \sum_{X_1, \ldots, X_m} P(X_1) \times P(X_2|X_1) \times \cdots \times P(X_m | X_{m-1}) \times \cdots \times P(X_{m+n} | X_{m+n-1})
\notag \\
&= P(X_{m+1}) \times P(X_{m+2} | X_{m+1}) \times \cdots \times P(X_{m+n} | X_{m+n-1}).
\end{align}
$$
Therefore, by [Corollary 1](#Corollary-1), we have prove the statement.
### Theorem 1
Every subchain of a Markov chain is also a markov chain.
**proof**
Use [Lemma 1](#Lemma-1) and [Lemma 2](#Lemma-2) we can show that that
$$
X_1 - \cdots - X_m - \cdots -X_{m + n} - \cdots \Rightarrow X_1 - \cdots - X_m - X_{m+n} - \cdots.
$$
Then we can use this property to complete the proof.