--- tags: information theory --- # Basic Property of Markov Porcess ## Definition ### Definition 1 (Markov Process) $\{X_i | i = 1, 2, \ldots\}$ is a ***Markov process*** if $\forall n > 1$, $$ P_{X_n | X_{n-1}, X_{n-1}, \ldots, X_1} = P_{X_n | X_{n-1}}. $$ It's typically denoted by $X_1 - X_2 - \cdots - X_n - \cdots$. The common alphabet $\mathcal X$ is called the state space of the Markov process. #### Corollary 1 $$ X_1 - X_2 - X_3 \iff P(X_1, X_2, X_3) = P(X_1) \cdot P(X_2|X_1) \cdot P(X_3|X_2), $$ **proof** $(\Rightarrow)$ By chain rule, $P(X_1, X_2, X_3) = P(X_1) \cdot P(X_2|X_1) \cdot P(X_3|X_2, X_1)$, where $P(X_3|X_2,X_1) = P(X_3|X_2)$ by definition. $(\Leftarrow)$ Since $P(X_1, X_2, X_3) = P(X_1) \cdot P(X_2|X_1) \cdot P(X_3|X_2)$, we have $$P(X_3 | X_2, X_1) = \frac{P(X_3, X_2, X_1)}{P(X_2, X_1)} = \frac{P(X_1) \cdot P(X_2|X_1) \cdot P(X_3|X_2)}{P(X_2, X_1)} = P(X_3 | X_2). $$ **remark** It can be easily generalized to $X_1 - \cdots - X_n$ by induction. #### Corollary2 $X-Y-Z$ if and only if $X$ and $Z$ are independent given $Y$. **proof** Markovity implies conditional independence because $$ P(x, z|y) = P(z | x, y) \times P(x | y) = P(z | y) \times P(x | y) $$ Conversely, if $P(x, z | y) = P(x | y) \cdot P(z | y)$, then $$ P(z | x, y) = \frac{P(x, y, z)}{P(x, y)} = \frac{P(x, z | y) P(y)}{P(x,y)} = \frac{P(x| y) P(z|y) P(y)}{P(x,y)} = P(z|y). $$ **remark** By symmetry, $X-Y-Z$ implies that $Z-X-Y$. ## Properties ### Lemma 1 In Markov Process $X_1 - \cdots - X_n$, $$ P(X_n | X_1) = \sum_{X_2, \ldots, X_{n-1}} P(X_n|X_{n-1}) \times \cdots \times P(X_2 | X_1). $$ (same result for continuous case) **proof** By the difinition of marginal probability distribution, $$ \begin{align} P(X_1, X_n) &= \sum_{X_2, \ldots, X_{n-1}} P(X_1, \ldots, X_n). \end{align} $$ By [Corollary 1](#Corollary-1) in the Definition 1, the equality above gives $$ P(X_1, X_n) = \sum_{X_2, \ldots, X_{n-1}} P(X_1) \times P(X_2|X_1) \times \cdots \times P(X_n|X_{n-1}). $$ Dividing both side by $P(X_1)$ gives the complete the proof. **remark** It implies that $P(X_n) = \sum_{X_1, \ldots, X_n-1} P(X_1) \times P(X_2|X_1) \times \cdots \times P(X_n | X_{n-1})$ ### Lemma 2 If $X_1 - \cdots - X_m - \cdots - X_{m+n}$, then $X_{m+1} - \cdots - X_{m+n}$ for any $n, m \in \mathbb N$. **proof** By definition of marginal probability distribution, $$ \begin{align} P(X_{m+1}, \ldots, X_{m+n}) = \sum_{X_1, ... ,X_m}P(X_1, \ldots, X_{m+n}) \end{align}. $$ Applying [Corollary 1](#Corollary-1) in Definition 1 and [Lemma 1](#Lemma-1), we obtain $$ \begin{align} P(X_{m+1}, \ldots, X_{m+n}) &= \sum_{X_1, \ldots, X_m} P(X_1) \times P(X_2|X_1) \times \cdots \times P(X_m | X_{m-1}) \times \cdots \times P(X_{m+n} | X_{m+n-1}) \notag \\ &= P(X_{m+1}) \times P(X_{m+2} | X_{m+1}) \times \cdots \times P(X_{m+n} | X_{m+n-1}). \end{align} $$ Therefore, by [Corollary 1](#Corollary-1), we have prove the statement. ### Theorem 1 Every subchain of a Markov chain is also a markov chain. **proof** Use [Lemma 1](#Lemma-1) and [Lemma 2](#Lemma-2) we can show that that $$ X_1 - \cdots - X_m - \cdots -X_{m + n} - \cdots \Rightarrow X_1 - \cdots - X_m - X_{m+n} - \cdots. $$ Then we can use this property to complete the proof.