Markov matrix

Markov matrix, 又稱為轉移矩陣, 是用來描述 Markov chain 的轉變的矩陣.

矩陣裡每一個元素都代表機率, 因此是個介於
$0$ 與
$1$ 之間的實數, 並且 column sum 等於
$1$ .

General
$n \times n$ Markov matrix

Let

P \in M_{n \times n}

be a Markov matrix, then

0 \leq P_{i j} \leq 1

and

\begin{matrix} (1) & \sum_{i = 1}^{n} P_{i j} = 1. \end{matrix}

$2 \times 2$ Markov matrix - part 1

對於

2 \times 2

的 Markov matrix, 我們可以把他表示為

\begin{matrix} (2) & P = [\begin{matrix} a & b \\ 1 - a & 1 - b \end{matrix}], \end{matrix}

其中

0 \leq a, b \leq 1

Examples

$(a, b) = (1, 0)$ , 那
$P$ 就變成 identity matrix,

$\begin{matrix} (3) & P = [\begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix}] . \end{matrix}$
- (3) 這個矩陣的 eigenvalues 是兩個
  $1$ , eigenvector 的 basis 則是
  $(1, 0)^{T}$ 以及
  $(0, 1)^{T}$ .
- 此外,
  $P x = x$ , 所以這個轉移矩陣完全沒有做任何事, 轉移完後還是同樣的狀態.
$(a, b) = (1, 1)$ ,

$\begin{matrix} (4) & P = [\begin{matrix} 1 & 1 \\ 0 & 0 \end{matrix}] . \end{matrix}$
- (4) 這個矩陣的 eigenvalues 是
  $1$ 與
  $0$ , 相對應的 eigenvector 的 basis 則是
  $(1, 0)^{T}$ 以及
  $(1, - 1)^{T}$ .
- 如果初始狀態是個機率,
  $x = [x_{1}, x_{2}]^{T}$ 並且
  $x_{1} + x_{2} = 1$ , 則
  
  $P [\begin{matrix} x_{1} \\ x_{2} \end{matrix}] = [\begin{matrix} x_{1} + x_{2} \\ 0 \end{matrix}] = [\begin{matrix} 1 \\ 0 \end{matrix}] .$
  也就是說這個轉移矩陣是把所有狀態堆到第一個元素去.
$(a, b) = (0, 0)$ . 基本上同
$(a, b) = (1, 1)$ , 不過這個轉移矩陣是把所有狀態堆到第二個元素去
$(a, b) = (0, 1)$ ,

$\begin{matrix} (5) & P = [\begin{matrix} 0 & 1 \\ 1 & 0 \end{matrix}] . \end{matrix}$
- (5) 這個矩陣的 eigenvalues 是
  $1$ 與
  $- 1$ , 相對應的 eigenvector 的 basis 則是
  $(1, 1)^{T}$ 以及
  $(1, - 1)^{T}$ .
- 如果初始狀態是
  $x = [x_{1}, x_{2}]^{T}$ , 則
  
  $\begin{matrix} (6) & P [\begin{matrix} x_{1} \\ x_{2} \end{matrix}] = [\begin{matrix} x_{2} \\ x_{1} \end{matrix}] . \end{matrix}$
  也就是說這個轉移矩陣是把兩個狀態對調.

對於 Markov matrix 我們最想知道的是在不斷的轉移之後, 會不會"收斂"到某個狀態去. 也就是我們想求

lim_{n \to \infty} P^{n} x .

Observation

第一件重要的事就是 Markov matrix 他的 column sum 等於

1

. 因此,

\begin{matrix} (7) & [\begin{matrix} 1 & 1 \end{matrix}] P = 1 \cdot [\begin{matrix} 1 & 1 \end{matrix}] . \end{matrix}

也就是說

[1, 1]^{T}

是 Markov matrix 的左 eigenvector, 並且 Markov matrix 的 eigenvalue 一定有一個是

1

接著我們看一下左 eigenvector 以及其相關性質.

Left eigenvector

以下是對
$n \times n$ 方陣左 eigenvector 的定義以及一些性質.

Definition:
Let

y \in R^{n} ∖ {0}

satisfies

\begin{matrix} (8) & y^{T} A = λ y^{T}, \end{matrix}

then

y

is called the left eigenvector of

A

that corresponds to the eigenvalue

λ

Remark:
Taking tranpose to each side of (8) we have

A^{T} y = λ y

. So the left eigenvector is also the eigenvector of

A^{T}

Lemma 1:

A

and

A^{T}

share exactly the same eigenvalues.

pf:
$det (A - λ I) = det (A^{T} - λ I)$ . So the roots are exactly the same.

Lemma 2: Let

v \neq 0

and

w \neq 0

be the right and left eigenvector of

A

that corresponds to different eigenvalues, i.e.,

A v = λ_{1} v, w^{T} A = λ_{2} w^{T}, λ_{1} \neq λ_{2} .

Then

w^{T} v = 0

pf:
$w^{T} A v = w^{T} (A v) = λ_{1} w^{T} v$ and
$w^{T} A v = (w^{T} A) v = λ_{2} w^{T} v$ .

Therefore
$(λ_{1} - λ_{2}) w^{T} v = 0$ , and
$w^{T} v = 0$ due to
$λ_{1} \neq λ_{2}$ .

因此, 對一般

n \times n

的 Markov matrix, 我們也知道它一定有一個 eigenvalue 是

1

Lemma 3: An

n \times n

Markov matrix has an eigenvalue

1

and corresponding left eigenvector

1 = [1, 1, \dots, 1]^{T}

pf: Using (1) we have
$1 P = 1 = 1 \cdot 1$ . Therefore,
$1$ is the left eigenvector corresponds to eigenvalue
$1$ , and Markov matrix has an eigenvalue
$1$ .

$2 \times 2$ Markov matrix - part 2

利用 Lemma 2 我們可以推論一件有一點點神奇的事. 我們已經知道 (2) 的這個

P

矩陣他一定有個 eigenvalue 是

1

. 那假設他另一個 eigenvalue 不是

1

, 並且假設這個對應的 eigenvector 叫做

[y_{1}, y_{2}]^{T}

, 那根據 Lemma 2 我們有

\begin{matrix} (9) & [\begin{matrix} 1 & 1 \end{matrix}] [\begin{matrix} y_{1} \\ y_{2} \end{matrix}] = 0, \end{matrix}

也就是

y_{1} + y_{2} = 0

. 然後就知道這個 eigenvector 的 basis 一定是

[1, - 1]^{T}

然後我們再反過來算一下它的 eigenvalue

\begin{matrix} (10) & P [\begin{matrix} 1 \\ - 1 \end{matrix}] = [\begin{matrix} a - b \\ b - a \end{matrix}] = (a - b) [\begin{matrix} 1 \\ - 1 \end{matrix}] . \end{matrix}

因此知道 eigenvalue 是

a - b

, eigenvector 是

[1, - 1]^{T}

那這個推論的原始假設是"第二個 eigenvalue 不是

1

". 因此只有

a - b = 1

的時候推論才會錯. 而由於

0 \leq a, b \leq 1

, 可以發現只有在

(a, b) = (1, 0)

的時候才會有

a - b = 1

. 也就是

P

是 identity matrix.

Lemma 4: Let

P

be a

2 \times 2

Markov matrix, written as (2). Assume

P

is not an identity matrix, then it has eigenvalue

λ_{2} = (a - b)

with corresponding eigenvector

v_{2} = [1, - 1]^{T}

First eigenvector

接著我們來試著將

P

對角化. 先算一下

λ = 1

的 eigenvector:

P - 1 I = [\begin{matrix} a - 1 & b \\ 1 - a & - b \end{matrix}] .

因此可以很輕易看出 eigenvector 的 basis 是

[b, 1 - a]^{T}

. 不過我們把它 normalize 一下, 使它是個機率 (元素和為

1

v_{1} = \frac{1}{1 + b - a} [\begin{matrix} b \\ 1 - a \end{matrix}] .

Diagonalization

有了所有的 eigenvalue 跟 eigenvector 我們就可以將矩陣對角化.

假設

P \neq I

\begin{matrix} (11) & P [\begin{matrix} v_{1} & v_{2} \end{matrix}] = [\begin{matrix} v_{1} & v_{2} \end{matrix}] [\begin{matrix} 1 & 0 \\ 0 & a - b \end{matrix}] . \end{matrix}

令

V = [v_{1}, v_{2}]

, 那我們就有

\begin{matrix} (12) & P = V [\begin{matrix} 1 & 0 \\ 0 & a - b \end{matrix}] V^{- 1} . \end{matrix}

並且

\begin{matrix} (13) & P^{n} = V [\begin{matrix} 1 & 0 \\ 0 & (a - b)^{n} \end{matrix}] V^{- 1} . \end{matrix}

接著我們再稍微整理一下, 算一下

V^{- 1}

V^{- 1} = [\begin{matrix} 1 & 1 \\ \frac{1 - a}{1 + b - a} & \frac{- b}{1 + b - a} \end{matrix}] = [\begin{matrix} w_{1}^{T} \\ w_{2}^{T} \end{matrix}] .

也就是我們得到了

w_{1}

與

w_{2}

分別是

P

的兩個 left-eigenvectors. 而

w_{1}

也如預料中是個元素全為

1

的向量.

因此對角化 (12) 可以改寫為

\begin{matrix} (14) & P = v_{1} w_{1}^{T} + (a - b) v_{2} w_{2}^{T} . \end{matrix}

以及

\begin{matrix} (15) & P^{n} = v_{1} w_{1}^{T} + (a - b)^{n} v_{2} w_{2}^{T} . \end{matrix}

Steady state

最後我們就可以找出 Markov process 的 steady state 了. 假設

a - b \neq \pm 1

(也就是

P

不是 identity matrix (3) 或位置調換的矩陣 (5)), 並且假設

x

是個機率向量, 元素和為

1

, 則有

\begin{matrix} (16) & lim_{n \to \infty} P^{n} x = v_{1} . \end{matrix}

這邊我們有利用到

| a - b | < 1

以及

| a - b |^{n} \to 0

而如果對

x

沒有任何條件, 令

x = [x_{1}, x_{2}]^{T}

, 則有

\begin{matrix} (17) & lim_{n \to \infty} P^{n} x = (x_{1} + x_{2}) v_{1} . \end{matrix}

$n \times n$ Markov matrix

對於一般的 Markov matrix

P

我們也有類似的結果.

Markov matrix 的條件為元素介於
$0$ 跟
$1$ 之間以及 column sum 為
$1$ .

根據 Lemma 3 我們知道

P

有一個 eigenvalue 是

1

. 我們做以下兩個假設

$dim (Null {P - I}) = 1$ . 也就是
$λ = 1$ 的 eigenvector 是個一維的子空間.
$- 1$ 不是
$P$ 的 eigenvalue.

那麼, 假設

P v_{1} = v_{1}

並且

v_{1}

的元素和為

1

. 那我們有,

\begin{matrix} (18) & lim_{n \to \infty} P^{n} x = v_{1} . \end{matrix}

對任何元素和為

1

的向量

x

都會成立.

因此

v_{1}

就是 Markov process 的 Steady state.

所以任給一個 Markov matrix

P

, 我們目標就是要找他的

v_{1}

, 這樣就知道最終狀態長怎樣了. 而找

v_{1}

這件事可以利用 power iteration. 再請自行參閱.

Power method

不過簡單的說, 其實所謂的以 power iteration 找

v_{1}

的作法我們已經在 (18) 說明完了. 就是先隨機生成一個向量

x_{0}

, 然後再一直以

P

來乘它.

x_{n} = P x_{n - 1}, n \geq 1.

然後過程中檢驗看看

x_{n}

收斂了沒即可.