owned this note
owned this note
Published
Linked with GitHub
# 凱力–漢米頓定理
Cayley–Hamilton theorem

This work by Jephian Lin is licensed under a [Creative Commons Attribution 4.0 International License](http://creativecommons.org/licenses/by/4.0/).
$\newcommand{\trans}{^\top}
\newcommand{\adj}{^{\rm adj}}
\newcommand{\cof}{^{\rm cof}}
\newcommand{\inp}[2]{\left\langle#1,#2\right\rangle}
\newcommand{\dunion}{\mathbin{\dot\cup}}
\newcommand{\bzero}{\mathbf{0}}
\newcommand{\bone}{\mathbf{1}}
\newcommand{\ba}{\mathbf{a}}
\newcommand{\bb}{\mathbf{b}}
\newcommand{\bc}{\mathbf{c}}
\newcommand{\bd}{\mathbf{d}}
\newcommand{\be}{\mathbf{e}}
\newcommand{\bh}{\mathbf{h}}
\newcommand{\bp}{\mathbf{p}}
\newcommand{\bq}{\mathbf{q}}
\newcommand{\br}{\mathbf{r}}
\newcommand{\bx}{\mathbf{x}}
\newcommand{\by}{\mathbf{y}}
\newcommand{\bz}{\mathbf{z}}
\newcommand{\bu}{\mathbf{u}}
\newcommand{\bv}{\mathbf{v}}
\newcommand{\bw}{\mathbf{w}}
\newcommand{\tr}{\operatorname{tr}}
\newcommand{\nul}{\operatorname{null}}
\newcommand{\rank}{\operatorname{rank}}
%\newcommand{\ker}{\operatorname{ker}}
\newcommand{\range}{\operatorname{range}}
\newcommand{\Col}{\operatorname{Col}}
\newcommand{\Row}{\operatorname{Row}}
\newcommand{\spec}{\operatorname{spec}}
\newcommand{\vspan}{\operatorname{span}}
\newcommand{\Vol}{\operatorname{Vol}}
\newcommand{\sgn}{\operatorname{sgn}}
\newcommand{\idmap}{\operatorname{id}}
\newcommand{\am}{\operatorname{am}}
\newcommand{\gm}{\operatorname{gm}}
\newcommand{\mult}{\operatorname{mult}}
\newcommand{\iner}{\operatorname{iner}}$
```python
from lingeo import random_int_list
from linspace import vtop
```
## Main idea
Let $A$ be an $n\times n$ matrix and
$$
p(x) = a_0x^d + a_1x^{d-1} + \cdots + a_d
$$
a polynomial.
Then define
$$
p(A) = a_0A^d + a_1A^{d-1} + \cdots + a_dI.
$$
##### Cayley–Hamilton Theorem
Let $A$ be an $n\times n$ matrix and
$$
p_A(x) = s_0(-x)^n + s_1(-x)^{n-1} + \cdots + s_n
$$
its characteristic polyonimal.
Then
$$
p_A(A) = s_0(-A)^n + s_1(-A)^{n-1} + \cdots + s_nI = O.
$$
##### Remark
Although $p_A(x) = \det(A - xI)$, it is not correct that $p_A(A) = \det(A - A\cdot I)$ by definition.
In particular, $p_A(A)$ is a matrix, while $\det(A - AI)$ is a number.
##### Proof of the Cayley–Hamilton Theorem
Let $A_x = A - xI$.
Then we know $\det(A_x) = p_A(x)$ and the adjucate $A_x\adj$ of $A_x$ has the property
$$
\begin{aligned}
A_x\adj A_x &= p_A(x) I \\
&= s_0I(-x)^n + s_1I(-x)^{n-1} + \cdots + s_nI.
\end{aligned}
$$
On the other hand, each entry of $A_x\adj$ is a polynomial of degree at most $n-1$, so we may write
$$
A_x\adj = C_1(-x)^{n-1} + C_2(-x)^{n-2} + \cdots + C_n
$$
for some constant matrices $C_1,\ldots, C_n$.
Therefore,
$$
\begin{aligned}
A_x\adj A_x &= A_x\adj(A - xI) \\
&= C_1(-x)^n + (C_1A + C_2)(-x)^{n-1} + \cdots + (C_{n-1}A + C_n)(-x) + C_nA.
\end{aligned}
$$
By comparing the coefficients of the two expressions of $A_x\adj A_x$ we get
$$
\begin{array}{rclcl}
s_0 I &= & & &C_1 \\
s_1 I &= &C_1A &+ &C_2\\
\vdots &&&& \\
s_{n-1}I &= &C_{n-1}A &+ &C_n\\
s_nI &= &C_nA. & &
\end{array}
$$
By post-multiplying $(-A)^n-k$ to both sides of the $k$-th equation for $k = 0,\ldots, n$, we get
$$
\begin{aligned}
p_A(A) &= s_0(-A)^n + s_1(-A)^{n-1} + \cdots + s_nI \\
&= C_1(-A)^n + C_1A(-A)^{n-1} + C_2(-A)^{n-1} + C_2A(-A)^{n-2} + \cdots + C_n(-A) + C_nA\\
&= O.
\end{aligned}
$$
Notice that $s_0,\ldots,s_n$ can be written as polynomials in entries of $A$.
Also, each entry of $A^0,\ldots,A^n$ can be written as a polynomial in entries of $A$.
This mean each entry of $p_A(A)$ is a polynomial in entries of $A$, and it is zero polynomial!
How come these terms cancel with each other can be viewed from graph theory, but it is beyond the scope.
See the book _A Combinatorial Approach to Matrix Theory and Its Applications_ by Richard A. Brualdi and Dragoš Cvetković for more details.
## Side stories
- matrix as a complex number
## Experiments
##### Exercise 1
執行以下程式碼。
<!-- eng start -->
Run the code below.
<!-- eng end -->
```python
### code
set_random_seed(0)
print_ans = False
n = 2
A = matrix(n, random_int_list(n^2))
cs = random_int_list(n + 1)
p = vtop(cs)
print("n =", n)
pretty_print(LatexExpr("A ="), A)
pretty_print(LatexExpr("p ="), p)
if print_ans:
for i in range(n + 1):
pretty_print(LatexExpr("A^{%s} ="%i), A^i)
pofA = sum([cs[i] * A^i for i in range(n + 1)], identity_matrix(n))
pretty_print(LatexExpr("p(A) ="), pofA)
```
##### Exercise 1(a)
計算 $A^0, A^1, \ldots, A^n$。
<!-- eng start -->
Find $A^0, A^1, \ldots, A^n$.
<!-- eng end -->
##### Exercise 1(b)
計算 $p(A)$。
<!-- eng start -->
Find $p(A)$.
<!-- eng end -->
:::info
What do the experiments try to tell you? (open answer)
...
:::
## Exercises
##### Exercise 2
令
$$
p(x) = x^2 - 5x + 6 = (x - 2)(x - 3).
$$
說明
$$
p(A) = A^2 - 5A + 6 = (A - 2I)(A - 3I).
$$
因此 $p(A)$ 不會因為 $p$ 的表達式不同而影響結果。
<!-- eng start -->
Let
$$
p(x) = x^2 - 5x + 6 = (x - 2)(x - 3).
$$
Verify
$$
p(A) = A^2 - 5A + 6 = (A - 2I)(A - 3I).
$$
and explain why $p(A)$ does not depend on the different expressions of $p(A)$.
<!-- eng end -->
##### Exercise 3
對以下 $n\times n$ 矩陣 $A$,
求出其特徵多項式 $p_A(x)$ 並計算 $A^0, \ldots, A^n$,
最後驗證 $p_A(A) = O$。
<!-- eng start -->
For the following matrices $A$, find the characteristic polynomial $p_A(x)$ and calculate $A^0, \ldots, A^n$, and then verify $p_A(A) = O$.
<!-- eng end -->
##### Exercise 3(a)
$$
A = \begin{bmatrix}
1 & 2 \\
3 & 4
\end{bmatrix}.
$$
##### Exercise 3(b)
$$
A = \begin{bmatrix}
1 & 1 & 1 \\
1 & 1 & 1 \\
1 & 1 & 1
\end{bmatrix}.
$$
##### Exercise 4
令
$$
A = \begin{bmatrix}
1 & 0 & 0 \\
0 & 2 & 0 \\
0 & 0 & 3
\end{bmatrix}.
$$
計算 $B_1 = A - I$、$B_2 = A - 2I$、及 $B_3 = A - 3I$。
說明 $B_1,B_2,B_3$ 彼此兩兩可交換,並驗證 $B_1B_2B_3 = O$。
求出特徵多項式 $p_A(x)$ 並說明為什麼 $p_A(A) = O$。
<!-- eng start -->
Let
$$
A = \begin{bmatrix}
1 & 0 & 0 \\
0 & 2 & 0 \\
0 & 0 & 3
\end{bmatrix}.
$$
Calculate $B_1 = A - I$, $B_2 = A - 2I$, and $B_3 = A - 3I$. Explain why $B_1, B_2, B_3$ are mutually commutative and verify $B_1B_2B_3 = O$. Find the characteristic polynomial $p_A(x)$ and explain why $p_A(A) = O$.
<!-- eng end -->
##### Exercise 5
令
$$
A = \begin{bmatrix}
0 & 1 & 0 & 0 & 0 \\
0 & 0 & 1 & 0 & 0 \\
0 & 0 & 0 & 1 & 0 \\
0 & 0 & 0 & 0 & 1 \\
0 & 0 & 0 & 0 & 0
\end{bmatrix}.
$$
計算 $A^0, A^1, \ldots, A^5$。
求出特徵多項式 $p_A(x)$ 並說明為什麼 $p_A(A) = O$。
<!-- eng start -->
Let
$$
A = \begin{bmatrix}
0 & 1 & 0 & 0 & 0 \\
0 & 0 & 1 & 0 & 0 \\
0 & 0 & 0 & 1 & 0 \\
0 & 0 & 0 & 0 & 1 \\
0 & 0 & 0 & 0 & 0
\end{bmatrix}.
$$
Calculate $A^0, A^1, \ldots, A^5$. Then find the characteristic polynomial $p_A(x)$ and explain why $p_A(A) = O$.
<!-- eng end -->
##### Exercise 6
令
$$
A = \begin{bmatrix}
1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 2 & 1 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 2 & 1 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 2 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 3 & 1 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 3 & 1 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 3 & 1 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 3
\end{bmatrix}
$$
計算 $B_1 = (A - I)^2$、$B_2 = (A - 2I)^3$、及 $B_3 = (A - 3I)^4$。
說明 $B_1,B_2,B_3$ 彼此兩兩可交換,並驗證 $B_1B_2B_3 = O$。
求出特徵多項式 $p_A(x)$ 並說明為什麼 $p_A(A) = O$。
<!-- eng start -->
Let
$$
A = \begin{bmatrix}
1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 2 & 1 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 2 & 1 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 2 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 3 & 1 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 3 & 1 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 3 & 1 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 3
\end{bmatrix}
$$
Calculate $B_1 = A - I$, $B_2 = A - 2I$, and $B_3 = A - 3I$. Explain why $B_1, B_2, B_3$ are mutually commutative and verify $B_1B_2B_3 = O$. Find the characteristic polynomial $p_A(x)$ and explain why $p_A(A) = O$.
<!-- eng end -->
##### Exercise 7
令
$$
A = \begin{bmatrix}
1 & 2 & 3 \\
4 & 5 & 6 \\
7 & 8 & 9
\end{bmatrix}
$$
而 $A_x = A - xI$。
計算 $A_x\adj$ 並將其寫成
$$
A_x\adj = C_1(-x)^{2} + C_2(-x)^{1} + C_3.
$$
<!-- eng start -->
Let
$$
A = \begin{bmatrix}
1 & 2 & 3 \\
4 & 5 & 6 \\
7 & 8 & 9
\end{bmatrix}
$$
and $A_x = A - xI$. Compute $A_x\adj$ and write it into the form
$$
A_x\adj = C_1(-x)^{2} + C_2(-x)^{1} + C_3.
$$
<!-- eng end -->
##### Exercise 8
當 $z = a + bi$ 為一複數時,定義
$$
A(z) =
\begin{bmatrix}
a & -b \\
b & a
\end{bmatrix}.
$$
驗證對於任兩複數 $z_1,z_2$ 都有
- $A(z_1z_2) = A(z_1)A(z_2)$ 及
- $A(z_1 + z_2) = A(z_1) + A(z_2)$。
所以當 $p$ 是一個多項式滿足 $p(z) = 0$ 時,則有 $p(A(z)) = O$。
<!-- eng start -->
For any complex number $z = a + bi$, define
$$
A(z) =
\begin{bmatrix}
a & -b \\
b & a
\end{bmatrix}.
$$
Verify that
- $A(z_1z_2) = A(z_1)A(z_2)$ and
- $A(z_1 + z_2) = A(z_1) + A(z_2)$
for any two complex numbers $z_1,z_2$. Therefore, if $p$ is a polynomial with $p(z) = 0$, then we also have $p(A(z)) = O$.
<!-- eng end -->