瑞利商 - HackMD

# 瑞利商 ![Creative Commons License](https://i.creativecommons.org/l/by/4.0/88x31.png) This work by Jephian Lin is licensed under a [Creative Commons Attribution 4.0 International License](http://creativecommons.org/licenses/by/4.0/). $\newcommand{\trans}{^\top} \newcommand{\adj}{^{\rm adj}} \newcommand{\cof}{^{\rm cof}} \newcommand{\inp}[2]{\left\langle#1,#2\right\rangle} \newcommand{\dunion}{\mathbin{\dot\cup}} \newcommand{\bzero}{\mathbf{0}} \newcommand{\bone}{\mathbf{1}} \newcommand{\ba}{\mathbf{a}} \newcommand{\bb}{\mathbf{b}} \newcommand{\bc}{\mathbf{c}} \newcommand{\bd}{\mathbf{d}} \newcommand{\be}{\mathbf{e}} \newcommand{\bh}{\mathbf{h}} \newcommand{\bp}{\mathbf{p}} \newcommand{\bq}{\mathbf{q}} \newcommand{\br}{\mathbf{r}} \newcommand{\bx}{\mathbf{x}} \newcommand{\by}{\mathbf{y}} \newcommand{\bz}{\mathbf{z}} \newcommand{\bu}{\mathbf{u}} \newcommand{\bv}{\mathbf{v}} \newcommand{\bw}{\mathbf{w}} \newcommand{\tr}{\operatorname{tr}} \newcommand{\nul}{\operatorname{null}} \newcommand{\rank}{\operatorname{rank}} %\newcommand{\ker}{\operatorname{ker}} \newcommand{\range}{\operatorname{range}} \newcommand{\Col}{\operatorname{Col}} \newcommand{\Row}{\operatorname{Row}} \newcommand{\spec}{\operatorname{spec}} \newcommand{\vspan}{\operatorname{span}} \newcommand{\Vol}{\operatorname{Vol}} \newcommand{\sgn}{\operatorname{sgn}} \newcommand{\idmap}{\operatorname{id}} \newcommand{\am}{\operatorname{am}} \newcommand{\gm}{\operatorname{gm}} \newcommand{\iner}{\operatorname{iner}}$ ```python from lingeo import random_int_list ``` ## Main idea Let $A$ be an $n\times n$ real symmetric matrix with eigenvalues $\lambda_1 \leq \cdots \leq \lambda_n$. According to the spectral theorem, there is an orthonormal basis $\beta = \{\bv_1, \ldots, \bv_n\}$ corresponding to $\lambda_1,\ldots,\lambda_n$. Let $\bv$ be a vector in $\mathbb{R}^n$. Then we have the following properties: - It can be written as $\bx = c_1\bv_1 + \cdots + c_n\bv_n$ for some coefficients $c_1,\ldots,c_n$. - $\|\bx\|^2 = \inp{\bx}{\bx} = c_1^2 + \cdots + c_n^2$. - $\bx\trans A\bx = \inp{A\bx}{\bx} = c_1^2\lambda_1 + \cdots + c_n^2\lambda_n$. With these property in mind, one may solve the optimization problem: minimize $\bx\trans A\bx$, subject to $\|\bx\| = 1$. The minimum value is $\lambda_1$ and is achieved by $\bx = \bv_1$ (or other vector in the same eigenspace). Similarly, if the problem is asking for an maximum value, then the maximum value is $\lambda_n$ and is achieved by $\bx = \bv_n$. Let $A$ be an $n\times n$ real symmetric matrix. The **Rayleigh quotient** of $A$ is a function of the form $$ R_A(\bx) = \frac{\bx\trans A\bx}{\bx\trans \bx}. $$ ##### Rayleigh quotient theorem Let $A$ be an $n\times n$ real symmetric matrix. Then $$ \begin{aligned} \lambda_1 &= \min_{ \substack{\bx\in\mathbb{R}^n \\ \bx\neq\bzero} } R_A(\bx) = \min_{ \substack{\bx\in\mathbb{R}^n \\ \|\bx\| = 1} } \bx\trans A\bx, \\ \lambda_n &= \max_{ \substack{\bx\in\mathbb{R}^n \\ \bx\neq\bzero} } R_A(\bx) = \max_{ \substack{\bx\in\mathbb{R}^n \\ \|\bx\| = 1} } \bx\trans A\bx. \end{aligned} $$ The minimum is achieved by an eigenvector of $\lambda_1$, while the maximum is achieved by an eigenvector of $\lambda_n$. Moreover, if one already know $\lambda_1$ and $\bv_1$, then $$ \lambda_2 = \min_{ \substack{\bx\in\mathbb{R}^n \\ \bx\neq\bzero,\ \bx\perp\bv_1} } R_A(\bx) = \min_{ \substack{\bx\in\mathbb{R}^n \\ \|\bx\| = 1,\ \bx\perp\bv_1} } \bx\trans A\bx. $$ The vector $\bv_{n-1}$ can be done in a similar way if $\lambda_n$ and $\bv_n$ is known. ## Side stories - optimization problem - Courant–Fisher theorem ## Experiments ##### Exercise 1 執行以下程式碼。 ```python ### code set_random_seed(0) print_ans = False n = 3 eigs = random_int_list(n) A = diagonal_matrix(eigs) pretty_print(LatexExpr("A ="), A) if print_ans: xAx = " + ".join("x_%s^2 (%s)"%(i+1, eigs[i]) for i in range(n)) print("xT A x =", xAx) print("max value:", max(eigs)) print("achieved by:", identity_matrix(n)[max(list(range(n)), key=lambda k: eigs[k])]) print("min value:", min(eigs)) print("achieved by:", identity_matrix(n)[min(list(range(n)), key=lambda k: eigs[k])]) ``` ##### Exercise 1(a) 令 $$ \bx = \begin{bmatrix} x_1 \\ \vdots \\ x_n \end{bmatrix}. $$ 計算 $\bx\trans A\bx$。 $Ans:$ 當 `seed = 0` 時， $A = \begin{bmatrix} -4 & 0 & 0 \\ 0 & 3 & 0 \\ 0 & 0 & 5 \end{bmatrix}.$ $\bx\trans A\bx = \begin{bmatrix} x_1 & x_2 & x_3 \end{bmatrix}\begin{bmatrix} -4 & 0 & 0 \\ 0 & 3 & 0 \\ 0 & 0 & 5 \end{bmatrix}\begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix}=-4x_1^2+3x_2^2+5x_3^2$. ##### Exercise 1(b) 求 $\bx\trans A\bx$ 在 $\|\bx\| = 1$ 限制下的最大值。達成這個最大值的 $\bx$ 為何？ :::warning - [x] 加文字說明 - [x] 實際上這題是希望對這樣的問題有一點感覺，不要直接用定理。想看看長度為 $1$ 的意思是 $x_1^2 + x_2^2 + x_3^2 = 1$，而這時候 $-4x_1^2 + 3x_2^2 + 5x_3^2$ 的最大值為何？ ::: $Ans:$ 計算 $\bx\trans A\bx = -4x_1^2 + 3x_2^2 + 5x_3^2$，因題意 $\|\bx\| = 1$ 得知 $x_1^2 + x_2^2 + x_3^2 = 1$，而 $\bx_3^2$ 係數最大，因此在 $\bx = \begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix}$ 時， $\bx\trans A\bx$ 會有最大值 $-4x_1^2 + 3x_2^2 + 5x_3^2=5$。 ##### Exercise 1(c) 求 $\bx\trans A\bx$ 在 $\|\bx\| = 1$ 限制下的最小值。達成這個最小值的 $\bx$ 為何？ :::warning - [x] 參考上一題 ::: $Ans:$ 計算 $\bx\trans A\bx = -4x_1^2 + 3x_2^2 + 5x_3^2$，因題意 $\|\bx\| = 1$ 得知 $x_1^2 + x_2^2 + x_3^2 = 1$，而 $\bx_１^2$ 係數最小，因此在 $x = \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix}$ 時， $\bx\trans A\bx$ 會有最小值 $-4x_1^2 + 3x_2^2 + 5x_3^2=-4$。 ## Exercises ##### Exercise 2 令 $A = \begin{bmatrix} a_{ii} \end{bmatrix}$ 為一 $n\times n$ 實對稱矩陣，且其特徵值為 $\lambda_1 \leq \cdots \leq \lambda_n$。 ##### Exercise 2(a) 證明對所有 $i = 1,\ldots,n$ 都有 $\lambda_1 \leq a_{ii} \leq \lambda_n$。提示：找一個適合的向量 $\bx$ 來套用 $\lambda_1 \leq R_A(\bx) \leq \lambda_n$。 :::warning - [x] 我不懂為什麼 $\lambda_1 \leq R_A(\bx)=(x_1)^2a_{11}+(x_2)^2a_{22}+\cdots+(x_n)^2a_{n,n}\leq \lambda_n$ 可以得到 $\lambda_1 \leq a_{ii} \leq \lambda_n$？實際上這題你只要分別取 $\bx = \be_1,\ldots, \be_n$ 就好。 ::: $Ans:$ 令 $$ A = \begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1,n} \\ a_{21} & a_{22} & \ddots & \vdots \\ \vdots & \ddots & \ddots & a_{n-1,n} \\ a_{n,1} & \cdots & a_{n,n-1} & a_{n,n} \\ \end{bmatrix}. $$ 且 $A$ 的特徵值為 $\lambda_1,...,\lambda_n$。令 $\bx_k \in \mathbb{R}^n$ 的第 $k$ 項為 $1$ 而其它項為 $0$。因為 $R_A(\bx_k) = \frac{\bx_k\trans A\bx_k}{\bx_k\trans \bx_k} =\bx\trans A\bx = a_{kk}$ 且 $\lambda_1 \leq R_A(\bx_k) \leq \lambda_n$，因此對所有 $i = 1,\ldots,n$ 都有 $\lambda_1 \leq a_{ii} \leq \lambda_n$。 ##### Exercise 2(b) 令 $$ A = \begin{bmatrix} 2 & -1 & -1 \\ -1 & 1 & 0 \\ -1 & 0 & 1 \end{bmatrix}. $$ 說明 $A$ 的最大特徵值 $\lambda_3 \geq 2$。 $Ans:$ 由 2(a) 可知對所有 $i = 1,\ldots,n$ 都有 $a_{ii} \leq \lambda_3$，所以 $2 \leq \lambda_3$。 ##### Exercise 3 令 $A = \begin{bmatrix} a_{ii} \end{bmatrix}$ 為一 $n\times n$ 實對稱矩陣，且其特徵值為 $\lambda_1 \leq \cdots \leq \lambda_n$。 ##### Exercise 3(a) 證明 $\lambda_1 \leq \frac{1}{n} \sum_{i=1}^n\sum_{j=1}^n a_{ij} \leq \lambda_n$。 :::warning - [x] 是用 $\frac{1}{\sqrt{n}}\bone$ 沒錯，可是它不一定等於 $\bv_k$，也不用等於 $\bv_k$；這題只要說明當 $\bx = \frac{1}{\sqrt{n}}\bone$ 時，$R_A(\bx) = \frac{1}{n} \sum_{i=1}^n\sum_{j=1}^n a_{ij}$。 ::: $Ans:$ 當 $\bx = \begin{bmatrix}\frac{1}{\sqrt{n}}\\\vdots\\\frac{1}{\sqrt{n}}\end{bmatrix}$ 時，$R_A(\bx) = \begin{bmatrix}\frac{1}{\sqrt{n}}\cdots\frac{1}{\sqrt{n}}\end{bmatrix}A\begin{bmatrix}\frac{1}{\sqrt{n}}\\\vdots\\\frac{1}{\sqrt{n}}\end{bmatrix} = \frac{1}{n} \sum_{i=1}^n\sum_{j=1}^n a_{ij}$，因為 $\lambda_1 \leq R_A(\bx) \leq \lambda_n$，故 $\lambda_1 \leq \frac{1}{n} \sum_{i=1}^n\sum_{j=1}^n a_{ij} \leq \lambda_n$。 ##### Exercise 3(b) 令 $$ A = \begin{bmatrix} 0 & 1 & 1 \\ 1 & 0 & 0 \\ 1 & 0 & 0 \end{bmatrix}. $$ 說明 $A$ 的最大特徵值 $\lambda_3 \geq \frac{4}{3}$。 :::warning - [x] 用前一題 ::: $Ans:$ 計算 $\frac{1}{3} \sum_{i=1}^3\sum_{j=1}^3 a_{ij} = \frac{4}{3}$，從上題得知 $\frac{1}{n} \sum_{i=1}^n\sum_{j=1}^n a_{ij} \leq \lambda_n$，所以 $\frac{1}{3} \sum_{i=1}^3\sum_{j=1}^3 a_{ij} = \frac{4}{3} \leq \lambda_3$。 ##### Exercise 4 令 $A$ 為一 $n\times n$ 實對稱矩陣，且其特徵值為 $\lambda_1 \leq \cdots \leq \lambda_n$。令 $B$ 為 $A$ 刪掉第一行第一列的矩陣，而其特徵值為 $\mu_1 \leq \cdots \leq \mu_{n-1}$。證明 $\lambda_1 \leq \mu_1$ 且 $\mu_{n-1} \leq \lambda_n$。 :::warning - [x] 我覺得 $\lambda_A=c_1^2a_{11}+\cdots+c_n^2a_{nn}$ 等式有錯；而且沒說明 $c_i$, $a_{ij}$ 是什麼 - [x] 這題可以取 $B$ 的對應到 $\mu_1$ 的特徵向量 $\bu_1$。可假設其為單位長，並將它補上一個 $0$ 變成 $\bv_1$。如此就有 $\lambda_1 \leq \bv_1\trans A\bv_1 = \bu_1 B\bu_1 = \mu_1$ ::: $Ans:$ 取 $B$ 的對應到 $\mu_1$ 的特徵向量 $\bu_1 = (u_{11},u_{21},\cdots,u_{n1})$。且 $\|\bu_1\| = 1$，並使 $\bv_1 = (0,u_{11},u_{21},\cdots,u_{n1})$。因為 $\lambda_1 \leq \bv_1\trans A\bv_1$，因此 $\lambda_1 \leq \bv_1\trans A\bv_1 = \bu_1\trans B\bu_1 = \mu_1$。取 $B$ 的對應到 $\mu_n$ 的特徵向量 $\bu_n = (u_{1n},u_{2n},\cdots,u_{nn})$。且 $\|\bu_n\| = 1$，並使 $\bv_n = (0,u_{1n},u_{2n},\cdots,u_{nn})$。因為 $\bv_n\trans A\bv_n \leq \lambda_n$，因此 $\mu_n = \bu_n\trans B\bu_n = \bv_n\trans A\bv_n \leq \lambda_n$。 ##### Exercise 5 令 $$ A = \begin{bmatrix} 2 & 1 & 1 \\ 1 & 2 & 1 \\ 1 & 1 & 2 \end{bmatrix}. $$ 解以下極值問題： maximize $\bx\trans A \bx$, subject to $\|\bx\| = 1$. :::warning - [x] 說明 $\bx$ 為什麼要這樣取 - [x] max --> $\max$ ::: $Ans:$ 將 $A$ 對角化，可變成 $$ A' = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 4 \end{bmatrix}. $$ $$ \ker(A-4I) = \operatorname{span}\left\{\begin{bmatrix} 1 \\ 1 \\ 1 \end{bmatrix}\right\}. $$ 又因 $\|\bx\| = 1$，所以取垂直矩陣 $$ \bx = \begin{bmatrix} 1/\sqrt{3} \\ 1/\sqrt{3} \\ 1/\sqrt{3} \end{bmatrix}. $$ 因為 $$ R_A(\bx) = \frac{\bx\trans A\bx}{\bx\trans \bx}. $$ 又 $\|\bx\| = 1$，所以只需要算 $\bx\trans A\bx$， $$ \begin{bmatrix} 1/\sqrt{3} & 1/\sqrt{3} & 1/\sqrt{3} \end{bmatrix} \begin{bmatrix} 2&1&1 \\ 1&2&1 \\ 1&1&2 \end{bmatrix} \begin{bmatrix} 1/\sqrt{3} \\ 1/\sqrt{3} \\ 1/\sqrt{3} \end{bmatrix} = 4 $$ 所以 $\lambda_3 = \max \bx\trans A\bx=4.$ ##### Exercise 6 已知 $$ A_n = \begin{bmatrix} 0 & 1 & ~ & 0 \\ 1 & 0 & \ddots & ~ \\ ~ & \ddots & \ddots & 1 \\ 0 & ~ & 1 & 0 \end{bmatrix} $$ 的所有特徵值為 $\{2\cos(\frac{2k\pi}{n+1}): k = 1,\ldots, n\}$。求在 $x_1^2 + \cdots + x_n^2 = 1$ 的限制下， $x_1x_2 + x_2x_3 + \cdots + x_{n-1}x_n$ 的最大值為何。 :::warning - [x] $x_1x_2 + x_2x_3 + \cdots + x_{n-1}x_n = \bx\trans A_n\bx，$ <-- 這部份差一個倍數，所以最後答案也有問題 ::: $Ans:$ 令 $\bx = \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{bmatrix}.$ 得 $x_1x_2 + x_2x_3 + \cdots + x_{n-1}x_n = \frac{1}{2}\bx\trans A_n\bx，$ 且 $\|\bx\| = \sqrt{x_1^2 + \cdots + x_n^2} = 1.$ 根據瑞利商定理 $x_1x_2 + x_2x_3 + \cdots + x_{n-1}x_n$ 的最大值為 $\frac{1}{2}\lambda_n = \frac{1}{2}\max_{ \substack{\bx\in\mathbb{R}^n \\ \bx\neq\bzero} } R_A(\bx) = \frac{1}{2}\max_{ \substack{\bx\in\mathbb{R}^n \\ \|\bx\| = 1} } \bx\trans A_n\bx = \cos(\frac{2n\pi}{n+1}).$ ##### Exercise 7 令 $$ L = \begin{bmatrix} 2 & -1 & -1 \\ -1 & 1 & 0 \\ -1 & 0 & 1 \end{bmatrix}. $$ 解以下極值問題： minimize $\bx\trans L \bx$, subject to $\|\bx\| = 1$ and $\bx \perp \bone$. 這裡 $\bone$ 為全一向量。 :::info 題目出錯，我把 $\bx = \bone$ 改成 $\bx \perp \bone$ ::: :::warning 懸賞 ::: $Ans:$ $L$ 的特徵值為 $0,1,3$ 其對應的特徵向量分別為 $\bv_0 = \frac{1}{\sqrt{3}} \begin{bmatrix}1 \\ 1 \\ 1\end{bmatrix}$，$\bv_1 = \frac{1}{\sqrt{2}}\begin{bmatrix}0 \\ -1 \\ 1 \end{bmatrix}$，$\bv_3 = \frac{1}{\sqrt{6}}\begin{bmatrix}-2 \\ 1 \\ 1 \end{bmatrix}$。因為 $\bx \perp \bone$ 因此令 $\bx = c_1\bv_1+c_3\bv_3$，因為 $\|\bx\| = 1$，$c_1^2 + c_3^2 = 1$。所以 $R_A(\bx) = \frac{\bx\trans A\bx}{\bx\trans \bx} = \bx\trans A\bx = c_1^2\times 1 + c_3^2\times 3$，當 $c_1 = 1$、$c_3 = 0$ 時，可得出最小值為 $1$。 ##### Exercise 8 查找各種資源，並描述柯朗–費雪定理（Courant–Fischer Theorem）。 :::warning - [x] $\bw_k-1$ --> $W_{k-1}$ - [x] 第二次的 $W$ 的下標有錯 - [x] $\bx\neq\bzero$ 是後面那邊的 - [x] 引用的時候要看完整，$\lambda_k$ 是第 $k$ 大還是第 $k$ 小？ ::: $Ans:$ 參考來源： [https://ccjou.wordpress.com/2010/03/16/hermitian-%E7%9F%A9%E9%99%A3%E7%89%B9%E5%BE%B5%E5%80%BC%E7%9A%84%E8%AE%8A%E5%8C%96%E7%95%8C%E5%AE%9A/] 若 $A$ 為一 Hermitian 矩陣， $$ \min _{ \substack{W_{k-1}} } \max _{\bx\perp W_{k-1} \\ \bx\neq\bzero} = \frac{\bx\trans A\bx}{\bx\trans \bx} =\lambda_k . $$ $$ \max _{ \substack{W_{n-k}} } \min _{\bx\perp W_{n-k} \\ \bx\neq\bzero} = \frac{\bx\trans A\bx}{\bx\trans \bx} =\lambda_k . $$ 其中，前者稱為 $\lambda_k$ 的極小化─極大化原則；後者稱為 $\lambda_k$ 的極大化─極小化原則。且 $\lambda_k$ 是指第 $k$ 大。 Courant-Fischer 定理的價值在於：即便不知道 Hermitian 矩陣的特徵向量，我們能從極小化─極大化原則，也就是透過變化界定，來決定 Hermitian 矩陣的特徵值，所以也稱作最小─最大 (min-max) 定理。 :::info 分數 = 6.5 :::

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.