owned this note
owned this note
Published
Linked with GitHub
# 體驗奇異值分解

This work by Jephian Lin is licensed under a [Creative Commons Attribution 4.0 International License](http://creativecommons.org/licenses/by/4.0/).
{%hackmd 5xqeIJ7VRCGBfLtfMi0_IQ %}
```python
from lingeo import random_int_list
```
## Main idea
Let $A$ be an $m\times n$ matrix and
$f_A : \mathbb{R}^n\rightarrow\mathbb{R}^n$ the corresponding linear function defined by $f({\bf v}) = A{\bf v}$. Let $\mathcal{E}_n$ and $\mathcal{E}_m$ be the standard bases of $\mathbb{R}^n$ and $\mathbb{R}^m$, respectively.
Then $[f_A] = [f_A]_{\mathcal{E}_n}^{\mathcal{E}_m} = A$.
Let $\alpha$ be another basis of $\mathcal{R}^n$ and $V = [\operatorname{id}]_\alpha^{\mathcal{E}_n}$.
Let $\beta$ be another basis of $\mathcal{R}^m$ and $U = [\operatorname{id}]_\beta^{\mathcal{E}_m}$.
Then $[f_A]_\alpha^\beta = U^{-1}AV$ and $A = U[f_A]_\alpha^\beta V^{-1}$.
##### Singular value decomposition
Let $A$ be an $m\times n$ matrix.
Then there are orthonormal bases $\alpha$ and $\beta$ of $\mathbb{R}^n$ and $\mathcal{R}^m$, respectively, such that
$$[f_A]_\alpha^\beta = \Sigma = \begin{bmatrix}
\operatorname{diag}(\sigma_1, \ldots, \sigma_r) & O_{r,n-r} \\
O_{m-r,r} & O_{m-r,n-r}
\end{bmatrix},
$$
where $\sigma_1\geq\cdots\geq\sigma_r$ and $\operatorname{diag}(\sigma_1,\ldots,\sigma_r)$ is the diagonal matrix with the given diagonal entries.
That is, there are $m\times m$ and $n\times n$ orthogonal matrices $U$ and $V$ such that $U^\top AV = \Sigma$ and $A = U\Sigma V^\top$.
Let $\alpha = \{ {\bf v}_1, \ldots, {\bf v}_n \}$ and $\beta = \{ {\bf u}_1, \ldots, {\bf u}_m \}$ be the bases in the singular value decomposition.
Then $V$ is the matrix whose columns are vectors in $\alpha$, and
$U$ is the matrix whose columns are vectors in $\beta$.
Since $\alpha$ and $\beta$ orthogonal orthonormal, $U$ and $V$ are orthogonal and $U^{-1} = U^\top$ and $V^{-1} = V^\top$.
By examining $AV = U\Sigma$, we have
$$AV =
A\begin{bmatrix}
| & ~ & | \\
{\bf v}_1 & \cdots & {\bf v}_n \\
| & ~ & |
\end{bmatrix} =
U\Sigma =
\begin{bmatrix}
| & ~ & | \\
{\bf v}_1 & \cdots & {\bf v}_n \\
| & ~ & |
\end{bmatrix}
\begin{bmatrix}
\operatorname{diag}(\sigma_1, \ldots, \sigma_r) & O_{r,n-r} \\
O_{m-r,r} & O_{m-r,n-r}
\end{bmatrix} =
\begin{bmatrix}
| & ~ & | \\
\sigma_1{\bf v}_1 & \cdots & \sigma_r{\bf v}_r & O_{n,n-r} \\
| & ~ & |
\end{bmatrix}.
$$
Therefore, $A{\bf v}_i = \lambda_i {\bf u}_i$ for $i = 1,\ldots, r$ and $A{\bf v}_i = {\bf 0}$ for $i = r+1,\ldots, n$.
The values $\sigma_1,\ldots,\sigma_r$ are called the **sigular values** of $A$.
Singular values are necessarily positive, and the number of singular values is $r = \operatorname{rank}(A)$.
## Side stories
- image compression
## Experiments
##### Exercise 1
執行以下程式碼。
令 $f : \mathbb{R}^4 \rightarrow \mathbb{R}^3$ 為一線性函數。
令 $\alpha = \{{\bf v}_1,\ldots, {\bf v}_4\}$ 及 $\beta = \{{\bf u}_1,\ldots, {\bf u}_3\}$ 分別為 $\mathbb{R}^4$ 和 $\mathbb{R}^3$ 的一組單位長垂直基底。
已知 $\Sigma = [f]_\alpha^\beta$。
```python
### code
set_random_seed(0)
print_ans = False
r = choice([1,2,3])
sigs = list(map(lambda k: abs(k), random_int_list(r)))
sigs.sort(reverse=True)
m,n = 3,4
Sigma = zero_matrix(m,n)
for i in range(r):
Sigma[i,i] = sigs[i]
print("Sigma =")
show(Sigma)
if print_ans:
for i in range(n):
if i < r:
print("A v%s = %s u%s"%(i+1, sigs[i], i+1))
else:
print("A v%s = 0"%(i+1))
print("range(f_A) = span({ u1, ..., u%s })"%r)
print("ker(f_A) = span({ v%s, ..., vn })"%(r+1))
print("rank(f_A) =", r)
print("null(f_A) =", n-r)
```
##### Exercise 1(a)
利用 $\Sigma$ 說明 $f_A$ 的作用。
##### Exercise 1(b)
找出 $\operatorname{range}(f_A)$、$\operatorname{ker}(f_A)$、$\operatorname{rank}(f_A)$、$\operatorname{null}(f_A)$。
## Exercises
##### Exercise 2
令 $A$ 為一 $3\times 4$ 矩陣、
$\alpha = \{ {\bf v}_1,\ldots,{\bf v}_4 \}$ 為 $\mathbb{R}^4$ 的一組基底、
$\beta = \{ {\bf u}_1,\ldots,{\bf u}_3 \}$ 為 $\mathbb{R}^3$ 的一組基底。
已知
$$[f_A]_\beta^\beta = \begin{bmatrix}
3 & 0 & 0 & 0 \\
0 & 4 & 0 & 0 \\
0 & 0 & 5 & 0 \\
\end{bmatrix}.
$$
將 $A{\bf v}_1$、$A{\bf v}_2$、$A{\bf v}_3$、及 $A({\bf v}_1 + {\bf v}_2 + {\bf v}_3)$ 分別寫成 $\beta$ 的線性組合。
##### Exercise 3
令
$$A = \begin{bmatrix}
1 & 1 & 1 & 1 \\
1 & 1 & 1 & 1 \\
1 & 1 & 1 & 1 \\
\end{bmatrix},
$$
$\beta = \{ {\bf v}_1, \ldots, {\bf v}_4 \}$ 為
$$\begin{bmatrix}
\frac{1}{2} & \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{6}} & \frac{1}{\sqrt{12}} \\
\frac{1}{2} & -\frac{1}{\sqrt{2}} & \frac{1}{\sqrt{6}} & \frac{1}{\sqrt{12}} \\
\frac{1}{2} & 0 & -\frac{2}{\sqrt{6}} & \frac{1}{\sqrt{12}} \\
\frac{1}{2} & 0 & 0 & -\frac{3}{\sqrt{12}} \\
\end{bmatrix}
$$
的行向量集合、
且 $\beta = \{ {\bf u}_1, \ldots, {\bf u}_3 \}$ 為
$$\begin{bmatrix}
\frac{1}{\sqrt{3}} & \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{6}} \\
\frac{1}{\sqrt{3}} & -\frac{1}{\sqrt{2}} & \frac{1}{\sqrt{6}} \\
\frac{1}{\sqrt{3}} & 0 & -\frac{2}{\sqrt{6}} \\
\end{bmatrix}
$$
的行向量集合。
##### Exercise 3(a)
寫出 $[f_A]_\alpha^\beta$ 並說明 $f_A$ 的作用。
##### Exercise 3(b)
找出兩個垂直矩陣 $U$ 和 $V$
以及一個只有在 $i,i$-項有值的矩陣 $\Sigma$ 使得 $A = U\Sigma V^\top$。
###### Exercise 4
令 $A$ 為 $m\times n$ 矩陣且 $\operatorname{rank}(A) = r$。
令 $A = U\Sigma V^\top$ 為其奇異值分解﹐其中 $\sigma_1\geq \cdots \geq \sigma_r > 0$ 為其奇異值。
令 ${\bf u}_1,\ldots,{\bf u}_{m}$ 為 $U$ 的各行向量、
${\bf v}_0, \ldots, {\bf v}_{n-1}$ 為 $V$ 的各行向量。
##### Exercise 4(a)
說明
$$A = \sum_{i=1}^{r} \sigma_i{\bf u}_i{\bf v}_i^\top.
$$
##### Exercise 4(b)
定義 $\|M\| = \sqrt{\operatorname{M^\top M}}$ 為矩陣的長度。
說明 $\|M\|^2$ 就是 $M$ 的各項平方和。
##### Exercise 4(c)
說明對任何 $i = 1,\ldots, r$ 都有 $\|{\bf u}_i{\bf v}_i\| = 1$。
因此我們可以令 $A' = \sum_{i=1}^{k} \sigma_i{\bf u}_i{\bf v}_i^\top$ 當作 $A$ 的逼近﹐而且有 $\|A - A'\| \leq \sum_{k+1}^r \sigma_i$。
(矩陣長度也符何三角不等式。)
##### Exercise 4(d)
每張圖片的亮度都可視為一個矩陣。
令 $A$ 為原圖﹐則我們可以用 $A'$ 來降低原圖的大小。
執行以下程式碼。
試試看選用各個 $k$ 的差異。
算一下把 $A'$ 儲存起來要幾個數字。
```python
import numpy as np
import matplotlib.pyplot as plt
from scipy import linalg as LA
arr = plt.imread('incrediville-side.jpg').mean(axis=-1)
plt.imshow(arr, cmap='Greys_r', vmin=0, vmax=255)
U,s,Vh = LA.svd(arr) ### might take long
```
```python
k = 50 ### between 1 to 3120
arr_new = U[:,:k].dot(np.diag(s[:k])).dot(Vh[:k,:])
plt.imshow(arr_new, cmap='Greys_r', vmin=0, vmax=255)
```