changed 4 years ago
Linked with GitHub

體驗奇異值分解

Creative Commons License
This work by Jephian Lin is licensed under a Creative Commons Attribution 4.0 International License.

\(\newcommand{\trans}{^\top} \newcommand{\adj}{^{\rm adj}} \newcommand{\cof}{^{\rm cof}} \newcommand{\inp}[2]{\left\langle#1,#2\right\rangle} \newcommand{\dunion}{\mathbin{\dot\cup}} \newcommand{\bzero}{\mathbf{0}} \newcommand{\bone}{\mathbf{1}} \newcommand{\ba}{\mathbf{a}} \newcommand{\bb}{\mathbf{b}} \newcommand{\bc}{\mathbf{c}} \newcommand{\bd}{\mathbf{d}} \newcommand{\be}{\mathbf{e}} \newcommand{\bh}{\mathbf{h}} \newcommand{\bp}{\mathbf{p}} \newcommand{\bq}{\mathbf{q}} \newcommand{\br}{\mathbf{r}} \newcommand{\bx}{\mathbf{x}} \newcommand{\by}{\mathbf{y}} \newcommand{\bz}{\mathbf{z}} \newcommand{\bu}{\mathbf{u}} \newcommand{\bv}{\mathbf{v}} \newcommand{\bw}{\mathbf{w}} \newcommand{\tr}{\operatorname{tr}} \newcommand{\nul}{\operatorname{null}} \newcommand{\rank}{\operatorname{rank}} %\newcommand{\ker}{\operatorname{ker}} \newcommand{\range}{\operatorname{range}} \newcommand{\Col}{\operatorname{Col}} \newcommand{\Row}{\operatorname{Row}} \newcommand{\spec}{\operatorname{spec}} \newcommand{\vspan}{\operatorname{span}} \newcommand{\Vol}{\operatorname{Vol}} \newcommand{\sgn}{\operatorname{sgn}} \newcommand{\idmap}{\operatorname{id}} \newcommand{\am}{\operatorname{am}} \newcommand{\gm}{\operatorname{gm}} \newcommand{\mult}{\operatorname{mult}} \newcommand{\iner}{\operatorname{iner}}\)

from lingeo import random_int_list

Main idea

Let \(A\) be an \(m\times n\) matrix and
\(f_A : \mathbb{R}^n\rightarrow\mathbb{R}^n\) the corresponding linear function defined by \(f({\bf v}) = A{\bf v}\). Let \(\mathcal{E}_n\) and \(\mathcal{E}_m\) be the standard bases of \(\mathbb{R}^n\) and \(\mathbb{R}^m\), respectively.
Then \([f_A] = [f_A]_{\mathcal{E}_n}^{\mathcal{E}_m} = A\).

Let \(\alpha\) be another basis of \(\mathcal{R}^n\) and \(V = [\operatorname{id}]_\alpha^{\mathcal{E}_n}\).
Let \(\beta\) be another basis of \(\mathcal{R}^m\) and \(U = [\operatorname{id}]_\beta^{\mathcal{E}_m}\).
Then \([f_A]_\alpha^\beta = U^{-1}AV\) and \(A = U[f_A]_\alpha^\beta V^{-1}\).

Singular value decomposition

Let \(A\) be an \(m\times n\) matrix.
Then there are orthonormal bases \(\alpha\) and \(\beta\) of \(\mathbb{R}^n\) and \(\mathcal{R}^m\), respectively, such that \[[f_A]_\alpha^\beta = \Sigma = \begin{bmatrix} \operatorname{diag}(\sigma_1, \ldots, \sigma_r) & O_{r,n-r} \\ O_{m-r,r} & O_{m-r,n-r} \end{bmatrix}, \] where \(\sigma_1\geq\cdots\geq\sigma_r\) and \(\operatorname{diag}(\sigma_1,\ldots,\sigma_r)\) is the diagonal matrix with the given diagonal entries.

That is, there are \(m\times m\) and \(n\times n\) orthogonal matrices \(U\) and \(V\) such that \(U^\top AV = \Sigma\) and \(A = U\Sigma V^\top\).

Let \(\alpha = \{ {\bf v}_1, \ldots, {\bf v}_n \}\) and \(\beta = \{ {\bf u}_1, \ldots, {\bf u}_m \}\) be the bases in the singular value decomposition.
Then \(V\) is the matrix whose columns are vectors in \(\alpha\), and
\(U\) is the matrix whose columns are vectors in \(\beta\).
Since \(\alpha\) and \(\beta\) orthogonal orthonormal, \(U\) and \(V\) are orthogonal and \(U^{-1} = U^\top\) and \(V^{-1} = V^\top\).

By examining \(AV = U\Sigma\), we have
\[AV = A\begin{bmatrix} | & ~ & | \\ {\bf v}_1 & \cdots & {\bf v}_n \\ | & ~ & | \end{bmatrix} = U\Sigma = \begin{bmatrix} | & ~ & | \\ {\bf v}_1 & \cdots & {\bf v}_n \\ | & ~ & | \end{bmatrix} \begin{bmatrix} \operatorname{diag}(\sigma_1, \ldots, \sigma_r) & O_{r,n-r} \\ O_{m-r,r} & O_{m-r,n-r} \end{bmatrix} = \begin{bmatrix} | & ~ & | \\ \sigma_1{\bf v}_1 & \cdots & \sigma_r{\bf v}_r & O_{n,n-r} \\ | & ~ & | \end{bmatrix}. \] Therefore, \(A{\bf v}_i = \lambda_i {\bf u}_i\) for \(i = 1,\ldots, r\) and \(A{\bf v}_i = {\bf 0}\) for \(i = r+1,\ldots, n\).

The values \(\sigma_1,\ldots,\sigma_r\) are called the sigular values of \(A\).
Singular values are necessarily positive, and the number of singular values is \(r = \operatorname{rank}(A)\).

Side stories

  • image compression

Experiments

Exercise 1

執行以下程式碼。
\(f : \mathbb{R}^4 \rightarrow \mathbb{R}^3\) 為一線性函數。
\(\alpha = \{{\bf v}_1,\ldots, {\bf v}_4\}\)\(\beta = \{{\bf u}_1,\ldots, {\bf u}_3\}\) 分別為 \(\mathbb{R}^4\)\(\mathbb{R}^3\) 的一組單位長垂直基底。
已知 \(\Sigma = [f]_\alpha^\beta\)

### code
set_random_seed(0)
print_ans = False
r = choice([1,2,3])
sigs = list(map(lambda k: abs(k), random_int_list(r)))
sigs.sort(reverse=True)

m,n = 3,4
Sigma = zero_matrix(m,n)
for i in range(r):
    Sigma[i,i] = sigs[i]

print("Sigma =")
show(Sigma)

if print_ans:
    for i in range(n):
        if i < r:
            print("A v%s = %s u%s"%(i+1, sigs[i], i+1))
        else:
            print("A v%s = 0"%(i+1))
    print("range(f_A) = span({ u1, ..., u%s })"%r)
    print("ker(f_A) = span({ v%s, ..., vn })"%(r+1))
    print("rank(f_A) =", r)
    print("null(f_A) =", n-r)
Exercise 1(a)

利用 \(\Sigma\) 說明 \(f_A\) 的作用。

Exercise 1(b)

找出 \(\operatorname{range}(f_A)\)\(\operatorname{ker}(f_A)\)\(\operatorname{rank}(f_A)\)\(\operatorname{null}(f_A)\)

Exercises

Exercise 2

\(A\) 為一 \(3\times 4\) 矩陣、
\(\alpha = \{ {\bf v}_1,\ldots,{\bf v}_4 \}\)\(\mathbb{R}^4\) 的一組基底、
\(\beta = \{ {\bf u}_1,\ldots,{\bf u}_3 \}\)\(\mathbb{R}^3\) 的一組基底。
已知
\[[f_A]_\beta^\beta = \begin{bmatrix} 3 & 0 & 0 & 0 \\ 0 & 4 & 0 & 0 \\ 0 & 0 & 5 & 0 \\ \end{bmatrix}. \]
\(A{\bf v}_1\)\(A{\bf v}_2\)\(A{\bf v}_3\)、及 \(A({\bf v}_1 + {\bf v}_2 + {\bf v}_3)\) 分別寫成 \(\beta\) 的線性組合。

Exercise 3

\[A = \begin{bmatrix} 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 \\ \end{bmatrix}, \] \(\beta = \{ {\bf v}_1, \ldots, {\bf v}_4 \}\)
\[\begin{bmatrix} \frac{1}{2} & \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{6}} & \frac{1}{\sqrt{12}} \\ \frac{1}{2} & -\frac{1}{\sqrt{2}} & \frac{1}{\sqrt{6}} & \frac{1}{\sqrt{12}} \\ \frac{1}{2} & 0 & -\frac{2}{\sqrt{6}} & \frac{1}{\sqrt{12}} \\ \frac{1}{2} & 0 & 0 & -\frac{3}{\sqrt{12}} \\ \end{bmatrix} \] 的行向量集合、
\(\beta = \{ {\bf u}_1, \ldots, {\bf u}_3 \}\)
\[\begin{bmatrix} \frac{1}{\sqrt{3}} & \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{6}} \\ \frac{1}{\sqrt{3}} & -\frac{1}{\sqrt{2}} & \frac{1}{\sqrt{6}} \\ \frac{1}{\sqrt{3}} & 0 & -\frac{2}{\sqrt{6}} \\ \end{bmatrix} \] 的行向量集合。

Exercise 3(a)

寫出 \([f_A]_\alpha^\beta\) 並說明 \(f_A\) 的作用。

Exercise 3(b)

找出兩個垂直矩陣 \(U\)\(V\)
以及一個只有在 \(i,i\)-項有值的矩陣 \(\Sigma\) 使得 \(A = U\Sigma V^\top\)

Exercise 4

\(A\)\(m\times n\) 矩陣且 \(\operatorname{rank}(A) = r\)
\(A = U\Sigma V^\top\) 為其奇異值分解﹐其中 \(\sigma_1\geq \cdots \geq \sigma_r > 0\) 為其奇異值。
\({\bf u}_1,\ldots,{\bf u}_{m}\)\(U\) 的各行向量、
\({\bf v}_0, \ldots, {\bf v}_{n-1}\)\(V\) 的各行向量。

Exercise 4(a)

說明 \[A = \sum_{i=1}^{r} \sigma_i{\bf u}_i{\bf v}_i^\top. \]

Exercise 4(b)

定義 \(\|M\| = \sqrt{\operatorname{M^\top M}}\) 為矩陣的長度。
說明 \(\|M\|^2\) 就是 \(M\) 的各項平方和。

Exercise 4©

說明對任何 \(i = 1,\ldots, r\) 都有 \(\|{\bf u}_i{\bf v}_i\| = 1\)

因此我們可以令 \(A' = \sum_{i=1}^{k} \sigma_i{\bf u}_i{\bf v}_i^\top\) 當作 \(A\) 的逼近﹐而且有 \(\|A - A'\| \leq \sum_{k+1}^r \sigma_i\)
(矩陣長度也符何三角不等式。)

Exercise 4(d)

每張圖片的亮度都可視為一個矩陣。
\(A\) 為原圖﹐則我們可以用 \(A'\) 來降低原圖的大小。
執行以下程式碼。
試試看選用各個 \(k\) 的差異。
算一下把 \(A'\) 儲存起來要幾個數字。

import numpy as np
import matplotlib.pyplot as plt
from scipy import linalg as LA

arr = plt.imread('incrediville-side.jpg').mean(axis=-1)
plt.imshow(arr, cmap='Greys_r', vmin=0, vmax=255)

U,s,Vh = LA.svd(arr) ### might take long
k = 50 ### between 1 to 3120
arr_new = U[:,:k].dot(np.diag(s[:k])).dot(Vh[:k,:])
plt.imshow(arr_new, cmap='Greys_r', vmin=0, vmax=255)
Select a repo