低秩近似法與影像壓縮

tags: `線性代數` `matlab`

撰寫時間 : 2021/09/01

大綱

我的報告內容為SVD與資料分析這門課上課內容的整理與延伸 - 低秩近似法。第一部分是數學理論，這是教授在第三天的上課內容；第二部分Matlab實踐是將第一部分的數學理論轉換為Matlab程式碼，在教授提供原文書程式碼的基礎之上，我另外加上奇異值大小分布作圖，相對誤差計算與作圖，與壓縮RGB影像 - 將R、G、B三色分離圖層個別做SVD分解並轉換為秩數較低的矩陣，最後再將合併R、G、B三色圖層即可得到壓縮後的RGB影像。

數學理論

矩陣的Frobenius norm與Spectral norm

在討論低秩近似法前，我們要找到如何代表兩矩陣"長得很像"的衡量標準?直覺來說就是找兩矩陣的"長度"，"長度"講為較專業的術語就是範數(norm)，norm符合兩個性質，一是除了零矩陣的norm等於0，其他矩陣的norm必定大於0，二是符合三角不等式

‖ A + B \leq ‖ A ‖ + ‖ B ‖, \forall A, B \in R^{m \times n}

，norm在定義上百百種，這邊只討論2種norm

Frobenius norm
$‖ A ‖_{F} = \sqrt{\sum_{i = 1}^{m} \sum_{j = 1}^{n} | a_{i j} |^{2}}$ 將矩陣所有的entries平方相加最後開根號。
經推導，在原矩陣右邊或是左邊乘上正交矩陣，不改變Frobenius norm，因此Frobenius norm即為將
$A$ 做SVD分解中所有奇異值平方相加開根號。
$‖ A ‖_{F} = ‖ {U Σ V}^{T} ‖_{F} = ‖ Σ ‖_{F} = \sqrt{σ_{1}^{2} + \dots + σ_{n}^{2}}$
Spectral norm / two norm
$‖ A ‖_{2} = max_{x \neq 0} \frac{‖ Ax ‖_{2}}{‖ x ‖_{2}}$ 線性變換前後向量長度的比例，取其中的最大值。另一種看法，可令
$‖ x ‖ = 1$ 就是在單位圓上做線性轉換後得到橢圓等形狀，取其離原點最大的長度。
經推導，Spectral norm為將
$A$ 做SVD分解中最大的奇異值
$σ_{1}$ 。
$‖ A ‖_{2} = σ_{1}$

SVD的外積展開式與低秩近似法

SVD分解

A = {U Σ V}^{T}

中

A

數值只由

U

前

r

項column vector，

Σ

前

r \times r

的方陣決定，

V^{T}

前

r

項row vector所決定。因此若資料以SVD形式儲存，可進一步簡化為reduced SVD。

A_{r} = U_{r} Σ_{r} V_{r}^{T} A_{m \times n} = U_{m \times r} Σ_{r \times r} (V^{T})_{r \times n}

由第2天計算SVD步驟可知，欲求正交矩陣

U, V

，是把單位正交集合(set)擴展到單位正交基底(basis)，實際上擴展哪些基底不重要，也不具有唯一性，只是為了要讓

U, V

的行向量單位正交而形成正交矩陣而已。

如此就可以定義SVD的前

k

項外積展開式，取前將奇異值一一取出展開

A_{k} = σ_{1} u_{1} v_{1}^{T} + \dots + σ_{k} u_{k} v_{k}^{T}, i \leq k \leq r

其秩數

rank (A_{k}) = k

，根據Eckart-Young Theorem，在同一秩數下

rank (B) = k

，最接近原本

A

矩陣的距離就是前

k

項外積展開式，這就是低秩近似法(low-rank approximation)最佳化的問題。

For any B \in R^{m \times n} with rank (B) = k We have ‖ A - B ‖_{F} \geq ‖ A - A_{k} ‖_{F} and ‖ A - B ‖_{2} \geq ‖ A - A_{k} ‖_{2}

因此在同一秩數

k

下，最靠近

A

的矩陣就是前

k

項外積展開式

A_{k}

。

最後就可計算相對誤差的最小值

\begin{aligned} \frac{‖ A - B ‖_{F}}{‖ A ‖_{F}} \geq \frac{‖ A - A_{k} ‖_{F}}{‖ A ‖_{F}} = \frac{\sqrt{σ_{k + 1}^{2} + \dots + σ_{r}^{2}}}{\sqrt{σ_{1}^{2} + \dots + σ_{r}^{2}}} \\ \frac{‖ A - B ‖_{2}}{‖ A ‖_{2}} \geq \frac{‖ A - A_{k} ‖_{2}}{‖ A ‖_{2}} = \frac{σ_{k + 1}}{σ_{1}} \end{aligned}

Matlab程式實踐

給定輸入圖片lycoris_recoil.jpg。

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

clc, clear, close all
%% read file and change data type
filename = 'lycoris_recoil.jpg';
imfinfo(filename) % image information
rare_data = imread(filename); % data type : uint8
cal_data = double(rare_data); % data type : double

size_A = size(cal_data); % size of A
imshow(cal_data/255); % imshow(RGB) with RGB in [0,1]
title(['original image with ',num2str(size_A(2)),'$\times$',num2str(size_A(1)),' pixels'],'interpreter','latex','FontSize',15);
axis equal; % maintain aspect ratio of x,y
axis off; % turn off axis

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

首先讀取圖檔，並用指令imfinfo()查看圖檔資訊為

1920 \times 1080

像素，位元深度(bit depth)為24位元，色型為true color，代表有3條通道(channel)分別為紅、綠、藍，每一條通道的位元深度為8位元。而圖片預設儲存的資料型態為無號(unsigned) 8位元整數(uint8)，在計算機運算時，將其轉換為雙精度浮點數運算(double)。由於指令imshow要顯示true color影像，若是資料型態是浮點數則數值限制在

[0, 1]

，因此除上

255

，將

[0, 255]

映射至

[0, 1]

。

%% decompose coloered images into RGB layers
cal_data_1 = cal_data(:,:,1);
cal_data_2 = cal_data(:,:,2);
cal_data_3 = cal_data(:,:,3);

subplot(2,2,1);
imshow(cal_data/255);
title('colored image $\mathbf{A}$','interpreter','latex','FontSize',15);

subplot(2,2,2);
imshow(cal_data_1,[0,255]);
title('red layer of image $\mathbf{R}$','interpreter','latex','FontSize',15);
colorbar

subplot(2,2,3);
imshow(cal_data_2,[0,255]);
title('green layer of image $\mathbf{G}$','interpreter','latex','FontSize',15);
colorbar

subplot(2,2,4);
imshow(cal_data_3,[0,255]);
title('blue layer of image $\mathbf{B}$','interpreter','latex','FontSize',15);
colorbar

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

將true color影像的R、G、B通道分解，也就是把3維矩陣降至3個2維矩陣

A \in R^{1080 \times 1920 \times 3} \to R, G, B \in R^{1080 \times 1920}

需要注意灰階影像顯示時，預設數值範圍是

[0, 1]

，因此在指令imshow後面要給與要對應灰階的範圍更改為

[0, 255]

。

%% singular value decomposition
[U1,S1,V1] = svd(cal_data_1);
[U2,S2,V2] = svd(cal_data_2);
[U3,S3,V3] = svd(cal_data_3);

分別對矩陣

R, G, B

做SVD分解為

U Σ V^{T}

的形式。

%% plot of sigular value
sigular_value_number = min(size(S1));
x_axis = 1:sigular_value_number;

sigular_value_1 = zeros(1, length(x_axis));
sigular_value_2 = zeros(1, length(x_axis));
sigular_value_3 = zeros(1, length(x_axis));

for i = 1:sigular_value_number % take diagonal entries
    sigular_value_1(1,i) = S1(i,i);
    sigular_value_2(1,i) = S2(i,i);
    sigular_value_3(1,i) = S3(i,i);
end

subplot(3,1,1);
semilogy(x_axis, sigular_value_1, 'LineWidth',2);
title(['sigualr value $\sigma_i$ of $\mathbf{R}$ for i = 1, 2, ...', num2str(sigular_value_number)],'interpreter','latex','FontSize',15);
xlabel('i-axis','FontSize',10);
ylabel('sigualr value \sigma_i','FontSize',10);
grid on

subplot(3,1,2);
semilogy(x_axis, sigular_value_2, 'LineWidth',2);
title(['sigualr value $\sigma_i$ of $\mathbf{G}$ for i = 1, 2, ...', num2str(sigular_value_number)],'interpreter','latex','FontSize',15);
xlabel('i-axis','FontSize',10);
ylabel('sigualr value \sigma_i','FontSize',10);
grid on

subplot(3,1,3);
semilogy(x_axis, sigular_value_3, 'LineWidth',2);
title(['sigualr value $\sigma_i$ of $\mathbf{B}$ for i = 1, 2, ...', num2str(sigular_value_number)],'interpreter','latex','FontSize',15);
xlabel('i-axis','FontSize',10);
ylabel('sigualr value \sigma_i','FontSize',10);
grid on

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Σ = {[\begin{matrix} σ_{1} & \dots & 0 & 0 & \dots & 0 \\ ⋮ & ⋱ & ⋮ & ⋮ & ⋱ & ⋮ \\ 0 & \dots & σ_{1080} & 0 & \dots & 0 \end{matrix}]}_{1080 \times 1920}, where σ_{1} \geq σ_{2} \geq \dots \geq σ_{1080} \geq 0

矩陣

R, G, B

的奇異值

σ_{i}

數值對index

i

作圖，可以發現以此張圖來說，R、G、B圖層的奇異值

σ_{i}

分布類似；需要注意

σ_{i}

數值是對數

\log

作圖，因此奇異值大小變化量劇烈，代表取足夠前

k

項的SVD的外積展開式，就幾乎可以涵蓋整張圖的所有資訊

A = \underset{encode almost image message}{\underset{⏟}{σ_{1} u_{1} v_{1}^{T} + σ_{2} u_{2} v_{2}^{T} + \dots + σ_{k} u_{k} v_{k}^{T}}} + \dots + σ_{1080} u_{1080} v_{1080}^{T}, k < 1080

%% minimal relative rate of error
subplot(3,1,1);
error_1 = sigular_value_1(2:end) / S1(1,1);
semilogy(x_axis(1:end-1), error_1, 'LineWidth',2);
title('relative rate of error $\frac{\| \mathbf{R} - \mathbf{R}_k \|_2}{\| \mathbf{R} \|_2} = \frac{\sigma_{k + 1}}{\sigma_1}$','interpreter','latex','FontSize',15);
xlabel('k-axis','FontSize',10);
ylabel('A_k error','FontSize',10);
grid on

subplot(3,1,2);
error_2 = sigular_value_2(2:end) / S2(1,1);
semilogy(x_axis(1:end-1), error_2, 'LineWidth',2);
title('relative rate of error $\frac{\| \mathbf{G} - \mathbf{G}_k \|_2}{\| \mathbf{G} \|_2} = \frac{\sigma_{k + 1}}{\sigma_1}$','interpreter','latex','FontSize',15);
xlabel('k-axis','FontSize',10);
ylabel('A_k error','FontSize',10);
grid on

subplot(3,1,3);
error_3 = sigular_value_3(2:end) / S3(1,1);
semilogy(x_axis(1:end-1), error_3, 'LineWidth',2);
title('relative rate of error $\frac{\| \mathbf{B} - \mathbf{B}_k \|_2}{\| \mathbf{B} \|_2} = \frac{\sigma_{k + 1}}{\sigma_1}$','interpreter','latex','FontSize',15);
xlabel('k-axis','FontSize',10);
ylabel('A_k error','FontSize',10);
grid on

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

至於這個

k

項要取多少?根據前面數學理論中矩陣之間的"距離"這個概念，在此使用norm的其中一種定義 - 2-norm來求得相對誤差值。

%% compressed with factor of k
k = 30; % reduced to number of singlular values
S_re_1 = zeros(size_A(1),size_A(2));
S_re_2 = zeros(size_A(1),size_A(2));
S_re_3 = zeros(size_A(1),size_A(2));
for ir = 1:k
    S_re_1(ir,ir) = S1(ir,ir); % compress red   layer of image
    S_re_2(ir,ir) = S2(ir,ir); % compress green layer of image
    S_re_3(ir,ir) = S3(ir,ir); % compress blue  layer of image
end
A_re_1 = U1*S_re_1*V1';
A_re_2 = U2*S_re_2*V2';
A_re_3 = U3*S_re_3*V3';

我選擇把R、G、B的相對誤差都限制在

2 %

以內，因此選擇

k = 31

，此時R、G、B誤差為

\begin{aligned} relative rate error of red layer & = \frac{‖ R - R_{k} ‖_{2}}{‖ R ‖_{2}} \approx 0.0138466 < 2 % \\ relative rate error of green layer & = \frac{‖ G - G_{k} ‖_{2}}{‖ G ‖_{2}} \approx 0.0173546 < 2 % \\ relative rate error of blue layer & = \frac{‖ B - B_{k} ‖_{2}}{‖ B ‖_{2}} \approx 0.0192512 < 2 % \end{aligned}

%% combine RGB layers into colored images
figure;
A_re = uint8(zeros(size_A(1), size_A(2), size_A(3)));
A_re(:,:,1) = uint8(A_re_1);
A_re(:,:,2) = uint8(A_re_2);
A_re(:,:,3) = uint8(A_re_3);
imshow(A_re);
title('compressed colored image $\mathbf{R}_k, \mathbf{G}_k,\mathbf{B}_k \to \mathbf{A}_k$ with low rank approximation','interpreter','latex','FontSize',15);
axis equal;
axis off;

最後合併壓縮後的

R, G, B

矩陣回原來的影像。可以發現原本原本

rank (A) = 1080

的矩陣，現在只需要取

k = 31

，使矩陣秩數只要

rank (A_{k}) = 31

，就可以看出原本影像大部分的特徵。

%% each RGB layers after low rank approximation
subplot(2,2,1);
imshow(A_re);
title('colored image $\mathbf{A}_k$','interpreter','latex','FontSize',15);

subplot(2,2,2);
imshow(uint8(A_re_1));
colorbar
title('red layer of image $\mathbf{R}_k$','interpreter','latex','FontSize',15);

subplot(2,2,3);
imshow(uint8(A_re_2));
colorbar
title('green layer of image $\mathbf{G}_k$','interpreter','latex','FontSize',15);

subplot(2,2,4);
imshow(uint8(A_re_3));
colorbar
title('blue layer of image $\mathbf{B}_k$','interpreter','latex','FontSize',15);

把R、G、B壓縮影像印出來，看個別R、G、B圖層壓縮的效果。

%% comparison of different rank k
sgtitle('SVD application - low rank approximation $\mathbf{A}_k = \mathbf{U}_k\mathbf{\Sigma}_k\mathbf{V}_k^T$','interpreter','latex','FontSize',20);
k = [5,10,50,100, 200]; % number of singlular values
for i = 1:length(k)
    S_re_1 = zeros(size_A(1),size_A(2));
    S_re_2 = zeros(size_A(1),size_A(2));
    S_re_3 = zeros(size_A(1),size_A(2));
    for ir = 1:k(i)
        S_re_1(ir,ir) = S1(ir,ir); % compress red   layer of image
        S_re_2(ir,ir) = S2(ir,ir); % compress green layer of image
        S_re_3(ir,ir) = S3(ir,ir); % compress blue  layer of image
    end
    A_re_1 = U1*S_re_1*V1';
    A_re_2 = U2*S_re_2*V2';
    A_re_3 = U3*S_re_3*V3';
    
    A_re = uint8(zeros(size_A(1), size_A(2), size_A(3)));
    A_re(:,:,1) = uint8(A_re_1);
    A_re(:,:,2) = uint8(A_re_2);
    A_re(:,:,3) = uint8(A_re_3);
    subplot(2,3,i);
    imshow(A_re);
    title(['$\mathrm{rank}(\mathbf{A}_k) =$ ',num2str(k(i))],'interpreter','latex','FontSize',15);
end
subplot(2,3,6);
imshow(cal_data/255);
title(['original image with ',num2str(size_A(2)),'$\times$',num2str(size_A(1)),' pixels'],'interpreter','latex','FontSize',15);

最後寫一個for loop，觀察取不同

k

，也就是不同的秩數下所得到的效果，可以發現在

rank (A) = 100

，以目前這種顯示比例，跟原圖比較幾乎看不出有什麼區別，幾乎能保留影像大部分的特徵，而使用秩數較低的矩陣同時又能減少大量資料儲存，這就是low rank approximation在影像壓縮的應用與意義。

低秩近似法與影像壓縮

tags: 線性代數 matlab

撰寫時間 : 2021/09/01

大綱

數學理論

矩陣的Frobenius norm與Spectral norm

SVD的外積展開式與低秩近似法

Matlab程式實踐

Read more

考研筆記 - 計算機組織 (大碩張凡)

線性代數的最小平方/範數問題與與主成分分析(PCA)

C++的inline, extern, static關鍵字

考研筆記 - 機率

tags: `線性代數` `matlab`