---
tags: linear algebra
---
# Intuition Behind Singular Value Decomposition (SVD)
## Definition
The singular value decomposition of an $m \times n$ matrix $T$ is a factorization of the form $U \Sigma V^*$, where $U$ is an unitary matrix, $\Sigma$ is a $m \times n$ is a rectangle diagonal matrix with non-negative real numbers on the diagonal, and $V$ is an $n\times n$ unitary matrix. That is
$$
T =
\begin{bmatrix}
u_1 \ldots u_m
\end{bmatrix}
\begin{bmatrix}
\sigma_1 & 0 & 0 & \cdots & \cdots & 0 \\
0 & \sigma_2 & 0 & \cdots & \cdots & 0 \\
\vdots & \vdots & \ddots & & & \vdots\\
0 & 0 & \cdots & \sigma_m& \cdots & 0
\end{bmatrix}
\begin{bmatrix}
v_1^T \\
\vdots \\
v_n^T \\
\end{bmatrix}
$$
(In this expression for illustration, $n>m$.)
## Intuition
Suppose that $y = T(x)$, what SVD does is to decompose $y$ into the the components of an orthonormal basis of $\mathbb R^m$. To see this, we write $U = \begin{bmatrix} u_1 \cdots u_m \end{bmatrix}$ and assume that $U$ is orthonormal (i.e. the columns of $U$ constitute an orthonormal basis of $\mathbb R^m$), so $y$ can be expressed in this way:
$$
T(x) = y =
\begin{bmatrix}
u_1 \ldots u_m
\end{bmatrix}
\begin{bmatrix}
c_1 \\
\vdots \\
c_m \\
\end{bmatrix},
$$
where $c_i$ is the corresponding components of $u_i$. That is, we hope we can express all $y$ in terms of the linear combination of an fixed orthonormal basis.
Moreover, from the definition of SVD, we found that
$$
\begin{bmatrix} c_1 \\ \vdots \\ c_m \end{bmatrix}
= \Sigma V^*x =
\begin{bmatrix}
\sigma_1 & 0 & 0 & \cdots & \cdots & 0 \\
0 & \sigma_2 & 0 & \cdots & \cdots & 0 \\
\vdots & \vdots & \ddots & & & \vdots\\
0 & 0 & \cdots & \sigma_m& \cdots & 0
\end{bmatrix}
\begin{bmatrix}
v_1^T \\
\vdots \\
v_n^T \\
\end{bmatrix}x,
$$
where $\{v_1, \ldots, v_n \} = \beta$ is an orthonormal matrix. Because $\beta$ is orthonrmal, $V^*$ is just a orthonrmal projection matrix that project $x$ on $\beta$. Thus, we have
$$
V^* x =
\begin{bmatrix}
v_1^T \\
\vdots \\
v_n^T \\
\end{bmatrix}
\begin{bmatrix}
x_1 \\
\vdots \\
x_n
\end{bmatrix} =
\begin{bmatrix}
d_1 \\ \vdots \\ d_n
\end{bmatrix},
$$
where $d_i$ is $x$'s corresponding components of $v_i$. From the perspective of matrix multiplication, the value of $c_i$ ***only*** depends on the value of $d_i$, which means $y$'s components on $u_i$ only depends on $x$'s components on $v_i$. In other words, if $i \neq j$, then $T(v_j) = \sigma_j u_j$ doesn't make any contribution to $y$'s component on $u_i$ and only $x$'s component on $v_i$ will make contibute to it. Based on the discussion and the fact that $\gamma = \{ u_1, \cdots ,u_m\}$ is orthonormal, we have
$$
\forall i \neq j, \langle T(v_i), T(v_j) \rangle = 0
$$
The question is, how to find such basis $\beta$, or, is $\beta$ really exists for arbitrary matrix $T$?
## Justification
From the discussion of previous section, we know that to find a singular value decompositino of $T$, we must find a *orthonormal* basis $\beta = \{v_1, \ldots, v_n\}$ such that the following equality holds for $i \neq j$:
$$
\langle T(v_i), T(v_j) \rangle = 0
$$
Now, we apply the property of adjoint operator of $T$ (denoted by $T^*$):
$$
\langle T(v_i), T(v_j) \rangle = 0
\iff \langle v_i, T^* T(v_j) \rangle = 0
$$
(Actually, when $T$ is a real matrix, we have $T^t = T^*$, where $T^t$ denotes the transpose of $T$) Observe that $T^*T$ is a semi-positive definite matrix (you can prove it by the definition of semi-positive matrix). This implies that we can find an orthonormal basis $\{v'_1, \ldots, v'_n\}$ such that $v_i$ is a eigenvector of $T^*T$:
$$
T^*T(v_i) = \lambda_i v'_i.
$$
Then we have for $i \neq j$,
$$
\langle T(v'_i), T(v'_j) \rangle = \langle v'_i, T^* T(v'_j) \rangle = \lambda_j \langle v'_i, v'_j \rangle= 0
$$
Great! $\{v'_1, \ldots, v'_n \} = \{ v_1, \ldots , v_n\}$ is the basis we want! $V$'s row is jsut the transpose of $v'_i$ in $\{v'_1, \ldots, v'_n\}$. Now the next step is to find the value of $\sigma_i$ in $\Sigma$:
$$
\begin{align}
& \langle \sigma_i u_i, \sigma_i u_i \rangle = \langle T(v_i), T(v_i) \rangle = \langle v_i, T^*T(v_i) \rangle = \langle v_i, \lambda_i v_i \rangle \\
\Rightarrow & \sigma_i = \sqrt{\lambda_i}
\end{align}
$$
Sicne $T^*T$ is semi-positive definite, we have $\lambda_i \ge 0$. Thus, the value of $\sigma_i$ is non-negative.
The last step is to find $U$, this is simple because we know that $T(v_i) = \sigma_i u_i \Rightarrow u_i = \frac{1}{\sigma_i} T(v_i) = \frac{1}{\sqrt{\lambda_i}} T(v_i)$. Note that if the rank of $T$ (dimension of $R(T)$) $r$ is less than $m$, then there would be $m - r$ columns of $U$ to be determined. This can be done using the fact that $R(T) \perp N(T^*)$. Thus, the $m-r$ columns should be the orthonormal basis of $N(T^*)$.