---
title: Ch6-4
tags: Linear algebra
GA: G-77TT93X4N1
---
# Chapter 6 extra note 4
> Singular values
> Schmidt decomposition
> singular value decomposition
> pseudoinverse
## Selected lecture notes
:::info
**Definition:**
Let $T\in\mathcal{L}(V, W)$, where $V$ and $W$ are inner product spaces. The eigenvalues of $|T|$ are called *singular values* of $T$. That is, if $\lambda_1, \cdots, \lambda_n$ are eigenvalues of $T^*T$, then $\sqrt{\lambda_1}, \cdots, \sqrt{\lambda_n}$ are singular values of $T$.
:::
**Theorem:**
Let $T\in\mathcal{L}(V, W)$, where $V$ and $W$ are inner product spaces. There exists orthonormal basis $\{{\bf v}_1, \cdots, {\bf v}_n\}\subset V$, $\{{\bf w}_1, \cdots, {\bf w}_m\}\subset W$ and $\sigma_1\ge \sigma_2\ge \cdots \ge \sigma_r>0$ such that
$$
\tag{1}
T({\bf v}_i) = \begin{cases}
\sigma_i{\bf w}_i, & 1\le i\le r \\
{\bf 0}, & r<i\le n.
\end{cases}
$$
* Proof:
> Since $T^*T\ge 0$, there exists an orthonormal basis of eigenvectors $\{{\bf v}_1, \cdots, {\bf v}_n\}\subset V$ and $r\le n$ such that
> $$
> \tag{2}
> T^*T{\bf v}_i = \sigma^2_i {\bf v}_i, \quad 1\le i\le n,
> $$
> where
> $$
> \tag{3}
> \sigma_1\ge \sigma_2\ge \cdots \ge \sigma_r>\sigma_{r+1}=\cdots=\sigma_n=0.
> $$
>
> We then define
> $$
> \tag{4}
> {\bf w}_i = \frac{1}{\sigma_i}T({\bf v}_i), \quad 1\le i\le r.
> $$
>
> claim: $\{{\bf w}_1, \cdots, {\bf w}_r\}\subset W$ is orthonormal.
> > $$
> > \begin{align}
> > \tag{5}
> > \langle{\bf w}_i, {\bf w}_j\rangle &= \langle\frac{1}{\sigma_i}T({\bf v}_i), \frac{1}{\sigma_j}T({\bf v}_j)\rangle\\
> > &= \frac{1}{\sigma_i\sigma_j}\langle T({\bf v}_i), T({\bf v}_j)\rangle\\
> > &= \frac{1}{\sigma_i\sigma_j}\langle T^*T({\bf v}_i), {\bf v}_j\rangle\\
> > &= \frac{1}{\sigma_i\sigma_j}\langle \sigma_i^2 {\bf v}_i, {\bf v}_j\rangle\\
> > &= \frac{\sigma_i^2}{\sigma_i\sigma_j}\langle {\bf v}_i, {\bf v}_j\rangle\\
> > & = \delta_{ij}.
> > \end{align}
> > $$
>
> If $r<m=\text{dim}(W)$, we can find $\{{\bf w}_{r+1},\cdots {\bf w}_{m}\}\subset W$ such that $\{{\bf w}_1, \cdots, {\bf w}_m\}$ forms an orthonormal basis of $W$.
>
> Finally, by definition in (4) we have $T({\bf v}_i) = \sigma_i{\bf w}_i$ for $1\le i\le r$.
>
> For $r+1\le i\le n$, $T^*T{\bf v}_i={\bf 0}$ implies $T{\bf v}_i={\bf 0}$.
>
>
**Theorem:**
Let $T\in\mathcal{L}(V, W)$, where $V$ and $W$ are inner product spaces. Suppose there exists orthonormal basis $\{{\bf v}_1, \cdots, {\bf v}_n\}\subset V$, $\{{\bf w}_1, \cdots, {\bf w}_m\}\subset W$ and $\sigma_1\ge \sigma_2\ge \cdots \ge \sigma_r>0$ such that (1) is satisfied, then $T^*T({\bf v}_i)=\sigma_i^2{\bf v}_i$ for $1\le i\le r$.
* Proof:
> For $1\le i\le r$,
> $$
> \tag{6}
> \begin{align}
> \langle {\bf v}_i, T^*({\bf w}_j)\rangle &=\langle T({\bf v}_i), {\bf w}_j\rangle\\
> &=\langle \sigma_i{\bf w}_i, {\bf w}_j\rangle\\
> &=\sigma_i\delta_{ij},
> \end{align}
> $$
> so, for $1\le j\le r$,
> $$
> \tag{7}
> \begin{align}
> T^*({\bf w}_j) &= \sum^n_{i=1}\langle T^*({\bf w}_j), {\bf v}_i\rangle {\bf v}_i\\
> &=\sum^n_{i=1}\sigma_i\delta_{ij} {\bf v}_i\\
> &=\sigma_j{\bf v}_j.
> \end{align}
> $$
> Therefore, for $1\le i\le r$,
> $$
> \tag{8}
> T^*(T({\bf v}_i)) = T^*(\sigma_i{\bf w}_i) = \sigma^2_i{\bf v}_i.
> $$
>
**Remark:**
The above theorems show that the singular values of $T$ are uniquely determined.
**Theorem:** (Schmidt decomposition)
Let $T\in\mathcal{L}(V, W)$, where $V$ and $W$ are inner product spaces. There exists orthonormal basis $\{{\bf v}_1, \cdots, {\bf v}_n\}\subset V$, $\{{\bf w}_1, \cdots, {\bf w}_m\}\subset W$ and $\sigma_1\ge \sigma_2\ge \cdots \ge \sigma_r>0$ such that, for all ${\bf v}\in V$,
$$
\tag{9}
T({\bf v}) = \sum^r_{i=1}\sigma_i\langle{\bf v}, {\bf v}_i\rangle{\bf w}_i,
$$
where (1) is satisfied.
* Proof:
> Since $\{{\bf v}_1, \cdots, {\bf v}_n\}$ is an orthonormal basis,
> $$
> \tag{10}
> {\bf v} = \sum^n_{i=1}\langle{\bf v}, {\bf v}_i\rangle{\bf v}_i.
> $$
> We then have
> $$
> \tag{11}
> T{\bf v} = \sum^n_{i=1}\langle{\bf v}, {\bf v}_i\rangle T{\bf v}_i= \sum^r_{i=1}\sigma_i\langle{\bf v}, {\bf v}_i\rangle {\bf w}_i.
> $$
---
:::info
**Definition:**
Let $T\in\mathcal{L}(V, W)$, where $V$ and $W$ are inner product spaces. Let $L:\text{Ker}(T)^{\perp}\to\text{Ran}(T)$ be such that $L({\bf v})=T({\bf v})$ for all ${\bf v}\in\text{Ker}(T)^{\perp}$, then $L$ is invertible.
The *pseudoinverse* of $T$, denoted by $T^\dagger$, is defined as
$$
\tag{12}
T^\dagger{\bf w} = \begin{cases}
L^{-1}{\bf w}, & {\bf w}\in\text{Ran}(T) \\
{\bf 0}, & {\bf w}\in\text{Ran}(T)^\perp.
\end{cases}
$$
:::
**Proposition:**
Let $T\in\mathcal{L}(V, W)$, where $V$ and $W$ are inner product spaces. Let $\{{\bf v}_1, \cdots, {\bf v}_n\}\subset V$, $\{{\bf w}_1, \cdots, {\bf w}_m\}\subset W$ are orthonormal basis and $\sigma_1\ge \sigma_2\ge \cdots \ge \sigma_r>0$ such that (1) is satisfied. Then $T^\dagger\in\mathcal{L}(W,V)$ and
$$
\tag{13}
T^\dagger{\bf w} = \sum^r_{i=1}\frac{1}{\sigma_i}\langle{\bf w}, {\bf w}_i\rangle{\bf v}_i.
$$
---
**Remarks:**
* We can rewrite (9) as
$$
\tag{14}
\begin{align}
T({\bf v}) &= \sum^r_{i=1}\sigma_i\langle{\bf v}, {\bf v}_i\rangle{\bf w}_i\\
&=\begin{bmatrix}
{\bf w}_1 & \cdots & {\bf w}_r
\end{bmatrix}_{? \times r}
\begin{bmatrix}
\sigma_1\langle{\bf v}, {\bf v}_1\rangle\\
\vdots \\
\sigma_r\langle{\bf v}, {\bf v}_r\rangle
\end{bmatrix}_{r \times ?}\\
&=\begin{bmatrix}
{\bf w}_1 & \cdots & {\bf w}_r
\end{bmatrix}_{? \times r}
\begin{bmatrix}
\sigma_1 & &\\
& \ddots & \\
& & \sigma_r
\end{bmatrix}_{r\times r}
\begin{bmatrix}
\langle{\bf v}, {\bf v}_1\rangle\\
\vdots \\
\langle{\bf v}, {\bf v}_r\rangle
\end{bmatrix}_{r \times ?}\\
&=\begin{bmatrix}
{\bf w}_1 & \cdots & {\bf w}_r
\end{bmatrix}_{? \times r}
\begin{bmatrix}
\sigma_1 & & \\
& \ddots & \\
& & \sigma_r
\end{bmatrix}_{r\times r}
\begin{bmatrix}
{\bf v}^*_1\\
\vdots \\
{\bf v}^*_r
\end{bmatrix}_{r \times ?}{\bf v}\\
&=\widetilde{W}\widetilde{\Sigma}\widetilde{V^*}{\bf v}.
\end{align}
$$
This is the *reduced* singular value decomposition of $T$.
* In general, singular value decomposition (SVD) refers to the decomposition of the matrix representation of $T$.
* The SVD of $T$ is
$$
\tag{15}
\begin{align}
T({\bf v}) &=\begin{bmatrix}
{\bf w}_1 & \cdots & {\bf w}_m
\end{bmatrix}_{? \times m}
\begin{bmatrix}
\sigma_1 & & & \\
& \ddots & & \\
& & \sigma_r & \\
& & & & &
\end{bmatrix}_{m\times n}
\begin{bmatrix}
{\bf v}^*_1\\
\vdots \\
{\bf v}^*_n
\end{bmatrix}_{n \times ?}{\bf v}\\
&=W\Sigma V^*{\bf v}.
\end{align}
$$
* The SVD of $T^\dagger$ is
$$
\tag{16}
\begin{align}
T^\dagger({\bf w}) &=\begin{bmatrix}
{\bf v}_1 & \cdots & {\bf v}_n
\end{bmatrix}_{? \times n}
\begin{bmatrix}
\frac{1}{\sigma_1} & & & \\
& \ddots & & \\
& & \frac{1}{\sigma_r} & \\
& & & & &
\end{bmatrix}_{n\times m}
\begin{bmatrix}
{\bf w}^*_1\\
\vdots \\
{\bf w}^*_m
\end{bmatrix}_{m \times ?}{\bf w}\\
&=V\Sigma^{-1} W^*{\bf w}.
\end{align}
$$