--- title: Ch6-4 tags: Linear algebra GA: G-77TT93X4N1 --- # Chapter 6 extra note 4 > Singular values > Schmidt decomposition > singular value decomposition > pseudoinverse ## Selected lecture notes :::info **Definition:** Let $T\in\mathcal{L}(V, W)$, where $V$ and $W$ are inner product spaces. The eigenvalues of $|T|$ are called *singular values* of $T$. That is, if $\lambda_1, \cdots, \lambda_n$ are eigenvalues of $T^*T$, then $\sqrt{\lambda_1}, \cdots, \sqrt{\lambda_n}$ are singular values of $T$. ::: **Theorem:** Let $T\in\mathcal{L}(V, W)$, where $V$ and $W$ are inner product spaces. There exists orthonormal basis $\{{\bf v}_1, \cdots, {\bf v}_n\}\subset V$, $\{{\bf w}_1, \cdots, {\bf w}_m\}\subset W$ and $\sigma_1\ge \sigma_2\ge \cdots \ge \sigma_r>0$ such that $$ \tag{1} T({\bf v}_i) = \begin{cases} \sigma_i{\bf w}_i, & 1\le i\le r \\ {\bf 0}, & r<i\le n. \end{cases} $$ * Proof: > Since $T^*T\ge 0$, there exists an orthonormal basis of eigenvectors $\{{\bf v}_1, \cdots, {\bf v}_n\}\subset V$ and $r\le n$ such that > $$ > \tag{2} > T^*T{\bf v}_i = \sigma^2_i {\bf v}_i, \quad 1\le i\le n, > $$ > where > $$ > \tag{3} > \sigma_1\ge \sigma_2\ge \cdots \ge \sigma_r>\sigma_{r+1}=\cdots=\sigma_n=0. > $$ > > We then define > $$ > \tag{4} > {\bf w}_i = \frac{1}{\sigma_i}T({\bf v}_i), \quad 1\le i\le r. > $$ > > claim: $\{{\bf w}_1, \cdots, {\bf w}_r\}\subset W$ is orthonormal. > > $$ > > \begin{align} > > \tag{5} > > \langle{\bf w}_i, {\bf w}_j\rangle &= \langle\frac{1}{\sigma_i}T({\bf v}_i), \frac{1}{\sigma_j}T({\bf v}_j)\rangle\\ > > &= \frac{1}{\sigma_i\sigma_j}\langle T({\bf v}_i), T({\bf v}_j)\rangle\\ > > &= \frac{1}{\sigma_i\sigma_j}\langle T^*T({\bf v}_i), {\bf v}_j\rangle\\ > > &= \frac{1}{\sigma_i\sigma_j}\langle \sigma_i^2 {\bf v}_i, {\bf v}_j\rangle\\ > > &= \frac{\sigma_i^2}{\sigma_i\sigma_j}\langle {\bf v}_i, {\bf v}_j\rangle\\ > > & = \delta_{ij}. > > \end{align} > > $$ > > If $r<m=\text{dim}(W)$, we can find $\{{\bf w}_{r+1},\cdots {\bf w}_{m}\}\subset W$ such that $\{{\bf w}_1, \cdots, {\bf w}_m\}$ forms an orthonormal basis of $W$. > > Finally, by definition in (4) we have $T({\bf v}_i) = \sigma_i{\bf w}_i$ for $1\le i\le r$. > > For $r+1\le i\le n$, $T^*T{\bf v}_i={\bf 0}$ implies $T{\bf v}_i={\bf 0}$. > > **Theorem:** Let $T\in\mathcal{L}(V, W)$, where $V$ and $W$ are inner product spaces. Suppose there exists orthonormal basis $\{{\bf v}_1, \cdots, {\bf v}_n\}\subset V$, $\{{\bf w}_1, \cdots, {\bf w}_m\}\subset W$ and $\sigma_1\ge \sigma_2\ge \cdots \ge \sigma_r>0$ such that (1) is satisfied, then $T^*T({\bf v}_i)=\sigma_i^2{\bf v}_i$ for $1\le i\le r$. * Proof: > For $1\le i\le r$, > $$ > \tag{6} > \begin{align} > \langle {\bf v}_i, T^*({\bf w}_j)\rangle &=\langle T({\bf v}_i), {\bf w}_j\rangle\\ > &=\langle \sigma_i{\bf w}_i, {\bf w}_j\rangle\\ > &=\sigma_i\delta_{ij}, > \end{align} > $$ > so, for $1\le j\le r$, > $$ > \tag{7} > \begin{align} > T^*({\bf w}_j) &= \sum^n_{i=1}\langle T^*({\bf w}_j), {\bf v}_i\rangle {\bf v}_i\\ > &=\sum^n_{i=1}\sigma_i\delta_{ij} {\bf v}_i\\ > &=\sigma_j{\bf v}_j. > \end{align} > $$ > Therefore, for $1\le i\le r$, > $$ > \tag{8} > T^*(T({\bf v}_i)) = T^*(\sigma_i{\bf w}_i) = \sigma^2_i{\bf v}_i. > $$ > **Remark:** The above theorems show that the singular values of $T$ are uniquely determined. **Theorem:** (Schmidt decomposition) Let $T\in\mathcal{L}(V, W)$, where $V$ and $W$ are inner product spaces. There exists orthonormal basis $\{{\bf v}_1, \cdots, {\bf v}_n\}\subset V$, $\{{\bf w}_1, \cdots, {\bf w}_m\}\subset W$ and $\sigma_1\ge \sigma_2\ge \cdots \ge \sigma_r>0$ such that, for all ${\bf v}\in V$, $$ \tag{9} T({\bf v}) = \sum^r_{i=1}\sigma_i\langle{\bf v}, {\bf v}_i\rangle{\bf w}_i, $$ where (1) is satisfied. * Proof: > Since $\{{\bf v}_1, \cdots, {\bf v}_n\}$ is an orthonormal basis, > $$ > \tag{10} > {\bf v} = \sum^n_{i=1}\langle{\bf v}, {\bf v}_i\rangle{\bf v}_i. > $$ > We then have > $$ > \tag{11} > T{\bf v} = \sum^n_{i=1}\langle{\bf v}, {\bf v}_i\rangle T{\bf v}_i= \sum^r_{i=1}\sigma_i\langle{\bf v}, {\bf v}_i\rangle {\bf w}_i. > $$ --- :::info **Definition:** Let $T\in\mathcal{L}(V, W)$, where $V$ and $W$ are inner product spaces. Let $L:\text{Ker}(T)^{\perp}\to\text{Ran}(T)$ be such that $L({\bf v})=T({\bf v})$ for all ${\bf v}\in\text{Ker}(T)^{\perp}$, then $L$ is invertible. The *pseudoinverse* of $T$, denoted by $T^\dagger$, is defined as $$ \tag{12} T^\dagger{\bf w} = \begin{cases} L^{-1}{\bf w}, & {\bf w}\in\text{Ran}(T) \\ {\bf 0}, & {\bf w}\in\text{Ran}(T)^\perp. \end{cases} $$ ::: **Proposition:** Let $T\in\mathcal{L}(V, W)$, where $V$ and $W$ are inner product spaces. Let $\{{\bf v}_1, \cdots, {\bf v}_n\}\subset V$, $\{{\bf w}_1, \cdots, {\bf w}_m\}\subset W$ are orthonormal basis and $\sigma_1\ge \sigma_2\ge \cdots \ge \sigma_r>0$ such that (1) is satisfied. Then $T^\dagger\in\mathcal{L}(W,V)$ and $$ \tag{13} T^\dagger{\bf w} = \sum^r_{i=1}\frac{1}{\sigma_i}\langle{\bf w}, {\bf w}_i\rangle{\bf v}_i. $$ --- **Remarks:** * We can rewrite (9) as $$ \tag{14} \begin{align} T({\bf v}) &= \sum^r_{i=1}\sigma_i\langle{\bf v}, {\bf v}_i\rangle{\bf w}_i\\ &=\begin{bmatrix} {\bf w}_1 & \cdots & {\bf w}_r \end{bmatrix}_{? \times r} \begin{bmatrix} \sigma_1\langle{\bf v}, {\bf v}_1\rangle\\ \vdots \\ \sigma_r\langle{\bf v}, {\bf v}_r\rangle \end{bmatrix}_{r \times ?}\\ &=\begin{bmatrix} {\bf w}_1 & \cdots & {\bf w}_r \end{bmatrix}_{? \times r} \begin{bmatrix} \sigma_1 & &\\ & \ddots & \\ & & \sigma_r \end{bmatrix}_{r\times r} \begin{bmatrix} \langle{\bf v}, {\bf v}_1\rangle\\ \vdots \\ \langle{\bf v}, {\bf v}_r\rangle \end{bmatrix}_{r \times ?}\\ &=\begin{bmatrix} {\bf w}_1 & \cdots & {\bf w}_r \end{bmatrix}_{? \times r} \begin{bmatrix} \sigma_1 & & \\ & \ddots & \\ & & \sigma_r \end{bmatrix}_{r\times r} \begin{bmatrix} {\bf v}^*_1\\ \vdots \\ {\bf v}^*_r \end{bmatrix}_{r \times ?}{\bf v}\\ &=\widetilde{W}\widetilde{\Sigma}\widetilde{V^*}{\bf v}. \end{align} $$ This is the *reduced* singular value decomposition of $T$. * In general, singular value decomposition (SVD) refers to the decomposition of the matrix representation of $T$. * The SVD of $T$ is $$ \tag{15} \begin{align} T({\bf v}) &=\begin{bmatrix} {\bf w}_1 & \cdots & {\bf w}_m \end{bmatrix}_{? \times m} \begin{bmatrix} \sigma_1 & & & \\ & \ddots & & \\ & & \sigma_r & \\ & & & & & \end{bmatrix}_{m\times n} \begin{bmatrix} {\bf v}^*_1\\ \vdots \\ {\bf v}^*_n \end{bmatrix}_{n \times ?}{\bf v}\\ &=W\Sigma V^*{\bf v}. \end{align} $$ * The SVD of $T^\dagger$ is $$ \tag{16} \begin{align} T^\dagger({\bf w}) &=\begin{bmatrix} {\bf v}_1 & \cdots & {\bf v}_n \end{bmatrix}_{? \times n} \begin{bmatrix} \frac{1}{\sigma_1} & & & \\ & \ddots & & \\ & & \frac{1}{\sigma_r} & \\ & & & & & \end{bmatrix}_{n\times m} \begin{bmatrix} {\bf w}^*_1\\ \vdots \\ {\bf w}^*_m \end{bmatrix}_{m \times ?}{\bf w}\\ &=V\Sigma^{-1} W^*{\bf w}. \end{align} $$