Discussion Dec 11


Motivation for SVD is representing A which is \(n \times d\) by a low-rank matrix \(A_k\) which is \(n \times d\).

Objective is to minimize:
\[\sum_{i = 1}^n A_i - a_i,\]
where \(a_i = argmin_{v \in span(rows(A_k))} \|A_i - v\|_2\)

Problem 2 (Part 1)

Show that for any matrix \(A\) we have \(\sigma_k \le \frac{\|A\|_F}{\sqrt{k}}\).

\[\|A\|_F^2 = \sum_{i = 1}^n \sum_{j = 1}^d a_{ij}^2 = \sum_{i = 1}^r \sigma_i^2,\]
where \(r\) is rank of \(A\).

Suffices to show that \(\sigma_k^2 \le \frac{\|A\|_F^2}{k}\). Suppose this is not the case then we have \[\sigma_1^2 + \sigma_2^2 + \dots + \sigma_k^2 > \|A\|_F^2\]

  • \(\|A\|^2_F = \sum_{i = 1}^r \sigma_i^2 = \|\sigma\|_2^2\)
  • \(\|A\|_2 = \|\sigma\|_\infty\)
  • For square \(A\) we have \(Tr(A) = \sum_{i = 1}^n \lambda_i\)

Problem 2 (Part 2)

Prove that for every \(A\) there exists a matrix B of rank at most \(k\) such that:
\[\|A - B\|_2 = \sigma_{k + 1} \le \frac{\|A\|_F}{\sqrt{k + 1}}\]
Proof: pick \(B = A_k\), where \(A_k\) is the best rank-\(k\) approximation for \(A\) constructed via SVD.

Problem 2 (Part 3)

Is it true that for every matrix \(A\) there exists a matrix \(B\) of rank at most \(k\) such that: \[\|A - B\|_F \le \frac{\|A\|_F}{\sqrt{k}}?\]

Let's take \(A = I\). Then \(\|A\|_F = \sqrt{n}.\) We know that among all matrices \(B\) of rank at most \(k\) it holds that:
\[\|A - A_k\|_F \le \|A - B\|_F,\]
where \(A_k\) is truncated SVD.

Problem 3

Let's construct a square matrix \(A^TA\) of size \(d \times d\). In lectures we suggest computing \((A^TA)^n\) for large enough \(n\). If \(A\) is sparse this might not be taking a full advantage of sparsity.

Let's take a vector x and instead compute:
\[(A^TA)^n x\]
\[A^T A A^T (A \dots (A^T (A x)))\]
So we have \(2n\) multiplications by a sparse matrix \(A\) which can be done in \(nnz(A)\) time each, where \(nnz(A)\) is the number of non-zero elements in \(A\).

Select a repo