$A^TA + \lambda I_n$ Is Always an Invertible Matrix

# $A^TA + \lambda I_n$ Is Always an Invertible Matrix Given an arbitrary $m\times n$ matrix $A$ and a scalar $\lambda>0$, $A^TA + \lambda I_n$ is always an invertible matrix. How to prove this fact? --- ## Explanation We will give two ways to prove this statement using the Singular Value Decomposition (SVD). These two explanations essentially come from the same idea. The first explanation is through the matrix form of the SVD, which we will mainly deal with the matrices the whole process, and the second is through the vector form of the SVD, from which the exact inverse will be presented explicitly. ### Matrix Form Let's look the matrix form first. Let the SVD of $A$ be be the following: $$A = U\Sigma V^T$$ And it follows immediately that $$A^T=V\Sigma U^T$$ and $$A^TA=V\Sigma\Sigma^TV^T$$ where $\displaystyle \Sigma\Sigma^T$ is a diagonal matrix with non-negative entries. To write down the matrix form of $\displaystyle A^TA+\lambda I_n$, we need only know the following fact: $$I_n = VV^T$$ which is due to the orthogonality of $V$. Therefore, we have $$A^TA+\lambda I_n = V\Big(\Sigma\Sigma^T+\lambda I_n\Big)V^T$$ Since the matrix in the middle of RHS is clearly invertible, the proof is done. ### Vector Form Now we give the second explanation in terms of the vector form. Let $A$ be of rank $r$. We write down the truncated SVD of $A$: $$A=\sum_{j\le r} \sigma_j u_j v_j^T\tag{1}$$ which tells us that $$A^TA+\lambda I_n=\sum_{j\le r} (\sigma_j^2+\lambda)v_jv_j^T + \sum_{j>r}\lambda v_j v_j^T\tag{2}$$ It is easily seen that the inverse of $A^TA + \lambda I_n$ is $$\sum_{j\le r}\frac{1}{\sigma_j^2+\lambda}v_jv_j^T+\sum_{j>r}\frac{1}{\lambda}v_jv_j^T\tag{3}$$ Notice that to derive $(2)$, we still use the fact: $$I_n = \sum_{j\le n} v_jv_j^T$$ --- ## Experiment Let's verify what we have shown in the previous section. In the following Python code, we do the following things: - Create the matrix of rank $r$. - Compute the inverse of $\displaystyle A^TA+\lambda I_n$ through its SVD. - Verify the inverse indeed behaves as what we expected. ```python= import numpy as np def get_rank_r_matrix(m: int, n: int, rank: int) -> np.ndarray: B = np.random.randn(m, rank) C = np.random.randn(rank, n) return B @ C if __name__ == "__main__": # hyperparameters # rank of the matrix r = 3 # number of rows of the matrix m = 24 # number of columns of the matrix n = 5 lambda_ = 5 assert (m > r) and (n > r), f"m and n should be no less than r. Found m = {m}; n = {n}; r = {r}." A = get_rank_r_matrix(m=m, n=n, rank=r) assert A.shape == (m, n) assert np.linalg.matrix_rank(A) == r B = A.T @ A + lambda_ * np.identity(n) U, s, Vt = np.linalg.svd(A) # let the remaining singular values be zeros s[r:] = 0 D = np.zeros((n, n)) np.fill_diagonal(D, s ** 2) D = D + lambda_ * np.identity(n) D_inverse = np.linalg.inv(D) B_inverse = (Vt.T) @ D_inverse @ Vt assert np.allclose(B @ B_inverse, np.identity(n)) assert np.allclose(B_inverse @ B, np.identity(n)) ``` ###### tags: `singular-value-decomposition`