{%hackmd 5xqeIJ7VRCGBfLtfMi0_IQ %} # Rayleigh quotient of a real symmetric matrix ## Problem Let $A$ be an $n\times n$ real symmetric matrix with its eigenvalues $\lambda_1 \leq \cdots \leq \lambda_n$. Show that for any $\bx\in\mathbb{R}^n$ with $\|\bx\| = 1$, we have $$ \lambda_1 \leq \bx\trans A\bx \leq \lambda_n. $$ Moreover, when $\bx\trans A\bx = \lambda_1$, $\bx$ is an eigenvector of $A$ with respect to $\lambda_1$; when $\bx\trans A\bx = \lambda_n$, $\bx$ is an eigenvector of $A$ with respect to $\lambda_n$. ## Thought According to the spectral theorem, a real symmetric matrix is similar to a diagonal matrix by an orthonormal basis. That is, there is an orthonormal basis $\beta$ such that the matrix $A$, when observed by $\beta$, has its matrix representation diagonal. Intuitively, the proof should be similar to the case when $A$ is a diagonal matrix. ## Sample answer By the spectral theorem, there is an orthonormal basis $\beta = \{\bu_1, \ldots, \bu_n\}$ composed of eigenvectors of $A$. We may assume that $\bu_i$ is the eigenvector corresponding to $\lambda_i$ for $i = 1, \ldots, n$. Since $\beta$ is an orthonormal basis of $\mathbb{R}^n$, we may write $$ \bx = c_1\bu_1 + \cdots + c_n\bu_n $$ for some scalars $c_1, \ldots, c_n$. As $\bu_i$'s are eigenvectors, we also have $$ A\bx = c_1\lambda_1\bu_1 + \cdots + c_n\lambda_n\bu_n. $$ The fact that $\beta$ is orthonormal means that $\inp{\bu_i}{\bu_j} = 0$ for $i\neq j$ and $\inp{\bu_i}{\bu_i} = 1$ for any $i$. By the distributive law of the inner product, we may compute $$ \begin{aligned} \bx\trans A\bx &= \inp{\bx}{A\bx} \\ &=\inp{c_1\bu_1 + \cdots + c_n\bu_n}{c_1\lambda_1\bu_1 + \cdots + c_n\lambda_n\bu_n} \\ &= \sum_{i=1}^n c_i^2\lambda_i\inp{\bu_i}{\bu_i} + \sum_{i\neq j}c_ic_j\lambda_j\inp{\bu_i}{\bu_j} \\ &= c_1^2\lambda_1 + \cdots + c_n^2\lambda_n \end{aligned} $$ and $$ \begin{aligned} \|\bx\|^2 &= \inp{\bx}{\bx} \\ &=\inp{c_1\bu_1 + \cdots + c_n\bu_n}{c_1\bu_1 + \cdots + c_n\bu_n} \\ &= \sum_{i=1}^n c_i^2\inp{\bu_i}{\bu_i} + \sum_{i\neq j}c_ic_j\inp{\bu_i}{\bu_j} \\ &= c_1^2 + \cdots + c_n^2. \end{aligned} $$ By replacing $c_i^2 = p_i$ for $i = 1, \ldots, n$, we have $$ \begin{aligned} \bx\trans A\bx &= p_1\cdot\lambda_1 + \cdots + p_n\cdot\lambda_n, \\ \|\bx\|^2 &= p_1 + \cdots + p_n = 1, \\ \end{aligned} $$ and $p_i \geq 0$ for $i = 1, \ldots, n$. Thus, $\bx\trans A\bx$ is the weighted average of $\lambda_1, \ldots, \lambda_n$ with respect to the weights $p_1, \ldots, p_n$. Consequently, the value of $\bx\trans A\bx$ is between the minimum and the maximum of $\{\lambda_1, \ldots, \lambda_n\}$, which are $\lambda_1$ and $\lambda_n$, respectively. When $\bx\trans A\bx = \lambda_1$, necessarily all the weight is attributed to those $\lambda_i$ with $\lambda_i = 1$, and $\bx$ is a linear combination of eigenvectors with respect to $\lambda_1$, which is again an eigenvector with respect to $\lambda_1$. Similarly, when $\bx\trans A\bx = \lambda_n$, $\bx$ is an eigenvector with respect to $\lambda_n$. *This note can be found at Course website > Learning resources.*