{%hackmd 5xqeIJ7VRCGBfLtfMi0_IQ %}
# Rayleigh quotient of a real symmetric matrix
## Problem
Let $A$ be an $n\times n$ real symmetric matrix with its eigenvalues $\lambda_1 \leq \cdots \leq \lambda_n$. Show that for any $\bx\in\mathbb{R}^n$ with $\|\bx\| = 1$, we have
$$
\lambda_1 \leq \bx\trans A\bx \leq \lambda_n.
$$
Moreover, when $\bx\trans A\bx = \lambda_1$, $\bx$ is an eigenvector of $A$ with respect to $\lambda_1$; when $\bx\trans A\bx = \lambda_n$, $\bx$ is an eigenvector of $A$ with respect to $\lambda_n$.
## Thought
According to the spectral theorem, a real symmetric matrix is similar to a diagonal matrix by an orthonormal basis. That is, there is an orthonormal basis $\beta$ such that the matrix $A$, when observed by $\beta$, has its matrix representation diagonal. Intuitively, the proof should be similar to the case when $A$ is a diagonal matrix.
## Sample answer
By the spectral theorem, there is an orthonormal basis $\beta = \{\bu_1, \ldots, \bu_n\}$ composed of eigenvectors of $A$. We may assume that $\bu_i$ is the eigenvector corresponding to $\lambda_i$ for $i = 1, \ldots, n$.
Since $\beta$ is an orthonormal basis of $\mathbb{R}^n$, we may write
$$
\bx = c_1\bu_1 + \cdots + c_n\bu_n
$$
for some scalars $c_1, \ldots, c_n$. As $\bu_i$'s are eigenvectors, we also have
$$
A\bx = c_1\lambda_1\bu_1 + \cdots + c_n\lambda_n\bu_n.
$$
The fact that $\beta$ is orthonormal means that $\inp{\bu_i}{\bu_j} = 0$ for $i\neq j$ and $\inp{\bu_i}{\bu_i} = 1$ for any $i$. By the distributive law of the inner product, we may compute
$$
\begin{aligned}
\bx\trans A\bx &= \inp{\bx}{A\bx} \\
&=\inp{c_1\bu_1 + \cdots + c_n\bu_n}{c_1\lambda_1\bu_1 + \cdots + c_n\lambda_n\bu_n} \\
&= \sum_{i=1}^n c_i^2\lambda_i\inp{\bu_i}{\bu_i} + \sum_{i\neq j}c_ic_j\lambda_j\inp{\bu_i}{\bu_j} \\
&= c_1^2\lambda_1 + \cdots + c_n^2\lambda_n
\end{aligned}
$$
and
$$
\begin{aligned}
\|\bx\|^2 &= \inp{\bx}{\bx} \\
&=\inp{c_1\bu_1 + \cdots + c_n\bu_n}{c_1\bu_1 + \cdots + c_n\bu_n} \\
&= \sum_{i=1}^n c_i^2\inp{\bu_i}{\bu_i} + \sum_{i\neq j}c_ic_j\inp{\bu_i}{\bu_j} \\
&= c_1^2 + \cdots + c_n^2.
\end{aligned}
$$
By replacing $c_i^2 = p_i$ for $i = 1, \ldots, n$, we have
$$
\begin{aligned}
\bx\trans A\bx &= p_1\cdot\lambda_1 + \cdots + p_n\cdot\lambda_n, \\
\|\bx\|^2 &= p_1 + \cdots + p_n = 1, \\
\end{aligned}
$$
and $p_i \geq 0$ for $i = 1, \ldots, n$. Thus, $\bx\trans A\bx$ is the weighted average of $\lambda_1, \ldots, \lambda_n$ with respect to the weights $p_1, \ldots, p_n$. Consequently, the value of $\bx\trans A\bx$ is between the minimum and the maximum of $\{\lambda_1, \ldots, \lambda_n\}$, which are $\lambda_1$ and $\lambda_n$, respectively. When $\bx\trans A\bx = \lambda_1$, necessarily all the weight is attributed to those $\lambda_i$ with $\lambda_i = 1$, and $\bx$ is a linear combination of eigenvectors with respect to $\lambda_1$, which is again an eigenvector with respect to $\lambda_1$. Similarly, when $\bx\trans A\bx = \lambda_n$, $\bx$ is an eigenvector with respect to $\lambda_n$.
*This note can be found at Course website > Learning resources.*