Inner products

# Inner Products $\newcommand{\iprod}[2]{\langle #1,#2 \rangle}$ :::warning Let $V$ be a vector space over a field $F$ where $F=\mathbb{R}$ or $\mathbb{C}$. A function $\iprod{\cdot}{\cdot}:V\times V\to F$ satisfying the following properties is said to be an **inner product** on $V$: (a) $\iprod{x+y}{z}=\iprod{x}{z}+\iprod{y}{z}$ (b) $\iprod{cx}{y}=c\iprod{x}{y}$ \(c) $\iprod{x}{y}=\overline{\iprod{y}{x}}$ (d) $x\not=\mathit0\implies\iprod{x}{x}\in\mathbb{R}_{>0}$ For any $x,y,z\in V$ and $c\in F$ ::: Clearly, the following properties of an inner product are true. :::info For any $x,y,z\in V$ and $c\in F$ $\iprod{x}{y+z}=\iprod{x}{y}+\iprod{x}{z}$ $\iprod{x}{cy}=\bar{c}\iprod{x}{y}$ $\iprod{x}{\mathit0}=0$ $\iprod{x}{x}=0\iff x=\mathit0$ If $\iprod{x}{y}=\iprod{x}{z}$ for every $x\in V$, then $y=z$ ::: ## Inner Product Spaces Let $V$ be a vector space over $F$ and $\iprod{\cdot}{\cdot}$ be an inner product on $V$. We call the tuple $(V,\iprod{\cdot}{\cdot})$ as a **inner product space** and, for convenience, denoted as $V$. ## Adjoints and Norms :::warning For an $m\times n$ real or complex matrix $A$, we define the **adjoint** $A^*$ of $A$ as the matrix that satisfies $(A^*)_{ij}=\overline{A_{ji}}$. ::: :::warning Let $V$ be a inner product space over $F$. We define the norm of $x\in V$ as $\|x\|:=\sqrt{\iprod{x}{x}}$. The following inequalities holds: - **Cauchy-Schwarz Inequality** $|\iprod{x}{y}|\le\|x\|\cdot\|y\|$ - **Triangle Inequality** $\|x+y\|\le\|x\|+\|y\|$ ::: :::warning Let $V$ be a vector space over $F$. A function $\|\cdot\|_V:V\to\mathbb{R}$ is considered to be a norm if it satisfies the following properties: For any $x,y\in V$ and $c\in F$ (a) $\|x\|_V=0\iff x=\mathit0$ (b) $\|cx\|_V=|c|\cdot\|x\|$ \(c) $\|x+y\|_V\le\|x\|_V+\|y\|_V$ Some definitions requires that $\|x\|_V\ge0$, but since $$ 0=\|x-x\|_V\le\|x\|_V+\|-x\|_V=\|x\|_V+\|x\|_V=2\|x\|_V $$ so we need only to make sure the Triangle Inequality holds. Hence we can define a normed space $V$ as the tuple $(V,\|\cdot\|_V)$. ::: :::danger <span style = "font-size:23px;font-weight:700;">Remark</span> Given an inner product space $V$, we can also get a *normed space* correspoding to the inner product. But given a *normed space*, we may not be able to find a inner product space corresponding to it. In fact, the *normed space* $V=\mathbb{R}^2$ where the norm is defined as $$ \forall (a,b)\in V,\|(a,b)\|_V:=|a|+|b| $$ has no corresponding inner product $\iprod{\cdot}{\cdot}$ on $\mathbb{R}^2$ such that $\|x\|_V^2=\iprod{x}{x}$ for all $x\in\mathbb{R}^2$. To show this, we first introduce the parallelogram law on an inner product space $V$. - *The Parallelogram law* $\forall x,y\in V,\|x+y\|^2+\|x-y\|^2=2\|x\|^2+2\|y\|^2$ ::: :::spoiler Proof of The Parallelogram law By simple calculations $$ \iprod{x+y}{x+y}+\iprod{x-y}{x-y}=2\iprod{x}{x}+\iprod{x}{y}+\iprod{y}{x}+2\iprod{y}{y}-\iprod{x}{y}-\iprod{y}{x} $$ We get the desired result. ::: Hence, if there is an inner product, then the norm correspoding to it has to satisfy the Parallelogram law. So if the norm does not satisfy the Parallelogram law, then there is no inner product on $V$ corresponding to the norm. We show that $$ \forall (a,b)\in V,\|(a,b)\|_V:=|a|+|b| $$ does not satisfy the parallelogram law. Take $x=(1,0),y=(0,1)$, then: $\|(1,1)\|_V^2+\|(1,-1)\|_V^2=4+4\not=\|(1,0)\|^2+\|(0,1)\|^2=2+2$ So there does not exist any inner product that satisfies $\sqrt{\iprod{(a,b)}{(a,b)}}=|a|+|b|$ ## Orthogonal and Orthonormal subsets :::warning Let $V$ be a inner product space and $x,y\in V$. If $\iprod{x}{y}=0$, then we say $x$ and $y$ are orthogonal. Note that $\mathit0$ is orthogonal to any vector in $V$. If $\iprod{x}{y}=0$ and $\|x\|=\|y\|=1$, then we say $x$ and $y$ are orthonormal. Note that if $2$ nonzero vectors $v_1,v_2$ are orthogonal, then after dividing their norms respectively, $\dfrac{v_1}{\|v_1\|},\dfrac{v_2}{\|v_2\|}$ are orthonormal. ::: :::warning Let $S$ be a finite subset of $V$. If every vector $v$ in $S$ and any other vector $v'\in S$ are orthogonal or orthonormal, then we say $S$ is an orthogonal or orthonormal subset, respectively, of $V$. ::: :::info If $S$ is an *orthogonal subset* of $V$ and $\mathit0\not\in S$, then $S$ must be *linearly independent*. ::: The proof is by induction, we only prove the case of $2$. :::spoiler Proof Suppose that $\{v_1,v_2\}$ is an orthogonal subset and $v_1,v_2$ are nonzero. If $c_1v_1+c_2v_2=\mathit0$, then $c_1v_1=-c_2v_2$, so we get $$ \iprod{c_1v_1}{c_2v_2}=\iprod{-c_2v_2}{c_2v_2}=-c_2\overline{c_2}\|v_2\|^2 $$ Since $v_2\not=\mathit0$, it follows that $c_2=0$, hence $c_1=0$. This implies that $\{v_1,v_2\}$ is an linearly independent subset. ::: Hence a immediate result of this theorem is: :::info If $S$ is an *orthonormal subset* of $V$, then $S$ is *linearly independent*. ::: ## Gram-Schimit Process :::info Let $V$ be a finite-dimensional inner product space. If $S=\{v_1,v_2,\dots,v_n\}$ is an linearly independent subset of $V$, then by defining $S':=\{w_1,w_2,\dots,w_n\}$ where $w_k$ is defined as the recursive formula below: $$ w_1=v_1,w_k:=v_k-\sum_{i=1}^{k-1}\dfrac{\iprod{v_k}{w_i}}{\|w_i\|^2}w_i,2\le k\le n $$ Then $S'$ is an orthogonal subset of $V$ and $\mathit0\not\in S'$. ::: > The concept of this recursive formula is: > For $v_k$, we subtract the **orthogonal projection** of $v_k$ on $w_i$ where $1\le i\le k-1$, then $v_k$ would become orthogonal to $w_i$. The proof is also by induction. :::info Also, $\text{Span}(S)=\text{Span}(S')$. So, if we apply the Gram-Schimit Process on a basis $\beta$ of $V$, then we get an orthogonal basis $\beta'$ for $V$. We can then divide the norms of the vectors in $\beta'$ respectively to produce an orthonormal basis $\gamma=\{v_1,\dots,v_n\}$ for $V$. ::: Hence every finite-dimensional inner product space has an orthonormal basis. And so we can get the formula: :::info $$ \forall x\in V,x=\displaystyle\sum_{i=1}^n\iprod{x}{v_i}v_i $$ Here $\iprod{x}{v_i}$ is called the **Fourier Coefficients** of $x$. ::: A corollary of this theorem is for any linear operator $T:V\to V$, consider the matrix representation $A:=[T]_\beta\in\text{M}_n(F)$ under an orthonormal basis $\beta=\{v_1,\dots,v_n\}$, then: $$ \forall 1\le i,j\le n,A_{ij}=\sum_{i=1}^n\iprod{T(v_j)}{v_i} $$ ### Orthogonal complement :::warning Let $V$ be a inner product space. Suppose $S$ is a subset of $V$, then we define the orthogonal complement $S^{\perp}$ of $S$ as: $$ S^{\perp}:=\{y\in V:\forall x\in S,\iprod{y}{x}=0\} $$ ::: $S^{\perp}$ is clearly a subspace of $V$. The orthogonal conplement provides a good break down of $V$. Imagine $S$ (has to be a subspace of $V$) as the $x$-axis and $S^{\perp}$ as the $y$-axis on $\mathbb{R}^2$. :::info That is, if $S$ is a subspace of $V$, then $\forall x\in V,\exists!u,v$ where $u\in S,v\in S^{\perp},\text{s.t. }u+v=x$. ::: This is like breaking down a $n$, or even infinite dimensional inner product space to a $2$ -dimensional-like inner product space. :::warning Also, the vector $u\in S$ is the **closest** to $x$ and is called the orthogonal projection of $x$ on $S$. The **closest** means that $y\in S\implies\|y-x\|\ge\|y-u\|$ and the equality holds if and only if $x=u$. ::: ### The standard inner product On $\mathbb{R}^n,\mathbb{C}^n$, the standard inner product between any two vectors $v_1,v_2$ is defined as the sum of the product of every coordinates in $v_1,v_2$. :::warning Let $\beta:=\{v_1,\dots,v_n\}$ be a basis for a finite dimensional inner product space $V$. For any $x,y\in V$, define: $$ \iprod{x}{y}:=\displaystyle\sum_{i=1}^n\iprod{x}{v_i}\cdot\overline{\iprod{y}{v_i}} $$ Then $\iprod{\cdot}{\cdot}$ is an inner product on $V$. Furthermore, let $\phi_{\beta}$ be the isomorphism between $V$ and $F^n$ where $\forall x\in V,\phi_{\beta}(x):=[x]_{\beta}$, then $\forall x,y\in V,\iprod{[x]_\beta}{[y]_{\beta}}'=\iprod{x}{y}$, where $\iprod{\cdot}{\cdot}'$ is the standard inner product on $F^n$. ::: Note that if $x$ and $y$ are in $F^n$ with the standard inner product, then $y^*x=\iprod{x}{y}$. ## Adjoint of a Linear Operator Before defining the adjoint of a linear operator $T$ on a finite dimensional inner product space $V$, we first state a lemma to ensure that the adjoint is well-defined. :::warning Let $V$ be a finite-dimensional inner product space over $F$. $$ \forall f\in V^*,\exists!y,\text{s.t. }\forall x\in V,f(x)=\iprod{x}{y} $$ where $V^*$ is the dual space of $V$. ::: :::spoiler Proof Because the existence of an orthonormal basis $\beta=\{v_1,\dots,v_n\}$ for $V$, we can obtain the dual basis $\beta^*=\{f_1,\dots,f_n\}$ of $V^*$. Hence we suppose that $f=\displaystyle\sum_{i=1}^na_if_i$ where $a_i$ are unique scalars in $F$. If we take $$ y:=\sum_{i=1}^n\overline{a_i}v_i $$ and suppose $x=\displaystyle\sum_{i=1}^nb_iv_i$ then we can get: $$ \iprod{x}{y}=\sum_{i=1}^na_ib_i=f(x) $$ ::: :::warning Hence if $T$ is a linear operator on $V$, by defining $g\in V^*$ as $g(x):=\iprod{T(x)}{y}$, we know that there exists an unique vector $y'\in V$ such that $g(x)=\iprod{x}{y'}$. We define the **adjoint** of $T$ as the linear operator $T^*$ on $V$ where $T^*(x)=y'$, that is, $\iprod{T(x)}{y}=\iprod{x}{T^*(y)}$ for every $x,y\in V$. ::: :::info The matrix representation of the adjoint of a given linear operator $T$ under an orthonormal basis $\beta$ for $V$ is: $$ [T^*]_{\beta}=[T]_{\beta}^* $$ Hence if $A\in\text{M}_n(F)$, then $L_{A^*}=(L_A)^*$ ::: :::info The adjoint also has the following properties: Let $V$ be an inner product space and $T,U$ be linear operators on $V$ whose adjoints exists. (a) $cT+U$ has an adjoint $(cT+U)^*=\bar{c}T^*+U^*$ where $c\in F$. (b) $TU$ has an adjoint, and $(TU)^*=U^*T^*$ \(c) $T^*$ has an adjoint, and $(T^*)^*=T^*$ (d) $I$ has an adjoint, and $I^*=I$ By $L_{A^*}=(L_A)^*$, the matrix version of these properties also holds. ::: Note that because if $V$ is an finite-dimensional inner product space, we can always find an orthonormal basis for it. Hence any linear operator $T$ on $V$ has an adjoint. ### Least squares approximation Suppose there are $m$ data points $(t_i,y_i)$ on $\mathbb{R}^2$. We want to find a line with the equation $y=cx+d$ that best represents the data. :::warning One approach is to minimize the error $E$ defined as: $$ E=\displaystyle\sum_{i=1}^m(y_i-ct_i-d)^2 $$ that is, minimizing the distance squared of the data points' $y$ coordinates to the line. ::: There are different approaches to find an equation for $c$ and $d$, and one is using adjoints of linear operators. If we let $A:=\begin{pmatrix}t_1&1\\t_2&1\\\vdots&\vdots\\t_m&1\end{pmatrix}$ and $x:=(c,d)^t$ and $y=(y_1,\dots,y_m)^t$, then clearly, $E=\|y-Ax\|^2$. We develop a general method for finding an explicit vector $x_0\in F^n$ such that for every $x\in F^n,\|y-Ax_0\|\le\|y-Ax\|$ with $A\in\text{M}_{m\times n}(F)$. :::info Observe that $\iprod{Ax}{y}_m=\iprod{x}{A^*y}_n$ where $\iprod{\cdot}{\cdot}_m,\iprod{\cdot}{\cdot}_n$ are the standard inner product on $F^m,F^n$ respectively. Also $\text{rank}(A^*A)=\text{rank}(A)$ by the fact that $A^*Ax=\mathit0\iff Ax=\mathit0$ and the dimension theorem. Hence $A^*A$ is invertible if $\text{rank}(A)=n$. By theorem, there exists a unique vector in $R(L_A)$ that is closest to $y$, say, $Ax_0$. Hence $\iprod{Ax}{Ax_0-y}_m=0$ because $Ax_0-y\in (R(L_A))^\perp$, we get $$ \forall x\in F^n, \iprod{x}{A^*Ax_0-A^*y}_m=0 $$ if and only if $A^*Ax_0=A^*y$. ::: In addition, if $\text{rank}(A)=n$, then $x_0=(A^*A)^{-1}A^*y$. The other approach is by taking partial derivatives with respect to $c$ and $d$. i.e., setting $$ \begin{cases}\dfrac{\partial}{\partial c}E=0\\\dfrac{\partial}{\partial d}E=0\end{cases} $$ We get $$ \begin{cases}\displaystyle\sum_{i=1}^mt_i(ct_i+d-y_i)=0\\\displaystyle\sum_{i=1}^m(ct_i+d-y_i)=0\end{cases} $$ which is called the **normal equations**. By direct calculations we can also derive these equations with $A^*Ax_0=A^*y$. ### Minimal solution to a linear system :::info Let $A\in\text{M}_{m\times n}(F)$ and $b\in F^m$. Suppose that $Ax=b$ is consistent. Then the following statements are true (a) There exists an unique minimal solution $s$ of $Ax=b$ and $s\in R(L_{A^*})$ (b) The vector $s$ is the only solution to $Ax=b$ that lies in $R(L_{A^*})$; in fact, if $u$ satisfies $AA^*u=b$ then $s=A^*u$. ::: ## Normal and Self-Adjoint Operators :::warning Let $V$ be an inner product space over $F$ and $T$ be a linear operator that has an adjoint. If $TT^*=T^*T$ or $T=T^*$, then we say $T$ is normal or self-adjoint, respectively. Note that if $T$ is self-adjoint, then $T$ is also normal. ::: :::info Let $V$ be an inner product space over $F$ and $T$ be normal on $V$. The following statements are true. (a) $\forall x\in V,\|T(x)\|=\|T^*(x)\|$ (b) $T-cI$ is normal for every $c\in F$ \(c) If $T(x)=\lambda x$, then $T^*(x)=\bar{\lambda}x$ (d) If $\lambda_1$ and $\lambda_2$ are distincet eigenvalues of $T$ with corresponding eigenvectors $x_1$ and $x_2$, then $\iprod{x_1}{x_2}=0$. ::: ### Schur's Theorem :::info Let $T$ be a linear operator on a finite-dimensional inner product space $V$. If the characteristic polynomial of $T$ splits over $F$, then there exists an orthonormal basis $\gamma$ for $V$ such that $[T]_{\gamma}$ is upper triangular. ::: ### Orthonormally Diagonalization :::warning Let $V$ be an inner product space over $F$ and $T$ be a linear operator. We say $T$ is orthonormally diagonalizable if and only if there exists a orthonormal basis for $V$ that is consisted of eigenvectors of $T$. ::: :::warning The following statements are true. (a) $T$ is normal if and only if $T$ is orthonormally diagonalizable over $\mathbb{C}$ (b) $T$ is self-adjoint if and only if $T$ is orthonormally diagonalizable over $\mathbb{R}$. ::: :::spoiler Proof For (a), use the Fundemental theorem of Algebra, we can deduce that the characteristic polynomial splits over $\mathbb{C}$. Hence, by Schur's theorem, there exists an orthonormal basis $\beta=\{v_1,\dots,v_n\}$ for $V$ such that $[T]_{\beta}$ is upper triangular. We know that $v_1$ is an eigenvector of $T$, we claim that $v_k$ where $1\le k\le n$ are also eigenvectors of $T$. By mathematical induction on $k$, we assume $v_1,\dots,v_{k-1}$ are eigenvectors of $T$ for some $1\le k\le n$. Let $A=[T]_\beta$, then: $$ T(v_k)=\displaystyle\sum_{j=1}^kA_{jk}v_k $$ Consider any $j<k$ and $\lambda_j$ denote the eigenvalue of $T$ corresponding to $v_j$. So we have $A_{jk}=\iprod{T(v_k)}{v_j}=\iprod{v_k}{T^*(v_j)}=\lambda_j\iprod{v_k}{v_j}=0$ Therefore $T(v_k)=A_{kk}v_k$, this means that $v_k$ is an eigenvector of $T$. For (b), we use the fact that if $T$ is self-adjoint, then all of the eigenvalues of $T$ are real. And we apply the Fundemental theorem of Algebra, so the characteristic polynomial splits over $\mathbb{R}$. By applying Schur's theorem to obtain an orthonoraml basis $\beta$ and observing that $[T^*]_{\beta}=[T]^*_{\beta}=[T]_{\beta}$ is upper triangular and lower triangular, it follows that $\beta$ is consisted of eigenvectors of $T$. ::: ## Positive Definite and Semidefinite Operators :::warning A **Positive Definite** or a **Positive Semidefinite** linear operator is a *self-adjoint* linear operator $T$ that satisfies $\forall x\in V-\{\mathit0\},\iprod{T(x)}{x}>0$ or $\forall x\in V-\{\mathit0\},\iprod{T(x)}{x}\ge0$, respectively. ::: :::info It is easy to deduce that the eigenvalues of a linear operator $T$ are all positive or nonnegative if and only if $T$ is positive definite or positive semidefinite, respectively. ::: Because if a scalar has the relation $>0$ or $\ge0$, then it must be a real number, hence the linear operator must be self-adjoint. :::info Let $V$ be a finite-dimensional inner product space. A linear operator $T$ is positive semidefinite if and only if $\exists B\in\text{M}_{n}(F),\text{s.t. }[T]_{\beta}=B^*B$ where $\beta$ is an orthonormal basis for $V$ ::: :::danger We first claim that the isomorphism $\phi_{\beta}:V\to F^n$ defined as $\forall x\in V,\phi_{\beta}(x):=[x]_{\beta}$ where $n=\dim(V)$ satisfies: $$ \forall u,v\in V\iprod{u}{v}=\iprod{\phi_{\beta}(u)}{\phi_{\beta}(v)}' $$ where $\iprod{\cdot}{\cdot}'$ is the standard inner product on $F^n$. ::: :::spoiler Proof Suppose that $$ u=\sum_{i=1}^na_ix_i,v=\sum_{i=1}^nb_ix_i $$ Then \begin{align*} \iprod{u}{v}&=\iprod{\sum_{i=1}^na_ix_i}{\sum_{i=1}^nb_ix_i} \\&=\sum_{i=1}^n\iprod{a_ix_i}{\sum_{j=1}^nb_jx_j} \\&=\sum_{i=1}^n\iprod{a_ix_i}{b_ix_i} \\&=\sum_{i=1}^na_i\overline{b_i} \\&=\iprod{(a_1,a_2,\dots,a_n)^t}{(b_1,b_2\dots,b_n)^t}' \\&=\iprod{\phi_\beta(u)}{\phi_{\beta}(v)}' \end{align*} ::: We now proof the ($\implies$) side, the converse is by the isomorphism and the fact that: $$ [T]_\beta=[L_{B^*}]_\gamma[L_B]_\gamma=[(L_B)^*]_\gamma[L_B]_\gamma=[(L_B)^*L_B]_\gamma $$ And so $$ \iprod{T(x)}{x}=\iprod{\phi_\beta(T(x))}{\phi_\beta(x)}'=\iprod{[T]_\beta[x]_\beta}{[x]_\beta}'=\iprod{(L_B)^*L_B(v)}{v}'=\iprod{L_B(v)}{L_B(v)}'=\|{L_B(v)}\|^2\ge0 $$ we can conclude that $T$ is semidefinite. :::spoiler Proof Suppose $T$ is positive semidefinite. $T$ is self-adjoint $\iff$ $\exists$ an orthonormal basis $\gamma$ consisted of eigenvectors of $T$. Let $D=[T]_{\gamma}$ and $v_i$ be an eigenvector corresponding to eigenvalue $\lambda_i$ for every $i$, then $D$ is diagaonal and $$ 0\le\iprod{T(v_i)}{v_i}=\lambda_i $$ Hence $D\in\text{Sym}_{n}(\mathbb{R})$, where $\text{Sym}_{n}(\mathbb{R})\subset\text{M}_n(\mathbb{R})$ is the set of all symmetric matrices. Because every eigenvalue of $T$ are nonnegative, we can take the square root of the diagonal entries to get another diagonal matrix $C\in\text{M}_n(\mathbb{R})$. Since diagonal matrices are Symmetric, $C^*C=D$. Let $Q:=[I_V]^{\beta}_\gamma\in\text{M}_n(F)$ Claim : $Q^{-1}=Q^*$ Let $w_i\in\gamma$ and $v_i\in\beta$. By the definition of $Q$, the $i$-th column of $Q$ is the vector $v=(a_{1,i},a_{2,i},\dots,a_{n,i})^t\in F^n$ such that: $$ w_i=\sum_{k=1}^n(a_{k,i})v_k $$ Hence we can get: \begin{align*} \delta_{i,j}=I_{i,j}=\iprod{w_i}{w_j}&=\iprod{\sum_{k=1}^n(a_{k,i})v_k}{\sum_{k=1}^n(a_{k,j})v_k} \\&=\sum_{k=1}^n\iprod{a_{k,i}v_k}{\sum_{t=1}^na_{t,j}v_j} \\&=\sum_{k=1}^n\iprod{a_{k,i}v_k}{a_{k,j}v_k} \\&=\sum_{k=1}^na_{k,i}\overline{a_{k,j}} \\&=\sum_{k=1}^nQ_{k,i}Q^*_{j,k} \\&=(Q^*Q)_{j,i} \end{align*} This implies that $Q^*Q=I$, so $Q^*=Q^{-1}$. Hence $A=Q^{-1}DQ=Q^*C^*CQ=(CQ)^*CQ$. ::: :::info If $T,U$ are both positive semidefinite operators and $T^2=U^2$, then $T=U$. ::: :::spoiler Proof $\iprod{T^2(v)}{v}=\iprod{T(v)}{T(v)}\ge0$ and $(T^2)^*=T^*T^*=T^2$, so $T^2$ is positive semidefinite. $\iff\exists\beta$ an orthonormal basis consisted of eigenvectors of $T^2$. By theorem, $T-cI_V$ and $T^2-cI_V$ are self-adjoint for every $c\in F$, where $I_V$ is the idendity transformation. Suppose $\lambda_i$ is an eigenvalue of $T^2$ and $v_i$ is an eigenvector corresponding to it. Then $\lambda_i\ge 0$ and $(T^2-\lambda_iI_V)(v_i)=\mathit0$, we get: $$ (T^2-\lambda_iI_V)(v_i)=(T-\sqrt{\lambda_i}I_V)(T-\sqrt{\lambda_i}I_V)(v_i)=\mathit0 $$ This means the inner product of $(T^2-\lambda_iI_V)(v_i)$ and $v_i$ is zero, i.e., $$ 0=\iprod{(T-\sqrt{\lambda_i}I_V)(T-\sqrt{\lambda_i}I_V)(v_i)}{v_i}=\iprod{(T-\sqrt{\lambda_i}I_V)(v_i)}{(T-\sqrt{\lambda_i}I_V)^*(v_i)} $$ $T-cI_V$ is also self-adjoint, hence we get: $$ 0=\iprod{(T-\sqrt{\lambda_i}I_V)(v_i)}{(T-\sqrt{\lambda_i}I_V)(v_i)}=\|{(T-\sqrt{\lambda_i})(v_i)}^2\| $$ $\iff (T-\sqrt{\lambda_i})(v_i)=\mathit0$, so $v_i$ is an eigenvector of $T$ corresponding to the eigenvalue $\sqrt{\lambda_i}$. Since $T^2=U^2$, similarly, $v_i$ is also an eigenvector of $U$ and corresponding to the eigenvalue $\sqrt{\lambda_i}$, that is, the equality: $$ (T-\sqrt{\lambda_i}I_V)(v_i)=(U-\sqrt{\lambda_i}I_V)(v_i) $$ Finally, using the fact that $\beta$ is an basis consisted of eigenvectors of $T$, we can conclude that $T(v)=U(v)$ for every $v\in V$. ::: :::info If $T,U$ are both positive definite operators and $TU=UT$, then $TU$ is positive definite. ::: Suppose $x\not=\mathit0$, then: \begin{align*} 0&<\iprod{T(U(x)+x)}{U(x)+x} \\&=\iprod{T(U(x))}{x}+\iprod{T(x)}{U(x)} \\&=\iprod{TU(x)}{x}+\iprod{UT(x)}{x} \\&=2\iprod{TU(x)}{x} \end{align*} So $TU$ is positive definite. :::warning We say a square matrix $A$ is positive definite or postive semidefinite if $L_A$ is postive definite or positive semidefinite. ::: :::info Let $V$ be a finite dimensional inner product space and $\beta$ be an orthonormal basis for $V$. The above theorems hold for matrices. So a square matrix $[T]_{\beta}$ is positive definite or postive semidefinite if and only if $T$ is postive definite or positive semidefinite. ::: Because the isomorphism $\phi_{\beta}$ and the claim: $$ \forall u,v\in V,\iprod{u}{v}=\iprod{\phi_{\beta}(u)}{\phi_{\beta}(v)}' $$ we get $\iprod{T(x)}{x}=\iprod{L_A(\phi_\beta(x))}{\phi_{\beta}(x)}'>0$ where $A=[T]_{\beta}$. ## Unitary and Orthogonal Operators Let $V$ be a fintie-dimensional inner product space over $F$ and $T$ be a linear operator on $V$. :::warning If $\forall x\in V, \|T(x)\|=\|x\|$, then we say $T$ is a unitary or an orthogonal operator on $V$ if $F=\mathbb{C}$ or $F=\mathbb{R}$, respectively. ::: :::info The following statements are equivalent. (a) $T^*T=I$ (b) $TT^*=I$ \(c) $\forall x\in V,\|T(x)\|=\|x\|$ (d) If $\beta$ is an orthonormal basis for $V$, then so is $T(\beta)$. (e) There exists an orthonormal basis for $V$ such that $T(\beta)$ is orthonormal. (f) $\forall x,y\in V,\iprod{T(x)}{T(y)}=\iprod{x}{y}$. ::: :::info Hence over $\mathbb{R}$, $T$ is orthonormally diagonalizable and all of its eigenvalues have absolute value of $1$ if and only if $T$ is self-adjoint and orthogonal. Over $\mathbb{C}$, $T$ is orthonormally diagonalizable and all of its eigenvalues have absolute value of $1$ if and only if $T$ is unitary. ::: A useful corollary is that if $T$ is a unitary or an orthogonal operator, then $T$ is invertible, because $T^*=T^{-1}$. We also define unitary and orthogonal matrices: :::warning If a square matrix's adjoint is its inverse, then we call it a unitary matrix over $\mathbb{C}$ and an orthogonal matrix over $\mathbb{R}$. ::: Note that if $A^*A=I$ where $A$ is a square matrix, then the columns of $A$ form an orthonormal basis for $F^n$. :::warning We say $A,B$ are unitarily equivalent or orthogonally equivalent if and only if there exists a unitary or orthogonal matrix $P$, respectively, such that $P^*BP=A$. Because $P$ is unitary, we have $P^*=P^{-1}$, so $A$ and $B$ are also similar. ::: :::info Let $A$ be a complex square matrix, then $A$ is normal if and only if $A$ is unitarily equivalent to a diagonal matrix. $A$ is a real symmetric matrix if and only if $A$ is orthogonally equivalent to a real diagonal matrix. ::: Schur's Theorem: :::info If $A$ is a square matrix whose characteristic polynomial splits over $F$, then: (a) If $F=\mathbb{C}$, then $A$ is unitarily equivalent to a complex upper triangular matrix. (b) If $F=\mathbb{R}$, then $A$ is orthogonally equivalent to a real upper triangular matrix. ::: ### Rigid Motions :::warning Let $V$ be a real inner product space. A rigid motion is a function $f:V\to V$ that satisfies $\|f(x)-f(y)\|=\|x-y\|$ for every $x,y\in V$. ::: :::warning A translation is a function $g:V\to V$ that satisfies $\exists! v_0\in V,\text{s.t. }\forall x\in V,g(x)=x+v_0$. ::: :::info Let $f:V\to V$ be a rigid motion on a finite-dimesnional real inner product space $V$. $$ \exists !g,T:V\to V,\text{s.t. }f=g\circ T $$ where $g$ is a translation and $T$ is an orthogonal operator. ::: ### Conic Sections :::warning A quadratic equation is defined as: $$ ax^2+2bxy+cy^2+dx+ey+f=0 $$ for some $a,b,c,d,e,f\in\mathbb{R}$ ::: Because $d,e,f$ represents a translation along $x,y,z$ axis, respectively, we need only cosider the terms: $$ ax^2+2bxy+cy^2 $$ which is called the **associate quadratic form** of $ax^2+2bxy+cy^2+dx+ey+f=0$. We wish to eliminate the $xy$ term by changing axis, then we can easily identify the conic section is either a eclipse, a porabola, or a hyperbola. :::info If we let $$ A=\begin{pmatrix}a&b\\b&c\end{pmatrix}\quad\text{and}\quad X=\begin{pmatrix}x\\y\end{pmatrix} $$ Then $ax^2+2bxy+cy^2=X^tAX=\iprod{AX}{X}$. Because $A$ is symmetric, we can choose an orthogonal matrix $P$ such that $P^tAP=D$ is diagonal and so the entries are eigenvalues of $A$. We define our new axes $X':=\begin{pmatrix}x'\\y'\end{pmatrix}=P^tX$, then we get the transformation that $$ X=PX'\implies X^tAX=(PX')^tA(PX')=(X')^tP^tAPX'=(X')^tDX'=\lambda_1(x')^2+\lambda_2(y')^2 $$ where $\lambda_1,\lambda_2$ are eigenvalues of $A$. Furthermore, $\det(P)=\pm 1$, but if $\det(P)=-1$, we can intechange the columns of $P$ to obtain $Q$ so that $\det(Q)=1$. Hence the transformation $P$ represents a rotation. ::: ## Orthogonal projections and The Spectral Theorem ### Projection :::warning Let $V$ be an vector space and $W_1,W_2$ be subspaces of $V$ where $W_1\oplus W_2=V$. If a linear operator $T$ on $V$ satisfies: $$ x=x_1+x_2,x_1\in W_1,x_2\in W_2\implies T(x)=x_1 $$ then we say $T$ is **the projection on $W_1$ along $W_2$.** It immediately follows that $R(T)=W_1$ and $N(T)=W_2$, hence $V=R(T)\oplus N(T)$. With this in mind, we refer to $T$ as a **projection on $W_1$** or simply as a **projection**. ::: ### Orthogonal projection :::warning Let $V$ be an inner product space and $T$ be a projection. We call $T$ a orthogonal projection if $R(T)^\perp=N(T)$ and $N(T)^\perp=R(T)$. ::: :::info Let $V$ be an inner product space and $T$ be a linear operator on $V$. $T$ is an orthogonal projection if and only if $T$ has an adjoint and $T^*=T^2=T$. ::: ### The Spectral Theorem :::info Suppose $T$ is a linear operator on a finite-dimensional inner product space $V$ over $F$ with the distinct eigenvalues $\lambda_i,i=1,\dots,k$. Assume that $T$ is normal if $F=\mathbb{C}$ and $T$ is self-adjoint if $F=\mathbb{R}$. $\forall i$ let $T_i$ be the orthogonal projection of $V$ on $E_{\lambda_i}$. Then the following statements are true. (a) $V=\displaystyle\bigoplus_{i=1}^{k} E_{\lambda_i}$ (b) $\displaystyle\bigoplus_{j=1,j\not=i}^kE_{\lambda_{j}}=E_{\lambda_i}^\perp$ \(c) $\forall 1\le i,j\le k,T_iT_j=\delta_{i,j}T_i$ (d) $\displaystyle\sum_{i=1}^kT_i=I$ (e) $\displaystyle\sum_{i=1}^k\lambda_iT_i=T$ ::: ## Singular Value Decomposistion and The Psuedoinverse