Linear Algebra Note 4

Orthogonality (2021.11.10 ~ 2021.11.17)

Example

$A = [\begin{array}{cc} 1 & 0 \\ 1 & 1 \\ 1 & 2 \end{array}], \vec{b} = [\begin{matrix} 6 \\ 0 \\ 0 \end{matrix}], P = A (A^{T} A)^{- 1} A^{T} where A^{T} A = [\begin{array}{ccc} 1 & 1 & 1 \\ 0 & 1 & 2 \end{array}] [\begin{array}{cc} 1 & 0 \\ 1 & 1 \\ 1 & 2 \end{array}] = [\begin{array}{cc} 3 & 3 \\ 3 & 5 \end{array}] A^{T} A \vec{x} = A^{T} \vec{b} = [\begin{array}{cc} 3 & 3 \\ 3 & 5 \end{array}] [\begin{matrix} \hat{x_{1}} \\ \hat{x_{2}} \end{matrix}] = [\begin{array}{ccc} 1 & 1 & 1 \\ 0 & 1 & 2 \end{array}] [\begin{matrix} 6 \\ 0 \\ 0 \end{matrix}] = [\begin{matrix} 6 \\ 0 \end{matrix}] ⟹ \hat{\vec{x}} = [\begin{matrix} \hat{x_{1}} \\ \hat{x_{2}} \end{matrix}] = [\begin{matrix} 5 \\ - 3 \end{matrix}] \vec{p} = A \hat{\vec{x}} = [\begin{array}{cc} 1 & 0 \\ 1 & 1 \\ 1 & 2 \end{array}] [\begin{matrix} 5 \\ - 3 \end{matrix}] = [\begin{matrix} 5 \\ 2 \\ - 1 \end{matrix}] \vec{e} = \vec{b} - \vec{p} = [\begin{matrix} 1 \\ - 2 \\ 1 \end{matrix}], check {\vec{e}}^{T} \vec{p} = {[\begin{matrix} 1 \\ - 2 \\ 1 \end{matrix}]}^{T} [\begin{matrix} 5 \\ 2 \\ - 1 \end{matrix}] = 0, check {\vec{e}}^{T} A = {[\begin{matrix} 1 \\ - 2 \\ 1 \end{matrix}]}^{T} [\begin{array}{cc} 1 & 0 \\ 1 & 1 \\ 1 & 2 \end{array}] = 0 Alternatively, P = A (A^{T} A)^{- 1} A^{T} where (A^{T} A)^{- 1} = \frac{1}{6} [\begin{array}{cc} 5 & - 3 \\ - 3 & 3 \end{array}] ⟹ P = \frac{1}{6} [\begin{array}{ccc} 5 & 2 & - 1 \\ 2 & 2 & 2 \\ - 1 & 2 & 5 \end{array}] \vec{p} = P \vec{b} = \frac{1}{6} [\begin{array}{ccc} 5 & 2 & - 1 \\ 2 & 2 & 2 \\ - 1 & 2 & 5 \end{array}] [\begin{matrix} 6 \\ 0 \\ 0 \end{matrix}] = [\begin{matrix} 5 \\ 2 \\ - 1 \end{matrix}]$
Theorem

$P^{2} = P$
- Intuition
  
  $P \vec{b} = \vec{p}, P (P \vec{b}) = P^{2} \vec{b} = \vec{p}$
- Proof
  
  $P = A (A^{T} A)^{- 1} A^{T} \begin{aligned} P^{2} = P P & = (A (A^{T} A)^{- 1} A^{T}) (A (A^{T} A)^{- 1} A^{T}) \\ = A (A^{T} A)^{- 1} A^{T} A (A^{T} A)^{- 1} A^{T} \\ = A (A^{T} A)^{- 1} A^{T} = P \end{aligned}$

Least Square Approximations

In many practical cases,

A \vec{x} = \vec{b}

has no solutions, e.g.

A

= tall matrix (more equations than unknowns)
Let

\vec{e} ≜ A \vec{x} - \vec{b}

, the best approximation is to minimize

\vec{e} ⟹

minimize

{‖ \begin{matrix} \vec{e} \end{matrix} ‖}^{2}

, i.e. Least Square Approximations

Example (Linear fit)
Find the closest line to points
$(t, b) = (0, 6), (1, 0), (2, 0)$
Let the line:
$b = x_{0} + x_{1} t$

$\begin{matrix} x_{0} + x_{1} \cdot 0 = 6 \\ x_{0} + x_{1} \cdot 1 = 0 \\ x_{0} + x_{1} \cdot 2 = 0 \end{matrix} ⟹ [\begin{array}{cc} 1 & 0 \\ 1 & 1 \\ 1 & 2 \end{array}] [\begin{matrix} x_{0} \\ x_{1} \end{matrix}] = [\begin{matrix} 6 \\ 0 \\ 0 \end{matrix}] (A \vec{x} = \vec{b})$
Let
$\vec{e} = A \vec{x} - \vec{b}$ and minimize
${‖ \begin{matrix} \vec{e} \end{matrix} ‖}^{2}$

$e ⊥ A \hat{\vec{x}} ⟺ A^{T} \vec{e} = 0 ⟺ A^{T} (A \vec{x} - \vec{b}) = 0 \begin{aligned} \hat{\vec{x}} & = (A^{T} A)^{- 1} A^{T} \vec{b} \\ = ([\begin{array}{ccc} 1 & 1 & 1 \\ 0 & 1 & 2 \end{array}] [\begin{array}{cc} 1 & 0 \\ 1 & 1 \\ 1 & 2 \end{array}])^{- 1} [\begin{array}{ccc} 1 & 1 & 1 \\ 0 & 1 & 2 \end{array}] [\begin{array}{c} 6 \\ 0 \\ 0 \end{array}] = [\begin{array}{c} 5 \\ - 3 \end{array}] \end{aligned} ∴ b = 5 - 3 t is the best linear fit$
Theorem (Orthogonality principle)
Let
$\vec{e} = A \vec{x} - \vec{b}$ and let
$\hat{\vec{x}}$ be the solution to min
${‖ \begin{matrix} \vec{e} \end{matrix} ‖}^{2}$ = min
${‖ \begin{matrix} A \vec{x} - \vec{b} \end{matrix} ‖}^{2}$ , then
$\vec{e} ⊥ C (A)$ and
$A \hat{\vec{x}} = P \vec{b} = \vec{p}$ where
$P$ is the projection matrix onto
$C (A)$ and
$\hat{\vec{x}}$ is the solution to
$A^{T} A \hat{\vec{x}} = A^{T} \vec{b}$
- Proof
  Let
  $\vec{b} = \vec{p} + \vec{e}$ where
  $\vec{p} \in C (A)$ and
  $\vec{e} \in N (A^{T})$
  while
  $A \vec{x} = \vec{b}$ may NOT be solvable,
  $A \vec{x} = \vec{p}$ is always solvable as
  $\vec{p} \in C (A)$
  The objective function (to be minimized) is
  
  $\begin{aligned} {‖ \begin{array}{c} A \vec{x} - \vec{b} \end{array} ‖}^{2} & = {‖ \begin{array}{c} A \vec{x} - \vec{p} - \vec{e} \end{array} ‖}^{2} \\ = (A \vec{x} - \vec{p} - \vec{e})^{T} (A \vec{x} - \vec{p} - \vec{e}) \\ = {‖ \begin{array}{c} A \vec{x} - \vec{p} \end{array} ‖}^{2} + {‖ \begin{array}{c} \vec{e} \end{array} ‖}^{2} - (A \vec{x} - \vec{p})^{T} \vec{e} - {\vec{e}}^{T} (A \vec{x} - \vec{p}) \\ = {‖ \begin{array}{c} A \vec{x} - \vec{p} \end{array} ‖}^{2} + {‖ \begin{array}{c} \vec{e} \end{array} ‖}^{2} - 0 - 0 (∵ N (A^{T}) ⊥ C (A)) \\ = {‖ \begin{array}{c} A \vec{x} - \vec{p} \end{array} ‖}^{2} + {‖ \begin{array}{c} \vec{e} \end{array} ‖}^{2} \end{aligned}$
  - Goal:
    min
    ${‖ \begin{matrix} A \vec{x} - \vec{b} \end{matrix} ‖}^{2}$ = min [
    ${‖ \begin{matrix} A \vec{x} - \vec{p} \end{matrix} ‖}^{2} + {‖ \begin{matrix} \vec{e} \end{matrix} ‖}^{2}$ ] = min
    ${‖ \begin{matrix} A \vec{x} - \vec{p} \end{matrix} ‖}^{2}$ + min
    ${‖ \begin{matrix} \vec{e} \end{matrix} ‖}^{2} \geq {‖ \begin{matrix} \vec{e} \end{matrix} ‖}^{2}$
    we can hit the equality by choosing
    $\vec{x} = \hat{\vec{x}}$ , then the least square solution is
    ${‖ \begin{matrix} \vec{e} \end{matrix} ‖}^{2}$ when
    $A \hat{\vec{x}} = P \vec{b} = \vec{p}$
  - Summary:
    When
    $A \vec{x} = \vec{b}$ has NO solution, multiply by
    $A^{T}$ and solve
    $A^{T} A \hat{\vec{x}} = A^{T} \vec{b}$ . The solution to this alternative system of equations is the least square approximation to
    $A \vec{x} = \vec{b}$

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Fitting by a parabola

Given a quadratic (二次的) equation

b = x_{2} + x_{1} t + x_{2} t^{2}

with observations

(b, t) = (b_{1}, t_{1}), \dots, (b_{m}, t_{m})

with

m > 3

points, the

m

equations for an exact fit are usually unsolvable

{\begin{matrix} x_{2} + x_{1} t_{1} + x_{2} t_{1}^{2} = b_{1} \\ x_{2} + x_{1} t_{2} + x_{2} t_{2}^{2} = b_{2} \\ ⋮ \\ x_{2} + x_{1} t_{m} + x_{2} t_{m}^{2} = b_{m} \end{matrix} ⟹ A = {[\begin{array}{ccc} 1 & t_{1} & t_{1}^{2} \\ ⋮ & ⋮ & ⋮ \\ 1 & t_{m} & t_{m}^{2} \end{array}]}_{m \times 3}, A [\begin{matrix} x_{0} \\ x_{1} \\ x_{2} \end{matrix}] = [\begin{matrix} b_{1} \\ b_{2} \\ ⋮ \\ b_{m} \end{matrix}] (A \vec{x} = \vec{b})

Least Squares: solve

A^{T} A \hat{\vec{x}} = A^{T} \vec{b}

Orthogonal Bases

A vector spaces can be spanned by different bases. A basis containing all orthogonal vectors is called an orthogonal basis.
Recall that vectors

\vec{q_{1}}, \vec{q_{2}}, \dots, \vec{q_{n}}

are orthogonal if

< \vec{q_{i}}, \vec{q_{j}} >= {\vec{q_{i}}}^{T} \vec{q_{j}} = {\begin{matrix} 0, i \neq j \\ {‖ \begin{matrix} \vec{q_{i}} \end{matrix} ‖}^{2} = {‖ \begin{matrix} \vec{q_{j}} \end{matrix} ‖}^{2}, i = j \end{matrix}

Define (Orthonormal): The vectors
$\vec{q_{1}}, \vec{q_{2}}, \dots, \vec{q_{n}}$ are orthonormal (orthogonal and normalized) if

$< \vec{q_{i}}, \vec{q_{j}} >= {\vec{q_{i}}}^{T} \vec{q_{j}} = {\begin{matrix} 0, i \neq j \\ 1, i = j \end{matrix}$ (normalized to unit vector)
Theorem
Let
$Q = [\vec{q_{1}} \vec{q_{2}} \dots \vec{q_{n}}]$ where
$\vec{q_{1}}, \vec{q_{2}}, \dots, \vec{q_{n}}$ are orthonormal. Then
$Q^{T} Q = I$ . Moreover, if
$Q$ is square, then
$Q^{- 1} = Q^{T}$
- Proof
  
  $\begin{aligned} Q^{T} Q = [\begin{array}{c} {\vec{q_{1}}}^{T} \\ {\vec{q_{2}}}^{T} \\ ⋮ \\ {\vec{q_{n}}}^{T} \end{array}] [\begin{array}{cccc} \vec{q_{1}} & \vec{q_{2}} & \dots & \vec{q_{n}} \end{array}] & = [\begin{array}{cccc} {\vec{q_{1}}}^{T} \vec{q_{1}} & {\vec{q_{1}}}^{T} \vec{q_{2}} & \dots & {\vec{q_{1}}}^{T} \vec{q_{n}} \\ {\vec{q_{2}}}^{T} \vec{q_{1}} & {\vec{q_{2}}}^{T} \vec{q_{2}} & \dots & {\vec{q_{2}}}^{T} \vec{q_{n}} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ {\vec{q_{n}}}^{T} \vec{q_{1}} & {\vec{q_{n}}}^{T} \vec{q_{2}} & \dots & {\vec{q_{n}}}^{T} \vec{q_{n}} \end{array}] \\ = [\begin{array}{cccc} 1 & 0 & \dots & 0 \\ 0 & 1 & \dots & 0 \\ ⋮ & ⋮ & ⋱ & ⋮ \\ 0 & 0 & \dots & 1 \end{array}] = I \end{aligned}$
  If
  $Q$ is square
  $⟹$ size
  $= n \times n$ and
  $Q^{T} Q = I ⟹ Q^{- 1} = Q^{T}$
Example
Permutation matrix
$Q = [\begin{array}{ccc} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 1 & 0 & 0 \end{array}]$ is orthogonal
- Check
  
  $Q^{T} Q = [\begin{array}{ccc} 0 & 0 & 1 \\ 1 & 0 & 0 \\ 0 & 1 & 0 \end{array}] [\begin{array}{ccc} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 1 & 0 & 0 \end{array}] = [\begin{array}{ccc} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array}] = I$
Example
Rotation matrix
$Q = [\begin{array}{cc} c o s θ & - s i n θ \\ s i n θ & c o s θ \end{array}]$ rotates a vector counterclockwise by
$θ$ and
$Q$ is orthogonal
- Check
  
  $Q^{T} Q = [\begin{array}{cc} c o s θ & s i n θ \\ - s i n θ & c o s θ \end{array}] [\begin{array}{cc} c o s θ & - s i n θ \\ s i n θ & c o s θ \end{array}] = [\begin{array}{cc} 1 & 0 \\ 0 & 1 \end{array}] = I$
Theorem
Orthonormal matrices preserve length of a vector, i.e.
$‖ \begin{matrix} Q \vec{x} \end{matrix} ‖ = ‖ \begin{matrix} \vec{x} \end{matrix} ‖ \forall \vec{x} \in R^{n}$
Moreover, it preserves dot products, i.e.
$(Q \vec{x})^{T} (Q \vec{y}) = {\vec{x}}^{T} \vec{y}$
- Proof
  
  $(Q \vec{x})^{T} (Q \vec{y}) = {\vec{x}}^{T} Q^{T} Q \vec{y} = {\vec{x}}^{T} I \vec{y} = {\vec{x}}^{T} \vec{y}$
  Let
  $\vec{y} = \vec{x}, (Q \vec{x})^{T} (Q \vec{x}) = {\vec{x}}^{T} \vec{x} ⟺ {‖ \begin{matrix} Q \vec{x} \end{matrix} ‖}^{2} = {‖ \begin{matrix} \vec{x} \end{matrix} ‖}^{2} ⟺ ‖ \begin{matrix} Q \vec{x} \end{matrix} ‖ = ‖ \begin{matrix} \vec{x} \end{matrix} ‖$

Recall that for projecting a vector
$\vec{b}$ onto a subspace spanned by columns of
$A ⟹$ they are basis for
$C (A)$

\begin{aligned} < \vec{e}, A > = & < \vec{b} - A \vec{x}, A > = 0 \\ ⟹ & A^{T} (\vec{b} - A \vec{x}) = 0 \\ ⟹ & A^{T} \vec{b} = A^{T} A \vec{x} \\ ⟹ & \hat{\vec{x}} = (A^{T} A)^{- 1} A^{T} \vec{b}, \vec{p} = A \hat{\vec{x}} = A (A^{T} A)^{- 1} A^{T} \vec{b} \end{aligned}

when we have an orthogonal basis of the subspace

Q = [\vec{q_{1}} \vec{q_{2}} \dots \vec{q_{n}}], Q^{T} Q = I

Then

\hat{\vec{x}} = Q^{T} \vec{b}, \vec{p} = Q \hat{\vec{x}} = Q Q^{T} \vec{b}

Projection of
$\vec{b}$ onto a subspace is the sum of the projection of
$\vec{b}$ onto each orthonormal vector
$\vec{q_{i}}$ in the orthonormal basis of the subspace

$\begin{aligned} \vec{p} & = Q Q^{T} \vec{b} = [\begin{array}{cccc} \vec{q_{1}} & \vec{q_{2}} & \dots & \vec{q_{n}} \end{array}] [\begin{array}{c} {\vec{q_{1}}}^{T} \\ {\vec{q_{2}}}^{T} \\ ⋮ \\ {\vec{q_{n}}}^{T} \end{array}] \vec{b} = [\begin{array}{cccc} \vec{q_{1}} & \vec{q_{2}} & \dots & \vec{q_{n}} \end{array}] [\begin{array}{c} {\vec{q_{1}}}^{T} \vec{b} \\ {\vec{q_{2}}}^{T} \vec{b} \\ ⋮ \\ {\vec{q_{n}}}^{T} \vec{b} \end{array}] \\ = (\underset{scalar}{\underset{⏟}{{\vec{q_{1}}}^{T} \vec{b}}}) \vec{q_{1}} + ({\vec{q_{2}}}^{T} \vec{b}) \vec{q_{2}} + \dots + ({\vec{q_{n}}}^{T} \vec{b}) \vec{q_{n}} (\vec{b} = \vec{p} + \vec{e}, \vec{e} ⊥ \vec{q_{i}}) \\ = ({\vec{q_{1}}}^{T} \vec{p}) \vec{q_{1}} + ({\vec{q_{2}}}^{T} \vec{p}) \vec{q_{2}} + \dots + ({\vec{q_{n}}}^{T} \vec{p}) \vec{q_{n}} \\ = \vec{p_{1}} + \vec{p_{2}} + \dots + \vec{p_{n}} \end{aligned}$
Let
$\vec{p} = \hat{x_{1}} \vec{q_{1}} + \hat{x_{2}} \vec{q_{2}} + \dots + \hat{x_{n}} \vec{q_{n}}$

${\vec{q_{i}}}^{T} \vec{p} = {\vec{q_{i}}}^{T} (\hat{x_{1}} \vec{q_{1}} + \hat{x_{2}} \vec{q_{2}} + \dots + \hat{x_{n}} \vec{q_{n}}) = \hat{x_{i}} {\vec{q_{i}}}^{T} \vec{q_{i}} ({\vec{q_{i}}}^{T} \vec{q_{j}} = 0 \forall j \neq i) ∴ {\vec{p}}^{T} \vec{q_{i}} = \hat{x_{i}} {\vec{q_{i}}}^{T} \vec{q_{i}}, \hat{x_{i}} = \frac{{\vec{p}}^{T} \vec{q_{i}}}{{‖ \begin{matrix} \vec{q_{i}} \end{matrix} ‖}^{2}} where {‖ \begin{matrix} \vec{q_{i}} \end{matrix} ‖}^{2} = 1 for orthonormal basis {q_{i}}$
- Example
  
  $\vec{q_{1}} = [\begin{matrix} 1 \\ 0 \\ 0 \end{matrix}], \vec{q_{2}} = [\begin{matrix} 0 \\ 1 \\ 0 \end{matrix}], \vec{b} = [\begin{matrix} 4 \\ 3 \\ 2 \end{matrix}] \vec{p_{1}} = ({\vec{q_{1}}}^{T} \vec{b}) \vec{q_{1}} = {[\begin{matrix} 1 \\ 0 \\ 0 \end{matrix}]}^{T} [\begin{matrix} 4 \\ 3 \\ 2 \end{matrix}] [\begin{matrix} 1 \\ 0 \\ 0 \end{matrix}] = [\begin{matrix} 4 \\ 0 \\ 0 \end{matrix}] \vec{p_{2}} = ({\vec{q_{2}}}^{T} \vec{b}) \vec{q_{2}} = {[\begin{matrix} 0 \\ 1 \\ 0 \end{matrix}]}^{T} [\begin{matrix} 4 \\ 3 \\ 2 \end{matrix}] [\begin{matrix} 0 \\ 1 \\ 0 \end{matrix}] = [\begin{matrix} 0 \\ 3 \\ 0 \end{matrix}] \vec{p} = \vec{p_{1}} + \vec{p_{2}} = [\begin{matrix} 4 \\ 3 \\ 0 \end{matrix}] = Projection of \vec{b} onto s p a n ({\vec{q_{1}}, \vec{q_{2}}})$

The Gram-Schmidt Process

A systematic way to create orthonormal basis from an arbitrary basis

Input:

\vec{v_{1}}, \vec{v_{2}}, \dots, \vec{v_{n}}

(

n

non-zero vectors)
Output:

\vec{q_{1}}, \vec{q_{2}}, \dots, \vec{q_{r}} (r = r a n k ([\vec{v_{1}} \vec{v_{2}} \dots \vec{v_{n}}]))

Step1: Let

\vec{e_{1}} = \vec{v_{1}}, \vec{q_{1}} = \frac{\vec{e_{1}}}{‖ \begin{matrix} \vec{e_{1}} \end{matrix} ‖}

Step2+:

\underset{orthogonalization}{\underset{⏟}{\vec{e_{k}} = \vec{v_{k}} - \sum_{i = 1}^{k - 1} ({\vec{q_{i}}}^{T} \vec{v_{k}}) \vec{q_{i}}}}, \underset{normalization}{\underset{⏟}{\vec{q_{k}} = \frac{\vec{e_{k}}}{‖ \begin{matrix} \vec{e_{k}} \end{matrix} ‖}}}

Example

$\vec{a} = [\begin{matrix} 1 \\ - 1 \\ 0 \end{matrix}], \vec{b} = [\begin{matrix} 2 \\ 0 \\ - 2 \end{matrix}], \vec{c} = [\begin{matrix} 3 \\ - 3 \\ 3 \end{matrix}]$
1. Let
  $\vec{q_{1}} = \frac{\vec{a}}{‖ \begin{matrix} \vec{a} \end{matrix} ‖} = \frac{1}{\sqrt{2}} [\begin{matrix} 1 \\ - 1 \\ 0 \end{matrix}]$
2. Let
  $\vec{e_{2}} = \vec{v_{2}} - (projection of \vec{v_{2}} onto \vec{q_{1}}) = \vec{v_{2}} - ({\vec{q_{1}}}^{T} \vec{v_{2}}) \vec{q_{1}}, \vec{q_{2}} = \frac{\vec{e_{2}}}{‖ \begin{matrix} \vec{e_{2}} \end{matrix} ‖}$

QR Factorization (Decomposition)

{\vec{v_{1}}, \vec{v_{2}}, \dots, \vec{v_{n}}} (linearly independent) \overset{G r a m - S c h m i d t P r o c e s s}{\to} {\vec{q_{1}}, \vec{q_{2}}, \dots, \vec{q_{n}}}

Express

\vec{v_{k}}

by the newly obtained orthonormal basis

{\vec{q_{1}}, \vec{q_{2}}, \dots, \vec{q_{n}}}

\vec{v_{1}} = ({\vec{q_{1}}}^{T} \vec{v_{1}}) \vec{q_{1}} \vec{v_{2}} = ({\vec{q_{2}}}^{T} \vec{v_{2}}) \vec{q_{2}} + ({\vec{q_{1}}}^{T} \vec{v_{2}}) \vec{q_{1}} ⋮ \vec{v_{k}} = ({\vec{q_{1}}}^{T} \vec{v_{k}}) \vec{q_{1}} + \dots + ({\vec{q_{k}}}^{T} \vec{v_{k}}) \vec{q_{k}}

\begin{aligned} Let A & = [\begin{array}{cccc} \vec{v_{1}} & \vec{v_{2}} & \dots & \vec{v_{k}} \end{array}] \\ = [\begin{array}{cccc} ({\vec{q_{1}}}^{T} \vec{v_{1}}) \vec{q_{1}} & ({\vec{q_{1}}}^{T} \vec{v_{2}}) \vec{q_{1}} & ({\vec{q_{1}}}^{T} \vec{v_{k}}) \vec{q_{1}} \\ + ({\vec{q_{2}}}^{T} \vec{v_{2}}) \vec{q_{2}} & \dots & ({\vec{q_{2}}}^{T} \vec{v_{k}}) \vec{q_{2}} \\ + \dots \\ + ({\vec{q_{k}}}^{T} \vec{v_{k}}) \vec{q_{k}} \end{array}] \\ = [\begin{array}{cccc} \vec{q_{1}} & \vec{q_{2}} & \dots & \vec{q_{k}} \end{array}] [\begin{array}{cccc} {\vec{q_{1}}}^{T} \vec{v_{1}} & {\vec{q_{1}}}^{T} \vec{v_{2}} & \dots & {\vec{q_{1}}}^{T} \vec{v_{k}} \\ 0 & {\vec{q_{2}}}^{T} \vec{v_{2}} & \dots & {\vec{q_{2}}}^{T} \vec{v_{k}} \\ ⋮ & 0 & ⋮ \\ ⋮ & ⋮ & ⋱ & ⋮ \\ 0 & 0 & \dots & {\vec{q_{k}}}^{T} \vec{v_{k}} \end{array}] = Q R \end{aligned}

A = Q R

, consider the least square approximation

A^{T} A \vec{\hat{x}} = A^{T} \vec{b}

where

A = Q R

\begin{aligned} ⟺ (Q R)^{T} (Q R) \vec{\hat{x}} = (Q R)^{T} \vec{b} \\ ⟺ R^{T} Q^{T} Q R \vec{\hat{x}} = R^{T} Q^{T} \vec{b} \\ ⟺ R^{T} R \vec{\hat{x}} = R^{T} Q^{T} \vec{b} \\ ⟺ (R^{T})^{- 1} R^{T} R \vec{\hat{x}} = (R^{T})^{- 1} R^{T} Q^{T} \vec{b} \\ ⟺ R \vec{\hat{x}} = Q^{T} \vec{b} \\ ∴ we can solve \vec{\hat{x}} by back substituting using R \end{aligned}

Example

$A = [\begin{array}{ccc} 1 & 1 & 0 \\ 1 & 0 & 1 \\ 0 & 1 & 1 \end{array}] ⟹ \vec{v_{1}} = [\begin{matrix} 1 \\ 1 \\ 0 \end{matrix}], \vec{v_{2}} = [\begin{matrix} 1 \\ 0 \\ 1 \end{matrix}], \vec{v_{3}} = [\begin{matrix} 0 \\ 1 \\ 1 \end{matrix}] \vec{q_{1}} = \frac{\vec{v_{1}}}{‖ \begin{matrix} \vec{v_{1}} \end{matrix} ‖} = \frac{1}{\sqrt{2}} [\begin{matrix} 1 \\ 1 \\ 0 \end{matrix}] \vec{e_{2}} = \vec{v_{2}} - ({\vec{q_{1}}}^{T} \vec{v_{2}}) \vec{q_{1}} = [\begin{matrix} \frac{1}{2} \\ \frac{- 1}{2} \\ 1 \end{matrix}], \vec{q_{2}} = \frac{\vec{e_{2}}}{‖ \begin{matrix} \vec{e_{2}} \end{matrix} ‖} = [\begin{matrix} \frac{1}{\sqrt{6}} \\ \frac{- 1}{\sqrt{6}} \\ \frac{2}{\sqrt{6}} \end{matrix}] \vec{e_{3}} = \vec{v_{3}} - ({\vec{q_{1}}}^{T} \vec{v_{3}}) \vec{q_{1}} - ({\vec{q_{2}}}^{T} \vec{v_{3}}) \vec{q_{2}} = [\begin{matrix} \frac{- 2}{3} \\ \frac{2}{3} \\ \frac{2}{3} \end{matrix}], \vec{q_{3}} = \frac{\vec{e_{3}}}{‖ \begin{matrix} \vec{e_{3}} \end{matrix} ‖} = [\begin{matrix} \frac{- 1}{\sqrt{3}} \\ \frac{1}{\sqrt{3}} \\ \frac{1}{\sqrt{3}} \end{matrix}] ∴ Q = \vec{v_{1}} = [\begin{array}{ccc} \vec{q_{1}} & \vec{q_{2}} & \vec{q_{3}} \end{array}] = [\begin{matrix} \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{6}} & \frac{- 1}{\sqrt{3}} \\ \frac{1}{\sqrt{2}} & \frac{- 1}{\sqrt{6}} & \frac{1}{\sqrt{3}} \\ 0 & \frac{2}{\sqrt{6}} & \frac{1}{\sqrt{3}} \end{matrix}], \begin{aligned} R & = [\begin{array}{ccc} {\vec{v_{1}}}^{T} \vec{q_{1}} & {\vec{v_{1}}}^{T} \vec{q_{2}} & {\vec{v_{1}}}^{T} \vec{q_{3}} \\ 0 & {\vec{v_{2}}}^{T} \vec{q_{2}} & {\vec{v_{2}}}^{T} \vec{q_{3}} \\ 0 & 0 & {\vec{v_{3}}}^{T} \vec{q_{3}} \end{array}] \\ = [\begin{array}{ccc} \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} \\ 0 & \frac{3}{\sqrt{6}} & \frac{1}{\sqrt{6}} \\ 0 & 0 & \frac{2}{\sqrt{3}} \end{array}] \end{aligned}$

Determinants (2021.11.19)

Example
Determinant of
$2 \times 2$ matrix
$A = [\begin{array}{cc} a & b \\ c & d \end{array}], d e t (A) = | \begin{matrix} A \end{matrix} | = a d - b c$
Properties
1. $| \begin{matrix} I_{n} \end{matrix} | = 1$
2. Exchange of two rows leads to change of sign
  
  $A = [\begin{array}{cc} a & b \\ c & d \end{array}], A^{'} = [\begin{array}{cc} c & d \\ a & b \end{array}], | \begin{matrix} A^{'} \end{matrix} | = b c - a d = - (a d - b c) = - | \begin{matrix} A \end{matrix} |$
3. The determinant is a linear function of each row separately
  (scalar multiplication)
  $A^{'} = [\begin{array}{cc} t a & t b \\ c & d \end{array}] ⟹ | \begin{matrix} A^{'} \end{matrix} | = t (a d - b c) = t | \begin{matrix} A \end{matrix} |$
  (vector addition)
  $| \begin{matrix} A^{″} \end{matrix} | = | \begin{matrix} [\begin{array}{cc} a + a^{'} & b + b^{'} \\ c & d \end{array}] \end{matrix} | = | \begin{matrix} A \end{matrix} | + | \begin{matrix} A^{'} \end{matrix} | where A^{'} = [\begin{array}{cc} a^{'} & b^{'} \\ c & d \end{array}]$
4. If two rows are equal in
  $A$ ,
  $| \begin{matrix} A \end{matrix} | = 0$
  - Proof
    Given two rows,
    ${\vec{a_{i}}}^{T}$ and
    ${\vec{a_{j}}}^{T}$ are equal in
    $A$
    Let
    $A^{'}$ be the matrix
    $A$ with
    ${\vec{a_{i}}}^{T}$ and
    ${\vec{a_{j}}}^{T}$ exchanged
    From 2,
    $d e t (A^{'}) = - d e t (A) . . . ①$ , since
    $A^{'} = A, d e t (A^{'}) = d e t (A) . . . ②$
    From ① and ②,
    $d e t (A) = d e t (A^{'}) = 0$
5. Subtracting a multiple of one row from another row leaves
  $| \begin{matrix} A \end{matrix} |$ unchanged
  
  $\begin{aligned} | \begin{array}{c} A^{'} \end{array} | & = | \begin{array}{c} [\begin{array}{cc} a & b \\ c - k a & d - k b \end{array}] \end{array} | \\ = | \begin{array}{c} [\begin{array}{cc} a & b \\ c & d \end{array}] \end{array} | + | \begin{array}{c} [\begin{array}{cc} a & b \\ - k a & - k b \end{array}] \end{array} | \\ = | \begin{array}{c} A \end{array} | + (- k) | \begin{array}{c} [\begin{array}{cc} a & b \\ a & b \end{array}] \end{array} | \\ = | \begin{array}{c} A \end{array} | + 0 = | \begin{array}{c} A \end{array} | \end{aligned}$
6. A matrix with a row of zeros has
  $d e t (A) = 0$
  
  $A = [\begin{array}{cc} a & b \\ 0 & 0 \end{array}], A^{'} = [\begin{array}{cc} a & b \\ 0 + a & 0 + b \end{array}]$
  From 4,
  $| \begin{matrix} A^{'} \end{matrix} | = 0$ , from 5,
  $| \begin{matrix} A \end{matrix} | = | \begin{matrix} A^{'} \end{matrix} | = 0$
7. If
  $A$ is triangular, then
  $| \begin{matrix} A \end{matrix} | =$ product of diagonal elements
  
  $A = [\begin{array}{cc} a_{11} & a_{12} & \dots & a_{1 n} \\ 0 & a_{22} & \dots & a_{2 n} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ 0 & 0 & 0 & a_{n n} \end{array}] \overset{J o r d a n e l i m i n a t i o n}{\to} D = [\begin{array}{cc} a_{11} & 0 & \dots & 0 \\ 0 & a_{22} & \dots & 0 \\ ⋮ & ⋮ & ⋱ & ⋮ \\ 0 & 0 & 0 & a_{n n} \end{array}] (d i a g o n a l)$
  From 5,
  $| \begin{matrix} D \end{matrix} | = | \begin{matrix} A \end{matrix} | =$ product of diagonal elements
8. If
  $A$ is singular, then
  $| \begin{matrix} A \end{matrix} | = 0$ , if
  $A$ is invertible,
  $| \begin{matrix} A \end{matrix} | \neq 0$
  - Define
    A singular matrix does not have a matrix inverse
    
    $A$ is singular
    $⟹$
    $A$ does not have full rank
    $⟹$
    $A$ contains all-zero row after Gauss-Jordan Elimination
    Note that Gaussian Elimanation can convert
    $A$ to an upper triangular matrix
    $U$ with row operation that does not change the determinant or row exchange that change the sign of determinant only
    
    $∴ | \begin{matrix} U \end{matrix} | = (- 1)^{k} | \begin{matrix} A \end{matrix} |$ where
    $k$ = the number of row exchanges
    If
    $A$ is singular,
    $| \begin{matrix} A \end{matrix} | = \frac{1}{(- 1)^{k}} | \begin{matrix} U \end{matrix} | = 0$
    If
    $A$ is invertible,
    $| \begin{matrix} U \end{matrix} | =$ product of diagonal elements of
    $U$
    $⟹ | \begin{matrix} A \end{matrix} | = \frac{1}{(- 1)^{k}} | \begin{matrix} U \end{matrix} | \neq 0$

Linear Algebra Note 4

Orthogonality (2021.11.10 ~ 2021.11.17)

Least Square Approximations

Fitting by a parabola

Orthogonal Bases

The Gram-Schmidt Process

QR Factorization (Decomposition)

Determinants (2021.11.19)

Read more

Linear Algebra Note 7

Linear Algebra Note 6

Linear Algebra Note 5

Linear Algebra Note 3