Try   HackMD

Linear Algebra Note 4

Orthogonality (2021.11.10 ~ 2021.11.17)

  • Example

    A=[101112], b=[600], P=A(ATA)1ATwhere ATA=[111012][101112]=[3335]ATAx=ATb=[3335][x1^x2^]=[111012][600]=[60]x^=[x1^x2^]=[53]p=Ax^=[101112][53]=[521]e=bp=[121], check eTp=[121]T[521]=0, check eTA=[121]T[101112]=0Alternatively, P=A(ATA)1AT where (ATA)1=16[5333]P=16[521222125]p=Pb=16[521222125][600]=[521]

  • Theorem

    P2=P

    • Intuition
      Pb=p, P(Pb)=P2b=p
    • Proof
      P=A(ATA)1ATP2=PP=(A(ATA)1AT)(A(ATA)1AT)=A(ATA)1ATA(ATA)1AT=A(ATA)1AT=P

Least Square Approximations

In many practical cases,

Ax=b has no solutions, e.g.
A
= tall matrix (more equations than unknowns)
Let
eAxb
, the best approximation is to minimize
e
minimize
e2
, i.e. Least Square Approximations

  • Example (Linear fit)
    Find the closest line to points

    (t, b)=(0, 6), (1, 0), (2, 0)
    Let the line:
    b=x0+x1t

    x0+x10=6x0+x11=0x0+x12=0[101112][x0x1]=[600] (Ax=b)

    Let
    e=Axb
    and minimize
    e2

    eAx^ATe=0AT(Axb)=0x^=(ATA)1ATb=([111012][101112])1[111012][600]=[53]b=53t is the best linear fit

  • Theorem (Orthogonality principle)
    Let

    e=Axb and let
    x^
    be the solution to min
    e2
    = min
    Axb2
    , then
    eC(A)
    and
    Ax^=Pb=p
    where
    P
    is the projection matrix onto
    C(A)
    and
    x^
    is the solution to
    ATAx^=ATb

    • Proof
      Let
      b=p+e
      where
      pC(A)
      and
      eN(AT)

      while
      Ax=b
      may NOT be solvable,
      Ax=p
      is always solvable as
      pC(A)

      The objective function (to be minimized) is
      Axb2=Axpe2=(Axpe)T(Axpe)=Axp2+e2(Axp)TeeT(Axp)=Axp2+e200 (N(AT)C(A))=Axp2+e2
      • Goal:
        min
        Axb2
        = min [
        Axp2+e2
        ] = min
        Axp2
        + min
        e2e2

        we can hit the equality by choosing
        x=x^
        , then the least square solution is
        e2
        when
        Ax^=Pb=p
      • Summary:
        When
        Ax=b
        has NO solution, multiply by
        AT
        and solve
        ATAx^=ATb
        . The solution to this alternative system of equations is the least square approximation to
        Ax=b

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Fitting by a parabola

Given a quadratic (二次的) equation

b=x2+x1t+x2t2 with observations
(b, t)=(b1, t1),, (bm, tm)

with

m>3 points, the
m
equations for an exact fit are usually unsolvable
{x2+x1t1+x2t12=b1x2+x1t2+x2t22=b2x2+x1tm+x2tm2=bmA=[1t1t121tmtm2]m×3, A[x0x1x2]=[b1b2bm] (Ax=b)

Least Squares: solve

ATAx^=ATb

Orthogonal Bases

A vector spaces can be spanned by different bases. A basis containing all orthogonal vectors is called an orthogonal basis.
Recall that vectors

q1, q2, , qn are orthogonal if
<qi, qj>=qiTqj={0 , ijqi2=qj2, i=j

  • Define (Orthonormal): The vectors

    q1, q2, , qn are orthonormal (orthogonal and normalized) if
    <qi, qj>=qiTqj={0 , ij1 , i=j
    (normalized to unit vector)

  • Theorem
    Let

    Q=[q1 q2  qn] where
    q1, q2, , qn
    are orthonormal. Then
    QTQ=I
    . Moreover, if
    Q
    is square, then
    Q1=QT

    • Proof
      QTQ=[q1Tq2TqnT][q1q2qn]=[q1Tq1q1Tq2q1Tqnq2Tq1q2Tq2q2TqnqnTq1qnTq2qnTqn]=[100010001]=I

      If
      Q
      is square
      size
      =n×n
      and
      QTQ=IQ1=QT
  • Example
    Permutation matrix

    Q=[010001100] is orthogonal

    • Check
      QTQ=[001100010][010001100]=[100010001]=I
  • Example
    Rotation matrix

    Q=[cosθsinθsinθcosθ] rotates a vector counterclockwise by
    θ
    and
    Q
    is orthogonal

    • Check
      QTQ=[cosθsinθsinθcosθ][cosθsinθsinθcosθ]=[1001]=I
  • Theorem
    Orthonormal matrices preserve length of a vector, i.e.

    Qx=x  xRn
    Moreover, it preserves dot products, i.e.
    (Qx)T(Qy)=xTy

    • Proof
      (Qx)T(Qy)=xTQTQy=xTIy=xTy

      Let
      y=x, (Qx)T(Qx)=xTxQx2=x2Qx=x

Recall that for projecting a vector

b onto a subspace spanned by columns of
A
they are basis for
C(A)

<e, A> = <bAx, A> = 0AT(bAx)=0ATb=ATAxx^=(ATA)1ATb, p=Ax^=A(ATA)1ATb
when we have an orthogonal basis of the subspace
Q=[q1 q2  qn], QTQ=I

Then
x^=QTb, p=Qx^=QQTb

  • Projection of

    b onto a subspace is the sum of the projection of
    b
    onto each orthonormal vector
    qi
    in the orthonormal basis of the subspace
    p=QQTb=[q1q2qn][q1Tq2TqnT]b=[q1q2qn][q1Tbq2TbqnTb]=(q1Tbscalar)q1+(q2Tb)q2++(qnTb)qn (b=p+e, eqi)=(q1Tp)q1+(q2Tp)q2++(qnTp)qn=p1+p2++pn

  • Let

    p=x1^q1+x2^q2++xn^qn
    qiTp=qiT(x1^q1+x2^q2++xn^qn)=xi^qiTqi (qiTqj=0  ji)pTqi=xi^qiTqi, xi^=pTqiqi2 where qi2=1 for orthonormal basis{qi}

    • Example
      q1=[100], q2=[010], b=[432]p1=(q1Tb)q1=[100]T[432][100]=[400]p2=(q2Tb)q2=[010]T[432][010]=[030]p=p1+p2=[430]= Projection of b onto span({q1, q2})

The Gram-Schmidt Process

A systematic way to create orthonormal basis from an arbitrary basis

Input:

v1, v2, , vn (
n
non-zero vectors)
Output:
q1, q2, , qr (r=rank([v1 v2  vn]))

Step1: Let
e1=v1, q1=e1e1

Step2+:
ek=vki=1k1(qiTvk)qiorthogonalization, qk=ekeknormalization

  • Example
    a=[110], b=[202], c=[333]
    1. Let
      q1=aa=12[110]
    2. Let
      e2=v2(projection of v2 onto q1)=v2(q1Tv2)q1, q2=e2e2

QR Factorization (Decomposition)

{v1, v2, , vn} (linearly independent) GramSchmidt Process{q1, q2, , qn}

Express

vk by the newly obtained orthonormal basis
{q1, q2, , qn}

v1=(q1Tv1)q1v2=(q2Tv2)q2+(q1Tv2)q1 vk=(q1Tvk)q1++(qkTvk)qk

Let A=[v1v2vk]=[(q1Tv1)q1(q1Tv2)q1(q1Tvk)q1+(q2Tv2)q2(q2Tvk)q2++(qkTvk)qk]=[q1q2qk][q1Tv1q1Tv2q1Tvk0q2Tv2q2Tvk000qkTvk]=QR

A=QR , consider the least square approximation
ATAx^=ATb
where
A=QR

(QR)T(QR)x^=(QR)TbRTQTQRx^=RTQTbRTRx^=RTQTb(RT)1RTRx^=(RT)1RTQTbRx^=QTb we can solve x^ by back substituting using R

  • Example
    A=[110101011]v1=[110], v2=[101], v3=[011]q1=v1v1=12[110]e2=v2(q1Tv2)q1=[12121], q2=e2e2=[161626]e3=v3(q1Tv3)q1(q2Tv3)q2=[232323], q3=e3e3=[131313]Q=v1=[q1q2q3]=[12161312161302613], R=[v1Tq1v1Tq2v1Tq30v2Tq2v2Tq300v3Tq3]=[121212036160023]

Determinants (2021.11.19)

  • Example
    Determinant of

    2×2 matrix
    A=[abcd], det(A)=|A|=adbc

  • Properties

    1. |In|=1
    2. Exchange of two rows leads to change of sign
      A=[abcd], A=[cdab], |A|=bcad=(adbc)=|A|
    3. The determinant is a linear function of each row separately
      (scalar multiplication)
      A=[tatbcd]|A|=t(adbc)=t|A|

      (vector addition)
      |A|=|[a+ab+bcd]|=|A|+|A| where A=[abcd]
    4. If two rows are equal in
      A
      ,
      |A|=0
      • Proof
        Given two rows,
        aiT
        and
        ajT
        are equal in
        A

        Let
        A
        be the matrix
        A
        with
        aiT
        and
        ajT
        exchanged
        From 2,
        det(A)=det(A)...
        , since
        A=A, det(A)=det(A)...

        From ① and ②,
        det(A)=det(A)=0
    5. Subtracting a multiple of one row from another row leaves
      |A|
      unchanged
      |A|=|[abckadkb]|=|[abcd]|+|[abkakb]|=|A|+(k)|[abab]|=|A|+0=|A|
    6. A matrix with a row of zeros has
      det(A)=0

      A=[ab00], A=[ab0+a0+b]

      From 4,
      |A|=0
      , from 5,
      |A|=|A|=0
    7. If
      A
      is triangular, then
      |A|=
      product of diagonal elements
      A=[a11a12a1n0a22a2n000ann]Jordan eliminationD=[a11000a220000ann](diagonal)

      From 5,
      |D|=|A|=
      product of diagonal elements
    8. If
      A
      is singular, then
      |A|=0
      , if
      A
      is invertible,
      |A|0
      • Define
        A singular matrix does not have a matrix inverse
        A
        is singular
        A
        does not have full rank
        A
        contains all-zero row after Gauss-Jordan Elimination
        Note that Gaussian Elimanation can convert
        A
        to an upper triangular matrix
        U
        with row operation that does not change the determinant or row exchange that change the sign of determinant only
        |U|=(1)k|A|
        where
        k
        = the number of row exchanges
        If
        A
        is singular,
        |A|=1(1)k|U|=0

        If
        A
        is invertible,
        |U|=
        product of diagonal elements of
        U
        |A|=1(1)k|U|0