\(\newcommand{\trans}{^\top}
\newcommand{\adj}{^{\rm adj}}
\newcommand{\cof}{^{\rm cof}}
\newcommand{\inp}[2]{\left\langle#1,#2\right\rangle}
\newcommand{\dunion}{\mathbin{\dot\cup}}
\newcommand{\bzero}{\mathbf{0}}
\newcommand{\bone}{\mathbf{1}}
\newcommand{\ba}{\mathbf{a}}
\newcommand{\bb}{\mathbf{b}}
\newcommand{\bc}{\mathbf{c}}
\newcommand{\bd}{\mathbf{d}}
\newcommand{\be}{\mathbf{e}}
\newcommand{\bh}{\mathbf{h}}
\newcommand{\bp}{\mathbf{p}}
\newcommand{\bq}{\mathbf{q}}
\newcommand{\br}{\mathbf{r}}
\newcommand{\bx}{\mathbf{x}}
\newcommand{\by}{\mathbf{y}}
\newcommand{\bz}{\mathbf{z}}
\newcommand{\bu}{\mathbf{u}}
\newcommand{\bv}{\mathbf{v}}
\newcommand{\bw}{\mathbf{w}}
\newcommand{\tr}{\operatorname{tr}}
\newcommand{\nul}{\operatorname{null}}
\newcommand{\rank}{\operatorname{rank}}
%\newcommand{\ker}{\operatorname{ker}}
\newcommand{\range}{\operatorname{range}}
\newcommand{\Col}{\operatorname{Col}}
\newcommand{\Row}{\operatorname{Row}}
\newcommand{\spec}{\operatorname{spec}}
\newcommand{\vspan}{\operatorname{span}}
\newcommand{\Vol}{\operatorname{Vol}}
\newcommand{\sgn}{\operatorname{sgn}}
\newcommand{\idmap}{\operatorname{id}}
\newcommand{\am}{\operatorname{am}}
\newcommand{\gm}{\operatorname{gm}}
\newcommand{\mult}{\operatorname{mult}}
\newcommand{\iner}{\operatorname{iner}}\)
How to solve a system of linear equations?
In high school, we have seen many system of linear equations with a unique solution, and we know how to solve it. Here we are going to learn how to obtain all solutions of a system of linear equation when its solution is not unique.
Consider the system of linear equations
\[
\begin{aligned}
x + y + z + w + u &= 3, \\
x + 2y + 2z + 2w + 2u &= 4, \\
x + 3y + 3z + 4w + 4u &= 5. \\
\end{aligned}
\]
Step 1: Transform the system into the augmented matrix
We may record the coefficients on the left into a matrix \(A\) and the constants on the right into a vector \(\bb\). By setting
\[
A = \begin{bmatrix}
1 & 1 & 1 & 1 & 1 \\
1 & 2 & 2 & 2 & 2 \\
1 & 3 & 3 & 4 & 4 \\
\end{bmatrix},\
\bx = \begin{bmatrix} x \\ y \\ z \\ w \\ u \end{bmatrix}, \text{ and }
\bb = \begin{bmatrix} 3 \\ 4 \\ 5 \end{bmatrix},
\]
we know the system of linear equations is equivalent to \(A\bx = \bb\), and we may represent this system by the augmented matrix
\[
\left[\begin{array}{c|c}
A & \bb
\end{array}\right] =
\left[\begin{array}{ccccc|c}
1 & 1 & 1 & 1 & 1 & 3 \\
1 & 2 & 2 & 2 & 2 & 4 \\
1 & 3 & 3 & 4 & 4 & 5 \\
\end{array}\right].
\]
Step 2: Make it into an echelon form
By running Gaussian elimination, which is a sequence of row operations, we may obtain
\[
\left[\begin{array}{ccccc|c}
{\color{red}1} & 1 & 1 & 1 & 1 & 3 \\
0 & {\color{red}1} & {\color{red}1} & 1 & 1 & 1 \\
0 & 0 & 0 & {\color{red}1} & {\color{red}1} & {\color{red}0} \\
\end{array}\right].
\]
This stair-like structure is called an echelon form . The term echelon is a way to arrange the troops in military; see the pictures in Wikipedia: Echelon formation to get a better sense of this name.
Note that, if preferred, one may run Gaussian elimination further to get the reduced echelon form
\[
\left[\begin{array}{ccccc|c}
{\color{red}1} & 0 & 0 & 0 & 0 & 2 \\
0 & {\color{red}1} & {\color{red}1} & 0 & 0 & 1 \\
0 & 0 & 0 & {\color{red}1} & {\color{red}1} & {\color{red}0} \\
\end{array}\right].
\]
Step 3: Recognize the leading variables and the free variables
The echelon form we used above is equivalent to
\[
\begin{aligned}
x + & y + z + & w + u &= 3, \\
~ & y + z + & w + u &= 1, \\
~ & ~ & w + u &= 0. \\
\end{aligned}
\]
The first (left-most) variable with nonzero coefficient on each equation is called a leading variable . In this case, we have three leading variables \(x\), \(y\), and \(w\). Any variable that is not a leading variable is called a free variable . In this case, \(z\) and \(u\) are the free variables.
Note that given any numbers for the free variables, the leading variables are uniquely determined, as we will see in the next two steps.
Step 4: Find a special solution
For example, we may assign \(z = 0\) and \(u = 0\). Thus, we solve, from right to left, that \(w = 0\), \(y = 1\), and \(x = 2\). Recording these numbers as a vector
\[
\bp = \begin{bmatrix} 2 \\ 1 \\ 0 \\ 0 \\ 0 \end{bmatrix},
\]
we call it as a special solution of the \(A\bx = \bb\), which means it is one of the solution.
Step 5: Find the homogeneous solutions
In fact, we may assign \(z = c_1\) and \(u = c_2\) to get all solutions. Thus, we have
\[
\begin{bmatrix} x \\ y \\ z \\ w \\ u \end{bmatrix} =
\begin{bmatrix}
2 & & \\
1 & -c_1 & \\
0 & c_1 & \\
0 & & -c_2 \\
0 & & c_2 \\
\end{bmatrix} =
\begin{bmatrix} 2 \\ 1 \\ 0 \\ 0 \\ 0 \end{bmatrix} +
c_1\begin{bmatrix} 0 \\ -1 \\ 1 \\ 0 \\ 0 \end{bmatrix} +
c_2\begin{bmatrix} 0 \\ 0 \\ 0 \\ -1 \\ 1 \end{bmatrix} =
\bp + c_1\bh_1 + c_2\bh_2.
\]
Note that \(\bh_1\) is usually not a solution of \(A\bx = \bb\); instead, it is a solution of \(A\bx = \bzero\). This is not suprising after a second thought. By assigning \(z = 0\) and \(u = 0\), we get the solution \(\bp\), so \(A\bp = \bb\). By assigning \(z = 1\) and \(u = 0\), we get the solution \(\bp + \bh_1\), so \(A(\bp + \bh_1) = \bb\). Combining these two facts along with some algebra, it is straightforward to see
\[
A\bh_1 = A(\bp + \bh_1) - A\bp = \bb - \bb = \bzero.
\]
Indeed, \(\bh_1\) is the unique solution of \(A\bx = \bzero\) with \(z = 1\) and \(u = 0\). Therefore, here is another way to obtain \(\bh_1\). First, consider the homogeneous equation \(A\bx = \bzero\). By running the Gaussian elimination, we know it is equivalent to
\[
\begin{aligned}
x + & y + z + & w + u &= 0, \\
~ & y + z + & w + u &= 0, \\
~ & ~ & w + u &= 0, \\
\end{aligned}
\]
which are the same equations we have been using except that the constants on the right are replaced by zeros. By solving this homogeneous system with \(z = 1\) and \(u = 0\), we see again that
\[
\bh_1 = \begin{bmatrix} 0 \\ -1 \\ 1 \\ 0 \\ 0 \end{bmatrix}.
\]
In a similar way, one may obtain \(\bh_2\) without calculating all the parametrization. (Give it a try!)
Now, the set
\[
\vspan(\{\bh_1, \bh_2\}) = \{c_1\bh_1 + c_2\bh_2: c_1, c_2\in\mathbb{R}\}
\]
is the solution set of \(A\bx = \bzero\). It is also called the homogeneous solution of \(A\bx = \bb\), since we have to replace \(\bb\) by \(\bzero\). On the other hand, the set
\[
\bp + \vspan(\{\bh_1, \bh_2\}) = \{\bp + c_1\bh_1 + c_2\bh_2: c_1, c_2\in\mathbb{R}\}
\]
is the solution set of \(A\bx = \bb\). It is also called the general solution of \(A\bx = \bb\), in contrast to a special solution.
As a summary, the general solution is equal to a special solution plus the homogeneous solution.
This note can be found at Course website > Learning resources.