Chapter 3.1. System of Linear Equations and Gaussian Elimination

## Chapter 3.1. System of Linear Equations and Gaussian Elimination We begin by examining a fundamental problem in mathematics and its applications: finding a set of unknown values that simultaneously satisfy multiple conditions. Specifically, we consider the goal of finding $n$ real numbers - the scalars $x_{1}, \ldots, x_{n} \in \mathbb{R}$ - that are constrained by the following set of $m$ equations: $$ \begin{align} a_{11}x_{1}+a_{12}x_{2}+\cdots+a_{1n}x_{n} &= y_{1} \\ a_{21}x_{1}+a_{22}x_{2}+\cdots+a_{2n}x_{n} &= y_{2} \quad (3.1) \\\ \vdots \quad\quad\quad\quad &\vdots \quad \quad \vdots \quad \\ a_{m1}x_{1}+a_{m2}x_{2}+\cdots+a_{mn}x_{n} &= y_{m} \end{align} $$ The formulation labeled (3.1) is what we call a ==system of $m$ linear equations in $n$ unknowns== or simply a ==system of linear equations==. This system is defined by the fixed real ==coefficients== $a_{ij}$ and the constant terms $y_{i}$. A ==solution== to the system is any $n$-tuple of values $\left(x_{1}, \ldots, x_{n}\right) \in \mathbb{R}^{n}$ - that satisfies every single equation in the system (3.1). A system of linear equations is called ==homogeneous== if, in (3.1), $y_1=y_2=\cdots= y_m=0$. ### 3.1.1. Methods of Substitution and Elimination One of the first methods that we have learnt to solve a system of linear equations are the ==method of substitution== and the ==method of elimination==. For example, consider $$ \begin{aligned} 3x_{1}-2x_{2} &= 10 \quad (3.2) \\ 4x_{1}+ 2x_{2}&= 4 \end{aligned} $$ Let us first solve this using the method of substitution. We begin by isolating $x_{2}$ in the second equation in (3.2), and we get, $$ \begin{aligned} 4x_{1}+2x_{2} & =4 \\ 2x_{2} & =4-4x_{1} \\ x_{2} & =\frac{(4-4x_{1})}{2} \\ x_{2} & =2-2x_{1} \quad (3.2.1) \end{aligned} $$ Next, substitute the expression for $x_{2}$, obtained in (3.2.1), into the first equation in (3.2) to get $$ \begin{aligned} 3x_{1}-2x_{2}&= 10 & \\ 3x_{1}-2(2-2x_{1})&= 10 \\ 3x_{1}-4+4x_{1}&= 10 \\ 7x_{1} &= 14 \\ \text{or, } \quad x_{1} &= 2. \end{aligned}$$ Now, substitute $x_{1}=2$ back into the second equation in (3.2) and obtain $$ \begin{aligned} & x_{2} = 2-2 x \\ & x_{2} = 2-2(2) \\ & \text{or, } x_{2} = -2. \end{aligned}$$ Next, let us try the method of elimination to solve the system of linear equations in (3.2). Adding both the equations in (3.2) eliminates $x_{2}$: $$ \begin{aligned} (3x_{1}-2x_{2})+(4x_{1}+2x_{2}) & =10+4 \\ 7x_{1} & =14 \\ x_{1} & = 2. \end{aligned} $$ Substituting $x_{1}=2$ into the first equation in (3.2), we have $$ \begin{aligned} 3(2)-2x_{2} & =10 \\ 6-2x_{2} & =10 \\ -2x_{2} & =4 \\ \text{or, } x_{2} & =-2. \end{aligned} $$ ### 3.1.2. Equivalent Systems of Linear Equations and Matrix Representation In general, we leverage a simple but powerful technique when solving a system of linear equations like (3.1): *linearly combining the equations*. Consinder our general system (3.1). Suppose we select $m$ scalars $c_{1}, \ldots, c_{m}$, multiply the $j^{th}$ equation by $c_{j}$ and then add the resulting equations together. This operation yields a new equation: $$ \begin{array}{r} \left(c_{1} a_{11}+\cdots+c_{m} a_{m1}\right)x_{1}+\cdots+\left(c_{1} a_{1n}+\cdots+c_{m} a_{mn}\right)x_{n} \\ =c_{1}y_{1}+\cdots+c_{m}y_{m} \end{array} $$ We call this new equation a linear combination of the equations in the original system (3.1). Evidently, any solution of the entire system of equations (3.1) will also be a solution of this new equation. If we have another system of linear equations $$ \begin{align*} b_{11}x_{1}&+\cdots+b_{1n}x_{n} = z_{1} \\ \cdot &+ \cdots + \cdot \quad\quad = \cdot \\ \cdot &+ \cdots + \cdot \quad\quad = \cdot \quad (3.3) \\ \cdot &+ \cdots + \cdot \quad\quad = \cdot \\ b_{k1}x_1&+\cdots+b_{kn}x_{n} = z_{k} \end{align*} $$ in which each of the $k$ equations is a linear combination of the equations in (3.1), then every solution of (3.1) is a solution of this new system. Of course it may happen that some solutions of (3.3) are not solutions of (3.1). This clearly does not happen if each equation in the original system is a linear combination of the equations in the new system. **Definition 3.1.** Two systems of linear equations are ==equivalent== if each equation in each system is a linear combination of the equations in the other system. We can then formally state our observations as follows. **Theorem 3.1.** Equivalent systems of linear equations have exactly the same solutions. In essence, equivalent systems are just different algebraic expressions of the exact same problem; their solution spaces are identical. This theorem guarantees that the process of elimination—replacing one system with an equivalent, simpler one—will not change the final answer. Therefore, if the elimination process is to be effective in finding the solutions of a system like (3.1), then one must see how, by forming linear combinations of the given equations, to produce an equivalent system of equations which is easier to solve. In the next section we shall discuss one method of doing this. We shall now abbreviate the system (3.1) as $$ \begin{gathered} \quad\quad\quad\quad\quad\quad\quad\quad\quad\quad \mathbf{A}\mathbf{x}=\mathbf{y} \quad\quad\quad\quad\quad\quad\quad\quad (3.4) \end{gathered}$$ where: $$ \begin{gathered} \mathbf{A}=\left[\begin{array}{ccc} a_{11} & \cdots & a_{1n} \\ \vdots & & \vdots \\ a_{m 1} & \cdots & a_{m n} \end{array}\right] \\ \mathbf{x}=\left[\begin{array}{c} x_{1} \\ \vdots \\ x_{n} \end{array}\right] \text { and } \quad \mathbf{y}=\left[\begin{array}{c} y_{1} \\ \vdots \\ y_{m} \end{array}\right] . \end{gathered} $$ We call $\mathbf{A}$ the matrix of coefficients of the system. In general, a rectangular array of real numbers is called a ==matrix==. A matrix having $m$ ==rows== and $n$ ==columns== is referred to as an ==$m \times n$ matrix==, and $m$ and $n$ are called the ==dimensions of the matrix==. The ==scalar== (all scalars are also real numbers throughout this course) located at the intersection of the $i$ th row and the $j$ th column of a matrix is called the ==$ij^{th}$ element or entry== of the matrix. **Definition 3.2.** Two matrices $\mathbf{A}$ and $\mathbf{B}$ of the same dimensions are said to be ==equal== if each element of $\mathbf{A}$ equals the corresponding element of $\mathbf{B}$, in which case we write $\mathbf{A}=\mathbf{B}$ and are said to be ==unequal== otherwise, i.e., if some element of $\mathbf{A}$ differs from the corresponding element of $\mathbf{B}$, in which case we write $\mathbf{A} \neq \mathbf{B}$. A matrix that has only one column, that is, a matrix of the form $$ \left[\begin{array}{c} a_{1} \\ a_{2} \\ \vdots \\ a_m \end{array}\right] $$ is called a ==column vector==. Similarly, a matrix that has only one row is called a ==row vector==. A row or column vector having $m$ elements may be referred to as an ==$m$-dimensional row or column vector==. In the system (3.4), $\mathbf{x}$ and $\mathbf{y}$ are $m$-dimensional column vectors. Null row or column vectors (i.e., null matrices having one row or one column) are sometimes referred to simply as null vectors, and (like null matrices in general) are represented by the symbol $\mathbf{0}$. ### 3.1.3. Elementary Row Operations and Gaussian Elimination We wish now to consider operations on the rows of the matrix $A$ which correspond to forming linear combinations of the equations in the system $\mathbf{Ax}=\mathbf{y}$. We restrict our attention to three ==elementary row operations== on an $m \times n$ matrix $\mathbf{A}$: (i) multiplication of one row of $\mathbf{A}$ by a non-zero scalar $c$; (ii) replacement of the $r$ th row of $\mathbf{A}$ by row $r$ plus $c$ times row $s, c$ any scalar and $r \neq s$; (iii) interchange of two rows of $\mathbf{A}$. Here is a formal definition. **Definition 3.3.** An ==elementary row operation== is a function $e$ mapping an $m \times n$ matrix $\mathbf{A}$ to another $m \times n$ matrix $e(\mathbf{A})$. The three types of elementary row operations are: **ERO-1** $e(\mathbf{A})_{ij}=a_{ij}$ if $i \neq r$; $e(\mathbf{A})_{lj}=ca_{rj}$ where $c \neq 0$ **ERO-2**. $e(\mathbf{A})_{ij}=a_{ij}$ if $i \neq r$; $e(\mathbf{A})_{r j}=a_{rj}+ca_{sj}$ where $c \neq 0$. **ERO-3**. $e(\mathbf{A})_{i j}=a_{ij}$ if $i \neq r,s$; $e(\mathbf{A})_{rj}=a_{sj}$, $e(\mathbf{A})_{sj}=a_{rj}$. In defining $e(\mathbf{A})$, it is not really important how many columns $\mathbf{A}$ has, but the number of rows of $\mathbf{A}$ is crucial. To avoid any such complications, we shall agree that an elementary row operation $e$ is defined on the class of all $m \times n$ real matrices, for some fixed $m$ but any $n$. In other words, a particular $e$ is defined on the class of all $m$-rowed real matrices. One reason that we restrict ourselves to these three simple types of row operations is that, having performed such an operation $e$ on a matrix $\mathbf{A}$, we can recapture $A$ by performing a similar operation on $e(\mathbf{A})$. **Theorem 3.2.** To each elementary row operation $e$ there corresponds an elementary row operation $e_{1}$, of the same type as $e$, such that $e_{1}(e(\mathbf{A}))= e\left(e_{1}(\mathbf{A})\right)=\mathbf{A}$ for each $\mathbf{A}$. *Proof.* Suppose $e$ is the operation which multiplies the $r^{th}$ row of a matrix by the non-zero scalar $c$. Let $e_{1}$ be the operation which multiplies row $r$ by $c^{-1}$. Next, suppose $e$ is the operation which replaces row $r$ by row $r$ plus $c$ times row $s$, $r \neq s$. Let $e_{1}$ be the operation which replaces row $r$ by row $r$ plus ($-c$) times row $s$. Lastly, if $e$ interchanges rows $r$ and $s$, let $e_{1}=e$. In each of these three cases we clearly have $e_{1}(e(\mathbf{A}))=e\left(e_{1}(\mathbf{A})\right)=\mathbf{A}$ for each $\mathbf{A}$. $\blacksquare$ **Definition 3.4.** If $\mathbf{A}$ and $\mathbf{B}$ are $m \times n$ real matrices, we say that $\mathbf{B}$ is ==row-equivalent== to $\mathbf{A}$ if $\mathbf{B}$ can be obtained from $\mathbf{A}$ by a finite sequence of elementary row operations. Using Theorem 3.2, one should find it easy to verify the following. Each matrix is row-equivalent to itself; if $\mathbf{B}$ is row-equivalent to $\mathbf{A}$, then $\mathbf{A}$ is row-equivalent to $\mathbf{B}$; if $\mathbf{B}$ is row-equivalent to $\mathbf{A}$ and $\mathbf{C}$ is row-equivalent to $\mathbf{B}$, then $\mathbf{C}$ is row-equivalent to $\mathbf{A}$. In other words, row-equivalence is an ==equivalence relation==. **Theorem 3.3.** If $\mathbf{A}$ and $\mathbf{B}$ are row-equivalent $m \times n$ matrices, the homogeneous systems of linear equations $\mathbf{Ax}=\mathbf{0}$ and $\mathbf{Bx}=\mathbf{0}$ have exactly the same solutions. *Proof*. Suppose we pass from $\mathbf{A}$ to $\mathbf{B}$ by a finite sequence of elementary row operations: $$\mathbf{A}=\mathbf{A}_{0} \rightarrow \mathbf{A}_{1} \rightarrow \cdots \rightarrow \mathbf{A}_{k}=\mathbf{B}.$$ It is sufficient to show that one elementary row operation in the above sequence does not disturb the set of solutions. Therefore, without loss of generality, assume that $\mathbf{B}$ is obtained from $\mathbf{A}$ by a single elementary row operation. No matter which of the three types the operation is, each equation in the system $\mathbf{Bx}=\mathbf{0}$ will be a linear combination of the equations in the system $\mathbf{Ax}=\mathbf{0}$. Since the inverse of an elementary row operation is an elementary row operation, each equation in $\mathbf{Ax}=0$ will also be a linear combination of the equations in $\mathbf{Bx}=\mathbf{0}$. Hence, these two systems are equivalent, and by Theorem 3.1 they have the same solutions. $\blacksquare$ >Example. Suppose $$ \mathbf{A}=\left[\begin{array}{rrrr} 2 & -1 & 3 & 2 \\ 1 & 4 & 0 & -1 \\ 2 & 6 & -1 & 5 \end{array}\right] .$$ We shall perform a finite sequence of elementary row operations on $A$, indicating by numbers in parentheses the type of operation performed. $$ \begin{aligned} & {\left[\begin{array}{rrrr} 2 & -1 & 3 & 2 \\ 1 & 4 & 0 & -1 \\ 2 & 6 & -1 & 5 \end{array}\right] \xrightarrow{}\left[\begin{array}{rrrr} 0 & -9 & 3 & 4 \\ 1 & 4 & 0 & -1 \\ 2 & 6 & -1 & 5 \end{array}\right] \xrightarrow{}} \\ & {\left[\begin{array}{rrrr} 0 & -9 & 3 & 4 \\ 1 & 4 & 0 & -1 \\ 0 & -2 & -1 & 7 \end{array}\right] \xrightarrow{}\left[\begin{array}{rrrr} 0 & -9 & 3 & 4 \\ 1 & 4 & 0 & -1 \\ 0 & 1 & \frac{1}{2} & -\frac{7}{2} \end{array}\right] \xrightarrow{}} \\ & {\left[\begin{array}{rrrr} 0 & -9 & 3 & 4 \\ 1 & 0 & -2 & 13 \\ 0 & 1 & \frac{1}{2} & -\frac{7}{2} \end{array}\right] \xrightarrow{}\left[\begin{array}{rrrr} 0 & 0 & \frac{15}{2} & -\frac{55}{2} \\ 1 & 0 & -2 & 13 \\ 0 & 1 & \frac{1}{2} & -\frac{7}{2} \end{array}\right] \xrightarrow{}} \\ & {\left[\begin{array}{rrrrr} 0 & 0 & 1 & -\frac{11}{3} \\ 1 & 0 & -2 & 13 \\ 0 & 1 & \frac{1}{2} & -\frac{7}{2} \end{array}\right] \xrightarrow{}\left[\begin{array}{rrrr} 0 & 0 & 1 & -\frac{11}{3} \\ 1 & 0 & 0 & \frac{17}{8} \\ 0 & 1 & \frac{1}{2} & -\frac{7}{2} \end{array}\right] \xrightarrow{}} \\ & {\left[\begin{array}{rrrr} 0 & 0 & 1 & -\frac{11}{3} \\ 1 & 0 & 0 & \frac{17}{3} \\ 0 & 1 & 0 & -\frac{5}{3} \end{array}\right]} \end{aligned} $$ The row-equivalence of $\mathbf{A}$ with the final matrix in the above sequence tells us in particular that the solutions of $$ \begin{aligned} 2 x_1-x_2+3 x_3+2 x_4 & =0 \\ x_1+4 x_2-x_4 & =0 \\ 2 x_1+6 x_2-x_3+5 x_4 & =0 \end{aligned} $$ and $$ \begin{aligned} 0 &+ 0 &+ x_{3} &- \frac{11}{3}x_{4} &= 0 \\ x_{1} &+ 0 &+ 0 &+ \frac{17}{3}x_{4} &=0 \\ 0 &+ x_{2} &+ 0 &- \frac{5}{3}x_{4} &= 0 \end{aligned}$$ are exactly the same. In the second system it is apparent that if we assign any rational value $c$ to $x_4$ we obtain a solution $\left(-\frac{17}{3} c, \frac{5}{3}c, \frac{11}{3} c, c\right)$, and also that every solution is of this form. So far, our choice of row operations was motivated by a desire to simplify the coefficient matrix in a manner analogous to *eliminating unknowns* in the system of linear equations. Let us now make a formal definition of the type of matrix at which we were attempting to arrive. **Definition 3.5.** An $m \times n$ matrix $\mathbf{R}$ is called ==row-reduced== if: (i) the first non-zero entry in each non-zero row of $\mathbf{R}$ is equal to $1$; (ii) each column of $\mathbf{R}$ which contains the leading non-zero entry of some row has all its other entries $0$. > Example. One example of a row-reduced matrix is the $n \times n$ (square) identity matrix $I$. This is the $n \times n$ matrix defined by $I = \left[\delta_{ij}\right]$ where: $$\delta_{ij}= \begin{cases}1, & \text { if } \quad i=j \\ 0, & \text { if } \quad i \neq j .\end{cases}$$ Two examples of matrices which are not row-reduced are: $$ \left[\begin{array}{rrrr} 1 & 0 & 0 & 0 \\ 0 & 1 & -1 & 0 \\ 0 & 0 & 1 & 0 \end{array}\right], \quad\left[\begin{array}{rrr} 0 & 2 & 1 \\ 1 & 0 & -3 \\ 0 & 0 & 0 \end{array}\right] . $$ The first matrix does satisfy condition (i), but fails to satisfy condition (ii) in column 3. The second matrix fails to satisfy condition (i), because the leading non-zero entry of the first row is not $1$. We shall now prove that we can pass from any given matrix to a row-reduced matrix, by means of a finite number of elementary row operations. In combination with Theorem 3.3, this will provide us with an effective tool for solving systems of linear equations. **Theorem 3.4.** Every $m \times n$ matrix is row-equivalent to a row-reduced matrix. *Proof*. Let $\mathbf{A}$ be an $m \times n$ matrix. If every entry in the first row of $\mathbf{A}$ is $0$, then condition (i) is satisfied in so far as row $1$ is concerned. If row $1$ has a non-zero entry, let $k_{1}$ be the smallest positive integer $j$ for which $a_{1j} \neq 0$. Multiply row $1$ by $\frac{1}{a_{1k_{1}}}$, and then condition (i) is satisfied with regard to row $1$. Now for each $i \geq 2$, add $(-a_{ik_{1}})$ times row $1$ to row $i$. Now the leading non-zero entry of row $1$ occurs in column $k$, that entry is $1$, and every other entry in column $k$ is $0$. Now, consider the matrix which has resulted from above. If every entry in row $2$ is $0$, we do nothing to row $2$. If some entry in row $2$ is different from $0$, we multiply row $2$ by a scalar so that the leading non-zero entry is $1$. In the event that row $1$ had a leading non-zero entry in column $k_{1}$, this leading non-zero entry of row $2$ cannot occur in column $k_{1}$; say it occurs in column $k_{2} \neq k_{1}$. By adding suitable multiples of row $2$ to the various rows, we can arrange that all entries in column $j$ are $0$, except the $1$ in row $2$. The important thing to notice is this: In carrying out these last operations, we will not change the entries of row $1$ in columns $1, \ldots, k_{1}$; nor will we change any entry of column $k_{1}$. Of course, if row $1$ was identically $0$, the operations with row $2$ will not affect row $1$. Working with one row at a time in the above manner, it is clear that in a finite number of steps we will arrive at a row-reduced matrix. $\blacksquare$ Until now, our work with systems of linear equations was motivated oy an attempt to find the solutions of such a system. We wish now to acquire some information which is slightly more theoretical, and or that purpose it is convenient to go a little beyond row-reduced matrices. **Definition 3.6.** An $m \times n$ matrix $\mathbf{R}$ is called a ==row-reduced echelon matrix== if: (i) $\mathbf{R}$ is row-reduced; (ii) every row of $\mathbf{R}$ which has all its entries $0$ occurs below every row which has a non-zero entry; (iii) if rows $1, \ldots, l$ are the non-zero rows of $\mathbf{R}$, and if the leading nonzero entry of row $i$ occurs in column $\mathrm{k}_{i}$, $i=1,\ldots,l$, then $k_{1}< k_{2}<\cdots<k_{l}$. One can also describe an $m \times n$ row-reduced echelon matrix $\mathbf{R}$ as follows. Either every entry in $\mathbf{R}$ is $0$, or there exists a positive integer $l$, $1 \leq l \leq m$, and $l$ positive integers $k_{1}, \ldots, k_{l}$ with $1 \leq k_{i} \leq n$ and (i) $r_{ij}=0$ for $i>l$, and $r_{ij}=0$ if $j<k_{i}$. (ii) $r_{ik_{i}}=\delta_{ij}, 1 \leq i \leq l$, $1 \leq j \leq l$. (iii) $k_{1}<\cdots<k_{l}$. > Example. Two examples of row-reduced echelon matrices are the $n \times n$ identity matrix, and the $m \times n$ zero matrix, in which all entries are $0$. Here is a non-trivial one: $$ \left[\begin{array}{rrrrr} 0 & 1 & -3 & 0 & \frac{1}{2} \\ 0 & 0 & 0 & 1 & 2 \\ 0 & 0 & 0 & 0 & 0 \end{array}\right] . $$ **Theorem 3.5.** Every $m \times n$ matrix $\mathbf{A}$ is row-equivalent to a row-reduced echelon matrix. *Proof*. We know that $A$ is row-equivalent to a row-reduced matrix. All that we need observe is that by performing a finite number of row interchanges on a row-reduced matrix we can bring it to row-reduced echelon form. Let us now discuss briefly the system $\mathbf{Rx}=\mathbf{0}$, when $\mathbf{R}$ is a row-reduced echelon matrix. Let rows $1,\ldots,l$ be the non-zero rows of $\mathbf{R}$, and suppose that the leading non-zero entry of row $i$ occurs in column $k_{i}$. The system $\mathbf{Rx}=\mathbf{0}$ then consists of $l$ non-trivial equations. Also the unknown $x_{k_{i}}$ will occur (with non-zero coefficient) only in the $i^{th}$ equation. If we let $u_{1}, \ldots, u_{n-l}$ denote the ($n-l$) unknowns which are different from $x_{k_{1}}, \ldots, x_{k_{l}}$, then the $l$ non-trivial equations in $\mathbf{Rx}=\mathbf{0}$ are of the form $$ \begin{array}{r} x_{k_1}+\sum_{j=1}^{n-l} c_{1j} u_j=0 \\ \vdots \\ x_{k_l}+\sum_{j=1}^{n-l} c_{lj} u_j=0 . \end{array} $$ All the solutions to the system of equations $\mathbf{Rx}=\mathbf{0}$ are obtained by assigning any values whatsoever to $u_{1}, \ldots, u_{n-l}$ and then computing the corresponding values of $x_{k_{1}}, \ldots, x_{k_{l}}$. For example, if $R$ is the matrix displayed in Example 8, then $l=2$, $k_{1}=2, k_{2}=4$, and the two non-trivial equations in the system $\mathbf{Rx}=\mathbf{0}$ are $$ \begin{aligned} x_2-3 x_3+\frac{1}{2} x_5=0 & \text { or } \quad x_2=3 x_3-\frac{1}{2} x_5 \\ x_4+2 x_5=0 & \text { or } \quad x_4=-2 x_5 . \end{aligned} $$ So we may assign any values to $x_1, x_3$, and $x_5$, say $x_1=a, x_3=b, x_5=c$, and obtain the solution $(a, 3 b-\frac{1}{2} c, b,-2 c, c)$. Let us observe one thing more in connection with the system of equations $\mathbf{Rx}=\mathbf{0}$. If the number $l$ of non-zero rows in $R$ is less than $n$, then the system $\mathbf{Rx}=\mathbf{0}$ has a ==non-trivial solution==, that is, a solution $\left(x_1, \ldots, x_n\right)$ in which not every $x_j$ is $0$. For, since $l<n$, we can choose some $x_j$ which is not among the $l$ unknowns $x_{k_{1}}, \ldots, x_{k_{l}}$, and we can then construct a solution as above in which this $x_{j}$ is $1$. This observation leads us to one of the most fundamental facts concerning systems of homogeneous linear equations. **Theorem 3.6.** If $\mathbf{A}$ is an $m \times n$ matrix and $m<n$, then the homogeneous system of linear equations $\mathbf{Ax}=\mathbf{0}$ has a non-trivial solution. *Proof*. Let $\mathbf{R}$ be a row-reduced echelon matrix which is row-equivalent to $\mathbf{A}$. Then the systems $\mathbf{Ax}=\mathbf{0}$ and $\mathbf{Rx}=\mathbf{0}$ have the same solutions by Theorem 3.3. If $l$ is the number of non-zero rows in $R$, then certainly $l \leq m$, and since $m<n$, we have $r<n$. It follows immediately from our remarks above that $\mathbf{Ax}=\mathbf{0}$ has a non-trivial solution. $\blacksquare$ **Theorem 3.7.** If $\mathbf{A}$ is an $n \times n$ (square) matrix, then $\mathbf{A}$ is row-equivalent to the $n \times n$ identity matrix if and only if the system of equations $\mathbf{Ax}=\mathbf{0}$ has only the trivial solution. *Proof*. If $\mathbf{A}$ is row-equivalent to $\mathbf{A}$, then $\mathbf{Ax}=\mathbf{0}$ and $\mathbf{Ix}=\mathbf{0}$ have the same solutions. Conversely, suppose $\mathbf{Ax}=\mathbf{0}$ has only the trivial solution $\mathbf{x}=\mathbf{0}$. Let $\mathbf{R}$ be an $n \times n$ row-reduced echelon matrix which is row-equivalent to $\mathbf{A}$, and let $l$ be the number of non-zero rows of $\mathbf{R}$. Then $\mathbf{Rx}=\mathbf{0}$ has no non-trivial solution. Thus $l \geq n$. But since $\mathbf{R}$ has $n$ rows, certainly $l \leq n$, and we have $l=n$. Since this means that $\mathbf{R}$ actually has a leading non-zero entry of $1$ in each of its $n$ rows, and since these $1$'s occur each in a different one of the $n$ columns, $\mathbf{R}$ must be the $n \times n$ identity matrix. $\blacksquare$ Let us now ask what elementary row operations do toward solving a system of linear equations $\mathbf{Ax}=\mathbf{y}$ which is not homogeneous. At the outset, one must observe one basic difference between this and the homogeneous case, namely, that while the homogeneous system always has the trivial solution $x_{1}=\cdots=x_{n}=0$, an inhomogeneous system need have no solution at all. We form the augmented matrix $\overline{\mathbf{A}} = \left[A \mid y\right]$ of the system $\mathbf{Ax}=\mathbf{y}$. This is the $m \times(n+1)$ matrix whose first $n$ columns are the columns of $\mathbf{A}$ and whose last column is $\mathbf{y}$. More precisely, $$ \begin{aligned} & \overline{a}_{ij}=a_{i j}, \quad \text { if } \quad j \leq n \\ & \overline{a}_{i(n+1)}=y_{i} . \end{aligned} $$ Suppose we perform a sequence of elementary row operations on $\mathbf{A}$, arriving at a row-reduced echelon matrix $\mathbf{R}$. If we perform this same sequence of row operations on the augmented matrix $\overline{\mathbf{A}}$, we will arrive at a matrix $\overline{\mathbf{R}}$ whose first $n$ columns are the columns of $\mathbf{R}$ and whose last column contains certain scalars $z_{1}, \ldots, z_{m}$. The scalars $z_{i}$ are the entries of the $m$-dimensional column matrix $$ \mathbf{z}=\left[\begin{array}{c} z_1 \\ \vdots \\ z_m \end{array}\right] $$ which results from applying the sequence of row operations to the matrix $\mathbf{y}$. It should be clear to the reader that, just as in the proof of Theorem 3.3, the systems $\mathbf{Ax}=\mathbf{y}$ and $\mathbf{Rx}=\mathbf{z}$ are equivalent and hence, have the same solutions. It is very easy to determine whether the system $\mathbf{Rx}=\mathbf{z}$ has any solutions and to determine all the solutions if any exist. For, if $\mathbf{R}$ has $l$ non-zero rows, with the leading non-zero entry of row $i$ occurring in column $k_{i}$, $i=1, \ldots, l$, then the first $l$ equations of $\mathbf{Rx}=\mathbf{z}$ effectively express $x_{k_1}, \ldots, x_{k_l}$ in terms of the $(n-l)$ remaining $x_j$ and the scalars $z_{1},\ldots,z_{l}$. The last $(m-l)$ equations are $$ \begin{array}{cc} 0 & =z_{l+1} \\ \vdots & \vdots \\ 0 & =z_m \end{array} $$ and accordingly the condition for the system to have a solution is $z_{i}=0$ for $i>l$. If this condition is satisfied, all solutions to the system are found just as in the homogeneous case, by assigning arbitrary values to $(n-l)$ of the $x_{j}$ and then computing $x_{k_{i}}$ from the $i^{th}$ equation. > Example. Let $$ \mathbf{A}=\left[\begin{array}{rrr} 1 & -2 & 1 \\ 2 & 1 & 1 \\ 0 & 5 & -1 \end{array}\right] $$ and suppose that we wish to solve the system $\mathbf{Ax}=\mathbf{y}$ for some $y_{1}, y_{2}$, and $y_{3}$. Let us perform a sequence of row operations on the augmented matrix $\overline{\mathbf{A}}$ which row-reduces $\mathbf{A}$: $$ \begin{gathered} {\left[\begin{array}{rrrr} 1 & -2 & 1 & y_1 \\ 2 & 1 & 1 & y_2 \\ 0 & 5 & -1 & y_3 \end{array}\right] \xrightarrow{}\left[\begin{array}{rrcc} 1 & -2 & 1 & y_1 \\ 0 & 5 & -1 & \left(y_2-2 y_1\right) \\ 0 & 5 & -1 & y_3 \end{array}\right] \xrightarrow{}} \\ {\left[\begin{array}{rrrc} 1 & -2 & 1 & y_1 \\ 0 & 5 & -1 & \left(y_2-2 y_1\right) \\ 0 & 0 & 0 & \left(y_3-y_2+2 y_1\right) \end{array}\right] \xrightarrow{}\left[\begin{array}{cccc} 1 & -2 & 1 & y_1 \\ 0 & 1 & -\frac{1}{5} & \frac{1}{5}\left(y_2-2 y_1\right) \\ 0 & 0 & 0 & \left(y_3-y_2+2 y_1\right) \end{array}\right] \xrightarrow{} .} \\ {\left[\begin{array}{rrrc} 1 & 0 & \frac{3}{5} & \frac{1}{5}\left(y_1+2 y_2\right) \\ 0 & 1 & -\frac{1}{5} & \frac{1}{5}\left(y_2-2 y_1\right) \\ 0 & 0 & 0 & \left(y_3-y_2+2 y_1\right) \end{array}\right] .} \end{gathered} $$ The condition that the system $\mathbf{Ax}=\mathbf{y}$ have a solution is thus $$2y_{1}-y_{2}+y_{3}=0$$ and if the given scalars $y_{i}$ satisfy this condition, all solutions are obtained by assigning a value $c$ to $x_{3}$ and then computing $$ \begin{aligned} & x_{1}=-\frac{3}{5} c+\frac{1}{5}\left(y_1+2 y_2\right) \\ & x_{2}=\frac{1}{5} c+\frac{1}{5}\left(y_2-2 y_1\right) . \end{aligned} $$