## Chapter 3.2. Matrix Operations and Determinants This chapter introduces the fundamental tools for manipulating and analyzing matrices: matrix operations and the determinant. Matrices are more than just tables of numbers; they are powerful mathematical objects that represent linear transformations, store data, and simplify systems of equations. To fully utilize their potential, we must understand the essential operations that govern their behavior. We will first explore arithmetic operations like matrix addition and scalar multiplication, leading into the crucial, non-commutative process of matrix multiplication. We will then define important derived properties, including the transpose (flipping a matrix over its diagonal) and the trace (the sum of its main diagonal elements). Finally, we will introduce the determinant, a single scalar value unique to square matrices, which is essential for determining a matrix's invertibility and is a cornerstone concept for solving linear systems and understanding geometric transformations. ### 3.2.1. Basic Operations ### Transpose of a Matrix The ==transpose== of a $m \times n$ matrix $\mathbf{A}$, $\mathbf{A}^{T}$, is an $n \times m$ matrix where each entry of the transpose, $a_{ij}^{T} = a_{ji}$. ### Matrix Addition Given two $m \times n$ matrices $\mathbf{A}$ and $\mathbf{B}$, the ==sum== of these matrices $$\mathbf{A}+\mathbf{B} = \{a_{ij} + b_{ij} \mid i=1,\ldots,m \text{ and } j=1,\ldots,n \}$$ The sum of matrices is not defined if their dimensions are different. It is easy to verify that the sum operation is *commutative* and *associative*. ### Matrix Multiplication ==Scalar multiplication== is defined for an arbitrary scalar $k \in \mathbb{R}$ and an arbitrary $m \times n$ matrix $\mathbf{A}=\left\{a_{i j}\right\}$. The product of $k$ and $\mathbf{A}$ is written as $k \mathbf{A}$ and is defined to be the $m \times n$ matrix whose $i j$ th element is $k a_{ij}$. For example, $$ 4\left[\begin{array}{rrr} 2 & -3 & 0 \\ 1 & 4 & 2 \end{array}\right]=\left[\begin{array}{rrr} 8 & -12 & 0 \\ 4 & 16 & 8 \end{array}\right] . $$ The matrix $k \mathbf{A}$ is said to be a scalar multiple of x. For any matrix $\mathbf{A}$, $$1 \mathbf{A}=\mathbf{A}.$$ For a matrix $\mathbf{A}$ with $m$ rows and $n$ columns and a matrix $\mathbf{B}$ with $p$ rows and $q$ columns, the ==product== $\mathbf{AB}$ is defined when $n=p$ and is equal to the $m \times q$ matrix whose $ij$ th element is $$\sum_{k=1}^n a_{i k} b_{k j}=a_{i1} b_{1j}+a_{i2} b_{2j}+\cdots+a_{in} b_{nj}.$$ > Example. $$ \begin{aligned} & \left[\begin{array}{rrr} 2 & -3 & 0 \\ 1 & 4 & 2 \end{array}\right]\left[\begin{array}{rr} 3 & 4 \\ -2 & 0 \\ 1 & 2 \end{array}\right] \\ & \quad=\left[\begin{array}{ll} 2(3)+(-3)(-2)+0(1) & 2(4)+(-3)(0)+0(2) \\ 1(3)+4(-2)+2(1) & 1(4)+4(0)+2(2) \end{array}\right] \\ & \quad=\left[\begin{array}{rr} 12 & 8 \\ -3 & 8 \end{array}\right] . \end{aligned} $$ Matrix multiplication is not, in general, ==commutative==. That is, $\mathbf{AB}$ is not necessarily identical to $\mathbf{BA}$. In fact, when $n=p$ but $m \neq q$ or when $m=q$ but $n \neq p$, one of the matrix products $\mathbf{AB}$ and $\mathbf{BA}$ is defined, while the other is undefined. When $n=p$ and $m=q, \mathbf{AB}$ and $\mathbf{BA}$ are both defined, but the dimensions of $\mathbf{AB}$ are the same as those of $\mathbf{BA}$ only if $m=n$. Even when $n=p=m=q$, in which case $\mathbf{A}$ and $\mathbf{B}$ are both $n \times n$ matrices and the two matrix products $\mathbf{AB}$ and $\mathbf{BA}$ are both defined and have the same dimensions, it is not necessarily true that $\mathbf{AB}=\mathbf{BA}$. For example, $$ \left[\begin{array}{ll} 1 & 0 \\ 0 & c \end{array}\right]\left[\begin{array}{ll} 0 & 1 \\ 1 & 0 \end{array}\right]=\left[\begin{array}{ll} 0 & 1 \\ c & 0 \end{array}\right] \neq\left[\begin{array}{ll} 0 & c \\ 1 & 0 \end{array}\right]=\left[\begin{array}{ll} 0 & 1 \\ 1 & 0 \end{array}\right]\left[\begin{array}{ll} 1 & 0 \\ 0 & c \end{array}\right] . $$ Matrix multiplication is ==associative==. Thus, introducing a third matrix $\mathbf{C}$, $$ \mathbf{A}(\mathbf{BC})=(\mathbf{AB}) \mathbf{C}, $$ provided that $p=n$ and that $\mathbf{C}$ has $q$ rows so that the matrix multiplications required to form the left and right sides of the equality are defined. The symbol $\mathbf{A B C}$ is used to represent the common value of the left and right sides of equality (2.6), and this value is referred to as the product of $\mathbf{A}, \mathbf{B}$, and $\mathbf{C}$. This notation and terminology extend in an obvious way to any finite number of matrices. Matrix multiplication is ==distributive== with respect to addition, that is $$ \begin{aligned} & \mathbf{A}(\mathbf{B}+\mathbf{C})=\mathbf{A B}+\mathbf{A} \mathbf{C}, \\ & (\mathbf{A}+\mathbf{B}) \mathbf{C}=\mathbf{A} \mathbf{C}+\mathbf{B C}, \end{aligned} $$ where, in each equality, it is assumed that the dimensions of $\mathbf{A}, \mathbf{B}$, and $\mathbf{C}$ are such that all multiplications and additions are defined. Results (2.7) and (2.8) extend in an obvious way to the postmultiplication or premultiplication of a matrix $\mathbf{A}$ or $\mathbf{C}$ by the sum of any finite number of matrices. ### Inverse of a Matrix Next, we define the inverse of square matrices, i.e., matrices with the same number of rows and columns. <span style="color: lime;">**Definition 3.7.**</span> Let $\mathbf{A}$ be an $n \times n$ (square) matrix. A $n \times n$ matrix $\mathbf{B}$ such that $\mathbf{BA}=\mathbf{I}$ is called a ==left inverse== of $\mathbf{A}$; an $n \times n$ matrix $\mathbf{B}$ such that $\mathbf{AB}=\mathbf{I}$ is called a right inverse of $\mathbf{A}$. If $\mathbf{AB}=\mathbf{BA}=\mathbf{I}$, then $\mathbf{B}$ is called an ==(two-sided) inverse of $\mathbf{A}$== and $\mathbf{A}$ is said to be invertible. <span style="color: mediumslateblue;">**Theorem 3.8.**</span> If $\mathbf{A}$ has a left inverse $\mathbf{B}$ and a right inverse $\mathbf{C}$, then $\mathbf{B}=\mathbf{C}$. Proof. Suppose $\mathbf{BA}=\mathbf{I}$ and $\mathbf{AC}=\mathbf{I}$. Then $$ \mathbf{B}=\mathbf{BI}=\mathbf{B(AC)}=\mathbf{(BA)C}=\mathbf{IC}=\mathbf{C}.$$ Thus if $\mathbf{A}$ has a left and a right inverse, $\mathbf{A}$ is invertible and has a unique inverse, which we shall denote by $\mathbf{A}^{-1}$ and simply call the inverse of $\mathbf{A}$. <span style="color: mediumslateblue;">**Theorem 3.9.**</span> Let $\mathbf{A}$ and $\mathbf{B}$ be $n \times n$ matrices. (i) If $\mathbf{A}$ is invertible, so is $\mathbf{A}^{-1}$ and $\left(\mathbf{A}^{-1}\right)^{-1}=\mathbf{A}$. (ii) If both $\mathbf{A}$ and $\mathbf{B}$ are invertible, so is $\mathbf{AB}$, and $(\mathbf{AB})^{-1}=\mathbf{B}^{-1} \mathbf{A}^{-1}$. *Proof*. The first statement is evident from the symmetry of the definition. The second follows upon verification of the relations $$(\mathbf{AB})\left(\mathbf{B}^{-1}\mathbf{A}^{-1}\right)=\left(\mathbf{B}^{-1}\mathbf{A}^{-1}\right)(\mathbf{AB})=\mathbf{I}.$$ <span style="color: mediumslateblue;">**Corollary 3.1.**</span> A product of invertible matrices is invertible. <span style="color: mediumslateblue;">**Theorem 3.10.**</span> An elementary matrix is invertible. *Proof*. Left to the reader. <span style="color: mediumslateblue;">**Theorem 3.11.**</span> If A is an $\mathrm{n} \times \mathrm{n}$ matrix, the following are equivalent. (i) $\mathbf{A}$ is invertible. (ii) $\mathbf{A}$ is row-equivalent to the $\mathrm{n} \times \mathrm{n}$ identity matrix. (iii) $\mathbf{A}$ is a product of elementary matrices. *Proof*. Let $\mathbf{R}$ be a row-reduced echelon matrix which is rowequivalent to $\mathbf{A}$. By Theorem 3., $$\mathbf{R}=\mathbf{E}_{k} \cdots \mathbf{E}_{2} \mathbf{E}_{1}\mathbf{A}$$ where $\mathbf{E}_{1}, \ldots, \mathbf{E}_{k}$ are elementary matrices. Each $\mathbf{E}_j$ is invertible, and so $$\mathbf{A}=\mathbf{E}_1^{-1} \cdots \mathbf{E}_k^{-1}\mathbf{R}.$$ Since products of invertible matrices are invertible, we see that $\mathbf{A}$ is invertible if and only if $\mathbf{R}$ is invertible. Since $R$ is a (square) row-reduced echelon matrix, $\mathbf{R}$ is invertible if and only if each row of $\mathbf{R}$ contains a non-zero entry, that is, if and only if $\mathbf{R}=\mathbf{I}$. We have now shown that $\mathbf{A}$ is invertible if and only if $\mathbf{R}=\mathbf{I}$, and if $\mathbf{R}=\mathbf{I}$ then $\mathbf{A}^{-1}=\mathbf{E}_{k} \cdots \mathbf{E}_{1}$. It should now be apparent that (i), (ii), and (iii) are equivalent statements about $A$. $\blacksquare$ <span style="color: mediumslateblue;">**Corollary 3.2.**</span> If A is an invertible $\mathrm{n} \times \mathrm{n}$ matrix and if a sequence of elementary row operations reduces A to the identity, then that same sequence of operations when applied to I yields $\mathrm{A}^{-1}$. <span style="color: mediumslateblue;">**Corollary 3.3.**</span> Let A and B be $\mathrm{m} \times \mathrm{n}$ matrices. Then B is row-equivalent to A if and only if $\mathrm{B}=\mathrm{PA}$ where P is an invertible $\mathrm{m} \times \mathrm{m}$ matrix. <span style="color: mediumslateblue;">**Theorem 3.12.**</span> For an $\mathrm{n} \times \mathrm{n}$ matrix A , the following are equivalent. (i) A is invertible. (ii) The homogeneous system $\mathrm{AX}=0$ has only the trivial solution $\mathrm{X}=0$. (iii) The system of equations $\mathrm{AX}=\mathrm{Y}$ has a solution X for each $\mathrm{n} \times 1$ matrix Y . ### 3.2.2. Some Basic Types of Matrices ### Square matrices A matrix having the same number of rows as columns is called a ==square matrix==. An $n \times n$ square matrix is said to be of ==order== $n$. The $n$ elements $a_{11}, a_{22} \ldots, a_{n n}$ of an $n \times n$ square matrix $\mathbf{A}$ that fall on an imaginary diagonal line extending from the upper left to the lower right corners of the matrix are known as the (first, second, etc.) ==diagonal elements== or the ==diagonal== of the matrix $\mathbf{A}$. Those elements of a square matrix other than the diagonal elements (i.e., those elements that lie above and to the right or below and to the left of the diagonal) are called the ==off-diagonal elements==. Note that the product $\mathbf{AA}$ of a matrix $\mathbf{A}$ by itself is defined if and only if $\mathbf{A}$ is square. For a square matrix $\mathbf{A}$, the symbol $\mathbf{A}^{2}$ is used to represent $\mathbf{AA}, \mathbf{A}^{3}$ to represent $\mathbf{A} \mathbf{A}^{2}=\mathbf{A}\mathbf{A}\mathbf{A}$, and, more generally, $\mathbf{A}^k$ to represent $\mathbf{A} \mathbf{A}^{k-1}=\mathbf{A} \mathbf{A} \mathbf{A}^{k-2}= \ldots=\mathbf{A} \mathbf{A} \mathbf{A} \cdots \mathbf{A}(k=2,3,\ldots)$. The symbol $\mathbf{J}_{m n}$ is used to represent an $m \times n$ matrix whose elements are all equal to $1$. Or, when the dimensions are clear from the context or are to be left unspecified, we may simply write $\mathbf{J}$ for such a matrix. Thus, $$ \mathbf{J}=\left[\begin{array}{cccc} 1 & 1 & \ldots & 1 \\ 1 & 1 & \ldots & 1 \\ \vdots & \vdots & & \vdots \\ 1 & 1 & \ldots & 1 \end{array}\right] . $$ Also, $\mathbf{J}_n$ is sometimes written for $\mathbf{J}_{n n}$. Note that $$ \mathbf{J}_{m n} \mathbf{J}_{np}=n \mathbf{J}_{mp}.$$ A matrix whose elements are all equal to 0 is called a ==null matrix==. A null matrix is denoted by the symbol $\mathbf{0}$. Thus, $$ \mathbf{0}=\left[\begin{array}{cccc} 0 & 0 & \ldots & 0 \\ 0 & 0 & \ldots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \ldots & 0 \end{array}\right] . $$ ### Symmetric matrices A matrix $\mathbf{A}$ is said to be ==symmetric== if $\mathbf{A}^{T}=\mathbf{A}$. Thus, a symmetric matrix is a square matrix whose $ij^{th}$ element equals its $ji^{th}$ th element. For example, $$ \left[\begin{array}{rrr} 3 & 1 & -4 \\ 1 & 0 & 2 \\ -4 & 2 & -5 \end{array}\right] $$ is a symmetric matrix. The symbol $\mathbf{J}_{m n}$ is used to represent an $m \times n$ matrix whose elements are all equal to 1 . Or, when the dimensions are clear from the context or are to be left unspecified, we may simply write $\mathbf{J}$ for such a matrix. Thus, $$ \mathbf{J}=\left[\begin{array}{cccc} 1 & 1 & \ldots & 1 \\ 1 & 1 & \ldots & 1 \\ \vdots & \vdots & & \vdots \\ 1 & 1 & \ldots & 1 \end{array}\right] . $$ Also, $\mathbf{J}_n$ is sometimes written for $\mathbf{J}_{n n}$. Note that $$ \mathbf{J}_{m n} \mathbf{J}_{n p}=n \mathbf{J}_{m p} . $$ ### Diagonal matrices A diagonal matrix is a square matrix whose off-diagonal elements are all equal to $0$, that is, a matrix of the general form $$ \left[\begin{array}{cccc} d_1 & 0 & \ldots & 0 \\ 0 & d_2 & & 0 \\ \vdots & & \ddots & \\ 0 & 0 & & d_n \end{array}\right], $$ where $d_1, d_2, \ldots, d_n$ are scalars. A diagonal matrix $$ \left[\begin{array}{cccc} 1 & 0 & \ldots & 0 \\ 0 & 1 & & 0 \\ \vdots & & \ddots & \\ 0 & 0 & & 1 \end{array}\right] $$ whose diagonal elements are all equal to $1$ is called an ==identity matrix==. The symbol $\mathbf{I}_n$ is used to represent an identity matrix of order $n$. In cases where the order is clear from the context, we may simply write $\mathbf{I}$ for an identity matrix. Clearly, for an arbitrary matrix $\mathbf{A}$, $$\mathbf{I} \mathbf{A}=\mathbf{A} \mathbf{I}=\mathbf{A}.$$ ### Matrices of ones Letting A represent an arbitrary matrix and $k$ an arbitrary scalar, we find that $$ \begin{gathered} 0 \mathbf{A}=\mathbf{0}, \quad k \mathbf{0}=\mathbf{0}, \\ \mathbf{A}+\mathbf{0}=\mathbf{0}+\mathbf{A}=\mathbf{A}, \\ \mathbf{A}-\mathbf{A}=\mathbf{0}, \\ \mathbf{0} \mathbf{A}=\mathbf{0}, \quad \mathbf{A 0}=\mathbf{0} \end{gathered} $$ (where the dimensions of each null matrix $\mathbf{0}$ can be ascertained from the context). ### Triangular matrices If all of the elements of a square matrix that are located below and to the left of the diagonal are $0$, the matrix is called an ==upper triangular matrix==. Thus, the general form of an upper triangular matrix is $$ \left[\begin{array}{ccccc} a_{11} & a_{12} & a_{13} & \ldots & a_{1 n} \\ 0 & a_{22} & a_{23} & \ldots & a_{2 n} \\ 0 & 0 & a_{33} & \ldots & a_{3 n} \\ \vdots & \vdots & & \ddots & \vdots \\ 0 & 0 & 0 & & a_{n n} \end{array}\right] . $$ Similarly, if all of the elements that are located above and to the right of the diagonal are 0 , the matrix is called a lower triangular matrix. More formally, an $n \times n$ matrix $\mathbf{A}=\left\{a_{i j}\right\}$ is upper triangular if $a_{i j}=0$ for $j<i=1, \ldots, n$, and is lower triangular if $a_{i j}=0$ for $j>i=1, \ldots, n$. By a triangular matrix, we mean a (square) matrix that is upper triangular or lower triangular. Observe that a (square) matrix is a diagonal matrix if and only if it is both upper and lower triangular. The transpose of an upper triangular matrix is a ==lower triangular matrix== - and vice versa - as is easily verified. Further, the sum of two upper triangular matrices (of the same order) is upper triangular and the sum of two lower triangular matrices is lower triangular is also easily verified. Some basic properties of the product of two upper or two lower triangular matrices are described in the following lemma. <span style="color: mediumslateblue;">**Theorem 3.13.**</span> Let $\mathbf{A}=\left(a_{ij}\right)$ and $\mathbf{B}=\left(b_{ij}\right)$ represent $n \times n$ matrices. If $\mathbf{A}$ and $\mathbf{B}$ are both upper triangular, then their product $\mathbf{A B}$ is upper triangular. Similarly, if $\mathbf{A}$ and $\mathbf{B}$ are both lower triangular, then $\mathbf{A B}$ is lower triangular. Further, if $\mathbf{A}$ and $\mathbf{B}$ are either both upper triangular or both lower triangular, then the $i$ th diagonal element of $\mathbf{A B}$ is the product $a_{i i} b_{i i}$ of the $i$ th diagonal elements $a_{i i}$ and $b_{i i}$ of $\mathbf{A}$ and $\mathbf{B}(i=1, \ldots, n)$. Proof. By definition, the $i j$ th element of $\mathbf{A B}$ is $\sum_{k=1}^n a_{i k} b_{k j}$. Suppose that $\mathbf{A}$ and $\mathbf{B}$ are both upper triangular and hence that $a_{i k}=0$, for $k<i$, and $b_{k j}=0$, for $k>j$. If $j \leq i$, then clearly $b_{k j}=0$ for $k>i$. Thus, for $j \leq i$, $$ \sum_{k=1}^n a_{i k} b_{k j}=\sum_{k=1}^{i-1}(0) b_{k j}+a_{i i} b_{i j}+\sum_{k=i+1}^n a_{i k}(0)=a_{i i} b_{i j} $$ In particular, $$ \sum_{k=1}^n a_{i k} b_{k i}=a_{i i} b_{i i} $$ and, for $j<i$, $$ \sum_{k=1}^n a_{i k} b_{k j}=a_{i i}(0)=0 $$ We conclude that the $i$ th diagonal element of $\mathbf{AB}$ is $a_{ii} b_{ii}(i=1, \ldots, n)$ and that $\mathbf{A B}$ is upper triangular. Suppose now that $\mathbf{A}$ and $\mathbf{B}$ are both lower triangular, rather than upper triangular. Then, since $\mathbf{B}^{T}$ and $\mathbf{A}^{T}$ are both upper triangular with $i$ th diagonal elements $b_{i i}$ and $a_{i i}$, respectively, $\mathbf{B}^{T} \mathbf{A}^{T}$ is upper triangular with $i$ th diagonal element $a_{ii} b_{ii}$, implying that $\mathbf{A B}=\left(\mathbf{B}^{T} \mathbf{A}^{T}\right)^{T}$ is lower triangular with $i$ th diagonal element $a_{i i} b_{i i}$. Q.E.D. An (upper or lower) triangular matrix is called a unit (upper or lower) triangular matrix if all of its diagonal elements equal one. In the special case of unit triangular matrices, Lemma 1.3.1 reduces to the following result. <span style="color: mediumslateblue;">**Corollary 3.4.**</span> The product of two unit upper triangular matrices (of the same order) is unit upper triangular, and the product of two unit lower triangular matrices is unit lower triangular. Clearly, the transpose of an $m$-dimensional column vector is an $m$-dimensional row vector, and vice versa. Let $\mathbf{A}$ be an $m \times n$ matrix. Then $\mathbf{A}$ can be carried to a row-reduced echelon matrix $U$, which is, upper triangular. $$ \mathbf{A} \rightarrow \mathbf{E}_1 \mathbf{A} \rightarrow \mathbf{E}_2\mathbf{E}_1 \mathbf{A} \rightarrow \mathbf{E}_3\mathbf{E}_2\mathbf{E}_1 \mathbf{A} \rightarrow \cdots \rightarrow \mathbf{E}_k \mathbf{E}_{k-1} \cdots \mathbf{E}_2 \mathbf{E}_1 \mathbf{A}=\mathbf{U} $$ where $\mathbf{E}_{1}, \mathbf{E}_{2}, \ldots, \mathbf{E}_{k}$ are elementary matrices corresponding to the row operations used. Hence $$\mathbf{A}=\mathbf{LU}$$ where $\mathbf{L}=\left(\mathbf{E}_{k} \mathbf{E}_{k-1} \cdots \mathbf{E}_{2} \mathbf{E}_{1}\right)^{-1}=\mathbf{E}_{1}^{-1} \mathbf{E}_{2}^{-1} \cdots \mathbf{E}_{k-1}^{-1} \mathbf{E}_{k}^{-1}$. If we do not insist that $\mathbf{U}$ is reduced then, except for row interchanges, none of these row operations involve adding a row to a row above it. Thus, if no row interchanges are used, all the $\mathbf{E}_i$ are lower triangular, and so $\mathbf{L}$ is lower triangular (and invertible) by Theorem and Theorem . Such a factorization of the matrix $\mathbf{A}$ into a lower triangular matrix $\mathbf{L}$ and an upper triangular matrix $\mathbf{U}$ is called as the ==LU factorization== of the matrix $\mathbf{A}$ ### 3.2.3. Linear Independence In this section, we focus on the space of $m \times n$ matrices. Any finite set of row or column vectors, or more generally any finite set of $m \times n$ matrices, is either linearly dependent or linearly independent. <span style="color: lime;">**Definition 3.8.**</span> A nonempty finite set $\left\{\mathbf{A}_1, \mathbf{A}_2, \ldots, \mathbf{A}_k\right\}$ of $m \times n$ matrices is said to be linearly dependent if there exist scalars $x_1, x_2, \ldots, x_k$, not all zero, such that $$ x_{1} \mathbf{A}_1+x_{2} \mathbf{A}_2+\cdots+x_{k} \mathbf{A}_{k}=\mathbf{0}$$ If no such scalars exist, the set is called linearly independent. The empty set is considered to be linearly independent. Note that if any subset of a finite set of matrices is linearly dependent, then the set itself is linearly dependent. While technically linear dependence and independence are properties of sets of matrices, it is customary to speak of "a set of linearly dependent (or independent) matrices" or simply of "linearly dependent (or independent) matrices" instead of "a linearly dependent (or independent) set of matrices." In particular, in the case of row or column vectors, it is customary to speak of "linearly dependent (or independent) vectors." Note that a set consisting of a single matrix is linearly dependent if the matrix is null, and is linearly independent if the matrix is non-null. An elementary result on the linear dependence or independence of two or more matrices is expressed in the following theorem. <span style="color: mediumslateblue;">**Theorem 3.14.**</span> A set $\left\{\mathbf{A}_1, \mathbf{A}_2, \ldots, \mathbf{A}_k\right\}$ of two or more $m \times n$ matrices, the first of which is non-null, is linearly dependent if and only if at least one of the matrices is expressible as a linear combination of the preceding ones, that is, if and only if, for some integer $j$, with $2 \leq j \leq k, \mathbf{A}_j$ is expressible as a linear combination of $\mathbf{A}_1, \ldots, \mathbf{A}_{j-1}$. Proof. Suppose that, for some $j \geq 2$, $\mathbf{A}_{j}$ is expressible as a linear combination of $\mathbf{A}_{1}, \ldots, \mathbf{A}_{j-1}$. That is, $$ \mathbf{A}_{j}=x_{1} \mathbf{A}_{1}+\cdots+x_{j-1} \mathbf{A}_{j-1} $$ Then, $$ \begin{aligned} \left(-x_{1}\right) \mathbf{A}_1 & +\cdots+\left(-x_{j-1}\right) \mathbf{A}_{j-1}+\mathbf{A}_j =\mathbf{0}, \end{aligned} $$ implying that $\left\{\mathbf{A}_{1}, \mathbf{A}_{2}, \ldots, \mathbf{A}_{k}\right\}$ is linearly dependent. Conversely, suppose that the set $\left\{\mathbf{A}_1, \ldots, \mathbf{A}_k\right\}$ is linearly dependent. Define $j$ to be the first integer for which $\left\{\mathbf{A}_1, \ldots, \mathbf{A}_j\right\}$ is a linearly dependent set. Then, $$ x_1 \mathbf{A}_1+\cdots+x_j \mathbf{A}_j=\mathbf{0} $$ for some scalars $x_1, \ldots, x_j$, not all zero. Moreover, $x_j \neq 0$, since otherwise $\left\{\mathbf{A}_1, \ldots, \mathbf{A}_{j-1}\right\}$ would be a linearly dependent set, contrary to the definition of $j$. Thus, $$ \mathbf{A}_j=\left(-x_1 / x_j\right) \mathbf{A}_1+\cdots+\left(-x_{j-1} / x_j\right) \mathbf{A}_{j-1}. \quad \blacksquare$$ <span style="color: mediumslateblue;">**Corollary 3.5.**</span> A set $\left\{\mathbf{A}_1, \mathbf{A}_2, \ldots, \mathbf{A}_k\right\}$ of two or more $m \times n$ matrices is linearly dependent if and only if at least one of the matrices is expressible as a linear combination of the others; that is, if and only if, for some integer $j(1 \leq j \leq k)$, $\mathbf{A}_j$ is expressible as a linear combination of $\mathbf{A}_1, \ldots, \mathbf{A}_{j-1}, \mathbf{A}_{j+1}, \ldots, \mathbf{A}_k$. <span style="color: mediumslateblue;">**Corollary 3.6.**</span> Suppose that $\left\{\mathbf{A}_1, \ldots, \mathbf{A}_k\right\}$ is a (nonempty) linearly independent set of $m \times n$ matrices and that $\mathbf{A}$ is another $m \times n$ matrix. Then, the set $\left\{\mathbf{A}_1, \ldots, \mathbf{A}_k, \mathbf{A}\right\}$ is linearly independent if and only if $\mathbf{A}$ is not expressible as a linear combination of the matrices $\mathbf{A}_1, \ldots, \mathbf{A}_k$. ### 3.2.4. Row Space, Column Space, and Null Space of a Matrix In this section, we define a study the properties of two important spaces related to an $m \times n$ matrix $\mathbf{A}$ - its ==row space== and its ==column space==. Given an $m \times n$ matrix $\mathbf{A}$, its $i^{th}$ row vector is denoted by $A_{i}$ and its $j^{th}$ column vector is denoted by $\mathbf{a}^{j}$. <span style="color: lime;">**Definition 3.9.**</span> The ==column space== of an $m \times n$ matrix $\mathbf{A}$ is the set whose elements consist of all $m$-dimensional column vectors that are expressible as linear combinations of the $n$ columns of $\mathbf{A}$. Thus, the elements of the column space of $\mathbf{A}$ consist of all $m$-dimensional column vectors of the general form $$x_{1} \mathbf{a}^{1}+x_{2} \mathbf{a}^{2}+\cdots+x_{n} \mathbf{a}^{n},$$ where $x_1, x_2, \ldots, x_n$ are scalars and $\mathbf{a}_1, \mathbf{a}_2, \ldots, \mathbf{a}_n$ represent the columns of $\mathbf{A}$, or equivalently of the general form $\mathbf{A x}$, where $\mathbf{x}$ is an $n \times 1$ column vector. >Example. The column space of the $3 \times 4$ matrix $$ \left(\begin{array}{rrrr} 2 & -4 & 0 & 0 \\ -1 & 2 & 0 & 0 \\ 0 & 0 & 1 & 2 \end{array}\right) $$ includes the column vector $$ \left(\begin{array}{r} 4 \\ -2 \\ -3 \end{array}\right)=2\left(\begin{array}{r} 2 \\ -1 \\ 0 \end{array}\right)+0\left(\begin{array}{r} -4 \\ 2 \\ 0 \end{array}\right)-3\left(\begin{array}{l} 0 \\ 0 \\ 1 \end{array}\right)+0\left(\begin{array}{l} 0 \\ 0 \\ 2 \end{array}\right), $$ but not the column vector $(1,0,0)^{T}$. <span style="color: lime;">**Definition 3.10.**</span> The ==row space== of an $m \times n$ matrix $\mathbf{A}$ is the set whose elements consist of all $n$ dimensional row vectors that are expressible as linear combinations of the $m$ rows of $\mathbf{A}$. Thus, the elements of the row space of $\mathbf{A}$ consist of all $n$-dimensional row vectors of the general form $$x_{1} A_{1}+x_2 A_{2}+\cdots+x_{m} A_{m}.$$ The following lemma relates the column space of a matrix to the row space of its transpose. <span style="color: mediumslateblue;">**Theorem 3.15.**</span> For any matrix $\mathbf{A}, \mathbf{y} \in \mathcal{C}(\mathbf{A})$ if and only if $\mathbf{y}^{T} \in \mathcal{R}\left(\mathbf{A}^{T}\right)$. *Proof*. If $\mathbf{y} \in \mathcal{C}(\mathbf{A})$, then $\mathbf{y}=\mathbf{A x}$ for some column vector $\mathbf{x}$, implying that $\mathbf{y}^{T}=(\mathbf{Ax})^{T}=\mathbf{x}^{T} \mathbf{A}^{T}$ and hence that $\mathbf{y}^{T} \in \mathcal{R}\left(\mathbf{A}^{T}\right)$. An analogous argument can be used to show that if $\mathbf{y}^{T} \in \mathcal{R}\left(\mathbf{A}^{T}\right)$, then $\mathbf{y} \in \mathcal{C}(\mathbf{A})$. $\blacksquare$ <span style="color: lime;">**Definition 3.11.**</span> The ==rank of a matrix $\mathbf{A}$==, denoted by $\operatorname{rank}(\mathbf{A})$ is the number of non-zero vectors in the row-reduced echelon matrix $\mathbf{R}$ that is row equivalent to $\mathbf{A}$. Similar to elementary row operations, one can also define ==elementary column operations== and using elementary column operations, we can construct the column-reduced echelon matrix of a matrix $\mathbf{A}$, $\mathbf{C}$. The number of non-zero columns in the column-reduced echelon matrix $\mathbf{C}$ that is column equivalent to the matrix $\mathbf{A}$ is called as the ==column rank of the matrix $\mathbf{A}$==. An $m \times n$ matrix $\mathbf{A}$ is said to have ==full row rank== if $\operatorname{rank}(\mathbf{A})=m$, that is, if its rank equals the number of rows, and to have ==full column rank== if $\operatorname{rank}(\mathbf{A})=n$. Clearly, an $m \times n$ matrix can have full row rank only if $m \leq n$, that is, only if the number of rows does not exceed the number of columns, and can have full column rank only if $n \leq m$. A matrix is said to be ==non-singular== if it has both full row rank and full column rank. Clearly, any ==non-singular== matrix is square. By definition, an $n \times n$ matrix $\mathbf{A}$ is non-singular if and only if $\operatorname{rank}(\mathbf{A})=n$. An $n \times n$ matrix of rank less than $n$ is said to be singular. <span style="color: lime;">**Definition 3.12.**</span> The ==null space== of an $m \times n$ matrix $\mathbf{A}$, denoted $\mathcal{N}(\mathbf{A})$, is the set of all vectors $\mathbf{x}$ in $\mathbb{R}^{n}$ such that $\mathbf{Ax}=\mathbf{0}$. Intuitively, these are the solutions of the homogeneous system of linear equations when the coefficient matrix is $\mathbf{A}$. <span style="color: lime;">**Definition 3.13.**</span> The ==nullity of an $m \times n$ matrix $\mathbf{A}$==, denoted by $\operatorname{nullity}(\mathbf{A})$, which is $n-\operatorname{rank}(\mathbf{A})$. ### 3.2.5. Determinants Associated with any square matrix is a scalar that is known as the ==determinant== of the matrix. As a preliminary to defining the determinant, it is convenient to introduce a convention for classifying various pairs of matrix elements as either positive or negative. Let $\mathbf{A}$ represent an arbitrary $n \times n$ matrix. Consider any pair of elements of $\mathbf{A}$ that do not lie either in the same row or the same column, say $a_{i j}$ and $a_{i'j'}$ where $i' \neq i$ and $j' \neq j$ . The pair is said to be a ==negative pair== if one of the elements is located above and to the right of the other, or equivalently if either $i'>i$ and $j'<j$ or $i'<i$ and $j'>j$. Otherwise, if either $i'>i$ and $j'>j$ or $i'<i$ and $j'<j$, the pair is said to be a ==positive pair==. For example (supposing that $n \geq 4$ ), the pair $a_{34}$ and $a_{22}$ is positive, whereas the pair $a_{34}$ and $a_{41}$ is negative. Note that whether the pair $a_{ij}$ and $a_{i'j'}$ is positive or negative is completely determined by the relative locations of $a_{i j}$ and $a_{i'j'}$ and has nothing to do with whether $a_{i j}$ and $a_{i'j'}$ are positive or negative numbers. Now, consider $n$ elements of $\mathbf{A}$, no two of which lie either in the same row or the same column, say the $i_1 j_1, \ldots, i_n j_n$ th elements (where both $i_1, \ldots, i_n$ and $j_1, \ldots, j_n$ are permutations of the first $n$ positive integers). A total of $\binom{n}{2}$ pairs can be formed from these $n$ elements. The symbol $\sigma_n\left(i_1, j_1 ; \ldots ; i_n, j_n\right)$ is to be used to represent the number of these $\binom{n}{2}$ pairs that are negative pairs. Observe that $\sigma_n\left(i_1, j_1 ; \ldots ; i_n, j_n\right)$ has the following two properties: (i) the value of $\sigma_n\left(i_1, j_1 ; \ldots ; i_n, j_n\right)$ is not affected by permuting its $n$ pairs of arguments; in particular, it is not affected if the $n$ pairs are permuted so that they are ordered by row number or by column number. For example, $\sigma_3(2,3 ; 1,2 ; 3,1)= \sigma_3(1,2 ; 2,3 ; 3,1)=\sigma_3(3,1 ; 1,2 ; 2,3)$. (ii) $\sigma_n\left(i_1, j_1 ; \ldots ; i_n, j_n\right)=\sigma_n\left(j_1, i_1 ; \ldots ; j_n, i_n\right)$. For example, $\sigma_3(2,3 ; 1,2 ; 3,1)=\sigma_3(3,2 ; 2,1 ; 1,3)$. For any sequence of $n$ distinct integers $i_1, \ldots, i_n$, define $$ \phi_n\left(i_1, \ldots, i_n\right)=p_1+\cdots+p_{n-1}, $$ where $p_k$ represents the number of integers in the subsequence $i_{k+1}, \ldots, i_n$ that are smaller than $i_k(k=1, \ldots, n-1)$. For example, $$ \phi_5(3,7,2,1,4)=2+3+1+0=6 . $$ Then, clearly, for any permutation $i_1, \ldots, i_n$ of the first $n$ positive integers, $$ \sigma_n\left(1, i_1 ; \ldots ; n, i_n\right)=\sigma_n\left(i_1, 1 ; \ldots ; i_n, n\right)=\phi_n\left(i_1, \ldots, i_n\right) . $$ <span style="color: lime;">**Definition 3.14.**</span> The ==determinant of an $n \times n$ matrix $\mathbf{A}$==, to be denoted by $\operatorname{det}(\mathbf{A})$, is defined by $$\operatorname{det}(A)=\sum(-1)^{\sigma_n\left(1, j_1 ; \ldots ; n, j_n\right)} a_{1 j_1} \cdots a_{n j_n} $$ or equivalently by $$\operatorname{det}(A)=\sum(-1)^{\phi_n\left(j_1, \ldots, j_n\right)} a_{1 j_1} \cdots a_{n j_n},$$ where $j_1, \ldots, j_n$ is a permutation of the first $n$ positive integers and the summation is over all such permutations. Thus, the determinant of an $n \times n$ matrix $\mathbf{A}$ can (at least in principle) be obtained via the following process: (i) Form all possible products, each of $n$ factors, that can be obtained by picking one and only one element from each row and column of $\mathbf{A}$. (ii) In each product, count the number of negative pairs among the $\binom{n}{2}$ pairs of elements that can be generated from the $n$ elements that contribute to this particular product. If the number of negative pairs is an even number, attach a plus sign to the product; if it is an odd number, attach a minus sign. (iii) Sum the signed products. In particular, the determinant of a $1 \times 1$ matrix $\mathbf{A}=\left(a_{11}\right)$ is $$ \operatorname{det}(A)=a_{11} ; $$ the determinant of a $2 \times 2$ matrix $\mathbf{A}=\left\{a_{i j}\right\}$ is $$ \begin{aligned} \operatorname{det}(A) & =(-1)^0 a_{11} a_{22}+(-1)^1 a_{12} a_{21} \\ & =a_{11} a_{22}-a_{12} a_{21} \end{aligned} $$ and the determinant of a $3 \times 3$ matrix $\mathbf{A}=\left\{a_{i j}\right\}$ is $$ \begin{aligned} \operatorname{det}(A)= & (-1)^0 a_{11} a_{22} a_{33}+(-1)^1 a_{11} a_{23} a_{32}+(-1)^1 a_{12} a_{21} a_{33} \\ & +(-1)^2 a_{12} a_{23} a_{31}+(-1)^2 a_{13} a_{21} a_{32}+(-1)^3 a_{13} a_{22} a_{31} \end{aligned} $$ <span style="color: mediumslateblue;">**Theorem 3.17.**</span> If an $n \times n$ matrix $\mathbf{A}=\left\{a_{i j}\right\}$ is (upper or lower) triangular, then $$ \operatorname{det}(\mathbf{A})=a_{11}a_{22}\cdots a_{nn},$$ that is, the determinant of a triangular matrix equals the product of its diagonal elements. *Proof*. Consider a lower triangular matrix $$ \mathbf{A}=\left(\begin{array}{llll} a_{11} & 0 & \ldots & 0 \\ a_{21} & a_{22} & & \\ \vdots & \vdots & \ddots & \\ a_{n 1} & a_{n 2} & \cdots & a_{n n} \end{array}\right) . $$ That $\operatorname{det}(\mathbf{A})=a_{11} a_{22} \cdots a_{n n}$ follows immediately upon observing that the only term in the determinant formulathat can be nonzero is that corresponding to the permutation $j_1=1,\ldots,j_n=n$ [and that $\phi_n(1,2, \ldots, n)=0$]. (To verify formally that only this one term can be nonzero, let $j_1, \ldots, j_n$ represent an arbitrary permutation of the first $n$ positive integers, and suppose that $a_{1 j_1} \cdots a_{n j_n} \neq 0$ or equivalently that $a_{i j_i} \neq 0$ for $i=1, \ldots, n$. Then, it is clear that $j_1=1$ and that, if $j_1= 1, \ldots, j_{i-1}=i-1$, then $j_i=i$. We conclude, on the basis of mathematical induction, that $j_1=1, \ldots, j_n=n$ ). <span style="color: mediumslateblue;">**Theorem 3.18.**</span> For any $n \times n$ matrix $\mathbf{A}$, $$ \operatorname{det}\left(\mathbf{A}^{T}\right)=\operatorname{det}(\mathbf{A}).$$ Proof. Let $a_{ij}$ and $b_{ij}$ represent the $ij^{th}$ th elements of $\mathbf{A}$ and $\mathbf{A}^{T}$, respectively $(i, j=1, \ldots, n)$. Then, $$ \begin{aligned} \left|\mathbf{A}^{T}\right| & =\sum(-1)^{\phi_n\left(j_1, \ldots, j_n\right)} b_{1 j_1} \cdots b_{n j_n} \\ & =\sum(-1)^{\phi_n\left(j_1, \ldots, j_n\right)} a_{j_1} \cdots a_{j_n n} \\ & =\operatorname{det}(\mathbf{A}), \end{aligned} $$ where $j_1, \ldots, j_n$ is a permutation of the first $n$ positive integers and the summations are over all such permutations. $\blacksquare$ As an immediate consequence of the definition of a determinant, we have the following lemma. <span style="color: mediumslateblue;">**Theorem 3.19.**</span> If an $n \times n$ matrix $\mathbf{B}$ is formed from an $n \times n$ matrix $\mathbf{A}$ by multiplying all of the elements of one row or one column of $\mathbf{A}$ by the same scalar $k$ (and leaving the elements of the other $n-1$ rows or columns unchanged), then $$ \operatorname{det}(\mathbf{B})=k\operatorname{det}(\mathbf{A}) . $$ As a corollary of Theorem 3.19, we obtain the following generalization of result (1.8). <span style="color: mediumslateblue;">**Corollary 3.7.**</span> If one or more rows (or columns) of an $n \times n$ matrix $\mathbf{A}$ are null, then $$ \operatorname{det}(\mathbf{A})=0.$$ *Proof*. Suppose that the $i$ th row of $\mathbf{A}$ is null, and let $\mathbf{B}$ represent an $n \times n$ matrix formed from $\mathbf{A}$ by multiplying all of the elements of the $i$ th row of $\mathbf{A}$ by zero. Then, $\mathbf{A}=\mathbf{B}$, and we find that $$\operatorname{det}(\mathbf{A})=\operatorname{det}(\mathbf{B})=0\operatorname{det}(\mathbf{A})=0. \quad \blacksquare$$ The following corollary (of Theorem 3.19) relates the determinant of a scalar multiple of a matrix $\mathbf{A}$ to that of $\mathbf{A}$ itself. <span style="color: mediumslateblue;">**Corollary 3.8.**</span> For any $n \times n$ matrix $\mathbf{A}$ and any scalar $k$, $$|k \mathbf{A}|=k^n\operatorname{det}(A).$$ *Proof*. This result follows from Lemma 13.2.2 upon observing that $k \mathbf{A}$ can be formed from $\mathbf{A}$ by successively multiplying each of the $n$ rows of $\mathbf{A}$ by $k$. Q.E.D. As a special case of Corollary 3.8, we have the following, additional corollary. <span style="color: mediumslateblue;">**Corollary 3.9.**</span> For any $n \times n$ matrix A, $$ \operatorname{det}(-\mathbf{A})=(-1)^n\operatorname{det}(\mathbf{A}).$$ The following two theorems describe how the determinant of a matrix is affected by permuting its rows or columns in certain ways. <span style="color: mediumslateblue;">**Theorem 3.20.**</span> If an $n \times n$ matrix $\mathbf{B}=\left\{b_{i j}\right\}$ is formed from an $n \times n$ matrix $\mathbf{A}=\left\{a_{i j}\right\}$ by interchangingtwo rows or two columns of $\mathbf{A}$, then $$ \operatorname{det}(\mathbf{B})=-\operatorname{det}(\mathbf{A}).$$ *Proof*. Consider first the case where $\mathbf{B}$ is formed from $\mathbf{A}$ by interchanging two adjacent rows, say the $i$ th and $(i+1)$ th rows. Then, $$ \begin{aligned} \operatorname{det}(\mathbf{B})= & \sum(-1)^{\phi_n\left(j_1, \ldots, j_n\right)} b_{1 j_1} \cdots b_{n j_n} \\ = & \sum(-1)^{\phi_n\left(j_1, \ldots, j_n\right)} a_{1 j_1} \cdots a_{i-1, j_{i-1}} a_{i+1, j_i} a_{i, j_{i+1}} a_{i+2, j_{i+2}} \cdots a_{n j_n} \\ = & -\sum(-1)^{\phi_n\left(j_1, \ldots, j_{j-1}, j_{i+1}, j_i, j_{i+2}, \ldots, j_n\right)} \\ & \quad \times a_{1 j_1} \cdots a_{i-1, j_{i-1}} a_{i, j_{i+1}} a_{i+1, j_i} a_{i+2, j_{i+2}} \cdots a_{n j_n} \\ & \quad\left[\operatorname{since} \phi_n\left(j_1, \ldots, j_{i-1}, j_{i+1}, j_i, j_{i+2} \cdots, j_n\right)\right. \\ & \quad=\phi_n\left(j_1, \ldots, j_n\right)+1, \text { if } j_{i+1}>j_i \\ & \left.\quad=\phi_n\left(j_1, \ldots, j_n\right)-1, \text { if } j_{i+1}<j_i\right] \\ = & -\operatorname{det}(A) \end{aligned} $$ where $j_1, \ldots, j_n$ (and hence $j_1, \ldots, j_{i-1}, j_{i+1}, j_i, j_{i+2}, \ldots, j_n$ ) is a permutation of the first $n$ positive integers and the summation is over all such permutations. Let $\mathbf{A}$ represent an $n \times n$ matrix. Let $\mathbf{A}_{i j}$ represent the $(n-1) \times(n-1)$ submatrix of $\mathbf{A}$ obtained by striking out the row and the column that contain the element $a_{i j}$, that is, the $i$ th row and the $j$ th column. The determinant $\left|\mathbf{A}_{i j}\right|$ of this submatrix is called the minor of the element $a_{i j}$. The *signed* minor $(-1)^{i+j}\left|\mathbf{A}_{i j}\right|$ is called the ==cofactor of $a_{ij}$==. The determinant of $\mathbf{A}$ can be expanded in terms of the cofactors of the $n$ elements of any particular row or column of $\mathbf{A}$, as described in the following theorem. <span style="color: mediumslateblue;">**Theorem 3.21.**</span> Let $a_{i j}$ represent the $i j$ th element of an $n \times n$ matrix $\mathbf{A}$, and let $\alpha_{i j}$ represent the cofactor of $a_{i j}(i, j=1, \ldots, n)$. Then, for $i=1, \ldots, n$, $$\operatorname{det}(\mathbf{A})=\sum_{j=1}^n a_{i j} \alpha_{i j}=a_{i 1} \alpha_{i 1}+\cdots+a_{i n} \alpha_{i n}$$ and, for $j=1, \ldots, n$, $$\operatorname{det}(\mathbf{A})=\sum_{i=1}^n a_{i j} \alpha_{i j}=a_{1 j} \alpha_{1 j}+\cdots+a_{n j} \alpha_{n j}.$$ *Proof*. Let $\mathbf{A}_{ij}$ represent the matrix obtained by striking out the $i^{th}$ row and the $j^{th}$ column of $\mathbf{A}(i, j=1, \ldots, n)$. Consider first result (5.1) for the case $i=1$. Denote by $a_{t s}^{(j)}$ the $t s$ th element of the matrix $\mathbf{A}_{1 j}(t, s=1, \ldots, n-1 ; j= 1, \ldots, n)$. We find that $$ \operatorname{det}(\mathbf{A})=\sum(-1)^{\phi_n\left(k_1, \ldots, k_n\right)} a_{1 k_1} \cdots a_{n k_n} $$ (where $k_1, \ldots, k_n$ is a permutation of the first $n$ positive integers and the summation is over all such permutations) $$ \begin{aligned} =a_{11} & \sum(-1)^{\phi_{n-1}\left(k_2, \ldots, k_n\right)} a_{2 k_2} \cdots a_{n k_n}+\cdots \\ & +a_{1 j} \sum(-1)^{j-1+\phi_{n-1}\left(k_2, \ldots, k_n\right)} a_{2 k_2} \cdots a_{n k_n}+\cdots \\ & +a_{1 n} \sum(-1)^{n-1+\phi_{n-1}\left(k_2, \ldots, k_n\right)} a_{2 k_2} \cdots a_{n k_n} \end{aligned} $$ (where, in the $j$ th of the $n$ sums, $k_2, \ldots, k_n$ is a permutation of the $n-1$ integers $1, \ldots, j-1, j+1, \ldots, n$ and the summation is over all such permutations) $$ \begin{aligned} =a_{11} & \sum(-1)^{\phi_{n-1}\left(s_1, \ldots, s_{n-1}\right)} a_{1 s_1}^{(1)} \cdots a_{n-1, s_{n-1}}^{(1)}+\cdots \\ & +a_{1 j}(-1)^{j-1} \sum(-1)^{\phi_{n-1}\left(s_1, \ldots, s_{n-1}\right)} a_{1 s_1}^{(j)} \cdots a_{n-1, s_{n-1}}^{(j)}+\cdots \\ & +a_{n j}(-1)^{n-1} \sum(-1)^{\phi_{n-1}\left(s_1, \ldots, s_{n-1}\right)} a_{1 s_1}^{(n)} \cdots a_{n-1, s_{n-1}}^{(n)} \end{aligned} $$ (where, in each of the $n$ sums, $s_1, \ldots, s_{n-1}$ is a permutation of the first $n-1$ positive integers and the summation is over all such permutations) $$ \begin{aligned} & =\sum_{j=1}^n a_{1 j}(-1)^{j-1}\left|\mathbf{A}_{1 j}\right| \\ & =\sum_{j=1}^n a_{1 j}(-1)^{j+1}\left|\mathbf{A}_{1 j}\right| \\ & =\sum_{j=1}^n a_{1 j} \alpha_{1 j} \end{aligned} $$ Consider now result (5.1) for the case $i>1$. Let $\mathbf{B}$ represent the $n \times n$ matrix whose first row is the $i$ th row of $\mathbf{A}$, whose second, $\ldots, i$ th rows are the first, $\ldots,(i-1)$ th rows, respectively, of $\mathbf{A}$, and whose $(i+1), \ldots, n$th rows are the same as those of $\mathbf{A}$. Observe that $\mathbf{A}$ can be obtained from $\mathbf{B}$ by successively interchanging the first row of $\mathbf{B}$ with its second, $\ldots, i$ th rows, so that, according to Theorem 13.2.6, $$ \operatorname{det}(\mathbf{A})=(-1)^{i-1}\operatorname{det}(\mathbf{B}) . $$ Let $\mathbf{B}_{1 j}$ represent the matrix obtained by striking out the first row and the $j$ th column of $\mathbf{B}$, and let $b_{1 j}$ represent the $j$ th element of the first row of $\mathbf{B}(j=$$1, \ldots, n)$. Then, $\mathbf{B}_{1 j}=\mathbf{A}_{i j}(j=1, \ldots, n)$, and we find that $$ \begin{aligned} \operatorname{det}(\mathbf{A})=(-1)^{i-1}\operatorname{det}(\mathbf{B}) & =(-1)^{i-1} \sum_{j=1}^n b_{1 j}(-1)^{1+j}\left|\mathbf{B}_{1 j}\right| \\ & =(-1)^{i-1} \sum_{j=1}^n a_{i j}(-1)^{1+j}\left|\mathbf{A}_{i j}\right| \\ & =\sum_{j=1}^n a_{i j} \alpha_{i j} \end{aligned} $$ Finally, consider result (5.2). Observe that the matrix obtained by striking out the $j$ th row and the $i$ th column of $\mathbf{A}^{T}$ is $\mathbf{A}_{i j}^{T}$ and hence that the cofactor of the $j i$ th element of $\mathbf{A}^{T}$ is $$ (-1)^{j+i}\left|\mathbf{A}_{i j}^{T}\right|=(-1)^{i+j}\left|\mathbf{A}_{i j}\right|=\alpha_{i j} $$ $(i, j=1, \ldots, n)$. Thus, since the $j i$ th element of $\mathbf{A}^{T}$ is the $i j$ th element of $\mathbf{A} (i, j=1, \ldots, n)$, it follows from result (5.1) (and from Lemma 13.2.1) that $$ \operatorname{det}(A)=\left|\mathbf{A}^{T}\right|=\sum_{i=1}^n a_{i j} \alpha_{i j} $$ $(j=1, \ldots, n)$. $\blacksquare$ Note that if the $j$ th element $a_{i j}$ of the $i$ th row of $\mathbf{A}$ equals zero, then the $j$ th term in formula (5.1) also equals zero. Similarly, if $a_{i j}=0$, then the $i$ th term in formula (5.2) equals zero. Thus, if the $n \times n$ matrix $\mathbf{A}$ contains a row or column that includes many zeroes, then formula (5.1) or (5.2) can be used to reexpress $\operatorname{det}(A)$ in terms of the determinants of a relatively small number of $(n-1) \times(n-1)$ submatrices of $\mathbf{A}$. The following theorem can be regarded as a companion to Theorem 3.21. <span style="color: mediumslateblue;">**Theorem 3.22.**</span> Let $a_{i j}$ represent the $i j$ th element of an $n \times n$ matrix $\mathbf{A}$, and let $\alpha_{i j}$ represent the cofactor of $a_{i j}(i, j=1, \ldots, n)$. Then, for $i^{\prime} \neq i=1, \ldots, n$, $$\sum_{j=1}^n a_{i j} \alpha_{i^{\prime} j}=a_{i 1} \alpha_{i^{\prime} 1}+\cdots+a_{i n} \alpha_{i^{\prime} n}=0$$ and, for $j^{\prime} \neq j=1, \ldots, n$, $$ \sum_{i=1}^n a_{i j} \alpha_{i j^{\prime}}=a_{1 j} \alpha_{1 j^{\prime}}+\cdots+a_{n j} \alpha_{n j^{\prime}}=0.$$ *Proof*. Consider result (5.3). Let $\mathbf{B}$ represent a matrix whose $i^{\prime}$ th row equals the $i$ th row of $\mathbf{A}$ and whose first, $\ldots,\left(i^{\prime}-1\right),\left(i^{\prime}+1\right), \ldots, n$th rows are identical to those of $\mathbf{A}$ (where $i^{\prime} \neq i$ ). Observe that the $i^{\prime}$ th row of $\mathbf{B}$ is a duplicate of its $i$ th row and hence, according to Lemma 13.2.8, that $\operatorname{det}(\mathbf{B})=0$. Let $b_{k j}$ represent the $k j$ th element of $\mathbf{B}(k, j=1, \ldots, n)$. Clearly, the cofactor of $b_{i^{\prime} j}$ is the same as that of $a_{i^{\prime} j}(j=1, \ldots, n)$. Thus, making use of Theorem 13.5.1, we find that $$ \sum_{j=1}^n a_{i j} \alpha_{i^{\prime} j}=\sum_{j=1}^n b_{i^{\prime} j} \alpha_{i^{\prime} j}=\operatorname{det}(\mathbf{B})=0 $$ which establishes result (5.3). Result (5.4) can be proved via an analogous argument. Q.E.D. For any $n \times n$ matrix $\mathbf{A}=\left\{a_{i j}\right\}$, the $n \times n$ matrix whose $i j$ th element is the cofactor $\alpha_{i j}$ of $a_{i j}$ is called the matrix of cofactors (or cofactor matrix) of $\mathbf{A}$. The transpose of this matrix is called the adjoint matrix of $\mathbf{A}$ and is denoted by the $\operatorname{symbol} \operatorname{adj} \mathbf{A}$ or $\operatorname{adj}(\mathbf{A})$. Thus $$ \operatorname{adj} \mathbf{A}=\left(\begin{array}{ccc} \alpha_{11} & \ldots & \alpha_{1 n} \\ \vdots & \ddots & \vdots \\ \alpha_{n 1} & \ldots & \alpha_{n n} \end{array}\right)^{\prime}=\left(\begin{array}{ccc} \alpha_{11} & \ldots & \alpha_{n 1} \\ \vdots & \ddots & \vdots \\ \alpha_{1 n} & \ldots & \alpha_{n n} \end{array}\right) . $$ If $\mathbf{A}$ is symmetric, then the matrix of cofactors is also symmetric (or equivalently the adjoint matrix equals the matrix of cofactors), as is easily verified. There is a close relationship between the adjoint of a nonsingular matrix $\mathbf{A}$ and the inverse of $\mathbf{A}$, as is evident from the following theorem and is made explicit in the corollary of this theorem. <span style="color: mediumslateblue;">**Theorem 3.23.**</span> For an $n \times n$ matrix $\mathbf{A}$, $$\mathbf{A} \operatorname{adj}(\mathbf{A})=(\operatorname{adj} \mathbf{A}) \mathbf{A}=\operatorname{det}(A) \mathbf{I}_n.$$ *Proof*. Let $a_{i j}$ represent the $i j$ th element of $\mathbf{A}$, and let $\alpha_{i j}$ represent the cofactor of $a_{i j}(i, j=1, \ldots, n)$. Then, the $i i^{\prime}$ th element of the matrix product $\mathbf{A} \operatorname{adj}(\mathbf{A})$ is $\sum_{j=1}^n a_{i j} \alpha_{i^{\prime} j}\left(i, i^{\prime}=1, \ldots, n\right)$. According to Theorems 3.21-3.22, $$ \begin{aligned} \sum_{j=1}^n a_{i j} \alpha_{i^{\prime} j} & =\operatorname{det}(A), \quad \text { if } \quad i^{\prime}=i \\ & =0, \quad \text { if } \quad i^{\prime} \neq i \end{aligned} $$ Thus, $\mathbf{A} \operatorname{adj}(\mathbf{A})=\operatorname{det}(\mathbf{A}) \mathbf{I}$. That $(\operatorname{adj}(\mathbf{A}) \mathbf{A})=\operatorname{det}(\mathbf{A}) \mathbf{I}$ can be established via a similar argument. $\blacksquare$ <span style="color: mediumslateblue;">**Corollary 3.10.**</span> If $\mathbf{A}$ is an $n \times n$ nonsingular matrix, then $$\operatorname{adj}(\mathbf{A})=\operatorname{det}(\mathbf{A})\mathbf{A}^{-1}$$ or equivalently $$ \mathbf{A}^{-1}=\left(\dfrac{1}{\operatorname{det}(\mathbf{A})}\right) \operatorname{adj}(\mathbf{A}).$$ ### 3.2.5. Eigenvalues and Eigenvectors To explain eigenvalues, we first explain eigenvectors. Almost all vectors change direction, when they are multiplied by a(n $m \times n$ matrix) $\mathbf{A}$. Certain exceptional vectors $\mathbf{x}$ are in the same direction as $\mathbf{Ax}$. Those are the ==eigenvectors==. Multiply an eigenvector by A, and the vector $\mathbf{Ax}$ is a number $\lambda$ times the original $\mathbf{x}$. > Example. Let $$ \mathbf{A}=\left(\begin{array}{rrr} 0 & 5 & -10 \\ 0 & 22 & 16 \\ 0 & -9 & -2 \end{array}\right) $$ First, compute the product $\mathbf{Ax}$ for $$ \mathbf{x}=\left(\begin{array}{r} -5 \\ -4 \\ 3 \end{array}\right) $$ This product is given by $$ \mathbf{Ax}=\left[\begin{array}{rrr} 0 & 5 & -10 \\ 0 & 22 & 16 \\ 0 & -9 & -2 \end{array}\right]\left[\begin{array}{r} -5 \\ -4 \\ 3 \end{array}\right]=\left[\begin{array}{r} -50 \\ -40 \\ 30 \end{array}\right]=10\left[\begin{array}{r} -5 \\ -4 \\ 3 \end{array}\right] $$ In this case, the product $\mathbf{Ax}$ resulted in a vector which is equal to $10$ times the vector $\mathbf{x}$. In other words, $\mathbf{Ax}=10\mathbf{x}$. Next, let's see what happens when we compute $\mathbf{Ax}$ for the vector $$ \mathbf{x}=\left[\begin{array}{l} 1 \\ 0 \\ 0 \end{array}\right] $$ This product is given by $$ \mathbf{Ax}=\left[\begin{array}{rrr} 0 & 5 & -10 \\ 0 & 22 & 16 \\ 0 & -9 & -2 \end{array}\right]\left[\begin{array}{l} 1 \\ 0 \\ 0 \end{array}\right]=\left[\begin{array}{l} 0 \\ 0 \\ 0 \end{array}\right]=0\left[\begin{array}{l} 1 \\ 0 \\ 0 \end{array}\right] $$ In this case, the product $\mathbf{Ax}$ resulted in a vector equal to 0 times the vector $\mathbf{x}$, $\mathbf{Ax}=0\mathbf{x}$. Perhaps this matrix is such that $\mathbf{Ax}$ results in $\lambda\mathbf{x}$, for every vector $\mathbf{x}$. However, consider $$ \left[\begin{array}{rrr} 0 & 5 & -10 \\ 0 & 22 & 16 \\ 0 & -9 & -2 \end{array}\right]\left[\begin{array}{r} 1 \\ 1 \\ 1 \end{array}\right]=\left[\begin{array}{r} -5 \\ 38 \\ -11 \end{array}\right] $$ In this case, $\mathbf{Ax}$ did not result in a vector of the form $\lambda\mathbf{x}$ for some scalar $\lambda$. There is something special about the first two products calculated in the above example. Notice that for each, $A X=k X$ where $k$ is some scalar. When this equation holds for some $X$ and $k$, we call the scalar $k$ an eigenvalue of $A$. We often use the special symbol $\lambda$ instead of $k$ when referring to eigenvalues. In Example 7.1.1, the values 10 and 0 are eigenvalues for the matrix $A$ and we can label these as $\lambda_1=10$ and $\lambda_2=0$. <span style="color: lime;">**Definition 3.15.**</span> An ==eigenvector== and the associated ==eigenvalue== of the matrix $\mathbf{A}$ is a non-zero vector $\mathbf{x}$ and a number $\lambda \in \mathbb{R}$ such that $\mathbf{Ax}=\lambda\mathbf{x}$. <span style="color: mediumslateblue;">**Theorem 3.24.**</span> Let $\mathbf{A}$ be an $n \times n$ matrix and let $\lambda$ be a scalar. The following are equivalent. (i) $\lambda$ is a characteristic value of $\mathbf{A}$. (ii) The operator $(\mathbf{A}-\mathrm{\lambda I})$ is singular. (iii) $\operatorname{det}(\mathbf{A}-\mathrm{\lambda I})=0$. Based on Theorem 3.24, we will now look at how to find the eigenvalues and eigenvectors for an $n \times n$ matrix $\mathbf{A}$ in detail. (i) First, find the eigenvalues $\lambda$ of $\mathbf{A}$ by solving the equation $\operatorname{det}(\lambda \mathbf{I}-\mathbf{A})=0$. (ii) For each $\lambda$, find the basic eigenvectors $\mathbf{x} \neq 0$ by finding the basic solutions to $(\mathbf{A}-\mathrm{\lambda I})\mathbf{x}=0$. > Example. In this example, we find the eigenvalues and eigenvectors for the matrix $$ \mathbf{A}=\left(\begin{array}{rrr} 5 & -10 & -5 \\ 2 & 14 & 2 \\ -4 & -8 & 6 \end{array}\right) $$ First, we need to find the eigenvalues of $\mathbf{A}$. Recall that they are the solutions of the equation $$ \operatorname{det}(\mathbf{A}-\lambda\mathbf{I})=0 $$ In this case the equation is $$ \operatorname{det}\left(\left(\begin{array}{rrr} 5 & -10 & -5 \\ 2 & 14 & 2 \\ -4 & -8 & 6 \end{array}\right)-\lambda\left(\begin{array}{lll} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array}\right)\right)=0 $$ which becomes $$ \operatorname{det}\left(\begin{array}{ccc} \lambda-5 & 10 & 5 \\ -2 & \lambda-14 & -2 \\ 4 & 8 & \lambda-6 \end{array}\right)=0 $$ Therefore, we can find all eigenvalues by solving the result is the following equation: $$ (\lambda-5)\left(\lambda^2-20 \lambda+100\right)=0.$$ Solving this equation, we find that the eigenvalues are $\lambda_1=5, \lambda_2=10$ and $\lambda_3=10$. Notice that 10 is a root of multiplicity two due to $$ \lambda^{2}-20\lambda+100=(\lambda-10)^{2}.$$ Therefore, $\lambda_2=10$ is an eigenvalue of multiplicity two. Now that we have found the eigenvalues for $\mathbf{A}$, we can compute the eigenvectors. First, we will find the basic eigenvectors for $\lambda_{1}=5$. In other words, we want to find all non-zero vectors $\mathbf{x}$ so that $\mathbf{Ax}=\mathbf{5x}$. This requires that we solve the equation $(\mathbf{A}-5\mathbf{I})\mathbf{x}=0$ for $\mathbf{x}$ as follows. $$ \left(\left(\begin{array}{rrr} 5 & -10 & -5 \\ 2 & 14 & 2 \\ -4 & -8 & 6 \end{array}\right)-5\left(\begin{array}{lll} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array}\right)\right)\left(\begin{array}{l} x_{1} \\ x_{2} \\ x_{3} \end{array}\right)=\left[\begin{array}{l} 0 \\ 0 \\ 0 \end{array}\right] $$ Equivalently, we need to find the solution to $$ \left(\begin{array}{rrr} 0 & 10 & 5 \\ -2 & -9 & -2 \\ 4 & 8 & -1 \end{array}\right)\left(\begin{array}{l} x_{1} \\ x_{2} \\ x_{3} \end{array}\right)=\left(\begin{array}{l} 0 \\ 0 \\ 0 \end{array}\right) $$ This is equivalent to set up the augmented matrix and row reduce it to get the solution. Thus the matrix you must row reduce is $$ \left(\begin{array}{rrr|r} 0 & 10 & 5 & 0 \\ -2 & -9 & -2 & 0 \\ 4 & 8 & -1 & 0 \end{array}\right) $$ The row-reduced echelon form is $$ \left(\begin{array}{rrr|r} 1 & 0 & -\frac{5}{4} & 0 \\ 0 & 1 & \frac{1}{2} & 0 \\ 0 & 0 & 0 & 0 \end{array}\right) $$ and the solution is any vector of the form $$ \left(\begin{array}{c} \frac{5}{4}\lambda \\ -\frac{1}{2}\lambda \\ \lambda \end{array}\right)=\lambda\left(\begin{array}{r} \frac{5}{4} \\ -\frac{1}{2} \\ 1 \end{array}\right) $$ where $c \in \mathbb{R}$. Multiply this vector by 4, we obtain a simpler description for the solution to this system, as given by $$ \lambda\left(\begin{array}{r} 5 \\ -2 \\ 4 \end{array}\right) $$ where $\lambda \in \mathbb{R}$. Here, the basic eigenvector is given by $$ \mathbf{x}=\left(\begin{array}{r} 5 \\ -2 \\ 4 \end{array}\right) $$ Notice that we cannot let $\lambda=0$ here, because this would result in the zero vector and eigenvectors are never equal to $\mathbf{0}$! Other than this value, every other choice of $\lambda$ in the above equation results in an eigenvector. Next, we will find the basic eigenvectors for $\lambda_2, \lambda_3=10$. These vectors are the basic solutions to the equation, $$ \left(10\left(\begin{array}{lll} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array}\right)-\left(\begin{array}{rrr} 5 & -10 & -5 \\ 2 & 14 & 2 \\ -4 & -8 & 6 \end{array}\right)\right)\left(\begin{array}{l} x_{1} \\ x_{2} \\ x_{3} \end{array}\right)=\left(\begin{array}{l} 0 \\ 0 \\ 0 \end{array}\right) $$ That is you must find the solutions to $$ \left(\begin{array}{rrr} 5 & 10 & 5 \\ -2 & -4 & -2 \\ 4 & 8 & 4 \end{array}\right)\left(\begin{array}{l} x_{1} \\ x_{2} \\ x_{3} \end{array}\right)=\left(\begin{array}{l} 0 \\ 0 \\ 0 \end{array}\right) $$ Consider the augmented matrix $$ \left(\begin{array}{rrr|r} 5 & 10 & 5 & 0 \\ -2 & -4 & -2 & 0 \\ 4 & 8 & 4 & 0 \end{array}\right) $$ The row-reduced echelon form for this matrix is $$ \left(\begin{array}{lll|l} 1 & 2 & 1 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{array}\right) $$ and so the eigenvectors are of the form $$ \left(\begin{array}{c} -2 s-t \\ s \\ t \end{array}\right)=s\left(\begin{array}{r} -2 \\ 1 \\ 0 \end{array}\right)+t\left(\begin{array}{r} -1 \\ 0 \\ 1 \end{array}\right) $$ Note that you can't pick $t$ and $s$ both equal to zero because this would result in the zero vector and eigenvectors are never equal to zero. Here, there are two basic eigenvectors, given by $$ \mathbf{x}'=\left(\begin{array}{r} -2 \\ 1 \\ 0 \end{array}\right), \mathbf{x}''=\left(\begin{array}{r} -1 \\ 0 \\ 1 \end{array}\right) $$ Taking any (non-zero) linear combination of $\mathbf{x}'$ and $\mathbf{x}''$ will also result in an eigenvector for the eigenvalue $\lambda=10$. As in the case for $\lambda=5$, always check your work! For the first basic eigenvector, we can check $A\mathbf{x}'=10\mathbf{x}'$ as follows. $$ \left(\begin{array}{rrr} 5 & -10 & -5 \\ 2 & 14 & 2 \\ -4 & -8 & 6 \end{array}\right)\left(\begin{array}{r} -1 \\ 0 \\ 1 \end{array}\right)=\left(\begin{array}{r} -10 \\ 0 \\ 10 \end{array}\right)=10\left(\begin{array}{r} -1 \\ 0 \\ 1 \end{array}\right) $$ This is what we wanted. Checking the second basic eigenvector, $\mathbf{x}''$, is left as an exercise.