345 Notes - Determinants and Volume

345 Notes - Determinants and Volume ============== ###### tags: `345` `linear algebra` As a warm up here is a [nicely illustrated video from 3Blue1Brown](https://youtu.be/Ip3X9LOh2dk) discussing the same aspects of determinants that I am trying to get at in these notes. {%youtube Ip3X9LOh2dk %} This is from the [Essence of Linear Algebra](https://www.youtube.com/playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab) series and [this video](https://youtu.be/kYB8IZa5AuE) might be useful in understanding the relationship between what I call the area determined by $\mathbf{u,v}\in\mathbb R^2$ and the way that the unit square determined by $\mathbf{e_1,e_2}$[^std] is *transformed* by $A$, where $$ A=\bigl[\mathbf u\,\mathbf v\bigr]=\begin{bmatrix}u_1&v_1\\u_2&v_2\end{bmatrix} =\bigl[A\mathbf e_1\,A\mathbf e_2\bigr] $$ With this in mind we can think of $\bigl[\mathbf e_1\,\mathbf e_2\bigr]$ as the unit square (oriented in a particular way) and $A\bigl[\mathbf e_1\,\mathbf e_2\bigr]=\bigl[A\mathbf e_1\,A\mathbf e_2\bigr]=A$ as the transformed parallelogram. # n-dimensional volume of a [parallelepiped](https://en.wikipedia.org/wiki/Parallelepiped) In $\mathbb{R}^n$ let $\mathbf{u_1}, \mathbf{u_2},\ldots,\mathbf{u_n}$ be $n$ vectors. These $n$ vectors uniquely determine a [parallelepiped](https://en.wikipedia.org/wiki/Parallelepiped) which we will denote $P(\mathbf{u_1},\ldots,\mathbf{u_n})$. The volume of $P(\mathbf{u_1},\ldots,\mathbf{u_n})$ will just be denoted $\text{vol}(\mathbf{u_1},\ldots,\mathbf{u_n})$. If the vectors $\mathbf{u_1}, \mathbf{u_2},\ldots,\mathbf{u_n}$ are not independent, then the parallelepiped $P(\mathbf{u_1},\ldots,\mathbf{u_n})$ is deginerate and has a volume of 0. Conversely, if the vectors are independent, then the parallelopiped $P(\mathbf{u_1},\ldots,\mathbf{u_n})$ is non-deginerate and has a non-zero volume. This is summed up in the following fact. **Fact:** $\text{vol}(\mathbf{u_1},\ldots,\mathbf{u_n})=0$ iff $\mathbf{u_1}, \mathbf{u_2},\ldots,\mathbf{u_n}$ are not independent. A trivial case of this would be when $\mathbf{u_i}=\mathbf{u_j}$ for some $i\neq j$. Here we want to axiomatize volume and it is always preferable to take the weakest form of an axiom, so we adopt the following. **Axiom 1':** $\text{vol}(\ldots,\mathbf{u},\ldots,\mathbf{u},\ldots)=0$ It turns out that directly axiomatizing volume is a bad idea since it does not behave nicely with respect to linear combinations of the vectors generating the parallelepipeds. **Example:** In 1-dimenssion, 3 is a vector, you might think of it as the "arrow" from 0 to 3, but this is not necessary. Similarly, -3 is a vector. $P(3)$ is the 1-dimensional parallelepiped, i.e., line segment, $[0,3]$, and $\text{vol}(3)=|3|=3$. Similarly, $P(-3)$ is the line segment $[-3,0]$ and $\text{vol}(-3)=3=|-3|$. $$ \text{vol}(3 + (-3))=\text{vol}(0)=0\neq\text{vol}(3)+\text{vol}(-3)=|3|+|-3| = 6 $$ In 1-dimension, $\text{vol}(r)=|r|$ and the example is just a case of the well-known fact that $|r+s|\leq|r|+|s|$, but in general, $|r+s|\neq |r| + |s|$. **Example:** In $\mathbb{R}^2$, let $\mathbf{u}$ and $\mathbf{v}$, be two non-colinear vectors, so $P(\mathbf{u},\mathbf{v})$ is a non-degenerate parallelogram. Then clearly, $\text{vol}(\mathbf{u},\mathbf{v})=\text{vol}(\mathbf{-u},\mathbf{v})$, so again $$ \begin{split} 0=\text{vol}&(\mathbf{0},\mathbf{v})=\text{vol}(\mathbf{u}+(\mathbf{-u}),\mathbf{v})\\ &\neq \text{vol}(\mathbf{u},\mathbf{v})+\text{vol}(\mathbf{-u},\mathbf{v})=2\cdot \text{vol}(\mathbf{u},\mathbf{v})>0 \end{split} $$ These two examples suggest that if we want volume to behave nicely with respect to linear combinations of the vectors, then we need a "signed" volume, so that $\text{vol}(\mathbf{-u},\mathbf{v})=-\text{vol}(\mathbf{u},\mathbf{v})$. # Axioms for signed volume Here are our axioms of signed volume, $\det(\mathbf{u_1}, \mathbf{u_2},\ldots,\mathbf{u_n})$: **Axiom 1: (multilinearity)** $$ \begin{split} \det(\mathbf{u_1},\ldots,&\alpha\mathbf{u_i}+\beta\mathbf{u_i}',\ldots,\mathbf{u_n})\\ &=\alpha\det(\mathbf{u_1},\ldots,\mathbf{u_i},\ldots,\mathbf{u_n})+\beta\det(\mathbf{u_1},\ldots,\mathbf{u_i}',\ldots,\mathbf{u_n}) \end{split} $$ This is often split into two separate items. $$ \begin{split} \det(\mathbf{u_1},\ldots,&\mathbf{u_i}+\mathbf{u_i}',\ldots,\mathbf{u_n})\\ &=\det(\mathbf{u_1},\ldots,\mathbf{u_i},\ldots,\mathbf{u_n})+\det(\mathbf{u_1},\ldots,\mathbf{u_i}',\ldots,\mathbf{u_n}) \end{split}\tag{1} $$ and $$ \det(\mathbf{u_1},\ldots,\alpha\mathbf{u_i},\ldots,\mathbf{u_n})=\alpha\det(\mathbf{u_1},\ldots,\mathbf{u_i},\ldots,\mathbf{u_n})\tag{2} $$ **Axiom 2:** $\det(\ldots,\mathbf{u},\ldots,\mathbf{u},\ldots)=0$ **Axiom 3:** $\det(\mathbf{e_1},\ldots,\mathbf{e_n})=1$ Here $\mathbf{e_i}=(0,\ldots,0,1,0,\ldots,0)$ is the unit vector in $\mathbb{R}^n$ with a 1 in the i^th^ position. So this axiom just says that the unit cube has unit volume. Axioms 2 and 3 are clear, although Axiom 3 actually determines the "orientation" of our system of vectors and is sort of arbitrary, for example in $\mathbb{R}^2$ we have $\det((1,0),(0,1))=1$, while $\det((0,1),(1,0))=-1$. |Illustration of Axiom 1 (1) in 2D| |:--:| |![Signed Volume 2D](https://ketchers.github.io/Teaching/345/SignedVolume2D.gif)| |[Open in Geogebra](https://www.geogebra.org/calculator/bezhfnpw)| :::info Note that the areas of the three parallelograms do not change throughout the motion. From the case where $v_1$ and $v_2$ are orthogonal to $u$, it is clear that the the signed areas $\det(v_2,u)$ and $\det(v_1,u)$ result in $\det(v_1+v_2,u)$. Note also here that from the *counter-clockwise rule*, the signed areas $\det(v_1,u)<0$, $\det(v_1+v_2,u)<0$, and $\det(v_2,u)>0$. ::: |Illustration of Axiom 1 (1) in 3D| |:--:| |![Signed Volume 3D](https://ketchers.github.io/Teaching/345/SignedVolume3D.gif)| |[Open in Geogebra](https://www.geogebra.org/3d/ygbnxzhh)| :::info Similar to the situation above, the volumes of the three parallelepipeds do not change throughout their motion. From the *right-hand rule* the volumes $\det(v,u,w_1+w_2)$ and $\det(v,u,w_1)$ are positive and $\det(v,u,w_2)$ is negative. From the position where the $uv$-plane is orthogonal to the $w_i$'s it is clear that $\det(v,u,w_1)+\det(v,u,w_2)=\det(v,u,w_1+w_2)$. ::: |Illustration of Axiom 1 (2) in 3D| |:--:| |![Scaled Volume 3D](https://ketchers.github.io/Teaching/345/ScaledVolume3D.gif)| |[Open in Geogebra](https://www.geogebra.org/3d/xb7cnsez)| :::info This image makes it clear that $\det(u,v,\alpha w)=\alpha \det(u,v,w)$, at least in $\mathbb R^3$. ::: Part (2) of Axiom 1 is relatively clear, the tricky part is part (1), this is hinted at in the images above. Note that the various parallelepipeds *clearly* satisfy the rule when $\mathbf{v_1}$ and $\mathbf{v_2}$ are co-linear, the trick is to note that the volumes of each parallelepiped does not change and thus (2) holds also when $\mathbf{v_1}$ and $\mathbf{v_2}$ are not co-linear. The *formal* argument follows the images. To argue formally, start with our vectors $\mathbf{u_1},\mathbf{u_2},\ldots,\mathbf{u_n}$. Let $\mathbf{n}$ be a unit normal to the hyperplane determined by the $n-1$ vectors, $\mathbf{u_2},\ldots,\mathbf{u_n}$. Let $\mathbf{w_1}=\text{proj}(\mathbf{u_1},\mathbf{n})=\langle \mathbf{u_1},\mathbf{n}\rangle\,\mathbf{n}$ be the projection of $\mathbf{u_1}$ onto $\mathbf{n}$, in other words, the *height* of the parallelepiped above the hyperplane. (In the pictures above, the hyperplane is just the *line* in the direction of $\mathbf{u}$ in the 2-dimensional picture and is the plane containing $\mathbf{u}$ and $\mathbf{v}$ in the 3-dimensional picture.) The point is $$ \det(\mathbf{u_1},\mathbf{u_2},\ldots,\mathbf{u_n})=\det(\mathbf{w_1},\mathbf{u_2},\ldots,\mathbf{u_n}) $$ If you do the same thing for $\mathbf{u_1}'$, then we get $$ \begin{split} \det(\mathbf{u_1}&,\mathbf{u_2},\ldots,\mathbf{u_n})+\det(\mathbf{u_1}',\mathbf{u_2},\ldots,\mathbf{u_n})\\ &=\det(\mathbf{w_1},\mathbf{u_2},\ldots,\mathbf{u_n})+\det(\mathbf{w_1}',\mathbf{u_2},\ldots,\mathbf{u_n})\\ \end{split} $$ Since $\mathbf{w_1}$ an $\mathbf{w_2}$ are colinear we know $$ \begin{split} \det(\mathbf{w_1}&,\mathbf{u_2},\ldots,\mathbf{u_n})+\det(\mathbf{w_1}',\mathbf{u_2},\ldots,\mathbf{u_n})\\ &=\det(\mathbf{w_1}+\mathbf{w_1}',\mathbf{u_2},\ldots,\mathbf{u_n}) \end{split} $$ Finally, since the projection to $\mathbf{n}$ of $\mathbf{u_1}+\mathbf{u_2}$ is $\mathbf{w_1}+\mathbf{w_2}$ we have $$ \det(\mathbf{w_1}+\mathbf{w_1}',\mathbf{u_2},\ldots,\mathbf{u_n})=\det(\mathbf{u_1}+\mathbf{u_1}',\mathbf{u_2},\ldots,\mathbf{u_n}) $$ Thus we have $$ \begin{split} \det(\mathbf{u_1}&,\mathbf{u_2},\ldots,\mathbf{u_n})+\det(\mathbf{u_1}',\mathbf{u_2},\ldots,\mathbf{u_n})\\ &=\det(\mathbf{u_1}+\mathbf{u_1}',\mathbf{u_2},\ldots,\mathbf{u_n}) \end{split} $$ as desired! # Properties of the determinant Besides the properties contained in Axioms 1 - 3 above it is easy to verify the following: * $\det(\ldots,\mathbf{u},\ldots,\mathbf{v},\ldots)=-\det(\ldots,\mathbf{v},\ldots,\mathbf{u},\ldots)$ That is swapping two vectors (rows/columns) changes the sign. * $\det(\mathbf{u_1},\ldots,\mathbf{u_n})=0 \iff \mathbf{u_1},\ldots,\mathbf{u_n}$ are not linearly independent. For the first, consider $$ \begin{split} \det(\ldots,&\mathbf{u},\ldots,\mathbf{v},\ldots) +\det(\ldots,\mathbf{v},\ldots,\mathbf{u},\ldots)\\ &=\det(\ldots,\mathbf{u},\ldots,\mathbf{v},\ldots) + \det(\ldots,\mathbf{v},\ldots,\mathbf{v},\ldots)\\ &\qquad\qquad+\det(\ldots,\mathbf{v},\ldots,\mathbf{u},\ldots)+ \det(\ldots,\mathbf{u},\ldots,\mathbf{u},\ldots)\\ &=\det(\ldots,\mathbf{u+v},\ldots,\mathbf{v},\ldots) +\det(\ldots,\mathbf{v+v},\ldots,\mathbf{u},\ldots)\\ &=\det(\ldots,\mathbf{v+v},\ldots,\mathbf{u+v},\ldots)=0 \end{split} $$ This just uses (1) of Axiom 2 and Axiom 1. For the second, suppose WLOG that $\mathbf{u_1}$ is a linear combination of the rest of the vectors. Then $\mathbf{u_1}=\alpha_2\mathbf{u_2}+\cdots+\alpha_n\mathbf{u_n}$ and $$ \begin{split} \det(\mathbf{u_1},\mathbf{u_2},\ldots,\mathbf{u_n}) &= \det\left(\sum_{i=2}^n\alpha_i\mathbf{u_i},\mathbf{u_2},\ldots,\mathbf{u_n}\right)\\ &=\sum_{i=2}^n\alpha_i\det(\mathbf{u_i},\ldots,\mathbf{u_i},\ldots)=0 \end{split} $$ # The axioms are enough to compute the determinant It turns out that the three axioms given suffice to show that $\det(\mathbf{u_1},\mathbf{u_2},\ldots,\mathbf{u_n})$ always yields a unique value and this is what we call the determinant. Note that this yields immediately that $$ \text{vol}(\mathbf{u_1},\ldots,\mathbf{u_n})=|\det(\mathbf{u_1},\ldots,\mathbf{u_n})|\tag{3} $$ This is true since what we showed above was that any operation satisfying (3) must satisfy axioms 1 - 3. If satisfying those axioms is enough to yield a result, then (3) must hold. Basically what one shows here is that the axioms lead to the usual expansion along columns. This is just a sort of tedious inductive proof, so I will just do the calculation for $\mathbb{R}^2$, then show how that leads to the usual expansion in $\mathbb{R}^3$, going from 2 to 3 indicates what must be done in the actual inductive step. ## Computing det(u,v) In $\mathbb{R}^2$, the standard basis vectors are $\mathbf{e_1}=(1,0)$ and $\mathbf{e_2}=(0,1)$. An arbitrary vector $\mathbf{v}=(v_1,v_2)=v_1\mathbf{e_1}+v_2\mathbf{e_2}$. so using this notation, let's see that the axioms suffice to compute $\det(\mathbf{u},\mathbf{v})$. $$ \begin{align*} \det(\mathbf{u},\mathbf{v})&=\det(u_1\mathbf{e_1}+u_2\mathbf{e_2},v)\\ &=u_1\det(\mathbf{e_1},v)+u_2\det(\mathbf{e_2},v)\\ &=u_1\det(\mathbf{e_1},v_1\mathbf{e_1}+v_2\mathbf{e_2})+ u_2\det(\mathbf{e_1},v_1\mathbf{e_1}+v_2\mathbf{e_2})\\ &=u_1\bigl(v_1\det(\mathbf{e_1},\mathbf{e_1})+v_2\det(\mathbf{e_1},\mathbf{e_2})\bigr)+ u_2\bigl(v_1\det(\mathbf{e_2},\mathbf{e_1})+v_2\det(\mathbf{e_2},\mathbf{e_2})\bigr)\\ &=u_1v_1\det(\mathbf{e_1},\mathbf{e_1})+u_1v_2\det(\mathbf{e_1},\mathbf{e_2})+ u_2v_1\det(\mathbf{e_2},\mathbf{e_1})+u_2v_2\det(\mathbf{e_2},\mathbf{e_2})\\ &=u_1v_1\cdot 0+u_1v_2\cdot 1 +u_2v_1\cdot(-1)+u_2v_2\cdot 0\\ &=u_1v_2-u_2v_1 \end{align*} $$ This is the usual value of the determinant: $$ \det(\mathbf{u},\mathbf{v})=\det\begin{bmatrix}u_1&v_1\\u_2&v_2\end{bmatrix}=u_1v_2-u_2v_1 $$ ## Computing det(u,v,w) As above we use the notation $\mathbf{u}=u_1\mathbf{e_1}+u_2\mathbf{e_2}+u_3\mathbf{e_3}$ so $$ \begin{align*} \det(\mathbf{u},\mathbf{v},\mathbf{w})&= \det\left(\sum_{i=1}^3u_i\mathbf{e_i},\mathbf{v},\mathbf{w}\right)\\ &=\sum_{i=1}^n u_i\det(\mathbf{e_i},\mathbf{v},\mathbf{w})\\ \end{align*} $$ So we just need to see that $\det(\mathbf{e_i},\mathbf{v},\mathbf{w})$ corresponds to the (i,1)-minor (essentially we are expanding along the first column). This is easy, I'll do $i=2$, just to see where the $-1$ comes from $$ \begin{align*} \det(\mathbf{e_2},\mathbf{v},\mathbf{w})&= \det(\mathbf{e_2},v_1\mathbf{e_1}+v_2\mathbf e_2+v_3\mathbf{e_3},w_1\mathbf{e_1}+w_2\mathbf e_2+w_3\mathbf{e_3})\\ &=\det(\mathbf{e_2},(v_1\mathbf{e_1}+v_3\mathbf{e_3})+v_2\mathbf e_2,(w_1\mathbf{e_1}+w_3\mathbf{e_3})+w_2\mathbf e_2)\\ &=\det(\mathbf{e_2},v_1\mathbf{e_1}+v_3\mathbf{e_3},(w_1\mathbf{e_1}+w_3\mathbf{e_3})+w_2\mathbf e_2) +\det(\mathbf{e_2},v_2\mathbf e_2,(w_1\mathbf{e_1}+w_3\mathbf{e_3})+w_2\mathbf e_2)\\ &=\det(\mathbf{e_2},v_1\mathbf{e_1}+v_3\mathbf{e_3},(w_1\mathbf{e_1}+w_3\mathbf{e_3})+w_2\mathbf e_2)\\ &=\det(\mathbf{e_2},v_1\mathbf{e_1}+v_3\mathbf{e_3}, w_1\mathbf{e_1}+w_3\mathbf{e_3})+\det(\mathbf{e_2},v_1\mathbf{e_1}+v_3\mathbf{e_3},w_2\mathbf e_2)\\ &=\det(\mathbf{e_2},v_1\mathbf{e_1}+v_3\mathbf{e_3},w_1\mathbf{e_1}+w_3\mathbf{e_3})\\ &=-\det(v_1\mathbf{e_1}+v_3\mathbf{e_3},\mathbf{e_2},w_1\mathbf{e_1}+w_3\mathbf{e_3})\\ &=-(v_1w_3-v_3w_1)\\ &=(-1)^{2+1}\det(M_{2,1}) \end{align*} $$ ## Computing det(u₁,...,uₙ) As above $\mathbf{u_i}=\sum_{j=1}^n {u_{ij}}\mathbf{e_j}$ and so \begin{align} \det(\mathbf{u_1},\mathbf{u_2},\ldots,\mathbf{u_i},\ldots,\mathbf{u_n})&= \det\left(\mathbf{u_1},\mathbf{u_2},\ldots,\sum_{j=1}^n {u_{ij}}\mathbf{e_j},\ldots,\mathbf{u_n}\right)\\ &=\sum_{i=1}^n u_{ij}\det(\mathbf{u_1},\mathbf{u_2},\ldots,\mathbf{e_j},\ldots,\mathbf{u_n}) \end{align} [^std]: Recall that in this class we use $\bf e_i^n$ to be the $i$^th^ standard basis element in $\mathbb R^n$. The $n$ is often determined from context. So $\bf e_1=\bf i=\Bigl[\begin{smallmatrix}1\\0\end{smallmatrix}\Bigr]$ and $\bf e_2=\bf j=\Bigl[\begin{smallmatrix}0\\1\end{smallmatrix}\Bigr]$.