The Trouble with Transforms

# The Trouble with Transforms Here, we summarize TRS transforms, their relationship with affine transformations, and algebraic problems that arise therein. These are then contrasted with isometries, which do not suffer similarly. This can serve also as a brief introduction to the algebra of affine transformations. ## Introduction Ah, `Transform`, everyone's favorite little guy, used to describe position and orientation of entities: ```rust struct Transform { translation: Vec3, rotation: Quat, scale: Vec3, } ``` Let's recall what the components are doing here: * `translation`: motion away from the default position, described as a vector; * `rotation`: rotation from the default orientation, described as a unit quaternion; * `scale`: scaling deformation from the default shape along the `x`, `y`, and `z` axes, encoded as the components of a vector. Of course, these are all taken simultaneously, so there is the matter of the *semantics* of this — that is, what exactly it does when you actually apply it to something. The idea here is that the components are applied in the reverse of the listed order: * First, `scale` is applied, stretching the entity based on its components. * Next, the entity is rotated by `rotation`. * Finally, the entity is translated by `translation`. So when a `Transform` is actually applied, this is what happens. Let's see how that works mathematically speaking: ## Matrix representation It's not uncommon for these kinds of transforms to be referred to using the acronym "TRS," which just stands for "translation, rotation, scale". The point is that, when we use a `Transform`, we first convert it to a form which can be used to do math on coordinates (e.g. those of meshes). In Bevy, this happens through (essentially) `Affine3`, which is a 3x3 matrix together with a translation vector — an *affine transformation*: ```rust struct Affine3 { matrix3: Mat3, translation: Vec3, } ``` Each component of a `Transform` has an associated `Affine3`: * `translation` is obvious — the translation is just shoved into the corresponding field of `Affine3`: ```rust fn translation_aff(translation: Vec3) -> Affine3 { Affine3 { matrix3: Mat3::IDENTITY, translation: translation, } } ``` * `rotation` passes to the associated orthogonal matrix, which gets put in the `matrix3` field: ```rust fn rotation_aff(rotation: Quat) -> Affine3 { Affine3 { matrix3: Mat3::from_quat(rotation), translation: Vec3::ZERO, } } ``` * `scale` is embedded as the diagonal of the matrix: ```rust fn scale_aff(scale: Vec3) -> Affine3 { Affine3 { matrix3: Mat3::from_diagonal(scale), translation: Vec3::ZERO, } } ``` Affine transformations are applied to points (encoded as `Vec3`) by first multiplying by the `matrix3` component and then translating by `translation`: ```rust fn apply(aff: Affine3, point: Vec3) -> Vec3 { aff.matrix3 * point + aff.translation } ``` Of course, the next ingredient is the multiplication of affine transformations themselves, which is used to finish the definition of a map `Transform -> Affine3`. That's also where we'll see the trouble starts to creep in. ## Multiplying affine transformations Let us be slightly more mathematical and write affine transformations as $A =(M, v)$, where $M$ corresponds to the `matrix3` component and $v$ to `translation`. We saw that above that for $p \in \mathbb{R}^3$, $$ A(p) = (M, v)(p) = Mp + v. $$ Now, given a second affine transformation $B = (N, w)$, we want to know the affine transformation that describes the composite $AB$, which ought to satisfy a form of associativity: $$ (AB)(p) = A(B(p)) $$ On the other hand, we have $$ \begin{align*} A(B(p)) &= A((N, w)(p)) \\ &= A(Np + w) \\ &= (M, v)(Np + w) \\ &= MNp + Mw + v \\ &= (MN)p + (Mw + v) \end{align*} $$ and so defining $$ AB = (M, v)(N, w) := (MN, Mw + v) $$ is consistent. In code, that looks like this: ```rust fn mul(first: Affine3, second: Affine3) -> Affine3 { Affine3 { matrix3: first.matrix3 * second.matrix3, translation: first.matrix3 * second.translation + first.translation, } } ``` Of course, this should — and does — live in an [actual `Mul` implementation](https://docs.rs/bevy/latest/bevy/math/struct.Affine3A.html#impl-Mul-for-Affine3A), and we'll use this multiplication with the `*` operator as such. With this in hand, we are finally ready to make the jump from `Transform` to `Affine3`: ```rust fn trs(transform: Transform) -> Affine3 { let Transform { translation, rotation, scale } = transform; translation_aff(translation) * rotation_aff(rotation) * scale_aff(scale) } ``` This is scaling, then rotation, then translation, made manifest in affine transformations: three individual transformations $T$, $R$, and $S$, multiplied as $TRS$, giving an affine transformation which can actually transform points. We'll now see that this multiplication is precisely what leads to issues with `Transform`. ## Multiplication of transforms Well, now that we know we can multiply affine transformations and we can turn `Transform`s into affine transformations, it's natural to ask whether we can multiply `Transform`s as well. Sadly, the answer is no — at least, not in a way which universally makes sense semantically. The problem stems from the fact that affine multiplication is not commutative: $$ \begin{align*} AB = (M,v)(N,w) &= (MN, Mw + v)\\ BA = (N,w)(M,v) &= (NM, Nv + w) \end{align*} $$ To see this, let's imagine that we have a pair of transforms `transform0` and `transform1`, naming the affine transformations of the components $T_0, R_0, S_0$ and $T_1, R_1, S_1$ respectively, and the composites $F_0 = T_0 R_0 S_0$, $F_1 = T_1 R_1 S_1$. On the one hand, we know the result of `trs(transform0) * trs(transform1)`: $$ F_0 F_1 = (T_0 R_0 S_0)(T_1 R_1 S_1) = T_0 R_0 S_0 T_1 R_1 S_1 $$ If we want this to be shadowed by some multiplication `transform0 * transform1` (i.e. so that `trs(transform0 * transform1) == trs(transform0) * trs(transform1)`), then $F_0 F_1$ needs to be decomposed as $TRS$, where $T$, $R$, and $S$ are actually translation, rotation, and scaling matrices. Noncommutativity defeats the naïve attempt at this immediately: while $T_0 T_1$ is a translation matrix, for example, $T_1$ cannot be moved past $S_0$ and $R_0$ to bring the two together, and the same goes for the other parts. Still, one might hope that a more sophisticated technique could still produce something of the correct form — by studying the commutators of the component transformations and so on — but it turns out that the task is, unfortunately, impossible: TRS transformations are just not closed under multiplication in general. For example, $F_0 F_1$ may have shearing, which TRS transformations never do. ## Inverses of transforms Of course, this naturally means that a `Transform` cannot have a multiplicative inverse which is rightfully considered a `Transform` — it only has a multiplicative inverse at the level of affine transformations. In fact, the inverse is conceptually an SRT transformation: $$ (TRS)(S^{-1}R^{-1}T^{-1}) = (S^{-1}R^{-1}T^{-1})(TRS) = \text{id} $$ (Note that $T^{-1}$, $R^{-1}$, and $S^{-1}$ are still translation, rotation, and scaling transformations respectively.) ## Isometries We can repeat the preceding story but without the scaling component, obtaining `Isometry` rather than `Transform`: ```rust struct Isometry { translation: Vec3, rotation: Quat, } ``` Then, using the same construction as before, we can create TR transformations within affine transformations: ```rust fn tr(isometry: Isometry) -> Affine3 { let Isometry { translation, rotation } = transform; translation_aff(translation) * rotation_aff(rotation) } ``` Of course, at first glance it might seem like this still runs into all the old problems that TRS had. However, the situation is actually much better: given two TR transformations, their product is actually also a TR transformation. This follows from the fact that the commutator $\lbrack T, R \rbrack = T^{-1}R^{-1}TR$ of a translation transformation $T$ and a rotation transformation $R$ is itself a translation transformation (see Appendix II). Specifically, if $G_0$ and $G_1$ are TR transformations: $$ \begin{align*} G_0 G_1 &= T_0 R_0 T_1 R_1 \\ &= T_0 (R_0 T_1) R_1 \\ &= T_0 (R_0 T_1 R_0^{-1} T_1^{-1} T_1 R_0) R_1 \\ &= T_0 (\lbrack R_0^{-1}, T_1^{-1} \rbrack T_1 R_0) R_1 \\ &= (T_0 \lbrack R_0^{-1}, T_1^{-1} \rbrack T_1) (R_0 R_1). \end{align*} $$ That is, $G_0 G_1 = TR$, where $$ \begin{align*} T &= T_0 \lbrack R_0^{-1}, T_1^{-1} \rbrack T_1, \\ R &= R_0 R_1. \end{align*} $$ Let's unpack this commutator to see what the resulting translation actually looks like: $$ \begin{align*} T_i = (I, v_i) \implies T_i^{-1} = (I, -v_i)\\ R_i = (M_i, 0) \implies R_i^{-1} = (M_i^{-1}, 0) \end{align*} $$ $$ \begin{align*} \lbrack R_0^{-1}, T_1^{-1} \rbrack &= \lbrack T_1^{-1}, R_0^{-1} \rbrack^{-1} \\ &= \lbrack (I, -v_1), (M_0^{-1}, 0) \rbrack^{-1}\\ &= (I, v_1 - M_0 v_1)^{-1} ~~\text{(Appendix II)} \\ &= (I, M_0v_1 - v_1). \end{align*} $$ Thus: $$ T = T_0 \lbrack R_0^{-1}, T_1^{-1} \rbrack T_1 = (I, v_0)(M_0v_1 - v_1)(I, v_1) = (I, M_0v_1 + v_0). $$ This lets us define multiplication for `Isometry`: ```rust fn mul(first: Isometry, second: Isometry) -> Isometry { Isometry { translation: first.rotation * second.translation + first.translation, rotation: first.rotation * second.rotation, } } ``` By construction, we have `tr(iso1 * iso2) == tr(iso1) * tr(iso2)`. It's also worth noting that the expression $M_0v_1 + v_0$ is exactly what showed up in the definition of multiplication for arbitrary affine transformations. If you unravel this, you'll see we have demonstrated that the TR construction produces exactly the affine transformations whose matrix component is a special orthogonal matrix. It should also be easy to convince yourself that the inverse of an isometry can be computed exactly as in Appendix I: ```rust fn inverse(iso: Isometry) -> Isometry { let inv = iso.rotation.inverse(); Isometry { translation: -inv * iso.translation, rotation: inv, } } ``` This finishes the story for `Isometry`: they don't suffer from any of the annoying algebraic limitations of `Transform`. Of course, it's also worth remembering that they're less expressive. This can be remedied to a limited extent, as the next section demonstrates. ## Isometries + uniform scaling The whole problem with TRS transformations can be attributed to the commutators of the form $\lbrack T, S \rbrack$ and $\lbrack R, S \rbrack$ — as we saw, $\lbrack T, R \rbrack$ is actually nice. In fact, $\lbrack T, S \rbrack$ is also nice: the derivation in Appendix II doesn't even use that $R$ is a rotation, so it can also be applied to $\lbrack T, S \rbrack$, demonstrating that it is also a translation. At any rate, there is the question of what adjustment could be made so that $\lbrack R, S \rbrack$ is nice, and a natural answer to that question is that we could impose that scaling is *uniform*, meaning that $S = (\lambda I, 0)$, which just commutes with everything. On the level of data, that would look like this: ```rust struct ScaledIsometry { translation: Vec3, rotation: Quat, scale: f32, } ``` In code, the scale component transforms to an $S$ matrix by placing its value along the diagonal: ```rust fn scale_aff(scale: f32) -> Affine3 { Affine3 { matrix3: scale * Mat3::IDENTITY, translation: Vec3::ZERO, } } ``` and TRS works exactly as in `Transform` with this modification: ```rust fn trs(iso: ScaledIsometry) -> Affine3 { let ScaledIsometry { translation, rotation, scale } = iso; translation_aff(translation) * rotation_aff(rotation) * scale_aff(scale) } ``` The same argument for TR transformations works with the addition of uniform scaling, since the S transformations can be freely moved around. Accordingly, multiplication works exactly as in the case of `Isometry` but with the scales multiplying (and so on for inverses). You can convince yourself additionally that the TRS transformations produced this way are exactly the affine transformations $(M, v)$ such that $\lambda M$ is orthogonal for some $\lambda \in \mathbb{R}$. ## Appendix I: Inverse transformations Given an affine transformation $(M, v)$, one may ask whether an inverse exists and, if so, what it is. Thankfully, this is easily computed: the matrix component must obviously be $M^{-1}$, so if the inverse's translation component is $w$, we have: $$ \begin{align*} (M^{-1}, w)(M, v) &= (I, M^{-1}v + w) = (I, 0) =: \text{id} \\ (M, v)(M^{-1}, w) &= (I, Mw + v) = (I, 0) =: \text{id} \end{align*} $$ The first of these implies $w = -M^{-1}v$, which is also compatible with the second: $$ M(-M^{-1}v) + v = -MM^{-1}v + v = -v + v = 0. $$ So we have: $$ (M, v)^{-1} = (M^{-1}, -M^{-1}v). $$ In code: ```rust fn inverse(aff: Affine3) -> Affine3 { let inv = aff.matrix3.inverse; Affine3 { matrix3: inv, translation: -inv * aff.translation, } } ``` ## Appendix II: The TR Commutator Here we compute the commutator between a translation transformation $T$ and a rotation $R$. Recall: $$ \begin{align*} T &= (I, v) \\ R &= (M, 0) \end{align*} $$ where $I$ is the identity matrix and $M \in SO(3)$. Evidently, we also have $$ \begin{align*} T^{-1} &= (I, -v) \\ R^{-1} &= (M^{-1}, 0). \end{align*} $$ Then the commutator is: $$ \begin{align*} \lbrack T, R \rbrack &= T^{-1}R^{-1}TR \\ &= ((I, -v)(M^{-1}, 0))((I, v)(M, 0)) \\ &= (M^{-1}, -v)(M, v) \\ &= (I, M^{-1}v - v). \end{align*} $$ That is, the commutator of a translation and a rotation is a translation.