Sodium Translucency Sorting using Global Face Normal Indexing

## Read the full text of my Master's thesis [here](https://douira.dev/) instead of this # Overview This document has become somewhat outdated as the problem is more complex as initially anticipated and I found a better approach to dynamic sorting later on (multi-partition trees). It's nonetheless a good introduction to the situation and visualizes how GFNI, which is just a small part of the whole translucency sorting system, works. The whole system is not called GFNI, that's just one of the triggering mechanisms it uses. Please see the implementation in PRs [#2016](https://github.com/CaffeineMC/sodium-fabric/pull/2016) and [#2352](https://github.com/CaffeineMC/sodium-fabric/pull/2352) I wrote my [Master's thesis](https://douira.dev/) on translucency sorting for Minecraft with the implementation in Sodium. Since this document has been linked in various places, I'll post the abstract here: > Ordering translucent quadrilaterals (quads) by descending depth produces a correct image with alpha-blended translucency. Generating such an order efficiently with a moving camera is challenging in the general case. Since translucency is common in computer graphics, a multitude of techniques have been developed for it, but none is universally optimal. This work implements quad-based translucency sorting in Sodium, a Minecraft modification focused on rendering performance. A sort order is obtained by topologically sorting a visibility graph over the quads with respect to a polytope of view points. However, this algorithm's quadratic runtime and potential for sort failure when being approximated limits its suitability to static sorting. Axis-aligned multi-partition trees without quad fragmentation can be used instead. They are efficiently built with a projected interval scanning algorithm and generate dynamic sort orders 60 percent faster than sorting quads by distance. Together these techniques improve visual correctness and provide a performant translucency sorting system. Unaligned partitioning, although not necessary for Minecraft, fully exploits partitionable geometry and lends itself to interpretation in parameter space. It is given a polynomial upper bound and options for transferring a lower bound from linear constraint solving. # Old Documentation on GFNI When rendering translucent geometry, one approach is to sort the geometry back-to-front in order to be able to render it correctly. Rendering it in any other order requires either order independent blending of translucent surfaces or accepting that translucent geometry isn't displayed correctly from all viewing angles. Of the two possible solutions, order independent blending results in a lower visual quality and is not object of this document. This document aims to outline the challenges of translucent geometry sorting, a potential performant solution, and the concepts of its implementation. ## Prior work Vanilla Minecraft performs translucency sorting (from now on referred to as "sorting" wherever unambiguous) every frame for all translucent geometry on the CPU. This incurs a performance cost but results in correct rendering. (TODO: when exactly does it sort? Always or just on camera movement? Does it frustum cull?) Currently Sodium doesn't sort translucent geometry which results in visual glitches when the camera is looking through two or more translucent surfaces. ### Work related to Sodium - [General translucency sorting tracking issue](https://github.com/CaffeineMC/sodium-fabric/issues/38). - [Weighted, Blended Order-Independent Transparency implementation](https://github.com/CaffeineMC/sodium-fabric/pull/519): Abandoned by author, visually unappealing - [Sorting with Compute Shaders 2021](https://github.com/CaffeineMC/sodium-fabric/pull/963): Requires OGL 4.3/2+ (compute shaders), now incompatible with current Sodium. JellySquid wrote [a discord message](https://discord.com/channels/602796788608401408/908399790839177216/915777405220696154) about this PR. (TODO: when does it sort? What is the trigger? Always?) - [Sorting with Compute Shaders 2023](https://github.com/CaffeineMC/sodium-fabric/pull/1687): Requires OGL 4.3/2+ (compute shaders), visual popping due to incremental sort on highly layered scenes, picks chunk to sort randomly and requries a proper triggering mechanism - [Discussion on Discord](https://discord.com/channels/602796788608401408/651120262129123330/1093935693950103715) about translucency sort triggering using Global Face Normal Indexing. [Continuation 1](https://discord.com/channels/602796788608401408/754753337525534892/1097682690246971492), [continuation 2](https://discord.com/channels/602796788608401408/651120262129123330/1097693661271167057). [Followup on easier special cases](https://discord.com/channels/602796788608401408/651120262129123330/1098074083326963804). ## Context In Minecraft chunk sections (16x16x16 blocks) are rendered in sorted order (TODO: is this correct?) so only geometry within sections needs to be sorted. Faces are usually aligned to axes but not necessarily. Flowing water can produce many different normals. However, mods and other custom content could generate any number of translucent face orientations. Translucency sorting is a difficult problem because there can be a large amount of translucenct geometry in need of sorting and the existing sorting work is potentially invalidated quickly. The order of faces changes when the camera moves, but not when the camera direction changes, since rotatation without translation doesn't change a face's distance to the camera. The distance of a face to the camera is measured using the [manhattan distance](https://en.wikipedia.org/wiki/Taxicab_geometry) (taxicab distance). TODO: give a summary of which data we have access to, its format and how it ends up being rendered. Are we sorting triangles or quads? What index buffer are we sorting, specifically in sodium 0.4 without assuming large changes? See also [this discord conversation](https://discord.com/channels/602796788608401408/651120262129123330/1123750737910964326) # (Old) Translucency Sorting using Global Face Normal Indexing ## Sort Triggering Given a certain sorting algorithm, be it on the GPU or CPU, it is necessary to determine which sections need sorting after a certain camera movement to avoid the performance cost of sorting everything every frame. The specific sorting implementation which is often bitonic sort on the GPU (see [this file as an example in cortex's PR](https://github.com/CaffeineMC/sodium-fabric/pull/1687/files#diff-c8701e49048e23ab4b8d3e251550fd1b395bb6c88b85cc4c16813199920549b9)) and merge or insertion sort on the CPU depending on the array size (see [this file for an example in Sodium](https://github.com/CaffeineMC/sodium-fabric/blob/1.19.3/dev/src/main/java/me/jellysquid/mods/sodium/client/util/GeometrySort.java)). For the purposes of compatibility and simplicity, this document will assume a CPU-based implementation but a GPU-based implementation can also benefit from the triggering mechanism. ## Algorithm Introduction and Description The following triggering and sorting algorithm assumes: - There are many translucent faces spread out throughout the world - They come in varying orientation but many are axis-aligned - Potentially quantizing the exact orientation of some faces to a factor of something around 360°/64 degrees (approximation compontent of this algorithm) ### Face Geometry ![Diagram of a face and its associated geometry](https://hackmd.io/_uploads/ryEsyTd_2.png) Each face (triangle) $f$ has a normal vector $n$ that determines its orientation. A _face plane_ $P_{nd}$ is the extension of a face into an infinite plane has the same normal vector. Such a face plane is identified only by its normal vector $n$ and its distance $d$ from the world's origin. For every face plane there is a _normal plane_ $P_n=P_{n_0}$ which has the same normal but distance 0 to the origin (lies on it). ![Diagram of multiple faces with a common normal](https://hackmd.io/_uploads/rkVCf6u_3.png) Multiple faces $f$, $g$, and $h$ share the same normal $n$ if they face in the same direction. However, they are distinguished by their distance $d$ to the normal plane. The coplanar faces $g$ and $g'$ even share the same distance $d(g)=d(g')$. Such coplanar faces' ordering in translucency sorting is arbitrary since they can never visually overlap. ### Plane Crossing ![Diagram of a camera crossing two face planes](https://hackmd.io/_uploads/SypMLTO_3.png) Given a camera movement from point $a$ to $b$ (in a single frame presumably), the face planes crossed by the camera can be determined by looking at the distances $d(a)$, $d(b)$ and those of the faces $d(h)$, $d(g)$, and $d(f)$. The camera crossed the _face planes_ $p\in C$ where the plane's distance $d(p)$ is between the start $d(a)$ and end $d(b)$ distance. In this example the camera crossed planes $C=\{g,h\}$ because $d(a)<d(h)<d(g)<d(b)$. With a unit normal vector $n_0=\frac{n}{||n||_2}$ the distance of a point $p$ to the normal plane $P_n$ can be easily calculated with the dot product: $d_{n_0}(p)=n_0\cdot p$. Intuitively this makes sense as points orthogonal to the normal vector naturally must lie within the plane and thus have a distance 0, just as the dot product of two orthogonal vectors is 0. ### Multi-Plane Crossing ![Diagram of three different viewing directions of two faces](https://hackmd.io/_uploads/BJlejTu_3.png) A crossing event refers to the camera moving across a face plane. Finding crossing events is interesting because each face planes forms a potential boundary between at two differing sort orders. In the diagram viewing direction 1 the faces are sorted as $f$ and then $g$ (in back-to-front sorting). In direction 3 they are sorted $g$ and then $f$. In direction 2 it doesn't matter in which order they are sorted as the faces aren't visually overlapping. Determining when the two faces overlap and in which order is hard as soon as there's more than just a few faces in the scene. However, we know that the sort order of faces can only change when the camera crosses the face plane of one of the involved faces. In this case there's two face planes and the camera moving from direction 1 to direction 3 would generate two crossing events. Since there are only two different sort orders, one of the crossing events is redundant. The approximation of sort order changes with face plane crossing events gets better as more different normals and faces are introduced. In complex scenes, almost all crossing events can indicate a change in sort order. ### Global Face Normal Indexing How can all crossing events that a given camera movement between two points generates be determined efficiently? Under the reasonable assumption that the world's translucent geometry stays mostly constant (and sparse updates are applied otherwise), sorted lists of the face planes for each normal can be maintained and traversed to find crossed face planes. This is the main idea of this approach: Within a sorted list $L_n$ of face planes all sharing the same normal $n$, finding crossing events is easy because there's simply a distance range, namely the range between the start and end distance of the camera with regards to $n$, within which all face planes have been crossed. #### In Detail After a section is meshed and prepared for rendering, the normal of each translucent face is known. Let $f$ be a translucent face of the geometry with some normal $n$ and distance $d_n(f)$ to its normal plane $P_n$. For each possible normal $n$ in the world there's a list $L_n$ of faces with this normal. The faces in this list are sorted by their distance $d_n(f)$. Each face $f$ with normal $n$ is inserted into $L_n$ at a position that maintains the sorting order. These lists could be realized as tree heaps (`TreeSet` or similar) for efficient updates that maintain their internal ordering. A normal vector that is points in exactly the opposite (colinear) direction of another one is counted as the same, but the distances are negative. This negative distance naturally comes from the distance calculation $d_n(p)=p\cdot n$. ![Diagram of various faces with two different normals and the origin](https://hackmd.io/_uploads/SJx_QhY_2.png) In this diagram there are faces with two different normals $n$ and $m$. As such, there's lists $L_n$ and $L_m$ for each normal. Without giving the exact distances from the diagram, the lists are sorted by ascending distance and faces behind the origin have negative distances: $L_n=[i,h,g,f]$ and $L_m=[t,s,r]$. The datastructure backing each list uses the distance as the key and the face as the stored value, allowing efficient additions and removals as well as range queries. Initially all translucent faces of a section are sorted for camera position $a$ and referenced in their respective lists. Note that the section's sort involves faces with a combination of different normals and distances. Then the camera moves to position $b$ in the next frame. A camera movement $a\neq b$ means the system needs to figure out which sections' sort order have been invalidated. Each face $f$ references the section $s(f)=S$ it came from. $S$ is a set of faces in a chunk section that can have different normals and distances. In the implementation this means that instead of just using a tree heap of faces, the lists $L_n$ store the tuple $(f, S)$ or otherwise make $S$ reachable through the entry of $f$ in $L_n$. ![Diagram of various faces with two different normals, the origin, and a camera movement from a to b](https://hackmd.io/_uploads/H1wKJH9On.png) For each normal $n$ that has a non-empty list $L_n$, the new distance $d_n(b)$ for the camera is calculated. The previous distance $d_n(a)$ is stored and re-used from the previous frame. For each section, an empty list $I_S$ that stores the crossed faces is initialized. All faces that the camera crossed with regards to normal $n$ are determined using a range query $L_n[d_n(a),d_n(b)]$. The entries in $L_n$ are dereferenced to their sections $S$ and stored into their respective $I_S$. After finding all plane crossings, the sections that had crossings and thus have entries in their $I_S$ need to be re-sorted. The sorting can be performed with any algorithm but a few special cases that improve performance are considered in the next section. In the diagram's example, faces $h$ and $g$ of normal $n$ and face $r$ of normal $m$ have been crossed. This is the result of doing a $[d_n(a), d_n(b)]$ range query on $L_n=[h,g,f]$ and a $[d_m(a), d_m(b)]$ range query on $L_n=[s,r]$. As mentioned in the beginning, there can potentially be a large number of different normals in the scene. In Vanilla Minecraft this number is probably low, but could be increased through resource packs or mods. In order to keep the number of diferent normals and associated normal lists $L_n$ reasonably small, the normals can be quantized. This doesn't yield a large saving without drastic quanitzation, but is important to avoid iterating through thousands of different faces individually. An additional optimization can be applied to reduce the number of relevant normal lists for common camera movements and is discussed in a later section. ## Optimizations and Special Cases ### Triggering with Backface Culling In Minecraft faces are only visible in when viewed from where their normal points. If the camera is looking in the same direction as the normal, the face isn't rendered. This allows translucency sorting to be a little lazy in sorting faces that aren't visible anyways because they're facing in the wrong direction. ![Diagram of two blocks that have outwards facing normals, for demostrating backface culling](https://hackmd.io/_uploads/rytckl7Yn.png) This diagram shows a common situation in a simplified form: Two "blocks" of translucent blocks adjacent to eachother. Internal faces between the individual blocks aren't rendered at all. There are two outwards facing faces and one internal bidirectional face. Here there are two normals $n$ and $m$. Camera movement from $a$ to $b$ goes in the same direction as $m$ but not $n$. Faces in face planes that are crossed against their normal are no longer visible after the camera movement, such as faces $f$ and $g$, and thus no sort triggering needs to happen since they don't matter for rendering anyways. In this example, a sort trigger happens for crossing the face planes of faces $f$ and $g$ but not $i$. Since $g$ and $h$ lie in the same plane, $h$ is effectively also triggered on but doesn't need to be considered during sorting. The implementation simply keeps separate normal lists $L_n$ and $L_m$ for $n_0\neq m_0$ even if $n_0=-m_0$. Splitting up normal lists by normal direction easily avoids processing faces that point in uninteresting directions. Faces that are bidirectional with normal $k$ are added two both $L_k$ and $L_{-k}$. If they are actually just two mirored faces, the system can process them as separate and trigger when the face plane is crossed in both directions. ### Special Sorting Cases These special cases are referenced by their letter in the implementation. There are three different outcomes for a newly built section: The section can be excluded from the normal lists entirely, its faces can be statically sorted by their normal distances (and not introduced to normal lists either), or it can be added to normal lists and dynamically sorted on triggering events. An interesting triggering case is when only a single face of a section was crossed. In this case, sorting can be expedited by simply moving that face to its sorted position without re-sorting all the other faces relative to eachother. This case doesn't include multiple faces on the same plane being crossed as they aren't necessarily adjacent in the sort order their ordering in the new sort order can be unrelated to the previous ordering. (In such cases special case D often applies.) #### A: Empty or Single Face-Plane Nothing needs to be done if a setion has no translucent faces. If there is only a single quad in a section, it doesn't even need to be introduced to the translucency sorting system as the sort order always stays the same. The same is true for multiple faces that all lie on the same plane within a section. While there are different ways such a set of faces can be sorted, they cannot overlap and thus sorting them is irrelevant. #### B: Two Face Planes with Opposing Normals If there are only two normals $n$ and $m$ with $n=-m$ and each only one face plane, nothing needs to happen since no face can be seen through another one. This is a common case for the ocean and other bodies of water so that translucency sorting only becomes relevant for flowing water or player-made structures. #### C: Faces on a Convex Hull If the translucent faces are on the surface of the convex hull of all translucent faces in the section and face outwards, then there is no way to see one through another. Since convex hulls are hard, a simpler case only uses the axis aligned normals: Under the condition that only aligned normals are used in the section, tracking the bounding box of the translucent geometry (the vertices) in the section and then checking if the normal distances line up with the bounding box allows the exclusion of some sections containing a single convex translucent cuboid (of which not all faces need to exist). #### D: Two Opposing Normals (General Case) If there are only two opposing normals but with more than one face plane each, the faces do need to be sorted, but not dynamically. First of all with opposing normals, only faces of the same normal can be seen through each other. Furthermore, faces are only seen through each other in the same order, since faces are only visible through their front side. A static sort that orders faces by their normal distance is correct for any view direction. The sort only needs to be updated when the section is rebuilt, just as it is with other cases. ### Optimization for Picking Normal Lists If there are many normal lists, it can be beneficial to avoid iterating through all of them and performing potentially unnecessary range queries. A normal list $L_n$ can be excluded from consideration for a particular camera movement $a$ to $b$ early if it is known ahead of time that there is no face $f$ such that $d_n(a)<d_n(f)<d_n(b)$, assuming $d_n(a)<d_n(b)$. The other direction behaves identically. If the new distance $d_n(b)$ is calculated for a list $L_n$, it can be exluded if $d_n(b)<d_\mathrm{next}$ where $d_\mathrm{next}$ is the distance for the next closest face in $L_n$. This avoids performing a range query on $L_n$ if $d_\mathrm{next}$ can be saved from the last query performed on this list. Furthermore, to avoid calculating $d_n(b)$, the overapproximation $d_{(1,1,1)}(b)\geq d_n(b)$ can be used. We've been assuming unit normal vectors with $||n||=1$ but in this case using a non-unit normal vector $(1,1,1)$ gives the maximum possible distance $b$ can have with regards to any normal vector. If $d_{(1,1,1)}(b)<d_\mathrm{next}$, then the corresponding normal list can be excluded and $d_n(b)$ doesn't even need to be calculated. This optimization is useful for the common case of sparsely populated normal lists with large gaps. Especially normal lists for rare off-axis normal vectors could benefit from this optimization. Additonally, if this makes working with a larger number of normal lists feasible, quantization of normal vectors may be unnecessary or only necessary in small amounts. With large numbers of normal lists, an additional structure that can quickly query the set of normal lists that have a $d_\mathrm{next}$ above a calculated threshold may be necessary. ### Data Structures for Normal Lists There's a number of data structures that could be used to implement the normal lists. There's a two functions a normal list needs to support: - Updates that add or remove faces at any position - Range queries for faces. They are connected such that $b_1=a_2$ for any two queries $[a_1, b_1]$ and $[a_2, b_2]$. In the following $n$ refers to the number of items in a normal list and $k$ to the number of faces within a queried range. #### Tree Heaps A tree heap like `TreeSet` support $O(\log n)$ updates and $O(k \log n)$ range queries. Generally these use red-black trees internally and can deal with very sparse data, which is expected to occur. Trees have the caveat of indirection overhead and being potentially spread out over the Java heap. #### Sorted Array Lists A resizable array list like `ArrayList` supports only $O(n\log n)$ updates when they aren't at the end of the list or the array needs to be resized. Using the connected property of the range queries, they are supported in $O(k)$ with the initial search for the starting position taking $O(\log n)$ for a binary search. Since this option has such a bad update complexity, it shouldn't be considered unless that caveat can be mitigated. #### Sparse Arrays Instead of using a compact array, @burger suggested to use sparse arrays that are big enough to hold faces at any possible distance in buckets. This means that for a given render distance, all normal lists would be resized to fit all possible quanitzed plane distances. At each distance, such a list would then refer to a list of faces that are at that distance. The problem with quantizing the distance of faces to full or potentially even half blocks is that it could introduce visual artifacts when a sort is delayed for too long. However, since the plane crossing isn't the exact boundary on which a sort order changes necessarily, this may be a minimal problem. The caveat with this approach is a large memory usage. Sparse arrays support updates in $O(1)$ (if the sub-lists within each bucket are efficient). Range queries take $O(n)$ since the whole array might have to be iterated since it can have many empty spots. However, since camera movement is usually not very fast, this would only be a small fraction of $n$. Interestingly the index of the initial camera position can be calculated instead of searched for with a binary search since it directly maps buckets to world space. ### Efficient Updates of Whole Sections When a section is updated, its entire geometry is re-calculated and a new list of faces is produced. The following structure allows updates to be performed efficienty by separating the storage of faces within each normal list into groups. This also includes adding or removing the sections of a chunk when its loaded or unloaded. Since only translucent faces are relevant to this algorithm, the normal lists only need to be touched at all if a section update makes changes to translucent faces. If checking for such a change would involve comparing all the faces individually, a hash consisting of the number of translucent faces and their vertices could be stored and tested against. When the faces of different sections are stored mixed together with those of other sections, an update involves removing the previous set of faces and then adding the new set of faces. This results in a lot of tree rebalancing (in the case of using a tree heap) even though most of the work being performed is probably unnecessary as usually only few of the faces change at once. If many faces change but the number of translucent faces remains similar, a tree would still potentially perform unnecessary operations to grow and shrink. #### Structure Instead of storing all faces in one big heap, normal lists can store them grouped by section instead. Each group has a simple tree of the face plane distances and is represented by an interval of the minimum and maximum distances it contains. The normal list stores the groups keyed by their intervals in an interval tree, which supports range queries that return all intersecting intervals. A stored "face" may actually refer to any number of faces stored together if they all lie in the same face plane. If face planes are stored within (section-)groups, storing each individual face becomes unnecessary since the group already gives the associated section. Thus, if groups are used, only storing unique face plane distances is necessary. #### Queries To perform a range query on a normal list with this structure the set of overlapping groups are queried from the interval tree. Within each group the face planes that fall within the range are triggered. There are cases where the interval of a group fully encompasses the query interval without having any distances to trigger. Only distances that actually fall within the query interval are triggered. It might be enticing to eliminate the storage of face planes in the normal lists entirely, but it would lead to unnecessary triggering. One could concievably simply store the groups of faces from a section as ranges of minimum and maximum distances with regards to the normal in a segmenet tree. This would yield good results if the range of distances given by the camera movement partially intersects a group's distance range. However, if the camera only moves a small amount, such as 1 block as it is common, the missing information that a particular section might not even have translucent face planes within this range, causes many more trigger events to happen than necessary. #### Updates of Whole Sections The following procedure describes an update: - Proceed if the section's update changes translucent geometry - Proceed if the section requires to be entered into normal lists (see special cases) - For each translucent face $f$, determine its normal $n$ and distance $d_n(f)$. This yields the face planes $P_{nd}$ of the section. - For each normal $n$ of a face plane $P_{nd}$ with distance range $[d_\min,d_\max]$ of the section: - Within $L_n$ find the group for this section - If an update is necessary and the group exists, remove the existing group - Construct a new group of face planes (with normal $n$) for this section by adding them to a tree heap or other size-appropriate structure - Add the new group to the normal list's interval tree ![Diagram of the data structure with normal lists, buckets, groups, and face planes](https://hackmd.io/_uploads/HJKQIVx2h.png)