BVH ideas & algorithms

# BVH ideas & algorithms ## What is BVH (Bounding Volume Hierarchies)? ### Background * We need to send several rays to render one scene. * For each ray, we need to test whether this ray intersect with **any** object. * The naive solution is to **iterate through all objects** in the scene, and test whether this object collides with the ray. * In this way, the ray-object intersection is the main time-bottleneck in a ray tracer, and the time is **linear with the number of objects**. * A better way is to **build a tree** for faster traversing. There are two methods that is widely used in practice: (1) kd-tree and (2) BVH tree. * Because the objects in the scene is bounded, the maximum nodes in BVH tree is at most $2N-1$, where $N$ is the number of nodes. The number of nodes is smaller than that of kd-tree. ### Ideas * Every node in the tree represents a bounding box, and contains information of what objects is in that bounding box. ![](https://i.imgur.com/zNQSAWz.png) * There is no specific meaning in "left" or "right" children. * Bounding volumes can overlap. * The objects contains in left children and right children should be disjoint. * We can test whether a ray hits any object in the scene by this pseudo code: ```cpp if (hits purple) hit0 = hits blue enclosed objects hit1 = hits red enclosed objects if (hit0 or hit1) return true and info of closer hit return false ``` ### Axis-Aligned Bounding Box (AABB) * A good bounding box should be able to determine whether this box intersect with the ray quickly. * A widely used bounding box is Axis-Aligned Bounding Box (AABB). * We can determine whether a ray hits the AABB easily. * Suppose the coordinates of the minimum corner of the AABB is $(x_1,y_1,z_1)$, and the maximum corner of the AABB is $(x_2,y_2,z_2)$. * The coordinate of the ray origin is $(x_o,y_o,z_o)$. * The direction of the ray is $(b_x,b_y,b_z)$. * To simplify calculation, suppose $b_x$, $b_y$, and $b_z$ are all positive. * We can calculate the intersection time $t_{entry}$ and $t_{exit}$ by the following eqations: $$t_{entry_x}= \frac{x_1-x_o}{b_x}\\ t_{exit_x}= \frac{x_2-x_o}{b_x}\\ t_{entry_y}= \frac{y_1-y_o}{b_y}\\ t_{exit_y}= \frac{y_2-y_o}{b_y}\\ t_{entry_z}= \frac{z_1-z_o}{b_z}\\ t_{exit_z}= \frac{z_2-z_o}{b_z}\\ t_{entry}=\max\left( t_{entry_x},t_{entry_y},t_{entry_z} \right)\\ t_{exit}=\min\left( t_{exit_x},t_{exit_y},t_{exit_z} \right)$$ * if $t_{entry}\leq t_{exit}$, then the ray hits the AABB. * Problem arises when any of $b_x$, $b_y$, or $b_z$ is negative, but it can be easily resolved. For example, if $b_x<0$, we just need to swap $x_1$ and $x_2$ in the above eqations to ensure $t_{entry_x}\leq t_{exit_x}$. * $b_x$, $b_y$, and $b_z$ may be zero, resulting in $t=\infty$ or $t=-\infty$. But it can be shown that the result is still correct. ### Surface Area Heuristic (SAH) * There are several ways to split primitives into two disjoint set. * A naive way is choose one axis, and split primitives into two disjoint set by their median (this can be done by calling STL's `std::nth_element`. * However, sometimes it can lead to poor partition, resulting in slower rendering. * Most of the best current algorithms for building acceleration structures for ray-tracing are based on the “surface area heuristic” (SAH). * The SAH model estimates the computational expense of performing ray intersection tests, including the **time spent traversing nodes of the tree** and the **time spent on ray–primitive intersection tests** for a particular partitioning of primitives. * While constructing BVH, we need to evaluate the performance of several partition scheme, choose the heuristically best partition scheme, and check whether partitioning is better than no partitioning. * To achieve this, SAH-based algorithm computed two values, (1) the cost if we divides the primitives, and (2) the cost if we treat current node as a leaf node. * For (2), the SAH cost is computed simply as: $$C_{leaf}=\sum_{i\in S} t_{isect}(i)$$ where $t_{isect}(i)$ is the cost of ray–object intersection with the $i$th primitive, and $S$ is the primitive set of the current node. * For (1), the SAH cost is more complicated: $$C_{A,B}=t_{trav} +p(A|S)\sum_{i\in A}t_{isect}(i) +p(B|S)\sum_{i\in B}t_{isect}(i)$$ * $t_{trav}$ is the time it takes to traverse the current node and determine which of the children the ray passes through. * $A$ is the primitive set of left child. * $B$ is the primitive set of right child. * $S$ is the primitive set of the current node. * $p(A|S)$ is the conditional probability that the ray passes through the bounding box of $A$ given that the ray hits the bounding box of $S$. * $p(B|S)$ is similar to $p(A|S)$. * The probabilities $p(A|S)$ and $p(B|S)$ can be computed using ideas from geometric probability. * It can be shown that for a convex volume $A$ contained in another convex volume $B$, the conditional probability that a uniformly distributed random ray passing through $B$ will also pass through $A$ is the ratio of their surface areas, $s_A$ and $s_B$: $$p(A|B)=\frac{s_A}{s_B}$$ * After choosing best $A$ and $B$ to minimize $C_{A,B}$, we compare $C_{A,B}$ with $C_{leaf}$. If $C_{A,B} > C_{leaf}$, then there is no need to partition the node, we just treat current node as leaf node. Otherwise, we divide $S$ into $A$ and $B$, which result in two children node for current node. * The entire tree can be constructed by recursive method. ## Implementation [madmann91/bvh: A modern C++ BVH construction and traversal library](https://github.com/madmann91/bvh) ### Tree Structure #### `bvh::Bvh` * The structure of BVH tree (`bvh::Bvh`) is as follows: ![](https://i.imgur.com/PXrYzXn.png) * `nodes` is an array of `bvh::Bvh::Node`. * `primitive_indices` is an array of `size_t`, which represent the actual primitive index. * `node_count` indicates how many nodes in the tree. * Each node in `bvh::Bvh` is a `bvh::Bvh::Node`, and the member variable it contains is as follows: ![](https://i.imgur.com/o16noK2.png) * `bounds[0]`, `bounds[2]`, `bounds[4]` are the x-coordinate, y-coordinate, z-coordinate of the minimum corner of the bounding box. * `bounds[1]`, `bounds[3]`, `bounds[5]` are the x-coordinate, y-coordinate, z-coordinate of the maximum corner of the bounding box. * `primitive_count` indicates how many primitive (triangle, sphere, or other "objects") is in this node. If the node is not leaf node, this value will be 0. * `first_child_or_primitive`: if this node is leaf node, then this value is the index to the first primitive contains in this node. Otherwise, this value is the index to the left child of this node. ### Tree Construction #### `bvh::SweepSahBuilder` ![](https://i.imgur.com/Y25fW94.png) * Suppose There are 4 unit spheres in the scene. * Their center are $(2,2,2)$, $(3,3,0)$, $(1,-1,0)$, and $(0,0,0)$, respectively. #### Preprocessing Steps * `bvh::SweepSahBuilder` first create `nodes[0]` in the tree, which is the root node. * The bounding box (`bounds`) of `nodes[0]` is the global bounding box. * Then it use `RadixSort` (a floating point version of radix sort) to sort the primitives based on their center, along x, y, z-axis, and store the result in `sorted_references` array. * After sorting, the result looks like this: ![](https://i.imgur.com/ocXPIib.png) * After sorting, we can start to build the tree recursively. #### Recursive Steps * First, `bvh::SweepSahBuilder` check whether the number of primitives contain in current node is 1 or the depth of the tree exceeds `max_depth` (which is 64 for `bvh::SweepSahBuilder`). If yes, then it treat current node as a leaf node and return. * Otherwise, it iterate over all possible split along x-axis, y-axis, and z-axis, and find the best split which minimize $C_{A,B}$ (refer to [Surface Area Heuristic (SAH)](#Surface-Area-Heuristic-SAH)). * Then it check whether $C_{A,B}\ge C_{leaf}$ (The actual implementation in `bvh::SweepSahBuilder` is a little different, but the logic is the same, see [SAH implementation details](#SAH-implementation-details)). If yes, then it treat current node as a leaf node and return. If not, then it create two children nodes based on the best split and resurse on it. * Before recursing next node, we need to rearrange the elements in `sorted_references` to make sure primitive reference indices in the left child comes before those in the right child (which can be done by `std::stable_partition` in $\mathcal{O}(N)$). ##### SAH implementation details $$\text{best_split_cost}=\text{bbox_half_area}(A_{best})\cdot |A_{best}|+\text{bbox_half_area}(B_{best})\cdot |B_{best}|$$$$\text{max_split_cost}=\text{current_node_bbox_half_area}\cdot(\text{current_node_num_primitives}-1)$$ * If $\text{best_split_cost}>=\text{max_split_cost}$, do not split and make current node as leaf node. * Otherwise, split based on $A_{best}$ and $B_{best}$. * By some rearrangement, we can see that this is the same as what is described in [Surface Area Heuristic (SAH)](#Surface-Area-Heuristic-SAH), with $t_{isect}=t_{trav}=1$. #### Final Result ##### BVH Tree ![](https://i.imgur.com/kpUmYsV.png) ![](https://i.imgur.com/knUMLgr.png) ##### Graphical Representation * Each cuboid is a bounding box. ![](https://i.imgur.com/8iiJ1NK.png) ### Intersection Test #### `bvh::FastNodeIntersector` * This is the AABB intersector used for traversing the BVH tree. The implementation is basically the same as what is described in [Axis-Aligned Bounding Box (AABB)](#Axis-Aligned-Bounding-Box-AABB). #### `bvh::ClosestPrimitiveIntersector` * An primitive intersector with `any_hit == false`. Stores intersected primitive index and intersection information. #### `bvh::AnyPrimitiveIntersector` * An primitive intersector with `any_hit == true`. Only stores the distance to the intersected primitive. #### `bvh::Triangle.intersect()` * The implementation is based on Moeller-Trumbore test. ![](https://i.imgur.com/2So5NFT.png) * Notice that $\mathbf{e_1}=\mathbf{p_0}-\mathbf{p_1}$, which is different from we usually see. * Using barycentric coordinates $(u,v)$, we can write down the ray-triangle intersection equation:$$\mathbf{p_0}+u(\mathbf{p_1}-\mathbf{p_0})+v(\mathbf{p_2}-\mathbf{p_0})=\mathbf{o}+t\mathbf{d}$$Rearranging the terms gives, $$ -u\mathbf{e_1}+v\mathbf{e_2}+t(-\mathbf{d})=\mathbf{o}-\mathbf{p_0}\\ u\mathbf{e_1}-v\mathbf{e_2}+t\mathbf{d}=\mathbf{p_0}-\mathbf{o}$$Writing in matrix, $$\begin{bmatrix} \mathbf{e_1} & -\mathbf{e_2} &\mathbf{d} \end{bmatrix} \begin{bmatrix} u\\ v\\ t \end{bmatrix} = \mathbf{p_0}-\mathbf{o}$$The solution can be abtained by Cramer's rule: $$\begin{bmatrix} u\\ v\\ t \end{bmatrix} = \frac{1}{ \begin{vmatrix} \mathbf{e_1} & -\mathbf{e_2} & \mathbf{d} \end{vmatrix} } \begin{bmatrix} \begin{vmatrix} \mathbf{p_0}-\mathbf{o} & -\mathbf{e_2} & \mathbf{d} \end{vmatrix} \\ \begin{vmatrix} \mathbf{e_1} & \mathbf{p_0}-\mathbf{o} & \mathbf{d} \end{vmatrix} \\ \begin{vmatrix} \mathbf{e_1} & -\mathbf{e_2} & \mathbf{p_0}-\mathbf{o} \end{vmatrix} \end{bmatrix}$$ Rearrange the terms inside determinants: $$\begin{bmatrix} u\\ v\\ t \end{bmatrix} = \frac{1}{ \begin{vmatrix} \mathbf{d} & \mathbf{e_1} & -\mathbf{e_2} \end{vmatrix} } \begin{bmatrix} \begin{vmatrix} -\mathbf{e_2} & \mathbf{d} & \mathbf{p_0}-\mathbf{o} \end{vmatrix} \\ \begin{vmatrix} \mathbf{e_1} & -\mathbf{d} & \mathbf{p_0}-\mathbf{o} \end{vmatrix} \\ \begin{vmatrix} \mathbf{p_0}-\mathbf{o} & \mathbf{e_1} & -\mathbf{e_2} \end{vmatrix} \end{bmatrix}\\ \begin{bmatrix} u\\ v\\ t \end{bmatrix} = \frac{1}{ \begin{vmatrix} \mathbf{d} & \mathbf{e_1} & \mathbf{e_2} \end{vmatrix} } \begin{bmatrix} \begin{vmatrix} \mathbf{e_2} & \mathbf{d} & \mathbf{p_0}-\mathbf{o} \end{vmatrix} \\ \begin{vmatrix} \mathbf{e_1} & \mathbf{d} & \mathbf{p_0}-\mathbf{o} \end{vmatrix} \\ \begin{vmatrix} \mathbf{p_0}-\mathbf{o} & \mathbf{e_1} & \mathbf{e_2} \end{vmatrix} \end{bmatrix}$$ From linear algebra, there is a formula which connects determinant and cross product: $$\begin{vmatrix} \mathbf{x} & \mathbf{a} & \mathbf{b}\end{vmatrix}=\langle \mathbf{x} , \mathbf{a} \times \mathbf{b} \rangle$$We can leverage this formula to simplify the eqation: $$ \begin{bmatrix} u\\ v\\ t \end{bmatrix} = \frac{1}{ \langle\mathbf{d}, \mathbf{e_1} \times\mathbf{e_2}\rangle } \begin{bmatrix} \langle\mathbf{e_2}, \mathbf{d}\times (\mathbf{p_0}-\mathbf{o})\rangle \\ \langle\mathbf{e_1}, \mathbf{d} \times(\mathbf{p_0}-\mathbf{o})\rangle \\ \langle\mathbf{p_0}-\mathbf{o} , \mathbf{e_1} \times \mathbf{e_2}\rangle \end{bmatrix}$$Notice that there are several duplicated terms, so we can define these three vectors: $$\mathbf{c}=\mathbf{p_0}-\mathbf{o}\\ \mathbf{r}=\mathbf{d}\times\mathbf{c}\\ \mathbf{n}=\mathbf{e_1}\times\mathbf{e_2}$$Substitute back to the equation: $$\begin{bmatrix} u\\ v\\ t \end{bmatrix} = \frac{1}{ \langle\mathbf{d}, \mathbf{n}\rangle } \begin{bmatrix} \langle\mathbf{e_2}, \mathbf{r}\rangle \\ \langle\mathbf{e_1}, \mathbf{r}\rangle \\ \langle\mathbf{c} , \mathbf{n}\rangle \end{bmatrix}$$ which is how `bvh::Triangle.intersect()` is implemented :) #### `bvh::Sphere.intersect()` under construction... ### Tree Traversal #### `bvh::SingleRayTraverser` ##### Initialization Step * First, check whether the root node is a leaf node. If yes, then call `intersect_leaf` on root node and return. If not, push the root node to stack and goto [Loop Step](#Loop-Step-Loop-Until-Stack-is-Empty). * The implementation of `intersect_leaf` is simple. Just iterate through all primitive in that leaf node, and return the closest hit. (If `any_hit == true`, return immediately if hit any primitive.) ##### Loop Step (Loop Until Stack is Empty) * Initialize. * `curr_node = stack.pop()` * `left_child = address of curr_node's left child` * `right_child = address of curr_node's right child` * Check if the ray hit the bounding box of the left child. * If **yes** and **the left child is a leaf node**, call `intersect_leaf` on it and set `left_child = nullptr`. (If `any_hit == true` and the ray intersect anything in the left node, return.) * If **the ray doesn't hit the bounding box**, set `left_child = nullptr`. * Check if the ray hit the bounding box of the right child. * If **yes** and **the right child is a leaf node**, call `intersect_leaf` on it and set `right_child = nullptr`. (If `any_hit == true` and the ray intersect anything in the right node, return.) * If **the ray doesn't hit the bounding box**, set `right_child = nullptr`. * If `left_child != nullptr && right_child != nullptr` * Check which bounding box is closer. * If left bounding box is closer, then push `right_child` into stack. After that, push `left_child` into stack. (To ensure left child will be traversed earlier) * If right bounding box is closer, then push `left_child` into stack. After that, push `right_child` into stack. (To ensure right child will be traversed earlier) * If `left_child != nullptr && right_child == nullptr`, push `left_child` into stack. * If `left_child == nullptr && right_child != nullptr`, push `right_child` into stack. * Goto [Loop Step](#Loop-Step-Loop-Until-Stack-is-Empty). ##### Final Step * Return best hit. ##### Note * The traversal loop is "eager", meaning that it immediately processes leaves instead of pushing them on the stack. * In the actual implementation, instead of pushing topmost node and poping it in the next loop step, it omit this redundant step by saving the topmost node directly into ```left_child``` variable. ## References [Ray Tracing in One Weekend Series](https://raytracing.github.io/) [Bounding Volume Hierarchies](https://pbr-book.org/3ed-2018/Primitives_and_Intersection_Acceleration/Bounding_Volume_Hierarchies) [madmann91/bvh: A modern C++ BVH construction and traversal library](https://github.com/madmann91/bvh) [Lecture 13: Spatial Data Structures (CMU 15-462/662) - YouTube](https://www.youtube.com/watch?v=NF7r-pC8fFc) [Fast Minimum Storage Ray-Triangle Intersection (Möller–Trumbore intersection algorithm)](http://www.graphics.cornell.edu/pubs/1997/MT97.pdf)