Render|Frame Graph

{%hackmd theme-dark %} # Render|Frame Graph ## TODO List - [ ] Blog intro frame graph - [ ] Mini RenderGraph (Vulkan|DX) => Exemple Code sans/avec - [ ] Crash Course Render Dependency Graph Unreal - [ ] What is ring allocation ? - [ ] Show the complete (DX12) graphic pipeline - [ ] Talk about Mesh Shader Stage and Amplification Shader Stage (parts of graphic pipeline) - [ ] Talk about FLOP (float operation) based on precision ![](https://i.imgur.com/MMijnY7.png) https://www.techpowerup.com/gpu-specs/quadro-rtx-3000-mobile.c3428 ## Introduction A Render Graph(or Frame Graph) is **high-level** representation of the graphic operations to render a scene. In other words, it is basically a **wrapper** around a(or multiple) low-level **graphic API**(Vulkan, DirectX12, OpenGL, ...). Examples: <a href="https://www.gdcvault.com/play/1024612/FrameGraph-Extensible-Rendering-Architecture-in">Frosbite (DICE): Frame Graph</a> <a href="https://www.gdcvault.com/play/1024612/FrameGraph-Extensible-Rendering-Architecture-in">Unreal Engine(Epic Games): Render Dependency Graph</a> <a href="https://www.gdcvault.com/play/1024612/FrameGraph-Extensible-Rendering-Architecture-in">Anvil Engine(Ubisoft): Frame Graph</a> ## Properties - Each render operation gets abstracted to a **single way to produce render code**. - Clearer and lighter code - Easier to debug (The implementation of the render operation is not broken, call to the function is the issue) - **Hide many low level operations** (memory allocation, ressource state transition) - Low-level and high-level code are optimized independently. ### Recent 3D APIs **D3D12|Vulkan**: → Manage **resources states**: ![](https://i.imgur.com/HmQSvUf.png) (https://learn.microsoft.com/fr-fr/windows/win32/direct3d12/using-resource-barriers-to-synchronize-resource-states-in-direct3d-12#implicit-state-transitions) → Manage **relative transition**: Switching resource from a state to another. Consider the following example: ![](https://i.imgur.com/Tp7Fmgx.png) →: Read →: Write **Render Graph** has information about these interactions: Can figure out best place for **barrier transition**. In general, the more we can **group resource transitions** in a single call, the less calls to the SDK and the faster our program will perform. *** ### Minimize And Group Resource Barriers For example, running a shadow mapping algorithm: - Pass 1: Outputs **depth buffer** - Pass 2 (Compute Shader Pass): Reads **depth buffer** per-pixel (UAV: UnorderedAccessView -> representing ReadWrite data) and outputs **shadow buffer** ***Warning:*** ***Textures can be read per-pixel and be ReadWrite but Sample() does not work*** - Pass 3: Takes **depth buffer** and **shadow buffer** as input. Transition of **depth buffer** and **shadow buffer**, `write` → `read` to be used as shader resource in the `Pass 3`, should be grouped. ![](https://i.imgur.com/SiBBgH5.png) It is possible to group **resource barriers** from **different command queues**. (In tripleA engine there are atleast 3 command queues a **graphic**, a **compute** and a **copy**). Main command queue is the graphic → queues need **synchronisation mechanism** to communicate with each other. Optimization of resource transition: https://levelup.gitconnected.com/organizing-gpu-work-with-directed-acyclic-graphs-f3fd5f2c2af3. *** ### Manage And Optimize Resource Memory ![](https://i.imgur.com/NBiCWHw.png) Here `Resource A` stops being used in pass 3, `Resource C` is used in pass 4 which means their lifetimes do not overlap so the same memory can be used for `A` and `C` (same for `A` and `D`) **Types of resources:** - **Graph|Transient resources:** Used on a per-frame basis (*GBuffer*, *Deferred lightning pass*, ...) → their lifetime can be **fully handled** by render graph. - **External resources:** Whose lifetime is dependent on systems outside the graph ([**swapchain**](#Swapchain) back buffer) → **render graph** will just manage their **states**. #### Transient Resource System **Transient resource** are owned by the *RDG* and last for a maximum of one frame. → High potential of **memory re-use** → resource lifetime is then used to apply [**resource aliasing**](#Resource-aliasing) on [**placed resources**](#Placed-resource). ### Parallel Command List Recording Check: https://learn.microsoft.com/en-us/windows/win32/direct3d12/user-mode-heap-synchronization https://devblogs.microsoft.com/directx/gpus-in-the-task-manager/#:~:text=GPU%20engines%20are%20made%20up,of%20that%20engine's%20underlying%20cores. **GPU Engines**: - **3D Engine**: The 3D engine is responsible for processing and rendering 3D graphics. It contains specialized hardware components such as a geometry processor, rasterizer, and pixel shader that work together to transform 3D objects into 2D images that can be displayed on a screen. The geometry processor handles tasks such as transforming vertices, calculating lighting and shading effects, and applying textures to 3D objects. The rasterizer then converts the 3D objects into pixels that can be displayed on a screen, while the pixel shader applies additional effects such as transparency and anti-aliasing to the rendered image. - **Compute Engine**: The compute engine is responsible for performing general-purpose computing tasks on the GPU. It is used for tasks that do not necessarily involve graphics processing, such as scientific simulations, data processing, and machine learning. The compute engine is designed to perform highly parallel computations, meaning that it can perform many calculations simultaneously, making it well-suited for tasks that involve large amounts of data or complex mathematical operations. - **Copy Engine**: The copy engine is responsible for moving data between the CPU and the GPU, as well as between different areas of the GPU itself. It is used for tasks such as uploading data from the CPU to the GPU's memory, copying data between different memory locations within the GPU, and downloading data from the GPU's memory back to the CPU. The copy engine is designed to transfer data quickly and efficiently, making it an important component for many graphics and compute tasks. ## Glossary ### **Swapchain**: Vulkan ne possède pas de concept comme le framebuffer par défaut, et nous devons donc créer une infrastructure qui contiendra les buffers sur lesquels nous effectuerons les rendus avant de les présenter à l'écran. Cette infrastructure s'appelle swap chain sur Vulkan et doit être créée explicitement. La swap chain est essentiellement une file d'attente d'images attendant d'être affichées. Notre application devra récupérer une des images de la file, dessiner dessus puis la retourner à la file d'attente. Le fonctionnement de la file d'attente et les conditions de la présentation dépendent du paramétrage de la swap chain. Cependant, l'intérêt principal de la swap chain est de synchroniser la présentation avec le rafraîchissement de l'écran. ### **Resource aliasing**: Consists of using the **same GPU memory** for different resources whose **lifetime don’t overlap** during frame computation. ### **Placed resource**: Resource that allows developers to specify the **location of the resource's memory** directly. Placed resources are the **lightest weight** resource objects available, and are the **fastest to create and destroy**. ## External resources - D3D12 Placed resources: https://learn.microsoft.com/en-us/windows/win32/api/d3d12/nf-d3d12-id3d12device-createplacedresource - D3D12 Resource Handling: https://logins.github.io/graphics/2020/07/31/DX12ResourceHandling.html - D3D12 resource aliasing: https://learn.microsoft.com/en-us/windows/win32/direct3d12/memory-aliasing-and-data-inheritance - Memory aliasing algorithm: https://levelup.gitconnected.com/gpu-memory-aliasing-45933681a15e - Resource aliasing: https://gpuopen-librariesandsdks.github.io/D3D12MemoryAllocator/html/resource_aliasing.html - Vulkan swapchain: https://vulkan-tutorial.com/fr/Dessiner_un_triangle/Presentation/Swap_chain