![In-GPU-we-Rust](https://hackmd.io/_uploads/B1y03o3gJe.jpg)
---
## `whoami`
![rust-projects](https://hackmd.io/_uploads/rkw__tEbyx.png)
<!-- 12 years of experience -->
---
## Agenda
1. Landscape of GPU abstractions
2. History of *wgpu*
3. *Blade* of difference
---
## GPU abstractions
![safe/lightweight/portable](https://hackmd.io/_uploads/rJRR3QT6C.png)
----
### Map of Portability
![platform availability](https://github.com/kvark/slides/raw/b3cbeaa4704dd6090cf633e5b559390e744f6c1c/md/ProcessingShaders/PlatformsMap.jpeg)
----
![gpu-safety](https://hackmd.io/_uploads/HkSnocSZkg.jpg)
<!-- is it safe to access memory? Browser definition -->
<!-- underfined behavior vs undefined data -->
----
<!-- .slide: style="text-align: left; font-size: 24px; margin-left: 60px; " -->
### case: glow
Purity: :heavy_check_mark:
Safety: :grey_question:
- OpenGL is safe, but Rust API is not
Backends: *GL/GLES/WebGL*
- no compute on Apple platforms
Overhead: :question:
- API itself is close to zero overhead
- but actual platforms may involve translation
Ergonomics: *AA+*
- relatively small API
- boilerplate related to bindings and framebuffers
Downloads: every *8 seconds*
----
<!-- .slide: style="text-align: left; font-size: 24px; margin-left: 60px;" -->
### case: Ash
Purity: :heavy_check_mark: (no shader solution)
Safety: :x:
Backends: *Vulkan*
Overhead: :heavy_check_mark:
Ergonomics: *A*
Downloads: every *9 seconds*
- is a dependency of many others
----
<!-- .slide: style="text-align: left; font-size: 24px; margin-left: 60px;" -->
### case: Vulkano
![vulkano-logo](https://github.com/vulkano-rs/vulkano/blob/master/logo.png?raw=true =15%x)
Purity: :heavy_check_mark: host, :x: shader processing (3rd party C++)
Safety: :heavy_check_mark: host, :x: shaders, relies on robust buffer/image access
Backends: *Vulkan*
Overhead: :zzz:
- every draw/dispatch is iterating all the used resources
- actual commands are recorded at the end of the pass
Ergonomics: *AA*
- automatic barriers, bit of type sugar
Downloads: every *2.5 minutes*
----
<!-- .slide: style="text-align: left; font-size: 24px; margin-left: 60px;" -->
### case: wgpu
![wgpu-logo](https://github.com/gfx-rs/wgpu/blob/trunk/logo.png?raw=true =15%x)
Purity: :heavy_check_mark: (includes shader solution via `naga`)
Safety: :heavy_check_mark: (includes shader instrumentation)
Backends: *Vulkan*, *D3D12*, *Metal*, *GL*, *WebGPU*, *WebGL2*
Overhead: :zzz:
- tracking every bind group setup
- actual commands are recorded at the end of the pass
Ergonomics: *AAA*
- simple specification
- automatic state tracking
Downloads: every *12 seconds*
----
<!-- .slide: style="text-align: left; font-size: 24px; margin-left: 60px;" -->
### case: wgpu-hal
Purity: :heavy_check_mark: (includes shader solution via `naga`)
Safety: :x:
Backends: *Vulkan*, *D3D12*, *Metal*, *GL/GLES/WebGL2*, *WebGPU*
Overhead: :heavy_check_mark: (directly mapped)
Ergonomics: *A+*
- a bit simpler than Vulkan
Downloads: every *12 seconds* (same as wgpu)
----
<!-- .slide: style="text-align: left; font-size: 24px; margin-left: 60px;" -->
### case: Blade
![blade-logo](https://github.com/kvark/blade/blob/main/docs/logo.png?raw=true =15%x)
Purity: :heavy_check_mark: (includes shader solution via `naga`)
Safety: :x:
Backends: *Vulkan*, *Metal*, *GLES/WebGL2*
Overhead: :heavy_check_mark: (directly mapped)
GPU penalty: :question: (to be discussed)
Ergonomics: *AAA+*
- doesn't involve any bind group layout business
- no resource states or barriers
- but requires manual resource destruction
Downloads: every *15 minutes*
----
### Ergonomics scale
![ergononimcs](https://hackmd.io/_uploads/SkOtDBaaA.png)
<!-- drops portability and overhead -->
---
![logo](https://github.com/gfx-rs/wgpu/blob/trunk/logo.png?raw=true =50%x)
----
### wgpu: Implementation of WebGPU
![webgpu-problem](https://hackmd.io/_uploads/HyPBTKne1l.png)
----
### WebGPU: Targets
![wgpu-intersection](https://hackmd.io/_uploads/HJuEyjHbye.png)
----
### wgpu: History
![wgpu-history](https://hackmd.io/_uploads/SJDB4U3lyx.png)
----
### wgpu: Architecture
![wgpu-graph](https://hackmd.io/_uploads/r1lZA9rWJl.png)
----
### wgpu: Safety
Core idea: *validating correctness takes as much computation as providing it*.
<!-- not obvious, needed to be experimentally proven -->
----
### wgpu: Synchronization
![wgpu-usages](https://hackmd.io/_uploads/ryPf0Fneyx.png)![wgpu-sync](https://hackmd.io/_uploads/Hk3mAFhlye.png)
----
### WebGPU Shading Language
![webgpu-shading-language2](https://hackmd.io/_uploads/ByZw6_hlJx.jpg)
----
#### WGSL: Motivation
- one of the drivers behind early Web was the ability to _inspect/edit/write_ pages directly.
- no shading language is designed for safety and lack of UB.
- GLSL is outdated, SPIR-V spec is difficult, everything else is poorly specified...
Naga shows GLSL -> SPIRV in just 1.5ms per shader.
<!-- in any case SPIR-V fork would require a spec -->
----
#### naga: Architecture
![naga-architecture](https://hackmd.io/_uploads/r1ooC5Bbkx.png)
----
### wgpu: Conclusion
- most mature, portable, well specified
- pretty fast, and the only truly safe
![vangers debug](https://github.com/kvark/slides/raw/b3cbeaa4704dd6090cf633e5b559390e744f6c1c/md/WgpuChallenges/vangers-raymax-debug.png)
---
## blade
Lean and mean graphics API
![](https://i.imgur.com/uXB8rPj.png)
----
### blade: Motivation
- it's not always worth it to provide the driver with all the info ahead of time.
- lots of workflows are leaning to *compute-only*, e.g. 2D graphics rendering, ray tracing, neural networks.
- most API complexity is from rasterization.
- modern APIs are too verbose.
----
![Screenshot 2024-10-28 222030](https://hackmd.io/_uploads/rJqCex0gJg.png)
----
### blade: Principles
1. hacking graphics should be fun!
- we can live without resource barriers
- shader resource layouts can be simpler
- uniforms are just data
2. simplicity >> safety
- no runtime validation
- copyable handles
<!-- user-facing abstraction should be safe,
the question is - at what level is this enforced?
Arguably, GPU API level isn't the best -->
----
![validation](https://hackmd.io/_uploads/r1ggZ0SWkg.jpg)
----
### blade: Look, ma, no bindings!
Shader:
```rust
var<storage,read_write> particles: array<Particle>;
var<uniform> parameters: Parameters;
```
Host:
```rust
pc.bind(0, &MainData {
particles: particle_buffer.into(),
parameters: Parameters {
my_uniform: [1,2,3,4],
},
});
pc.dispatch([group_count, 1, 1]);
```
----
### blade: Synchronization
```rust
if let mut pass = command_encoder.compute("fill-gbuf") {
let mut pc = pass.with(&self.fill_pipeline);
pc.bind(0, &FillData {...});
pc.dispatch(groups);
}
// implicit barrier between passes
if let mut pass = command_encoder.compute("ray-trace") {
let mut pc = pass.with(&self.main_pipeline);
pc.bind(0, &MainData {...});
pc.dispatch(groups);
}
```
----
![blade-zed](https://hackmd.io/_uploads/Bk8r-Z0xye.png)
---
### blade: Performance
API translation and command recording: :zap:
Rasterization:
| GPU | blade | wgpu-hal |
| --- | ----- | -------- |
| Ryzen 3500U | 20K | 20K |
| Ryzen 6850U | 70K | 70K |
| GeForce 3050 | 100K | 100K |
----
<!-- .slide: style="text-align: left; font-size: 24px; margin-left: 60px; " -->
### blade: GPU Penalty
[@krOoze on Khronos forums](https://community.khronos.org/t/which-vulkan-implementations-really-care-about-image-layouts/6885/4):
>Supplying GENERAL everywhere sure is state-of-the-art weapons-grade laziness…
Drivers:
* NVIDIA: [irrelevant](https://d29g4g2dyqv443.cloudfront.net/sites/default/files/akamai/gameworks%2FVulkanDevDaypdaniel.pdf)
>Just leave images in the VK_IMAGE_LAYOUT_GENERAL layout
* AMD: comes down to [ac_surface_supports_dcc_image_stores](https://gitlab.freedesktop.org/mesa/mesa/-/blob/e18733300e65f97757150c6a670f80d032a2615d/src/amd/common/ac_surface.c#L149)
* roughly starts with RDNA
* experiments show no penalty on Vega
* Intel: unclear
Easy to mitigate by inserting transitions around render passes.
----
### blade: conclusion
- easy to use, hackable
- very fast and portable
![game](https://github.com/kvark/blade/blob/main/docs/vehicle-colliders.jpg?raw=true)
---
## Thank you! :crab: :crab: :crab:
![torus](https://github.com/kvark/blade/raw/d99fd709b8d0b415197eee0b71b1cac9cee84aa2/docs/ray-query.gif =50%x)
{"image":"https://hackmd.io/_uploads/BJu9sS6a0.jpg","title":"In GPU we Rust","breaks":true,"description":"Presentation about the GPU abstractions in Rust.","contributors":"[{\"id\":\"979e994f-8a6f-4ba5-b86c-9af3abd000ad\",\"add\":12168,\"del\":5573}]"}