# 10/20 例行會議
## 會議資訊
* 時間:10/20(四) 16:00@教授辦公室/線上
* 出席:
* 地點:
* 主題:例行會議
## 會議記錄
## 會議主題
#### 俊佑
### 理解 vulkan-sim 運作模式與安裝工具
* Vulkan-Sim: A GPU Architecture Simulator for Ray Tracing(https://zenodo.org/record/6941619#.Y1AGknZBy3A)
* Windows 10 驅動程式修復工具McAfee Windows問題 > Windows 10操作系统錯誤 > Windows驅動程式修復"Update VulkanInstantly"(https://www.driverfix.com/land/hgd/index.php?tracking=GGfix&dyn_param2=Update%20Vulkan%20Instantly&whf=true&banner=15680941118&adgroup=131606965156&keyword=vulkan&ads_name=&gclid=CjwKCAjwwL6aBhBlEiwADycBIALB6letKsKaRNUTM7VC-nC93tx-V8oNx8gyoHdDoIR30E0hV9hnNRoC6SsQAvD_BwE)
### 透過網上教學影片理解實際應用
* Real-Time Ray Tracing | "RTX ON in Vulkan" | Vulkan Lecture Series Ep. 6, Real-Time Rendering Course(https://www.youtube.com/watch?v=12k_frqw7tM)
* Bringing Ray Tracing to Vulkan(https://www.youtube.com/watch?v=xpxVAoXaVgg)
* XDC 2020 | Ray-tracing in Vulkan: A brief overview of the provisional VK_KHR_ray_tracing API(https://www.youtube.com/watch?v=-FvAJmq8NvI)
* Vulkan API and DCS | USAF Sim/Wargame Developer Reacts(https://www.youtube.com/watch?v=ZLVjND2kt-U)
* DCS: What is Vulkan API?(https://www.youtube.com/watch?v=tOWm8E7ZV9E)
### 安裝合適的 vulkan-sim 版本進行環境設置
* Vulkan - NVIDIA Developer(https://developer.nvidia.com/vulkan)
* Developing a Simulation Framework for Vulkan - ProQuest(https://www.proquest.com/docview/2188240472?pq-origsite=gscholar&fromopenview=true)
* Vulkan Support? :: Euro Truck Simulator 2 General Discussions(https://steamcommunity.com/app/227300/discussions/0/412448792365477045/)
* The Vulkan Device Simulation Layer - Sascha Willems(https://www.saschawillems.de/blog/2017/08/19/the-vulkan-device-simulation-layer/)
### 學習 vulkan-sim 相關應用
* Vulkan Tutorial 7: Vulkan Memory Management(https://www.youtube.com/watch?v=ifmW3lOA6CA)
* Is Vulkan Hard?(https://www.youtube.com/watch?v=LOQLg-JHRZc)
### 實作與 vulkan-sim
* How to Install Vulkan API(https://www.youtube.com/watch?v=2XiLqW4stWw)
* NVIDIA Vulkan Ray Tracing Tutorial(https://nvpro-samples.github.io/vk_raytracing_tutorial_KHR/)
* INTRODUCTION TO VULKAN RAY TRACING(https://link.springer.com/content/pdf/10.1007/978-1-4842-7185-8_16.pdf)
* Ray Tracing In Vulkan(https://www.khronos.org/blog/ray-tracing-in-vulkan)
* New Vulkan example on raytracing using VK_NV_ray_tracing(https://www.saschawillems.de/blog/2019/04/21/new-vulkan-example-on-raytracing-using-vk_nv_ray_tracing/)
* Tutorial: Vulkan GLSL Ray Tracing Emulator(https://www.gsn-lib.org/docs/nodes/raytracing.php)
#### 彥傑
### 把 triangle intersection 的精度降成 half
* 使用 pbrt-v3 渲染,只有改 ray 的精度,沒改 triangle 的精度(例如 triangle 的頂點向量、邊向量)
* float

* half

### Visualize BVH
* 場景採用 Crytek Sponza,攔截 pbrt-v3 的 ray queries
* path tracing
* maxdepth=2

* 大張的圖:https://drive.google.com/file/d/1Buw31a_f9wQVc6lWHQhzNt-aSiPgs9u-/view?usp=sharing
* 50 張圖:
https://drive.google.com/drive/folders/1uNGMN5j5YAz78IqlGYBMMk43lP76_1Yh?usp=sharing
### 把大 cache 分成數個小 cache 的實驗
* 場景採用 Dabrovic Sponza

* 我把 BVH 分成 4 塊,分別是黑、黃、綠、藍:

* 然後生 primary ray,並蒐集它的 traversal history
* 然後把 traversal history 丟給 pycachesim 模擬 LRU, fully-associative 的 cache
:::spoiler pycachesim code
```python=
from cachesim import CacheSimulator, Cache, MainMemory
from tqdm import tqdm
mem = [MainMemory() for i in range(4)]
l1 = 4 * [None]
l1[0] = Cache("L1", 1, 32, 64, "LRU") # 2KB
l1[1] = Cache("L1", 1, 160, 64, "LRU") # 10KB
l1[2] = Cache("L1", 1, 160, 64, "LRU") # 10KB
l1[3] = Cache("L1", 1, 160, 64, "LRU") # 10KB
mem[0].load_to(l1[0])
mem[0].store_from(l1[0])
mem[1].load_to(l1[1])
mem[1].store_from(l1[1])
mem[2].load_to(l1[2])
mem[2].store_from(l1[2])
mem[3].load_to(l1[3])
mem[3].store_from(l1[3])
cs = [CacheSimulator(l1[i], mem[i]) for i in range(4)]
bigmem = MainMemory()
bigl1 = Cache("L1", 1, 512, 64, "LRU") # 32KB
bigmem.load_to(bigl1)
bigmem.store_from(bigl1)
bigcs = CacheSimulator(bigl1, bigmem)
with open('node_load_trace.txt') as f:
for line in tqdm(f.readlines()):
addr, cluster = [int(x) for x in line.split()]
cs[cluster].load(addr, length=8)
bigcs.load(addr, length=8)
for i in range(4):
cs[i].print_stats()
print()
bigcs.print_stats()
```
:::
* 這是只有一個 32KB cache 的情況

* 然後這是分成 4 塊,對應於黑、黃、綠、藍,分別是 2KB、10KB、10KB、10KB

* 結果分成很多小塊的 cache 的確沒有比較好,但是從這裡可以看到,黑色那塊 treelet 雖只占 0.1% 的 node,但是卻佔了 37% 的 bandwidth
* 因此可以推測階層式的架構是有利於平衡 memory bandwidth 的
### Reduced Precision Traversal
* Motivation

* [Watertight Ray Traversal with Reduced Precision Notes](/iQkIBMuzQTiOKwdoFXuxZw)
#### 凱雋
#### 冠吾
* 成功執行vulkan-sim,並把其benchmark中的ptx dump出來
* 類似gpgpu-sim的config,但它多了一塊rt的配置
* 
* 跑起來的樣子
* 
### TODOs
#### 陳俊佑
- [ ] run Vulkan-sim
## 下次會議