# 10/20 例行會議 ## 會議資訊 * 時間:10/20(四) 16:00@教授辦公室/線上 * 出席: * 地點: * 主題:例行會議 ## 會議記錄 ## 會議主題 #### 俊佑 ### 理解 vulkan-sim 運作模式與安裝工具 * Vulkan-Sim: A GPU Architecture Simulator for Ray Tracing(https://zenodo.org/record/6941619#.Y1AGknZBy3A) * Windows 10 驅動程式修復工具McAfee Windows問題 > Windows 10操作系统錯誤 > Windows驅動程式修復"Update VulkanInstantly"(https://www.driverfix.com/land/hgd/index.php?tracking=GGfix&dyn_param2=Update%20Vulkan%20Instantly&whf=true&banner=15680941118&adgroup=131606965156&keyword=vulkan&ads_name=&gclid=CjwKCAjwwL6aBhBlEiwADycBIALB6letKsKaRNUTM7VC-nC93tx-V8oNx8gyoHdDoIR30E0hV9hnNRoC6SsQAvD_BwE) ### 透過網上教學影片理解實際應用 * Real-Time Ray Tracing | "RTX ON in Vulkan" | Vulkan Lecture Series Ep. 6, Real-Time Rendering Course(https://www.youtube.com/watch?v=12k_frqw7tM) * Bringing Ray Tracing to Vulkan(https://www.youtube.com/watch?v=xpxVAoXaVgg) * XDC 2020 | Ray-tracing in Vulkan: A brief overview of the provisional VK_KHR_ray_tracing API(https://www.youtube.com/watch?v=-FvAJmq8NvI) * Vulkan API and DCS | USAF Sim/Wargame Developer Reacts(https://www.youtube.com/watch?v=ZLVjND2kt-U) * DCS: What is Vulkan API?(https://www.youtube.com/watch?v=tOWm8E7ZV9E) ### 安裝合適的 vulkan-sim 版本進行環境設置 * Vulkan - NVIDIA Developer(https://developer.nvidia.com/vulkan) * Developing a Simulation Framework for Vulkan - ProQuest(https://www.proquest.com/docview/2188240472?pq-origsite=gscholar&fromopenview=true) * Vulkan Support? :: Euro Truck Simulator 2 General Discussions(https://steamcommunity.com/app/227300/discussions/0/412448792365477045/) * The Vulkan Device Simulation Layer - Sascha Willems(https://www.saschawillems.de/blog/2017/08/19/the-vulkan-device-simulation-layer/) ### 學習 vulkan-sim 相關應用 * Vulkan Tutorial 7: Vulkan Memory Management(https://www.youtube.com/watch?v=ifmW3lOA6CA) * Is Vulkan Hard?(https://www.youtube.com/watch?v=LOQLg-JHRZc) ### 實作與 vulkan-sim * How to Install Vulkan API(https://www.youtube.com/watch?v=2XiLqW4stWw) * NVIDIA Vulkan Ray Tracing Tutorial(https://nvpro-samples.github.io/vk_raytracing_tutorial_KHR/) * INTRODUCTION TO VULKAN RAY TRACING(https://link.springer.com/content/pdf/10.1007/978-1-4842-7185-8_16.pdf) * Ray Tracing In Vulkan(https://www.khronos.org/blog/ray-tracing-in-vulkan) * New Vulkan example on raytracing using VK_NV_ray_tracing(https://www.saschawillems.de/blog/2019/04/21/new-vulkan-example-on-raytracing-using-vk_nv_ray_tracing/) * Tutorial: Vulkan GLSL Ray Tracing Emulator(https://www.gsn-lib.org/docs/nodes/raytracing.php) #### 彥傑 ### 把 triangle intersection 的精度降成 half * 使用 pbrt-v3 渲染,只有改 ray 的精度,沒改 triangle 的精度(例如 triangle 的頂點向量、邊向量) * float ![](https://i.imgur.com/Ox4ApFT.png) * half ![](https://i.imgur.com/pg50UWP.jpg) ### Visualize BVH * 場景採用 Crytek Sponza,攔截 pbrt-v3 的 ray queries * path tracing * maxdepth=2 ![](https://i.imgur.com/amC7w1L.png) * 大張的圖:https://drive.google.com/file/d/1Buw31a_f9wQVc6lWHQhzNt-aSiPgs9u-/view?usp=sharing * 50 張圖: https://drive.google.com/drive/folders/1uNGMN5j5YAz78IqlGYBMMk43lP76_1Yh?usp=sharing ### 把大 cache 分成數個小 cache 的實驗 * 場景採用 Dabrovic Sponza ![](https://i.imgur.com/MourQdH.png) * 我把 BVH 分成 4 塊,分別是黑、黃、綠、藍: ![](https://i.imgur.com/jTvwqud.png) * 然後生 primary ray,並蒐集它的 traversal history * 然後把 traversal history 丟給 pycachesim 模擬 LRU, fully-associative 的 cache :::spoiler pycachesim code ```python= from cachesim import CacheSimulator, Cache, MainMemory from tqdm import tqdm mem = [MainMemory() for i in range(4)] l1 = 4 * [None] l1[0] = Cache("L1", 1, 32, 64, "LRU") # 2KB l1[1] = Cache("L1", 1, 160, 64, "LRU") # 10KB l1[2] = Cache("L1", 1, 160, 64, "LRU") # 10KB l1[3] = Cache("L1", 1, 160, 64, "LRU") # 10KB mem[0].load_to(l1[0]) mem[0].store_from(l1[0]) mem[1].load_to(l1[1]) mem[1].store_from(l1[1]) mem[2].load_to(l1[2]) mem[2].store_from(l1[2]) mem[3].load_to(l1[3]) mem[3].store_from(l1[3]) cs = [CacheSimulator(l1[i], mem[i]) for i in range(4)] bigmem = MainMemory() bigl1 = Cache("L1", 1, 512, 64, "LRU") # 32KB bigmem.load_to(bigl1) bigmem.store_from(bigl1) bigcs = CacheSimulator(bigl1, bigmem) with open('node_load_trace.txt') as f: for line in tqdm(f.readlines()): addr, cluster = [int(x) for x in line.split()] cs[cluster].load(addr, length=8) bigcs.load(addr, length=8) for i in range(4): cs[i].print_stats() print() bigcs.print_stats() ``` ::: * 這是只有一個 32KB cache 的情況 ![](https://i.imgur.com/9nxdHIt.png) * 然後這是分成 4 塊,對應於黑、黃、綠、藍,分別是 2KB、10KB、10KB、10KB ![](https://i.imgur.com/ltgKcO1.png) * 結果分成很多小塊的 cache 的確沒有比較好,但是從這裡可以看到,黑色那塊 treelet 雖只占 0.1% 的 node,但是卻佔了 37% 的 bandwidth * 因此可以推測階層式的架構是有利於平衡 memory bandwidth 的 ### Reduced Precision Traversal * Motivation ![](https://i.imgur.com/5ussavJ.png) * [Watertight Ray Traversal with Reduced Precision Notes](/iQkIBMuzQTiOKwdoFXuxZw) #### 凱雋 #### 冠吾 * 成功執行vulkan-sim,並把其benchmark中的ptx dump出來 * 類似gpgpu-sim的config,但它多了一塊rt的配置 * ![](https://i.imgur.com/2MdQpkT.png) * 跑起來的樣子 * ![](https://i.imgur.com/spyutDR.png) ### TODOs #### 陳俊佑 - [ ] run Vulkan-sim ## 下次會議