2023q1 Homework3 (fibdrv)

--- tags: linux kernel --- # 2023q1 Homework3 (fibdrv) contributed by < [andy155224](https://github.com/andy155224) > [作業說明](https://hackmd.io/@sysprog/linux2023-fibdrv/https%3A%2F%2Fhackmd.io%2F%40sysprog%2Flinux2023-fibdrv-g) ## 開發環境 ```shell $ gcc --version gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0 $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 46 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 24 On-line CPU(s) list: 0-23 Vendor ID: GenuineIntel Model name: 13th Gen Intel(R) Core(TM) i7-13700 CPU family: 6 Model: 183 Thread(s) per core: 2 Core(s) per socket: 16 Socket(s): 1 Stepping: 1 CPU max MHz: 5200.0000 CPU min MHz: 800.0000 BogoMIPS: 4224.00 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bt s rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe po pcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb intel_pt sha_ni xsaveopt xsavec xgetbv1 xsaves split_lock_detect avx_vnni dtherm ida arat pln pts hw p hwp_notify hwp_act_window hwp_epp hwp_pkg_req hfi umip pku ospke waitpkg gfni vaes vpclmulqdq tme rdpid movdiri movdir64b fsrm md_clear serialize pconfig arch_lbr ibt flush_l1d arch_ca pabilities Virtualization features: Virtualization: VT-x Caches (sum of all): L1d: 640 KiB (16 instances) L1i: 768 KiB (16 instances) L2: 24 MiB (10 instances) L3: 30 MiB (1 instance) NUMA: NUMA node(s): 1 NUMA node0 CPU(s): 0-23 Vulnerabilities: Itlb multihit: Not affected L1tf: Not affected Mds: Not affected Meltdown: Not affected Mmio stale data: Not affected Retbleed: Not affected Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Spectre v2: Mitigation; Enhanced IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS SW sequence Srbds: Not affected Tsx async abort: Not affected 0-3 ``` ## 自我檢查清單 - [x] 研讀上述 ==Linux 效能分析的提示== 描述，在自己的實體電腦運作 GNU/Linux，做好必要的設定和準備工作 $\to$ 從中也該理解為何不希望在虛擬機器中進行實驗; - [ ] 研讀上述費氏數列相關材料 (包含論文)，摘錄關鍵手法，並思考 [clz / ctz](https://en.wikipedia.org/wiki/Find_first_set) 一類的指令對 Fibonacci 數運算的幫助。請列出關鍵程式碼並解說 - [ ] 複習 C 語言 [數值系統](https://hackmd.io/@sysprog/c-numerics) 和 [bitwise operation](https://hackmd.io/@sysprog/c-bitwise)，思考 Fibonacci 數快速計算演算法的實作中如何減少乘法運算的成本; - [ ] 學習指出針對大數運算的加速運算和縮減記憶體操作成本的舉措，紀錄你的認知和疑惑 - [ ] 注意到 `fibdrv.c` 存在著 `DEFINE_MUTEX`, `mutex_trylock`, `mutex_init`, `mutex_unlock`, `mutex_destroy` 等字樣，什麼場景中會需要呢？撰寫多執行緒的 userspace 程式來測試，觀察 Linux 核心模組若沒用到 mutex，到底會發生什麼問題。嘗試撰寫使用 [POSIX Thread](https://en.wikipedia.org/wiki/POSIX_Threads) 的程式碼來確認。 $\to$ 搭配閱讀〈[並行和多執行緒程式設計](https://hackmd.io/@sysprog/concurrency)〉 ## 事前環境準備在我的環境下， VSCode 的 C/C++ 延伸模組沒有抓到以下標頭檔 ![](https://i.imgur.com/xgLPL9v.png) 透過 find 指令，找到了這幾個標頭檔實際的位置 ![](https://i.imgur.com/Fr3Dem0.png) 將其路徑加入到 IncludePath 中但還是有以下錯誤 ![](https://i.imgur.com/XCjtTR6.png) 同樣透過 find 指令去找出標頭檔的位置並加到 IncludePath 中最後經過幾次的 find 後就解決了 include 失敗的問題 ```json { "configurations": [ { "name": "Linux", "includePath": [ "${workspaceFolder}/**", "/usr/src/linux-hwe-5.19-headers-5.19.0-38/include", "/usr/src/linux-hwe-5.19-headers-5.19.0-38/arch/x86/include", "/usr/src/linux-headers-5.19.0-38-generic/include", "/usr/src/linux-headers-5.19.0-38-generic/arch/x86/include/generated" ], "defines": [], "compilerPath": "/usr/bin/gcc", "cStandard": "gnu17", "cppStandard": "gnu++17", "intelliSenseMode": "linux-gcc-x64" } ], "version": 4 } ``` ## 開發紀錄一開始 `clone` 下來後先 `make check` 一次，得到預期的結果 ```bash Passed [-] f(93) fail input: 7540113804746346429 expected: 12200160415121876738 ``` 可以發現因為原先計算 fibonacci sequence 的 function 回傳的型別是 `long long`，所以當計算第 93 個 fibonacci number 後就會發生 overflow 的問題。 ### 時間測量和效能分析在 [時間測量和效能分析](https://hackmd.io/@sysprog/linux2023-fibdrv/%2F%40sysprog%2Flinux2023-fibdrv-c) 中有提到 `hrtimer` 是 2.6.16 開始有的新計時機制，裡面使用了新的資料結構 `ktime_t` 來計時。 `ktime_t` 可以做加減法和時間單位的轉換。根據 [時間測量和效能分析](https://hackmd.io/@sysprog/linux2023-fibdrv/%2F%40sysprog%2Flinux2023-fibdrv-c)，在 `fibdrv.c` 中引入 `ktime_t` ```c static ktime_t kt; ``` 並且更改 `fib_read` 這個 function，從原先回傳 `fib_sequence(*offset)` 變成回傳 `fib_time_proxy(*offset)`，而後者與前者的差異是後者除了會去呼叫 `fib_sequence(*offset)` 來計算第 offset 個 fibonacci number 外，也會在呼叫的前後引入 `ktime_get()` 和 `ktime_sub` 以計算 `fib_sequence()` 的執行時間，也就是在 kernel space 中運算的時間。 ```c static long long fib_time_proxy(long long k) { kt = ktime_get(); long long result = fib_sequence(k); kt = ktime_sub(ktime_get(), kt); return result; } ``` 而在這個核心模組中 `fib_write` 目前是沒有作用的，所以可以藉由此函式來回傳上述計算的在 kernel space 中運算的時間 `kt`，同時也透過 `ktime_to_ns()` 將 `kt` 的時間單位轉換至 `ns`。 ```c static ssize_t fib_write(struct file *file, const char *buf, size_t size, loff_t *offset) { return ktime_to_ns(kt); } ``` 為了要使用 [4ce43c4](https://github.com/colinyoyo26/fibdrv/commit/4ce43c4e7ee0b80c4ce9e6dcd995bbfdf239206c) ，除了上述已經得到的 `fib_sequence()` 的計算時間 `kt` 外， ### 透過 [Fast Doubling](https://chunminchang.github.io/blog/post/calculating-fibonacci-numbers-by-fast-doubling) 來降低 fibonacci sequence 的計算成本