# 2022q1 Homework3 (fibdrv) ###### tags: `Linux 核心設計` `成大課程` contributed by < `hbr890627` > ## 實驗環境 ```shell $ gcc --version gcc (Ubuntu 9.4.0-1ubuntu1~20.04) 9.4.0 $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian Address sizes: 39 bits physical, 48 bits virtual CPU(s): 12 On-line CPU(s) list: 0-11 Thread(s) per core: 2 Core(s) per socket: 6 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 151 Model name: 12th Gen Intel(R) Core(TM) i5-12400 Stepping: 5 CPU MHz: 2500.000 CPU max MHz: 5600.0000 CPU min MHz: 800.0000 BogoMIPS: 4992.00 Virtualization: VT-x L1d cache: 288 KiB L1i cache: 192 KiB L2 cache: 7.5 MiB L3 cache: 18 MiB NUMA node0 CPU(s): 0-11 ``` ## 前期準備 * 檢查 Linux 核心版本 ```shell $ uname -r 5.13.0-35-generic ``` * 安裝 linux-headers 套件 ```shell $ sudo apt install linux-headers-`uname -r` ``` * 確認 linux-headers 套件已正確安裝於開發環境 ```shell $ dpkg -L linux-headers-`uname -r` | grep "/lib/modules" /lib/modules /lib/modules/5.13.0-35-generic /lib/modules/5.13.0-35-generic/build ``` * 檢驗目前的使用者身份 ```shell $ whoami hbr $ sudo whoami root ``` * 取得原始程式碼 ```shell $ git clone https://github.com/hbr890627/fibdrv ``` * 編譯並測試 ```shell $ cd fibdrv $ make check Git hooks are installed successfully. cc -o client client.c make: cc: Command not found make: *** [Makefile:28:client] 錯誤 127 ``` 並未看到預期的輸出,還以為是前面某個步驟做錯了,仔細觀察錯誤訊息 ```shell make: cc: Command not found ``` 檢查是否尚未安裝 gcc ```shell $ gcc Command 'gcc' not found, but can be installed with: sudo apt install gcc ``` gcc 安裝完成後,再次進行 make check,成功編譯,出現預期輸出 ```shell f(93) fail input: 7540113804746346429 expected: 12200160415121876738 ``` * 觀察產生的 fibdrv.ko 核心模組 ```shell $ modinfo fibdrv.ko filename: /home/hbr/Desktop/linux2022/fibdrv/fibdrv.ko version: 0.1 description: Fibonacci engine driver author: National Cheng Kung University, Taiwan license: Dual MIT/GPL srcversion: 9A01E3671A116ADA9F2BB0A depends: retpoline: Y name: fibdrv vermagic: 5.13.0-35-generic SMP mod_unload modversions ``` * 將模組載入核心 ```shell $ sudo insmod fibdrv.ko ``` * 觀察 fibdrv.ko 核心模組在 Linux 核心掛載後的行為 ```shell $ ls -l /dev/fibonacci crw------- 1 root root 510, 0 三 12 16:52 /dev/fibonacci $ cat /sys/class/fibonacci/fibonacci/dev 510:0 $ cat /sys/module/fibdrv/version 0.1 $ lsmod | grep fibdrv fibdrv 16384 0 $ cat /sys/module/fibdrv/refcnt 0 ``` > TODO: > 為何 $ cat /sys/class/fibonacci/fibonacci/dev 的結果是 510 ? ## 效能分析 ### 查看行程的 CPU affinity ```shell $ taskset -p 1 pid 1's current affinity mask: fff $ taskset -cp 1 pid 1's current affinity list: 0-11 ``` :::info 此實驗環境的CPU核心數的確為 12 ::: ### 將行程固定在特定的 CPU 中執行 ```shell $ taskset -p 0x001 1 pid 1's current affinity list: 0-11 taskset: failed to set pid 1's affinity: Operation not permitted ``` 並未開啟 [CAP_SYS_NICE](https://man7.org/linux/man-pages/man7/capabilities.7.html) 這個權限 > TODO: 要怎麼開啟? ### 以特定的 CPU 執行程式 ```shell $ taskset COREMASK EXECUTABLE ``` ### 限定 CPU 給特定的程式使用 修改 `/etc/default/grub` ,讓第 12 個 CPU 核心被保留 ``` GRUB_CMDLINE_LINUX_DEFAULT="quiet splash isolcpus=11" ``` 並依照文件中的說明,輸入指令更新 ``` $ sudo update-grub ``` 再重新開機後,使用 `htop` 確認第 12 個 CPU 核心並沒有再使用,也使用 `taskset` 再確認一次 ``` $ taskset -p 1 pid 1's current affinity mask: 7ff $ taskset -cp 1 pid 1's current affinity list: 0-10 ``` ### 排除干擾效能分析的因素 * 抑制 address space layout randomization (ASLR) ``` $ sudo sh -c "echo 0 > /proc/sys/kernel/randomize_va_space" ``` * 設定 scaling_governor 為 performance。準備以下 shell script,檔名為 performance.sh ``` for i in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor do echo performance > ${i} done ``` 之後再用 `$ sudo sh performance.sh` 執行 * 針對 Intel 處理器,關閉 turbo mode ``` $ sudo sh -c "echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo" ``` ## 修正 Fibonacci 數計算的正確性 原本給予計算 Fibonacci 數的程式碼如下,是最基本的算法 ``` static long long fib_sequence(long long k) { /* FIXME: C99 variable-length array (VLA) is not allowed in Linux kernel. */ long long f[k + 2]; f[0] = 0; f[1] = 1; for (int i = 2; i <= k; i++) { f[i] = f[i - 1] + f[i - 2]; } return f[k]; } ``` 發現有一段註解 `FIXME: C99 variable-length array (VLA) is not allowed in Linux kernel.` ,在編譯時也會跳出警告訊息 ``` warning: ISO C90 forbids variable length array ‘f’ [-Wvla] ``` 修改程式碼,將 VLA 的部份移除 ``` static long long fib_sequence(long long k) { long long a = 0, b = 1, t; for (int i = 2; i <= k; i++) { t = a + b; a = b; b = t; } return b; } ``` ## Fast Doubling 以下為虛擬碼 ```c Fast_Fib(n) a = 0; b = 1; // m = 0 for i = (number of binary digit in n) to 1 t1 = a*(2*b - a); t2 = b^2 + a^2; a = t1; b = t2; // m *= 2 if (current binary digit == 1) t1 = a + b; // m++ a = b; b = t1; return a; ``` 對應的實作 ```c unsigned long long Fast_Fib(unsigned int n) { unsigned long long a = 0, b = 1; unsigned long long t1, t2; for (int i = 256; i > 0; i /= 2) { t1 = a * (2 * b - a); t2 = b * b + a * a; a = t1; b = t2; // m *= 2 if (n & i) { t1 = a + b; a = b; b = t1; } } return a; } ``` > TODO: 比較使用 Fast Doubling 後帶來的效能提升 ## 自我檢查清單 - [x] 研讀上述 ==Linux 效能分析的提示== 描述,在自己的實體電腦運作 GNU/Linux,做好必要的設定和準備工作 $\to$ 從中也該理解為何不希望在虛擬機器中進行實驗; - [ ] 研讀上述費氏數列相關材料 (包含論文),摘錄關鍵手法,並思考 [clz / ctz](https://en.wikipedia.org/wiki/Find_first_set) 一類的指令對 Fibonacci 數運算的幫助。請列出關鍵程式碼並解說 - [ ] 複習 C 語言 [數值系統](https://hackmd.io/@sysprog/c-numerics) 和 [bitwise operation](https://hackmd.io/@sysprog/c-bitwise),思考 Fibonacci 數快速計算演算法的實作中如何減少乘法運算的成本; - [ ] 研讀 [KYG-yaya573142 的報告](https://hackmd.io/@KYWeng/rkGdultSU),指出針對大數運算,有哪些加速運算和縮減記憶體操作成本的舉措? - [ ] `lsmod` 的輸出結果有一欄名為 `Used by`,這是 "each module's use count and a list of referring modules",但如何實作出來呢?模組間的相依性和實際使用次數 ([reference counting](https://en.wikipedia.org/wiki/Reference_counting)) 在 Linux 核心如何追蹤呢? > 搭配閱讀 [The Linux driver implementer’s API guide » Driver Basics](https://www.kernel.org/doc/html/latest/driver-api/basics.html) - [ ] 注意到 `fibdrv.c` 存在著 `DEFINE_MUTEX`, `mutex_trylock`, `mutex_init`, `mutex_unlock`, `mutex_destroy` 等字樣,什麼場景中會需要呢?撰寫多執行緒的 userspace 程式來測試,觀察 Linux 核心模組若沒用到 mutex,到底會發生什麼問題。嘗試撰寫使用 [POSIX Thread](https://en.wikipedia.org/wiki/POSIX_Threads) 的程式碼來確認。