--- tags: 2023 linux kernel --- # 2023q1 Homework3 (fibdrv) contributed by < [`Risheng1128`](https://github.com/Risheng1128) > > [作業要求](https://hackmd.io/@sysprog/linux2023-fibdrv) > [作業區](https://hackmd.io/@sysprog/linux2023-homework3) ## 實驗環境 ```shell $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 39 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 12 On-line CPU(s) list: 0-11 Vendor ID: GenuineIntel Model name: 11th Gen Intel(R) Core(TM) i5-11400H @ 2.70GHz CPU family: 6 Model: 141 Thread(s) per core: 2 Core(s) per socket: 6 Socket(s): 1 Stepping: 1 CPU max MHz: 4500.0000 CPU min MHz: 800.0000 BogoMIPS: 5376.00 Virtualization features: Virtualization: VT-x Caches (sum of all): L1d: 288 KiB (6 instances) L1i: 192 KiB (6 instances) L2: 7.5 MiB (6 instances) L3: 12 MiB (1 instance) NUMA: NUMA node(s): 1 NUMA node0 CPU(s): 0-11 $ gcc --version gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0 Copyright (C) 2021 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. ``` ## 研讀 Linux 效能分析的提示 參考 [Linux 效能分析的提示](https://hackmd.io/@sysprog/linux2023-fibdrv/%2F%40sysprog%2Flinux2023-fibdrv-c#-Linux-%E6%95%88%E8%83%BD%E5%88%86%E6%9E%90%E7%9A%84%E6%8F%90%E7%A4%BA) 為了分析效能而做的措施 首先保留特定的處理器,以便我們執行程式時有空閒的處理器可以直接使用,輸入命令 `sudo vim /etc/default/grub` 開啟檔案,並新增以下的內容 ``` GRUB_CMDLINE_LINUX="isolcpus=0" ``` 接著輸入命令 `sudo update-grub` ,以下為預期輸出 ```shell Sourcing file `/etc/default/grub' Sourcing file `/etc/default/grub.d/init-select.cfg' Generating grub configuration file ... Found linux image: /boot/vmlinuz-5.19.0-35-generic Found initrd image: /boot/initrd.img-5.19.0-35-generic Found linux image: /boot/vmlinuz-5.19.0-32-generic Found initrd image: /boot/initrd.img-5.19.0-32-generic Memtest86+ needs a 16-bit boot, that is not available on EFI, exiting Warning: os-prober will be executed to detect other bootable partitions. Its output will be used to detect bootable binaries on them and create new boot entries. Found Windows Boot Manager on /dev/nvme1n1p1@/EFI/Microsoft/Boot/bootmgfw.efi Adding boot menu entry for UEFI Firmware Settings ... done ``` 重新開機後,輸入命令 `taskset -p 1` 查看 PID 1 可使用的處理器編號 ```shell $ taskset -p 1 pid 1's current affinity mask: ffe ``` 可以發現編號 0 已經被保留了 接著抑制 [ASLR](https://en.wikipedia.org/wiki/Address_space_layout_randomization) ```shell $ sudo sh -c "echo 0 > /proc/sys/kernel/randomize_va_space" ``` 設定 scaling_governor ,執行以下的 shell script (使用命令 `sudo sh filename.sh`) ``` for i in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor do echo performance > ${i} done ``` 針對 Intel 處理器,關閉 turbo mode ```shell $ sudo sh -c "echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo" ``` ## 修正 Fibonacci 數計算並改善 fibdrv 核心模組的計算效率 首先在編譯核心模組時,發生了以下警告 ``` warning: the compiler differs from the one used to build the kernel The kernel was built by: x86_64-linux-gnu-gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0 You are using: gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0 ``` 產生原因是建立核心和模組的編譯器不同,這裡修改 Makefile 的內容,指定 `x86_64-linux-gnu-gcc` 編譯核心模組 ```diff - $(MAKE) -C $(KDIR) M=$(PWD) modules + $(MAKE) -C $(KDIR) M=$(PWD) CC=x86_64-linux-gnu-gcc modules ``` 這裡做了一個修改,主要是讓 fibdrv 可以主動偵測編譯 kernel 的編譯器,完整修改可見 [commit](https://github.com/Risheng1128/fibdrv/commit/c8f4f071e9880396c7709efa9ace7e011a701294) ### 使用 Fast Doubling 計算 Fibonacci 數 在原本的 fibdrv 中,是使用 repeated addition 的方法計算 Fibonacci 數,可見論文 [Algorithms for Computing Fibonacci Numbers Quickly](https://ir.library.oregonstate.edu/downloads/t435gg51w) ,額外的分析放在 [研讀 Algorithms for Computing Fibonacci Numbers Quickly](https://hackmd.io/@Risheng/rkdoM1YWq) 參考作業說明 [加速 Fibonacci 運算](https://hackmd.io/@sysprog/linux2023-fibdrv/%2F%40sysprog%2Flinux2023-fibdrv-d) 的部份,裡頭提到 1. iterative 方法的時間複雜度為 O(n) 2. fast doubling 的時間複雜度降為 O(logn) 而 fibdrv 原始的實作則是使用迭代的方式計算,因此這裡使用 [fast doubling](https://www.nayuki.io/page/fast-fibonacci-algorithms) 的方法改進計算 Fibonacci 數的效能 透過 fast doubling 可以得知,目標的 Fibonacci 數可由以下兩式計算取得 1. $F(2k) = F(k)[2F(k+1) - F(k)]$ 2. $F(2k+1) = F(k+1)^2 + F(k)^2$ 因此,假設 $F(0) = 0$ 且 $F(1) = 1$ ,可以開始往下計算 當 $k = 1$ 時 1. $F(2) = F(1)[2F(2) - F(1)]$ &rarr; $F(2) = 1$ 2. $F(3) = F(2)^2 + F(1)^2$ &rarr; $F(3) = 2$ 以此類推 接著參考 [Bottom-up Fast Doubling](https://hackmd.io/@sysprog/linux2023-fibdrv/%2F%40sysprog%2Flinux2023-fibdrv-d#Bottom-up-Fast-Doubling) 的邏輯實作 fast doubling 首先計算變數 `k` 使用的位元數 ```c int count = 63 - __builtin_clzll(k); long long fn0 = 1, fn1 = 1; ``` 接著在迴圈內根據 fast doubling 的公式計算出新的 fibonacci 數 ```c /* f(2n) = f(n) * (2f(n + 1) - f(n)) */ long long f2n0 = fn0 * ((fn1 << 1) - fn0); /* f(2n + 1) = f(n + 1) ^ 2 + f(n) ^ 2 */ long long f2n1 = fn1 * fn1 + fn0 * fn0; ``` 最後更新變數,用來計算下一個 fibonacci 數 ```c /* update f(n) and f(n + 1) * if mask = -1, then fn0 = f2n1 and fn1 = f2n0 + f2n1 * if mask = 0, then fn0 = f2n0 and fn1 = f2n1 */ long long mask = -!!(k & (1UL << count)); fn0 = (f2n1 & mask) | (f2n0 & ~mask); fn1 = ((f2n0 + f2n1) & mask) | (f2n1 & ~mask); ``` 根據這樣的方法,一樣可以計算 fibonacci 數,一樣算到第 92 個數都是正確的 ```shell Reading from /dev/fibonacci at offset 92, returned the sequence 7540113804746346429. ``` 以上的修改可見 [commit](https://github.com/Risheng1128/fibdrv/commit/0eec6cb514e82b31ecdc906e779af25bd314eae3) ### 使用 ktime 量測時間 參考[核心模式的時間測量](https://hackmd.io/@sysprog/linux2023-fibdrv/%2F%40sysprog%2Flinux2023-fibdrv-c#%E6%A0%B8%E5%BF%83%E6%A8%A1%E5%BC%8F%E7%9A%84%E6%99%82%E9%96%93%E6%B8%AC%E9%87%8F)裡頭提到的 ktime 來量測計算 fibonacci 數所需的時間 首先宣告一個型態為 `ktime_t` 的全域變數 ```c static ktime_t fib_time; ``` 接著修改函式 `fib_read` 的內容,使其執行時會量測計算 fibonacci 數的時間,如下所示 ```c /* calculate the fibonacci number at given offset */ static ssize_t fib_read(struct file *file, char *buf, size_t size, loff_t *offset) { fib_time = ktime_get(); ssize_t result = fib_sequence(*offset); fib_time = ktime_sub(ktime_get(), fib_time); return result; } ``` 接著修改 `fib_write` 的功能,使其回傳最近一次計算 fibonacci 的時間 ```c /* write operation is skipped */ static ssize_t fib_write(struct file *file, const char *buf, size_t size, loff_t *offset) { return ktime_to_ns(fib_time); } ``` 以上的完整修改可見 [commit](https://github.com/Risheng1128/fibdrv/commit/9e8f402725eb122e8b8acaf5803766c74c36de72) 可以量測時間後,可以接著比較使用 fast doubling 前後的差異,如下所示 ![](https://i.imgur.com/bgmzVn7.png) 很明顯使用 fast doubling 後,效能明顯提升許多 ### 實作大數運算