# 2022q1 Homework3 (fibdrv) contributed by < `happyjack91630` > ## 前期準備 :::spoiler 模組安裝 ```shell= //檢查Linux核心版本 $ uname -r 5.13.0-30-generic //安裝 linux-headers 套件 $ sudo apt install linux-headers-`uname -r` Reading package lists... Done Building dependency tree Reading state information... Done linux-headers-5.13.0-30-generic is already the newest version (5.13.0-30.33~20.04.1). 0 upgraded, 0 newly installed, 0 to remove and 95 not upgraded. //確認 linux-headers 套件已正確安裝於開發環境 $ dpkg -L linux-headers-5.13.0-30-generic | grep "/lib/modules" /lib/modules /lib/modules/5.13.0-30-generic /lib/modules/5.13.0-30-generic/build //檢驗目前的使用者身份 $ whoami jack $ sudo whoami root $ sudo apt install util-linux strace gnuplot-nox $ git clone https://github.com/sysprog21/fibdrv $ cd fibdrv $ make check make -C /lib/modules/5.13.0-30-generic/build M=/home/jack/fibdrv modules make[1]: Entering directory '/usr/src/linux-headers-5.13.0-30-generic' make[1]: Leaving directory '/usr/src/linux-headers-5.13.0-30-generic' make unload make[1]: Entering directory '/home/jack/fibdrv' sudo rmmod fibdrv || true >/dev/null rmmod: ERROR: Module fibdrv is not currently loaded make[1]: Leaving directory '/home/jack/fibdrv' make load make[1]: Entering directory '/home/jack/fibdrv' sudo insmod fibdrv.ko make[1]: Leaving directory '/home/jack/fibdrv' sudo ./client > out make unload make[1]: Entering directory '/home/jack/fibdrv' sudo rmmod fibdrv || true >/dev/null make[1]: Leaving directory '/home/jack/fibdrv' Passed [-] f(93) fail input: 7540113804746346429 expected: 12200160415121876738 //觀察產生的 fibdrv.ko 核心模組 $ modinfo fibdrv.ko filename: /home/jack/fibdrv/fibdrv.ko version: 0.1 description: Fibonacci engine driver author: National Cheng Kung University, Taiwan license: Dual MIT/GPL srcversion: 2E292D54A69BD2B4FF86019 depends: retpoline: Y name: fibdrv vermagic: 5.13.0-30-generic SMP mod_unload modversions //insmod命令-->install module的縮寫,用来載入魔塊 $ sudo insmod fibdrv.ko $ ls -l /dev/fibonacci crw------- 1 root root 511, 0 三 8 18:12 /dev/fibonacci 511:0 $ cat /sys/module/fibdrv/version 0.1 $ lsmod | grep fibdrv fibdrv 16384 0 $ cat /sys/module/fibdrv/refcnt 0 ``` ::: :::spoiler 排除干擾效能分析 #### 限定 CPU 給特定的程式使用 - 以 `sudo` 管理者權限更改 `/etc/default/grub` 檔案。修改 `GRUB_CMDLINE_LINUX_DEFAULT` 參數,新增 `"isolcpus=x(,x...)"` 指定 cpu 核心只給特定程式使用。 ```shell $ sudo vim /etc/default/grub ... GRUB_CMDLINE_LINUX_DEFAULT="quiet splash isolcpus=7" ... ``` - 更新grub(也要使用sudo權限) > 參考 [KYG-yaya573142](https://hackmd.io/@KYWeng/rkGdultSU#%E6%8E%92%E9%99%A4%E5%B9%B2%E6%93%BE%E6%95%88%E8%83%BD%E5%88%86%E6%9E%90%E7%9A%84%E5%9B%A0%E7%B4%A0)的做法都沒辦法將特定cpu獨立出來,加了這行`sudo update-grub`就可以了。 ```shell $ sudo update-grub ``` - 重新開機後檢查行程的 CPU affinity: ```shell $ taskset -cp 1 pid 1's current affinity list: 0-6 $ cat /sys/devices/system/cpu/isolated 7 ``` - 將程式固定在特定的 CPU 中執行 ```shell $ sudo taskset -c 7 ./client ``` - 抑制 address space layout randomization (ASLR) ```shell $ sudo sh -c "echo 0 > /proc/sys/kernel/randomize_va_space" ``` - 設定 scaling_governor 為 performance ```shell $ vim performance.sh for i in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor do echo performance > ${i} done $ sudo sh performance.sh ``` - 關閉 Intel 處理器的 turbo mode ```shell $ sudo sh -c "echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo" ``` ::: ## kernel space 與 user space 傳遞時間 #### kernel space 使用`ktime`相關的API來做計算,計算時間特別獨立出來寫成`fib_time_proxy`這個函式,這樣就不用一直重複寫計算時間的程式碼了。那由於之後可能也要計算不同fibonacci實作方法的時間,所以這個函式第二個參數是帶入function pointer,以便切換不同的fibonacci的實做函式。 ```c= static long long fib_time_proxy(long long k, long long (*func_ptr)(long long)) { kt = ktime_get(); long long result = func_ptr(k); kt = ktime_sub(ktime_get(), kt); return result; } ``` #### user space and system call time 使用`clock_gettime`,用來在user space的地方計算時間。 並將kernel time - user time來取得system call所花費的時間。 ```c= //time[0] record kernel time //time[1] record user time //time[2] record system call time clock_gettime(CLOCK_MONOTONIC, &t1); kernel_time = write(fd, write_buf, 0); time[0][s] = kernel_time; mean[0] += time[0][s]; clock_gettime(CLOCK_MONOTONIC, &t2); double user_time = (double) (t2.tv_sec * 1e9 + t2.tv_nsec) - (t1.tv_sec * 1e9 + t1.tv_nsec); //ns time[1][s] = user_time; mean[1] += time[1][s]; time[2][s] = user_time - kernel_time; mean[2] += time[2][s]; ``` ![](https://i.imgur.com/oizg1vl.png) > 上面圖片利用了常態分佈的特性將兩個標準差以外的數據都排除掉,用來去除極端值 ## 費氏數列實作 ### iterative ```c= static long long fib_sequence(long long k) { /* FIXME: C99 variable-length array (VLA) is not allowed in Linux kernel. */ long long f[k + 2]; f[0] = 0; f[1] = 1; for (int i = 2; i <= k; i++) { f[i] = f[i - 1] + f[i - 2]; } return f[k]; } ``` 一開始iterative的版本,在註解上有個警告`/* FIXME: C99 variable-length array (VLA) is not allowed in Linux kernel. */`,那是linux kernel規定宣告array必須有**固定的長度**,不能動態變化。換個角度想,可以用malloc來動態宣告陣列,但是kernel module裡不能使用`<stdlib.h>`的函式庫([參考](https://stackoverflow.com/questions/15662480/error-stdio-h-no-such-file-or-directory-error-during-make))。 ```c= static long long fib_sequence(long long k) { if (k < 2) return 0; long long a = 0, b = 1; for (int i = 2; i <= k; i++) { long long c = a + b; a = b; b = c; } return b; } ``` 後來,就將iterative的程式碼改成不用陣列來儲存計算出來的值。改用兩個變數(a,b)來儲存後續要計算的值就行了。 ### fast doubling 講義中[費氏數列](https://hackmd.io/PMozqaxqRA-jF2KmgTZcPQ?view#-%E8%B2%BB%E6%B0%8F%E6%95%B8%E5%88%97)推導了一種 `fast doubling` 的演算法: $$ \begin{split} F(2k) &= F(k)[2F(k+1) - F(k)] \\ F(2k+1) &= F(k+1)^2+F(k)^2 \end{split} $$ fast doubling的實作方法則是參考了[網誌](https://chunminchang.github.io/blog/post/calculating-fibonacci-numbers-by-fast-doubling)的作法。 ```c= static long long fib_doubling(long long k) { unsigned int h = 0; for (unsigned int i = k; i; ++h, i >>= 1) ; long long a = 0; // F(0) = 0 long long b = 1; // F(1) = 1 // There is only one `1` in the bits of `mask`. The `1`'s position is same // as the highest bit of n(mask = 2^(h-1) at first), and it will be shifted // right iteratively to do `AND` operation with `n` to check `n_j` is odd or // even, where n_j is defined below. for (unsigned int mask = 1 << (h - 1); mask; mask >>= 1) { // Run h times! // Let j = h-i (looping from i = 1 to i = h), n_j = floor(n / 2^j) = n // >> j (n_j = n when j = 0), k = floor(n_j / 2), then a = F(k), b = // F(k+1) now. long long c = a * (2 * b - a); // F(2k) = F(k) * [ 2 * F(k+1) – F(k) ] long long d = a * a + b * b; // F(2k+1) = F(k)^2 + F(k+1)^2 if (mask & k) { // n_j is odd: k = (n_j-1)/2 => n_j = 2k + 1 a = d; // F(n_j) = F(2k + 1) b = c + d; // F(n_j + 1) = F(2k + 2) = F(2k) + F(2k + 1) } else { // n_j is even: k = n_j/2 => n_j = 2k a = c; // F(n_j) = F(2k) b = d; // F(n_j + 1) = F(2k + 1) } } return a; } ``` 在kernel space量測個別的計算時間。 ```c= static ssize_t fib_write(struct file *file, const char *buf, size_t mode, loff_t *offset) { /* mode 0 : 將fib_sequence的function pointer帶入fib_time_proxy, 用來計算fib_sequence的執行時間 mode 1 : 將fib_doubling的function pointer帶入fib_time_proxy, 用來計算fib_doubling的執行時間 */ switch (mode) { case 0: sequence_result = fib_time_proxy(*offset, fib_sequence); printk("%lld", kt); break; case 1: doubling_result = fib_time_proxy(*offset, fib_doubling); printk("%lld", kt); break; case 2: return 1; } return (ssize_t) ktime_to_ns(kt); } ``` ![](https://i.imgur.com/RZsiCDL.png) > fib_seq 是使用iterative,複雜度為O(N)。 > fib_doubling 則是使用fast doubling的做法,複雜度為O(log n) ### overflow 雖然上述方法,都是正確計算fibonacci的程式碼,但是在f(92)以後的數值會出錯,是因為`long long` overflow的問題,可以參考[F(92) 以後的數值錯誤的原因https://leetcode.jp/problems.php](https://hackmd.io/BZjwYw1FQ1O0NPY5kIS7VQ?view#F92-%E4%BB%A5%E5%BE%8C%E7%9A%84%E6%95%B8%E5%80%BC%E9%8C%AF%E8%AA%A4%E7%9A%84%E5%8E%9F%E5%9B%A0)。 ## 大數運算(iterative) 將大數(超過long long的數),轉成char array,在進行數學運算(這個部分先做加法的大數運算)。 ### step 1.將大數轉成char array ```c= static void int2char(long long a, long long b, char *a_s, char *b_s) { int reminder; int last_pos_a = max_space - 1; int last_pos_b = max_space - 1; while (a > 0) { reminder = a % 10; a_s[last_pos_a--] = reminder + '0'; a = a / 10; } while (b > 0) { reminder = b % 10; b_s[last_pos_b--] = reminder + '0'; b = b / 10; } } ``` ![](https://i.imgur.com/s1XFeny.png) >上述圖片,解釋將一個大數轉成相對應的char array。 ### step 2.將兩個char array做相加 ```c= static void sum(char *a, char *b, char *c) { int last_pos = max_space - 1; int carry = 0; int i = max_space - 1, j = max_space - 1; for (; j >= 0; i--, j--) { int sum_v; sum_v = (a[i] - '0') + (b[j] - '0') + carry; carry = sum_v / 10; sum_v %= 10; c[last_pos--] = sum_v + '0'; } } ``` ![](https://i.imgur.com/yVhDPxf.png) >上述圖片,解釋將兩個char array相加的流程。 ### step 3. 將上述大數方法,融入到fib_sequence ```c= static void fib_big_num(int n, char *r) { if (n == 0) { *r = n + '0'; r++; *r = '\0'; return; } if (n == 1 || n == 2) { *r = '1'; r++; *r = '\0'; return; } long long a = 1; long long b = 1; char a_s[max_space]; char b_s[max_space]; char ans[max_space]; for (int i = 0; i < max_space; i++) { a_s[i] = '0'; b_s[i] = '0'; ans[i] = '0'; } int2char(a, b, a_s, b_s); for (int i = 3; i <= n; i++) { sum(a_s, b_s, ans); for (int j = 0; j < max_space; j++) { a_s[j] = b_s[j]; b_s[j] = ans[j]; } } int flag = 0; for (int i = 0; i < max_space; i++) { if (ans[i] == '0' && flag == 0) { continue; } else { flag = 1; *r++ = ans[i]; } } *r = '\0'; } ``` 目前上述方法可以將long long overflow的問題解決,正確算出f(92)以後fibonacci的數字。 ![](https://i.imgur.com/2meIMmy.png) >要算fibonacci之前,要記得把`fibdrv.c`裡的`#define MAX_LENGTH` 調高才行,否則,`fib_device_lessk`算new_pos會算錯。 ![](https://i.imgur.com/PibMa2F.png) > 隨然上述方法能算出正確的答案,但是非常沒有效率。第一個原因是因為此大數運算是運用在本來就比較慢的fib_sequence上。第二個原因是此大數運算本身的實作方式蠻浪費時間的。舉例來說,將兩個char array 相加的過程的那個for迴圈,把一堆沒必要的'0' + '0'也做相加了,所以非常浪費時間。