2022q1 Homework3 (fibdrv)

# 2022q1 Homework3 (fibdrv) contributed by < `freshLiver` > ###### tags: `linux2022` `kernel module` `fibonacci` ## 準備工作 ### 測試環境 ```shell $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian Address sizes: 39 bits physical, 48 bits virtual CPU(s): 12 On-line CPU(s) list: 0-11 Thread(s) per core: 2 Core(s) per socket: 6 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 167 Model name: 11th Gen Intel(R) Core(TM) i5-11400 @ 2.60GHz Stepping: 1 CPU MHz: 665.263 CPU max MHz: 4400.0000 CPU min MHz: 800.0000 BogoMIPS: 5184.00 Virtualization: VT-x L1d cache: 288 KiB L1i cache: 192 KiB L2 cache: 3 MiB L3 cache: 12 MiB NUMA node0 CPU(s): 0-11 ``` ### Kernel Headers ```json "includePath": [ "${default}", "/lib/modules/5.11.0-49-generic/build/arch/x86/include", "/lib/modules/5.11.0-49-generic/build/arch/x86/include/generated", "/lib/modules/5.11.0-49-generic/build/include", "/lib/modules/5.11.0-49-generic/build/include/generated" ], "defines": [ "MODULE", "__KERNEL__" ] ``` VS Code 相關的設定 ( `.vscode/c_cpp_properties.json` )，可以避免編輯器找不到 Kernel Header 而跳出錯誤。 ## fibdrv 實作 ### 現有缺陷 #### Variable-length Array 如同註解所示，Linux Kernel 中不應該使用 C99 新增的 VLA 來宣告陣列，可以改用大小為 2 的陣列搭配 `swap` 巨集改寫： ```c static long long fib_sequence(long long k) { long long f[2] = {0, 1}; for (int i = 1; i <= k; i++) { swap(f[0], f[1]); f[0] += f[1]; } return f[0]; } ``` 但這樣一樣會因為 64 位元的限制造成 `fib(93)` 後的結果仍是錯誤的。 ### 更快速的 Fibonacci 運算實作 - [x] [費氏數列分析](https://hackmd.io/@sysprog/fibonacci-number) - [ ] [Fibonacci and Golden Ratio Formulae](http://www.maths.surrey.ac.uk/hosted-sites/R.Knott/Fibonacci/fibFormulae.html) - [x] [The Fibonacci Q-matrix](https://youtu.be/lTHVwsHJrG0) - [x] [Fast Doubling](https://chunminchang.github.io/blog/post/calculating-fibonacci-numbers-by-fast-doubling) #### 實作 Fast Doubling 使用 Fast Doubling 的規則加上 Bitwise 的應用進行遞迴版的實作： ```c static inline uint64_t fast_doubling(uint32_t target) { if (target <= 2) return !!target; // fib(2n) = fib(n) * (2 * fib(n+1) − fib(n)) // fib(2n+1) = fib(n) * fib(n) + fib(n+1) * fib(n+1) uint64_t n = fast_doubling(target >> 1); uint64_t n1 = fast_doubling((target >> 1) + 1); // check 2n or 2n+1 if (target & 1) return n * n + n1 * n1; return n * ((n1 << 1) - n); } ``` 但雖然減少了不少計算，但還是會有重複計算的部份： ```graphviz digraph FIB { "3a" [label="3"; color="red"] "3b" [label="3"; color="red"] "2a" [label="2"] "2b" [label="2"] "2c" [label="2"] "1a" [label="1"] "1b" [label="1"] 6 -> "3a"; 6 -> 4; "3a" -> "2a"; "3a" -> "1a"; 4 -> "2b"; 4 -> "3b"; "3b" -> "2c"; "3b" -> "1b"; } ``` 而當 `target` 越大時重複的計算會造成的效能影響越大。 #### Bottom-up Fast Doubling 從一數的 2 進制表示可以發現該數是如何產生的，以 $87_{10}$ 為例： ```text 87 = 1010111 (87 >> i+1) i = 0 : 43 = (1010111 - 1) << 1 = 101011 i = 1 : 21 = ( 101011 - 1) << 1 = 10101 i = 2 : 10 = ( 10101 - 1) << 1 = 1010 i = 3 : 5 = ( 1010 - 0) << 1 = 101 i = 4 : 2 = ( 101 - 1) << 1 = 10 i = 5 : 1 = ( 10 - 0) << 1 = 1 i = 6 : 0 = ( 1 - 1) << 1 = 0 ^ 87 的第 i 個位元 ``` 若是進行移項並反過來看的話會變成： ```text (87 >> i+1) i = 6 : 1 = 0 << 1 + 1 = 1 i = 5 : 2 = 1 << 1 + 0 = 10 i = 4 : 5 = 10 << 1 + 1 = 101 i = 3 : 10 = 101 << 1 + 0 = 1010 i = 2 : 21 = 1010 << 1 + 1 = 10101 i = 1 : 43 = 10101 << 1 + 1 = 101011 i = 0 : 87 = 101011 << 1 + 1 = 1010111 ^ 87 的第 i 個位元 ``` 從 $n=0$ 開始看，可以發現每次位移後只要檢查目標數對應的位元，就可以知道下次應以 $fib(2n)$ 還是 $fib(2n+1)$ 為基礎進行右移。若以文字表示流程的話可以寫成： 1. 從最高位元的 1 開始，此時 $n=1$，而： - $fib(n)=1$ - $fib(n+1)=1$ 2. 若下一個位元不存在的話跳到第 3 步，否則（假設目前為 $n=k$）： - 透過 $fib(k)$ 以及 $fib(k+1)$ 計算 $fib(2k)$ 以及 $fib(2k+1)$ - 檢查下一個位元： - 0：$n = 2k$ - 1：$n = 2k + 1$，此時需要 $fib(n+1)$ 讓下一迭代能夠計算 $fib(2n)$ 以及 $fib(2n+1)$ - 回到第 2 步 3. 此時 $n$ 為目標數，回傳 $fib(n)$ 若寫成程式碼的話則會變成： ```c static inline uint64_t fast_doubling_iter(uint64_t target) { if (target <= 2) return !!target; // find first 1 uint8_t count = 63 - __builtin_clzll(target); uint64_t fib_n0 = 1, fib_n1 = 1; for (uint64_t i = count, fib_2n0, fib_2n1; i-- > 0;) { fib_2n0 = fib_n0 * ((fib_n1 << 1) - fib_n0); fib_2n1 = fib_n0 * fib_n0 + fib_n1 * fib_n1; if (target & (1UL << i)) { fib_n0 = fib_2n1; fib_n1 = fib_2n0 + fib_2n1; } else { fib_n0 = fib_2n0; fib_n1 = fib_2n1; } } return fib_n0; } ``` :::warning 由於 cppcheck 會偵測區域變數的 scope 是否能夠再縮減，因此這邊將 `i` 一併宣告成 64 位元的無號整數，以便將 `for` 迴圈需要用到的區域變數一併宣告。 ::: 而迴圈內 `if...else...` 的部份可以利用 `-!!(target & (1 << i))` 作為 mask 的技巧簡化成： ```c static inline uint64_t fast_doubling_iter(uint64_t target) { if (target <= 2) return !!target; // find first 1 uint8_t count = 63 - __builtin_clzll(target); uint64_t fib_n0 = 1, fib_n1 = 1; for (uint64_t i = count, fib_2n0, fib_2n1, mask; i-- > 0;) { fib_2n0 = fib_n0 * ((fib_n1 << 1) - fib_n0); fib_2n1 = fib_n0 * fib_n0 + fib_n1 * fib_n1; mask = -!!(target & (1UL << i)); fib_n0 = (fib_2n0 & ~mask) + (fib_2n1 & mask); fib_n1 = (fib_2n0 & mask) + fib_2n1; } return fib_n0; } ``` ### 更精確的 Fibonacci 運算實作 #### 基於 `list_head` 的大數運算先在熟悉的 user space 進行實作來檢驗程式碼的正確性。 ##### 實作理念 ```c typedef struct { size_t len; struct list_head link; } bignum_head; typedef struct { uint64_t value; struct list_head link; } bignum_node; #define NEW_NODE(head, val) \ ({ \ bignum_node *node = malloc(sizeof(bignum_node)); \ if (node) { \ node->value = val; \ list_add_tail(&node->link, head); \ } \ }) ``` - 使用鏈結串列避免像字串操作須重複 `malloc` 以及 `free` 的成本 - 利用內建的 64 位元整數型別進行運算 - 減少解碼成本（僅須 `printf` 類的格式化輸出函式） ```c= static inline void bignum_add_to_smaller(struct list_head *lgr, struct list_head *slr) { struct list_head **l = &lgr->next, **s = &slr->next; for (bool carry = 0;; l = &(*l)->next, s = &(*s)->next) { if (*s == slr) { if (*l == lgr) { if (carry) NEW_NODE(slr, 1); break; } NEW_NODE(slr, 0); } bignum_node *lentry = list_entry(*l, bignum_node, link), *sentry = list_entry(*s, bignum_node, link); carry = FULL_ADDER_64(lentry->value, sentry->value, carry); } } ``` :::danger 程式碼中的 `l` 以及 `s` 兩個 indirect pointer 雖然看似多餘，但是其實不可被簡化，否則會影響答案，因為：第 14 行的 `NEW_NODE` 會新增一節點並透過 `list_add_tail` 加到 `slr` 的尾端，因此若是單純使用指向 `list_head` 的指標的話，則需要在 `NEW_NODE` 後重新將 `s` 指向新的節點，否則第 18 行的 `list_entry` 相當於對串列的 head 使用 `list_entry`，造成存取到預期外位址。因此使用 indirect pointer `*s` 來省略將 `s` 指向新的尾端節點的步驟。 ::: ```c #define MAX_DIGITS 18 #define BOUND64 1000000000000000000UL #define FULL_ADDER_64(a, b, carry) \ ({ \ uint64_t res = (a) + (b) + (carry); \ uint64_t overflow = -!!(res >= BOUND64); \ (b) = res - (BOUND64 & overflow); \ overflow; \ }) ``` 由於 64 位元無號整數的上限為 20 位數的 $18,446,744,073,709,551,615$，而為了減少解碼的成本，透過限制每個節點的上限值為 10 的冪，每當超過上限時就新增一節點儲存進位，而解碼時只要從最高位依序將每個節點以 10 進位的形式印出即可完成解碼。 :::success 為了盡可能的減少節點數量，原先是以 `UINT64_MAX` 作為上限，但發現在解碼時仍須進行大數運算，因此改以 10 的冪（$10^{18} = 1000000000000000000$）作為上限值，來減少解碼成本。而不選擇以 $10^{19}$ 作為上限的原因則是考慮到兩個 $10^{19}-1$ 相加時會超過 `UINT64_MAX`，造成需要額外判斷是否有 overflow 的發生，因此選擇少一位數的 $10^{18}$ 作為上限配合 $\geq$ 進行簡單的判斷。 ::: ##### 測試結果（最簡單實作的費氏數列運算） ```c int main(int argc, char const *argv[]) { struct list_head *h1 = bignum_new(1); struct list_head *h2 = bignum_new(0); for (int i = 0; i < atoi(argv[1]); ++i) { bignum_add_to_smaller(h1, h2); swap(h1, h2); } bignum_to_string(h2, NULL, 0); bignum_free(h1); bignum_free(h2); return 0; } ``` 由於目前只有實作加法的功能，因此先以最基本的遞迴呼叫測試： ```shell $ taskset -cp 1 pid 1's current affinity list: 0-9 ``` ```shell $ time taskset -c 10 ./a.out 100000 | wc 0 1 20899 taskset -c 10 ./a.out 100000 0.25s user 0.00s system 99% cpu 0.257 total wc 0.00s user 0.00s system 0% cpu 0.256 total ``` $fib(100000)$ 大約在 $\frac{1}{4}$ 秒左右計算完成，且與 [WolframAlpha 上的結果](https://www.wolframalpha.com/input?i=fib%28100000%29)相同。 ```shell $ time taskset -c 10 ./a.out 1000000 | wc 0 1 208988 taskset -c 10 ./a.out 1000000 29.70s user 0.00s system 99% cpu 29.707 total wc 0.00s user 0.00s system 0% cpu 29.707 total ``` 而 $fib(1000000)$ 大約在 30 秒內計算完成，但因位數過多無法與 WolframAlpha 上的結果相比。 ##### 實作 Fast Doubling 版本 :::warning TODO : 需要實作 in-place 乘法運算 ::: #### 以 Kernel API 改寫 ##### 動態記憶體配置而由於 Linux Kernel 中不應使用 VLA，所以需要使用動態記憶體配置： > 參考資料：[LWN - Variable-length arrays and the max() mess](https://lwn.net/Articles/749064/) 但若使用熟悉的 `malloc` 的話，會發現連熟悉的函式庫都找不到，因此 `make all` 時就會發現編譯失敗： ```text /home/freshliver/Dropbox/Notes/_jserv/linux/labs/lab3-fibdrv/fibdrv.c:11:10: fatal error: stdlib.h: 沒有此一檔案或目錄 11 | #include <stdlib.h> ``` > Kernel Module 不能使用標準函式庫的原因參考了 [Stackoverflow 的說明](https://stackoverflow.com/questions/26455140/how-does-the-kernel-stop-you-using-malloc)，但還需要查證其他第一手資料。因此需要從 Linux Kernel 的 Header 中找出動態記憶體配置的函式，以及需要引入的標頭檔。而從 [Linux Kernel 文件的 Memory Allocation Guide](https://www.kernel.org/doc/html/latest/core-api/memory-allocation.html) 中有寫到： > Linux provides a variety of APIs for memory allocation. You can allocate small chunks using `kmalloc` or `kmem_cache_alloc` families, large virtually contiguous areas using `vmalloc` and its derivatives, or you can directly request pages from the page allocator with `alloc_pages`. It is also possible to use more specialized allocators, for instance `cma_alloc` or `zs_malloc`. 這段首先提示了幾個可用來動態記憶體配置的函式，而這些函式需要引入以下幾個標頭檔： ```c #include <linux/mm.h> #include <linux/slab.h> #include <linux/vmalloc.h> ``` 接下來則是要從[一堆看起來很像的記憶體配置函式](https://www.kernel.org/doc/html/latest/core-api/mm-api.html)中尋找符合須由的函式，而在下方的 [Selecting memory allocator](https://www.kernel.org/doc/html/latest/core-api/memory-allocation.html#selecting-memory-allocator) 部份則有提到選擇 Allocator 的建議以及限制： > The most straightforward way to allocate memory is to use a function from the kmalloc() family. And, to be on the safe side it’s best to use routines that set memory to zero, like kzalloc(). > The maximal size of a chunk that can be allocated with kmalloc is limited. The actual limit depends on the hardware and the kernel configuration, but it is a **good practice to use kmalloc for objects smaller than page size**. > For **large** allocations you can use vmalloc() and vzalloc(),(後略) 其中提到了 `kmalloc` 的限制，可以從 [Linux Manual Page 中 `sysconf` 的 POSIX.1 variables 部份](https://www.man7.org/linux/man-pages/man3/sysconf.3.html) 中找到 `PAGESIZE` 這個系統變數，而系統變數則可以透過 [`getconf`](https://www.man7.org/linux/man-pages/man1/getconf.1p.html) 取得（單位為位元組）： ```shell $ getconf PAGESIZE 4096 ``` 由於在我電腦上的 `PAGESIZE` 設定是 4 KB，因此在這次實作中，只有能夠確保記憶體佔用小於 4 KB 的物件才使用 `kmalloc` 進行配置，例如：`list_head` 以及 `bignum_head` 等簡單的結構體物件，並使用 `kfree` 釋放對應的記憶體哭間。而無法確保大小的物件，例如 $fib(k)$ 的結果字串，則使用 `vmalloc` 相關的函式進行配置，並搭配 `vfree` 釋放記憶體空間。 ##### 改寫記憶體配置的部份在建立節點以及刪除串列的部份比較單純，直接將 `malloc` 以及 `free` 分別改成 `kmalloc` 以及 `kfree` 即可： :::warning GFP ::: ```c #define NEW_NODE(head, val) \ ({ \ bignum_node *node = kmalloc(sizeof(bignum_node), GFP_KERNEL); \ if (node) { \ node->value = val; \ list_entry(head, bignum_head, link)->len++; \ list_add_tail(&node->link, head); \ } \ }) static inline void bignum_free(struct list_head *head) { bignum_node *ptr, *next; list_for_each_entry_safe (ptr, next, head, link) kfree(ptr); kfree(list_entry(head, bignum_head, link)); } ``` 而解碼的部份則是使用 `vzalloc` 來省去 `memset` 的步驟： ```c static inline char *bignum_to_string(struct list_head *head) { // UINT64 < BOUND64 (10^18) size_t digits = list_entry(head, bignum_head, link)->len * MAX_DIGITS + 1; // DMA for result string char *res = vzalloc(sizeof(char) * digits), *pres = res; if (!res) return NULL; // decode from Most Significant Node if (head == head->next) { vfree(res); return NULL; } uint64_t node_result = 0; struct list_head *p = head->prev; // Most Significant Node node_result = list_entry(p, bignum_node, link)->value; snprintf(pres, MAX_DIGITS + 1, "%" U64_FMT, node_result); // other nodes for (p = p->prev; p != head; p = p->prev) { size_t pos = strlen(res); node_result = list_entry(p, bignum_node, link)->value; snprintf(&res[pos], MAX_DIGITS + 1, "%018" U64_FMT, node_result); } return res; } ``` ##### 改寫 `fibdrv.c` ```c static long long fib_sequence_big(uint64_t target, char *buf, size_t size) { struct list_head *lgr = bignum_new(1), *slr = bignum_new(0); for (uint64_t i = 0; i < target; ++i) { bignum_add_to_smaller(lgr, slr); swap(lgr, slr); } char *result = bignum_to_string(slr); size_t failed = copy_to_user(buf, result, size); vfree(result); bignum_free(lgr); bignum_free(slr); return size - failed; } ``` 在 `client.c` 中新增一個函式來進行最基本的 `fibonacci` 運算，並將運算結果透過 `copy_to_user` 複製到在 user space 中透過 `malloc` 分配的記憶體，最後再釋放記憶體空間。 ```c static ssize_t fib_read(struct file *file, char *buf, size_t size, loff_t *offset) { return (ssize_t) fib_sequence_big(*offset, buf, size); } ``` 接著則須改寫 `fib_read` 的部份，傳入 `buf` 以及 `size` 讓 `client` 能夠取得運算結果。 ##### 改寫 `client.c` ```c // fib(n) length ~= 0.2090n size_t length_pred = offset / 4 + 1; char *result = calloc(length_pred, 1); for (int i = 0; i <= offset; i++) { lseek(fd, i, SEEK_SET); sz = read(fd, result, length_pred); printf("Reading from " FIB_DEV " at offset %d, returned the sequence " "%s.\n", i, result); } for (int i = offset; i >= 0; i--) { lseek(fd, i, SEEK_SET); sz = read(fd, result, length_pred); printf("Reading from " FIB_DEV " at offset %d, returned the sequence " "%s.\n", i, result); } free(result); ``` 而 `client.c` 則須修改兩個包含 `read` 的 `for` 迴圈的內容，為了確保答案能夠全部存放到 `buffer` 中，這邊使用了大於 [$fib(n)$ 位數預測值 $n \times 0.2090$](https://en.wikipedia.org/wiki/Fibonacci_number#Magnitude) 的 $n \times 0.25$ 作為 `buffer` 大小，並使用此 `buffer` 作為 `read` 的參數儲存結果，最後再將 `printf` 改用 `result` 輸出答案。 ##### 移除 `MAX_LENGTH` 限制在完成上面的步驟後，如果直接執行 `make check` 會發現 `fibdrv` 的輸出在 $fib(92)$ 後與預期的數值不同，這是因為 `fibdrv.c` 原先用來回傳答案的型別 `sszie_t` 無法表示 $fib(93)$ 以及之後的數值，因此有在 `seek` 的部份另外進行限制： ```c #define MAX_LENGTH 92 static loff_t fib_device_lseek(struct file *file, loff_t offset, int orig) { loff_t new_pos = 0; switch (orig) { case 0: /* SEEK_SET: */ new_pos = offset; break; case 1: /* SEEK_CUR: */ new_pos = file->f_pos + offset; break; case 2: /* SEEK_END: */ new_pos = MAX_LENGTH - offset; break; } if (new_pos > MAX_LENGTH) new_pos = MAX_LENGTH; // max case if (new_pos < 0) new_pos = 0; // min case file->f_pos = new_pos; // This is what we'll use now return new_pos; } ``` 這會讓 `seek` 無法設定成大於 92 的數，因此 `fib_read` 參數的 `offset` 最大只會是 92，造成在 `client.c` 中呼叫的 $fib(k), \forall k \geq 93$ 都以 $fib(92)$ 替代，使得 `fibdrv` 在 $fib(92)$ 後的輸出皆相同，因此只要將 `MAX_LENGTH` 改成 `100` 即可。 ##### 修改正確答案 ```diff Reading from /dev/fibonacci at offset 90, returned the sequence 2880067194370816120. Reading from /dev/fibonacci at offset 91, returned the sequence 4660046610375530309. Reading from /dev/fibonacci at offset 92, returned the sequence 7540113804746346429. -Reading from /dev/fibonacci at offset 93, returned the sequence 7540113804746346429. -Reading from /dev/fibonacci at offset 94, returned the sequence 7540113804746346429. -Reading from /dev/fibonacci at offset 95, returned the sequence 7540113804746346429. -Reading from /dev/fibonacci at offset 96, returned the sequence 7540113804746346429. -Reading from /dev/fibonacci at offset 97, returned the sequence 7540113804746346429. -Reading from /dev/fibonacci at offset 98, returned the sequence 7540113804746346429. -Reading from /dev/fibonacci at offset 99, returned the sequence 7540113804746346429. -Reading from /dev/fibonacci at offset 100, returned the sequence 7540113804746346429. -Reading from /dev/fibonacci at offset 100, returned the sequence 7540113804746346429. -Reading from /dev/fibonacci at offset 99, returned the sequence 7540113804746346429. -Reading from /dev/fibonacci at offset 98, returned the sequence 7540113804746346429. -Reading from /dev/fibonacci at offset 97, returned the sequence 7540113804746346429. -Reading from /dev/fibonacci at offset 96, returned the sequence 7540113804746346429. -Reading from /dev/fibonacci at offset 95, returned the sequence 7540113804746346429. -Reading from /dev/fibonacci at offset 94, returned the sequence 7540113804746346429. -Reading from /dev/fibonacci at offset 93, returned the sequence 7540113804746346429. +Reading from /dev/fibonacci at offset 93, returned the sequence 12200160415121876738. +Reading from /dev/fibonacci at offset 94, returned the sequence 19740274219868223167. +Reading from /dev/fibonacci at offset 95, returned the sequence 31940434634990099905. +Reading from /dev/fibonacci at offset 96, returned the sequence 51680708854858323072. +Reading from /dev/fibonacci at offset 97, returned the sequence 83621143489848422977. +Reading from /dev/fibonacci at offset 98, returned the sequence 135301852344706746049. +Reading from /dev/fibonacci at offset 99, returned the sequence 218922995834555169026. +Reading from /dev/fibonacci at offset 100, returned the sequence 354224848179261915075. +Reading from /dev/fibonacci at offset 100, returned the sequence 354224848179261915075. +Reading from /dev/fibonacci at offset 99, returned the sequence 218922995834555169026. +Reading from /dev/fibonacci at offset 98, returned the sequence 135301852344706746049. +Reading from /dev/fibonacci at offset 97, returned the sequence 83621143489848422977. +Reading from /dev/fibonacci at offset 96, returned the sequence 51680708854858323072. +Reading from /dev/fibonacci at offset 95, returned the sequence 31940434634990099905. +Reading from /dev/fibonacci at offset 94, returned the sequence 19740274219868223167. +Reading from /dev/fibonacci at offset 93, returned the sequence 12200160415121876738. Reading from /dev/fibonacci at offset 92, returned the sequence 7540113804746346429. Reading from /dev/fibonacci at offset 91, returned the sequence 4660046610375530309. Reading from /dev/fibonacci at offset 90, returned the sequence 2880067194370816120. ``` ### 透過寫入使用不同的實作求值 ```c static enum FIB_MODES { FIB_MODE_BASIC_64 = 0, FIB_MODE_BASIC_BIG = 1, FIB_MODE_FAST_DOUBLING_64 = 2, FIB_MODE_FAST_DOUBLING_BIG = -1, } mode = FIB_MODE_BASIC_64; ``` 透過建立一個包含數種費氏數列計算實作的 `enum`，並修改 `fib_write` 讓程式能夠透過對 `/dev/fibonacci` 進行 `write` 來修改模式： ```c static ssize_t fib_write(struct file *file, const char *buf, size_t size, loff_t *offset) { int val, err = kstrtoint_from_user(buf, size, 10U, &val); if (err == ERANGE || err == EINVAL) pr_err("OVERFLOW OR NOT A NUMBER STRING.\n"); else switch (val) { case FIB_MODE_BASIC_64: pr_info("SET MODE : BASIC_64.\n"); mode = FIB_MODE_BASIC_64; return FIB_MODE_BASIC_64; case FIB_MODE_FAST_DOUBLING_64: pr_info("SET MODE : FAST_64.\n"); mode = FIB_MODE_FAST_DOUBLING_64; return FIB_MODE_FAST_DOUBLING_64; case FIB_MODE_BASIC_BIG: pr_info("SET MODE : BASIC_BIG.\n"); mode = FIB_MODE_BASIC_BIG; return FIB_MODE_BASIC_BIG; case FIB_MODE_FAST_DOUBLING_BIG: default: pr_warn("TO BE IMPLEMENTED.\n"); break; } return 1; } ``` 這邊是選擇使用整數作為各種實作模式的編號，在 `fib_write` 中會先透過 `kstrtoint_from_user` 嘗試將 `buf` 中的字串轉為整數，接著再根據結果設定模式。 :::warning `printk` 或是 `pr_info` 等 [Message Logging Functions](https://www.kernel.org/doc/html/latest/core-api/printk-basics.html) 都會輸出到 `/var/log/syslog` 中，可以透過 `tail` 或是 `cat` 等命令檢查程式是否有照預期運作。 ```shell Mar 16 00:02:24 freshliver-K21 kernel: [78008.798135] MODE = BASIC_64. Mar 16 00:02:27 freshliver-K21 kernel: [78011.276084] UNKNOWN MODE. Mar 16 00:02:27 freshliver-K21 kernel: [78011.276092] MODE = BASIC_64. Mar 16 00:02:31 freshliver-K21 kernel: [78015.190769] MODE = BASIC_64. Mar 16 00:02:33 freshliver-K21 kernel: [78017.419772] MODE = BASIC_64. Mar 16 00:02:36 freshliver-K21 kernel: [78019.993990] SET MODE : BASIC_BIG. Mar 16 00:02:36 freshliver-K21 kernel: [78019.993994] MODE = BASIC_BIG. Mar 16 00:02:45 freshliver-K21 kernel: [78028.973371] SET MODE : BASIC_BIG. Mar 16 00:02:45 freshliver-K21 kernel: [78028.973374] MODE = BASIC_BIG. ``` ::: 而 `fib_read` 中則須根據選擇的模式呼叫對應的實作函式來進行計算： ```c static ssize_t fib_read(struct file *file, char *buf, size_t size, loff_t *offset) { switch (mode) { case FIB_MODE_BASIC_64: printk(KERN_INFO "MODE = BASIC_64.\n"); return (ssize_t) fib_basic_64(*offset, buf, size); case FIB_MODE_FAST_DOUBLING_64: printk(KERN_INFO "MODE = FAST_DOUBLING_64.\n"); return (ssize_t) fib_fast_64(*offset, buf, size); case FIB_MODE_BASIC_BIG: printk(KERN_INFO "MODE = BASIC_BIG.\n"); return (ssize_t) fib_basic_big(*offset, buf, size); case FIB_MODE_FAST_DOUBLING_BIG: default: return copy_to_user(buf, "TO BE IMPLEMENTED.", size); } } ``` ## 評測及分析在評測及分析的部份參考了 [KYG-yaya573142 的報告](https://hackmd.io/@KYWeng/rkGdultSU)中所使用到的技巧以及觀念。 ### 準備 #### 減少干擾因素為了進可能減少干擾因素，在測試時做了以下處理： - 關閉其他可能影響測量的程式 - 修改 GRUB 參數讓部份 CPU 不被使用並使用 `taskset` 設定 Affinity ```shell $ cat /etc/default/grub | grep "isolcpu" GRUB_CMDLINE_LINUX_DEFAULT="isolcpus=10,11" ``` - 關閉 Intel Turbo `sudo sh -c "echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo"` - 關閉 ASLR `sudo sh -c "echo 0 > /proc/sys/kernel/randomize_va_space"` - 設定 scaling_governor 為 performance `sudo sh -c "echo performance > /sys/devices/system/cpu/cpu$CPU_ID/cpufreq/scaling_governor"` - 重複測量並排除 95% 信賴區間外的數據 - 小範圍 ($fib(0)$ 到 $fib(92)$) 的測試重複 1000 次 - 大範圍 ($fib(0)$ 到 $fib(2000)$) 的測試則重複 100 次 :::danger ```bash ISOLATED_CPU_ID=11 ISOLATED_CPU_AFFINITY_MASK=7ff for irq in $(find /proc/irq/ -name "smp_affinity"); do AND_MSK="$(( 0x$(cat $irq) & 0x$ISOLATED_CPU_AFFINITY_MASK ))" AND_MSK="$(printf \"%03x\" $AND_MSK )" sudo sh -c "echo $AND_MSK > $irq"; done sudo sh -c "echo $ISOLATED_CPU_AFFINITY_MASK > /proc/irq/default_smp_affinity" ``` 修改部份 IRQ 的 `smp_affinity` 時會出現 `sh: 1: echo: echo: I/O error` 的錯誤訊息： ```shell $ ISOLATED_CPU_AFFINITY_MASK=7ff $ for irq in $(find /proc/irq/ -name "smp_affinity"); do AND_MSK="$(( 0x$(cat $irq) & 0x$ISOLATED_CPU_AFFINITY_MASK ))" AND_MSK="$(printf \"%03x\" $AND_MSK )" sudo sh -c "echo $AND_MSK > $irq"; done sh: 1: echo: echo: I/O error sh: 1: echo: echo: I/O error sh: 1: echo: echo: I/O error ... ``` ::: #### 透過 Shell Script 進行自動測試以及格式化輸出為了方便重複測量執行時間以及處理數據，另外[寫了一個 Shell Script](https://github.com/freshLiver/fibdrv/commit/4e1e729c243c4c1c23502a9f06a402c6d4ee1190) 來處理前一部份提到的處理以及重複測量相關的，這會將 [experiments/read_perf.c](https://github.com/freshLiver/fibdrv/commit/35dc8cbc008814fe61a7fc174a549e12b2ce40fc) 的結果以 JSON 的格式輸出，最後再交給 [Python Script](https://github.com/freshLiver/fibdrv/blob/4e1e729c243c4c1c23502a9f06a402c6d4ee1190/experiments/plot_result.ipynb) 進行信賴區間的處理以及繪製圖表。 ### 64 位元整數版本 #### VLA (原始版本) #### 無 VLA ![](https://i.imgur.com/LNGHxJW.png) :::warning 原先即使關閉了 Turbo、ASLR 也將 cpufreq 設定成 PERFORMANCE 模式，並透過 taskset 將測試程式指定在其中一個已設定 isolcpu 的 cpu 上測試了，但似乎仍會受到其他因素影響、畫出像下方一樣有明顯抖動的結果。 ![](https://i.imgur.com/0QVUad5.png) 結果透過 htop 發現 dropbox 有明顯佔用其他 cpu，而將 dropbox 關閉後就沒有明顯的抖動了。 ::: #### Fast Doubling ##### Top-down (Recursive) ##### Bottom-up (Iterative) ![](https://i.imgur.com/23V5952.png) ##### `ctz`/`clz` 類的指令對費氏數列的效益在 Bottom-up 的策略中可以看到 `__builtin_clz` 的使用，以便快速的找到最高位的 1 :::warning TODO : 與自定義的 `clz` 進行比較 ::: ### 大數運算版本 - `bignum` ![](https://i.imgur.com/QPYwpzB.png) ![](https://i.imgur.com/660giGy.png) #### --- // #### 加速大數運算與縮減記憶體操作成本的措施 - [ ] [KYG-yaya573142 的報告](https://hackmd.io/@KYWeng/rkGdultSU) ### `fibdrv.c` 中的 MUTEX > 注意到 `fibdrv.c` 存在著 `DEFINE_MUTEX`, `mutex_trylock`, `mutex_init`, `mutex_unlock`, `mutex_destroy` 等字樣，什麼場景中會需要呢？撰寫多執行緒的 userspace 程式來測試，觀察 Linux 核心模組若沒用到 mutex，到底會發生什麼問題。嘗試撰寫使用 POSIX Thread 的程式碼來確認。 ## 其他 ### 為何不希望在虛擬機器中進行實驗