# 2022q1 Homework3 (fibdrv) contributed by < [Nomad1230](https://github.com/Nomad1230) > ## 實驗環境 ```shell $ gcc --version gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0 $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 39 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 Vendor ID: GenuineIntel Model name: Intel(R) Core(TM) i5-8300H CPU @ 2.30GHz CPU family: 6 Model: 158 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 1 Stepping: 10 CPU MHz: 2300.000 CPU max MHz: 4000.0000 CPU min MHz: 800.0000 BogoMIPS: 4599.93 Virtualization: VT-x L1d cache: 128 KiB L1i cache: 128 KiB L2 cache: 1 MiB L3 cache: 8 MiB NUMA node0 CPU(s): 0-7 ``` ## 檢查環境 ```shell $ uname -r 5.13.0-37-generic $ dpkg -L linux-headers-`uname -r` | grep "/lib/modules" /lib/modules /lib/modules/5.13.0-37-generic /lib/modules/5.13.0-37-generic/build $ whoami tungyi $ sudo whoami root ``` ## 自我檢查清單 - [ ] 研讀上述 ==Linux 效能分析的提示== 描述,在自己的實體電腦運作 GNU/Linux,做好必要的設定和準備工作 $\to$ 從中也該理解為何不希望在虛擬機器中進行實驗; - [ ] 研讀上述費氏數列相關材料 (包含論文),摘錄關鍵手法,並思考 [clz / ctz](https://en.wikipedia.org/wiki/Find_first_set) 一類的指令對 Fibonacci 數運算的幫助。請列出關鍵程式碼並解說 - [ ] 複習 C 語言 [數值系統](https://hackmd.io/@sysprog/c-numerics) 和 [bitwise operation](https://hackmd.io/@sysprog/c-bitwise),思考 Fibonacci 數快速計算演算法的實作中如何減少乘法運算的成本; - [ ] 研讀 [KYG-yaya573142 的報告](https://hackmd.io/@KYWeng/rkGdultSU),指出針對大數運算,有哪些加速運算和縮減記憶體操作成本的舉措? - [ ] `lsmod` 的輸出結果有一欄名為 `Used by`,這是 "each module's use count and a list of referring modules",但如何實作出來呢?模組間的相依性和實際使用次數 ([reference counting](https://en.wikipedia.org/wiki/Reference_counting)) 在 Linux 核心如何追蹤呢? > 搭配閱讀 [The Linux driver implementer’s API guide » Driver Basics](https://www.kernel.org/doc/html/latest/driver-api/basics.html) - [ ] 注意到 `fibdrv.c` 存在著 `DEFINE_MUTEX`, `mutex_trylock`, `mutex_init`, `mutex_unlock`, `mutex_destroy` 等字樣,什麼場景中會需要呢?撰寫多執行緒的 userspace 程式來測試,觀察 Linux 核心模組若沒用到 mutex,到底會發生什麼問題。嘗試撰寫使用 [POSIX Thread](https://en.wikipedia.org/wiki/POSIX_Threads) 的程式碼來確認。 ## 作業要求 > [fibdrv](https://hackmd.io/@sysprog/linux2020-fibdrv#-%E4%BD%9C%E6%A5%AD%E8%A6%81%E6%B1%82) ## 大數運算 #### bignum 結構 ```c typedef struct _bignum { unsigned long int *value; unsigned long int length; int sign; } bignum; ``` 使用 unsigned long int 的陣列儲存大數,並儲存其長度及正負號。 #### bignum_new ```c bignum *bignum_new(unsigned long int length) { bignum *new = (bignum *) kmalloc(sizeof(bignum)); if (!new) return NULL; new->value = malloc(length * sizeof(unsigned long int)); new->length = length; return new; } ``` 宣告時輸入長度並分配其所需空間。 #### bignum_free ```c void bignum_free(bignum *a) { kfree(a->value); kfree(a); } ``` #### bignum_swap ```c void bignum_swap(bignum *a, bignum *b) { bignum tmp = *a; *a = *b; *b = tmp; } ``` #### bignum_resize ```c void bignum_resize(bignum *a, unsigned long int length) { a->value = (unsigned long int *) krealloc(a->value, length * sizeof(unsigned long int)); a->length = length; } ``` 當大數需要更多空間儲存資料時,使用 `krealloc` 分配更多空間。 #### bignum_add ```c= void bignum_add(bignum *a, bignum *b, bignum *ans) { unsigned long int max_length = a->length > b->length? a->length : b->length; bignum *tmp = NULL; if (ans == a || ans == b) { tmp = ans; ans = bignum_new(max_length); } if (ans->length < max_length) bignum_resize(ans, max_length); if (a->sign == b->sign) { ans->sign = a->sign; unsigned long int i, carry = 0; for (i = 0; i < max_length; i++) { unsigned long int tmp = a->value[i] + carry; carry = tmp > ~b->value[i] || a->value[i] > tmp; ans->value[i] += tmp + b->value[i]; } if (carry) bignum_resize(ans, max_length + 1); } else { ans->sign = 0; int i; unsigned long int borrow = 0; bignum *pos, *neg; if (a->sign) { pos = b; neg = a; } else { pos = a; neg = b; } for (i = 0; i < max_length; i++) { unsigned long int tmp = pos->value[i] - borrow; borrow = tmp < neg->value[i] || tmp == ULONG_MAX; ans->value[i] = tmp - neg->value[i]; } i = max_length - 1; while (!ans->value[i]) i--; ans->length = i + 1; if (borrow) { ans->sign = 1; for (i = max_length - 1; i >= 0; i--) ans->value[i] = ~ans->value[i]; ans->value[0]++; } } if (tmp) { bignum_swap(tmp, ans); bignum_free(ans); } } ``` 第 8 ~ 17 行是同號數的加法,若是計算完最高位時依然有 carry bit ,則要呼叫 `bignum_resize` 分配更多空間。 第 21 ~ 51 行是異號的減法,減法完成後會重新計算長度,將高位數值為 0 的部份不計,若是最後依然有 borrow bit 則要將整個大數做二補數運算。 #### bignum_mul ```c void bignum_mul(bignum *a,bignum *b, bignum *ans) { unsigned long int i,j; unsigned long int max_length = a->length + b->length; if (ans->length < max_length) bignum_resize(ans, max_length); bignum *tmp = NULL; if (ans == a || ans == b) { tmp = ans; ans = bignum_new(max_length); } for (i = 0; i < a->length; i++) { int a_clz = __builtin_clzl(a->value[i]); unsigned long int carry = 0; unsigned long int rem = 0; for (j = 0; j < b->length; j++) { int b_clz = __builtin_clzl(b->value[j]); carry = ((a->value[i] >> b_clz) * (b->value[j] >> a_clz)) >> (128 - a_clz - b_clz); rem = a->value[i] * b->value[j]; bignum_do_mul_add(ans, i+j+1, carry); bignum_do_mul_add(ans, i+j, rem); } } ans->sign = a->sign ^ b->sign; i = max_length - 1; while (!ans->value[i]) i--; ans->length = i + 1; if (tmp) { bignum_swap(tmp, ans); bignum_free(ans); } } ``` ```c void bignum_do_mul_add(bignum *ans, unsigned long int length, unsigned long int value) { unsigned long int carry = 0; do { ans->value[length] += carry;//unsigned long int tmp = ans->value[length] + carry; carry = ans->value[length] > ~value || (ans->value[length] == 0 && carry); ans->value[length] += value; value = 0; length++; } while (carry); } ``` 在計算乘法時,使用 bitwise 操作直接計算 carry 值,但由於計算時是使用 unsigned long int 儲存計算結果,在兩數的 clz 值加起來小於 64 時還是會造成錯誤。 此套算法可以計算出 F(100) 的數值,但若是要再計算更大的數值有可能會發生錯誤。 #### bignum_mul 修正 將計算計算乘法部份的程式碼修正如下: ```c for (i = 0; i < a->length; i++) { for (j = 0; j < b->length; j++) { unsigned __int128 result = (unsigned __int128) a->value[i] * b->value[j]; unsigned long int carry = result >> 64; unsigned long int rem = result & ULONG_MAX; printf("i = %lu, j = %lu, carry = %lu, rem = %lu\n", i, j, carry, rem); bignum_do_mul_add(ans, i+j+1, carry); bignum_do_mul_add(ans, i+j, rem); } } ``` 改由使用 `unsigned __int128` 來儲存計算過程,即可計算出結果。 #### 乘法的正確性驗證 在與教授 meeting 的過程中被教授提問要如何檢驗自己寫的 `bignum_mul` 是否可以計算出正確結果,目前我想到的方式是去檢驗其是否符合乘法的基本性質:交換律、結合律、分配律。 使用 $xy = x + x(y - 1) = y + y(x - 1)$ ## 結果輸出 ## Fast Doubling ## sysfs :::danger 1.如何檢驗大數乘法的正確性 2.閱讀 sysfs 的課程文件及觀看課程影片,紀錄中途遇到的問題 :::