# 2022q1 Homework3 (fibdrv)
contributed by < [Nomad1230](https://github.com/Nomad1230) >
## 實驗環境
```shell
$ gcc --version
gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 39 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
Vendor ID: GenuineIntel
Model name: Intel(R) Core(TM) i5-8300H CPU @ 2.30GHz
CPU family: 6
Model: 158
Thread(s) per core: 2
Core(s) per socket: 4
Socket(s): 1
Stepping: 10
CPU MHz: 2300.000
CPU max MHz: 4000.0000
CPU min MHz: 800.0000
BogoMIPS: 4599.93
Virtualization: VT-x
L1d cache: 128 KiB
L1i cache: 128 KiB
L2 cache: 1 MiB
L3 cache: 8 MiB
NUMA node0 CPU(s): 0-7
```
## 檢查環境
```shell
$ uname -r
5.13.0-37-generic
$ dpkg -L linux-headers-`uname -r` | grep "/lib/modules"
/lib/modules
/lib/modules/5.13.0-37-generic
/lib/modules/5.13.0-37-generic/build
$ whoami
tungyi
$ sudo whoami
root
```
## 自我檢查清單
- [ ] 研讀上述 ==Linux 效能分析的提示== 描述,在自己的實體電腦運作 GNU/Linux,做好必要的設定和準備工作 $\to$ 從中也該理解為何不希望在虛擬機器中進行實驗;
- [ ] 研讀上述費氏數列相關材料 (包含論文),摘錄關鍵手法,並思考 [clz / ctz](https://en.wikipedia.org/wiki/Find_first_set) 一類的指令對 Fibonacci 數運算的幫助。請列出關鍵程式碼並解說
- [ ] 複習 C 語言 [數值系統](https://hackmd.io/@sysprog/c-numerics) 和 [bitwise operation](https://hackmd.io/@sysprog/c-bitwise),思考 Fibonacci 數快速計算演算法的實作中如何減少乘法運算的成本;
- [ ] 研讀 [KYG-yaya573142 的報告](https://hackmd.io/@KYWeng/rkGdultSU),指出針對大數運算,有哪些加速運算和縮減記憶體操作成本的舉措?
- [ ] `lsmod` 的輸出結果有一欄名為 `Used by`,這是 "each module's use count and a list of referring modules",但如何實作出來呢?模組間的相依性和實際使用次數 ([reference counting](https://en.wikipedia.org/wiki/Reference_counting)) 在 Linux 核心如何追蹤呢?
> 搭配閱讀 [The Linux driver implementer’s API guide » Driver Basics](https://www.kernel.org/doc/html/latest/driver-api/basics.html)
- [ ] 注意到 `fibdrv.c` 存在著 `DEFINE_MUTEX`, `mutex_trylock`, `mutex_init`, `mutex_unlock`, `mutex_destroy` 等字樣,什麼場景中會需要呢?撰寫多執行緒的 userspace 程式來測試,觀察 Linux 核心模組若沒用到 mutex,到底會發生什麼問題。嘗試撰寫使用 [POSIX Thread](https://en.wikipedia.org/wiki/POSIX_Threads) 的程式碼來確認。
## 作業要求
> [fibdrv](https://hackmd.io/@sysprog/linux2020-fibdrv#-%E4%BD%9C%E6%A5%AD%E8%A6%81%E6%B1%82)
## 大數運算
#### bignum 結構
```c
typedef struct _bignum
{
unsigned long int *value;
unsigned long int length;
int sign;
} bignum;
```
使用 unsigned long int 的陣列儲存大數,並儲存其長度及正負號。
#### bignum_new
```c
bignum *bignum_new(unsigned long int length)
{
bignum *new = (bignum *) kmalloc(sizeof(bignum));
if (!new)
return NULL;
new->value = malloc(length * sizeof(unsigned long int));
new->length = length;
return new;
}
```
宣告時輸入長度並分配其所需空間。
#### bignum_free
```c
void bignum_free(bignum *a)
{
kfree(a->value);
kfree(a);
}
```
#### bignum_swap
```c
void bignum_swap(bignum *a, bignum *b)
{
bignum tmp = *a;
*a = *b;
*b = tmp;
}
```
#### bignum_resize
```c
void bignum_resize(bignum *a, unsigned long int length)
{
a->value = (unsigned long int *) krealloc(a->value, length * sizeof(unsigned long int));
a->length = length;
}
```
當大數需要更多空間儲存資料時,使用 `krealloc` 分配更多空間。
#### bignum_add
```c=
void bignum_add(bignum *a, bignum *b, bignum *ans)
{
unsigned long int max_length = a->length > b->length? a->length : b->length;
bignum *tmp = NULL;
if (ans == a || ans == b) {
tmp = ans;
ans = bignum_new(max_length);
}
if (ans->length < max_length)
bignum_resize(ans, max_length);
if (a->sign == b->sign)
{
ans->sign = a->sign;
unsigned long int i, carry = 0;
for (i = 0; i < max_length; i++)
{
unsigned long int tmp = a->value[i] + carry;
carry = tmp > ~b->value[i] || a->value[i] > tmp;
ans->value[i] += tmp + b->value[i];
}
if (carry)
bignum_resize(ans, max_length + 1);
}
else
{
ans->sign = 0;
int i;
unsigned long int borrow = 0;
bignum *pos, *neg;
if (a->sign)
{
pos = b;
neg = a;
}
else
{
pos = a;
neg = b;
}
for (i = 0; i < max_length; i++)
{
unsigned long int tmp = pos->value[i] - borrow;
borrow = tmp < neg->value[i] || tmp == ULONG_MAX;
ans->value[i] = tmp - neg->value[i];
}
i = max_length - 1;
while (!ans->value[i])
i--;
ans->length = i + 1;
if (borrow)
{
ans->sign = 1;
for (i = max_length - 1; i >= 0; i--)
ans->value[i] = ~ans->value[i];
ans->value[0]++;
}
}
if (tmp) {
bignum_swap(tmp, ans);
bignum_free(ans);
}
}
```
第 8 ~ 17 行是同號數的加法,若是計算完最高位時依然有 carry bit ,則要呼叫 `bignum_resize` 分配更多空間。
第 21 ~ 51 行是異號的減法,減法完成後會重新計算長度,將高位數值為 0 的部份不計,若是最後依然有 borrow bit 則要將整個大數做二補數運算。
#### bignum_mul
```c
void bignum_mul(bignum *a,bignum *b, bignum *ans)
{
unsigned long int i,j;
unsigned long int max_length = a->length + b->length;
if (ans->length < max_length)
bignum_resize(ans, max_length);
bignum *tmp = NULL;
if (ans == a || ans == b)
{
tmp = ans;
ans = bignum_new(max_length);
}
for (i = 0; i < a->length; i++)
{
int a_clz = __builtin_clzl(a->value[i]);
unsigned long int carry = 0;
unsigned long int rem = 0;
for (j = 0; j < b->length; j++)
{
int b_clz = __builtin_clzl(b->value[j]);
carry = ((a->value[i] >> b_clz) * (b->value[j] >> a_clz)) >> (128 - a_clz - b_clz);
rem = a->value[i] * b->value[j];
bignum_do_mul_add(ans, i+j+1, carry);
bignum_do_mul_add(ans, i+j, rem);
}
}
ans->sign = a->sign ^ b->sign;
i = max_length - 1;
while (!ans->value[i])
i--;
ans->length = i + 1;
if (tmp)
{
bignum_swap(tmp, ans);
bignum_free(ans);
}
}
```
```c
void bignum_do_mul_add(bignum *ans, unsigned long int length, unsigned long int value)
{
unsigned long int carry = 0;
do
{
ans->value[length] += carry;//unsigned long int tmp = ans->value[length] + carry;
carry = ans->value[length] > ~value || (ans->value[length] == 0 && carry);
ans->value[length] += value;
value = 0;
length++;
}
while (carry);
}
```
在計算乘法時,使用 bitwise 操作直接計算 carry 值,但由於計算時是使用 unsigned long int 儲存計算結果,在兩數的 clz 值加起來小於 64 時還是會造成錯誤。
此套算法可以計算出 F(100) 的數值,但若是要再計算更大的數值有可能會發生錯誤。
#### bignum_mul 修正
將計算計算乘法部份的程式碼修正如下:
```c
for (i = 0; i < a->length; i++)
{
for (j = 0; j < b->length; j++)
{
unsigned __int128 result = (unsigned __int128) a->value[i] * b->value[j];
unsigned long int carry = result >> 64;
unsigned long int rem = result & ULONG_MAX;
printf("i = %lu, j = %lu, carry = %lu, rem = %lu\n", i, j, carry, rem);
bignum_do_mul_add(ans, i+j+1, carry);
bignum_do_mul_add(ans, i+j, rem);
}
}
```
改由使用 `unsigned __int128` 來儲存計算過程,即可計算出結果。
#### 乘法的正確性驗證
在與教授 meeting 的過程中被教授提問要如何檢驗自己寫的 `bignum_mul` 是否可以計算出正確結果,目前我想到的方式是去檢驗其是否符合乘法的基本性質:交換律、結合律、分配律。
使用 $xy = x + x(y - 1) = y + y(x - 1)$
## 結果輸出
## Fast Doubling
## sysfs
:::danger
1.如何檢驗大數乘法的正確性
2.閱讀 sysfs 的課程文件及觀看課程影片,紀錄中途遇到的問題
:::