# 2022q1 Homework3 (fibdrv)
contributed by < `happyjack91630` >
## 前期準備
:::spoiler 模組安裝
```shell=
//檢查Linux核心版本
$ uname -r
5.13.0-30-generic
//安裝 linux-headers 套件
$ sudo apt install linux-headers-`uname -r`
Reading package lists... Done
Building dependency tree
Reading state information... Done
linux-headers-5.13.0-30-generic is already the newest version (5.13.0-30.33~20.04.1).
0 upgraded, 0 newly installed, 0 to remove and 95 not upgraded.
//確認 linux-headers 套件已正確安裝於開發環境
$ dpkg -L linux-headers-5.13.0-30-generic | grep "/lib/modules"
/lib/modules
/lib/modules/5.13.0-30-generic
/lib/modules/5.13.0-30-generic/build
//檢驗目前的使用者身份
$ whoami
jack
$ sudo whoami
root
$ sudo apt install util-linux strace gnuplot-nox
$ git clone https://github.com/sysprog21/fibdrv
$ cd fibdrv
$ make check
make -C /lib/modules/5.13.0-30-generic/build M=/home/jack/fibdrv modules
make[1]: Entering directory '/usr/src/linux-headers-5.13.0-30-generic'
make[1]: Leaving directory '/usr/src/linux-headers-5.13.0-30-generic'
make unload
make[1]: Entering directory '/home/jack/fibdrv'
sudo rmmod fibdrv || true >/dev/null
rmmod: ERROR: Module fibdrv is not currently loaded
make[1]: Leaving directory '/home/jack/fibdrv'
make load
make[1]: Entering directory '/home/jack/fibdrv'
sudo insmod fibdrv.ko
make[1]: Leaving directory '/home/jack/fibdrv'
sudo ./client > out
make unload
make[1]: Entering directory '/home/jack/fibdrv'
sudo rmmod fibdrv || true >/dev/null
make[1]: Leaving directory '/home/jack/fibdrv'
Passed [-]
f(93) fail
input: 7540113804746346429
expected: 12200160415121876738
//觀察產生的 fibdrv.ko 核心模組
$ modinfo fibdrv.ko
filename: /home/jack/fibdrv/fibdrv.ko
version: 0.1
description: Fibonacci engine driver
author: National Cheng Kung University, Taiwan
license: Dual MIT/GPL
srcversion: 2E292D54A69BD2B4FF86019
depends:
retpoline: Y
name: fibdrv
vermagic: 5.13.0-30-generic SMP mod_unload modversions
//insmod命令-->install module的縮寫,用来載入魔塊
$ sudo insmod fibdrv.ko
$ ls -l /dev/fibonacci
crw------- 1 root root 511, 0 三 8 18:12 /dev/fibonacci
511:0
$ cat /sys/module/fibdrv/version
0.1
$ lsmod | grep fibdrv
fibdrv 16384 0
$ cat /sys/module/fibdrv/refcnt
0
```
:::
:::spoiler 排除干擾效能分析
#### 限定 CPU 給特定的程式使用
- 以 `sudo` 管理者權限更改 `/etc/default/grub` 檔案。修改 `GRUB_CMDLINE_LINUX_DEFAULT` 參數,新增 `"isolcpus=x(,x...)"` 指定 cpu 核心只給特定程式使用。
```shell
$ sudo vim /etc/default/grub
...
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash isolcpus=7"
...
```
- 更新grub(也要使用sudo權限)
> 參考 [KYG-yaya573142](https://hackmd.io/@KYWeng/rkGdultSU#%E6%8E%92%E9%99%A4%E5%B9%B2%E6%93%BE%E6%95%88%E8%83%BD%E5%88%86%E6%9E%90%E7%9A%84%E5%9B%A0%E7%B4%A0)的做法都沒辦法將特定cpu獨立出來,加了這行`sudo update-grub`就可以了。
```shell
$ sudo update-grub
```
- 重新開機後檢查行程的 CPU affinity:
```shell
$ taskset -cp 1
pid 1's current affinity list: 0-6
$ cat /sys/devices/system/cpu/isolated
7
```
- 將程式固定在特定的 CPU 中執行
```shell
$ sudo taskset -c 7 ./client
```
- 抑制 address space layout randomization (ASLR)
```shell
$ sudo sh -c "echo 0 > /proc/sys/kernel/randomize_va_space"
```
- 設定 scaling_governor 為 performance
```shell
$ vim performance.sh
for i in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
do
echo performance > ${i}
done
$ sudo sh performance.sh
```
- 關閉 Intel 處理器的 turbo mode
```shell
$ sudo sh -c "echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo"
```
:::
## kernel space 與 user space 傳遞時間
#### kernel space
使用`ktime`相關的API來做計算,計算時間特別獨立出來寫成`fib_time_proxy`這個函式,這樣就不用一直重複寫計算時間的程式碼了。那由於之後可能也要計算不同fibonacci實作方法的時間,所以這個函式第二個參數是帶入function pointer,以便切換不同的fibonacci的實做函式。
```c=
static long long fib_time_proxy(long long k, long long (*func_ptr)(long long))
{
kt = ktime_get();
long long result = func_ptr(k);
kt = ktime_sub(ktime_get(), kt);
return result;
}
```
#### user space and system call time
使用`clock_gettime`,用來在user space的地方計算時間。
並將kernel time - user time來取得system call所花費的時間。
```c=
//time[0] record kernel time
//time[1] record user time
//time[2] record system call time
clock_gettime(CLOCK_MONOTONIC, &t1);
kernel_time = write(fd, write_buf, 0);
time[0][s] = kernel_time;
mean[0] += time[0][s];
clock_gettime(CLOCK_MONOTONIC, &t2);
double user_time = (double) (t2.tv_sec * 1e9 + t2.tv_nsec) -
(t1.tv_sec * 1e9 + t1.tv_nsec); //ns
time[1][s] = user_time;
mean[1] += time[1][s];
time[2][s] = user_time - kernel_time;
mean[2] += time[2][s];
```
![](https://i.imgur.com/oizg1vl.png)
> 上面圖片利用了常態分佈的特性將兩個標準差以外的數據都排除掉,用來去除極端值
## 費氏數列實作
### iterative
```c=
static long long fib_sequence(long long k)
{
/* FIXME: C99 variable-length array (VLA) is not allowed in Linux kernel. */
long long f[k + 2];
f[0] = 0;
f[1] = 1;
for (int i = 2; i <= k; i++) {
f[i] = f[i - 1] + f[i - 2];
}
return f[k];
}
```
一開始iterative的版本,在註解上有個警告`/* FIXME: C99 variable-length array (VLA) is not allowed in Linux kernel. */`,那是linux kernel規定宣告array必須有**固定的長度**,不能動態變化。換個角度想,可以用malloc來動態宣告陣列,但是kernel module裡不能使用`<stdlib.h>`的函式庫([參考](https://stackoverflow.com/questions/15662480/error-stdio-h-no-such-file-or-directory-error-during-make))。
```c=
static long long fib_sequence(long long k)
{
if (k < 2)
return 0;
long long a = 0, b = 1;
for (int i = 2; i <= k; i++) {
long long c = a + b;
a = b;
b = c;
}
return b;
}
```
後來,就將iterative的程式碼改成不用陣列來儲存計算出來的值。改用兩個變數(a,b)來儲存後續要計算的值就行了。
### fast doubling
講義中[費氏數列](https://hackmd.io/PMozqaxqRA-jF2KmgTZcPQ?view#-%E8%B2%BB%E6%B0%8F%E6%95%B8%E5%88%97)推導了一種 `fast doubling` 的演算法:
$$
\begin{split}
F(2k) &= F(k)[2F(k+1) - F(k)] \\
F(2k+1) &= F(k+1)^2+F(k)^2
\end{split}
$$
fast doubling的實作方法則是參考了[網誌](https://chunminchang.github.io/blog/post/calculating-fibonacci-numbers-by-fast-doubling)的作法。
```c=
static long long fib_doubling(long long k)
{
unsigned int h = 0;
for (unsigned int i = k; i; ++h, i >>= 1)
;
long long a = 0; // F(0) = 0
long long b = 1; // F(1) = 1
// There is only one `1` in the bits of `mask`. The `1`'s position is same
// as the highest bit of n(mask = 2^(h-1) at first), and it will be shifted
// right iteratively to do `AND` operation with `n` to check `n_j` is odd or
// even, where n_j is defined below.
for (unsigned int mask = 1 << (h - 1); mask; mask >>= 1) { // Run h times!
// Let j = h-i (looping from i = 1 to i = h), n_j = floor(n / 2^j) = n
// >> j (n_j = n when j = 0), k = floor(n_j / 2), then a = F(k), b =
// F(k+1) now.
long long c = a * (2 * b - a); // F(2k) = F(k) * [ 2 * F(k+1) – F(k) ]
long long d = a * a + b * b; // F(2k+1) = F(k)^2 + F(k+1)^2
if (mask & k) { // n_j is odd: k = (n_j-1)/2 => n_j = 2k + 1
a = d; // F(n_j) = F(2k + 1)
b = c + d; // F(n_j + 1) = F(2k + 2) = F(2k) + F(2k + 1)
} else { // n_j is even: k = n_j/2 => n_j = 2k
a = c; // F(n_j) = F(2k)
b = d; // F(n_j + 1) = F(2k + 1)
}
}
return a;
}
```
在kernel space量測個別的計算時間。
```c=
static ssize_t fib_write(struct file *file,
const char *buf,
size_t mode,
loff_t *offset)
{
/*
mode 0 : 將fib_sequence的function pointer帶入fib_time_proxy,
用來計算fib_sequence的執行時間
mode 1 : 將fib_doubling的function pointer帶入fib_time_proxy,
用來計算fib_doubling的執行時間
*/
switch (mode) {
case 0:
sequence_result = fib_time_proxy(*offset, fib_sequence);
printk("%lld", kt);
break;
case 1:
doubling_result = fib_time_proxy(*offset, fib_doubling);
printk("%lld", kt);
break;
case 2:
return 1;
}
return (ssize_t) ktime_to_ns(kt);
}
```
![](https://i.imgur.com/RZsiCDL.png)
> fib_seq 是使用iterative,複雜度為O(N)。
> fib_doubling 則是使用fast doubling的做法,複雜度為O(log n)
### overflow
雖然上述方法,都是正確計算fibonacci的程式碼,但是在f(92)以後的數值會出錯,是因為`long long` overflow的問題,可以參考[F(92) 以後的數值錯誤的原因https://leetcode.jp/problems.php](https://hackmd.io/BZjwYw1FQ1O0NPY5kIS7VQ?view#F92-%E4%BB%A5%E5%BE%8C%E7%9A%84%E6%95%B8%E5%80%BC%E9%8C%AF%E8%AA%A4%E7%9A%84%E5%8E%9F%E5%9B%A0)。
## 大數運算(iterative)
將大數(超過long long的數),轉成char array,在進行數學運算(這個部分先做加法的大數運算)。
### step 1.將大數轉成char array
```c=
static void int2char(long long a, long long b, char *a_s, char *b_s)
{
int reminder;
int last_pos_a = max_space - 1;
int last_pos_b = max_space - 1;
while (a > 0) {
reminder = a % 10;
a_s[last_pos_a--] = reminder + '0';
a = a / 10;
}
while (b > 0) {
reminder = b % 10;
b_s[last_pos_b--] = reminder + '0';
b = b / 10;
}
}
```
![](https://i.imgur.com/s1XFeny.png)
>上述圖片,解釋將一個大數轉成相對應的char array。
### step 2.將兩個char array做相加
```c=
static void sum(char *a, char *b, char *c)
{
int last_pos = max_space - 1;
int carry = 0;
int i = max_space - 1, j = max_space - 1;
for (; j >= 0; i--, j--) {
int sum_v;
sum_v = (a[i] - '0') + (b[j] - '0') + carry;
carry = sum_v / 10;
sum_v %= 10;
c[last_pos--] = sum_v + '0';
}
}
```
![](https://i.imgur.com/yVhDPxf.png)
>上述圖片,解釋將兩個char array相加的流程。
### step 3. 將上述大數方法,融入到fib_sequence
```c=
static void fib_big_num(int n, char *r)
{
if (n == 0) {
*r = n + '0';
r++;
*r = '\0';
return;
}
if (n == 1 || n == 2) {
*r = '1';
r++;
*r = '\0';
return;
}
long long a = 1;
long long b = 1;
char a_s[max_space];
char b_s[max_space];
char ans[max_space];
for (int i = 0; i < max_space; i++) {
a_s[i] = '0';
b_s[i] = '0';
ans[i] = '0';
}
int2char(a, b, a_s, b_s);
for (int i = 3; i <= n; i++) {
sum(a_s, b_s, ans);
for (int j = 0; j < max_space; j++) {
a_s[j] = b_s[j];
b_s[j] = ans[j];
}
}
int flag = 0;
for (int i = 0; i < max_space; i++) {
if (ans[i] == '0' && flag == 0) {
continue;
} else {
flag = 1;
*r++ = ans[i];
}
}
*r = '\0';
}
```
目前上述方法可以將long long overflow的問題解決,正確算出f(92)以後fibonacci的數字。
![](https://i.imgur.com/2meIMmy.png)
>要算fibonacci之前,要記得把`fibdrv.c`裡的`#define MAX_LENGTH` 調高才行,否則,`fib_device_lessk`算new_pos會算錯。
![](https://i.imgur.com/PibMa2F.png)
> 隨然上述方法能算出正確的答案,但是非常沒有效率。第一個原因是因為此大數運算是運用在本來就比較慢的fib_sequence上。第二個原因是此大數運算本身的實作方式蠻浪費時間的。舉例來說,將兩個char array 相加的過程的那個for迴圈,把一堆沒必要的'0' + '0'也做相加了,所以非常浪費時間。