---
tags: linux kernel, linux2022
---
# 2022q1 Homework3 (fibdrv)
contributed by < [`linjohnss`](https://github.com/linjohnss) >
> [作業要求](https://hackmd.io/PMozqaxqRA-jF2KmgTZcPQ?view#K04-fibdrv)
### 自我檢查清單
- [ ] 研讀上述 ==Linux 效能分析的提示== 描述,在自己的實體電腦運作 GNU/Linux,做好必要的設定和準備工作 $\to$ 從中也該理解為何不希望在虛擬機器中進行實驗;
- [ ] 研讀上述費氏數列相關材料 (包含論文),摘錄關鍵手法,並思考 [clz / ctz](https://en.wikipedia.org/wiki/Find_first_set) 一類的指令對 Fibonacci 數運算的幫助。請列出關鍵程式碼並解說
- [ ] 複習 C 語言 [數值系統](https://hackmd.io/@sysprog/c-numerics) 和 [bitwise operation](https://hackmd.io/@sysprog/c-bitwise),思考 Fibonacci 數快速計算演算法的實作中如何減少乘法運算的成本;
- [ ] 研讀 [KYG-yaya573142](https://hackmd.io/@KYWeng/rkGdultSU) 的報告,指出針對大數運算,有哪些加速運算和縮減記憶體操作成本的舉措?
- [ ] `lsmod` 的輸出結果有一欄名為 `Used by`,這是 “each module’s use count and a list of referring modules”,但如何實作出來呢?模組間的相依性和實際使用次數 ([reference counting](https://en.wikipedia.org/wiki/Reference_counting)) 在 Linux 核心如何追蹤呢?
> 搭配閱讀 [The Linux driver implementer’s API guide » Driver Basics](https://www.kernel.org/doc/html/latest/driver-api/basics.html)
- [ ] 注意到 `fibdrv.c` 存在著 `DEFINE_MUTEX`, `mutex_trylock`, `mutex_init`, `mutex_unlock`, `mutex_destroy` 等字樣,什麼場景中會需要呢?撰寫多執行緒的 userspace 程式來測試,觀察 Linux 核心模組若沒用到 mutex,到底會發生什麼問題。嘗試撰寫使用 [POSIX Thread](https://en.wikipedia.org/wiki/POSIX_Threads) 的程式碼來確認。
## 開發環境
### 實驗環境
```shell
$ gcc --version
gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 39 bits physical, 48 bits virtual
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 2
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 142
Model name: Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz
Stepping: 10
CPU MHz: 1800.000
CPU max MHz: 3400.0000
CPU min MHz: 400.0000
BogoMIPS: 3600.00
Virtualization: VT-x
L1d cache: 128 KiB
L1i cache: 128 KiB
L2 cache: 1 MiB
L3 cache: 6 MiB
NUMA node0 CPU(s): 0-7
$ uname -r
5.13.0-30-generic
```
### 撰寫 Linux 核心模組的前期準備
:::info
自從 Linux 核心 4.4 版以來,Ubuntu Linux 預設開啟 EFI_SECURE_BOOT_SIG_ENFORCE,這使得核心模組需要適度的簽章才可掛載進入 Linux 核心,為了後續測試的便利,我們需要將 UEFI Secure Boot 的功能關閉,請見 Why do I get “Required key not available” when install 3rd party kernel modules or after a kernel upgrade?
:::
- 查看是否有啟用 secureboot
```shell
$ dmesg|grep -i secure
[ 0.000000] secureboot: Secure boot enabled
```
- [Disable Secure Boot](https://askubuntu.com/questions/762254/why-do-i-get-required-key-not-available-when-install-3rd-party-kernel-modules)
```shell
$ sudo apt install mokutil
$ sudo mokutil --disable-validation
```
- 再次確認是否有啟用 secureboot
```shell
$ dmesg|grep -i secure
[ 0.000000] secureboot: Secure boot disabled
```
- 安裝 `linux-headers` 套件
```shell
$ sudo apt install linux-headers-`uname -r`
```
- 確認 `linux-headers` 套件已安裝
```shell
$ dpkg -L linux-headers-5.13.0-30-generic | grep "/lib/modules"
/lib/modules
/lib/modules/5.13.0-30-generic
/lib/modules/5.13.0-30-generic/build
```
- 確認目前使用者身份,不可為 root
```shell
$ whoami
```
- 安裝後續會用得到的工具
```shell
$ sudo apt install util-linux strace gnuplot-nox
```
- fork fibdrv 並 clone 到本地端
```shell
$ git clone git@github.com:linjohnss/fibdrv.git
$ cd fibdrv
```
- 編譯並測試
```shell
$ make check
```
- 給定的 fibdrv 存在缺陷
```shell
Passed [-]
f(93) fail
input: 7540113804746346429
expected: 12200160415121876738
```
## 排除干擾效能分析的因素
- 限定 CPU 給特定的程式使用
修改 /etc/default/grub,修改完成後 `sudo update-grub` 來更新設定。
```
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash isolcpus=7"
```
- 查看行程的 CPU affinity
```shell
$ taskset -p 1
pid 1's current affinity mask: 7f
$ taskset -cp 1
pid 1's current affinity list: 0-6
```
- 將行程固定在特定的 CPU 中執行
```shell
$ sudo taskset -c 7 ./client
$ sudo taskset -c 7 ./client_plot > plot_input
```
- 抑制 address space layout randomization (ASLR)
```shell
$ sudo sh -c "echo 0 > /proc/sys/kernel/randomize_va_space"
```
- 設定 scaling_governor 為 performance
```shell
$ cat performance.sh
for i in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
do
echo performance > ${i}
done
$ sudo sh performance.sh
```
- 針對 Intel 處理器,關閉 turbo mode
```shell
$ sudo sh -c "echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo"
```
- IRQ affinity
參考 [KYG-yaya573142](https://hackmd.io/@KYWeng/rkGdultSU#%E6%8E%92%E9%99%A4%E5%B9%B2%E6%93%BE%E6%95%88%E8%83%BD%E5%88%86%E6%9E%90%E7%9A%84%E5%9B%A0%E7%B4%A0) 的作法
```bash
for file in `find /proc/irq -name "smp_affinity"`
do
var=0x`cat ${file}`
var="$(( $var & 0x7f ))"
var=`printf '%.2x' ${var}`
sudo bash -c "echo ${var} > ${file}"
done
sudo bash -c "echo 7f > /proc/irq/default_smp_affinity"
```
- 將上述設定寫成 script,方便後續實做
```c
$ sudo sh setup.sh
```
## 核心模式的時間測量
### Kernel Space
先根據作業提示用 ktime API 實做 kernel space 的時間量測
```c
static ktime_t kt;
static long long fib_time_proxy(long long k)
{
kt = ktime_get();
long long result = fib_sequence(k);
kt = ktime_sub(ktime_get(), kt);
return result;
}
static ssize_t fib_read(struct file *file,
char *buf,
size_t size,
loff_t *offset)
{
return (ssize_t) fib_time_proxy(*offset);
}
static ssize_t fib_write(struct file *file,
const char *buf,
size_t size,
loff_t *offset)
{
return ktime_to_ns(kt);
}
```
- gnuplot
```shell
$ gnuplot scripts/plot.gp
```
### User Space
```c
struct timespec t1, t2;
clock_gettime(CLOCK_MONOTONIC, &t1);
...
long long user_time = (long long)(t2.tv_sec * 1e9 + t2.tv_nsec)
- (t1.tv_sec * 1e9 + t1.tv_nsec);
```
## iterative
```c
static long long fib_sequence(long long k)
{
/* FIXME: C99 variable-length array (VLA) is not allowed in Linux kernel. */
long long f[k + 2];
f[0] = 0;
f[1] = 1;
for (int i = 2; i <= k; i++) {
f[i] = f[i - 1] + f[i - 2];
}
return f[k];
}
```
![](https://i.imgur.com/ytCoLxj.png)
## FIXME: C99 variable-length array (VLA) is not allowed in Linux kernel.
原先為 VLA 形式的宣告。
```c
static long long fib_sequence(long long k)
{
/* FIXME: C99 variable-length array (VLA) is not allowed in Linux kernel. */
long long f[k + 2];
f[0] = 0;
f[1] = 1;
for (int i = 2; i <= k; i++) {
f[i] = f[i - 1] + f[i - 2];
}
return f[k];
}
```
根據費氏數列定義,可以發現只須利用 3 個變數來實做,不須紀錄之前的數列。
```c
f[0] = 0;
f[1] = 1;
f[i] = f[i - 1] + f[i - 2];
```
```c
static long long fib_sequence(long long k)
{
if(k < 2)
return k;
long long f[2];
f[0] = 0;
f[1] = 1;
for (int i = 2; i <= k; i++) {
f[2] = f[0] + f[1];
f[0] = f[1];
f[1] = f[2];
}
return f[2];
}
```
## fast doubling
在作業要求中提及了一個更快的演算法 [fast doubling](https://hackmd.io/PMozqaxqRA-jF2KmgTZcPQ?view#-%E8%B2%BB%E6%B0%8F%E6%95%B8%E5%88%97),我們可以用 `F(0) = 0, F(1) = 1, F(2) = 1` 推導出隨後的數值。
$$
\begin{split}
F(2k) &= F(k)[2F(k+1) - F(k)] \\
F(2k+1) &= F(k+1)^2+F(k)^2
\end{split}
$$
### [參考實作](https://chunminchang.github.io/blog/post/calculating-fibonacci-numbers-by-fast-doubling)
```c
static long long fib_fast_doubling(long long k)
{
long long h = 0;
for (long long i = k; i; ++h, i >>= 1)
;
long long a = 0;
long long b = 1;
for (unsigned int mask = 1 << (h - 1); mask; mask >>= 1) {
long long c = a * (2 * b - a);
long long d = a * a + b * b;
if (mask & k) {
a = d;
b = c + d;
} else {
a = c;
b = d;
}
}
return a;
}
```
### 與 interative 比較
```shell
$ sudo taskset -c 7 ./client_plot 1 > plot_input
$ gnuplot scripts/plot_compare.gp
```
從下圖可以看出 fast doubling 整體效率會比 interative 還要好。
![](https://i.imgur.com/S7WylCw.png)
### 考慮到硬體加速 $F(n)$ 的手法
前面的 `h` 計算我們可以改用 clz 處理。
```diff
static long long fib_fast_doubling(long long k)
{
+ long long h = 64 - __builtin_clzll(k);
- long long h = 0;
- for (long long i = k; i; ++h, i >>= 1)
- ;
long long a = 0;
long long b = 1;
for (unsigned int mask = 1 << (h - 1); mask; mask >>= 1) {
long long c = a * (2 * b - a);
long long d = a * a + b * b;
if (mask & k) {
a = d;
b = c + d;
} else {
a = c;
b = d;
}
}
return a;
}
```
```shell
$ sudo taskset -c 7 ./client_plot 2 > plot_input
$ gnuplot scripts/plot_compare.gp
```
從下圖可以看到使用 clz 加速的 fast doubling 算法所需的時間快一些。
![](https://i.imgur.com/2D5XLTE.png)
## 計算 $F(93)$ (包含) 之後的 Fibonacci 數
原先的演算法在 $F(93)$ 都會產生 Overflow 的問題,因此我們可以用 big number 來嘗試解決問題。
### Big Number
結構的部份,利用兩個 `unsigned long long` 來處理 F~93~ 之後 Overflow 的問題。
```c
struct BigN {
unsigned long long lower, upper;
};
```
參考作業提示的方法實做 `BigN` 加法。
```c
static inline void addBigN(struct BigN *output, struct BigN x, struct BigN y)
{
output->upper = x.upper + y.upper;
if (y.lower > ~x.lower)
output->upper++;
output->lower = x.lower + y.lower;
}
```
參考 [KYG-yaya573142](https://hackmd.io/@KYWeng/rkGdultSU#bn-%E8%B3%87%E6%96%99%E7%B5%90%E6%A7%8B) 的作法進行改寫,將 `BigN` 轉成 `string`。
```c
static char *BigNtoDec(struct BigN target)
{
// log10(x) = log2(x) / log2(10) ~= log2(x) / 3.322
size_t size = sizeof(struct BigN) * 8 / 3 + 2;
char *str = (char *) kmalloc(size + 1, GFP_KERNEL);
memset(str, '0', size);
str[size] = '\0';
int carry;
for (unsigned long long i = 1ULL << 63; i; i >>= 1) {
carry = (target.upper & i) > 0;
for (int j = size - 1; j >= 0; j--) {
str[j] += ((str[j] - '0') + carry);
carry = str[j] > '9';
if (carry)
str[j] -= 10;
}
}
for (unsigned long long i = 1ULL << 63; i; i >>= 1) {
carry = (target.lower & i) > 0;
for (int j = size - 1; j >= 0; j--) {
str[j] += ((str[j] - '0') + carry);
carry = str[j] > '9';
if (carry)
str[j] -= 10;
}
}
char *ptr = str;
for (; *ptr == '0' && ptr != &str[size - 1]; ptr++)
;
return ptr;
}
```
### BigN iterative
將原先的 iterative 作法,`long long` 用 `BigN` 來代替,並且加法用 `addBigN` 來處理。
```c
static struct BigN fib_iterative_bign(long long k)
{
struct BigN f[3];
f[0].lower = 0;
f[0].upper = 0;
f[1].lower = 1;
f[1].upper = 0;
if (k < 2)
return f[k];
for (int i = 2; i <= k; i++) {
addBigN(&f[2], f[0], f[1]);
f[0] = f[1];
f[1] = f[2];
}
return f[2];
}
```
實測到 F~186~ 都是正確的
```shell
Reading from /dev/fibonacci at offset 184, returned the sequence 127127879743834334146972278486287885163.
Reading from /dev/fibonacci at offset 185, returned the sequence 205697230343233228174223751303346572685.
Reading from /dev/fibonacci at offset 186, returned the sequence 332825110087067562321196029789634457848.
```
### 改為 string 表示 big number
```c
typedef struct {
char num[MAX_DIGITS]; /* represent the number */
int digit; /* index of high-order digit */
} BigN;
```
Big number 結構改為一個 string 與一個 interger,利用 string 紀錄每個位數的值,而 digit 負責紀錄目前的位數。
```c
static void init_BigN(BigN *n, long long input)
{
for (int i = 0; i < MAX_DIGITS; i++)
n->num[i] = (char) 0;
n->digit = -1;
if (input == 0)
n->digit = 0;
for (; input > 0; input /= 10) {
n->digit++;
n->num[n->digit] = (input % 10);
}
}
```
init_BigN 會先初始化 BigN,使 BigN 的 string 全為 `(char) 0`、`digits = -1`。
接著根據 `input` 的數字轉成 char 給 big number。
```c
static void add_BigN(BigN *a, BigN *b, BigN *c)
{
int carry = 0;
init_BigN(c, 0);
c->digit = a->digit > b->digit ? a->digit + 1 : b->digit + 1;
for (int i = 0; i <= c->digit; i++) {
c->num[i] = (char) (carry + a->num[i] + b->num[i]) % 10;
carry = (carry + a->num[i] + b->num[i]) / 10;
}
if (c->num[c->digit] == 0)
c->digit--;
}
```
將兩個 BigN 相加,並將成果存在另一個 BigN。
1. 初始化 `c` 為 0,接著判斷 `a`, `b` 誰的位數較高,用來配置 `c` 的位數
2. for 迴圈對每個位數相加,並查看是否需進位
3. 最後的判斷最高位數是否進位
```c
static char *result_BigN(BigN *result)
{
size_t s = result->digit + 1;
char *str = kmalloc(MAX_DIGITS, GFP_KERNEL);
memset(str, '0', s);
str[s] = '\0';
for (int i = s - 1, j = 0; i > -1; i--, j++) {
str[i] += result->num[j];
}
return str;
}
```
將結果字串轉成 `0~9` 的字串
```shell
Reading from /dev/fibonacci at offset 227, returned the sequence 123227981463641240980692501505442003148737643593.
Reading from /dev/fibonacci at offset 228, returned the sequence 199387062373213542599493807777207997205533596336.
Reading from /dev/fibonacci at offset 229, returned the sequence 322615043836854783580186309282650000354271239929.
Reading from /dev/fibonacci at offset 230, returned the sequence 522002106210068326179680117059857997559804836265.
Reading from /dev/fibonacci at offset 231, returned the sequence 844617150046923109759866426342507997914076076194.
```
實測到 231 都是對的,超過會發生 Segmentation fault