---
tags: 2023 linux kernel
---
# 2023q1 Homework3 (fibdrv)
contributed by < [`Risheng1128`](https://github.com/Risheng1128) >
> [作業要求](https://hackmd.io/@sysprog/linux2023-fibdrv)
> [作業區](https://hackmd.io/@sysprog/linux2023-homework3)
## 實驗環境
```shell
$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 39 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 12
On-line CPU(s) list: 0-11
Vendor ID: GenuineIntel
Model name: 11th Gen Intel(R) Core(TM) i5-11400H @ 2.70GHz
CPU family: 6
Model: 141
Thread(s) per core: 2
Core(s) per socket: 6
Socket(s): 1
Stepping: 1
CPU max MHz: 4500.0000
CPU min MHz: 800.0000
BogoMIPS: 5376.00
Virtualization features:
Virtualization: VT-x
Caches (sum of all):
L1d: 288 KiB (6 instances)
L1i: 192 KiB (6 instances)
L2: 7.5 MiB (6 instances)
L3: 12 MiB (1 instance)
NUMA:
NUMA node(s): 1
NUMA node0 CPU(s): 0-11
$ gcc --version
gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
```
## 研讀 Linux 效能分析的提示
參考 [Linux 效能分析的提示](https://hackmd.io/@sysprog/linux2023-fibdrv/%2F%40sysprog%2Flinux2023-fibdrv-c#-Linux-%E6%95%88%E8%83%BD%E5%88%86%E6%9E%90%E7%9A%84%E6%8F%90%E7%A4%BA) 為了分析效能而做的措施
首先保留特定的處理器,以便我們執行程式時有空閒的處理器可以直接使用,輸入命令 `sudo vim /etc/default/grub` 開啟檔案,並新增以下的內容
```
GRUB_CMDLINE_LINUX="isolcpus=0"
```
接著輸入命令 `sudo update-grub` ,以下為預期輸出
```shell
Sourcing file `/etc/default/grub'
Sourcing file `/etc/default/grub.d/init-select.cfg'
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-5.19.0-35-generic
Found initrd image: /boot/initrd.img-5.19.0-35-generic
Found linux image: /boot/vmlinuz-5.19.0-32-generic
Found initrd image: /boot/initrd.img-5.19.0-32-generic
Memtest86+ needs a 16-bit boot, that is not available on EFI, exiting
Warning: os-prober will be executed to detect other bootable partitions.
Its output will be used to detect bootable binaries on them and create new boot entries.
Found Windows Boot Manager on /dev/nvme1n1p1@/EFI/Microsoft/Boot/bootmgfw.efi
Adding boot menu entry for UEFI Firmware Settings ...
done
```
重新開機後,輸入命令 `taskset -p 1` 查看 PID 1 可使用的處理器編號
```shell
$ taskset -p 1
pid 1's current affinity mask: ffe
```
可以發現編號 0 已經被保留了
接著抑制 [ASLR](https://en.wikipedia.org/wiki/Address_space_layout_randomization)
```shell
$ sudo sh -c "echo 0 > /proc/sys/kernel/randomize_va_space"
```
設定 scaling_governor ,執行以下的 shell script (使用命令 `sudo sh filename.sh`)
```
for i in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
do
echo performance > ${i}
done
```
針對 Intel 處理器,關閉 turbo mode
```shell
$ sudo sh -c "echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo"
```
## 修正 Fibonacci 數計算並改善 fibdrv 核心模組的計算效率
首先在編譯核心模組時,發生了以下警告
```
warning: the compiler differs from the one used to build the kernel
The kernel was built by: x86_64-linux-gnu-gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
You are using: gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
```
產生原因是建立核心和模組的編譯器不同,這裡修改 Makefile 的內容,指定 `x86_64-linux-gnu-gcc` 編譯核心模組
```diff
- $(MAKE) -C $(KDIR) M=$(PWD) modules
+ $(MAKE) -C $(KDIR) M=$(PWD) CC=x86_64-linux-gnu-gcc modules
```
這裡做了一個修改,主要是讓 fibdrv 可以主動偵測編譯 kernel 的編譯器,完整修改可見 [commit](https://github.com/Risheng1128/fibdrv/commit/c8f4f071e9880396c7709efa9ace7e011a701294)
### 使用 Fast Doubling 計算 Fibonacci 數
在原本的 fibdrv 中,是使用 repeated addition 的方法計算 Fibonacci 數,可見論文 [Algorithms for Computing Fibonacci Numbers Quickly](https://ir.library.oregonstate.edu/downloads/t435gg51w) ,額外的分析放在 [研讀 Algorithms for Computing Fibonacci Numbers Quickly](https://hackmd.io/@Risheng/rkdoM1YWq)
參考作業說明 [加速 Fibonacci 運算](https://hackmd.io/@sysprog/linux2023-fibdrv/%2F%40sysprog%2Flinux2023-fibdrv-d) 的部份,裡頭提到
1. iterative 方法的時間複雜度為 O(n)
2. fast doubling 的時間複雜度降為 O(logn)
而 fibdrv 原始的實作則是使用迭代的方式計算,因此這裡使用 [fast doubling](https://www.nayuki.io/page/fast-fibonacci-algorithms) 的方法改進計算 Fibonacci 數的效能
透過 fast doubling 可以得知,目標的 Fibonacci 數可由以下兩式計算取得
1. $F(2k) = F(k)[2F(k+1) - F(k)]$
2. $F(2k+1) = F(k+1)^2 + F(k)^2$
因此,假設 $F(0) = 0$ 且 $F(1) = 1$ ,可以開始往下計算
當 $k = 1$ 時
1. $F(2) = F(1)[2F(2) - F(1)]$ → $F(2) = 1$
2. $F(3) = F(2)^2 + F(1)^2$ → $F(3) = 2$
以此類推
接著參考 [Bottom-up Fast Doubling](https://hackmd.io/@sysprog/linux2023-fibdrv/%2F%40sysprog%2Flinux2023-fibdrv-d#Bottom-up-Fast-Doubling) 的邏輯實作 fast doubling
首先計算變數 `k` 使用的位元數
```c
int count = 63 - __builtin_clzll(k);
long long fn0 = 1, fn1 = 1;
```
接著在迴圈內根據 fast doubling 的公式計算出新的 fibonacci 數
```c
/* f(2n) = f(n) * (2f(n + 1) - f(n)) */
long long f2n0 = fn0 * ((fn1 << 1) - fn0);
/* f(2n + 1) = f(n + 1) ^ 2 + f(n) ^ 2 */
long long f2n1 = fn1 * fn1 + fn0 * fn0;
```
最後更新變數,用來計算下一個 fibonacci 數
```c
/* update f(n) and f(n + 1)
* if mask = -1, then fn0 = f2n1 and fn1 = f2n0 + f2n1
* if mask = 0, then fn0 = f2n0 and fn1 = f2n1
*/
long long mask = -!!(k & (1UL << count));
fn0 = (f2n1 & mask) | (f2n0 & ~mask);
fn1 = ((f2n0 + f2n1) & mask) | (f2n1 & ~mask);
```
根據這樣的方法,一樣可以計算 fibonacci 數,一樣算到第 92 個數都是正確的
```shell
Reading from /dev/fibonacci at offset 92, returned the sequence 7540113804746346429.
```
以上的修改可見 [commit](https://github.com/Risheng1128/fibdrv/commit/0eec6cb514e82b31ecdc906e779af25bd314eae3)
### 使用 ktime 量測時間
參考[核心模式的時間測量](https://hackmd.io/@sysprog/linux2023-fibdrv/%2F%40sysprog%2Flinux2023-fibdrv-c#%E6%A0%B8%E5%BF%83%E6%A8%A1%E5%BC%8F%E7%9A%84%E6%99%82%E9%96%93%E6%B8%AC%E9%87%8F)裡頭提到的 ktime 來量測計算 fibonacci 數所需的時間
首先宣告一個型態為 `ktime_t` 的全域變數
```c
static ktime_t fib_time;
```
接著修改函式 `fib_read` 的內容,使其執行時會量測計算 fibonacci 數的時間,如下所示
```c
/* calculate the fibonacci number at given offset */
static ssize_t fib_read(struct file *file,
char *buf,
size_t size,
loff_t *offset)
{
fib_time = ktime_get();
ssize_t result = fib_sequence(*offset);
fib_time = ktime_sub(ktime_get(), fib_time);
return result;
}
```
接著修改 `fib_write` 的功能,使其回傳最近一次計算 fibonacci 的時間
```c
/* write operation is skipped */
static ssize_t fib_write(struct file *file,
const char *buf,
size_t size,
loff_t *offset)
{
return ktime_to_ns(fib_time);
}
```
以上的完整修改可見 [commit](https://github.com/Risheng1128/fibdrv/commit/9e8f402725eb122e8b8acaf5803766c74c36de72)
可以量測時間後,可以接著比較使用 fast doubling 前後的差異,如下所示
![](https://i.imgur.com/bgmzVn7.png)
很明顯使用 fast doubling 後,效能明顯提升許多
### 實作大數運算