owned this note
owned this note
Published
Linked with GitHub
# 2023q1 Homework3 (fibdrv)
contributed by < [slipet](https://github.com/slipet/fibdrv) >
<details>
<summary><font color=darkred>開發環境</font></summary>
```
$ gcc --version
gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
$lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 48 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 12
On-line CPU(s) list: 0-11
Vendor ID: AuthenticAMD
Model name: AMD Ryzen 5 5600X 6-Core Processor
CPU family: 25
Model: 33
Thread(s) per core: 2
Core(s) per socket: 6
Socket(s): 1
Stepping: 0
Frequency boost: enabled
CPU max MHz: 4650.2920
CPU min MHz: 2200.0000
BogoMIPS: 7399.35
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_ts
c rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_
lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb
cat_l3 cdp_l3 hw_pstate ssbd mba ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt
xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd arat npt lbrv svm_lock nrip_save tsc_scale v
mcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip pku ospke vaes vpclmulqdq rdpid overflow_recov succor
smca fsrm
Virtualization features:
Virtualization: AMD-V
Caches (sum of all):
L1d: 192 KiB (6 instances)
L1i: 192 KiB (6 instances)
L2: 3 MiB (6 instances)
L3: 32 MiB (1 instance)
NUMA:
NUMA node(s): 1
NUMA node0 CPU(s): 0-11
Vulnerabilities:
Gather data sampling: Not affected
Itlb multihit: Not affected
L1tf: Not affected
Mds: Not affected
Meltdown: Not affected
Mmio stale data: Not affected
Retbleed: Not affected
Spec rstack overflow: Mitigation; safe RET, no microcode
Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Spectre v2: Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP always-on, RSB filling, PBRSB-eIBRS Not affected
Srbds: Not affected
Tsx async abort: Not affected
```
</details>
## Checklist
- [ ] 研讀上述 ==Linux 效能分析的提示== 描述,在自己的實體電腦運作 GNU/Linux,做好必要的設定和準備工作 $\to$ 從中也該理解為何不希望在虛擬機器中進行實驗;
- [ ] 研讀上述費氏數列相關材料 (包含論文),摘錄關鍵手法,並思考 [clz / ctz](https://en.wikipedia.org/wiki/Find_first_set) 一類的指令對 Fibonacci 數運算的幫助。請列出關鍵程式碼並解說
- [ ] 複習 C 語言 [數值系統](https://hackmd.io/@sysprog/c-numerics) 和 [bitwise operation](https://hackmd.io/@sysprog/c-bitwise),思考 Fibonacci 數快速計算演算法的實作中如何減少乘法運算的成本;
- [ ] 學習指出針對大數運算的加速運算和縮減記憶體操作成本的舉措,紀錄你的認知和疑惑
- [ ] 注意到 `fibdrv.c` 存在著 `DEFINE_MUTEX`, `mutex_trylock`, `mutex_init`, `mutex_unlock`, `mutex_destroy` 等字樣,什麼場景中會需要呢?撰寫多執行緒的 userspace 程式來測試,觀察 Linux 核心模組若沒用到 mutex,到底會發生什麼問題。嘗試撰寫使用 [POSIX Thread](https://en.wikipedia.org/wiki/POSIX_Threads) 的程式碼來確認。 $\to$ 搭配閱讀〈[並行和多執行緒程式設計](https://hackmd.io/@sysprog/concurrency)〉
## Experiment procedure
### Modification to calculate Fib(n), when n > 92
In the original design, the Fibonacci sequence is stored in a `long long` type array, which will overflow after fib(92). In this report, I followed the recommendation from [link](https://hackmd.io/@sysprog/linux2023-fibdrv-b#%E8%A8%88%E7%AE%97-F93-%E5%8C%85%E5%90%AB-%E4%B9%8B%E5%BE%8C%E7%9A%84-Fibonacci-%E6%95%B8-%E5%8A%9F%E8%83%BD%E5%8F%97%E9%99%90) to implement two methods to deal with this issue:
#### 1. GCC `__uint128` type:
Declare the array for recording Fibonacci sequence using `__uint128` type. The key difference between the `long long` and `__uint128` types is that `printf()` cannot directly output the result with `__uint128` type. As the result of this, we need to convert the number into a string and pass it to the user process through a buffer.
<details>
<summary><font color=darkred>Details</font></summary>
```c
static size_t fib_sequence(long long k, char *buf)
{
uint128_t f[k + 2];
f[0] = 0;
f[1] = 1;
for (int i = 2; i <= k; i++) {
f[i] = f[i - 1] + f[i - 2];
}
char *result = uint128_to_string(f[k]);
size_t len = uint128_len(f[k]);
__reverse(result, len);
if (copy_to_user(buf, result, len))
return -EFAULT;
kfree(result);
return len;
}
```
</details>
`uint128_to_string` and `uint128_len` are the functions associated with `uint128_t` type. Below are the details of the implementation:
<details>
<summary><font color=darkred>Details</font></summary>
```c
static size_t uint128_len(uint128_t x)
{
size_t size = 0;
while (x) {
size++;
x /= 10;
}
size += (size == 0); // digit len can not less than 1
return size;
}
static char *uint128_to_string(uint128_t x)
{
char *str = (char *) kmalloc(uint128_len(x) + 1, GFP_KERNEL);
char *p = str;
*p = '0'; // initilize the string to zero;
while (x) {
*p++ = x % 10 + '0';
x /= 10;
}
return str;
}
```
</details>
So far, we are able to accurately calculate the Fibonacci sequence up to n = 100; however, we still encounter overflow after n = 187.
```shell
##Modify verify.py to calculate fibonacci sequence til fib(200)
f(187) fail
input: 198239973509362327032045173661212819077
expected: 538522340430300790495419781092981030533
```
#### 2. [myBigN](https://github.com/slipet/fibdrv/blob/master/myBigN.h)
There is an alternative for the large number representation mentioned in [link](https://hackmd.io/@sysprog/linux2023-fibdrv-b#%E8%A8%88%E7%AE%97-F93-%E5%8C%85%E5%90%AB-%E4%B9%8B%E5%BE%8C%E7%9A%84-Fibonacci-%E6%95%B8-%E5%8A%9F%E8%83%BD%E5%8F%97%E9%99%90), which is a customized data structure. I reference the implementations from [AdrianHuang](https://github.com/AdrianHuang/fibdrv/tree/big_number) and [SSO-23](https://github.com/elliotgoodrich/SSO-23) to design `myBigN` for large number representation.
<details>
<summary><font color=darkred>Structure details</font></summary>
```c
typedef struct {
char *ptr;
size_t size : MY_BIGNUM_MAX_LEN, capcity : 6, is_ptr : 1, reserved : 3;
} myLong;
typedef struct {
char data[15];
uint8_t free_space : 4, reserved : 4;
} myShort;
typedef union {
myLong long_num;
myShort short_num;
} myBigN;
```
</details>
Numbers up to 15 digits are represented using the `myShort` structure. Whenever a number exceeds 15 digits, we dynamically allocate memory space to a pointer in `myLong` for storing large numbers. Both structures include a reserved byte to record information about free space, size, and a control flag. The following is the implementation associated with `myBigN`.
<details>
<summary><font color=darkred>fib_sequence</font></summary>
```c
static ssize_t fib_sequence(long long k, char *buf)
{
myBigN f[k + 2];
f[0] = *MY_BIGNUM_TMP("0");
f[1] = *MY_BIGNUM_TMP("1");
for (int i = 2; i <= k; i++) {
myBigN_Add(&f[i], &f[i - 1], &f[i - 2]);
}
ssize_t len = myBigN_size(&f[k]);
if (copy_to_user(buf, myBigN_data(&f[k]), len))
return -EFAULT;
for (int i = 0; i <= k; i++)
myBigN_free(&f[i]);
return len;
}
```
</details>
The strategy for `myBigN` variables addition is implemented character by character. Below are the details of this method.
<details>
<summary><font color=darkred>myBigN_Add</font></summary>
```c
void myBigN_Add(myBigN *out, const myBigN *x, const myBigN *y)
{
size_t sizeX = myBigN_size(x), sizeY = myBigN_size(y);
char *dataX = myBigN_data(x), *dataY = myBigN_data(y);
size_t i = 0;
if (sizeX < sizeY) {
__SWAP(dataX, dataY, char);
__SWAP(&sizeX, &sizeY, size_t);
}
__reverse(dataX, sizeX);
__reverse(dataY, sizeY);
uint8_t carry = 0;
char result[sizeX + 2];
for (; i < sizeX || carry; i++) {
uint8_t sum = ((i < sizeX) ? (dataX[i] - '0') : 0) +
((i < sizeY) ? (dataY[i] - '0') : 0) + carry;
result[i] = '0' + sum % 10;
carry = sum / 10;
}
result[i] = 0;
__reverse(result, i);
__reverse(dataX, sizeX);
__reverse(dataY, sizeY);
if (out)
*out = *MY_BIGNUM_TMP(result);
}
```
</details>
The macros `MY_BIGNUM_TMP` and `INIT_STRING_LITERALS`, referenced from [AdrianHuang](https://github.com/AdrianHuang/fibdrv/tree/big_number), leverage the designated initializer to make the initialization of variables more elegant and readable.
<details>
<summary><font color=darkred>MY_BIGNUM_TMP and INIT_STRING_LITERALS</font></summary>
```c
#define INIT_STRING_LITERALS() \
(myBigN) \
{ \
.short_num = {.free_space = 15 } \
}
#define MY_BIGNUM_TMP(x) myBigN_init(&INIT_STRING_LITERALS(), x)
```
</details>
In `MY_BIGNUM_TMP(x)`, we can initialize a myBigN pointer with `NIT_STRING_LITERALS()` and a string literal. Declaring the `INIT_STRING_LITERALS()` macro is equivalent to declaring a myBigN object. `myBigN_init` receives the address of the object and a string literal to complete the subsequent allocation. Numbers with a digit length of up to 15 are stored in a character array. When the length exceeds 15, we allocate heap space to `x->long_num.ptr` for storing the larger number.
<details>
<summary><font color=darkred>myBigN_init</font></summary>
```c
myBigN *myBigN_init(myBigN *x, const char *strNum)
{
*x = INIT_STRING_LITERALS();
//+ 1 for '\0'
ssize_t len = strlen(strNum) + 1;
if (len > 16) {
x->long_num.size = len - 1;
x->long_num.is_ptr = true;
x->long_num.ptr = kmalloc(myBigN_size(x), GFP_KERNEL);
memcpy(myBigN_data(x), strNum, len);
} else {
memcpy(x->short_num.data, strNum, len);
x->short_num.free_space = 15 - len + 1;
}
return x;
}
```
</details>
Up until now, everything seemed set up to calculate Fibonacci sequence to fib(500). However, every time I execute `$make check` command, a failure message appears, followed by my Ubuntu system freezing, requiring a restart to resolve. The failure message is looks like this:
```shell
##Modify verify.py to calculate fibonacci sequence til fib(500)
f(312) fail
input: 1719558092601766452430641106302905217344934236440122960529002115744
expected: 71558092601766452430641106302905217344934236440122960529002115744
```
Furthermore, the terminated numbers of the Fibonacci sequence are not the same and appear to occur randomly.
At first, I intserted `printk` into the function and checked the information about myBigN variables. I used `$sudo dmesg | grep "xxxxx"` to filter the kernel messages. With the help of the print messages, I found that there is a condition uncovered in `myBigN_init`. The error was that the large number should be assigned to `x->long_num.ptr` rather than `x->short_num.data` when `len = 16`. The revised code is as follows:
<details>
<summary><font color=darkred>myBigN_init (revised)</font></summary>
```diff
myBigN *myBigN_init(myBigN *x, const char *strNum)
{
*x = INIT_STRING_LITERALS();
//+ 1 for '\0'
ssize_t len = strlen(strNum) + 1;
- if (len > 16) {
+ if (len >= 16) {
x->long_num.size = len - 1;
x->long_num.is_ptr = true;
x->long_num.ptr = kmalloc(myBigN_size(x), GFP_KERNEL);
memcpy(myBigN_data(x), strNum, len);
} else {
memcpy(x->short_num.data, strNum, len);
x->short_num.free_space = 15 - len + 1;
}
return x;
}
```
</details>
However, the failure message persists despite the modification. This leads me to suspect other parts of my implementation. Subsequently, I spent two days addressing this issue. Ultimately, I discovered that the problem lies in `memcpy(myBigN_data(x), strNum, len);`. It copies the entire string including `\0` into `myBigN_data(x)`, but the allocated size for `x->long_num.ptr` is only len - 1. Thus, an extra byte is copied into `myBigN_data(x)`, overwriting the area designated for storing the size. Below is the modification:
<details>
<summary><font color=darkred>myBigN_init (final)</font></summary>
```diff
myBigN *myBigN_init(myBigN *x, const char *strNum)
{
*x = INIT_STRING_LITERALS();
//+ 1 for '\0'
ssize_t len = strlen(strNum) + 1;
if (len >= 16) {
x->long_num.size = len - 1;
x->long_num.is_ptr = true;
x->long_num.ptr = kmalloc(myBigN_size(x), GFP_KERNEL);
- memcpy(myBigN_data(x), strNum, len);
+ memcpy(myBigN_data(x), strNum, myBigN_size(x));
} else {
memcpy(x->short_num.data, strNum, len);
x->short_num.free_space = 15 - len + 1;
}
return x;
}
```
</details>
Now, we are able to calculate Fibonacci sequence up to fib(500). The following is the result of fib(500) in the out file, which is same as [link](https://gist.github.com/brchiu/71bf87f51be35d6e334f4898b1ebd011)
```
/fibonacci at offset 500, returned the sequence 139423224561697880139724382870407283950070256587697307264108962948325571622863290691557658876222521294125.
```
TODO: I should learn using other debugging tools to help me debug kernel modules.
### [sysprog21/bignum](https://github.com/sysprog21/bignum)
From the description of [sysprog21/bignum 程式碼分析](https://hackmd.io/@sysprog/linux2023-fibdrv/%2F%40sysprog%2Flinux2023-fibdrv-e), we can found that
## Reference
[AdrianHuang](https://github.com/AdrianHuang)
[Linux 核心專題: 重做 fibdrv](https://hackmd.io/@sysprog/Hk2NJG3H2)