# 2023q1 Homework3 (fibdrv)
contributed by < `chiacyu` >
## 實驗環境
```c
$ gcc --version
gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 48 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 16
On-line CPU(s) list: 0-15
Vendor ID: AuthenticAMD
Model name: AMD Ryzen 9 5900HX with Radeon Graphics
CPU family: 25
Model: 80
Thread(s) per core: 2
Core(s) per socket: 8
Socket(s): 1
Stepping: 0
Frequency boost: enabled
CPU max MHz: 4679.2959
CPU min MHz: 1200.0000
BogoMIPS: 6587.65
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb
rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 x2
apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skini
t wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba ibrs ibpb stibp vmmcall fsgsbase
bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc
cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd cppc arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyas
id decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip pku ospke vaes vpclmulqdq rdpid overflow_recov succo
r smca fsrm
Virtualization features:
Virtualization: AMD-V
Caches (sum of all):
L1d: 256 KiB (8 instances)
L1i: 256 KiB (8 instances)
L2: 4 MiB (8 instances)
L3: 16 MiB (1 instance)
NUMA:
NUMA node(s): 1
NUMA node0 CPU(s): 0-15
Vulnerabilities:
Itlb multihit: Not affected
L1tf: Not affected
Mds: Not affected
Meltdown: Not affected
Mmio stale data: Not affected
Retbleed: Not affected
Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Spectre v2: Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP always-on, RSB filling, PBRSB-eIBRS Not affected
Srbds: Not affected
Tsx async abort: Not affected
```
### 疑難雜症
在 `make check` 的時候出現 `Key was rejected by service` 這個錯誤。解決方法是把 `bios` 的 `secure boot mode` 關掉。如果有遇到一樣的問題可以試試看。
### fibdrv deep dive
在實作之前若是對 `device driver` 不是很熟的話可以 先閱讀這篇 [Character device drivers](https://linux-kernel-labs.github.io/refs/heads/master/labs/device_drivers.html) 可以更了解整個架構。
#### struct file_operations
```c
const struct file_operations fib_fops = {
.owner = THIS_MODULE,
.read = fib_read,
.write = fib_write,
.open = fib_open,
.release = fib_release,
.llseek = fib_device_lseek,
};
```
內文有提到 `the character device drivers receive unaltered system calls made by users over device-type files.` 透過 `file_operations` 來定義出來透過那些自行定義的 `system call` 來進行互動。
需要注意的是,這邊的 `function` 跟在 `userspace` 所使用的不同。 例如以 `read` 來說
```c
ssize_t (*read) (struct file *, char __user *, size_t, loff_t *);
```
跟 `userspace` 所使用的程式有很大的不同。由於 `device driver` 運行在 `kernel space` 而 `kernel space` 並不能直接存取 `userspace` 的資料。此時便需要作業系統的介入來進行協調。
```c
ssize_t read(int fd, void *buf, size_t count);
```
## Fast doubling 改寫
這邊使用作業的說明 `fast-doubling` 實作 `fibonacci`,參考作業說明和 [Calculating Fibonacci Numbers by Fast Doubling](https://chunminchang.github.io/blog/post/calculating-fibonacci-numbers-by-fast-doubling) 的解說。
公式列在下方:
$$
\begin{align}
&F(2k) = 2F(k)\times F(k+1) - F(k)^2 \\
&F(2k+1) = F(k)^2 + F(k+1)^2
\end{align}
$$
如果對於 `fast-doubling` 不熟悉的話作者還有提供另外的幾篇貼文分別是 [Matrix Difference Equation for Fibonacci Sequence](https://chunminchang.github.io/blog/post/matrix-difference-equation-for-fibonacci-sequence) 和 [Exponentiation by squaring](https://chunminchang.github.io/blog/post/exponentiation-by-squaring) 。 看完之後會對整個方法更為熟悉。
```c
static __uint128_t fib_sequence(long long k)
{
__uint128_t a = 0;
__uint128_t b = 1;
__uint128_t t1, t2;
int len = 64 - __builtin_clzl(k);
while (len > 0) {
t1 = a * (2 * b - a);
t2 = (b * b) + (a * a);
a = t1;
b = t2;
if (((k >> (len - 1)) & 0x1)) {
t1 = a + b;
a = b;
b = t1;
}
len -= 1;
}
return a;
}
```
這邊除了直接使用 `__uint128_t` 的資料結構之外,還用了 `__builtin_clzl()` 的方式, 先算出從 `MSB` 往 `LSB` 數總共含有有幾個0, 例如:
```c
__builtin_clzl(4) == 60
```
## 時間量測與效能分析
首先引用了課程要求的 `ktime_t` 的結構來計算時間, 在 `main` 裡面使用 `read(fd, buf, 1)` 時會啟動 `static ssize_t fib_read(struct file *file, char *buf, size_t size, loff_t *offset)` 這個程式。
首先會先新增一個 `static` 的變數 `static ktime_t kt` 除此之外這邊多新增了一隻 `helper function` 叫做 `fib_time_proxy` 來計算時間。
當 `fib_read()` 被呼叫時會回傳 `fib_time_proxy()` ,在 `fib_time_proxy()` 裡先使用 `ktime_get();` 得到目前的時間,執行完 `fib_sequence()` 後再使用 `ktime_sub(ktime_get(), kt);` 計算出時間差。
```c
static ssize_t fib_read(struct file *file,
char *buf,
size_t size,
loff_t *offset)
{
return (ssize_t) fib_time_proxy(*offset);
}
static long long fib_time_proxy(long long k)
{
kt = ktime_get();
long long result = fib_sequence(k);
kt = ktime_sub(ktime_get(), kt);
return result;
}
```
修改後的 `client.c` 如下:
```c
int main()
{
long long sz;
long long kt;
char buf[1];
char write_buf[] = "testing writing";
int offset = 100; /* TODO: try test something bigger than the limit */
FILE *data_set = fopen("data.txt", "w");
int fd = open(FIB_DEV, O_RDWR);
if (fd < 0) {
perror("Failed to open character device");
exit(1);
}
for (int i = 0; i <= offset; i++) {
lseek(fd, i, SEEK_SET);
sz = read(fd, buf, 1);
printf("Reading from " FIB_DEV
" at offset %d, returned the sequence "
"%lld.\n",
i, sz);
kt = write(fd, write_buf, sizeof(write_buf));
fprintf(data_set, "%lld\n", kt);
printf("The time span is %lld\n", kt);
}
fclose(data_set);
close(fd);
return 0;
}
```
執行出來的結果如下:
```c
Reading from /dev/fibonacci at offset 0, returned the sequence 0.
-The time span is 2147017456
Reading from /dev/fibonacci at offset 1, returned the sequence 1.
-The time span is 1396
Reading from /dev/fibonacci at offset 2, returned the sequence 1.
-The time span is 978
Reading from /dev/fibonacci at offset 3, returned the sequence 2.
-The time span is 1466
Reading from /dev/fibonacci at offset 4, returned the sequence 3.
-The time span is 978
Reading from /dev/fibonacci at offset 5, returned the sequence 5.
-The time span is 978
Reading from /dev/fibonacci at offset 6, returned the sequence 8.
-The time span is 908
Reading from /dev/fibonacci at offset 7, returned the sequence 13.
-The time span is 908
Reading from /dev/fibonacci at offset 8, returned the sequence 21.
-The time span is 838
Reading from /dev/fibonacci at offset 9, returned the sequence 34.
-The time span is 978
Reading from /dev/fibonacci at offset 10, returned the sequence 55.
-The time span is 908
Reading from /dev/fibonacci at offset 11, returned the sequence 89.
-The time span is 978
Reading from /dev/fibonacci at offset 12, returned the sequence 144.
-The time span is 908
Reading from /dev/fibonacci at offset 13, returned the sequence 233.
-The time span is 1396
Reading from /dev/fibonacci at offset 14, returned the sequence 377.
-The time span is 978
Reading from /dev/fibonacci at offset 15, returned the sequence 610.
-The time span is 908
Reading from /dev/fibonacci at offset 16, returned the sequence 987.
-The time span is 908
Reading from /dev/fibonacci at offset 17, returned the sequence 1597.
...
...
```
我們可以從 [linux/include/linux/ktime.h](linux/include/linux/ktime.h) 發現 `ktime_t` 的資料型態是 `s64` 即六十四位元的有號數,單位是 `nanoseconds`。 這邊目前所計算出來的時間僅僅是 `kernel space` 計算 `fib sequence` 的時間。需要在去計算 `user space` 所經過的時間。從 `client.c` 的流程是:
1. 執行 `client.c` in (user-space)
2. 切換至 `kernel space`
3. 執行 `fib_sequence()`(kernel space)
4. 切換回 `user space`
5. 將收集到的 `ktime` 寫入 `data.txt`
目前的 `ktime_t` 只印出了 `kernel space` 執行的時間,需要計算的是 `user-space` 等待的時間減去 `kernel-space` 所花費的時間即是切換的成本。
下圖代表的僅僅是 `kernel` 執行 `fib_sequence()` 的時間在使用 `fast-doubling` 的計算時間。

比較以下的原本還沒有使用 `fast-doubling` 的時間

接下來計算在 `user space` 經過的時間,使用的 `timespec` 的資料結構可以 [clock_getres(2)](https://man7.org/linux/man-pages/man2/clock_getres.2.html) 看到
```c
struct timespec {
time_t tv_sec; /* seconds */
long tv_nsec; /* nanoseconds */
};
```
因為 `ktime_t` 的單位是 `nanosecond` 需要把單位統一轉換成 `nanoseconds`
```c
ts.tv_sec * 1e9 + ts.tv_nsec
```
再來修改 `client.c` 程式來用三個變數來紀錄時間。
```c
for (int i = 0; i <= offset; i++) {
long long kt, ut;
ut = utime_tons();
lseek(fd, i, SEEK_SET);
sz = read(fd, read_buf, 1);
read_buf[sz] = '\0';
printf("Reading from " FIB_DEV
" at offset %d, returned the sequence "
"%s and size is %lld.\n",
i, read_buf, sz);
kt = write(fd, write_buf, sizeof(write_buf));
ut = utime_tons() - ut;
fprintf(data_set_k, "%lld %d\n", i, kt);
fprintf(data_set_u, "%lld %d\n", i, ut);
fprintf(data_set_d, "%lld %d\n", i, ut - kt);
printf("The time is %lld\n", kt);
```
`kt` : `kernel space` 所經過的時間
`ut` : `user space` 所經過的時間
`dt` : `Copy to user` 所花費的時間
下圖為 `original` 與 `fast-doubing` 在 `kernel space` 所花費時間的比較

可以看出 `fast-doubling` 的方法比原本遞迴的版本,在數字越大的時候加速的效果越好。
## Small/Short String Optimization
初步理解 `String Optimization` , 開始理解寫程式很多時候是把訊息透過數字 `encode` 跟 `decode` 的過程。
`String Optimization` 的基本概念就是用 `char` 的方式來去儲存數字,舉例來說,要表達0-9的數字以 `int` 來說至少需要4個`bytes` 來儲存。 但若使用 `char` 只需要1個 `byte`便可以搞定。
這邊使用的結構體為:
```c
typedef struct big_num_fix {
char num[128];
} big_num_fix_t;
```
- `num` : 存放數字本身並以string來表示
這邊在宣告的時候直接設定陣列大小為 `128` 個 `char`, 好處是在於不用多餘的程式來進行初始化,可以直接將前兩個元素的 `num` 設定為 `0` 與 ``1``。
但在相加的過程會發生字元反序的問題,例如當我們要進位時則會發生如下:
['5'] + ['8'] = ['3', '1']
因此需要一個 `reverse_str` 來將數字轉原來的序列。
```c
void reverse_bn(big_num_t *a)
{
char *buf = malloc(Max_len * sizeof(char));
strncpy(buf, a->num_string, (Max_len * sizeof(char)));
int bit_num = a->num_size;
int j = 0;
for (int i = bit_num - 1; i >= 0; i--) {
a->num_string[j++] = buf[i];
}
}
```
接下來我們需要一個 `add_function` 來將兩的 ` big_num_fix_t` 來做相加。
```c
static void big_num_fix_add(big_num_fix_t *ina,
big_num_fix_t *inb,
big_num_fix_t *out)
{
size_t size_a = strlen(ina->num);
size_t size_b = strlen(inb->num);
int i, sum, carry = 0;
if (size_a >= size_b) {
for (i = 0; i < size_b; i++) {
sum = (ina->num[i] - '0') + (inb->num[i] - '0') + carry;
out->num[i] = '0' + sum % 10;
carry = sum / 10;
}
for (i = size_b; i < size_a; i++) {
sum = (ina->num[i] - '0') + carry;
out->num[i] = '0' + sum % 10;
carry = sum / 10;
}
} else {
for (i = 0; i < size_a; i++) {
sum = (ina->num[i] - '0') + (inb->num[i] - '0') + carry;
out->num[i] = '0' + sum % 10;
carry = sum / 10;
}
for (i = size_a; i < size_b; i++) {
sum = (inb->num[i] - '0') + carry;
out->num[i] = '0' + sum % 10;
carry = sum / 10;
}
}
if (carry)
out->num[i++] = '0' + carry;
out->num[i] = '\0';
```
改寫後的 `fib_sequence_big_num_fix`
```c
static long long fib_sequence_big_num_fix(long long k, char *buf)
{
big_num_fix_t *f = kmalloc((k + 2) * sizeof(big_num_fix_t), GFP_KERNEL);
f[0].num[0] = '0';
f[0].num[1] = '1';
for (int i = 2; i <= k; i++) {
big_num_fix_add(&f[i - 1], &f[i - 2], &f[i]);
}
size_t ret= strlen(f[k].num);
reverse_str(f[k].num, ret);
copy_to_user(buf, f[k].num, ret);
return ret;
}
```
修改完成之後計算到 `fib(100)` 就沒有問題
```c
-The time span is 3487
-Reading from /dev/fibonacci at offset 93, returned the sequence 12200160415121876738 and size is 20.
-The time span is 3758
-Reading from /dev/fibonacci at offset 94, returned the sequence 19740274219868223167 and size is 20.
-The time span is 3627
-Reading from /dev/fibonacci at offset 95, returned the sequence 31940434634990099905 and size is 20.
-The time span is 3637
-Reading from /dev/fibonacci at offset 96, returned the sequence 51680708854858323072 and size is 20.
-The time span is 3727
-Reading from /dev/fibonacci at offset 97, returned the sequence 83621143489848422977 and size is 20.
-The time span is 3547
-Reading from /dev/fibonacci at offset 98, returned the sequence 135301852344706746049 and size is 21.
-The time span is 3597
-Reading from /dev/fibonacci at offset 99, returned the sequence 218922995834555169026 and size is 21.
-The time span is 3908
-Reading from /dev/fibonacci at offset 100, returned the sequence 354224848179261915075 and size is 21.
```
執行出來的結果如下:

可以看到其中有些數值會有爆增的情形,應該是有其他干擾因素如其他的 `process` 使用 `kernel` 資源等等,再來我們可以看看分別在 `kernel space` 與 `user space` 還有 `copy to user` 所花費的成本與時間。

從實驗出來的結果推論,雖然使用 `bn` 結構雖然可以解決大數運算的問題,但需要以運算時間來作為成本交換。 而且可以看到數值震盪的問題,需要透過一些方法來排除干擾因素。
## 排除干擾因素
### 查看行程的 CPU affinity
根據作業說明的方法來查詢:
```c
pid 1's current affinity mask: ffff
(base) chiacyu@chiacyu-msi:~$ taskset -cp 1
pid 1's current affinity list: 0-15
```
即表示 `pid=1` 的 `process` 可以在 `0-15` 個 `core` 上面執行
### 孤立特定的的 CPU core
詳細的方法可以參考這篇 [How to get isolcpus kernel](https://askubuntu.com/questions/165075/how-to-get-isolcpus-kernel-parameter-working-with-precise-12-04-amd64) 收先找到 `/etc/default` 找到 `grub` 檔案,可以用 `vim`,這邊需要用`sudo`才可以複寫檔案。
```c
# If you change this file, run 'update-grub' afterwards to update
# /boot/grub/grub.cfg.
# For full documentation of the options in this file, see:
# info -f grub -n 'Simple configuration'
GRUB_DEFAULT=0
GRUB_TIMEOUT_STYLE=hidden
GRUB_TIMEOUT=10
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash isolcpus=1"
GRUB_CMDLINE_LINUX=""
# Uncomment to enable BadRAM filtering, modify to suit your needs
# This works with Linux (no patch required) and with any kernel that obtains
# the memory map information from GRUB (GNU Mach, kernel of FreeBSD ...)
#GRUB_BADRAM="0x01234567,0xfefefefe,0x89abcdef,0xefefefef"
# Uncomment to disable graphical terminal (grub-pc only)
#GRUB_TERMINAL=console
# The resolution used on graphical terminal
# note that you can use only modes which your graphic card supports via VBE
# you can see them in real GRUB with the command `vbeinfo'
#GRUB_GFXMODE=640x480
# Uncomment if you don't want GRUB to pass "root=UUID=xxx" parameter to Linux
#GRUB_DISABLE_LINUX_UUID=true
# Uncomment to disable generation of recovery mode menu entries
#GRUB_DISABLE_RECOVERY="true"
# Uncomment to get a beep at grub start
#GRUB_INIT_TUNE="480 440 1"
```
完成之後記得輸入 `update-grub`
成功之後重新開機之後可以在 `/sys/devices/system/cpu` 的地方驗證
```c
(base) chiacyu@chiacyu-msi:/sys/devices/system/cpu$ ls
cpu0 cpu12 cpu2 cpu6 cpufreq kernel_max online smt
cpu1 cpu13 cpu3 cpu7 cpuidle microcode possible uevent
cpu10 cpu14 cpu4 cpu8 hotplug modalias power vulnerabilities
cpu11 cpu15 cpu5 cpu9 isolated offline present
(base) chiacyu@chiacyu-msi:/sys/devices/system/cpu$ cat isolated
1
```
接著將該行程個定在 `CPU core 1` 執行來看看結果

看起來浮動的情形比原先的狀況好,把兩者放在一起對比看看,原本尚未孤立 `CPU core` 的時候會出現 `20000` 的爆增數字,孤立之後最多出現 `14000`。

## 排除干擾效能分析的因素
抑制 [address space layout randomization (ASLR)](https://en.wikipedia.org/wiki/Address_space_layout_randomization)
```c
$ sudo sh -c "echo 0 > /proc/sys/kernel/randomize_va_space"
```
作業提到針對 Intel 處理器,關閉 turbo mode:
```c
sudo sh -c "echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo"
```
但由於我的主機使用的是 `AMD` 的處理器若要關閉 `turbo mode` 可以參考這一篇 [Disabling AMD's equivalent (on a Zen-1 Epyc)](https://askubuntu.com/questions/1294142/disabling-amds-equivalent-on-a-zen-1-epyc-of-intels-turbo-boost-at-runtime) 以下是關閉 turbo Mode 的結果。

## String Optimization 加上 fast-doubling
回顧一下 `fast-doubling的方法` 會需要多 `乘法` 與 `減法`,因此我們需要對應的程式來做處裡:
```c
static __uint128_t fib_sequence(long long k)
{
__uint128_t a = 0;
__uint128_t b = 1;
__uint128_t t1, t2;
int len = 64 - __builtin_clzl(k);
while (len > 0) {
t1 = a * (2 * b - a);
t2 = (b * b) + (a * a);
a = t1;
b = t2;
if (((k >> (len - 1)) & 0x1)) {
t1 = a + b;
a = b;
b = t1;
}
len -= 1;
}
return a;
}
```
### big_num_fix_mul()
```c
static void big_num_fix_mul(big_num_fix_t *ina,
big_num_fix_t *inb,
big_num_fix_t *out)
{
int carry = 0, sum, base;
bool carry_flag = false;
size_t size_a = strlen(ina->num);
size_t size_b = strlen(inb->num);
int bottom_index = size_a + size_b;
for (int i = 0; i < size_b; i++) {
for (int j = 0; j < size_a; j++) {
base = out->num[j + i] - '0';
if (i == 0)
base = 0;
sum = ((inb->num[i] - '0') * (ina->num[j] - '0')) + carry + base;
carry = sum / 10;
out->num[j + i] = (sum % 10) + '0';
}
if (carry) {
out->num[i + size_a] = ('0' + carry);
carry = 0;
carry_flag = true;
}
}
if (!carry_flag)
bottom_index--;
out->num[bottom_index] = '\0';
return;
}
```
### big_num_fix_sub()
```c
static void big_num_fix_sub(big_num_fix_t *ina,
big_num_fix_t *inb,
big_num_fix_t *out)
{
size_t sizes = strlen(ina->num) > strlen(inb->num) ? strlen(ina->num)
: strlen(inb->num);
for (int i = 0; i < sizes; i++) {
if (ina->num[i] < inb->num[i]) {
ina->num[i + 1]--;
ina->num[i] += 10;
}
out->num[i] = ina->num[i] - inb->num[i];
}
return;
}
```
```c
static long long fib_sequence_big_num_fix_fd(long long k, char *buf)
{
big_num_fix_t a;
big_num_fix_t b;
big_num_fix_t t1, t2;
big_num_fix_t const2;
a.num[0] = '0';
a.num[1] = '\0';
b.num[0] = '1';
b.num[1] = '\0';
const2.num[0] = '2';
const2.num[1] = '\0';
big_num_fix_t twobtmp;
big_num_fix_t twobmatmp;
big_num_fix_t bsquare;
big_num_fix_t asquare;
int len = 64 - __builtin_clzl(k);
while (len > 0) {
big_num_fix_mul(&const2, &b, &twobtmp);
big_num_fix_sub(&twobtmp, &a, &twobmatmp);
big_num_fix_mul(&a, &twobmatmp, &t1);
big_num_fix_mul(&b, &b, &bsquare);
big_num_fix_mul(&a, &a, &asquare);
big_num_fix_add(&asquare, &bsquare, &t2);
a = t1;
b = t2;
if (((k >> (len - 1)) & 0x1)) {
big_num_fix_add(&a, &b, &t1);
a = b;
b = t1;
}
len -= 1;
}
size_t ret = strlen(a.num);
copy_to_user(buf, a.num, ret);
return ret;
}
```
### 時間量測與效能分析
在量測的過程中還是會出現一些離峰值

試著去掉離峰值之後的結果

我們來比較一下跟原本 `naive` 的結果

可以看到在數值越來越大的時候加速的效果越來越顯著。
## Mutex 觀察與相關實驗
### 觀察 fib 裡面的 Mutex lock 的使用時機
可以看到在 `fib_open` 裡面的結構
```c
static int fib_open(struct inode *inode, struct file *file)
{
if (!mutex_trylock(&fib_mutex)) {
printk(KERN_ALERT "fibdrv is in use");
return -EBUSY;
}
return 0;
}
```
在 `Linux` 裡的哲學是 `Everything is a file` ,因此我們的 在 `open` `fib driver` 的時候 需要先 `mutex_trylock` 來去看看是否 `mutex` 有被其他 `process` 所佔有。
### mutex lock的功能
這邊先寫了一個簡單的 `pthread` 程式
```c
void *thread_func(void *arg)
{
long long sz;
char *read_buf = "";
char *write_buf = "testing writing";
int offset = 100;
int fd = open(FIB_DEV, O_RDWR);
if (fd < 0) {
perror("Failed to open character device");
exit(1);
}
for (int i = 0; i <= offset; i++) {
long long kt;
lseek(fd, i, SEEK_SET);
sz = read(fd, read_buf, 1);
printf("Reading from " FIB_DEV
" at offset %d, returned the sequence "
"%s and size is %lld.\n",
i, read_buf, sz);
kt = write(fd, write_buf, sizeof(write_buf));
}
close(fd);
return;
}
int main()
{
pthread_t thr[NUM_THREADS];
int i, rc;
for (i=0 ; i<NUM_THREADS ; i++) {
if ((rc = pthread_create(&thr[i], NULL, thread_func, NULL))) {
fprintf(stderr, "error:pthread_create, rc: %d\n", rc);
return EXIT_FAILURE;
}
}
for (i=0 ; i<NUM_THREADS ; i++) {
pthread_join(thr[i], NULL);
}
return EXIT_SUCCESS;
}
```
在執行的時候會發生
```c
sudo ./client > out
Failed to open character device: Device or resource busy
Failed to open character device: Device or resource busy
Failed to open character device: Device or resource busy
make: *** [Makefile:38: check] Error 1
```
當其中一個 `thread` 取得 `Mutex` 但還沒有釋放時,若是另一個 `thread` 也去搶奪 `mutex` 時,會出現此錯誤,此時不只強奪 `mutex` 的 `thread`, 包括持有 `mutex` 的 `thread` 也會失敗。