2022q1 Homework3 (fibdrv)

contributed by < jimmy-liu1021 >

自我檢查清單

研讀上述 Linux 效能分析的提示描述，在自己的實體電腦運作 GNU/Linux，做好必要的設定和準備工作
$\to$ 從中也該理解為何不希望在虛擬機器中進行實驗;
研讀上述費氏數列相關材料 (包含論文)，摘錄關鍵手法，並思考 clz / ctz 一類的指令對 Fibonacci 數運算的幫助。請列出關鍵程式碼並解說
複習 C 語言數值系統和 bitwise operation，思考 Fibonacci 數快速計算演算法的實作中如何減少乘法運算的成本;
研讀 KYG-yaya573142 的報告，指出針對大數運算，有哪些加速運算和縮減記憶體操作成本的舉措？
lsmod 的輸出結果有一欄名為 Used by，這是 "each module's use count and a list of referring modules"，但如何實作出來呢？模組間的相依性和實際使用次數 (reference counting) 在 Linux 核心如何追蹤呢？

搭配閱讀 The Linux driver implementer’s API guide » Driver Basics
注意到 fibdrv.c 存在著 DEFINE_MUTEX, mutex_trylock, mutex_init, mutex_unlock, mutex_destroy 等字樣，什麼場景中會需要呢？撰寫多執行緒的 userspace 程式來測試，觀察 Linux 核心模組若沒用到 mutex，到底會發生什麼問題。嘗試撰寫使用 POSIX Thread 的程式碼來確認。

說好的開發紀錄呢？

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

jserv

實驗環境

$ gcc --version
gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0

$ lscpu
Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
Address sizes:                   39 bits physical, 48 bits virtual
CPU(s):                          8
On-line CPU(s) list:             0-7
Thread(s) per core:              2
Core(s) per socket:              4
Socket(s):                       1
NUMA node(s):                    1
Vendor ID:                       GenuineIntel
CPU family:                      6
Model:                           60
Model name:                      Intel(R) Core(TM) i7-4770K CPU @ 3.50GHz
Stepping:                        3
CPU MHz:                         3500.000
CPU max MHz:                     3900.0000
CPU min MHz:                     800.0000
BogoMIPS:                        6999.36
Virtualization:                  VT-x
L1d cache:                       128 KiB
L1i cache:                       128 KiB
L2 cache:                        1 MiB
L3 cache:                        8 MiB
NUMA node0 CPU(s):               0-7

前期準備

$ uname -r
5.13.0-39-generic

//確認 linux-headers 套件已正確安裝於開發環境
$ dpkg -L linux-headers-`uname -r` | grep "/lib/modules"
/lib/modules
/lib/modules/5.13.0-39-generic
/lib/modules/5.13.0-39-generic/build

//確認使用者身份不是 root
$ whoami
jimmy
$ sudo whoami
root

//安裝製圖工具
$ sudo apt install util-linux strace gnuplot-nox

//clone `fibdrv` 專案並安裝模組
$ git clone https://github.com/sysprog21/fibdrv
$ cd fibdrv
$ make check

此時會發現有報錯

@diff -u out scripts/expected.txt

這道指令是要將 out 與預期的輸出 expected.txt 比對差異，然而實際上去觀察 expected.txt 的內容發現 F(93) 以後的數值亦是錯誤的，因此將此行刪除。

排除效能干擾因素

查看指定行程的處理器親和性
```
$ taskset -cp 1
pid 1's current affinity list: 0-7
```
代表 Process 1 可在第0到第7個 Cpu 中執行
若將其中一顆Cpu單獨讓我們使用，量測效能時會更加準確。
```
$ cat /etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash isolcpus=7"
```
將 GRUB_CMDLINE_LINUX_DEFAULT 加上 isolcpus=7 以空出第7個Cpu
修改完 grub 之後，內有提醒要使用 update-grub 指令進行更新
```
$ sudo update-grub
```

重開機之後再檢查一次，可發現第7個 Cpu 不在列表中

$ taskset -cp 1
pid 1's current affinity list: 0-6

測量 Fibonacci sequence 執行時間

Kernel space

在 kernel space 中，使用 ktime_get 取得時間

static ssize_t fib_write(struct file *file,
                         const char *buf,
                         size_t size,
                         loff_t *offset)
{
    ktime_t kt;
    switch (size) {
    case 0:
        kt = ktime_get();
        fib_sequence(*offset);
        kt = ktime_sub(ktime_get(), kt);
        break;
    case 1:
        kt = ktime_get();
        fib_sequence_fdouble(*offset);
        kt = ktime_sub(ktime_get(), kt);
        break;
    default:
        return 0;
    }
    return (ssize_t) ktime_to_ns(kt);
}

User space

在 user space 中，使用 clock_gettime 取得時間

clock_gettime(CLOCK_REALTIME, &t_start);
fz = write(fd, write_buf, 0);
clock_gettime(CLOCK_REALTIME, &t_end);
long long dif = t_end.tv_nsec - t_start.tv_nsec;

System Call

將 kernel space runtime 扣掉 user space runtime 便可估測 system call 所佔用的時間

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

測量 Fast doubling 執行時間

參考 KYG-yaya573142 中的對於 fast doubling 的實作

static long long fib_sequence_fdouble(long long n)
{
    if (n < 2) { /* F(0) = 0, F(1) = 1 */
        return n;
    }
    long long f[2];
    unsigned int ndigit = 32 - __builtin_clz(n); /* number of digit in n */
    f[0] = 0;                                    /* F(k) */
    f[1] = 1;                                    /* F(k+1) */

    for (unsigned int i = 1U << (ndigit - 1); i;
         i >>= 1) { /* walk through the digit of n */
        long long k1 =
            f[0] * (f[1] * 2 - f[0]); /* F(2k) = F(k) * [ 2 * F(k+1) – F(k) ] */
        long long k2 =
            f[0] * f[0] + f[1] * f[1]; /* F(2k+1) = F(k)^2 + F(k+1)^2 */
        if (n & i) {                   /* current binary digit == 1 */
            f[0] = k2;                 /* F(n) = F(2k+1) */
            f[1] = k1 + k2; /* F(n+1) = F(2k+2) =  F(2k) +  F(2k+1) */
        } else {
            f[0] = k1; /* F(n) = F(2k) */
            f[1] = k2; /* F(n+1) = F(2k+1) */
        }
    }
    return f[0];
}

使用相同的方式測量時間

kernel space 與 user space 近乎 constant time

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

比較 sequential 與 fast doubling 效能差異

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

2022q1 Homework3 (fibdrv)

自我檢查清單

排除效能干擾因素

測量 Fibonacci sequence 執行時間

Kernel space

User space

System Call

測量 Fast doubling 執行時間

使用相同的方式測量時間

比較 sequential 與 fast doubling 效能差異

大數運算

Read more

2022q1 Lecture 3 (quiz3)

2022q1 Homework2 (quiz2)

2022q1 Lecture 2 (quiz2)

2022q1 Homework1 (quiz1)