2020q1 Homework2 (fibdrv)

# 2020q1 Homework2 (fibdrv) contributed by < `ldotrg` > ###### tags: `linux2020` `dutsai` 系統資訊: ```shell #uname -a Linux 5.3.0-42-generic 18.04.1-Ubuntu SMP #lscpu Architecture: x86_64 CPU(s): 6 On-line CPU(s) list: 0-5 Model name: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz ``` ## 建立用以對照結果的 Fibonacci 數列運用 Leetcode: [2. Add two number](https://leetcode.com/problems/add-two-numbers/) 的概念。每個位數單獨處理，換句話說就是用一個 byte 來表示一個十進位數字。用這最基本的方式實做，來作為效能評比的對照組。 ```cpp static inline void add_BigN(struct BigN *output, struct BigN x, struct BigN y) { u8 carry = 0; for (int ii = 0; ii < MAX_DIGITS; ii++) { u8 tmp = x.val[ii] + y.val[ii] + carry; output->val[ii] = tmp % 10; carry = 0; if (tmp > 9) carry = 1; } } ``` # 利用 eBPF 一行指令測量 `debug_read` 執行時間 [bpftrace](https://github.com/iovisor/bpftrace) 變更很快，可從原始程式碼安裝 - [bpftrace install instructions](https://github.com/iovisor/bpftrace/blob/master/INSTALL.md) - 參考步驟: ```shell $ ./build.sh $ cd build-release $ make install ``` - 共筆: [BCC & bpftrace - 安裝](https://hackmd.io/@0xff07/BJ6vxInlU) 對照看 Brendan Gregg 的解說 [Jump start from Intermediate](http://www.brendangregg.com/blog/2019-01-01/learn-ebpf-tracing.html#Intermediate) - [bpftrace Reference Guide](https://github.com/iovisor/bpftrace/blob/master/docs/reference_guide.md) - [`uprobe`/`uretprobe`: Dynamic Tracing, User-Level](https://github.com/iovisor/bpftrace/blob/master/docs/reference_guide.md#3-uprobeuretprobe-dynamic-tracing-user-level) 一切就緒後，只需要以下一行命令即可輸出 Fibonacci 數 (記得左右移動捲軸): ```shell sudo bpftrace -e 'BEGIN { @start = nsecs; } uprobe:'$PWD'/client:debug_read /@start != 0/ { @start = nsecs; } uretprobe:'WD'/client:debug_read { @cnt++; @stop = nsecs; printf("%d ,%llu\n", (@cnt - 1), (@stop - @start)); } END { clear(@start); clear(@stop); clear(@cnt);}' | tee performance.csv && sed -i 's/Attaching 4 probes...//g' performance.csv ``` 會看到以下輸出`Attaching 4 probes...` 就表示成功了拆解命令 `bpftrace`: 請參閱 [bpftrace-reference-guide](https://github.com/iovisor/bpftrace/blob/master/docs/reference_guide.md#bpftrace-reference-guide) ```shell #!/usr/bin/env bash echo "Data will be preserved in performance.csv... Press Ctrl+C to stop." sudo bpftrace -e ' BEGIN { @start = nsecs; } uprobe:'$PWD'/client:debug_read /@start != 0/ { @start = nsecs; } uretprobe:'$PWD'/client:debug_read { @cnt++; @stop = nsecs; printf("%d ,%llu\n", (@cnt - 1), (@stop - @start)); } END { clear(@start); clear(@stop); clear(@cnt); }' ``` 同時輸出至檔案與終端畫面 `| tee performance.csv` 透過 `sed` 刪除不必要的資訊 `&& sed -i 's/Attaching 4 probes...//g' performance.csv` > 原始程式碼: [ldotrg/fibdrv](https://github.com/ldotrg/fibdrv) > 腳本檔案 [bpf_scripts.sh](https://github.com/ldotrg/fibdrv/blob/master/bpf_scripts.sh) 最後結果會輸出到 `performance.csv` 檔，再用 gnuplot 作圖即可 >`performance.csv` 第一行為 Fibonacci 數列第 n 項 > 第 2 行為每次 function 被呼叫到返回的時間畫圖： ```shell $ gnuplot \ -e 'in="performance.csv";out="fibtime.png";gtitle="Fibonacci Sequence Performance"' \ plot.gp ``` 產生 1000 項 ![](https://i.imgur.com/Iw7HNQr.png) 產生 100 項 ![](https://i.imgur.com/6b0fdlI.png) 但這種大數表示方式實在是沒有效率,當計算到 10,000 項以上時就發生 kernel panic。目前只先支援到 1000 項。 :::danger 為何上圖會有顯著的抖動現象呢？請確認 [SMP IRQ affinity](https://www.kernel.org/doc/Documentation/IRQ-affinity.txt) 充分設定，並搭配 `isolcpus`，以排除硬體中斷的干擾 :notes: jserv ::: # 排除干擾效能分析的因素 ### SMP IRQ affinity - 檢查 irq 與 cpu 的統計資訊 `$ cat /proc/interrupts` - 列出所有的 irq `ls /proc/irq/` - 將特定中斷綁定到指定的CPU `sudo sh -c "echo <cpu bitmask> > /proc/irq/<IRQ number>/smp_affinity"` e.g. Bind irq 128 to CPU 0-3 `sudo sh -c "echo f > /proc/irq/128/smp_affinity"` - Applies to all non-active IRQs.Once IRQ is allocated/activated its affinity bitmask will be set to the default mask. `sudo sh -c "echo f > /proc/irq/default_smp_affinity"` ### taskset ```bash sudo taskset 0x10 ./client ``` ### 針對 intel 處理器，關閉 turbo mode `sudo sh -c "echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo"` 執行時間會增加 ### CPU Isolation ```bash sudo vim /etc/default/grub GRUB_CMDLINE_LINUX_DEFAULT="isolcpus=4,5" sudo update-grub sudo reboot ``` ![](https://i.imgur.com/zqoGbrd.png) # 嘗試移植 [bignum](https://github.com/sysprog21/bignum) - 參照 [4.2 Composite Host Programs](https://elixir.bootlin.com/linux/latest/source/Documentation/kbuild/makefiles.rst#L621) 支援多個原始檔的 kbuild ```shell obj-m := fibdrv.o fibdrv-objs := bignum_k/apm.o \ bignum_k/bignum.o \ bignum_k/format.o \ bignum_k/mul.o \ bignum_k/sqr.o ``` 結果 `insmod fibdrv.ko` 遇到下列問題 ``` [ 111.811807] fibdrv: module license 'unspecified' taints kernel. [ 111.811810] Disabling lock debugging due to kernel taint ``` `obj-m` 定義的是最後核心模組經過 `linker` 連結後的檔名。所以編譯的時候並不會將`fibdrv.c`加入編譯。如果我們在 `firdrv-objs` 加入 `fibdrv.o`，還是會遇到相同狀況。藉由`objdump -tT fibdrv.ko` 發現根本沒有 `fibdrv.c` 裡的符號。此外，由 `modinfo` 發現，`fibdrv.c` 裡`MODULE_LICENSE` `MODULE_AUTHOR` ...等宣告的字串也沒有出現。 ```shell $ modinfo fibdrv.ko filename: fibdrv/fibdrv.ko srcversion: 23EEA8BE9D31043C12427A0 depends: retpoline: Y name: fibdrv vermagic: 5.3.0-42-generic SMP mod_unload ``` 因此 `fibdrv.c` 其實根本沒有編譯。嘗試將 `fibdrv.c` 改為 `fibdrv_main.c`。 Makefile 也做以下修改： ```shell obj-m := $(TARGET_MODULE).o $(TARGET_MODULE)-objs := fibdrv_main.o \ bignum_k/apm.o \ bignum_k/bignum.o \ bignum_k/format.o \ bignum_k/mul.o \ bignum_k/sqr.o ``` 就可以恢復正常。結論：由多個 C 原始檔創造的核心模組，C 原始檔的名字不能和核心模組名字一樣。 ### 需尋找`realloc` 在核心程式碼的替代方案 ```cpp static void *(*orig_malloc)(size_t, gfp_t) = kmalloc; static void *(*orig_realloc)(void *, size_t) = NULL; static void (*orig_free)(const void *) = kfree; ``` :::warning Linux 核心的 slab 介面提供 `krealloc`，注意使用規範 :notes: jserv ::: ### 利用 `procfs` 動態改變程式行為為了測試方便, 使用一個變數來判斷該使用哪一種 `Fibonacci` 的演算法這樣就不用一直重新編譯原始碼。 [Creating a proc and intrfacing with user space](https://devarea.com/linux-kernel-development-creating-a-proc-file-and-interfacing-with-user-space/) ```shell $ cat /proc/fibonacci/fib_flag fib_flag = 0 $ echo 2 > /proc/fibonacci/fib_flag $ cat /proc/fibonacci/fib_flag fib_flag = 2 ``` > 原諒我使用如此醜陋寫法，得先趕快解決其它主線問題 ```clike switch (fib_flag) { case 1: result = fast_doubling(*offset); snprintf(kbuf, MAX_DIGITS, "%llu", result); len = copy_to_user(buf, kbuf, MAX_DIGITS); break; case 2: bignum_k_fast_doubling(buf, size, *offset); break; default: fib_sequence(buf, size, *offset); len = size; } ``` > TODO: sysfs 利用 kobject 包裝核心內部的資料結構並揭露給使用者。 > [問題影片解說連結](https://youtu.be/Fo-3MtrXr3E?t=7053) > - [檔案系統概念及實作手法](https://beta.hackfoldr.org/linux/https%253A%252F%252Fhackmd.io%252Fs%252FBypqEyF6N) > - [sysfs.txt](https://github.com/torvalds/linux/blob/c309b6f24222246c18a8b65d3950e6e755440865/Documentation/translations/zh_CN/filesystems/sysfs.txt) > - [sysfs example](https://hackmd.io/@colinyoyo26/2020q1fibdrv#%E5%BE%9E-sysfs-%E7%95%8C%E9%9D%A2%E8%AE%80%E5%8F%96%E5%9F%B7%E8%A1%8C%E6%99%82%E9%96%93) ### 將輸出資料的結果導入 user spcace 的記憶體 bignum fast doubling 初版完成：[b721239](https://github.com/ldotrg/fibdrv/commit/b721239cca3c508ada83d77ba17149d5b0b567d1) > 不知道有什麼好東西可以取代 `FILE` 的角色 ``` void bn_fprint(const bn *n, unsigned int base, FILE *fp) ``` # 效能分析三部曲 ## 效能 - 取樣 ![](https://i.imgur.com/42cx4hc.png) - bignum 有抖動的現象值得注意 ![](https://i.imgur.com/8elX5Ef.png) ![](https://i.imgur.com/Z5LOtrS.png) :::danger 在 Ubuntu Linux 18.04-4 搭配 Linux node1 4.15.0-91-generic 編譯失敗: ``` In file included from /tmp/fibdrv/bignum_k/format.c:1:0: ./arch/x86/include/asm/fpu/api.h:28:8: error: unknown type name ‘bool’ extern bool irq_fpu_usable(void); ^~~~ ./arch/x86/include/asm/fpu/api.h:37:30: error: unknown type name ‘u64’ extern int cpu_has_xfeatures(u64 xfeatures_mask, const char **feature_name); ^~~ ``` :notes: jserv ::: > Fix at commit [6e95b9](https://github.com/ldotrg/fibdrv/commit/6e95b90c4ceea9bfa4ff05d0e327cf0cb82629a9) > 當時為了修理有浮點運算的編譯錯誤，嘗試了兩種方法，最後選擇第二種，忘記移除第一個所使用的標頭檔。 ```shell bignum_k/format.c:143:67: error: SSE register return with SSE disabled return (size_t)(radix_sizes[radix] * (size * APM_DIGIT_SIZE)) + 2; ``` > 1. 程式裡必須使用浮點運算的區塊,需由`kernel_fpu_begin(void)` `kernel_fpu_end(void)`做保護, 需要`#include <asm/fpu/api.h>` => 但似乎還有其他問題 > 2. 直接在compile flag 裡添加 `-msse2` ## 效能 - 分析 ## 效能 - 改善 ## 參考資訊 - [2020q1 quiz2](https://hackmd.io/@sysprog/linux2020-quiz2#2020q1-%E7%AC%AC-2-%E9%80%B1%E6%B8%AC%E9%A9%97%E9%A1%8C) - [folly string](https://github.com/facebook/folly/blob/master/folly/FBString.h#L208) - [自我檢查清單](https://hackmd.io/@sysprog/linux2020-fibdrv#%E8%87%AA%E6%88%91%E6%AA%A2%E6%9F%A5%E6%B8%85%E5%96%AE) - [作業要求](https://hackmd.io/@sysprog/linux2020-fibdrv#-%E4%BD%9C%E6%A5%AD%E8%A6%81%E6%B1%82)