2019q1 Homework2 (kcalc)

# 2019q1 Homework2 (kcalc) contributed by < `jeffcarl67` > ## 環境 * Linux 4.15.0-45-generic #48~16.04.1-Ubuntu SMP * gcc (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609 * 相關程式碼引用自 [linux v5.0](https://elixir.bootlin.com/linux/v5.0/source) ## 作業要求 * 回答「自我檢查清單」的所有問題，需要附上對應的參考資料和必要的程式碼，以第一手材料 (包含自己設計的實驗) 為佳 * 在 GitHub 上 fork [kcalc](https://github.com/sysprog21/kcalc)，主要目標是將 `MathEx` 整合到 `calc.c` 中 (作為 LKM 的形式)，過程中需要一併完成以下: * 將 `MathEx` 的浮點數運算換為 fixed point，應該先在使用者層級驗證，然後再搬到 Linux 核心中。可斟酌移除 `MathEx` 裡頭的功能，但需要充分解釋; * 量化 `MathEx` 在核心的執行時間，搭配 [fibdrv](https://hackmd.io/s/SJ2DKLZ8E) 撰寫的工具程式使用; * 設計 micro-benchmarking 實驗，用以驗證 `MathEx` 移植到 Linux 核心後的表現; * 嘗試解讀上述時間分佈，特別是隨著 Fibonacci 數列增長後，對於 Linux 核心的影響; * 改善 `MathEx` 的執行效率; * 請善用 [perf](http://wiki.csie.ncku.edu.tw/embedded/perf-tutorial) 一類地效能分析工具; ## 自我檢查清單 * 解釋浮點運算在 Linux 核心中為何需要特別對待，以及 context switch 的過程中，涉及到 FPU/SIMD context，該注意什麼？ * 提示: 參照 [Lazy FP state restore](https://en.wikipedia.org/wiki/Lazy_FP_state_restore) 和上方參考資料 * 應該撰寫對應包含浮點運算的 Linux 核心模組，實際編譯和測試 * 在給定的 `calc.c` 檔案中，和 [fibdrv](https://hackmd.io/s/SJ2DKLZ8E) 一樣有 character device，但註冊用的 kernel API 不同 (`register_chrdev` vs. `alloc_chrdev_region`)，能否解釋其差異和適用場合呢？查看源碼後可以發現, `register_chrdev` 實際調用了函式 `__register_chrdev`, 函式 `__register_chrdev` 與 `alloc_chrdev_region` 皆定義在 `linux/fs/xhar_dev.c` 中, 以下是兩個函數的相關註解: * `__register_chrdev` ```clike /** * __register_chrdev() - create and register a cdev occupying a range of minors * @major: major device number or 0 for dynamic allocation * @baseminor: first of the requested range of minor numbers * @count: the number of minor numbers required * @name: name of this range of devices * @fops: file operations associated with this devices * * If @major == 0 this functions will dynamically allocate a major and return * its number. * * If @major > 0 this function will attempt to reserve a device with the given * major number and will return zero on success. * * Returns a -ve errno on failure. * * The name of this device has nothing to do with the name of the device in * /dev. It only helps to keep track of the different owners of devices. If * your module name has only one type of devices it's ok to use e.g. the name * of the module here. */ ``` * `alloc_chrdev_region` ```clike /** * alloc_chrdev_region() - register a range of char device numbers * @dev: output parameter for first assigned number * @baseminor: first of the requested range of minor numbers * @count: the number of minor numbers required * @name: the name of the associated device or driver * * Allocates a range of char device numbers. The major number will be * chosen dynamically, and returned (along with the first minor number) * in @dev. Returns zero or a negative error code. */ ``` 比較註解與相關程式碼後可知, 兩個函式都有動態分配 major device number 的能力, 皆調用函式 `__register_chrdev_region` 以取得 device number, 但在 `__register_chrdev` 還會分配一個 `struct cdev` 結構, 而在 `alloc_chrdev_region` 中只是單純取得 device number * 在 `scripts/test.sh` 檔案中，有一道命令為 `sudo chmod 0666`，這個作用為何？對於我們測試有何幫助？能否對 [fibdrv](https://hackmd.io/s/SJ2DKLZ8E) 建立的 `/dev/fibonacci` device file 也套用類似修改呢？另外，請解釋 device file 在核心及使用者層級的功能執行命令 `sudo chmod 0666` 後會使指定的文件權限變為對所有人可讀可寫, 意味著在使用這個設備時任何用戶都能直接讀寫文件, 而查看 `/dev/fibonacci` 的權限後 ```shell $ ls -l /dev/fibonacci crw------- 1 root root 242, 0 3月 19 05:16 /dev/fibonacci ``` 可知只有 root 有對 `/dev/fibonacci` 讀寫的權限, 導致每次執行 `client` 時都需要使用 `sudo` 命令提昇普通用戶的權限, 如此一來在測試時較為不便, 嘗試對 `/dev/fibonacci` 執行相同操作後 ```shell $ sudo chmod 0666 /dev/fibonacci $ ls -l /dev/fibonacci crw-rw-rw- 1 root root 242, 0 3月 19 05:22 /dev/fibonacci ``` 可以發現執行 `client` 時不再需要使用 `sudo` 命列提昇權限 * 在 `calc.c` 檔案中，用到 `copy_to_user` 這個 kernel API，其作用為何？本例使用該 API 做了什麼事？若是資料量增長，是否會有效能的嚴重衝擊呢？ `copy_to_user` 定義在 `linux/include/linux/uaccess.h` 中, 其最終會調用定義於 `linux/arch/x86/include/asm/uaccess_64.h` 的函式 `raw_copy_to_user` 執行實際操作, 在 x86 架構中, 此函式利用如下所定義的巨集依據所要傳送的資料長度使用不同的 x86 傳送指令, 例如若欲傳送的資料為 1 byte , 則巨集展開成 `movb %b0,%1` 的指令 ```cp #define __put_user_goto(x, addr, itype, rtype, ltype, label) \ asm_volatile_goto("\n" \ "1: mov"itype" %"rtype"0,%1\n" \ _ASM_EXTABLE_UA(1b, %l2) \ : : ltype(x), "m" (__m(addr)) \ : : label) #define __put_user_failed(x, addr, itype, rtype, ltype, errret) \ ({ __label__ __puflab; \ int __pufret = errret; \ __put_user_goto(x,addr,itype,rtype,ltype,__puflab); \ __pufret = 0; \ __puflab: __pufret; }) #define __put_user_asm(x, addr, retval, itype, rtype, ltype, errret) do { \ retval = __put_user_failed(x, addr, itype, rtype, ltype, errret); \ } while (0) ``` * 找出至少 3 個 Linux 核心中使用定點數的案例，搭配程式碼解說 * 提示: 參照 [Linux Kernel Load Average 計算分析 ](http://brytonlee.github.io/blog/2014/05/07/linux-kernel-load-average-calc/), [What Every Computer Scientist Should Know About Floating-Point Arithmetic](https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html), [Load average explained](https://wiki.nix-pro.com/view/Load_average_explained) * 是否知曉 MathEx 如何解析給定字串，從而分離出變數和對應數學表示法呢？ * 如何對 `MathEx` 進行效能分析？能否找出相似的 math expression 來分析比較呢？ * 提示: 參照 [muparserSSE - A Math Expression Compiler](http://beltoforion.de/article.php?a=muparsersse) * 在 `MathEx` 原始程式碼的 `expression.[ch]` 裡頭 `vec` 相關的程式碼，主要做什麼事？有沒有發現類似 [list](https://hackmd.io/s/S12jCWKHN) 使用到的技巧呢？ * 提示: 參照 `mathex/test-unit.c` 的測試項目 * 解釋 `MathEx` 一類的 math expression 在真實世界有哪些應用？甚至，是否在 Linux 核心就存在類似的程式碼？ * 提示: 參照 [A thorough introduction to eBPF](https://lwn.net/Articles/740157/) * 如果要將使用者層級的 C 語言程式，搬到 Linux 核心作為核心模組 (LKM)，該注意哪些議題呢？請舉例說明 * 提示: 注意 `__KERNEL__` 巨集的定義, [kmalloc](https://www.kernel.org/doc/htmldocs/kernel-api/API-kmalloc.html) 的使用, [vmalloc](https://www.kernel.org/doc/htmldocs/kernel-api/API-vmalloc.html) 的使用 (以及後兩者的差異) * [fixed point arithmetic for RT-Linux](http://netwinder.osuosl.org/pub/netwinder/docs/nw/rt_fixed/doc/html/rt_fix.html) 的運作原理為何？給定的程式碼已經存在超過 20 年，很多細節已有出入，可否嘗試移植到 Linux `v4.15+` 呢？