2024q1 Homework6 (integration)

# 2024q1 Homework6 (integration) ## 自我檢查清單 - [x] 研讀前述 ==Linux 效能分析== 描述，在自己的實體電腦運作 GNU/Linux，做好必要的設定和準備工作 $\to$ 從中也該理解為何不希望在虛擬機器中進行實驗; - [x] 閱讀〈[Linux 核心模組運作原理](https://hackmd.io/@sysprog/linux-kernel-module)〉並對照 Linux 核心原始程式碼 (v6.1+)，解釋 `insmod` 後，Linux 核心模組的符號 (symbol) 如何被 Linux 核心找到 (使用 List API)、`MODULE_LICENSE` 巨集指定的授權條款又對核心有什麼影響 (GPL 與否對於可用的符號列表有關)，以及藉由 [strace](https://man7.org/linux/man-pages/man1/strace.1.html) 追蹤 Linux 核心的掛載，涉及哪些系統呼叫和子系統？ > 〈[Linux 核心模組運作原理](https://hackmd.io/@sysprog/linux-kernel-module)〉列出的程式碼較舊，歡迎編輯頁面，更新到 Linux v6.1 以上。 - [ ] 閱讀《[The Linux Kernel Module Programming Guide](https://sysprog21.github.io/lkmpg/)》(LKMPG) 並解釋 [simrupt](https://github.com/sysprog21/simrupt) 程式碼裡頭的 mutex lock 的使用方式，並探討能否改寫為 [lock-free](https://hackmd.io/@sysprog/concurrency-lockfree); > 參照 [2021 年的筆記](https://hackmd.io/@linD026/simrupt-vwifi)。歡迎貢獻 LKMPG! > $\to$ 搭配閱讀〈[並行和多執行緒程式設計](https://hackmd.io/@sysprog/concurrency)〉 - [ ] 探討 Timsort, Pattern Defeating Quicksort (pdqsort) 及 Linux 核心 [lib/sort.c](https://github.com/torvalds/linux/blob/master/lib/sort.c) 在排序過程中的平均比較次數，並提供對應的數學證明; > 對照 [fluxsort](https://github.com/scandum/fluxsort) 和 [crumsort](https://github.com/scandum/crumsort) 的分析和效能評比方式 - [ ] 研讀 [CMWQ](https://www.kernel.org/doc/html/latest/core-api/workqueue.html) (Concurrency Managed Workqueue) 文件，對照 [simrupt](https://github.com/sysprog21/simrupt) 專案的執行表現，留意到 worker-pools 類型可指定 "Bound" 來分配及限制特定 worker 執行於指定的 CPU，Linux 核心如何做到？CMWQ 關聯的 worker thread 又如何與 CPU 排程器互動？ > 搭配閱讀《Demystifying the Linux CPU Scheduler》 - [ ] 解釋 `xoroshiro128+` 的原理 (對照〈[Scrambled Linear Pseudorandom Number Generators](https://arxiv.org/pdf/1805.01407.pdf)〉論文)，並利用 [ksort](https://github.com/sysprog21/ksort) 提供的 `xoro` 核心模組，比較 Linux 核心內建的 `/dev/random` 及 `/dev/urandom` 的速度，說明 `xoroshiro128+` 是否有速度的優勢？其弱點又是什麼？ > $\to$ 搭配閱讀: [不亂的「亂數」](https://blog.cruciferslab.net/?p=599) - [ ] 解釋 [ksort](https://github.com/sysprog21/ksort) 如何運用 CMWQ 達到並行的排序; ## 閱讀〈[Linux 核心模組運作原理](https://hackmd.io/@sysprog/linux-kernel-module)〉 ### `insmod` 使用 `insmod` 指令可以動態地將模組載入核心，這樣的好處可以使 kernel 較為精簡，並且讓 kernel 更為彈性，可以依照自己的需求編寫模組。若是使用靜態的方式將模組直接編譯進核心當中，這樣讓核心的檔案非常巨大，並且每有新的模組就要重新編譯核心，非常耗費時間。 ### Linux 核心模組的符號 (symbol) 如何被 Linux 核心找到 (使用 List API) 在 [module/main.c](https://github.com/torvalds/linux/blob/5eb4573ea63d0c83bf58fb7c243fc2c2b6966c02/kernel/module/main.c#L302) 中，有一個 `find_symbol` 的函數，使用 `list_for_each_entry_rcu` 去找 module ### `MODULE_LICENSE` 巨集指定的授權條款又對核心有什麼影響? 在 [linux/module.h](https://github.com/torvalds/linux/blob/5eb4573ea63d0c83bf58fb7c243fc2c2b6966c02/include/linux/module.h#L186) 中，可以找到 MODULE_LICENSE 所定義的 License ``` /* * The following license idents are currently accepted as indicating free * software modules * * "GPL" [GNU Public License v2] * "GPL v2" [GNU Public License v2] * "GPL and additional rights" [GNU Public License v2 rights and more] * "Dual BSD/GPL" [GNU Public License v2 * or BSD license choice] * "Dual MIT/GPL" [GNU Public License v2 * or MIT license choice] * "Dual MPL/GPL" [GNU Public License v2 * or Mozilla license choice] * * The following other idents are available * * "Proprietary" [Non free products] * * Both "GPL v2" and "GPL" (the latter also in dual licensed strings) are * merely stating that the module is licensed under the GPL v2, but are not * telling whether "GPL v2 only" or "GPL v2 or later". The reason why there * are two variants is a historic and failed attempt to convey more * information in the MODULE_LICENSE string. For module loading the * "only/or later" distinction is completely irrelevant and does neither * replace the proper license identifiers in the corresponding source file * nor amends them in any way. The sole purpose is to make the * 'Proprietary' flagging work and to refuse to bind symbols which are * exported with EXPORT_SYMBOL_GPL when a non free module is loaded. * * In the same way "BSD" is not a clear license information. It merely * states, that the module is licensed under one of the compatible BSD * license variants. The detailed and correct license information is again * to be found in the corresponding source files. * * There are dual licensed components, but when running with Linux it is the * GPL that is relevant so this is a non issue. Similarly LGPL linked with GPL * is a GPL combined work. * * This exists for several reasons * 1. So modinfo can show license info for users wanting to vet their setup * is free * 2. So the community can ignore bug reports including proprietary modules * 3. So vendors can do likewise based on their own policies */ ``` 其中裡面有句話說 `The sole purpose is to make the 'Proprietary' flagging work and to refuse to bind symbols which are exported with EXPORT_SYMBOL_GPL when a non free module is loaded.`，如果今天 MODULE_LICENSE 非為免費授權的話，則 kernel 會拒絕綁定 `EXPORT_SYMBOL_GPL` symbol 的模組，這樣可以維持 GPL 的特性。何謂 GPL ? > GNU通用公眾授權條款（英語：GNU General Public License，縮寫GNU GPL 或 GPL），是被廣泛使用的自由軟體授權條款，給予了終端使用者運行、學習、共享和修改軟體的自由。[6]授權條款最初由自由軟體基金會的理察·斯托曼為GNU專案所撰寫，並授予電腦程式的使用者自由軟體定義（The Free Software Definition）的權利。[7]GPL是一個Copyleft授權條款，這意味著只要專案的某個部分（如動態連結庫）以GPL發佈，則整個專案以及衍生作品只能以相同的許可條款分發[8]。 > > 引述維基百科的解釋其中 GPL 為 [Copyleft](https://zh.wikipedia.org/wiki/Copyleft) 授權條款，GPL 的軟體可以任意地被使用、修改，其中其衍生作品也必須也以 GPL 授權發布。 ### 由 [strace](https://man7.org/linux/man-pages/man1/strace.1.html) 追蹤 Linux 核心的掛載，涉及哪些系統呼叫和子系統？使用以下指令追蹤 Linux 核心的掛載 ``` sudo strace insmod hello.ko ``` ``` openat(AT_FDCWD, "/home/jimmylu/linux2024/linux2024_lab/hello-word-linux-module/hello.ko", O_RDONLY|O_CLOEXEC) = 3 mmap(NULL, 192416, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7ffff7e78000 finit_module(3, "", 0) = 0 ``` - openat : 打開特定模組檔案和其他相關系統檔案 - mmap : 將文件內容映射到記憶體中，以便執行 - finit_module : 初始化並加載模組 ### 初探字符設備根據老師列出的 [character-device](http://derekmolloy.ie/writing-a-linux-kernel-module-part-2-a-character-device/) 教材，可以透過字符設備將 User Space 的資料傳遞到 Kernel Space。首先有一個模組名稱假設為 CDD ，在 module_init 時，會有以下步驟 ```c // 1. 註冊字符設備 (Driver) majorNumber = register_chrdev(0, DEVICE_NAME, &cdd_fops); // 2. 註冊字符設備 Class cddClass = class_create(CLASS_NAME); // 3. 註冊字符設備 (File) cddDevice = device_create(cddClass, NULL, MKDEV(majorNumber, 125), NULL, DEVICE_NAME); ``` 使用 `register_chrdev` 註冊字符設備 (Driver) 時，會將該 Driver 註冊到 `/proc/devices` 內，並且可以根據 Driver 的功能去實作 `struct file_operations` 的功能(如 read, write etc)，511 為 major number ，kernel 會依據這個 major number 去找尋對應的 Driver ``` $ cat /proc/devices Character devices: 1 mem 4 /dev/vc/0 4 tty 4 ttyS ... 511 cddchar ``` 使用 `device_create` 註冊字符設備 (File)，會在 `/dev/` 下真實創建一個檔案, 開頭第一個字母 c 代表它為字符設備，cddchar 的 major number 為 511, minor number 為 0，而 User Space 的程式在操作這個裝置時，kernel 會根據 major number 找到對應的 Driver，執行對應的操作。 ``` $ ls /dev/ -l total 0 crw------- 1 root root 511, 0 四 27 19:50 cddchar ``` 而在 user space 的程式中，對 /dev/cddchar 作 write 操作，其實會導向 Driver 中實作的 write 函數 ( cdd_fops.write() ) ![image](https://hackmd.io/_uploads/SySl1d5ZR.png) > [圖片來源](https://www.quora.com/Is-device-driver-programming-good-for-career-setting) ```c char *filename = "/dev/cddchar"; fd = open(filename, O_RDWR); ret = write(fd, stringToSend, strlen(stringToSend)); ``` > [完整程式](https://github.com/jimmylu890303/linux2024_lab/tree/main/character-device) 但是目前有抱持著一個疑問，當我今天 user space 的程式去對 /dev/tty0 操作，kernel 都會根據 major number 導向到對應的 driver ，但他們的 minor number 都不同，要如何根據 minor number 去執行對應的操作? ``` $ cat /proc/devices Character devices: 4 tty 4 ttyS $ ls /dev -l crw--w---- 1 root tty 4, 0 四 27 19:49 tty0 crw--w---- 1 root tty 4, 1 四 27 19:49 tty1 crw--w---- 1 root tty 4, 10 四 27 19:49 tty10 crw--w---- 1 root tty 4, 11 四 27 19:49 tty11 crw--w---- 1 root tty 4, 12 四 27 19:49 tty12 crw--w---- 1 root tty 4, 13 四 27 19:49 tty13 ... ```