2024q1 Homework6 (integration)

# 2024q1 Homework6 (integration) contributed by < `LindaTing0106` > ## 自我檢查清單 ## 閱讀〈[Linux 核心模組運作原理](https://hackmd.io/@sysprog/linux-kernel-module)〉 ### MODULE_LICENSE 巨集指定的授權條款核心有什麼影響 :::danger 注意用語：code 作為原始程式碼的語境，翻譯為「程式碼」 ::: 需要在<s>代碼</s> 程式碼中加入 MODULE_LICENSE ，否則程式連編譯都無法。 ``` modpost: missing MODULE_LICENSE() ``` :::danger 為何有此機制？ ::: 根據 [Linux kernel licensing rules](https://docs.kernel.org/process/license-rules.html#id1) ，需要有 MODULE_LICENSE 的目的在於識別此模組為自由軟體亦或是專有軟體。在 [linux/module.h](https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/include/linux/module.h) 裡頭可以找到可定義的 License ： <s>免費</s> 自由軟體為： - "GPL" [GNU Public License v2] - "GPL v2" [GNU Public License v2] - "GPL and additional rights" [GNU Public License v2 rights and more] - "Dual BSD/GPL" [GNU Public License v2 or BSD license choice] - "Dual MIT/GPL" [GNU Public License v2 or MIT license choice] - "Dual MPL/GPL" [GNU Public License v2 or Mozilla license choice] <s>不是免費</s> 專有軟體： - "Proprietary" [Non free products] 當一模組被標示為專有軟體，則在讀取此模組時核心會將此模組標示為 P ，並且不允許使用 EXPORT_SYMBOL_GPL() 導出的 symbols 。 :::danger 關鍵不是「免費」，參照第三週教材，從 Revolution OS 紀錄片得知授權條款背後的精神。 ::: :::info 根據第三週教材， free software 指的是有散播原始程式碼的自由。確保你可取得原始碼，確保你可以修改該軟體並使用它為基礎創造新的自由軟體；以及確保你認知到你擁有這些權力。 ::: ### insmod 後，Linux 核心模組的符號 (symbol) 如何被 Linux 核心找到在程式中有 `module_init(init_fib_dev)` ，而此處 gcc 在編譯後會將我們傳進去的參數設為 `int init_module(void)` 的別名。 `module_init` 檢查傳入的函式，回傳值是否正確，並把 `init_module` 和傳入的函式關聯起來，因為 `insmod` 指令實作內部會呼叫 `init_module`。如此一來呼叫 `init_module` 就等同於呼叫我們自己寫的函式。而因為我們在使用 `strace` 追蹤 `insmod` 會做什麼行為時，可以看到其會呼叫 `finit_module` ，而後再進入到 `load_module` ，則在 `load_module` 內部程式碼中可以看到在執行 `do_init_module` 之前，會先做 `layout_and_allocate` 為載入的 module 進行記憶體配置，而進入 `do_init_module` 後，將再執行 `do_one_initcall` ，而 `init_function` 就在 `do_one_initcall` 被執行到。 :::danger 你如何確認？用第七週介紹的 UML 和 QEMU 來追蹤。 ::: :::info 根據 [建構 User-Mode Linux 的實驗環境](https://hackmd.io/@sysprog/user-mode-linux-env) ，在 gdb 設好幾個中斷點後，將 `hello.ko` 掛載後顯示以下訊息。 ``` Thread 1 "vmlinux" hit Breakpoint 1, load_module (info=info@entry=0x609efdd8, uargs=uargs@entry=0x40097ca0 <error: Cannot access memory at address 0x40097ca0>, flags=flags@entry=0) at kernel/module.c:3875 3875 { (gdb) Continuing. Thread 1 "vmlinux" hit Breakpoint 2, layout_and_allocate (flags=0, info=0x609efdd8) at kernel/module.c:3488 3488 err = check_modinfo(info->mod, info, flags); (gdb) Continuing. Thread 1 "vmlinux" hit Breakpoint 3, layout_and_allocate (flags=<optimized out>, info=0x609efdd8) at kernel/module.c:3540 3540 return mod; (gdb) p mod $17 = (struct module *) 0x64822140 (gdb) c Continuing. Thread 1 "vmlinux" hit Breakpoint 6, load_module (info=info@entry=0x609efdd8, uargs=uargs@entry=0x40097ca0 <error: Cannot access memory at address 0x40097ca0>, flags=flags@entry=0) at kernel/module.c:4050 4050 return do_init_module(mod); (gdb) Continuing. Thread 1 "vmlinux" hit Breakpoint 7, do_init_module (mod=mod@entry=0x64822140) at kernel/module.c:3635 3635 { (gdb) p mod $18 = (struct module *) 0x64822140 (gdb) c Continuing. Thread 1 "vmlinux" hit Breakpoint 8, do_one_initcall (fn=0x64822000) at init/main.c:1217 1217 { (gdb) c Continuing. Thread 1 "vmlinux" hit Breakpoint 11, do_one_initcall (fn=0x64822000) at init/main.c:1226 1226 ret = fn(); (gdb) n Hello World! - init ``` 在進入 `do_init_module` 前 `load_module` 會先設置好變數 `mod` 的值，而在進入 `layout_and_allocate` 配置此模組在記憶體中的位置，而後 mod 的值已經為 `0x64822140` 了，而在 `do_init_module` 中會將 `mod->init` 也就是 `0x64822000` 傳入 `do_one_initcall` 中，而後在此函式中去呼叫 `fn()` ，則顯示出 `Hello World! - init` 。 ::: :::warning - 沒有看到 mod->init 是在哪裡被賦予值。 ::: ### [strace](https://man7.org/linux/man-pages/man1/strace.1.html) 追蹤 Linux 核心的掛載，涉及哪些些系統呼叫和子系統？ <s> - execve() ：執行呼叫的程式。 - mmap() ：把文件內容映射到一段虛擬記憶體上，通過對這段記憶體的讀取和修改，實現對文件的讀取和修改。 - openat() ：當傳給函數的路徑名是絕對路徑時， open 與 openat 無區別。 </s> :::danger 注意用語： * file 是「檔案」，而非「文件」(document) * 研讀第一手材料，避免低劣的簡體中文翻譯文章 ::: ## 閱讀《The Linux Kernel Module Programming Guide》(LKMPG) ### 解釋 simrupt 程式碼裡頭的 mutex lock 的使用方式 simrupt 是由 simulate 和 interrupt 二個單字組合而來。在資料傳入使用者層級之前，資料會先儲存在 `rx_fifo` 中。 simrupt 中設了三個 lock ：分別為 read_lock 、 producer_lock 和 consumer_lock 。而每次做讀取時，當前執行緒就會先將 read_lock 鎖起來，防止別的執行緒重複做到以下程式片段。 ```c do { ret = kfifo_to_user(&rx_fifo, buf, count, &read); if (unlikely(ret < 0)) break; if (read) break; if (file->f_flags & O_NONBLOCK) { ret = -EAGAIN; break; } ret = wait_event_interruptible(rx_wait, kfifo_len(&rx_fifo)); } while (ret == 0); pr_debug("simrupt: %s: out %u/%u bytes\n", __func__, read, kfifo_len(&rx_fifo)); ``` 而在 `simrupt_work_func` 中則使用到了 `producer_lock` 和 `consumer_lock` 在這邊 `consumer_lock` 在取 `circular buffer` 前會先鎖住 `consumer_lock` ，以防同時有別的執行緒對 `circular buffer` 進行存取，而在當前執行緒取得值後，便將 `consumer_lock` 解鎖，之後程式繼續運行，而同樣在值要放入 `kfifo buffer` 時，會將 `producer_lock` 上鎖，而在存入值後便將 `producer_lock` <s>鬆開</s> 解鎖。這裡由於從 `circular buffer` 存取值和將值放入 `kfifo buffer` 並不衝突，故 `consumer_lock` 和 `producer_lock` 不會互相影響。 :::danger 什麼「鬆開」？使用精準的詞彙。不要「舉燭」，你能從實驗確認你對程式碼的認知嗎？ ::: ## 探討 Timsort, Pattern Defeating Quicksort (pdqsort) 及 Linux 核心 [lib/sort.c](https://github.com/torvalds/linux/blob/master/lib/sort.c) 在排序過程中的平均比較次數在 [fluxsort](https://github.com/scandum/fluxsort) 中可以看到作者是使用 `bench.c` 檔去觀察不同排序演算法的各種比較，利用原作者提供的檔案我們可以測試 timsort 和 pdqsort 的效能。這是執行出來的結果。 ``` | Name | Items | Type | Best | Average | Compares | Samples | Distribution | | --------- | -------- | ---- | -------- | -------- | --------- | ------- | ---------------- | | timsort | 100000 | 32 | 0.008617 | 0.008822 | 1532560 | 10 | random order | | pdqsort | 100000 | 32 | 0.006241 | 0.006335 | 1875709 | 10 | random order | | | | | | | | | | | timsort | 100000 | 32 | 0.005844 | 0.005937 | 1050602 | 10 | random % 100 | | pdqsort | 100000 | 32 | 0.002654 | 0.002724 | 815320 | 10 | random % 100 | | | | | | | | | | | timsort | 100000 | 32 | 0.000144 | 0.000148 | 99999 | 10 | ascending order | | pdqsort | 100000 | 32 | 0.000299 | 0.000300 | 200010 | 10 | ascending order | | | | | | | | | | | timsort | 100000 | 32 | 0.001104 | 0.001175 | 299994 | 10 | ascending saw | | pdqsort | 100000 | 32 | 0.006847 | 0.006915 | 3082227 | 10 | ascending saw | | | | | | | | | | | timsort | 100000 | 32 | 0.000647 | 0.000671 | 200001 | 10 | pipe organ | | pdqsort | 100000 | 32 | 0.004417 | 0.004517 | 2472304 | 10 | pipe organ | | | | | | | | | | | timsort | 100000 | 32 | 0.000145 | 0.000154 | 99999 | 10 | descending order | | pdqsort | 100000 | 32 | 0.000429 | 0.000441 | 300032 | 10 | descending order | | | | | | | | | | | timsort | 100000 | 32 | 0.001172 | 0.001186 | 300010 | 10 | descending saw | | pdqsort | 100000 | 32 | 0.007551 | 0.007720 | 3219752 | 10 | descending saw | | | | | | | | | | | timsort | 100000 | 32 | 0.002413 | 0.002447 | 508111 | 10 | random tail | | pdqsort | 100000 | 32 | 0.004687 | 0.004765 | 1850433 | 10 | random tail | | | | | | | | | | | timsort | 100000 | 32 | 0.004718 | 0.004807 | 866402 | 10 | random half | | pdqsort | 100000 | 32 | 0.005524 | 0.005677 | 1953722 | 10 | random half | | | | | | | | | | | timsort | 100000 | 32 | 0.001120 | 0.001140 | 461648 | 10 | ascending tiles | | pdqsort | 100000 | 32 | 0.003520 | 0.003560 | 1713633 | 10 | ascending tiles | | | | | | | | | | | timsort | 100000 | 32 | 0.004282 | 0.004352 | 1532027 | 10 | bit reversal | | pdqsort | 100000 | 32 | 0.005813 | 0.005890 | 1828342 | 10 | bit reversal | ``` 原本在執行時， pdqsort 的比較次數皆為 0 ，看了程式之後發現是因為沒有將程式中設定的 `cmp` 傳入 pdqsort 函式中，也就造成了在比較時，不會去計算比較次數。但改為程式中設定的 `cmp` ，也導致其執行時間變慢。在執行程式時，若是執行由 c++ 撰寫的演算法，<s>且當前需排序資料不為 4 bytes 則排序會出錯。</s> 並且資料型態不為 32 位元時會出錯。 :::danger 改進你的漢語表達。 :::