Linux 核心專題: 重作 lab0

# Linux 核心專題: 重作 lab0 > 執行人: markarz > [專題解說錄影](https://youtu.be/K_AeLtggJC4) ### Reviewed by `jserv` [markarz/lab0-c](https://github.com/markarz/lab0-c) 尚未依據作業要求，rebase 到最新的 [lab0-c](https://github.com/sysprog21/lab0-c)，務必改正。 ### Reviewed by `Mike1117` - 程式碼區塊都沒有註記程式語言，導致可讀性太低 - 在 `q_shuffle` 的實作中，你選擇交換節點中的 `value` 指標，而不是節點本身，這樣做的考量是什麼？ ### Reviewed by `liangchingyub` 列點時使用 HACKMD 的「有序清單」功能，正確縮排。 ### Reviewed by `yy214123` 建議中文英文可以加入空白間隔，這樣視覺上比較不會黏在一起。 ### Reviewed by `otischung` > [直覺的判斷 xorshift 的效能會比 c 的 rand() 好](https://hackmd.io/@sysprog/Bkh9cEV4xl?stext=7010%3A40%3A0%3A1751429720%3AtTftMw) 請避免使用「直覺」，請嘗試證明兩者之間的 CPU cycle、指令數量、或執行時間相差多少。 >以補上使用 `perf`工具測量的數據。 ### Reviewed by `alexc-313` > 直覺的判斷 xorshift 的效能會比 c 的 rand() 好使用 perf 驗證效能有發現顯著差異嗎? 如果沒有的話是為什麼呢? >我用perf測試完，我看 cycle 的數量還是有差一些，如果要更名顯的差異需要 queue 的資料有更多筆會越明顯，假設你的 queue 有1000個數字那亂數就要取999次，那每輪的 rand() 和 xorshift() 使用的 cycle 數量一定會有差，所以自然的當我跑越多次我的 cycle 會有更明顯的差別，至於rand()和 xorshift() 差多少cycle這個我可以再做實驗測試證明。 ### Reviewed by `MuChengChen` 請說明 Xorshift 演算法中 $T = (I + L^c)(I + R^b)(I + L^a)$ 如何演變成 `xorshift()` 中的運算 ? >我們可以這樣看$(I+L^C)\vec{a}$ = `a^(a<<c)`這個 $\vec{a}$ 就是數學定義的$seed$ ，`a^(a<<c)` 是程式語法，那 $T$ 的物理意義是產生亂數的核心邏輯，也就是說我的 $\vec{seed} !=0$可以與 $T$ 相乘就會產生亂數。 ### Reviewed by `Hande1004` 文字訊息及程式碼避免用截圖的呈現方式。 ## TODO: 重作 lab0 > 依據[第一份作業](https://hackmd.io/@sysprog/linux2025-lab0/)指示，重新投入，並強化對於亂數、排序演算法和改進、常數時間判定的理論和實務、統計手法、網頁伺服器，以及程式碼品質的議題。 :::danger 注意書寫規範：中英文間用一個半形空白字元區隔 ::: ## 亂數、排序演算法和改進 ### Fisher–Yates shuffle [參考資料](https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle#The_modern_algorithm) 演算法運作流程: + 先取 queue 的長度(len) + 開始交換 len-1 的位置和隨機到的數字位置行 swap + len減一 + 2、3步循環做一直做直到len<=1 運作圖示： ![image](https://hackmd.io/_uploads/rkjIksyHge.png) ### 在 qtest 提供新的指令 shuffle :::danger 區分「指令」(instruction) 和「命令」(command)! ::: 要新增`shuffle`指令在 qtest 新增指令，新增指令方法根據[教材](https://hackmd.io/@sysprog/linux2025-lab0/%2F%40sysprog%2Flinux2025-lab0-b)在 qtest 新增 `hello` 指令的方法，在 `qtest.c` 新增一個`do_shuffle`function ，在`console_init()`裡新增指令`ADD_COMMAND(shuffle, "random test", "")`。 ### `do_shuffle` ```C static bool do_shuffle(int argc, char *argv[]) { if (argc != 1) { report(1, "%s takes no arguments", argv[0]); return false; } set_noallocate_mode(true); if (exception_setup(true)) { q_shuffle(current->q); } exception_cancel(); set_noallocate_mode(false); q_show(3); return !error_check(); } ``` `set_noallocate_mode()`:當傳入 true 時會禁止使用 free 和 malloc 的指令，它是為了確保適用者在執行不同功能時不會釋放到鏈結串列也不會在運作時配置新的 heap 空間，那做這個動作是確保我們在下命令時都是對到同一條鏈結串列，避免每次做影響鏈結串列的順序功能時都會產生新的一條鏈結串列。 `exception_setup()`、`exception_cancel()`:[參考教材解釋](https://hackmd.io/@sysprog/linux2025-lab0/%2F%40sysprog%2Flinux2025-lab0-b) ### ` q_shuffle()` ```C void q_shuffle(struct list_head *head) { if (!head || list_empty(head) || list_is_singular(head)) { return; } int len = q_size(head); if (len < 2) { return; } struct list_head *tail_node = head->prev; for (; len > 1; len--) { int random_idx = rand() % len; struct list_head *random_node = head->next; for (int i = 0; i < random_idx; i++) { random_node = random_node->next; } if (random_node != tail_node) { element_t *random_entry = list_entry(random_node, element_t, list); element_t *tail_entry = list_entry(tail_node, element_t, list); char *temp_val = random_entry->value; random_entry->value = tail_entry->value; tail_entry->value = temp_val; } tail_node = tail_node->prev; } } ``` --- ### 驗證Uniform distribution 檢驗虛無假說（Null hypothesis），即某特定事件在樣本中觀察到的頻率分佈與特定理論分佈一致，事件必須是互斥且機率為 1。如果要 shuffle 三個不同的數字，會出現六種可能的結果，彼此是互斥的，且發生機率加起來為 1。那假設 shuffle 是公平的（六種結果發生的機率相同），並遵守 Uniform distribution，那虛無假說為： + $H_0$（虛無假說）: shuffle 的結果發生的機率相同，遵守 Uniform distribution + $H_1$（對立假說）: shuffle 的結果發生的機率至少有一個不同接下來透過 Pearson's chi-squared test 來驗證 shuffle 三個數字的頻率是符合 Uniform distribution: #### 1. 先計算 chi-squared test statistic $X^2$ $$ X^2 = \sum_{i=1}^n {(O_i - E_i)^2 \over E_i} $$ + $X^2$: Pearson's cumulative test statistic + $O_i$: the number of observations of type i + $E_i$: the expected count of type i ``` Expectation: 41666 Observation: {'1234': 41694, '1243': 41533, '1324': 41443, '1342': 41534, '1423': 41916, '1432': 41380, '2134': 41869, '2143': 41547, '2314': 41849, '2341': 41508, '2413': 42131, '2431': 41476, '3124': 41743, '3142': 42016, '3214': 41168, '3241': 41761, '3412': 41568, '3421': 41694, '4123': 42024, '4132': 41707, '4213': 41408, '4231': 41806, '4312': 41608, '4321': 41617} chi square sum: 29.12902606441703 ``` ![image](https://hackmd.io/_uploads/HyLPnM34gl.png) --- ## shuffle使用Xorshift ### xorshift演算法介紹 [xorshift背後原理參考資料](https://www.researchgate.net/publication/5142825_Xorshift_RNGs) [LFSR](https://en.wikipedia.org/wiki/Linear-feedback_shift_register) #### 數學公式邏輯推導 + 想法希望可以找到一個邏輯運算是可以$1\sim2^n-1$ 不會造成數字分群和每一個數字都能跑到，數字分群意思是假設我這邊有 $1\sim9$的數字我對這些數字都 $mod2$ ，結果就是偶數的數字會分一群奇數會分一群，那就不會是每個都有機會而是只會特定產生某幾組數字，而是希望每個數字都是獨立的都能被獨立產生。那為什麼不是從 $0$ 開始，因為設定 $0$ 就像 `srand(0)` 一樣她永遠都只會產生 $0$。 + 實作 [Primitive polynomial](https://en.wikipedia.org/wiki/Primitive_polynomial_(field_theory)) #### 驗證: $$ R = \begin{pmatrix} 0 & 0 & \dots & 0 & 0 \\ 1 & 0 & \dots & 0 & 0 \\ 0 & 1 & \dots & 0 & 0 \\ \vdots & \vdots & \ddots & \vdots & \vdots \\ 0 & 0 & \dots & 1 & 0 \end{pmatrix} \qquad L = \begin{pmatrix} 0 & 1 & 0 & \dots & 0 \\ 0 & 0 & 1 & \dots & 0 \\ \vdots & \vdots & \ddots & \ddots & \vdots \\ 0 & 0 & 0 & \dots & 1 \\ 0 & 0 & 0 & \dots & 0 \end{pmatrix} \qquad I = \begin{pmatrix} 1 & 0 & \dots & 0 & 0 \\ 0 & 1 & \dots & 0 & 0 \\ \vdots & \vdots & \ddots & \vdots & \vdots \\ 0 & 0 & \dots & 1 & 0 \\ 0 & 0 & \dots & 0 & 1 \end{pmatrix} $$ $T = (I + L^c)(I + R^b)(I + L^a)$ + T: The final 32x32 transition matrix。 + I: The 32x32 identity matrix. + L: The 32x32 left-shift matrix. + R: The 32x32 right-shift matrix. + a, b, c: The three shift amounts you input. step1: $T = (I + L^5)(I + R^{17})(I + L^{13})$ $P(z) = \det(T - zI)= z³² + z²⁷ + z²⁶ + z²⁵ + z²⁴ + z²³ + z²² + z²¹ + z²⁰ + z¹⁹ + z¹⁸ + z¹⁷ + z¹⁶ + z¹⁵ + z¹⁴ + z¹³ + z¹² + z¹¹ + z¹⁰ + z⁹ + z⁸ + z⁷ + z⁶ + z⁵ + z⁴ + z³ + z² + z + 1$ step2: 檢查 $P(Z)$ 是否可以拆成兩個多項式，如果不能這步就測試成功並進到step3。 step3: $2^{32}-1 = 4,294,967,295 = 3 × 5 × 17 × 257 × 65537$ 需滿足以下條件 + $P(3)\bmod 3! \neq 0$ + $P(5)\bmod 5! \neq 0$ + $P(17)\bmod17! \neq 0$ + $P(257)\bmod 257! \neq 0$ + $P(65537)\bmod 65537! \neq 0$ `**step2、step3都滿足那他就可以當亂數演算法的位移量**` 圖片裡的數據都能用在`32bits版本的xorshift`來做亂數產生器: ![image](https://hackmd.io/_uploads/BkW0K6-Sge.png) --- 新增`xorshift`指令在qtest新增指令，按照上面新增shuffle指令的步驟一樣 ### ` do_xorshift()` `q_shuffle(current->q);` 改成 `q_xorshift(current->q);`即可 ![image](https://hackmd.io/_uploads/Bkib5AJrlg.png) ### ` xorshift()` 實做邏輯是依造維基百科給的範例程式改寫[參考資料](https://en.wikipedia.org/wiki/Xorshift) ![image](https://hackmd.io/_uploads/SkqjqLpNeg.png) ```C struct xorshift32_state { uint32_t a; }; static struct xorshift32_state x32_state= { .a = 1 }; uint32_t xorshift(void) { /* Algorithm "xor" from p. 4 of Marsaglia, "Xorshift RNGs" */ uint32_t x = x32_state.a; x ^= x << 13; x ^= x >> 17; x ^= x << 5; x32_state.a = x; return x; } ``` ### ` q_xorshift()` 取隨機數的方法從`rand()`換成`xorshift()` ![image](https://hackmd.io/_uploads/rJmB30JBex.png) ### 驗證Uniform distribution 驗證方法和上面一樣 [參考教材](https://hackmd.io/@sysprog/linux2025-lab0/%2F%40sysprog%2Flinux2025-lab0-d) ``` Expectation: 41666 Observation: {'1234': 41330, '1243': 41646, '1324': 41489, '1342': 41785, '1423': 41570, '1432': 41748, '2134': 41392, '2143': 41613, '2314': 41782, '2341': 41860, '2413': 41730, '2431': 41862, '3124': 41509, '3142': 41699, '3214': 41862, '3241': 41786, '3412': 41411, '3421': 42183, '4123': 41454, '4132': 41615, '4213': 41611, '4231': 41708, '4312': 41550, '4321': 41805} chi square sum: 20.212979407670524 ``` ![image](https://hackmd.io/_uploads/Bkk5qwp4eg.png) --- ## `rand()`和`Xorshift()` 比較 [rand()背後原理](https://en.wikipedia.org/wiki/Linear_congruential_generator) 下圖裡rand()的程式由[c99規格書7.20.2.1 page:312](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf)提供 ![image](https://hackmd.io/_uploads/B1ZRhT1Sxl.png) 下圖xorshift(uint32)版本是依[xorshift維基百科資料](https://en.wikipedia.org/wiki/Xorshift)給的範例程式 ![image](https://hackmd.io/_uploads/SkqjqLpNeg.png) 我們可以從這兩段程式碼可以直覺的判斷xorshift的效能會比c的rand()好，原因是xorshift是用位移運算rand()是用除法相關指令，一般我們說除法相關指令通常都會跑比較多的cycle，所以xorshift為了加速亂數而實做出來的演算法。 ### 使用perf驗證效能：執行方式： + 先在 lab0-c 建立`xorshift_test.cmd` ``` new it RAND 1000 xorshift free quit ``` + 在Terminal輸入 ``` sudo perf stat -e cycles,instructions ./qtest -f xorshift_test.cmd` ``` 結果： xorshift()版本　　　　　　 ``` Performance counter stats for './qtest -f xorshift_test.cmd': 454,750,468 cycles 821,570,150 instructions # 1.81 insn per cycle 0.099687190 seconds time elapsed 0.074281000 seconds user 0.036938000 seconds sys ``` rand()版本 ``` Performance counter stats for './qtest -f shuffle_test.cmd': 455,163,523 cycles 822,829,028 instructions # 1.81 insn per cycle 0.100295716 seconds time elapsed 0.073202000 seconds user 0.038321000 seconds sys ``` | 項目 | xorshift | rand | 差異百分比 | | -------------------- | --------- | ----------- | ---------- | | Cycle（個） | 454,750,468 |455,163,523 | 0.0908%| | 總花費時間大幅（秒） | 0.099687190 | 0.100295716 |0.6104% |　 :::danger 注意書寫規範：中英文間用一個半形空白字元區隔 ::: ## TODO: 彙整其他學員的成果 > 紀錄你從其他學員成果獲得的啟發，應適度重現實驗並詳實標注