2025q1 Homework2 (quiz1+2)

# 2025q1 Homework2 (quiz1+2) contributed by < `cmosinverter` > ## 開發環境 ```shell $ gcc --version gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 39 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 4 On-line CPU(s) list: 0-3 Vendor ID: GenuineIntel Model name: Intel(R) N100 CPU family: 6 Model: 190 Thread(s) per core: 1 Core(s) per socket: 4 Socket(s): 1 Stepping: 0 CPU(s) scaling MHz: 87% CPU max MHz: 3400.0000 CPU min MHz: 700.0000 BogoMIPS: 1612.80 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat p se36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdp e1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqd q dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm p cid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l2 cd p_l2 ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid r dt_a rdseed adx smap clflushopt clwb intel_pt sha_ni xsaveopt xsave c xgetbv1 xsaves split_lock_detect user_shstk avx_vnni dtherm ida a rat pln pts hwp hwp_notify hwp_act_window hwp_epp hwp_pkg_req vnmi umip pku ospke waitpkg gfni vaes vpclmulqdq rdpid movdiri movdir64b fsrm md_clear serialize arch_lbr ibt flush_l1d arch_capabilities Virtualization features: Virtualization: VT-x Caches (sum of all): L1d: 128 KiB (4 instances) L1i: 256 KiB (4 instances) L2: 2 MiB (1 instance) L3: 6 MiB (1 instance) NUMA: NUMA node(s): 1 NUMA node0 CPU(s): 0-3 Vulnerabilities: Gather data sampling: Not affected Itlb multihit: Not affected L1tf: Not affected Mds: Not affected Meltdown: Not affected Mmio stale data: Not affected Reg file data sampling: Mitigation; Clear Register File Retbleed: Not affected Spec rstack overflow: Not affected Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitizatio n Spectre v2: Mitigation; Enhanced / Automatic IBRS; IBPB conditional; RSB fillin g; PBRSB-eIBRS Not affected; BHI BHI_DIS_S Srbds: Not affected Tsx async abort: Not affected ``` ## 第一週 Qiuz 1 #### 延伸問題 1 > 解釋上方程式碼運作原理根據 `list_insert_before` prototype 的描述，我們可以透過這個函式完成以下條件: 1. 將新的 item 加到 before 之前 2. 如果 before 是 head of list，那麼新的 item 會被加到最前面 3. 如果 before 為 NULL，那麼新的 item 會被加到最尾端 ```c static inline void list_insert_before(list_t *l, list_item_t *before, list_item_t *item) { list_item_t **p; for (p = AAAA; *p != BBBB; p = CCCC) ; *p = item; DDDD = before; } ``` 考慮一般的情況 (如下圖): ```graphviz digraph G { rankdir=LR; node [shape=record]; a [label="{ <data> head | <ref> }"]; b [label="{ <data> a | <ref> }"]; c [label="{ <data> b | <ref> }"]; d [label="{ <data> NULL}"]; a:ref:c -> b:data [arrowhead=vee, arrowtail=dot, dir=both, tailclip=false]; b:ref:c -> c:data [arrowhead=vee, arrowtail=dot, dir=both, tailclip=false, label="before"]; c:ref:c -> d:data [arrowhead=vee, arrowtail=dot, dir=both, tailclip=false]; } ``` * p 的初始情況為一個指向指向 head 的指標的指標，因此 AAAA 為 `&l->head` * p 是一個指標的指標，他會逐一走訪每一個節點的指標，只要 `*p != Before`，他就會繼續走訪，因此 BBBB 為 `*p != Before` * 如果要移動 p 這個間接指標，那麼要先對他解指標 `*p` 然後取出該指標指向的節點所指向的下一個節點的指標 `(*p)->next`，然後再用 address-of operator 取出這個指標的位置: `&(*p)->next`，因此 CCCC 為 `&(*p)->next` * 當 `*p == Before` 時，可以直接對 p 解指標然後指派 item 給 `*p`，此時情況如下圖所示: ```graphviz digraph G { rankdir=LR; node [shape=record]; head [label="{ <data> head | <ref> }"]; a [label="{ <data> a | <ref> }"]; b [label="{ <data> b | <ref> }"]; item [label="{ <data> item | <ref> }"]; n [label="{ <data> NULL}"]; before [label="{ <data> &before | <ref> }"]; head:ref:c -> a:data [arrowhead=vee, arrowtail=dot, dir=both, tailclip=false]; a:ref:c -> item:data [arrowhead=vee, arrowtail=dot, dir=both, tailclip=false]; b:ref:c -> n:data [arrowhead=vee, arrowtail=dot, dir=both, tailclip=false]; before:ref:c -> b:data [arrowhead=vee, arrowtail=dot, dir=both, tailclip=false]; } ``` 然後把 item 的 next 指派為 before，也就是目前指向 B 的指標: ```graphviz digraph G { rankdir=LR; node [shape=record]; head [label="{ <data> head | <ref> }"]; a [label="{ <data> a | <ref> }"]; b [label="{ <data> b | <ref> }"]; item [label="{ <data> item | <ref> }"]; n [label="{ <data> NULL}"]; head:ref:c -> a:data [arrowhead=vee, arrowtail=dot, dir=both, tailclip=false]; a:ref:c -> item:data [arrowhead=vee, arrowtail=dot, dir=both, tailclip=false]; b:ref:c -> n:data [arrowhead=vee, arrowtail=dot, dir=both, tailclip=false]; item:ref:c -> b:data [arrowhead=vee, arrowtail=dot, dir=both, tailclip=false]; } ``` 成功插入 item 至 linked-list 中。那麼 `如果 before 是 head of list 呢` ? 如下圖所示: ```mermaid graph LR subgraph linked list head==>node1==>node2 end subgraph pointer before==>head end ``` 根據剛剛完成的程式碼: * 先假設程式碼是對的 ```C= static inline void list_insert_before(list_t *l, list_item_t *before, list_item_t *item) { list_item_t **p; for (p = &l->head; *p != before; p = &(*p)->next) ; *p = item; item->next = before; } ``` 從程式碼中 for-loop 結束後的狀態可以得知，indirect pointer `p` 指向的指標 `*p`，正好跟 pointer `Before`一模一樣，也就是 `*p == Before`，如果此時照著程式碼 (6) 的邏輯繼續執行，如下圖: ```mermaid graph LR subgraph linked list direction LR head==>node1==>node2 end subgraph linked list direction LR item end subgraph pointer to pointer direction LR p ==> item end subgraph pointer direction LR before==>head end ``` `*p`，會改為指向一個新的 item，然後繼續執行 (7): ```mermaid graph LR subgraph pointer to pointer direction LR p end subgraph linked list direction LR p ==> item ==> head==>node1==>node2 end ``` 此結構符合一開始的規定，也就是若 before 是 head of list，那麼新的 item 會被加到最前面。最後，`如果 before 為 NULL，那麼 item 會被加進 tail of list 嗎` ? 從程式碼中 for-loop 結束後的狀態可以得知，p 會在迴圈結束後指向指向 NULL 的 pointer，所以綜合前面得到的經驗， item 會被加到 list 的最後。 #### 延伸問題 2 > 在現有的程式碼基礎上，撰寫合併排序操作 ## 第一週 Qiuz 2 #### 延伸問題 1.1 > 解釋程式碼運作的原理，並嘗試補完程式碼首先看到管理記憶體區塊的結構表示: ```C typedef struct block { size_t size; struct block *l, *r; } block_t; ``` 因為所要管理的記憶體是樹狀結構，而且這個樹狀結構是一個 Binary Search Tree ，也就是任何一個節點他的 right subtree 內的任何一個節點的記憶體區塊大小都比較大，left subtree 內的任何一個節點的記憶體區塊大小都比較小，遵循這個規則，可以在 log(n) 時間複雜度找到目標大小的記憶體區塊。閱讀完了程式碼和對應的註解後，了解到 remove_free_tree 這個函式想要達成的目的如下: * 可以移除目標記憶體區塊 * 如果該區塊左右都有子節點，就需要找到 size 小於該區塊的最大記憶體區塊 `in-order predecessor` * 如果該區塊只有一個子節點，用該子節點做替代 * 沒有子節點，直接移除該節點 #### 延伸問題 1.2 > 使其可獨立執行，應提供相關的測試程式碼 [Code](https://github.com/cmosinverter/linux2025-w1-q2) 測試程式碼構想: * 先產生一個大小為 n 的 list，裡面每一個 element 都是 size_t * 透過這個 list 建立一個 binary search tree * 從原本的 list 中隨機 remove memory block k 次， k 小於等於 n 首先，為了讓程式碼可以獨立的執行，所以必須先完成 `find_free_tree` 和 `find_predecessor_free_tree` 對應的實作。 **[find_free_tree](https://github.com/cmosinverter/linux2025-w1-q2/blob/main/block.c#L8)** * 用遞迴版本 Binary Search 求出 target 的指標 **[find_predecessor_free_tree](https://github.com/cmosinverter/linux2025-w1-q2/blob/main/block.c#L29)** * 這邊的 predecessor 是指 in-order predecessor，所以該節點為左子節點的最右邊子節點在測試樹狀記憶體結構之前，還要再補上一些樹操作的實作如下: **[init_tree](https://github.com/cmosinverter/linux2025-w1-q2/blob/main/block.c#L49)** * 藉由動態分配記憶體初始化 **root** 間接指標，並將 ***root** 指向 NULL **[insert_free_tree_iterative](https://github.com/cmosinverter/linux2025-w1-q2/blob/main/block.c#L142)** * 利用迭代法插入記憶體區塊 **[inorder](https://github.com/cmosinverter/linux2025-w1-q2/blob/main/block.c#L159)** * In-order traversal 的遞迴版本，預期可以由小到大印出所有剩餘記憶體的大小實際測試流程: **[test.c](https://github.com/cmosinverter/linux2025-w1-q2/blob/main/test.c)** 1. 初始化根節點 2. 生成各種大小的記憶體區塊 3. 將前一步生成的記憶體區塊插入樹結構中 4. 移除所有記憶體區塊 5. 釋放記憶體測試結果: ```shell 15 5 15 5 10 15 5 10 15 25 5 10 15 20 25 5 10 15 25 5 10 15 5 15 15 ``` 所有函式功能皆正常，最後使用 valgrind 檢查 memory leak: ```shell $ valgrind ./test ==938828== Memcheck, a memory error detector ==938828== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al. ==938828== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info ==938828== Command: ./test ==938828== 15 5 15 5 10 15 5 10 15 25 5 10 15 20 25 5 10 15 25 5 10 15 5 15 15 ==938828== ==938828== HEAP SUMMARY: ==938828== in use at exit: 0 bytes in 0 blocks ==938828== total heap usage: 7 allocs, 7 frees, 1,152 bytes allocated ==938828== ==938828== All heap blocks were freed -- no leaks are possible ==938828== ==938828== For lists of detected and suppressed errors, rerun with: -s ==938828== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) ``` valgrind 沒有發現記憶體洩漏。 #### 延伸問題 2 > 參閱 [memory-allocators](https://github.com/mtrebi/memory-allocators)，針對你補完 (及後續改進) 的記憶體配置器，進行效能評比，並解讀其效能表現 ## 第一週 Quiz 3 #### 延伸問題 1 > 解釋程式碼運作原理以下程式碼為 quick_sort 的變數宣告: ```C int n = list_length(list); int value; int i = 0; int max_level = 2 * n; struct list_head *begin[max_level]; struct list_head *result = NULL, *left = NULL, *right = NULL; begin[0] = list->next; list->prev->next = NULL; ``` max_level: 定義為 2 x n ，此情況模擬了遞迴實作下的最大遞迴深度。 begin: 本實作方法為 iterative 而不是 recursive，因此需要模擬一個 stack 結構。 result: 排序後的暫存指標對於鏈結串列的 quick sort 而言，每一輪的排序(對於最外層的 while 迴圈而言)如下 : * 若分割有大於一個節點: * 將所有小於 pivot 的節點改接到 left，所有大於 pivot 的節點改接到 right * 依照 left，pivot，right 的順序推到 stack 中 * 若分割沒有或只有一個節點: * 若有一個節點的話，將該節點插入 result 的頭部 * 執行 stack pop `i--` #### 延伸問題 2 > 研讀〈[A comparative study of linked list sorting algorithms](https://dl.acm.org/doi/pdf/10.1145/225890.225893)〉，分析針對雙向鏈結串列的排序演算法考量，並運用 Linux 核心風格的 List API 予以實作 ## 第二週 Quiz 1 #### 延伸問題 1 > 解釋程式碼運作原理，改進 random_shuffle_array 也探討 list_for_each_entry_safe 建構的迴圈中，若將 list_move_tail 換為 list_move，為何會無法滿足 stable sorting list_quicksort 為遞迴版本的 quick sort，其詳細步驟如下: * 取出串列的第一個元素作為 pivot，並且從串列中移除 pivot * 依序比對串列中各個元素與 pivot 的大小關係，小於的接到 list_less 尾端，大於的接到 list_greater 尾端 * 遞迴的對 list_less 和 list_greater 做排序 * 最後依照 list_less、pivot、list_greater 的順序組建最終排序串列 stable sorting 之 [Wikipedia](https://simple.wikipedia.org/wiki/Stable_sorting_algorithm) 定義: > A sorting algorithm is called stable if it preserves the order of elements with the same sorting key. 假如將 list_move_tail 換成 list_move，那麼相同鍵值的相對順序將會反轉，導致 stable sorting 的狀態不成立，list_move_tail 因為遵循比較晚被比較的鍵值 (比較後面的鍵值) 會被放在後面的性質，因此 stable sorting 的狀態依然成立。在改進 random_shuffle_array 之前，先探討題目是如何實作的。 ```C= static inline uint8_t getnum(void) { static uint16_t s1 = UINT16_C(2); static uint16_t s2 = UINT16_C(1); static uint16_t s3 = UINT16_C(1); s1 *= UINT16_C(171); s1 %= UINT16_C(30269); s2 *= UINT16_C(172); s2 %= UINT16_C(30307); s3 *= UINT16_C(170); s3 %= UINT16_C(30323); return s1 ^ s2 ^ s3; } static uint16_t get_unsigned16(void) { uint16_t x = 0; size_t i; for (i = 0; i < sizeof(x); i++) { x <<= 8; x |= getnum(); } return x; } ``` 本程式碼用到了一個叫做 Linear congruential generator (LCG) 的 PRNG，s1、s2、s3 分別代表一個 LCG，且由於 s1、s2、s3 已經被宣告為 static variable，因此他們只會在第一次 getnum 的 function call 被初始化為 2、1、1，並且在整個程式的生命週期內每一次 getnum function call 之後產生隨機數。之所以使用三個 LCG 而不是單一個的好處就是可以增加隨機性，並且延長 PRNG 的週期至 30268、30306、30322 的最小公倍數。將 random_shuffle_array 的問題: 1. 不能對任意 array 洗牌 2. 只能產生 0, 1, 2, ... length(array) 的隨機陣列將 random_shuffle_array 改成 Fisher–Yates shuffle: ```C static inline void random_shuffle_array(uint16_t *operations, uint16_t len) { for (uint16_t i = len-1; i > 0; i--) { uint16_t j = get_unsigned16() % (i + 1); uint16_t temp = operations[i]; operations[i] = operations[j]; operations[j] = temp; } } ``` 同時在 main 裡面加入 init_array 來初始化 values[256]，變成一個 0, 1, 2, 3, ..., 255 的陣列。 Reference * [Wichmann–Hill](https://en.wikipedia.org/wiki/Wichmann%E2%80%93Hill) * [Linear congruential generator](https://en.wikipedia.org/wiki/Linear_congruential_generator) #### 延伸問題 2 > 將程式碼整合進 lab0 提及的 qtest，並探討環狀雙向鏈結串列的排序效能，並善用 perf 一類的效能分析工具，探討包含 Linux 核心 list_sort 在內排序實作的效能表現並充分解釋 ## [第二週 Qiuz 2](https://hackmd.io/@sysprog/linux2025-quiz2#%E6%B8%AC%E9%A9%97-2) #### 延伸問題 1 > 解釋上述程式碼，並探討擴充為 ⌈根號(x)⌉ (向上取整數) 實作，該如何調整並保持只用整數加減及位元操作為了要好的理解 digit-by-digit method 的實作方法，我打算先實做一個自己看得懂的版本，~~因為我看了一個月還是看不懂 Quiz 裡面的寫法~~。以下是我實作的版本： ```C= uint64_t sqrti_simple(uint64_t x) { uint64_t y = 0; if (x <= 1) return x; int total_bits = 64; int shift = (total_bits - 1 - clz64(x)) & (~1UL); uint64_t p = 0; for(int i = (shift / 2); i >= 0; i--) { uint64_t b = (p << (i + 1)) + (1 << (2 * i)); if (x >= b) { x -= b; p += (1 << i); } } y = p; return y; } ``` `Line 5`: 若 x 小於 1，直接返回 (0, 1 平方根與自己相同) `Line 8`: uint64_t 的 bit 個數 `Line 9`: 從 LSB 數過來最高的 set bit (最高的為 1 的 bit) 的 index `Line 11`: 目前找到所有 bit 的和 `Line 12`: 假設 shift 是 k，那麼根據平方根的不等式定義： $$ 2^k <= x <= 2^{k+1} $$ 因此對 x 開根號之後： $$ 2^{k/2} <= sqrt(x) <= 2^{(k+1)/2} $$ 因為 quiz 裡面要找的平方根定義是要找到 $floor(x)$ 的 binary representation，所以 $sqrt(x)$ 的最高的 set bit 會是 $floor(k/2)$，這也是為什麼 for 迴圈裡面要從 `i = (shift / 2)` 開始尋找。 `Line 12~18`: 假設以下式子，我們想要找到 x 的平方根： $$x=N^2\approx(b_n2^0 + b_{n-1}2^1 + ... +b_02^n)^2$$ 其中， $$b_i \in \{0, 1\}$$ 且定義數列 $P$ 如下所示： $$P_m=P_{m+1} + a_m$$ 遞迴式子不太好懂，所以我先從 $m=0$ 把式子展開： $$P_0=b_n2^0 + b_{n-1}2^1 + ... +b_02^n=P_{1} + a_m$$ 其中， $$a_m=b_m2^0$$ 其實 $P_0$ 就是我們所要找的最終解答，在每一個 for 迴圈裡面，我們都要嘗試決定 $a_m$ 裡面的 $b_m$ 到底是 1 還是 0。也就是說，每一次在進到迴圈之前都已經有了 $P_{m+1}$，我們想要看看如果 $(P_m)^2 = (P_{m+1}+2^m)^2$ 會不會超過 $x$，如果超過了代表 $a_m=b_m=0$，$P_m=P_{m+1}$，第 $m$ 個 bit 被設為 0，反之沒超過的話就設為 1。可是要怎麼知道 $(P_m)^2$ 的值是多少？回到數列 $P$ 的定義用平方公式把他展開： $$(P_{m+1} + a_m)^2 = P_{m+1}^2 + P_{m+1}2^{m+1} + 2^{2m}$$ 可以發現 $(P_m)^2$ 其實就是 $P_{m+1}^2$ 加上 $P_{m+1}2^{m+1} + 2^{2m}$，而我們定義了誤差值的關係式： $$X_m = N^2 - P_m^2$$ 因為我們想要決定以下不等式是否成立： $$(P_{m+1}+2^m)^2 \leq N^2$$ 將以下式子帶入： $$X_{m+1} = N^2 - P_{m+1}^2$$ 所以等同檢查下面不等式是否成立： $$X_{m+1} \geq P_{m+1}2^{m+1}+2^{2m}$$ 這就是 `Line 12、13` 在做的事情。 `Line 15、16`: 把 $x$ 扣掉 $P_{m+1}2^{m+1}+2^{2m}$ (縮小誤差)，然後將最終答案加上 $2^ i$ 至於如何改進？參考[陳麒升](https://hackmd.io/@rota1001/linux2025-homework2?stext=19320%3A10%3A0%3A1746545653%3A_OW8m9)大神的解釋：我們可以令 $y_i = P_i*2^{i}$，代入 `Line 13` ```C uint64_t b = (p << (i + 1)) + (1 << (2 * i)); ``` 變成： ```C uint64_t b = y + (1 << (2 * i)); ``` `(p << (i + 1))` 實際上就是 $y_{i+1} = P_{i+1}*2^{i+1}$ 令 $y_{i+1} = P_{i+1}*2^{i+1}$，代入 `Line 16`： ```C p += (1 << i); ``` 上面的數學是其實就是 $P_i = P_{i+1} + 2^i$ $$P_{i} = 2^{-i}*y_i$$ $$P_{i+1} = 2^{-(i+1)}*y_{i+1}$$ 帶入 $P_i = P_{i+1} + 2^i$ 之後變成： $$2^{-i}*y_i = 2^{-(i+1)}*y_{i+1} + 2^i$$ 左右同時乘上 $2^{-i}$： $$y_i = 2^{-1}*y_{i+1} + 2^{2i}$$ 對應到程式碼的話，就是 `line 14、17`： ```C= uint64_t sqrti_simple(uint64_t x) { uint64_t y = 0; if (x <= 1) return x; int total_bits = 64; int shift = (total_bits - 1 - clz64(x)) & (~1UL); uint64_t p = 0; for(int i = (shift / 2); i >= 0; i--) { uint64_t b = y + (1 << (2 * i)); y >> 1; if (x >= b) { x -= b; y += 1 << (2 * i) } } return y; } ``` 最後看到 `Line 13、17` 都有出現的部分： ```C (1 << (2 * i)) ``` 因為 for 迴圈裡面的 i 是從 shift/2 開始，所以初始狀態的 i： $$2i = 2*(shift/2) = shift$$ 每一次的迭代 i 都會 -1，所以： $$2(i-1) = 2*(shift/2-1) = shift-2$$ 意思就是我們可以令 `m = 1 << shift`，然後每次在迴圈結束的時候把 `m` 往右位移兩個 bit，當 `m` 為 0 的時候剛好迴圈也結束了，所以可以順便把 `m` 當作 while loop 的條件。修改之後的程式碼： ```C= uint64_t sqrti_simple(uint64_t x) { uint64_t m, y = 0; if (x <= 1) return x; int total_bits = 64; int shift = (total_bits - 1 - clz64(x)) & (~1UL); m = 1ULL << shift; while (m) { uint64_t b = y + m; y >>= 1; if (x >= b) { x -= b; y += m; } m >>= 2; } return y; } ``` 與 `uint64_t sqrti(uint64_t x)` 一樣，修改完畢。至於向上取整數的話，先看到下面原本向下取整數的不等式： $$(P_{m+1}+2^m)^2 \leq N^2$$ 意思就是最終狀態 $$(P_{0})^2 \leq N^2$$ 如果我們想要向上取整數，假設向上取整的最終結果是 $P_0'$，不等式要修改成下面的樣子 $$(P_{0}'-1)^2 < N^2$$ 也就是 `if (x >= b)` 要改成 `if (x > b)`，改完了之後再把扣掉的 1 加回去 `return y + 1`。 #### 延伸問題 2 > 參照計算機結構測驗題 C 及其注釋，以 C 語言 (搭配 GNU extension) 撰寫不依賴除法指令的 sqrtf，確保符合 IEEE 754 規格。對照 sqrtf, rsqrt, reciprocal implementations in Arm assembly #### 延伸問題 3 > 參照從 √2 的存在談開平方根的快速運算提到的「藉由查表的平方根運算」，以 C 語言實作，並比較不同手法的整數開平方根效能一旦發現效能改進的機會，可準備提交改進 int_sqrt 程式碼到 Linux 核心 ## 第二週 Qiuz 3