2025q1 Homework2 (quiz1+2)

# 2025q1 Homework2 (quiz1+2) contributed by <`alanhc`)> {%hackmd NrmQUGbRQWemgwPfhzXj6g %} ## 環境 ``` alanhc@alanhc:~/lab0-c$ uname -a Linux alanhc 6.8.0-55-generic #57-Ubuntu SMP PREEMPT_DYNAMIC Wed Feb 12 23:42:21 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux alanhc@alanhc:~/lab0-c$ hostnamectl Static hostname: alanhc Icon name: computer-desktop Chassis: desktop 🖥️ Machine ID: 85df47c270b342f3b43b1b02b90d3c16 Boot ID: a64d2aee6fcd432f91d013489094b858 Operating System: Ubuntu 24.04.2 LTS Kernel: Linux 6.8.0-55-generic Architecture: x86-64 Hardware Vendor: ASUSTeK COMPUTER INC. Hardware Model: Pro WS W680-ACE IPMI Firmware Version: 3901 Firmware Date: Fri 2024-09-27 Firmware Age: 5month 1w 4d alanhc@alanhc:~/lab0-c$ gcc --version gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 Copyright (C) 2023 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. alanhc@alanhc:~/lab0-c$ ldd --version ldd (Ubuntu GLIBC 2.39-0ubuntu8.4) 2.39 Copyright (C) 2024 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. Written by Roland McGrath and Ulrich Drepper. alanhc@alanhc:~/lab0-c$ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 46 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 28 On-line CPU(s) list: 0-27 Vendor ID: GenuineIntel Model name: Intel(R) Core(TM) i7-14700 CPU family: 6 Model: 183 Thread(s) per core: 2 Core(s) per socket: 20 Socket(s): 1 Stepping: 1 CPU(s) scaling MHz: 17% CPU max MHz: 5400.0000 CPU min MHz: 800.0000 BogoMIPS: 4224.00 ``` ## Links - [作業說明](https://hackmd.io/@sysprog/rkqqXa55yl) ## [測驗1](https://hackmd.io/@sysprog/linux2025-quiz1#測驗-1) - [課前測驗參考解答: Q1](https://hackmd.io/@sysprog/bitwise-reverse?type=view) ### 1 - 題目有段測試程式，其中 - 執行階層如下 - main - test_suit - my_run_test(test_list) - test是一個自己定義的測試function，對應到程式的test_list - tests_run 是count ```cpp= #define my_run_test(test) \ do { \ char *message = test(); \ tests_run++; \ if (message) \ return message; \ } while (0) ``` - test_list - 主要流程 - 測試insert到頭 (list_insert_before(&l, l.head, &items[i])) - 測試insert到尾 (list_insert_before(&l, NULL, &items[i])) - 題目要回答：list_insert_before 函式實作 - 查看定義 ```clike= /** * Inserts a new item into the list before a specified item. * * This function traverses the list to locate the position immediately before * the item pointed to by @before and inserts @item in that position. * The time complexity is O(n), where n is the number of steps needed to * reach @before from the head of the list. * * Parameters: *@1 : Pointer to the list. * @before : Pointer to the item before which the new item should be inserted. * - If @before is the head of the list, the new item is inserted * at the front. * - If @before is NULL, the new item is appended to the end of the list. * - In all other cases, behavior is undefined if @before does not belong to @l. * @item */ : The new list item to be inserted. static inline void list_insert_before(list_t *1, list_item_t *before, list_item_t *item); ``` - before是要新增的前一個元素 - l 是一個list - item 是被新增的 node - 答案為 ```cpp= list_item_t **p; for (p = &l->head; *p != before; p = &(*p)->next) ; *p = item; item->next = before; ``` - 因為 p 從定義是 pointer to pointer to list_item_t - 因此 *p 意義是當前節點 - 答案第2行 - l->head 是指向第一個元素，&l->head 是是指向第一個元素的 address - 確保 *p 還沒有到新增元素的前一個 (before) - 迭代的時候 p = 當前(*p) 所指到的地址 (&) - 例子解釋原本：`head -> [ 10 ] -> [ 20 ] -> [ 30 ] -> NULL` 要在 20 節點之前插入一個新的節點 item，並且假設這個新節點的值是 15，因此before是20。 ``` head -> [ 10 ] -> [ 20 ] -> [ 30 ] -> NULL ^ p (指向 head 節點) ``` *p 會是 10 ，然後檢查 *p != before（即 10 != 20）。因為條件不成立，我們會更新 p，使其指向下一個節點的 next 指標。 - 可以想像 p 是 *p 的前一個節點 - 由於答案第二行終止節點是 before == *p ，再加上 p 為 *p 的前一個節點 - 第4行 *p = item; 的目的是將 item 插入到 10 節點之後，因為 p - 第5行因為要將新加入的item 插入 before - ![截圖 2025-03-17 晚上10.21.17](https://hackmd.io/_uploads/rk--ehrhkg.jpg) - 依照每次迭代來理解 | 步驟| p (指標)| *p (當前節點) | -------- | -------- | -------- | | 初始| &head| 10 | 迭代 1| &head->next | 10 | 迭代 2| &head->next->next | 20 | 插入 | &head->next->next | 15 #### 延伸在現有的程式碼基礎上，撰寫合併排序操作 TODO: 待補完整解釋 - 使用快慢指標找中點 - 使用 pointer to pointer 簡化 - 完整程式碼：https://gist.github.com/alanhc/3705f57d8fefbe92cb64f00f172fe843 - 其中的merge邏輯： ```cpp= void merge_sorted_lists(list_t *l, list_item_t *a, list_item_t *b) { list_reset(l); while (a && b) { if (a->value < b->value) { list_insert_before(l, NULL, a); a = a->next; } else { list_insert_before(l, NULL, b); b = b->next; } } while (a) { list_insert_before(l, NULL, a); a = a->next; } while (b) { list_insert_before(l, NULL, b); b = b->next; } } ``` ### 2 - remove_free_tree：給定樹的root，刪除target - 要討論的case有 - target 同時擁有左子樹及右子樹 - target 只有左子樹 **或** 右子樹 - 程式碼片段如下 ```cpp= /* If the target node has two children, we need to find a replacement. */ if ((*node_ptr)->l && (*node_ptr)->r) { /* Find the in-order predecessor: * This is the rightmost node in the left subtree. */ block_t **pred_ptr = &(*node_ptr)->l; while (EEEE) pred_ptr = FFFF; ``` - 回答EEEE及FFFF - 第2行爲當要移除的target有兩個子節點時，要找出in-order predecessor（左子樹的最右節點），即為第7-8行 - EEEE 為(*pred_ptr)->r，檢查是否還有右邊子節點 - FFFF 為(*pred_ptr)->r，往右走 #### 補完程式 - memory allocator best-fit - 解釋： - 在記憶體分配器（memory allocator）中，best-fit 策略是： - 在所有「足夠大的區塊」中，選擇「最接近 target size」的區塊。 - 也就是 size >= target->size 中最小的那一個。 - find_free_tree - 邏輯： - 要找到左邊接近target size較小但大於或等於target的節點 - 第6-7行：當發現一個「大於等於」 target 的區塊，這有潛力成為 best-fit，所以記錄到 best。 - 第9-10行：太小，往右找 ```cpp= block_t **find_free_tree(block_t **root, block_t *target) { block_t **best = NULL; block_t **cur = root; while (*cur) { if ((*cur)->size >= target->size) { best = cur; cur = &(*cur)->l; } else { cur = &(*cur)->r; } } return best ? best : cur; } ``` - 視覺化 ```clike 50 / \ 30 70 / \ \ 20 40 80 target->size = 35 1. root = 50: 50 >= 35 (best = 50), 往左 2. root = 30: 30 < 35, 往右 3. root = 40: 40 >= 35 (best = 40), 往左 4. root = NULL -> 結束結果: best = 40 ``` - find_predecessor_free_tree - predecessor：左子樹中最右節點 - 第6-7行：有右邊就往右找 ```cpp= block_t *find_predecessor_free_tree(block_t **root, block_t *node) { if (!node || !node->l) return NULL; block_t *pred = node->l; while (pred->r) pred = pred->r; return pred; } ``` - 參閱 [memory-allocators](https://github.com/mtrebi/memory-allocators)，針對你補完 (及後續改進) 的記憶體配置器，進行效能評比，並解讀其效能表現 ### 3 - 第23行為`p = n->next;`，遍歷pivot右邊的節點 - 21-25邏輯是將pivot右邊依照value分為大於及小於value的 - 第28行為`list_tail(&left);`，這行是將left最尾端的元素作為佐子區間的end - 第32行保留right區間最尾節點作為右區間的end - 核心概念：分而治之 (Divide and Conquer) 1. 選擇 pivot 每一輪從當前子 list 的開頭 L 當作 pivot 節點。 2. 分割成兩個子鏈表 (left / right) 遍歷 pivot 右邊的節點 (p = pivot->next)，根據 value 分成 left 子 list 和 right 子 list： n->value > pivot->value 放到 right 反之，放到 left 3. 遞迴處理子鏈表（用 stack 模擬遞迴）先把 left -> pivot -> right 推進 stack。持續重複 quick sort 子問題。 4. 逐步合併回 result 當 L == R 時，代表當前子問題只剩 1 個節點，直接 pop 出來放到 result 的頭部。 ```cpp= void quick_sort(node_t **list) { int n = list_length(list); int value; int i = 0; int max_level = 2 * n; node_t *begin[max_level], *end[max_level]; node_t *result = NULL, *left = NULL, *right = NULL; begin[0] = *list; end[0] = list_tail(list); while (i >= 0) { node_t *L = begin[i], *R = end[i]; if (L != R) { node_t *pivot = L; value = pivot->value; node_t *p = pivot->next; pivot->next = NULL; while (p) { node_t *n = p; p = n->next; list_add(n->value > value ? &right : &left, n); } begin[i] = left; end[i] = list_tail(&left); begin[i + 1] = pivot; end[i + 1] = pivot; begin[i + 2] = right; end[i + 2] = list_tail(&right); left = right = NULL; i += 2; } else { if (L) list_add(&result, L); i--; } } *list = result; } ``` ## 測驗2 ### 1 - 此為使用linux核心風格實作quick sort - cmpint 比較兩個 - 第13行為 list_first_entry(head, struct listitem, list); - 第14行為 list_del(&pivot->list); 把 pivot 從 head 中移除 - 19-21 看item跟pivit哪個大 - 第19行為 list_move_tail(&item->list, &list_less); - 第21行為 list_move_tail(&item->list, &list_greater); - 第28行為 list_add(&pivot->list, head); - 第29行為 list_splice_tail(&list_less, head); - 第30行為 list_splice_tail(&list_greater, head); - 主要邏輯： - list_first_entry 取得 pivot。 - list_del 把 pivot 暫時移除。 - 迴圈裡用 cmpint 比較 item 與 pivot，分別移到 list_less 和 list_greater。 - 最後遞迴排好兩邊，再把 pivot 接回 head 的頭，再接上兩個 sorted 子 list。 ```cpp= { struct list_head list_less, list_greater; struct listitem *pivot; struct listitem *item = NULL, *is = NULL; if (list_empty(head) || list_is_singular(head)) return; INIT_LIST_HEAD(&list_less); INIT_LIST_HEAD(&list_greater); // 取 head 的第一個元素作為 pivot pivot = list_first_entry(head, struct listitem, list); // 把 pivot 從 head 中移除 list_del(&pivot->list); list_for_each_entry_safe(item, is, head, list) { if (cmpint(&item->i, &pivot->i) < 0) list_move_tail(&item->list, &list_less); else list_move_tail(&item->list, &list_greater); } list_quicksort(&list_less); list_quicksort(&list_greater); // 重新把三個部分接起來 list_add(&pivot->list, head); list_splice_tail(&list_less, head); list_splice_tail(&list_greater, head); } ``` ### 2 - 可藉由題目輸入輸出推估 - clz2 - x 為 input, c 為shift c |mask[c]| 0xFFFF >> mask[c]| lower 保留的 bits 數量 |---|---|---|---| 0| 0 |0xFFFF >> 0 = 0xFFFF| 16 bits 1| 8 |0xFFFF >> 8 = 0x00FF| 8 bits 2| 12 |0xFFFF >> 12 = 0x000F| 4 bits 3| 14 |0xFFFF >> 14 = 0x0003| 2 bits - 第1行當 c = 3 時，我們希望只剩下最後 2 bits，用來對應 magic[] 做最終查表。 bits| binary| magic 值 | 說明 |---|---|---|---| 00| 00| 2| 2 個 leading zeros 01| 01| 1| 1 個 leading zero 10| 10| 0| 無 leading zero 11| 11| 0| 無 leading zero - 第2行 00 對應 2 個 leading zeros 11 / 10 對應 0 個 leading zeros - 第11行，當 c == 3，代表： - 已經只剩下 2 bits 可判斷（mask[3] = 14） - 直接使用 magic[] 回傳答案 - 第12行：lower 是 2 bits 時，最多還能有 "2" 個 leading zeros，因為已經處理過高層，這裡再補上 2 bits 的 offset - 第13行，每次進入下一層要 c+1 每層細分： c=0 ➡️ 16 bits c=1 ➡️ 8 bits c=2 ➡️ 4 bits c=3 ➡️ 2 bits 二分法 (divide and conquer) clz的流程： 1. 先拆成 upper / lower（高位 or 低位） 2. upper ≠ 0 就繼續在 upper 遞迴 3. upper = 0 時，回傳 (16 >> c) 的 offset + 遞迴到 lower 4. 最後 c=3 時，直接用 magic[] 直接給答案 ```cpp= static const int mask[] = {0, 8, 12, 14}; static const int magic[] = {2, 1, 0, 0}; // magic[] size = 4 對應 clz2 最後的 2 bits 判斷 unsigned clz2(uint32_t x, int c) { if (!x && !c) return 32; uint32_t upper = (x >> (16 >> c)); uint32_t lower = (x & (0xFFFF >> mask[c])); if (c == 3) return upper ? magic[upper] : 2 + magic[lower]; return upper ? clz2(upper, c + 1) : (16 >> (c)) + clz2(lower, c + 1); } #define clz32(x) clz2(x, 0) ``` - 第9行 ~1 是位元反轉操作。因此 ~1 等同於 111...1110。 - 這是為了確保我們在計算平方根的起始位置時，能夠對齊至 power of 4（即 4 的冪次），這樣可以避免不必要的多次迭代，並保證更高效的計算。 - 例如，假設 clz64(x) 計算出來是 3，則 (total_bits - 1 - clz64(x)) 等於 60，然後 & ~1 確保結果是 60 變成 60 & ~1 = 60，即 shift = 60，這樣可以確保對齊至 4 的冪次。 - 第14行將y右移1 ，也就是將 y 除以 2 - 為什麼是右移 1 位呢？因為在每一輪中，y 代表的是目前的平方根近似值。每次我們測試 y + m 是否小於或等於 x，並進行更新時，都會進行一次右移操作，使得 y 被縮小，逐步逼近平方根的值。 - 第19行 m 右移 2 位，這是為了在每次迭代時逐漸減少 m 的值，使其逐步縮小，這樣在每一輪中可以有效逼近平方根。 - m 的初始值是根據 clz64(x) 計算的，代表了 x 的最高位的初始估算值。每次迭代時，我們將 m 降低，逐步精確化平方根。 - 第3-10行：目的是根據輸入 x 的位元數來確定平方根的初始估算值 m，並且將其對齊到最近的 4 的冪次，這樣可以高效地進行後續的計算。 - ~1 是位元反轉運算。~1 會將 1 轉為 0，而其餘位元保持為 1。這樣操作的目的，是將 shift 值調整為最接近的偶數，使其符合 power of 4。 - 為什麼需要對齊到 power of 4 呢？因為我們希望從 m 開始計算平方根，這樣會更高效。每次計算 m 的更新時，根據平方根的二進位特性，我們會使用「四分之一縮放」，這樣會大幅提高效率。 - 第14行這裡的 y 是我們當前的平方根估算值，每次迭代，我們會通過右移來更新 y，逐步逼近平方根。 - 第19行右移 2 位相當於除以 4，這有助於調整 m 以便更精確地逼近平方根。 ```cpp= uint64_t sqrti(uint64_t x) { uint64_t m, y = 0; if (x <= 1) return x; int total_bits = 64; int shift = (total_bits - 1 - clz64(x)) & ~1; // MMMM = ~1 m = 1ULL << shift; while (m) { uint64_t b = y + m; y >>= 1; // NNNN = 1 if (x >= b) { x -= b; y += m; } m >>= 2; // PPPP = 2 } return y; } ``` ### 3 - 第6-7prev為前一個節點 node 為現在節點 - 第8-12 遍歷節點並設prev - 第14行卻本head的prev只巷最後一個節點 ```cpp= static void rebuild_list_link(struct list_head *head) { if (!head) return; struct list_head *node, *prev; prev = head; node = head->next; while (node) { node->prev = prev; prev = node; node = node->next; } prev->next = head; head->prev = prev; } ``` - 第14行為 value = pivot->value; 因為value是用於比較的值 - 第21行為 int n_value = n->value; 從n個節點取值用於與pivot value比較 - 第32行透過將 begin i+1 指向 right 起點，將right連結到下一層 - 第33行將begin i+2 指向NULL，不需繼續處理 - 解釋14-35 - pivot 節點的值用來分割其他節點到 left 和 right 部分。 - 對每個遍歷到的節點，根據其值與 pivot 的值比較，將它放到相應的分區（left 或 right）。 - 最後，更新 begin[i] 和 begin[i + 1]，處理左右分區的鏈接，並準備下一輪處理。 ```cpp= { int n = list_length(list); int value; int i = 0; int max_level = 2 * n; struct list_head *begin[max_level]; struct list_head *result = NULL, *left = NULL, *right = NULL; begin[0] = list->next; list->prev->next = NULL; while (i >= 0) { struct list_head *L = begin[i], *R = list_tail(begin[i]); if (L != R) { struct list_head *pivot = L; value = pivot->value; struct list_head *p = pivot->next; pivot->next = NULL; /* break the list */ while (p) { struct list_head *n = p; p = p->next; int n_value = n->value; if (n_value > value) { n->next = right; right = n; } else { n->next = left; left = n; } } begin[i] = left; begin[i + 1] = right; begin[i + 2] = NULL; left = right = NULL; i += 2; } else { if (L) { L->next = result; result = L; } i--; } } list->next = result; rebuild_list_link(list); } ``` ### TODO研讀〈A comparative study of linked list sorting algorithms〉，分析針對雙向鏈結串列的排序演算法考量，並運用 Linux 核心風格的 List API 予以實作 ## 作業二表單當你在已 fork 的 lab0-c 執行 make test 時，你應該要看到 Fingerprint: 開頭的訊息，請填入該訊息的內容 p77c051686cyi8hczvugrubt lab0-c 的 list.h 利用 C 語言的 bit field 達成「當編譯器缺乏 typeof 支援時，停止編譯」，這反映在該檔案的哪些巨集？ list_for_each_entry 在〈從 Revolution OS 看作業系統生態變化〉提及 Apple Safari 和 Google Chrome 網頁瀏覽器可追溯到最初程式碼的源頭是 KDE 計畫旗下的哪個子專案？ (小寫英文字母作答) khtml 在〈Linux 核心設計: 發展動態回顧〉的解說錄影中，提及授課教師在 2022 年發表的論文，運用 Linux 核心什麼機制來收集 CPU 排程事件，以降低統計過程對系統效能的衝擊 (小寫英文字母作答) ebpf 在「[並行程式設計: 排程器原理](https://hackmd.io/@sysprog/concurrency-sched#搶佔式多工)」提及可在使用者層級做到搶佔式的 coroutine，這需要 Linux 提供什麼機制？用英語簡述實作手法，以及為了達成搶佔 (preemption) 特性，你該如何調整？ To implement preemptive coroutines in user space, you can use signals combined with a timer (e.g., setitimer) to periodically interrupt the running coroutine. When the signal (such as SIGALRM) is delivered, the signal handler saves the coroutine’s current context using getcontext or sigsetjmp, then switches to the scheduler context or another coroutine using setcontext or siglongjmp. 已知 int x(int i, int k) { return (i < k && putchar(i + '0') && COND || putchar(i + '0'); } 在 x(1,3); printf("\n"); 使用情境會得到 12321 字串輸出，請補完 COND 敘述，使該程式運作符合預期。用最精簡的形式書寫程式碼，不要提交空白字元。(提示: 遞迴呼叫) !x(i+1,k) https://hackmd.io/@sysprog/c-recursion#案例分析：數列輸出〈Linux 核心的紅黑樹〉提到 Linux 核心對於紅黑樹節點的定義是 struct rb_node { unsigned long __rb_parent_color; struct rb_node *rb_right; struct rb_node *rb_left; } __attribute__((aligned(sizeof(long)))); 也就是親代節點跟顏色一起宣告在 unsigned long 型態的結構體成員中，搭配 x86-64 ABI 規格，用英語解釋其運作原理。 In the Linux kernel, the struct rb_node is designed to be space-efficient by combining the parent pointer and color information into a single unsigned long field called __rb_parent_color. How it works: • On x86-64 architecture, pointers are aligned to at least 8 bytes (due to 64-bit addressing), meaning the lower 3 bits of any valid pointer to a struct are always zero (because of alignment). • The kernel takes advantage of this by storing the color information (and sometimes additional flags) in the lower bits of the __rb_parent_color field. • The upper bits of __rb_parent_color contain the actual parent pointer, and the lower bits encode the color (typically 1 bit: 0 for red, 1 for black). Why do this? This technique: 1. Saves space by avoiding an extra field for color. 2. Maintains cache friendliness and alignment on 64-bit systems. 3. Aligns with x86-64 ABI rules that ensure the lower bits are free for such usage without corrupting the actual pointer value. Example: • The parent pointer is extracted by masking out the lower bits: parent = (struct rb_node *)(__rb_parent_color & ~3UL); • The color is extracted by: color = __rb_parent_color & 1UL;