# 屠龍矮人 ## 2/25 討論 問題 1. 為何 hash table 要用雙向鏈結串列? 單向就可以了不是嗎? > 為了方便刪除 2. 如何修正單精度浮點數的誤差? 3. Linux hash table 設計的思維是建立在 hash function 不太碰撞的前提之下? 如何量化證明比 list_head 結構更好 4. 參見〈 [預期目標 + 開發環境設定](https://hackmd.io/@sysprog/linux2025-lab0/%2F%40sysprog%2Flinux2025-lab0-a) 〉 > queue.c 僅提供介面 (interface) 但尚未有完整程式實作,需要由學員補完並逐步精進,包含以下: 沒寫到 `q_ascend` > q_sort: 以遞增順序來排序鏈結串列的節點 lab0-c 裡面的要求是 ascend/descend, 作業說明跟作業內容不一致 > 寬:你說的對,但 qtest.c 裡面 `descend` 直接被寫死為 `0` ,所以就算你的 qsort 不使用 `descend` 這個參數也不會怎樣。 --- ## 2/26 討論 ### list.h 討論 `list_cut_position`: 根據 `head_to` 原本指向的鏈結串列的記憶體位置要怎麼追蹤?是否該修改文件? ### 2025 quiz 2-1 檢討 冷知識: qsort 的 pseudocode: ``` QUICKSORT(A, p, r) if p < r q = PARTITION(A, p, r) QUICKSORT(A, p, q-1) QUICKSORT(A, q+1, r) ``` 問題: 1. `stable sorting` 是啥? 如何驗證? > Stable sorting algorithms maintain the relative order of records with equal keys (i.e. values). That is, a sorting algorithm is stable if whenever there are two records R and S with the same key and with R appearing before S in the original list, R will appear before S in the sorted list. The Sorting algorithm article provides a more complete description of this. 3. `cccc` 為何是 `list_move_tail` 不是 `list_move` ? >因為如果使用 list_move 就不是 stable sort, 假設有 [$1_A$, $1_B$ ] 若使用 list_move 加入 `less` 或 `greater` 則結果為 [ $1_B$, $1_A$ ] >抑或是說,`list_for_each` 是從左至右,而 list_move 是從右至左 3. `head` 從哪裡來的? > 是遞迴函式的參數 ```c static void list_quicksort(struct list_head *head); ``` 4. what if `head` is a dummy node ? >不用擔心,因為 `BBBB` 使用的函式是 `list_first_entry` ,獲取的是 `head` 的下一個節點 5. 為何需要 `BBBB` 不移除 `head` 會怎樣? > 這是個大烏龍,`BBBB` 的正確答案應為 `list_first_entry` ,即獲取 dummy node 的下一個節點。 ### 比較 2025 quiz 2-1 (遞迴寫法) 和 2025 quiz 1-3 (非遞迴寫法) ## 3/1 討論 1. `MMMM` 為何不能是 `62` (`~1`) 2. `clz` 的定義為和? ### 二分逼近法 鮭:可找我要十進位制的筆記 但我二進位制算法還不會 ### 牛頓法 --- # 貳:上課內容 ## 2/25 上課 - [ ] 裝 git hook - [ ] 修改作者名稱 + email 帳號 - [ ] 閱讀 [How to Write a Git Commit Message](https://cbea.ms/git-commit/) - [ ] 學習 GDB debug - [ ] 閱讀 CSAPP ch2 (p67 ~ 179) (大概看一個禮拜吧) - [ ] 閱讀 [從 √2 的存在談開平方根的快速運算](https://hackmd.io/@sysprog/sqrt) - [ ] 看熟 [list.h](https://github.com/sysprog21/lab0-c/blob/master/list.h) --- ## 2/18 上課簡記 ### 浮點數修正 ```c #include <stdio.h> int main() { float sum = 0.0f; for (int i = 0; i < 10000; i++) sum += i + 1; printf("Sum: %f\n", sum); return 0; } ``` 可以發現單精度浮點數的誤差值會循環出現,目標是利用該特性來修正誤差,即利用「查表」的方式來修正誤差: ```c int diff24[4] = {0, -1, 0, 1}; int diff25[8] = {0, -1, -2, 1, 0, -1, 2, 1}; ``` 誤差發生在 `sum = sum +(i + 1)` 此刻我們假設右手邊的 `sum` 和 `(i+1)` 都能精準表示, ## 3/3 上課簡記 1. namespace, cgroup 2. io_uring 3. 要看作業系統術語與概念 4. 核心發展 為何 ```c /* mpi: Multi-Precision Integers */ typedef struct { uint32_t *data; size_t capacity; } mpi_t[1]; ``` 要設計成 `mpi_t[1]` 用字串當成數入有什麼好處? 保證是連續記憶體? HOW? string literal 是啥?有多長? ## 3/5 討論 為什麼 `q_sort` 在元素為的時候 ``[dolphin bear gerbil]`` 的時候 `next_item->value` 為無法存取的記憶體,而在元素為 `[ddd ccc bbb aaa dolphin bear gerbil]` 的時候,`item->value` 和 `next_item->value` 會一樣都是 `ddd`? ```c // 把 target 的元素全部貼到 head 去,並排序 void merge_two(struct list_head *head, struct list_head *target, bool descend) { if (!head || !target) { return; } LIST_HEAD(pos); while (!list_empty(head) && !list_empty(target)) { element_t *l_entry = list_first_entry(head, element_t, list); element_t *r_entry = list_first_entry(target, element_t, list); if ((!descend && strcmp(l_entry->value, r_entry->value) <= 0) || (descend && strcmp(l_entry->value, r_entry->value) >= 0)) { list_move_tail(&l_entry->list, &pos); } else { list_move_tail(&r_entry->list, &pos); } } // 剩下的部份也貼上去 &pos ,最後再全部貼回 head if (list_empty(head)) list_splice_tail(target, &pos); if (list_empty(target)) list_splice_tail(head, &pos); list_splice(&pos, head); } ``` ```c /* Sort elements of queue in ascending/descending order */ void q_sort(struct list_head *head, bool descend) { if (!head || list_empty(head) || list_is_singular(head)) { return; } int count = q_size(head) / 2; struct list_head *slow = head, *fast = head->next; for (; count; slow = slow->next, fast = fast->next->next, count--) ; // 用 indirect pointer ,創一個指向 dummy node 的指標,代表後半部的 dummy node // 把後半部變成雙向鏈結串列 struct list_head **right = &slow->next; head->prev->next = *right; (*right)->prev = head->prev; slow->next->prev = *right; // 把前半部變成雙向鏈結串列 slow->next = head; head->prev = slow; q_sort(head, descend); q_sort(*right, descend); merge_two(head, *right, descend); } ``` #### 解析 do_sort ```c bool do_sort(int argc, char *argv[]) { if (argc != 1) { report(1, "%s takes no arguments", argv[0]); return false; } int cnt = 0; if (!current || !current->q) report(3, "Warning: Calling sort on null queue"); else cnt = q_size(current->q); error_check(); if (cnt < 2) report(3, "Warning: Calling sort on single node"); error_check(); set_noallocate_mode(true); /* If the number of elements is too large, it may take a long time to check the * stability of the sort. So, MAX_NODES is used to limit the number of elements * to check the stability of the sort. */ #define MAX_NODES 100000 struct list_head *nodes[MAX_NODES]; unsigned no = 0; if (current && current->size && current->size <= MAX_NODES) { element_t *entry; list_for_each_entry (entry, current->q, list) nodes[no++] = &entry->list; } else if (current && current->size > MAX_NODES) report(1, "Warning: Skip checking the stability of the sort because the " "number of elements %d is too large, exceeds the limit %d.", current->size, MAX_NODES); if (current && exception_setup(true)) q_sort(current->q, descend); exception_cancel(); set_noallocate_mode(false); bool ok = true; if (current && current->size) { for (struct list_head *cur_l = current->q->next; cur_l != current->q && --cnt; cur_l = cur_l->next) { /* Ensure each element in ascending/descending order */ element_t *item, *next_item; item = list_entry(cur_l, element_t, list); next_item = list_entry(cur_l->next, element_t, list); if (!descend && strcmp(item->value, next_item->value) > 0) { report(1, "ERROR: Not sorted in ascending order"); ok = false; break; } if (descend && strcmp(item->value, next_item->value) < 0) { report(1, "ERROR: Not sorted in descending order"); ok = false; break; } /* Ensure the stability of the sort */ if (current->size <= MAX_NODES && !strcmp(item->value, next_item->value)) { bool unstable = false; for (unsigned i = 0; i < MAX_NODES; i++) { if (nodes[i] == cur_l->next) { unstable = true; break; } if (nodes[i] == cur_l) { break; } } if (unstable) { report( 1, "ERROR: Not stable sort. The duplicate strings \"%s\" " "are not in the same order.", item->value); ok = false; break; } } } } #undef MAX_NODES q_show(3); return ok && !error_check(); } ``` ## 3/10 討論 ### 2025 q1 利用指標陣列 `begin` 和 `end` 來模擬堆疊 從堆疊中取出索引值對應的指標 `begin[i]` 和 `end[i]`並將值存在` L `和 `R` 之中 並將 `pivot` 設定為 `L` > node_t *L = begin[i], *R = end[i]; 示意圖如下 ```graphviz digraph graphname{ node[shape=box] rankdir = LR A[label=4] B[label=1] C[label=3] D[label=5] E[label=2] F[label=7] G[label=8] NULL[label=NULL,shape=plaintext] A->B->C->D->E->F->G->NULL { rank="same"; L[label="L",shape=plaintext,fontcolor=red] pivot[label="pivot",shape=plaintext,fontcolor=blue] pivot->L[color=blue] L->A[color=red]; } { rank="same"; R[label="R",shape=plaintext,fontcolor=red] R->G[color=red]; } } ``` 當 `L` 不等於 `R` 時, 將大於 pivot 的節點加入 `right` ,反之則加入 `left` ```graphviz digraph{ ## rankdir=TD rankdir=LR node [shape=box,color=black] //tp [shape=none,label=pivot,fontcolor=red] A [shape=none,label=left] B [shape=none,label=right] C [shape=none,label=pivot,fontcolor=blue] C->4 B->8->7->5 A->2->3->1 } ``` 接著分別用指標 `begin[i]` 和 `end[i]` 來記錄 `left` 的開頭與結尾、`begin[i+2]` 和 `end[i+2]` 來記錄 `right` 的開頭與結尾,而 `pivot` 則同時用 `begin[i+1]` 和 `end[i+1]` 來記錄 最後將堆疊的索引 `i` 更新為 `i+=2` (取出一層又放回三層) 是意圖如下: ```graphviz digraph graphname{ node[shape=box] rankdir = LR p [shape=none,label=pivot,fontcolor=blue] p->4 p1[shape=none,label="begin[i+1]",fontcolor=red] p1->4 p2[shape=none,label="end[i+1]",fontcolor=red] p2->4 A[label=8] B[label=7] C[label=5] D[label=2] E[label=3] F[label=1] R[label="R",shape=none,label=right] //NULL[label=NULL,shape=plaintext] right[shape=none,label="right",fontcolor=blue] right->A left[shape=none,label="left",fontcolor=blue] left ->D D->E->F A->B->C { rank="same"; L[label="begin[i+2]",shape=plaintext,fontcolor=red] //pivot[label="pivot",shape=plaintext,fontcolor=blue] //pivot->L[color=blue] L->A[color=red]; } { rank="same"; R[label="end[i]",shape=plaintext,fontcolor=red] R->F[color=red]; } { rank="same"; L1[label="begin[i]",shape=plaintext,fontcolor=red] //pivot[label="pivot",shape=plaintext,fontcolor=blue] //pivot->L[color=blue] L1->D[color=red]; } { rank="same"; R1[label="end[i+2]",shape=plaintext,fontcolor=red] R1->C[color=red]; } } ``` ```graphviz digraph graphname{ node[shape=box] subgraph cluster_0 { //rank="same"; rankdir = LR; style=filled; color=lightgrey; node [style=filled,color=white]; "begin[i+2]" "begin[i+1]" "begin[i]" ; label = "begin"; } subgraph cluster_1 { //rank="same"; rankdir = LR; style=filled; color=lightgrey; node [style=filled,color=white]; "end[i+2]" "end[i+1]" "end[i]" ; label = "end"; } } ``` 可以想像成這樣: ```graphviz digraph graphname{ node[shape=box] rankdir = LR node[shape=record]; map [label="stack |begin[i+2] end[i+2] |begin[i+1] end[i+1] |begin[i] end[i] "]; } ``` ## 4/22 課堂 # 叁:作業 ## lab0 ### TODO: - [ ] 測試不同的 hashtable 的結構對效能的影響 + 使用 bloom filter (不用做了 烏龍一場) - [ ] 2024 年 Linux 核心課程期末展示的錄影中,授課教師若干學員進行互動,其中提到「教授很難找」,是在討論什麼專案時? (小寫) > simplefs - [ ] 作業說明提到「前述提過 qtest 本身就是個命令直譯器」,但直譯器是啥? ### 問題: 1. fork 完之後在本地端 `lab0-c` 底下執行 `$ make`,會出現如下錯誤: ``` Fatal: GitHub Actions MUST be activated. Check this article: https://docs.github.com/en/actions/managing-workflow-runs/disabling-and-enabling-a-workflow Then, execute 'make' again. make: *** [Makefile:37: .git/hooks/applied] Error 1 ``` > 第一次在本地端 linux 開發 git,應該要設定 git 開發環境 ``` $ git config --global user.name 你偏好的英文或拼音名稱 $ git config --global user.email 你的電子郵件信箱 ``` > 如此可避免執行自動測試時拋出錯誤訊息 2. 執行 git pre-commit hook 檢查會遇到以下問題: ``` $ git commit -a Following files need to be cleaned up: queue.c Running static analysis... Implement q_free to properly free queue elements [line 1] - Avoid using non-American English words ``` 什麼是 non-American English words? 整段 commit message 如下: ``` Implement q_free to properly free queue elements Add q_free() to properly deallocate all dynamically allocated memory associated with the queue. This function iterates through the list, removes each element, and frees its allocated memory to prevent leaks. This implementation ensures that all elements, including their values, are properly deallocated before releasing the queue head. ``` 3. git commit 時被要求要使用 pointer to const code : ```c element *entry, *safe, *min; min = list_first_entry(head, element_t, list); ``` 在 git commit 時會被阻攔,得到以下訊息: ``` queue.c:293:15: style: Variable 'min' can be declared as pointer to const [constVariablePointer] q_release_element(entry); ^ queue.c:325:15: style: Variable 'min' can be declared as pointer to const [constVariablePointer] q_release_element(entry); ^ Fail to pass static analysis. ``` 見鬼啦, pointet to const 是啥? ## quiz1+2 ### 第一週測驗 1 只用指標來執行插入節點的版本: ```c void list_insert_before_naive(list_t *l, list_item_t *before, list_item_t *item) { list_item_t *pos = l->head; if (before == l->head) { item->next = before; l->head = item; return; } while (pos && pos->next != before) { pos = pos->next; } item->next = before; pos->next = item; } ``` # 肆:教材閱讀 ## Linux 核心的 hash table 實作 > [Linux 核心的 hash table 實作](https://hackmd.io/@sysprog/linux-hashtable) Perfect function -> 1 to 1 ,不會發生碰撞 -> 不可能做到 ### 1.解釋上述程式碼的運作原理。