2024q1 Homework2 (quiz1+2)

# 2024q1 Homework2 (quiz1+2) contributed by < `ssheep773` > ## 第 1 週測驗題 [2024q1 第 1 週測驗題](https://hackmd.io/@sysprog/linux2024-quiz1) ### 測驗 `1` 運作原理使用鏈結串列實作 `Quick sort` ，其中的 `node_t` 結構如下 ```c typedef struct __node { struct __node *left, *right; struct __node *next; long value; } node_t; ``` ```graphviz digraph node_t { node [shape= "record"]; rankdir= "LR"; subgraph cluster_0 { label= "node_t"; node1 [label= "<head>value | next | {<left> left | <right> right}"]; } } ``` 透過實際例子解釋程式流程: ```graphviz digraph initial { node [shape="box"]; node4 [label= "4"] node3 [label= "3"] node5 [label= "5"] node1 [label= "1"] node [shape="plaintext"]; L [fontcolor="blue"] R [fontcolor="blue"] pivot [fontcolor="red"] { rank="same"; node4->node3 node3->node5 node5->node1 } pivot->node4; L->node4; R->node1; } ``` 選擇 `L` 指向的節點 4 作為 `pivot` ，將 `pivot` 從串列移除，並比較剩餘節點和 `pivot` 的大小關係，使用 `list_add` 將小於 `pivot` 的節點加入 `left` ，大於 `pivot` 的節點加入 `right` 。 ```graphviz digraph initial { node [shape="box"]; node3 [label= "3"]; node1 [label= "1"]; node4 [label= "4"]; node5 [label= "5"]; node [shape="plaintext"]; left [fontcolor="blue"]; right [fontcolor="blue"]; pivot [fontcolor="red"]; left->node3; "begin[0]"->node3; node3->node1; node1->"end[0]" [dir=back]; pivot->node4; "begin[1]"->node4; node4->"end[1]" [dir=back]; right->node5; "begin[2]"->node5; node5->"end[2]" [dir=back]; } ``` `i = 2` ， `begin[2] == end[2]` 將 `begin[2]` 加入到 `result` ```graphviz digraph initial { rankdir = "LR" node [shape="box"]; node5 [label= "5"] result [shape="plaintext"]; "result"->node5; } ``` `i = 1` ， `begin[1] == end[1]` 將 `begin[1]` 加入到 `result` ```graphviz digraph initial { rankdir = "LR" node [shape="box"]; node4 [label= "4"] node5 [label= "5"] result [shape="plaintext"]; "result"->node4; node4->node5; } ``` 然後再回到 `i = 0` 重複上述的步驟完成排序 ```graphviz digraph initial { node [shape="box"]; node4 [label= "4"] node3 [label= "3"] node5 [label= "5"] node1 [label= "1"] result [shape="plaintext"]; { rank="same"; result->node1; node1->node3; node3->node4; node4->node5; } } ``` 此方法都是選擇最左邊的節點作為 `pivot` ，並且透過 `begin[n] == end[n]` 來確認是否完成排序，我們的結果是由小到大的排序，所以每次優先選擇 `right` 做排序 **改進方法** `pivot` 的選擇 `max_level` 的大小目前的選擇 `pivot` 的方法在處理一個排序好的遞增串列時，會使的時間複雜度 $O(n^2)$ 暴力解 : 可以在排序前先檢查串列是否為排序好的遞增串列，若符合擇不執行 `quick sort` 直接將串列反轉使用 median of three ，查看串列的前中後三個點的大小，取中間值作為 `pivot` 使用 median of median ，找尋串列的中位數作為 `pivot` 使用 Linux 核心風格的 List API 改寫上述程式碼，並針對鏈結串列，提出可避免最差狀況的快速排序實作，應設計效能評比的測試程式。 ### 使用 Linux 核心風格的 List API 改寫 Quick sort [commit a64bbcf](https://github.com/ssheep773/linux2024q1/commit/a64bbcfb9f90e1006048af17d99be88374b3d159) 首先引入 `list.h` 以及將測試程式改成測驗一中 timsort 使用的 `main.c` 。修改成 Linux 核心風格的 List API ，最需要修改的是排序節點的結構是定義在 `main.c` ，我們無法在 `quick.c` 中得知排序節點的架構，無法使用`list_entry` 等巨集，比較節點大小時則是透過呼叫比較函式 `cmp` ```graphviz digraph { node [shape=plaintext]; "Pointer to pointer:" -> "*begin[ ]:" -> "list_head:" [color=white]; node [shape=record, fontcolor=black, ]; pointers [label="<f0> **begin ", color=white]; values [label="<f0> *begin[0] | <f1> begin[1] | <f2> *begin[2] | <f3> *begin[3] | <f4> ... | <f5> *begin[max_level]", ]; indices [label="<head> head"]; { rank=same; "Pointer to pointer:"; pointers } { rank=same; "*begin[ ]:"; values } { rank=same; "list_head:"; indices } edge [color=black]; pointers:f0 -> values:f0; values:f0 -> indices:f0; } ``` 上面是修改後的結構圖，會在排序開始時配置整個 `**begin` ```c struct list_head **begin = malloc(sizeof(struct list_head *) * max_level); ``` 在排序過程中配置 `*begin[]` 指向 `head` 用來串接排序的串列，其中因為最後是將結果接回輸入的 `head` 需要初始化避免出錯，所以使用`list_splice_tail_init(head, begin[0])` ```c begin[0] = malloc(sizeof(struct list_head)); INIT_LIST_HEAD(begin[0]); list_splice_tail_init(head, begin[0]); ``` 我們可以透過檢查串列只有一個節點或是為空來決定要分割還是合併串列 ```diff - if (L != R) + if (!list_is_singular(begin[i]) && !list_empty(begin[i])) ``` ### 使用 link-list 建立 `begin` 上面的方法中我們在一開始就考慮最差的情況，建立最大的可能使用的 `begin[]` ，這在一般隨機的情況下會產生資源浪費（這在後續的實驗會證明），而且要求配置龐大的連續空間，也存在失敗的風險，我希望 `begin[]` 可以改成以鏈結串列的方式建立建立 `begin_t` 的架構，並且紀錄串列最長時的 `begin_t` 個數 ```c typedef struct begin_t{ struct list_head blist; // 串接其他的 begin_t struct list_head *head; // 串接排序的串列 } begin_t; ``` 因為 `begin_t` 是定義在 `quick.[ch]` 所以我們可以使用 `lis_entry` 等巨集來到實驗的部分，當樣本數是 10000 時的最壞情況， `begin_t` 個數來到 19999 ，符合前面用 `begin[]` 建立時的大小 ``` Creating sample ==== Testing quick_sort ==== max begin size = 19999 Comparisons: 49995000 check: List is sorted ``` 在相同樣本中一般情況下，串列最長時的 `begin_t` 個數約在 2100 個左右，可以發現個數大約是原始最大值的 10% ，減少了 90% 的空間，並且是非連續的記憶體建制。 ``` Creating sample ==== Testing quick_sort ==== max begin size = 2154 Comparisons: 5043928 check: List is sorted ``` 比較使用鏈結串列與陣列建立 begin 的記憶體開銷首先是使用陣列建立的記憶體開銷 ![memory_array](https://hackmd.io/_uploads/Hyn6nKQfR.png) 在來是使用鏈結串列動態建立的記憶體開銷可以看到記憶體的起伏，說明記憶體的配置與釋放 ![linklist](https://hackmd.io/_uploads/S1862tQGA.png) ### 改進 `pivot` 的選擇使用 middle of three [commit f14fa6a](https://github.com/ssheep773/linux2024q1/commit/f14fa6af6b845fa7f7417090cd70838244596b57) 比較串列的第一個、最後一個與中間的節點數值，選擇他們的中間值作為 `pivot` ，中間的節點是透過快慢指標取得，其中的 `snode->head->next->next->next` 是為了要確認目前的串列至少有兩個節點，否則使用快慢指標會進入無窮迴圈， ```c struct list_head *pivot; if (begin[i]->next->next->next != begin[i]) { struct list_head *first = begin[i]->next; struct list_head *last = begin[i]->prev; struct list_head *mid = begin[i]->next; for (struct list_head *fast = begin[i]->next; fast != begin[i] && fast->next != begin[i]; fast = fast->next->next) { mid = mid->next; } pivot = findMiddle(first, mid, last, priv, cmp); } else { pivot = begin[i]->next; } ``` 若排序一個排序好的串列時，原始的 quick sort 會是最壞的情況 (worse case) ，而改成使用 middle of three 後，在最壞的情況下可以減少約 99% 的比較次數。 ``` Creating sample ==== Testing quick_sort ==== Comparisons: 49995000 check: List is sorted ==== Testing quick_sort_mid ==== Comparisons: 121821 check: List is sorted ``` 但是在隨機串列的情況下，直接使用串列首個節點作 `pivot` 的比較次數會較少，並且 middle of three 會有嚴重 unstable 的情況，還需要改進。 (上面提到的最壞的情況不會有 unstable 的情況，是因為樣本中沒有重複的數字) ``` Creating sample ==== Testing quick_sort ==== Comparisons: 5037252 check: List is sorted ==== Testing quick_sort_mid ==== Comparisons: 5066375 ERROR: unstable 5001 check: ERROR: unstable 5001 List is not sorted ``` [commit d0f0990](https://github.com/ssheep773/linux2024q1/commit/d0f099057a2362f7f9fc6d04b9a3ee0f49806aef) 修改 `findMiddle` 找中間數值的函式，這次的修改主要是確保當 `a == b == c` 時，會優先選擇 `a` 減少 unstable 的發生，而 `cmpBC` 可以透過 `cmpAB - cmpAC` 得到，可以減少一次比較 ```c struct list_head *findMiddle(struct list_head *a, struct list_head *b, struct list_head *c, void *priv, list_cmp_func_t cmp) { int cmpAB = cmp(priv, a, b); int cmpAC = cmp(priv, a, c); int cmpBC = cmpAB - cmpAC; if ( (cmpAB <= 0 && cmpAC >= 0) || (cmpAB >= 0 && cmpAC <= 0) ) { return a; } if ( (cmpAB >= 0 && cmpBC >= 0) || (cmpAB <= 0 && cmpBC <= 0)) { return b; } return c; } ``` 修改後 unstable 的情況大幅減少，但是仍然會發生。 ``` Creating sample ==== Testing quick_sort ==== Comparisons: 5034315 check: List is sorted ==== Testing quick_sort_mid ==== Comparisons: 5067210 ERROR: unstable 10 check: ERROR: unstable 10 List is not sorted ``` 這主要是因為在前幾次分割時，選擇前中後三個數字時造成的，以例子說明：在下面的例子中節點以 `數值.次序` 表示，可以看到因為 `pivot = 9.10` ，而我們是用 ` < pivot <= ` 來劃分，這樣數值相同但是次序較前的 `9.7` 會被排到 `9.10` 後面產生 unstable 的情況，而若改成 ` <= pivot < ` 則換成 `9.15` 會產生 unstable 的情況，所以目前的方法還是 unstable 。 ``` list = 3.0 5.1 5.2 6.3 2.4 5.5 8.6 9.7 4.8 4.9 9.10 5.11 8.12 6.13 3.14 9.15 0.16 5.17 7.18 7.19 first= 3.0, mid= 9.10 ,last= 7.19 => pivot=9.10 begin[0] : 3.0 5.1 5.2 6.3 2.4 5.5 8.6 4.8 4.9 5.11 8.12 6.13 3.14 0.16 5.17 7.18 7.19 begin[1] : 9.10 begin[2] : 9.7 9.15 ``` ### 測驗 `2` Timsort 程式碼理解這裡使用的 `merge` 方法與 merge sort 使用的 one-pair-at-a-time mode 相同 ```c static struct list_head *merge(void *priv, list_cmp_func_t cmp, struct list_head *a, struct list_head *b) { struct list_head *head; struct list_head **tail = AAAA; for (;;) { /* if equal, take 'a' -- important for sort stability */ if (cmp(priv, a, b) <= 0) { *tail = a; tail = &a->next; a = a->next; if (!a) { *tail = b; break; } } else { *tail = b; tail = &b->next; b = b->next; if (!b) { *tail = a; break; } } } return head; } ``` `build_prev_link` 將單向鏈結串列 `list` 加入雙向鏈結串列中，因為 Timsort 在排序過程中，會將原本的雙向鏈結斷掉 ```c static void build_prev_link(struct list_head *head, struct list_head *tail, struct list_head *list) { tail->next = list; do { list->prev = tail; tail = list; list = list->next; } while (list); /* The final links to make a circular doubly-linked list */ tail->next = head; head->prev = tail; } ``` `merge_final` 與前面提到的 `merge` 差異是合併時是雙向鏈結串列，並且如果 `a` 走訪完畢則結束，但若是 `b` 走完則會將 `a` 放入 `b` ，再透過 `build_prev_link(head, tail, b)` 建立剩餘節點的雙向鏈結，並且這樣撰寫的優點是我們不必分別確認 `a` 跟 `b` 是否為走訪完 `merge_at` : 函式是將 `tp` 和 `tp->prev` 合併至 `tp` `find_run` ```c static struct pair find_run(void *priv, struct list_head *list, list_cmp_func_t cmp) { size_t len = 1; struct list_head *next = list->next, *head = list; struct pair result; if (!next) { result.head = head, result.next = next; return result; } if (cmp(priv, list, next) > 0) { /* decending run, also reverse the list */ struct list_head *prev = NULL; do { len++; list->next = prev; prev = list; list = next; next = list->next; head = list; } while (next && cmp(priv, list, next) > 0); list->next = prev; } else { do { len++; list = next; next = list->next; } while (next && cmp(priv, list, next) <= 0); list->next = NULL; } head->prev = NULL; head->next->prev = (struct list_head *) len; result.head = head, result.next = next; return result; } ``` 使用圖解的方式說明 : 這裡參考 [SH](https://hackmd.io/@ShchangAnderson/linux2024-homework2#%E6%B8%AC%E9%A9%97-2) 同學的圖解第一種情況 : 升序排列的節點 * 首先將 head 指向鏈結串列的開頭， next 指向 list 的下一個節點 ```graphviz digraph initial { node [shape="box"]; node1 [label= "1", color="red"] node4 [label= "4", color="red"] node5 [label= "5"] node2 [label= "2"] node3 [label= "3"] node [shape="plaintext"]; head [fontcolor="blue"] list [fontcolor="blue"] next [fontcolor="red"] { rank="same"; node1->node4 node4->node5 node5->node2 node2->node3 node3->"NULL" } head->node1; next->node4; list->node1; } ``` * `next` 指標中節點的值不小於 `list` 指標中節點的值，則繼續走訪，直到找到非升序的節點為止 ```graphviz digraph initial { node [shape="box"]; node1 [label= "1"] node4 [label= "4"] node5 [label= "5", color="red"] node2 [label= "2", color="red"] node3 [label= "3"] node [shape="plaintext"]; head [fontcolor="blue"] list [fontcolor="blue"] next [fontcolor="red"] { rank="same"; node1->node4 node4->node5 node5->node2 node2->node3 node3->"NULL" } head->node1; next->node2; list->node5; } ``` * 離開迴圈後回傳串列的開頭以及 next 指標，使得下一輪能夠繼續走訪並找到下一組 run ```graphviz digraph initial { node [shape="box"]; node1 [label= "1"] node4 [label= "4"] node5 [label= "5", color="red"] node2 [label= "2", color="red"] node3 [label= "3"] node [shape="plaintext"]; head [fontcolor="blue"] list [fontcolor="blue"] next [fontcolor="red"] null [label = "NULL"] { rank="same"; "result.head"->node1; node1->node4 node4->node5 node5->"NULL" } { rank="same"; "result.next"->node2; node2->node3 node3->null } next->node2; list->node5; head->node1; } ``` 第二種情況 : 降序排列的節點 * 一樣首先將 `head` 指向鏈結串列的開頭， `next` 指向 `list` 的下一個節點，不過這種情況多一個 `prev` ```graphviz digraph initial { node [shape="box"]; node1 [label= "1"] node4 [label= "4", color="red"] node5 [label= "5", color="red"] node2 [label= "2"] node3 [label= "3"] node [shape="plaintext"]; head [fontcolor="blue"] list [fontcolor="blue"] next [fontcolor="red"] prev [fontcolor="red"] null [label="NULL"] { rank="same"; node5->node4 node4->node1 node1->node2 node2->node3 node3->"NULL" } next->node4; list->node5; head->node5; prev->null; } ``` * 第一次執行後結果 ```graphviz digraph initial { node [shape="box"]; node1 [label= "1"] node4 [label= "4", color="red"] node5 [label= "5", color="red"] node2 [label= "2"] node3 [label= "3"] node [shape="plaintext"]; head [fontcolor="blue"] list [fontcolor="blue"] next [fontcolor="red"] prev [fontcolor="red"] null [label="NULL"] { rank="same"; node4->node1 node1->node2 node2->node3 node3->"NULL" node5->null; } next->node1; list->node4; head->node4; prev->node5; } ``` * 持續走訪直到 next 節點的值大於(或等於) list 節點的值 ```graphviz digraph initial { node [shape="box"]; node1 [label= "1", color="red"] node4 [label= "4", color="red"] node5 [label= "5"] node2 [label= "2"] node3 [label= "3"] node [shape="plaintext"]; head [fontcolor="blue"] list [fontcolor="blue"] next [fontcolor="red"] prev [fontcolor="red"] null [label="NULL"] { rank="same"; node1->node2 node2->node3 node3->"NULL" node4->node5; node5->null; } next->node2; list->node1; head->node1; prev->node5; } ``` * 最終離開迴圈後回傳串列的開頭以及 `next` 指標，使得下一輪能夠繼續走訪並找到下一組 run，可以發現走訪過的節點皆被反轉。 ```graphviz digraph initial { node [shape="box"]; node1 [label= "1", color="red"] node4 [label= "4", color="red"] node5 [label= "5"] node2 [label= "2"] node3 [label= "3"] node [shape="plaintext"]; list [fontcolor="blue"] next [fontcolor="red"] null [label="NULL"] { rank="same"; "result.head"->node1 "result.next"->node2 node2->node3 node3->"NULL" node1->node4; node4->node5; node5->null; } next->node2; list->node5; } ``` 接著，timsort 會檢視目前的 runs 是否可以合併，這是 timsort 的一個特色，會在分割的過程中進行合併，達到對 cache 友善的效果。 `merge_collapse` 是 timsort 檢查 stack 中頂層 3 個 run 的大小是否滿足以下兩個條件 : ( X在最上層 ) * Z > Y + X * Y > X 若沒有符合則比較 X 、 Z 的大小，較小的與 Y 合併，而如果只有 2 個 run 時則只需滿足第二個條件 X = `run_size(tp)` Y = `run_size(tp->prev)` Z = `run_size(tp->prev->prev)` 在下面的程式碼我們不只看頂層 3 個 run ，也看頂層 2 到 4 層是否滿足上述的條件 ```c static struct list_head *merge_collapse(void *priv, list_cmp_func_t cmp, struct list_head *tp) { int n; while ((n = stk_size) >= 2) { if ((n >= 3 && run_size(tp->prev->prev) <= run_size(tp->prev) + run_size(tp)) || (n >= 4 && run_size(tp->prev->prev->prev) <= run_size(tp->prev->prev) + run_size(tp->prev))) { if (run_size(tp->prev->prev) < run_size(tp)) { tp->prev = merge_at(priv, cmp, tp->prev); } else { tp = merge_at(priv, cmp, tp); } } else if (run_size(tp->prev) <= run_size(tp)) { tp = merge_at(priv, cmp, tp); } else { break; } } return tp; } ``` `merge_force_collapse` 這裡的合併就是不斷抓出頂部三個 run ，比較第一層與第三層的大小，較小的與第二層合併，直到剩下小於 3 run。合併時只跟上下層合併是為了保持排序的 stable `timsort` 首先將鏈結串列斷成單向，接著執行 `find_run` 將串列拆分成多個 run 存在 tp 中並透過 `merge_collapse` 確保符合timsort 對 run 的兩個條件。接著執行 `merge_force_collapse` 使 `stk_size < 3` ，而我們會有剩一個 run 以及剩兩個 run 的情況，若剩一個則執行 `build_prev_link` 將鏈結串列串成雙向則完成，否則呼叫`merge_final` 將最後兩個 run 合併 ```c void timsort(void *priv, struct list_head *head, list_cmp_func_t cmp) { stk_size = 0; struct list_head *list = head->next, *tp = NULL; if (head == head->prev) return; /* Convert to a null-terminated singly-linked list. */ head->prev->next = NULL; do { /* Find next run */ struct pair result = find_run(priv, list, cmp); result.head->prev = tp; tp = result.head; list = result.next; stk_size++; tp = merge_collapse(priv, cmp, tp); } while (list); /* End of input; merge together all the runs. */ tp = merge_force_collapse(priv, cmp, tp); /* The final merge; rebuild prev links */ struct list_head *stk0 = tp, *stk1 = stk0->prev; while (stk1 && stk1->prev) stk0 = stk0->prev, stk1 = stk1->prev; if (stk_size <= 1) { build_prev_link(head, head, stk0); return; } merge_final(priv, cmp, head, stk1, stk0); } ``` 目前疑惑這段看起來是走訪到 stack 的底部，但不知道其中用意 ```c /* The final merge; rebuild prev links */ struct list_head *stk0 = tp, *stk1 = stk0->prev; while (stk1 && stk1->prev) stk0 = stk0->prev, stk1 = stk1->prev; ``` 目前的 timsort 時做並未實作 minrun 與 galloping mode ，預計加入這兩個方法並分析效能將 timsort 引入lab0 中 [commit bcd1a78](https://github.com/ssheep773/lab0-c/commit/bcd1a78bfaa5a7b7d16a18161a3839ba05e0eab0) ## 第 2 週測驗題 [2024q1 第 2 週測驗題](https://hackmd.io/@sysprog/linux2024-quiz2) ### 測驗 `1` #### 運用 [Linux 核心的 hash table 實作](https://hackmd.io/@sysprog/linux-hashtable) 構建二元樹首先看主函式 `buildTree`，使用中序 (inorder) 的序列建立 hash table ，使用 `INIT_HLIST_HEAD` 初始化 hash table ，再透過 `node_add` 建立 hash table，最後用 `dfs` 搭配建立的 hash table 建構二元樹 ```c static struct TreeNode *buildTree(int *preorder, int preorderSize, int *inorder, int inorderSize) { struct hlist_head *in_heads = malloc(inorderSize * sizeof(*in_heads)); for (int i = 0; i < inorderSize; i++) INIT_HLIST_HEAD(&in_heads[i]); for (int i = 0; i < inorderSize; i++) node_add(inorder[i], i, inorderSize, in_heads); return dfs(preorder, 0, preorderSize - 1, inorder, 0, inorderSize - 1, in_heads, inorderSize); } ``` 先觀察 `node_add` 如何建立 hash table ，根據 `hash` 變數決定我們在 hash table 內的位置，透過 `hlist_add_head` 函式將其加至 hash table 中 ```c static inline void node_add(int val, int idx, //在中序的位置 int size, struct hlist_head *heads) { struct order_node *on = malloc(sizeof(*on)); on->val = val; on->idx = idx; int hash = (val < 0 ? -val : val) % size; hlist_add_head(&on->node, ＆heads[hash]); } ``` 可以得到如下的 hash table，值得關注的是 `order_node` 還儲存 `inorder` 串列的索引值 `idx` ```graphviz digraph so { rankdir=LR; {rank=same inh4 inh3 inh2 inh1 inh0} inh0 [label="in_heads[0]"; shape="none"]; inh1 [label="in_heads[1]"; shape="none"]; inh2 [label="in_heads[2]"; shape="none"]; inh3 [label="in_heads[3]"; shape="none"]; inh4 [label="in_heads[4]"; shape="none"]; node [shape = "record"]; 15 [label="{val : 15 | idx : 2}"]; 20 [label="{val : 20 | idx : 3}"]; 7 [label="{val : 7 | idx : 4}"]; 9 [label="{val : 9 | idx : 0}"]; 3 [label="{val : 3 | idx : 1}"]; inh0 -> 20 -> 15 inh2 -> 7 inh3 -> 3 inh4 -> 9 } ``` 接者是 `hlist_add_head` ，需在加入前先檢查原先是否有節點存在，若存在則須將原先第一個節點的 `pprev` 指向 `&n->next` ，詳細內容的可以參見 [Linux 核心的 hash table 實作](https://hackmd.io/@sysprog/linux-hashtable) ```c static inline void hlist_add_head(struct hlist_node *n, struct hlist_head *h) { if (h->first) h->first->pprev = &n->next; n->next = h->first; n->pprev = &h->first; h->first = n; } ``` 在使用中序 (inorder) 建立 hash table 後，接著看到 `dfs` 輸入的變數 : `preorder` `pre_low` `pre_high` ：前序序列與節點搜尋範圍 `inorder` `in_low` `in_high` ：中序序列與節點搜尋範圍 `in_heads` ：中序建立的 hash table `size` ：是 `inorderSize` 同時也是 hash table 的大小 ```c static struct TreeNode *dfs(int *preorder, int pre_low, int pre_high, int *inorder, int in_low, int in_high, struct hlist_head *in_heads, int size) { if (in_low > in_high || pre_low > pre_high) return NULL; struct TreeNode *tn = malloc(sizeof(*tn)); tn->val = preorder[pre_low]; int idx = find(preorder[pre_low], size, in_heads); tn->left = dfs(preorder, pre_low + 1, pre_low + (idx - in_low), inorder, in_low, idx - 1, in_heads, size); tn->right = dfs(preorder, pre_high - (in_high - idx - 1), pre_high, inorder, idx + 1, in_high, in_heads, size); return tn; } ``` `tn->val = preorder[pre_low]` 因為是前序 `tn` 必為父點，透過 `find` 得知 `tn` 在中序的位置 `idx` ，我們可以透過 `idx` 得知節點是如何劃分的 ```graphviz digraph so { rankdir=TB; left [color=red, label="9"] right [color=red, label="15 20 7"] root [color=blue;label="3"] root -> left root -> right } ``` 如此遞迴下去即可建構二元樹最後來看 `find` 尋找節點 `val` 在中序的位置 `idx` ，而使用 hash table 可以快速地找到目標節點資訊，其中 `hlist_for_each (p, &heads[hash])` 為走訪特定索引中的所有節點，找到相同 `val` 並回傳 `idx` ```c static int find(int num, int size, const struct hlist_head *heads) { struct hlist_node *p; int hash = (num < 0 ? -num : num) % size; hlist_for_each (p, &heads[hash]) { struct order_node *on = list_entry(p, struct order_node, node); if (num == on->val) return on->idx; } return -1; } ```  **TODO** 1. 指出其中可改進之處並實作 2. 在 Linux 核心的 cgroups 相關原始程式碼，找出 pre-order walk 程式碼並探討 ### 測驗 `2` #### 理解程式碼 `LRUCache` 結構 ```c typedef struct { int capacity; // 快取的最大容量 int count; // 目前已儲存的節點數量 struct list_head dhead; // 儲存快取中的節點 struct hlist_head hhead[]; // 快速查找特定鍵值的節點 } LRUCache; ``` `lRUCacheCreate` 新增 LRU Cache 並初始化 :::warning 疑問 : 在配置 hhead[] 的記憶體時，為何是使用 `sizeof(struct list_head))` 而不是 `hlist_head` 是否因為空間相同？ ::: ```c LRUCache *lRUCacheCreate(int capacity) { LRUCache *cache = malloc(2 * sizeof(int) + sizeof(struct list_head) + capacity * sizeof(struct list_head)); cache->capacity = capacity; cache->count = 0; INIT_LIST_HEAD(&cache->dhead); for (int i = 0; i < capacity; i++) INIT_HLIST_HEAD(&cache->hhead[i]); return cache; } ``` `lRUCacheFree` 釋放先前分配的記憶體並清除 LRU 快取我們透過 `list_for_each_safe` 走訪快取中的所有節點，而節點中 `link` 變數表示節點儲存的資料位置，我們在刪除節點 `free(cache)` 前需要先將 `link` 刪除 ```c void lRUCacheFree(LRUCache *obj) { struct list_head *pos, *n; list_for_each_safe (pos, n, &obj->dhead) { LRUNode *cache = list_entry(pos, LRUNode, link); list_del(&cache->link); free(cache); } free(obj); } ``` `lRUCacheGet` 使用 hash table 能夠更快速找到節點，透過 `LRUNode *cache = list_entry(pos, LRUNode, node)` 我們走訪特定索引值內中的節點，而且同一個索引值節點之間是用 `node` 連接。並且若我們找到對應的 `key` 會將該節點移至 `dhead` 的開頭。 ```c int lRUCacheGet(LRUCache *obj, int key) { int hash = key % obj->capacity; struct hlist_node *pos; hlist_for_each (pos, &obj->hhead[hash]) { LRUNode *cache = list_entry(pos, LRUNode, node); if (cache->key == key) { list_move(&cache->link, &obj->dhead); return cache->value; } } return -1; } ``` `lRUCachePut` 可以分為三種情況： 1. 已有相同的節點存在與 `lRUCacheGet` 作法相似，因為都在找尋相同 `key` 值，差異在於最後找到的節點 c 賦值給 cache。 2. 沒有相同的節點存在，快取已滿，需要刪除節點取得 `dhead` 尾部的節點，表示是最久未被使用的節點，將它移動至 `dhead` 開頭並刪除，最後在刪除的節點位置新增新的節點資訊 3. 沒有相同的節點存在，快取未滿，可直接加入為 `cache` 配置新的記憶體位置，將節點加入 head table 的索引及 `dhead` 開頭，最後更新目前的快取數量在 `lRUCachePut` 的部分，情況 2 使用了原本刪除節點的記憶體位置，省去新配置記憶體的失敗風險，並且因為是額滿的情況也無需更新 `obj->count` ```c void lRUCachePut(LRUCache *obj, int key, int value) { LRUNode *cache = NULL; int hash = key % obj->capacity; struct hlist_node *pos; hlist_for_each (pos, &obj->hhead[hash]) { LRUNode *c = list_entry(pos, LRUNode, node); if (c->key == key) { list_move(c->link, &obj->dhead); cache = c; } } if (!cache) { if (obj->count == obj->capacity) { cache = list_last_entry(&obj->dhead, LRUNode, link); list_move(&cache->link, &obj->dhead); hlist_del(&cache->node); hlist_add_head(&cache->node, &obj->hhead[hash]); } else { cache = malloc(sizeof(LRUNode)); hlist_add_head(&cache->node, &obj->hhead[hash]); list_add(&cache->link, &obj->dhead); obj->count++; } cache->key = key; } cache->value = value; } ``` 改進之處並實作 `lRUCacheGet` 中會將剛尋找到的節點移至 `dhead` 的開頭，我們也應該將節點移至 `hlist_head` 的開頭更快速的從 hash table 中找到。 ```diff int lRUCacheGet(LRUCache *obj, int key) { int hash = key % obj->capacity; struct hlist_node *pos; hlist_for_each (pos, &obj->hhead[hash]) { LRUNode *cache = list_entry(pos, LRUNode, node); if (cache->key == key) { list_move(&cache->link, &obj->dhead); + hlist_del(&cache->node); + hlist_add_head(&cache->node, &obj->hhead[hash]); return cache->value; } } return -1; } ``` 延伸問題: 在 Linux 核心找出 LRU 相關程式碼並探討 ### 測驗 `3` #### 在指定的記憶體空間找出第 N 個設定的位元解釋程式碼的運作原理首先看到 `find_nth_bit` 他有三個變數 `addr` : 指向要尋找的位元組的指標 `size` : 位元的搜尋範圍 `n` : 需要確認是否為 1 的位元位置我們會先檢查 n 是否超出搜尋範圍，接著我們需要先了解 `small_const_nbits` `GENMASK` 這兩個巨集的作用 ```c static inline unsigned long find_nth_bit(const unsigned long *addr, unsigned long size, unsigned long n) { if (n >= size) return size; if (small_const_nbits(size)) { unsigned long val = *addr & GENMASK(size - 1, 0); return val ? fns(val, n) : size; } return FIND_NTH_BIT(addr[idx], size, n); } ``` `small_const_nbits` 透過 `builtin_constant_p` 檢查 `nbits` 是否在編譯時就已經確立，並且檢查是否介於範圍 `0 < nbit <= BITS_PER_LONG` ```c #define small_const_nbits(nbits) \ (__builtin_constant_p(nbits) && (nbits) <= BITS_PER_LONG && (nbits) > 0) ``` `GENMASK` 產生長度為 `BITS_PER_LONG` 並且在 `h` 位元到 `l` 位元之間為 1 的 bitmask ```c #define GENMASK(h, l) \ (((~0UL) - (1UL << (l)) + 1) & (~0UL >> (BITS_PER_LONG - 1 - (h)))) ``` 再知曉 `small_const_nbits` `GENMASK` 的作用後，看到下面程式碼，先確認變數是否在編譯時就確立後，並生成一個由 0 到 (size-1) 都為 1 的 mask ，與 addr 做交集，高於 size 的位元會被歸零，如此只保留搜尋範圍內的位元這個判斷條件的主要是在 size 足夠小時，可以用更簡單快速的方法來找到第 n 個為 1 位元，而不需要調用到較複雜的 `FIND_NTH_BIT` ```c if (small_const_nbits(size)) { unsigned long val = *addr & GENMASK(size - 1, 0); ``` 之後再透過 `fns` 尋找第 nth 位元，`fns` 目的是找到第 n 個被設為 1 的位置，我們呼叫 `__fs` 尋找位元組中設為 1 的最低位元位置，並用 `__clear_bit` 將那個位置歸零避免重複找到，當 n = 1 時表示找到第 n 個被設為 1 的位置 ```c static inline unsigned long fns(unsigned long word, unsigned int n) { while (word) { unsigned int bit = __ffs(word); if (n-- == 0) return bit; __clear_bit(bit, &word); } return BITS_PER_LONG; } ``` 接者來看 `FIND_NTH_BIT` 使用 `idx` 以 `BITS_PER_LONG` 長度為一個區間走訪，透過 `hweight_long` 可以得知區間內的 1 個數，若個數大於 `nr` 則代表目標 `nr` 在這個區間內，再呼叫 `fns` 在該區間內尋找確切的位置，並且 `fns` 結果需加上目前所在區間位置 `idx * BITS_PER_LONG` ，才是最後的結果反之若 `hweight_long` 區間內的 1 個數不超過 `nr` 代表，目標 `nr` 不再區間內，往下一個區間尋找，直到超過搜尋範圍 `sz` ```c #define FIND_NTH_BIT(FETCH, size, num) \ ({ \ unsigned long sz = (size), nr = (num), idx, w, tmp; \ \ for (idx = 0; (idx + 1) * BITS_PER_LONG <= sz; idx++) { \ if (idx * BITS_PER_LONG + nr >= sz) \ goto out; \ \ tmp = (FETCH); \ w = hweight_long(tmp); \ if (w > nr) \ goto found; \ \ nr -= w; \ } \ \ if (sz % BITS_PER_LONG) \ tmp = (FETCH) & BITMAP_LAST_WORD_MASK(sz); \ found: \ sz = min(idx * BITS_PER_LONG + fns(tmp, nr), sz); \ out: \ sz; \ }) ``` 延伸問題：在 Linux 核心原始程式碼找出 find_nth_bit 的應用案例，並予以解說和分析，需要涵蓋 CPU affinity 相關的原始程式碼。