# 2021q1 Homework (quiz1) contributed by < `hankluo6` > > [第 1 週測驗題](https://hackmd.io/@sysprog/linux2021-quiz1) ### 解題思路 #### `LLL` - [x] `(c) left = &((*left)->next)` 從函式名稱 `list_concat` 可知應為將兩個 list 做 [concatenation](https://dictionary.cambridge.org/dictionary/english/concatenation),`LLL` 為在迴圈內尋找 list 的尾端,且 left 的型態為 `node_t **`,為 node pointer 的 "address",故答案為 `&((left)->next`。 #### `AAA` - [x] `(e) &right` `AAA` 為 `n->value > value` 時要將 node `n` 加入到的 list,此時 `n` 為目前 `while` 迴圈中操作的 node,`value` 為 pivot 節點上的值 (此例為第一個節點的值),但這樣並不能判斷 left list 跟 right list 兩者間的大小關係,觀察下方程式碼中的 `list_concat(&result, left);` 先出現可知 left list 的值較小,故 `AAA` 應為 `right` 表示 value 比 pivot 的 value 大時加入到 right 的 list 中。 #### `BBB` - [x] `(c) &left` 反之,value 小於等於 pivot 的 value 時加入到 left 的 list 中。 #### `CCC` - [x] `(b) list_concat(&result, pivot); list_concat(&result, right);` 再依照大小順序將 list 連接即可。 ### 程式原理 ```cpp= static inline void list_add_node_t(node_t **list, node_t *node_t) { node_t->next = *list; *list = node_t; } ``` 將 `node_t` 加入到 `list` 的前端,並將之設定為新的 head。因為傳入的 `list` type 為 pointer to pointer,所以可以很優雅地處理 head 為 NULL 時的問題。 * 第 2 行 : `node_t->next = *list;` ```graphviz digraph linked_list{ rankdir = LR; node [shape = record]; subgraph cluster_0 { style = "filled"; color = "black"; fillcolor = "lightyellow"; node [style = filled, fillcolor = white, color = black]; label = "list(0x04)"; "NULL(0x00)"; } "node_t(0x08)"->"NULL(0x00)" } ``` * 第 3 行`*list = node_t` ```graphviz digraph linked_list{ rankdir = LR; node [shape = record]; subgraph cluster_0 { style = "filled"; color = "black"; fillcolor = "lightyellow"; node [style = filled, fillcolor = white, color = black]; label = "list(0x04)"; "node_t(0x08)"; } "node_t(0x08)"->"NULL(0x00)" } ``` ```cpp= static inline void list_concat(node_t **left, node_t *right) { while (*left) left = &((*left)->next); *left = right; } ``` 取得 `left` 中最後的節點的 address,並把 `right` 的 list 移動到該位置中。因 `right` 本身的結構沒有更改,故原先節點在 `right` 中的順序在串聯後的 list 依然保持不變。 * 滿足 while 迴圈條件: 第 1 次 $\to$ `left = &((*left)->next)` ```graphviz digraph linked_list{ rankdir = LR; node [shape = record]; subgraph cluster_0 { style = "filled"; color = "black"; fillcolor = "lightyellow"; node [style = filled, fillcolor = white, color = black]; label = "&A(0x14)"; "A(0x04)"; } subgraph cluster_1 { style = "filled"; color = "black"; fillcolor = "lightyellow"; node [style = filled, fillcolor = white, color = black]; label = "&B(0x18)"; "B(0x08)"; } subgraph cluster_2 { style = "filled"; color = "black"; fillcolor = "lightyellow"; node [style = filled, fillcolor = white, color = black]; label = "&C(0x1b)"; "C(0x0b)"; } subgraph cluster_3 { style = "filled"; color = "black"; fillcolor = "lightyellow"; node [style = filled, fillcolor = white, color = black]; label = "&D(0x1d)"; "NULL(0x00)"; } "A(0x04)"->"B(0x08)" "B(0x08)"->"C(0x0b)" "C(0x0b)"->"NULL(0x00)" "right(0x0d)" node [shape=plaintext, fontcolor=black, fontsize=15]; "left=&A(0x14), *left=A(0x04)"; "right(0x0d)"->"A(0x04)" [style=invis] } ``` * 滿足 while 迴圈條件: 第 2 次 $\to$ `left = &((*left)->next)` ```graphviz digraph linked_list{ rankdir = LR; node [shape = record]; subgraph cluster_0 { style = "filled"; color = "black"; fillcolor = "lightyellow"; node [style = filled, fillcolor = white, color = black]; label = "&A(0x14)"; "A(0x04)"; } subgraph cluster_1 { style = "filled"; color = "black"; fillcolor = "lightyellow"; node [style = filled, fillcolor = white, color = black]; label = "&B(0x18)"; "B(0x08)"; } subgraph cluster_2 { style = "filled"; color = "black"; fillcolor = "lightyellow"; node [style = filled, fillcolor = white, color = black]; label = "&C(0x1b)"; "C(0x0b)"; } subgraph cluster_3 { style = "filled"; color = "black"; fillcolor = "lightyellow"; node [style = filled, fillcolor = white, color = black]; label = "&D(0x1d)"; "NULL(0x00)"; } "A(0x04)"->"B(0x08)" "B(0x08)"->"C(0x0b)" "C(0x0b)"->"NULL(0x00)" "right(0x0d)" node [shape=plaintext, fontcolor=black, fontsize=15]; "left=&B(0x18), *left=B(0x08)"; "right(0x0d)"->"A(0x04)" [style=invis] } ``` * 滿足 while 迴圈條件: 第 3 次 $\to$ `left = &((*left)->next)` ```graphviz digraph linked_list{ rankdir = LR; node [shape = record]; subgraph cluster_0 { style = "filled"; color = "black"; fillcolor = "lightyellow"; node [style = filled, fillcolor = white, color = black]; label = "&A(0x14)"; "A(0x04)"; } subgraph cluster_1 { style = "filled"; color = "black"; fillcolor = "lightyellow"; node [style = filled, fillcolor = white, color = black]; label = "&B(0x18)"; "B(0x08)"; } subgraph cluster_2 { style = "filled"; color = "black"; fillcolor = "lightyellow"; node [style = filled, fillcolor = white, color = black]; label = "&C(0x1b)"; "C(0x0b)"; } subgraph cluster_3 { style = "filled"; color = "black"; fillcolor = "lightyellow"; node [style = filled, fillcolor = white, color = black]; label = "&D(0x1d)"; "NULL(0x00)"; } "A(0x04)"->"B(0x08)" "B(0x08)"->"C(0x0b)" "C(0x0b)"->"NULL(0x00)" "right(0x0d)" node [shape=plaintext, fontcolor=black, fontsize=15]; "left=&C(0x1b), *left=C(0x0b)"; "right(0x0d)"->"A(0x04)" [style=invis] } ``` * 滿足 while 迴圈條件: 第 4 次 $\to$ `left = &((*left)->next)` ```graphviz digraph linked_list{ rankdir = LR; node [shape = record]; subgraph cluster_0 { style = "filled"; color = "black"; fillcolor = "lightyellow"; node [style = filled, fillcolor = white, color = black]; label = "&A(0x14)"; "A(0x04)"; } subgraph cluster_1 { style = "filled"; color = "black"; fillcolor = "lightyellow"; node [style = filled, fillcolor = white, color = black]; label = "&B(0x18)"; "B(0x08)"; } subgraph cluster_2 { style = "filled"; color = "black"; fillcolor = "lightyellow"; node [style = filled, fillcolor = white, color = black]; label = "&C(0x1b)"; "C(0x0b)"; } subgraph cluster_3 { style = "filled"; color = "black"; fillcolor = "lightyellow"; node [style = filled, fillcolor = white, color = black]; label = "&D(0x1d)"; "NULL(0x00)"; } "A(0x04)"->"B(0x08)" "B(0x08)"->"C(0x0b)" "C(0x0b)"->"NULL(0x00)" "right(0x0d)" node [shape=plaintext, fontcolor=black, fontsize=15]; "left=&D(0x1d), *left=NULL(0x00)"; "right(0x0d)"->"A(0x04)" [style=invis] } ``` * 一旦 `*left == NULL` 成立,離開上述 while 迴圈 * 第 4 行 : `*left = right` ```graphviz digraph linked_list{ rankdir = LR; node [shape = record]; subgraph cluster_0 { style = "filled"; color = "black"; fillcolor = "lightyellow"; node [style = filled, fillcolor = white, color = black]; label = "&A(0x14)"; "A(0x04)"; } subgraph cluster_1 { style = "filled"; color = "black"; fillcolor = "lightyellow"; node [style = filled, fillcolor = white, color = black]; label = "&B(0x18)"; "B(0x08)"; } subgraph cluster_2 { style = "filled"; color = "black"; fillcolor = "lightyellow"; node [style = filled, fillcolor = white, color = black]; label = "&C(0x1b)"; "C(0x0b)"; } subgraph cluster_3 { style = "filled"; color = "black"; fillcolor = "lightyellow"; node [style = filled, fillcolor = white, color = black]; label = "&D(0x1d)"; "right(0x0d)"; } "A(0x04)"->"B(0x08)" "B(0x08)"->"C(0x0b)" "C(0x0b)"->"right(0x0d)" node [shape=plaintext, fontcolor=black, fontsize=15]; "left=&D(0x1d), *left=right(0x0d)"; "left=&D(0x1d), *left=right(0x0d)"->"A(0x04)" [style=invis] } ``` ```cpp void quicksort(node_t **list) { if (!*list) return; node_t *pivot = *list; int value = pivot->value; node_t *p = pivot->next; pivot->next = NULL; node_t *left = NULL, *right = NULL; while (p) { node_t *n = p; p = p->next; list_add_node_t(n->value > value ? &right : &left, n); } quicksort(&left); quicksort(&right); node_t *result = NULL; list_concat(&result, left); list_concat(&result, pivot); list_concat(&result, right);; *list = result; } ``` `quicksort` 與一般[常見陣列中的 quick sort 類似](https://en.wikipedia.org/wiki/Quicksort),在 `quicksort` 中會先選擇第一個 node 當作 pivot,接著依照每個 node 的值跟 pivot 間的大小關係,將每個 node 分配到比 pivot 大的 list 與比 pivot 小的 list 當中,再遞迴對兩個 sublist 做 `quicksort`,最後再透過 `list_concat` 以 `left_head ~ left_tail -> pivot -> right_head ~ right_tail` 的順序接起來,並把 `*list` 設定為最後 `result` list 的開頭 (`node_t **list` 本身的值不變,改變的是其指向的記憶體位置 `*list`)。 ### `random` [random(3)](https://linux.die.net/man/3/random) 提到: > The srandom() function sets its argument as the seed for a new sequence of pseudo-random integers to be returned by random(). These sequences are repeatable by calling srandom() with the same seed value. If no seed value is provided, the random() function is automatically seeded with a value of 1. 所以要讓每次程式結果不同,必須設置 seed 值,可在程式內加入 `srand(time(NULL))` 以目前時間設定 seed。 使用 [getrandom(2)](https://man7.org/linux/man-pages/man2/getrandom.2.html) 從 [/dev/urandom](https://en.wikipedia.org/wiki//dev/random) 取得目前 entropy pool 中的 noise,就能在不透過 seed 的情況下保證有一定的隨機性: ```cpp int number; while (count--) { getrandom(&number, 2, GRND_NONBLOCK); list = list_make_node_t(list, number % 1024); } ``` --- ### Optimized QuickSort > [GitHub](https://github.com/hankluo6/quicksort-ll) ```cpp // quickSort // // This public-domain C implementation by Darel Rex Finley. // // * Returns YES if sort was successful, or NO if the nested // pivots went too deep, in which case your array will have // been re-ordered, but probably not sorted correctly. // // * This function assumes it is called with valid parameters. // // * Example calls: // quickSort(&myArray[0],5); // sorts elements 0, 1, 2, 3, and 4 // quickSort(&myArray[3],5); // sorts elements 3, 4, 5, 6, and 7 bool quickSort(int *arr, int elements) { #define MAX_LEVELS 1000 int piv, beg[MAX_LEVELS], end[MAX_LEVELS], i=0, L, R ; beg[0]=0; end[0]=elements; while (i>=0) { L=beg[i]; R=end[i]-1; if (L<R) { piv=arr[L]; if (i==MAX_LEVELS-1) return NO; while (L<R) { while (arr[R]>=piv && L<R) R--; if (L<R) arr[L++]=arr[R]; while (arr[L]<=piv && L<R) L++; if (L<R) arr[R--]=arr[L]; } arr[L]=piv; beg[i+1]=L+1; end[i+1]=end[i]; end[i++]=L; } else { i--; }} return YES; } ``` 先觀察原本 [Optimized QuickSort — C Implementation (Non-Recursive)](https://alienryderflex.com/quicksort/) 內的程式,可以發現 `while` 迴圈用來當作遞迴的每次呼叫,並分成兩部份: * `if` 的部份為 quicksort 中的 split * `else` 部份則為 quicksort 中的 merge 另外使用 `beg` 及 `end` 紀錄範圍,變數 `i` 用來模擬 stack 的行為,用其隨著時間改變 `beg` 及 `end` 的範圍。 * 陣列初始內容 ```graphviz digraph { node [color=black shape=none margin=0 height=.3 ] values [label=< <table border="0" cellborder="1" cellspacing="0"> <tr> <td port="f0" bgcolor = "lightyellow">14</td> <td port="f1" bgcolor = "lightyellow">22</td> <td port="f2" bgcolor = "lightyellow">10</td> <td port="f3" bgcolor = "lightyellow">15</td> <td port="f4" bgcolor = "lightyellow">31</td> <td port="f5" bgcolor = "lightyellow">30</td> <td port="f6" bgcolor = "lightyellow">12</td> <td port="f7" bgcolor = "lightyellow"><font color = "white">00</font></td> </tr> </table> >]; } ``` * 滿足 while 迴圈條件: 第 1 次 ```graphviz digraph { node [shape=plaintext, fontsize=18]; "i=0" [color=white]; node [shape=plaintext, fontcolor=g, fontsize=18]; "invis" [style=invis]; node [color=black shape=none margin=0 height=.3 ] values [label=< <table border="0" cellborder="1" cellspacing="0"> <tr> <td port="f0" bgcolor = "gray">12</td> <td port="f1" bgcolor = "gray">10</td> <td port="f2" bgcolor = "gray">14</td> <td port="f3" bgcolor = "lightyellow">15</td> <td port="f4" bgcolor = "lightyellow">31</td> <td port="f5" bgcolor = "lightyellow">30</td> <td port="f6" bgcolor = "lightyellow">22</td> <td port="f7" bgcolor = "lightyellow"><font color = "white">00</font></td> </tr> </table> >]; "beg[0]"->values:f0 "beg[1]"->values:f3 "end[0]"->values:f2 "end[1]"->values:f7 "i=0"->"invis"[style=invis] { rank=same; "i=0"; values } { rank=same; "invis"; "end[0]" ; "end[1]" } } ``` * 滿足 while 迴圈條件: 第 2 次 ```graphviz digraph { node [shape=plaintext, fontcolor=g, fontsize=18]; "i=1" [color=white]; node [shape=plaintext, fontcolor=g, fontsize=18]; "invis" [style=invis]; node [color=black shape=none margin=0 height=.3 ] values [label=< <table border="0" cellborder="1" cellspacing="0"> <tr> <td port="f0" bgcolor = "gray">12</td> <td port="f1" bgcolor = "gray">10</td> <td port="f2" bgcolor = "gray">14</td> <td port="f3" bgcolor = "gray">15</td> <td port="f4" bgcolor = "lightyellow">31</td> <td port="f5" bgcolor = "lightyellow">30</td> <td port="f6" bgcolor = "lightyellow">22</td> <td port="f7" bgcolor = "lightyellow"><font color = "white">00</font></td> </tr> </table> >]; "beg[0]"->values:f0 "beg[1]"->values:f3 "beg[2]"->values:f4 "end[0]"->values:f2 "end[1]"->values:f3 "end[2]"->values:f7 "i=1"->"invis"[style=invis] { rank=same; "i=1"; values } { rank=same; "invis"; "end[0]" ; "end[1]" ; "end[2]" ;} } ``` * 滿足 while 迴圈條件: 第 3 次 ```graphviz digraph { node [shape=plaintext, fontcolor=g, fontsize=18]; "i=2" [color=white]; node [shape=plaintext, fontcolor=g, fontsize=18]; "invis" [style=invis]; node [color=black shape=none margin=0 height=.3 ] values [label=< <table border="0" cellborder="1" cellspacing="0"> <tr> <td port="f0" bgcolor = "gray">12</td> <td port="f1" bgcolor = "gray">10</td> <td port="f2" bgcolor = "gray">14</td> <td port="f3" bgcolor = "gray">15</td> <td port="f4" bgcolor = "gray">22</td> <td port="f5" bgcolor = "gray">30</td> <td port="f6" bgcolor = "lightyellow">31</td> <td port="f7" bgcolor = "lightyellow"><font color = "white">00</font></td> </tr> </table> >]; "beg[0]"->values:f0 "beg[1]"->values:f3 "beg[2]"->values:f4 "beg[3]"->values:f7 "end[0]"->values:f2 "end[1]"->values:f3 "end[2]"->values:f6 "end[3]"->values:f7 "i=2"->"invis"[style=invis] { rank=same; "i=2"; values } { rank=same; "invis"; "end[0]" ; "end[1]" ; "end[2]" ; "end[3]" } } ``` > 你好,我驗算過發現,滿足 while 迴圈條件: 第 3 次的時候 > end[2] 會指向 31, beg[3] 和 end[3] 會指向最尾端空的地方。[name=Chia-Lun Hsu] >感謝更正![name=hankluo6] 由上面例子可以觀察到此程式會把 array 一直分成 sub-array,並優先選擇右方的 array 來處理,亦即每次排序皆為選擇目前尚未排序的陣列中最大值,並放至右側,如此便能讓右側的陣列滿足 [Loop invariant](https://en.wikipedia.org/wiki/Loop_invariant) 的特性。可以看成把 Array 以二等分持續分開,形成 Full Binary Tree,再將 tree 以右子樹開始走訪各節點,每次碰到 leaf 時,該數字就是目前要排序的值。 將上述概念套用到 linked list 也符合,持續往樹的右方 search,透過變數 `i` 模擬 stack 的操作,當找到要排序的值後,在透過 `list_add_node_t` 插入到最終 list 的前方。為了程式簡潔,我將 pivot 也當作一次迴圈考量,所以此時對應的 tree 應為 Ternary Tree。 ```cpp void quicksort_nonrecursive(node_t **list) { int value; int i = 0; int max_level = 100; node_t *node_beg[max_level], *node_end[max_level]; node_t *result = NULL, *left = NULL, *right = NULL; node_t *pivot, *p, *L, *R; node_beg[0] = *list; node_end[0] = get_list_tail(list); while (i >= 0) { L = node_beg[i]; R = node_end[i]; if (L != R) { pivot = L; value = pivot->value; p = pivot->next; pivot->next = NULL; while (p) { node_t *n = p; p = p->next; list_add_node_t(n->value > value ? &right : &left, n); } node_beg[i] = left; node_end[i] = get_list_tail(&left); node_beg[i + 1] = pivot; node_end[i + 1] = pivot; node_beg[i + 2] = right; node_end[i + 2] = get_list_tail(&right); left = NULL; right = NULL; i += 2; } else { if (L) list_add_node_t(&result, L); i--; } } *list = result; } ``` * 效能比較 輸入為長度 2000 的 linked list * 使用 `-O0` 編譯: ![](https://i.imgur.com/aV5qniU.png) 可以發現沒有使用 recursion 的效能平均比使用 recursion 的還高。 * 使用 `-O3` 編譯: ![](https://i.imgur.com/jzHe6A6.png) 兩者間的差距明顯縮小,說明編譯器對遞迴函式有很好的最佳化。 可改進的空間: 1. `max_level` 與 quick sort 時 partition 的位置有關,當 worst case 時,Tree 退化成 linked list,樹高為 n,而每次迴圈 `i` 需要加 2,會 `max_level = 2n`,best case 時,Tree 為平衡樹, `max_level = 2log(n)`。 2. `node_beg` 以及 `node_end` 可以用 dynamic array 來實作,起始容量為 $log_2(n)$,並隨著迴圈越深加倍其容量。 3. 以 `introsort` 改進原先 quick sort: * 實做 insertion sort ```c void insert_sorted(node_t *entry, node_t **list) { while (*list && (*list)->value < entry->value) list = &(*list)->next; entry->next = *list; *list = entry; } void insertsort(node_t **list) { node_t *sorted = NULL; node_t *cur = *list; while (cur) { node_t *node = cur; cur = cur->next; insert_sorted(node, &sorted); } *list = sorted; } ``` * 在做 partition 時紀錄 left list 及 right list 的長度 ```c while (p) { node_t *n = p; p = p->next; if (n->value > value) { list_add_node_t(&right, n); r++; } else { list_add_node_t(&left, n); l++; } } ``` * 當長度小於 20 時,以 insertion sort 取代 quick sort ```c if (l < 20) insertsort(&left); else quicksort_recursion(&left); if (r < 20) insertsort(&right); else quicksort_recursion(&right); ``` > [程式碼](https://github.com/hankluo6/quicksort-ll/blob/bbbc6404b551ed4683ba80226fd4af61e1b705d1/linked_list.c#L52) > 下一步就是引入 tree-based sort 到上方程式碼中,改善 quick sort 原本 worst case 的表現,從而落實 introsort > :notes: jserv ### Intro Sort 當遞迴的次數太多時,改以 tree sort 來排序,在 `introSort` 內新增 `treesort` 函式: ```c void introsort(node_t **list, int max_level, int insert) { if (!*list) return; if (max_level == 0) { treesort(list); return; } node_t *pivot = *list; int value = pivot->value; int l = 0, r = 0; node_t *p = pivot->next; pivot->next = NULL; node_t *left = NULL, *right = NULL; while (p) { node_t *n = p; p = p->next; if (n->value > value) { list_add_node_t(&right, n); r++; } else { list_add_node_t(&left, n); l++; } } if (l < insert) { insertsort(&left); } else { introsort(&left, max_level - 1, insert); } if (r < insert) { insertsort(&right); } else { introsort(&right, max_level - 1, insert); } node_t *result = NULL; list_concat(&result, left); list_concat(&result, pivot); list_concat(&result, right);; *list = result; } ``` 而 `treesort` 的實作是以紅黑樹來當作平衡樹,便能在 worst case 也有 $O(nlogn)$ 的複雜度,紅黑樹以 [rv32emu-next/c_map.h](https://github.com/sysprog21/rv32emu-next/blob/master/c_map.h) 及 [rv32emu-next/c_map.c](https://github.com/sysprog21/rv32emu-next/blob/master/c_map.c) 改寫來實現。並將原本的 `node_t` 結構擴增為同時存有 linked list 及 tree 的欄位: ```c typedef struct __node { int value; struct __node *next; // for linked list struct __node *left, *right, *up; // for red-black tree c_map_color_t color; // for red-black tree } node_t; ``` :::info 改用 `union` 可以更節省空間。 ::: `treesort` 會將 linked list 內的 node 加入到 red-black tree 中,在以 inorder 的方式取出: ```c void treesort(node_t **list) { node_t **record = list; c_map_t map = c_map_new(sizeof(int), sizeof(NULL), c_map_cmp_int); while (*list) { c_map_insert(map, *list, NULL); list = &(*list)->next; } node_t *node = c_map_first(map), *first = node; for ( ;node; node = c_map_next(node)) { *list = node; list = &(*list)->next; } *list = NULL; *record = first; free(map); } ``` 要特別注意此紅黑樹不能插入同樣數值的節點。 ### 效能分析 * 找尋適當的參數值,測試資料為 100000 筆隨機不重複數字: ```c for (int max_level = 1; max_level < (1 << 15); max_level <<= 1) { for (int insert = 10; insert < 40; ++insert) { size_t times = 100; time = 0; while (times--) { size_t count = 100000; int *test_arr = malloc(sizeof(int) * count); for (int i = 0; i < count; ++i) { test_arr[i] = i; } shuffle(test_arr, count); while (count--) { list1 = list_make_node_t(list1, test_arr[count]); } clock_gettime(CLOCK_MONOTONIC, &tt1); introsort(&list1, max_level, insert); clock_gettime(CLOCK_MONOTONIC, &tt2); time += diff_in_ns(tt1, tt2); list_free(&list1); free(test_arr); } fprintf(fd, "%f ", time / 100.0); } fprintf(fd, "\n"); } ``` | max_level | insert | time(ns) | |:---------:|:------:|:--------:| | 16 | 21 | 43538009 | | 32 | 21 | 38184258 | | 64 | 21 | 49350176 | | 128 | 21 | 49877200 | * [詳細輸出結果](https://github.com/hankluo6/quicksort-ll/blob/main/introsort_parameter.txt) * `max_level` 參數決定到第幾層時使用 tree sort * `insert` 參數決定少於幾個 node 時使用 insertion sort 測量出來最佳的 `max_level` 為 32,大約為 $2 \times log_2(100000)$,與論文 [Introspective Sorting and Selection Algorithms](https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.14.5196&rep=rep1&type=pdf) 中的假設相同。 >摘錄自 Introspective Sorting and Selection Algorithms 第 5 頁: > >depth limit is $2\lfloor log_2N\rfloor$, since this value empirically produces good results. * 每種 sorting 方式比較,測試資料為 100000 筆隨機不重複數字: ![](https://i.imgur.com/AR5tWC5.png) > 針對圖例標注的修正: with 可簡稱 `w/`; without 可簡稱 `w/o`。參見 [wiktionary: w/o](https://en.wiktionary.org/wiki/w/o) > :notes: jserv 可以發現 tree sort 的效能比其他都還要快上許多,這與 [A Comparative Study of Linked List Sorting Algorithms](https://pages.mtu.edu/~shene/PUBLICATIONS/1996/3Conline.pdf) 內的統計結果一樣。問題可能是出在遞迴的成本太高,導致 quick sort 無法超越 tree sort。 * 與 [linux/lib/list_sort.c](https://github.com/torvalds/linux/blob/master/lib/list_sort.c) 比較,測試資料為 100000 筆隨機不重複數字: ![](https://i.imgur.com/F2CPCyi.png) 嘗試用 perf 分析 tree sort 與 list_sort: - [ ] `list_sort` ``` Performance counter stats for './list_sort.out': 4,416,717 cache-misses # 0.531 % of all cache refs (66.65%) 832,301,750 cache-references (66.68%) 48,222,171,264 cycles (66.69%) 16,549,168,666 branch-instructions (66.69%) 26,319,955 branch-misses # 0.16% of all branches (66.67%) 564,284,840 L1-dcache-load-misses (66.63%) 12.103174289 seconds time elapsed 11.414579000 seconds user 0.688155000 seconds sys ``` - [ ] `tree_sort` ``` Performance counter stats for './tree_sort.out': 1,347,843,845 cache-misses # 20.133 % of all cache refs (66.65%) 6,694,777,854 cache-references (66.66%) 226,565,049,469 cycles (66.68%) 28,693,147,859 branch-instructions (66.68%) 1,123,005,940 branch-misses # 3.91% of all branches (66.67%) 3,115,288,209 L1-dcache-load-misses (66.66%) 58.205776700 seconds time elapsed 58.170830000 seconds user 0.007998000 seconds sys ``` > list_sort.c 的[註解](https://github.com/torvalds/linux/blob/05a59d79793d482f628a31753c671f2e92178a21/lib/list_sort.c#L136)寫道: > > Thus, it will avoid cache thrashing as long as $3 \times 2^k$ elements can fit into the cache. Not quite as good as a fully-eager bottom-up mergesort, but it does use $0.2 \times n$ fewer comparisons, so is faster in the common case that everything fits into L1. 所以 list_sort 中的 merge sort 對 cache 比較友善,從 perf 的分析也能看出 list_sort 的各項數值皆比 tree sort 還低,這也能說明 linux 核心中使用 merge sort 而不使用 tree sort 的原因。 tree sort 的效能瓶頸推測是在 red-black tree 的部份,透過 perf 分析,可以發現 L1-cache-misses 和 branch-misses 最高的部份皆為 `c_map_insert` 中的比較函式 `obj->comparator(&node->value, &cur->value)` 內的 Conditional Expression ,而 cache-misses 最高的部份則是 `c_map_insert` 中建立樹的指派 `cur->left = node` 及 `cur->right = node`。故可以推測出造成效能較低的原因為建立樹所需要的成本太高,需要走訪及比較樹中的節點,而當節點的數量越多,此成本就越高。 ###### tags: `linux2021`