2021q1 Homework1 (quiz1)

# 2021q1 Homework1 (quiz1) contributed by < `tiffany6022` > ###### tags: `linux2021` > [作業說明](https://hackmd.io/@sysprog/linux2021-quiz1) - [x] 解釋上述程式運作原理，搭配 Graphviz，比照 Linked List 練習題在 HackMD 筆記上視覺化展現; - [x] 測試程式使用到 random，多執行幾次可發現輸出結果相仿，請修正，並引入其他 Pseudorandom number generator - [ ] 參考 Optimized QuickSort — C Implementation (Non-Recursive) 並重寫上述 quick sort 程式碼，避免使用遞迴呼叫 - [ ] Linux 核心內部也有 linked list 實作，但是 circulr doubly-linked list，linux-list 仿效 Linux 核心的實作並予以簡化，在 examples/ 目錄提供 quick sort 實作，請探討 Linux 的 linked list 和上述程式碼的落差，並改寫 linux-list 的 quick sort 範例，避免使用遞迴呼叫 - [ ] 參考資料: The Linux Kernel API - List Management Functions - [ ] 研讀 Ching-Kuang Shene (冼鏡光] 教授撰寫的 A Comparative Study of Linked List Sorting Algorithms，思考高效率的 linked list 排序演算法，並落實於上述 (3) 的程式碼中 ## 解釋程式運作原理 1. 首先，考慮一個 singly linked list，其結構定義為: ```cpp typedef struct __node { int value; struct __node *next; } node_t; ``` ```graphviz digraph graphname { rankdir=LR node[shape=record] struct1 [label="{<v> value|<n> }"] struct2 [label="{<v> value|<n> }"] struct3 [label="{<v> value|<n> }"] NULL [label="{<v> NULL }", color=White] struct1:n:c -> struct2:v [label="next", arrowtail=dot, dir=both, tailclip=false, arrowsize=0.8] struct2:n:c -> struct3:v [label="next", arrowtail=dot, dir=both, tailclip=false, arrowsize=0.8] struct3:n:c -> NULL [label="next", arrowtail=dot, dir=both, tailclip=false, arrowsize=0.8] } ``` 2. `quicksort` 函式 * 傳入的參數 `list` 是一個指向 node_t 指標的指標 ```cpp void quicksort(node_t **list) ``` ```graphviz digraph graphname { rankdir=LR node[shape=box] pointer1 [label= "", shape=none,height=.0,width=.5] pointer2 [label= "", shape=none,height=.0,width=.0] struct1 [label="node1"] struct2 [label="node2"] struct3 [label="node3"] NULL [label="NULL", color=White] { rank="same" pointer1 -> pointer2 [label="list", color="red"] } pointer2 -> struct1 struct1 -> struct2 struct2 -> struct3 struct3 -> NULL } ``` :::info :bulb: 為什麼這裡要使用 **指標的指標**？根據 C 語言 Call by value 的特性 (呼叫者與被呼叫者是兩個獨立的變數但使用相同的數值，如果被呼叫者更改了參數的數值，並不會影響到呼叫者本身)，如果我們要改動的是指標，就需傳入指向指標的指標，就可**不用在函式結束後回傳指標** ::: * 一開始判斷如果 `*list` 是 NULL 時 return ```cpp if (!*list) return; ``` * 將 `pivot` 指向 *list 的最初節點，並紀錄它的 `value`，同樣地將 `p` 設成指向 pivot-> next 指向的節點，再把 pivot->next 指向 NULL ```cpp node_t *pivot = *list; int value = pivot->value; node_t *p = pivot->next; pivot->next = NULL; ``` ```graphviz digraph graphname { rankdir=LR node[shape=box] pointer2 [label= "", shape=none,height=.0,width=.0] struct1 [label="node1 \n value", fontcolor="red"] NULL [label="NULL", color=White] pointer2 -> struct1 [label="pivot", fontcolor="red"] struct1 -> NULL } ``` ```graphviz digraph graphname { rankdir=LR node[shape=box] pointer2 [label= "", shape=none,height=.0,width=.0] struct2 [label="node2"] struct3 [label="node3"] NULL [label="NULL", color=White] pointer2 -> struct2 [label="p", fontcolor="red"] struct2 -> struct3 struct3 -> NULL } ``` * 定義指向 node_t 結構的指標 `left` 和 `right` ```cpp node_t *left = NULL, *right = NULL; ``` * 當 `p` 不是 NULL 時，進入迴圈 * 將指向 node_t 結構的指標 `n` 設成 `p` * `p` 再設成自己的 `next` 指標指向的節點 (往前走一步) * 進入 `list_add_node_t` 函式 (詳見 `3. list_add_node_t 函式`) * 迴圈結束後 `left` 會指向一串都比 value 小的 list，`right` 會指向一串都比 value 大的 list ```cpp while (p) { node_t *n = p; p = p->next; list_add_node_t(n->value > value ? AAA : BBB, n); } ``` ```graphviz digraph graphname { rankdir=LR node[shape=box] pointer2 [label= "", shape=none,height=.0,width=.0] pointer3 [label= "", shape=none,height=.0,width=.0] struct2 [label="node2"] struct3 [label="node3"] NULL [label="NULL", color=White] { rank="same"; pointer3 -> struct3:v [label="p", fontcolor="red"] } pointer2 -> struct2 [label="n", fontcolor="red"] struct2 -> struct3 struct3 -> NULL } ``` * 進入迴圈前 ```graphviz digraph graphname { rankdir=LR node[shape=box] pointer [label= "", shape=none,height=.0,width=.0] struct5 [label="5"] NULL [label="NULL", color=White] pointer -> struct5 [label="pivot", fontcolor="red"] struct5 -> NULL } ``` ```graphviz digraph graphname { rankdir=LR node[shape=box] struct7 [label="7"] struct1 [label="1"] struct3 [label="3"] struct6 [label="6"] struct2 [label="2"] NULL [label="NULL", color=White] struct7 -> struct1 -> struct3 -> struct6 -> struct2 -> NULL } ``` * 迴圈結束後 ```graphviz digraph graphname { rankdir=LR node[shape=box] pointer [label= "", shape=none,height=.0,width=.0] struct7 [label="7"] struct6 [label="6"] NULL [label="NULL", color=White] struct6 -> struct7 -> NULL pointer -> struct6 [label="right"] } ``` ```graphviz digraph graphname { rankdir=LR node[shape=box] pointer [label= "", shape=none,height=.0,width=.0] struct1 [label="1"] struct3 [label="3"] struct2 [label="2"] NULL [label="NULL", color=White] struct2 -> struct3 -> struct1 -> NULL pointer -> struct2 [label="left"] } ``` :::success :face_with_monocle: **AAA**︰`&right` **BBB**︰`&left` 因為程式最後預期的輸出是依據**遞增順序**排序，因此大於 pivot->value 的節點要放在右邊 (right)，小於的要放在左邊 (left)，又因為傳入的 right 或 left 會需要改動到，且 `list_add_node_t` 函式並未回傳任何值，所以判斷這裡使用的是**指標的指標**，因此需要傳入指標 right 或 left 的地址: `&right` 或 `&left` ::: * 先後帶入 `&left` `&right` 進入 `quicksort` 函式做遞迴 * 由於 `quicksort` 傳入的參數型態是指標的指標，因此這邊需要代入指標的地址 * 透過遞迴執行 divide and conquer ```cpp quicksort(&left); quicksort(&right); ``` ```graphviz digraph graphname { rankdir=LR node[shape=box] struct5 [label="5"] struct7 [label="7"] struct1 [label="1"] struct3 [label="3"] struct6 [label="6"] struct2 [label="2"] struct5 -> struct7 -> struct1 -> struct3 -> struct6 -> struct2 } ``` ```graphviz digraph graphname { rankdir=LR node[shape=box] pointer_r [label= "", shape=none,height=.0,width=.0] pointer_l [label= "", shape=none,height=.0,width=.0] struct5 [label="5", color="red"] struct7 [label="7"] struct1 [label="1"] struct3 [label="3"] struct6 [label="6"] struct2 [label="2"] struct5 -> pointer_l [color="white"] pointer_l -> struct2 [label="left"] struct2 -> struct3 -> struct1 struct1 -> pointer_r [color="white"] pointer_r -> struct6 [label="right"] struct6 -> struct7 } ``` ```graphviz digraph graphname { rankdir=LR node[shape=box] pointer_r1 [label= "", shape=none,height=.0,width=.0] pointer_r2 [label= "", shape=none,height=.0,width=.0] pointer_l [label= "", shape=none,height=.0,width=.0] struct5 [label="5"] struct2 [label="2", color="red"] struct3 [label="3"] struct1 [label="1"] struct6 [label="6", color="red"] struct7 [label="7"] struct5 -> struct2 [color="white"] struct2 -> pointer_l [color="white"] pointer_l -> struct1 [label="left"] struct1 -> pointer_r1 [color="white"] pointer_r1 -> struct3 [label="right"] struct3 -> struct6 [color="white"] struct6 -> pointer_r2 [color="white"] pointer_r2 -> struct7 [label="right"] } ``` * 將指向 node_t 結構的指標 `result` 設成 NULL，並進入 `list_concat` 函式 (詳見 `4. list_concat 函式`) * 進行 divide and conquer 後的 combine ```cpp node_t *result = NULL; list_concat(&result, left); CCC; *list = result; ``` ```graphviz digraph graphname { rankdir=LR node[shape=box] struct1 [label="1"] struct2 [label="2 \n pivot", color="red"] struct3 [label="3"] struct5 [label="5"] struct6 [label="6 \n pivot", color="red"] struct7 [label="7"] struct5 -> struct1 [color="white"] struct1 -> struct2 struct2 -> struct3 [color="white"] struct3 -> struct6 [color="white"] struct6 -> struct7 [color="white"] } ``` ```graphviz digraph graphname { rankdir=LR node[shape=box] struct1 [label="1"] struct2 [label="2"] struct3 [label="3 \n right", color="red"] struct5 [label="5"] struct6 [label="6"] struct7 [label="7 \n right", color="red"] struct5 -> struct1 [color="white"] struct1 -> struct2 struct2 -> struct3 struct3 -> struct6 [color="white"] struct6 -> struct7 } ``` ```graphviz digraph graphname { rankdir=LR node[shape=box] struct1 [label="1"] struct2 [label="2"] struct3 [label="3"] struct5 [label="5 \n pivot", color="red"] struct6 [label="6"] struct7 [label="7"] struct1 -> struct2 struct2 -> struct3 struct3 -> struct5 struct5 -> struct6 [color="white"] struct6 -> struct7 } ``` ```graphviz digraph graphname { rankdir=LR node[shape=box] struct1 [label="1"] struct2 [label="2"] struct3 [label="3"] struct5 [label="5"] struct6 [label="6 \n right", color="red"] struct7 [label="7"] struct1 -> struct2 -> struct3 -> struct5 -> struct6 -> struct7 } ``` :::success :face_with_monocle: **CCC**︰`list_concat(&result, pivot); list_concat(&result, right)` 因為第一個傳入的參數會將第二個傳入的參數接起來，又根據題目從小到大的排序方式，所以要先將 &result (上一行 left 已與它相接) 和 pivot 相接，再將接完的 &result 再與 right 相接，這裡的 &result 需要加 `&` 也是因為傳入的參數為指標的指標。 ::: 3. `list_add_node_t` 函式 * 將傳入的 node_t 的 next 指向 dereference list (指向節點的指標)，再將 dereference list 重新指向 node_t ```cpp static inline void list_add_node_t(node_t **list, node_t *node_t) { node_t->next = *list; *list = node_t; } ``` 4. `list_concat` 函式 * 將 `**left` 和 `*right` 合併 * 當 dereference left 的指標並非指向 NULL 時，往 next 方向前進，當指向 NULL 時，跳出迴圈，將指向 NULL 的指標改成指向 right ```cpp static inline void list_concat(node_t **left, node_t *right) { while (*left) LLL; *left = right; } ``` :::success :face_with_monocle: **LLL**︰`left = &((*left)->next)` 由於這裡是要透過迴圈找到 left 串列中指向 NULL 的指標，也就是最後一個，因此每次 left 都要找到下一個節點，又因為這裡的 left 是指標的指標，因此要將 left 設成 (*left)->next 的**地址** ::: --- ## 修正 Random #### 測試 `random()` 函式根據 [Linux man page](https://linux.die.net/man/3/random) 中描述，透過設定 `srandom` 函式的參數 `seed` 可以得到 pesudo-random integers，但若沒提供 `seed value` 將自動設為 1，這也是為什麼只使用 `random` 函式的結果會都相同 > The srandom() function sets its argument as the seed for a new sequence of pseudo-random integers to be returned by random(). These sequences are repeatable by calling srandom() with the same seed value. If no seed value is provided, the random() function is automatically seeded with a value of 1. ```cpp for(int i=0; i<10; i++) printf("%ld ", random() % 1024); ``` ```shell $ ./random 359 966 105 115 81 255 74 236 809 205 $ ./random 359 966 105 115 81 255 74 236 809 205 $ ./random 359 966 105 115 81 255 74 236 809 205 ``` #### 加入 srandom 修正加入 `srandom()` 函數，並將當前系統時間當作 `seed` 參數傳入，以確保每次傳入都可以得到不同 seed value，看起來就很像真正的 random (找不到任何規律) > [Pseudorandom number generator (PRNG)](https://en.wikipedia.org/wiki/Pseudorandom_number_generator#Mathematical_definition) > 電腦考慮到運算的複雜度、系統設計成本，因此在多數的情況下，我們用的亂數是透過一系列的方程式而來的。 > 好的 PRNG: 執行夠久的話，可以把給定的範圍都執行過一次 ```cpp srandom(time(NULL)); for(int i=0; i<10; i++) printf("%ld ", random() % 1024); ``` ```shell $ ./random 900 874 585 76 224 492 254 221 475 535 $ ./random 913 638 913 875 261 915 350 647 781 9 $ ./random 951 284 654 314 415 447 905 104 327 859 ``` #### 注意 * 由於傳入的是當下時間，若在同一秒執行多次，也會產生相同的輸出 * `rand()` 與 `random()` 差別在於前者返回的是一個 int 型態，而後者則是 long int 型態