2021q1 Homework1 (quiz1)

--- tags: linux2021 --- # 2021q1 Homework1 (quiz1) contributed by < `catkitchen721` > [2021q1 第 1 週測驗題](https://hackmd.io/@sysprog/linux2021-quiz1) - [x] 解釋上述程式運作原理，搭配 Graphviz，比照 Linked List 練習題在 HackMD 筆記上視覺化展現; - [ ] 測試程式使用到 random，多執行幾次可發現輸出結果相仿，請修正，並引入其他 Pseudorandom number generator - [ ] 參考 Optimized QuickSort — C Implementation (Non-Recursive) 並重寫上述 quick sort 程式碼，避免使用遞迴呼叫 - [ ] Linux 核心內部也有 linked list 實作，但是 circulr doubly-linked list，linux-list 仿效 Linux 核心的實作並予以簡化，在 examples/ 目錄提供 quick sort 實作，請探討 Linux 的 linked list 和上述程式碼的落差，並改寫 linux-list 的 quick sort 範例，避免使用遞迴呼叫 - [ ] 研讀 Ching-Kuang Shene (冼鏡光) 教授撰寫的 A Comparative Study of Linked List Sorting Algorithms，思考高效率的 linked list 排序演算法，並落實於上述 (3) 的程式碼中 ## 測驗 1 1. 考慮一個單向 linked list，其結構定義為： ```clike typedef struct __node { int value; struct __node *next; } node_t; ``` * 該結構可表示成： ```graphviz digraph node_struct { rankdir=LR; node [shape=record]; node0 [label="node_t|{{value|<value>}|{next|<next>}}"]; node [shape=box]; node1 [style="invis"]; node0:next:c -> node1 [tailclip=false, arrowtail=dot, dir=both]; } ``` 2. 已知不存在 circular (環狀結構)，下方程式碼嘗試對該單向 linked list 進行 Quick sort，預期依據遞增順序。 ```cpp static inline void list_add_node_t(node_t **list, node_t *node_t) { node_t->next = *list; *list = node_t; } static inline void list_concat(node_t **left, node_t *right) { while (*left) LLL; *left = right; } void quicksort(node_t **list) { if (!*list) return; node_t *pivot = *list; int value = pivot->value; node_t *p = pivot->next; pivot->next = NULL; node_t *left = NULL, *right = NULL; while (p) { node_t *n = p; p = p->next; list_add_node_t(n->value > value ? AAA : BBB, n); } quicksort(&left); quicksort(&right); node_t *result = NULL; list_concat(&result, left); CCC; *list = result; } ``` ### 將各函數分開分析如以下各點： #### `list_add_node_t` 函式 ```cpp static inline void list_add_node_t(node_t **list, node_t *node_t) { node_t->next = *list; *list = node_t; } ``` 首先注意到的是 ==static== 與 ==inline== 關鍵字，在過去我甚少利用到這些 specifiers ，正好利用這次機會查詢 C 語言規格書了解其使用方式： :::info * ## static > `節錄自 C 語言規格書 ISO/IEC 9899:1999 - 6.2.2 章節` > **6.2.2 Linkages of identifiers** > * An identifier declared in different scopes or in the same scope more than once can be made to refer to the same object or function by a process called linkage. There are three kinds of linkage: ==external==, ==internal==, and ==none==. > * In the set of translation units and libraries that constitutes an entire program, each declaration of a particular identifier with external linkage denotes the same object or function. ==Within one translation unit, each declaration of an identifier with internal linkage denotes the same object or function==. Each declaration of an identifier with no linkage denotes a unique entity. > * If the declaration of a file scope identifier for an object or a function contains the storage-class specifier ==static==, the identifier has ==internal linkage==. <br/> 看了以上對於 linkage 的說明我還是產生了疑惑，於第二段說明中，到底什麼是 **translation unit** ？ <br/> 既然有疑惑就繼續往下查： <br/> > `節錄自 C 語言規格書 ISO/IEC 9899:1999 - 5.1.1.1 章節` > **5.1.1.1 Program structure** > * A C program need not all be translated at the same time. The text of the program is kept in units called source files, (or preprocessing files) in this International Standard. A source file together with all the headers and source files included via the preprocessing directive **#include** is known as a ==preprocessing translation unit==. After preprocessing, a preprocessing translation unit is called a ==translation unit==. <br/> 對於這段說明我的解讀是，一個 source file 與它 **#include** 的所有 header files, source files 合稱 ==preprocessing translation unit== ，到這邊還沒結束，最後一句提到要在 **preprocessing** 之後才稱作 ==translation unit==。 <br/> 又遇到問題了，什麼是 **preprocessing** ？ <br/> > `節錄自 C 語言規格書 ISO/IEC 9899:1999 - 6.10 章節 - Semantics 項目` > **6.10 Preprocessing directives** > > **Semantics** > > The implementation can process and skip sections of source files conditionally, include other source files, and replace macros. These capabilities are called ==preprocessing==, because conceptually they occur before translation of the resulting translation unit. <br/> **prepocessing** 指的是，實作可以： 1. 依據**條件**處理與**跳過** source files 的片段，包括其他 source files 2. **替換 macros** <br/> * 對於 ==static== 關鍵字的總結： * 在一個 object 或一個 function 識別符前加上 **static** 關鍵字，則該識別符在同一個 translation unit 下指的是同一個 object 或 function。 * 同一個 translation unit 指的是，該識別符所在的 source file 經過條件跳過一些程式片段以及替換 macros 後（包括 **#include** 所包含的 header files 也會展開），該處理完的 source file 稱作一個 translation unit 。 ::: :::info * ## inline > `節錄自 C 語言規格書 ISO/IEC 9899:1999 - 6.7.4 章節 - Semantics 項目` > **6.7.4 Function specifiers** > > A function declared with an inline function specifier is an inline function. The function specifier may appear more than once; the behavior is the same as if it appeared only once. ==Making a function an inline function suggests that calls to the function be as fast as possible.== The extent to which such suggestions are effective is ==implementation-defined==. <br/> **inline** 關鍵字是一個函式（ function ）說明符（ specifier ），若加在 function 識別符（ identifier ）之前，表示建議編譯器使得這個函式能呼叫地愈快愈好。但規格書也提到了實際效果是由**實作**定義的。 ### GCC 對於處理 inline function 的實作 * [Using the GNU Compiler Collection (GCC)](https://gcc.gnu.org/onlinedocs/gcc/) <br/> > `節錄自 GCC 官方文件 - 6.45 章節` > **6.45 An Inline Function is As Fast As a Macro** > By declaring a function inline, you can direct GCC to make calls to that function faster. ==One way GCC can achieve this is to integrate that function’s code into the code for its callers==. This makes execution faster by eliminating the function-call overhead; in addition, ==if any of the actual argument values are constant, their known values may permit simplifications at compile time so that not all of the inline function’s code needs to be included==. The effect on code size is less predictable; object code may be larger or smaller with function inlining, depending on the particular case. <br/> GCC 有兩種方式處理 inline function ： 1. 直接將 function code 整合到呼叫程式的 code 中 2. 若有引數的值是常數，則會允許在 compile time 時以它們的已知值進行簡化，使得 function code 不用被全部包含進呼叫程式中。 <br/> 對於是否加上 **static** 關鍵字， GCC 也有進行說明： <br/> > `節錄自 GCC 官方文件 - 6.45 章節` > **6.45 An Inline Function is As Fast As a Macro** > ... > When an inline function is not static, then the compiler must assume that there may be calls from other source files; since a global symbol can be defined only once in any program, the function must not be defined in the other source files, so the calls therein cannot be integrated. Therefore, a non-static inline function is always compiled on its own in the usual fashion. <br/> GCC 解釋，若一個 inline function 不具 static 關鍵字，會如同 C 語言規格書所解釋，不將它當成一個具有 internal linkage 的 function ，可能會有其他 source file 呼叫這個 function ，因此 GCC 不會將這個 function 直接整合進呼叫程式中，而是直接視為 inline 關鍵字沒有效果。 ::: ##### 小結： 1. static 使得這些 functions 有 internal linkage，不會從其他 translation unit 呼叫 2. inline 使得這些體積較小的 functions 能以較快的速度呼叫回到原題目： ```cpp static inline void list_add_node_t(node_t **list, node_t *node_t) { node_t->next = *list; *list = node_t; } ``` ==（注意到參數中的 node_t 應為原資料結構 node_t 的指標型態）== * 若 list 原本為空： ```graphviz digraph node_struct { rankdir=LR; node [shape=record]; node0 [label="<id>|{{value|<value>}|{next|<next>}}"]; node [shape=plaintext]; null0 [label="NULL"]; node [shape=box]; node_t; node1 [label="list"]; invis0 [style="invis"]; { rank=same; node1 -> null0; invis0; } node_t -> node0:id; node0:next:c -> invis0 [tailclip=false, arrowtail=dot, dir=both]; } ``` ```cpp node_t->next = *list; ``` ```graphviz digraph node_struct { rankdir=LR; node [shape=record]; node0 [label="<id>|{{value|<value>}|{next|<next>}}"]; node [shape=plaintext]; null0 [label="NULL"]; node [shape=box]; node_t; node1 [label="list"]; { rank=same; node1 -> null0; } node_t -> node0:id; node0:next:c -> null0 [tailclip=false, arrowtail=dot, dir=both]; } ``` ```cpp *list = node_t; ``` ```graphviz digraph node_struct { rankdir=LR; node [shape=record]; node0 [label="<id>|{{value|<value>}|{next|<next>}}"]; node [shape=plaintext]; null0 [label="NULL"]; node [shape=box]; node_t; invis0 [style="invis"]; node1 [label="list"]; { rank=same; node1; invis0; } node1 -> node_t; node_t -> node0:id; invis0 -> node0 [style="invis"]; node0:next:c -> null0 [tailclip=false, arrowtail=dot, dir=both]; } ``` * 若 list 原本不為空： ```graphviz digraph node_struct { rankdir=LR; node [shape=record]; node0 [label="<id>|{{value|<value>}|{next|<next>}}"]; node [shape=box]; node_t; node1 [label="list"]; node2 [label="node-in-list"]; invis0 [style="invis"]; { rank=same; node1 -> node2; invis0; } node_t -> node0:id; node0:next:c -> invis0 [tailclip=false, arrowtail=dot, dir=both]; } ``` ```cpp node_t->next = *list; ``` ```graphviz digraph node_struct { rankdir=LR; node [shape=record]; node0 [label="<id>|{{value|<value>}|{next|<next>}}"]; node [shape=box]; node_t; node1 [label="list"]; node2 [label="node-in-list"]; { rank=same; node1 -> node2; } node_t -> node0:id; node0:next:c -> node2 [tailclip=false, arrowtail=dot, dir=both]; } ``` ```cpp *list = node_t; ``` ```graphviz digraph node_struct { rankdir=LR; node [shape=record]; node0 [label="<id>|{{value|<value>}|{next|<next>}}"]; node [shape=box]; node_t; node1 [label="list"]; node2 [label="node-in-list"]; invis0 [style="invis"]; { rank=same; node1; invis0; } node_t -> node0:id; invis0 -> node0 [style="invis"]; node1 -> node_t:id; node0:next:c -> node2 [tailclip=false, arrowtail=dot, dir=both]; } ``` 可以發現兩者的結果皆是在 list 開頭插入 node_t 。 --- #### `list_concat` 函式 ```cpp= static inline void list_concat(node_t **left, node_t *right) { while (*left) left = &((*left)->next) /*LLL*/ *left = right; } ``` * 假設 left 指向 NULL ，圖像化表示成： ```graphviz digraph node_struct { rankdir=LR; node [shape=box]; left; node [shape=plaintext]; null0[label="NULL"]; { rank=same; left -> null0; } } ``` right 指向一串 nodes ： ```graphviz digraph node_struct { rankdir=LR; node [shape=box]; right;rnode0;rnode1;rnode2; r1 [label="rnode1_ptr"];r2 [label="rnode2_ptr"]; invis0 [style=invis]; node [shape=plaintext]; null0[label="NULL"]; { rank=same; right -> rnode0; } { rank=same; r1 -> rnode1; } { rank=same; r2 -> rnode2; } { rank=same; null0 -> invis0 [style=invis]; } rnode0 -> r1; right -> r1 [style=invis]; rnode0 -> rnode1 [style=invis]; rnode1 -> r2; r1 -> r2 [style=invis]; rnode1 -> rnode2 [style=invis]; rnode2 -> null0; r2 -> null0 [style=invis]; rnode2 -> invis0 [style=invis]; } ``` 放在一起： ```graphviz digraph node_struct { rankdir=LR; node [shape=box]; left; right;rnode0;rnode1;rnode2; r1 [label="rnode1_ptr"];r2 [label="rnode2_ptr"]; invis0 [style=invis]; node [shape=plaintext]; null0[label="NULL"]; null1[label="NULL"]; { rank=same; right -> rnode0; } { rank=same; r1 -> rnode1; } { rank=same; r2 -> rnode2; } { rank=same; null0 -> invis0 [style=invis]; } { rank=same; left -> null1; } rnode0 -> r1; right -> r1 [style=invis]; rnode0 -> rnode1 [style=invis]; rnode1 -> r2; r1 -> r2 [style=invis]; rnode1 -> rnode2 [style=invis]; rnode2 -> null0; r2 -> null0 [style=invis]; rnode2 -> invis0 [style=invis]; null1 -> right [style=invis]; } ``` 首先，第 2 行 `while (*left)` 會因為 left 指向 NULL 而跳離迴圈，因此可以直接看第 4 行 `*left = right;` ，這句會將 left 指向的記憶體空間內容改為 right 這個指標。 ```graphviz digraph node_struct { rankdir=LR; node [shape=box]; left; right [label=< <TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0"> <TR><TD PORT="right">right</TD></TR> </TABLE>>]; rnode0;rnode1;rnode2; r1 [label="rnode1_ptr"];r2 [label="rnode2_ptr"]; invis0 [style=invis]; node [shape=plaintext]; null0[label="NULL"]; { rank=same; left -> right; right:right -> rnode0; } { rank=same; r1 -> rnode1; } { rank=same; r2 -> rnode2; } { rank=same; null0 -> invis0 [style=invis]; } rnode0 -> r1; right -> r1 [style=invis]; rnode0 -> rnode1 [style=invis]; rnode1 -> r2; r1 -> r2 [style=invis]; rnode1 -> rnode2 [style=invis]; rnode2 -> null0; r2 -> null0 [style=invis]; rnode2 -> invis0 [style=invis]; } ``` 這樣就可以將 left 指向的記憶體空間置換成 right 。 --- ```cpp= static inline void list_concat(node_t **left, node_t *right) { while (*left) left = &((*left)->next) /*LLL*/ *left = right; } ``` * 假設 left 所指向的記憶體空間**不**為 NULL ```graphviz digraph node_struct { rankdir=LR; node [shape=box]; left; right;rnode0;rnode1;rnode2; lnode0;lnode0_ptr;lnode1;lnode1_ptr; r1 [label="rnode1_ptr"];r2 [label="rnode2_ptr"]; invis0 [style=invis]; invis1 [style=invis]; node [shape=plaintext]; null0[label="NULL"]; null1[label="NULL"]; { rank=same; right -> rnode0; } { rank=same; r1 -> rnode1; } { rank=same; r2 -> rnode2; } { rank=same; null0 -> invis0 [style=invis]; } { rank=same; left -> lnode0_ptr -> lnode0; } { rank=same; lnode1_ptr -> lnode1; } { rank=same; null1 -> invis1 [style=invis]; } rnode0 -> r1; right -> r1 [style=invis]; rnode0 -> rnode1 [style=invis]; rnode1 -> r2; r1 -> r2 [style=invis]; rnode1 -> rnode2 [style=invis]; rnode2 -> null0; r2 -> null0 [style=invis]; rnode2 -> invis0 [style=invis]; lnode0 -> lnode1_ptr; lnode1 -> null1; null1 -> right [style=invis]; lnode0_ptr -> lnode1_ptr [style=invis]; lnode0 -> lnode1 [style=invis]; lnode1_ptr -> null1 [style=invis]; lnode1 -> invis1 [style=invis]; } ``` 則第 2 行的 `while (*left)` 不會馬上跳離迴圈，繼續往下追蹤： ```cpp left = &((*left)->next) ``` ```graphviz digraph node_struct { rankdir=LR; node [shape=box]; left;sleft[label="lnode0_ptr"]; right;rnode0;rnode1;rnode2; lnode0;lnode1;lnode1_ptr; r1 [label="rnode1_ptr"];r2 [label="rnode2_ptr"]; invis0 [style=invis]; invis1 [style=invis]; node [shape=plaintext]; null0[label="NULL"]; null1[label="NULL"]; { rank=same; right -> rnode0; } { rank=same; r1 -> rnode1; } { rank=same; r2 -> rnode2; } { rank=same; null0 -> invis0 [style=invis]; } { rank=same; left -> sleft [style=invis]; sleft -> lnode0; } { rank=same; lnode1_ptr -> lnode1; } { rank=same; null1 -> invis1 [style=invis]; } rnode0 -> r1; right -> r1 [style=invis]; rnode0 -> rnode1 [style=invis]; rnode1 -> r2; r1 -> r2 [style=invis]; rnode1 -> rnode2 [style=invis]; rnode2 -> null0; r2 -> null0 [style=invis]; rnode2 -> invis0 [style=invis]; lnode0 -> lnode1_ptr; lnode1 -> null1; null1 -> right [style=invis]; sleft -> lnode1_ptr [color=blue, label="next", fontcolor=blue]; lnode0 -> lnode1 [style=invis]; lnode1_ptr -> null1 [style=invis]; lnode1 -> invis1 [style=invis]; left -> lnode1_ptr; } ``` 整理後： ```graphviz digraph node_struct { rankdir=LR; node [shape=box]; left;sleft[label="lnode0_ptr"]; right;rnode0;rnode1;rnode2; lnode0;lnode1;lnode1_ptr; r1 [label="rnode1_ptr"];r2 [label="rnode2_ptr"]; invis0 [style=invis]; invis1 [style=invis]; node [shape=plaintext]; null0[label="NULL"]; null1[label="NULL"]; { rank=same; right -> rnode0; } { rank=same; r1 -> rnode1; } { rank=same; r2 -> rnode2; } { rank=same; null0 -> invis0 [style=invis]; } { rank=same; sleft -> lnode0; } { rank=same; left -> lnode1_ptr -> lnode1; } { rank=same; null1 -> invis1 [style=invis]; } rnode0 -> r1; right -> r1 [style=invis]; rnode0 -> rnode1 [style=invis]; rnode1 -> r2; r1 -> r2 [style=invis]; rnode1 -> rnode2 [style=invis]; rnode2 -> null0; r2 -> null0 [style=invis]; rnode2 -> invis0 [style=invis]; lnode0 -> lnode1_ptr; lnode1 -> null1; null1 -> right [style=invis]; sleft -> lnode1_ptr [style=invis]; lnode0 -> lnode1 [style=invis]; lnode1_ptr -> null1 [style=invis]; lnode1 -> invis1 [style=invis]; } ``` 迴圈第一圈完成了，這邊可以針對 LLL 這題的 $d$ 選項做個解析： :::info 如果這題使用了 $d$ 選項， `*left = (*left)->next` ，則圖示會變成如下所示： 1. ```graphviz digraph node_struct { rankdir=LR; node [shape=box]; left; right;rnode0;rnode1;rnode2; lnode0;lnode0_ptr;lnode1;lnode1_ptr; r1 [label="rnode1_ptr"];r2 [label="rnode2_ptr"]; invis0 [style=invis]; invis1 [style=invis]; node [shape=plaintext]; null0[label="NULL"]; null1[label="NULL"]; { rank=same; right -> rnode0; } { rank=same; r1 -> rnode1; } { rank=same; r2 -> rnode2; } { rank=same; null0 -> invis0 [style=invis]; } { rank=same; left -> lnode0_ptr -> lnode0; } { rank=same; lnode1_ptr -> lnode1; } { rank=same; null1 -> invis1 [style=invis]; } rnode0 -> r1; right -> r1 [style=invis]; rnode0 -> rnode1 [style=invis]; rnode1 -> r2; r1 -> r2 [style=invis]; rnode1 -> rnode2 [style=invis]; rnode2 -> null0; r2 -> null0 [style=invis]; rnode2 -> invis0 [style=invis]; lnode0 -> lnode1_ptr; lnode1 -> null1; null1 -> right [style=invis]; lnode0_ptr -> lnode1_ptr [style=invis]; lnode0 -> lnode1 [style=invis]; lnode1_ptr -> null1 [style=invis]; lnode1 -> invis1 [style=invis]; } ``` 2. ```graphviz digraph node_struct { rankdir=LR; node [shape=box]; left; right;rnode0;rnode1;rnode2; lnode1;lnode1_ptr [label=< <TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0"> <TR><TD PORT="here">lnode1_ptr</TD></TR> </TABLE>>]; r1 [label="rnode1_ptr"];r2 [label="rnode2_ptr"]; invis0 [style=invis]; invis1 [style=invis]; node [shape=plaintext]; null0[label="NULL"]; null1[label="NULL"]; { rank=same; right -> rnode0; } { rank=same; r1 -> rnode1; } { rank=same; r2 -> rnode2; } { rank=same; null0 -> invis0 [style=invis]; } { rank=same; left -> lnode1_ptr; lnode1_ptr:here -> lnode1; } { rank=same; null1 -> invis1 [style=invis]; } rnode0 -> r1; right -> r1 [style=invis]; rnode0 -> rnode1 [style=invis]; rnode1 -> r2; r1 -> r2 [style=invis]; rnode1 -> rnode2 [style=invis]; rnode2 -> null0; r2 -> null0 [style=invis]; rnode2 -> invis0 [style=invis]; lnode1 -> null1; null1 -> right [style=invis]; lnode1_ptr -> null1 [style=invis]; lnode1 -> invis1 [style=invis]; } ``` 會直接將 left 所指的空間蓋掉，造成 lnode0_ptr 的值不見，再也無法取得 lnode0 這個 node 。 ::: 進行迴圈第 2 圈， `while (*left)` 此時的 \*left 是剛剛的 lnode1_ptr 指標，不為空，因此進入迴圈： ```cpp left = &((*left)->next) ``` ```graphviz digraph node_struct { rankdir=LR; node [shape=box]; left;head[label="lnode0_ptr"]; right;rnode0;rnode1;rnode2; lnode0;lnode1;lnode1_ptr; r1 [label="rnode1_ptr"];r2 [label="rnode2_ptr"]; invis0 [style=invis]; invis1 [style=invis]; node [shape=plaintext]; null0[label="NULL"]; null1[label="NULL"]; { rank=same; right -> rnode0; } { rank=same; r1 -> rnode1; } { rank=same; r2 -> rnode2; } { rank=same; null0 -> invis0 [style=invis]; } { rank=same; head -> lnode0; } { rank=same; left -> lnode1_ptr [style=invis]; lnode1_ptr -> lnode1; } { rank=same; null1 -> invis1 [style=invis]; } rnode0 -> r1; right -> r1 [style=invis]; rnode0 -> rnode1 [style=invis]; rnode1 -> r2; r1 -> r2 [style=invis]; rnode1 -> rnode2 [style=invis]; rnode2 -> null0; r2 -> null0 [style=invis]; rnode2 -> invis0 [style=invis]; lnode0 -> lnode1_ptr; lnode1 -> null1; null1 -> right [style=invis]; head -> lnode1_ptr [style=invis]; lnode0 -> lnode1 [style=invis]; lnode1_ptr -> null1 [color=blue, label="next", fontcolor=blue]; lnode1 -> invis1 [style=invis]; left -> null1; } ``` 整理後： ```graphviz digraph node_struct { rankdir=LR; node [shape=box]; left;head[label="lnode0_ptr"]; right;rnode0;rnode1;rnode2; lnode0;lnode1;lnode1_ptr; r1 [label="rnode1_ptr"];r2 [label="rnode2_ptr"]; invis0 [style=invis]; invis1 [style=invis]; node [shape=plaintext]; null0[label="NULL"]; null1[label="NULL"]; { rank=same; right -> rnode0; } { rank=same; r1 -> rnode1; } { rank=same; r2 -> rnode2; } { rank=same; null0 -> invis0 [style=invis]; } { rank=same; head -> lnode0; } { rank=same; lnode1_ptr -> lnode1; } { rank=same; left -> null1; null1 -> invis1 [style=invis]; } rnode0 -> r1; right -> r1 [style=invis]; rnode0 -> rnode1 [style=invis]; rnode1 -> r2; r1 -> r2 [style=invis]; rnode1 -> rnode2 [style=invis]; rnode2 -> null0; r2 -> null0 [style=invis]; rnode2 -> invis0 [style=invis]; lnode0 -> lnode1_ptr; lnode1 -> null1; null1 -> right [style=invis]; head -> lnode1_ptr [style=invis]; lnode0 -> lnode1 [style=invis]; lnode1_ptr -> null1 [style=invis]; lnode1 -> invis1 [style=invis]; } ``` 最後迴圈第 3 次檢查後，此時的 \*left 為 NULL ，因此直接跳第 4 行： ```cpp *left = right; ``` ```graphviz digraph node_struct { rankdir=LR; node [shape=box]; left;head[label="lnode0_ptr", fontname="times-bold"]; right [label=< <TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0"> <TR><TD PORT="here">right</TD></TR> </TABLE>>];rnode0;rnode1;rnode2; lnode0;lnode1;lnode1_ptr; r1 [label="rnode1_ptr"];r2 [label="rnode2_ptr"]; invis0 [style=invis]; invis1 [style=invis]; node [shape=plaintext]; null0[label="NULL"]; { rank=same; right:here -> rnode0; } { rank=same; r1 -> rnode1; } { rank=same; r2 -> rnode2; } { rank=same; null0 -> invis0 [style=invis]; } { rank=same; head -> lnode0; } { rank=same; lnode1_ptr -> lnode1; } { rank=same; left -> right; } rnode0 -> r1; right -> r1 [style=invis]; rnode0 -> rnode1 [style=invis]; rnode1 -> r2; r1 -> r2 [style=invis]; rnode1 -> rnode2 [style=invis]; rnode2 -> null0; r2 -> null0 [style=invis]; rnode2 -> invis0 [style=invis]; lnode0 -> lnode1_ptr; lnode1 -> right; head -> lnode1_ptr [style=invis]; lnode0 -> lnode1 [style=invis]; lnode1_ptr -> right [style=invis]; lnode1 -> invis1 [style=invis]; } ``` 如前面的分析，因為 left 所指的記憶體空間不變，是直接以 right 覆蓋，因此 lnode1 也能指到 right ，成功串接。串接結束後，原本的 **lnode0_ptr** 指標就能直接取得串列的頭。 --- #### `quicksort` 函式 ```cpp= void quicksort(node_t **list) { if (!*list) return; node_t *pivot = *list; int value = pivot->value; node_t *p = pivot->next; pivot->next = NULL; node_t *left = NULL, *right = NULL; while (p) { node_t *n = p; p = p->next; list_add_node_t(n->value > value ? &right : &left, n); /*AAA, BBB*/ } quicksort(&left); quicksort(&right); node_t *result = NULL; list_concat(&result, left); list_concat(&result, pivot); /*CCC*/ list_concat(&result, right); /*CCC*/ *list = result; } ``` :::info 在從頭開始分析之前，先注意到剛剛解析的 `list_concat` 函式用法： ```cpp node_t *result = NULL; list_concat(&result, left); ``` 其中 result 就相當於剛剛分析的 **lnode0_ptr** ，而 \&result 就相當於 **left** 這個指標的指標。剛才的最後提到了 **lnode0_ptr** 能繼續指向串列開頭，因此串接完後， result 能夠繼續維持串列開頭位址，不因為串列連接而失去串列開頭。 ::: ```cpp if (!*list) return; ``` 檢查 list 是否指向 NULL ，若是，則直接 return 。 ```cpp node_t *pivot = *list; int value = pivot->value; node_t *p = pivot->next; pivot->next = NULL; ``` 假設 list 為： ```graphviz digraph quicksort { rankdir=LR; # node [shape=record]; # value [label="value|"]; node [shape=box]; list; p_1;n_1 [label="6"]; p_2;n_2 [label="2"]; p_3;n_3 [label="9"]; # pivot; # p; invis0 [style=invis]; node [shape=plaintext]; null0 [label="NULL"]; { rank=same; list -> p_1 -> n_1; } { rank=same; p_2 -> n_2; } { rank=same; p_3 -> n_3; } { rank=same; null0 -> invis0 [style=invis]; } n_1 -> p_2; p_1 -> p_2 [style=invis]; n_1 -> n_2 [style=invis]; n_2 -> p_3; p_2 -> p_3 [style=invis]; n_2 -> n_3 [style=invis]; n_3 -> null0; p_3 -> null0 [style=invis]; n_3 -> invis0 [style=invis]; } ``` pivot 設為 list 開頭： ```graphviz digraph quicksort { rankdir=LR; # node [shape=record]; # value [label="value|"]; node [shape=box]; list; p_1;n_1 [label="6"]; p_2;n_2 [label="2"]; p_3;n_3 [label="9"]; pivot [color=blue]; # p; invis0 [style=invis]; node [shape=plaintext]; null0 [label="NULL"]; { rank=same; list -> p_1 -> n_1; n_1 -> pivot [arrowtail=normal, dir=both, arrowhead=none]; } { rank=same; p_2 -> n_2; } { rank=same; p_3 -> n_3; } { rank=same; null0 -> invis0 [style=invis]; } n_1 -> p_2; p_1 -> p_2 [style=invis]; n_1 -> n_2 [style=invis]; n_2 -> p_3; p_2 -> p_3 [style=invis]; n_2 -> n_3 [style=invis]; n_3 -> null0; p_3 -> null0 [style=invis]; n_3 -> invis0 [style=invis]; } ``` pivot 指向的 value 賦值給 value 變數： ```graphviz digraph quicksort { rankdir=LR; node [shape=record]; value [label="value|6", color="#225522"]; node [shape=box]; list; p_1;n_1 [label="6"]; p_2;n_2 [label="2"]; p_3;n_3 [label="9"]; pivot [color=blue]; # p; invis0 [style=invis]; node [shape=plaintext]; null0 [label="NULL"]; { rank=same; list -> p_1 -> n_1; n_1 -> pivot [arrowtail=normal, dir=both, arrowhead=none]; } { rank=same; p_2 -> n_2; } { rank=same; p_3 -> n_3; } { rank=same; null0 -> invis0 [style=invis]; } n_1 -> p_2; p_1 -> p_2 [style=invis]; n_1 -> n_2 [style=invis]; n_2 -> p_3; p_2 -> p_3 [style=invis]; n_2 -> n_3 [style=invis]; n_3 -> null0; p_3 -> null0 [style=invis]; n_3 -> invis0 [style=invis]; } ``` p 指到 pivot->next 所指的位置： ```graphviz digraph quicksort { rankdir=LR; node [shape=record]; value [label="value|6", color="#225522"]; node [shape=box]; list; p_1;n_1 [label="6"]; p_2;n_2 [label="2"]; p_3;n_3 [label="9"]; pivot [color=blue]; p [color="#dd5522"]; invis0 [style=invis]; node [shape=plaintext]; null0 [label="NULL"]; { rank=same; list -> p_1 -> n_1; n_1 -> pivot [arrowtail=normal, dir=both, arrowhead=none]; } { rank=same; p_2 -> n_2; n_2 -> p [arrowtail=normal, dir=both, arrowhead=none]; } { rank=same; p_3 -> n_3; } { rank=same; null0 -> invis0 [style=invis]; } n_1 -> p_2; p_1 -> p_2 [style=invis]; n_1 -> n_2 [style=invis]; n_2 -> p_3; p_2 -> p_3 [style=invis]; n_2 -> n_3 [style=invis]; n_3 -> null0; p_3 -> null0 [style=invis]; n_3 -> invis0 [style=invis]; } ``` 將 pivot 所指的 node 斷開於串列： ```graphviz digraph quicksort { rankdir=LR; node [shape=record]; value [label="value|6", color="#225522"]; node [shape=box]; list; p_1;n_1 [label="6"]; p_2;n_2 [label="2"]; p_3;n_3 [label="9"]; pivot [color=blue]; p [color="#dd5522"]; invis0 [style=invis]; invis1 [style=invis]; node [shape=plaintext]; null0 [label="NULL"]; null1 [label="NULL"]; { rank=same; list -> p_1 -> n_1; n_1 -> pivot [arrowtail=normal, dir=both, arrowhead=none]; } { rank=same; p_2 -> n_2; n_2 -> p [arrowtail=normal, dir=both, arrowhead=none]; } { rank=same; p_3 -> n_3; } { rank=same; null0 -> invis0 [style=invis]; } { rank=same; null1 -> invis1 [style=invis]; } n_1 -> null1; p_1 -> null1 [style=invis]; null1 -> p_2 [style=invis]; p_1 -> null1 [style=invis]; n_1 -> invis1 [style=invis]; n_2 -> p_3; p_2 -> p_3 [style=invis]; n_2 -> n_3 [style=invis]; n_3 -> null0; p_3 -> null0 [style=invis]; n_3 -> invis0 [style=invis]; } ``` 以上這段的用意是找出一個 pivot 且將它由串列分離，並以 p 指標紀錄 pivot 的下一個 node ，同時將 pivot 指向的值紀錄在 value 變數中。以下的程式片段則是繼續以 p 指向的串列，依據 pivot node 的值，分為左右兩串列，左串列都將會比 pivot 小或相等於 pivot ，右串列則會比 pivot 都大。 ```cpp node_t *left = NULL, *right = NULL; while (p) { node_t *n = p; p = p->next; list_add_node_t(n->value > value ? &right : &left, n); /*AAA, BBB*/ } ``` 以上的例子做完後會如下所示： ```graphviz digraph quicksort { rankdir=LR; node [shape=record]; value [label="value|6", color="#225522"]; node [shape=box]; list; p_1;n_1 [label="6"]; p_2;n_2 [label="2"]; p_3;n_3 [label="9"]; pivot [color=blue]; p [color="#dd5522"]; left [color=blue];right [color=blue]; invis0 [style=invis]; invis1 [style=invis]; invis2 [style=invis]; node [shape=plaintext]; null0 [label="NULL"]; null1 [label="NULL"]; null2 [label="NULL"]; null3 [label="NULL"]; { rank=same; list -> p_1 -> n_1; n_1 -> pivot [arrowtail=normal, dir=both, arrowhead=none]; } { rank=same; left -> p_2 -> n_2; } { rank=same; right -> p_3 -> n_3; } { rank=same; null0 -> invis0 [style=invis]; } { rank=same; null1 -> invis1 [style=invis]; } { rank=same; null3 -> invis2 [style=invis]; } n_1 -> null1; p_1 -> null1 [style=invis]; null1 -> p_2 [style=invis]; p_1 -> null1 [style=invis]; n_1 -> invis1 [style=invis]; n_2 -> null3; null3 -> p_3 [style=invis]; invis2 -> n_3 [style=invis]; p_2 -> null3 [style=invis]; n_2 -> invis2 [style=invis]; n_3 -> null0; p_3 -> null0 [style=invis]; n_3 -> invis0 [style=invis]; p -> null2; } ``` ```cpp quicksort(&left); quicksort(&right); ``` 但 left 與 right 都只保證比 pivot 大或小，並不保證串列內部是排序好的，因此針對 left 與 right 進行遞迴呼叫 quicksort 函式。 ```cpp node_t *result = NULL; list_concat(&result, left); list_concat(&result, pivot); /*CCC*/ list_concat(&result, right); /*CCC*/ *list = result; ``` 利用上面所解釋的 list_concat 函式串接三串列，且此函式保證了 result 指標能時時保持指在串列開頭，因此最後只要將 list 指到已串接完成的 result 串列開頭即可。 --- 3. 對應的測試程式如下： ```cpp= static bool list_is_ordered(node_t *list) { bool first = true; int value; while (list) { if (first) { value = list->value; first = false; } else { if (list->value < value) return false; value = list->value; } list = list->next; } return true; } static void list_display(node_t *list) { printf("%s IN ORDER : ", list_is_ordered(list) ? " " : "NOT"); while (list) { printf("%d ", list->value); list = list->next; } printf("\n"); } int main(int argc, char **argv) { size_t count = 20; node_t *list = NULL; while (count--) list = list_make_node_t(list, random() % 1024); list_display(list); quicksort(&list); list_display(list); if (!list_is_ordered(list)) return EXIT_FAILURE; list_free(&list); return EXIT_SUCCESS; } ``` ### 將各函數分開分析如以下各點： #### `list_is_ordered` 函式 ```cpp static bool list_is_ordered(node_t *list) { bool first = true; int value; while (list) { if (first) { value = list->value; first = false; } else { if (list->value < value) return false; value = list->value; } list = list->next; } return true; } ``` 首先看到 return type `bool` 以及 return value `true` 與 `false` ， > `節錄自 C 語言規格書 ISO/IEC 9899:1999 - 7.16 章節` > **7.16 Boolean type and values <stdbool.h>** > The header <stdbool.h> defines four macros. The macro **bool** expands to **\_Bool**. > `節錄自 C 語言規格書 ISO/IEC 9899:1999 - 6.2.5 章節` > **6.2.5 Types** > * An object declared as type **\_Bool** is large enough to store the values 0 and 1. > * The remaining three macros are suitable for use in #if preprocessing directives. They are **true** which expands to the integer constant 1, **false** which expands to the integer constant 0, and **\__bool_true_false_are_defined** which expands to the integer constant 1. **bool** 是可以儲存 0 或 1 的型態， **true** 被定義為 1 ， **false** 被定義為 0 。要使用它們的話必須先 `#include <stdbool.h>` 。（ **\_Bool** 則不用，它是 C 語言關鍵字） > `可參閱 C 語言規格書 6.4.1 Keywords 章節 - Syntax 項目` 此函式在迴圈中分為 `first` 條件與**非** `first` 兩種，因為除了第一次以外，其他次都必須與 value 做比較，第一次進迴圈時 value 是尚未初始化的，也還沒取得第一次的 value ，自然無法做比較。此函式是為了檢查串列是否已經由小排到大：第一圈先將第一個 node 的值賦予 value 。 ```graphviz digraph ordered { rankdir=LR; node [shape=record]; value [label="value|8"]; node [shape=box]; n0 [label="8"]; n1 [label="2"]; n2 [label="6"]; node [shape=plaintext]; list; { rank=same; list;n0; } n0 -> n1 -> n2; } ``` 第二圈檢查第二個 node 的值是否 $\leq$ value ，否則 return false ，此處正好大於 value ，直接 return false ，判斷此串列尚未排序好。 ```graphviz digraph ordered { rankdir=LR; node [shape=record]; value [label="value|8"]; node [shape=box]; n0 [label="8"]; n1 [label="2", fontcolor=red]; n2 [label="6"]; node [shape=plaintext]; list; { rank=same; list;value;n1; } n0 -> n1 -> n2; } ``` 若為以下例子： ```graphviz digraph ordered { rankdir=LR; node [shape=record]; value [label="value|2"]; node [shape=box]; n0 [label="2"]; n1 [label="6"]; n2 [label="8"]; node [shape=plaintext]; list; { rank=same; list;n0; } n0 -> n1 -> n2; } ``` 第二圈檢查第二個 node 的值 $<$ value ，則將 value 改為第二個 node 的值，繼續往下比較。 ```graphviz digraph ordered { rankdir=LR; node [shape=record]; value [label="value|6"]; node [shape=box]; n0 [label="2"]; n1 [label="6"]; n2 [label="8"]; node [shape=plaintext]; list; { rank=same; list;value;n1; } n0 -> n1 -> n2; } ``` 若最後都沒有 return false ，則 return true ，判斷此串列已排序。 ```graphviz digraph ordered { rankdir=LR; #node [shape=record]; #value [label="value|6"]; node [shape=box]; n0 [label="2"]; n1 [label="6"]; n2 [label="8"]; node [shape=plaintext]; list;NULL; { rank=same; list;NULL; } n0 -> n1 -> n2 -> NULL; } ``` #### `list_display` 函式 ```cpp static void list_display(node_t *list) { printf("%s IN ORDER : ", list_is_ordered(list) ? " " : "NOT"); while (list) { printf("%d ", list->value); list = list->next; } printf("\n"); } ``` 這個函式就比較單純，先以 `list_is_ordered` 函式判斷串列是否排好序，再依序印出各 node 的值。 #### `main` 函式 ```cpp int main(int argc, char **argv) { size_t count = 20; node_t *list = NULL; while (count--) list = list_make_node_t(list, random() % 1024); list_display(list); quicksort(&list); list_display(list); if (!list_is_ordered(list)) return EXIT_FAILURE; list_free(&list); return EXIT_SUCCESS; } ``` 首先觀察到的是 `random()` 函式，經由 man 手冊我觀察到了以下資訊： > `節錄自 Linux man-pages 5.05` > **RANDOM(3)** > * **CONFORMING TO** > POSIX.1-2001, POSIX.1-2008, 4.3BSD. 可以發現這個函式並不包含於 C 語言標準，因此它只能使用於符合以上標準的系統上。至於 `EXIT_FAILURE` 與 `EXIT_SUCCESS` 兩個 macro ， C 語言規格書寫道： > `節錄自 C 語言規格書 ISO/IEC 9899:1999 - 7.20 章節` > **7.20 General utilities <stdlib.h>** > * The header **<stdlib.h>** declares five types and several functions of general utility, and defines several macros. > ... > * The macros defined are **NULL** (described in 7.17); **EXIT_FAILURE** and **EXIT_SUCCESS** which expand to integer constant expressions that can be used as the argument to the exit function to return unsuccessful or successful termination status, respectively, to the host environment; ... 直接去找 `stdlib.h` 檔案（我的裝置上這個檔案位於 `/usr/include/stdlib.h` ），搜尋 `EXIT_SUCCESS` 關鍵字： ```cpp /* We define these the same for all machines. Changes from this to the outside world should be done in `_exit'. */ #define EXIT_FAILURE 1 /* Failing exit status. */ #define EXIT_SUCCESS 0 /* Successful exit status. */ ``` 因此為了使用這兩個 macro 務必先 `#include <stdlib.h>` 。回到函式本身，先來看前面的一句敘述： ```cpp size_t count = 20; ``` **size_t** 是一個型態， > `節錄自 C 語言規格書 ISO/IEC 9899:1999 - 6.5.3.4 章節` > **6.5.3.4 The sizeof operator** > * The value of the result is implementation-defined, and its type (an unsigned integer type) is **size_t**, defined in <stddef.h> (and other headers). 實際去查找 `stddef.h` 檔案，發現並不在 `/usr/include/` 中，而是在 `/usr/lib/gcc` 的子資料夾中，是由 GCC 所實作的檔案： ```cpp #ifndef __SIZE_TYPE__ #define __SIZE_TYPE__ long unsigned int #endif #if !(defined (__GNUG__) && defined (size_t)) typedef __SIZE_TYPE__ size_t; ``` 對於 `__GNUG__` 這個 macro ，查閱 GCC 官方文件的 [CPP Manual](https://gcc.gnu.org/onlinedocs/cpp/)： > `節錄自 GCC 官方文件 - CPP Manual - 3.7.2 章節` > **3.7.2 Common Predefined Macros** > * **\_\_GNUG\_\_** The GNU C++ compiler defines this. Testing it is equivalent to testing (**\_\_GNUC\_\_** && **\_\_cplusplus**). 這個 macro 是 GNU C++ 編譯器會定義的，因此 size_t 在「不是 GNU C++ 編譯器」或「沒有定義 size_t」的情況下會將 size_t 定義為 `long unsigned int` 型態。回到函式， ```cpp size_t count = 20; node_t *list = NULL; while (count--) list = list_make_node_t(list, random() % 1024); ``` 由程式可以猜測這邊想要隨機產生 20 個介在 0 ~ 1023 的數字，並加入到 list 中，這邊題目沒有定義，稍後會補上這個函式的詳細定義。 ```cpp list_display(list); quicksort(&list); list_display(list); ``` 這邊程式會呼叫 `list_display` 函式印出排序前的數字，並以 `quicksort` 函式真正執行排序，再印出排序後的結果。 ```cpp if (!list_is_ordered(list)) return EXIT_FAILURE; ``` 這邊是以 `list_is_ordered` 函式驗證排序是否正確，若失敗則 return EXIT_FAILURE 。 ```cpp list_free(&list); return EXIT_SUCCESS; ``` 若剛才的驗證沒有失敗，則呼叫 `list_free` 函式（而這個函式題目也是尚未定義的，稍後會補上詳細定義，猜測函式的目的是要釋放 `list_make_node_t` 函式配置的記憶體空間），最後視為成功而 return EXIT_SUCCESS 。 ### 補充尚未定義的函式 #### `list_make_node_t` 函式 ```cpp static node_t *list_make_node_t(node_t *list, int n) { node_t *tmp = (node_t *)malloc(sizeof(node_t)); tmp->value = n; tmp->next = list; return tmp; } ``` 這次 `list` 參數以 `node_t *` 型態傳入， `list` 這個指標只會複製它的值進入函式，不會修改到傳入的指標本身，因此只能選擇將新配置的 node 指標 return 回去。 #### `list_free` 函式 ```cpp static void list_free(node_t **list) { while (*list) { node_t *tmp = *list; *list = (*list)->next; free(tmp); } } ``` 傳入的參數是 node_t 指標的指標 `list` ，可以直接更動到 `*list` 的值，先檢查 `*list` 是否為空，若否，則先以 `tmp` 儲存 `*list` 指標值，這是稍後要釋放的空間，再將 `*list` 指標往後移動一個 node ，再來就是釋放 tmp 中配置的空間，最後重新回迴圈開頭檢查 `*list` 是否為空，直到所有 node 釋放完畢。 ### 實際執行 ```shell NOT IN ORDER : 1016 84 706 124 326 483 763 498 939 186 205 809 236 74 255 81 115 105 966 359 IN ORDER : 74 81 84 105 115 124 186 205 236 255 326 359 483 498 706 763 809 939 966 1016 ``` ### Valgrind 檢查記憶體洩漏 ```shell ==44461== Memcheck, a memory error detector ==44461== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==44461== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info ==44461== Command: ./list ==44461== NOT IN ORDER : 1016 84 706 124 326 483 763 498 939 186 205 809 236 74 255 81 115 105 966 359 IN ORDER : 74 81 84 105 115 124 186 205 236 255 326 359 483 498 706 763 809 939 966 1016 ==44461== ==44461== HEAP SUMMARY: ==44461== in use at exit: 0 bytes in 0 blocks ==44461== total heap usage: 21 allocs, 21 frees, 1,344 bytes allocated ==44461== ==44461== All heap blocks were freed -- no leaks are possible ==44461== ==44461== For lists of detected and suppressed errors, rerun with: -s ==44461== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) ``` 結果： `list_free` 順利執行，成功釋放已配置的記憶體空間。