2021q1 Homework1 (quiz1)

# 2021q1 Homework1 (quiz1) contributed by < `xxiex123` > ## 延伸問題1 解析以下code ``` cpp static inline void list_add_node_t(node_t **list, node_t *node_t) { node_t->next = *list; *list = node_t; } static inline void list_concat(node_t **left, node_t *right) { while (*left) left = &((*left)->next); *left = right; } void quicksort(node_t **list) { if (!*list) return; node_t *pivot = *list; int value = pivot->value; node_t *p = pivot->next; pivot->next = NULL; node_t *left = NULL, *right = NULL; while (p) { node_t *n = p; p = p->next; list_add_node_t(n->value > value ? &right : &left, n); } quicksort(&left); quicksort(&right); node_t *result = NULL; list_concat(&result, left); list_concat(&result, pivot); list_concat(&result, right); *list = result; } ``` 我們先從簡單的兩個function著手瞭解: **list_add_node_t** ```graphviz digraph graph1{ node [shape=box];nodeA; { rank="same"; nodeA->null; } } ``` 假設需要加入一個nodeB，經過list_add_node_t（&list,nodeB);后 (把nodeB從list頭部加入) ```graphviz digraph graph2{ node [shape=box]; { rank="same"; nodeB->nodeA->null; } } ``` nodeB是以pass by value的方式被assign到nodeA的前面。 **list_concat** 從上面的結果延續，假設想要從上面的list的尾部接上另外一組list， ```graphviz digraph graph4{ node [shape=box]; { rank="same"; D->C->null; } } ``` 經過list_concat(&nodeB,D);之後 ```graphviz digraph graph3{ node [shape=box]; { rank="same"; nodeB->nodeA->D->C->null; } } ``` 其中，D為head的list是以pass by value的方式assign 到nodeB的list。 **quicksort** 在瞭解主程式碼之前，先知道一下quicksort的概念： quicksort主要的想法是從一組數中取其中一個數，叫做pivot，它被用來作爲一個參照，區分找出這組數中、比pivot的值小的和比pivot大的，再把較小的數全部排到pivot的左邊，較大的排到右邊。由此可得到兩組未被排序好的數組，分別爲pivot左邊的數組和pivot 右邊的數組。被quicksort 所區分出來的數組的個數會大於等於1，小於等於n-1（n為被quicksort處理的數組大小），所以每quicksort 一次都會降低排序問題的複雜度，因此只要不斷對子數組做quicksort，就能得到一個已排序好的數組。以下為實際例子 ``` graphviz digraph graph4{ node[shape=record]; struct [label="<0>7|<1>4|<2>9|<3>3|<4>8"] } ``` 初始狀態,讓參數 j從大到小執行直到 j=1。 ``` graphviz digraph graph4{ node[shape=record]; struct [label="<0>7|<1>4|<2>9|<3>3|<4>8"] node[shape=oval] pivot->struct:0; i->struct:4; j->struct:4; } ``` （loop j=4）因8>7, 把8換到i的位置（這裏的情況也就是他自己）并且i-1 ``` graphviz digraph graph4{ node[shape=record]; struct [label="<0>7|<1>4|<2>9|<3>3|<4>8"] node[shape=oval] pivot->struct:0; i->struct:4; j->struct:4; } ``` （loop j=3）因3<7,不執行任何指令 ``` graphviz digraph graph4{ node[shape=record]; struct [label="<0>7|<1>4|<2>9|<3>3|<4>8"] node[shape=oval] pivot->struct:0; i->struct:3; j->struct:3; } ``` （loop j=2），9>7,把9換到i的位置并且i-1. ``` graphviz digraph graph4{ node[shape=record]; struct [label="<0>7|<1>4|<2>9|<3>3|<4>8"] node[shape=oval] pivot->struct:0; struct:3->struct:2[color=red,dir=both] i->struct:3; j->struct:2; } ``` （loop j=1），4<7,不執行任何指令 ``` graphviz digraph graph4{ node[shape=record]; struct [label="<0>7|<1>4|<2>3|<3>9|<4>8"] node[shape=oval] pivot->struct:0; i->struct:2; j->struct:1; } ``` （loop j=0），for loop 結束，并且把pivot 換到i的位置 ``` graphviz digraph graph4{ node[shape=record]; struct [label="<0>7|<1>4|<2>3|<3>9|<4>8"] node[shape=oval] pivot->struct:0; i->struct:2; j->struct:0; } ``` 由此形成以pivot左右兩邊的兩組組未排序數組，因此我們要再次對它們做quick sort，因此一般quick sort是一種迭代函數 ``` graphviz digraph graph4{ node[shape=record]; struct [label="<0>3|<1>4|<2>7|<3>9|<4>8"] node[shape=oval] pivot->struct:2; } ``` 最終形成 ``` graphviz digraph graph4{ node[shape=record]; struct [label="<0>3|<1>4|<2>7|<3>8|<4>9"] node[shape=oval] pivot->struct:2; } ``` 知道了quick sort的概念后，我們便很容易瞭解用來 sort link list 的主程式碼， ```cpp= void quicksort(node_t **list) { if (!*list) return; node_t *pivot = *list; int value = pivot->value; node_t *p = pivot->next; pivot->next = NULL; node_t *left = NULL, *right = NULL; while (p) { node_t *n = p; p = p->next; list_add_node_t(n->value > value ? &right : &left, n); } quicksort(&left); quicksort(&right); node_t *result = NULL; list_concat(&result, left); list_concat(&result, pivot); list_concat(&result, right); *list = result; } ``` 這裏的quick sort與之前的例子概念一致，只是被設計成針對link-list的資料結構而不是前面的array。以下圖的list為例子做實際示範 ``` graphviz digraph graph4{ node[shape=record] struct1 [label="<v> 8|<next> next"]; struct2 [label="<v> 4|<next> next"]; struct3 [label="<v> 3|<next> next"]; struct4 [label="<v> 7|<next> next"]; struct5 [label="<v> 9|<next> next"]; struct6 [shape=none,label="null"]; rankdir=LR struct1:next->struct2; struct2:next->struct3; struct3:next->struct4; struct4:next->struct5; struct5:next->struct6; } ``` 程式碼内： 3,4行是迭代函數的中斷條件 6-9行只是從list 中取head作爲pivot 11行宣告兩組list，分別爲left 和right，用來區分比pivot大或比pivot小的數直到11行爲止，狀態如下 ``` graphviz digraph graph4{ node[shape=record] struct1 [label="p"] struct2 [label="<v> 4|<next> next"]; struct3 [label="<v> 3|<next> next"]; struct4 [label="<v> 7|<next> next"]; struct5 [label="<v> 9|<next> next"]; struct6 [shape=none,label="null"]; rankdir=LR struct7 [label="left|{<v>nan|<next> next}"]; struct8 [label="right|{<v>nan |<next> next}"]; struct9 [shape=none,label="null"]; struct10 [shape=none,label="null"]; struct12 [shape=none,label="null"]; struct13 [shape=record,label="value|8"] struct11 [label="pivot|{<v>8|<next>next}"]; struct1->struct2 struct2:next->struct3; struct3:next->struct4; struct4:next->struct5; struct5:next->struct6; struct7:next->struct9; struct8:next->struct10; struct11:next->struct12; } ``` 下面開始做數組的區分（程式碼12-16行）若p不是指向null，就把p指向的node拿來和pivot做對比，如果比pivot小，就把其插入到left 的集合裏，相反則插入到right集合，之後把p指向下一個元素，繼續進行該指令，直到p指向null爲止（list 結束比較）比較結束后,產生如下3個list ``` graphviz digraph graph4{ node[shape=record] struct3 [label="<v> 3|<next> next"]; struct4 [label="<v> 7|<next> next"]; rankdir=LR struct7 [label="right|{<v>9|<next> next}"]; struct8 [label="left|{<v>4 |<next> next}"]; struct9 [shape=none,label="null"]; struct10 [shape=none,label="null"]; struct12 [shape=none,label="null"]; struct11 [label="pivot|{<v>8|<next>next}"]; struct7:next->struct9; struct8:next->struct3->struct4->struct10; struct11:next->struct12; } ``` left的list 值比pivot 低，right 的list 值都比pivot 高，并且left right 都是未被排序的，所以把left 和right 都丟入quicksort 做排序。因爲right只有一個元素，所以進入quick sort后滿足第4行的條件，直接退出不需要做，而left 有多餘一個element 所以需要再跑一次quick sort。 ``` graphviz digraph graph4{ node[shape=record] struct3 [label="<v> 3|<next> next"]; struct4 [label="<v> 7|<next> next"]; rankdir=LR struct8 [label="left|{<v>4 |<next> next}"]; struct10 [shape=none,label="null"]; struct8:next->struct3->struct4->struct10; } ``` 做完quick sort 得到 ``` graphviz digraph graph4{ node[shape=record] rankdir=LR struct7 [label="right|{<v>7|<next> next}"]; struct8 [label="left|{<v>3 |<next> next}"]; struct9 [shape=none,label="null"]; struct10 [shape=none,label="null"]; struct12 [shape=none,label="null"]; struct11 [label="pivot|{<v>4|<next>next}"]; struct7:next->struct9; struct8:next->struct10; struct11:next->struct12; } ``` 因這裏left right 都只有一個元素，所以不需要再做quick sort。在這時候就會執行第21-24行程式碼，主要是把這三個list 以 left-> pivot - > right 整合在一起。因此得到 ``` graphviz digraph graph4{ node[shape=record] struct3 [label="<v> 4|<next> next"]; struct4 [label="<v> 7|<next> next"]; rankdir=LR struct8 [label="left|{<v>3 |<next> next}"]; struct10 [shape=none,label="null"]; struct8:next->struct3->struct4->struct10; } ``` 這時第二層的quick sort執行完畢，把left 變成上圖已排序好的形式了，所以第一層接著執行剛剛 combine 三個list 的動作，最終形成 ``` graphviz digraph graph4{ node[shape=record] struct1 [label="<v> 3|<next> next"]; struct2 [label="<v> 4|<next> next"]; struct3 [label="<v> 7|<next> next"]; struct4 [label="<v> 8|<next> next"]; struct5 [label="<v> 9|<next> next"]; struct6 [shape=none,label="null"]; rankdir=LR struct1:next->struct2->struct3->struct4->struct5->struct6; } ``` 測試程式碼内用到random（）的函數，但random 函數在沒有給予seed的情況下，會以一比固定的亂數作爲random 輸出的參考，因此每次執行程式碼時，都用固定的亂數，造成每次的結果都一模一樣。 ```cpp int main(int argc, char **argv) { size_t count = 20; node_t *list = NULL; while (count--) list = list_make_node_t(list, random() % 1024); ``` 要改變這種囧境，我們可以給予random函數一個seed，設定seed 的方法為 srand（**要給的seed**），想要每次執行都得到不同的亂數，通常以time(0)為seed，time 函數會回傳現在的即時時間，因時間一直在走動，所以產生出來的亂數也每分每秒都不一樣。可以參考http://www.cplusplus.com/reference/cstdlib/srand/ ```cpp int main(int argc, char **argv) { size_t count = 20; srand(time(0)); node_t *list = NULL; while (count--) list = list_make_node_t(list, random() % 1024); ``` --- ## 延申問題2 由參考[Optimized QuickSort — C Implementation (Non-Recursive)](https://alienryderflex.com/quicksort/) 的文章，該作者主要利用兩個array 來當作stack去處理recursion 的邏輯運作，但因爲這樣可能造成**處理大筆資料**的時候因預先設定好的array不足夠大而處理失敗（因爲所需要的array 最大空間可能會是‘所處理的資料數量’-1）。但作者也有提到相對應的解決方案，利用了時間成本換取了執行成功的保証。利用作者所提供的演算法套入我們討論的link-list quick sort問題，解決不需要recursion來執行quick sort。具體改變如下 ```cpp= int findLength(node_t **list){ node_t *p =*list;int lenght =0; while(p){ lenght++; p=p->next; } return lenght; } void findLocat(node_t **list,int target){ while(target>1){ list=&((*list)->next); target--; } if(target ==1){ node_t *t = (*list)->next; (*list)->next=null; list=&t; } } void quicksort(node_t **list) { #define MAX_LEVELS 300 int value,beg[MAX_LEVELS], end[MAX_LEVELS], i=0,Lend=0, Rend =0; Lend = findLength(*list); beg[0]=0; end[0]=Lend-1; while(i>=0){ node_t *left = NULL, *right = NULL; node_t *pivot = *list; findLocat(&pivot,beg[i]); value = pivot->value; node_t *p = pivot->next; node_t *prev = pivot; if(end[i]>0){ while (end[i]>0) { node_t *n = p; p = p->next; if(n->value > value){ Rend++; prev = n; } else{ Lend++; prev->next = p; list_add_node_t(&left,n); } end[i]--; } node_t *result = *list; if(beg[i]==0) result = NULL; list_concat(&result, left); list_concat(&result, pivot); list_concat(&result, right); *list = result; if (Lend>=Rend) { beg[i+1]=Lend+2; end[i]=Lend-1; end[i+1]=Rend-1;i++; } else { beg[i+1]=beg[i];beg[i]=Lend+2; end[i]=Rend-1; end[i+1]=Lend-1;i++; } } else i--; }} ``` 想要代替 recursion ，通常都需要利用到類似 stack 的概念，因爲 recursion 的行爲就與 stack 有異曲同工之妙，因此學習作者利用 beg 與 end 的空間（FILO）來存放工作序列。分類方法的具體差別在於 singly link-list 不像 array，不能直接指定位置，因此排序時的操作需要修改。修改方法利用了 list 的特性（易拆解、易合并），來避免用到 list 尾部的資料，提升效率。 ``` graphviz digraph graph4{ node[shape=record] struct1 [label="<v> 7|<next> next"]; struct2 [label="<v> 4|<next> next"]; struct3 [label="<v> 3|<next> next"]; struct4 [label="<v> 8|<next> next"]; struct5 [label="<v> 9|<next> next"]; struct6 [shape=none,label="null"]; rankdir=LR struct1:next->struct2->struct3->struct4->struct5->struct6; } ``` ``` graphviz digraph graph4{ node[shape=record] struct1 [label="<v> 7|<next> next"]; struct2 [label="<v> 4|<next> next"]; struct3 [label="<v> 3|<next> next"]; struct4 [label="<v> 8|<next> next"]; struct5 [label="<v> 9|<next> next"]; struct6 [shape=none,label="null"]; struct7 [shape=none,label="null"]; rankdir=LR pivot->struct1; smaller->struct2->struct3->struct7; bigger->struct4->struct5->struct6; } ``` 到這裏會記錄下 smaller 與bigger 的 list 的開始位置與結束位置，作爲下一筆要處理的資料存放到 beg 和 end 。 concat 之後： ``` graphviz digraph graph4{ node[shape=record] struct1 [label="<v> 4|<next> next"]; struct2 [label="<v> 3|<next> next"]; struct3 [label="<v> 7|<next> next"]; struct4 [label="<v> 8|<next> next"]; struct5 [label="<v> 9|<next> next"]; struct6 [shape=none,label="null"]; rankdir=LR struct1:next->struct2->struct3->struct4->struct5->struct6; } ``` 從上圖可以看到 linked-list 的分類方法與平常的 array 的不同之處。另外作者的描述，如果每一次選擇較短的資料做為下一筆的分類目標，則可以使該演算法適用于任意大小的 list ，而不像前面説的如果超過MAXLEVEL 就有可能排序失敗。從數學的角度去分析，當一筆資料被分成兩半時，較短的一邊大小肯定小於等於原資料的二分之一。因此在 MAXLEVEL = 300 的同時，意味著可以處理 $2^{300}$ (比宇宙原子數量多) 大小的 list ，超過這個大小的問題，基本上微乎其微。程式碼下方也加上了同樣的機制。 ```cpp if (Lend>=Rend) { beg[i+1]=Lend+2; end[i]=Lend-1; end[i+1]=Rend-1;i++; } else { beg[i+1]=beg[i];beg[i]=Lend+2; end[i]=Rend-1; end[i+1]=Lend-1;i++; } } ``` --- ## 延申問題3 以上所探討的程式碼都是基於簡單的linked list 結構 ```cpp typedef struct __node { int value; struct __node *next; } node_t; ``` 而linux-list 的資料結構較爲複雜。linux-list 先是定義出一個list-head ```cpp struct list_head { struct list_head *prev; struct list_head *next; }; ``` 其是一個doubly-linked list，并且再針對其定義出各種作業函數 ```cpp container_of(ptr, type, member) static inline void INIT_LIST_HEAD(struct list_head *head) static inline void list_add(struct list_head *node, struct list_head *head) static inline void list_del(struct list_head *node) static inline int list_empty(const struct list_head *head) static inline int list_is_singular(const struct list_head *head) static inline void list_splice(struct list_head *list, struct list_head *head)static inline void list_cut_position(struct list_head *head_to,struct list_head *head_from,struct list_head *node) static inline void list_move(struct list_head *node, struct list_head *head) list_entry(node, type, member) list_for_each(node, head) ``` 因爲這些作業函數是以list 為circular為前提才能正常操作，因此若慾使用其作業函數，最好把linked-list用環狀鏈接。有了這些作業函數，我們可以對list-head 做許多操作，因爲list-head也**只有**指向前後兩個list-head 形態的資料，所以可想而知上面這些作業函數不過是幫助找到list中的某個node。這種資料形態并沒有自己的資料變數（像之前討論的就有value作爲儲存資料的變數），因此它是被用來建構在一個有資料變數的新結構裏面，使新的結構具備完整linked-list 的特質。如下： ```cpp typedef struct __element { char *value; struct __element *next; struct list_head list; } list_ele_t; ``` 資料變數為value， next 是指到下一個__element 形態的物件，可以當作singular linked list 來使用，而 list_head 形態的list就是用來做doubly-linked list的工具。這個外層的結構可以依照需求去自行定義，而list_head 扮演著的是提供結構一個doubly-linked list的工具。