--- tags: linux2022 --- contributed by < `scottxxxabc` > > [作業說明](https://hackmd.io/@sysprog/linux2022-lab0) > [GitHub](https://github.com/scottxxxabc/lab0-c) ## 開發環境 (使用 VirtualBox) :::warning 注意書寫方式: VirtualBox (中間沒有空格) :notes: jserv ::: ```bash $ uname -r 5.13.0-30-generic $ gcc --version gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 Copyright (C) 2019 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian Address sizes: 39 bits physical, 48 bits virtual CPU(s): 1 On-line CPU(s) list: 0 Thread(s) per core: 1 Core(s) per socket: 1 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 140 Model name: 11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz Stepping: 1 CPU MHz: 2995.202 BogoMIPS: 5990.40 Hypervisor vendor: KVM Virtualization type: full L1d cache: 48 KiB L1i cache: 32 KiB L2 cache: 1.3 MiB L3 cache: 12 MiB ``` ## 開發環境設定 安裝必要的開發工具套件 ```bash $ sudo apt install build-essential git-core valgrind $ sudo apt install cppcheck clang-format aspell colordiff ``` CppCheck版本 ```bash $ cppcheck --version Cppcheck 1.90 ``` ## queue.c 的實作 ### q_new() ```c struct list_head *q_new() { struct list_head *head = malloc(sizeof(struct list_head)); if (!head) return NULL head->next = head; head->prev = head; return head; } ``` * `malloc` 一個大小為 `struct list_head` 的記憶體,並將`head` 指向這個記憶體空間。 * 由於 list 尚未有加入元素,因此將 `prev` 以及 `next` 指標指向 `head`。 ```graphviz digraph G { rankdir=LR; node[shape=record]; head[label="head | {<prev>prev|<next>next}"] head:next:e->head:e head:prev:w->head:w } ``` *** ### q_free() ```c void q_free(struct list_head *head) { if (head == NULL) return; head->prev->next = NULL; head->prev = NULL; struct list_head *tmp, *l = head->next; while (l != NULL) { tmp = l->next; element_t *e = container_of(l, element_t, list); q_release_element(e); l = tmp; } free(head); } ``` * 檢查list是否為空 * 將 circular doubly-linked list 首尾互相連結的 link 打斷。 * 指標 `l` 指向要刪除的節點,刪除 `l` 時 `l->next` 指標也會遺失,因此用指標 `tmp` 指向 `l` 的下一個節點。 * `container_of(l, element_t, list)` 取得包含 `l` 的 `element_t` 節點。 * 使用 `q_release_element()` 釋放占用的記憶體。 * 迴圈重複上述步驟直到 `l` 指向的節點為 `NULL`,也就是list尾端。 * 最後釋放head占用的記憶體。 :::warning 注意書寫方式: doubly-linked list (留意連字號和空白字元) :notes: jserv ::: *增加下方程式碼避免輸入 list 為空* ```c if (head == NULL) return; ``` :::warning 可改寫為: ```c if (!head) return; ``` :notes: jserv ::: ### q_insert_head() ```c bool q_insert_head(struct list_head *head, char *s) { if (!head) return false; element_t *new = malloc(sizeof(element_t)); if (!new) return false; new->value = strdup(s); if (new->value == NULL) { free(new); return false; } struct list_head *tmp = head->next; tmp->prev = &(new->list); (new->list).next = tmp; head->next = &(new->list); (new->list).prev = head; // cppcheck-suppress memleak return true; } ``` * 檢查 `head` 是否為空以及新加入的元素 `new` 是否成功 `malloc` 。 * 使用 `strdup` 配置記憶體並複製字串,並檢查是否成功。 * 將新元素插入 list 前端。 #### **<font color="#f70">CppCheck error : `Memory leak`</font>** `new` 指向的新配置的記憶體未在結束時釋放,導致 `new` 指向的該記憶體在 `q_inserthead()` 結束後無法存取。 然而我們仍能利用`container_of()`及 該記憶體空間包含的 list 節點位址來重新取得記憶體空間的位址 當需要使用到該記憶體時,可以使用 ```c element_t *tmp = container_of(node, element_t, list); ``` **<font color="#444">加入註解 `// cppcheck-suppress memleak` 忽略 CppCheck 判斷 `new` 指標未釋放的 memory leak error。</font>** :::warning 你應該說明為何抑制 Cppcheck 的 memory leak 錯誤是正確的操作。 :notes: jserv ::: *** ### q_insert_tail() ```c bool q_insert_tail(struct list_head *head, char *s) { if (!head) return false; element_t *new = malloc(sizeof(element_t)); if (!new) return false; new->value = strdup(s); if (new->value == NULL) { free(new); return false; } struct list_head *tmp = head->prev; tmp->next = &(new->list); (new->list).prev = tmp; head->prev = &(new->list); (new->list).next = head; // cppcheck-suppress memleak return true; } ``` * 做法與 `q_insert_head()` 相似,將最後一個步驟改為插入尾端。 * *** ### q_remove_head() ```c element_t *q_remove_head(struct list_head *head, char *sp, size_t bufsize) { if (q_size(head) == 0) return NULL; struct list_head *r = head->next; head->next = r->next; r->next->prev = head; element_t *tmp = container_of(r, element_t, list); size_t str_size = sizeof(sp) <= bufsize - 1 ? sizeof(sp) : bufsize - 1; memset(sp, '\0', bufsize); memcpy(sp, tmp->value, str_size); return tmp; } ``` * 檢查 list 是否為空。 * 將 list 第一個元素移出。 * 使用 `memcpy` 將 `bufsize` 長的的記憶體空間複製至 `sp`。 * 將 `sp` 最後一個字元設為 `\0`。 ** 將`if (head == NULL || head->next == NULL)`的判斷修改為`q_size(head) == 0`, 以及將 ```c size_t str_size = sizeof(sp) <= bufsize - 1 ? sizeof(sp) : bufsize - 1; memset(sp, '\0', bufsize); memcpy(sp, tmp->value, str_size); ``` 改成 ```c if(sp){ memcpy(sp, tmp->value, bufsize); sp[bufsize - 1] = '\0'; } ``` *** ### q_remove_tail() ```c element_t *q_remove_tail(struct list_head *head, char *sp, size_t bufsize) { if (head == NULL || head->next == NULL) return NULL; struct list_head *r = head->prev; r->prev->next = head; head->prev = r->prev; element_t *tmp = container_of(r, element_t, list); size_t str_size = sizeof(sp) <= bufsize - 1 ? sizeof(sp) : bufsize - 1; memset(sp, '\0', bufsize); memcpy(sp, tmp->value, str_size); return tmp; } ``` * 做法與 `q_remove_head()` 相似,將第二個步驟改為尾端元素。 *** ### q_size() ```c int q_size(struct list_head *head) { if (head == NULL || head->next == NULL) return 0; int count = 0; struct list_head **pptr = &head->next; for (; *pptr != head; pptr = &(*pptr)->next, count++) ; return count; } ``` * 走訪 list 的每一個節點並計算長度。 利用額外空間存放 list 長度可使q_size()的呼叫為 constant time。 *** ### q_delete_mid() ```c bool q_delete_mid(struct list_head *head) { if (head == NULL || head->next == NULL) return false; struct list_head *slow = head->next, *fast = head->next->next; for (; fast != head && fast->next != head; slow = slow->next, fast = fast->next->next) ; slow->prev->next = slow->next; slow->next->prev = slow->prev; element_t *tmp = container_of(slow, element_t, list); q_release_element(tmp); return true; } ``` * 檢查 list 是否為空。 * 初始化 `slow` 以及 `fast` 指向 list 第一及第二個節點。 - `slow`每次前進一個節點。 - `fast`每次前進兩個節點。 * 當 `fast` 前進到 list 的尾端,也就是list 的最後一個節點或 head 時,`slow` 會指向中間的節點。 * 呼叫 `q_release_element()` 刪除 `slow` 指向的節點。 *** ### q_delete_dup() ```c bool q_delete_dup(struct list_head *head) { if (q_size(head) < 1) return false; struct list_head *tmp = head->next; while (tmp != head) { while (tmp->next != head && strcmp(container_of(tmp, element_t, list)->value, container_of(tmp->next, element_t, list)->value) == 0) { struct list_head *tmp2 = tmp->next; list_del(tmp->next); q_release_element(container_of(tmp2, element_t, list)); } tmp = tmp->next; } return true; } ``` * 檢查 list 是否為空。 * 將 `tmp` 指向第一個節點。 * 刪除與tmp值相同的元素 - 若 `tmp` 與 `tmp->next` 值相同,移出並刪除 `tmp->next`。 - 重複以上步驟直到 `tmp` 與 `tmp->next` 值不同 * `tmp` 指向 `tmp->next`。 * 重複直到 `tmp` 指向陣列尾端。 *** ### q_swap() ```c void q_swap(struct list_head *head) { if (q_size(head) <= 1) return; struct list_head *node1 = head->next, *node2 = node1->next; while (node1 != head && node2 != head) { node1->prev->next = node2; node2->prev = node1->prev; node2->next->prev = node1; node1->next = node2->next; node1->prev = node2; node2->next = node1; node1 = node1->next; if (node1 == head) break; node2 = node1->next; } return; } ``` * 檢查 list 是否為空或只有1個節點。 * 交換 `node1` 及 `node2` 指向的節點。 *** ### q_reverse() ```c void q_reverse(struct list_head *head) { if (q_size(head) <= 1) return; struct list_head *ptr = head, *prev_node = ptr->prev, *next_node = ptr->next; do { ptr->prev = next_node; ptr->next = prev_node; prev_node = ptr; ptr = ptr->prev; next_node = ptr->next; } while (ptr != head); } ``` * 檢查 list 是否為空或只有1個節點。 * `ptr` 指向現在的節點,`prev_node` 指向 `ptr` 的前一個節點,`next_node` 指向 `ptr` 的後一個節點。 - 交換`ptr->prev`、`ptr->next` 指向的位址。 - `ptr`、`prev_node`、`next_node` 往下一個節點前進。 - 重複直到 `ptr` 指向 list 尾端。 *** ### q_sort() `q_sort()` 由 `q_sort()`, `merge()`, `mergesort()` 三個函式組成。 #### q_sort() 將環狀結構拆開,呼叫 `merge_sort()` 排序後將 list 重新鏈接成 *circular*。 ```c void q_sort(struct list_head *head) { if (q_size(head) <= 1) return; struct list_head *tmp = head->next; head->prev->next = NULL; head->next->prev = NULL; head->prev = NULL; head->next = NULL; struct list_head *newhead = merge_sort(tmp); head->next = newhead; newhead->prev = head; struct list_head **pptr = &head; for (; (*pptr)->next != NULL; pptr = &(*pptr)->next) ; head->prev = *pptr; (*pptr)->next = head; } ``` #### merge_sort() 排序輸入的 *doubly-linked list* ```c struct list_head *merge_sort(struct list_head *pptr2first) { if (pptr2first == NULL || pptr2first->next == NULL) return pptr2first; struct list_head *slow = pptr2first, *fast = slow->next; for (; fast != NULL && fast->next != NULL; slow = slow->next, fast = fast->next->next) ; struct list_head *tmp = slow->next; tmp->prev = NULL; slow->next = NULL; struct list_head *l1 = merge_sort(pptr2first); struct list_head *l2 = merge_sort(tmp); return merge(l1, l2); } ``` #### merge() 將兩個 *doubly-linked list* merge成一個。 ```c struct list_head *merge(struct list_head *list1, struct list_head *list2) { char *v1, *v2; if (!list1 || !list2) return (list2 == NULL) ? list1 : list2; struct list_head *newhead = NULL, *tail = NULL, **node = NULL; v1 = container_of(list1, element_t, list)->value; v2 = container_of(list2, element_t, list)->value; node = (strcmp(v1, v2) <= 0) ? &list1 : &list2; newhead = tail = *node; *node = (*node)->next; // cppcheck-suppress knownConditionTrueFalse while (list1 && list2) { v1 = container_of(list1, element_t, list)->value; v2 = container_of(list2, element_t, list)->value; node = (*v1 <= *v2) ? &list1 : &list2; (*node)->prev = tail; tail->next = *node; tail = tail->next; *node = (*node)->next; } // cppcheck-suppress knownConditionTrueFalse node = (list2 == NULL) ? &list1 : &list2; (*node)->prev = tail; tail->next = *node; return newhead; } ``` 採用遞迴的方式實作,不使用原本的環狀結構以及`head`節點,避免多餘的指標數值指派以及空間配置。 ```graphviz digraph G { rankdir=LR; ptr[label="ptr"] node[shape=record]; n1[label="1 | {<prev>NULL|<next>next}"] n2[label="2 | {<prev>prev|<next>next}"] n3[label="3 | {<prev>prev|<next>NULL}"] ptr->n1 n1:next->n2 n2:prev:s->n1:s n2:next->n3 n3:prev:s->n2:s } ``` * [`q_sort()`](#q_sort()) - 檢查 list 是否為空或只有1個節點。 - 將 `head` 移出 list ,將環狀結構打斷。 - 呼叫 [`merge_sort()`](#merge_sort()) 排序 list。 - 將 `head` 加入 list 前端,重新連結首尾。 * [`merge_sort()`](#merge_sort()) - 檢查 list 是否為空或只有1個節點。 - 使用與 [`delete_dup()`](#q_delete_dup()) 相同的方式找到中間的節點。 - 從中央節點將 list 分為左右兩個子 list。 - 以兩個子 list 為輸入遞迴呼叫自己,排序兩個子 list。 - 呼叫 [`merge()`](#merge()) 將排序好的兩個 list 合併。 * [`merge()`](#merge()) - `list1`、`list2` 分別指向兩個 *doubly-linked list*。 - `newhead` 指向新 *doubly-linked list* 的第一個節點,`tail` 指向 list 尾端。 - 指標的指標 `node` 指向下一個要接在 `tail` 後的節點。 ```graphviz digraph G { rankdir=LR; tail[label="tail"] list1[label="list1"] list2[label="list2"] pptr[label="node"] node[shape=record] n1[label="# | {<prev>prev|<next>NULL}"] n2[label="1 | {<prev>NULL|<next>next}"] n3[label="2 | {<prev>NULL|<next>next}"] n4[label="3 | {<prev>prev|<next>next}"] tail->n1 list1->n2 list2->n3 pptr->list1 n2:next->n4[dir="both"] } ``` *** * 將 `*node` 指向的節點連接到tail後。 ```graphviz digraph G { rankdir=LR; tail[label="tail"] list1[label="list1"] list2[label="list2"] pptr[label="node"] node[shape=record] n1[label="# | {<prev>prev|<next>next}"] n2[label="1 | {<prev>prev|<next>next}"] n3[label="2 | {<prev>NULL|<next>next}"] n4[label="3 | {<prev>prev|<next>next}"] tail->n1 list1->n2 list2->n3 pptr->list1 n1:next->n2 n2:prev->n1 n2:next->n4[dir=both] } ``` * * `*node` 指向下一個節點。 ```graphviz digraph G { rankdir=LR; tail[label="tail"] list1[label="list1"] list2[label="list2"] pptr[label="node"] node[shape=record] n1[label="# | {<prev>prev|<next>next}"] n2[label="1 | {<prev>prev|<next>next}"] n3[label="2 | {<prev>NULL|<next>next}"] n4[label="3 | {<prev>prev|<next>next}"] tail->n2:n list1->n4 list2->n3 pptr->list1 n2:prev->n1:next[dir=both] } ``` * * 重複直到其中一個 list 為空。 * 將非空的 list 接在 `tail` 後。 #### **<font color="#f70">CppCheck error : `Condition always true/false`</font>** 在 `merge()` 中的 `while (list1 && list2)` 迴圈中透過指標的指標操作 `list1`、 `list2`,但CppCheck仍偵測出 `Condition always true`,推測是由於函式起始時if空指標的判斷 ```c if (!list1 || !list2) return (list2 == NULL) ? list1 : list2; ``` 導致 CppCheck 認為 `list1`, `list2` 非空以及無法偵測出指標的指標的間接操作造成。 下方的程式碼 ```c node = (list2 == NULL) ? &list1 : &list2; ``` 也會因相同原因造成`Condition always false`。 **<font color="#444">加入註解 `// cppcheck-suppress knownConditionTrueFalse` 可忽略此問題。</font>** ## shuffle 命令的實作 在 `console_init()` 加入程式碼 ```c ADD_COMMAND(shuffle, " | Shuffle queue contents"); ``` 執行 `make`,發現錯誤訊息 ```bash $ make CC qtest.o In file included from qtest.c:36: qtest.c: In function ‘console_init’: console.h:46:45: error: ‘do_shuffle’ undeclared (first use in this function) 46 | #define ADD_COMMAND(cmd, msg) add_cmd(#cmd, do_##cmd, msg) | ^~~ qtest.c:793:5: note: in expansion of macro ‘ADD_COMMAND’ 793 | ADD_COMMAND(shuffle, " | Shuffle queue contents"); | ^~~~~~~~~~~ console.h:46:45: note: each undeclared identifier is reported only once for each function it appears in 46 | #define ADD_COMMAND(cmd, msg) add_cmd(#cmd, do_##cmd, msg) | ^~~ qtest.c:793:5: note: in expansion of macro ‘ADD_COMMAND’ 793 | ADD_COMMAND(shuffle, " | Shuffle queue contents"); | ^~~~~~~~~~~ make: *** [Makefile:50: qtest.o] Error 1 ``` 第 5 行顯示 `ADD_COMMAND(cmd, msg)` 巨集內的 `do_##cmd` 函式未宣告,於是加入 ```c static bool do_shuffle() { return true; } ``` 執行 `qtest`,輸入 `help` 後 `shuffle` 出現於第 20 行。 ```bash= $ ./qtest cmd> help Commands: # ... | Display comment dedup | Delete all nodes that have duplicate string dm | Delete middle node in queue free | Delete queue help | Show documentation ih str [n] | Insert string str at head of queue n times. Generate random string(s) if str equals RAND. (default: n == 1) it str [n] | Insert string str at tail of queue n times. Generate random string(s) if str equals RAND. (default: n == 1) log file | Copy output to file new | Create new queue option [name val] | Display or set options quit | Exit program reverse | Reverse queue rh [str] | Remove from head of queue. Optionally compare to expected value str rhq | Remove from head of queue without reporting value. rt [str] | Remove from tail of queue. Optionally compare to expected value str show | Show queue contents shuffle | Shuffle queue contents size [n] | Compute queue size n times (default: n == 1) sort | Sort queue in ascending order source file | Read commands from source file swap | Swap every two adjacent nodes in queue time cmd arg ... | Time command execution Options: echo 1 Do/don't echo commands error 5 Number of errors until exit fail 30 Number of times allow queue operations to return false length 1024 Maximum length of displayed string malloc 0 Malloc failure probability percent simulation 0 Start/Stop simulation mode verbose 4 Verbosity level ``` ### do_shuffle() ```c static bool do_shuffle() { if (!l_meta.l || !l_meta.size) { show_queue(0); return false; } srand(time(NULL)); int i; struct list_head *tail = l_meta.l->prev; for (i = lcnt; i > 1; i--) { int num = rand() % i; int j; struct list_head *ptr = l_meta.l->next; for (j = 0; j < num; j++) ptr = ptr->next; struct list_head *tmp = tail->prev; if (ptr != tail) { if (ptr->next == tail) { ptr->prev->next = tail; tail->prev = ptr->prev; tail->next->prev = ptr; ptr->next = tail->next; ptr->prev = tail; tail->next = ptr; } else { list_del_init(tail); list_add(tail, ptr); list_del_init(ptr); list_add(ptr, tmp); } } if (tmp != l_meta.l) tail = tmp; } show_queue(0); return true; } ``` * 檢查 list 是否為空。 * `tail` 指向 list 最後一個節點。 根據 [Fisher–Yates shuffle](https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle) 演算法實作,步驟如下 * 在第一個節點到 `tail` 節點之間隨機選一節點 `ptr`。 * `ptr` 與 `tail` 互換。 若 `ptr` 的下一節點為 `tail` ,交換節點之方法可參考[`q_swap()`](#q_swap()) 若 `ptr` 的下一節點不為 `tail` ,: - 將 `tail` 移出 list 。 ```graphviz digraph G { rankdir=LR; head[label="head"] tail[label="tail"] ptr[label="ptr"] tmp[label="tmp"] node[shape=record] n1[label="# | {<prev>prev|<next>next}"] n2[label="1 | {<prev>prev|<next>next}"] n3[label="2 | {<prev>prev|<next>next}"] n4[label="3 | {<prev>prev|<next>next}"] n5[label="4 | {<prev>prev|<next>next}"] head->n1 tail->n5 n1:next->n2:prev[dir=both] n2:next->n3:prev[dir=both] n3:next->n4:prev[dir=both] n4:next->n5:prev[dir=both] n5:next->n1:prev:s[dir=both] tmp->n4 ptr->n2 } ``` ```graphviz digraph G { rankdir=LR; head[label="head"] tail[label="tail"] ptr[label="ptr"] tmp[label="tmp"] node[shape=record] n1[label="# | {<prev>prev|<next>next}"] n2[label="1 | {<prev>prev|<next>next}"] n3[label="2 | {<prev>prev|<next>next}"] n4[label="3 | {<prev>prev|<next>next}"] n5[label="4 | {<prev>prev|<next>next}"] head->n1 tail->n5 n1:next->n2:prev[dir=both] n2:next->n3:prev[dir=both] n3:next->n4:prev[dir=both] n4:next->n1:prev[dir=both] tmp->n4 ptr->n2 } ``` * * 加入到 `ptr` 後 ```graphviz digraph G { rankdir=LR; head[label="head"] tail[label="tail"] ptr[label="ptr"] tmp[label="tmp"] node[shape=record] n1[label="# | {<prev>prev|<next>next}"] n2[label="1 | {<prev>prev|<next>next}"] n3[label="2 | {<prev>prev|<next>next}"] n4[label="3 | {<prev>prev|<next>next}"] n5[label="4 | {<prev>prev|<next>next}"] head->n1 tail->n5 n1:next->n2:prev[dir=both] n2:next->n5:prev[dir=both] n3:next->n4:prev[dir=both] n4:next->n1:prev[dir=both] n5:next->n3:prev:s[dir=both] tmp->n4 ptr->n2 } ``` * * 將 `ptr` 移出 list 。 ```graphviz digraph G { rankdir=LR; head[label="head"] tail[label="tail"] ptr[label="ptr"] tmp[label="tmp"] node[shape=record] n1[label="# | {<prev>prev|<next>next}"] n2[label="1 | {<prev>prev|<next>next}"] n3[label="2 | {<prev>prev|<next>next}"] n4[label="3 | {<prev>prev|<next>next}"] n5[label="4 | {<prev>prev|<next>next}"] head->n1 tail->n5 n1:next->n5:prev[dir=both] n3:next->n4:prev[dir=both] n4:next:s->n1:prev[dir=both] n5:next->n3:prev[dir=both] tmp->n4 ptr->n2 } ``` * * 將 `ptr` 加入 `tmp` 後。 ```graphviz digraph G { rankdir=LR; head[label="head"] ptr[label="ptr"] tmp[label="tmp"] tail[label="tail"] node[shape=record] n1[label="# | {<prev>prev|<next>next}"] n2[label="1 | {<prev>prev|<next>next}"] n3[label="2 | {<prev>prev|<next>next}"] n4[label="3 | {<prev>prev|<next>next}"] n5[label="4 | {<prev>prev|<next>next}"] head->n1 n1:next->n5:prev[dir=both] n3:next->n4:prev[dir=both] n4:next->n2:prev[dir=both] n2:next->n1:prev[dir=both] n5:next->n3:prev[dir=both] tmp->n4 ptr->n2 tail->n5 } ``` ** 將 `tail` 指向 `tmp` 指向的節點。 ## 引入[list_sort.c](https://github.com/torvalds/linux/blob/master/lib/list_sort.c) ### 研讀list_sort.c ```c __attribute__((nonnull(2,3))) void list_sort(void *priv, struct list_head *head, list_cmp_func_t cmp) ``` * 參數 `priv` 在 `list_sort()` 中並未使用,只是將 `priv` 傳給 `cmp`,若 `cmp` 也不使用則可傳入 `NULL`。 * 參數 `head` 為要排序的 list。 * `cmp` 為比較元素大小用的 function pointer, `cmp`在 [list_sort.h](https://github.com/torvalds/linux/blob/master/include/linux/list_sort.h) 的定義如下 : ```c typedef int __attribute__((nonnull(2,3))) (*list_cmp_func_t)(void *, const struct list_head *, const struct list_head *); ``` `__attribute__((nonnull(2,3)))`表示第二及第三個參數的值不可能為 `NULL` , #### `list_sort()` `list_sort()` 嘗試將待合併的 list 維持 $2 : 1$ 的長度,若已存在 2 個 $2^k$ 大小的 sublist ,後面的元素總量又達到 $2^k$ 時,就會合併 2 個 $2^k$ 大小的 sublist。 如此作法只要在 L1 Cache 能夠容納 $3*2^k$ 數量的元素時就能避免 thrashing,也能夠減少 $0.2*n$ 次的比較。 ```c struct list_head *list = head->next, *pending = NULL; size_t count = 0; /* Count of pending */ if (list == head->prev) /* Zero or one elements */ return; /* Convert to a null-terminated singly-linked list. */ head->prev->next = NULL; ``` 首先先將 list 尾端連向 `head` 的連結打斷,使其變成單向的 list,`*pending` 指向一個 list,其中每個元素皆為等待被 merge 的 sublist,並巧妙利用原 list 的 `prev` 指標作為指向下一個節點的指標。 ```c do { size_t bits; struct list_head **tail = &pending; /* Find the least-significant clear bit in count */ for (bits = count; bits & 1; bits >>= 1) tail = &(*tail)->prev; /* Do the indicated merge */ if (likely(bits)) { struct list_head *a = *tail, *b = a->prev; a = merge(priv, cmp, b, a); /* Install the merged result in place of the inputs */ a->prev = b->prev; *tail = a; } /* Move one element from input list to pending */ list->prev = pending; pending = list; list = list->next; pending->next = NULL; count++; } while (list); ``` 初始狀態 ```graphviz digraph G { rankdir=LR; head[label="head"] pending[label="pending"] tail[label="tail"] list[label="list"] node[shape=record] n0[label="# | {<prev>prev|<next>next}"] n1[label="1 | {<prev>prev|<next>next}"] n2[label="2 | {<prev>prev|<next>next}"] n3[label="3 | {<prev>prev|<next>next}"] n4[label="4 | {<prev>prev|<next>next}"] head->n0 n0:next->n1:prev[dir=both] n1:next->n2:prev[dir=both] n2:next->n3:prev[dir=both] n3:next->n4:prev[dir=both] n4:next->NULL tail->pending pending->NULL list->n1:n } ``` `count` = $0_{10}$,`bits` = $0_2$,經過 `for` 迴圈之後,`bits` 為 0 ,因此不進行 `merge`,將 `list` 節點加入 `pending`。 ```graphviz digraph G { rankdir=LR; head[label="head"] pending[label="pending"] tail[label="tail"] list[label="list"] node[shape=record] n0[label="# | {<prev>prev|<next>next}"] n1[label="1 | {<prev>prev|<next>next}"] n2[label="2 | {<prev>prev|<next>next}"] n3[label="3 | {<prev>prev|<next>next}"] n4[label="4 | {<prev>prev|<next>next}"] head->n0 n0:next->n1:prev n2:prev->n1 n2:next->n3:prev[dir=both] n3:next->n4:prev[dir=both] n4:next->NULL tail->pending n1:prev->NULL n1:next->NULL pending->n1 list->n2:n subgraph cluster_0 { style=filled; color="pink"; label="sublist 1"; n1; } } ``` `count` = $1_{10}$,`bits` = $01_2$,經過 `for` 迴圈之後,`bits` 為 0 ,因此不進行 `merge`,將 `list` 節點加入 `pending`,建立新的 sublist。 ```graphviz digraph G { rankdir=LR; head[label="head"] pending[label="pending"] tail[label="tail"] list[label="list"] node[shape=record] n0[label="# | {<prev>prev|<next>next}"] n1[label="1 | {<prev>prev|<next>NULL}"] n2[label="2 | {<prev>prev|<next>NULL}"] n3[label="3 | {<prev>prev|<next>next}"] n4[label="4 | {<prev>prev|<next>NULL}"] head->n0 n0:next->n1:w n2:prev->n1 n3:prev->n2 n3:next->n4:prev[dir=both] tail->pending n1:prev->NULL subgraph cluster_0 { style=filled; color="pink"; label="sublist 1"; n1; } subgraph cluster_1 { style=filled; color="pink"; label="sublist 2"; n2; } pending->n2 list->n3 } ``` `count` = $2_{10}$,`bits` = $10_2$,經過 `for` 迴圈之後,`bits` 為 $10_2$ ,因此將`*tail`指向的 sublist 與前一個 sublist `merge`。 ```graphviz digraph G { rankdir=LR; head[label="head"] pending[label="pending"] tail[label="tail"] list[label="list"] node[shape=record] n0[label="# | {<prev>prev|<next>next}"] n1[label="1 | {<prev>NULL|<next>next}"] n2[label="2 | {<prev>prev|<next>NULL}"] n3[label="3 | {<prev>prev|<next>next}"] n4[label="4 | {<prev>prev|<next>NULL}"] head->n0 n0:next->n1 n1:next->n2 n3:prev->n2 n3:next->n4:prev[dir=both] tail->pending pending->n1 list->n3 subgraph cluster_1 { style=filled; color="pink"; label="sublist 1"; n1; n2; } } ``` 再把 `list` 節點加入 `pending`,建立新的 sublist。 ```graphviz digraph G { rankdir=LR; head[label="head"] pending[label="pending"] tail[label="tail"] list[label="list"] node[shape=record] n0[label="# | {<prev>prev|<next>next}"] n1[label="1 | {<prev>NULL|<next>next}"] n2[label="2 | {<prev>prev|<next>NULL}"] n3[label="3 | {<prev>prev|<next>NULL}"] n4[label="4 | {<prev>prev|<next>NULL}"] head->n0 n0:next->n1 n1:next->n2 n3:prev->n1 n4:prev->n3:w tail->pending pending->n3 list->n4 subgraph cluster_0 { style=filled; color="pink"; label="sublist 1"; n1; n2; } subgraph cluster_1 { style=filled; color="pink"; label="sublist 2"; n3; } } ``` 重複以上步驟至 list 沒有剩餘節點, ```c list = pending; pending = pending->prev; for (;;) { struct list_head *next = pending->prev; if (!next) break; list = merge(priv, cmp, pending, list); pending = next; } /* The final merge, rebuilding prev links */ merge_final(priv, cmp, head, pending, list); ``` 參考 [SmallHanley](https://hackmd.io/@SmallHanley/linux2022-lab0)以及 [arthurchang09](https://hackmd.io/@arthur-chang/linux2022-lab0) 的方式,在 `qtest.c` 中加入 `linuxsort` 命令來執行。 ```c static int list_node_cmp(void *priv, const struct list_head *a, const struct list_head *b) { return strcmp(list_entry(a, element_t, list)->value, list_entry(b, element_t, list)->value); } static bool do_linuxsort(){ if (!l_meta.l || !l_meta.size) { show_queue(0); return false; } list_sort(NULL, l_meta.l, list_node_cmp); show_queue(0); return true; } ``` 接著使用 `qtest.c` 的 `time` 命令來計算排序一百萬個節點所花費的時間,如下表。 | 節點個數 | 我的sort | linux sort | 我的sort / linux sort| | -------- | -------- | -------- | -------- | | 500000 | 0.887(s) | 0.435(s) | 0.49| | 1000000 | 1.816(s) | 0.971(s) | 0.53| | 2000000 | 4.098(s) | 2.171(s) | 0.53| 使用 Valgrind 的 Cachegrind 工具分析我的實作,以一百萬個節點為例。 ```bash valgrind --tool=cachegrind --branch-sim=yes ./qtest -f traces/my_sort.cmd ``` ```bash cg_annotate --auto=yes cachegrind.out.5966 -------------------------------------------------------------------------------- I1 cache: 32768 B, 64 B, 8-way associative D1 cache: 49152 B, 64 B, 12-way associative LL cache: 12582912 B, 64 B, 12-way associative Command: ./qtest -f traces/my_sort.cmd Data file: cachegrind.out.5966 -------------------------------------------------------------------------------- Ir I1mr ILmr Dr D1mr DLmr Dw D1mw DLmw Bc Bcm Bi Bim -------------------------------------------------------------------------------- 2,836,569,531 1,750 1,720 719,430,080 57,810,495 27,214,285 331,698,599 4,719,563 4,347,615 394,236,008 15,439,132 39,671,200 337 PROGRAM TOTALS -------------------------------------------------------------------------------- Ir I1mr ILmr Dr D1mr DLmr Dw D1mw DLmw Bc Bcm Bi Bim file:function -------------------------------------------------------------------------------- 400,040,751 4 4 100,371,223 14,478,851 4,528,482 82,696,976 8,144 32 39,862,271 1,561,607 0 0 /home/scott/Documents/lab0-c/queue.c:merge 97,920,684 1 1 35,017,853 16,569,945 4,935,556 6,999,993 263,755 242,938 22,951,423 771,021 0 0 /home/scott/Documents/lab0-c/queue.c:merge_sort 4,000,025 2 2 1,000,006 1,000,001 999,674 11 4 4 1,000,002 7 0 0 /home/scott/Documents/lab0-c/queue.c:q_sort ``` Cachegrind 工具分析 linux list_sort,同樣排序一百萬個節點。 ```bash cg_annotate --auto=yes cachegrind.out.6025 -------------------------------------------------------------------------------- I1 cache: 32768 B, 64 B, 8-way associative D1 cache: 49152 B, 64 B, 12-way associative LL cache: 12582912 B, 64 B, 12-way associative Command: ./qtest -f traces/linux_sort.cmd Data file: cachegrind.out.6025 -------------------------------------------------------------------------------- Ir I1mr ILmr Dr D1mr DLmr Dw D1mw DLmw Bc Bcm Bi Bim -------------------------------------------------------------------------------- 2,706,564,927 1,743 1,713 653,863,237 39,702,748 20,293,696 312,061,904 5,453,266 5,097,384 363,826,443 24,564,592 57,370,502 338 PROGRAM TOTALS -------------------------------------------------------------------------------- Ir I1mr ILmr Dr D1mr DLmr Dw D1mw DLmw Bc Bcm Bi Bim file:function -------------------------------------------------------------------------------- 248,131,972 1 1 27,686,766 3,100,861 1,045,376 43,373,556 0 0 36,373,570 10,023,940 17,686,786 1 /home/scott/Documents/lab0-c/list_sort.c:merge 149,494,248 1 1 56,060,343 12,968,773 4,829,301 18,686,781 0 0 0 0 0 0 /home/scott/Documents/lab0-c/qtest.c:list_node_cmp 42,024,281 8 8 6,999,966 507,586 500,184 8,999,980 1,005,613 992,750 6,000,011 1,406,627 999,995 1 /home/scott/Documents/lab0-c/list_sort.c:list_sort ``` :::success [**Valgrind cg-manual**](https://valgrind.org/docs/manual/cg-manual.html) Cachegrind gathers the following statistics (abbreviations used for each statistic is given in parentheses): * I cache reads (***Ir***, which equals the number of instructions executed), I1 cache read misses (***I1mr***) and LL cache instruction read misses (***ILmr***). * D cache reads (***Dr***, which equals the number of memory reads), D1 cache read misses (***D1mr***), and LL cache data read misses (***DLmr***). * D cache writes (***Dw***, which equals the number of memory writes), D1 cache write misses (***D1mw***), and LL cache data write misses (***DLmw***). * Conditional branches executed (***Bc***) and conditional branches mispredicted (***Bcm***). * Indirect branches executed (***Bi***) and indirect branches mispredicted (***Bim***). ::: | | L1 read miss | LL read miss |Bcm| | -------- | -------- | -------- |--------| | my mergesort | 17,318,797 | 10,463,712 |2,332,635| |`list_sort`|16,577,220|6,374,861|11,430,567| 根據兩次的測試結果可以發現,我的實作在 Cache miss 方面比 `list_sort` 多,但在 Branch mispredicted 則比較少。[Valgrind cg-manual](https://valgrind.org/docs/manual/cg-manual.html) 提到 > On a modern machine, an L1 miss will typically cost around 10 cycles, an LL miss can cost as much as 200 cycles, and a mispredicted branch costs in the region of 10 to 30 cycles. Detailed cache and branch profiling can be very useful for understanding how your program interacts with the machine and thus how to make it faster. 可見 LL miss 的懲罰比起分支預測錯誤還要重很多。對於 `list_sort` 而言,減少 Cache miss 顯然是效益較高的。 仔細查看我的實作程式碼的部分可以發現在 [`merge_sort()`](#merge_sort()) 以快慢節點找出中間位置的時候發生了許多 miss : ```bash Ir I1mr ILmr Dr D1mr DLmr Dw D1mw DLmw Bc Bcm Bi Bim ------------------------------------------------------ 47,787,842 0 0 9,884,992 6,045,985 2,126,317 0 0 0 18,951,425 652,723 0 0 for (; fast != NULL && fast->next != NULL; 18,132,866 0 0 18,132,866 10,517,420 2,809,211 0 0 0 0 0 0 0 slow = slow->next, fast = fast->next->next) ``` 此方法必須要走訪每一個節點,由於資料量高達一百萬個,也不可能將全部資料搬進 Cache,因此每一次調用 `next` 指標都很有可能造成 Cache miss。 [`merge()`](#merge()) 函式中的 `*node = (*node)->next;` ```bash Ir I1mr ILmr Dr D1mr DLmr Dw D1mw DLmw Bc Bcm Bi Bim ------------------------------------------------------ 35,348,494 0 0 17,674,247 2,944,096 948,749 17,674,247 0 0 0 0 0 0 *node = (*node)->next; ``` 另一個造成嚴重 Cache miss 的則是比較字串的部分,以下為 [`merge()`](#merge())字串比較的部分 ```bash Ir I1mr ILmr Dr D1mr DLmr Dw D1mw DLmw Bc Bcm Bi Bim ------------------------------------------------------ 143,393,974 0 0 35,348,494 11,523,119 3,579,693 17,674,247 0 0 0 0 0 0 node = (strcmp(v1, v2) <= 0) ? &list1 : &list2; ``` `list_sort` 則是傳入 `list_sort()` 參數的 `cmp` function。 ```bash Ir I1mr ILmr Dr D1mr DLmr Dw D1mw DLmw Bc Bcm Bi Bim ------------------------------------------------------ 149,494,248 1 1 56,060,343 12,968,773 4,829,301 18,686,781 0 0 0 0 0 0 /home/scott/Documents/lab0-c/qtest.c:list_node_cmp ``` ## 參考資料 [你所不知道的 C 語言: linked list 和非連續記憶體](https://hackmd.io/@sysprog/c-linked-list) ***