2024q1 Homework1 (lab0)

# 2024q1 Homework1 (lab0) contributed by < [Appmedia06](https://github.com/Appmedia06) > ## 開發環境 ``` $ gcc --version gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 Copyright (C) 2021 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 39 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 Vendor ID: GenuineIntel Model name: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz CPU family: 6 Model: 140 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 1 Stepping: 1 CPU max MHz: 4200.0000 CPU min MHz: 400.0000 BogoMIPS: 4838.40 Virtualization features: Virtualization: VT-x Caches (sum of all): L1d: 192 KiB (4 instances) L1i: 128 KiB (4 instances) L2: 5 MiB (4 instances) L3: 8 MiB (1 instance) NUMA: NUMA node(s): 1 NUMA node0 CPU(s): 0-7 ``` ## :penguin:作業要求 * 修改 ```queue.[ch]``` 內的函式，並通過 ``` make test ``` 的自動評分系統，其中包括： * ```q_new``` * ```q_free``` * ```q_insert_head``` * ```q_insert_tail``` * ```q_remove_head``` * ```q_remove_tail``` * ```q_size``` * ```q_delete_mid``` * ```q_delete_dup``` * ```q_swap``` * ```q_reverse``` * ```q_reverseK``` * ```q_sort``` * ```q_descend``` * ```q_merge``` * 研讀 Linux 核心原始程式碼的 [lib/list_sort.c](https://github.com/torvalds/linux/blob/master/lib/list_sort.c) * Valgrind ＆自動測試獲得 100 分 * 在 ```qtest``` 提供新的命令 shuffle，參考 [Fisher–Yates shuffle](https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle) 演算法 * 研讀論文〈[Dude, is my code constant time?](https://eprint.iacr.org/2016/1123.pdf)〉 ## 作業背景在開始作業之前，先了解 lab0-c 的組成。 * `console.[ch]` : 實作 `./qtest` 的命令界面 * `list.h` : 實做 Linux 雙向鍊結串列的實作 * `qtest.c` : 測試程式 * `queue.[ch]` : 佇列實作接著來看要實作的佇列的架構，在 `list.h` 中的 `struct list_head` 結構，可以看到佇列為雙向的鍊結串列。 ```cpp struct list_head { struct list_head *prev; struct list_head *next; }; ``` 佇列節點裡的的內容為定義在 `queue.h` 的 `element_t` ，其內容包含一個字串 `value` ，以及一個鍊結串列 `list` 。 ```cpp /** * element_t - Linked list element * @value: pointer to array holding string * @list: node of a doubly-linked list * * @value needs to be explicitly allocated and freed */ typedef struct { char *value; struct list_head list; } element_t; ``` 實際上的佇列就如下圖所示，是一個雙向環狀的鍊結串列。 ```graphviz digraph ele_list { rankdir=LR node[shape=record] head [label="head|{<left>prev|<right>next}"] node1 [label="node1|{<left>prev|<right>next}"] node2 [label="node2|{<left>prev|<right>next}"] node3 [label="node3|{<left>prev|<right>next}"] // node1 [label="node4|{<left>prev|<right>next}"] head:right:e -> node1:w node1:left:w -> head:e node1:right:e -> node2:w node2:left:w -> node1:e node2:right:e -> node3:w node3:left:w -> node2:e node3:e -> head:n head:left:w -> node3:s } ``` ## 實作佇列操作 ### ```q_new``` 用 malloc 配置記憶體給新建立的 ```list_head``` ，若 malloc 配置記憶體失敗會回傳 ```NULL``` ，因此判斷若是成功配置的話，使用 ``` INIT_LIST_HEAD ``` 這個 API 來初始化佇列，使其變成一個**環狀**的鍊結串列，反之則回傳 ```NULL``` 。 ```cpp /* Create an empty queue */ struct list_head *q_new() { struct list_head *head = malloc(sizeof(*head)); if (head) INIT_LIST_HEAD(head); return head; } ``` --- ### ```q_free``` 要釋放掉整個佇列的記憶體空間，考慮了以下事項： 1. 要走訪整個佇列對每項元素釋放，但必須注意到可能會把指標指到未知的地址上。 * 使用 ```list_for_each_entry_safe``` ，它比 ```list_for_each_entry``` 多提供了一個safe指標，來儲存下一個要走訪的節點，來**安全**的走訪佇列。 2. 先將元素從佇列中 ```remove``` ，接著再釋放掉其元素的資料及指標。 * 使用 ```list_del``` 將走訪到的節點先 ```remove```。 * 使用 ```q_release_element``` 將移除的節點裡的資料及指標釋放。 3. 最後將佇列的 ```head``` 釋放。 ```cpp /* Free all storage used by queue */ void q_free(struct list_head *head) { if (!head) return; element_t *entry, *safe; list_for_each_entry_safe (entry, safe, head, list) { list_del(&entry->list); q_release_element(entry); } free(head); } ``` --- ### ```q_insert_head``` 要插入一個節點到佇列的最前面 * 配置記憶體空間給新的節點 * 配置記憶體空間給節點內的字串，其大小為字串長度加一（'\0'）乘上字元大小 * 將字串**安全**的複製過去 * ```strcpy``` 和 ```strncpy```差異，C99 7.21.2.4 The strncpy function： > The strncpy function copies not more than n characters (characters that follow a null character are not copied) from the array pointed to by s2 to the array pointed to by s1. If copying takes place between objects that overlap, the behavior is undefined. * ```strcpy```會直接將整個字串複製過去，而它判斷字串的大小即為尋訪字串直到找到字元 ```'\0'``` 。但當要被複製過去的目的的記憶體大小不足夠時，就可能發生 **buffer overflow** 的問題，因此使用 ```strncpy``` 讓我們先判斷來源字串的大小用於配置，若配置失敗則表示記憶體大小不足，進而中止插入。 * 使用 ```list_add``` 將 ```new_node``` 的list和head連在一起，下圖顯示了list_add的流程 ![list_add.drawio](https://hackmd.io/_uploads/H1y9wnTAT.png) ```cpp /* Insert an element at head of queue */ bool q_insert_head(struct list_head *head, char *s) { if (!head) return false; element_t *new_node = malloc(sizeof(element_t)); if (!new_node) return false; size_t s_len = strlen(s); new_node->value = malloc((s_len + 1) * sizeof(char)); if (!new_node->value) { q_release_element(new_node); return false; } strncpy(new_node->value, s, s_len); new_node->value[s_len] = '\0'; list_add(&new_node->list, head); return true; } ``` --- ### ```q_insert_tail``` 要插入一個節點到佇列的最後面，操作和 ``` q_insert_head ``` 相似，差別差在: * `list_add` 變成 ```list_add_tail``` ```cpp /* Insert an element at tail of queue */ bool q_insert_tail(struct list_head *head, char *s) { if (!head) return false; element_t *new_node = malloc(sizeof(*new_node)); if (!new_node) return false; size_t s_len = strlen(s); new_node->value = malloc((s_len + 1) * sizeof(char)); if (!new_node->value) { q_release_element(new_node); return false; } strncpy(new_node->value, s, s_len); new_node->value[s_len] = '\0'; list_add_tail(&new_node->list, head); return true; } ``` #### 精簡程式碼我們發現 `q_insert_head` 和 `q_insert_tail` 這兩個函式只差了一行，也就是決定要加入的位置。因此我們可以利用 `q_insert_head` 來實作 `q_insert_tail` 。原本插入最前端的只要改成插入最後一個節點即可，利用 `list_head` 的雙向鍊結，引入最後一個節點，即可插入最後一個節點。且提供了比較簡潔的程式碼。 ```cpp bool q_insert_tail(struct list_head *head, char *s) { return q_insert_head(head->prev, s); } ``` --- ### ```q_remove_head``` & ```q_remove_tail``` 要從佇列中**移除**一個節點，而非刪除一個節點。首先注意到了 ```q_remove_head``` 和 ```q_remove_tail``` 類似，主要差別和上述一樣，我們可以呼叫 ```list.h``` 中方便的 API : ```list_first_entry``` 和 ```list_last_entry``` 即可找出佇列的頭或尾了。 * 檢查head是否存在以及list是否為空 * 取得頭元素或尾元素 * 呼叫```q_remove_element``` * 將其元素使用 ```list_del_init``` 移除，並重新初始化這個被移除的節點 * 若 ```sp``` 和 ```bufsize``` 皆不為0，則使用 ```strncpy``` 安全的將被移除的節點的值複製給 ```sp``` ```cpp element_t *q_remove_element(element_t *element, char *sp, size_t bufsize) { list_del_init(&element->list); if (sp && bufsize) { strncpy(sp, element->value, bufsize); sp[bufsize - 1] = '\0'; } return element; } ``` 這邊列出 ```q_remove_head``` 的函式。 ```cpp /* Remove an element from head of queue */ element_t *q_remove_head(struct list_head *head, char *sp, size_t bufsize) { if (!head || list_empty(head)) return NULL; element_t *entry = list_first_entry(head, element_t, list); q_remove_element(entry, sp, bufsize); return entry; } ``` #### 精簡程式碼和插入一樣， `q_remove_head` 和 `q_remove_tail` 只差一行程式碼而已。 `list_first_entry` 巨集的功能是取出引入的 head 的下一個節點，因此只要將輸入改為 `head->prev->prev` ，也就是最後一個節點的前一個，就會取出最後一個節點。 ```cpp element_t *q_remove_tail(struct list_head *head, char *sp, size_t bufsize) { return q_remove_head(head->prev->prev, sp, bufsize); } ``` --- ### ```q_size``` 使用 ```list_for_each``` 尋訪整個 ```list``` 來計算 ```list``` 中有幾個元素。 ```cpp /* Return number of elements in queue */ int q_size(struct list_head *head) { if (!head) return 0; int size = 0; struct list_head *node = NULL; list_for_each (node, head) size++; return size; } ``` --- ### ```q_delete_mid``` 「龜兔賽跑」理論( [Floyd's Cycle detection](https://en.wikipedia.org/wiki/Cycle_detection) )，利用快慢指標來判斷何時到 ```list``` 的中間了。 * 演算步驟： 1. 檢查 head 是否存在以及 list 是否為空 2. 初始化 ```fast``` 以及 ```slow``` 指標為```head``` (開頭） 3. do_while 迴圈內，先讓 ```fast``` 走兩步 4. 判斷 ```fast``` 是否走到最後一個節點，若有則結束迴圈，若無則讓 ```slow``` 走一步。 5. 迴圈的中止條件也是判斷 ```fast``` 是否走到最後一個節點 6. 當 ```fast``` 走到最後一個節點時， ```slow```的下一個節點即為中間值 7. 使用 ```list_entry``` 取得中間節點，再使用 ```list_del```移除此節點，最後使用 ```q_release_element``` 將此節點的完整釋放 * 針對 ```fast``` 以2的幂在移動是否走到最後一個節點，基於我們使用的是環狀鍊結串列，有兩種可能： * ```list``` 長度為**偶數**的: 因為最後一個節點為2的幂，所以 ```fast``` 會走到最後一個節點，也就是 ```head->prev``` * ```list``` 長度為**奇數**的：因為最後一個節點不為2的幂，所以 ```fast``` 會跳過最後一個節點，直接走到第一個節點，也就是 ```head``` ，這也是用 do_while 的理由 ```cpp /* Delete the middle node in queue */ bool q_delete_mid(struct list_head *head) { if (!head || list_empty(head)) { return false; } struct list_head *fast = head, *slow = head; do { fast = fast->next->next; if ((fast == head) || (fast == head->prev)) { break; } slow = slow->next; } while (!(fast == head) && !(fast == head->prev)); element_t *node = list_entry(slow->next, element_t, list); list_del(&node->list); q_release_element(node); return true; // https://leetcode.com/problems/delete-the-middle-node-of-a-linked-list/ } ``` --- ### ```q_delete_dup``` 要將佇列裡重複的節點刪除 * 演算步驟： 1. 檢查 head 是否存在 2. 檢查佇列是否為空或是只有一個節點 3. 使用 ```list_for_each_entry_safe``` 尋訪每個節點，定義 ```next_str``` 為下一個節點的字串， 4. 使用 ```stmcmp``` 來判斷目前節點的字串和下一個節點的字串是否相等，並且目前節點不是最後一個節點，若成立，則刪除目前節點 5. 若第4點不成立，則根據 ```dup_flag``` 來判斷是否為重複節點中的最後一個節點 ```cpp /* Delete all nodes that have duplicate string */ bool q_delete_dup(struct list_head *head) { if (!head) return false; if (list_empty(head) || list_is_singular(head)) return true; element_t *node, *safe; bool dup_flag = false; list_for_each_entry_safe (node, safe, head, list) { char *next_str = list_entry(node->list.next, element_t, list)->value; if ((strcmp(next_str, node->value) == 0) && (node->list.next != head)) { list_del(&node->list); q_release_element(node); dup_flag = true; } else if (dup_flag) { list_del(&node->list); q_release_element(node); dup_flag = false; } } return true; // https://leetcode.com/problems/remove-duplicates-from-sorted-list-ii/ } ``` 但當我執行 ```./qtest``` 的 ```dedup``` 命令時，得到了以下錯誤： Segmentation fault occurred. You dereferenced a NULL or invalid pointerAborted (core dumped) 錯誤的原因是當 ```node``` 走到最後一個節點時，```list_entry(node->list.next, element_t, list)->value;```，其 ```value``` 是沒有值的，這就導致 ```strcmp``` 函式發出 ```Segmentation fault```。因此我讓判斷條件先檢查是否為最後一個節點，這樣它就可以發現是最後一個節點而不會去執行 ```strcmp``` 。 ```diff bool q_delete_dup(struct list_head *head) bool dup_flag = false; list_for_each_entry_safe (node, safe, head, list) { char *next_str = list_entry(node->list.next, element_t, list)->value; - if ((strcmp(next_str, node->value) == 0) && (node->list.next != head)) { + if ((node->list.next != head) && (strcmp(next_str, node->value) == 0)) { list_del(&node->list); q_release_element(node); dup_flag = true; ``` --- ### ```q_swap``` 先理解 ```list_move``` 的功用，實際上 ```list_move``` 就是先做 ```list_del``` （移除）再做 ```list_add``` （插入），所以我想要做 ```swap``` 節點兩兩交換就可以利用 ```list_move(node, node->next)``` ，原理如下： 1. 使用 ```list_for_each``` 從 ```node0``` 開始尋訪 2. 若不是最後一個節點的話，則使用 ```list_move(node0, node1)``` ，透過這樣的操作後， ```node0``` 就在 ```node1``` 和 ```node2``` 之間了 3. 接著繼續尋訪，因為剛剛已經交換過位置了，所以會從 ```node2``` 繼續，重複剛剛的步驟即可做到交換了 ![list_move.drawio](https://hackmd.io/_uploads/B1I7shTRT.png) ```cpp /* Swap every two adjacent nodes */ void q_swap(struct list_head *head) { if (!head || list_empty(head)) return; struct list_head *node = NULL; list_for_each (node, head) { if (node->next == head) break; list_move(node, node->next); } // https://leetcode.com/problems/swap-nodes-in-pairs/ } ``` #### 精簡程式碼透過上面示意圖思考後，發現 `q_swap` 再做的事其實跟 `q_reverseK` 函式很像。若是將 K 設置為 2，那其實就會兩個節點為一組去做反轉了。 ```cpp void q_swap(struct list_head *head) { if (!head || list_empty(head)) return; q_reverseK(head, 2); } ``` --- ### ```q_reverse``` 基於這邊佇列是一個雙向鍊結串列，所以要把每個節點做反轉，只要將每個節點的 ```next``` 和 ```prev``` 交換即可。 * 使用 ```list_for_each_safe``` 進行尋訪佇列，每個節點做反轉 * ```list_for_each_safe``` 的中止條件為當尋訪的 ```node``` 已經是 ```head```。也就是說最後要將 ```head``` 本身的 ```next``` 和 ```prev``` 對調 ```clike #define list_for_each_safe(node, safe, head) \ for (node = (head)->next, safe = node->next; node != (head); \ node = safe, safe = node->next) ``` ```cpp /* Reverse elements in queue */ void q_reverse(struct list_head *head) { struct list_head *node = NULL, *safe = NULL; list_for_each_safe (node, safe, head) { struct list_head *tmp = node->next; node->next = node->prev; node->prev = tmp; } struct list_head *tmp = head->next; node->next = head->prev; node->prev = tmp; } ``` --- ### ```q_reverseK``` 要將佇列**k**個節點為一組做反轉，若剩餘的節點數目小於**k**則不需在反轉。 * 當尋訪到需要反轉的節點時(cnt == k)，要把從一開始到這個節點切割下來，對這個切割下來的佇列做反轉，再將反轉後的佇列插回到原來的位置 * 變數說明： * ```cnt``` : 計數 * ```node``` : 尋訪的節點 * ```safe``` : 指向 ```node``` 的下一個節點 * ```head_from``` : 紀錄切割範圍的開頭 * ```new_head``` : 暫時存放切割下來的佇列 * 演算步驟： 1. 檢查head是否存在或是佇列是否為空或是只有一個節點 2. 將 ```list_from``` 設成 ```head``` 作為起點，並使用 ```LIST_HEAD``` 宣告一個 ```list_head``` 作為暫時存放切割下來的佇列 3. 使用 ```list_for_each_safe``` 尋訪佇列 4. 當計數到k時，以 ```list_from``` 為起點， ```node``` 為終點，使用 ```list_cut_position``` 將其切割下來放入 ```new_head``` 5. 將 ```new_head``` 反轉 6. 將 ```new_head``` 以 ```head_from``` 為開頭插回原本的佇列 7. 計數器重新計數 8. 將切割起始位置 ```head_from``` 設為 ```safe``` 的前一個，這樣下次才會從 ```node``` 的下一個開始切割 ```cpp /* Reverse the nodes of the list k at a time */ void q_reverseK(struct list_head *head, int k) { if (!head || list_empty(head) || list_is_singular(head)) return; int cnt = 0; struct list_head *node, *safe, *head_from = head; LIST_HEAD(new_head); list_for_each_safe (node, safe, head) { cnt++; if (cnt == k) { list_cut_position(&new_head, head_from, node); q_reverse(&new_head); list_splice_init(&new_head, head_from); cnt = 0; head_from = safe->prev; } } // https://leetcode.com/problems/reverse-nodes-in-k-group/ } ``` --- ### ```q_sort``` 以**小到大**順序來排序鏈結串列的節點，使用 **merge sort** 的**遞迴**方式來實做。 ```graphviz digraph G { compound=true node[shape=box, height=.1];sorted_1 sorted_2 sorted_3 sorted_4 sorted_5 sorted_6 sorted_7 sorted_8; node[shape=record, height=.1];input result divide_41 divide_42 divide_21 divide_22 divide_23 divide_24 merge_21 merge_22 merge_23 merge_24 merge_41 merge_42; input[label="<f0>2|<f1>5|<f2>4|<f3>6|<f4>8|<f5>1|<f6>7|<f7>3"] result[label="<f0>1|<f1>2|<f2>3|<f3>4|<f4>5|<f5>6|<f6>7|<f7>8"] divide_41[label="<f1>2|<f2>5|<f3>4|<f4>6"] divide_42[label="<f1>8|<f2>1|<f3>7|<f4>3"] divide_21[label="<f0>2|<f1>5"] divide_22[label="<f0>4|<f1>6"] divide_23[label="<f0>8|<f1>1"] divide_24[label="<f0>7|<f1>3"] sorted_1[label="1"] sorted_2[label="2"] sorted_3[label="3"] sorted_4[label="4"] sorted_5[label="5"] sorted_6[label="6"] sorted_7[label="7"] sorted_8[label="8"] merge_21[label="<f0>2|<f1>5"] merge_22[label="<f0>4|<f1>6"] merge_23[label="<f0>1|<f1>8"] merge_24[label="<f0>3|<f1>7"] merge_41[label="<f1>2|<f2>4|<f3>5|<f4>6"] merge_42[label="<f1>1|<f2>3|<f3>7|<f4>8"] subgraph cluster_0 { style=filled; color="#ef6548"; label="divide"; divide_pad[label="----------------------", style=invis] divide_pad -> divide_23[style=invis] input -> divide_41 input -> divide_42 divide_41 -> divide_21 divide_41 -> divide_22 divide_42 -> divide_23 divide_42 -> divide_24 } divide_21:f0 -> sorted_2 divide_21:f1 -> sorted_5 divide_22 -> sorted_4 divide_22 -> sorted_6 divide_23:f0 -> sorted_8 divide_23:f1 -> sorted_1 divide_24:f0 -> sorted_7 divide_24:f1 -> sorted_3 subgraph cluster_1 { style=filled; color="#a6cee3"; label="sorted lists"; sorted_1; sorted_2; sorted_3; sorted_4; sorted_5; sorted_6; sorted_7; sorted_8; } sorted_2 -> merge_21:f0 sorted_5 -> merge_21:f1 sorted_4 -> merge_22:f0 sorted_6 -> merge_22:f1 sorted_8 -> merge_23:f1[constraint=false] sorted_1 -> merge_23:f0 sorted_7 -> merge_24:f1 sorted_3 -> merge_24:f0 subgraph cluster_2 { style=filled; color="#b2df8a"; label="merge"; merge_pad[label="--------------------", style=invis] rank=same; merge_41; merge_pad; merge_42 rank=same; merge_41; merge_42; merge_pad; merge_22 -> merge_pad[style=invis] merge_23 -> merge_pad[style=invis] merge_pad -> result[style=invis] merge_21 -> merge_41 merge_22 -> merge_41 merge_23 -> merge_42 merge_24 -> merge_42 merge_41 -> result merge_42 -> result } } ``` #### ```q_sort``` * 演算方法 1. 檢查head是否存在或為空佇列 2. 透過將最後一個節點的 ```next``` 賦值為 ```NULL``` ，來斷開環狀鍊結 3. 使用 ```q_divide``` 將鍊結串列做 **divide & conquer** ，回傳即為排序好的鍊結串列 4. 走訪每個節點將其 ```prev``` 串起，最後再將步驟2斷開的頭尾連接起來 * 實做細節 * 因為考慮到在做分割及合併時需要做成單向無環狀的鍊結串列，所以在第7行將鍊結串列的頭尾切割 * 在做排序時， ```prev``` 會亂掉，因此需要將其串起 ```cpp /* Sort elements of queue in ascending/descending order */ void q_sort(struct list_head *head, bool descend) { if (!head || list_empty(head)) return; head->prev->next = NULL; head->next = q_divide(head->next); struct list_head *iter_node; for (iter_node = head; iter_node->next != NULL; iter_node = iter_node->next) { iter_node->next->prev = iter_node; } iter_node->next = head; head->prev = iter_node; } ``` #### ```q_divide``` * 演算方法 1. 若 ```head``` 或其下個節點以為空，代表切到只剩一個，就可以回傳回去了 2. 使用**快慢指標**，使其更快找到中間節點 3. 當 ```fast``` 走到最後一個節點的下一個（奇）或走到最後一個節點（偶），此刻的 ```slow``` 即為中間節點 4. 分別將左邊和右邊的鍊結串列繼續做分割 5. 使用 ```merge_Two_list``` ，將左鍊結串列和右鍊結串列合併 ```cpp struct list_head *q_divide(struct list_head *head) { if (!head || !head->next) return head; struct list_head *fast = head, *slow = head, *mid; for (; fast && fast->next; fast = fast->next->next, slow = slow->next ; mid = slow; slow->prev->next = NULL; struct list_head *left = q_divide(head); struct list_head *right = q_divide(mid); return merge_Two_list(left, right); } ``` #### ```merge_Two_list``` * 演算方法 1. 宣告 ```new_head``` 為合併 ```left``` 和 ```right``` 的新鍊結串列 2. 宣告 ```indirect``` (指標的指標）指向 ```new_head``` 3. 宣告 ```iter_node``` (指標的指標）用來走訪 ```left``` 和 ```right``` 4. 迴圈內，只要 ```left``` 和 ```right``` 有一方走完了，即跳出迴圈到步驟9，反之則執行步驟5 5. 判斷 ```left``` 和 ```right``` 的值誰比較**小**，就將 ```iter_node``` 指向該指標 6. 將 ```iter_node``` 指派給 ```indirect``` ，也就等同於將 ```left``` 或 ```right``` 指派給 ```new_head``` 7. ```indirect``` 指向它所指向的指標的下一個節點 8. ```iter_node``` 走到下一個節點 9. 有一方走完後，將剩下的一方直接串上 ```indirect``` * 實做細節 * 這邊 ```indirect``` 利用到了 <[你所不知道的 C 語言: linked list 和非連續記憶體](https://hackmd.io/@sysprog/c-prog/%2Fs%2FSkE33UTHf#Merge-Sort-%E7%9A%84%E5%AF%A6%E4%BD%9C)> 說明的**間接指標**的概念，在第4行將它指向 ```new_head``` 指標，而在第11行 ```iter_node``` 指派給 ```indirect``` ，意思即為將 ```left``` 或 ```right``` 的節點賦給 ```new_head``` ，接著 ```indirect``` 再去指向 ```new_head``` 的下一個節點，這一套下來，雖然都是對 ```new_head``` 和 ```left``` 或 ```right``` 操作，但透過操作間接指標，我們就不用直接對這些指標做更動 * 而 ```iter_node``` 也是這樣的操作，透過間接指標對 ```left``` 和 ```right``` 走訪 ![indirect.drawio](https://hackmd.io/_uploads/Sy9BnHBkA.png) ```cpp struct list_head *merge_Two_list(struct list_head *left, struct list_head *right) { struct list_head *new_head = NULL, **indirect = &new_head, **iter_node = NULL; for (; left && right; *iter_node = (*iter_node)->next) { iter_node = strcmp(list_entry(left, element_t, list)->value, list_entry(right, element_t, list)->value) >= 0 ? &right : &left; *indirect = *iter_node; indirect = &(*indirect)->next; } *indirect = (struct list_head *) ((uintptr_t) left | (uintptr_t) right); return new_head; } ``` --- ### ```q_ascend``` & ```q_descend``` * ```q_ascend``` : 刪除**左邊**有比較大的節點，也就是說處理後的鍊結串列是**升序**的 * 例如： 8, 3, 9, 2, 5 => 8, 9 * ```q_descend``` : 刪除**右邊**有比較大的節點，也就是說處理後的鍊結串列是**降序**的 * 例如： 5, 2, 9, 3, 8 => 9, 8 ```q_ascend``` 和 ```q_descend``` 兩者是非常相似，可以走訪佇列，比較前後兩個節點的大小來去判斷是否要刪除。差別就差在 ```q_ascend``` 是透過**從頭到尾**，而 ```q_descend``` 則是反過來的**從尾到頭**。 #### ```q_ascend``` * 演算步驟 1. 檢查head是否存在或為空佇列 2. 宣告兩個指標 ```ele``` 和 ```ele_next``` 分別代表前一個和下一個節點的元素 3. 走訪佇列，當走回到 ```head``` 時，代表結束，跳出迴圈 4. 比較前一個和下一個節點的值 * 若前一個節點的值比下一個節點的值**小**，則繼續走訪 * 若前一個節點的值比下一個節點的值**大**，則刪除下一個節點 * 實做細節 * 需回傳刪除後最終的節點數量， ```node_num``` 即為用於計數，當不需刪除節點時，自加一，而最後 ```++node_num``` 則是要加上 ```head``` * 本來想要利用一個變數 ```max``` 來紀錄目前最大值，但發現在使用 ```list_del``` 移除節點時會把原本的佇列串起來，這也就可以讓我繼續走訪節點去比較前後兩個節點了 ```cpp /* Remove every node which has a node with a strictly less value anywhere to * the right side of it */ int q_ascend(struct list_head *head) { if (!head || list_empty(head)) return 0; int node_num = 0; element_t *ele = list_entry(head->next, element_t, list); element_t *ele_next = list_entry(head->next->next, element_t, list); while (&ele_next->list != head) { if (strcmp(ele->value, ele_next->value) < 0) { ele_next = list_entry(ele_next->list.next, element_t, list); ele = list_entry(ele->list.next, element_t, list); node_num++; } else { list_del(&ele_next->list); q_release_element(ele_next); ele_next = list_entry(ele->list.next, element_t, list); } } return ++node_num; // https://leetcode.com/problems/remove-nodes-from-linked-list/ } ``` #### ```q_descend``` 其想法和演算步驟都和 ```q_ascend``` 一樣，差別僅在它是從尾開始走到頭的，因此需要修改一開始的 ```ele``` 和 ```ele_prev``` 為最後一個以及倒數第二個，並把 ```next``` 都改為 ```prev``` 。 ```cpp /* Remove every node which has a node with a strictly greater value anywhere to * the right side of it */ int q_descend(struct list_head *head) { if (!head || list_empty(head)) return 0; int node_num = 0; element_t *ele = list_entry(head->prev, element_t, list); element_t *ele_prev = list_entry(head->prev->prev, element_t, list); while (&ele_prev->list != head) { if (strcmp(ele->value, ele_prev->value) < 0) { ele_prev = list_entry(ele_prev->list.prev, element_t, list); ele = list_entry(ele->list.prev, element_t, list); node_num++; } else { list_del(&ele_prev->list); q_release_element(ele_prev); ele_prev = list_entry(ele->list.prev, element_t, list); } } return ++node_num; // https://leetcode.com/problems/remove-nodes-from-linked-list/ } ``` #### 小發現當我使用 ```./qtest``` 測試時，帶入<[2487. Remove Nodes From Linked List](https://leetcode.com/problems/remove-nodes-from-linked-list/description/)> 給的範例[5, 2, 13, 3, 8]時，使用 ```descend``` 會發現 **13** 這個 **字串** 被刪除了，透過實驗發現只有雙位數以上的值才有可能會發生這種情況，原來是因為 ```strcmp``` 函式的實作是從兩個字串的第一個字元開始比較，所以當字串 "3" 和 "13" 做比較時會先拿 "3" 和 "1" 做比較，因為 "3" 比 "1" 的 ASCII code 還大，所以會直接回傳正數，而導致第13行會判斷 "3" 比 "13" 還大，進而把 "13" 給刪除了。 --- ### ```q_merge``` 要合併數個長度不同且已排序的佇列，並回傳總節點數量，先理解 ```q_merge``` 的參數是佇列，卻說是要合併多個佇列： * ```queue_contex_t```: 管理一連串佇列的結構 ```cpp /** * queue_contex_t - The context managing a chain of queues * @q: pointer to the head of the queue * @chain: used by chaining the heads of queues * @size: the length of this queue * @id: the unique identification number */ typedef struct { struct list_head *q; struct list_head chain; int size; int id; } queue_contex_t; ``` 也就是說， ```q_merge``` 傳入的 ```head``` 即為 ```queue_contex_t``` 的 ```chain``` 的 ```head```，以下圖解說明： ![queue_contex_t.drawio](https://hackmd.io/_uploads/HJMGUsSyA.png) 我把它簡單理解為**一個鍊結串列每個節點都是一個獨立的鍊結串列**，而我們要做的就是把每個節點的鏈結串列合併。 * 演算步驟 1. 檢查head是否存在或為空佇列 2. 宣告佇列 ```first``` 為第一個子佇列 3. 累加 ```first``` 節點數量並斷開頭尾連結 4. 走訪之後的每個節點，取出其佇列 ```sub``` 、累加佇列數量、斷開頭尾連結，使用 ```merge_Two_list``` 將 ```first->next``` 和 ```sub->next``` 合併回 ```first->next``` ，接著初始化 ```sub``` 佇列，繼續走訪直到走完佇列 5. 走訪每個節點將其 ```prev``` 串起，最後再將步驟2斷開的頭尾連接起來 * 實做細節 * 取出 ```first``` 佇列時，要抓的是 ```head->next``` ，而非 ```head``` * 在使用 ```merge_Two_list``` 來合併兩個佇列時，也都是從它們的 ```next``` 開始合併，因為佇列的 ```head``` 是沒有東西的 * 當初在想 ```first``` 和 ```sub``` 名稱時蠻猶豫的，有考慮過使用 ```major``` 和 ```minor``` ，但考慮到它們並無主從關係，因此還是使用上述名稱 * 和 ```q_sort``` 一樣，再做合併時會導致 ```prev``` 順序亂掉以及需要先斷開頭尾，因此最後需要補上 ```cpp /* Merge all the queues into one sorted queue, which is in ascending/descending * order */ int q_merge(struct list_head *head, bool descend) { if (!head || list_empty(head)) return 0; int node_num = 0; struct list_head *first = list_entry(head->next, queue_contex_t, chain)->q; node_num += q_size(first); first->prev->next = NULL; struct list_head *iter_node = head->next->next; while (iter_node != head) { struct list_head *sub = list_entry(iter_node, queue_contex_t, chain)->q; node_num += q_size(sub); sub->prev->next = NULL; first->next = merge_Two_list(first->next, sub->next); INIT_LIST_HEAD(sub); iter_node = iter_node->next; } struct list_head *node; for (node = first; node->next != NULL; node = node->next) { node->next->prev = node; } node->next = first; first->prev = node; return node_num; // https://leetcode.com/problems/merge-k-sorted-lists/ } ``` ## 研讀 Linux 核心原始程式碼的 [lib/list_sort.c](https://github.com/torvalds/linux/blob/master/lib/list_sort.c) 我的筆記：[研讀 Linux 核心原始程式碼的 lib/list_sort.c 筆記](https://hackmd.io/A6U5qYEXSXKi4Uvsrb96_Q) ### 將 list_sort.c 加入專案研讀 `list_sort.c` 裡的 merge sort 之後，要將其引入自己的專案裡。以下是流程。 1. 將 `lib/list_sort.c` 加進 lab0-c 目錄下 * 修改標頭檔 ```cpp #include <linux/kernel.h> // #include <linux/bug.h> // #include <linux/compiler.h> // #include <linux/export.h> // #include <linux/string.h> #include <list_sort.h> #include <list.h> ``` * 刪除 `EXPORT_SYMBOL(list_sort);` 2. 創建 `list_sort.h` ，將使用到的巨集加入 ```cpp #ifndef _LIST_SORT_H #define _LIST_SORT_H #include <stdint.h> #include "list.h" typedef uint8_t u8; #define likely(x) __builtin_expect(!!(x), 1) #define unlikely(x) __builtin_expect(!!(x), 0) typedef int __attribute__((nonnull(2,3))) (*list_cmp_func_t)(void *, const struct list_head *, const struct list_head *); void list_sort(void *priv, struct list_head *head, list_cmp_func_t cmp); #endif /* End of _LIST_SORT_H_ */ ``` 3. 修改 `qtest.c` * 新增 `list_sort.h` 標頭檔 * 新增 `cmp` 函式，用於 `list_sort` ```cpp static int cmp(void *priv, const struct list_head *list1, const struct list_head *list2) { return strcmp(list_entry(list1, element_t, list)->value, list_entry(list2, element_t, list)->value); } ``` * 定義變數 `use_list_sort` ，為 1 時，用 `list_sort` ，反之則用 `q_sort` ```cpp static int use_list_sort = 0; ``` * 在 `do_sort` 函式，根據 `use_list_sort` 決定排序函式 ```cpp if (current && exception_setup(true)) { if (use_list_sort == 1) list_sort(NULL, current->q, cmp); else q_sort(current->q, descend); } ``` * 在 `console_init` 函式內新增 ```cpp add_param("listsort", &use_list_sort, "Use the list sort in Linux", NULL); ``` --- ### 安裝 perf 因為安裝 perf 會有版本的問題，因此先看自己的 Linux 版本。 ``` $ uname -r ``` 根據自己的版本修改以下安裝命令。 ``` $ sudo apt-get install linux-tools-6.5.0-35-generic linux-cloud-tools-6.5.0-35-generic ``` 接著，要得到 perf 的權限，可以先透過以下指令來設置 perf 的最高權限。 ``` $ sudo sh -c "echo -1 > /proc/sys/kernel/perf_event_paranoid" ``` --- ### 新增測試程式在 `traces` 目錄下創建 `trace-sort.cmd` ，來用 `qtest` 提供的 time 來計算排序所花的時間。 ``` # Test performance of sort betweem q_sort() and list_sort() # listsort: 0: q_sort(), 1: list_sort() # it RAND: randon list number option listsort 1 new it RAND 500000 time sort time ``` 回到 lab0-c 目錄下執行 `./qtest -f traces/trace-sort.cmd` ， ``` cmd> # Test performance of sort betweem q_sort() and list_sort() cmd> # listsort: 0: q_sort(), 1: list_sort() cmd> # it RAND: randon list number cmd> cmd> option listsort 0 cmd> new l = [] cmd> it RAND 100000 l = [mlslrwdt mmropwi oaomp ucqlsbdx xudyqljd aqdhrif ufgvmrutp vanrctr eiyzkbcpz kzacvqmkx bhmwki e bueobevm shrmfpp hmedmi xjuvefu zqyotxa bgpmyuqs ywqmc phorbcp xqnoibvq fzvoe ntfjvpxm hfeuuzb d wnfzubqr cypmd htewl kywwsd ffrmamabr wiqidgsm qyiezj ... ] cmd> time Elapsed time = 0.063, Delta time = 0.063 cmd> sort l = [aaagvhdn aaahrtt aaajl aaamc aaaoezxb aaavpqwc aaaxs aabas aabcwgn aabnbj aabsnn aabxjc aacee q aachnqrvs aaczo aadatpi aadco aadkafe aadllqz aadmvdf aadzrqwp aaeajz aaecf aaehil aaeiwg aaejpz aaeniz aaennb aaevgzvd aaexnmdx ... ] cmd> time Elapsed time = 0.157, Delta time = 0.094 Freeing queue ``` 根據命令可以看到，先創建了一個佇列，在其佇列尾端隨機插入 100,000 筆資料，並獲取現在執行的時間，接著就執行排序，排序結束後再次獲取執行時間，因此倒數第二行的 **Delta time** 就是執行排序的時間。 --- ### 比較 `q_sort` 和 `list_sort` 之間效能落差以下分別統計`q_sort` 和 `list_sort` 之間效能落差，資料數從 10 萬筆到 80 萬筆，共 8 個，每次都做 3 次來平均。 * **q_sort** | 資料數 | q_sort_1st | q_sort_2nd | q_sort_3rd | Average | | -------- | -------- | -------- | -------- | --------| | 100,000 | 0.094 | 0.092 | 0.102 | 0.096 | | 200,000 | 0.239 | 0.259 | 0.257 | 0.251 | | 300,000 | 0.391 | 0.482 | 0.459 | 0.444 | | 400,000 | 0.572 | 0.594 | 0.579 | 0.581 | | 500,000 | 0.738 | 0.793 | 0.750 | 0.760 | | 600,000 | 0.942 | 0.987 | 0.978 | 0.969 | | 700,000 | 1.098 | 1.165 | 1.169 | 1.144 | | 800,000 | 1.344 | 1.321 | 1.322 | 1.329 | * **list_sort** | 資料數 | list_sort_1st | list_sort_2nd | list_sort_3rd |Average | | -------- | -------- | -------- | -------- | -------- | | 100,000 | 0.070 | 0.056 | 0.064 | 0.063 | | 200,000 | 0.151 | 0.147 | 0.169 | 0.156 | | 300,000 | 0.253 | 0.258 | 0.252 | 0.254 | | 400,000 | 0.349 | 0.361 | 0.360 | 0.356 | | 500,000 | 0.462 | 0.463 | 0.451 | 0.459 | | 600,000 | 0.570 | 0.582 | 0.564 | 0.572 | | 700,000 | 0.692 | 0.699 | 0.676 | 0.689 | | 800,000 | 0.789 | 0.813 | 0.806 | 0.803 | * **排序效能比較圖** ![Sorting performance comparsion](https://hackmd.io/_uploads/H1PFVafxR.png) 可以看到當資料量越大時，效能差的越多。 ### 實際在機器運行的數據 perf 提供了很多事件的觀測，現在要來看兩個函式分別在機器上跑的 cache-misses 、 instruction 、 cycle 。 * **q_sort** ``` Performance counter stats for './qtest -f traces/trace-sort.cmd' (5 runs): 33,195,862 cache-misses # 67.38% of all cache refs ( +- 0.11% ) 49,269,944 cache-references ( +- 0.31% ) 3,308,935,634 instructions # 0.94 insn per cycle ( +- 0.04% ) 3,504,986,169 cycles ( +- 0.22% ) 1.04023 +- 0.00337 seconds time elapsed ( +- 0.32% ) ``` * **list_sort** ``` Performance counter stats for './qtest -f traces/trace-sort.cmd' (5 runs): 28,224,650 cache-misses # 69.40% of all cache refs ( +- 0.31% ) 40,668,846 cache-references ( +- 0.24% ) 3,291,921,745 instructions # 1.21 insn per cycle ( +- 0.03% ) 2,727,824,372 cycles ( +- 0.47% ) 0.79568 +- 0.00507 seconds time elapsed ( +- 0.64% ) ``` * 將上面結果整理成表格 | | q_sort | list_sort | | ---------------- | ------------- | ------------- | | cache-references | 49,269,944 | 40,668,846 | | cache-misses | 33,195,862 | 28,224,650 | | instructions | 3,308,935,634 | 3,291,921,745 | | cycles | 3,504,986,169 | 2,727,824,372 | | insn per cycle | 0.32 | 0.64 | 從上面表格可以很明顯看出來， Linux 的 list_sort 因為其設計對 cache 友善的演算法，因此比 q_sort 的 cache-misses 少了 **17%** ，也因為這樣，使其 cycle 的數量少非常多， inst per cycle 也比 q_sort 好了 2 倍。 ### 在 lab0-c 實作 Timsort 根據[第二周作業](https://hackmd.io/DMjzkMZrSjWBSMznxlSQtw#quiz1-%E6%B8%AC%E9%A9%972)，實作了 Timsort，現在需要將其整合到 qtest 命令中。具體步驟如下： * 將 `timsort.c` 和 `timsort.h` (`sort_impl.h`) 添加到 lab0-c 目錄下。 * 在 `qtest.c` 中新增 do_timsort 函式，該函式基本上就是將 `do_sort` 中的排序演算法改為 `timsort` 函式。 * 在 `console_init` 函式中新增 `timsort` 的指令。另外，在上面提到的 `list_sort` 部分有個疏忽，直接編譯會產生以下錯誤訊息。需要先在 Makefile 裡新增輸出檔，才能解決這個問題。 ```diff Error message: /usr/bin/ld: qtest.o: in function `do_timsort': /home/appmedia/linux2024/hw/lab0-c/qtest.c:671: undefined reference to `timsort' collect2: error: ld returned 1 exit status make: *** [Makefile:49: qtest] Error 1 OBJS := qtest.o report.o console.o harness.o queue.o \ random.o dudect/constant.o dudect/fixture.o dudect/ttest.o \ shannon_entropy.o \ linenoise.o web.o \ + list_sort.o timsort.o ``` 在提交 commit 時， Cppcheck 發出了錯誤訊息: ```command timsort.c:27:23: style: Variable 'head' is not assigned a value. [unassignedVariable] struct list_head *head; ``` 原來是需要將指標做初始的賦值。 ```diff -struct list_head *head; +struct list_head *head = NULL; ``` > 詳細修改: [commit](https://github.com/Appmedia06/lab0-c/commit/f95f48f4aaaca6966d932968a7beedfbdd207437) ## Valgrind ＆自動測試 ### ```make test``` 實作完佇列後便可以測試，輸入 `make test` ，會去用 `qtest` 執行 scripts/driver.py ，得到結果為 100 分，代表通過自動評分系統的所有項目。 ``` $ make test ... ... --- TOTAL 100/100 ``` --- ### Valgrind 參考 [以 Valgrind 分析記憶體問題](https://hackmd.io/@sysprog/linux2023-lab0/%2F%40sysprog%2Flinux2023-lab0-b) ，得知 Valgrind 的用途簡單來說就是讓使用者可以觀察程式在執行時的一些狀況，例如使用未初始化的記憶體、不當的記憶體配置、或取消配置記憶體等等。現在要開始測試，先輸入 ``` $ make valgrind ``` 就會得到動態分析後的結果 :::spoiler 完整結果 ``` # Explicitly disable sanitizer(s) make clean SANITIZER=0 qtest make[1]: Entering directory '/home/appmedia/linux2024/hw/lab0-c' rm -f qtest.o report.o console.o harness.o queue.o random.o dudect/constant.o dudect/fixture.o dudect/ttest.o shannon_entropy.o linenoise.o web.o list_sort.o .qtest.o.d .report.o.d .console.o.d .harness.o.d .queue.o.d .random.o.d .dudect/constant.o.d .dudect/fixture.o.d .dudect/ttest.o.d .shannon_entropy.o.d .linenoise.o.d .web.o.d .list_sort.o.d *~ qtest /tmp/qtest.* rm -rf .dudect rm -rf *.dSYM (cd traces; rm -f *~) CC qtest.o CC report.o CC console.o CC harness.o CC queue.o CC random.o CC dudect/constant.o CC dudect/fixture.o CC dudect/ttest.o CC shannon_entropy.o CC linenoise.o CC web.o CC list_sort.o LD qtest make[1]: Leaving directory '/home/appmedia/linux2024/hw/lab0-c' cp qtest /tmp/qtest.3oObvB chmod u+x /tmp/qtest.3oObvB sed -i "s/alarm/isnan/g" /tmp/qtest.3oObvB scripts/driver.py -p /tmp/qtest.3oObvB --valgrind --- Trace Points +++ TESTING trace trace-01-ops: # Test of insert_head and remove_head --- trace-01-ops 5/5 +++ TESTING trace trace-02-ops: # Test of insert_head, insert_tail, remove_head, remove_tail, and delete_mid --- trace-02-ops 6/6 +++ TESTING trace trace-03-ops: # Test of insert_head, insert_tail, remove_head, reverse and merge --- trace-03-ops 6/6 +++ TESTING trace trace-04-ops: # Test of insert_head, insert_tail, size, swap, and sort --- trace-04-ops 6/6 +++ TESTING trace trace-05-ops: # Test of insert_head, insert_tail, remove_head, reverse, size, swap, and sort --- trace-05-ops 6/6 +++ TESTING trace trace-06-ops: # Test of insert_head, insert_tail, delete duplicate, sort, descend and reverseK --- trace-06-ops 6/6 +++ TESTING trace trace-07-string: # Test of truncated strings --- trace-07-string 6/6 +++ TESTING trace trace-08-robust: # Test operations on empty queue --- trace-08-robust 6/6 +++ TESTING trace trace-09-robust: # Test remove_head with NULL argument --- trace-09-robust 6/6 +++ TESTING trace trace-10-robust: # Test operations on NULL queue --- trace-10-robust 6/6 +++ TESTING trace trace-11-malloc: # Test of malloc failure on insert_head --- trace-11-malloc 6/6 +++ TESTING trace trace-12-malloc: # Test of malloc failure on insert_tail --- trace-12-malloc 6/6 +++ TESTING trace trace-13-malloc: # Test of malloc failure on new --- trace-13-malloc 6/6 +++ TESTING trace trace-14-perf: # Test performance of insert_tail, reverse, and sort --- trace-14-perf 6/6 +++ TESTING trace trace-15-perf: # Test performance of sort with random and descending orders # 10000: all correct sorting algorithms are expected pass # 50000: sorting algorithms with O(n^2) time complexity are expected failed # 100000: sorting algorithms with O(nlogn) time complexity are expected pass --- trace-15-perf 6/6 +++ TESTING trace trace-16-perf: # Test performance of insert_tail --- trace-16-perf 6/6 +++ TESTING trace trace-17-complexity: # Test if time complexity of q_insert_tail, q_insert_head, q_remove_tail, and q_remove_head is constant Probably constant time Probably constant time Probably constant time Probably constant time --- trace-17-complexity 5/5 --- TOTAL 100/100 Test with specific case by running command: scripts/driver.py -p /tmp/qtest.3oObvB --valgrind -t <tid> ``` ::: 結果得到了 100 分，代表並無出現有記憶體的問題，這邊我順便列出 Valgrind 可以偵測出的記憶體相關的問題。 * **Memory Leak** 當程式配置了記憶體，但在釋放前就已經沒有指標指向此塊記憶體，依據情況又可以分成以下四種。 * *definitely lost*: 真的發生 memory leak * *indirectly lost*: 間接的 memory leak，structure 本身發生 memory leak，而內部的 member 如果是 allocate 的出來的，一樣會 memory leak，但是只要修好前面的問題，後面的問題也會跟著修復。 * *possibly lost*: allocate 一塊記憶體，並且放到指標 ptr，但事後又改變 ptr 指到這塊記憶體的中間 * *still reachable*: 程式結束時有未釋放的記憶體，不過卻還有指標指著，通常會發生在 global 變數即便是在所用的函式庫裡頭配置的記憶體，也可偵測到，這也是動態分析手法的優勢。 * **Invalid Memory Access** 有些對記憶體有危險的操作並不會馬上發生 [Segmentation fault](https://en.wikipedia.org/wiki/Segmentation_fault) ，但是在之後執行的程式很有可能就出現問題了。可能會有以下幾種狀況。 * 使用到了不合法的記憶體區段，也就是試圖讀寫一個非法的區域 * 讀取的區域已被釋放 * 釋放一個不存在的區域 :::info Segmentation Fault: 它會出現在當程式企圖存取CPU無法定址的記憶體區段時。當錯誤發生時，硬體會通知作業系統產生了記憶體存取權限衝突的狀況。作業系統通常會產生核心轉儲（core dump）以方便開發者進行除錯。下列幾種情況可能會導致 Segmentation Fault : * 錯誤的訪問、寫入至唯讀記憶體 * 訪問了不屬於程式地址空間的內存 * 寫入空指標 * 越界存取 ::: * 其他案例 * 用到沒有結束字元 (null-terminated string) 的字串 * 區域變數的存取若超過範圍，可能造成 stack corrupt --- ### qtest 記憶體的消耗量使用工具 massif 來視覺化 qtest 使用時的記憶體消耗量。可以先將 `harness.c` 裡面的 `time_limit` 調整大一點，因為 Valgrind 會使程式的速度大幅下降，就很可能發生 timeout ，所以暫時調高 `time_limit` 可以讓我們先得到其數據。 ```cpp static int time_limit = 10; ``` 輸入命令。 ``` $ valgrind --tool=massif ./qtest -f traces/trace-17-complexity.cmd ``` 就可以得到 `massif.out.xxxx` 的檔案，接著到 [Massif-Visualizer](https://apps.kde.org/zh-tw/massif-visualizer/) 用 Ubuntu Software 安裝 Massif 。安裝完成後，輸入命令，其中 xxxx 為生成的數字。 ``` $ massif-visualizer massif.out.xxxx ``` 就可以得到記憶體消耗量的視覺圖了。 ![Screenshot from 2024-04-13 15-41-58](https://hackmd.io/_uploads/B13wP3vlC.png) ## 在 qtest 實做 `shuffle` 命令 ### Fisher–Yates shuffle 演算法 * 演算方法 1. 取得佇列的長度來走訪整個佇列 2. 設置最後一個節點為 `tail` 反向的走訪佇列，每次都和前面**隨機**一個節點 `rand_node` 交換 3. 被走訪後的節點都是被隨機交換過的且不會再被交換將實作佇列洗牌的函式 `q_shuffle` 寫在 `qtest.c` 中。 ```cpp void swap_node(element_t *a, element_t *b) { char *tmp = a->value; a->value = b->value; b->value = tmp; } void q_shuffle(struct list_head *head) { if (!head || list_empty(head)) return; int len = q_size(head); srand(time(NULL)); struct list_head *tail = head->prev, *rand_node; for (int i = len; i > 1; i--) { int rand_num = rand() % i; rand_node = tail; while (rand_num-- >= 0) { rand_node = rand_node->prev; } swap_node(list_entry(rand_node, element_t, list), list_entry(tail, element_t, list)); tail = tail->prev; } } ``` 接著要把 shuffle 加入 `./qtest` 的命令中，根據前面 `do_what` 函式，寫 `do_shuffle` 函式，最後顯示佇列前 3 個元素。 ```cpp static bool do_shuffle(int argc, char *argv[]) { if (argc != 1) { report(1, "%s takes no arguments", argv[0]); return false; } set_noallocate_mode(true); if (current && exception_setup(true)) q_shuffle(current->q); exception_cancel(); set_noallocate_mode(false); return q_show(3); } ``` 繼續在 `qtest.c` 中的 `console_init` 函式中新增 `ADD_COMMAND` 。 ```cpp ADD_COMMAND(shuffle, "Shuffle the queue using Fisher-Yates shuffle algorithm", ""); ``` 執行 `./qtest` ，使用命令 `it RAND 10` 先插入隨機 10 個字串，接著執行 `shuffle` 命令後發現出現以下錯誤。 ``` Segmentation fault occurred. You dereferenced a NULL or invalid pointerAborted (core dumped) ``` 檢查後發現是在找隨機節點時可能會存取到無效的指標，因此修改以下程式。 ```diff -while (rand_num-- >= 0) { +while (rand_num-- > 0) { ``` ### 亂度測試程式根據老師給的 [測試程式](https://github.com/Appmedia06/lab0-c/blob/master/test_shuffle.py) 來檢驗 `shuffle` 的亂度，創建一個測試用的 python 檔，安裝必要的套件包，執行命令 `python3 test_shuffle.py` ，便可以得到洗牌亂度的直方圖了。 ![shuffle1](https://hackmd.io/_uploads/Sy7wkTReR.png) 但這裡發現亂度非常不均勻，看完老師的 [M06: integration](https://hackmd.io/@sysprog/linux2024-integration-b) 後發現拿掉時間種子就可以解決了。 ```diff - srand(time(NULL)); + // srand(time(NULL)); ``` 會導致這樣的原因，要先從 `rand()` 函數說起，根據 C99 (§ 7.20.2.2) 說明了 `rand()` 函式的作法，它會將 next 進行一次 [Linear congruential generator](https://en.wikipedia.org/wiki/Linear_congruential_generator) ，接著再將其限縮在 $0\sim 32767$ ，也就是 $0\sim 2^{15}-1$ 之間。 ```cpp int rand(void) // RAND_MAX assumed to be 32767 { next = next * 1103515245 + 12345; return (unsigned int)(next/65536) % 32768; } ``` 而 `srand(time(NULL))` 會初始化當下時間一次，但初始化 `next` 變數後，它便會去做 Linear congruential generator，就和時間無關了，而移除掉它的功能是每次 shuffle 都會將 Linear congruential generator 的 seed 初始化為當下的時間，這樣就符合 Uniform Distribution 。但實際上這樣並沒有達成真正的洗牌，每次shuffle 都是一樣的操作，根本沒有 shuffle 到，因為每次 seed 都被 `static unsigned long int next = 1;` 這樣初始化。為其實 shuffle 是有規律的，只是不知道它從週期的哪個起點開始。那有規律不就不是獨立事件。 ![shuffle2](https://hackmd.io/_uploads/r19scp0eA.png) ## 研讀論文〈Dude, is my code constant time?〉 ### 論文大綱這篇論文介紹了 [dudect](https://github.com/oreparaz/dudect) 工具，用來檢測特定平台上的程式碼是否以 **constant time** 運行。作者們的方法基於 TIMING LEAKAGE DETECTION ，提供了一個簡潔且易於使用的方案。該工具不依賴於靜態分析，而是基於對使用平台上執行時間測量的統計分析。這樣一來，就避免了對底層硬體進行建模的問題，也就是說可以不用在特定的硬體環境下測試。該工具的設計使得在黑盒測試情況下使用時具有極大的優勢，並且可以通過實施細節來提高檢測定時攻擊的效果。 ### 實作方法根據論文中的 *II. OUR APPROACH : TIMING LEAKAGE DETECTION* 說明實作方法。 * *Step 1: Measure execution time* 1. **Classes definition** * 對所研究的加密功能的執行時間進行重複測量，並使用屬於兩個不同輸入數據類別的輸入進行測試。 * leakage detection 技術通常會對兩個不同的輸入數據類別進行測試，以獲取潛在的定時洩漏。 * 一種常見的洩漏檢測測試是固定 VS. 隨機測試，其中第一類輸入數據固定為常數值，而第二類輸入數據則每次測量時隨機選擇 2. **Cycle counters** * 現代 CPU（例如 x86 系列中的 TSC 寄存器或 ARM 中的 systick 外設）提供了週期計數器，可用於準確獲取執行時間測量。 * 週期計數器可以記錄 CPU 執行指令的時脈週期數，從而提供了準確的執行時間訊息。 3. **Environmental conditions** * 為了確保測量結果的穩定性和可靠性，需要最小化環境條件的影響。 * 每次測量應對應一個隨機選擇的類別，以減少環境變化對測量結果的影響。 * *Step 2: Apply post-processing* 描述了在執行時間測量後，實踐者可以對每個單獨的測量進行後處理，以進行統計分析。 1. **Cropping** * 典型的定時分佈通常會被較大的執行時間影響。 * 通過裁剪測量，可以排除一些極端值，從而使統計分析更加穩定和準確。 2. **Higher-order preprocessing** * 根據應用的統計測試，可以用一些有利於的高階預處理，例如中心化乘積，模擬高階DPA攻擊。 * *Step 3: Apply statistical test* 第三步是應用統計的測試，以試圖推翻"兩個定時分佈相等"的虛無假設。可以選擇不同的統計測試方法，以確定兩個定時分佈是否存在差異。 1. **t-test** * 一種簡單的統計測試方法是 [Welch's t-test](https://en.wikipedia.org/wiki/Welch%27s_t-test) 。這種統計測試用於測試平均值的等價性，因此如果該測試失敗，則意味著存在一個一階定時信息洩漏。 * 當 t-test 與裁剪預處理結合時，不僅僅是測試平均值的相等性，還涉及更高階的統計矩，因為裁剪是一種非線性轉換。 2. **Non-parametric tests** * 除了 t-test 外，還可以使用更一般的非參數測試。這些測試通常對於底層分佈的假設較少，但可能收斂較慢並且需要更多的樣本。 ### 修改 lab0-c 的 dudect 相關程式在 lab0-c 中有實作論文中的 [dudect](https://github.com/oreparaz/dudect) 程式，只是當我們將 lab0-c push 到 github 上時會發現在 `trace-17-complexity 0/5` 是沒有分數的。因為實作程式碼和論文上說明的實作方法有出入，因此現在要來修改它。看到 `fixture.c` ，透過註解可以理解這個程式是多次測量給定函數的執行時間，使用 **Welch's t-test 檢測函式是否為常數時間** 。而 `doit` 就實作的函式。 * `doit` 函式的實作流程 1. `prepare_inputs` 來準備輸入資料。 2. 在 `constant.c` 中有 `measure` 函式，會檢測的函式執行前後的 `cpucycles` ，而 `mode` 則是選擇要檢測哪個函式（有 `ih` , `it` , `rh` , `rt`)。 3. `differentiate` 將前後時間相減獲得執行時間。 4. `update_statistics` 去除掉 CPU cycle counter 中非正常的值。 5. `report` 計算 Welford method 的 t 值， t 值用於統計數據確定數據之間的平均值是否存在顯著差異。如果 t 值大於 threshold 及判定不是恆定時間。回傳布林代數。而和論文的實作 [dudect](https://github.com/oreparaz/dudect) 比較後發現缺少了 **Cropping** 的動作。因此需要在 `fixture.c` 裡增加可以將測量結果進行預處理，丟掉超過特定百分比的測量結果。 * 在 `update_statistics` 中新增判斷是否超過特定百分比 ```cpp for (size_t cropping_idx = 0; cropping_idx < DUDECT_NUMBER_PERCENTILES; cropping_idx++) { if (difference < percentiles[cropping_idx]) { t_push(t, difference, classes[i]); } } ``` * 配置一個 `percentiles` 陣列，當第一次執行的時候，先呼叫 `prepare_percentiles` 用來初始化 `percentiles` 的值。接著就和原本程式一樣呼叫 `update_statistics` 和 `report` 。只是 `update_statistics` 多了上述檢測功能。 ```cpp if (first_time) { prepare_percentiles(exec_times, percentiles); } else { update_statistics(exec_times, classes, percentiles); ret &= report(); } ``` 詳細差異： [commit](https://github.com/sysprog21/lab0-c/commit/aa0f989f7e3053a25aa505e25f3022f4411f5142) 經過修改後，最終 push 到 github 上，成功得到 [100 分](https://github.com/Appmedia06/lab0-c/actions/runs/8772947389/job/24072453556)。 ### 〈Teach Yourself Programming in Ten Years〉心得這篇[文章](https://norvig.com/21-days.html)首先提到了人們在程式設計領域往往過於急躁，只求結果而不願深入學習。這種表面涉獵卻缺乏深度的方式，往往造就了技能不足的程式設計師。這讓我回想起大一大二時期，當我學會一項技能後，便急於開始專案開發。然而，現在回顧當初的作品，發現其中許多不成熟之處。若當初能夠充分補充知識，或許專案會更加完善。目前我正在學習 Linux 核心設計課程，這門課程要求的知識量相當龐大。因此，我決定調整自己的節奏，深入吸收各個知識點，包括你所未接觸過的 C 語言系列以及與 Linux 系統相關的知識，並且努力完成所有作業。雖然我尚未修習過這門課程，但我期望通過自己的努力學習，成為浩瀚的程式碼中一個微小的齒輪。 ## 參考資料 * [M01: lab0](https://hackmd.io/@sysprog/linux2024-lab0/%2F%40sysprog%2Flinux2024-lab0-a) * [你所不知道的 C 語言: linked list 和非連續記憶體](https://hackmd.io/@sysprog/c-linked-list) * [The Linux Kernel API](https://www.kernel.org/doc/html/latest/core-api/kernel-api.html)