2024q1 Homework1 (lab0)

# 2024q1 Homework1 (lab0) contributed by < [Ackerman666](https://github.com/Ackerman666) > ### Reviewed by `allenliao666` - 建議使用 git diff 方便閱讀者了解內容，例如 `q_free` 中的「原先有在函式中將 head 設成 NULL 的動作 (在 commit 時跳以下錯誤)」。 - 內文中同時使用元素與節點表示 element，應統一翻譯所使用的詞語。 - 應把 assign 、 reverse 等翻譯成中文，避免中英交雜。 - `q_swap` 可以使用 list_move 簡化程式碼。 - ## 開發環境 $ gcc --version gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 Copyright (C) 2021 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 39 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Vendor ID: GenuineIntel Model name: Intel(R) Core(TM) i7-10700 CPU @ 2.90GHz CPU family: 6 Model: 165 Thread(s) per core: 2 Core(s) per socket: 8 Socket(s): 1 Stepping: 5 CPU max MHz: 4800.0000 CPU min MHz: 800.0000 BogoMIPS: 5799.77 <s> ## 修改 `queue.c` </s> ## 指定佇列的操作 ### `q_new` :::danger allocate 翻譯為「配置」，以區格 dispatch (分發/分配) 的翻譯 > 已更正 ::: * 用`malloc` <s>分配</s>**配置**空間給予 `list_head` * `list_head` 物件透過 `INIT_LIST_HEAD` 初始化值得注意的是`malloc`可能會回傳==空指標==，因此進行初始化前先判斷其是否為空指標 ```c /* Create an empty queue */ struct list_head *q_new() { struct list_head *q_head = (struct list_head *) malloc(sizeof(struct list_head)); if(q_head) INIT_LIST_HEAD(q_head); return q_head; } ``` ### `q_free` :::danger HackMD 不是讓你張貼完整程式碼的地方，GitHub 才是！因此你在開發紀錄只該列出關鍵程式碼 (善用 diff 標示)，可附上對應 GitHub commit 的超連結，列出程式碼是為了「檢討」和「便於他人參與討論」，不是用來「假裝自己有付出」 ::: ```c /* Free all storage used by queue */ void q_free(struct list_head *head) { if (!head) return; element_t *element, *safe; list_for_each_entry_safe (element, safe, head, list) { q_release_element(element); } free(head); } ``` 原先有在函式中將 `head` 設成 `NULL` 的動作 (在 `commit` 時跳以下錯誤) Following files need to be cleaned up: queue.c queue.c:39:5: warning: Assignment of function parameter has no effect outside the function. Did you forget dereferencing it? [uselessAssignmentPtrArg] head = NULL; ^ Fail to pass static analysis. 後來意識到 `head` 這個變數本身就只是函式內的區域變數因此就算將其設定成 `NULL`，本質上不會影響到當初傳遞進去的指標變數，因此要避免 ### `q_insert_head` :::danger 避免非必要的項目符號 (即 `* `)，以清晰、明確，且流暢的漢語書寫。 ::: 建立一個新的 `element` 物件檢查佇列和 `element` 是否存在如通過上述檢查，`strdup` 會分配空間，並將 `ｓ` 指向字串複製進去，再將空間地址 `assign` 到 `element->value` 最後判斷 `element->value` 是否為NULL，是則本次 insert 失敗，並透過 `free` 釋放出空間 insert 成功則透過 `list_add` 將 `element` 加進佇列開頭 ```c /* Insert an element at head of queue */ bool q_insert_head(struct list_head *head, char *s) { element_t *element = (element_t *) malloc(sizeof(element_t)); if (!element || !head) return false; element->value = strdup(s); if (!element->value) { free(element); return false; } list_add(&element->list, head); return true; } ``` ### `q_insert_tail` 程式邏輯和 `q_insert_head` 本質上相同 ```c /* Insert an element at tail of queue */ bool q_insert_tail(struct list_head *head, char *s) { element_t *element = (element_t *) malloc(sizeof(element_t)); if (!element || !head) return false; element->value = strdup(s); if (!element->value) { free(element); return false; } list_add_tail(&element->list, head); return true; } ``` ### `q_remove_head` * 佇列存在且非空就透過 `list_first_entry` 找到第一個 `element` * 執行 `list_del` 將 `element` 移除 * 如果 `sp` 非 `NULL`，則透過 `strncpy` 將 `element->value` 複製到 `sp` ```c /* Remove an element from head of queue */ element_t *q_remove_head(struct list_head *head, char *sp, size_t bufsize) { if (!head || list_empty(head)) return NULL; element_t *element = list_first_entry(head, element_t, list); list_del(&element->list); if (sp && bufsize > 0) { strncpy(sp, element->value, bufsize - 1); sp[bufsize - 1] = '\0'; } return element; } ``` ### `q_remove_tail` 程式邏輯和 `q_remove_head` 本質上相同 ```c /* Remove an element from tail of queue */ element_t *q_remove_tail(struct list_head *head, char *sp, size_t bufsize) { if (!head || list_empty(head)) return NULL; element_t *element = list_last_entry(head, element_t, list); list_del(&element->list); if (sp && bufsize > 0) { strncpy(sp, element->value, bufsize - 1); sp[bufsize - 1] = '\0'; } return element; } ``` ### `q_size` * 設定一個計數器為 `size` ，透過 `list_for_each` 走訪每個節點，每走訪一個節點 `size` 就加一 ```c /* Return number of elements in queue */ int q_size(struct list_head *head) { if (!head || list_empty(head)) return 0; int size = 0; struct list_head *iterator; list_for_each (iterator, head) { ++size; } return size; } ``` ### `find_mid_node` **該函式用於找尋佇列裡的 middle point** 原先想透過一次走訪得到佇列長度，再進行一次走訪刪除中間節點，複雜度為 $\mathcal{O}(n)$ 但看了[分析「快慢指標」](https://hackmd.io/@sysprog/ry8NwAMvT)後，得知上述方法因時間局部性，在 cache miss 上會比快慢指標來得多 :::danger 提供量化分析來確認這說法。 ::: 因此採取快慢指標的方式實作該函式 ```c struct list_head *find_mid_node(struct list_head *head) { struct list_head *fast, *slow, *tail; fast = slow = head->next; tail = head->prev; // queue size > 1 if (slow != tail) { while (fast != head && fast != tail) { fast = fast->next->next; if (fast != head) slow = slow->next; } } return slow; } ``` ### `q_delete_mid` 透過`find_mid_node`找到中間節點，再予以刪除 ```c /* Delete the middle node in queue */ bool q_delete_mid(struct list_head *head) { // https://leetcode.com/problems/delete-the-middle-node-of-a-linked-list/ if (!head || list_empty(head)) return false; struct list_head *mid; mid = find_mid_node(head); list_del(mid); element_t *mid_element = list_entry(mid, element_t, list); q_release_element(mid_element); return true; } ``` ### `q_delete_dup` 透過 `list_for_each_safe` 可以得到當前元素 `cur` 和下個元素 `next` 迴圈中會執行 `strcmp` 比較兩元素的 value ，相等則將 `cur` 進行刪除，並設置 `delete` 變數為 true 而 `delete` 變數的目為確保每個 value 重複出現的節點都能確實的被刪掉 ```c /* Delete all nodes that have duplicate string */ bool q_delete_dup(struct list_head *head) { // https://leetcode.com/problems/remove-duplicates-from-sorted-list-ii/ if (!head) return false; struct list_head *node, *safe; bool delete = false; list_for_each_safe (node, safe, head) { element_t *cur = list_entry(node, element_t, list); element_t *next = list_entry(safe, element_t, list); if (safe != head && strcmp(cur->value, next->value) == 0) { list_del(node); q_release_element(cur); delete = true; } else if (delete) { list_del(node); q_release_element(cur); delete = false; } } return true; } ``` ### `q_swap` 主要透過 `node` (當前節點)， `safe` (下個節點)， `pre` (當前節點的前一個節點)來操作而 [list.h](https://github.com/Ackerman666/lab0-c/blob/644c4d3ef473b6a365bfa401d5684d9b87bd3464/list.h#L409-L411) 定義的巨集 `list_for_each_safe` 中每進行一次迭代，會將 `safe` <s>assign</s> 給 `node` 完成交換後，為了使 `node` 指向位置是下一個要交換的節點，必須將 `node->next` assign給 `safe` :::danger 改進你的漢語表達。 ::: ```c /* Swap every two adjacent nodes */ void q_swap(struct list_head *head) { // https://leetcode.com/problems/swap-nodes-in-pairs/ struct list_head *node, *safe, *pre = head; list_for_each_safe (node, safe, head) { if (node->next != head) { pre->next = safe; node->next = safe->next; node->prev = safe; safe->next = node; safe->prev = pre; pre = node; safe = node->next; } if (safe == head) head->prev = node; } } ``` ### `q_reverse` 主要透過 `node` (當前節點)， `safe` (下個節點)， `tmp` (當前節點的前一個節點) 來操作在 `list_for_each_safe` 執行完畢後，還需要將 `head` 指向 reverse 過後的首尾節點 ```c /* Reverse elements in queue */ void q_reverse(struct list_head *head) { if (!head || list_empty(head)) return; struct list_head *node, *safe, *tmp = head; list_for_each_safe (node, safe, head) { node->next = tmp; node->prev = safe; tmp = node; } head->prev = head->next; head->next = tmp; } ``` ### `q_reverseK` :::danger access 的翻譯是「存取」，而非「訪問」(visit) > 已更正 ::: 用 `times` 紀錄<s>拜訪</s> **存取**的節點數量，當等於 k 時，用 `list_cut_position` 將節點切開，並移至 `tmp` 再對 `tmp` 執行 `q_reverse` ，並用 `list_splice_tail_init` 將 `tmp` 指向的 list 移動至 `rk_head` 上述執行完畢後，會反轉 k 個節點，為了要繼續反轉下組節點，必須在反轉完畢後將 `times` 設成 0 最後當迴圈執行完畢時，將 `rk_head` 指向的 list 再移回 `head` :::danger HackMD 不是讓你張貼完整程式碼的地方，GitHub 才是！因此你在開發紀錄只該列出關鍵程式碼 (善用 diff 標示)，可附上對應 GitHub commit 的超連結，列出程式碼是為了「檢討」和「便於他人參與討論」，不是用來「假裝自己有付出」 ::: ```c /* Reverse the nodes of the list k at a time */ void q_reverseK(struct list_head *head, int k) { // https://leetcode.com/problems/reverse-nodes-in-k-group/ if (!head || list_empty(head) || k == 1) return; int times = 1; struct list_head *node, *safe; struct list_head tmp, rk_head; INIT_LIST_HEAD(&tmp); INIT_LIST_HEAD(&rk_head); list_for_each_safe (node, safe, head) { if (times == k) { list_cut_position(&tmp, head, node); q_reverse(&tmp); list_splice_tail_init(&tmp, &rk_head); times = 0; } ++times; } list_splice(&rk_head, head); } ``` ### `merge_sorted_queues` 利用 merge sort 在 merge 階段的方式，將兩個已排序的佇列合併 **思路 :** 過程會兩兩比較節點，並依 `descend` 與否來選出排序較前面的節點 `selected` 並透過 `list_move_tail` 移到 `head` 上最後再將 `head` 透過 `list_splice_tail`將結點移至 `l` 為了讓程式碼更精簡，我在判斷 `descend` 與否時使用了 <s>[三元表達式](https://shengyu7697.github.io/cpp-ternary-operator/)</s> **條件運算子 (conditional operator )** :::danger 參照 C 語言規格書！ > 已更正 ::: 後來想到`l` 與 `r` 的升降序方向要相同，否則結果會錯，這部分我再思考如何更 general :::danger 改進你的漢語表達。 ::: ```c /* Merge tow sorted queues and set l to the queue head*/ void merge_sorted_queues(struct list_head *l, struct list_head *r, bool descend) { struct list_head head; INIT_LIST_HEAD(&head); while (!list_empty(l) && !list_empty(r)) { struct list_head *selected, *l_first = l->next, *r_first = r->next; element_t *left_e = list_first_entry(l, element_t, list); element_t *right_e = list_first_entry(r, element_t, list); int cmp_result = strcmp(left_e->value, right_e->value); selected = descend ? ((cmp_result < 0) ? r_first : l_first) : ((cmp_result > 0) ? r_first : l_first); list_move_tail(selected, &head); } list_splice_tail_init(list_empty(l) ? r : l, &head); list_splice_tail(&head, l); } ``` ### `q_sort` 透過 merge sort 來對佇列排序 **思路** 先用 `find_mid_node` 找到中間節點 `mid` 再透過 `list_cut_position` 將原本的佇列切分成兩份 `l` 指向以 `mid` 為結尾的左半部， `head` 則為剩餘的右半部 `l` 與 `head` 再個別進行 `q_sort` 來進行排序將排序好的`l` 和 `head` 利用 `merge_sorted_queues` 來 merge **執行過程中，資料結構皆保持環狀鏈結串列，不用特地拆開變成單向鏈結串列** ```c /* Sort elements of queue in ascending/descending order */ void q_sort(struct list_head *head, bool descend) { if (!head || list_empty(head) || list_is_singular(head)) return; /* left and right sub queue*/ struct list_head l, r; struct list_head *mid; INIT_LIST_HEAD(&l); INIT_LIST_HEAD(&r); mid = find_mid_node(head); /* After cut * l will point to the left part of the queue relative to the mid position. * otherwise head will point to the right part (excluding mid) */ list_cut_position(&l, head, mid); q_sort(&l, descend); q_sort(head, descend); merge_sorted_queues(&l, head, descend); list_splice_tail(&l, head); } ``` 精簡程式碼： ```diff /* left and right sub queue*/ - struct list_head l, r; + struct list_head l; struct list_head *mid; INIT_LIST_HEAD(&l); - INIT_LIST_HEAD(&r); ``` :::danger 你如何確保目前的測試已涵蓋排序演算法的最差狀況？ ::: ### `q_descend` 函式功能註解講得不夠明確，`Remove` 一詞根據 [queue.h](https://github.com/Ackerman666/lab0-c/blob/b324db1ac5ed804a4f61ee2e3e3a22a48f9f69d0/queue.h#L90C1-L92C52) 不該將 space freed 掉但實際上要執行 `q_release_element` 否則無法通過<s>測資</s>**測試項目** :::danger 此處為「測試項目」，著重在 item 一詞，不是「資料」。避免歧義，「測試資料」不要簡寫。 > 已更正 ::: **錯誤訊息如下** +++ TESTING trace trace-06-ops: # Test of insert_head, insert_tail, delete duplicate, sort, descend and reverseK ERROR: There is no queue, but 4 blocks are still allocated ERROR: Freed queue, but 4 blocks are still allocated :::danger 程式碼註解不該出現中文，總是以美式英語書寫。 > 已更正 ::: **主要思路** : 反向走訪佇列，並透過 `max` 紀錄當前 value 值最大之節點當有節點的 value 小於所紀錄 `max` 的 value時，就將該點 remove 反之代表該節點的 value 比當前記錄的大，因此將 `max` 進行更新 ```c /* Remove every node which has a node with a strictly greater value anywhere to * the right side of it */ int q_descend(struct list_head *head) { // https://leetcode.com/problems/remove-nodes-from-linked-list/ if (!head || list_empty(head)) return 0; int amount = 1; struct list_head *max = head->prev, *node = head->prev->prev, *next; for (next = node->prev; node != (head); node = next, next = node->prev) { element_t *element = list_entry(node, element_t, list); element_t *cur_max_element = list_entry(max, element_t, list); if (strcmp(cur_max_element->value, element->value) > 0) { list_del(&element->list); /* should "delete" the element, otherwise cant pass trace-06-ops */ q_release_element(element); } else { max = node; ++amount; } } return amount; } ``` :::danger HackMD 不是讓你張貼完整程式碼的地方，GitHub 才是！因此你在開發紀錄只該列出關鍵程式碼 (善用 diff 標示)，可附上對應 GitHub commit 的超連結，列出程式碼是為了「檢討」和「便於他人參與討論」，不是用來「假裝自己有付出」 ::: ### `q_ascend` 程式邏輯和 `q_descend` 本質上相同 ```c /* Remove every node which has a node with a strictly less value anywhere to * the right side of it */ int q_ascend(struct list_head *head) { // https://leetcode.com/problems/remove-nodes-from-linked-list/ if (!head || list_empty(head)) return 0; int amount = 1; struct list_head *min = head->prev, *node = head->prev->prev, *next; for (next = node->prev; node != (head); node = next, next = node->prev) { element_t *element = list_entry(node, element_t, list); element_t *cur_min_element = list_entry(min, element_t, list); if (strcmp(cur_min_element->value, element->value) < 0) { list_del(&element->list); q_release_element(element); } else { min = node; ++amount; } } return amount; } ``` ### `q_merge` 因為給定的所有 `queues` 皆是排序好的狀態，因此直接用利用 `list_for_each_entry` 遞迴並以 `merge_sorted_queues` 兩兩合併至 `target` 中原先我想使用的做法是將整個佇列合併成大佇列再執行一次 `q_sort` 但是 `q_sort` 在過程中會反覆的進行遞迴，並多次的 `merge_sorted_queues` 以記憶體的觀點來看，這樣會在 Stack memory 有多次的allocation and deallocation 操作所以最後我改用逐一合併的方式來避免過多的函式呼叫 :::danger 改進你的漢語表達。 ::: ```c /* Merge all the queues into one sorted queue, which is in ascending/descending * order */ int q_merge(struct list_head *head, bool descend) { // https://leetcode.com/problems/merge-k-sorted-lists/ if (!head || list_empty(head)) return 0; queue_contex_t *target = list_entry(head->next, queue_contex_t, chain); queue_contex_t *other = NULL; list_for_each_entry (other, head, chain) { if (target == other) continue; merge_sorted_queues(target->q, other->q, descend); other->size = 0; } return q_size(target->q); } ``` ## 實作 shuffle 命令 > 參考 [Fisher–Yates shuffle](https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle) 原理是每次隨機挑選一個陣列元素並將該元素與陣列最尾端**未置換**過的元素做交換因過程不需額外記憶體空間，因此除線性時間外也達到 [in place](https://en.wikipedia.org/wiki/In-place_algorithm)的效果 :::danger 優先參照權威材料，例如你的演算法教科書和指定教材，退而求其次，才是英文 Wikipedia 條目。 > 了解，已將連結替換 ! ::: 參考之虛擬碼 <s>pseudocode</s> -- To shuffle an array a of n elements (indices 0..n-1): for i from n−1 down to 1 do j ← random integer such that 0 ≤ j ≤ i exchange a[j] and a[i] ### 統計結果腳本程式碼來自[作業說明](https://hackmd.io/@sysprog/linux2024-lab0/%2F%40sysprog%2Flinux2024-lab0-d) :::danger 這個次數怎麼決定呢？能否用你的統計知識來解釋？要有多少次測試才能夠總結呢？ > 假說檢定有兩個重要的錯誤指標 >* 型一錯誤 ( alpha（α）) : A/B 兩組本質上沒差異，卻被統計成有差異的機率 >* 型二錯誤 ( beta （β）) : A/B 兩組本質上有差異，卻無法偵測到顯著差異的機率 > >通常會希望這兩種錯誤越低越好，最直觀的方式就是去增加樣本數，來貼近實際情況 >譬如有相同數量的黑球白球放在一個箱子中，如只取出10顆，結果為7顆黑球與3顆白球 >那麼就會造成型一錯誤，而如果將抽取次數增多，則可避免因型一錯誤造成的誤判 > >目前已知樣本數和型一錯誤、型二錯誤、效力量(Effect Size)有關 >但相關推算公式，我目前還沒找到資料 > > 你的統計學教科書呢？去讀書！ :notes: jserv ::: 在執行該腳本時要注意 `qtest ` 執行檔的位置，原先我是複製一份到腳本資料夾底下但跳出以下錯誤 stderr: FATAL: You should run qtest in the directory containing valid git workspace 最後我是透過 `cwd` 更改 `subprocess.run()` 執行的工作目錄位置，切換至根目錄來執行 ```diff +cwd = '../' -completedProcess = subprocess.run(clist, capture_output=True, text=True, input=input) +completedProcess = subprocess.run(clist, capture_output=True, text=True, input=input, cwd = cwd) ``` 執行 scripts/shuffle.py 進行實驗總共執行1000000 shuffle Expectation: 41666 Observation: {'1234': 41503, '1243': 41338, '1324': 41708, '1342': 41853, '1423': 42100, '1432': 41624, '2134': 41583, '2143': 41588, '2314': 41966, '2341': 41518, '2413': 41603, '2431': 41547, '3124': 41929, '3142': 41890, '3214': 41663, '3241': 41660, '3412': 41767, '3421': 41783, '4123': 41462, '4132': 41573, '4213': 41858, '4231': 41376, '4312': 41432, '4321': 41676} chi square sum: 20.96140738251812 * Expectation : 預期每種組合出現的次數 * Observation : 每種組合在該次測試中實際出現次數 * chi square sum : 20.96 下圖為該次測試的統計圖可以看出每種組合出現次數都相當均勻 ![image](https://hackmd.io/_uploads/rk-TXIQ0T.png) ## 研讀 Linux [`list_sort.c`](https://github.com/torvalds/linux/blob/master/lib/list_sort.c) 並引入專案將 [`list_sort.c`](https://github.com/torvalds/linux/blob/master/lib/list_sort.c) 和 [`list_sort.h`](https://github.com/torvalds/linux/blob/master/include/linux/list_sort.h) 加入專案 `list_sort.h` 中會用到 `likely` 和 `unlikely` 兩個巨集而該巨集是被定義在 [`linux/compiler.h`](https://github.com/torvalds/linux/blob/master/include/linux/compiler.h) 底下但本次專案並不包含 `compiler.h`，因此我將該巨集定義在 `list_sort.h` 中 ```diff +#define likely(x) __builtin_expect(!!(x), 1) +#define unlikely(x) __builtin_expect(!!(x), 0) ``` **likely 和 unlikely 巨集用意** 兩個巨集由 `__builtin_expect()` 而來，該函式由 gcc 內建該函式會用到兩個 not operator，目的就是確保 `!!(x)` 的結果一定會是 0 或者是 1 巨集目的是要讓 branch 期望的比較結果讓編譯器知道，並做進一步優化下述程式中，透過 `unlikely(x)` 可以表示 `x` 這個敘述為否的機會較大因此編譯器在編譯程式後，會使 `bar()` 的 assembly code 執行順序較 `foo()` 來的前面以此來減少 branch hazarad 的發生 if (unlikely(x)) { // foo() } else { // bar() } `list_sort.h` 中有定義如下的函式指標 __attribute__((nonnull(2, 3))) (*list_cmp_func_t)(void *, const struct list_head *, const struct list_head *); 因此我在 [`custom_cmp.c`](https://github.com/Ackerman666/lab0-c/blob/master/custom_cmp.c) 實作了對應到 queue element 的升降序比較函式 **發生的錯誤** 在 commit `custom_cmp.c` 程式碼時，在 Cppcheck 靜態分析中會發生 nullPointer error 原因和 [jasperlin1996](https://hackmd.io/@jasperlin1996/linux2022-lab0#Cppcheck-Null-pointer-dereference:~:text=Cppcheck%3A%20Null%20pointer%20dereference) 指出的[問題](https://hackmd.io/@jasperlin1996/linux2022-lab0#Cppcheck-Null-pointer-dereference:~:text=Cppcheck%3A%20Null%20pointer%20dereference)一樣 :::danger 「呼叫」(call) 和「使用」(use) 不同，請查閱辭典以理解二者落差。巨集不能「呼叫」，因為前置處理器會在真正編譯 C 程式碼之前就展開巨集內容。 $\to$ [你所不知道的 C 語言：前置處理器應用篇](https://hackmd.io/@sysprog/c-preprocessor) > 已修正 ::: 以下會<s>呼叫</s> **使用** `list_entry` 的巨集 ```c int q_asc_cmp(void *, const struct list_head *l1, const struct list_head *l2) { if (l1 == NULL || l2 == NULL) return 0; return strcmp(list_entry(l1, element_t, list)->value, list_entry(l2, element_t, list)->value); } ``` `list_entry` 由 `container_of` 而來問題關鍵在於 `container_of` 中有以下程式碼 `(type *) 0` 為 null pointer，因而導致在靜態檢查中，偵測到 **Null pointer dereference** 的錯誤 const __typeof__(((type *) 0)->member).... **解法** 在 [`lab0-c/scripts/pre-commit.hook`](https://github.com/Ackerman666/lab0-c/blob/master/scripts/pre-commit.hook) 中已經有對特定的檔案做 nullPointer 的壓制因此仿照同樣做法新增以下程式即可解決 ```diff CPPCHECK_suppresses="--inline-suppr harness.c \ --suppress=missingIncludeSystem \ --suppress=noValidConfiguration \ --suppress=unusedFunction \ --suppress=identicalInnerCondition:log2_lshift16.h \ --suppress=nullPointerRedundantCheck:report.c \ --suppress=nullPointerRedundantCheck:harness.c \ --suppress=nullPointer:queue.c \ --suppress=nullPointer:qtest.c \ +--suppress=nullPointer:custom_cmp.c \ --suppress=returnDanglingLifetime:report.c \ --suppress=constParameterCallback:console.c \ --suppress=constParameterPointer:console.c \ --suppress=checkLevelNormal:log2_lshift16.h" CPPCHECK_OPTS="-I. --enable=all --error-exitcode=1 --force $CPPCHECK_suppresses $CPPCHECK_unmatched ." ``` ### 增加 `list_sort` 指令對 `sort` 指令進行修改，加入第二個參數來指定排序模式改動細節 : [commit 8b5af29](https://github.com/Ackerman666/lab0-c/commit/8b5af29e9b92259ba547c79138fff270771e5684) ```diff -ADD_COMMAND(sort, "Sort queue in ascending/descening order", ""); +ADD_COMMAND(sort,"Sort queue in ascending/descening order (sort type : [0] ""merge_sort, [1] list_sort)","[t]"); ``` ### 比較 `list_sort` 與 `q_sort` 先參考 [Linux 效能分析工具: Perf](https://wiki.csie.ncku.edu.tw/embedded/perf-tutorial) 進行 Perf 工具安裝新增 `trace-sort.cmd` 隨機新增1000000筆資料到佇列，執行排序，並記錄所耗費時間 option fail 0 option malloc 0 new ih RAND 1000000 time sort 1 time free 執行下列指令來重複執行5次 `qtest` 在這邊實際上會用到該指令兩次，第一次 `qtest` 執行 `q_sort` ，第二次為 `list_sort` sudo perf stat --repeat 5 ./qtest -f traces/trace-sort.cmd **執行時間比較** 由下表可看出每次執行 `list_sort` 耗時皆比 `q_sort` 少 | Exp No | q_sort | list_sort| | -------- | -------- | -------- | | 1 | 1.079 | 0.749 | | 2 | 1.099 | 0.755 | | 3 | 1.073 | 0.746 | | 4 | 1.065 | 0.751 | | 5 | 1.106 | 0.743 | **perf stat 分析情況** my_sort Performance counter stats for './qtest -f traces/trace-sort.cmd' (5 runs): 1640.86 msec task-clock # 1.000 CPUs utilized ( +- 0.39% ) 3 context-switches # 1.828 /sec ( +- 24.94% ) 0 cpu-migrations # 0.000 /sec 3,5234 page-faults # 21.473 K/sec ( +- 0.00% ) 75,6362,5118 cycles # 4.610 GHz ( +- 0.37% ) 49,5211,3455 instructions # 0.65 insn per cycle ( +- 0.00% ) 6,9795,4553 branches # 425.358 M/sec ( +- 0.00% ) 1220,5845 branch-misses # 1.75% of all branches ( +- 0.33% ) list_sort Performance counter stats for './qtest -f traces/trace-sort.cmd' (5 runs): 1307.84 msec task-clock # 1.000 CPUs utilized ( +- 0.61% ) 2 context-switches # 1.529 /sec ( +- 29.15% ) 0 cpu-migrations # 0.000 /sec 3,5233 page-faults # 26.940 K/sec ( +- 0.00% ) 60,0273,3421 cycles # 4.590 GHz ( +- 0.16% ) 48,3465,9132 instructions # 0.81 insn per cycle ( +- 0.00% ) 7,2748,9750 branches # 556.254 M/sec ( +- 0.00% ) 2101,7155 branch-misses # 2.89% of all branches ( +- 0.13% ) 1.30832 +- 0.00820 seconds time elapsed ( +- 0.63% ) 將差距較明顯的4個數據挑出 | | q_sort | list_sort | | -------- | -------- | -------- | | task-clock | 1640.86 msec | 1307.84 msec | | instructions | 49,5211,3455 | 48,3465,9132 | | cycles | 75,6362,5118 | 60,0273,3421 | | branch-misses | 1.75% of all branches | 2.89% of all branches| 可以看到雖 `q_sort` 在 `branch-misses` 的比例上少於 `list_sort` 但主要 `list_sort` 在執行的 `cycles` 數量少 `q_sort` 滿多的因此整體 `list_sort` 表現優秀於 `q_sort` ## 論文閱讀 :〈[Dude, is my code constant time?](https://eprint.iacr.org/2016/1123.pdf)〉 ### Abstract 該論文透過 leakage detection test 的方式，結合統計上的 [Welch’s t-test](https://en.wikipedia.org/wiki/Welch's_t-test) 去判斷程式執行是否為常數時間，因是透過統計手法來測量，所以使用上不侷限於特定的硬體平台 ### Step 1: Measure execution time 進行 leakage detection test 時會準備兩種 class 測試資料，這邊是以 fix-vs-random 的方式 * 第一個 class : 全部的輸入資料會被設定成固定值 * 第一個 class : 全部的輸入資料會以隨機的方式來產生最後將兩種 class 的資料輸入進程式，並記錄執行時間 ### Step 2: Apply post-processing 對記錄到的時間進行後處理 * Cropping : 在程式執行時，有機會受到外部的影響 (系統中斷等)，造成記錄的時間涵蓋了一些與實驗無關的部分，因此透過 Cropping 將異常值去除 * Higher-order preprocessing : 根據應用的統計測試不同，可以採取一些其他處理機制 ### Step 3: Apply statistical test 建立虛無假說 : 兩個 class 執行時間分布一致 (因要判斷是否為常數時間，因此在實驗設計上，固定值的 class 要能真的是以常數時間執行) 透過 Welch’s t-test 進行假說檢定，當檢定結果拒絕虛無假說，則可判定該執行時間不為常數 ### 修正 `fixture.c` [commit 855889d](https://github.com/Ackerman666/lab0-c/commit/855889d5ef4ec1872ee83c8b3a2aad2aa4620275) 與原版 `dudect` 最大差別是我發現 `update_statistics()` 沒有做 Cropping 的動作因此參考 [`dudect/src/dudect.h`](https://github.com/oreparaz/dudect/blob/dc269651fb2567e46755cfb2a13d3875592968b5/src/dudect.h#L302) 的實作新增 `cmp`, `percentile`, `prepare_percentiles` 來對 `update_statistics` 做 Cropping 處理最後通過 trace-17-complexity ![image](https://hackmd.io/_uploads/H1kZX-yyR.png)