2020q3 Homework1 (lab0)

# 2020q3 Homework1 (lab0) contributed by < `unknowntpo` > ###### tags: `進階電腦系統理論與實作` `linux2020` 續: [2020q1 Homework1 (lab0)](/@unknowntpo/lab0-c) ## 實驗環境 ## 作業要求 ## 開發過程 ### NLL_PRT_GUARD 利用 macro 來簡化每次檢查傳入的 pointer 是否為 `NULL` 如果 `q` 是 `NULL` 的話就回傳 `0` ```cpp /* GUARD of NULL pointer */ #define NULL_PTR_GUARD(q) \ do { \ if (!q) \ return 0; \ } while (0) ``` ### q_new ```cpp /* * Create empty queue. * Return NULL if could not allocate space. */ queue_t *q_new() { queue_t *q = malloc(sizeof(queue_t)); NULL_PTR_GUARD(q); q->head = NULL; q->tail = NULL; q->size = 0; return q; } ``` ### q_free ```cpp /* Free all storage used by queue */ void q_free(queue_t *q) { if (!q) return; list_ele_t *tmp; for (tmp = q->head; tmp;) { /* Store the next element */ q->head = tmp->next; free(tmp->value); free(tmp); tmp = q->head; } free(q); } ``` ### q_insert_head 解釋: * Step1: * 一開始先檢查是否傳遞的 `q` 為 NULL queue * 並使用 `strlen()` 求得字串長度，儲存在型態為 `size_t` 的變數 `len` 內 * Step2: * 為新的 node `newh` 配置記憶體空間， * 使用 `NULL_PTR_GUARD` 做 null pointer 檢查， * 如果配置失敗，就 return false * 為 `newh->value` 配置記憶體空間，以便之後把輸入的字串 `s` 複製到新的 node 上 * 一樣進行 null pointer 的檢查，不過如果這裡配置失敗的話，必須先釋放整個 `newh` 結構體的記憶體空間才能 return false, 所以沒辦法用 `NULL_PTR_GUARD`, 只能手動寫一個。 * Step3: * 進行從 `s` 到 `newh->value` 的字串複製，並在結尾手動加入 `\0` * 對於 `strncpy()` 是否會自動補 0 的討論 * [strncpy 是否會自動補上 \0 ?](/F4me2PsnQ7ex99VMblZolQ?view&fbclid=IwAR2_cblvSpm1iDsBOk449exJpuU4btfEy_Hde8Fzo8cqfpx4VN3orIAKIv8) :::info TODO: 參考 [你所不知道的C語言：技巧篇](/@sysprog/c-trick?type=view) 中的文章 [停止使用 strncpy 函數](http://blog.haipo.me/?p=1065) ，使用 `strdup()` 來改寫 code ::: * Step4: * 更新 `q->head` 與 `q->tail` 所指向的位置 * `q->tail` 要分成兩種情況討論 * 如果 `q_insert_head()`執行前節點數量為 0， * 這時就必須讓 `q->tail` 指向新的 node `newh`, * 如果 `q_insert_head()`執行前節點數量為 1， * 這時就不能更動 `q->tail` 的位置 * 更新 q->size 的數值 * Step5: * return true ```cpp /* * Attempt to insert element at head of queue. * Return true if successful. * Return false if q is NULL or could not allocate space. * Argument s points to the string to be stored. * The function must explicitly allocate space and copy the string into it. */ bool q_insert_head(queue_t *q, char *s) { /* Guard of NULL pointer */ size_t len = strlen(s); NULL_PTR_GUARD(q); NULL_PTR_GUARD(len); list_ele_t *newh; newh = malloc(sizeof(list_ele_t)); NULL_PTR_GUARD(newh); newh->value = malloc(len + 1); if (!newh->value) { free(newh); return false; } strncpy(newh->value, s, len); newh->value[len] = '\0'; newh->next = q->head; q->head = newh; if (!q->tail) q->tail = newh; q->size++; return true; } ``` ### q_insert_tail * 使用 `newt` 代表新插入的 node * 大致上與 `q_insert_head()` 相同，不同的地方是，必須考慮到兩種情況 * Empty queue * `q->size >= 1` * Empty queue * 此時 `q->head` 與 `q->tail` 都指向 `NULL` * 在更新 `q->head` 與 `q->tail` 時必須讓他們同時指向`newt` * `q->size >= 1` * 此時 `q->head` 與 `q->tail` 都不為 `NULL`, 分別指向 queue 的開頭與結尾 * 我們只需要更新 `q->tail` 的位置 * 首先讓 `q->tail->next = newt` 來把新的 node 接在尾端 * 再來更新 `q->tail` 的位置讓他指向新的尾端，也就是 `newt` ```cpp /* * Attempt to insert element at tail of queue. * Return true if successful. * Return false if q is NULL or could not allocate space. * Argument s points to the string to be stored. * The function must explicitly allocate space and copy the string into it. */ bool q_insert_tail(queue_t *q, char *s) { NULL_PTR_GUARD(q); size_t len = strlen(s); list_ele_t *newt = malloc(sizeof(list_ele_t)); NULL_PTR_GUARD(newt); newt->next = NULL; newt->value = malloc(len + 1); if (!newt->value) { free(newt->value); free(newt); return false; } strncpy(newt->value, s, len); newt->value[len] = '\0'; if (!q->tail) { // empty queue q->tail = newt; q->head = newt; } else { q->tail->next = newt; q->tail = q->tail->next; } q->size++; return true; } ``` ### q_remove_head * 如果 `bufsize - 1` 大於等於要刪除的字串長度 * `ncopy` 就是要刪除的字串長度 * :bulb:: 只要關注不包含 `\0` 的字串長度就好 * 如果 `bufsize` 小於要刪除的字串長度 * `ncopy` 就是 `bufsize - 1` * 因為要留一個 byte 放置 `\0` ```cpp /* * Attempt to remove element from head of queue. * Return true if successful. * Return false if queue is NULL or empty. * If sp is non-NULL and an element is removed, copy the removed string to *sp * (up to a maximum of bufsize-1 characters, plus a null terminator.) * The space used by the list element and the string should be freed. */ bool q_remove_head(queue_t *q, char *sp, size_t bufsize) { list_ele_t *old_h; if (!q || !q->head || !sp) return false; size_t ncopy = (bufsize - 1) >= strlen(q->head->value) ? strlen(q->head->value) : bufsize - 1; strncpy(sp, q->head->value, ncopy); sp[ncopy] = '\0'; old_h = q->head; q->head = q->head->next; /* Update q->tail if q->size == 1 */ /* TODO: Apply Good-Taste to the update of q->tail */ if (q->size == 1) q->tail = NULL; free(old_h->value); free(old_h); q->size--; return true; } ``` ### q_size ```cpp /* * Return number of elements in queue. * Return 0 if q is NULL or empty */ int q_size(queue_t *q) { NULL_PTR_GUARD(q); return q->size; } ``` ### q_reverse ```cpp /* * Reverse elements in queue * No effect if q is NULL or empty * This function should not allocate or free any list elements * (e.g., by calling q_insert_head, q_insert_tail, or q_remove_head). * It should rearrange the existing ones. */ void q_reverse(queue_t *q) { if (!q || q->size <= 1) return; list_ele_t *pre, *cur, *nxt; /* Swap q->head and q->tail */ pre = q->head; q->head = q->tail; q->tail = pre; /* reverse the linked list */ for (q->head->next = q->tail, pre = q->tail, cur = pre->next, nxt = cur->next; ;) { /* reverse */ cur->next = pre; /* update the pointer */ pre = cur; if (cur == q->head) { q->tail->next = NULL; return; // means we're done } else { cur = nxt; nxt = nxt->next; } } } ``` ### Sort 這裡參照 `AdrianHuang` 同學的[實作](https://github.com/AdrianHuang/lab0-c/commit/5fc70c742351e4e0c66f80c12a9de0b989e30d7d)，使用一系列 function 來選擇不同 sorting method 函式呼叫流程 ```shell do_sort |_ q_sort_register_method(sort_method); |_ q_sort(q); |_merge_sort(q) |_do_merge_sort(head) |_do_merge(l1, l2) ``` :::danger 這樣做的缺點: 1. 太多相似命名的 function 了，我自己使用起來都有點混亂 2. 要新增更多 sorting 方法時，要在存放 function 順序的 `enum` 與存放 `function pointer` 的 `sort_func` 做對應的更動，要同時維護兩個地方很麻煩 ::: ### do_sort >in `qtest.c` ```cpp bool do_sort(int argc, char *argv[]) { int sort_method = MERGE_SORT; if (argc < 1) { report(1, "%s takes no arguments", argv[0]); return false; } else if (argc > 2) { report(1, "Too many arguments\nUsage: %s <sorting method>", argv[0]); } if (!q) report(3, "Warning: Calling sort on null queue"); error_check(); int cnt = q_size(q); if (cnt < 2) report(3, "Warning: Calling sort on single node"); error_check(); if (argc == 2) sort_method = atoi(argv[1]); q_sort_register_method(sort_method); set_noallocate_mode(true); if (exception_setup(true)) q_sort(q); exception_cancel(); set_noallocate_mode(false); ``` 使用方法: 我們可以使用 help command 來查看如何使用 ```shell sort [index] | Sort queue in ascending order, where index == 0 (default: merge sort), 1 (selection sort), 2 (bubble sort) ``` ```shell >cmd sort 2 ``` 使用 `atoi()` 取得我們從 command line 輸入的 number 後在這裡註冊我們要使用的排序 function ```cpp q_sort_register_method(sort_method); ``` ### q_sort 在 `do_sort()` 內被呼叫 ```cpp /* * Function pointer to actual sorting method. * Sort elements of queue in ascending order * No effect if q is NULL or empty. In addition, if q has only one * element, do nothing. */ void (*q_sort)(queue_t *q); ``` ### q_sort_register_method >in `queue.c` 負責註冊 sort function 到 `q_sort()` 這個 function pointer ```cpp= /* * Register the sorting method. */ void q_sort_register_method(int sort_method) { /* Sanity check */ if (sort_method < MERGE_SORT || sort_method >= SORT_METHOD_NUM) sort_method = MERGE_SORT; q_sort = sort_func[sort_method]; } ``` ### enum to store the sorting number in `queue.h` 如果需要加入新的 sort function 時，只要在 `BUBBLE_SORT` 下新增，並且在sort_func 底下新增一個 function pointer 就可以了 :::warning 這樣其實有點麻煩，要維護兩個地方，還在想有什麼辦法能夠簡化 ::: ```cpp /* Enumeration for different sorting methods */ enum { MERGE_SORT, SELECTION_SORT, BUBBLE_SORT, SORT_METHOD_NUM, }; ``` ### sort_func 由 `function pointer ` 組成的陣列 ```cpp /* Array of function pointer which points to the actual sort function */ void (*sort_func[SORT_METHOD_NUM])(queue_t *q) = { merge_sort, selection_sort, bubble_sort, }; ``` ### merge_sort ```cpp /* The caller function preparing merge_sort operation */ static void merge_sort(queue_t *q) { if (!q || q->size <= 1) return; q->head = do_merge_sort(q->head); while (q->tail->next) q->tail = q->tail->next; return; } ``` ### do_merge_sort ```cpp /* Do the actual operation of merge_sort */ /* FIXME: Stack overflow, the end condition doesn't work */ static list_ele_t *do_merge_sort(list_ele_t *head) { if (!head->next) return head; /* Do split using tortoise and hare algorithm */ list_ele_t *slow = head; list_ele_t *fast = head->next; while (fast && fast->next) { slow = slow->next; fast = fast->next->next; } fast = slow->next; slow->next = NULL; /* Sort each list */ list_ele_t *l1 = do_merge_sort(head); list_ele_t *l2 = do_merge_sort(fast); return do_merge(l1, l2); } ``` #### do_merge 取自 [`LeetCode 21. Merge Two Sorted Lists`的討論](https://leetcode.com/problems/merge-two-sorted-lists/discuss/788262/C-5-line-simple-recursive-solution) ```cpp /* Do the merge part */ static list_ele_t *do_merge(list_ele_t *l1, list_ele_t *l2) { if (!l1) return l2; if (!l2) return l1; list_ele_t *head = (strcmp(l1->value, l2->value) <= 0) ? l1 : l2; head->next = (strcmp(l1->value, l2->value) <= 0) ? do_merge(l1->next, l2) : do_merge(l1, l2->next); return head; } ``` 在使用 valgrind 測試時出現 stack overflow，於是使用迭代法來避免使用太多 stack 空間 ```cpp /* Do the merge part */ static list_ele_t *do_merge(list_ele_t *l1, list_ele_t *l2) { list_ele_t *head; if (strcmp(l1->value, l2->value) <= 0) { head = l1; l1 = l1->next; } else { head = l2; l2 = l2->next; } for (list_ele_t *cur = head;; cur = cur->next) { if (!l1) { cur->next = l2; break; } if (!l2) { cur->next = l1; break; } if (strcmp(l1->value, l2->value) <= 0) { cur->next = l1; l1 = l1->next; } else { cur->next = l2; l2 = l2->next; } } return head; } ``` :::success TODO: 嘗試更好的方法，整合 head 位置的判斷進入迴圈 ::: ### bubble_sort 使用 pointer to pointer 來避免將 head of queue 當作 special case，省去分支 ```cpp static void bubble_sort(queue_t *q) { if (!q || q->size <= 1) return; list_ele_t *tmp; list_ele_t **in_h = &q->head; for (int i = 0; i < q->size; i++) { /* Set q->tail when every round of comparison is finished */ q->tail = (*in_h); in_h = &q->head; for (int j = 0; j < q->size - 1 - i; j++) { if (strcmp((*in_h)->value, (*in_h)->next->value) > 0) { tmp = (*in_h)->next; (*in_h)->next = tmp->next; tmp->next = (*in_h); (*in_h) = tmp; } in_h = &(*in_h)->next; } } return; } ``` ## Valgrind ### linenoise 問題 * 在 strdup() 時發生 memory leak * >發生於 [commit b48fca1](https://github.com/unknowntpo/lab0-c/commit/b48fca19032704c5a332b740291324d117a6e4d7) * 試試看使用 gdb+valgrind! * [抓漏 - Gdb 和 Valgrind 合體技](https://wen00072.github.io/blog/2014/11/29/catch-the-leak-gdb-and-valgrind-siamese-juggling/) ### 症狀 * 當我們對第一筆測資進行 valgrind 測試時 ```shell $ make valgrind tid=1 ``` 得到 ```shell +++ TESTING trace trace-01-ops: # Test of insert_head and remove_head ==16369== 104 bytes in 20 blocks are still reachable in loss record 1 of 2 ==16369== at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==16369== by 0x5277A29: strdup (strdup.c:42) ==16369== by 0x111E6A: linenoiseHistoryAdd (linenoise.c:1236) ==16369== by 0x1121B4: linenoiseHistoryLoad (linenoise.c:1325) ==16369== by 0x10B8A9: main (qtest.c:779) ==16369== ==16369== 160 bytes in 1 blocks are still reachable in loss record 2 of 2 ==16369== at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==16369== by 0x111DD8: linenoiseHistoryAdd (linenoise.c:1224) ==16369== by 0x1121B4: linenoiseHistoryLoad (linenoise.c:1325) ==16369== by 0x10B8A9: main (qtest.c:779) ==16369== --- trace-01-ops 6/6 --- TOTAL 6/6 Test with specific case by running command: scripts/driver.py -p /tmp/qtest.uK1qgh --valgrind -t <tid> ``` ## 論文 [Dude, is my code constant time?](https://eprint.iacr.org/2016/1123.pdf) ## 同學範本 * [RinHizakura 同學](https://hackmd.io/@RinHizakura/BysgszHNw#Fisher%E2%80%93Yates-shuffle) * linenoise * Dude, is my code constant time? 論文導讀 * duduct 實作原理分析 * Valgrind & Massif