2020q1 Homework1 (lab0)

--- tags: linux, queue, linked list, c_programming --- # 2020q1 Homework1 (lab0) contributed by < [chses9440611](https://github.com/chses9440611/lab0-c) > ## Implemenation ### queue.h 在 queue_t 增加 size 和 tail 欄位，用來完成在 $O(1)$ 時間內完成 q_insert_tail 和 q_size ```cpp typedef struct ELE { char *value; struct ELE *next; } list_ele_t; typedef struct { list_ele_t *head, *tail; // Linked list of elements int size; } queue_t; ``` ### q_new 當 malloc queue 失敗時要回傳 `NULL`，而一開始讓 `q->head` 和 `q->tail` 指向 `NULL` ```cpp queue_t *q_new() { queue_t *q = malloc(sizeof(queue_t)); if (!q) return NULL; q->head = q->tail = NULL; q->size = 0; return q; } ``` ### q_free ```cpp void q_free(queue_t *q) { // when q is not NULL, we do free(q), free node from head if (q != NULL) { while (q->size > 0) { list_ele_t *tmp = q->head; q->head = q->head->next; free(tmp->value); free(tmp); q->size--; } free(q); } } ``` ### q_insert_head 這邊要注意的是當新的節點的 value 在 malloc 失敗時要記得將新節點的空間一併釋放。第二個要注意的是當要插入第一個節點時，要讓 `q->tail` 和 `q->head` 指向同一個節點。 ```cpp bool q_insert_head(queue_t *q, char *s) { list_ele_t *newh; int length; if (!q) return false; newh = malloc(sizeof(list_ele_t)); if (!newh) { printf("Fail to malloc a node\n"); return false; } // get the string length of s, not including '\0' length = strlen(s); newh->value = malloc(length + 1); // if fail to malloc value, also free the node newh if (!newh->value) { printf("Fail to malloc value\n"); free(newh); return false; } strncpy(newh->value, s, length); newh->value[length] = '\0'; /* * if q is an empty queue, first insert * will let tail and head point to same node */ newh->next = q->head; q->head = newh; if (q->size == 0) q->tail = q->head; (q->size)++; return true; } ``` ### q_insert_tail 要注意的地方跟 q_insert_head 一樣。 ```cpp bool q_insert_tail(queue_t *q, char *s) { /* Remember: It should operate in O(1) time */ if (!q) return false; list_ele_t *newt; newt = malloc(sizeof(list_ele_t)); if (!newt) { return false; } int length = strlen(s); newt->value = malloc(length + 1); if (!newt->value) { free(newt); return false; } strncpy(newt->value, s, length); newt->value[length] = '\0'; // avoid newt->next NULL deference newt->next = NULL; if (q->size == 0) { q->head = q->tail = newt; } else { q->tail = q->tail->next = newt; } (q->size)++; return true; } ``` ### q_remove_head 當 q 為 `NULL` 或是 q->size 為0，以及 sp 指向 NULL 時回傳 `NULL` 要注意要移除的節點的字串長度要跟 bufsize 比較大小，字串長度不可以超過 `bufsize - 1` ```cpp bool q_remove_head(queue_t *q, char *sp, size_t bufsize) { if (!q || q->size == 0 || !sp) return false; list_ele_t *old = q->head; int length = strlen(old->value); if (bufsize < length + 1) length = bufsize - 1; // for sp is pointed to an allocated space, no need to malloc again. // sp = malloc(length + 1); // if(!sp) { // printf("sp malloc fail\n"); // return false; // } strncpy(sp, old->value, length); sp[length] = '\0'; q->head = q->head->next; free(old->value); free(old); if (q->size == 1) q->tail = q->head; (q->size)--; return true; } ``` ### q_size 由於在 queue_t 中增加了 size 欄位，使得可以在常數時間內完成 ```cpp int q_size(queue_t *q) { if (!q) return 0; return q->size; } ``` :::warning 可改寫為以下: ```cpp int q_size(queue_t *q) { return q ? q->size : 0; } ``` 簡潔又清晰 :notes: jserv ::: :::success 經過改寫後： ```cpp int q_size(queue_t *q) { return q ? q->size : 0; } ``` ::: ### q_reverse 先讓 `q->tail = q->head` 可以避免之後再用迴圈來尋找 `q->tail` 的 overhead。 ```cpp void q_reverse(queue_t *q) { if (q && q->size > 1) { list_ele_t *left, *right; q->tail = q->head; left = q->head->next; while (left != NULL) { right = left->next; left->next = q->head; q->head = left; left = right; } q->tail->next = NULL; } } ``` ### q_sort 若用 bottom-up 的方式，會需要 $O(n)$ 的空間，在 trace-15 會檢測時會超出程式記憶體的容量。若是純用遞迴也會造成 Segmentation fault。於是最後我採取的是用 Top-down 遞迴來均分串鏈，用迴圈來合併分併兩個串鏈，最後在用一個 for 迴圈來尋找 `q->tail`。則遞迴式為 $\Big\{\begin{array}{11} T(n) = 2T(\frac{n}{2}) + O(n) & \mbox{n > 1} \\ T(1) = O(1) & \mbox{n = 1}\end{array}$ 藉由 [Mater theorem](https://en.wikipedia.org/wiki/Master_theorem_$analysis_of_algorithms$) 或是計算可得 $T(n) = O(n\log n)$ ```cpp void q_sort(queue_t *q) { if (!q || q->size < 2) return; q->head = merge_sort(q->head); for (q->tail = q->head; q->tail->next != NULL; q->tail = q->tail->next); } ``` ### auxiliary function of q_sort :::warning `mergesort` 和 `merge` 這兩個函式應該在宣告標註 `static`，這樣就不會有符號的 visibility 被公開的狀況，有助於編譯器施加最佳化。 :notes: jserv ::: :::info 在使用時，要注意 static function 的宣告和實作要放在同一個檔案否則會有error: foo_function declared ‘static’ but never defined ::: #### declare in queue.c ```cpp static list_ele_t *merge_sort(list_ele_t *head); static list_ele_t *merge(list_ele_t *l1, list_ele_t *l2); ``` #### mergesort ```cpp static list_ele_t *merge_sort(list_ele_t *head) { if (!head || !head->next) return head; list_ele_t *fast = head->next; list_ele_t *slow = head; // split list while (fast && fast->next) { slow = slow->next; fast = fast->next->next; } fast = slow->next; slow->next = NULL; // sort each List list_ele_t *l1 = merge_sort(head); list_ele_t *l2 = merge_sort(fast); return merge(l1, l2); } ``` #### merge 在合併時，先尋找第一個節點，並讓 `head` 指向他，之後再用迴圈的方式來尋找。 ```cpp static list_ele_t *merge(list_ele_t *l1, list_ele_t *l2) { int cmpResult = strcmp(l1->value, l2->value); list_ele_t *head, *tail; if (cmpResult <= 0) { tail = head = l1; l1 = l1->next; } else { tail = head = l2; l2 = l2->next; } while (l1 != NULL && l2 != NULL) { cmpResult = strcmp(l1->value, l2->value); if (cmpResult <= 0) { tail->next = l1; l1 = l1->next; } else { tail->next = l2; l2 = l2->next; } tail = tail->next; } if (l1 == NULL) { tail->next = l2; } else { tail->next = l1; } return head; } ``` ### q_sort (Bottom-up way) 但這個函式在節點數來到百萬級時造成程式的 stack 已滿而無法運作 ```cpp void q_sort(queue_t *q) { if (!q || q->size <= 1) return; int size = q->size; list_ele_t *ptr[size]; // linked list merge sort bottom-up way for (int width = 1; width < size; width *= 2) { list_ele_t* target; for(int k = 0; k < size; k++) { ptr[k] = q->head; q->head = q->head->next; } q->tail = ptr[size-1]->next = NULL; for (int i = 0; i < size; i += 2 * width) { if(size < i + width) { // the remain left < size, no need to compare left and rig q->tail->next = ptr[i]; q->tail = ptr[size-1]; continue; } // the total component to insert int leftBound = i + width, rightBound = (size < i + 2 * width)?size : i + 2 * width; for (int left = i, right = leftBound; left < leftBound || right < rightBound;) { if (left == leftBound) { // right remain q->tail->next = ptr[right]; right = rightBound; q->tail = ptr[right-1]; } else if (right == rightBound) { // left remain q->tail->next = ptr[left]; left = leftBound; q->tail = ptr[left-1]; } else { int result = strcmp(ptr[left]->value, ptr[right]->value); if (result <= 0) { // Same string or smaller left, insert left target = ptr[left++]; // ptr[j] = tmp[left], j++, left += 1 } else { // right is smaller target = ptr[right++]; } if(!q->tail) q->head = target; else { q->tail->next = target; } q->tail = target; } } } } q->tail->next = NULL; } ``` ## Make test result ```shell make test scripts/driver.py -c --- Trace Points +++ TESTING trace trace-01-ops: # Test of insert_head and remove_head --- trace-01-ops 6/6 +++ TESTING trace trace-02-ops: # Test of insert_head, insert_tail, and remove_head --- trace-02-ops 6/6 +++ TESTING trace trace-03-ops: # Test of insert_head, insert_tail, reverse, and remove_head --- trace-03-ops 6/6 +++ TESTING trace trace-04-ops: # Test of insert_head, insert_tail, size, and sort --- trace-04-ops 6/6 +++ TESTING trace trace-05-ops: # Test of insert_head, insert_tail, remove_head, reverse, size, and sort --- trace-05-ops 5/5 +++ TESTING trace trace-06-string: # Test of truncated strings --- trace-06-string 6/6 +++ TESTING trace trace-07-robust: # Test operations on NULL queue --- trace-07-robust 6/6 +++ TESTING trace trace-08-robust: # Test operations on empty queue --- trace-08-robust 6/6 +++ TESTING trace trace-09-robust: # Test remove_head with NULL argument --- trace-09-robust 6/6 +++ TESTING trace trace-10-malloc: # Test of malloc failure on new --- trace-10-malloc 6/6 +++ TESTING trace trace-11-malloc: # Test of malloc failure on insert_head Fail to malloc a node Fail to malloc value Fail to malloc a node Fail to malloc a node Fail to malloc a node Fail to malloc a node Fail to malloc a node Fail to malloc a node --- trace-11-malloc 6/6 +++ TESTING trace trace-12-malloc: # Test of malloc failure on insert_tail --- trace-12-malloc 6/6 +++ TESTING trace trace-13-perf: # Test performance of insert_tail --- trace-13-perf 6/6 +++ TESTING trace trace-14-perf: # Test performance of size --- trace-14-perf 6/6 +++ TESTING trace trace-15-perf: # Test performance of insert_tail, size, reverse, and sort --- trace-15-perf 6/6 +++ TESTING trace trace-16-perf: # Test performance of sort with random and descending orders # 10000: all correct sorting algorithms are expected pass # 50000: sorting algorithms with O(n^2) time complexity are expected failed # 100000: sorting algorithms with O(nlogn) time complexity are expected pass --- trace-16-perf 6/6 +++ TESTING trace trace-17-complexity: # Test if q_insert_tail and q_size is constant time complexity Probably constant time Probably constant time --- trace-17-complexity 5/5 --- TOTAL 100/100 ``` ## Natural sort - [ ] 什麼是 natural sort - [ ] 如何修改比較程式 - [ ] 如何在 simulation 修改來反映出 natural sort 的使用 ## Valgrind 排除記憶體錯誤, Massif 視覺化 - [ ] 什麼是 Valgrind 怎麼使用 - [ ] 什麼是 Massif，如何使用 - [ ] 要設計怎樣的實驗 - [ ] 如何實作實驗 ### Dude, is my code constant time? - [ ] ## TODO: - [ ] 修改排序所用的比較程式，變更為 natural sort，在 "simulaition" 也做對應的修改，得以反映出 natural sort 的使用。 * 紀錄如何逐步達到自動評分的要求，需要涵蓋 - [ ] 運用 Valgrind 排除 qtest 實作的記憶體錯誤，並透過 Massif視覺 "simulation" 過程中的記憶體使用量，需要設計對應的實驗。 - [ ] 研讀論文 [Dude, is my code constant time?](https://eprint.iacr.org/2016/1123.pdf)，解釋本程式的 "simulation" 模式是如何透過以實驗而非理論分析，達到驗證時間複雜度，需要解釋 [Student's t-distribution](https://en.wikipedia.org/wiki/Student%27s_t-distribution) 及程式實作的原理; - [ ] 解釋 [select](http://man7.org/linux/man-pages/man2/select.2.html) 系統呼叫在本程式的使用方式，並分析 [console.c](https://github.com/sysprog21/lab0-c/blob/master/console.c) 的實作，說明其中運用 CS:APP [RIO 套件](http://csapp.cs.cmu.edu/2e/ch10-preview.pdf) 的原理和考量點。可對照參考 [CS:APP 第 10 章重點提示](https://hackmd.io/@sysprog/H1TtmVTTz) :arrow_forward: 為避免「舉燭」，請比照 `qtest` 實作，撰寫獨立的終端機命令解譯器 (可建立新的 GitHub repository) * 挑戰題 - [ ] 參照 [antirez/linenoise](https://github.com/antirez/linenoise)，強化 `qtest` 命令列功能，使其支援單行或多行編輯、歷史命令處理、命令自動補完 (例如輸入 `h` 之後按下 Tab 按鍵，自動接續補上 `elp`，成為完整的 `help` 命令)。除了整合程式碼外，應當說明 [antirez/linenoise](https://github.com/antirez/linenoise) 的運作原理，注意到 [termios](http://man7.org/linux/man-pages/man3/termios.3.html) 的運用 - [ ] 思考如何預設開啟 AddressSanitizer，卻又能滿足效能測試所需 (可能需要兩種以上的 build target) - [ ] 指出現有程式的缺陷 (提示: 和 [RIO 套件](http://csapp.cs.cmu.edu/2e/ch10-preview.pdf) 有關)，嘗試強化並提交 pull request