--- ###### tags: `sysprog2021q1` --- # 2021q1 Homework1 (lab0) contributed by < `93i7xo2` > > Source: [lab-0](https://hackmd.io/@sysprog/linux2021-lab0) ## To-Do List - [ ] 研讀論文 [Dude, is my code constant time?](https://eprint.iacr.org/2016/1123.pdf),解釋本程式的 "simulation" 模式是如何透過以實驗而非理論分析,達到驗證時間複雜度,需要解釋 [Student's t-distribution](https://en.wikipedia.org/wiki/Student%27s_t-distribution) 及程式實作的原理。注意:現有實作存在若干致命缺陷,請討論並提出解決方案; - [ ] 開啟 [Address Sanitizer](https://github.com/google/sanitizers/wiki/AddressSanitizer),修正 `qtest` 執行過程中的錯誤 * 先執行 `qtest` 再於命令提示列輸入 `help` 命令,會使 開啟 [Address Sanitizer](https://github.com/google/sanitizers/wiki/AddressSanitizer) 觸發錯誤,應予以排除 - [ ] 解釋 [select](http://man7.org/linux/man-pages/man2/select.2.html) 系統呼叫在本程式的使用方式,並分析 [console.c](https://github.com/sysprog21/lab0-c/blob/master/console.c) 的實作,說明其中運用 CS:APP [RIO 套件](http://csapp.cs.cmu.edu/2e/ch10-preview.pdf) 的原理和考量點。可對照參考 [CS:APP 第 10 章重點提示](https://hackmd.io/@sysprog/H1TtmVTTz) :arrow_forward: 為避免「舉燭」,請比照 `qtest` 實作,撰寫獨立的終端機命令解譯器 (可建立新的 GitHub repository) - [x] 運用 Valgrind 排除 `qtest` 實作的記憶體錯誤,並透過 Massif 視覺化 "simulation" 過程中的記憶體使用量,需要設計對應的實驗 * 在 `qtest` 中實作 [coroutine](https://en.wikipedia.org/wiki/Coroutine),並提供新的命令 `web`,提供 web 伺服器功能,注意: web 伺服器運作過程中,`qtest` 仍可接受其他命令 * 可嘗試整合 [tiny-web-server](https://github.com/7890/tiny-web-server),將 `FORK_COUNT` 變更為 `0`,並以 [coroutine](https://en.wikipedia.org/wiki/Coroutine) 取代原本的 fork 系統呼叫 * 說明 [antirez/linenoise](https://github.com/antirez/linenoise) 的運作原理,注意到 [termios](http://man7.org/linux/man-pages/man3/termios.3.html) 的運用 * 指出現有程式的缺陷 (提示: 和 [RIO 套件](http://csapp.cs.cmu.edu/2e/ch10-preview.pdf) 有關),嘗試強化並提交 pull request ## 系統環境 ``` CPU: i5-6200U RAM: 20G OS: Ubuntu 20.04.2 LTS ``` ## 完成`qtest.c`實作 ### `q_new` 為了實現 $O(1)$ 時間複雜度,加上 `count` 、 `last_element` 紀錄。 ```diff queue_t *q_new() { queue_t *q = malloc(sizeof(queue_t)); - /* TODO: What if malloc returned NULL? */ - q->head = NULL; + if (q) { + q->head = NULL; + q->count = 0; + q->last_element = q->head; + } return q; } ``` ### `q_free` ```diff void q_free(queue_t *q) { - /* TODO: How about freeing the list elements and the strings? */ /* Free queue structure */ + if (!q) + return; + + while (q->head) { + list_ele_t *tmp = q->head; + q->head = q->head->next; + free(tmp->value); + free(tmp); + } free(q); } ``` ### `q_insert_head` 依據 [CERN Common vulnerabilities guide for C programmers](https://security.web.cern.ch/recommendations/en/codetools/c.shtml) 的建議使用 `strncpy`,以 `strlen` 取的字串大小作為 `strncpy` 參數。由於 `strncpy` 並不會加上 null character,需自行補上或以 `calloc` 替代 `malloc`。 ```diff bool q_insert_head(queue_t *q, char *s) { - list_ele_t *newh; - /* TODO: What should you do if the q is NULL? */ - newh = malloc(sizeof(list_ele_t)); - /* Don't forget to allocate space for the string and copy it */ - /* What if either call to malloc returns NULL? */ + if (!q) + return false; + + /* Create a new element. */ + list_ele_t *newh = malloc(sizeof(list_ele_t)); + int len = strlen(s); + if (!newh) + return false; + newh->value = (char *) malloc(len + 1); + if (!newh->value) { + free(newh); + return false; + } + strncpy(newh->value, (const char *) s, len); + newh->value[len] = '\0'; + + /* Attach new element at the head of linked list. */ newh->next = q->head; q->head = newh; + q->count++; + + /* Mark it as last element if it's a first element. */ + if (!newh->next) + q->last_element = newh; + return true; } ``` ### `q_insert_tail` 一開始實作 `q_insert_tail` 忽視掉佇列可能是空的可能性,透過 valgrind 發現無效寫入 ``` ==127495== Invalid write of size 8 ==127495== at 0x10DF1E: q_insert_tail (queue.c:105) ==127495== by 0x10E3D2: measure (constant.c:69) ==127495== by 0x10E5D7: doit (fixture.c:136) ==127495== by 0x10E82F: is_insert_tail_const (fixture.c:168) ==127495== by 0x10B572: do_insert_tail (qtest.c:259) ==127495== by 0x10CB67: interpret_cmda (console.c:220) ==127495== by 0x10D0EF: interpret_cmd (console.c:243) ==127495== by 0x10D6CC: cmd_select (console.c:569) ==127495== by 0x10D903: run_console (console.c:628) ==127495== by 0x10BFC1: main (qtest.c:772) ==127495== Address 0x8 is not stack'd, malloc'd or (recently) free'd ==127495== Segmentation fault occurred. You dereferenced a NULL or invalid pointer ``` 於是加入空佇列的判斷 ```diff -q->last_element->next = newh; -q->last_element = newh; +if (!q->last_element) { + q->head = q->last_element = newh; +} else { + q->last_element->next = newh; + q->last_element = newh; +} q->count++; ``` 最終實作如下 ```diff bool q_insert_tail(queue_t *q, char *s) { - /* TODO: You need to write the complete code for this function */ - /* Remember: It should operate in O(1) time */ - /* TODO: Remove the above comment when you are about to implement. */ - return false; + if (!q) + return false; + + /* Create a new element. */ + list_ele_t *newh = malloc(sizeof(list_ele_t)); + int len = strlen(s); + if (!newh) + return false; + newh->value = (char *) malloc(len + 1); + if (!newh->value) { + free(newh); + return false; + } + strncpy(newh->value, (const char *) s, len); + newh->value[len] = '\0'; + newh->next = NULL; + + /* Insert element at tail of queue, with time complexity of O(1) */ + if (!q->last_element) { + q->head = q->last_element = newh; + } else { + q->last_element->next = newh; + q->last_element = newh; + } + q->count++; + + return true; } ``` ### `q_remove_head` ```diff bool q_remove_head(queue_t *q, char *sp, size_t bufsize) { - /* TODO: You need to fix up this code. */ - /* TODO: Remove the above comment when you are about to implement. */ + if (!q || !q->head) + return false; + if (sp) { + int realBufsize = strlen(q->head->value) < bufsize - 1 + ? strlen(q->head->value) + : bufsize - 1; + memcpy((void *) sp, (const void *) q->head->value, realBufsize); + *(sp + realBufsize) = '\0'; + } + list_ele_t *target = q->head; + /* May be bufsize is greater than actual size of value */ q->head = q->head->next; + q->count--; + free(target->value); + free(target); + + /* Mark last element as NULL if queue is empty */ + if (!q->head) + q->last_element = NULL; + return true; } ``` ### `q_size` ```diff int q_size(queue_t *q) { - /* TODO: You need to write the code for this function */ - /* Remember: It should operate in O(1) time */ - /* TODO: Remove the above comment when you are about to implement. */ - return 0; + /* With time complexity of O(1) */ + return q ? q->count : 0; } ``` ### `q_reverse` ```diff void q_reverse(queue_t *q) { - /* TODO: You need to write the code for this function */ - /* TODO: Remove the above comment when you are about to implement. */ + if (q == NULL) + return; + + list_ele_t *prev = NULL; + list_ele_t *temp; + list_ele_t *current = q->last_element = q->head; + while (current != NULL) { + temp = current->next; + current->next = prev; + prev = current; + current = temp; + } + q->head = prev; } ``` ### `q_sort` - 以 mergesort 來實作 `qsort`,分割 list 時使用 slow/faster pointer 的方法 - 參考 [gpwork4u](https://hackmd.io/-k6KIJ8GRSen7NUdVBt0AQ?view) 的建議將 comparator 獨立出來方便替換成其他函式 ```diff void q_sort(queue_t *q) { - /* TODO: You need to write the code for this function */ - /* TODO: Remove the above comment when you are about to implement. */ + if (q == NULL || q->head == NULL || q->count == 1) + return; + q->head = mergesort(q->head, comparator); + + list_ele_t *end = q->head; + while (end->next) { + end = end->next; + } + q->last_element = end; } + +/* + * Sort elements in linked list + * This function is called by `q_sort` + */ +list_ele_t *mergesort(list_ele_t *start, + int (*compar)(const void *, const void *)) +{ + if (!start || !start->next) { + return start; + } + + // Split the list into two lists + list_ele_t *slow = start, *fast = start; + while (fast->next && fast->next->next) { + slow = slow->next; + fast = fast->next->next; + } + list_ele_t *left = start, *right = slow->next; + slow->next = NULL; + + left = mergesort(left, compar); + right = mergesort(right, compar); + + for (list_ele_t *merge = NULL; left || right;) { + if (!right || (left && (*compar)(left, right) < 0)) { + list_ele_t *next = left->next; + if (!merge) { + start = merge = left; + } else { + merge->next = left; + merge = left; + left = next; + } else { + list_ele_t *next = right->next; + if (!merge) { + start = merge = right; + } else { + merge->next = right; + merge = right; + } + right = next; + } + merge->next = NULL; + } + return start; +} ``` ## memleak in `qtest.c` 使用 `make valgrind` 或是 `valgrind ./qtest -f traces/trace-XX-ops.cmd` 會出現 memory leak 的訊息: ``` cmd> Freeing queue ==364164== 5 bytes in 1 blocks are still reachable in loss record 1 of 3 ==364164== at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==364164== by 0x4A6150E: strdup (strdup.c:42) ==364164== by 0x110394: linenoiseHistoryAdd (linenoise.c:1236) ==364164== by 0x110F27: linenoiseHistoryLoad (linenoise.c:1325) ==364164== by 0x10C2BE: main (qtest.c:781) ==364164== ``` trace 後發現這段配置的記憶體應該是經由 `linenoise` 掛上 exit handlers 並於程式正常結束時清除 ``` linenoise └linenoiseRaw └enableRawMode └atexit(linenoiseAtExit); ``` 按照 `run_console` 現有寫法,只有在 interactive mode 下才會呼叫到 `linenoise` ```cpp if (!has_infile) { char *cmdline; while ((cmdline = linenoise(prompt)) != NULL) { interpret_cmd(cmdline); linenoiseHistoryAdd(cmdline); /* Add to the history. */ linenoiseHistorySave(HISTORY_FILE); /* Save the history on disk. */ linenoiseFree(cmdline); } } else { while (!cmd_done()) cmd_select(0, NULL, NULL, NULL, NULL); } ``` 並在 `runconsole` 前設定 `linenoise` ```cpp /* Trigger call back function(auto completion) */ linenoiseSetCompletionCallback(completion); linenoiseHistorySetMaxLen(HISTORY_LEN); linenoiseHistoryLoad(HISTORY_FILE); /* Load the history at startup */ ... bool ok = true; ok = ok && run_console(infile_name); ``` 想到的解法有兩個,在 `qtest` 結束前自行呼叫 `linenoiseAtExit`,或是將 `linenoiseHistoryLoad` 搬移到接下來要執行 `linenoise` 的前面,確保記憶體一定被釋放,例如原始專案範例 [example.c](https://github.com/antirez/linenoise/blob/master/example.c)。 選擇用前者,畢竟在 non-interactive mode 去載入 `linenoise` 的設定就是一件匪夷所思的事情,修改後如下: ```diff if (!has_infile) { char *cmdline; + /* Trigger call back function(auto completion) */ + linenoiseSetCompletionCallback(completion); + + linenoiseHistorySetMaxLen(HISTORY_LEN); + linenoiseHistoryLoad(HISTORY_FILE); /* Load the history at startup */ while ((cmdline = linenoise(prompt)) != NULL) { interpret_cmd(cmdline); linenoiseHistoryAdd(cmdline); /* Add to the history. */ linenoiseHistorySave(HISTORY_FILE); /* Save the history on disk. */ linenoiseFree(cmdline); } } else { while (!cmd_done()) cmd_select(0, NULL, NULL, NULL, NULL); } ``` ## 嵌入 natural sort [natsort](https://github.com/sourcefrog/natsort) 的行為是先以數字做排序,其次是字母,有別於 `strcmp` 逐一字元比較;數字的界定是以左右字母或特殊字元(`-`)做區隔,不計中間的空格或 0。 ``` pic02a pic02000 pic05 pic2 pic3 pic4 pic 4 else pic 5 ``` 測試腳本提供數種 testdata,然而實際用到的只有 test-dates、test-fractions、test-words。 來看 natsort 的程式碼,`strnatcasecmp.c` 提供兩個比較函式 - strnatcmp - strnatcasecmp: 功能同上,但字母不分大小寫,官方測試資料也沒測到大小寫字母。 ```cpp if (fold_case) { ca = nat_toupper(ca); cb = nat_toupper(cb); } ``` 基於考量採用 `strnatcasecmp` 當作比較函數,並在 `qtest.c` 引入 `strnatcmp.h`、`strnatcmp.h`,同時在`console.c`的`simulation`後方新增一個 option - natsort,用以切換 natsort/normal sort mode。 ```cpp int comparator(const void *a, const void *b) { int (*sort_alg)() = strcmp; if (natsort) sort_alg = strnatcasecmp; return sort_alg(((list_ele_t *) a)->value, ((list_ele_t *) b)->value); } ``` 可惜在 console 輸入時是以空白做運算元分隔,下面字詞插入時會被誤判,故不列入測試。 ``` pic 4 else pic 5 something Fractional release numbers ``` :::info `run-test.bash` 有這麼一段 ```shell diff -u "$1" <( ./natsort < "$2" ) && return ``` 根據 [3.5.6 Process Substitution](https://www.gnu.org/software/bash/manual/html_node/Process-Substitution.html#Process-Substitution) 的說明,`diff` 必須要將 `<()` 內的程式輸出讀進來,作用等同 ```shell ./natsort < "$2" | diff -u "$1" - && return ``` ::: ## 研讀論文 Dude, is my code constant time?