---
###### tags: `sysprog2021q1`
---
# 2021q1 Homework1 (lab0)
contributed by < `93i7xo2` >
> Source: [lab-0](https://hackmd.io/@sysprog/linux2021-lab0)
## To-Do List
- [ ] 研讀論文 [Dude, is my code constant time?](https://eprint.iacr.org/2016/1123.pdf),解釋本程式的 "simulation" 模式是如何透過以實驗而非理論分析,達到驗證時間複雜度,需要解釋 [Student's t-distribution](https://en.wikipedia.org/wiki/Student%27s_t-distribution) 及程式實作的原理。注意:現有實作存在若干致命缺陷,請討論並提出解決方案;
- [ ] 開啟 [Address Sanitizer](https://github.com/google/sanitizers/wiki/AddressSanitizer),修正 `qtest` 執行過程中的錯誤
* 先執行 `qtest` 再於命令提示列輸入 `help` 命令,會使 開啟 [Address Sanitizer](https://github.com/google/sanitizers/wiki/AddressSanitizer) 觸發錯誤,應予以排除
- [ ] 解釋 [select](http://man7.org/linux/man-pages/man2/select.2.html) 系統呼叫在本程式的使用方式,並分析 [console.c](https://github.com/sysprog21/lab0-c/blob/master/console.c) 的實作,說明其中運用 CS:APP [RIO 套件](http://csapp.cs.cmu.edu/2e/ch10-preview.pdf) 的原理和考量點。可對照參考 [CS:APP 第 10 章重點提示](https://hackmd.io/@sysprog/H1TtmVTTz) :arrow_forward: 為避免「舉燭」,請比照 `qtest` 實作,撰寫獨立的終端機命令解譯器 (可建立新的 GitHub repository)
- [x] 運用 Valgrind 排除 `qtest` 實作的記憶體錯誤,並透過 Massif 視覺化 "simulation" 過程中的記憶體使用量,需要設計對應的實驗
* 在 `qtest` 中實作 [coroutine](https://en.wikipedia.org/wiki/Coroutine),並提供新的命令 `web`,提供 web 伺服器功能,注意: web 伺服器運作過程中,`qtest` 仍可接受其他命令
* 可嘗試整合 [tiny-web-server](https://github.com/7890/tiny-web-server),將 `FORK_COUNT` 變更為 `0`,並以 [coroutine](https://en.wikipedia.org/wiki/Coroutine) 取代原本的 fork 系統呼叫
* 說明 [antirez/linenoise](https://github.com/antirez/linenoise) 的運作原理,注意到 [termios](http://man7.org/linux/man-pages/man3/termios.3.html) 的運用
* 指出現有程式的缺陷 (提示: 和 [RIO 套件](http://csapp.cs.cmu.edu/2e/ch10-preview.pdf) 有關),嘗試強化並提交 pull request
## 系統環境
```
CPU: i5-6200U
RAM: 20G
OS: Ubuntu 20.04.2 LTS
```
## 完成`qtest.c`實作
### `q_new`
為了實現 $O(1)$ 時間複雜度,加上 `count` 、 `last_element` 紀錄。
```diff
queue_t *q_new()
{
queue_t *q = malloc(sizeof(queue_t));
- /* TODO: What if malloc returned NULL? */
- q->head = NULL;
+ if (q) {
+ q->head = NULL;
+ q->count = 0;
+ q->last_element = q->head;
+ }
return q;
}
```
### `q_free`
```diff
void q_free(queue_t *q)
{
- /* TODO: How about freeing the list elements and the strings? */
/* Free queue structure */
+ if (!q)
+ return;
+
+ while (q->head) {
+ list_ele_t *tmp = q->head;
+ q->head = q->head->next;
+ free(tmp->value);
+ free(tmp);
+ }
free(q);
}
```
### `q_insert_head`
依據 [CERN Common vulnerabilities guide for C programmers](https://security.web.cern.ch/recommendations/en/codetools/c.shtml) 的建議使用 `strncpy`,以 `strlen` 取的字串大小作為 `strncpy` 參數。由於 `strncpy` 並不會加上 null character,需自行補上或以 `calloc` 替代 `malloc`。
```diff
bool q_insert_head(queue_t *q, char *s)
{
- list_ele_t *newh;
- /* TODO: What should you do if the q is NULL? */
- newh = malloc(sizeof(list_ele_t));
- /* Don't forget to allocate space for the string and copy it */
- /* What if either call to malloc returns NULL? */
+ if (!q)
+ return false;
+
+ /* Create a new element. */
+ list_ele_t *newh = malloc(sizeof(list_ele_t));
+ int len = strlen(s);
+ if (!newh)
+ return false;
+ newh->value = (char *) malloc(len + 1);
+ if (!newh->value) {
+ free(newh);
+ return false;
+ }
+ strncpy(newh->value, (const char *) s, len);
+ newh->value[len] = '\0';
+
+ /* Attach new element at the head of linked list. */
newh->next = q->head;
q->head = newh;
+ q->count++;
+
+ /* Mark it as last element if it's a first element. */
+ if (!newh->next)
+ q->last_element = newh;
+
return true;
}
```
### `q_insert_tail`
一開始實作 `q_insert_tail` 忽視掉佇列可能是空的可能性,透過 valgrind 發現無效寫入
```
==127495== Invalid write of size 8
==127495== at 0x10DF1E: q_insert_tail (queue.c:105)
==127495== by 0x10E3D2: measure (constant.c:69)
==127495== by 0x10E5D7: doit (fixture.c:136)
==127495== by 0x10E82F: is_insert_tail_const (fixture.c:168)
==127495== by 0x10B572: do_insert_tail (qtest.c:259)
==127495== by 0x10CB67: interpret_cmda (console.c:220)
==127495== by 0x10D0EF: interpret_cmd (console.c:243)
==127495== by 0x10D6CC: cmd_select (console.c:569)
==127495== by 0x10D903: run_console (console.c:628)
==127495== by 0x10BFC1: main (qtest.c:772)
==127495== Address 0x8 is not stack'd, malloc'd or (recently) free'd
==127495==
Segmentation fault occurred. You dereferenced a NULL or invalid pointer
```
於是加入空佇列的判斷
```diff
-q->last_element->next = newh;
-q->last_element = newh;
+if (!q->last_element) {
+ q->head = q->last_element = newh;
+} else {
+ q->last_element->next = newh;
+ q->last_element = newh;
+}
q->count++;
```
最終實作如下
```diff
bool q_insert_tail(queue_t *q, char *s)
{
- /* TODO: You need to write the complete code for this function */
- /* Remember: It should operate in O(1) time */
- /* TODO: Remove the above comment when you are about to implement. */
- return false;
+ if (!q)
+ return false;
+
+ /* Create a new element. */
+ list_ele_t *newh = malloc(sizeof(list_ele_t));
+ int len = strlen(s);
+ if (!newh)
+ return false;
+ newh->value = (char *) malloc(len + 1);
+ if (!newh->value) {
+ free(newh);
+ return false;
+ }
+ strncpy(newh->value, (const char *) s, len);
+ newh->value[len] = '\0';
+ newh->next = NULL;
+
+ /* Insert element at tail of queue, with time complexity of O(1) */
+ if (!q->last_element) {
+ q->head = q->last_element = newh;
+ } else {
+ q->last_element->next = newh;
+ q->last_element = newh;
+ }
+ q->count++;
+
+ return true;
}
```
### `q_remove_head`
```diff
bool q_remove_head(queue_t *q, char *sp, size_t bufsize)
{
- /* TODO: You need to fix up this code. */
- /* TODO: Remove the above comment when you are about to implement. */
+ if (!q || !q->head)
+ return false;
+ if (sp) {
+ int realBufsize = strlen(q->head->value) < bufsize - 1
+ ? strlen(q->head->value)
+ : bufsize - 1;
+ memcpy((void *) sp, (const void *) q->head->value, realBufsize);
+ *(sp + realBufsize) = '\0';
+ }
+ list_ele_t *target = q->head;
+ /* May be bufsize is greater than actual size of value */
q->head = q->head->next;
+ q->count--;
+ free(target->value);
+ free(target);
+
+ /* Mark last element as NULL if queue is empty */
+ if (!q->head)
+ q->last_element = NULL;
+
return true;
}
```
### `q_size`
```diff
int q_size(queue_t *q)
{
- /* TODO: You need to write the code for this function */
- /* Remember: It should operate in O(1) time */
- /* TODO: Remove the above comment when you are about to implement. */
- return 0;
+ /* With time complexity of O(1) */
+ return q ? q->count : 0;
}
```
### `q_reverse`
```diff
void q_reverse(queue_t *q)
{
- /* TODO: You need to write the code for this function */
- /* TODO: Remove the above comment when you are about to implement. */
+ if (q == NULL)
+ return;
+
+ list_ele_t *prev = NULL;
+ list_ele_t *temp;
+ list_ele_t *current = q->last_element = q->head;
+ while (current != NULL) {
+ temp = current->next;
+ current->next = prev;
+ prev = current;
+ current = temp;
+ }
+ q->head = prev;
}
```
### `q_sort`
- 以 mergesort 來實作 `qsort`,分割 list 時使用 slow/faster pointer 的方法
- 參考 [gpwork4u](https://hackmd.io/-k6KIJ8GRSen7NUdVBt0AQ?view) 的建議將 comparator 獨立出來方便替換成其他函式
```diff
void q_sort(queue_t *q)
{
- /* TODO: You need to write the code for this function */
- /* TODO: Remove the above comment when you are about to implement. */
+ if (q == NULL || q->head == NULL || q->count == 1)
+ return;
+ q->head = mergesort(q->head, comparator);
+
+ list_ele_t *end = q->head;
+ while (end->next) {
+ end = end->next;
+ }
+ q->last_element = end;
}
+
+/*
+ * Sort elements in linked list
+ * This function is called by `q_sort`
+ */
+list_ele_t *mergesort(list_ele_t *start,
+ int (*compar)(const void *, const void *))
+{
+ if (!start || !start->next) {
+ return start;
+ }
+
+ // Split the list into two lists
+ list_ele_t *slow = start, *fast = start;
+ while (fast->next && fast->next->next) {
+ slow = slow->next;
+ fast = fast->next->next;
+ }
+ list_ele_t *left = start, *right = slow->next;
+ slow->next = NULL;
+
+ left = mergesort(left, compar);
+ right = mergesort(right, compar);
+
+ for (list_ele_t *merge = NULL; left || right;) {
+ if (!right || (left && (*compar)(left, right) < 0)) {
+ list_ele_t *next = left->next;
+ if (!merge) {
+ start = merge = left;
+ } else {
+ merge->next = left;
+ merge = left;
+ left = next;
+ } else {
+ list_ele_t *next = right->next;
+ if (!merge) {
+ start = merge = right;
+ } else {
+ merge->next = right;
+ merge = right;
+ }
+ right = next;
+ }
+ merge->next = NULL;
+ }
+ return start;
+}
```
## memleak in `qtest.c`
使用 `make valgrind` 或是 `valgrind ./qtest -f traces/trace-XX-ops.cmd` 會出現 memory leak 的訊息:
```
cmd>
Freeing queue
==364164== 5 bytes in 1 blocks are still reachable in loss record 1 of 3
==364164== at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==364164== by 0x4A6150E: strdup (strdup.c:42)
==364164== by 0x110394: linenoiseHistoryAdd (linenoise.c:1236)
==364164== by 0x110F27: linenoiseHistoryLoad (linenoise.c:1325)
==364164== by 0x10C2BE: main (qtest.c:781)
==364164==
```
trace 後發現這段配置的記憶體應該是經由 `linenoise` 掛上 exit handlers 並於程式正常結束時清除
```
linenoise
└linenoiseRaw
└enableRawMode
└atexit(linenoiseAtExit);
```
按照 `run_console` 現有寫法,只有在 interactive mode 下才會呼叫到 `linenoise`
```cpp
if (!has_infile) {
char *cmdline;
while ((cmdline = linenoise(prompt)) != NULL) {
interpret_cmd(cmdline);
linenoiseHistoryAdd(cmdline); /* Add to the history. */
linenoiseHistorySave(HISTORY_FILE); /* Save the history on disk. */
linenoiseFree(cmdline);
}
} else {
while (!cmd_done())
cmd_select(0, NULL, NULL, NULL, NULL);
}
```
並在 `runconsole` 前設定 `linenoise`
```cpp
/* Trigger call back function(auto completion) */
linenoiseSetCompletionCallback(completion);
linenoiseHistorySetMaxLen(HISTORY_LEN);
linenoiseHistoryLoad(HISTORY_FILE); /* Load the history at startup */
...
bool ok = true;
ok = ok && run_console(infile_name);
```
想到的解法有兩個,在 `qtest` 結束前自行呼叫 `linenoiseAtExit`,或是將 `linenoiseHistoryLoad` 搬移到接下來要執行 `linenoise` 的前面,確保記憶體一定被釋放,例如原始專案範例 [example.c](https://github.com/antirez/linenoise/blob/master/example.c)。
選擇用前者,畢竟在 non-interactive mode 去載入 `linenoise` 的設定就是一件匪夷所思的事情,修改後如下:
```diff
if (!has_infile) {
char *cmdline;
+ /* Trigger call back function(auto completion) */
+ linenoiseSetCompletionCallback(completion);
+
+ linenoiseHistorySetMaxLen(HISTORY_LEN);
+ linenoiseHistoryLoad(HISTORY_FILE); /* Load the history at startup */
while ((cmdline = linenoise(prompt)) != NULL) {
interpret_cmd(cmdline);
linenoiseHistoryAdd(cmdline); /* Add to the history. */
linenoiseHistorySave(HISTORY_FILE); /* Save the history on disk. */
linenoiseFree(cmdline);
}
} else {
while (!cmd_done())
cmd_select(0, NULL, NULL, NULL, NULL);
}
```
## 嵌入 natural sort
[natsort](https://github.com/sourcefrog/natsort) 的行為是先以數字做排序,其次是字母,有別於 `strcmp` 逐一字元比較;數字的界定是以左右字母或特殊字元(`-`)做區隔,不計中間的空格或 0。
```
pic02a
pic02000
pic05
pic2
pic3
pic4
pic 4 else
pic 5
```
測試腳本提供數種 testdata,然而實際用到的只有 test-dates、test-fractions、test-words。
來看 natsort 的程式碼,`strnatcasecmp.c` 提供兩個比較函式
- strnatcmp
- strnatcasecmp: 功能同上,但字母不分大小寫,官方測試資料也沒測到大小寫字母。
```cpp
if (fold_case) {
ca = nat_toupper(ca);
cb = nat_toupper(cb);
}
```
基於考量採用 `strnatcasecmp` 當作比較函數,並在 `qtest.c` 引入 `strnatcmp.h`、`strnatcmp.h`,同時在`console.c`的`simulation`後方新增一個 option - natsort,用以切換 natsort/normal sort mode。
```cpp
int comparator(const void *a, const void *b)
{
int (*sort_alg)() = strcmp;
if (natsort)
sort_alg = strnatcasecmp;
return sort_alg(((list_ele_t *) a)->value, ((list_ele_t *) b)->value);
}
```
可惜在 console 輸入時是以空白做運算元分隔,下面字詞插入時會被誤判,故不列入測試。
```
pic 4 else
pic 5 something
Fractional release numbers
```
:::info
`run-test.bash` 有這麼一段
```shell
diff -u "$1" <( ./natsort < "$2" ) && return
```
根據 [3.5.6 Process Substitution](https://www.gnu.org/software/bash/manual/html_node/Process-Substitution.html#Process-Substitution) 的說明,`diff` 必須要將 `<()` 內的程式輸出讀進來,作用等同
```shell
./natsort < "$2" | diff -u "$1" - && return
```
:::
## 研讀論文 Dude, is my code constant time?