# 2021q1 Homework1 (lab0) contributed by < `chiehen` > ###### tags: `linux2021` ## 作業要求 - [x] 依據指示著手修改 queue.[ch] 和連帶的檔案,測試後用 Git 管理各項修改,記得也該實作 q_sort 函式。 - [x] 開啟 Address Sanitizer,修正 qtest 執行過程中的錯誤 - [ ] 運用 Valgrind 排除 qtest 實作的記憶體錯誤,並透過 Massif 視覺化 “simulation” 過程中的記憶體使用量,需要設計對應的實驗 - [ ] 在 qtest 中實作 coroutine,並提供新的命令 web,提供 web 伺服器功能 - [ ] 解釋 select 系統呼叫在本程式的使用方式,並分析 console.c 的實作,說明其中運用 CS:APP RIO 套件 的原理和考量點。 - [ ] 說明 antirez/linenoise 的運作原理,注意到 termios 的運用 - [ ] 研讀論文 Dude, is my code constant time? - [ ] 指出現有程式的缺陷 (提示: 和 RIO 套件 有關),嘗試強化並提交 pull request ## 環境準備 1. 安裝必要的開發工具套件 ``` $ sudo apt install build-essential git-core valgrind $ sudo apt install cppcheck clang-format aspell colordiff $ sudo apt install tig ``` 2. 取得程式 1. 在 GitHub 上 fork 專案([lab-0](https://github.com/sysprog21/lab0-c)) 2. clone 至本地端 ``` $ git clone git@github.com:你的帳號名稱/lab0-c ``` ## C Programming Lab 在程式中 [queue.h](https://github.com/sysprog21/lab0-c/blob/master/queue.h) 定義了鏈結串列(linked-list) 的結構: ```c=16 /* Linked list element (You shouldn't need to change this) */ typedef struct ELE { /* Pointer to array holding string. * This array needs to be explicitly allocated and freed */ char *value; struct ELE *next; } list_ele_t; ``` 為單向的 linked-list, 接著用鏈結串列來實作佇列 (queue): ```c=25 /* Queue structure */ typedef struct { list_ele_t *head; /* Linked list of elements */ /* TODO: You will need to add more fields to this structure * to efficiently implement q_size and q_insert_tail. */ /* TODO: Remove the above comment when you are about to implement. */ } queue_t; ``` 可發現 [queue.c](https://github.com/sysprog21/lab0-c/blob/master/queue.c) 僅提供介面 (interface) 但尚未有完整程式實作,包含以下: * `q_new`: 建立新的「空」佇列; * `q_free`: 釋放佇列所佔用的記憶體; * `q_insert_head`: 在佇列開頭 (head) 加入 (insert) 給定的新元素 (以 LIFO 準則); * `q_insert_tail`: 在佇列尾端 (tail) 加入 (insert) 給定的新元素 (以 FIFO 準則); * `q_remove_head`: 在佇列開頭 (head) 移去 (remove) 給定的元素。 * `q_size`: 計算佇列中的元素數量。 * `q_reverse`: 以反向順序重新排列鏈結串列,該函式不該配置或釋放任何鏈結串列元素,換言之,它只能重排已經存在的元素; * `q_sort`: 以遞增順序來排序鏈結串列的元素。 因此第一個作業要求即是完成此部份實做。 ### 實做 q_new 根據註解創建一個空的 queue, 將 `head` 及`tail` 指向 `NULL`, `size` 設為 `0` 。如果無法成功獲得動態配置的記憶體空間則回傳 `NULL`。 以下為程式碼: ```c /* * Create empty queue. * Return NULL if could not allocate space. */ queue_t *q_new() { queue_t *q = malloc(sizeof(queue_t)); if (!q) return NULL; q->head = NULL; q->tail = NULL; q->size = 0; return q; } ``` ### 實做 q_free 當呼叫此函式時將釋放 queue 中所有配置的記憶體空間,依照 linked-list 單向的結構,從 `head` 開始逐一釋放記憶體。 另外,因 list element 中成員 `value` 有額外配置記憶體,因此在釋放記憶體時,須先釋放 `value` 的記憶體。在釋放完所有 linked-list 的記憶體後,最後釋放 queue 的記憶體。 以下為程式碼: ```c /* Free all storage used by queue */ void q_free(queue_t *q) { /* Free queue structure */ if (q) { while (q->head) { list_ele_t *p = q->head; q->head = p->next; free(p->value); free(p); } } free(q); } ``` ### 實做 q_insert_head 此段函式在佇列開頭 (head) 加入 (insert) 給定的新元素。依據註解如果 queue 是`NULL`,或記憶體配置失敗則回傳 `false`。 配置出字串的記憶體時: ``` char *val = malloc(strlen(s) + 1); ``` 可注意到配置的空間是 `strlen(s) +1` bytes, 是因為根據 Linux Programmer's Manual: > The strlen() function calculates the length of the string pointed to by s, excluding the terminating null byte ('\0'). 但在複製字串時也須複製空字元 ('\0') 因此多配置1 byte 儲存此字元, 此外, 須注意若此時字串配置記憶體失敗, 在回傳 `false` 前要記得釋放先前配置的 list element 記憶體, 以免記憶體漏失(memory leak)。 一開始在複製字串時使用 `strcpy()`, 但在執行 git commit 時Git pre-commit hook 提示 `strcpy()` 為不安全函式, 本來想改為使用較完善且安全的 `strlcpy`, 但發現此函式並沒有被 glibc 支援, 因此改用 `strncpy()` 複製字串: ```c strncpy(val, s, strlen(s) + 1); ``` 完整程式碼為: ```c /* * Attempt to insert element at head of queue. * Return true if successful. * Return false if q is NULL or could not allocate space. * Argument s points to the string to be stored. * The function must explicitly allocate space and copy the string into it. */ bool q_insert_head(queue_t *q, char *s) { /* Check if q is NULL */ if (!q) return false; list_ele_t *newh; newh = malloc(sizeof(list_ele_t)); /* Check if allocation successes */ if (!newh) return false; char *val = malloc(strlen(s) + 1); /* Check if allocation successes */ if (!val) { free(newh); return false; } /* Copy string */ strncpy(val, s, strlen(s) + 1); /* Update list element */ newh->value = val; newh->next = q->head; /* Update queue */ q->head = newh; // If this is first element in q, // then also update tail if (!q->tail) q->tail = newh; q->size++; return true; } ``` ### 實做 q_insert_tail 在佇列尾端 (tail) 加入 (insert) 給定的新元素。依據註解如果 queue 是`NULL`,或記憶體配置失敗則回傳 `false`。 為達成運算時間為 $O(1)$ 的要求, 增加新成員 `list_ele_t *tail` 至 queue 的結構以紀錄尾端位置: ```c /* Queue structure */ typedef struct { list_ele_t *head; /* Linked list of elements */ list_ele_t *tail; // Make q_insert_tail compute in O(1) int size; } queue_t; ``` 實做方式與 `q_insert_head` 大致相同, 只有在更新 queue 時須改為以下: ```c /* Update queue */ if (q->tail) q->tail->next = newt; q->tail = newt; // If this is the first element in q, // then also update head if (!q->head) q->head = newt; ``` 完整程式碼: ```c /* * Attempt to insert element at tail of queue. * Return true if successful. * Return false if q is NULL or could not allocate space. * Argument s points to the string to be stored. * The function must explicitly allocate space and copy the string into it. */ bool q_insert_tail(queue_t *q, char *s) { /* Check if q is NULL */ if (!q) return false; list_ele_t *newt; // newt means new tail newt = malloc(sizeof(list_ele_t)); /* Check if allocation successes */ if (!newt) return false; char *val = malloc(strlen(s) + 1); /* Check if allocation successes */ if (!val) { free(newt); return false; } /* Copy string */ strncpy(val, s, strlen(s) + 1); /* Update list element */ newt->value = val; newt->next = NULL; /* Update queue */ if (q->tail) q->tail->next = newt; q->tail = newt; // If this is the first element in q, // then also update head. if (!q->head) q->head = newt; q->size++; return true; } ``` ### 實做 q_remove_head 在佇列開頭 (head) 移去 (remove) 給定的元素, 並將被移除字串複製至 `*sp`。 * 在複製字串時, 須考慮被移除字串長度大於 `buffsize` 的情形: ```c if (strlen(val) > bufsize - 1) { // The removed string is too long, // then only copy bufsize-1 characters. strncpy(sp, val, bufsize - 1); sp[bufsize - 1] = '\0'; } else { strncpy(sp, val, strlen(val) + 1); } ``` 此時將只複製 `bufsize-1` characters * 在釋放前須檢查釋放元素非tail 所指向的元素, 否則將造成 `q_insert_tail` 時使用到已釋放的記憶體, i.e. ``` q = [world] cmd> rh Removed world from queue q = [] cmd> it t q = [t ... ] ERROR: Either list has cycle, or queue has more than 1 elements cmd> free ERROR: Attempted to free unallocated block. Address = 0x5555555555555555 匯流排錯誤 (核心已傾印) ``` 因此須添加判斷: ```c if(q->tail == ele) q->tail = NULL; // avoid use after free ``` 完整程式碼: ```c /* * Attempt to remove element from head of queue. * Return true if successful. * Return false if queue is NULL or empty. * If sp is non-NULL and an element is removed, copy the removed string to *sp * (up to a maximum of bufsize-1 characters, plus a null terminator.) * The space used by the list element and the string should be freed. */ bool q_remove_head(queue_t *q, char *sp, size_t bufsize) { /* Check if queue is NULL or empty, return false */ if (!q || !q->head) return false; list_ele_t *ele = q->head; q->head = q->head->next; /* copy removed string to *sp */ char *val = ele->value; if (sp) { if (strlen(val) > bufsize - 1) { strncpy(sp, val, bufsize - 1); sp[bufsize - 1] = '\0'; } else { strncpy(sp, val, strlen(val) + 1); } } /* Free memory */ if(q->tail == ele) q->tail = NULL; // avoid use after free free(val); free(ele); q->size--; return true; } ``` ### 實做 q_size 計算佇列中的元素數量, 且運算時間須為 $O(1)$。 為達成 $O(1)$ 要求, 須增加新成員 `int size` 至 queue 的結構: ```c /* Queue structure */ typedef struct { list_ele_t *head; /* Linked list of elements */ list_ele_t *tail; int size; // Make q_size compute in O(1) } queue_t; ``` 建立新的佇列時(`q_new`) 將 `size` 初始至 $0$。在函式 `q_insert_head`, `q_insert_tail`, `q_remove_head`中須更新 `size`。 則在 `q_size` 只須回傳 `size` 即可: ```c /* * Return number of elements in queue. * Return 0 if q is NULL or empty */ int q_size(queue_t *q) { if (q) return q->size; return 0; } ``` ### 實做 q_reverse 函式以反向順序重新排列鏈結串列。 當更新 `next` 前, 須紀錄前一個元素(q->head)位置及後一個元素位置(n)。 ```c void q_reverse(queue_t *q) { if (q && q->head) { list_ele_t *curr = q->head->next; q->tail = q->head; q->head->next = NULL; while (curr) { list_ele_t *n = curr->next; // the original next curr->next = q->head; q->head = curr; curr = n; } } } ``` ### 實做 q_sort 以遞增順序來排序鏈結串列的元素。 參考[Big-O Cheat Sheet](https://www.bigocheatsheet.com/), 選擇 Average case 和 Worst case 表現都不差 ($O(nlog(n))$) 的 Merge sort 實做 q_sort。 ```c /* * Sort elements of queue in ascending order * No effect if q is NULL or empty. In addition, if q has only one * element, do nothing. */ void q_sort(queue_t *q) { /* q is NULL or empty*/ if (!q || !q->head || !q->head->next) return; q->head = merge_sort(q->head); list_ele_t *temp = q->head; /* update tail */ while (temp->next) temp = temp->next; q->tail = temp; } ``` #### Merge Sort 參考 [Linked List Sort](https://npes87184.github.io/2015-09-12-linkedListSort/) 實做 Merge Sort 在 Merge Sort 中須尋找 list 的中間點。 利用兩變數 `fast` 和 `slow`, `fast` 速度為每次兩步, `slow` 速度為每次一步, 因此當 `fast` 走至 list 尾端時($2X$), `slow` 便在 list 中間($X$): ```c list_ele_t *fast = head->next; list_ele_t *slow = head; // find the middle while (fast && fast->next) { slow = slow->next; fast = fast->next->next; } fast = slow->next; slow->next = NULL; // divide list_ele_t *l1 = merge_sort(head); list_ele_t *l2 = merge_sort(fast); ``` 而在 `merge` 函式中, 原先如[參考文件](https://npes87184.github.io/2015-09-12-linkedListSort/)一樣建立 pseudo node 來進行合併: ```c list_ele_t *merge(list_ele_t *l1, list_ele_t *l2) { // merge with pseudo node list_ele_t *result = malloc(sizeof(list_ele_t)); list_ele_t *result_h = result; if (!result) return l1; ... ... list_ele_t *head = result_h->next; free(result_h); return head; } ``` 但在使用 qtest 測試時得到報錯: ``` cmd> sort FATAL ERROR: Calls to malloc disallowed FATAL Error. Exiting ``` 因此得知不允許在此處要求配置記憶體,因此改為先比較兩個 list 的第一個元素: ```c list_ele_t *merge(list_ele_t *l1, list_ele_t *l2) { list_ele_t *result; list_ele_t *result_h; if (!l2 || (l1 && strcmp(l1->value, l2->value) < 0)) { result = l1; l1 = l1->next; } else { result = l2; l2 = l2->next; } result_h = result; ... ... return result_h; } ``` Merge Sort 完整程式碼: ```c= list_ele_t *merge_sort(list_ele_t *head) { // when no element or one element in list, stop if (!head || !head->next) return head; list_ele_t *fast = head->next; list_ele_t *slow = head; // find the middle while (fast && fast->next) { slow = slow->next; fast = fast->next->next; } fast = slow->next; slow->next = NULL; // divide list_ele_t *l1 = merge_sort(head); list_ele_t *l2 = merge_sort(fast); return merge(l1, l2); } list_ele_t *merge(list_ele_t *l1, list_ele_t *l2) { // merge with pseudo node list_ele_t *result; list_ele_t *result_h; if (!l2 || (l1 && strcmp(l1->value, l2->value) < 0)) { result = l1; l1 = l1->next; } else { result = l2; l2 = l2->next; } result_h = result; while (l1 && l2) { if (strcmp(l1->value, l2->value) < 0) { result->next = l1; result = result->next; l1 = l1->next; } else { result->next = l2; result = result->next; l2 = l2->next; } } if (l1) result->next = l1; if (l2) result->next = l2; return result_h; } ``` ### Notice ### 測試 ``` $ make $ make test ``` 測試結果: ``` --- TOTAL 100/100 ``` ## Address Sanitizer 開啟 Address Sanitizer,修正 qtest 執行過程中的錯誤。 開啟 Address Sanitizer, 如果先前已生成執行檔, 先清除: ``` $ make clean ``` ``` $ make SANITIZER=1 ``` 執行測試: ``` $ make test ``` 錯誤訊息節錄: ``` ==25269==ERROR: AddressSanitizer: global-buffer-overflow on address 0x56189862a5e0 at pc 0x561898612ae8 bp 0x7ffc3dec4a40 sp 0x7ffc3dec4a30 READ of size 4 at 0x56189862a5e0 thread T0 #0 0x561898612ae7 in do_option_cmd /home/jane/linux2021/lab0-c/console.c:369 #1 0x5618986118d0 in interpret_cmda /home/jane/linux2021/lab0-c/console.c:221 #2 0x5618986120b5 in interpret_cmd /home/jane/linux2021/lab0-c/console.c:244 #3 0x5618986132e1 in cmd_select /home/jane/linux2021/lab0-c/console.c:594 #4 0x56189861385b in run_console /home/jane/linux2021/lab0-c/console.c:667 #5 0x5618986103e5 in main /home/jane/linux2021/lab0-c/qtest.c:780 #6 0x7fac60ba20b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2) #7 0x56189860db8d in _start (/home/jane/linux2021/lab0-c/qtest+0x8b8d) 0x56189862a5e1 is located 0 bytes to the right of global variable 'simulation' defined in 'console.c:21:6' (0x56189862a5e0) of size 1 'simulation' is ascii string '' SUMMARY: AddressSanitizer: global-buffer-overflow /home/jane/linux2021/lab0-c/console.c:369 in do_option_cmd ``` 得知錯誤發生在 console.c 的 369 行 又與變數 'simulation' 有關 ```c=369 int oldval = *plist->valp; ``` 發現在 104 行, simulation 從 bool 型態被強制轉形成 int 型態後被放入 param_ptr 結構的成員 valp 中 ```c=104 add_param("simulation", (int *) &simulation, "Start/Stop simulation mode ", ``` 因此在369行取值時發生錯誤 #### 修正 嘗試宣告 simulation 為 int ```c=21 int simulation ``` 報錯: ``` console.c:21:5: error: conflicting types for ‘simulation’ 21 | int simulation = 0; | ^~~~~~~~~~ In file included from console.c:3: console.h:11:13: note: previous declaration of ‘simulation’ was here 11 | extern bool simulation; | ^~~~~~~~~~ ``` 也更改 consle.h 中的宣告 ```c=11 extern int simulation ``` 另外發現執行 qtest 中的 help 指令也會有錯誤訊息 ``` $ ./qtest cmd> help ``` 錯誤訊息節錄: ``` ==26168==ERROR: AddressSanitizer: global-buffer-overflow on address 0x560412b823c0 at pc 0x560412b6b7bd bp 0x7ffeb60b7c50 sp 0x7ffeb60b7c40 READ of size 4 at 0x560412b823c0 thread T0 #0 0x560412b6b7bc in do_help_cmd /home/jane/linux2021/lab0-c/console.c:307 #1 0x560412b6b8d0 in interpret_cmda /home/jane/linux2021/lab0-c/console.c:221 #2 0x560412b6c0b5 in interpret_cmd /home/jane/linux2021/lab0-c/console.c:244 #3 0x560412b6d7f8 in run_console /home/jane/linux2021/lab0-c/console.c:660 #4 0x560412b6a3e5 in main /home/jane/linux2021/lab0-c/qtest.c:780 #5 0x7f83b44520b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2) #6 0x560412b67b8d in _start (/home/jane/linux2021/lab0-c/qtest+0x8b8d) 0x560412b823c1 is located 0 bytes to the right of global variable 'echo' defined in 'console.c:59:13' (0x560412b823c0) of size 1 SUMMARY: AddressSanitizer: global-buffer-overflow /home/jane/linux2021/lab0-c/console.c:307 in do_help_cmd ``` 發現一樣在 108 行時被強制轉型 ```c=108 add_param("echo", (int *) &echo, "Do/don't echo commands", NULL); ``` #### 修正 同理將 echo 也宣告為 int 型態 ```c=59 static int echo ``` ## Valgrind 運用 Valgrind 排除 qtest 實作的記憶體錯誤,並透過 Massif 視覺化 “simulation” 過程中的記憶體使用量,需要設計對應的實驗 ``` $ make valgrind ``` 得到輸出: ``` ==6204== 5 bytes in 1 blocks are still reachable in loss record 1 of 3 ==6204== at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreloa d_memcheck-amd64-linux.so) ==6204== by 0x4A5250E: strdup (strdup.c:42) ==6204== by 0x1100D8: linenoiseHistoryAdd (linenoise.c:1236) ==6204== by 0x110C6B: linenoiseHistoryLoad (linenoise.c:1325) ==6204== by 0x10C22C: main (qtest.c:769) ==6204== ==6204== 91 bytes in 19 blocks are still reachable in loss record 2 of 3 ==6204== at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreloa d_memcheck-amd64-linux.so) ==6204== by 0x4A5250E: strdup (strdup.c:42) ==6204== by 0x11004C: linenoiseHistoryAdd (linenoise.c:1236) ==6204== by 0x110C6B: linenoiseHistoryLoad (linenoise.c:1325) ==6204== by 0x10C22C: main (qtest.c:769) ==6204== ==6204== 160 bytes in 1 blocks are still reachable in loss record 3 of 3 ==6204== at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreloa d_memcheck-amd64-linux.so) ==6204== by 0x110098: linenoiseHistoryAdd (linenoise.c:1224) ==6204== by 0x110C6B: linenoiseHistoryLoad (linenoise.c:1325) ==6204== by 0x10C22C: main (qtest.c:769) ==6204== ``` 發現造成記憶體錯誤的為引入的套件 linenoise 中 #### Massif 視覺化 “simulation” 過程 ``` $ valgrind --tool=massif ./qtest cmd> option simulation 1 cmd> it Probably constant time cmd> size Probably constant time cmd> option simulation 0 cmd> quit Freeing queue ``` 使用 massif 視覺化 ``` $ massif-visualizer ``` 得到: ![](https://i.imgur.com/k5sIgAD.png)