# 2021q1 Homework1 (lab0)
contributed by < `chiehen` >
###### tags: `linux2021`
## 作業要求
- [x] 依據指示著手修改 queue.[ch] 和連帶的檔案,測試後用 Git 管理各項修改,記得也該實作 q_sort 函式。
- [x] 開啟 Address Sanitizer,修正 qtest 執行過程中的錯誤
- [ ] 運用 Valgrind 排除 qtest 實作的記憶體錯誤,並透過 Massif 視覺化 “simulation” 過程中的記憶體使用量,需要設計對應的實驗
- [ ] 在 qtest 中實作 coroutine,並提供新的命令 web,提供 web 伺服器功能
- [ ] 解釋 select 系統呼叫在本程式的使用方式,並分析 console.c 的實作,說明其中運用 CS:APP RIO 套件 的原理和考量點。
- [ ] 說明 antirez/linenoise 的運作原理,注意到 termios 的運用
- [ ] 研讀論文 Dude, is my code constant time?
- [ ] 指出現有程式的缺陷 (提示: 和 RIO 套件 有關),嘗試強化並提交 pull request
## 環境準備
1. 安裝必要的開發工具套件
```
$ sudo apt install build-essential git-core valgrind
$ sudo apt install cppcheck clang-format aspell colordiff
$ sudo apt install tig
```
2. 取得程式
1. 在 GitHub 上 fork 專案([lab-0](https://github.com/sysprog21/lab0-c))
2. clone 至本地端
```
$ git clone git@github.com:你的帳號名稱/lab0-c
```
## C Programming Lab
在程式中 [queue.h](https://github.com/sysprog21/lab0-c/blob/master/queue.h) 定義了鏈結串列(linked-list) 的結構:
```c=16
/* Linked list element (You shouldn't need to change this) */
typedef struct ELE {
/* Pointer to array holding string.
* This array needs to be explicitly allocated and freed
*/
char *value;
struct ELE *next;
} list_ele_t;
```
為單向的 linked-list,
接著用鏈結串列來實作佇列 (queue):
```c=25
/* Queue structure */
typedef struct {
list_ele_t *head; /* Linked list of elements */
/* TODO: You will need to add more fields to this structure
* to efficiently implement q_size and q_insert_tail.
*/
/* TODO: Remove the above comment when you are about to implement. */
} queue_t;
```
可發現 [queue.c](https://github.com/sysprog21/lab0-c/blob/master/queue.c) 僅提供介面 (interface) 但尚未有完整程式實作,包含以下:
* `q_new`: 建立新的「空」佇列;
* `q_free`: 釋放佇列所佔用的記憶體;
* `q_insert_head`: 在佇列開頭 (head) 加入 (insert) 給定的新元素 (以 LIFO 準則);
* `q_insert_tail`: 在佇列尾端 (tail) 加入 (insert) 給定的新元素 (以 FIFO 準則);
* `q_remove_head`: 在佇列開頭 (head) 移去 (remove) 給定的元素。
* `q_size`: 計算佇列中的元素數量。
* `q_reverse`: 以反向順序重新排列鏈結串列,該函式不該配置或釋放任何鏈結串列元素,換言之,它只能重排已經存在的元素;
* `q_sort`: 以遞增順序來排序鏈結串列的元素。
因此第一個作業要求即是完成此部份實做。
### 實做 q_new
根據註解創建一個空的 queue, 將 `head` 及`tail` 指向 `NULL`, `size` 設為 `0` 。如果無法成功獲得動態配置的記憶體空間則回傳 `NULL`。
以下為程式碼:
```c
/*
* Create empty queue.
* Return NULL if could not allocate space.
*/
queue_t *q_new()
{
queue_t *q = malloc(sizeof(queue_t));
if (!q)
return NULL;
q->head = NULL;
q->tail = NULL;
q->size = 0;
return q;
}
```
### 實做 q_free
當呼叫此函式時將釋放 queue 中所有配置的記憶體空間,依照 linked-list 單向的結構,從 `head` 開始逐一釋放記憶體。 另外,因 list element 中成員 `value` 有額外配置記憶體,因此在釋放記憶體時,須先釋放 `value` 的記憶體。在釋放完所有 linked-list 的記憶體後,最後釋放 queue 的記憶體。
以下為程式碼:
```c
/* Free all storage used by queue */
void q_free(queue_t *q)
{
/* Free queue structure */
if (q) {
while (q->head) {
list_ele_t *p = q->head;
q->head = p->next;
free(p->value);
free(p);
}
}
free(q);
}
```
### 實做 q_insert_head
此段函式在佇列開頭 (head) 加入 (insert) 給定的新元素。依據註解如果 queue 是`NULL`,或記憶體配置失敗則回傳 `false`。
配置出字串的記憶體時:
```
char *val = malloc(strlen(s) + 1);
```
可注意到配置的空間是 `strlen(s) +1` bytes, 是因為根據 Linux Programmer's Manual:
> The strlen() function calculates the length of the string pointed to by s, excluding the terminating null byte ('\0').
但在複製字串時也須複製空字元 ('\0') 因此多配置1 byte 儲存此字元,
此外, 須注意若此時字串配置記憶體失敗, 在回傳 `false` 前要記得釋放先前配置的 list element 記憶體, 以免記憶體漏失(memory leak)。
一開始在複製字串時使用 `strcpy()`, 但在執行 git commit 時Git pre-commit hook 提示 `strcpy()` 為不安全函式,
本來想改為使用較完善且安全的 `strlcpy`, 但發現此函式並沒有被 glibc 支援, 因此改用 `strncpy()` 複製字串:
```c
strncpy(val, s, strlen(s) + 1);
```
完整程式碼為:
```c
/*
* Attempt to insert element at head of queue.
* Return true if successful.
* Return false if q is NULL or could not allocate space.
* Argument s points to the string to be stored.
* The function must explicitly allocate space and copy the string into it.
*/
bool q_insert_head(queue_t *q, char *s)
{
/* Check if q is NULL */
if (!q)
return false;
list_ele_t *newh;
newh = malloc(sizeof(list_ele_t));
/* Check if allocation successes */
if (!newh)
return false;
char *val = malloc(strlen(s) + 1);
/* Check if allocation successes */
if (!val) {
free(newh);
return false;
}
/* Copy string */
strncpy(val, s, strlen(s) + 1);
/* Update list element */
newh->value = val;
newh->next = q->head;
/* Update queue */
q->head = newh;
// If this is first element in q,
// then also update tail
if (!q->tail)
q->tail = newh;
q->size++;
return true;
}
```
### 實做 q_insert_tail
在佇列尾端 (tail) 加入 (insert) 給定的新元素。依據註解如果 queue 是`NULL`,或記憶體配置失敗則回傳 `false`。
為達成運算時間為 $O(1)$ 的要求, 增加新成員 `list_ele_t *tail` 至 queue 的結構以紀錄尾端位置:
```c
/* Queue structure */
typedef struct {
list_ele_t *head; /* Linked list of elements */
list_ele_t *tail; // Make q_insert_tail compute in O(1)
int size;
} queue_t;
```
實做方式與 `q_insert_head` 大致相同, 只有在更新 queue 時須改為以下:
```c
/* Update queue */
if (q->tail)
q->tail->next = newt;
q->tail = newt;
// If this is the first element in q,
// then also update head
if (!q->head)
q->head = newt;
```
完整程式碼:
```c
/*
* Attempt to insert element at tail of queue.
* Return true if successful.
* Return false if q is NULL or could not allocate space.
* Argument s points to the string to be stored.
* The function must explicitly allocate space and copy the string into it.
*/
bool q_insert_tail(queue_t *q, char *s)
{
/* Check if q is NULL */
if (!q)
return false;
list_ele_t *newt; // newt means new tail
newt = malloc(sizeof(list_ele_t));
/* Check if allocation successes */
if (!newt)
return false;
char *val = malloc(strlen(s) + 1);
/* Check if allocation successes */
if (!val) {
free(newt);
return false;
}
/* Copy string */
strncpy(val, s, strlen(s) + 1);
/* Update list element */
newt->value = val;
newt->next = NULL;
/* Update queue */
if (q->tail)
q->tail->next = newt;
q->tail = newt;
// If this is the first element in q,
// then also update head.
if (!q->head)
q->head = newt;
q->size++;
return true;
}
```
### 實做 q_remove_head
在佇列開頭 (head) 移去 (remove) 給定的元素, 並將被移除字串複製至 `*sp`。
* 在複製字串時, 須考慮被移除字串長度大於 `buffsize` 的情形:
```c
if (strlen(val) > bufsize - 1) {
// The removed string is too long,
// then only copy bufsize-1 characters.
strncpy(sp, val, bufsize - 1);
sp[bufsize - 1] = '\0';
} else {
strncpy(sp, val, strlen(val) + 1);
}
```
此時將只複製 `bufsize-1` characters
* 在釋放前須檢查釋放元素非tail 所指向的元素, 否則將造成 `q_insert_tail` 時使用到已釋放的記憶體, i.e.
```
q = [world]
cmd> rh
Removed world from queue
q = []
cmd> it t
q = [t ... ]
ERROR: Either list has cycle, or queue has more than 1 elements
cmd> free
ERROR: Attempted to free unallocated block. Address = 0x5555555555555555
匯流排錯誤 (核心已傾印)
```
因此須添加判斷:
```c
if(q->tail == ele)
q->tail = NULL; // avoid use after free
```
完整程式碼:
```c
/*
* Attempt to remove element from head of queue.
* Return true if successful.
* Return false if queue is NULL or empty.
* If sp is non-NULL and an element is removed, copy the removed string to *sp
* (up to a maximum of bufsize-1 characters, plus a null terminator.)
* The space used by the list element and the string should be freed.
*/
bool q_remove_head(queue_t *q, char *sp, size_t bufsize)
{
/* Check if queue is NULL or empty, return false */
if (!q || !q->head)
return false;
list_ele_t *ele = q->head;
q->head = q->head->next;
/* copy removed string to *sp */
char *val = ele->value;
if (sp) {
if (strlen(val) > bufsize - 1) {
strncpy(sp, val, bufsize - 1);
sp[bufsize - 1] = '\0';
} else {
strncpy(sp, val, strlen(val) + 1);
}
}
/* Free memory */
if(q->tail == ele)
q->tail = NULL; // avoid use after free
free(val);
free(ele);
q->size--;
return true;
}
```
### 實做 q_size
計算佇列中的元素數量, 且運算時間須為 $O(1)$。
為達成 $O(1)$ 要求, 須增加新成員 `int size` 至 queue 的結構:
```c
/* Queue structure */
typedef struct {
list_ele_t *head; /* Linked list of elements */
list_ele_t *tail;
int size; // Make q_size compute in O(1)
} queue_t;
```
建立新的佇列時(`q_new`) 將 `size` 初始至 $0$。在函式 `q_insert_head`, `q_insert_tail`, `q_remove_head`中須更新 `size`。
則在 `q_size` 只須回傳 `size` 即可:
```c
/*
* Return number of elements in queue.
* Return 0 if q is NULL or empty
*/
int q_size(queue_t *q)
{
if (q)
return q->size;
return 0;
}
```
### 實做 q_reverse
函式以反向順序重新排列鏈結串列。
當更新 `next` 前, 須紀錄前一個元素(q->head)位置及後一個元素位置(n)。
```c
void q_reverse(queue_t *q)
{
if (q && q->head) {
list_ele_t *curr = q->head->next;
q->tail = q->head;
q->head->next = NULL;
while (curr) {
list_ele_t *n = curr->next; // the original next
curr->next = q->head;
q->head = curr;
curr = n;
}
}
}
```
### 實做 q_sort
以遞增順序來排序鏈結串列的元素。
參考[Big-O Cheat Sheet](https://www.bigocheatsheet.com/), 選擇 Average case 和 Worst case 表現都不差 ($O(nlog(n))$) 的 Merge sort 實做 q_sort。
```c
/*
* Sort elements of queue in ascending order
* No effect if q is NULL or empty. In addition, if q has only one
* element, do nothing.
*/
void q_sort(queue_t *q)
{
/* q is NULL or empty*/
if (!q || !q->head || !q->head->next)
return;
q->head = merge_sort(q->head);
list_ele_t *temp = q->head;
/* update tail */
while (temp->next)
temp = temp->next;
q->tail = temp;
}
```
#### Merge Sort
參考 [Linked List Sort](https://npes87184.github.io/2015-09-12-linkedListSort/) 實做 Merge Sort
在 Merge Sort 中須尋找 list 的中間點。 利用兩變數 `fast` 和 `slow`, `fast` 速度為每次兩步, `slow` 速度為每次一步, 因此當 `fast` 走至 list 尾端時($2X$), `slow` 便在 list 中間($X$):
```c
list_ele_t *fast = head->next;
list_ele_t *slow = head;
// find the middle
while (fast && fast->next) {
slow = slow->next;
fast = fast->next->next;
}
fast = slow->next;
slow->next = NULL;
// divide
list_ele_t *l1 = merge_sort(head);
list_ele_t *l2 = merge_sort(fast);
```
而在 `merge` 函式中, 原先如[參考文件](https://npes87184.github.io/2015-09-12-linkedListSort/)一樣建立 pseudo node 來進行合併:
```c
list_ele_t *merge(list_ele_t *l1, list_ele_t *l2)
{
// merge with pseudo node
list_ele_t *result = malloc(sizeof(list_ele_t));
list_ele_t *result_h = result;
if (!result)
return l1;
...
...
list_ele_t *head = result_h->next;
free(result_h);
return head;
}
```
但在使用 qtest 測試時得到報錯:
```
cmd> sort
FATAL ERROR: Calls to malloc disallowed
FATAL Error. Exiting
```
因此得知不允許在此處要求配置記憶體,因此改為先比較兩個 list 的第一個元素:
```c
list_ele_t *merge(list_ele_t *l1, list_ele_t *l2)
{
list_ele_t *result;
list_ele_t *result_h;
if (!l2 || (l1 && strcmp(l1->value, l2->value) < 0)) {
result = l1;
l1 = l1->next;
} else {
result = l2;
l2 = l2->next;
}
result_h = result;
...
...
return result_h;
}
```
Merge Sort 完整程式碼:
```c=
list_ele_t *merge_sort(list_ele_t *head)
{
// when no element or one element in list, stop
if (!head || !head->next)
return head;
list_ele_t *fast = head->next;
list_ele_t *slow = head;
// find the middle
while (fast && fast->next) {
slow = slow->next;
fast = fast->next->next;
}
fast = slow->next;
slow->next = NULL;
// divide
list_ele_t *l1 = merge_sort(head);
list_ele_t *l2 = merge_sort(fast);
return merge(l1, l2);
}
list_ele_t *merge(list_ele_t *l1, list_ele_t *l2)
{
// merge with pseudo node
list_ele_t *result;
list_ele_t *result_h;
if (!l2 || (l1 && strcmp(l1->value, l2->value) < 0)) {
result = l1;
l1 = l1->next;
} else {
result = l2;
l2 = l2->next;
}
result_h = result;
while (l1 && l2) {
if (strcmp(l1->value, l2->value) < 0) {
result->next = l1;
result = result->next;
l1 = l1->next;
} else {
result->next = l2;
result = result->next;
l2 = l2->next;
}
}
if (l1)
result->next = l1;
if (l2)
result->next = l2;
return result_h;
}
```
### Notice
### 測試
```
$ make
$ make test
```
測試結果:
```
--- TOTAL 100/100
```
## Address Sanitizer
開啟 Address Sanitizer,修正 qtest 執行過程中的錯誤。
開啟 Address Sanitizer, 如果先前已生成執行檔, 先清除:
```
$ make clean
```
```
$ make SANITIZER=1
```
執行測試:
```
$ make test
```
錯誤訊息節錄:
```
==25269==ERROR: AddressSanitizer: global-buffer-overflow on address 0x56189862a5e0 at pc 0x561898612ae8 bp 0x7ffc3dec4a40 sp 0x7ffc3dec4a30
READ of size 4 at 0x56189862a5e0 thread T0
#0 0x561898612ae7 in do_option_cmd /home/jane/linux2021/lab0-c/console.c:369
#1 0x5618986118d0 in interpret_cmda /home/jane/linux2021/lab0-c/console.c:221
#2 0x5618986120b5 in interpret_cmd /home/jane/linux2021/lab0-c/console.c:244
#3 0x5618986132e1 in cmd_select /home/jane/linux2021/lab0-c/console.c:594
#4 0x56189861385b in run_console /home/jane/linux2021/lab0-c/console.c:667
#5 0x5618986103e5 in main /home/jane/linux2021/lab0-c/qtest.c:780
#6 0x7fac60ba20b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2)
#7 0x56189860db8d in _start (/home/jane/linux2021/lab0-c/qtest+0x8b8d)
0x56189862a5e1 is located 0 bytes to the right of global variable 'simulation' defined in 'console.c:21:6' (0x56189862a5e0) of size 1
'simulation' is ascii string ''
SUMMARY: AddressSanitizer: global-buffer-overflow /home/jane/linux2021/lab0-c/console.c:369 in do_option_cmd
```
得知錯誤發生在 console.c 的 369 行
又與變數 'simulation' 有關
```c=369
int oldval = *plist->valp;
```
發現在 104 行, simulation 從 bool 型態被強制轉形成 int 型態後被放入 param_ptr 結構的成員 valp 中
```c=104
add_param("simulation", (int *) &simulation, "Start/Stop simulation mode ",
```
因此在369行取值時發生錯誤
#### 修正
嘗試宣告 simulation 為 int
```c=21
int simulation
```
報錯:
```
console.c:21:5: error: conflicting types for ‘simulation’
21 | int simulation = 0;
| ^~~~~~~~~~
In file included from console.c:3:
console.h:11:13: note: previous declaration of ‘simulation’ was here
11 | extern bool simulation;
| ^~~~~~~~~~
```
也更改 consle.h 中的宣告
```c=11
extern int simulation
```
另外發現執行 qtest 中的 help 指令也會有錯誤訊息
```
$ ./qtest
cmd> help
```
錯誤訊息節錄:
```
==26168==ERROR: AddressSanitizer: global-buffer-overflow on address 0x560412b823c0 at pc 0x560412b6b7bd bp 0x7ffeb60b7c50 sp 0x7ffeb60b7c40
READ of size 4 at 0x560412b823c0 thread T0
#0 0x560412b6b7bc in do_help_cmd /home/jane/linux2021/lab0-c/console.c:307
#1 0x560412b6b8d0 in interpret_cmda /home/jane/linux2021/lab0-c/console.c:221
#2 0x560412b6c0b5 in interpret_cmd /home/jane/linux2021/lab0-c/console.c:244
#3 0x560412b6d7f8 in run_console /home/jane/linux2021/lab0-c/console.c:660
#4 0x560412b6a3e5 in main /home/jane/linux2021/lab0-c/qtest.c:780
#5 0x7f83b44520b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2)
#6 0x560412b67b8d in _start (/home/jane/linux2021/lab0-c/qtest+0x8b8d)
0x560412b823c1 is located 0 bytes to the right of global variable 'echo' defined in 'console.c:59:13' (0x560412b823c0) of size 1
SUMMARY: AddressSanitizer: global-buffer-overflow /home/jane/linux2021/lab0-c/console.c:307 in do_help_cmd
```
發現一樣在 108 行時被強制轉型
```c=108
add_param("echo", (int *) &echo, "Do/don't echo commands", NULL);
```
#### 修正
同理將 echo 也宣告為 int 型態
```c=59
static int echo
```
## Valgrind
運用 Valgrind 排除 qtest 實作的記憶體錯誤,並透過 Massif 視覺化 “simulation” 過程中的記憶體使用量,需要設計對應的實驗
```
$ make valgrind
```
得到輸出:
```
==6204== 5 bytes in 1 blocks are still reachable in loss record 1 of 3
==6204== at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreloa
d_memcheck-amd64-linux.so)
==6204== by 0x4A5250E: strdup (strdup.c:42)
==6204== by 0x1100D8: linenoiseHistoryAdd (linenoise.c:1236)
==6204== by 0x110C6B: linenoiseHistoryLoad (linenoise.c:1325)
==6204== by 0x10C22C: main (qtest.c:769)
==6204==
==6204== 91 bytes in 19 blocks are still reachable in loss record 2 of 3
==6204== at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreloa
d_memcheck-amd64-linux.so)
==6204== by 0x4A5250E: strdup (strdup.c:42)
==6204== by 0x11004C: linenoiseHistoryAdd (linenoise.c:1236)
==6204== by 0x110C6B: linenoiseHistoryLoad (linenoise.c:1325)
==6204== by 0x10C22C: main (qtest.c:769)
==6204==
==6204== 160 bytes in 1 blocks are still reachable in loss record 3 of 3
==6204== at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreloa
d_memcheck-amd64-linux.so)
==6204== by 0x110098: linenoiseHistoryAdd (linenoise.c:1224)
==6204== by 0x110C6B: linenoiseHistoryLoad (linenoise.c:1325)
==6204== by 0x10C22C: main (qtest.c:769)
==6204==
```
發現造成記憶體錯誤的為引入的套件 linenoise 中
#### Massif 視覺化 “simulation” 過程
```
$ valgrind --tool=massif ./qtest
cmd> option simulation 1
cmd> it
Probably constant time
cmd> size
Probably constant time
cmd> option simulation 0
cmd> quit
Freeing queue
```
使用 massif 視覺化
```
$ massif-visualizer
```
得到:
