# 2021q1 Homework1 (lab0) contributed by < `robinsonweng` > ###### tags: `linux2021` > [GitHub](https://github.com/robinsonweng/lab0-c) ### Reviewed by `idoleat` 1. Git commit 內容過度瑣碎,很多 commit 就只是單純一行更動,附帶造成很多 commit 只有 title 沒有 body,如此在閱讀 commit 紀錄時比較難掌握整體上你對程式碼做了哪些改動。另外 [commit 38348ab](https://github.com/robinsonweng/lab0-c/commit/38348ab361ece193b74a1515c2aad6a53548659f) 需要更多描述為何該段程式碼是 junk,以及何謂 add comment? 3. 建議可以把會造成 stack overflow 的 merge sort 的實做 push 到另一個 branch 上,讓別人看是不是有可能是其他地方出了問題。因為其他同學大多都可以用 recursive 的方法實做,把你覺得有問題的 code 放上來也許可以讓別人找到改進空間。 4. lab0 還有後續項目值得繼續完成,例如實做 coroutine,及討論 lineoise 原理 ## :penguin: 作業要求 * [x] 開啟 [Address Sanitizer](https://github.com/google/sanitizers/wiki/AddressSanitizer),修正 `qtest` 執行過程中的錯誤 * 先執行 `qtest` 再於命令提示列輸入 `help` 命令,會使 開啟 [Address Sanitizer](https://github.com/google/sanitizers/wiki/AddressSanitizer) 觸發錯誤,應予以排除 ## C programming lab - [x] q_new: 新增一個空佇列. - [x] q_insert head: 對當下佇列的尾插入新節點 - [x] q_insert_tail: 對當下佇列的頭插入新節點 - [x] q_free: 釋放當下佇列所分配的所有記憶體空間 - [x] q_remove head: 移除當下佇列的頭節點. - [x] q_size: 計算當下佇列的大小,必須是常數執行時間 - [x] q_reverse: 翻轉當下佇列,並且不得呼叫free或是malloc等記憶體分配操作(inplace reverse) - [x] q_sort: 對當下的佇列進行排序 ### queue struct 題目要求 q_size 以及 q_insert_tail 必須是 O(1),意指不論佇列長度如何,這兩個函數的執行時間必須是固定的。最簡便的方式就是在佇列內新增兩個成員: size 以及 tail ```c=0 typedef struct { list_ele_t *head; /* Linked list of elements */ list_ele_t *tail; int size; } queue_t; ``` ### q_new ```c=0 queue_t *q_new() { queue_t *q = malloc(sizeof(queue_t)); if (q == NULL) abort(); q->head = NULL; return q; } ``` 一開始過於急躁,沒有仔細讀 `qtest.[ch]` 就貿然進行修改,一頭霧水的我直接加入 `abort`,經由老師提醒`abort` 會將整個`qtest`中斷。重新閱讀相關程式碼後發現只需要回傳 NULL pointer 就好,果然不做任何閱讀就貿然修改程式碼是大忌阿。 ```c=0 queue_t *q_new() { queue_t *q = malloc(sizeof(queue_t)); if (!q) return NULL; q->head = NULL; q->tail = NULL; q->size = 0; return q; } ``` ### q_insert_head `q_insert_head` 除了需要實做在佇列頭的位置插入新節點以外,函式內的註解還有幾個要求: 1. 必須分配記憶體給節點的值(字串) 透過 `(char *) malloc(sizeof(char) * slen + 1)` ,其中 (char *) 是為了轉換`malloc`回傳的`void *` 指標類型,slen 是輸入字串 s 的長度。最後不要忘記為了中止字元我們還要再+1 2. 要怎麼處理newh, newh->value 記憶體分配OOM的情境 newh 就跟前面的 q_new 一樣,當`malloc`回傳 NULL,回傳 false,但 `newh->value` 就不一樣了,由於前面的 newh 先分配成功了,所以在回傳 false 以前必須先釋放 newh ```cpp bool q_insert_head(queue_t *q, char *s) { if(!q) return false; list_ele_t *newh; /* TODO: What should you do if the q is NULL? */ newh = malloc(sizeof(list_ele_t)); if (!newh) return false; /* Don't forget to allocate space for the string and copy it */ /* What if either call to malloc returns NULL? */ int slen = strlen(s); newh->value = (char *) malloc(sizeof(char) * slen + 1); if (!newh->value) { free(newh); return false; } strncpy(newh->value, s, sizeof(char) * slen + 1); if (!q->tail) q->tail = newh; newh->next = q->head; q->head = newh; q->size++; ``` 由於會對佇列的長度進行更改,必須即時更新 q->size :::warning 使用 strncpy, strcpy 複製字串時,需要考慮到中止字元 `'\0'`, 所以在分配記憶體以及複製的長度都必須 + 1 ::: :::warning 假設頭節點的記憶體分配成功,但頭節點的字串沒有成功分配到記憶體,應該要先釋放頭節點的記憶體後才回傳false ::: ::: danger 使用`strcpy` 跳出此[函數不安全](https://security.web.cern.ch/recommendations/en/codetools/c.shtml)的警告,不得不讚嘆檢查代碼工具的強大 > The strcpy built-in function does not check buffer lengths and may very well overwrite memory zone contiguous to the intended destination. In fact, the whole family of functions is similarly vulnerable: strcpy, strcat and strcmp. ::: 最後要注意假設 q->tail 為空代表這個節點是佇列的第一個元素,要順便將新節點指向 q->tail ### q_insert_tail ```c=0 bool q_insert_tail(queue_t *q, char *s) { if (!q) return false; list_ele_t *newt = malloc(sizeof(list_ele_t)); if (!newt) return false; int slen = strlen(s); newt->value = (char *) malloc(sizeof(char) * slen + 1); if (!newt->value) { free(newt); return false; } strncpy(newt->value, s, sizeof(char) * slen + 1); if (!q->tail) { q->head = newt; } else { q->tail->next = newt; } q->tail = newt; q->tail->next = NULL; q->size++; return true; } ``` insert_tail 大部分都和 insert_head 一樣,差別只在最後指標要怎麼指 #### if !q->tail 假設 it 是第一個被執行的指令,那麼首要就是要將 head 跟 tail 指向新增的節點,最後要記得將next指向NULL ```graphviz digraph { node [shape=box] head -> newt tail -> newt newt -> NULL {rank=same; newt, NULL} } ``` #### if q->tail 假設在執行 it 以前,佇列已經有了其他元素在 ```graphviz digraph { node [shape=box] head -> node1 tail -> node2 node1-> node2 node2 -> NULL {rank=same; node1, node2, NULL, newt} } ``` 第一步: 將 q->tail->next 指向 newt ```graphviz digraph { node [shape=box] head -> node1 tail -> node2 node1 -> node2 node2 -> newt {rank=same; node1, node2, NULL, newt} } ``` 第二步: 將 q->tail 指向 newt,並將 q->tail->next 重新指向 NULL ```graphviz digraph { node [shape=box] head -> node1 tail -> newt node1 -> node2 node2 -> newt newt -> NULL {rank=same; node1, node2, NULL, newt} } ``` 一個我不能理解的地方是, 以上兩種情況都必須在最後將 q->tail->next 指向 NULL ,否則會造成循環, 即便 next 一開始就沒有指向任何地方。 ### q_free ```c=0 void q_free(queue_t *q) { if (!q) return; /* Walk through and free linked list & char */ /* Free should use on things that allocate by malloc */ list_ele_t *temp_list = q->head; while (q->head) { temp_list = q->head; q->head = q->head->next; free(temp_list->value); free(temp_list); } /* Free queue structure */ free(q); } ``` 我這時候並不是很清楚誰該 free 誰不該 free ,但我只記住一個原則: 那些被`malloc`分配的記憶體都該被`free`,從這點推論下來,我至少有三個東西要釋放 1. 佇列(q)本身 2. linked list 節點 3. linked list 指向的字串 1在範例當中已經幫我做好了,我只需要處理2,3就好,但在釋放過程中,順序是很重要的。我自己是照著「誰最後被分配,誰就先釋放」的順序進行。 首先,將 q->head 複製一份放在 temp_list 當中,並讓 q->head 移動至下一個節點 ```graphviz digraph { node [shape=box] head -> node2 tail -> node3 node1 -> node2 node2 -> node3 node3 -> NULL node1 -> a node2 -> c node3 -> b temp_list -> node1 {rank=same; node1, node2, node3, NULL} } ``` 然後照著字串, 節點依序將其釋放 ```graphviz digraph { node [shape=box] head -> node2 tail -> node3 node2 -> node3 node3 -> NULL node2 -> c node3 -> b temp_list -> Unreachable {rank=same; node2, node3, NULL} } ``` 原本這樣就可以灑花下一題了,但 commit 時跳出了關於 scope 的錯誤,大意是希望我們將宣告temp_list 放進 while 裡面, 不知道是基於什麼原因? :::danger style: The scope of the variable 'temp_list' can be reduced. [variableScope] ::: ### q_remove_head ```c=0 bool q_remove_head(queue_t *q, char *sp, size_t bufsize) { if (!q || !q->head) return false; if ((*sp != '\0') && (strncmp(sp, q->head->value, bufsize) > 0)) return false; strncpy(sp, q->head->value, bufsize - 1); *(sp += bufsize - 1) = '\0'; free(q->head->value); list_ele_t *temp = q->head; q->head = q->head->next; free(temp); q->size--; return true; } ``` 這題有幾個要求: `rh [str]` 1. 假設 str 為空 那麼直接移除頭節點 2. 假設 str 與佇列頭節點的字串不同,回傳false 3. 在 option 中設定 length = n 的前提之下,str 只需要和頭節點的字串比對至 n 個字元即可 要求本身不難, 在`strncmp`的手冊當中有提到: > It returns an integer less than, equal to, or greater than zero if s1 is found, respectively, to be less than to match, or be greater than s2. 所以在 q_remove_head 只需要知道`strncmp`比較的結果有沒有小於零,而且輸入的字串也不為空就可以了。 這裡有個問題,就是 q_remove_head 的 sp 不論參數怎麼下,都是空的字元,所以我重新閱讀`q_test.c` 的程式碼,並修改 [do_remove_head](https://github.com/robinsonweng/lab0-c/blob/master/qtest.c#L348) ,將參數透過 sp 傳進 q_remove_head 內,這樣就可以比較輸入的字串是否正確了。 ```c=347 // in do_remove_head removes[0] = '\0'; memset(removes + 1, 'X', string_length + STRINGPAD - 1); removes[string_length + STRINGPAD] = '\0'; bool check = argc > 1; bool ok = true; if (check) { strncpy(checks, argv[1], string_length + 1); checks[string_length] = '\0'; int alen = strlen(argv[1]); strncpy(removes, argv[1], alen); removes[alen] = '\0'; } ``` 但就像之前在 [q_insert_head](https://hackmd.io/VZqz2DmMQL6Gpu1E4YUx7Q?both#q_insert_head) 提到的,我們使用`strncpy, strcpy`必須考慮到`\0`,於是在複製字串的同時,必須在字串尾端加上中止符號。最後也不要忘了即時更新 q->size ### q_reverse (inplace) 參考文章: [Reverse a linked list](https://www.geeksforgeeks.org/reverse-a-linked-list/) 反轉後需要重新將 head 和 tail 指向對應的位置: ```graphviz digraph { node [shape=box] head -> node1 tail -> node3 node1 -> NULL node3 -> node2 node2 -> node1 comming -> NULL current -> NULL previous -> node1 {rank=same; node1, node2, node3, NULL} } ``` ```c=0 q->tail = q->head; q->head = previous; ``` ```graphviz digraph { node [shape=box] head -> node3 tail -> node1 node1 -> NULL node3 -> node2 node2 -> node1 comming -> NULL current -> NULL previous -> node1 {rank=same; node1, node2, node3, NULL} } ``` ### q_sort 參考文章: [iterative Merge Sort for linked list](https://www.geeksforgeeks.org/iterative-merge-sort-for-linked-list/) 由於測資極為龐大(1000000),merge sort 的 divide and conquer 策略不能用遞迴去實現,會把stack塞滿,於是將 merge sort 改為遞迴版的 ``` ==25711== Stack overflow in thread #1: can't grow stack to 0x1ffe801000 ``` ## 開啟 Address Sanitizer,修正 qtest 執行過程中的錯誤 ### 先執行 qtest 再於命令提示列輸入 help 命令,會使 Address Sanitizer 觸發錯誤,應予以排除 閱讀了[ASAN wiki](https://github.com/google/sanitizers/wiki/AddressSanitizer) 以及老師提供的[投影片1](https://www.slideshare.net/sermp/sanitizer-cppcon-russia), [投影片2](https://www.slideshare.net/CysinfoCommunity/a-look-into-the-sanitizer-family-asan-ubsan-by-akul-pillai)後可以對下圖的 Shadow bytes 和 Shasow byte legend 有點了解。簡單來講就是在原應用程式周圍分配一堆沒有被使用過的記憶體,也就是 shadow byte。假設我們操作記憶體妥當,那麼這些 shadow byte 理當都不會有任何更動,一旦發生溢位,周圍配置 shadow byte 就會被更改,我們就可以偵測到使用者有記憶體操作上的失誤 ``` ==4282==ERROR: AddressSanitizer: global-buffer-overflow on address 0x5596b643b3c0 at pc 0x5596b64247bd bp 0x7fff50fb2c20 sp 0x7fff50fb2c10 READ of size 4 at 0x5596b643b3c0 thread T0 #0 0x5596b64247bc in do_help_cmd /home/robinson/ubuntu/linux2021/lab0-c/console.c:307 #1 0x5596b64248d0 in interpret_cmda /home/robinson/ubuntu/linux2021/lab0-c/console.c:221 #2 0x5596b64250b5 in interpret_cmd /home/robinson/ubuntu/linux2021/lab0-c/console.c:244 #3 0x5596b64267f8 in run_console /home/robinson/ubuntu/linux2021/lab0-c/console.c:660 #4 0x5596b64233e5 in main /home/robinson/ubuntu/linux2021/lab0-c/qtest.c:780 #5 0x7ff992f9b0b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2) #6 0x5596b6420b8d in _start (/home/robinson/ubuntu/linux2021/lab0-c/qtest+0x8b8d) 0x5596b643b3c1 is located 0 bytes to the right of global variable 'echo' defined in 'console.c:59:13' (0x5596b643b3c0) of size 1 SUMMARY: AddressSanitizer: global-buffer-overflow /home/robinson/ubuntu/linux2021/lab0-c/console.c:307 in do_help_cmd Shadow bytes around the buggy address: 0x0ab356c7f620: f9 f9 f9 f9 04 f9 f9 f9 f9 f9 f9 f9 04 f9 f9 f9 0x0ab356c7f630: f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9 0x0ab356c7f640: f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 0x0ab356c7f650: 04 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 00 00 00 00 0x0ab356c7f660: 00 00 f9 f9 f9 f9 f9 f9 01 f9 f9 f9 f9 f9 f9 f9 =>0x0ab356c7f670: 01 f9 f9 f9 f9 f9 f9 f9[01]f9 f9 f9 f9 f9 f9 f9 0x0ab356c7f680: 04 f9 f9 f9 f9 f9 f9 f9 04 f9 f9 f9 f9 f9 f9 f9 0x0ab356c7f690: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ab356c7f6a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ab356c7f6b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ab356c7f6c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb Shadow gap: cc ==4282==ABORTING ``` 可以看到錯誤訊息指出在`console.c`內的全域變數`echo`發生溢位,閱讀`console.c`內的程式碼發現`echo`變數宣告類型為`bool`後,為了送進函數 add_param,取址並轉換成`(int *)` 型態 ```c=104 add_param("simulation", (int *) &simulation, "Start/Stop simulation mode", NULL); add_param("verbose", &verblevel, "Verbosity level", NULL); add_param("error", &err_limit, "Number of errors until exit", NULL); add_param("echo", (int *) &echo, "Do/don't echo commands", NULL); ``` 由於`simulation`做了跟`echo`類似的事情,推測它也會發生類似的溢位問題。 ``` ==4363==ERROR: AddressSanitizer: global-buffer-overflow on address 0x55d723b4d5e0 at pc 0x55d723b35ae8 bp 0x7ffc347e7200 sp 0x7ffc347e71f0 READ of size 4 at 0x55d723b4d5e0 thread T0 #0 0x55d723b35ae7 in do_option_cmd /home/robinson/ubuntu/linux2021/lab0-c/console.c:369 #1 0x55d723b348d0 in interpret_cmda /home/robinson/ubuntu/linux2021/lab0-c/console.c:221 #2 0x55d723b350b5 in interpret_cmd /home/robinson/ubuntu/linux2021/lab0-c/console.c:244 #3 0x55d723b367f8 in run_console /home/robinson/ubuntu/linux2021/lab0-c/console.c:660 #4 0x55d723b333e5 in main /home/robinson/ubuntu/linux2021/lab0-c/qtest.c:780 #5 0x7f1fc9de90b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2) #6 0x55d723b30b8d in _start (/home/robinson/ubuntu/linux2021/lab0-c/qtest+0x8b8d) 0x55d723b4d5e1 is located 0 bytes to the right of global variable 'simulation' defined in 'console.c:21:6' (0x55d723b4d5e0) of size 1 'simulation' is ascii string '' SUMMARY: AddressSanitizer: global-buffer-overflow /home/robinson/ubuntu/linux2021/lab0-c/console.c:369 in do_option_cmd Shadow bytes around the buggy address: 0x0abb64761a60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0abb64761a70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0abb64761a80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0abb64761a90: f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9 0x0abb64761aa0: f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9 =>0x0abb64761ab0: f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9[01]f9 f9 f9 0x0abb64761ac0: f9 f9 f9 f9 00 00 00 00 01 f9 f9 f9 f9 f9 f9 f9 0x0abb64761ad0: 04 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 00 00 00 00 0x0abb64761ae0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0abb64761af0: 00 f9 f9 f9 f9 f9 f9 f9 01 f9 f9 f9 f9 f9 f9 f9 0x0abb64761b00: 01 f9 f9 f9 f9 f9 f9 f9 04 f9 f9 f9 f9 f9 f9 f9 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb Shadow gap: cc ==4363==ABORTING ``` Bingo, 但說實話知道了問題點在哪,我也不知道是怎麼錯的。只好翻遍 C 語言規格書看看有沒有什麼漏掉的東西,過程中翻到了一個有趣的東西: > 6.2.5 Types > An object declared as type _Bool is large enough to store the values 0 and 1. 足夠大?足夠大是多大? ```c=0 #include <stdlib.h> #include <stdio.h> #include <stdbool.h> int main() { bool a = true; printf("the size of a is: %ld\n", sizeof(a)); // the size of a is: 1 } ``` 居然是1, 我印象中的溢位都是資料型態的空間無法容納數值的大小,類似滿出來那種感覺(這其實是叫算術溢位),這種因為空間小而造成的溢位我是第一次見過(因為其實是叫緩衝區溢位)。不過重新思考轉換指標型態後我有了一個挺合理的解釋: 原本變數 a 只佔一個 byte ```c bool a = true; ``` 我們取址後將其轉換為 (int *),轉換指標的資料型態只會改變編譯器對這個地址的看法,本身不會改變變數任何值 ```c int *b = (int *) &a; ``` 也就是原本編譯器只會向後看一個 byte 的值作為變數 a 的值,現在會多看三個 byte,編譯器存取到了不該存取的記憶體位置,自然符合緩衝區溢位的定義 ### 修正方案 我能夠想到對原程式碼衝擊最小的,就是直接新增一個`bool` 型態指標的參數在 add_param 函數內: ```c void add_param(char *name, int *valp, char *doccumentation, bool *switches, setter_function setter); ```