# 2020q1 Homework1 (lab0) contributed by < `MetalheadKen` > :::info :mega: 題目出處 * [H01: lab0](https://hackmd.io/@sysprog/linux2020-lab0) * [C Programming Lab: Assessing Your C Programming Skills](http://www.cs.cmu.edu/~213/labs/cprogramminglab.pdf) ::: ## 實驗環境 ```shell $ uname -a Linux kendai-XPS-13-9360 4.15.0-43-generic #46-Ubuntu SMP Thu Dec 6 14:45:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux $ gcc --version | head -n 1 gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0 ``` ## 題目摘要 * Some of the skills tested are : * Explicit memory management, as required in C * Creating and manipulating pointer-based data structures * Working with strings * Enhancing the proformance of key operations by storing redundant information in data structures * Implementing robust code that operates correctly with invalid arguments, including NULL pointers * Implementing a queue * Supporting both last-in, first-out (LIFO) and first-in-first-out (FIFO) * The underlying data structure is a singly-linked list * Resources * C programming * Kernighan and Ritchie, *The C Programming Language, second edition* * [Wikibooks](https://en.wikibooks.org/wiki/C_Programming) * [Linked lists](http://www.cs.cmu.edu/~iliano/courses/18S-CMU-CS122/handouts/10-linkedlist.pdf) * [Asympotic (big-Oh) notation](http://www.cs.cmu.edu/~iliano/courses/18S-CMU-CS122/handouts/05-sort.pdf) * Linux Man pages * The task is to modify the code in `queue.h` and `queue.c` to fully implement the following functions * `q_new`: Create a new, empty queue * `q_free`: Free all storage used by a queue * `q_insert_head`: Attempt to insert a new element at the head of the queue (LIFO discipline) * `q_insert_tail`: Attempt to insert a new element at the tail of the queue (FIFO discipline) * `q_remove_head`: Attempt to remove the element at the head of the queue * `q_size`: Compute the number of elements in the queue * `q_reverse`: Reorder the list so that the queue elements are reversed in order. * `q_sort`: Sort elements of queue in ascending order. * ==Precautions== * A *NULL* queue is one for which the pointer has value `NULL`. An *empty* queue is one pointing to a valid structure, but the `head` field has value `NULL` * For functions that provide strings as arguments, you must create and store a copy of the string by calling `malloc` to allocate space and then copying from the source to the newly allocated space * You cannot assume any fixed upper bound on the length of a string ─ you must allocate space for each string based on its length * `q_insert_tail` and `q_size` will require that your implementations operate in time $O(1)$ * `q_reverse` should not allocate or free any list elements. Instead, it should rearrange the existing elements ## 實作流程 ### `queue.h` 依據題意,為了讓函式 `q_insert_tail` 和 `q_size` 的時間複雜度為 $O(1)$,在 struct `queue_t` 裡新增成員 `tail` 和 `size`,使得在 runtime 的時候可以動態紀錄 queue 的尾端和其大小 ```cpp /* Queue structure */ typedef struct { list_ele_t *head; list_ele_t **tail; size_t size; } queue_t; ``` ### `q_new` 需注意若 `malloc` 失敗時需回傳 NULL ```cpp queue_t *q_new() { queue_t *q = (queue_t *) malloc(sizeof(queue_t)); if (!q) return NULL; q->head = NULL; q->tail = &(q->head); q->size = 0; return q; } ``` ### `q_free` 釋放 queue 的空間,注意記得一併釋放節點和字串所佔的記憶體空間 ```cpp void q_free(queue_t *q) { if (!q) return; while (q->head != NULL) { list_ele_t *tmp = q->head; q->head = q->head->next; free(tmp->value); free(tmp); } free(q); } ``` ### `q_insert_head` * 分別建立節點和字串的空間,之後進行字串複製和節點的串接 * 需注意若 `malloc` 失敗時,在回傳 NULL 前需釋放之前已經配置好的記憶體空間 * 在這邊使用 `goto` 來作錯誤處理,可參考 [goto in Linux kernel coding style]( https://www.kernel.org/doc/html/v4.18/process/coding-style.html#centralized-exiting-of-functions) 章節 * 紀錄節點數量 ```cpp bool q_insert_head(queue_t *q, char *s) { if (!q) goto ERR; list_ele_t *newh; newh = (list_ele_t *) malloc(sizeof(list_ele_t)); if (!newh) goto ERR; int len = strlen(s) + 1; char *str = (char *) malloc(len); if (!str) goto ERR_FREE_LIST; memcpy(str, s, len); newh->value = str; newh->next = q->head; if (!q->head) { q->tail = &(newh->next); } q->head = newh; q->size++; return true; ERR_FREE_LIST: free(newh); ERR: return false; } ``` ### `q_insert_tail` * 大致上與先前的 `q_insert_head` 的實作方向相同 * 因題意規定其時間複雜度需為 $O(1)$,這邊使用 `tail` 來串接新節點 * 撰寫「有品味」的程式碼 * 讓 `tail` 的 data type 為 `list_ele_t**`,使得可以減少 branch penalty,可參考 * [Linus Torvalds 在 TED 2016 的訪談](https://www.ted.com/talks/linus_torvalds_the_mind_behind_linux?language=zh-tw) * [Applying the Linus Torvalds “Good Taste” Coding Requirement](https://medium.com/@bartobri/applying-the-linus-tarvolds-good-taste-coding-requirement-99749f37684a) * [Naetw 的開發紀錄](https://hackmd.io/s/BysQssYHN#q_insert_tail) * 紀錄節點數量 ```cpp bool q_insert_tail(queue_t *q, char *s) { if (!q) goto ERR; list_ele_t *newt = (list_ele_t *) malloc(sizeof(list_ele_t)); if (!newt) goto ERR; int len = strlen(s) + 1; char *str = (char *) malloc(len); if (!str) goto ERR_FREE_LIST; memcpy(str, s, len); newt->value = str; newt->next = NULL; *(q->tail) = newt; q->tail = &(newt->next); q->size++; return true; ERR_FREE_LIST: free(newt); ERR: return false; } ``` ### `q_remove_head` * 若 `sp` 不為 NULL 時代表需把移除的字串複製到 `sp` 中 * 注意有可能要移除的字串的長度超過 `sp` 的大小,因而最多只需複製長度為 `bufsize - 1`,並注意要在結尾上加上 terminating null byte * 調整節點數量 ```cpp bool q_remove_head(queue_t *q, char *sp, size_t bufsize) { if (!q || !q->head) return false; if (sp != NULL) { strncpy(sp, q->head->value, bufsize - 1); sp[bufsize - 1] = '\0'; } list_ele_t *tmp = q->head; q->head = q->head->next; q->tail = (--q->size == 0) ? &(q->head) : q->tail; free(tmp->value); free(tmp); return true; } ``` ### `q_size` * 題意規定時間複雜度為 $O(1)$,因此藉由 `size` 動態紀錄 queue 的大小,並在此函式直接回傳 `size` 的內容 * 需注意有可能在沒有使用 `new` command 的情況下直接使用 `size` command,所以需判斷若 queue 為 NULL queue 或 empty queue 則直接回傳數值 0 ```cpp int q_size(queue_t *q) { if (!q || !q->head) { return 0; } else { return q->size; } } ``` ### `q_reverse` * 若 queue 為 NULL queue 或 empty queue 或長度為 1 的 queue 則無需 reverse * 使用 `prev`, `curr`, `next` 來紀錄反轉的過程 * `prev` 紀錄 queue 前一個的節點 * `curr` 紀錄 queue 當前的節點,在反轉過程中,`curr` 的下一個節點應指派為 `curr` 的前一個節點 * `next` 紀錄 queue 下一個的節點 * 注意最後 `head` 和 `tail` 需重設方向 ```cpp void q_reverse(queue_t *q) { if (!q || !q->head || !q->head->next) return; list_ele_t *prev, *curr, *next; prev = q->head; curr = q->head->next; q->head->next = NULL; q->tail = &(q->head->next); while (curr != NULL) { next = curr->next; curr->next = prev; prev = curr; curr = next; } q->head = prev; } ``` ### `q_sort` * 在這邊使用了 merge sort 來排序 queue 裡面的元素 * 可以注意到 `merge_sort` 裡面透過 `fast` 和 `slow` pointer 來找到該 list 的中間點為何 * 剛好 [LeetCode](https://leetcode.com/problems/sort-list/) 上有相關題目可以練練手,順便可以用獨立環境來跑一下 unit test ```cpp void q_sort(queue_t *q) { if (!q || !q->head || !q->head->next) return; q->head = merge_sort(q->head); q->tail = &(q->head); while (*(q->tail)) { q->tail = &((*q->tail)->next); } } static list_ele_t *merge_sort(list_ele_t *head) { if (!head || !head->next) return head; list_ele_t *slow = head; list_ele_t *fast = head->next; while (fast && fast->next) { slow = slow->next; fast = fast->next->next; } list_ele_t *mid = slow->next; slow->next = NULL; return merge(merge_sort(head), merge_sort(mid)); } /* Merge two list in ascending order */ static list_ele_t *merge(list_ele_t *left, list_ele_t *right) { if (!left) return right; if (!right) return left; list_ele_t *dummy = NULL; list_ele_t *curr = NULL; if (strcmp(left->value, right->value) > 0) { dummy = right; right = right->next; } else { dummy = left; left = left->next; } curr = dummy; while (left && right) { if (strcmp(left->value, right->value) > 0) { curr->next = right; right = right->next; } else { curr->next = left; left = left->next; } curr = curr->next; } if (left) curr->next = left; if (right) curr->next = right; return dummy; } ``` ### Implement Natural Sort 這邊採用 [sourcefrog/natsort](https://github.com/sourcefrog/natsort) 所實作的 natural sort 的 compare function - `strnatcmp`。從原專案中取得 `strnatcmp.[ch]` 後放入 `lab0-c` 的專案中。 :::info 這邊為了明確的表示是使用別人的專案,原本是想採用 git submodule 的方式建構的,但是考慮到若要對 natsort 專案做改動,勢必要 push 回去到原專案中,或者在當初 submodule init 時只能指定 fork 後的專案,但都相對的要麻煩許多。 不知道有什麼方式可以除了把有 reference 到的專案之相關訊息寫進 commit message 以外達到明確的使用到以及隨時跟新進度在別人的專案上。 > [name=MetalheadKen] ::: :::warning 這是練習撰寫清晰的 git commit messages 的好機會 :notes: jserv ::: 接著把在排序中採用到的 `strcmp` 一律改成 `strnatcmp` * `qtest.c` ```cpp bool do_sort(int argc, char *argv[]) { ... for (list_ele_t *e = q->head; e && --cnt; e = e->next) { /* Ensure each element in ascending order */ /* FIXME: add an option to specify sorting order */ if (strnatcmp(e->value, e->next->value) > 0) { report(1, "ERROR: Not sorted in ascending order"); ok = false; break; ... } ``` * `queue.c` ```cpp static list_ele_t *merge(list_ele_t *left, list_ele_t *right) { ... if (strnatcmp(left->value, right->value) > 0) { dummy = right; right = right->next; } else { ... while (left && right) { if (strnatcmp(left->value, right->value) > 0) { curr->next = right; right = right->next; } else { ... } ``` 更改 `Makefile`,使其可以編譯 `strnatcmp.c` * `Makefile` ```cpp ... OBJS := qtest.o report.o console.o harness.o queue.o \ random.o dudect/constant.o dudect/fixture.o dudect/ttest.o \ natsort/strnatcmp.o ... ``` 為了對現有的 natsort 做測試,增加新的 trace file * `traces/trace-18-natsort.cmd` ```cpp # Test operation of natural sort option fail 0 option malloc 0 new ih fred ih pic2 ih pic100a ih pic120 ih pic121 ih jane ih tom ih pic02a ih pic3 ih pic4 ih 1-20 ih pic100 ih pic02000 sort ``` * `script/driver.py` ```diff --- a/scripts/driver.py +++ b/scripts/driver.py @@ -35,7 +35,8 @@ class Tracer: 14: "trace-14-perf", 15: "trace-15-perf", 16: "trace-16-perf", - 17: "trace-17-complexity" + 17: "trace-17-complexity", + 18: "trace-18-natsort" } traceProbs = { @@ -55,10 +56,11 @@ class Tracer: 14: "Trace-14", 15: "Trace-15", 16: "Trace-16", - 17: "Trace-17" + 17: "Trace-17", + 18: "Trace-18" } - maxScores = [0, 6, 6, 6, 6, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 5] + maxScores = [0, 6, 6, 6, 6, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 5, 6] RED = '\033[91m' GREEN = '\033[92m' ``` 對 `trace-16-perf.cmd` 採用到的 `RAND` 提供英文與數字的亂數排序,其實只要新增數字到 `charset` 就可以了 * `qtest.c` ```diff --- a/qtest.c +++ b/qtest.c @@ -60,7 +60,7 @@ static int string_length = MAXSTRING; #define MIN_RANDSTR_LEN 5 #define MAX_RANDSTR_LEN 10 - static const char charset[] = "abcdefghijklmnopqrstuvwxyz"; + static const char charset[] = "abcdefghijklmnopqrstuvwxyz0123456789"; ``` 執行 `make test` 後出現以下錯誤,依錯誤訊息可得知為超時 ```SHELL ... ERROR: Time limit exceeded. Either you are in an infinite loop, or your code is too inefficient ... ``` 這是因為在 natsort 中所使用的 `strnatcmp` 比 glibc 所提供的 `strcmp` 的執行時間還要長的緣故。為了解決這個問題,這邊採用的方法為新增一個新的 command option `time`,藉由 `option time <limit=1>` 可以指定在這次測試中所會參考的 time limit。 * `qtest.c` ```cpp static void console_init() { ... add_param("fail", (void *) &fail_limit, sizeof(fail_limit), "Number of times allow queue operations to return false", NULL); add_param("time", (void *) &time_limit, sizeof(time_limit), "Maximum number of time limit", NULL); } ``` 為了要讓 `time_limit` 可以動態更改,必須把 `time_limit` 改為 global variable * `harness.c` ```diff --- a/harness.c +++ b/harness.c @@ -52,7 +52,7 @@ static bool noallocate_mode = false; static bool error_occurred = false; static char *error_message = ""; - static int time_limit = 1; + int time_limit = 1; /* * Data for managing exceptions ``` * `harness.h` ```diff --- a/harness.c +++ b/harness.c; @@ -25,6 +25,9 @@ size_t allocation_check(); /* Probability of malloc failing, expressed as percent */ extern int fail_probability; + /* Amount of time limit */ + extern int time_limit; + /* * Set/unset cautious mode. * In this mode, makes extra sure any block to be freed is currently allocated.; ``` 加入 `option time` 到會因為 natsort 的緣故而造成 Time limit exceeded 的 trace 中 * `traces/trace-15-perf.cmd` ```diff --- a/traces/trace-15-perf.cmd +++ b/traces/trace-15-perf.cmd @@ -1,6 +1,8 @@ # Test performance of insert_tail, size, reverse, and sort option fail 0 option malloc 0 + # For test natural sort, we extend the amount of time limit + option time 3 new ih dolphin 1000000 it gerbil 1000000 ... ``` ## Address Sanitizer 執行 `make SANITIZER=1 test` 後出現下列錯誤 ```shell ... # Test if q_insert_tail and q_size is constant time complexity ================================================================= ==26442==ERROR: AddressSanitizer: global-buffer-overflow on address 0x55ba43b1e9c0 at pc 0x55ba4390f86e bp 0x7ffe34a65e40 sp 0x7ffe34a65e30 READ of size 4 at 0x55ba43b1e9c0 thread T0 #0 0x55ba4390f86d in do_option_cmd /home/kendai/git/lab0-c/console.c:368 #1 0x55ba4390e48a in interpret_cmda /home/kendai/git/lab0-c/console.c:220 #2 0x55ba4390ee8e in interpret_cmd /home/kendai/git/lab0-c/console.c:243 #3 0x55ba4390fb77 in cmd_select /home/kendai/git/lab0-c/console.c:569 #4 0x55ba4390ff94 in run_console /home/kendai/git/lab0-c/console.c:628 #5 0x55ba4390d0ad in main /home/kendai/git/lab0-c/qtest.c:772 #6 0x7f0b512e6b96 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b96) #7 0x55ba4390a809 in _start (/home/kendai/git/lab0-c/qtest+0x6809) 0x55ba43b1e9c1 is located 0 bytes to the right of global variable 'simulation' defined in 'console.c:20:6' (0x55ba43b1e9c0) of size 1 'simulation' is ascii string '' SUMMARY: AddressSanitizer: global-buffer-overflow /home/kendai/git/lab0-c/console.c:368 in do_option_cmd ... +++ TESTING trace trace-17-complexity: --- trace-17-complexity 0/5 --- TOTAL 95/100 ``` 先用 GDB 定位問題 ```SHELL $ gdb -q --args ./qtest -f traces/trace-17-complexity.cmd ... (gdb) b console.c:368 (gdb) r ... Breakpoint 1, do_option_cmd (argc=3, argv=0x555555763ce0) at console.c:368 368 int oldval = *plist->valp; (gdb) p *plist $1 = {name = 0x55555555bd47 "simulation", valp = 0x55555575e520 <simulation>, documentation = 0x55555555bd2c "Start/Stop simulation mode", setter = 0x0, next = 0x5555557613e0} (gdb) whatis (*plist).valp type = int * (gdb) whatis simulation type = _Bool (gdb) p sizeof(_Bool) $2 = 1 ``` 從這邊可以知道由於 `simulation` 從 `bool` 轉成 `int *` 並存入 `valp`,之後當直接 dereference 的時候就會因為要取出 4 bytes 的資料而存取到不該存取的地方故造成 buffer overflow 為了要解決 global-buffer-overflow,我們需要把轉型成 `int *` 的地方改為 `void *`,根據 C99 規格,pointer to void 做為一個 generic pointer 可以把原本的 object type 轉成 pointer to void 再轉回來。藉由於此,我們可以透過 pointer to void 來做到不同的 data type 的存取 > C99 Spec 6.3.2.3:1 [Pointers] > > A pointer to void may be converted to or from a pointer to any incomplete or object type. A pointer to any incomplete or object type may be converted to a pointer to void and back again; the result shall compare equal to the original pointer. 除了 `void *` 來儲存各種的 data type 以外,還需要一個 variable 來儲存原本的 object 所佔的大小。最後,根據上面所述,我們需要修改以下動作來解決所需的問題 * 修改 struct `PELE` 來儲存 `void *` 和原始的 object size * 修改對應的 function argument 來完成儲存 `void *` 和原始的 object size 的動作 * 新增一個新的 function 來讓 `void *` 可以透過原始的 object size 來轉型回原來的 data type 修改如下 * `console.h` ```diff --- a/console.h +++ b/console.h @@ -23,13 +23,14 @@ struct CELE { }; /* Optionally supply function that gets invoked when parameter changes */ - typedef void (*setter_function)(int oldval); + typedef void (*setter_function)(void *oldval, int oldsize); /* Integer-valued parameters */ typedef struct PELE param_ele, *param_ptr; struct PELE { char *name; - int *valp; + void *valp; + int valsize; char *documentation; /* Function that gets called whenever parameter changes */ setter_function setter; @@ -44,7 +45,8 @@ void add_cmd(char *name, cmd_function operation, char *documentation); /* Add a new parameter */ void add_param(char *name, - int *valp, + void *valp, + int valsize, char *doccumentation, setter_function setter); ``` * `console.c` ```diff --- a/console.c +++ b/console.c @@ -82,6 +82,8 @@ static void pop_file(); static bool interpret_cmda(int argc, char *argv[]); + static uint32_t get_plist_valp(param_ptr p); + /* Initialize interpreter */ void init_cmd() { @@ -99,11 +101,14 @@ void init_cmd() add_cmd("log", do_log_cmd, " file | Copy output to file"); add_cmd("time", do_time_cmd, " cmd arg ... | Time command execution"); add_cmd("#", do_comment_cmd, " ... | Display comment"); - add_param("simulation", (int *) &simulation, "Start/Stop simulation mode", + add_param("simulation", (void *) &simulation, sizeof(simulation), + "Start/Stop simulation mode", NULL); + add_param("verbose", (void *) &verblevel, sizeof(verblevel), + "Verbosity level", NULL); + add_param("error", (void *) &err_limit, sizeof(err_limit), + "Number of errors until exit", NULL); + add_param("echo", (void *) &echo, sizeof(echo), "Do/don't echo commands", NULL); - add_param("verbose", &verblevel, "Verbosity level", NULL); - add_param("error", &err_limit, "Number of errors until exit", NULL); - add_param("echo", (int *) &echo, "Do/don't echo commands", NULL); init_in(); init_time(&last_time); @@ -130,7 +135,8 @@ void add_cmd(char *name, cmd_function operation, char *documentation) /* Add a new parameter */ void add_param(char *name, - int *valp, + void *valp, + int valsize, char *documentation, setter_function setter) { @@ -144,6 +150,7 @@ void add_param(char *name, param_ptr ele = (param_ptr) malloc_or_fail(sizeof(param_ele), "add_param"); ele->name = name; ele->valp = valp; + ele->valsize = valsize; ele->documentation = documentation; ele->setter = setter; ele->next = next_param; @@ -248,6 +255,20 @@ static bool interpret_cmd(char *cmdline) return ok; } + static uint32_t get_plist_valp(param_ptr p) + { + switch (p->valsize) { + case 1: + return (*(uint8_t *) p->valp); + case 2: + return (*(uint16_t *) p->valp); + default: + return (*(uint32_t *) p->valp); + } + } + /* Set function to be executed as part of program exit */ void add_quit_helper(cmd_function qf) { @@ -303,7 +324,7 @@ static bool do_help_cmd(int argc, char *argv[]) param_ptr plist = param_list; report(1, "Options:"); while (plist) { - report(1, "\t%s\t%d\t%s", plist->name, *plist->valp, + report(1, "\t%s\t%d\t%s", plist->name, get_plist_valp(plist), plist->documentation); plist = plist->next; } @@ -342,7 +363,7 @@ static bool do_option_cmd(int argc, char *argv[]) param_ptr plist = param_list; report(1, "Options:"); while (plist) { - report(1, "\t%s\t%d\t%s", plist->name, *plist->valp, + report(1, "\t%s\t%d\t%s", plist->name, get_plist_valp(plist), plist->documentation); plist = plist->next; } @@ -365,10 +386,10 @@ static bool do_option_cmd(int argc, char *argv[]) param_ptr plist = param_list; while (!found && plist) { if (strcmp(plist->name, name) == 0) { - int oldval = *plist->valp; - *plist->valp = value; + void *oldvalp = plist->valp; + memcpy(oldvalp, &value, plist->valsize); if (plist->setter) - plist->setter(oldval); + plist->setter(oldvalp, plist->valsize); found = true; } else plist = plist->next; ``` * `qtest.c` ```diff --- a/qtest.c +++ b/qtest.c @@ -98,11 +98,11 @@ static void console_init() add_cmd("size", do_size, " [n] | Compute queue size n times (default: n == 1)"); add_cmd("show", do_show, " | Show queue contents"); - add_param("length", &string_length, "Maximum length of displayed string", - NULL); - add_param("malloc", &fail_probability, "Malloc failure probability percent", - NULL); - add_param("fail", &fail_limit, + add_param("length", (void *) &string_length, sizeof(string_length), + "Maximum length of displayed string", NULL); + add_param("malloc", (void *) &fail_probability, sizeof(fail_probability), + "Malloc failure probability percent", NULL); + add_param("fail", (void *) &fail_limit, sizeof(fail_limit), "Number of times allow queue operations to return false", NULL); } ``` ## Valgrind + Massif Heap Profiler ### 安裝 massif-visualizer `$ sudo apt install massif-visualizer` ### 使用 valgrind + massif 進行測試 ```SHELL $ valgrind -q --tool=massif --stack=yes --time-unit=i --massif-out-file=massif.out ./qtest ... cmd> option fail 0 cmd> option malloc 0 cmd> new cmd> ih RAND 10 cmd> reverse cmd> sort cmd> rh ... 總共執行 rh 10 次 cmd> quit ``` ### 使用 massif-visualizer 視覺化 `$ massif-visualizer massif.out` ![](https://i.imgur.com/tRbxPUu.png) ## [Dudect](https://github.com/oreparaz/dudect) 相關研究 ## select 系統呼叫 ## 自動評分系統相關補充 ### 如何呼叫使用者輸入命令對應的函式 一開始使用者執行 `qtest` 後,輸入命令時,會先執行 `console.c` 中的 `interpret_cmd` 並呼叫 `parse_args` 來將輸入的字串以空白為區隔,轉換為 `argc` 和 `argv`,如下 ```cpp bool interpret_cmd(char *cmdline) { int argc; ... char **argv = parse_args(cmdline, &argc); bool ok = interpret_cmda(argc, argv); ... return ok; } ``` 之後在函式 `interpret_cmda` 中,利用迴圈走訪 singly linked list,並利用 `strcmp` 找出函式相對應的字串。可以發現在 `qtest` 中使用函式指標的方式可以很簡單方便的維護使用者可輸入的命令 ```cpp static bool interpret_cmda(int argc, char *argv[]) { if (argc == 0) return true; /* Try to find matching command */ cmd_ptr next_cmd = cmd_list; bool ok = true; while (next_cmd && strcmp(argv[0], next_cmd->name) != 0) next_cmd = next_cmd->next; if (next_cmd) { ok = next_cmd->operation(argc, argv); if (!ok) record_error(); } else { report(1, "Unknown command '%s'", argv[0]); record_error(); ok = false; } return ok; } ``` ### 在寫好 trace file 後程式如何運作 在 `qtest` 中有提供參數 `-f <filename>` 來讀取 trace file。當使用者輸入命令後,在 `qtest.c` 中的 `main` 函式會呼叫 `getopt` 來解析使用者先前輸入命令的後面所添加的參數,節錄如下 ```cpp int main(int argc, char *argv[]) { ... while ((c = getopt(argc, argv, "hv:f:l:")) != -1) { switch (c) { case 'h': usage(argv[0]); break; case 'f': strncpy(buf, optarg, BUFSIZE); buf[BUFSIZE - 1] = '\0'; infile_name = buf; break; case 'v': level = atoi(optarg); break; case 'l': strncpy(lbuf, optarg, BUFSIZE); buf[BUFSIZE - 1] = '\0'; logfile_name = lbuf; break; default: printf("Unknown option '%c'\n", c); usage(argv[0]); break; } } ... } ``` :::info `getopt` 是一個可以很方便的去 parse 從 command line 傳過來的參數的函式,該函式會一直讀取參數直到沒有更多的參數可以讀時才回傳 -1 * function prototype: `int getopt(int argc, char *argv[], const char *optstring)` * `int argc`: the argument count passed to the `main()` function * `char *argv[]`: the argument array passed to the `main()` function * `const char *optstring`: a string containing the legitimate option characters * 一個字元︰ 一個可用的 option * 一個字元搭配冒號︰該 option 後有額外帶參數 * 一個字元當配兩個冒號︰該 option 後帶的參數是可選的 **(此為 GNU extension)** * `getopt` 回傳數值後 * 成功回傳 option 字元,無更多 option 可讀則回傳 -1,在 `optstring` 找不到 option 字元則回傳 `?`,在 `optstring` 設定需帶參數的 option 沒有帶參數的話則回傳 `:` * `optarg` 變數儲存該 option 後面所帶的參數 * `optind` 變數儲存下次呼叫 `getopt` 時要處理的 argv 的 index * `optopt` 變數儲存當 `getopt` 找不到 option 字元或是缺少某些參數時的該 option * ==注意事項== * `getopt` 是 POSIX standard 但是並不是 C standard 並在 glibc 實作中的某些行為是 GNU extensions,這點需要特別注意 * 參考資料 * [man 3 getopt](https:// "http://man7.org/linux/man-pages/man3/getopt.3.html") * [Oracle Docs](https://docs.oracle.com/cd/E19253-01/816-5168/getopt-3c/index.html) * [POSIX-1.200x Draft](http://www.open-std.org/jtc1/sc22/open/n4217.pdf#page=1052) * Michael Kerrisk, *The Linux Programming Interface: Appendix B* ::: 在 `console.c` 中定義以下 structure 來儲存每個要使用的檔案的 file descritptor ```cpp #define RIO_BUFSIZE 8192 typedef struct RIO_ELE rio_t, *rio_ptr; struct RIO_ELE { int fd; /* File descriptor */ int cnt; /* Unread bytes in internal buffer */ char *bufptr; /* Next unread byte in internal buffer */ char buf[RIO_BUFSIZE]; /* Internal buffer */ rio_ptr prev; /* Next element in stack */ }; ``` > RIO 在 CS:APP 第 10.5 章提及,全名為 Robust I/O > Ref: [System-Level I/O](https://csapp.cs.cmu.edu/2e/ch10-preview.pdf#page=6) (preview version) 並在 `push_file` 函式中把開啟檔案的 file descriptor 放進 `struct RIO_ELE` 中的 `fd`,若無指定開啟檔案的路徑則選擇為標準輸入 (也就是 `stdin`) ```cpp static bool push_file(char *fname) { int fd = fname ? open(fname, O_RDONLY) : STDIN_FILENO; if (fd < 0) return false; if (fd > fd_max) fd_max = fd; rio_ptr rnew = malloc_or_fail(sizeof(rio_t), "push_file"); rnew->fd = fd; rnew->cnt = 0; rnew->bufptr = rnew->buf; rnew->prev = buf_stack; buf_stack = rnew; return true; } ``` ### 如何偵測記憶體洩漏 `harness.[ch]` 中藉由替換掉 `malloc` 和 `free` 的實作方式使得可以檢查到 allocation error 從 `harness.h` 中可以見到 `qtest` 使用 macro 來把 `malloc` 和 `free` 替換成 `test_malloc` 和 `test_free` ```cpp /* Tested program use our versions of malloc and free */ #define malloc test_malloc #define free test_free ``` 而從 `test_malloc` 所配置出來的記憶體都會被紀錄在 `struct BELE` 中,structure 在 `harness.c` 中定義如下並以 doubly-linked list 來串接節點 ```cpp typedef struct BELE { struct BELE *next; struct BELE *prev; size_t payload_size; size_t magic_header; /* Marker to see if block seems legitimate */ unsigned char payload[0]; /* Also place magic number at tail of every block */ } block_ele_t; ``` 在 structure 中 `payload[0]` 是一個 struct hack,在 [GCC manual](https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html) 中稱呼為 Arry of Length Zero,藉由這種方式可以做到讓 structure 中的成員可以動態的配置記憶體又不用因為多次呼叫 `malloc` 造成記憶體碎片化 用法如下 ```c block_ele_t *new_block = malloc(array_size + sizeof(block_ele_t)); ``` 這樣的手法其實就是利用 GNU C compiler 預設不會去檢查 Array index out of bounds 的問題來做到的,但是使用這個技巧需要思考以下的問題 * Flexible array members 需為 incomplete type,並且 `sizeof` 不能用在 incomplete type 上 * Flexible array members 需宣告在最後一個 non-empty structure 成員上 * Flexible array members 不能單獨出現,至少需一個成員在 * Flexible array members 本身不可以宣告或包含在 union 中 * [Clang 支援該 GNU extension,但不支援 static initialization](http://clang.llvm.org/docs/UsersManual.html#intentionally-unsupported-gcc-extensions) > Note that clang does support flexible array members (arrays with zero or unspecified size at the end of a structure). > > clang does not support static initialization of flexible array members. This appears to be a rarely used extension, but could be implemented pending user demand. 另外在 ISO C90 可以使用 Array of Length One 來做到同樣的事情,而從 ISO C99 開始支援 Flexible Array Members,請參照 * [Wikipedia](https://en.wikipedia.org/wiki/Flexible_array_member) * [C99 §6.7.2.1, item 16](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf#page=115) > As a special case, the last element of a structure with more than one named member may have an incomplete array type; this is called a flexible array member. ### 如何配置記憶體 `test_malloc` 把經由 `malloc` 配置出來的新的記憶體空間,將其串連到 doubly linked list `allocated` 中,其中配置出記憶體空間的寫法如下 ```c block_ele_t *new_block = malloc(size + sizeof(block_ele_t) + sizeof(size_t)); ``` 可以看到除了 `size` 和 `sizeof(block_ele_t)` 以外,還多了 `sizeof(size_t)`,這是因為在 `qtest` 中,藉由在尾端多配置出一個空間並填入 magic number 來查看若當該數值有被更動過的話,即表示出現 memory corruption (array access out of bounds),以及偵測是否是由 `test_malloc` 配置出來的記憶體空間,而前一個成員 `magic_header` 也是基於此用途的。`harness.c` 中的 `test_malloc` 節錄如下 ```cpp void *test_malloc(size_t size) { if (fail_allocation()) { report_event(MSG_WARN, "Malloc returning NULL"); return NULL; } block_ele_t *new_block = malloc(size + sizeof(block_ele_t) + sizeof(size_t)); ... new_block->magic_header = MAGICHEADER; new_block->payload_size = size; *find_footer(new_block) = MAGICFOOTER; void *p = (void *) &new_block->payload; memset(p, FILLCHAR, size); new_block->next = allocated; new_block->prev = NULL; if (allocated) allocated->prev = new_block; allocated = new_block; allocated_count++; return p; } ``` :::info 在 `fail_allocation` 中,實作了當亂數產生出來的數小於 0.01 乘上預先設定好的失敗機率 (`fail_probability`) 的話,即 malloc failure,為何需要如此實作? > [name=MetalheadKen] ::: :::warning 回去複習機率統計,思考上述做法是否合理,若有疑慮就提出論證和實驗 :notes: jserv ::: 其中為了方便取得到 `payload` 的尾端來指派 magic number (`MAGICFOOTER`),`qtest` 實作了 `find_footer` 函式,如下 ```cpp static size_t *find_footer(block_ele_t *b) { size_t *p = (size_t *) ((size_t) b + b->payload_size + sizeof(block_ele_t)); return p; } ``` 藉由傳進來的 `block_ele_t *` 的開頭,在加上 payload size 和該 structure 本身的大小來取得到 `payload` 的尾端