2019q1 Homework1 (lab0)

--- tags: 作業 --- # 2019q1 Homework1 (lab0) contributed by < `ab37695543xs` > ## 開發環境 ``` 架構： x86_64 CPU 作業模式： 32-bit, 64-bit Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 每核心執行緒數： 2 每通訊端核心數： 4 Socket(s): 1 NUMA 節點： 1 供應商識別號： GenuineIntel CPU 家族： 6 型號： 142 Model name: Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz 製程： 10 CPU MHz： 2099.944 CPU max MHz: 3400.0000 CPU min MHz: 400.0000 BogoMIPS: 3600.00 虛擬： VT-x L1d 快取： 32K L1i 快取： 32K L2 快取： 256K L3 快取： 6144K NUMA node0 CPU(s)： 0-7 ``` ## 作業目標 - [x] 英文閱讀 - [x] C 語言程式設計基礎 - [x] 準備 GNU/Linux 開發工具 - [x] 學習使用 Git 與 GitHub - [ ] 學習效能分析 - [x] 研究自動測試機制 CMU [Introduction to Computer Systems (ICS)](http://www.cs.cmu.edu/~213/index.html) 準備了 [C Programming Lab](http://www.cs.cmu.edu/~213/labs/cprogramminglab.pdf) 作為檢閱學生 C 語言程式設計的認知: * Explicit memory management, as required in C. * Creating and manipulating pointer-based data structures. * Working with strings. * Enhancing the performance of key operations by storing redundant information in data structures. * Implementing robust code that operates correctly with invalid arguments, including NULL pointers. 實驗目標為實作 queue * first in, first out (FIFO) * last in, first out (LIFO) ## 基本要求 ### queue.h 為了滿足 O(1) 的時間要求下，實作出 `q_insert_tail()` 與`q_size()`，便在資料結構上新增了 `tail` 與 `size` 來紀錄 ```clike= typedef struct { list_ele_t *head; list_ele_t *tail; int size; } queue_t; ``` ### queue.c #### q_new() 成功分配記憶體的話，就初始化 `q`，反之 `q` 會回傳 `NULL` ```cpp= queue_t *q_new() { queue_t *q = malloc(sizeof(queue_t)); /* initialize the queue */ if (q) { q->head = NULL; q->tail = NULL; q->size = 0; } return q; } ``` #### q_size() 由於透過額外 field 來記錄 queue 的大小，且在 `q_new()` 已初始化，這邊直接回傳即可 > 犯錯的地方： > 由於初始化是在 q_new() 時進行，需要考慮到未初始化的情況，即 q 為 NULL ```cpp= int q_size(queue_t *q) { if (q != NULL) return q->size; else return 0; } ``` #### q_insert_head() * 考慮傳入的 `q` 與記憶體分配的 `malloc()` 與 `strdup()` 為 `NULL` 的可能 * 若複製字串空間不夠時，會視為插入失敗，同時釋放該節點 * 需要考慮最初的情況，head 和 tail 為同一個，避免插入失敗 * 有新節點的情況也會需要更新 size ```cpp= bool q_insert_head(queue_t *q, char *s) { if (q == NULL) return false; list_ele_t *newh = malloc(sizeof(list_ele_t)); if (newh == NULL) return false; /* set the element */ newh->value = strdup(s); if (newh->value == NULL) { free(newh); return false; } newh->next = q->head; /* maintain the queue */ q->head = newh; if (q->tail == NULL) // same element q->tail = newh; q->size++; return true; } ``` #### q_insert_tail() 由於透過額外 field 記錄 tail 位置，所以省去遍尋的時間，剩下邏輯同 `q_insert_head()` ```cpp= bool q_insert_tail(queue_t *q, char *s) { /* Remember: It should operate in O(1) time */ list_ele_t *newt = malloc(sizeof(list_ele_t)); if (newt == NULL) return false; /* set the element */ newt->value = strdup(s); if (newt->value == NULL) { free(newt); return false; } newt->next = NULL; /* maintain the queue */ if (q->head == NULL) // same element q->head = newt; else q->tail->next = newt; q->tail = newt; q->size++; return true; } ``` #### q_free() * 由於是整個 queue 要 free，沒有必要保留 head，所以透過移動 head 來依序移除 * 字串在儲存時有經由 `strdup()` 分配記憶體，所以要手動 free * 最後所有節點 free 完後，head 和 tail 也就不復存在，可以直接 free 掉 queue > 犯錯的地方： > * 只要有參照指標 q 的地方，都應該檢查 q 是否為 NULL > * 雖然 free(NULL) 是可行的，但在 while 裡先使用到 `q->head`，所以會先失敗 ```cpp= void q_free(queue_t *q) { if (q == NULL) return; list_ele_t *tmp; while (q->head != NULL) { // next element tmp = q->head; q->head = tmp->next; free(tmp->value); // strdup() free(tmp); } /* nodes all freed */ free(q); } ``` #### q_remove_head() * 要移除 head 一定要先檢查 head 存不存在 * 這邊要將刪除的 head 所存的字串複製到終點 `sp`，同時有限制複製的字元數不超過 `bufsize` * 考慮到超過 `bufsize` 的字串不會複製到 `'\0'`，這邊是先將 `sp` 統一設為 `'\0'`，再將 `bufsize - 1` 以內的字元複製過去 * 刪除節點同樣需要更新 size > 犯錯的地方： > * 一開始只考慮到檢查 head 而已，忽略了 q 為 NULL 的情況 > * 字串複製時忽略了 sp 為 NULL 時，會造成無法寫入的情形 ```cpp= bool q_remove_head(queue_t *q, char *sp, size_t bufsize) { if (q == NULL || q->head == NULL) return false; /* pop the element */ if (sp != NULL) { memset(sp, (int) '\0', bufsize); strncpy(sp, q->head->value, bufsize - 1); } free(q->head->value); /* maintain the queue */ list_ele_t *tmp = q->head; q->head = tmp->next; free(tmp); q->size--; return true; } ``` #### q_reverse() 反轉的部分一開始寫的是後面的版本 (由尾到頭依序)，後來改良的是以下版本 (由頭到尾依序) * 改良版的運作如下圖，透過三個指標 `prev`、`head` 和 `next` 來完成交換。 * 迴圈前為圖 1, 2，先記錄一開始的 head 為 `newt`，需要考慮最後完成的 tail 要接 NULL * 迴圈操作為圖 3, 4 1. 記錄 head 下一個節點為 next 2. 將 head 連接至 prev，完成反轉 3. 將 prev 移至 head 4. 將 head 移至 next，繼續下一輪 ![](https://i.imgur.com/eCiOlQd.png) * 完成情況如下 1. head 移到原 tail 的下一個 (NULL)，跳出迴圈 2. 因為是透過 head 來遍尋 queue，所以將 head 移至 prev (原 tail) 3. 將 tail 移至一開始記錄的 newt，完成整個 queue 的反轉 ![](https://i.imgur.com/D5jQqrx.png) ```cpp= void q_reverse(queue_t *q) { if (q == NULL) return; /* start from head to tail link current(head) and prev then jump to next */ list_ele_t *newt = q->head; list_ele_t *prev = NULL; while (q->head != NULL) { list_ele_t *next = q->head->next; q->head->next = prev; prev = q->head; q->head = next; } q->head = prev; // prevent NULL q->tail = newt; // origin head } ``` * 原始版的運作如下圖，透過兩個指標 `tmp` 和 `tail` 來完成交換，缺點是每次都需要從頭找到 tail 前一個 * 首先記錄一開始的 tail 為 `newh` * 迴圈操作為圖 2, 3, 4 1. tmp 從 head 開始遍尋 2. 直到下一個節點為 tail 時停下來 3. 將 tail 連接至 tmp，完成反轉 4. tail 移至 tmp，作為下一次的中斷點 ![](https://i.imgur.com/E8SwpMp.png) * 完成情況如下 1. tail 與 head 重合時，跳出迴圈 2. 將目前 head (新 tail) 連接至 NULL 3. 將 head 移至一開始記錄的 newh，完成整個 queue 的反轉 ![](https://i.imgur.com/EfzivCA.png) ```cpp= /* start from tail to head */ list_ele_t *newh = q->tail; while (q->tail != q->head) { list_ele_t *tmp = q->head; while (tmp->next != q->tail) tmp = tmp->next; q->tail->next = tmp; // reverse the link of tail q->tail = tmp; } q->head->next = NULL; q->head = newh; // origin tail ``` ## 自動評分系統 ### makefile * 執行 `make test` 會等同執行 makefile 裡以下指令 * 第一列是代表相依的檔案，第二列為執行的命令 ``` test: qtest scripts/driver.py scripts/driver.py ``` ### scripts/driver.py * 執行 `driver.py` 後，首先會檢查有沒有後續的引數 ```python if __name__ == "__main__": run(sys.argv[0], sys.argv[1:]) ``` * 提供了幾種引數，`-p` 可以決定待測試的程式，`-t` 挑選測資題目，`-v` 決定細節顯示 (0~3 越高越細)，或是 `--valgrind` 以 valgrind 開啟程式 > 有個隱藏引數是 `-A`，開啟 autograde (以 JSON 字串印出)，但因為過程中就會自動計算分數，所以預設不會用到 ```python print("Usage: %s [-h] [-p PROG] [-t TID] [-v VLEVEL] [--valgrind]" % name) print(" -h Print this message") print(" -p PROG Program to test") print(" -t TID Trace ID to test") print(" -v VLEVEL Set verbosity level (0-3)") ``` ### scripts/driver.py/run() * 執行 `run` 後，主要處理上述引數 * 預設情況下會執行以下片段建立 class 並執行 function `run` ```python t = Tracer(qtest = "", verbLevel = 1, autograde = False, useValgrind = False) t.run(0) ``` ### scripts/driver.py/Tracer.run() * 根據是否有執行 valgrind，腳本的指令也會相對不同 * 上半部對應到的是 `valgrind ./qtest` * 下半部對應到的是 `./qtest` > 事實上下半部的 command 可以不用再次賦值，預設就是 `./qtest`，不過這樣可以增加可讀性 ```python if self.useValgrind: self.command = 'valgrind' else: self.command = self.qtest self.qtest = '' ``` * 在預設 `tid` 未指定測資的情況下，會依序跑完所有測資 ```python for t in tidList: tname = self.traceDict[t] if self.verbLevel > 0: print("+++ TESTING trace %s:" % tname) ok = self.runTrace(t) ``` * 分數是透過每個測資的滿分加總決定，無論是錯在該題的哪個地方，該題都是以 0 分計算，所以會需要 `verbLevel` 幫助了解 ```python maxval = self.maxScores[t] tval = maxval if ok else 0 print("---\t%s\t%d/%d" % (tname, tval, maxval)) score += tval maxscore += maxval scoreDict[t] = tval ``` ### scripts/driver.py/Tracer.runTrace() * 依序將前面所得的對應指令結合，便完成對應的腳本，再透過 python 的 `subprocess.call` 來呼叫 qtest > 這邊事實上也可以將 `fname` 的部份全部預設寫在 dict 裡面，拆開來只是為了精簡過程中顯示的文字 ```python fname = "%s/%s.cmd" % (self.traceDirectory, self.traceDict[tid]) vname = "%d" % self.verbLevel clist = [self.command, self.qtest, "-v", vname, "-f", fname] try: retcode = subprocess.call(clist) ``` ## qtest 行為與技巧 ### C 與 Python 擷取引數的差別 * 在 `qtest.c` 裡，使用的是 [getopt()](http://man7.org/linux/man-pages/man3/getopt.3.html) * 字串中的` h` `v:` `f:` `l:` 即是代表要處理的引數 * 輸入指令時必須是以 `-h`、`-v` 這樣的形式才會擷取 * `:`表示後面需要有對應的資訊，如 `-f test.txt`，該資訊會存在 `optarg` 這個固定的變數 * 成功擷取會回傳對應的字元，失敗則根據情況會回傳 `-1`、`?` 和 `：` ```cpp= int c; while ((c = getopt(argc, argv, "hv:f:l:")) != -1) { switch (c) { case 'h': usage(argv[0]); break; case 'f': strncpy(buf, optarg, BUFSIZE); buf[BUFSIZE - 1] = '\0'; infile_name = buf; break; case 'v': level = atoi(optarg); break; case 'l': strncpy(lbuf, optarg, BUFSIZE); buf[BUFSIZE - 1] = '\0'; logfile_name = lbuf; break; default: printf("Unknown option '%c'\n", c); usage(argv[0]); break; } } ``` * 在 `driver.py` 裡，使用的是 [getopt.getopt()](https://docs.python.org/3.6/library/getopt.html) C-style parser，而 python 也有提供比較強大的模組是 [argparse](https://docs.python.org/3.6/library/argparse.html#module-argparse) * 這邊的長引數 `valgrind` 是透過 list 來儲存，在 C 則需要呼叫 `getopt_long()`，以特別的 struct 來儲存 * 回傳值 `optlist` 和 `args` 可以更改變數名稱，前者以 list 儲存所有擷取到的引數，後者是儲存擷取後剩餘的引數 * 回傳的 list 每個元素都是以 `(option, value)` 形式儲存，`value` 只有在指定 `:` 時才會一併把資訊放入 ```python= optlist, args = getopt.getopt(args, 'hp:t:v:A', ['valgrind']) for (opt, val) in optlist: if opt == '-h': usage(name) elif opt == '-p': prog = val elif opt == '-t': tid = int(val) elif opt == '-v': vlevel = int(val) levelFixed = True elif opt == '-A': autograde = True elif opt == '--valgrind': useValgrind = True else: print("Unrecognized option '%s'" % opt) usage(name) ``` ### 測試過程 #### qtest.c * qtest 也提供了幾種可用的引數，`-f` 和 `-v` 對應到上述自動評分提及的腳本與細節顯示，`-l` 則是指定紀錄檔存放的位置 ```cpp printf("Usage: %s [-h] [-f IFILE][-v VLEVEL][-l LFILE]\n", cmd); printf("\t-h Print this information\n"); printf("\t-f IFILE Read commands from IFILE\n"); printf("\t-v VLEVEL Set verbosity level\n"); printf("\t-l LFILE Echo results to LFILE\n"); ``` * 執行 qtest 後會先根據需要的 [signal](http://man7.org/linux/man-pages/man7/signal.7.html) 註冊 signal handler * 這邊只有考慮兩種事件，`SIGSEGV` 處理記憶體錯誤，`SIGALRM` 則處理定時器 ```cpp static void queue_init() { fail_count = 0; q = NULL; signal(SIGSEGV, sigsegvhandler); signal(SIGALRM, sigalrmhandler); } ``` * handler 裡面會直接呼叫 `harness.c` 的 `trigger_exception()`，注意 handler 會有固定一個參數儲存信號 ```cpp void sigsegvhandler(int sig) { trigger_exception( "Segmentation fault occurred. You dereferenced a NULL or invalid " "pointer"); } ``` * 在每個與 `queue.c` 對應的函式裡，扣除引數的檢查後，都會先檢查 q 是否為 NULL * 接著執行 `harness.c` 的 `error_check()`，因為尚未測試 queue，這邊忽略回傳值，目的只是重設 `error_occurred` ```cpp if (q == NULL) report(3, "Warning: Calling insert tail on null queue"); error_check(); ``` ```cpp bool error_check() { bool e = error_occurred; error_occurred = false; return e; } ``` * 以 `do_new()` 為例，完成上述步驟後會先設置一個存檔點，`true` 表示允許開啟定時器，~~畢竟闖關就是要存檔~~，完成後就會解除定時器 ```cpp if (exception_setup(true)) q = q_new(); exception_cancel(); ``` * 部份測試會開啟額外的模式，如做完 `do_free()` 會開啟 `harness.c` 的 `cautious_mode`，確保任何要 free 的區塊目前都 allocated，但似乎不會用到 (?) > 影響到 `find_header()`，只有 `test_free()` 用到，但沒有呼叫到 `test_free()` 的機會 ```cpp set_cautious_mode(true); ``` * `do_reverse()` 也會開關 `noallocate_mode`，開啟的情況下會禁止 `test_malloc()` 和 `test_free()` > 一樣沒有呼叫到兩個函式的機會 ```cpp set_noallocate_mode(true); if (exception_setup(true)) q_reverse(q); exception_cancel(); set_noallocate_mode(false); ``` #### harness.c * 以下變數是為了處理例外發生，可以搭配參考 [setjmp.h](https://zh.wikipedia.org/wiki/Setjmp.h) * `env` 會儲存當下的 stack 資訊，宣告雖然可以用 `jmp_buf`，但可以改成 `sigjmp_buf` 比較有一致性 * `volatile` 表示該變數能夠在 set 和 jump 之間存取，且**不會套用編譯器的最佳化**，避免影響變數的更新 * `sig_atomic_t` 表示變數的更動需要在一個指令內完成，所以兩個關鍵字常一併出現 * `time_limited` 表示是否有開啟定時器 * `time_limit` 表示定時器的間隔，預設一秒 ```cpp static jmp_buf env; static volatile sig_atomic_t jmp_ready = false; static bool time_limited = false; static int time_limit = 1; ``` * 當 handler 捕捉到對應信號時，記錄要顯示的錯誤資訊，並使用 [siglongjmp()](https://linux.die.net/man/3/siglongjmp) 跳回上一個存檔點 ```cpp= void trigger_exception(char *msg) { error_occurred = true; error_message = msg; if (jmp_ready) siglongjmp(env, 1); else exit(1); } ``` * 這邊存檔點的設置使用的是 [sigsetjump()](https://linux.die.net/man/3/sigsetjmp)，算是 setjump() 的改良版，差別在可以處理信號內容 * 比較複雜的 context switch 可以參考 [setcontext()](https://linux.die.net/man/2/setcontext) * `savesigs` 的位置傳入 1，根據 [signal](http://man7.org/linux/man-pages/man7/signal.7.html)，表示能夠處理 SIGHUP 程序中斷的情況 * 建立存檔點時，固定呼叫 `exception_setup(true)` 1. 先經由 if 建立存檔點，`sigsetjmp()` 成功會是回傳 0，跳到 else 2. 允許 jump 並開啟定時器 3. 因定時器間隔一秒，正常是還不會觸發 `SIGALRM` 就回傳 `true` * 假設在執行 queue.c 的函式時發生 `SIGSEGV` 1. 捕捉到信號會立即執行 `trigger_exception()`，記錄對應的訊息 2. 接著 `siglongjmp()` 會回到上一個存檔點 `sigsetjmp()` 3. jump 回來會使得回傳值非 0，進入 if 4. 禁止 jump 並關閉定時器 5. 印出對應的錯誤訊息並跳出 6. 因為 false 使得原本 `qtest.c` 測試的 if 區塊不再執行 ```cpp= bool exception_setup(bool limit_time) { if (sigsetjmp(env, 1)) { /* Got here from longjmp */ jmp_ready = false; if (time_limited) { alarm(0); time_limited = false; } if (error_message) { report_event(MSG_ERROR, error_message); } error_message = ""; return false; } else { /* Got here from initial call */ jmp_ready = true; if (limit_time) { alarm(time_limit); time_limited = true; } return true; } } ``` * 無論執行過程中是否有 `SIGSEGV`，最終都會呼叫 `exception_cancel()`，差別在發生例外的情況下會多做一次而已 ```cpp= void exception_cancel() { if (time_limited) { alarm(0); time_limited = false; } jmp_ready = false; error_message = ""; } ``` #### qtest.c * 執行完測試的指令後，仍會再次執行 `harness.c` 的 `error_check()` * 如果正常執行，那麼仍會維持執行前的 `error_occurred = false` * 反之只要有發生 `trigger_exception()`，就會維持 `error_occurred = true`，使得最終回傳值為 false，表示失敗 ```cpp return ok && !error_check(); ``` ```cpp bool error_check() { bool e = error_occurred; error_occurred = false; return e; } ```