# 2021q1 Homework1 (lab0) contributed by < `DokiDokiPB` > ## 環境建置 - 實作程式碼用的是 macOS Big Sur 系統 - 如果是特定的工具,就會在 Linux 上使用,例如`valgrind` - 在 VSCode 編輯器上實作程式碼: - 使用 InteliSense 輔助編輯 - 工作上是使用 Sourcetree App 操作,但是 git hook 的關係沒辦法使用,改為 terminal 手打指令 commit - 可以透過:beer: Homebrew 安裝 - `cppcheck` - `make` - `clang-format` - `aspell` - 剛開始測試 git commit 時一直沒有觸發 git hook ,[原文](https://hackmd.io/@sysprog/linux2021-lab0#Git-Hooks-進行自動程式碼排版檢查)是使用 `make` 指令後才會安裝至 `.git/hooks` 底下。原文花了一點時間才找到 - 使用 `make` 指令出現找不到 `.git/hooks`,透過 `git init` 重新建立該資料夾 ## 相關資訊 - [My lab0 repo](https://github.com/a1091150/lab0-c) - [lab0 作業要求](https://hackmd.io/@sysprog/linux2021-lab0#-作業要求) - [lab0 作業清單](https://hackmd.io/@sysprog/linux2021-homework1) ## 程式碼實作清單 - [x] queue_t *q_new() - [x] void q_free(queue_t *q) - [x] bool q_insert_head(queue_t *q, char *s) - [x] bool q_insert_tail(queue_t *q, char *s) - [x] bool q_remove_head(queue_t *q, char *sp, size_t bufsize) - [x] int q_size(queue_t *q) - [x] void q_reverse(queue_t *q) - [x] void q_sort(queue_t *q) ### list_ele_t ```cpp typedef struct ELE { char *value; struct ELE *next; struct ELE *prev; } list_ele_t; ``` - 新增 prev 可以直接存取前一個 element ### queue_t ```diff typedef struct { list_ele_t *head; + list_ele_t *tail; + int size; } queue_t; ``` - 新增 ```int size``` - lab0 要求 ```int q_size(queue_t *q)``` function 為 constant time - 透過持續紀錄 size,就能以 $O(1)$ time 方式達成。 - 透過已知 size,可以減少檢查 ```list_ele_t *next ``` 是否為 NULL 的機率。 - 新增 ```list_ele_t tail``` - lab0 要求 ```q_insert_tail(queue_t *q) function 為 constant time``` - 透過紀錄 ```queue_t tail``` 方式達到紀錄 $O(1)$ time. > :::warning > 光新增欄位,是不足以達到題目要求,請詳述相關實作 > :notes: jserv > ::: > > 已修正 > > :heavy_check_mark: DokiDokiPB > > [color=#00a700] ### queue_t *q_new() ```cpp queue_t *q_new() { queue_t *q = malloc(sizeof(queue_t)); if (!q) { return NULL; } q->head = NULL; q->tail = NULL; q->size = 0; return q; } ``` ### void q_free(queue_t *q) ```cpp void q_free(queue_t *q) { if (!q) { return; } int size = q->size; list_ele_t *iter = q->head; for (int i = 0; i < size; i++) { list_ele_t *next = iter->next; free(iter->value); free(iter); iter = next; } free(q); } ``` ### bool q_insert_head(queue_t *q, char *s) ```cpp bool q_insert_head(queue_t *q, char *s) { if (!q) { return false; } list_ele_t *newh; newh = malloc(sizeof(list_ele_t)); if (!newh) { return false; } size_t size = strlen(s) + 1; newh->value = malloc(sizeof(char) * size); if (!newh->value) { free(newh); return false; } memset(newh->value, 0, size); memcpy(newh->value, s, size); if (!q->head) { q->head = newh; q->tail = newh; newh->prev = NULL; newh->next = NULL; } else { newh->next = q->head; q->head->prev = newh; q->head = newh; newh->prev = NULL; } q->size++; return true; } ``` - 在 lab0 中已提示使用 strcpy, strncpy 會有潛在風險 - 我選擇使用 memset 方式直接初始化數值、memcpy 方式複製數值。 ### bool q_insert_tail(queue_t *q, char *s) ```cpp bool q_insert_tail(queue_t *q, char *s) { if (!q) { return false; } list_ele_t *newh; newh = malloc(sizeof(list_ele_t)); if (!newh) { return false; } size_t size = strlen(s) + 1; newh->value = malloc(sizeof(char) * size); if (!newh->value) { free(newh); return false; } memset(newh->value, 0, size); memcpy(newh->value, s, size); if (!q->size) { q->head = newh; q->tail = newh; newh->prev = NULL; newh->next = NULL; } else { q->tail->next = newh; newh->prev = q->tail; q->tail = newh; q->tail->next = NULL; } q->size++; return true; } ``` - 作法與 ```bool q_insert_head(queue_t *q, char *s)``` 類似,只是新增至 tail ### bool q_remove_head(queue_t *q, char *sp, size_t ```cpp= bool q_remove_head(queue_t *q, char *sp, size_t bufsize) { if (!q) { return false; } if (q->size == 0) { return false; } list_ele_t *tmp = q->head; if (sp) { size_t size = strlen(tmp->value); size_t fixbufsize = bufsize - 1; size_t fixsize = size > fixbufsize ? fixbufsize : size; memset(sp, 0, bufsize); memcpy(sp, tmp->value, fixsize); } q->head = q->head->next; q->size--; if (q->size != 0) { q->head->prev = NULL; } free(tmp->value); free(tmp); return true; } ``` - 沒有考慮當 queue empty 時不應該去執行第 23 行程式碼,以至於測試時出現 Segmentation fault. ### void q_reverse(queue_t *q) ```cpp void q_reverse(queue_t *q) { if (!q) { return; } if (!q->size) { return; } int half = q->size >> 1; list_ele_t *next = q->head; list_ele_t *prev = q->tail; for (int i = 0; i < half; i++) { char *tmp = next->value; next->value = prev->value; prev->value = tmp; next = next->next; prev = prev->prev; } } ``` - 我透過互換數值的方式,而不是修改 ```list_ele_t *next, prev```,例如: - ```q->head->value``` 跟 ```queue->tail->value``` 互換 - ```q->head->next->value``` 跟 ```queue->tail->prev->value``` 互換,以此類推。 ### void q_sort(queue_t *q) #### 版本一的 q_sort ```cpp int compare(const void *a, const void *b) { char *ll = *(char **) a; char *rr = *(char **) b; return strcmp(ll, rr); } void q_sort(queue_t *q) { if (!q) { return; } if (q->size == 0 || q->size == 1) { return; } int size = q->size; char *iter[size]; list_ele_t *foo = q->head; for (int i = 0; i < size; i++) { iter[i] = foo->value; foo = foo->next; } qsort(iter, size, sizeof(char *), compare); foo = q->head; for (int i = 0; i < size; i++) { foo->value = iter[i]; foo = foo->next; } } ``` - 由於時間有限,我直接使用 C 語言內建的 qsort 作為代替 - 有記憶體不足的問題 #### 版本二 這一次使用的是 Merge Sort,並依照 lab0 的提示實作 Linked List 版本。 - 相比 lab0 中提供的方法,該並未利用在```queue_t``` 中新增的 ```size```,透過已知 size 的方式實作,我認為可以省下很多 if-else 的消耗。 - 使用 LinkedList 的特性,可以直接變更 ```queue_t prev``` 跟 ```queue_t next``` 的方式,使得不需要更多 memory 去暫時儲存該 memory address。 - 我參考很多程式碼是使用 merge sort,應該說作業區提供的寫法就有 Merge Sort - 它的 Time Complexity 為 $O(n \log n)$ - n 為 queue size ```cpp void q_sort(queue_t *q) { mergeSort(q); } ``` ##### void mergeSort(queue_t *q) ```clike= void mergeSort(queue_t *q) { if (!q || !q->head || !q->tail) { return; } if (q->size <= 1) { return; } int half = q->size >> 1; list_ele_t *mid = q->head; for (int i = 0; i < half; i++) { mid = mid->next; } // 拆成兩份 queue_t queue_t left = {.head = q->head, .tail = mid->prev, .size = half}; queue_t right = {.head = mid, .tail = q->tail, .size = q->size - half}; mergeSort(&left); mergeSort(&right); merge(&left, &right); // merge function 會把結果儲存在 queue_t left 中. q->head = left.head; q->tail = left.tail; } ``` - ```queue_t``` 中有 size 可以使用,我認為使用 for-loop 找尋中間位置所需時間,比起迭代 next、檢查是否為 NULL 要少。 - 將原本 queue 分成兩半,以 ```queue_t left``` 跟 ```queue_t right``` ,並且以遞迴方式排序 ```queue_t left``` 跟 ```queue_t right``` - 使用 ```merge``` function 合併 ```queue_t left``` 跟 ```queue_t right``` ##### void merge(queue_t *left, queue_t *right) ```clike= void merge(queue_t *left, queue_t *right) { if (!left || !right || left == right) { return; } queue_t *result = left; // 將排序好的結果儲存至 left 中 int lsize = left->size; int rsize = right->size; int size = lsize + rsize; list_ele_t *sourcehead = left->head->prev; // 儲存原本 queue 的順序,在排序好後接起 list_ele_t *sourcetail = right->tail->next; // 儲存原本 queue 的順序,在排序好後接起 list_ele_t *iter = NULL; list_ele_t *liter = left->head; list_ele_t *riter = right->head; int lindex = 0; int rindex = 0; // 決定 queue head if (strcmp(liter->value, riter->value) < 0) { // L iter = liter; liter = liter->next; lindex++; } else { // R iter = riter; riter = riter->next; rindex++; } // 將新的 head 接回原本的 queue if (sourcehead) { sourcehead->next = iter; iter->prev = sourcehead; } else { iter->prev = NULL; } result->head = iter; // 比較數值 for (int i = 1; i < size; i++) { if (lindex >= lsize) { iter->next = riter; riter->prev = iter; iter = riter; riter = riter->next; rindex++; } else if (rindex >= rsize) { iter->next = liter; liter->prev = iter; iter = liter; liter = liter->next; lindex++; } else { if (strcmp(liter->value, riter->value) < 0) { // L iter->next = liter; liter->prev = iter; iter = liter; liter = liter->next; lindex++; } else { // R iter->next = riter; riter->prev = iter; iter = riter; riter = riter->next; rindex++; } } } // 將新的 tail 接回原本的 queue if (sourcetail) { iter->next = sourcetail; sourcetail->prev = iter; } else { iter->next = NULL; } // 將結果儲存至 queue_t left 中 result->tail = iter; result->size = size; } ``` ## Address Sanitizer - 作業本身存在一些小細節沒有修正,透過 Address Sanitizer 發現 - 在 macOS 以及 Linux 皆支援 Address Sanitizer - ```make clean``` 後使用 ```make SANITIZER=1``` 可開啟該功能 - 在程式碼實作完後,出現 ```zsh ==13778==ERROR: AddressSanitizer: global-buffer-overflow on address 0x00010a7b1880 at pc 0x00010a796760 bp 0x7ffee546f2d0 sp 0x7ffee546f2c8 READ of size 4 at 0x00010a7b1880 thread T0 #0 0x10a79675f in do_option_cmd console.c:370 #1 0x10a797cf1 in interpret_cmda console.c:222 #2 0x10a797416 in interpret_cmd console.c:245 #3 0x10a79721a in cmd_select console.c:595 #4 0x10a7979dc in run_console console.c:668 #5 0x10a792b95 in main qtest.c:780 #6 0x7fff6d712cc8 in start+0x0 (libdyld.dylib:x86_64+0x1acc8) 0x00010a7b1881 is located 0 bytes to the right of global variable 'simulation' defined in 'console.c:21:6' (0x10a7b1880) of size 1 'simulation' is ascii string '' SUMMARY: AddressSanitizer: global-buffer-overflow console.c:370 in do_option_cmd ``` - AddressSanitizer 已經提示 simulation,行數也標示很清楚,透過文件搜尋,可以找到: ```clike= // console.c bool simulation = false; // ...... add_param("simulation", (int *)&simulation, "Start/Stop simulation mode", NULL); ``` - size of bool 與 size of int 不為同一個大小,AddressSanitizer 提示會有潛在記憶體問題。 - 修改以下程式碼,即可解決錯誤: ```clike= // console.c int simulation = 0; // console.h extern int simulation; ``` ### 版本一 > 使用版本:[lab0 作業版本 v0.1](https://github.com/a1091150/lab0-c/releases/tag/v0.1) > 這個版本使用 C 語言內建提供的 qsort 實作 q_sort > [color=#000000] #### valgrind - 第一次實作完程式碼後,使用 Linux 系統並使用 ```make valgrind``` 出現以下訊息: ``` ➜ lab0-c git:(master) make valgrind ==32600== Can&apos;t extend stack to 0x1ffe7ffe18 during signal delivery for thread 1: ==32600== no stack segment ==32600== ==32600== Process terminating with default action of signal 11 (SIGSEGV) ==32600== Access not within mapped region at address 0x1FFE7FFE18 ==32600== at 0x10E5FC: q_sort (queue.c:228) ``` - 推斷原因:在原程式碼中,我會先將 linked list 轉換成 Array,使得其使用更多的記憶體來實作功能,結果導致記憶體不足。 #### 檢測到 still reachable in loss record memory - 在使用版本一的程式碼後,遇到其它種類的問題: ``` ==32600== 32 bytes in 1 blocks are still reachable in loss record 5 of 33 ==32600== at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==32600== by 0x10C8F8: malloc_or_fail (report.c:189) ==32600== by 0x10D388: add_cmd (console.c:125) ==32600== by 0x10D4A5: init_cmd (console.c:98) ==32600== by 0x10C0D1: main (qtest.c:762) ``` ### 版本二 > 使用版本:[lab0 作業版本 v0.2](https://github.com/a1091150/lab0-c/releases/tag/v0.2) > 依據作業討論區提示,自行實作 Merge Sort 版本的 q_sort > [color=#000000] - 修復 AddressSanitizer 提示 simulation、重新實作 q_sort 後,使用 ```valgrind``` 沒有出現任何訊息。 ## 其它測試 > 使用版本:[lab0 作業版本 v0.2](https://github.com/a1091150/lab0-c/releases/tag/v0.2) > 依據作業討論區提示,自行實作 Merge Sort 版本的 q_sort > [color=#000000] - 參考以下筆記並使用指令 - [hankluo6 筆記](https://hackmd.io/@hankluo6/2020q3-lab0#Valgrind) - [akamayu-ouo 筆記](https://hackmd.io/@akamayu-ouo/sysprog21-lab0#Valgrind) - [gprof 筆記](https://hackmd.io/@qqMMSyj8SpeDLlB4jjxvPA/HyFU355EQ) - [GNU prof](https://sourceware.org/binutils/docs/gprof/) 這個筆記使用 gprof 判斷 function 是否有被執行。 - [```valgrind -tool=callgrind```](https://shininglionking.blogspot.com/2018/03/performance-profiler-callgrind.html) - ```valgrind ./qtest -f traces/trace-01-ops.cmd ``` ```zsh ➜ lab0-c git:(master) ✗ valgrind ./qtest -f traces/trace-01-ops.cmd ==10532== 4 bytes in 1 blocks are still reachable in loss record 1 of 3 ==10532== at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==10532== by 0x4A5050E: strdup (strdup.c:42) ==10532== by 0x110220: linenoiseHistoryAdd (linenoise.c:1236) ==10532== by 0x110DB3: linenoiseHistoryLoad (linenoise.c:1325) ==10532== by 0x10C22C: main (qtest.c:769) ==10532== ==10532== 137 bytes in 19 blocks are still reachable in loss record 2 of 3 ==10532== at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==10532== by 0x4A5050E: strdup (strdup.c:42) ==10532== by 0x110194: linenoiseHistoryAdd (linenoise.c:1236) ==10532== by 0x110DB3: linenoiseHistoryLoad (linenoise.c:1325) ==10532== by 0x10C22C: main (qtest.c:769) ==10532== ==10532== 160 bytes in 1 blocks are still reachable in loss record 3 of 3 ==10532== at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==10532== by 0x1101E0: linenoiseHistoryAdd (linenoise.c:1224) ==10532== by 0x110DB3: linenoiseHistoryLoad (linenoise.c:1325) ==10532== by 0x10C22C: main (qtest.c:769) ==10532== ``` - 推測以檔案輸入時,該 linenoiseHistoryAdd 中的記憶體沒有被釋放。 - 依據透過文字搜尋相關特定字串如 free, history 可以獲得 linenoise.c 中 ```freeHistory``` function,反向尋找 caller ```click= //...... static int enableRawMode(int fd) { struct termios raw; if (!isatty(STDIN_FILENO)) goto fatal; if (!atexit_registered) { atexit(linenoiseAtExit); atexit_registered = 1; } if (tcgetattr(fd, &orig_termios) == -1) goto fatal; //...... } ``` - linenoiseAtExit 會執行 freeHistory 釋放 history 的記憶體 - 透過 ```man atexit``` 查詢該功能,表示程式 exit 時會執行註冊的 function - ```enableRawMode``` 必須至少讀取一次 cmd 指令才會註冊 ```linenoiseAtExit```,如果使用檔案輸入的方式,就不會註冊 atexit. ``` DESCRIPTION The atexit() function registers the given function to be called at normal process termination, either via exit(3) or via return from the program's main(). Functions so registered are called in the reverse order of their registration; no arguments are passed. The same function may be registered multiple times: it is called once for each registration. POSIX.1 requires that an implementation allow at least ATEXIT_MAX (32) such functions to be registered. The actual limit supported by an implementation can be obtained using sysconf(3). When a child process is created via fork(2), it inherits copies of its parent's registrations. Upon a successful call to one of the exec(3) functions, all registrations are removed. ``` #### 解決方式 - 我認為 - qtest.c 的目的是測試 queue_t 的實作,而不是設定一些 console 相關的變數,例如 linenoise - console.c 使用 linenoise 功能,它就有必要負責 linenoise - 或是 linenoise 缺少一些檢查,例如總是註冊 ```atexit(linenoiseAtExit);``` - 我選擇的解決方式為:在 ```linenoiseHistorySetMaxLen``` function 新增以下程式碼 ```diff int linenoiseHistorySetMaxLen(int len) { ///... + if (!atexit_registered) { + atexit(linenoiseAtExit); + atexit_registered = 1; + } return 1; } ``` #### 使用 valgrind - 使用指令 ```valgrind ./qtest -f traces/trace-01-ops.cmd ``` - 再度使用指令 ```kcachegrind callgrind.out.[PID] ``` - 可以取得該 function call graph --- ## [Coroutine](https://hackmd.io/@sysprog/linux2021-lab0#Coroutine) - [子共筆](https://hackmd.io/@DokiDokiPB/largeWebServer) ### [Dude, is my code constant time?](https://hackmd.io/@sysprog/linux2021-lab0#-論文%E3%80%88Dude-is-my-code-constant-time〉重點提示)(未實作內容) > [time=Sun, Mar 7, 2021 12:01 AM] > [Note](https://hackmd.io/r_1ECy8ASOuzHPRB4tfwKA#以-Address-Sanitizer-偵測記憶體錯誤) > Address Sanitizer, Valgrind, Tiny-web-server > [Note](https://hackmd.io/@akamayu-ouo/sysprog21-lab0#Valgrind) > valgrind: fix memory leak on ```qtest -f <FILENAME>``` > 解釋 select 系統 > [Note](https://hackmd.io/@hankluo6/2021q1-lab0#整合-tiny-web-server-至-lab0) > Tiny-web-server > 解釋 select 系統