# 2020q3 Homework1 (lab0) contributed by < `tsengsam` > [lab0](https://hackmd.io/@sysprog/2020-lab0) --- #### 學習目標 - [x] 不定個數參數的處理, signal, setjmp/longjmp 及記憶體管理 - [x] GNU/Linux 開發工具 + Cppcheck + Valgrind + Makefile - [x] Git - [x] 樹立易於協作的開發規範 + ClangFormat - [x] 研究自動測試機制 - [x] 完成 lab0-c + queue.c + queue.h + qtest.c - [ ] 論文 ## 【開發環境】 ```shell $ uname -a Linux HOSTNAME 5.4.0-47-generic #51~18.04.1-Ubuntu SMP Sat Sep 5 14:35:50 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux $ gcc --version gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 Copyright (C) 2017 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. ``` ## 【開發工具】 ### [Cppcheck](http://cppcheck.sourceforge.net/manual.pdf) * 靜態程式碼分析工具,分析如 `.c` 檔,偵測 undefined behaviour ```cpp $ cppcheck file.c ``` ### [Valgrind](https://valgrind.org/docs/manual/manual.html) * 動態程式碼分析工具**們** * 作業中特別用到其中一個工具 [Massif](https://valgrind.org/docs/manual/ms-manual.html) 來量測 heap memory 的使用情形,和 stack memory 除了儲存的結構不同外,其儲存的類型、大小上限也不同可參考 [Heap vs Stack](https://gribblelab.org/CBootCamp/7_Memory_Stack_vs_Heap.html) 和 [使用方法](https://valgrind.org/docs/manual/ms-manual.html) ### Makefile * 適合在有很多程式檔的專案,若專案中一部分程式碼被修改則可以只編譯被修改的程式 * Makefile 檔案的主要語法如: ```shell= target: dependencies <tab>commands ``` `target`: 指的是要建立的目標檔案 `dependencies`: 則是在建立 target 之前所檢查的檔案,若發現 target 比它還新,那就不會再重新建立 target 以避免不必要的編譯動作 `command`: 表示要進行的 shell 指令,注意 具體像是: ```shell= test: test.c gcc test.c -o test ``` 更詳細參考 [Makefile 語法和示範](https://hackmd.io/@sysprog/gnu-linux-dev/https%3A%2F%2Fhackmd.io%2Fs%2FSySTMXPvl) 或 [Makefile 語法簡介](https://sites.google.com/site/mymakefile/makefile-yu-fa-jian-jie) ### [ClangFormat](https://clang.llvm.org/docs/ClangFormat.html) * 這是有助於建立一致程式碼風格的好用工具,手動執行的方式: ```shell $ clang-format -i *.[ch] ``` * 這個功能也能和 vim 整合,使得每次儲存 vim 檔案後會自動執行,方法為進入到 .vimrc 檔案加入以下設定: ```shell function! Formatonsave() let l:formatdiff = 1 py3f <path to your clang-format.py>/clang-format.py endfunction autocmd BufWritePre *.h,*.hpp,*.c,*.cc,*.cpp call Formatonsave() ``` ### Git * pre-commit hook * tig * commit editor * GNU Aspell ## 【修改程式】 * 完整程式碼 [GitHub](https://github.com/tsengsam/lab0-c) #### 結構定義 ```cpp /* Linked list element */ typedef struct ELE { char *value; struct ELE *next; } list_ele_t; /* Queue structure */ typedef struct { list_ele_t *head; /* Linked list of elements */ list_ele_t *tail; int size; } queue_t; ``` ### q_new Queue 的初始化,包含配置記憶體 ### q_free 要記得釋放結構內字串的記憶體空間 ```cpp= /* Free all storage used by queue */ void q_free(queue_t *q) { /* Free queue structure */ if (!q) return; while (q->head) { list_ele_t *tmp = q->head; free(tmp->value); q->head = q->head->next; free(tmp); } free(q); } ``` ### q_insert_head ```cpp= bool q_insert_head(queue_t *q, char *s) { if (!q) return false; list_ele_t *newh; newh = malloc(sizeof(list_ele_t)); if (!newh) return false; /* Don't forget to allocate space for the string and copy it */ char *tmp = malloc(sizeof(char) * strlen(s) + 1); if (!tmp) { free(newh); return false; } newh->value = tmp; strncpy(tmp, s, strlen(s) + 1); /* What if either call to malloc returns NULL? */ newh->next = q->head; q->head = newh; q->size += 1; if (q->size == 1) { q->tail = newh; } return true; } ``` ### q_insert_tail * 為使執行時間在 O(1),在 `queue_t` 定義了 `tail` 指標 ```cpp= bool q_insert_tail(queue_t *q, char *s) { if (!q) return false; list_ele_t *newt = malloc(sizeof(list_ele_t)); if (!newt) return false; newt->value = malloc(sizeof(char) * strlen(s) + 1); if (!newt->value) { free(newt); return false; } strncpy(newt->value, s, strlen(s) + 1); newt->next = NULL; if (!q->tail) { q->tail = newt; q->head = newt; } else { q->tail->next = newt; q->tail = newt; } q->size++; return true; } ``` ### q_remove_head ```cpp= bool q_remove_head(queue_t *q, char *sp, size_t bufsize) { if (!q || !q->head) return false; list_ele_t *tmp = q->head; q->head = q->head->next; if (sp) { strncpy(sp, tmp->value, bufsize - 1); sp[bufsize - 1] = '\0'; } free(tmp->value); free(tmp); q->size--; return true; } ``` ### q_size 取用結構內儲存的 size 變數使執行時間複雜度為常數級 ### q_reverse * 程式內的註解是否指希望 reuse 先前函式,嘗試重複使用函式但會跳出錯誤訊息。 ```shell /* * Reverse elements in queue * No effect if q is NULL or empty * This function should not allocate or free any list elements * (e.g., by calling q_insert_head, q_insert_tail, or q_remove_head). * It should rearrange the existing ones. */ ``` ![](https://i.imgur.com/9U83CoY.png) ```cpp= void q_reverse(queue_t *q) { if (!q || q->size == 0) return; list_ele_t *current = q->head; list_ele_t *next = current->next; list_ele_t *tmp; q->tail = q->head; q->tail->next = NULL; while (next) { tmp = next->next; next->next = current; current = next; next = tmp; } q->head = current; } ``` ### q_sort * 照著 [文章](https://npes87184.github.io/2015-09-12-linkedListSort/) 的 merge sort 進行實作後發現有一題測資沒過 觀察測資的操作後猜測與節點數量太多、使用遞迴導致執行過長或記憶體錯誤有關,需要再使用工具進行分析與修正。 ```cpp= void q_sort(queue_t *q) { if (!q || q->size <= 1) return; q->head = merge_list(q->head); q->tail = q->head; for (; q->tail->next; q->tail = q->tail->next) { /*nothing to do here*/ } } /* * Apply divide and conquer strategy merge sort * Split into two lists recursivly then merge them */ list_ele_t *merge_list(list_ele_t *head) { if (!head || !head->next) return head; list_ele_t *slow = head; list_ele_t *fast = head->next; while (fast && fast->next) { slow = slow->next; fast = fast->next->next; } fast = slow->next; slow->next = NULL; list_ele_t *ll = merge_list(head); list_ele_t *rl = merge_list(fast); return merge(ll, rl); } list_ele_t *merge(list_ele_t *ll, list_ele_t *rl) { if (!ll) return rl; if (!rl) return ll; if (strcmp(ll->value, rl->value) > 0) { rl->next = merge(ll, rl->next); return rl; } else { ll->next = merge(ll->next, rl); return ll; } } ``` ![](https://i.imgur.com/arbDOBU.png) * 使用 massif 測試 事後發現,其實 massif 是用來檢查 memory leak,而這邊的問題應是和 sort 有關,且函式內並沒有進行 `malloc` 分配記憶體,而是重複宣告變數及遞迴,故應該改使用測量 stack overflow 的工具才對,不過就當作學習使用方法。 ```script $ valgrind --tool=massif --time-unit=B ./qtest -v 3 -f traces/trace-15-perf.cmd $ ./qtest -v 3 -f traces/trace-15-perf.cmd ``` ![](https://i.imgur.com/teedo5Q.png) * 執行 `$ make valgrind` 之後得到如下的更多資訊,看到關於 Stack overflow, SIGSEGV, merge 等關鍵字後就大致確定是我的 merge 函式導致記憶體錯誤 ```shell= +++ TESTING trace trace-15-perf: # Test performance of insert_tail, size, reverse, and sort ==17332== Stack overflow in thread #1: can't grow stack to 0x1ffe801000 ==17332== Stack overflow in thread #1: can't grow stack to 0x1ffe801000 ==17332== Can't extend stack to 0x1ffe8010a8 during signal delivery for thread 1: ==17332== no stack segment ==17332== ==17332== Process terminating with default action of signal 11 (SIGSEGV) ==17332== Access not within mapped region at address 0x1FFE8010A8 ==17332== Stack overflow in thread #1: can't grow stack to 0x1ffe801000 ==17332== at 0x10CE4E: merge (queue.c:209) ==17332== If you believe this happened as a result of a stack ==17332== overflow in your program's main thread (unlikely but ==17332== possible), you can try to increase the size of the ==17332== main thread stack using the --main-stacksize= flag. ==17332== The main thread stack size used in this run was 8388608. ==17332== Stack overflow in thread #1: can't grow stack to 0x1ffe801000 ... ``` * 我再使用這個專案裡提供的另一個功能 `debug.py`,發現就是執行 GNU Debugger (GDB),我嘗試對 size 足夠大的 queue 做排序後程式會在第 209 行後終止,而相對應的程式碼正好就是在準備要再次遞迴的 `merge` 函式,與上面 make valgrind 的結果一樣 ```shell $ ./scripts/debug.py -d Reading symbols from /home/shan/Linux2020/lab0-c/qtest...done. Signal Stop Print Pass to program Description SIGALRM No No No Alarm clock Starting program: /home/shan/Linux2020/lab0-c/qtest cmd> new cmd> new q = [] cmd> ih hello 1000000 cmd> ih hello 1000000 q = [hello hello hello hello hello hello hello hello hello hello hello hello hello hello hello hello hello hello hello hello hello hello hello hello hello hello hello hello hello hello ... ] cmd> sort cmd> sort Program received signal SIGSEGV, Segmentation fault. 0x0000555555558e4e in merge (ll=0x55555b1793d0, rl=rl@entry=0x55555946bbd0) at queue.c:209 209 if (strcmp(ll->value, rl->value) > 0) { (gdb) ... ``` * 於是將 `merge` 函式修改成迭代作法: ```cpp= list_ele_t *merge(list_ele_t *ll, list_ele_t *rl) { list_ele_t *newh = NULL; list_ele_t **cursor = &newh; while (ll && rl) { if (strcmp(ll->value, rl->value) > 0) { *cursor = rl; rl = rl->next; } else { *cursor = ll; ll = ll->next; } cursor = &(*cursor)->next; } *cursor = ll ? ll : rl; return newh; } ``` --- ###### tags: `linux2020`