# 2020q1 Homework1 (lab0) contributed by < `ignite1771` > ## 開發環境 - Host OS: macOS Mojave 10.14.6 + QEMU/KVM - Guest OS: Ubuntu 18.04.3 LTS Live Server (參考 [@junwei 的筆記](https://hackmd.io/@junwei/Syy7Sc4WU)) ```shell $ uname -a Linux ubuntu_mac_pro 4.15.0-55-generic #60-Ubuntu SMP Tue Jul 2 18:22:20 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux $ gcc --version gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0 ``` ## 預先執行之實驗 - [x] [Valgrind 使用案例](https://hackmd.io/@sysprog/linux2020-lab0#Valgrind-使用案例) ## 實作功能 ### `queue.h` ```cpp typedef struct { list_ele_t *head; /* Linked list of elements */ list_ele_t *tail; int size; } queue_t; ``` ### `queue.c` ### q_new ```cpp queue_t *q_new() { queue_t *q = malloc(sizeof(queue_t)); if (!q) return NULL; q->head = NULL; q->tail = NULL; q->size = 0; return q; } ``` ### q_free ```cpp void q_free(queue_t *q) { if (!q) return; while (q->head) { list_ele_t *curh = q->head; q->head = q->head->next; free(curh->value); free(curh); } free(q); } ``` :::danger TODO: 思考如何讓上方程式碼更精簡 (注意 `free`) :notes: jserv ::: > 已在 commit [851e85f](https://github.com/ignite1771/lab0-c/commit/851e85ffc4bce471144d0288966cbbea6c7033ea) 改寫。 [name=ignite1771] ### q_insert_head ```cpp= bool q_insert_head(queue_t *q, char *s) { if (!q) return false; list_ele_t *newh; newh = malloc(sizeof(list_ele_t)); if (!newh) return false; newh->value = malloc(strlen(s) + 1); if (!newh->value) { free(newh); return false; } strncpy(newh->value, s, strlen(s) + 1); newh->next = q->head; if (!q->tail) q->tail = newh; q->head = newh; q->size += 1; return true; } ``` :::warning 自第 10, 15 行起可改寫,降低縮排深度 :notes: jserv ::: > 已在 commit [04e0d96](https://github.com/ignite1771/lab0-c/commit/04e0d960b2a40ee189521fa709a96a397124318d) 改寫。 [name=ignite1771] ### q_insert_tail ```cpp bool q_insert_tail(queue_t *q, char *s) { if (!q) return false; list_ele_t *newh = malloc(sizeof(list_ele_t)); if (!newh) return false; newh->next = NULL; newh->value = malloc(strlen(s) + 1); if (!newh->value) { free(newh); return false; } strncpy(newh->value, s, strlen(s) + 1); if (!q->head && !q->tail) { q->head = newh; q->tail = newh; } else { q->tail->next = newh; q->tail = newh; } q->size += 1; return true; } ``` ### q_remove_head - 加入 `sp[bufsize - 1] = '\0';` 來確保 sp 儲存之字串有 terminating charater ```cpp bool q_remove_head(queue_t *q, char *sp, size_t bufsize) { if (!q || !q->head) return false; if (sp) { strncpy(sp, q->head->value, bufsize); // Concatenate terminating character if bufsize is not eqaul to the // length of q->head->value sp[bufsize - 1] = '\0'; } list_ele_t *oldh = q->head; q->head = q->head->next; if (!q->head) q->tail = NULL; free(oldh->value); free(oldh); q->size -= 1; return true; } ``` :::info **發問** - 為何我不能將 `if (sp != NULL)` 改寫成 `if (!sp)` ? - 在使用 Cppcheck 工具進行靜態程式碼分析時,會出現以下 message: ``` queue.c:99:17: warning: Either the condition '!sp' is redundant or there is possible null pointer dereference: sp. [nullPointerRedundantCheck] strncpy(sp, q->head->value, bufsize); ^ queue.c:98:9: note: Assuming that condition '!sp' is not redundant if (!sp) ^ queue.c:99:17: note: Null pointer dereference strncpy(sp, q->head->value, bufsize); ^ queue.c:99:17: error: Null pointer dereference [nullPointer] strncpy(sp, q->head->value, bufsize); ^ Fail to pass static analysis. ``` :warning: `if (sp != NULL)` 等價於 `if (sp)`,而非 `if (!sp)`,兩者意思相反 :notes: jserv ::: > 了解,在修改程式碼讓其變簡潔時,我會更注意保持邏輯清晰 [name=ignite1771] ### q_size ```cpp int q_size(queue_t *q) { if (!q || !q->head) return 0; return q->size; } ``` :::warning 考慮以下程式變更: ```diff @@ -1,8 +1,6 @@ int q_size(queue_t *q) { - if (q == NULL || q->head == NULL) { + if (!q || !q->head) return 0; - } else { - return q->size; - } + return q->size; } ``` 要點: 1. 更精簡的縮排和敘述; 2. 日後引入 `likely` 和 `unlikely` 巨集,程式碼會更清晰 3. 上方其他程式碼也可比照修改; :notes: jserv ::: > 已在 commit [04e0d96](https://github.com/ignite1771/lab0-c/commit/04e0d960b2a40ee189521fa709a96a397124318d) 改寫。 [name=ignite1771] ### q_reverse ```cpp void q_reverse(queue_t *q) { if (!q || !q->head) return; list_ele_t *slow = q->head; list_ele_t *fast = q->head->next; list_ele_t *faster; q->tail = slow; slow->next = NULL; while (fast) { faster = fast->next; fast->next = slow; slow = fast; fast = faster; } q->head = slow; } ``` ### q_sort - MIN macro ```cpp #define MIN(a, b) \ { \ a < b ? a : b \ } ``` :::danger 這個 MIN 巨集存在缺陷: 1. 參數 a 和 b 傳遞到巨集定義內,應該要加上 parentheses (小括號) $\to$ 請思考為什麼; 2. a 和 b 實際可能是某些敘述,如 `MIN(++i, i++)`,依據上述巨集的定義方式,就要考慮到 side effect; 因此,在 gcc 中,會這樣定義: ```cpp #define min(X,Y) \ (__extension__ \ ({ \ typeof(X) __x = (X), __y = (Y); \ (__x < __y) ? __x : __y; \ }) \ ) ``` 請討論背後的思考議題 :notes: jserv ::: - [Natural Sort](https://github.com/sourcefrog/natsort) 將 `strnatcmp.h` 及 `strnatcmp.c` 加入 lab0-c 目錄,並修改 `Makefile` OBJS - merge 函式 iterative 版本 + pointer to pointer - 可以通過 trace-15-perf, 不會 stack overflow - 感謝 [jwang0306](https://hackmd.io/@jwang0306/lab0-c) 提供做法以及說明 ```cpp static list_ele_t *merge(list_ele_t *l1, list_ele_t *l2) { list_ele_t *head = NULL; list_ele_t **tmp = &head; while (l1 && l2) { if (strnatcmp(l1->value, l2->value) < 0) { *tmp = l1; l1 = l1->next; } else { *tmp = l2; l2 = l2->next; } tmp = &((*tmp)->next); } if (l1) { *tmp = l1; } if (l2) { *tmp = l2; } return head; } ``` merge_sort_list 函式則還是對照 [merge 函式 recursive 版本](https://npes87184.github.io/2015-09-12-linkedListSort/) 實作 ```cpp static list_ele_t *merge_sort_list(list_ele_t *head) { if (!head || !head->next) return head; // merge sort list_ele_t *slow = head; list_ele_t *fast = head->next; // split list while (fast && fast->next) { slow = slow->next; fast = fast->next->next; } fast = slow->next; slow->next = NULL; // sort each list list_ele_t *l1 = merge_sort_list(head); list_ele_t *l2 = merge_sort_list(fast); // merge sorted l1 and l2 return merge(l1, l2); } ``` :::danger :warning: 1. 注意函式和變數命命慣例,避免 [camelCase](https://en.wikipedia.org/wiki/Camel_case),在作業說明已解釋 Linux 核心原始程式碼的使用慣例,請跟著修改; 2. 關於 `merge` 和 `mergeSortList` 這兩個函式,其作用為 helper (內部使用),應該在宣告加上 `static`,將 [visibility](https://gcc.gnu.org/wiki/Visibility) 設定為不公開,有利於編譯器之後施加最佳化,也可避免符號 (symbol) 的衝突; 3. 函式 `merge` 程式碼尚可精簡; :notes: jserv ::: > 1. 已在 commit [66ea3fe](https://github.com/ignite1771/lab0-c/commit/66ea3fe8ae2fd43a6ccd744797fda94585959981) 改寫。 > 2. 已在 commit [2d8ea8d](https://github.com/ignite1771/lab0-c/commit/2d8ea8db97ed7882edd59cc621dd5ebd914e888c) 改寫。 > 3. 已在 commit [17b7741](https://github.com/ignite1771/lab0-c/commit/17b7741073e968b9b76d33f7921698ecd66a1e8e) 改寫。 [name=ignite1771] - q_sort ```cpp void q_sort(queue_t *q) { if (!q || !q->head || !q->head->next) return; q->head = merge_sort_list(q->head); // reset q->tail list_ele_t *curh = q->head; while (curh->next) { curh = curh->next; } q->tail = curh; } ``` ## 目前遇到的問題 - unary arithmetic operator ! ## 通過的 Traces - [x] trace-01-ops - [x] trace-02-ops - [x] trace-03-ops - [x] trace-04-ops - [x] trace-05-ops - [x] trace-06-string - [x] trace-07-robust - [x] trace-08-robust - [x] trace-09-robust - [x] trace-10-malloc - [x] trace-11-malloc - [x] trace-12-malloc - [x] trace-13-perf - [x] trace-14-perf - [x] trace-15-perf - [x] trace-16-perf - [x] trace-17-complexity ## 紀錄: Cppcheck 工具使用 1. 嘗試使用 `strcpy` 函數時出現警告 * 閱讀 https://security.web.cern.ch/security/recommendations/en/codetools/c.shtml, 提到 > The `strcpy` built-in function does not check buffer lengths and may very well overwrite memory zone contiguous to the intended destination. In fact, the whole family of functions is similarly vulnerable: `strcpy`, `strcat` and `strcmp`. * 修正:改用 `strncpy` 函數 ## 紀錄: Git 拆解 1 個 commit 成多個 commits :::info 使用情境: 當一次完成多個 functions 時 ::: 1. `git rebase {原始要拆解的 commit 之 id}~` * `edit` {原始要拆解的 commit 之 id} 3. `git reset HEAD~` 4. 修改檔案為**拆完之第 1 個 commit** 要的內容 * 建議使用之指令與工具 * vim 的快捷鍵 `V`, `y`, `p` * `git co -- {filename}` 5. `git add` & `git commit` 6. 用 `git reflog` 查詢 `{原始要拆解的 commit 之 id}` 7. `git reset {原始要拆解的 commit 之 id}` * 運用第 3 步之建議方法, 記下想要修改的**拆完之第 2 個 commit** 內容 8. `git reset {拆完之第 1 個 commit 之 id}` 9. 修改檔案為**拆完之第 2 個 commit** 要的內容 10. `git add` & `git commit` 11. 以此類推重複,完成後 `git rebase --continue` ## 討論: Symbol Visibility 參考: [你所不知道的 C 語言:動態連結器篇](https://hackmd.io/@sysprog/c-dynamic-linkage?type=view#Symbol-Visibility) Symbol Visibility 章節 ### 實驗 1: - 加入 `__attribute__((visibility("hidden")))` 描述至 `merge`, `merge_sort_list` 2 個函式中: ```cpp __attribute__((visibility("hidden"))) list_ele_t *merge(list_ele_t *l1, list_ele_t *l2); __attribute__((visibility("hidden"))) list_ele_t *merge_sort_list(list_ele_t *head); ``` - 或是,參考 [visibility](https://gcc.gnu.org/wiki/Visibility) 中提到 > To aid you converting old code to use the new system, GCC now supports also a `#pragma GCC visibility command`: ... `#pragma GCC visibility` is stronger than `-fvisibility`; it affects extern declarations as well. -fvisibility only affects definitions, so that existing code can be recompiled with minimal changes. **This is more true for C than C++;** C++ interfaces tend use classes, which are affected by `-fvisibility`. - 加入 `#pragma GCC visibility push(hidden)` 與 `#pragma GCC visibility pop` 包含住 `merge`, `merge_sort_list` 2 個函式中: ```cpp #pragma GCC visibility push(hidden) list_ele_t *merge(list_ele_t *l1, list_ele_t *l2); list_ele_t *merge_sort_list(list_ele_t *head); #pragma GCC visibility pop ``` 觀察 dynamic symobol table: - 第 20, 22 行可以看到 `merge`, `merge_sort_list` 函式的 `Vis` (Visability) 變成 `HIDDEN` - 但 `Bind` 依然是 `GLOBAL` ```shell $ gcc -o queue.o -c queue.c $ LC_ALL=C readelf --syms queue.o Symbol table '.symtab' contains 24 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS queue.c 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 3: 0000000000000000 0 SECTION LOCAL DEFAULT 3 4: 0000000000000000 0 SECTION LOCAL DEFAULT 4 5: 0000000000000000 0 SECTION LOCAL DEFAULT 6 6: 0000000000000000 0 SECTION LOCAL DEFAULT 7 7: 0000000000000000 0 SECTION LOCAL DEFAULT 5 8: 0000000000000000 76 FUNC GLOBAL DEFAULT 1 q_new 9: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND _GLOBAL_OFFSET_TABLE_ 10: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND test_malloc 11: 000000000000004c 106 FUNC GLOBAL DEFAULT 1 q_free 12: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND test_free 13: 00000000000000b6 242 FUNC GLOBAL DEFAULT 1 q_insert_head 14: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND strlen 15: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND strncpy 16: 00000000000001a8 231 FUNC GLOBAL DEFAULT 1 q_insert_tail 17: 000000000000028f 207 FUNC GLOBAL DEFAULT 1 q_remove_head 18: 000000000000035e 43 FUNC GLOBAL DEFAULT 1 q_size 19: 0000000000000389 142 FUNC GLOBAL DEFAULT 1 q_reverse 20: 0000000000000417 241 FUNC GLOBAL HIDDEN 1 merge 21: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND strncmp 22: 0000000000000508 188 FUNC GLOBAL HIDDEN 1 merge_sort_list 23: 00000000000005c4 127 FUNC GLOBAL DEFAULT 1 q_sort ``` ### 實驗 2: - 加入 `static` 描述至 `merge`, `merge_sort_list` 2 個函式中: ```cpp static list_ele_t *merge(list_ele_t *l1, list_ele_t *l2); static list_ele_t *merge_sort_list(list_ele_t *head); ``` 觀察 dynamic symobol table: - 第 5, 6 行可以看到 `merge`, `merge_sort_list` 函式的 `Bind` (Visability) 變成 `LOCAL` - 但 `Vis` 依然是 `DEFAULT` ```shell $ gcc -o queue.o -c queue.c $ LC_ALL=C readelf --syms queue.o Symbol table '.symtab' contains 24 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS queue.c 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 3: 0000000000000000 0 SECTION LOCAL DEFAULT 3 4: 0000000000000000 0 SECTION LOCAL DEFAULT 4 5: 0000000000000417 241 FUNC LOCAL DEFAULT 1 merge 6: 0000000000000508 188 FUNC LOCAL DEFAULT 1 merge_sort_list 7: 0000000000000000 0 SECTION LOCAL DEFAULT 6 8: 0000000000000000 0 SECTION LOCAL DEFAULT 7 9: 0000000000000000 0 SECTION LOCAL DEFAULT 5 10: 0000000000000000 76 FUNC GLOBAL DEFAULT 1 q_new 11: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND _GLOBAL_OFFSET_TABLE_ 12: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND test_malloc 13: 000000000000004c 106 FUNC GLOBAL DEFAULT 1 q_free 14: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND test_free 15: 00000000000000b6 242 FUNC GLOBAL DEFAULT 1 q_insert_head 16: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND strlen 17: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND strncpy 18: 00000000000001a8 231 FUNC GLOBAL DEFAULT 1 q_insert_tail 19: 000000000000028f 207 FUNC GLOBAL DEFAULT 1 q_remove_head 20: 000000000000035e 43 FUNC GLOBAL DEFAULT 1 q_size 21: 0000000000000389 142 FUNC GLOBAL DEFAULT 1 q_reverse 22: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND strncmp 23: 00000000000005c4 127 FUNC GLOBAL DEFAULT 1 q_sort ``` :::info **發問** - 為什麼可以是: - `Bind`=LOCAL, `Vis`=DEFAULT - `Bind`=GLOBAL, `Vis`=HIDDEN - 不是應該要 `Bind`=LOCAL, `Vis`=HIDDEN 嗎? :sparkle: 查閱 [Why symbol visibility is good](https://www.technovelty.org/code/why-symbol-visibility-is-good.html) > "local binding is the opposite and keeps the symbol local only (static) and weak is like global, but suggests that the symbol can be overridden." 交叉對照 [你所不知道的 C 語言:動態連結器篇](https://hackmd.io/@sysprog/c-dynamic-linkage) :notes: jserv ::: ### 實驗 3: - 同時加入: - `__attribute__((visibility("hidden")))` 或 `#pragma GCC visibility` - `static` 描述至 `merge`, `merge_sort_list` 2 個函式中,例如: ```cpp __attribute__((visibility("hidden"))) static list_ele_t *merge(list_ele_t *l1, list_ele_t *l2); __attribute__((visibility("hidden"))) static list_ele_t *merge_sort_list(list_ele_t *head); ``` - GCC 告訴我們 visibility attribute 並不需要 - 因為 `static` 已標示該函式僅限此 compilation unit 可見 ```shell $gcc -o queue.o -c queue.c queue.c:18:64: warning: ‘visibility’ attribute ignored [-Wattributes] list_ele_t *l2); ^~~~~~~~~~ queue.c:21:5: warning: ‘visibility’ attribute ignored [-Wattributes] list_ele_t *head); ^~~~~~~~~~ # Symbol Table 輸出結果與實驗 2 一樣 ```