# 2020q1 Homework1 (lab0)
contributed by < `ignite1771` >
## 開發環境
- Host OS: macOS Mojave 10.14.6 + QEMU/KVM
- Guest OS: Ubuntu 18.04.3 LTS Live Server (參考 [@junwei 的筆記](https://hackmd.io/@junwei/Syy7Sc4WU))
```shell
$ uname -a
Linux ubuntu_mac_pro 4.15.0-55-generic #60-Ubuntu SMP Tue Jul 2 18:22:20 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
$ gcc --version
gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
```
## 預先執行之實驗
- [x] [Valgrind 使用案例](https://hackmd.io/@sysprog/linux2020-lab0#Valgrind-使用案例)
## 實作功能
### `queue.h`
```cpp
typedef struct {
list_ele_t *head; /* Linked list of elements */
list_ele_t *tail;
int size;
} queue_t;
```
### `queue.c`
### q_new
```cpp
queue_t *q_new()
{
queue_t *q = malloc(sizeof(queue_t));
if (!q)
return NULL;
q->head = NULL;
q->tail = NULL;
q->size = 0;
return q;
}
```
### q_free
```cpp
void q_free(queue_t *q)
{
if (!q)
return;
while (q->head) {
list_ele_t *curh = q->head;
q->head = q->head->next;
free(curh->value);
free(curh);
}
free(q);
}
```
:::danger
TODO: 思考如何讓上方程式碼更精簡 (注意 `free`)
:notes: jserv
:::
> 已在 commit [851e85f](https://github.com/ignite1771/lab0-c/commit/851e85ffc4bce471144d0288966cbbea6c7033ea) 改寫。 [name=ignite1771]
### q_insert_head
```cpp=
bool q_insert_head(queue_t *q, char *s)
{
if (!q)
return false;
list_ele_t *newh;
newh = malloc(sizeof(list_ele_t));
if (!newh)
return false;
newh->value = malloc(strlen(s) + 1);
if (!newh->value) {
free(newh);
return false;
}
strncpy(newh->value, s, strlen(s) + 1);
newh->next = q->head;
if (!q->tail)
q->tail = newh;
q->head = newh;
q->size += 1;
return true;
}
```
:::warning
自第 10, 15 行起可改寫,降低縮排深度
:notes: jserv
:::
> 已在 commit [04e0d96](https://github.com/ignite1771/lab0-c/commit/04e0d960b2a40ee189521fa709a96a397124318d) 改寫。 [name=ignite1771]
### q_insert_tail
```cpp
bool q_insert_tail(queue_t *q, char *s)
{
if (!q)
return false;
list_ele_t *newh = malloc(sizeof(list_ele_t));
if (!newh)
return false;
newh->next = NULL;
newh->value = malloc(strlen(s) + 1);
if (!newh->value) {
free(newh);
return false;
}
strncpy(newh->value, s, strlen(s) + 1);
if (!q->head && !q->tail) {
q->head = newh;
q->tail = newh;
} else {
q->tail->next = newh;
q->tail = newh;
}
q->size += 1;
return true;
}
```
### q_remove_head
- 加入 `sp[bufsize - 1] = '\0';` 來確保 sp 儲存之字串有 terminating charater
```cpp
bool q_remove_head(queue_t *q, char *sp, size_t bufsize)
{
if (!q || !q->head)
return false;
if (sp) {
strncpy(sp, q->head->value, bufsize);
// Concatenate terminating character if bufsize is not eqaul to the
// length of q->head->value
sp[bufsize - 1] = '\0';
}
list_ele_t *oldh = q->head;
q->head = q->head->next;
if (!q->head)
q->tail = NULL;
free(oldh->value);
free(oldh);
q->size -= 1;
return true;
}
```
:::info
**發問**
- 為何我不能將 `if (sp != NULL)` 改寫成 `if (!sp)` ?
- 在使用 Cppcheck 工具進行靜態程式碼分析時,會出現以下 message:
```
queue.c:99:17: warning: Either the condition '!sp' is redundant or there is possible null pointer dereference: sp. [nullPointerRedundantCheck]
strncpy(sp, q->head->value, bufsize);
^
queue.c:98:9: note: Assuming that condition '!sp' is not redundant
if (!sp)
^
queue.c:99:17: note: Null pointer dereference
strncpy(sp, q->head->value, bufsize);
^
queue.c:99:17: error: Null pointer dereference [nullPointer]
strncpy(sp, q->head->value, bufsize);
^
Fail to pass static analysis.
```
:warning:
`if (sp != NULL)` 等價於 `if (sp)`,而非 `if (!sp)`,兩者意思相反
:notes: jserv
:::
> 了解,在修改程式碼讓其變簡潔時,我會更注意保持邏輯清晰 [name=ignite1771]
### q_size
```cpp
int q_size(queue_t *q)
{
if (!q || !q->head)
return 0;
return q->size;
}
```
:::warning
考慮以下程式變更:
```diff
@@ -1,8 +1,6 @@
int q_size(queue_t *q)
{
- if (q == NULL || q->head == NULL) {
+ if (!q || !q->head)
return 0;
- } else {
- return q->size;
- }
+ return q->size;
}
```
要點:
1. 更精簡的縮排和敘述;
2. 日後引入 `likely` 和 `unlikely` 巨集,程式碼會更清晰
3. 上方其他程式碼也可比照修改;
:notes: jserv
:::
> 已在 commit [04e0d96](https://github.com/ignite1771/lab0-c/commit/04e0d960b2a40ee189521fa709a96a397124318d) 改寫。 [name=ignite1771]
### q_reverse
```cpp
void q_reverse(queue_t *q)
{
if (!q || !q->head)
return;
list_ele_t *slow = q->head;
list_ele_t *fast = q->head->next;
list_ele_t *faster;
q->tail = slow;
slow->next = NULL;
while (fast) {
faster = fast->next;
fast->next = slow;
slow = fast;
fast = faster;
}
q->head = slow;
}
```
### q_sort
- MIN macro
```cpp
#define MIN(a, b) \
{ \
a < b ? a : b \
}
```
:::danger
這個 MIN 巨集存在缺陷:
1. 參數 a 和 b 傳遞到巨集定義內,應該要加上 parentheses (小括號) $\to$ 請思考為什麼;
2. a 和 b 實際可能是某些敘述,如 `MIN(++i, i++)`,依據上述巨集的定義方式,就要考慮到 side effect;
因此,在 gcc 中,會這樣定義:
```cpp
#define min(X,Y) \
(__extension__ \
({ \
typeof(X) __x = (X), __y = (Y); \
(__x < __y) ? __x : __y; \
}) \
)
```
請討論背後的思考議題
:notes: jserv
:::
- [Natural Sort](https://github.com/sourcefrog/natsort)
將 `strnatcmp.h` 及 `strnatcmp.c` 加入 lab0-c 目錄,並修改 `Makefile` OBJS
- merge 函式 iterative 版本 + pointer to pointer
- 可以通過 trace-15-perf, 不會 stack overflow
- 感謝 [jwang0306](https://hackmd.io/@jwang0306/lab0-c) 提供做法以及說明
```cpp
static list_ele_t *merge(list_ele_t *l1, list_ele_t *l2)
{
list_ele_t *head = NULL;
list_ele_t **tmp = &head;
while (l1 && l2) {
if (strnatcmp(l1->value, l2->value) < 0) {
*tmp = l1;
l1 = l1->next;
} else {
*tmp = l2;
l2 = l2->next;
}
tmp = &((*tmp)->next);
}
if (l1) {
*tmp = l1;
}
if (l2) {
*tmp = l2;
}
return head;
}
```
merge_sort_list 函式則還是對照 [merge 函式 recursive 版本](https://npes87184.github.io/2015-09-12-linkedListSort/) 實作
```cpp
static list_ele_t *merge_sort_list(list_ele_t *head)
{
if (!head || !head->next)
return head;
// merge sort
list_ele_t *slow = head;
list_ele_t *fast = head->next;
// split list
while (fast && fast->next) {
slow = slow->next;
fast = fast->next->next;
}
fast = slow->next;
slow->next = NULL;
// sort each list
list_ele_t *l1 = merge_sort_list(head);
list_ele_t *l2 = merge_sort_list(fast);
// merge sorted l1 and l2
return merge(l1, l2);
}
```
:::danger
:warning:
1. 注意函式和變數命命慣例,避免 [camelCase](https://en.wikipedia.org/wiki/Camel_case),在作業說明已解釋 Linux 核心原始程式碼的使用慣例,請跟著修改;
2. 關於 `merge` 和 `mergeSortList` 這兩個函式,其作用為 helper (內部使用),應該在宣告加上 `static`,將 [visibility](https://gcc.gnu.org/wiki/Visibility) 設定為不公開,有利於編譯器之後施加最佳化,也可避免符號 (symbol) 的衝突;
3. 函式 `merge` 程式碼尚可精簡;
:notes: jserv
:::
> 1. 已在 commit [66ea3fe](https://github.com/ignite1771/lab0-c/commit/66ea3fe8ae2fd43a6ccd744797fda94585959981) 改寫。
> 2. 已在 commit [2d8ea8d](https://github.com/ignite1771/lab0-c/commit/2d8ea8db97ed7882edd59cc621dd5ebd914e888c) 改寫。
> 3. 已在 commit [17b7741](https://github.com/ignite1771/lab0-c/commit/17b7741073e968b9b76d33f7921698ecd66a1e8e) 改寫。 [name=ignite1771]
- q_sort
```cpp
void q_sort(queue_t *q)
{
if (!q || !q->head || !q->head->next)
return;
q->head = merge_sort_list(q->head);
// reset q->tail
list_ele_t *curh = q->head;
while (curh->next) {
curh = curh->next;
}
q->tail = curh;
}
```
## 目前遇到的問題
- unary arithmetic operator !
## 通過的 Traces
- [x] trace-01-ops
- [x] trace-02-ops
- [x] trace-03-ops
- [x] trace-04-ops
- [x] trace-05-ops
- [x] trace-06-string
- [x] trace-07-robust
- [x] trace-08-robust
- [x] trace-09-robust
- [x] trace-10-malloc
- [x] trace-11-malloc
- [x] trace-12-malloc
- [x] trace-13-perf
- [x] trace-14-perf
- [x] trace-15-perf
- [x] trace-16-perf
- [x] trace-17-complexity
## 紀錄: Cppcheck 工具使用
1. 嘗試使用 `strcpy` 函數時出現警告
* 閱讀 https://security.web.cern.ch/security/recommendations/en/codetools/c.shtml, 提到
> The `strcpy` built-in function does not check buffer lengths and may very well overwrite memory zone contiguous to the intended destination. In fact, the whole family of functions is similarly vulnerable: `strcpy`, `strcat` and `strcmp`.
* 修正:改用 `strncpy` 函數
## 紀錄: Git 拆解 1 個 commit 成多個 commits
:::info
使用情境: 當一次完成多個 functions 時
:::
1. `git rebase {原始要拆解的 commit 之 id}~`
* `edit` {原始要拆解的 commit 之 id}
3. `git reset HEAD~`
4. 修改檔案為**拆完之第 1 個 commit** 要的內容
* 建議使用之指令與工具
* vim 的快捷鍵 `V`, `y`, `p`
* `git co -- {filename}`
5. `git add` & `git commit`
6. 用 `git reflog` 查詢 `{原始要拆解的 commit 之 id}`
7. `git reset {原始要拆解的 commit 之 id}`
* 運用第 3 步之建議方法, 記下想要修改的**拆完之第 2 個 commit** 內容
8. `git reset {拆完之第 1 個 commit 之 id}`
9. 修改檔案為**拆完之第 2 個 commit** 要的內容
10. `git add` & `git commit`
11. 以此類推重複,完成後 `git rebase --continue`
## 討論: Symbol Visibility
參考: [你所不知道的 C 語言:動態連結器篇](https://hackmd.io/@sysprog/c-dynamic-linkage?type=view#Symbol-Visibility) Symbol Visibility 章節
### 實驗 1:
- 加入 `__attribute__((visibility("hidden")))` 描述至 `merge`, `merge_sort_list` 2 個函式中:
```cpp
__attribute__((visibility("hidden")))
list_ele_t *merge(list_ele_t *l1, list_ele_t *l2);
__attribute__((visibility("hidden")))
list_ele_t *merge_sort_list(list_ele_t *head);
```
- 或是,參考 [visibility](https://gcc.gnu.org/wiki/Visibility) 中提到
> To aid you converting old code to use the new system, GCC now supports also a `#pragma GCC visibility command`:
...
`#pragma GCC visibility` is stronger than `-fvisibility`; it affects extern declarations as well. -fvisibility only affects definitions, so that existing code can be recompiled with minimal changes. **This is more true for C than C++;** C++ interfaces tend use classes, which are affected by `-fvisibility`.
- 加入 `#pragma GCC visibility push(hidden)` 與 `#pragma GCC visibility pop` 包含住 `merge`, `merge_sort_list` 2 個函式中:
```cpp
#pragma GCC visibility push(hidden)
list_ele_t *merge(list_ele_t *l1, list_ele_t *l2);
list_ele_t *merge_sort_list(list_ele_t *head);
#pragma GCC visibility pop
```
觀察 dynamic symobol table:
- 第 20, 22 行可以看到 `merge`, `merge_sort_list` 函式的 `Vis` (Visability) 變成 `HIDDEN`
- 但 `Bind` 依然是 `GLOBAL`
```shell
$ gcc -o queue.o -c queue.c
$ LC_ALL=C readelf --syms queue.o
Symbol table '.symtab' contains 24 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FILE LOCAL DEFAULT ABS queue.c
2: 0000000000000000 0 SECTION LOCAL DEFAULT 1
3: 0000000000000000 0 SECTION LOCAL DEFAULT 3
4: 0000000000000000 0 SECTION LOCAL DEFAULT 4
5: 0000000000000000 0 SECTION LOCAL DEFAULT 6
6: 0000000000000000 0 SECTION LOCAL DEFAULT 7
7: 0000000000000000 0 SECTION LOCAL DEFAULT 5
8: 0000000000000000 76 FUNC GLOBAL DEFAULT 1 q_new
9: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND _GLOBAL_OFFSET_TABLE_
10: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND test_malloc
11: 000000000000004c 106 FUNC GLOBAL DEFAULT 1 q_free
12: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND test_free
13: 00000000000000b6 242 FUNC GLOBAL DEFAULT 1 q_insert_head
14: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND strlen
15: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND strncpy
16: 00000000000001a8 231 FUNC GLOBAL DEFAULT 1 q_insert_tail
17: 000000000000028f 207 FUNC GLOBAL DEFAULT 1 q_remove_head
18: 000000000000035e 43 FUNC GLOBAL DEFAULT 1 q_size
19: 0000000000000389 142 FUNC GLOBAL DEFAULT 1 q_reverse
20: 0000000000000417 241 FUNC GLOBAL HIDDEN 1 merge
21: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND strncmp
22: 0000000000000508 188 FUNC GLOBAL HIDDEN 1 merge_sort_list
23: 00000000000005c4 127 FUNC GLOBAL DEFAULT 1 q_sort
```
### 實驗 2:
- 加入 `static` 描述至 `merge`, `merge_sort_list` 2 個函式中:
```cpp
static list_ele_t *merge(list_ele_t *l1, list_ele_t *l2);
static list_ele_t *merge_sort_list(list_ele_t *head);
```
觀察 dynamic symobol table:
- 第 5, 6 行可以看到 `merge`, `merge_sort_list` 函式的 `Bind` (Visability) 變成 `LOCAL`
- 但 `Vis` 依然是 `DEFAULT`
```shell
$ gcc -o queue.o -c queue.c
$ LC_ALL=C readelf --syms queue.o
Symbol table '.symtab' contains 24 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FILE LOCAL DEFAULT ABS queue.c
2: 0000000000000000 0 SECTION LOCAL DEFAULT 1
3: 0000000000000000 0 SECTION LOCAL DEFAULT 3
4: 0000000000000000 0 SECTION LOCAL DEFAULT 4
5: 0000000000000417 241 FUNC LOCAL DEFAULT 1 merge
6: 0000000000000508 188 FUNC LOCAL DEFAULT 1 merge_sort_list
7: 0000000000000000 0 SECTION LOCAL DEFAULT 6
8: 0000000000000000 0 SECTION LOCAL DEFAULT 7
9: 0000000000000000 0 SECTION LOCAL DEFAULT 5
10: 0000000000000000 76 FUNC GLOBAL DEFAULT 1 q_new
11: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND _GLOBAL_OFFSET_TABLE_
12: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND test_malloc
13: 000000000000004c 106 FUNC GLOBAL DEFAULT 1 q_free
14: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND test_free
15: 00000000000000b6 242 FUNC GLOBAL DEFAULT 1 q_insert_head
16: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND strlen
17: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND strncpy
18: 00000000000001a8 231 FUNC GLOBAL DEFAULT 1 q_insert_tail
19: 000000000000028f 207 FUNC GLOBAL DEFAULT 1 q_remove_head
20: 000000000000035e 43 FUNC GLOBAL DEFAULT 1 q_size
21: 0000000000000389 142 FUNC GLOBAL DEFAULT 1 q_reverse
22: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND strncmp
23: 00000000000005c4 127 FUNC GLOBAL DEFAULT 1 q_sort
```
:::info
**發問**
- 為什麼可以是:
- `Bind`=LOCAL, `Vis`=DEFAULT
- `Bind`=GLOBAL, `Vis`=HIDDEN
- 不是應該要 `Bind`=LOCAL, `Vis`=HIDDEN 嗎?
:sparkle:
查閱 [Why symbol visibility is good](https://www.technovelty.org/code/why-symbol-visibility-is-good.html)
> "local binding is the opposite and keeps the symbol local only (static) and weak is like global, but suggests that the symbol can be overridden."
交叉對照 [你所不知道的 C 語言:動態連結器篇](https://hackmd.io/@sysprog/c-dynamic-linkage)
:notes: jserv
:::
### 實驗 3:
- 同時加入:
- `__attribute__((visibility("hidden")))` 或 `#pragma GCC visibility`
- `static`
描述至 `merge`, `merge_sort_list` 2 個函式中,例如:
```cpp
__attribute__((visibility("hidden")))
static list_ele_t *merge(list_ele_t *l1, list_ele_t *l2);
__attribute__((visibility("hidden")))
static list_ele_t *merge_sort_list(list_ele_t *head);
```
- GCC 告訴我們 visibility attribute 並不需要
- 因為 `static` 已標示該函式僅限此 compilation unit 可見
```shell
$gcc -o queue.o -c queue.c
queue.c:18:64: warning: ‘visibility’ attribute ignored [-Wattributes]
list_ele_t *l2);
^~~~~~~~~~
queue.c:21:5: warning: ‘visibility’ attribute ignored [-Wattributes]
list_ele_t *head);
^~~~~~~~~~
# Symbol Table 輸出結果與實驗 2 一樣
```