owned this note
owned this note
Published
Linked with GitHub
# 2020q1 Homework1 (lab0)
contributed by < `amyhsieh16` >
###### tags: `linux2020`
## 開發環境
```shell
$ uname -a
Linux 4.15.0-88-generic #88-Ubuntu SMP Tue Feb 11 20:11:34 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
$ gcc --version
gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
```
## 作業
### 說明
* [作業說明: lab0](https://hackmd.io/@sysprog/linux2020-lab0)
### 作業要求
* 在 GitHub 上 fork `lab0-c`
* ==詳細閱讀 [C Programming Lab](http://www.cs.cmu.edu/~213/labs/cprogramminglab.pdf)== ,依據指示著手修改 `queue.[ch] `和連帶的檔案
* `q_new`:建立新的「空」佇列;
* `q_free`:釋放佇列所佔用的記憶體;
* `q_insert_head`:在佇列開頭 (head) 加入 (insert) 給定的新元素 (以 LIFO 準則);
* `q_insert_tail`:在佇列尾端 (tail) 加入 (insert) 給定的新元素 (以 FIFO 準則);
* `q_remove_head`:在佇列開頭 (head) 移除 (remove) 給定的元素。
* `q_size`:計算佇列中的元素數量。
* `q_reverse`:以反向順序重新排列鏈結串列,該函式不該配置或釋放任何鏈結串列元素,換言之,它只能重排已經存在的元素;
* `q_sort`:以遞增順序來排序鏈結串列的元素
* 修改排序所用的比較函式,變更為 [natural sort](https://github.com/sourcefrog/natsort),在 “simulation” 也該做對應的修改,得以反映出 [natural sort](https://github.com/sourcefrog/natsort) 的使用。
## 開發過程
### Define `queue_t`
* 為了使 `q_size`和 `q_insert_tail`在 $O(1)$條件下完成,新增
* `size`:紀錄 queue的元素數量
* `tail`:紀錄 queue最後一個元素的記憶體位置
```cpp
typedef struct
list_ele_t *head; /* Linked list of elements */
list_ele_t *tail;
int size;
} queue_t;
```
### `q_new`
* 用 ==[malloc](https://en.wikibooks.org/wiki/C_Programming/stdlib.h/malloc)== 動態配置記憶體
* 失敗後,回傳 NULL
* 成功後,則初始化 `head`及 `tail`為 NULL, `size`為 0
```cpp
queue_t *q_new()
{
queue_t *q = malloc(sizeof(queue_t));
if (!q)
return;
q->head = NULL;
q->tail = NULL;
q->size = 0;
return q;
}
```
### `q_free`
* queue 為 `NULL`時,不進行 free
* 先 free 每一個 元素的結構,最後才 free 整個 queue
```cpp
void q_free(queue_t *q)
{
/* What should you do if the q is NULL? */
if (!q)
return;
/* Free queue structure */
while (q->head) {
list_ele_t *tmp = q->head;
q->head = q->head->next;
free(tmp->value);
free(tmp);
}
free(q);
}
```
### `q_insert_head`
* 如果 q 是 `NULL` 或沒有`配置動態記憶體`,回傳 `false`
* 當使用 `malloc` 函數後,須確認是否成功,若==否==則須回傳 false
* 需確定要新增的型別,在使用malloc 時,需使用到sizeof(型別)*物件大小
* 從 queue 的 head 插入一個元素
* 設定元素的內容: value, next與記憶體位置
* 考量到 [CREN](https://security.web.cern.ch/security/recommendations/en/codetools/c.shtml) 的安全建議,棄置 `strcpy`而使用 `strncpy`
* 當插入元素後, queue的 `size`, `tail`的設定
```cpp
bool q_insert_head(queue_t *q, char *s)
{
/* What should you do if the q is NULL? */
if (!q)
return false;
/* allocate space for list_ele_t */
list_ele_t *newh;
newh = malloc(sizeof(list_ele_t));
if (!newh)
return false;
/* allocate space for the string and copy it */
newh->value = malloc(strlen(s) + 1);
if (!newh->value) {
free(newh);
return false;
}
memset(newh->value, '\0', strlen(s) + 1);
strncpy(newh->value, s, strlen(s));
newh->next = q->head;
q->head = newh;
q->tail = q->size ? q->tail : newh;
q->size++;
return true;
}
```
* 查閱[malloc](https://en.wikibooks.org/wiki/C_Programming/stdlib.h/malloc) ,可知在使用malloc 設定記憶體大小時,需sizeof(型別)相乘物件大小
```diff
@@ -58,7 +58,7 @@ bool q_insert_head(queue_t *q, char *s)
return false;
/* allocate space for the string and copy it */
- newh->value = malloc(sizeof(char) * (strlen(s) + 1));
+ newh->value = malloc(strlen(s) + 1);
if (!newh->value) {
```
### `q_insert_tail`
* 如果 q 是 `NULL` 或沒有`配置動態記憶體`,回傳 `false`
* 當使用 `malloc` 函數後,須確認是否成功,若==否==則須回傳 false
* 從 queue的 tail 插入一個元素
* 設定元素的內容: value, ==next== 與記憶體位置
* 因為 [CREN](https://security.web.cern.ch/security/recommendations/en/codetools/c.shtml),不使用 `strcpy`而使用 `strncpy`
* 當插入元素後, queue的 `size`, `head`的設定
* 時間複雜度為 $O(1)$
```cpp
bool q_insert_tail(queue_t *q, char *s)
{
/* What should you do if the q is NULL? */
if (!q)
return false;
/* allocate space for list_ele_t */
list_ele_t *newt;
newt = malloc(sizeof(list_ele_t));
if (!newt)
return false;
/* allocate space for the string and copy it */
newt->value = malloc(strlen(s) + 1);
if (!newt->value) {
free(newt);
return false;
}
memset(newt->value, '\0', strlen(s) + 1);
strncpy(newt->value, s, strlen(s));
newt->next = NULL;
q->tail->next = newt;
q->tail = newt;
q->head = q->size ? q->head : newt;
q->size++;
return true;
}
```
* 查閱[malloc](https://en.wikibooks.org/wiki/C_Programming/stdlib.h/malloc) ,可知在使用malloc 設定記憶體大小時,需sizeof(型別)相乘物件大小
```diff
@@ -93,8 +93,7 @@ bool q_insert_tail(queue_t *q, char *s)
return false;
/* allocate space for the string and copy it */
-
- newt->value = malloc(sizeof(char) * (strlen(s) + 1));
+ newt->value = malloc(strlen(s) + 1);
if (!newt->value) {
free(newt);
return false;
```
### `q_remove_head`
* 當 queue 是 `NULL` 或 `empty` 時,回傳 false
* 複製 head 的 value到 sp
* 移除 head
```cpp
bool q_remove_head(queue_t *q, char *sp, size_t bufsize)
{
/* if queue is NULL or empty */
if (!q || !q->size)
return false;
list_ele_t *tmp = q->head;
/* copy the removed string to *sp */
if (sp) {
memset(sp, '\0', bufsize);
strncpy(sp, tmp->value, bufsize-1);
}
q->head = q->head->next;
if (!q->head)
q->tail = q->head;
q->size--;
/* Free all storage used by q->head */
free(tmp->value);
free(tmp);
return true;
}
```
### `q_size`
* 回傳 queue的元素數量
* queue為 `NULL`或為 `empty`時,回傳0
* 時間複雜度為 $O(1)$
```cpp
int q_size(queue_t *q)
{
return q ? q->size : 0;
}
```
### `q_reverse`
* 當 `queue`是 `NULL`或 `empty`時,不進行 `reverse`
* 從 `head`開始,改變其 `next`的位置
* 宣告三個變數
* `curr`: 要修改的元素
* `prev`: 下一個要修改的元素
* `tmp`: 暫存 `prev->next`
把原本 curr->prev->tmp ,改成 prev->curr
```cpp
void q_reverse(queue_t *q)
{
if (!q || !q->size) {
return;
}
list_ele_t *curr = q->head, *prev = curr->next, *tmp = NULL;
q->tail = curr;
q->tail->next = NULL;
while (prev) {
tmp = prev->next;
prev->next = curr;
curr = prev;
prev = tmp;
}
q->head = curr;
}
```
:::warning
考慮以下的修改:
```diff
@@ -1,14 +1,13 @@
void q_reverse(queue_t *q)
{
- if (!q || !q->size) {
+ if (!q || !q->size)
return;
- }
- list_ele_t *curr = q->head, *prev = curr->next, *tmp = NULL;
+ list_ele_t *curr = q->head, *prev = curr->next;
q->tail = curr;
q->tail->next = NULL;
while (prev) {
- tmp = prev->next;
+ list_ele_t *tmp = prev->next;
prev->next = curr;
curr = prev;
prev = tmp;
```
要點:
1. 更精簡的排版,在 `{ ... }` 裡頭只有 `return` 或變數指派的話,可縮減 `{` 和 `}` 的使用;
2. 縮減變數 `tmp` 的 scope,不僅視覺上更清晰,而且日後引入 foreach 巨集 (Linux 核心原始程式碼的風格) 時,會更容易;
:notes: jserv
:::
>[name=amyhsieh16]已修正
1. commit `ac17c81`
2. commit `7b414e6`
### `q_sort`
* 參考 [Linked List Sort](https://npes87184.github.io/2015-09-12-linkedListSort/)
* Approach 1: 使用 Bubble Sort,會出現 Time limit exceeded
因為 Bubble Sort 的時間複雜度為 $O(n^2)$,而本作業所要求的時間複雜度為 $O(nlog(n))$,所以需要使用Approach 2的Merge Sort
:::danger
你需要解釋,為何會超時?
:notes: jserv
:::
* Approach 2: 使用 Merge Sort
* 藉由 [quiz1 的參考題解 1](https://hackmd.io/@Ryspon/HJVH8B0XU)與 divide and conquer 的觀念
* 用 recursive 手段實作 `merge` 函式時,在 `trace-15-perf` 一項中沒有得分,利用 `$ make valgrind`發現 ==stack overflow==
```shell
==5065== Stack overflow in thread #1: can't grow stack to 0x1ffe801000
==5065== Stack overflow in thread #1: can't實做 grow stack to 0x1ffe801000
==5065== Can't extend stack to 0x1ffe8010a8 during signal delivery for thread 1:
==5065== no stack segment
==5065==
==5065== Process terminating with default action of signal 11 (SIGSEGV)
==5065== Access not within mapped region at address 0x1FFE8010A8
==5065== Stack overflow in thread #1: can't grow stack to 0x1ffe801000
==5065== at 0x10CE89: merge (queue.c:240)
```
* 改用 iteration手段實作 `merge`函式
- [ ] Approach 1
```cpp
void q_sort(queue_t *q)
{
/* No effect if q is NULL or empty.*/
if (!q || !q->head)
return;
list_ele_t *head, *tail;
head = q->head;
tail = q->tail;
while (head != tail) {
list_ele_t *curr, *prev;
curr = head;
prev = head;
while (curr != tail) {
if (strncmp(curr->value, curr->next->value, 10) > 0 ) {
list_ele_t *tmp = curr->next;
if (curr != head) {
prev->next = tmp;
} else {
head = tmp;
}
curr->next = tmp->next;
tmp->next = curr;
prev = tmp;
if (tail == tmp) {
tail = curr;
}
} else {
prev = curr;
curr = curr->next;
}
if (!curr->next)
q->tail = curr;
}
tail = prev;
}
q->head = head;
}
```
- [ ] Approach 2
```cpp
list_ele_t *merge(list_ele_t *left, list_ele_t *right) {
if (!left)
return right;
if (!right)
return left;
if (strcmp(left->value, right->value) > 0) {
right->next = merge(left, right->next);
return right;
} else {
left->next = merge(left->next, right);
return left;
}
}
list_ele_t *mergesort (list_ele_t *head)
{
if (!head || !head->next) return head;
list_ele_t *fast, *slow;
fast = head->next;
slow = head;
while (fast && fast->next) {
slow = slow->next;
fast = fast->next->next;
}
fast = slow->next;
slow->next = NULL;
slow = mergesort(head);
fast = mergesort(fast);
return merge(slow, fast);
}
void q_sort(queue_t *q)
{
/* No effect if q is NULL or empty.*/
if (!q || !q->head) {
return;
}
q->tail = q->head = mergesort(q->head);
while (q->tail->next) {
q->tail = q->tail->next;
}
}
```
:::danger
縮減上述程式碼:
1. 用 `for` 迴圈改寫原本的 `while` 敘述,這樣之後可用 foreach 巨集改寫,而且會更清晰;
2. `merge` 和 `mergesort` 都以遞迴實作,但其實可減少遞迴次數;
:notes: jserv
:::
> [name=amyhsieh16] 1. 已修改,請參閱commit 83a3277
* 改用 `for`迴圈實作 `merge`函式
```cpp
list_ele_t *merge(list_ele_t *left, list_ele_t *right)
{
if (!left)
return right;
if (!right)
return left;
list_ele_t *head, *merge;
for (merge = NULL; right || left;) {
if (!right) {
merge->next = left;
break;
} else if (!left) {
merge->next = right;
break;
}
if (strcmp(left->value, right->value) > 0) {
if (!merge) {
head = merge = right;
} else {
merge->next = right;
merge = merge->next;
}
right = right->next;
} else {
if (!merge) {
head = merge = left;
} else {
merge->next = left;
merge = merge->next;
}
left = left->next;
}
}
return head;
}
```
### 記憶體檢測
* 使用 `$ make SANITIZER=1`
```shell
+++ TESTING trace trace-17-complexity:
# Test if q_insert_tail and q_size is constant time complexity
=================================================================
==7506==ERROR: AddressSanitizer: global-buffer-overflow on address 0x5645f93159c0 at pc 0x5645f910686e bp 0x7ffd16b6c920 sp 0x7ffd16b6c910
READ of size 4 at 0x5645f93159c0 thread T0
#0 0x5645f910686d in do_option_cmd linux2020/lab0-c/console.c:368
#1 0x5645f910548a in interpret_cmda linux2020/lab0-c/console.c:220
#2 0x5645f9105e8e in interpret_cmd linux2020/lab0-c/console.c:243
#3 0x5645f9106b77 in cmd_select linux2020/lab0-c/console.c:569
#4 0x5645f9106f94 in run_console linux2020/lab0-c/console.c:628
#5 0x5645f91040ad in main linux2020/lab0-c/qtest.c:772
#6 0x7f8f9df4fb96 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b96)
#7 0x5645f9101809 in _start (/home/amy/linux2020/lab0-c/qtest+0x6809)
0x5645f93159c1 is located 0 bytes to the right of global variable 'simulation' defined in 'console.c:20:6' (0x5645f93159c0) of size 1
'simulation' is ascii string ''
SUMMARY: AddressSanitizer: global-buffer-overflow linux2020/lab0-c/console.c:368 in do_option_cmd
Shadow bytes around the buggy address:
0x0ac93f25aae0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ac93f25aaf0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ac93f25ab00: 00 00 00 00 00 00 00 00 00 00 00 00 f9 f9 f9 f9
0x0ac93f25ab10: 00 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9
0x0ac93f25ab20: 00 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9
=>0x0ac93f25ab30: 00 f9 f9 f9 f9 f9 f9 f9[01]f9 f9 f9 f9 f9 f9 f9
0x0ac93f25ab40: 00 00 00 00 01 f9 f9 f9 f9 f9 f9 f9 04 f9 f9 f9
0x0ac93f25ab50: f9 f9 f9 f9 00 00 00 00 00 00 00 00 00 00 00 00
0x0ac93f25ab60: 00 00 00 00 00 00 00 00 00 00 00 00 00 f9 f9 f9
0x0ac93f25ab70: f9 f9 f9 f9 01 f9 f9 f9 f9 f9 f9 f9 01 f9 f9 f9
0x0ac93f25ab80: f9 f9 f9 f9 04 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==7506==ABORTING
--- trace-17-complexity 0/5
```
### Natural Sort
* 修改排序所用的比較函式,變更為 [natural sort](https://github.com/sourcefrog/natsort),在 “simulation” 也該做對應的修改,得以反映出 [natural sort](https://github.com/sourcefrog/natsort) 的使用。
## Valgrind
* 運用 Valgrind 排除 `qtest` 實作的記憶體錯誤,Massif 視覺化 “simulation” 過程中的記憶體使用量,需要設計對應的實驗
## 參考資料
* [C99 規格書](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf)