# 2020q3 Homework1 (lab0)
contributed by <`StevenChen8759`>
###### tags:
:::info
在先前參與旁聽的 [`2020q1 Linux核心設計`](http://wiki.csie.ncku.edu.tw/linux/schedule) 課程中,已撰寫完成 `lab0-c` 的 *Linked-list* 功能,並進行基本的記憶體洩漏檢測與效能量測。在本次作業中會將先前已實現的功能導入並改善實現,同時著重於更加詳細且精準的程式量測,並依據量測結果改善程式。
:radio: Steven HH Chen
:::
> 相關連結:
> [2020春季 Linux 核心設計 lab0-c 作業說明](https://hackmd.io/@sysprog/linux2020-lab0)
> [2020春季 Linux 核心設計 lab0-c 作業共筆](https://hackmd.io/@StevenHHChen/linux2020_hw01_lab0_c)
> [2020春季 Linux 核心設計 lab0-c GitHub](https://github.com/StevenChen8759/lab0-c_2020q1)
> _
> [2020秋季進階電腦系統理論與實作 lab0 作業說明](https://hackmd.io/@sysprog/2020-lab0)
> [2020秋季進階電腦系統理論與實作 lab0 作業繳交區](https://hackmd.io/@sysprog/2020-homework1)
> [2020秋季進階電腦系統理論與實作 lab0 GitHub](https://github.com/StevenChen8759/lab0-c)
> [color=green]
## :one: 開發環境
* PC 硬體摘要
* CPU: *Intel Core i5-5200 @ 2.20 GHz*
* CPU Arch: *x86_64*
* 作業系統環境
* Guest OS: *Ubuntu 18.04 LTS*
* Host OS: *Windows 8.1*
* Virtual Machine: *VMWare Workstation Player 15.5.6*
* Linux 核心: *Linux ubuntu 5.4.0-47-generic #51~18.04.1-Ubuntugc*
* GCC 版本: *gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0*
## :two: 功能實作與其正確性測試
* 本章節著重於 Linked List 的資料結構與操作功能設計,並在前述的虛擬環境下測試功能正確性。
* 除此之外,我們將舊有的實現修改成更加優雅的寫法,以增加執行效能。
* 參考 [2020q1 Linux核心設計的 lab0-c 作實作程式碼](https://hackmd.io/@StevenHHChen/linux2020_hw01_lab0_c#-%E5%AF%A6%E4%BD%9C%E7%B5%90%E6%9E%9C%E8%88%87%E9%96%8B%E7%99%BC%E9%81%8E%E7%A8%8B%E8%A8%98%E9%8C%84),並整理功能概要如下章節。
### :camera_with_flash: 程式功能概要
* 修改 `queue.h` 中的資料結構 `queue_t`,並新增指定欄位,以符合效能需求。
* 修改下列 `queue.c` 中對資料結構 `queue_t` 的相關操作函式,並留意以下細節:
* `malloc()` 配置空間的初始值問題
* `malloc()` 回傳 `NULL` 的處理,避免 `Null Pointer Dereference`
* 字串複製、比較的記憶體操作不應使用造成 `緩衝區溢位(Buffer Overflow)` 的處理函式。
* 如 `strcpy()` `strcmp()` ...等等
* 在 `remove_list_head()` 中,複製的字串應留意 buffer 空間的存取,以避免複製錯誤的數值。
* 參考 [Linux 核心設計作業一 -> lab0-c 的程式除錯部分](https://hackmd.io/@StevenHHChen/linux2020_hw01_lab0_c#-%E7%B5%90%E5%B0%BE%E5%AD%97%E5%85%83%E7%9A%84%E8%AD%B0%E9%A1%8C)
* 嚴格管制動態宣告的記憶體,避免 `記憶體洩漏(memory leak)` 與 `存取空指標內含值(Null Pointer Dereference)` 等錯誤
* `q_reverse()` 的操作如下 (以四個元素的 List 操作為例):
* 此方法精隨在於透過三個指標的精準操作,達成無記憶體洩漏的串列鏈結轉移。









* `q_sort()`採用 **Merge Sort** 方法排序,時間複雜度為 **$O(nlogn)$**。
* Ref: [pes87184 的 Merge Sort 實作程式碼](https://npes87184.github.io/2015-09-12-linkedListSort/)
### :hammer: 正確性測試
* `make check` 測試指令
```shell
$ make check
./qtest -v 3 -f traces/trace-eg.cmd
cmd>
cmd> # Demonstration of queue testing framework
cmd> # Use help command to see list of commands and options
cmd> # Initial queue is NULL.
cmd> show
q = NULL
cmd> # Create empty queue
cmd> new
q = []
cmd> # Fill it with some values. First at the head
cmd> ih dolphin
q = [dolphin]
cmd> ih bear
q = [bear dolphin]
cmd> ih gerbil
q = [gerbil bear dolphin]
cmd> # Now at the tail
cmd> it meerkat
q = [gerbil bear dolphin meerkat]
cmd> it bear
q = [gerbil bear dolphin meerkat bear]
cmd> # Reverse it
cmd> reverse
q = [bear meerkat dolphin bear gerbil]
cmd> # See how long it is
cmd> size
Queue size = 5
q = [bear meerkat dolphin bear gerbil]
cmd> # Delete queue. Goes back to a NULL queue.
cmd> free
q = NULL
cmd> # Exit program
cmd> quit
Freeing queue
```
* `make test` 測試結果
```shell
$ make test
...
--- Trace Points
--- trace-01-ops 6/6
--- trace-02-ops 6/6
--- trace-03-ops 6/6
--- trace-04-ops 6/6
--- trace-05-ops 5/5
--- trace-06-string 6/6
--- trace-07-robust 6/6
--- trace-08-robust 6/6
--- trace-09-robust 6/6
--- trace-10-malloc 6/6
--- trace-11-malloc 6/6
--- trace-12-malloc 6/6
--- trace-13-perf 6/6
--- trace-14-perf 6/6
--- trace-15-perf 0/6 *Fail
--- trace-16-perf 6/6
--- trace-17-complexity 5/5
--- TOTAL 94/100
```
* `make valgrind` 測試結果
```shell
$ make valgrind
...
--- Trace Points
--- trace-01-ops 6/6
--- trace-02-ops 6/6
--- trace-03-ops 6/6
--- trace-04-ops 6/6
--- trace-05-ops 5/5
--- trace-06-string 6/6
--- trace-07-robust 6/6
--- trace-08-robust 6/6
--- trace-09-robust 6/6
--- trace-10-malloc 6/6
--- trace-11-malloc 6/6
--- trace-12-malloc 6/6
--- trace-13-perf 6/6
--- trace-14-perf 6/6
--- trace-15-perf 0/6 *Fail
--- trace-16-perf 6/6
--- trace-17-complexity 5/5
--- TOTAL 94/100
```
## :three: 效能分析與記憶體分析相關議題
### :arrow_heading_up: `Merge Sort` 在排序大量資料時的錯誤改善
* 參考以下的 `queue` 操作,可發現在排序大量資料 (數百萬筆等級) 時,會造成 `Segmentation Fault`,此一錯誤亦造成測資 `trace-15-perf` 失敗:
```shell
$ ./qtest
cmd> new
cmd> new
q = []
cmd> ih ls 1000000
q = [ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ... ]
cmd> it cp 1000000
q = [ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ... ]
cmd> sort
Segmentation fault (core dumped)
```
* 執行 `make valgrind`,可發現造成 `Segmentation Fault` 的原因為 `堆疊溢位(Stack Overflow)`
```shell
$ make valgrind
...
+++ TESTING trace trace-15-perf:
# Test performance of insert_tail, size, reverse, and sort
==9813== Stack overflow in thread #1: can't grow stack to 0x1ffe801000
==9813== Stack overflow in thread #1: can't grow stack to 0x1ffe801000
==9813== Can't extend stack to 0x1ffe8010a8 during signal delivery for thread 1:
==9813== no stack segment
==9813==
==9813== Process terminating with default action of signal 11 (SIGSEGV)
==9813== Access not within mapped region at address 0x1FFE8010A8
==9813== Stack overflow in thread #1: can't grow stack to 0x1ffe801000
==9813== at 0x10CF00: merge (queue.c:247)
==9813== If you believe this happened as a result of a stack
==9813== overflow in your program's main thread (unlikely but
==9813== possible), you can try to increase the size of the
==9813== main thread stack using the --main-stacksize= flag.
==9813== The main thread stack size used in this run was 8388608.
==9813== Stack overflow in thread #1: can't grow stack to 0x1ffe801000
==9813==
==9813== Process terminating with default action of signal 11 (SIGSEGV)
==9813== Access not within mapped region at address 0x1FFE801F70
==9813== Stack overflow in thread #1: can't grow stack to 0x1ffe801000
==9813== at 0x4A2A650: _vgnU_freeres (in /usr/lib/valgrind/vgpreload_core-amd64-linux.so)
==9813== If you believe this happened as a result of a stack
==9813== overflow in your program's main thread (unlikely but
==9813== possible), you can try to increase the size of the
==9813== main thread stack using the --main-stacksize= flag.
==9813== The main thread stack size used in this run was 8388608.
```
* 原始的實作採用的是`Recursive Merge Sort`,改以 `Iterative Merge Sort`實現,因程式在執行期間大幅減少了遞迴呼叫的次數,故可有效避免 `Stack Overflow`。
* 在排序函式中使用了元素指標陣列,在進行元素比較後直接針對元素指標陣列作 swap,並在排序完成後重新 assign list 的順序,使指定的 list 是已排序的。
* Ref: [參考程式碼](https://www.tutorialspoint.com/c-program-for-iterative-merge-sort)
```cpp=228
/*
* List-Element-Is-Less-or-Equal function
* This function compare with two list elements by input argument.
* If ele1->value is less or equal to ele2->value, then return true.
* Otherwise or NULL list element pointer input, return false.
* TODO: make this function as a Marco call to improve performance
*/
bool listelement_isle(list_ele_t *ele1, list_ele_t *ele2)
{
size_t len1, len2;
char *str1 = NULL, *str2 = NULL;
/* Input pointer check */
if (!ele1 || !ele2)
return false;
/* Assign input string and do NULL chcek */
str1 = ele1->value;
str2 = ele2->value;
if (!str1 || !str2)
return false;
/* Decrease strlen() function call count,
* We store length in the local variable.
*/
len1 = strlen(str1);
len2 = strlen(str2);
/* Call strncmp() to compare two string based on shorter string length
* This method can avoid buffer overflow attack.
* Based on return value of strncmp, this function return assigned
* comparison result
*/
return (len1 >= len2) ? (strncmp(str1, str2, len2) <= 0)
: (strncmp(str1, str2, len1) <= 0);
}
/*
* Merge function for iterative Linked-list merge sort.
*/
void merge(list_ele_t *listarr[], int lb, int mid, int rb)
{
int i, j, k;
int n1 = mid - lb + 1;
int n2 = rb - mid;
list_ele_t *L[n1], *R[n2];
for (i = 0; i < n1; i++)
L[i] = listarr[lb + i];
for (j = 0; j < n2; j++)
R[j] = listarr[mid + 1 + j];
i = 0, j = 0, k = lb;
while (i < n1 && j < n2) {
if (listelement_isle(L[i], R[j])) {
listarr[k] = L[i];
i++;
} else {
listarr[k] = R[j];
j++;
}
k++;
}
while (i < n1) {
listarr[k] = L[i];
i++;
k++;
}
while (j < n2) {
listarr[k] = R[j];
j++;
k++;
}
}
/*
* Iterative merge sort (main function) for a linked-list.
*/
void itMergeSort(list_ele_t **listarr, int lb, int rb)
{
if (lb < rb) {
int mid = lb + (rb - lb) / 2;
// assert(mid & 0x80000000); // Assert if mid goes to negative value!
itMergeSort(listarr, lb, mid);
itMergeSort(listarr, mid + 1, rb);
merge(listarr, lb, mid, rb);
}
}
/*
* Sort elements of queue in ascending order
* No effect if q is NULL or empty. In addition, if q has only one
* element, do nothing.
*/
void q_sort(queue_t *q)
{
list_ele_t *ptr;
int i;
/* Reject q is NULL case */
if (!q)
return;
/* Return directly while queue has less than or equal to 1 element */
if (q->size <= 1)
return;
/* Declare Array for list element pointer */
list_ele_t *listarray[q->size];
list_ele_t **laptr;
/* Assign list element address from head to tail (O(n) cost) */
ptr = q->head;
laptr = listarray;
while (ptr != NULL) {
*laptr = ptr;
// printf("%s\n", (*laptr)->value);
laptr++;
ptr = ptr->next;
}
/*
* Traverse array and do merge sort based on string comparison result
* Time: O(nlogn)
*/
itMergeSort(listarray, 0, (q->size - 1));
/* Based on the result of sorted list array, assign next field of each node
*/
for (i = 0; i < q->size; i++)
listarray[i]->next = (i != q->size - 1) ? listarray[i + 1] : NULL;
/* Finally, assign new head and tail pointers */
q->head = listarray[0];
q->tail = listarray[q->size - 1];
}
```
* 但這樣的實現仍然存在缺陷,在 `q_sort()` 函式內儲存指標的陣列 `listarray` 在操作大量元素時,Process 的 `Stack Space` 大小可能不足,進而造成 `Segmentation Fault`,參考下列的 Runtime GDB 操作:
```shell
$ gdb --args ./qtest -v 3 -f traces/trace-15-perf.cmd
...
(gdb) run
Starting program: /home/stch/linux2020/lab0-c/qtest -v 3 -f traces/trace-15-perf.cmd
cmd> # Test performance of insert_tail, size, reverse, and sort
...
cmd> sort
Program received signal SIGSEGV, Segmentation fault.
q_sort (q=0x555555762b80) at queue.c:348
warning: Source file is more recent than executable.
348 *laptr = ptr;
(gdb) p laptr
$1 = (list_ele_t **) 0x7fffff0bb480
(gdb) p listarray
value requires 16000000 bytes, which is more than max-value-size
(gdb) p ptr
$2 = (list_ele_t *) 0x555564b86c70
(gdb)
```
* 基於上述的缺陷,我們並無法直接透過儲存指標的陣列進行大量資料的排序,應回到直接操作各節點的 `next` 欄位操作,透過適當的方式進行分割,並合併成已排序的 List,避免空間不足的問題。