# 2020q3 Homework1 (lab0) contributed by <`StevenChen8759`> ###### tags: :::info 在先前參與旁聽的 [`2020q1 Linux核心設計`](http://wiki.csie.ncku.edu.tw/linux/schedule) 課程中,已撰寫完成 `lab0-c` 的 *Linked-list* 功能,並進行基本的記憶體洩漏檢測與效能量測。在本次作業中會將先前已實現的功能導入並改善實現,同時著重於更加詳細且精準的程式量測,並依據量測結果改善程式。 :radio: Steven HH Chen ::: > 相關連結: > [2020春季 Linux 核心設計 lab0-c 作業說明](https://hackmd.io/@sysprog/linux2020-lab0) > [2020春季 Linux 核心設計 lab0-c 作業共筆](https://hackmd.io/@StevenHHChen/linux2020_hw01_lab0_c) > [2020春季 Linux 核心設計 lab0-c GitHub](https://github.com/StevenChen8759/lab0-c_2020q1) > _ > [2020秋季進階電腦系統理論與實作 lab0 作業說明](https://hackmd.io/@sysprog/2020-lab0) > [2020秋季進階電腦系統理論與實作 lab0 作業繳交區](https://hackmd.io/@sysprog/2020-homework1) > [2020秋季進階電腦系統理論與實作 lab0 GitHub](https://github.com/StevenChen8759/lab0-c) > [color=green] ## :one: 開發環境 * PC 硬體摘要 * CPU: *Intel Core i5-5200 @ 2.20 GHz* * CPU Arch: *x86_64* * 作業系統環境 * Guest OS: *Ubuntu 18.04 LTS* * Host OS: *Windows 8.1* * Virtual Machine: *VMWare Workstation Player 15.5.6* * Linux 核心: *Linux ubuntu 5.4.0-47-generic #51~18.04.1-Ubuntugc* * GCC 版本: *gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0* ## :two: 功能實作與其正確性測試 * 本章節著重於 Linked List 的資料結構與操作功能設計,並在前述的虛擬環境下測試功能正確性。 * 除此之外,我們將舊有的實現修改成更加優雅的寫法,以增加執行效能。 * 參考 [2020q1 Linux核心設計的 lab0-c 作實作程式碼](https://hackmd.io/@StevenHHChen/linux2020_hw01_lab0_c#-%E5%AF%A6%E4%BD%9C%E7%B5%90%E6%9E%9C%E8%88%87%E9%96%8B%E7%99%BC%E9%81%8E%E7%A8%8B%E8%A8%98%E9%8C%84),並整理功能概要如下章節。 ### :camera_with_flash: 程式功能概要 * 修改 `queue.h` 中的資料結構 `queue_t`,並新增指定欄位,以符合效能需求。 * 修改下列 `queue.c` 中對資料結構 `queue_t` 的相關操作函式,並留意以下細節: * `malloc()` 配置空間的初始值問題 * `malloc()` 回傳 `NULL` 的處理,避免 `Null Pointer Dereference` * 字串複製、比較的記憶體操作不應使用造成 `緩衝區溢位(Buffer Overflow)` 的處理函式。 * 如 `strcpy()` `strcmp()` ...等等 * 在 `remove_list_head()` 中,複製的字串應留意 buffer 空間的存取,以避免複製錯誤的數值。 * 參考 [Linux 核心設計作業一 -> lab0-c 的程式除錯部分](https://hackmd.io/@StevenHHChen/linux2020_hw01_lab0_c#-%E7%B5%90%E5%B0%BE%E5%AD%97%E5%85%83%E7%9A%84%E8%AD%B0%E9%A1%8C) * 嚴格管制動態宣告的記憶體,避免 `記憶體洩漏(memory leak)` 與 `存取空指標內含值(Null Pointer Dereference)` 等錯誤 * `q_reverse()` 的操作如下 (以四個元素的 List 操作為例): * 此方法精隨在於透過三個指標的精準操作,達成無記憶體洩漏的串列鏈結轉移。 ![init](https://i.imgur.com/Mdkj2NQ.png) ![top1](https://i.imgur.com/HEkKNJU.png) ![p1](https://i.imgur.com/bWetLZd.png) ![top2](https://i.imgur.com/rHiNan2.png) ![p2](https://i.imgur.com/XjwxwXa.png) ![top3](https://i.imgur.com/eXR2Tfv.png) ![p3](https://i.imgur.com/PgfQYOe.png) ![top4](https://i.imgur.com/SgBXCl7.png) ![p4](https://i.imgur.com/KF0rBWY.png) * `q_sort()`採用 **Merge Sort** 方法排序,時間複雜度為 **$O(nlogn)$**。 * Ref: [pes87184 的 Merge Sort 實作程式碼](https://npes87184.github.io/2015-09-12-linkedListSort/) ### :hammer: 正確性測試 * `make check` 測試指令 ```shell $ make check ./qtest -v 3 -f traces/trace-eg.cmd cmd> cmd> # Demonstration of queue testing framework cmd> # Use help command to see list of commands and options cmd> # Initial queue is NULL. cmd> show q = NULL cmd> # Create empty queue cmd> new q = [] cmd> # Fill it with some values. First at the head cmd> ih dolphin q = [dolphin] cmd> ih bear q = [bear dolphin] cmd> ih gerbil q = [gerbil bear dolphin] cmd> # Now at the tail cmd> it meerkat q = [gerbil bear dolphin meerkat] cmd> it bear q = [gerbil bear dolphin meerkat bear] cmd> # Reverse it cmd> reverse q = [bear meerkat dolphin bear gerbil] cmd> # See how long it is cmd> size Queue size = 5 q = [bear meerkat dolphin bear gerbil] cmd> # Delete queue. Goes back to a NULL queue. cmd> free q = NULL cmd> # Exit program cmd> quit Freeing queue ``` * `make test` 測試結果 ```shell $ make test ... --- Trace Points --- trace-01-ops 6/6 --- trace-02-ops 6/6 --- trace-03-ops 6/6 --- trace-04-ops 6/6 --- trace-05-ops 5/5 --- trace-06-string 6/6 --- trace-07-robust 6/6 --- trace-08-robust 6/6 --- trace-09-robust 6/6 --- trace-10-malloc 6/6 --- trace-11-malloc 6/6 --- trace-12-malloc 6/6 --- trace-13-perf 6/6 --- trace-14-perf 6/6 --- trace-15-perf 0/6 *Fail --- trace-16-perf 6/6 --- trace-17-complexity 5/5 --- TOTAL 94/100 ``` * `make valgrind` 測試結果 ```shell $ make valgrind ... --- Trace Points --- trace-01-ops 6/6 --- trace-02-ops 6/6 --- trace-03-ops 6/6 --- trace-04-ops 6/6 --- trace-05-ops 5/5 --- trace-06-string 6/6 --- trace-07-robust 6/6 --- trace-08-robust 6/6 --- trace-09-robust 6/6 --- trace-10-malloc 6/6 --- trace-11-malloc 6/6 --- trace-12-malloc 6/6 --- trace-13-perf 6/6 --- trace-14-perf 6/6 --- trace-15-perf 0/6 *Fail --- trace-16-perf 6/6 --- trace-17-complexity 5/5 --- TOTAL 94/100 ``` ## :three: 效能分析與記憶體分析相關議題 ### :arrow_heading_up: `Merge Sort` 在排序大量資料時的錯誤改善 * 參考以下的 `queue` 操作,可發現在排序大量資料 (數百萬筆等級) 時,會造成 `Segmentation Fault`,此一錯誤亦造成測資 `trace-15-perf` 失敗: ```shell $ ./qtest cmd> new cmd> new q = [] cmd> ih ls 1000000 q = [ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ... ] cmd> it cp 1000000 q = [ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ls ... ] cmd> sort Segmentation fault (core dumped) ``` * 執行 `make valgrind`,可發現造成 `Segmentation Fault` 的原因為 `堆疊溢位(Stack Overflow)` ```shell $ make valgrind ... +++ TESTING trace trace-15-perf: # Test performance of insert_tail, size, reverse, and sort ==9813== Stack overflow in thread #1: can't grow stack to 0x1ffe801000 ==9813== Stack overflow in thread #1: can't grow stack to 0x1ffe801000 ==9813== Can't extend stack to 0x1ffe8010a8 during signal delivery for thread 1: ==9813== no stack segment ==9813== ==9813== Process terminating with default action of signal 11 (SIGSEGV) ==9813== Access not within mapped region at address 0x1FFE8010A8 ==9813== Stack overflow in thread #1: can't grow stack to 0x1ffe801000 ==9813== at 0x10CF00: merge (queue.c:247) ==9813== If you believe this happened as a result of a stack ==9813== overflow in your program's main thread (unlikely but ==9813== possible), you can try to increase the size of the ==9813== main thread stack using the --main-stacksize= flag. ==9813== The main thread stack size used in this run was 8388608. ==9813== Stack overflow in thread #1: can't grow stack to 0x1ffe801000 ==9813== ==9813== Process terminating with default action of signal 11 (SIGSEGV) ==9813== Access not within mapped region at address 0x1FFE801F70 ==9813== Stack overflow in thread #1: can't grow stack to 0x1ffe801000 ==9813== at 0x4A2A650: _vgnU_freeres (in /usr/lib/valgrind/vgpreload_core-amd64-linux.so) ==9813== If you believe this happened as a result of a stack ==9813== overflow in your program's main thread (unlikely but ==9813== possible), you can try to increase the size of the ==9813== main thread stack using the --main-stacksize= flag. ==9813== The main thread stack size used in this run was 8388608. ``` * 原始的實作採用的是`Recursive Merge Sort`,改以 `Iterative Merge Sort`實現,因程式在執行期間大幅減少了遞迴呼叫的次數,故可有效避免 `Stack Overflow`。 * 在排序函式中使用了元素指標陣列,在進行元素比較後直接針對元素指標陣列作 swap,並在排序完成後重新 assign list 的順序,使指定的 list 是已排序的。 * Ref: [參考程式碼](https://www.tutorialspoint.com/c-program-for-iterative-merge-sort) ```cpp=228 /* * List-Element-Is-Less-or-Equal function * This function compare with two list elements by input argument. * If ele1->value is less or equal to ele2->value, then return true. * Otherwise or NULL list element pointer input, return false. * TODO: make this function as a Marco call to improve performance */ bool listelement_isle(list_ele_t *ele1, list_ele_t *ele2) { size_t len1, len2; char *str1 = NULL, *str2 = NULL; /* Input pointer check */ if (!ele1 || !ele2) return false; /* Assign input string and do NULL chcek */ str1 = ele1->value; str2 = ele2->value; if (!str1 || !str2) return false; /* Decrease strlen() function call count, * We store length in the local variable. */ len1 = strlen(str1); len2 = strlen(str2); /* Call strncmp() to compare two string based on shorter string length * This method can avoid buffer overflow attack. * Based on return value of strncmp, this function return assigned * comparison result */ return (len1 >= len2) ? (strncmp(str1, str2, len2) <= 0) : (strncmp(str1, str2, len1) <= 0); } /* * Merge function for iterative Linked-list merge sort. */ void merge(list_ele_t *listarr[], int lb, int mid, int rb) { int i, j, k; int n1 = mid - lb + 1; int n2 = rb - mid; list_ele_t *L[n1], *R[n2]; for (i = 0; i < n1; i++) L[i] = listarr[lb + i]; for (j = 0; j < n2; j++) R[j] = listarr[mid + 1 + j]; i = 0, j = 0, k = lb; while (i < n1 && j < n2) { if (listelement_isle(L[i], R[j])) { listarr[k] = L[i]; i++; } else { listarr[k] = R[j]; j++; } k++; } while (i < n1) { listarr[k] = L[i]; i++; k++; } while (j < n2) { listarr[k] = R[j]; j++; k++; } } /* * Iterative merge sort (main function) for a linked-list. */ void itMergeSort(list_ele_t **listarr, int lb, int rb) { if (lb < rb) { int mid = lb + (rb - lb) / 2; // assert(mid & 0x80000000); // Assert if mid goes to negative value! itMergeSort(listarr, lb, mid); itMergeSort(listarr, mid + 1, rb); merge(listarr, lb, mid, rb); } } /* * Sort elements of queue in ascending order * No effect if q is NULL or empty. In addition, if q has only one * element, do nothing. */ void q_sort(queue_t *q) { list_ele_t *ptr; int i; /* Reject q is NULL case */ if (!q) return; /* Return directly while queue has less than or equal to 1 element */ if (q->size <= 1) return; /* Declare Array for list element pointer */ list_ele_t *listarray[q->size]; list_ele_t **laptr; /* Assign list element address from head to tail (O(n) cost) */ ptr = q->head; laptr = listarray; while (ptr != NULL) { *laptr = ptr; // printf("%s\n", (*laptr)->value); laptr++; ptr = ptr->next; } /* * Traverse array and do merge sort based on string comparison result * Time: O(nlogn) */ itMergeSort(listarray, 0, (q->size - 1)); /* Based on the result of sorted list array, assign next field of each node */ for (i = 0; i < q->size; i++) listarray[i]->next = (i != q->size - 1) ? listarray[i + 1] : NULL; /* Finally, assign new head and tail pointers */ q->head = listarray[0]; q->tail = listarray[q->size - 1]; } ``` * 但這樣的實現仍然存在缺陷,在 `q_sort()` 函式內儲存指標的陣列 `listarray` 在操作大量元素時,Process 的 `Stack Space` 大小可能不足,進而造成 `Segmentation Fault`,參考下列的 Runtime GDB 操作: ```shell $ gdb --args ./qtest -v 3 -f traces/trace-15-perf.cmd ... (gdb) run Starting program: /home/stch/linux2020/lab0-c/qtest -v 3 -f traces/trace-15-perf.cmd cmd> # Test performance of insert_tail, size, reverse, and sort ... cmd> sort Program received signal SIGSEGV, Segmentation fault. q_sort (q=0x555555762b80) at queue.c:348 warning: Source file is more recent than executable. 348 *laptr = ptr; (gdb) p laptr $1 = (list_ele_t **) 0x7fffff0bb480 (gdb) p listarray value requires 16000000 bytes, which is more than max-value-size (gdb) p ptr $2 = (list_ele_t *) 0x555564b86c70 (gdb) ``` * 基於上述的缺陷,我們並無法直接透過儲存指標的陣列進行大量資料的排序,應回到直接操作各節點的 `next` 欄位操作,透過適當的方式進行分割,並合併成已排序的 List,避免空間不足的問題。