# 2020q1 Homework1 (lab0) contributed by < `matches4453` > > GitHub: [YanjenChen/lab0-c](https://github.com/YanjenChen/lab0-c) ## Enviroment ```shell $ lscpu 架構: x86_64 CPU 作業模式: 32-bit, 64-bit Byte Order: Little Endian CPU(s): 4 On-line CPU(s) list: 0-3 每核心執行緒數: 1 每通訊端核心數: 4 Socket(s): 1 NUMA 節點: 1 供應商識別號: GenuineIntel CPU 家族: 6 型號: 60 Model name: Intel(R) Core(TM) i5-4440 CPU @ 3.10GHz 製程: 3 CPU MHz: 800.016 CPU max MHz: 3300.0000 CPU min MHz: 800.0000 BogoMIPS: 6198.06 虛擬: VT-x L1d 快取: 32K L1i 快取: 32K L2 快取: 256K L3 快取: 6144K NUMA node0 CPU(s): 0-3 $ uname -a Linux yanjenlinux-desktop 5.3.0-40-generic #32~18.04.1-Ubuntu SMP Mon Feb 3 14:05:59 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux $ gcc --version gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 ``` ## Homework Requirement :::danger 修正下方英文書寫,文法和用語 :notes: jserv ::: The basic requirement of this homwork are listed at below, for more detail, please view the [documentation](https://hackmd.io/@sysprog/linux2020-lab0#-%E4%BD%9C%E6%A5%AD%E8%A6%81%E6%B1%82): - Implement the following queue operation: - `q_new`: Create a new, empty queue. - `q_free`: Free all storage used by a queue. - `q_insert_head`: Attempt to insert a new element at the head of the queue (LIFO discipline). - `q_insert_tail`: Attempt to insert a new element at the tail of the queue (FIFO discipline). - `q_remove_head`: Attempt to remove the element at the head of the queue. - `q_size`: Compute the number of elements in the queue. - `q_reverse`: Reorder the list so that the queue elements are reversed in order. - `q_sort`: Sort elements of queue in ascending order. - Replace `strcmp` with [nature sort]() in `q_sort`. ## Development Log ### `q_new` 加入判斷 `q` 是否為 NULL 避免 segmentation fault。修改前的原始碼沒有確認 `malloc` 是否成功配置記憶體。在配置失敗下初始化 `head` 會造成 segmentation fault。 commit [bed8927](https://github.com/YanjenChen/lab0-c/commit/bed89273c45a1b6eaff201d58e93bdd413e7fc6b) ```cpp queue_t *q = malloc(sizeof(queue_t)); if (q) { q->head = NULL; } ``` ### `q_free` 走訪 linked list 釋放每個節點與資料的記憶體。修改前的原始碼並沒有釋放 link list 中所有資料的記憶體,造成 memory leak。 commit [94ba00b](https://github.com/YanjenChen/lab0-c/commit/94ba00b8422fe13339cb4332e847017319065aff) ```cpp /* Remove first list element until queue is empty */ list_ele_t *tmp = q->head; while (tmp) { q->head = tmp->next; free(tmp->value); free(tmp); tmp = q->head; } ``` ### `q_insert_head` 透過 strncpy 建立資料字串的副本避免 buffer overflow,並檢查是否有成功配置記憶體。 commit [956b063](https://github.com/YanjenChen/lab0-c/commit/956b0638a4953e6536f5fc14fbd2c0a656dbac40) ```cpp list_ele_t *newh; char *news; const int slen = strlen(s); newh = malloc(sizeof(list_ele_t)); news = malloc(sizeof(char) * (slen + 1)); if (!q || !newh || !news) { return false; } /* Copy string value and manuly add \0 to buffer end */ strncpy(news, s, slen); news[slen] = '\0'; newh->next = q->head; newh->value = news; q->head = newh; ``` ### `q_insert_tail` 在 queue 的資料結構中新增 `tail` 指向 linked list 的尾端,讓時間複雜度維持 O(1)。其餘實作重點同 `q_insert_head` commit [27b7f12](https://github.com/YanjenChen/lab0-c/commit/27b7f127650329e32aa3faecf142b81a82e08785) ```cpp queue_t *q_new() { ... q->tail = NULL; ... } ``` ### `q_remove_head` 縮減 `strncpy` 補零的次數提升一點點的效能。考量到 `sp` 和儲存在 `tmp` 中的字串長度差異過大會讓 `strncpy` 補太多零在結尾,額外加入字串長度判斷。 commit [7c9d0e9](https://github.com/YanjenChen/lab0-c/commit/7c9d0e982f0efad261c84e14009ce0626324f54c) ```cpp bool q_remove_head(queue_t *q, char *sp, size_t bufsize) { ... list_ele_t *tmp = q->head; q->head = tmp->next; /* Modify tail if remove last element in queue */ if (!(tmp->next)) { q->tail = NULL; } /* Copy to sp if specified */ const size_t slen = strlen(tmp->value); if (sp) { strncpy(sp, tmp->value, slen >= bufsize ? bufsize - 1 : slen); sp[slen >= bufsize ? bufsize - 1 : slen] = '\0'; } /* Free memory of removed element */ free(tmp->value); free(tmp); ... } ``` ### `q_size` 在 queue 的資料結構中新增 `size` 紀錄 linked list 中的節點數,讓時間複雜度維持 $O(1)$。 commit [95256c8](https://github.com/YanjenChen/lab0-c/commit/95256c8d0a72feed78dcf3abbbc7252981942774) ```cpp queue_t *q_new() { ... q->size = 0; ... } ``` ### `q_reverse` 就是一個 queue reverse。 commit [bae4099](https://github.com/YanjenChen/lab0-c/commit/bae4099d0ed630e55e20cdb4614160169b675c21), [383c97a (Bug fixed)](https://github.com/YanjenChen/lab0-c/commit/383c97a2eb837b7537448ba953175c4105d0f934) ### `q_sort` 使用 merge sort 作為 queue sort 的演算法。 Linked list 由於記憶體配置並不連續,<s>無法支援 quick sort 所需的隨機存取</s>。 `merge` 中使用遞迴會在 linked list 過長時觸發 stack overflow,因此使用迴圈替代遞迴。 :::danger 在 linked list 上,當然可實作 quick sort,參見 [linux-list/examples/quick-sort.c](https://github.com/sysprog21/linux-list/blob/master/examples/quick-sort.c)。真正的限制因素,是你的想像力。排序是演算法,linked list 是資料結構,這兩者本可達到 [interoperability](https://en.wikipedia.org/wiki/Interoperability)。 :notes: jserv ::: commit [f013761](https://github.com/YanjenChen/lab0-c/commit/f01376163e8fbf87b013f53bc497fc7cdbec789e) ```cpp /* * Merge given list by nature order * Recursive funciton call will trigger stackoverflow, * use loop instead. */ static list_ele_t *merge(list_ele_t *a, list_ele_t *b) { if (!a) return b; else if (!b) return a; list_ele_t *head, *tmp; if (strcmp(a->value, b->value) <= 0) { head = a; a = a->next; } else { head = b; b = b->next; } tmp = head; while (a && b) { /* if (strnatcmp(a->value, b->value) == -1) { */ if (strcmp(a->value, b->value) <= 0) { tmp->next = a; a = a->next; } else { tmp->next = b; b = b->next; } tmp = tmp->next; } if (!a) tmp->next = b; else if (!b) tmp->next = a; return head; } ``` ### Fix memory leak 使用 `$ make test` 測試發現以下幾個情況有 memory leak: - NULL queue operation - Malloc faild during insertion 檢查後補上在這些情況下沒有釋放記憶體的漏洞。 commit [89f3784](https://github.com/YanjenChen/lab0-c/commit/89f3784ee8ac882329ee450bdf5f820a738c2635) ```cpp bool q_insert_head(queue_t *q, char *s) { /* Return false when queue is NULL or could not allocate space */ if (!q) return false; ... newh = malloc(sizeof(list_ele_t)); news = malloc(sizeof(char) * (slen + 1)); if (!newh || !news) { if (newh) free(newh); if (news) free(news); return false; } ... } ``` ### Nature comparison merge sort 應用 [sourcefrog/natsort](https://github.com/sourcefrog/natsort) 取代 `strcmp` 讓排序結果更符合直覺。實作時對原始碼作一些修改: - 使用 `short` 型態表示 tri-state 比較結果壓縮記憶體使用。 - 為無窮迴圈加上終止條件,避免無法預期的意外發生。 - 修改結構與註解提升可讀性。 commit [096daba](https://github.com/YanjenChen/lab0-c/commit/096daba12e51ca6f8f49d5195010a68753042d73) ```cpp /* * Compare value of integer substring at start of string `a`, `b` * Return value represent the following logical condition: * - `-1` if `a` < `b` * - `0` if `a` = `b` * - `1` if `a` > `b` */ static short compare_int(char const *a, char const *b) { bool lead_zero = (*a == '0' || *b == '0'); short result = 0; /* If `a` and `b` have same digit, compare significant digit. */ /* Check which integer substring have shortest digits */ for (; isdigit(*a) && isdigit(*b); a++, b++) { if (!result) { result = (*a <= *b) ? ((*a < *b) ? -1 : 0) : 1; } else if (lead_zero) { /* If one of substring has leading zeros, return the first */ /* difference. */ return result; } } if (isdigit(*a) && !isdigit(*b)) { result = 1; } else if (!isdigit(*a) && isdigit(*b)) { result = -1; } return result; } /* * Compare string `a`, `b` base on nature order * Return value represent the following logical condition: * - `-1` if `a` < `b` * - `0` if `a` = `b` * - `1` if `a` > `b` */ static short strnatcmp(char const *a, char const *b) { /* TODO: What if string doesn't contains `\0`? */ /* TODO: Reduce time complextiy if possible. */ bool result = 0; while (*a && *b) { /* Skip leading spaces */ for (; isspace(*a); a++) ; for (; isspace(*b); b++) ; /* Compare digits */ if (isdigit(*a) && isdigit(*b) && ((result = compare_int(a, b)) != 0)) { break; } /* Compare characters */ if (*a < *b) { result = -1; } else if (*a > *b) { result = 1; } } return result; } ``` :::warning 上述程式碼改為: ```cpp for (; isspace(*a); a++) ; ``` 並試著減少程式縮排 (indention) 的深度 (depth) :notes: jserv ::: :::success 已修正 ::: ### Upgrade simulation 新增 `natsort` 選項控制控制 simulation 是否使用 nature sort。並擴充隨機字串產生器產生數字,讓 trace16 可以進行相關測試。 commit [26bb9c8](https://github.com/YanjenChen/lab0-c/commit/26bb9c82f82b1c2f82c30d566232985145504b73) ```cpp= ... static const char charset[] = "abcdefghijklmnopqrstuvwxyz0123456789"; ... bool do_sort(int argc, char *argv[]) { ... int (*strcompare)(const char *, const char *) = use_natsort ? strnatcmp : strcasecmp; ... } ``` ### Valgrind analysis 開發過程中用 valgrind 找到不少 segmentation fault 發生的位置。修正問題後發現忘了紀錄。附上修正後的結果以及 `$ make test` 的記憶體消耗紀錄。 ```shell $ make test ... +++ TESTING trace trace-17-complexity: # Test if q_insert_tail and q_size is constant time complexity Probably constant time Probably constant time --- trace-17-complexity 5/5 --- TOTAL 100/100 ``` ![](https://i.imgur.com/B8LsM7I.jpg) ## Dude, is my code constant time? :::info **WIP** ::: ## `Select` and `console.c` :::info **WIP** :::