# [2020q3 Homework1 (lab0)](https://hackmd.io/@sysprog/2020-lab0) contributed by <`zhu849`> ## 操作環境 ``` $ cat /etc/lsb-release DISTRIB_ID=Ubuntu DISTRIB_RELEASE=20.04 DISTRIB_CODENAME=focal DISTRIB_DESCRIPTION="Ubuntu 20.04.1 LTS" $ uname -a Linux zhu-MS-7414 5.4.0-47-generic #51-Ubuntu SMP Fri Sep 4 19:50:52 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux $ gcc --version gcc (Ubuntu 9.3.0-10ubuntu2) 9.3.0 $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian Address sizes: 39 bits physical, 48 bits virtual CPU(s): 4 On-line CPU(s) list: 0-3 Thread(s) per core: 1 Core(s) per socket: 4 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 158 Model name: Intel(R) Core(TM) i5-7400 CPU @ 3.00GHz Stepping: 9 CPU MHz: 800.094 CPU max MHz: 3500.0000 CPU min MHz: 800.0000 BogoMIPS: 6000.00 Virtualization: VT-x L1d cache: 128 KiB L1i cache: 128 KiB L2 cache: 1 MiB L3 cache: 6 MiB NUMA node0 CPU(s): 0-3 ``` ## Queue實作 ### 實作目標 * `q_new()`: * Create empty queue. * Return NULL if could not allocate space. * `q_free()`: * Free all storage used by queue * `q_insert_head()`: * Attempt to insert element at head of queue. * Return true if successful. * Return false if `q` is NULL or could not allocate space. * Argument `s` points to the string to be stored. * The function must explicitly allocate space and copy the string into it. * `q_insert_tail()`: * Attempt to insert element at head of queue. * Return true if successful. * Return false if `q` is NULL or could not allocate space. * Argument `s` points to the string to be stored. * The function must explicitly allocate space and copy the string into it. * `q_remove_head()`: * Attempt to remove element from head of queue. * Return true if successful. * Return false if queue is NULL or empty. * If `sp` is non-NULL and an element is removed, copy the removed string to `*sp` * (up to a maximum of bufsize-1 characters, plus a null terminator.) * The space used by the list element and the string should be freed. * `q_size()`: * Return number of elements in queue. * Return 0 if `q` is NULL or empty * `q_reverse()`: * Reverse elements in queue * No effect if `q` is NULL or empty * This function should not allocate or free any list elements * (e.g., by calling q_insert_head, q_insert_tail, or q_remove_head). * It should rearrange the existing ones. * `q_sort()`: * Reverse elements in queue * No effect if `q` is NULL or empty * This function should not allocate or free any list elements * (e.g., by calling q_insert_head, q_insert_tail, or q_remove_head). * It should rearrange the existing ones. **以下 Step 代表實作時的思考順序,每個function都非一氣呵成,必須經過多次的修改,但是為了讓讀者可以更懂我的思考,每次的改動都會清楚寫出來** ### ==q_size()== **Step 0** * 按照老師給予的前導,修改 `q_size()` ```clike= int q_size(queue_t *q) { if (!q) return 0; return q->size; } ``` ### ==q_new()== ver.1 #### Step 1 * 先根據`q_new()`內的敘述來看 > What if malloc returned NULL? > * 考慮 q malloc 後可能為 NULL ,若為 NULL 則 `q->head = NULL`將會無法執行 ```clike= queue_t *q_new() { queue_t *q = malloc(sizeof(queue_t)); if(!q){ q->head = NULL; q->size = 0; } return q; } ``` ### ==q_insert_head()== #### Step 2 * 接著考慮 `q_insert_head()`的敘述 > What should you do if the q is NULL? > * 跟 `q_new()`相同需要考慮如果q為 NULL 情況時的處理 > > Don't forget to allocate space for the string and copy it > * 若新的空間(newh)配置成功,則需要再配置一個用來存放字串的char陣列 > > What if either call to malloc returns NULL? > * 承上點,需要考慮記憶體配置是否成功 * 若 `newh` 的記憶體配置成功,但是 `newh->value`的配置失敗的話,我們需要先將 `newh` 進行記憶體釋放才能回傳 Ver.1 ```clike= bool q_insert_head(queue_t *q, char *s) { list_ele_t *newh; if (!q) return false; newh = malloc(sizeof(list_ele_t)); if (!newh) return false; newh->value = malloc(sizeof(char) * (strlen(s) + 1)); if (!newh->value) { free(newh); return false; } strncpy(newh->value, s, strlen(s)); newh->value[strlen(s)] = '\0'; newh->next = q->head; q->head = newh; q->size++; return true; } ``` ### ==q_insert_tail()== #### Step 3 * 接著考慮 `q_insert_tail()`的敘述 > Remember: It should operate in O(1) time > * 考慮要在 O(1) 時間內完成,這代表我們不能夠利用迴圈從頭開始計數,需要從結構下手 ### ==Queue structure== #### Step 4 * 根據 **Step 3** 的考慮,我們先到`queue.h`的檔案內修改 Queue 的結構 * 在結構內新增 list_ele_t 型態的 \*tail 指向 list 的最後一個元素 ```clike= /* Queue structure */ typedef struct { list_ele_t *head; /* Linked list of elements */ list_ele_t *tail; int size; } queue_t; ``` #### Step 5 * 這邊和 `q_insert_head()` 的架構不完全相同,因為 `q_tail` 要先判斷是否為 NULL 才能決定插入位置。若是 NULL 則直接將 `q_tail` 指向 `newt`,反之若不是則要將 `q_tail->next` 先指向 `newt` * 另外也需要考慮到當 `q_head`為 NULL 時要將 `q_head` 也指向 newt ```clike= bool q_insert_tail(queue_t *q, char *s) { if (!q) return false; list_ele_t *newt = malloc(sizeof(list_ele_t)); if (!newt) return false; newt->value = malloc(sizeof(char) * (strlen(s) + 1)); if (!newt->value) { free(newt); return false; } strncpy(newt->value, s, strlen(s)); newh->value[strlen(s)] = '\0'; newt->next = NULL; if (q->tail) q->tail->next = newt; q->tail = newt; if (!q->head) q->head = q->tail; q->size++; return true; } ``` ### ==q_new()== ver.2 #### Step 6 * 因為我們剛剛在 Step 4 中新增了 \*tail 的指標,所以我們也需要去修改`q_new()`和`q_insert_head()`的內容來確保新增的點或是插入的點是不是位於 tail node ```clike= queue_t *q_new() { queue_t *q = malloc(sizeof(queue_t)); if (q) { q->head = NULL; q->tail = NULL; q->size = 0; } return q; } ``` ### ==q_insert_head()== ver.2 ```clike= bool q_insert_head(queue_t *q, char *s) { list_ele_t *newh; if (!q) return false; newh = malloc(sizeof(list_ele_t)); if (!newh) return false; newh->value = malloc(sizeof(char) * (strlen(s) + 1)); if (!newh->value) { free(newh); return false; } strncpy(newh->value, s, strlen(s)); newh->value[strlen(s)] = '\0'; newh->next = q->head; q->head = newh; if (!q->tail) q->tail = newh; q->size++; return true; } ``` ### ==q_remove_head()== #### Step 7 * `sp` 的使用情境與意義:做的操作有點像是在"pop",意即在達到移除目標節點後,我們仍要取的目標的值,需要注意的是,如果`sp`不是 NULL 代表已經宣告好 bufsize 大小的記憶體空間了 * `bufsize`的考慮:bufsize是我們最大保留的 string 長度,因為我們所複製的 string 長度可能會小於給定的 bufsize,所以需要做長度的判斷 * strcpy 與 strncpy 的選擇與問題 //TODO ```clike= bool q_remove_head(queue_t *q, char *sp, size_t bufsize) { if (!q || !(q->head)) { return false; } list_ele_t *tmp = q->head; q->head = q->head->next; tmp->next = NULL; // sp is non-NULL, check real bufsize if (sp) { size_t realbufsize = strlen(tmp->value) < bufsize ? (strlen(tmp->value) + 1) : bufsize; strncpy(sp, tmp->value, realbufsize); sp[realbufsize - 1] = '\0'; } free(tmp->value); free(tmp); q->size--; return true; } ``` ### ==q_reverse()== * 先檢查 `q` 和 `q->head` 為 NULL 的情況後執行 list reverse 的動作 * 需要使用到一個 `cursor` 讓尾端的 next 指向,next宣告在 while 內的可以帶來較好的安全性(不會被亂使用) * line 15:當 `q->head` 指到 NULL 時,這時候的 `cursor` 才是真正的 head,所以要將 `q->head` 指標更新 ```c= void q_reverse(queue_t *q) { if (!q || !(q->head)) return; // set tail position before do sth with q->head q->tail = q->head; list_ele_t *cursor = NULL; while (q->head) { list_ele_t *next = q->head->next; q->head->next = cursor; cursor = q->head; q->head = next; } q->head = cursor; } ``` ### ==q_free()== * 先看 `q_free()` 的敘述 > How about freeing the list elements and the strings? * 若要達到完整的 free list,就必須先 free 掉內容值,再 free 掉自己本身 * 利用 `prev` 指標指向原本的 `q_head` 再對 `prev` 做處理 ```c= void q_free(queue_t *q) { if (!q) return; while (q->head) { list_ele_t *prev = q->head; q->head = q->head->next; free(prev->value); free(prev); } free(q); } ``` ### ==q_sort()== * 在大學課程中我們學到的 sort 有好幾種,有 bubble sort, selection sort, insertion sort, merge sort etc. ![](https://i.imgur.com/aLAzeQA.png) [圖片來源](http://notepad.yehyeh.net/Content/Algorithm/Sort/Sort.php) * 為了增加 sort 內操作的靈活度,直接傳入 `&q->head` 到 sort function 內 * 為什麼不考慮利用已經算好的 `q->size` 傳入方便 split 的操作? * Ans: 考慮以下兩段程式碼,可以發現版本一雖然傳入 `qsize` 感覺可以更快找到 split 位置,但實際上兩個版本執行的次數是一樣的,因為在 linked-list 結構下沒辦法利用 index search 找到中間點,必須透過一一尋找才能找到我們要 split 的那個點 * 版本一: ```c= void merge_sort(list_ele_t **head, int qsize) { list_ele_t left = *head; list_ele_t right = *head; for (int i = 0; i < qsize / 2; i++) { right = right->next; } } ``` * 版本二: ```c= void merge_sort(list_ele_t **head) { list_ele_t *left = *head; list_ele_t *right = (*head)->next; while (right && right->next) { left = left->next; right = right->next->next; } right = left->next; } ``` * 另外在解釋這段 code 的抽象意義 1. ```graphviz digraph structs { node [shape=record]; left[shape=plaintext,fontcolor=darkgreen]; right[shape=plaintext,fontcolor=blue]; head[shape=plaintext,fontcolor=red]; "address of node"[shape=plaintext,fontcolor=black] struct1 [label="{A|<ref>}"]; struct2 [label="{B|<ref>}"]; struct3 [label="{C|<ref>}"]; struct4 [label="{D|<ref>}"]; struct1:ref:c -> struct2 [arrowtail=dot, dir=both,tailclip=false]; struct2:ref:c -> struct3 [arrowtail=dot, dir=both,tailclip=false]; struct3:ref:c -> struct4 [arrowtail=dot, dir=both,tailclip=false]; head -> "address of node" -> struct1 left -> struct1 right -> struct2 rankdir = LR } ``` * 利用 `strcmp(str1, str2)` 來當作比較基準做排序 * 回傳值若是小於0 - 代表 str2 > str1 * 回傳值若是大於0 - 代表 str2 < str1 * 回傳值若是等於0 - 代表 str2 == str1 * 這個寫法不是很滿意,感覺沒有盡可能的用到 `head` 傳入 pointer to pointer 的優勢,而且 line 32 - line 50 感覺有點醜,感覺還能再更好 //TODO ```c= void q_sort(queue_t *q) { if (!q || q->size < 2) return; merge_sort(&q->head); // update q->tail while (q->tail->next) q->tail = q->tail->next; } void merge_sort(list_ele_t **head) { // queue element size is less then 2 if (!(*head) || !((*head)->next)) return; list_ele_t *left = *head; list_ele_t *right = (*head)->next; while (right && right->next) { left = left->next; right = right->next->next; } right = left->next; left->next = NULL; left = (*head); merge_sort(&left); merge_sort(&right); list_ele_t *newlist = NULL; while (left && right) { if (strcmp(left->value, right->value) < 0) { if (newlist) { newlist->next = left; newlist = newlist->next; } else { newlist = left; *head = newlist; } left = left->next; } else { if (newlist) { newlist->next = right; newlist = newlist->next; } else { newlist = right; *head = newlist; } right = right->next; } } newlist->next = (left) ? left : right; } ``` :::warning 延伸問題:要再多小的節點數量不適合用 merge sort? ::: ## 實作結果 ![](https://i.imgur.com/3gPglwd.png) ### 實作中途遭遇困難 #### 問題一:strncpy的使用 * 中途曾經測試時遇到以下問題: ![](https://i.imgur.com/wrss1ZG.png) * 經過檢查程式碼的記憶體使用後發現對 strncpy 的使用不夠了解 | | strcpy | strncpy | | -------- | ---------------------------------------- | ------------------------------------------------------- | | 呼叫方式 | strcpy( char \*dest, const char \*src ); | strncpy( char \*dest, const char \*src, size_t count ); | | 存在問題 | buffer overflow | 若 src 字元數比 count 少,剩下格子用 '\0' 補滿未滿的格。若 src 字元數比 count 多,它**不會**幫忙補 '\0' | * 所以使用 strncpy 時需要利用 `newh->value[strlen(s)] = '\0'` 給定結束符號 ## 測試與實驗 #### 使用指令 ``` $ valgrind --tool=massif ./qtest -f ./traces/trace-02-ops.cmd $ ms_print massif.out.18587 $ massif-visualizer massif.out.18587 ``` ### Valgrind #### `trace-02-ops.cmd` 分析 執行內容: ``` cmd> # Test of insert_head, insert_tail, and remove_head cmd> option fail 0 cmd> option malloc 0 cmd> new q = [] cmd> ih gerbil q = [gerbil] cmd> ih bear q = [bear gerbil] cmd> ih dolphin q = [dolphin bear gerbil] cmd> it meerkat q = [dolphin bear gerbil meerkat] cmd> it bear q = [dolphin bear gerbil meerkat bear] cmd> it gerbil q = [dolphin bear gerbil meerkat bear gerbil] cmd> rh dolphin Removed dolphin from queue q = [bear gerbil meerkat bear gerbil] cmd> rh bear Removed bear from queue q = [gerbil meerkat bear gerbil] cmd> rh gerbil Removed gerbil from queue q = [meerkat bear gerbil] cmd> rh meerkat Removed meerkat from queue q = [bear gerbil] cmd> rh bear Removed bear from queue q = [gerbil] cmd> rh gerbil Removed gerbil from queue q = [] Freeing queue ``` ![](https://i.imgur.com/x8KrUoC.jpg) #### `trace-08-robust.cmd` 分析 ``` cmd> # Test operations on empty queue cmd> option fail 10 cmd> option malloc 0 cmd> new q = [] cmd> rh Warning: Calling remove head on empty queue Removal from queue failed q = [] cmd> reverse q = [] cmd> size Queue size = 0 q = [] cmd> sort Warning: Calling sort on single node q = [] Freeing queue ``` ![](https://i.imgur.com/RHXEI0a.jpg) #### `trace-10-malloc.cmd` 分析 ``` cmd> # Test of malloc failure on new cmd> option fail 10 cmd> option malloc 50 cmd> new WARNING: Malloc returning NULL q = NULL cmd> new WARNING: Malloc returning NULL q = NULL cmd> new WARNING: Malloc returning NULL q = NULL cmd> new WARNING: Malloc returning NULL q = NULL cmd> new q = [] cmd> new Freeing old queue q = NULL WARNING: Malloc returning NULL q = NULL cmd> cmd> Freeing queue ``` ![](https://i.imgur.com/bsevBlv.jpg) #### `trace-15-perf.cmd` 分析 ``` cmd> # Test performance of insert_tail, size, reverse, and sort cmd> option fail 0 cmd> option malloc 0 cmd> new q = [] cmd> ih dolphin 1000000 ERROR: Time limit exceeded. Either you are in an infinite loop, or your code is too inefficient ERROR: Insertion of dolphin failed (1 failures total) q = [dolphin ... ] cmd> it gerbil 1000000 ERROR: Time limit exceeded. Either you are in an infinite loop, or your code is too inefficient ERROR: Insertion of gerbil failed (2 failures total) q = [dolphin ... ] cmd> size 1000 Queue size = 1090429 q = [dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin ... ] cmd> reverse q = [gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil ... ] cmd> sort q = [dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin ... ] cmd> size 1000 Queue size = 1090429 q = [dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin ... ] cmd> ``` ![](https://i.imgur.com/treFKD2.jpg)