2020q3 Homework1 (lab0)

# 2020q3 Homework1 (lab0) contributed by < `fennecJ` > > [作業說明](https://hackmd.io/@sysprog/2020-lab0) ## 索引 [TOC] ## 環境 ```shell! $ uname -a Linux fennecj-Z-Series 5.4.0-47-generic #51-Ubuntu SMP Fri Sep 4 19:50:52 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux $ gcc --version gcc (Ubuntu 9.3.0-10ubuntu2) 9.3.0 Copyright (C) 2019 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. ``` 剛開始看到這份作業時完全摸不著頭緒，總之就先照著作業說明讓 q_size() 回傳 q->size 這裡要注意考慮 !q 的情形： ```cpp! int q_size(queue_t *q) { if(!q) return 0; return q->size; } ``` 之後到 queue.h 內修改 queue_t 的 structure 加入 size ，為了讓程式能在 O(1) 的時間複雜度執行 q_insert_tail() 我也在 queue_t 中加入了 tail ： ```cpp! typedef struct { list_ele_t *head; /* Linked list of elements */ list_ele_t *tail; int size; } queue_t; ``` 接著我從 q_new() 開始著手: 首先是程式內的註解提示說要記得處理 malloc 回傳 NULL 的例外狀況： ```cpp queue_t *q = malloc(sizeof(queue_t)); if (!q) { printf("Error! Memory allocation failed\n"); return q; } ``` 再來是 q 的初始化，整段合起來會變這樣： ## q_new() ```cpp queue_t *q_new() { queue_t *q = malloc(sizeof(queue_t)); if (!q) { printf("Error! Memory allocation failed\n"); return q; } q->size = 0; q->tail = NULL; q->head = NULL; return q; } ``` 接著我打算補完 q_free() 為了方便，我另外寫了一個 function l_free() ，其作用為以遞迴的方式 free 掉 list_ele_t 所使用的記憶體空間： ```cpp void l_free(list_ele_t *ELE) { if (ELE == NULL) return; l_free(ELE->next); free(ELE->value); free(ELE); } void q_free(queue_t *q) { if (q == NULL) return; l_free(q->head); free(q); } ``` 但是後來我看了一下 [作業說明](https://www.cs.cmu.edu/afs/cs/academic/class/15213-s20/www/labs/cprogramminglab.pdf) ，其中在第五頁的說明提到： ```! Your program will be tested on queues with over 1,000,000 elements. You will find that you cannot operateon such long lists using recursive functions, since that would require too much stack space. Instead, youneed to use a loop to traverse the elements in a list. ``` 看來我本來的寫法是行不通的。所以我把它改成 ## q_free() ```cpp= void q_free(queue_t *q) { if(q == NULL) return; while (q->head) { list_ele_t *t = q->head; q->head = q->head->next; free(t->value); free(t); } free(q); } ``` 接著著手進行 q_insert_head() 和 q_insert_tail() 的補完。首先是 q_insert_head() ## q_insert_head() **要注意幾個地方：** 1. 分配記憶體位置給字串 s 時 malloc 要用 strlen(s)+1 的位置而非 strlen(s) 2. 避免使用 strcpy 來複製字串，參見 [Common vulnerabilities guide for C programmers](https://security.web.cern.ch/recommendations/en/codetools/c.shtml) 其中有提到： ```! The strcpy built-in function does not check buffer lengths and may very well overwrite memory zone contiguous to the intended destination. In fact, the whole family of functions is similarly vulnerable: strcpy, strcat and strcmp. ``` 根據網站建議我以 strlcpy 替換掉 strcpy ： ```cpp= #ifndef strlcpy #define strlcpy(dst,src,sz) snprintf((dst), (sz), "%s", (src)) #endif ``` 3. 若 newh->value malloc 出錯，要記得 free 掉 newh 實做如下： ```cpp= bool q_insert_head(queue_t *q, char *s) { list_ele_t *newh = malloc(sizeof(list_ele_t)); if (!q || !newh){ return false; } newh->value = malloc(sizeof(char) * strlen(s) + 1); if (!newh->value) { free(newh); return false; } strlcpy(newh->value, s, strlen(s) + 1); newh->next = q->head; q->head = newh; if (!q->tail) q->tail = newh; q->size++; return true; } ``` q_insert_tail() 也是大同小異： ## q_insert_tail() ```cpp= bool q_insert_tail(queue_t *q, char *s) { list_ele_t *newh = malloc(sizeof(list_ele_t)); if (!q || !newh){ return false; } newh->value = malloc(sizeof(char) * (strlen(s) + 1)); if (!newh->value) { free(newh); return false; } strlcpy(newh->value, s, strlen(s) + 1); q->tail->next = newh; newh->next = NULL; q->tail = newh; if (!q->head) q->head = newh; q->size++; return true; } ``` 下一個是 q_remove_head() ，由於我使用 strlcpy() ，因此不用特別在後面補 '\0' ，和前兩個 insert() 函式一樣，都要考慮 !q 的情況，然後就是要記得 free 掉原本的 head ： ## q_remove_head() ```cpp= bool q_remove_head(queue_t *q, char *sp, size_t bufsize) { if (!q) return false; if (!q->head) return false; strlcpy(sp, q->head->value, bufsize); list_ele_t *tmp = q->head; q->head = q->head->next; free(tmp->value); free(tmp); q->size--; return true; } ``` ## q_reverse() 下一個是 q_reverse() ，實作的概念和 [第一周測驗題](https://hackmd.io/@sysprog/sysprog2020-quiz1) 中的 reverse 相同，即另外設一個 cursor 指向前一個 node ，值得注意的是，因為 reverse 之後原本 head 的位置會變為 tail 的位置，因此我們可以一開始就 q->tail = q->head ： ```cpp= void q_reverse(queue_t *q) { if (!q) return; if (q->size < 2) return; list_ele_t *cursor = NULL; q->tail = q->head; while (q->head) { list_ele_t *next = q->head->next; q->head->next = cursor; cursor = q->head; q->head = next; } q->head = cursor; } ``` ## q_sort() 最後剩下 q_sort() 這個函式，這部分我是參考 [Linked List Sort](https://npes87184.github.io/2015-09-12-linkedListSort/) 實作，但是偶爾測資會 failed ，後來參考了 [tsengsam](https://hackmd.io/@shanvia/rytih9i4w) 的文章之後，把 merge 的函式改寫為 iterative 的寫法： ```cpp= list_ele_t *merge(list_ele_t *a, list_ele_t *b) { list_ele_t *head = NULL; list_ele_t **tmp = &head; while (a && b) { if (strcmp(a->value, b->value) < 0) { *tmp = a; a = a->next; } else { *tmp = b; b = b->next; } tmp = &(*tmp)->next; } if (a) *tmp = a; if (b) *tmp = b; return head; } list_ele_t *split(list_ele_t *head) { if (!head || !head->next) return head; list_ele_t *fast = head->next; list_ele_t *slow = head; while (fast && fast->next) { slow = slow->next; fast = fast->next->next; } fast = slow->next; slow->next = NULL; list_ele_t *a = split(head); list_ele_t *b = split(fast); return merge(a, b); } void q_sort(queue_t *q) { if (!q) return; if (q->size < 2) return; q->head = split(q->head); while (q->tail->next) q->tail = q->tail->next; } ``` 至此我跑了一次 test ，拿了 94 分： ```bash $ make test ... +++ TESTING trace trace-07-robust: # Test operations on NULL queue ERROR: Freed queue, but 2 blocks are still allocated --- trace-07-robust 0/6 ... TOTAL:94/100 ``` 看來是不知道哪裡有沒被 free 掉的東西，因此我用 valgrind 檢查： ```bash= $make valgrind ... +++ TESTING trace trace-07-robust: # Test operations on NULL queue ERROR: Freed queue, but 2 blocks are still allocated ==12960== 56 bytes in 1 blocks are still reachable in loss record 1 of 2 ==12960== at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==12960== by 0x10D995: test_malloc (harness.c:138) ==12960== by 0x10DE3C: q_insert_head (queue.c:53) ==12960== by 0x10B9F7: do_insert_head (qtest.c:212) ==12960== by 0x10CB87: interpret_cmda (console.c:220) ==12960== by 0x10D10F: interpret_cmd (console.c:243) ==12960== by 0x10D6EC: cmd_select (console.c:569) ==12960== by 0x10D923: run_console (console.c:628) ==12960== by 0x10BFE1: main (qtest.c:772) ==12960== ==12960== 56 bytes in 1 blocks are still reachable in loss record 2 of 2 ==12960== at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==12960== by 0x10D995: test_malloc (harness.c:138) ==12960== by 0x10DF02: q_insert_tail (queue.c:80) ==12960== by 0x10B75C: do_insert_tail (qtest.c:297) ==12960== by 0x10CB87: interpret_cmda (console.c:220) ==12960== by 0x10D10F: interpret_cmd (console.c:243) ==12960== by 0x10D6EC: cmd_select (console.c:569) ==12960== by 0x10D923: run_console (console.c:628) ==12960== by 0x10BFE1: main (qtest.c:772) ==12960== --- trace-07-robust 0/6 ... ``` 根據第 9 行和第 20 行提供的提示 (queue.c:53 及 queue.c:80) ，這兩行的程式碼分別在 q_insert_head() 及 q_insert_tail() 中，且皆為同樣的程式碼： ```cpp newh = malloc(sizeof(list_ele_t)); ``` 仔細檢視，發現我原本的寫法若在 !q 但已成功 malloc newh 的情況下，程式會直接離開函式但並未 free 掉 newh ： ```cpp ... list_ele_t *newh = malloc(sizeof(list_ele_t)); if (!q || !newh){ return false; } ... ``` 因此我在 q_insert_head() 和 q_insert_tail() 的 if (!q || !newh) 中皆加上 free(newh) ： ```cpp= bool q_insert_head(queue_t *q, char *s) { list_ele_t *newh = malloc(sizeof(list_ele_t)); if (!q || !newh){ free(newh); return false; } newh->value = malloc(sizeof(char) * strlen(s) + 1); if (!newh->value) { free(newh); return false; } strlcpy(newh->value, s, strlen(s) + 1); newh->next = q->head; q->head = newh; if (!q->tail) q->tail = newh; q->size++; return true; } bool q_insert_tail(queue_t *q, char *s) { list_ele_t *newh = malloc(sizeof(list_ele_t)); if (!q || !newh){ free(newh); return false; } newh->value = malloc(sizeof(char) * (strlen(s) + 1)); if (!newh->value) { free(newh); return false; } strlcpy(newh->value, s, strlen(s) + 1); q->tail->next = newh; newh->next = NULL; q->tail = newh; if (!q->head) q->head = newh; q->size++; return true; } ``` 然後我又跑了一次 test ： ```bash $ make test ... TOTAL:100/100 ``` 總算拿到了 100 分。 ## 利用 valgrind 搭配視覺化工具追蹤記憶體使用情形使用工具： [Massif-Visualizer](https://kde.org/applications/en/massif-visualizer) 實驗設計：比較 ih 和 it 的記憶體使用差異：在 queue 中插入相同的字串，相同的次數，比較使用 ih 及 it 的差別。 ```bash $valgrind --tool=massif ./qtest cmd >new cmd >ih aaaaaaaaaa 30 cmd >quit ``` ![](https://i.imgur.com/NlWsamx.png) ```bash $valgrind --tool=massif ./qtest cmd >new cmd >it aaaaaaaaaa 30 cmd >quit ``` ![](https://i.imgur.com/3iZJJb4.png) 不知為何 it 在前段似乎上升的比 ih 快，重複試了幾次都是一樣的結果。原因待確認。而這個現象到 queue 中的資料量變多時就消失了： ```bash $valgrind --tool=massif ./qtest cmd >new cmd >ih aaaaaaaaaa 50 cmd >quit ``` ![](https://i.imgur.com/2ewICNw.png) ```bash $valgrind --tool=massif ./qtest cmd >new cmd >it aaaaaaaaaa 50 cmd >quit ``` ![](https://i.imgur.com/QmIPs9w.png) ## 參考資料 [Common vulnerabilities guide for C programmers](https://security.web.cern.ch/recommendations/en/codetools/c.shtml) [ZhuMon](https://hackmd.io/@ZhuMon/lab0-c) [RinHizakura](https://hackmd.io/@RinHizakura/ByuY_q6AU) [tsengsam](https://hackmd.io/@shanvia/rytih9i4w) [Linked List Sort](https://npes87184.github.io/2015-09-12-linkedListSort/)[第一周測驗題](https://hackmd.io/@sysprog/sysprog2020-quiz1)