# 2020q1 Homework1 (lab0) contributed by < `jwang0306` > > [作業要求](https://hackmd.io/@sysprog/linux2020-lab0#-%E4%BD%9C%E6%A5%AD%E8%A6%81%E6%B1%82) > [GitHub](https://github.com/jwang0306/lab0-c) ## 實做歷程 ### q_new 只要有 malloc 就確認一下到底有沒有成功 ```cpp queue_t *q_new() { queue_t *q = malloc(sizeof(queue_t)); if (!q) return NULL; q->head = NULL; q->tail = NULL; q->size = 0; return q; } ``` ### q_size ```cpp int q_size(queue_t *q) { return (!q) ? 0 : q->size; } ``` ### q_free 把每一個 list element 都 free 掉,最後才是 queue 本身 ```cpp void q_free(queue_t *q) { if (!q) return; if (q->size > 0) { list_ele_t *curr = q->head; list_ele_t *prev = NULL; while (curr) { prev = curr; curr = curr->next; free(prev->value); free(prev); } } free(q); } ``` ### q_insert_head 一樣是只要有 malloc 就檢查是否成功,然後判斷 queue 是不是空的,再另做打算 ```cpp bool q_insert_head(queue_t *q, char *s) { if (!q) return false; list_ele_t *newh; newh = malloc(sizeof(list_ele_t)); if (!newh) return false; newh->value = malloc((strlen(s) + 1) * sizeof(char)); if (!newh->value) { free(newh); return false; } memcpy(newh->value, s, (strlen(s) + 1)); if (q->size == 0) { newh->next = NULL; q->head = q->tail = newh; } else { newh->next = q->head; // insert to head q->head = newh; // update head } q->size++; return true; } ``` - 使用 valgrind 跑 trace 03: ```shell ==30056== Invalid read of size 8 ==30056== at 0x10CD68: q_reverse (queue.c:158) ==30056== by 0x109DB8: do_reverse (qtest.c:463) ==30056== by 0x10B938: interpret_cmda (console.c:220) ==30056== by 0x10BEAC: interpret_cmd (console.c:243) ==30056== by 0x10C47A: cmd_select (console.c:571) ==30056== by 0x10C6C2: run_console (console.c:630) ==30056== by 0x10ADE7: main (qtest.c:770) ==30056== Address 0x555555555555555d is not stack'd, malloc'd or (recently) free'd ==30056== ERROR: Segmentation fault occurred. You dereferenced a NULL or invalid pointer ERROR: Removed value vulture != expected value bear ... ``` valgrind 指出在 ```q_reverse``` 的地方出問題,在 ```curr = curr->next``` 的時候觸及了不合法區域,但該段程式碼看起來是沒問題的,因此推論是在 insert 的時候出問題,沒有連接好。最後發現 queue 是空的時候忘記將插入 node 的 next 設為 NULL (`newh->next = NULL`)。 ### q_insert_tail 因為要 $O(1)$ ,所以 queue 新增一個 tail 元素。架構與 ```q_insert_head``` 一模一樣。 ```cpp bool q_insert_tail(queue_t *q, char *s) { if (!q) return false; list_ele_t *newt; newt = malloc(sizeof(list_ele_t)); if (!newt) return false; newt->value = malloc((strlen(s) + 1) * sizeof(char)); if (!newt->value) { free(newt); return false; } memcpy(newt->value, s, (strlen(s) + 1)); newt->next = NULL; if (q->size == 0) { q->head = q->tail = newt; } else { q->tail->next = newt; // insert to tail q->tail = newt; // update tail } q->size++; return true; } ``` ### q_remove_head ```cpp bool q_remove_head(queue_t *q, char *sp, size_t bufsize) { if (!q || q->size == 0) return false; list_ele_t *tmp = q->head; q->head = q->head->next; // remove head q->size--; if (sp) { strncpy(sp, tmp->value, bufsize - 1); sp[bufsize - 1] = '\0'; } tmp->next = NULL; free(tmp->value); free(tmp); return true; } ``` - 在第六個 test 出現了 error : > ERROR: Removed value > meerkat_panda_squirrel_vultureXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX != expected value meerkat_panda_squirrel_vulture 用 valgrind 也看不出所以然,但是錯誤訊息就直接說是 removed value ,所以就找這個 function 的問題,果然就找到了,加上 `sp[bufsize - 1] = '\0'` 來切掉。 ### q_reverse ```cpp void q_reverse(queue_t *q) { if (!q || q->size <= 1) return; list_ele_t *curr = q->head; list_ele_t *next = NULL; list_ele_t *prev = NULL; while (curr != NULL) { next = curr->next; curr->next = prev; prev = curr; curr = next; } q->tail = q->head; q->head = prev; } ``` ### q_sort 使用時間複雜度 $O(nlog(n))$ 的 merge sort ,參考 [Linked List Sort ](https://npes87184.github.io/2015-09-12-linkedListSort/) 來完成。有空想嘗試改寫成 iterative 或是 tail recursive 版本。 #### `q_sort` ```cpp void q_sort(queue_t *q) { if (!q || q->size <= 1) return; /* Merge Sort */ q->head = merge_sort(q->head, q->size); list_ele_t *tmp = q->head; while (tmp->next) { tmp = tmp->next; } q->tail = tmp; } ``` #### ```merge_sort``` ```cpp list_ele_t *merge_sort(list_ele_t *head) { if (!head || !head->next) return head; /* Split the list */ list_ele_t *list_l = NULL; list_ele_t *list_r = NULL; split(head, &list_l, &list_r); /* Sort the list*/ list_l = merge_sort(list_l); list_r = merge_sort(list_r); /* Merge the list*/ return merge(list_l, list_r); } ``` :::warning `merge_sort` 函式的最後一行 `return head` 應刪去 :notes: jserv ::: > [name=jwang0306]已刪,筆誤 #### `split` 用以切開 linked list。一個 C 函式若要回傳多個元素,其中一個方法就是 <s>call by reference</s> indirection (間接資料存取)。 :::warning C 語言==沒有== call by reference,永遠只有一種模式,也就是傳遞數值 (俗語的 "copy by value"),請用這兩個術語在 C 語言規格書找找。call by xxx 實在不是恰當的說法,即便是 C++ 語言規格書也避免用這樣不精準的話語。 :notes: jserv ::: > [name=jwang0306]謝謝老師指正。我弄錯了,我的本意是 call by address ,但仔細想想 address 本身就是一種 value 。我會補上 C 語言規格書的內容來佐證。 :::success [C99](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf) [6.5.2.2 ] **Function calls** - If the expression that denotes the called function has a type that includes a prototype, the number of arguments shall agree with the number of parameters. Each argument shall have a type such that its value may be assigned to an object with the unqualified version of the type of its corresponding parameter. - An argument may be an expression of any object type. In preparing for the call to a function, the arguments are evaluated, and each parameter is assigned the value of the corresponding argument. ::: ```cpp void split(list_ele_t *head, list_ele_t **list_l, list_ele_t **list_r) { list_ele_t *slow = head; list_ele_t *fast = head->next; while (fast && fast->next) { slow = slow->next; fast = fast->next->next; } *list_l = slow->next; *list_r = head; slow->next = NULL; // to actually cut the list } ``` #### ```merge``` 舊版本: :::spoiler 整個 ```q_sort``` 不能有額外新增的 node,所以在迴圈裡判斷是否為第一次進入,直接將元素放進 ```tmp``` ,往後再用 ```->next``` 連接,並將 ```head``` 保存下來以便回傳。 ```cpp list_ele_t *merge(list_ele_t *list_l, list_ele_t *list_r) { if (!list_l) return list_l; if (!list_r) return list_r; list_ele_t *tmp = NULL; list_ele_t *head = NULL; /* Compare each element and link together */ while (list_l && list_r) { if (strcmp(ele_l->value, ele_r->value) < 0) { if (tmp) { tmp->next = list_l; tmp = tmp->next; } else { // first access tmp = list_l; head = tmp; } list_l = list_l->next; } else { if (tmp) { tmp->next = list_r; tmp = tmp->next; } else { // first access tmp = list_r; head = tmp; } list_r = list_r->next; } } if (list_l) tmp->next = list_l; if (list_r) tmp->next = list_r; return head; } ``` 改成 natural sort 之後,在 trace-15 出現 time exceed error 。解決方法: - 盡量減少 branch 。將判斷是否為第一次以大小的事件拿出迴圈: ```cpp list_ele_t *merge(list_ele_t *list_l, list_ele_t *list_r) { if (!list_l) return list_l; if (!list_r) return list_r; list_ele_t *tmp = NULL; list_ele_t *head = NULL; /* Compare each element and link together */ if (strnatcmp(list_l->value, list_r->value) < 0) { head = tmp = list_l; list_l = list_l->next; } else { head = tmp = list_r; list_r = list_r->next; } while (list_l && list_r) { if (strnatcmp(list_l->value, list_r->value) < 0) { tmp->next = list_l; tmp = tmp->next; list_l = list_l->next; } else { tmp->next = list_r; tmp = tmp->next; list_r = list_r->next; } } if (list_l) tmp->next = list_l; if (list_r) tmp->next = list_r; return head; } ``` 但應該不是長久之計,大概要從更根本的地方解決。 ::: --- :::info 3/12 : 老師提到,在 merge function 裡頭不該出現兩次 strcasecmp 的比較,希望我們能夠研究一下如何減少為一次。 :warning: 我的本意不是「不該出現兩次 strcasecmp」,而是讓學員們思考「同等效果但更精簡」的實作手段,後者大量存在於 Linux 核心 —— 各式很不直覺但想懂後異常優雅的程式碼 :notes: jserv ::: 因此我又開始檢視,對照我最先的 code ,雖然只有出現一次,但是明顯太多層 if-else 。感謝 johnnycck 給我的想法,善用 pointer 就能順利解決問題了,也讓整段程式碼更為精簡。 ```cpp list_ele_t *merge(list_ele_t *list_l, list_ele_t *list_r) { list_ele_t *head = NULL; list_ele_t **tmp = &head; while (list_l && list_r) { if (strnatcmp(list_l->value, list_r->value) < 0) { *tmp = list_l; list_l = list_l->next; } else { *tmp = list_r; list_r = list_r->next; } tmp = &((*tmp)->next); } if (list_l) *tmp = list_l; if (list_r) *tmp = list_r; return head; } ``` 大師 Linus Torvalds 曾說: > "People who understand pointers just use a "pointer to the entry pointer", and initialize that with the address of the list_head. And then as they traverse the list, they can remove the entry without using any conditionals, by just doing`*pp = entry->next`" 因此我們先做一個指向 head 的指標: `list_ele_t **tmp = &head;` ```mermaid graph LR subgraph linked-list head==>node1==>node2 end subgraph pointer to pointer tmp==>head end ``` 在走訪 list 的時候,先```*tmp``` (dereference ```tmp``` ,找出它所指向的 node ,此刻為 head), 接著```(*tmp)->next```(找出它的 `->next` ,這邊為 head 的 next ,也就是 node1),然後 `tmp = &((*tmp)->next)` (將其 reference 存回 ```tmp``` ,此刻 `tmp` 就變為指向 node1 的指標): ```mermaid graph LR subgraph linked-list head==>node1==>node2 end subgraph pointer to pointer tmp==>node1 end ``` 如此一來 head 從頭到尾都沒有動,直接回傳 head 就好了。以上是我的理解。 :::warning 接下來可欣賞 Linux 核心原始程式碼的「藝術」: [lib/list_sort.c](https://github.com/torvalds/linux/blob/master/lib/list_sort.c) 實作技巧: 1. 拆成 `merge` 和 `merge_final` 並考慮到節點重建的成本; 2. 縮減所需要比較次數 (太美,沒辦法用簡單的話來說); 3. 切割 linked list 的過程即已對元素比例進行分類,並考慮到 cache 效益去調整; 顯然,掌握 pointer to pointer 操作技巧是入門訓練。 :notes: jserv ::: - 利用 perf 分析 merge sort 效能: ![](https://i.imgur.com/JHbyz4j.png) ```merge_sort``` 佔了非常多比例,進一步往下看: ![](https://i.imgur.com/YNOgY5Z.png) 發現是整個 slow fast 的分割法佔掉了大部分效能,但想想好像也不能更快了,複雜度就是 $O(n)$ 。 :::info 3/17:不止要將 merge sort 寫好, comparator 也要寫好。根據上圖 perf 效能分析,可看出主要的效能瓶頸確實也出現在 `strnatcmp` (約第 110 行左右)。可能的優化: - 定義切割字元 - 約束 natural sort 的條件 ::: ### TODO - merge sort 的 iterative 和 tail recursive 版本 - 優化 `strnatcmp` ## Massif 的運用 ## Dude, is my code constant time? ## select 系統呼叫