2023q1 Homework1 (lab0)

# 2023q1 Homework1 (lab0) contributed by < [willow11235811](https://github.com/willow11235811) > ## 開發環境 ```shell $ paleofetch OS: Arch Linux x86_64 Host: X570 AORUS ELITE -CF Kernel: 6.1.12-arch1-1 Uptime: 3 hours, 41 mins Battery: Not found Packages: 1791 (pacman) Shell: zsh Resolution: 1920x1080 Terminal: st-256color CPU: AMD Ryzen 7 5700X 8- Processor (16) @ 4.661GHz Memory: 2850MiB / 32023MiB (8%) $ gcc --version gcc (GCC) 12.2.1 20230201 ``` ## :notebook: 開發過程 - [x] 實作 `queue.c` - [ ] 開啟 [Address Sanitizer](https://github.com/google/sanitizers/wiki/AddressSanitizer)，修正 `qtest` 執行過程中的錯誤 - [ ] 運用 Valgrind 排除 `qtest` 實作的記憶體錯誤，並透過 Massif 視覺化 "simulation" 過程中的記憶體使用量，需要設計對應的實驗 - [ ] 精簡 ```q_delete_dup``` - [ ] 思考其他 ```q_merge``` 實作方式 - [ ] 研讀 Linux 核心原始程式碼的 [lib/list_sort.c](https://github.com/torvalds/linux/blob/master/lib/list_sort.c) 並嘗試引入到 `lab0-c` 專案 :::spoiler 比較你自己實作的 merge sort 和 Linux 核心程式碼之間效能落差 ::: - [ ] 確保 `qtest` 提供的 `web` 命令得以搭配上述佇列實作使用 :::spoiler 目前一旦網頁伺服器啟動後，原本處理命令列操作的 `linenoise` 套件就無法使用，請提出改善機制 * 提示: 使用 [select](http://man7.org/linux/man-pages/man2/select.2.html) 系統呼叫 ::: - [ ] 在 `qtest` 提供新的命令 `shuffle` :::spoiler 允許藉由 [Fisher–Yates shuffle](https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle) 演算法，對佇列中所有節點進行洗牌 (shuffle) 操作，需要以統計的原理來分析你的實作，探究洗牌的「亂度」 ::: - [ ] 在 `qtest` 中執行 `option entropy 1` 並搭配 `ih RAND 10` 一類的命令 :::spoiler 觀察亂數字串的分布，並運用「資訊與熵互補，資訊就是負熵」的觀念，設計一組新的命令或選項，得以在 `qtest` 切換不同的 PRNG 的實作 (可選定 [Xorshift](https://en.wikipedia.org/wiki/Xorshift))，搭配統計分析工具，比較亂數產生器 (如 Xorshift vs. 內建) 的品質 * 參見: [由大數法則理解訊息熵 (entropy) 的數學意義](https://hackmd.io/@8dSak6oVTweMeAe9fXWCPA/ryoegPgw_) * 參見: [The Linux Pseudorandom Number Generator Revisited](https://eprint.iacr.org/2012/251.pdf) ::: - [ ] 研讀論文〈[Dude, is my code constant time?](https://eprint.iacr.org/2016/1123.pdf)〉 :::spoiler 解釋本程式的 "simulation" 模式是如何透過以實驗而非理論分析，達到驗證時間複雜度，需要解釋 [Student's t-distribution](https://en.wikipedia.org/wiki/Student%27s_t-distribution) 及程式實作的原理。 * 注意：現有實作存在若干致命缺陷，請討論並提出解決方案 * 在 [oreparaz/dudect](https://github.com/oreparaz/dudect) 的程式碼具備 percentile 的處理，但在 [lab0-c](https://github.com/sysprog21/lab0-c) 則無。 ::: - [ ] 指出現有程式的缺陷或者可改進之處 (例如更全面地使用 Linux 核心風格的鏈結串列 API)，嘗試實作並提交 pull request - [ ] 詳閱第 1 週所有教材及作業描述 [(一)](/@sysprog/linux2023-lab0-a), [(二)](/@sysprog/linux2023-lab0-b), [(三)](/@sysprog/linux2023-lab0-c), [(四)](/@sysprog/linux2023-lab0-d), [(五)](/@sysprog/linux2023-lab0-e)，記錄你的啟發和疑惑 - [ ] 解釋 [select](http://man7.org/linux/man-pages/man2/select.2.html) 系統呼叫在本程式的使用方式，並分析 [console.c](https://github.com/sysprog21/lab0-c/blob/master/console.c) 的實作 :::spoiler 說明其中運用 CS:APP [RIO 套件](http://csapp.cs.cmu.edu/2e/ch10-preview.pdf) 的原理和考量點。可對照參考 [CS:APP 第 10 章重點提示](https://hackmd.io/@sysprog/H1TtmVTTz) ::: ## 實作 queue.c > Score: 95 / 100 ### q_new 利用 `malloc` 配置記憶體。若配置失敗先釋放記憶體空間，並回傳 ```NULL``` 值。若成功配置，呼叫 ```list.h``` 的 ```INIT_LIST_HEAD``` 進行初始化。 > commit: [85b8cd1](https://github.com/willow11235811/lab0-c/commit/85b8cd13dfcf449faa1f8fe457bbdfded310c423) :::danger 不用張貼完整程式碼。 :notes: jserv > 已修改 ::: ### q_free 先檢查是否為 ```NULL``` 佇列或是空的佇列，利用 ```list.h``` 提供的 ```list_for_each_safe``` 走訪所有節點，再使用 ```q_release_element``` 釋放節點內的所有 element。 commit: [325c1fe](https://github.com/willow11235811/lab0-c/commit/325c1fe16620b797587000de97805c7af608a008) :::spoiler ```c void q_free(struct list_head *l) { if (!l) return; if (list_empty(l)) { free(l); return; } struct list_head *node, *save; list_for_each_safe(node, save, l) q_release_element(container_of(node, element_t, list)); free(l); } ``` ::: ### q_insert_head 利用 ```list.h``` 提供的鏈結串列巨集和函式，可管理鏈結串列，重點在於實作出新的節點和管理指標與記憶體。藉由 ```malloc``` 和 ```strncpy``` 來獲取新的記憶體位置。 commit: [aae2f1c](https://github.com/willow11235811/lab0-c/commit/aae2f1c1f3bea14bde7b4419063f4ff1e6f735b0) :::spoiler ```c bool q_insert_head(struct list_head *head, char *s) { if (!head || !s) return false; element_t *new_ele = malloc(sizeof(element_t)); if (!new_ele) { free(new_ele); return false; } size_t len = strlen(s) + 1; char *new_str = malloc(len * sizeof(char)); if (!new_str) { free(new_str); free(new_ele); return false; } strncpy(new_str, s, head); new_ele->value = new_str; list_add(&new_ele->list, head); return true; } ``` ::: ### q_insert_tail 與 ```q_insert_head``` 作法相同，改使用 ```list_add_tail``` 實作。 commit: [f04edd1](https://github.com/willow11235811/lab0-c/commit/f04edd1f3cc38f2996ce8701273df36da51b5f50) :::spoiler ```c bool q_insert_tail(struct list_head *head, char *s) { if (!head || !s) return false; element_t *new_ele = malloc(sizeof(element_t)); if (!new_ele) { free(new_ele); return false; } size_t len = strlen(s) + 1; char *new_str = malloc(len * sizeof(char)); if (!new_str) { free(new_str); free(new_ele); return false; } strncpy(new_str, s, len); new_ele->value = new_str; list_add_tail(&new_ele->list, head); return true; } ``` ::: ### q_remove_head 要移除佇列的頭部且回傳被移除的元素，同時要將元素的字串資料儲存。利用 ```list.h``` 提供的 ```container_of``` 可以找到頭部元素，並使用 ```list_del``` 移除元素，使用 ```strncpy``` 複製字串資料，並在最後加上 null terminator 。 commit: [7d4331c](https://github.com/willow11235811/lab0-c/commit/7d4331c7878ec2ac89dbde4dcbbaaa51d9767f8a), [ca27194](https://github.com/willow11235811/lab0-c/commit/ca27194a831fcd0f553c587dba21ad41ff295a0e) :::spoiler ```c element_t *q_remove_head(struct list_head *head, char *sp, size_t bufsize) { if (!head || list_empty(head)) return NULL; element_t *tmp = container_of(head->next, element_t, list); if (sp && bufsize) { strncpy(sp, tmp->value, bufsize - 1); *(sp + bufsize - 1) = '\0'; } list_del(head->next); return tmp; } ``` ::: ### q_remove_tail 與 q_remove_head 作法相同，改尋找尾部。 commit: [7aad974](https://github.com/willow11235811/lab0-c/commit/7aad974e1b4ca3cc0010ea2ff66d182bc1a9a3db), [ca27194](https://github.com/willow11235811/lab0-c/commit/ca27194a831fcd0f553c587dba21ad41ff295a0e) :::spoiler ```c element_t *q_remove_tail(struct list_head *head, char *sp, size_t bufsize) { if (!head || list_empty(head)) return NULL; element_t *tmp = container_of(head->prev, element_t, list); if (sp && bufsize) { strncpy(sp, tmp->value, bufsize - 1); *(sp + bufsize - 1) = '\0'; } list_del(head->prev); return tmp; } ``` ::: ### q_size 利用 ```list.h``` 提供的 ```list_for_each``` 走訪所有節點，這個實作的時間複雜度是 $O(N)$ 。 commit: [693b886](https://github.com/willow11235811/lab0-c/commit/693b886ca63fdffcd209fe2242fcdbfdab3f172b) :::spoiler ```c int q_size(struct list_head *head) { if (!head || list_empty(head)) return 0; int size = 0; struct list_head *count = NULL; list_for_each (count, head) size++; return size; } ``` ::: ### q_delete_mid 利用快慢指標演算法 (Tortoise and Hare Algorithm) ，藉由快指標：慢指標移動速度2：1，當快指標走完整個 linked list 時，慢指標會到達中點。 commit: [17573d1](https://github.com/willow11235811/lab0-c/commit/17573d14933fb79049c6c29cbea39916b7ca7e0a) :::spoiler ```c bool q_delete_mid(struct list_head *head) { // https://leetcode.com/problems/delete-the-middle-node-of-a-linked-list/ if (!head || list_empty(head)) return false; struct list_head *fast = head->next, *slow = head->next; while (fast != head && fast->next != head) slow = slow->next, fast = fast->next->next; list_del(slow); q_release_element(list_entry(slow, element_t, list)); return true; } ``` ::: ### q_delete_dup 一開始想利用 ```list_for_each_safe``` 走訪所有節點，但根據定義只允許刪除 ```node``` 節點，其他操作都可能造成 undefined behavior 。所以利用 ```tmp``` 走訪 ```save``` 到尾段的節點，只要發現 ```node``` 與 ```tmp``` 有相同的 ```value``` ，就刪除 ```node``` 這個節點。再利用 ```break``` 跳脫至最外層的 ```list_for_each_safe``` 繼續走訪剩下節點。 commit: [930e5db](https://github.com/willow11235811/lab0-c/commit/930e5db42ec4fe94345cbcc709a6f90f3491a50f) :::spoiler Version 1 ```c bool q_delete_dup(struct list_head *head) { // https://leetcode.com/problems/remove-duplicates-from-sorted-list-ii/ if (!head) return false; struct list_head *node, *save, *tmp; list_for_each_safe (node, save, head) { for (tmp = save; tmp != head; tmp = tmp->next) if (!strcmp(list_entry(node, element_t, list)->value, list_entry(tmp, element_t, list)->value)) { list_del(node); q_release_element(list_entry(node, element_t, list)); break; } } return true; } ``` ::: :::danger :bug: Bugs 隨後仔細閱讀 ```queue.h``` 發現 ```q_delete_dup``` 的要求是刪除「所有」重複的節點， Version 1 會留下1個節點。 ::: #### Debug 加入 ```is_dup``` 判斷是否有連續的節點重複，但程式碼應該還能再精簡。 commit: [03d9191](https://github.com/willow11235811/lab0-c/commit/03d91915a6151531ec3aa3532ea86d60622e33f2) :::info :notebook_with_decorative_cover: ToDo 精簡 ```q_delete_dup``` ::: :::spoiler Version 2 ```c bool q_delete_dup(struct list_head *head) { // https://leetcode.com/problems/remove-duplicates-from-sorted-list-ii/ if (!head) return false; struct list_head *node, *save; bool is_dup = false; list_for_each_safe (node, save, head) { if (node->next != head && !strcmp(list_entry(node, element_t, list)->value, list_entry(save, element_t, list)->value)) { is_dup = true; list_del(node); q_release_element(list_entry(node, element_t, list)); } else if (is_dup) { is_dup = false; list_del(node); q_release_element(list_entry(node, element_t, list)); } } return true; } ``` ::: ### q_swap 利用 indirect pointer 的技巧，直接操作要互換節點的指標。 commit: [25817e5](https://github.com/willow11235811/lab0-c/commit/25817e521436b7196bafcc8c5a7abe768220b7bf) :::spoiler Version 1 ```c void q_swap(struct list_head *head) { // https://leetcode.com/problems/swap-nodes-in-pairs/ if (!head) return; for (struct list_head **l = &head->next; *l != head && (*l)->next != head; l = &(*l)->next) { struct list_head *first = head, *second = head->next; first->prev = second; first->next = second->next; second->prev = first->prev; second->next = first; *l = first; } } ``` ::: :::danger :bug: Bugs 1. 出現無限迴圈 2. 指標操作應可用 ```list.h``` 提供的程式精簡 ::: #### Debug 使用 ```list_move``` 增加可讀性，並避免無限迴圈。 commit: [b9be753](https://github.com/willow11235811/lab0-c/commit/b9be753f88c2f9287de409f06061cf869b2aa8ea) :::spoiler Version 2 ```c void q_swap(struct list_head *head) { // https://leetcode.com/problems/swap-nodes-in-pairs/ if (!head) return; struct list_head *node, *save; for (node = head->next, save = node->next; node != head && save != head; node = node->next, save = node->next) { list_move(node, save); } } ``` ::: ### q_reverse ```list.h``` 提供的 ```list_move``` 可將節點移到頭部，但操作時會刪除節點，因此使用 ```list_for_each_safe``` 走訪所有節點。 commit: [df3a1e7](https://github.com/willow11235811/lab0-c/commit/df3a1e7c66323a697abfdff7195883236756657e) :::spoiler ```c void q_reverse(struct list_head *head) { if (!head || list_empty(head)) return; struct list_head *node, *save; list_for_each_safe (node, save, head) list_move(node, head); } ``` ::: ### q_reverseK 要反轉 $K$ 個元素，會使用到 ```list.h``` 提供的 ```list_cut_position``` 和 ```list_splice``` 來切斷與接合反轉後的佇列。先讓 ```node``` 前進 $K-1$ 個節點，再利用 ```tmp``` 作為儲存切斷後佇列的 buffer ，並反覆迭代直到尾部。 commit: [9070c81](https://github.com/willow11235811/lab0-c/commit/9070c817252848a832b72c20c6227eba072cfedc) :::spoiler Version 1 ```c void q_reverseK(struct list_head *head, int k) { // https://leetcode.com/problems/reverse-nodes-in-k-group/ if (!head || list_empty(head) || list_is_singular(head) || k <= 1) return; int i = k; struct list_head *node, *save, *cut = head; list_for_each_safe (node, save, head) { if (--i) continue; i = k; LIST_HEAD(tmp); list_cut_position(&tmp, cut, head); q_reverse(&tmp); list_splice(&tmp, cut); cut = save->prev; } } ``` ::: :::danger :bug: Bugs 雖可執行，但反轉後的佇列順序不對。 > 提交 `qtest` 檢查機制的強化改進？ > :notes: jserv ::: #### Debug ```list_cut_position``` 應使用 ```node``` 帶入，否則切點不對。 commit: [c2b25d1](https://github.com/willow11235811/lab0-c/commit/c2b25d1c229d9ec2700924b4f54e0f612cbe7a35) :::spoiler Version 2 ```c void q_reverseK(struct list_head *head, int k) { // https://leetcode.com/problems/reverse-nodes-in-k-group/ if (!head || list_empty(head) || list_is_singular(head) || k <= 1) return; int i = k; struct list_head *node, *save, *cut = head; list_for_each_safe (node, save, head) { if (--i) continue; i = k; LIST_HEAD(tmp); list_cut_position(&tmp, cut, node); q_reverse(&tmp); list_splice(&tmp, cut); cut = save->prev; } } ``` ::: ### q_sort 使用 Merge Sort 演算法加上遞迴的形式進行實作。利用快慢指標演算法 (Tortoise and Hare Algorithm) ，將佇列切成兩段。在實作 ```merge_list``` 用以合併佇列，參考[@ItisCaleb](https://hackmd.io/@ItisCaleb/S1v9IQpai#q_sort)所實作的函數。 commit: [252b5b7](https://github.com/willow11235811/lab0-c/commit/252b5b7d335b7a1cde4f44f929752fe61a449a17) :::spoiler Version 1 ```c static inline void merge_list(struct list_head *head, struct list_head *left, struct list_head *right) { while (!list_empty(left) && !list_empty(right)) { if (strcmp(list_entry(left->next, element_t, list)->value, list_entry(right->next, element_t, list)->value) < 0) { list_move_tail(left->next, head); } else { list_move_tail(right->next, head); } } if (!list_empty(left)) { list_splice_tail_init(left, head); } else { list_splice_tail_init(right, head); } } void q_sort(struct list_head *head) { /* Using merge sort */ if (!head || list_empty(head) || list_is_singular(head)) return; /* Find the middle point */ struct list_head *fast, *slow; for (fast = head, slow = head; fast != head && fast->next != head;) { fast = fast->next->next; slow = slow->next; } /* Divide into left and right parts */ LIST_HEAD(left); LIST_HEAD(right); list_splice_tail_init(head, &right); list_cut_position(&left, &right, slow); /* Recursion */ q_sort(&left); q_sort(&right); merge_list(head, &left, &right); } ``` ::: :::danger :bug: Bugs 出現無限迴圈 ::: #### Debug 呼叫 ```list_splice_tail_init``` 時會使 header 被初始化，其 ```*prev``` 和 ```*next``` 不會指向第一個節點，破壞 doubly-linked list 的結構。改使用兩個 ```list_cut_position``` 。 commit: [522d558](https://github.com/willow11235811/lab0-c/commit/522d5584c371eaec91385c820f8abe637bc6c5ee) :::spoiler Version2 ```c static inline void merge_list(struct list_head *head, struct list_head *left, struct list_head *right) { while (!list_empty(left) && !list_empty(right)) { if (strcmp(list_entry(left->next, element_t, list)->value, list_entry(right->next, element_t, list)->value) < 0) { list_move_tail(left->next, head); } else { list_move_tail(right->next, head); } } if (!list_empty(left)) { list_splice_tail_init(left, head); } else { list_splice_tail_init(right, head); } } void q_sort(struct list_head *head) { /* Using merge sort */ if (!head || list_empty(head) || list_is_singular(head)) return; /* Find the middle point */ struct list_head *fast, *slow; fast = head, slow = head; do { fast = fast->next->next; slow = slow->next; } while (fast != head && fast->next != head); /* Divide into left and right parts */ LIST_HEAD(left); LIST_HEAD(right); list_cut_position(&left, head, slow); list_cut_position(&right, head, head->prev); /* Recursion */ q_sort(&left); q_sort(&right); merge_list(head, &left, &right); } ``` ::: ### q_descend 要刪除所有節點，只要該節點右側有比自己更大的 value 。策略是假設尾部有佇列中的最大值 ```best``` ，從尾部往頭部依序搜尋，只要找到更小的值就刪除該節點。過程中若遇到比現有最大值更大的 value ，就更新佇列的最大值 ```best``` 。 commit: [7379088](https://github.com/willow11235811/lab0-c/commit/73790880e9da6314a54ab24400e9d21607d0c1fa) :::spoiler ```c int q_descend(struct list_head *head) { // https://leetcode.com/problems/remove-nodes-from-linked-list/ if (!head || list_empty(head)) return 0; element_t *elem = list_entry(head->prev, element_t, list), *save = list_entry(elem->list.prev, element_t, list); char *best = elem->value; while (&elem->list != head) { if (strcmp(elem->value, best) < 0) { list_del(&elem->list); q_release_element(elem); } else { best = elem->value; } elem = save; save = list_entry(elem->list.prev, element_t, list); } return q_size(head); } ``` ::: ### q_merge 一開始策略是想利用 Merge sort 演算法，由佇列頭部往尾部兩兩合併佇列。但實作時一直無法實現程式碼，最後參考 [@paul90317](https://hackmd.io/@paul90317/linux2023-spring-lab0#q_merge) 的 ```q_merge``` 實作。 :::info :notebook_with_decorative_cover: ToDo 思考其他 ```q_merge``` 實作方式 ::: commit: [d082344](https://github.com/willow11235811/lab0-c/commit/d08234460d077569ab6f1851db561cc4cd3cd57d) :::spoiler ```c int q_merge(struct list_head *head) { // https://leetcode.com/problems/merge-k-sorted-lists/ if (!head || list_empty(head)) return 0; int n = q_size(head); while (n > 1) { struct list_head *fast = head->next, *slow = head->next; for (int i = 0; i < n / 2; ++i) { LIST_HEAD(temp); merge_list(&temp, container_of(fast, queue_contex_t, chain)->q, container_of(fast->next, queue_contex_t, chain)->q); list_splice_tail(&temp, container_of(slow, queue_contex_t, chain)->q); fast = fast->next->next; slow = slow->next; } if (n & 1) list_splice_tail_init(container_of(fast, queue_contex_t, chain)->q, container_of(slow, queue_contex_t, chain)->q); n = (n + 1) / 2; } return q_size(container_of(head->next, queue_contex_t, chain)->q); } ``` :::