2024q1 Homework1 (lab0)

# 2024q1 Homework1 (lab0) contributed by < [`patri27826`](https://github.com/patri27826) > ## 開發環境 ```shell $ gcc --version gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 Copyright (C) 2021 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 46 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 24 On-line CPU(s) list: 0-23 Vendor ID: GenuineIntel Model name: 13th Gen Intel(R) Core(TM) i7-13700 CPU family: 6 Model: 183 Thread(s) per core: 2 Core(s) per socket: 16 Socket(s): 1 Stepping: 1 CPU(s) scaling MHz: 20% CPU max MHz: 5200.0000 CPU min MHz: 800.0000 BogoMIPS: 4224.00 ``` ## 指定的佇列操作 ### `q_insert_head` :::danger 列出程式碼之前，要闡述你的想法、考量因素，以及你的做法是否會有副作用 (side effect)。 ::: <s>初步想法</s> ??? ```c bool q_insert_head(struct list_head *head, char *s) { if(!head) return false; element_t *new = malloc(sizeof(element_t)); if (!new) return false; new->value = strdup(s); list_add(&new->list, head); return true; } ``` 在參考 [willwillhi1](https://hackmd.io/@willwillhi/lab0-2023#%E5%AE%8C%E6%88%90-queuec) 後，發現還會需要去檢查`new-value` 是不是 `NULL`，所以修改成如下 ```diff bool q_insert_head(struct list_head *head, char *s) if (!new) return false; new->value = strdup(s); + if (!new->value) { + free(new); + return false; + } list_add(&new->list, head); return true; } ``` 然後我就有點好奇會需要去排除 s 是 `NULL` 的情況嗎？，我就去看了`strdup` 裡面使用到`strlen`， :::danger 無論標題和內文中，中文和英文字元之間要有空白字元 (對排版和文字搜尋有利)； ::: <s>參考 cppreference.com 的 `strlen`說明，發現s在`null`的情況下，`strlen`會中止 size_t strlen( const char *str ); 1) Returns the length of the given **null-terminated** byte string, that is, the number of characters in a character array whose first element is pointed to by str up to and not including the first null character. </s> :::danger 查閱 C 語言規格書和 Linux man page，而非 cppreference 網站。 ::: 在查閱 [C 語言規格書](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf)和 [Linux man page](https://linux.die.net/man/3/strlen)後，發現他們對於 `strlen` 都只有介紹用法，沒有提及特殊值的情況，因此這邊我還是先暫時維持檢查 s 不為 `null` 的條件 :::danger 不要以為做完關鍵字搜尋，就等同於你看完 C 語言規格書。注意規格書的註腳 (footnote)，幾百頁的內容，可讓你一輩子受益，快去讀！ ::: 因此補上檢查 ```diff bool q_insert_head(struct list_head *head, char *s) { - if(!head) + if(!head || !s) return false; element_t *new = malloc(sizeof(element_t)); if (!new) return false; new->value = strdup(s); if(!new->value) { free(new); return false; } list_add(&new->list, head); return true; } ``` ### `q_insert_tail` 跟 `q_insert_head` 一樣做法， `list_add` 換成 `list_add_tail` :::danger 使用 `git diff` 或 `diff` 命令來產生程式碼之間的變更列表，而非憑感覺填入，注意位移量。 ::: ### `q_remove_head` 和 `q_remove_tail` 實作 ```c element_t *q_remove_head(struct list_head *head, char *sp, size_t bufsize) { if (!head || list_empty(head)) return NULL; element_t *head_element = list_first_entry(head, element_t, list); list_del(&head_element->list); if (sp && bufsize) { memcpy(sp, head_element->value, bufsize); sp[bufsize - 1] = '\0'; } return head_element; } ``` 參考 [komark06](https://hackmd.io/@komark06/SyCUIEYpj#q_remove_head-%E5%92%8C-q_remove_tail-%E5%AF%A6%E4%BD%9C)，先把元素從佇列中移除，再去對元素的值複製到傳入的 `sp` 指標，除了 `memcpy` 也有看到 [willwillhi](https://hackmd.io/@willwillhi/lab0-2023) 用 strncpy 實作 ``` strncpy(sp, node->value, bufsize-1); sp[bufsize-1] = '\0'; ``` <s> 了解到 `strcpy` 跟 `strncpy` 之間的[差異與用法](https://dotblogs.com.tw/Ace_Dream/2016/01/05/strcpy_strncpy) </s> :::danger 參照第一手材料。 ::: ### `q_delete_mid` 這題我一看到就想到 `leetcode` 上面的某一題尋找鏈結佇列的中心點是用快慢指針去實作，差異是在因為 `leetcode` 上的是單向的佇列，而我們實作的是雙向佇列，所以會需要去修改終止條件 ```c struct list_head *slow = head->next, *fast = head->next, *tail = head->prev; while (fast != tail && fast->next != tail) { slow = slow->next; fast = fast->next->next; } ``` ### `q_delete_dup` 這題看到一開始的想法是跑兩個迴圈來去檢查所有的 `Duplicate String` ，但這樣的時間複雜度會到 $O(n^2)$，之後參考 [`wanghanchi`](https://hackmd.io/@wanghanchi/linux2023-lab0#q_delete_dup) 的方法 ```c bool isDup = false; element_t *node, *safe; list_for_each_entry_safe (node, safe, head, list) { if (&safe->list != head && !strcmp(safe->value, node->value)) { list_del(&node->list); q_release_element(node); isDup = true; } else if (isDup) { list_del(&node->list); q_release_element(node); isDup = false; } } ``` 先去比對第一個元素的值跟第二個元素的值是否相同，如果相同，移除第一個，同時用一個 `flag` 來記錄第二個的值是否 `Duplicate` ，這樣當第一個 `if` 不通過時，我就可以去通過這個 `flag` 來去確認當前的值是不是前一次的 `Duplicate String` ### `q_swap` :::danger 列出程式碼之前，要闡述你的想法、考量因素，以及你的做法是否會有副作用 (side effect)。 ::: ```c void q_swap(struct list_head *head) { if (!head || !head->next || list_empty(head)) return; struct list_head *cur = head->next, *next = head->next->next; element_t *e1, *e2; while (cur != head && next != head) { e1 = list_entry(cur, element_t, list); e2 = list_entry(next, element_t, list); char *c = e1->value; e1->value = e2->value; e2->value = c; cur = next->next; if (!next->next) { break; } next = next->next->next; } } ``` 做法就是每次交換兩個元素的值，結束再往下兩個去找，比較要注意的是這邊 ```c cur = next->next; if (!next->next) { break; } ``` 要先去檢查 `next->next` 是不是 `null` ，否則在`next->next->next`就會出錯 ### `q_reverse` and `q_reverseK` :::danger - [ ] traverse (動詞) 和 traversal (名詞) 根據 Dictionary.com 的[解釋](https://www.dictionary.com/browse/traverse ): (作為及物動詞和不及物動詞都有類似的意思，以下列出作為及物動詞的寓意) * to pass or move over, along, or through. * to go to and fro over or along. 其實這意思很好懂，就像我們「走過」/「穿越」校園一般，於是 traverse a linked list 就會是「(用某種手段) 存取多個鏈結串列的節點」，但這裡卻沒有必要「所有」的範圍：英語的 "move over/through" 用於某個區域時，根本沒有這樣的隱含意義。如果將 traverse 翻譯為「遍歷」，就會導致「超譯」，也就是跳脫「直譯」和「意譯」。當我們回頭看 "traverse" 所在的技術描述內容，例如 "traverse every node"，若翻譯為「遍歷每個節點」，那麼既然都「遍」(意即「全面」、「到處」)，又何來「每個」節點呢？於是，合理的翻譯應改為「逐一走訪每個節點」 —— 差異在哪？在 "traverse every node" 的應用場景中，可能是我們嘗試在鏈結串列尋找特定的節點內含的資料，一旦找到就停止，或者我們要偵測給定的鏈結串列是否包含環狀 (circular) 結構，並沒有真的要「遍」(到處/全面)「歷」(意即「經過」) 每個節點。在我們的用語中，要區分「意圖」(intention) 和「實際作用」(reaction)，濫用「遍歷」會使得語意不清，從而難以推測英語原文的訴求。還有個更重要的原因是，「遍歷」這詞已在理工領域存在，且廣泛使用，即「遍歷理論」(Ergodic theory)，後者是研究具有不變測度 (invariant measure) 的動力系統及其相關問題的一個數學分支。遍歷理論研究遍歷變換，由試圖證明統計物理中的遍歷假設 (Ergodic hypothesis) 演進而來。在統計學中，若單一個物理系統在不同時間內重複相同的實驗 (如丟擲一枚硬幣)，其取樣數據所得的統計結果 (如硬幣出現正面的機率) 和極多個完全相同的物理系統的系集 (如丟極多個相同的硬幣) 在同時作相同的實驗取樣數據的統計結果假設為相同時，此種假設即稱為「遍歷性假設」或「遍歷假設」。基於這個假設，對時間平均的統計方式及結果，便可由對系集的平均及統計方式得到。在一般物理系統中，尤其在統計力學範圖中，均採用此遍歷性假設為基本的假設。在流體力學中對亂流的實驗分析，亦是採用此假設。遍歷 (Ergodic) 源於以下二個希臘詞： * ergon (對應英語的 work) * odos (對應英語的 path 或 way) 最初這是由奧地利物理學家波茲曼 ([Ludwig Boltzmann](https://en.wikipedia.org/wiki/Ludwig_Boltzmann)) 於統計力學領域 (statistical mechanics) 提出的概念，其一廣為人知的結果是：在經過長時間後，時間平均將會趨近空間平均。此事實在動力系統扮演極重要的角色，在隨機分析領域亦然。因此，若貿然將 traverse 翻譯為「遍歷」，一來語意不清，二來增加和理工多項領域專業人員的溝通成本，實在得不償失。 $\to$ [資訊科技詞彙翻譯](https://hackmd.io/@sysprog/it-vocabulary) ::: `q_reverse` 比較直觀，就是<s>遍歷</s> 整個佇列，一直把元素往 `head` 搬，也就是去呼叫 `list_move(node, head)` 而 `reverseK` 我是參考 [`var-x`](https://hackmd.io/@vax-r/linux2024-homework1#q_reverseK) ```c void q_reverseK(struct list_head *head, int k) { if (!head) return; struct list_head *last = head->next; int times = q_size(head) / k; LIST_HEAD(tmp); LIST_HEAD(result); for (int i = 0; i < times; i++) { for (int j = 0; j < k; j++) { struct list_head *node = last->next; list_del(last); list_add(last, &tmp); last = node; } list_splice_tail_init(&tmp, &result); } list_splice_init(&result, head); } ``` 主要精神就是我事先得知我有幾組需要被 `reverse` 的元素，再分組把 `reverse` 的結果存到暫存佇列，最後再把這個暫存佇列加回原本的佇列流程如下： 1. 我有7個元素且 $K=3$ ，則組數則為 $7/3 = 2$ ```c int times = q_size(head) / k; ``` 2. 接著在我的每個組內，依次把元素加到暫存佇列 `tmp` 的 `head` ，其實就是變相的reverse ```c for (int i = 0; i < times; i++) { for (int j = 0; j < k; j++) { struct list_head *node = last->next; list_del(last); list_add(last, &tmp); last = node; } list_splice_tail_init(&tmp, &result); ``` 3. 把 `tmp` 加到 `result` 的尾巴， `result就是最終結果的暫存佇列` ```c list_splice_tail_init(&tmp, &result); ``` 4. 把 `result` 加回原始佇列的頭部，原因是我們在 `reverse` 時會有一些數量不夠的情況 ($7％3 = 1$) ，這些多的元素就會被留在原始佇列內，因此我們最終要把 `result` 加回原始佇列的頭部 ```c list_splice_init(&result, head); ``` ### `q_ascend` and `q_descend` ```c element_t *first = list_entry(head->prev, element_t, list); element_t *second = list_entry(head->prev->prev, element_t, list); while (&second->list != head) { if (strcmp(first->value, second->value) > 0) { first = list_entry(first->list.prev, element_t, list); second = list_entry(second->list.prev, element_t, list); } else { list_del(&second->list); q_release_element(second); second = list_entry(first->list.prev, element_t, list); } } ``` 主要想法就是從佇列尾部開始去做比對，把不符的點刪除，若符合則一起往前到下一組 ### `q_sort` 一開始想的是 quick sort，但發現現在這個情況可能會用 `merge sort` 會比較合適，因此在參考 [`willwillhi`](https://hackmd.io/@willwillhi/lab0-2023)的做法，實作了 `merge sort` :::danger 列出明確的考量因素，不要人云亦云。 ::: ```c struct list_head *_merge_sorted_single_linked_queues(struct list_head *l1, struct list_head *l2) { struct list_head *head = NULL, **ptr = &head; for (struct list_head **cur = NULL; l1 && l2; *cur = (*cur)->next) { if (strcmp(container_of(l1, element_t, list)->value, container_of(l2, element_t, list)->value) >= 0) cur = &l2; else cur = &l1; *ptr = *cur; ptr = &(*ptr)->next; } *ptr = (struct list_head *) (void *) ((uintptr_t) (void *) l1 | (uintptr_t) (void *) l2); return head; } struct list_head *_merge_sort(struct list_head *l) { if (l == NULL || l->next == NULL) return l; struct list_head *fast = l; struct list_head *slow = l; while (fast->next != NULL && fast->next->next != NULL) { fast = fast->next->next; slow = slow->next; } struct list_head *l1 = l; struct list_head *l2 = slow->next; slow->next = NULL; return _merge_sorted_single_linked_queues(_merge_sort(l1), _merge_sort(l2)); } ``` :::danger 工程人員說話要精準，避免說「覺得」和「好像」等詞彙。 ::: 但在了解他的程式碼之後，<s>我覺得好像</s> 有改進的空間，原因是在他是將雙向鏈結佇列先拆成單向，再去做 `merge` ，但是這樣就不能套用我們事先定義好的一些函式，於是我嘗試將他保持著雙向鏈結的形式去做 `merge sort` ，但很可惜嘗試了很多一直 `segmentation fault` ，於是目前還是先維持單鏈結做法 ### `q_merge` 一開始花了一點時間搞懂 `queue_context_t` 的結構，了解之後發現他也是用 `merge sort` 的精神來實作，只是這次的最小單位就是一個 `queue` ，在參考 [var-x](https://hackmd.io/@vax-r/linux2024-homework1#q_merge)的做法後，實作了如下， ```c int q_merge(struct list_head *head, bool descend) { if (!head || list_empty(head)) return 0; if (list_is_singular(head)) return q_size(list_entry(head, queue_contex_t, chain)->q); int size = q_size(head); int count = (size % 2) ? size / 2 + 1 : size / 2; int queue_size = 0; for (int i = 0; i < count; ++i) { queue_contex_t *first = list_first_entry(head, queue_contex_t, chain); queue_contex_t *second = list_entry(first->chain.next, queue_contex_t, chain); while (!list_empty(first->q) && !list_empty(second->q)) { queue_size = _merge_sorted_double_linked_queues(first->q, second->q); list_move_tail(&second->chain, head); first = list_entry(first->chain.next, queue_contex_t, chain); second = list_entry(first->chain.next, queue_contex_t, chain); } } if (descend) { q_reverse(head); } return queue_size; } ``` :::danger 「主要精神」是什麼？查閱辭典，再來看是否合適。 ::: <s>主要精神</s> 是先得知他會需要做幾輪合併，在每一輪內一對一對去處理，例如一開始有4對，接下來合併到2對，再來一對，最後就剩一個。這樣的好處是能避免單一佇列在合併的過程會太長，每一輪的每一次都各自去合併兩個佇列，維持每個合併佇列都是合併相同數量的佇列 `_merge_sorted_double_linked_queues` 的作用是把兩個佇列依照 `merge sort ` 的方法，合併到傳入的第一個佇列 :::danger 改進你的漢語表達。 :::