2025q1 Homework1 (lab0)

# 2025q1 Homework1 (lab0) contributed by <`keepgoing-228`> {%hackmd NrmQUGbRQWemgwPfhzXj6g %} ## 開發環境 ```shell $ gcc --version gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 46 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 32 On-line CPU(s) list: 0-31 Vendor ID: GenuineIntel Model name: Intel(R) Core(TM) i9-14900K CPU family: 6 Model: 183 Thread(s) per core: 2 Core(s) per socket: 24 Socket(s): 1 Stepping: 1 CPU(s) scaling MHz: 18% CPU max MHz: 6000.0000 CPU min MHz: 800.0000 BogoMIPS: 6374.40 ```   ## 開始實作之前雖然自認已經看了很多遍作業要求，但我還是沒注意到`Lab0(A)`的牛刀小試，主要是因為沒看懂clang-format工具，只認知到寫作風格在coding的時候手動處理就好。因此趕緊回頭補齊[Clang 21.0.0git documentation](https://clang.llvm.org/docs/ClangFormat.html)->牛刀小試用到，可以想像為VScode Extensions的 Formatter。另外由於我的小腦袋記憶體不足，看完老師[你所不知道C語言：linked list和非連續記憶體](https://hackmd.io/@sysprog/c-linked-list#%E4%BD%A0%E6%89%80%E4%B8%8D%E7%9F%A5%E9%81%93%E7%9A%84-C-%E8%AA%9E%E8%A8%80-linked-list-%E5%92%8C%E9%9D%9E%E9%80%A3%E7%BA%8C%E8%A8%98%E6%86%B6%E9%AB%94)影片，竟然忘記[Linux 核心風格的鏈結串列提供的巨集](https://www.kernel.org/doc/html/latest/core-api/kernel-api.html)，再加上當下只有看完影片，也沒深入去閱讀document，因此在這邊做個小記。 - list_for_each: 走訪每個 list_head結構體(node)，在linux定義為 Iterate over list nodes - hlist_for_each: 影片`2:55:47` 處，還不懂hash要再回頭看 - container_of: [文章簡化的實作](https://hackmd.io/@sysprog/c-linked-list#%E7%B0%A1%E5%8C%96%E7%9A%84%E5%AF%A6%E4%BD%9C) >Pointer `p` 利用char 是 1 個位元組（byte）大小的型別，將結構的位址cast成 (char *) 後代表請把 &x 的位址視為一個指向 char 的指標，因為 &x 原本型別是 (struct data *)。因此，`p` 可以用來逐 byte 地存取或操作 x 的記憶體內容。 >```clike >struct data { > int a; > double b; >}; > >int main() >{ > struct data x; > char *p = (char *) &x; > ... >} >``` 最後，實在不了解[Perf 分析程式效能](https://hackmd.io/@sysprog/gnu-linux-dev/https%3A%2F%2Fhackmd.io%2F%40sysprog%2Flinux-perf)，卡住了一個禮拜了，所以我決定先開始實作佇列`quece.c` ## 指定的佇列操作常常會想直接開始在程式碼中programming，但是沒有到非常熟悉的狀況下反而會花更多的時間，應先仔細閱讀`list.h`,`queue.h`以及[The Linux Kernel API](https://www.kernel.org/doc/html/latest/core-api/kernel-api.html)，必要時需要畫圖。 ### q_new 一開始壞毛病就想自己手刻，有記得老師上課說的linux雙向list，卻沒有記得看Linux Kernel API - List Management Functions，再加上後來看到前人們的紀錄也發現已經有`INIT_LIST_HEAD` function，回頭看`list.h`如下： ```cpp static inline void INIT_LIST_HEAD(struct list_head *head) { head->next = head; head->prev = head; } ``` 原來已經在標頭檔了，直接使用就好。 ### q_free `list_for_each_safe`跟`list_for_each`兩者的差別為是否有提前存取下一個節點的地址。如果使用`list_for_each` node 的下一次疊代依賴於 node->next，所以執行 free(e) 後，即 current的記憶體被釋放，但current->next 變成無效指針。 ```cpp #define list_for_each_safe(node, safe, head) \ for (node = (head)->next, safe = node->next; node != (head); \ node = safe, safe = node->next) #define list_for_each(node, head) \ for (node = (head)->next; node != (head); node = node->next) ``` `list_entry`是`container_of`的簡單封裝，特別在linux kernel雙向鏈表設計的，較容易閱讀，回傳包含`list_head`結構體的`element_t`結構體起始地址，參數： `node`：指向 `list_head`結構體的指針（通常是鏈表中的某個節點）。 `type`：指向包含 `list_head` 結構體的 `element_t` 結構體。 `member`：`element_t`結構體中 `list_head` 成員的名稱。. ```cpp! /** * list_entry() - Get the entry for this node * @node: pointer to list node * @type: type of the entry containing the list node * @member: name of the list_head member variable in struct @type * * Return: @type pointer of entry containing node */ #define list_entry(node, type, member) container_of(node, type, member) /** * element_t - Linked list element * @value: pointer to array holding string * @list: node of a doubly-linked list * * @value needs to be explicitly allocated and freed */ typedef struct { char *value; struct list_head list; } element_t; ``` 完整function如下: ```cpp! /* Free all storage used by queue */ void q_free(struct list_head *head) { if (!head) return; struct list_head *current, *next; // current will be set to the current node in list_for_each_safe function list_for_each_safe(current, next, head) { // e point to the element_t that contain current list_head element_t *e = list_entry(current, element_t, list); free(e->value); free(e); } free(head); } ``` ### q_insert_head 目標是Insert a new node after head (LIFO rule)，在寫這個function時，再次感受到聽的懂上課內容並不是真正的了解，自己動手的時候才真正體會到 `list_head` 結構體不是`element_t`， `element_t` 結構體是自行定義於 `queue.h`的結構體。 ```cpp! /** * element_t - Linked list element * @value: pointer to array holding string * @list: node of a doubly-linked list * * @value needs to be explicitly allocated and freed */ typedef struct { char *value; struct list_head list; } element_t; ``` ![image](https://hackmd.io/_uploads/HyolB8oAyl.png) 一開始只看參數(parameter)的annotation 完全看不懂 `list_add` function 到底要把新的 node 插入在鏈結串列的哪邊，再次提醒自己務必仔細閱讀完整 annotation，包含說明。 [參數parameter vs. 引數argument](https://stackoverflow.com/questions/156767/whats-the-difference-between-an-argument-and-a-parameter) ```cpp! /** * list_add - Insert a node after a given node in a circular list * @node: Pointer to the list_head structure to add. * @head: Pointer to the list_head structure after which to add the new node. * * Adds the specified @node immediately after @head in a circular doubly-linked * list. The node that previously followed @head will now follow @node, and the * list's circular structure is maintained. */ static inline void list_add(struct list_head *node, struct list_head *head) { struct list_head *next = head->next; next->prev = node; node->next = next; node->prev = head; head->next = node; } ``` ### q_insert_tail 參考到[Jackiempty的學習紀錄](https://hackmd.io/@Jackiempty/linux2025-homework1#q_insert_tail)可以有兩種方式實作，第一將`q_insert_head` 使用相同手法，只有改`list_add` function 變成 `list_add_tail`，為最陽春版本。第二，利用環狀鏈結的概念，調換 `list_add_head()` 的引數，只需要換 `head->prev` 就可以巧妙的插入尾部。 ```CPP! bool q_insert_tail(struct list_head *head, char *s) { q_insert_head(head->prev, s); } ``` ### q_remove_head `remove` 不同於 `delete`: The only thing "remove" need to do is **UNLINK** it. 複製`element_t` 的 value 到 `sp` 指向的空間，這過程不知道如何下手，因此參考[chiangkd](https://hackmd.io/@chiangkd/2023spring-lab0-c#q_remove_head) 的程式碼： ```cpp! element_t *q_remove_head(struct list_head *head, char *sp, size_t bufsize) { ... // copy the string to sp if (sp) { strncpy(sp, rm_element->value, bufsize - 1); sp[bufsize - 1] = '\0'; } ... } ``` 程式碼解釋：如果 `sp` 存在，就使用 `strncpy` 函數將 `element_t` 的 value 複製到 `sp` 指向的空間，而 `bufsize - 1` 引數限制最多複製的字符數量，來保留最後一個字節用於存放字符串終止符 "\0"。最後固定在 `sp` 的最後一個位置設置字符串終止符 "\0"，是為了確保永遠都是有效 C 字符串(char) ，因為 `strncpy` 在字符串長度大於或等於指定的複製長度時不會自動添加終止符，所以加上中止符才會是有效 C 字符串。 ### q_remove_tail 跟 `q_insert_tail` 一樣有兩種方式可以實作。第一，與 `q_remove_head` 相同手法，將第一個 linked list 的部份改成最後一個 linked list 就可以。第二，參考 [alanhc](https://hackmd.io/@alanhc/linux2025-homework1#Queuech) 的程式碼，可以直接調換 `q_remove_head` 的引數，只要換 `head->prev->prev` 就可以移除最後一個 linked list，兩次 `prev` 的原因是除了環狀連結到尾部，還有要考慮 `q_insert_head` 內部實作時是從 `head` 往後找到頭部。 ```cpp! /* Remove an element from head of queue */ element_t *q_remove_head(struct list_head *head, char *sp, size_t bufsize) { // check if queue exists or is empty if (!head || list_empty(head)) return NULL; struct list_head *first = head->next; element_t *ele = list_entry(first, element_t, list); // copy the string to sp if (sp) { strncpy(sp, ele->value, bufsize - 1); sp[bufsize - 1] = '\0'; } // remove the node from the list, not delete list_del(first); return ele; } /* Remove an element from tail of queue */ element_t *q_remove_tail(struct list_head *head, char *sp, size_t bufsize) { // use q_remove_head to remove at tail directly return q_remove_head(head->prev->prev, sp, bufsize); } ``` ### q_delete_mid 使用快慢指針，當快指針走完一圈，慢指針剛好在中間。因為是 double linked list，所以要注意的是 while loop 條件應為： ```cpp! while (fast != head && fast->next != head) {...} ``` 而不是 [leetcode 2095](https://leetcode.com/problems/delete-the-middle-node-of-a-linked-list/) 中 single linked list 的 while loop： ```cpp! while (fast && fast->next) {...} ``` ### q_delete_dup 我先完成 [leetcode 82](https://leetcode.com/problems/remove-duplicates-from-sorted-list-ii/) 再回去實作 `q_delete_dup` 發現兩者雖然是同樣移除重複 value 的效果，但是在 single linked 跟 double linked 上不能用相同的手法。更精準的說是double linked 還要多考慮`list_head` 的 `prev` 指標。目前以下的作法沒有考慮要被移除的 `list_head` 指標 (下圖紅指標)，未來可能要改用 `list_del` 而不是自己手刻。 ![Screenshot from 2025-04-17 16-38-44](https://hackmd.io/_uploads/SkwXRVAAyl.png) ```cpp! /* Delete all nodes that have duplicate string */ bool q_delete_dup(struct list_head *head) { if (!head || list_empty(head)) return false; struct list_head *prev = head; struct list_head *cur = head->next; while (cur != head) { while (cur->next && list_entry(cur, element_t, list)->value == list_entry(cur->next, element_t, list)->value) { cur = cur->next; } // Check if the currnent node is a duplicate, if so, remove it if (prev->next != cur) { prev->next = cur->next; cur->next->prev = prev; cur = cur->next; } else { prev = cur; cur = cur->next; } } return true; } ``` ### q_swap 基於上一個 `q_delete_dup` 指標沒處理好的經驗，我想優先使用 The Linux Kernel API ，看到 [Jackeimpty](https://hackmd.io/@Jackiempty/linux2025-homework1#q_swap) 有使用 `list_move` ，這個 API 會將 `second` remove 後再 insert 在 `cur` 後方也是 `first` 的前方。 ![Screenshot from 2025-04-18 12-49-03](https://hackmd.io/_uploads/BJG0KUykee.png) ```cpp! /* Swap every two adjacent nodes */ void q_swap(struct list_head *head) { // https://leetcode.com/problems/swap-nodes-in-pairs/ if (!head || list_empty(head)) return; struct list_head *cur = head; while (cur->next != head && cur->next->next != head) { struct list_head *first = cur->next; struct list_head *second = first->next; list_move(second, cur); cur = first; // skip the two nodes, not cur = second } } ``` 後來參考 [PigeonSir](https://hackmd.io/@PigeonSir/linux2025-homework1#q_swap-q_reverse-q_reverseK) 程式碼，了解到 `q_swap` 和 `q_reverse` 都只是 `q_reverseK` 的特例，實現 `q_reserveK` 後分別將引數調換成 2 以及 queue size 即可。 ```cpp! void q_swap(struct list_head *head) { q_reverseK(head, 2); } ``` ### q_reverse 直接使用 `q_reverseK` function 。 ```cpp! void q_reverse(struct list_head *head) { q_reverseK(head, q_size(head)); } ``` ### q_reverseK 參考 [PigeonSir](https://hackmd.io/@PigeonSir/linux2025-homework1#q_swap-q_reverse-q_reverseK) 程式碼，不過他並沒有考慮雙向鏈結剩下的指標，我嘗試用 `q_swap` 的 `list_move` 實作方法，並用下方圖說明： ![Screenshot from 2025-04-19 15-59-18](https://hackmd.io/_uploads/ryjVd0x1le.png) 但是這個手法不是特別直觀，還要考慮 i = 0 時，不能 list_move，因此應該要再去參考其他 linux kernal API，精簡許多。 ```cpp! void q_reverseK(struct list_head *head, int k) { if (!head || list_empty(head)) return; struct list_head *n, *s, iter, *tmp_head = head; int i = 0; INIT_LIST_HEAD(&iter); list_for_each_safe (n, s, head) { i++; if (i == k) { list_cut_position(&iter, tmp_head, n); q_reverse(&iter); list_splice_init(&iter, tmp_head); i = 0; tmp_head = s->prev; } } } ``` ### q_sort 這邊使用 [Neetcode講解 merge sort array](https://www.youtube.com/watch?v=P1Ic85RarKY) 的方法，**merge sort** 是 [divide and conquer](https://en.wikipedia.org/wiki/Divide-and-conquer_algorithm) 分治法的其中一種，也就是說分治法應用於排序的方法就叫 **merge sort**。首先，Recursive call 遞迴呼叫來切分鏈結 (紅色部份)；接著，結合兩個已經排序的陣列/鏈結串列 (綠色部份)。 ![image](https://hackmd.io/_uploads/BJQKFEmyxg.png) 參考 [Jackiempty](https://hackmd.io/@Jackiempty/linux2025-homework1#q_sort) 並改進閱讀性，除了 `q_sort` 本身還另外加上兩個函式： 1. `merge_recur` 來利用快慢指標將佇列的中間節點找出來並遞迴處理 ```cpp struct list_head *merge_recur(struct list_head *head) { // base case: only one node, return directly if (!head->next) return head; // use slow and fast pointers to find the middle of the list struct list_head *slow = head; struct list_head *fast = head->next; while (fast && fast->next) { slow = slow->next; fast = fast->next->next; } // split the list struct list_head *mid = slow->next; slow->next = NULL; // recursive sort the left and right halves struct list_head *left = merge_recur(head); struct list_head *right = merge_recur(mid); // merge the sorted left and right halves return merge_two_list(left, right); } ``` 2. `merge_two_list` 來結合已經排序的鍊結串列，先將小的值接續在 `dummy` (Line 19)，因此是 ascending 的，而且現在只有先處理成單向鏈結串列。 ```cpp= struct list_head *merge_two_list(struct list_head *left, struct list_head *right) { // create a temporary head node struct list_head dummy; struct list_head *dum = &dummy; // check the boundary case if (!left && !right) { return NULL; } // merge two lists while (left && right) { // compare the values of two nodes if (strcmp(list_entry(left, element_t, list)->value, list_entry(right, element_t, list)->value) <= 0) { // take left node, if left value <= right value dum->next = left; left = left->next; } else { // take right node dum->next = right; right = right->next; } dum = dum->next; } // connect the remaining nodes dum->next = left ? left : right; // return the merged list (excluding the temporary head node) return dummy.next; } ``` `q_sort` 拿 `merge_two_list` 的結果，補上判斷是否為 descend 的功能 (Line 12-47)，以及最後設定每個鏈結的 `prev` 並確保為雙向鏈結串列。 ```cpp= void q_sort(struct list_head *head, bool descend) { if (!head || list_empty(head)) return; // disconnect the circular structure head->prev->next = NULL; // use merge sort to sort the list head->next = merge_recur(head->next); if (!descend) { // ascending order: // reconnect the prev pointer and circular structure struct list_head *current = head; struct list_head *next = head->next; // traverse the list, set the prev pointer of each node while (next) { next->prev = current; current = next; next = next->next; } // connect the tail node to the head node, form a circular list current->next = head; head->prev = current; } else { // descending order struct list_head *current = head; struct list_head *next = head->next; struct list_head *beyond = head->next->next; // traverse and reverse the list while (beyond) { next->next = current; next->prev = beyond; current = next; next = beyond; beyond = beyond->next; } // handle the last node next->next = current; next->prev = head; head->next = next; } } ``` ### q_ascend/ q_descend `q_descend` 函式會將右邊存在有嚴格大於的節點的節點給 remove ```cpp! int q_descend(struct list_head *head) { if (!head || list_empty(head)) return 0; struct list_head *current = head->next; while (current != head) { struct list_head *next = current->next; bool should_advance = true; while (next != head) { // Compare current node with all nodes to its right if (strcmp(list_entry(next, element_t, list)->value, list_entry(current, element_t, list)->value) > 0) { // Found a greater value to the right, remove current node struct list_head *to_remove = current; current = current->prev; // Move back before deleting list_del(to_remove); q_release_element(list_entry(to_remove, element_t, list)); should_advance = false; break; } next = next->next; } if (should_advance) current = current->next; } return q_size(head); } ``` ### q_merge 一開始沒搞清楚 head 如何指向多個queue，參考[Jackiempty](https://hackmd.io/@Jackiempty/linux2025-homework1#q_merge) 後發現有 `queue_contex_t` 這個結構體 `chain` 是像之前的 list_head，而指標 `q` 是指向真正的鏈結串列。 ```cpp! /** * queue_contex_t - The context managing a chain of queues * @q: pointer to the head of the queue * @chain: used by chaining the heads of queues * @size: the length of this queue * @id: the unique identification number */ typedef struct { struct list_head *q; struct list_head chain; int size; int id; } queue_contex_t; ``` ```cpp int q_merge(struct list_head *head, bool descend) { // https://leetcode.com/problems/merge-k-sorted-lists/ if (!head || list_empty(head)) return 0; queue_contex_t *first_queue = list_entry(head->next, queue_contex_t, chain); if (head->next == head->prev) return first_queue->size; // merge all the queues into one sorted queue for (struct list_head *current = head->next->next; current != head; current = current->next) { queue_contex_t *current_queue = list_entry(current, queue_contex_t, chain); // merge the elements of the current queue to the first queue list_splice_init(current_queue->q, first_queue->q); // set the size of the current queue to 0, because its elements have // been moved current_queue->size = 0; } // sort the merged queue q_sort(first_queue->q, descend); // update the size of the first queue first_queue->size = q_size(first_queue->q); // return the size of the merged queue return first_queue->size; } ``` ## 研讀論文並改進 dudect ## 在 qtest 提供新的命令 shuffle ## 研讀 lib/list_sort.c 研讀 Linux 核心原始程式碼的 [lib/list_sort.c](https://github.com/torvalds/linux/blob/master/lib/list_sort.c) 並嘗試引入到專案，比較你自己實作的 merge sort 和 Linux 核心程式碼之間效能落差，也該嘗試改進針對鏈結串列排序實作的效能，[詳細說明](https://hackmd.io/@sysprog/linux2025-lab0/%2F%40sysprog%2Flinux2025-lab0-e)以及[改進資料](https://hackmd.io/@sysprog/Hy5hmaKBh)。 ## 補充 ### c與c++差異因為之前兩者都接觸過，但只有初淺知道語法差異跟有無物件導向的概念，沒有深入了解差異，想藉由這次作業探討兩者差異。 - C - 1972 年由 Dennis Ritchie 開發，是一種結構化程式語言。 - 強調簡單、高效和底層控制，特別適合作業系統、嵌入式系統。 ```clike #include <stdio.h> struct Point { int x, y; }; int main() { struct Point p = {3, 4}; printf("x: %d, y: %d\n", p.x, p.y); return 0; } ``` - C++ - 由C演進而來，在1980 年代由 Bjarne Stroustrup 開發 - 增加了 [OOP(是態度不是語法)](https://hackmd.io/@sysprog/c-oop)、[generics](https://www.geeksforgeeks.org/generics-in-c/)、[operator overloading](https://www.geeksforgeeks.org/operator-overloading-cpp/)等等，目標是保持C的高效能，同時提供更好的直覺設計。 ```cpp #include <iostream> using namespace std; class Point { private: int x, y; public: Point(int x_=0, int y_=0) : x(x_), y(y_) {} void print() { cout << "x: " << x << ", y: " << y << endl; } }; int main() { Point p(3, 4); p.print(); return 0; } ``` ### Valgrind 第一週直播 `3:32` ### 整合網頁伺服器第一週直播 `3:57`