2025q1 Homework1 (lab0)

2025q1 Homework1 (lab0) === contributed by <`gg21-aping`> {%hackmd NrmQUGbRQWemgwPfhzXj6g %} ### 開發環境 ``` $ gcc --version gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 Copyright (C) 2023 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 39 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 4 On-line CPU(s) list: 0-3 Vendor ID: GenuineIntel Model name: Intel(R) Core(TM) i5-4200U CPU @ 1.60GHz CPU family: 6 Model: 69 Thread(s) per core: 2 Core(s) per socket: 2 Socket(s): 1 Stepping: 1 CPU(s) scaling MHz: 49% CPU max MHz: 2600.0000 CPU min MHz: 800.0000 BogoMIPS: 4589.18 ``` ### Reviewed by `MikazkiHikari` 請確認是否有按照預期進行rebase；然後 queue.[ch]的部分寫的很詳盡，但其他的項目呢? ### 資料結構和實作方式 * `element_t`: the element to be linked in the queue, carrying a pointer to an array holding `string`. * `queue_contex_t`: can be considered as the config of the queue structure, carrying the pointer to the head of the queue, the size, and the id. ### 佇列實作：修改 `queue.[ch]` #### `queue_new()` 參考 Linux Kernel 實作佇列，使用 `INIT_LIST_HEAD()`，將 `list_head` 的 `next` 和 `prev` 指標都指向結構本身。 #### `q_free()` 考量僅是將節點自佇列中移除，但不釋放其記憶體空間，可透過 `list_for_each_entry_safe()` 保存下個節點的指標，並逐一走訪佇列中各節點。提交程式碼的時候遇到一些問題： 1. (resolved) 程式碼被判斷並沒有使用指定的 coding style。查 Linux 核心程式碼使用 `list_for_each_entry_safe()` 都沒有在函式名稱及 `(` 間插入一個空白。 ```git - list_for_each_entry_safe(e, next, head, list) { + list_for_each_entry_safe (e, next, head, list) { q_release_element(e); } [!] queue.c does not follow the consistent coding style. Make sure you indent as the following: clang-format -i queue.c ``` 感謝社團同學提出參考 Linux 核心程式碼對 `.clang-format` 的操作，新增 `SpaceBeforeParens` 並提出 [PR](https://github.com/sysprog21/lab0-c/pull/232)。 2. (resolved) `cppcheck` 觸發的 `unusedlabel` 警告。參考 [issue#211](https://github.com/sysprog21/lab0-c/pull/211#issuecomment-2690577631) 的討論，只能將 `cppcheck` 從 ubuntu 24.04 LTS 的[官方版本 2.13](https://launchpad.net/ubuntu/noble/+package/cppcheck) 再往上更新。[Commit e0c6b11](https://github.com/sysprog21/lab0-c/commit/e0c6b113c9c9404208dcf4dccf4d229016aa4778) 解決了 `unusedlabel` 警告，但依然會觸發 `uninitvar` 警告。因此提了 [issue#229](https://github.com/sysprog21/lab0-c/issues/229) 的討論。 ```shell Following files need to be cleaned up: queue.c Running static analysis... queue.c:28:5: error: Uninitialized variable: next [uninitvar] list_for_each_entry_safe (e, next, head, list) { ^ Fail to pass static analysis. ``` #### `q_insert_head/tail()` 使用 `strdup()` 複製字串 `s`，避免原始的 `s` 指向的記憶體區塊被釋放或更動後影響到 `e->value`。 > The `strdup()` function returns a pointer to a new string which is a duplicate of the string s. Memory for the new string is obtained with `malloc(3)`, and can be freed with `free(3)`. `strdup()` 透過 `malloc(strlen(s) + 1)` 來動態分配記憶體空間，再透過 `strcopy()` 將原始字串的內容複製到新分配的記憶體。 ##### todo: * 在佇列中將節點插入頭或尾的操作基本相同，可以思考更有品味的程式碼。 #### `q_remove_head/tail()` 透過 `list_entry()` 系列操作可取得想要的節點資訊。注意 `strncpy(*dest, *src, n)` 在 destination 長度比 source 長度長的時候，會將剩餘空白的位置都自動補上終止符 `\0`，反之則 `dest` 指向的字串將不會有終止符。因此，為了避免呼叫者使用可能未正確終止的字串而導致的各種問題，做了一些防禦機制： ```c strncpy(sp, e->value, bufsize - 1); sp[bufsize - 1] = '\0'; ``` > `unlink` 被認定是 american english，但 `unlinked` 或 `unlinks` 均不被 `aspell` 認定為 american english，`Functions` 也不被認定為 american english QQ ##### todo: * 在佇列中移除頭或尾節點的操作基本相同，可以思考更有品味的程式碼。 #### `q_size()` 原本想透過 `container_of()` 直接使用 `queue_contx_t` 中的成員 `size`，但觀察 `qtest.c` 中對 `q_size()` 的呼叫，發現傳入的值有時是成員 `q`，有時是成員 `chain`。因此貿然的直接定位會無法取得正確的記憶體位置。因此仍是透過逐一走訪的方式計算。 #### `q_delete_mid()` 鏈結的中間節點可以透過兩次走訪鏈結取得，但透過 Hare and Tortoise Algorithm（快慢指標），我們可以只走訪鏈結一次，便快速定位到中間節點。僅需特別注意的是，此作業提供的結構為 circular doubly linked list，亦即指標不會指向 `NULL`，且單一節點便能取得其連結的前後節點。注意 `list_del()` 並不會釋放記憶體空間，僅是將該節點從鏈結中移除。節點仍會存在，但其指標指向的位址不能被保證；若要再次將節點加入鏈結中，依然要先呼叫 `LIST_INIT_HEAD()`。 #### `q_delete_dup()` 在 `queue.h` 的註解中，並沒有提到所有呼叫此函式的鏈結都是有序的，但卻附上 leetcode 刪除有序鏈結中的重複節點。所以一開始會有點不太確定課程期待的寫法是什麼。若能保證是有序鏈結，那麼透過雙指標，就能有效率的完成刪除重複節點的行為。但若無法保證傳入的鏈結為有序，就需要透過例如雜湊表的方式，來達到最佳的效能。參考數位同學的解法，推估該函式就是預設傳入的節點都為有序，因此先透過雙指標的方式完成此函式。逐一走訪佇列中各項元素，並透過 `strcmp` 做 `char` 的比較。其中 `strcmp()` 規定傳入的字串必須是 null-ended strings（即字串必須是 `\0` 結尾），雖然在 `queue.h` 中並未明定，但若節點的 `value` 成員不存在，設定回傳 `false`。 ```git + list_for_each_entry_safe(cur, next, head, list) { + if (&next->list == head) + break; + + if (!cur->value || !next->value) + return false; + + if (!strcmp(cur->value, next->value)) { + list_del(&cur->list); + q_release_element(cur); + } + } ``` 後來發現我把題目理解成 `A -> B -> B -> C` 要變成 `A -> B -> C` 的意思，但其實應該要把 `B` 完全移除。所以又做了以下修正。透過變數 `should_delete` 紀錄最後一個重複節點，並進行刪除。 ```git element_t *curr = NULL, *next = NULL; + bool should_delete = false; if (!head || list_empty(head)) return false; list_for_each_entry_safe(curr, next, head, list) { - if (&next->list == head) - break; + bool duplicate = + (&next->list != head && !strcmp(curr->value, next->value)); - if (!curr->value || !next->value) - return false; - - if (!strcmp(curr->value, next->value)) { + if (duplicate || should_delete) { list_del(&curr->list); q_release_element(curr); + should_delete = duplicate; } } ``` #### `q_swap()` 透過兩個指標，並使用 `list_move()` 可以快速的完成兩兩一組的交換操作。此時，`curr` 指向一組節點中的第一個節點，而 `next` 指向下一組節點中的第一個節點。 ```c curr = head->next; while (curr != head && curr->next != head) { next = curr->next->next; list_move(curr->next, curr->prev); curr = next; } ``` 更新：可透過呼叫 `q_reverseK(head, 2)` 來取代原本的 swap 操作。 #### `q_reverse()` 透過逐一走訪各節點並呼叫 `list_move()` 將各節點移至 `head` 節點後，即可完成整串鏈結的反轉。 #### `q_reverseK()` 透過三個指標 `start`、`end` 和 `iter` 來做鏈結的部分反轉。 ```c start = head; while (size >= k) { struct list_head *end = start->next, *iter = start->next; for (i = 0; i < k; i++) { list_move(iter, start); iter = end->next; } start = end; size -= k; } ``` 有特別注意變數宣告的位置，原本將 `end` 和 `iter` 一起宣告在整個函式區塊的最頂部 (beginning of the block)；但考慮該二變數僅在迴圈中使用，應定義在最小作用域的程式碼區塊中的最頂部，因此做了修正。 #### `q_sort()` 考量到資料結構為鏈結串列，使用 merge sort 的方式對鏈結進行排序。概念如下： * 透過快慢指針找到鏈結的中間點，透過 `list_cut_position()` 進行切分。 * 遵循 divide and conquer 的概念，透過遞迴手法不斷進行上述的切分行為。 * 呼叫 helper function `q_merge_two()` 進行排序。 * 參考 [yuto0226](https://hackmd.io/@yuto0226/linux2025-homework1) 同學的筆記，新增 `is_descend()` 函式來判斷排序順序。 #### `q_ascend/descend()` 題目要求將所有破壞升序、降序法則的節點刪除，即在升序的鏈結串列中，較小的元素不能在後面出現，降序則反。若從鏈結的起點開始往後走，判斷上會非常困難。但若從尾部開始往前走，則僅需紀錄一個 `extreme` 值，在升序排列中，遇到比 `extreme` 大的值一律刪除，在降序排列中則反。因此我首先寫了 `list_for_each_entry_safe_reverse` 的 macro 來協助我完成安全的迴圈操作。 ```C /* Iterate through a list reversely */ #define list_for_each_entry_safe_reverse(entry, safe, head, member) \ for (entry = list_entry((head)->prev, typeof(*entry), member), \ safe = list_entry(entry->member.prev, typeof(*entry), member); \ &entry->member != (head); entry = safe, \ safe = list_entry(safe->member.prev, typeof(*entry), member)) ``` 並創建一個 `q_make_monotonic()` 的函式，透過對變數 `descend` 的控制，即可讓 `q_ascend()` 和 `q_descend()` 同時呼叫。 ```C /** * q_make_monotonic() - Remove nodes to make queue monotonic * @head: header of queue * @descend: whether to filter for descending (true) or ascending (false) * * Return: the number of elements in queue after performing operation */ static int q_make_monotonic(struct list_head *head, bool descend) { element_t *curr, *next; char *extreme = NULL; if (!head || list_empty(head) || list_is_singular(head)) return q_size(head); list_for_each_entry_safe_reverse(curr, next, head, list) { if (!extreme) { extreme = curr->value; continue; } if (is_descend(extreme, curr->value, descend)) { list_del(&curr->list); q_release_element(curr); } else { extreme = curr->value; } } return q_size(head); } ``` #### `q_merge()` 藉由之前完成 `q_sort()` 開發過的 `q_merge_two()`，用暴力解的方式，兩兩合併在一起。