2025q1 Homework1 (lab0)

# 2025q1 Homework1 (lab0) contributed by <Pin99978> {%hackmd NrmQUGbRQWemgwPfhzXj6g %} ## 開發環境 ```shell $ gcc --version gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 % lscpu Architecture: x86_64 CPU op-mode(s): 32-bit， 64-bit Address sizes: 39 bits physical， 48 bits virtual Byte Order: Little Endian CPU(s): 20 On-line CPU(s) list: 0-19 Vendor ID: GenuineIntel Model name: 12th Gen Intel(R) Core(TM) i7-12700H CPU family: 6 Model: 154 Thread(s) per core: 2 Core(s) per socket: 14 Socket(s): 1 Stepping: 3 CPU(s) scaling MHz: 20% CPU max MHz: 4700.0000 CPU min MHz: 400.0000 BogoMIPS: 5376.00 ``` --- ### 實做前參考的相關筆記 * [你所不知道的 C 語言: linked list 和非連續記憶體](https://hackmd.io/@sysprog/c-linked-list) ### 概念論述：在閱讀 linked list 和非連續記憶體中 **Linux 核心原始程式碼**小節，自行整理以下概念以及補充部份 * 相對於常見的linked list 操作，`list_head` 的優點是，新增或刪除節點時，不需要自行將 `prev` 與 `next` 鏈結到下個節點上，其結構體 ```c typedef struct { char *value; struct list_head list; } my_data; ``` -- ## 實做 queue.c 根據 queue.h 對各個函式的描述，將實做遇到的問題與解釋紀錄在此處。 --- ### q_new() 目的：建立一個空佇列，並檢查是否配置成功。若成功，將 `prev` 和 `next` 指向自己。反之，若失敗則返回 `NULL` 實做: 1. 動態配置(malloc) 一個 `struct list_head` 大小的記憶體空間，作為佇列的頭部 (head) 2. 檢查是否配置失敗，若`malloc`的返回值為 `NULL`，則立即返回`NULL` 2. 若成功，呼叫 `list.h` 提供的 `INIT_LIST_HEAD` 函式，將新配置的節點初始化，使其 next 與 prev 指標指向 `head` 本身 3. 返回 head 指標 --- ### q_free() 目的: 釋放佇列中所有的元素的記憶體空間，且因 `element_t` 的 value 為指向字串的指標，因此也需將此記憶體釋放。實做: 1. 確認 `head` 是否為 `NULL`，若是，則返回 `NULL` 2. 使用 list.h 中定義的 `list_for_each_entry_safe` 來得到 entry，因需要在遍歷節點時候進行釋放，需先知道 `entry` 後的節點 `safe` 來避免斷鏈的問題. --- ### q_insert_head() 目的: * 建立新的佇列元素，該佇列元素插入至 `head` 與 `head->next` 之間。 * 指定的佇列元素的 `value` 為字串類型，value 需配置記憶體後將給定字串複製至 `value` 中實做: 1. 新的佇列元素為`element_t`，動態配置(malloc) 一個`element_t` 的記憶體空間，並確認是否配置成功，若配置失敗則返回`false` 2. 而 element_t 的 value 根據 `list.h` 中 `q_insert_head` 為指向字串(s)的指標，因此需額外配置字串的記憶體位置 3. 使用 `strlen` 得到字串長度 `len` 後，動態配置 `len + 1` 的長度後，使用 `strcpy` 將 `s` 複製到動態配置的記憶體位置(此處同樣需要判斷是否配置成功) 4. 若配置成功，呼叫 queue.h 提供的 `list_add` 函式 5. 若配置失敗，需釋放已配置的記憶體(前面的字串記憶體空間)，避免洩漏。從 list.h 中的敘述 `list_add` 可在給定的節點後新增新的節點 ```c static inline void list_add(struct list_head *node， struct list_head *head) { struct list_head *next = head->next; // 暫存節點，紀錄目前 head->next 的節點 // 避免在重新鏈結時失去節點 next->prev = node; // 將 next 節點的 prev 鏈結到新的 node node->next = next; // 將新節點的 next 鏈節到暫存節點 node->prev = head; // 將新節點的 prev 鏈節到 head head->next = node; // 將 head 的 next 鏈節到新節點 } ``` 6. 將新節點的 list 鏈結至 head 節點之後 `list_add(&new_node -> list ， head)` 按照實做 ```c bool q_insert_head(struct list_head *head， char *s) { element_t *new_node = malloc(sizeof(element_t)); if (!new_node) return false; size_t len = strlen(s); char *node_str = malloc(len + 1); if (node_str == NULL) free(node_str); return false; strcpy(node_str， s); new_node->value = node_str; list_add(&new_node->list， head); return true; } ``` 在嘗試 `git add commit` 時指出 `strcpy` 此函式有安全疑慮，參照[CREN Computer Security](https://security.web.cern.ch/) >strcpy() The strcpy built-in function does not check buffer lengths and may very well overwrite memory zone contiguous to the intended destination. In fact， the whole family of functions is similarly vulnerable: strcpy， strcat and strcmp. `strcpy` 在複製的過程中不考慮緩衝區大小，直到複製完成，因此在配置記憶體為完善的情況下將會造成覆寫已存在緩衝區資料的可能參考 [Dennis40816](https://hackmd.io/@dennis40816/linux2025-homework1#q_insert_head) 實做， `strdup` 是一個已將 1.字串長度 2. 記憶體配置 3. 複製字串整合的一個函式，使用此函式改寫實做 ```c bool q_insert_head(struct list_head *head， char *s) { element_t *new_node = malloc(sizeof(element_t)); if (!new_node) return false; new_node->value = strdup(s); if (!new_node->value) { free(new_node); return false; } list_add(&new_node->list， head); return true; } ``` --- ### q_insert_tail() 目的: * 與 `q_insert_head()` 類似，建立新的佇列元素，該佇列元素插入至尾端 * 指定的佇列元素的 `value` 為字串類型，value 需配置記憶體後將給定字串複製至 `value` 中實做: * 作法與 `insert_add()` 類似，將 `list_add`函式更改為 list.h 中的 `list_add_tail` 即可 ```c static inline void list_add_tail(struct list_head *node， struct list_head *head) { struct list_head *prev = head->prev; // 暫存節點，紀錄目前 head->prev 的節點 // 避免在重新鏈結時失去節點 prev->next = node; // 原尾端節點的 next 鏈結到新節點 node->next = head; // 新節點的 next 鏈結到 head node->prev = prev; // 新節點的 prev 鏈結到原尾端節點 head->prev = node; // head 的 prev 鏈結到新節點 } ``` --- ### q_remove_head() 目的: * 移除佇列中的第一個元素，並返回此元素的記憶體位址 * 若 `sp` 不是空字串，則需將被移除元素的值複製到 `sp` ，而此字串長度最大為 `bufsize - 1` 實做: 1. 判斷 `head` 是 `NULL` 以及 `head` 指向空佇列，如果為真，返回 `NULL` 2. 透過 `list_first_entry` 得到第一個 entry 3. 使用 `list_del` 將第一個 entry 斷鏈結，而需注意在 `queue.h` 中定義 remove的定義 > "remove" is different from "delete" > The space used by the list element and the string should not be freed. > The only thing "remove" need to do is unlink it. 4. 若 `sp` 不是空字串，利用 `strncpy` 複製字串後返回註1. 另外在閱讀 `list_del` 時，此函式包含一項預處理 `LIST_POSSONING` ```C #ifdef LIST_POISONING node->next = NULL; node->prev = NULL; #endif ``` 先了解 `#ifdef` 的用法，C11: 6.10.1 ， > Preprocessing directives of the forms > #ifdef identifier new-line groupopt > #ifndef identifier new-line groupopt > check whether the identifier is or is not currently defined as a macro name. Their conditions are equivalent to #if defined identifier and #if !defined identifier respectively 在 `Preprossing` 階段，若巨集`List_POISONING`已經被定義，則保留 `#ifdef` 到 `endif`的程式碼。而`List_POISONING`此巨集的作用在此是，當 `list_del` 函式被呼叫刪除一個節點時，若編譯時期有被定義，那麼該節點的 next 和 prev 指標會被設定為 NULL。這樣的目的是若該節點被刪除後又調用的話，則會出現 `Segement fault` :::info 疑問點1: 在哪個 .h file 定義了 LIST_POISONING ? 試著 grep 整個資料夾找到 `grep -rl "LIST_POISONING" *` `>list.h` ::: --- ### q_remove_tail() 目的: * 移除佇列中的最後一個元素，並返回此元素的記憶體位址 * 若 `sp` 不是空字串，則需將被移除元素的值複製到 `sp` ，而此字串長度最大為 `bufsize - 1` 實做: 與 `q_remove_tail()` 類似，將`list_first_entry` 改成`list_tail_entry()` 即可 --- ### q_size() 目的: 計算出 queue 的長度實做: 1. 判斷 `head` 是 `NULL` 以及 `head` 指向空佇列，如果為真，返回 0 2. 宣告一個 `int cnt` 遍歷至各個節點並將`cnt+1`，直到節點再回到head 3. 遍歷節點方法使用 list.h 的 `list_for_each` --- ### q_delete_mid() 目的: 1. 移除佇列中間的元素， `delete` 除了將原本的鏈結斷開之外，也需將元素的值(value) 以及元素本身佔用的記憶體空間釋放 2. 若移除元素成功返回 `true`，另外若`head`為`NULL` 或`head`指向空佇列，則返回 `false` 實做: 1. 判斷 `head` 是 `NULL` 以及 `head` 指向空佇列，如果為真，返回 false 2. 使用快慢指標技巧，在這佇列的第一個元素是 `head->next`，而當快針再次走到 `head` 時，慢針會剛好走到中間點(mid)，另外我們需考慮到長度是奇數與偶數的差別 > The middle node of a linked list of size n is the > ⌊n / 2⌋th node from the start using 0-based indexing. > If there're six elements， the third member should be deleted. ```c struct list_head *fast = head->next; // 快針 struct list_head *slow = head->next; // 慢針 while (fast != head && fast->next != head){ fast = fast->next->next; slow = slow->next; } ``` 3. 當找到 `mid` 節點時，將鏈結斷鏈和釋放其值佔用的記憶體即本身佔用的記憶體參考資料: [分析「快慢指標」](https://hackmd.io/@sysprog/ry8NwAMvT) 在此參考資料提及 --- ### q_delete_dup 目的: 比較佇列元素其字串是否重複，如果有重複就將包含自己"所有"元素移除(delete) 。說明: 比較 leetcode 82 的問題，在此需考慮兩個更複雜的條件相異點: 1. 比較的元素值為字串，若使用如 `node1->value == node2-> value` 將會是 false（這樣比較是兩個字串的記憶體位置），因此**需使用`strcmp` 函式來比較兩字串** 2. 在遍歷節點時，若上面的條件成立，除了將節點從佇列移除外，仍需釋放其記憶體空間與其字串的記憶體空間，因此需**使用暫存節點方式移除避免鏈結串列遺失**，此方法同 `list.h` 中部份函式的 `safe` 概念一樣相同點: 1. 假設 queue 佇列已經經過排序實做: 1. 判斷 `head` 是 `NULL` 以及 `head` 指向空佇列，如果為真，返回 false 2. 使用 `cur` 與 `run` 兩個節點來遍歷佇列, `cur` 用來遍歷鏈結串列，停止條件為當 `cur == head `時，而 `run` 使之為 `cur` 的下一個節點用來比較當前元素字串是否與 `cur` 元素相同 3 --（待補) -- 對觀念不熟： --- ### q_swap 目的:將相鄰的兩個元素做交換參考: 在 leetcode : [24. Swap Nodes in pair](https://leetcode.com/problems/swap-nodes-in-pairs/description/)，若鍊結串列為 `head = [1,2,3,4]` ，則經過交換後結果因為 `head = [2,1,4,3]`。此題目初見將問題簡化成 `[dummy,1,2,3]`，並實做交換方式 ```c // 假設已經定義 prev , node1 , node2, nextNode(node3) = node2->next // 確保不會在重新鏈結時找不到 node3 node1->next = nextNode; prev ->next = node2; node2->next = node1; ``` // 間接指標方式: --- ### 概念補充(待整理): C 語言 marco 以及 define 以及實際操作(查閱規格書 6.10) * The ## operator (token pasting) * > EXAMPLE 2 The following defines a function-like macro whose value is the maximum of its arguments. It has the advantages of working for any compatible types of the arguments and of generating in-line code without the overhead of function calling. It has the disadvantages of evaluating one or the other of its arguments a second time (including side effects) and generating more code than a function if invoked several times. It also cannot have its address taken， as it has none. #define max(a， b) ((a) > (b) ? (a) : (b)) `static` & `inline`: 在 `list.h` 中 `static inline void INIT_LIST_HEAD()` 此函式使用了 static & inline 進行修飾因為之前對這兩個綽詞沒有很理解，參照 C99 的解釋 * `inline` : 在 6.7.4 Function specifiers裡的敘述： ### 其餘進度(待整理): 閱讀及觀看 [你所不知道的 C 語言: linked list 和非連續記憶體](https://hackmd.io/@sysprog/c-linked-list) 後以間接指標實做leetcode 82. Remove Duplicates from Sorted List II 83. Remove Duplicates from Sorted List 24. Swap Nodes in Pairs 2095. Delete the Middle Node of a Linked List (快慢針以及間接指標)