2025q1 Homework1 (lab0)

contributed by < HeatCrab >

CC: Urbaner3: Reply for Review on lumynous

HeatCrab, 你直接要求他 lummynous5， rebase A 分支到主分支，我看到你有做而且也希望他做。他CI 測試還沒過，你的已經過了，我猜他有刻意保留，因為另外兩個分支好像可以測試，不過他本人比較清楚。 link

這個是我路過留言，我老是忘記小人的功能。 Urbaner3

其實 rebase 是老師要求的檢查項目之一，也可以方便其他開發者檢視 commit ，也就是開發紀錄，而不會受到上游分支的更新影響而混合在一起。另外，他其他兩個分支應該是用於 PR 的？我好像看到跟他推送到老師（upstream）有關係的樣子。 Heatcarb

作業書寫規範:

無論標題和內文中，中文和英文字元之間要有空白字元 (對排版和文字搜尋有利)
文字訊息 (尤其是程式執行結果) 請避免用圖片來表示，否則不好搜尋和分類
共筆書寫請考慮到日後協作，避免過多的個人色彩，用詞儘量中性
不要在筆記內加入 [TOC] : 筆記左上方已有 Table of Contents (TOC) 功能，不需要畫蛇添足
不要變更預設的 CSS 也不要加入任何佈景主題: 這是「開發紀錄」，用於評分和接受同儕的檢閱
當在筆記中貼入程式碼時，避免非必要的行號，也就是該手動將 c= 或 cpp= 變更為 c 或 cpp。行號只在後續討論明確需要行號時，才要出現，否則維持精簡的展現。可留意「你所不知道的 C 語言: linked list 和非連續記憶體」裡頭程式碼展現的方式
HackMD 不是讓你張貼完整程式碼的地方，GitHub 才是！因此你在開發紀錄只該列出關鍵程式碼 (善用 diff 標示)，可附上對應 GitHub commit 的超連結，列出程式碼是為了「檢討」和「便於他人參與討論」
留意科技詞彙的使用，請參見「資訊科技詞彙翻譯」及「詞彙對照表」
不要濫用 :::info, :::success, :::warning 等標示，儘量用清晰的文字書寫。:::danger 則僅限授課教師作為批注使用
避免過多的中英文混用，已有明確翻譯詞彙者，例如「鏈結串列」(linked list) 和「佇列」(queue)，就使用該中文詞彙，英文則留給變數名稱、人名，或者缺乏通用翻譯詞彙的場景
在中文敘述中，使用全形標點符號，例如該用「，」，而非 ","。注意書名號的使用，即 〈 和 〉，非「小於」和「大於」符號
避免使用不必要的 emoji 字元

開發環境

$ gcc --version
gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0

$ lscpu
Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
Address sizes:                   39 bits physical, 48 bits virtual
CPU(s):                          12
On-line CPU(s) list:             0-11
Thread(s) per core:              2
Core(s) per socket:              6
Socket(s):                       1
NUMA node(s):                    1
Vendor ID:                       GenuineIntel
CPU family:                      6
Model:                           151
Model name:                      12th Gen Intel(R) Core(TM) i5-12400
Stepping:                        5
CPU MHz:                         800.000 (根據 CPU(s) scaling MHz: 18% 和 max/min 推算)
CPU max MHz:                     4400.0000
CPU min MHz:                     800.0000
BogoMIPS:                        4992.00
Virtualization:                  VT-x
L1d cache:                       288 KiB
L1i cache:                       192 KiB
L2 cache:                        7.5 MiB
L3 cache:                        18 MiB
NUMA node0 CPU(s):               0-11

針對佇列操作 `queue.c` 的程式碼實作

實作前的準備

在開始實作 queue.c 之前，先閱讀與其最直接相關的三個標頭檔：queue.h 、 harness.h 和 list.h 。

queue.h
- element_t - 鏈結串列元素架構
```
typedef struct {
    char *value;
    struct list_head list;
} element_t;
```
  參考「你所不知道的 C 語言: linked list 和非連續記憶體」中的圖片：
- queue_context_t - 佇列上下文結構
  目前根據 q_merge() 中的註解猜測可能在實作上有關係：
  
  There is no need to free the 'queue_contex_t' and its member 'q' since they will be released externally.
- Operations on queue - 佇列的實作
  接下來將在 queue.c 中被實現的所有函式的功能簡介與注意事項。
harness.h
- 根據作業說明的確認：
  - 定義了 test_malloc 、 test_calloc 、 test_free 、 test_strdup 這四個函式，並在程式碼中使用巨集定義，去方便我們撰寫程式碼時使用的 malloc 、 calloc 、 free 、 strdup 。
  - 根據註解
    
    This test harness enables us to do stringent testing of code. It overloads the library versions of malloc and free with ones that allow checking for common allocation errors.
    
    我的解釋是可以直接使用原本的函式寫法，但是編譯器會自動幫你替換成 harness.h 中新定義的函式來防止發生問題並進行額外的檢查。
    為了程式的可讀性，我會選擇繼續使用較簡單且明確，也就是不含 test 的寫法。
- 與 queue.[ch] 的關聯（牽涉記憶體）：
  - malloc 跟 strdup 可用於 q_insert_head 、 q_insert_tail 這兩個函式的撰寫，如註解中說明的：
    
    The function must explicitly allocate space and copy the string into it.
    
    來處理記憶體與字串複製的問題。
  - free() 就很直觀明瞭，用於釋放記憶體的工作，像是 q_release_element 中就有明確的定義使用以及 q_free() 這個一眼明瞭是用來做什麼的函式。
list.h
- 先翻閱「你所不知道的 C 語言: linked list 和非連續記憶體」，裡面有提到許多跟這個頭標檔相關的函式內容，我就簡單的整理一下目前可能會使用到實作 queue.c 的部分。
  - 根據 INIT_LIST_HEAD() 與 LIST_HEAD() 的註解定義和程式碼，我們可以判斷，這個函式可以使用在 q_new() 的實作上。
  - 同樣的，我們可以知道， list_add() 跟 list_add_tail() 分別可以使用在 queue.h 中的 q_insert_head() 和 q_insert_tail() 的實作上。
  - 接下來的頗有意思，名稱命名為 list_del() ，但是根據註解的定義說明：
    
    Remove a node from a circular doubly-linked list與The node’s memory and its containing structure, if any, are not freed.
    
    透過程式碼我們可以發現，其實這個函式的工作應該是 remove 而非如命名的縮寫 del 一樣是 delete。
    下方還有一個 list_del_init() ，除了一樣會實作 list_del() 以外，他多增加了一行 INIT_LIST_HEAD() 來初始化，以確保節點有被安全的釋放與初始化。
  - 有數個會遍歷整個 list 的巨集，我就不一一舉例，但推估是會與實作需要確認整個 list 時有關的函式，像是 q_free() 或是 q_delete_dup() 之類的。
  - 還有許多的函式與巨集定義，但是我暫時沒什麼想法跟聯想，像是 list_splice 或是 list_splice_tail 等等。而 lab0-c 中的 list.h 卻只是佔了在「 The Linux Kernal Api 」中「List Management Functions」的冰山一角，可見 Linux kernal 程式碼的龐大。
- 有趣的部分
  在 list.h 中有許多巨集的定義，對我來說十分新鮮。其中，在 list.h 裡，除了定義整個程式碼的 typeof 外，下一個定義的巨集是 container_of() 。在老師的「你所不知道的 C 語言: linked list 和非連續記憶體」中也有被提到的，是個改變程式設計思維的變革。

開始實作

1. `q_new()`

建立新的「空」佇列

根據定義

Create an empty queue whose next and prev pointer point to itself

所以定義一個 new queue ，以下程式碼稱為 head ，head 的 prev 跟 next 皆會指向自己，代表著這個佇列的建立與錨點，沒有 head ，就沒有佇列。
根據前面閱讀標頭檔的收穫，list.h 有幫實作好了一個函式： INIT_LIST_HEAD ，直接引用即可。

2. `q_free()`

釋放佇列所佔用的記憶體

根據前面 q_new() 的設計， q_free() 要釋放一個佇列，就是將佇列的錨點，也就是 head 釋放掉。於是我從 list.h 中挑選一個會遍歷整個鏈結串列的函式，確保可以安全地執行移去。最終我在 list_for_each_safe 跟 list_for_each_entry_safe 這兩個函式中做選擇，兩者在實作上的功能幾乎一模一樣，但是在程式碼的寫法上卻有些出入，當然也間接地去影響到使用上的不同。

list_for_each_safe （選用此寫法）:

作法上較底層，直接處理 list_head 的節點，需要手動轉換自定義的結構 (element_t)
重視靈活性與相容性的寫法。

list_for_each_safe(node, safe, head) {
    element_t *e = list_entry(node, element_t, list);
    test_free(e->value);                             
    test_free(e);                                    
}

list_for_each_entry_safe:

高層次的操作，直接處理好 element_t 省去手動轉換的步驟。
重視簡潔性與可讀性的寫法。

element_t *e, *safe;
list_for_each_entry_safe(e, safe, head, list) {
    test_free(e->value); 
    test_free(e);
}

站在巨人的肩膀上：
我在撰寫完 q_free() 的程式碼時就覺得有些奇怪，感覺自己的寫法好像哪裡可以更好，因此我決定翻看大家實作的想法。為了加快效率，我優先打開了老師提供的「2023 年 Linux 核心設計課程第一次作業檢討」並觀摩了第一位同學： yanjiew1 的筆記。

在他的筆記中使用了在先前我讀完標頭檔後，就遺忘的 q_release_elemet() ，所以我最終也將我的程式碼直接更改為：

list_for_each_safe(node, safe, head) {
    element_t *e = list_entry(node, element_t, list);
    q_release_element(e);
}

3. `q_insert_head()` / `q_insert_tail()`

q_insert_head():
在佇列開頭 (head) 插入 (insert) 給定的新節點 (以 LIFO 準則)

q_insert_tail():
在佇列尾端 (tail) 插入 (insert) 給定的新節點 (以 FIFO 準則)

根據前面閱讀標頭檔的收穫，分別使用 list.h 中的 list_add 和 list_add_tail 來實作。

問題與發想：

探討 queue.h 中的 q_release_elemet() 使用
```
static inline void q_release_element(element_t *e)
{
    test_free(e->value);
    test_free(e);
}
```
考慮到是否能夠以 q_release_element() 取代原始程式碼中直接撰寫的 free(new_element); ，然而，在 test_free(e->value); 這行程式碼中，
若 e->value 為 NULL，可能會產生額外的問題。因此，若要使用 q_release_element()，則需對 e->value 進行額外檢查，如下：
```
static inline void q_release_element(element_t *e)
{
    if(e->value)
        test_free(e->value);
    test_free(e);
}
```
但這樣的改動可能影響到其他程式碼部分，且未必能顯著提升便利性，甚至可能增加不必要的額外處理。因此，綜合考量後，選擇維持原始設計，不進行修改。
老師在「2025 年 Linux 核心設計課程作業 —— lab0 (A)」中
- 對於 queue.c 中對 q_insert_head() 和 q_insert_tail() 的簡單說明提到
  - q_insert_head(): … (以 LIFO 準則)。
  - q_insert_tail(): … (以 FIFO 準則)。
- 在實作上，使用 list_add 和 list_add_tail 在使用 q_remove_head() 做移去的時候，是符合老師說明的這兩個準則的沒有錯。
- 但是，如果使用 q_remove_tail() 進行移去的話，程式碼在實作上，就會變為相反的結果：
  - q_insert_head() 變成 FIFO 準則。
  - q_insert_tail() 變成 LIFO 準則。
- 目前尚未有明確的決定，決定暫時先保留現狀。
  1. 曾考慮是否應根據不同的移去方式，調整新增節點的方法以符合規範。然而，這種做法可能會導致程式碼變得較為冗長，影響可讀性與簡潔性。
  2. 規範要求不得使用 q_remove_tail()，需統一從 head 進行移去。此限制可能會降低程式碼的靈活性與便利性，需進一步評估其合理性。
  3. (思考貓)

4. `q_size`

計算目前佇列的節點總量

其實在作業說明中的牛刀小試就已經很完善了，但為了滿足 queue.h 中的說明：

zero if queue is NULL or empty

我在判斷式新增一個 list_empty() 函式來確認佇列是否為空。並調整了變數名稱，增加可讀性。

@@ -40,14 +79,15 @@ 
 /* Return number of elements in queue */
 int q_size(struct list_head *head)
 {
-     if (!head)                         
+     if (!head || list_empty(head))     
         return 0;
 
      int len = 0;
-     struct list_head *li;              
+     struct list_head *node;            
 
-     list_for_each (li, head)           
+     list_for_each (node, head)         
          len++;
 
      return len;
 }

5. `q_remove_head()` / `q_remove_tail()`

q_remove_head():
在佇列開頭 (head) 移去 (remove) 給定的節點

q_remove_tail():
在佇列尾端 (tail) 移去 (remove) 給定的節點

跟 q_size() 這個函式一樣先透過 if (!head || list_empty(head)) 這個判斷式，判斷佇列是否為空，防止發生記憶體問題。
接著使用 list.h 中的 list_del() 分別在 q_remove_head() 將 head->next ，也就是佇列的第一個元素移去。而 q_remove_tail() 將 head->prev ，也就是佇列的最後一個元素移去。

    /* 找到第一個node */    
    struct list_head *first = head->next;
    list_del(first);
    element_t *e = list_entry(first, element_t, list);

    /* 找到最後一個node */
    struct list_head *last = head->prev;
    list_del(last);
    element_t *e = list_entry(last, element_t, list);

站在巨人的肩膀上：

在寫到一半時，決定先去參考前人，也就是剛剛的前輩： yanjiew1 的筆記，很幸運地在裡面找到了對這兩個函式撰寫有幫助的提示，原本前輩是自己寫了一個函式來處理字串複製的問題，後來老師就提示他，他額外寫的 q_copy_string() 這個函式是否有必要？最後，他找到了 strncpy 這個函式來做優化

-        if (sp && bufsize)                              
-            q_copy_string(sp, bufsize, elem->value);    
+        if (!sp || !bufsize)                            
+            return elem;                                
+        strncpy(sp, elem->value, bufsize);              
+        sp[bufsize - 1] = '\0';

commit 5b8c553 from yanjiew1

我簡單的調整一下判斷式，從 if (!sp || !bufsize) ，更改成 if (sp && bufsize > 0) ，並將 srrncpy 中的 bufsize 改成 bufszie - 1 ，防止記憶體問題發生。

    if (sp && bufsize > 0) {
        strncpy(sp, e->value, bufsize - 1);
        sp[bufsize - 1] = '\0';
    }

在移去的做法上，我原本是根據函式名稱 (q_remove_head() / q_remove_tail()) ，去找到移去的節點 (head / tail) 之後，再透過 list_entry() 來取得這個元素，去做接下來的複製字串任務。但是根據前人 yanjiew1 的筆記中的程式碼，我再次回顧了 list.h ，發現有已經寫好的 list_first_entry() 跟 list_last_entry() 可以使用，最終將這兩個函式引入我的程式碼中。

6. `q_delete_mid()`

移走佇列中間的節點
詳見 LeetCode 2095. Delete the Middle Node of a Linked List

在實作上我參考了老師在「你所不知道的 C 語言: linked list 和非連續記憶體」中開頭提到的快慢指標法。但在用這個方法前，我先使用了我在閱讀 list.h 時發現的函式 list_is_singular() ，這個函式會確認整個 list 是否只有一個節點，如果是就回傳1，不是就會回傳零。所以就簡單設計了一個判斷式，如果只有一個節點的話，就不用繼續後續的快慢指標法去尋找了！

    if (list_is_singular(head)) {
        element_t *elem = list_first_entry(head, element_t, list);
        list_del(&elem->list);
        q_release_element(elem);
        return true;
    }

接著就是快慢指標法的實作。一樣讓快的指標，一次移動兩個節點，慢的一次移動一個。當快的指標走到整個佇列的末端時，慢指標所指的節點就是我們要的中間節點，也就是要刪除的對象。並使用 list_entry() 找到節點的位置，將其刪除。因為是 delete ，所以最後要使用 q_release_element 去釋放記憶體。

    while (fast != head && fast->next != head) {
        fast = fast->next->next;
        slow = slow->next;
    }

    element_t *mid = list_entry(slow, element_t, list);
    list_del(slow);
    q_release_element(mid);

7. `q_delete_dup`

在已經排序的狀況，移走佇列中具備重複內容的節點
詳見 LeetCode 82. Remove Duplicates from Sorted List II

跟實作 q_delete_mid() 時一樣，除了確認佇列是否為空？ head 是否為 NULL 以外，再使用 list_is_singular() 來確認是否只有一個節點，減少程式碼的運算成本。
如果整個佇列不是只有一個節點，接著使用 list_for_each_safe() 來遍歷整個鏈結串列，並使用字串比對函式 strcmp() ，如果當下的節點字串跟我們前一個節點的字串一樣，我們就將其刪除掉。

    list_for_each_safe(node, safe, head) {
        element_t *curr = list_entry(node, element_t, list);
        if (prev && strcmp(prev->value, curr->value) == 0) {
            list_del(&curr->list);
            q_release_element(curr);
        } else {
            prev = curr;
        }
    }

站在巨人的肩膀上：
後來我發現我這樣寫會有一個問題，假設我的字串如下：
"apple" -> "apple" -> "apple" -> "cherry" -> "cherry" -> "banana"
那因為我的函式在功能上會刪除後來所有比較到的重複字串，所以最終刪除結果會變成：
"apple" -> "cherry" -> "banana"
可是正確的結果應該要是：
"banana"
所以我決定翻閱一下前人的筆記，看看大家怎麼寫的。這次一樣參閱了 yanjiew1 的筆記。他的做法很簡單，他就是先使用 LIST_HEAD() 建立一個新的佇列錨點（以下稱為 pending )，接著使用 list_for_each_entry_safe() 開始比較字串的重複性。他定義了一個 cut 變數，來判斷是否出現不同字串的節點。當出現不同字串的節點時，他會將這個節點以前的整個鏈結串列切除，並將這個鏈結串列嫁接到 pending 的後面，然後更新 cut 的位置，接著重複這個動作，直到 list_for_each_entry_safe() 這個函式結束。最後將這個由 pending 為頭部新組成的鏈結串列的整個記憶體釋放掉，完成刪除重複字串的動作。此外，正如老師說的學長一開始的命名不是很好，雖然調整過了，但我在參考時，還是讀得有點不知所云，所以我再次更改了變數名稱，增加可讀性。

commit 104f9e5

8. `q_swap()`

交換佇列中鄰近的節點
詳見 LeetCode 24. Swap Nodes in Pairs

想法非常的簡單，使用一個 while 迴圈，從第一個節點 head->next 開始，兩兩一組執行互換。比較特別的是，我原本實作上是會先使用 list_del() 來移去原本的節點，再用 list_add() 把節點加到前一個位置上，達到互換的效果。但是根據前面實作時的經驗， list.h 檔案中一定有已經寫好的函式可以使用，所以我再次閱讀標頭檔後，找到了 list_move() 這個函式。他可以將我想要的節點先移去後，再將節點移動到我指定的指標後面。

static inline void list_move(struct list_head *node,
                             struct list_head *head)
{
    list_del(node);
    list_add(node, head);
}

    struct list_head *node = head->next;
    while (node != head && node->next != head) {
        struct list_head *next = node->next;
        list_move(node, &next);     
        node = next->next;        
    }

更新與修正：
在 commit 104f9e5 後收到pointer type mismatch error 的報錯，導致 make test 無法完成。回去回顧了 list.h 後發現沒有注意好細節。如果我要使用 list_move() 的話，就不能使用指標的指標，因為會與定義不符。

queue.c: In function ‘q_swap’:
queue.c:194:25: error: passing argument 2 of ‘list_move’ from incompatible pointer type [-Werror=incompatible-pointer-types]
  194 |         list_move(node, &next);
      |                         ^~~~~
      |                         |
      |                         struct list_head **
In file included from queue.h:14,
                 from queue.c:5:
list.h:334:72: note: expected ‘struct list_head *’ but argument is of type ‘struct list_head **’
  334 | static inline void list_move(struct list_head *node, struct list_head *head)
      |                                                      ~~~~~~~~~~~~~~~~~~^~~~
cc1: all warnings being treated as errors
make: *** [Makefile:54: queue.o] Error 1

最終先把程式碼調整回原本的樣貌：

@@ -191,8 +191,11 @@ void q_swap(struct list_head *head)
     struct list_head *node = head->next;
     while (node != head && node->next != head) {
         struct list_head *next = node->next;
-        list_move(node, &next);                  
-        node = next->next;                       
+        struct list_head *next_next = next->next;
+                                                 
+        list_del(node);                          
+        list_add(node, next);                    
+        node = next_next;                        
     }
 }

commit ada17c4

站在巨人的肩膀上：
參考 yanjiew1 的筆記，他在 q_swap() 上的做法令我歎為觀止，他直接調用了 q_reverseK() ，並將 k 設為2，如此一來即省略了實作上 q_swap() 需要用不同方式執行，卻與 q_reverseK() 在執行相似工作這件事。

commit c5d22b0

9. `q_reverse()`

以反向順序重新排列鏈結串列，該函式不該配置或釋放任何鏈結串列節點
換言之，它只能重排已經存在的節點

實作上，我透過 LIST_HEAD() 這個巨集，初始化了一個臨時的佇列錨點 new_head ，接著使用 list_for_each_safe() ，遍歷整個鏈結串列並在每經過一個節點後，使用 list_move() 這個函式將節點接到 new_head 上，因為 list_move() 使用的是 list_add() ，他可以將節點向頭部鏈結，形成一個反轉，最後透過 list_splice() 這個函式把整個鏈結串列嫁接到原本的佇列 head 上，完成佇列的反向順序重新排列。

void q_reverse(struct list_head *head) 
{
    if (!head || list_empty(head))
        return;

    LIST_HEAD(new_head);

    struct list_head *node, *safe;
    list_for_each_safe(node, safe, head) {
        list_move(node, &new_head); 
    }

    list_splice(&new_head, head);
}

問題與發想：
我在寫完我自己的 q_resverse() 後，去看了一下 yanjiew1 的筆記，我發現他並沒有初始化一個新的 LIST_HEAD() ，於是我回去翻看 queue.h ，如同老師在作業說明裡說的一樣，不能配置或釋放任何記憶體，我可以確認我沒有釋放，但是配置呢？我感覺我使用 LIST_HEAD() 理論上應該不是吧？

翻閱 C 語言標準規格後：^[3]

變數定義與儲存期的相關規格

6.2.4 Storage durations of objects

An object has a storage duration that determines its lifetime. There are three storage durations: static, automatic, and allocated. […]

An object declared with external or internal linkage, or with the storage-class specifier static, has static storage duration. Its lifetime is the entire execution of the program and its stored value is initialized only once, prior to program startup.

An object whose identifier is declared with no linkage and without the storage-class specifier static has automatic storage duration. […] Its lifetime extends from entry into the block with which it is associated until execution of that block ends in any way.

解釋：

當使用 LIST_HEAD(my_list) 定義一個變數時，這個變數的記憶體是由編譯器分配的，並根據其聲明的位置具有「靜態儲存期」或「自動儲存期」。
如果在全域範圍定義（如文件頂層），根據第 6.2.4.2 段，它具有靜態儲存期，記憶體在程式啟動前分配。
如果在函數內部定義（如 void foo() { LIST_HEAD(my_list); }），根據第 6.2.4.5 段，它具有自動儲存期，記憶體在函數執行時於堆疊上分配。
這兩種情況都不涉及運行時的動態記憶體分配，而是由編譯器在編譯時確定記憶體位置。

動態記憶體配置的相關規格

7.20.3 Memory management functions

The order and contiguity of storage allocated by successive calls to the calloc, malloc, and realloc functions is unspecified. The pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to a pointer to any type of object and then used to access such an object or an array of such objects in the space allocated (until the space is explicitly deallocated). […]

相關函數描述：

void *malloc(size_t size); - Allocates space for an object whose size is specified by size and whose value is indeterminate.
void free(void *ptr); - Causes the space pointed to by ptr to be deallocated, that is, made available for further allocation.

解釋：

C 標準明確定義了動態記憶體配置是由 malloc、calloc、realloc 等函數完成的，這些函數在運行時從堆疊中分配記憶體，並且需要使用 free 手動釋放。
LIST_HEAD 巨集展開後僅僅是一個結構變數的定義和初始化（struct list_head my_list = {&(my_list), &(my_list)}），並未調用任何記憶體管理函數。
因此，根據第 7.20.3 節的定義，LIST_HEAD 不涉及動態記憶體配置。

3.結構定義與初始化的相關規格

6.7.8 Initialization

If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate. If an object that has static storage duration is not initialized explicitly, then: — if it has pointer type, it is initialized to a null pointer; — if it has arithmetic type, it is initialized to (positive or unsigned) zero; […]

An initializer specifies the initial value stored in an object.

If an object that has static storage duration is not initialized explicitly, the object is initialized implicitly as if every member that has arithmetic type were assigned 0 and every member that has pointer type were assigned a null pointer constant.

解釋：

LIST_HEAD 巨集展開後，立即對結構變數進行初始化（{&(my_list), &(my_list)}），這符合第 6.7.8 節中關於初始化的規定。
這種初始化是在編譯時完成的，編譯器會在生成的可執行文件中為變數預留空間並設置初始值，而不是在運行時動態分配記憶體。

結論與佐證總結

不涉及動態記憶體配置：
- 根據第 7.20.3 節，動態記憶體配置明確由 malloc 等函數處理，而 LIST_HEAD 巨集未使用這些函數。
- 因此，LIST_HEAD 不涉及運行時的動態記憶體分配。
涉及靜態或自動記憶體分配：
- 根據第 6.2.4 節，LIST_HEAD 定義的變數具有靜態或自動儲存期，其記憶體是由編譯器在編譯時分配的。
  *如果在全域範圍使用，則為靜態儲存期；如果在函數內使用，則為自動儲存期。
初始化由編譯器處理：
- 根據第 6.7.8 節，LIST_HEAD 的初始化（如 {&(my_list), &(my_list)}）是在編譯時完成的，無需運行時分配額外記憶體。

最終答案

根據 C 語言標準規格，LIST_HEAD 巨集不涉及運行時動態記憶體配置（即不使用 malloc 等函數從堆中分配記憶體）。它僅定義並初始化一個結構變數，其記憶體是由編譯器在編譯時自動分配的（靜態或自動儲存期，取決於上下文）

最後我還是決定使用前人的方法，在可以省略程式碼的長度下，即可達到相同的效果，何樂而不為呢？

/* Reverse elements in queue */
 void q_reverse(struct list_head *head)
 {
     if (!head || list_empty(head))
         return;
 
-    LIST_HEAD(new_head);                        
+    // LIST_HEAD(new_head);                     
 
     struct list_head *node, *safe;
     list_for_each_safe (node, safe, head)
         list_move(node, head);

-    list_splice(&new_head, head);               
+    // list_splice(&new_head, head);            
 }

commit c5d22b0

[3]：使用 GROK3 協助整理與修飾語句。

10. `q_reverseK()`

類似 q_reverse，但指定每 k 個節點為一組，進行反向順序排列
詳見 LeetCode 25. Reverse Nodes in k-Group

延續一貫的前面實作 q_swap 與 q_reverse 時的思考方法，透過迴圈來進行。所以我一樣先確認如 queue.h 中對 q_reverseK 的註解說明： No effect if queue is NULL or empty. If there has only one element, do nothing. ，接著使用一個 while 迴圈，包著一個 for 迴圈，用最基本的方式，把 k 個一組的佇列進行反向順序排列。

    while (count >= k) {
        LIST_HEAD(tmp);           
        struct list_head *prev = head;   
        struct list_head *next;    
        int i;

        /* 將 k 個節點移動到臨時佇列，並反轉順序 */
        for (i = 0; i < k; i++) {
            struct list_head *next = curr->next;
            list_move(curr, &tmp);        
            curr = next;                         
        }
        next = curr; // 更新起點

        struct list_head *new_start = tmp.next; // 新的起點
        list_splice(&tmp, prev);                // 將臨時佇列拼接起來
        prev = new_start;                       // 更新結束節點

        count -= k;
    }

更新與修正：
在調整完我的 q_descend() 後，又遇到以下問題

    cmd> ih e 2
    l = [e e d c b a a a]
    cmd> reverseK 3
    l = [a b c e e d a a]
    cmd> rh d
    Removed a from queue
    ERROR: Removed value a != expected value d

開始回顧我的程式碼，並確認哪裡出了狀況。
以下是初始的鏈結串列：

首先移動三個節點（因為 k = 3 )到 tmp 中，

執行 reverse 的動作後得到以下新的鏈結串列，

接著重複動作，再抓取三個節點，到 tmp 中執行 reverse ，

但是問題在這時候發生了，因為剛剛更新的 tmp.next 現在是 head 所以如下圖一樣，將 a1 → b → c 直接透過 list_splice() 鏈接在錯誤的地方。

最後剩下兩個節點，因為 2 <= k 所以就不執行，也得到以下錯誤的鏈結串列。

站在巨人的肩膀上：
其實我在寫 q_reverseK() 之前有先參考過前人 yanjiew1 的筆記了，但是我想說我自己實作也沒問題，最後問題可大了，有蠻多小細節沒有注意到的。所以我在更改的時候決定參考他的程式碼，學習他使用 list_move_tail() 這個函式來將要 reverse 的節點按照順序接起來，接著使用前面實作的 q_reverse() 去操作 reverse 的動作。不同的是，我仍舊使用 while 迴圈，也因此我設計了一個指標 list_tail 來更新每一次的串接位置，也就是修正我一開始犯的錯誤。

        /* 將反轉的終點更新 */
        for (int i = 0; i < k; i++)
            list_tail = list_tail->next;

commit ce019b9

q_sort()

以遞增順序來排序鏈結串列的節點
可參閱 Linked List Sort 得知實作手法

在實作排序前，我決定先觀看前人的筆記。通過 yanjiew1 、 alanjian85 這兩位的筆記，我決定跟隨他們的腳步使用 merge sort 來實作這邊的 q_sort() 函式。做法上卻跟他們有些微的不同。
首先跟前人一樣都會定義一個新的函式，我命名為 q_merge_two() ，參考 yanjiew1 的命名，用來將兩個已排序好的鏈結串列結合在一起，也方便接下來的 q_merge() 實作。
在實作 q_sort 上延續了前面實作 q_delete_mid 時使用的快慢指標法來找到中點進行切割，並採用遞迴的方式來排序。
然後執行 make test 時就，爆炸了！！！
在一番左思右想，觀察自己的程式碼邏輯後，最終決定看看同學們的實作。運氣很好的，第一個點開的同學 JimmyChongz 就有遇到跟我一樣的問題。

首先是記憶體上的缺失，通過調整 list_cut_position() 的慢指標，來防止使用 list_cut_position() 在切割上的缺陷，像是：
- 不均等分割
- 遺漏中點節點
- 邊界條件下的未定義行為
那我們主要是遇到「邊界條件下的未定義行為」為主，慢指標指到錯誤的位置，導致發生記憶體問題。^[1]

接著透過在 q_merge_two() 增加邊界檢查，來防止排序不穩定的問題。

+++ TESTING trace trace-04-ops:
# Test of 'q_new', 'q_insert_tail', 'q_insert_head', 'q_remove_head', 'q_size', 'q_swap', and 'q_sort'
ERROR: Not stable sort. The duplicate strings "bear" are not in the same order.
---	trace-04-ops	0/6
+++ TESTING trace trace-05-ops:
# Test of 'q_new', 'q_free', 'q_insert_head', 'q_insert_tail', 'q_remove_head', 'q_reverse', 'q_size', 'q_swap', and 'q_sort'
ERROR: Not stable sort. The duplicate strings "bear" are not in the same order.
---	trace-05-ops	0/6

    if (!left || !right || list_empty(right))
        return;

(檢討) 在寫排序的時候卡在究竟相同的字串應該誰先誰後，然後只能一直透過 test 來驗證，浪費許多時間，結果在撰寫問題與發想時在翻閱「你所不知道的 C 語言: linked list 和非連續記憶體」時發現，文章的最後一小段，老師其實有說明，證明老師的每一個文件都需要學生多花時間品味，「魔鬼藏在細節裡」。

問題與發想：

在參考前人 yanjiew1 的筆記時，老師在他的 q_sort() 下方詢問：如果使用迭代的做法呢？
這讓我回憶起我在老師提供的「你所不知道的 C 語言: linked list 和非連續記憶體」裡有提到實作迭代方式的排序程式碼實作。也我讓開始思考，會不會使用迭代的方式實作，會更節省成本？因為根據我在資料結構與演算法這類課程的所學，遞迴看似快，實際上在跟迭代的運算成本上相比，仍是略遜迭代一籌的，或許應該嘗試使用迭代去實作。
實作：

「你所不知道的 C 語言: linked list 和非連續記憶體」中有實作方法可以參考
TBD

[1]: 圖片引用自 JimmyChongz 的 2025q1 Homework1 (lab0)

12. `q_merge()`

合併所有已排序的佇列，並合而為一個新的已排序佇列
詳見 LeetCode 23. Merge k Sorted Lists

根據一開始閱讀標頭檔的了解，我們知道這個函式與 queue_contex_t 這個資料型別有關係。這個資料型別定義的 q 變數，方便我們去定義要合併的佇列位置，接著通過使用 list_for_each_safe() 函式，我們將要被合併的佇列篩選出來，並透過 q_merge_two() 傳送 descned 變數的布林值來做升序或是降序的排列合併。

    queue_contex_t *target_queue = list_first_entry(head, queue_contex_t, chain);
    struct list_head *target_list = target_queue->q;

    struct list_head *node, *safe;
    list_for_each_safe(node, safe, head) {
        if (node == &target_queue->chain)
            continue;
        queue_contex_t *curr_queue = list_entry(node, queue_contex_t, chain);
        struct list_head *curr_list = curr_queue->q;
        if (curr_list) {
            q_merge_two(target_list, curr_list, descend);

13. `q_ascend()` / `q_descend()`

參閱 LeetCode 2487. Remove Nodes From Linked List
注意節點間的順序

q_descend()

我首先先實作 q_descend() ，首先一樣再次閱讀與確認 queue.h 中對於 q_descending 的定義說明。先說結論，我以為我看懂了，結果耽誤了非常多時間。
原文如下： Remove every node which has a node with a strictly greater value anywhere to the right side of it. ，所以可以知道，只要你的右邊有比你的值大的點，我們就需要將那個點移去。
那我就先入為主地想，我就從左邊，也就是整個佇列的頭部開始實作，只要比較到右邊出現比較大的字串，就將該字串移除掉。
那到這邊想法完全沒問題，有問題是在程式碼的實作上。

    LIST_HEAD(stack);
    struct list_head *node, *safe;

    list_for_each_safe (node, safe, head) {
        const element_t *curr = list_entry(node, element_t, list);
        while (!list_empty(&stack)) {
            element_t *top = list_last_entry(&stack, element_t, list);
            if (strcmp(curr->value, top->value) > 0) {
                list_del(&top->list);
                q_release_element(top);
            } else {
                break;
            }
        }
        list_move(node, &stack);
    }

    list_splice_init(&stack, head);
    return q_size(head);

做法上是定義了一個堆疊，來儲存字串。如果遇到右邊有比較小的字串，我們就將堆疊內的數字移除，放入新字串。若是沒有，就將字串塞入堆疊中，繼續往下判斷，如下圖所示：

那很顯然的，依照 queue.h 中的定義來實作的話 5 → 3 → 1 → 2 → 4 的結果應該是：
5 → 4 才對，但是程式碼帶來的結果卻是 4 → 2 → 1
（有趣的是，我這邊大小於還寫錯邊了）
接著我就開始調整我的程式碼，但是怎麼調整都難以達到要求。

cmd> ih c
l = [c a d c b a]
cmd> descend
ERROR: At least one node violated the ordering rule
l = [a b c d a]
cmd> rh d
Removed a from queue
ERROR: Removed value a != expected value d
l = [b c d a]
cmd> rh c
Removed b from queue
ERROR: Removed value b != expected value c
l = [c d a]
cmd> rh b
Removed c from queue
ERROR: Removed value c != expected value b
l = [d a]
cmd> rh a
Removed d from queue
ERROR: Removed value d != expected value a

最後我跟 horseface1110 討論，他告訴我說，他是從尾巴開始做，這個想法打開了我的思路，我就按這個想法，重新寫了一版新的程式碼。
這次使用 list_last_entry() 這函式找到整個鏈結串列的最尾端，達到從尾巴做起的需求。並定義一個最後一個節點為 max_value ，接著使用 while 迴圈開始做檢查，只要左邊，也就是往頭部方向的節點，大於我們當今的 max_value ，他就不會被取代，並且我們會在過程中隨時更新 max_value 的值，確保最後會呈現遞減的狀態。

commit 6698a3d

q_ascend()
將 max_value 這個變數名稱調整為 min_value 後，一樣使用 list_last_entry() 找到串列的尾巴，並使用 while 迴圈依序檢查到頭部，差別在於，這次是出現大於當前的 min_value 則會被刪除。

commit ce019b9

問題與發想：
以下是 q_descend() 與 q_ascend() 在 queue.h 中寫到的原文定義：

q_descend() - Remove every node which has a node with a strictly greater value anywhere to the right side of it.
q_ascend() - Remove every node which has a node with a strictly less value anywhere to the right side of it.

沒錯！這兩個都是寫 remove 而不是 delete ，於是好奇究竟是什麼做法的我，就將我程式碼中的 q_release_element() 註解掉，執行 driver.py 後，得到以下結果：

+++ TESTING trace trace-06-ops:
# Test of 'q_new', 'q_free', 'q_insert_head', 'q_insert_tail', 'q_remove_head', 'q_delete_dup', 'q_sort', 'q_descend', and 'q_reverseK'
ERROR: There is no queue, but 4 blocks are still allocated
ERROR: Freed queue, but 4 blocks are still allocated
---	trace-06-ops	0/6

報錯 Freed queue, but 4 blocks are still allocated ，所以結論應該是 delete ，而不是 remove 。

到目前為止的結果：

$ make test

...
---	TOTAL		95/100
make: *** [Makefile:72：test] 錯誤 1

與

$ make valgrind

...
---	TOTAL		95/100
make: *** [Makefile:72：test] 錯誤 1

Valgrind ＋自動測試程式

通過前述測試結果，我發現使用 make test 與 make valgrind 同樣在測試 trace-17-complexity.cmd 時，儘管都失敗了，卻在四個函式上的表現結果不盡相同，所以我決定先研讀 Makefile 探究兩者造成差異的原因。

test: qtest scripts/driver.py
	$(Q)scripts/check-repo.sh
	scripts/driver.py -c

valgrind_existence:
	@which valgrind 2>&1 > /dev/null || (echo "FATAL: valgrind not found"; exit 1)

valgrind: valgrind_existence
	# Explicitly disable sanitizer(s)
	$(MAKE) clean SANITIZER=0 qtest
	$(eval patched_file := $(shell mktemp /tmp/qtest.XXXXXX))
	cp qtest $(patched_file)
	chmod u+x $(patched_file)
	sed -i "s/alarm/isnan/g" $(patched_file)
	scripts/driver.py -p $(patched_file) --valgrind $(TCASE)
	@echo
	@echo "Test with specific case by running command:" 
	@echo "scripts/driver.py -p $(patched_file) --valgrind -t <tid>"

可以發現都是呼叫 scripts/driver.py 來做測試的動作，關鍵點可能是 Valgrind 呼叫了sed -i "s/alarm/isnan/g" $(patched_file) 這行。
因為我們使用 cp qtest $(patched_file) 將 qtest 這個檔案複製到 $(patched_file) ，所以我接下來想要去找 qtest 中的 alarm 在哪裏，卻 qtest 裡根本發現沒有。
接著我們透過指令在詳細的測試一次：

$ scripts/driver.py -t 17 -c -v 2

得到結果：

+++ TESTING trace trace-17-complexity:
# Test if time complexity of 'q_insert_tail', 'q_insert_head', 'q_remove_tail', and 'q_remove_head' is constant
ERROR: Probably not constant time or wrong implementation
ERROR: Probably not constant time or wrong implementation
Probably constant time
Probably constant time
---	trace-17-complexity	0/5

再透過 Valgrind 測試

$ scripts/driver.py -p ./qtest --valgrind -t 17 -v 2

得到結果：

+++ TESTING trace trace-17-complexity:
# Test if time complexity of 'q_insert_tail', 'q_insert_head', 'q_remove_tail', and 'q_remove_head' is constant
Probably constant time
ERROR: Probably not constant time or wrong implementation
Probably constant time
Probably constant time
---	trace-17-complexity	0/5

由於在使用 Makefile 測試時與 qtest 檔案有高度相關，所以決定繼續研讀 qtest 中與 trace-17-complexity.cmd 測試有關的操作。
可以發現在 do_it 、 do_ih 、 do_rt 、 do_rh 這四個函式發現他們分別呼叫了 queue_insert 與 queue_remove ，其中這兩個函式都有相似的程式碼，呼叫 constant time 的函式判斷，以下用 queue_remove 函式舉例：

#if !(defined(__aarch64__) && defined(__APPLE__))
    if (simulation) {
        if (argc != 1) {
            report(1, "%s does not need arguments in simulation mode", argv[0]);
            return false;
        }
        bool ok =
            pos == POS_TAIL ? is_remove_tail_const() : is_remove_head_const();
        if (!ok) {
            report(1,
                   "ERROR: Probably not constant time or wrong implementation");
            return false;
        }
        report(1, "Probably constant time");
        return ok;
    }
#endif

當 simulation 為1的時候，即會進入判斷，並呼叫 is_remove_tail_const() 、 is_remove_head_const() 等與常數時間相關的函式。而在 trace-17-complexity.cmd 中，設定 option simulation 1 就是為了測試函式是否為常數時間。
之後在 dudect/fixture.[ch] 中發現相關的程式碼：

在 fxiture.h 中定義

/* Interface to test if function is constant */
#define _(x) bool is_##x##_const(void);
DUT_FUNCS
#undef _

在 fixture.c 中呼叫，並在 test_const 這個函式中實作

#define DUT_FUNC_IMPL(op) \
    bool is_##op##_const(void) { return test_const(#op, DUT(op)); }

附圖為使用 massif 顯示出的記憶體用量圖：

翻看作業說明後發現與論文〈Dude, is my code constant time?〉有關聯，於是決定先來研讀論文後，根據所學所得來調整問題。

研讀論文〈Dude, is my code constant time?〉

論文整理

背景與問題
- 什麼是 Timing Attacks？
  時間攻擊是駭客通過測量程式跑多久來偷秘密，是一種側信道攻擊（side-channel attack），比如老師在課程說明時提到的鋼琴家-官大為，他使用 Linux 測試的東西就可以延伸出類似的資安疑慮。如果程式處理不同輸入的時間不一樣，駭客就能猜出一些線索。論文中提到這種攻擊從 1996 年開始就有了，後來連 TLS 這種大系統都被攻破（references [AP13]），說明問題很嚴重。
- 為什麼要常數時間？
  接續前述說的 Timing Attacks 的問題，如果程式跑的時間跟輸入有關，駭客就能利用這點。理想是無論輸入什麼，時間都一樣，這樣才安全。有趣的是，原文說連一些號稱常數時間的程式都被發現有漏洞（引用 [GBY16], [Cry16]），所以得確保沒秘密相關的分支或存取。
- 傳統方法的麻煩
  這邊就是在點題，為什麼作者會開發 dudect 呢？在論文中提到，過去人們靠手動檢查程式碼（特別是組合語言），看看有沒有秘密相關的分支或存取。這很花時間，而且每次換編譯器或設定都要重來。其他工具（像 ctgrind、ctverif）雖然自動化，但需要模擬硬體行為，很難做得準確。
dudect 的方法（如作業說明整理的）
- 步驟 1：測量執行時間
  分為兩組測量（a fix-vs-random leakage detection test）一個固定數值，一個亂數產生，跑很多次，記下每次時間，用現代 CPU 自帶的計時器測。為了避免環境的影響，論文中還特別提到會會隨機混淆測量順序避免干擾（“each measurement corresponds to a randomly chosen class”）。
- 步驟 2：處理數據
  有時候測量結果會有異常值（比如被作業系統干擾跑超久），可以用 cropping 把這些極端值去掉。也可以做一些 higher-order preprocessing ，先處理數據，讓分析更敏感。
- 步驟 3：統計分析
  用統計測試，選用的是 Welch's t-test 檢查兩組時間分佈有沒有明顯差異。如果差異很大，就說明執行時間跟輸入有關，不是常數時間。也可以用其他測試，像 Kolmogorov-Smirnov （引用 [WOM11]），但相較於 t-test 大多都有其他缺陷或更多需求。
實驗結果

記憶體比較：測試了一個普通的 memcmp（會提早結束的變動時間版本），很快就發現時間洩漏。換成常數時間版本（用邏輯運算，不提早結束），測了幾百萬次都沒問題。
AES 加密：測試了三種 AES 實作：
- T-tables 版本（已知有時間漏洞），幾千次測量就抓到洩漏。
- Bitsliced 版本（應該是常數時間），測了十億次都沒問題。
- 向量排列版本（用 SSSE3 指令），也沒洩漏。
Curve25519 on ARM：在 ARM7TDMI 上測試了一個橢圓曲線運算（curve25519-donna），發現執行時間跟輸入有關，因為 ARM 的乘法指令本身就不是常數時間。這點在 x86 上卻沒問題，說明硬體差異很重要。

Welch's t-test 與 Student's t-test

閱讀完論文後了解，作者在實作 t-test 上使用的是 Welch's t-test ，但是老師作業說明中也有同學提問過相關議題，下方也有老師的解惑。但我對這些都不甚清楚，所以決定先從 Student's t-distribution 了解。

為什麼叫 Student ？
蠻有趣的，作者 William Sealy Gosset 為了避免洩露商業機密，因此在任職於 Guinness 啤酒廠時發表的文章都以筆名 “Student” 發表，。
什麼是 t-distribution ？
一種對稱、鐘形概率分佈，用於小樣本且母體標準差未知時的估計，特色是具較胖的尾部。
與常態分佈相比，兩者均為對稱鐘形，但是在分布圖上，t 分佈尾部較胖（極端值機率高），因為考慮到小樣本不確定性，也因此在樣本越大的情況下，會越接近常態分佈。而與 z-distribution 的比較之下， z-distribution是假設母體標準差已知，較適用於大樣本，尾部較瘦，而 t-distribution 用樣本標準差估計，適用小樣本，尾部較胖。

自由度（Degrees of Freedom, df）公式：
t 值（單樣本）：

$t = \frac{\bar{x} - μ}{s / \sqrt{n}}$ 其中
$\bar{x}$ 為樣本平均，
$μ$ 為母體平均，
$s$ 為樣本標準差，
$n$ 為樣本數。

為什麼使用 t 分佈？：
因為在現實中常遇小樣本且母體標準差未知，z-distribution 因需已知標準差而不適用。其中， t-distribution 以樣本標準差估計，胖尾反映估計的不確定性，確保小樣本檢驗結果保守且可靠。
跟 t-test 的關聯在哪？
t-distribution 為 t-test 提供概率框架，計算 t 統計量並判斷顯著性，適用於小樣本與未知標準差場景。當用樣本標準差替代母體標準差，t 統計量（如
$t = \frac{\bar{x} - μ}{s / \sqrt{n}}$ ）遵循 t-distribution ，而非常態分佈，因小樣本變動性。
在狹義上，Gosset 提出的原始 t-test，假設變異數相等，涵蓋：
- One-sample：
  
  $t = \frac{\bar{x} - μ_{0}}{s / \sqrt{n}}$
- Two-sample：
  
  $t = \frac{{\bar{x}}_{1} - {\bar{x}}_{2}}{s_{p} \sqrt{\frac{1}{n_{1}} + \frac{1}{n_{2}}}} (s_{p} 為合併標準差)$
而在廣義上 Student's t-test 常被用作 t-test 家族代稱，因其率先應用 t-distribution。

在了解了 Student's t-distribution 後，我接著研讀 Welch's t-test：

什麼是 Welch's t-test？
在定義上 Welch's t-test 是一種 t-test 變體，用於比較兩組獨立樣本的平均值是否顯著不同，不假設兩組母體變異數相等，依據 t 分佈判斷。
- 公式：
  
  $t = \frac{{\bar{x}}_{1} - {\bar{x}}_{2}}{\sqrt{\frac{s_{1}^{2}}{n_{1}} + \frac{s_{2}^{2}}{n_{2}}}}$
  
  ${\bar{x}}_{1}, {\bar{x}}_{2}$ ：兩組樣本平均值；
  $s_{1}^{2}, s_{2}^{2}$ ：樣本變異數；
  $n_{1}, n_{2}$ ：樣本數。
- 自由度：用 Welch-Satterthwaite 近似計算，如：
  
  $d f \approx \frac{{(\frac{s_{1}^{2}}{n_{1}} + \frac{s_{2}^{2}}{n_{2}})}^{2}}{\frac{(s_{1}^{2} / n_{1})^{2}}{n_{1} - 1} + \frac{(s_{2}^{2} / n_{2})^{2}}{n_{2} - 1}}$
也因此 Welch's t-test 適用於變異數未知或不相等的情況，比傳統的 t-test 更靈活。
跟 Student's t-test 的差異在哪？
假設條件：
- Student's t-test：假設兩組母體變異數相等，用合併標準差
  $s_{p} = \sqrt{\frac{(n_{1} - 1) s_{1}^{2} + (n_{2} - 1) s_{2}^{2}}{n_{1} + n_{2} - 2}}$
- Welch's t-test：不假設變異數相等，直接用各組樣本標準差。
公式：
- Student's：
  $t = \frac{{\bar{x}}_{1} - {\bar{x}}_{2}}{s_{p} \sqrt{\frac{1}{n_{1}} + \frac{1}{n_{2}}}}$ ，自由度
  $d f = n_{1} + n_{2} - 2$ 。
- Welch's：分母用獨立標準誤差，自由度經調整。
適用性：
- Student's：需驗證變異數相等，適合穩定數據。
- Welch's：無需變異數假設，適用於變動性不同的數據。
為什麼論文中會使用 Welch's t-test？
在論文中，比較固定與隨機輸入的執行時間，檢測是否有時間洩漏。
有以下的原因：
- 變異數不確定：執行時間受環境影響（如作業系統干擾），兩組變異數未必相等，Welch's 不需假設變異數相等更合適。
- 穩健性：論文測量數據可能包含異常值或非均勻變動，Welch's 能應對這種不確定性。
因此 Welch's t-test 的靈活性使其適合論文中複雜的時間數據分析。
論文中使用的可不可以是 Student's t-test？
以廣義上的定義來看是可行的，t-test 常被統稱為 Student's t-test，說“Student's”不完全錯。但是總的來說不合適， Student's t-test 要求變異數相等，論文數據（執行時間）變異數可能因輸入或環境不同而不一致，用 Student's 可能失準。來舉個有趣的例子：
論文用 Welch's，就像你點了百事可樂。說「可口可樂」（指 Student's）感覺沒什麼問題，因為都是可樂（t-test），但總的來說不夠精確，因為你喝的是百事！學術上該說「百事」（Welch's），不然就像點錯飲料，細節不對。
所以總結來說，其實作業說明的解釋可以接受，但不太正確。

接著開始閱讀 lab0-c 中的 dudect 資料夾，並開始著手改動程式碼，以完善缺陷並期許通過 trace-17-complexity.cmd 的測試，見到星之卡比！

閱讀 `dudect` 資料夾中的程式碼並分析不足之處

首先先理解資料夾中的檔案與他們之間的連結：

ttest.c
如前述解釋的，此檔案符合實作 Welch's t-test 的統計計算，首先接收 fixture.c 的執行時間數據來計算 t 值，接著使用 t_init 初始化統計變數並將平均值與平方和及樣本數設為 0，通過 t_push 計算新數據與舊平均值的差值後更新平均值並累加平方和，再利用 t_compute 套入 Welch's t-test 公式：

$t = \frac{{\bar{x}}_{1} - {\bar{x}}_{2}}{\sqrt{\frac{s_{1}^{2}}{n_{1}} + \frac{s_{2}^{2}}{n_{2}}}}$ 最終得出 t 統計量以判斷兩組執行時間平均值是否顯著不同，作為分析是否為常數時間的統計核心。
constant.c
負責準備測試用的輸入數據並測量佇列操作的執行時間，生成數據供給 fixture.c 處理以測試是否保持常數時間，使用 init_dut 初始化佇列結構並設為空，通過 prepare_inputs 呼叫 random.h 的 get_random_string 函式並利用 randombytes 生成隨機與固定輸入數據，將類別 0 設為全 0 ，而類別 1 保留隨機值，再利用 measure 根據指定操作測量時間並透過呼叫 cpucycles.h 的函式記錄操作前後週期，以執行如插入或移除的佇列操作，模擬不同輸入場景並收集數據作為 Welch's t-test 的輸入。
fixture.c
整合測試流程並應用 Welch's t-test，結合 constant.c 的測量結果與 ttest.c 的統計數據執行完整測試，以判斷執行時間是否為常數，使用 fixture.h 中定義的 is_op_const 呼叫 test_const 檢查特定操作並生成測試函式如插入或移除，通過 differentiate 從前後週期相減計算執行時間，再利用 update_statistics 呼叫 ttest.c 的 t-push 更新統計數據，接著透過呼叫 report 使用 t_compute 計算 t 值並與閾值比較來報告結果，最後利用 doit 整合準備與測量及分析執行單次測試。
console.h
在閱讀標上述3個程式碼使用的標頭檔時發現了裡面定義 simulation 變數，可以猜測 simulation 的動作不僅僅與測試函式的常數時間有關，也與 qtest.c 乃至整個檔案的指令呼叫，有一定程度的關係。

所以可以知道， ttest.c 接收 fixture.c 的數據計算 t 值，constant.c 生成數據供給 fixture.c，fixture.c 整合兩者實現 Welch's t-test 檢測是否能達到常數時間。

根據作業說明，我們知道原本的程式碼是不完整的，我們接著比較整個 dudect 自料夾中的程式碼跟論文作者實作的 dudect.h 會發現，目前在 dudect 中的程式碼正如作業說明裡說的一樣，是沒有具備 percentile 的處理，也就是我們的程式碼相較於論文內所述，在 pre-processing 的步驟上，除了 update_statistics 這個函式以外，沒有明確的實作前處理，導致檢測能力較弱，使得測試 t-test 會無法通過的原因。

實作 pre-processing 步驟改善 t-test 測試

作者是通過 prepare_percentiles 與 update_statistics 這兩個函式來處理步驟二。相較於作者，我們資料夾中的檔案內沒有 prepare_percentiles ，所以我們要來研究一下，作者做了什麼，並把它應用到我們的程式碼中。
根據論文 III.A.a Pre-processing

We discard measurements that are larger than the p percentile, for various values of p… The exact values for p follow an inverse exponential trend…

以及程式碼的實作，我們知道作者先透過 qsort 排序執行時間，接著透過 p 的計算公式：1 - pow(0.5, 10 * (double)(i + 1) / DUDECT_NUMBER_PERCENTILES) （此處 DUDECT_NUMBER_PERCENTILES = 100）找出分位的閾值，最後透過 update_statistics 函式，將不要的數據剪裁掉，也就是 cropping 的動作。

我的實作上，就是引用作者程式碼中的 prepare_percentiles 函式與其相關函式拿來做使用。再使用 make test 後，成功的通過測驗

+++ TESTING trace trace-17-complexity:
# Test if time complexity of 'q_insert_tail', 'q_insert_head', 'q_remove_tail', and 'q_remove_head' is constant
Probably constant time
Probably constant time
Probably constant time
Probably constant time
---	trace-17-complexity	5/5
---	TOTAL		100/100

可是，這樣是不嚴謹的，因為根據我們研讀的論文以及老師上課提到的。若是直接引用的話，只有處理到的是 pre-processing 的部分，像是作者使用了有使用到多重 t-test 測試。不過我覺得，先求有再求好。
正當我要繳交 commit 假裝自己「真的」通過測試的時候，發現同學 BennyWang1007 完成了新版的 fixture.c 程式碼，並且提交的 PR 已經被老師接受了。在簡單的閱覽 commit-message ，與確認程式碼內容後，發現同學實作的與作者論文裡對 dudect 的要求高度相關，於是我決定拋棄我的半成品，準備 rebase ，並接著測試我的 queue.c 是否能通過這版嚴謹的測試。

commit 83077d6 contributed by BennyWang1007

使用 git fetch 命令以取得 sysprog21/lab0-c 的最新更新

如同老師在 review 裡提供的圖片：

以下是我 rebase 的步驟：

git status ：在開始前與 rebase 的過程中，我會使用這個指令，確認我目前的狀況。
git fetch upstream ： fetch 到上游分支（ sysprog21/lab0-c ）。
git rebase upstream/master ：重新定義我分支的基底。
git pull origin master ：接著把我分支的資料拉取上來。
git push origin master --force ：將資料推送到我的分支上。由於我的分支有變動，所以會產生與衝突的情況。這時候我會選擇直接 force 推送。

git log --oneline --graph --all ：圖像化看一下我的 rebase 結果。

* 3399ce2 (HEAD -> master, origin/master, origin/HEAD) Resolve conflicts in pre-commit hook
* 9b22c61 Integrate q_swap via q_reverseK, refine q_reverse
* b033c41 Refactor queue functions reverseK, ascend, descend
* 6aefb08 Add sort and merge with helper merge function
* 871d65d Add node removal by right-side value comparison
* 1b766fd Fix pointer type mismatch error in queue operation
* 091c41c Fix duplicate removal bug and update operations
* 60dfeac Update four newly functions for queue operations
* 29dc67d Enhance queue functions in testing branch
* 02f9904 Resolve git pull conflict and update queue.c
* fe25702 Fix queue.c, queue.h, and hash scripts for lab-0
*   f53314e (upstream/master) Merge pull request #283 from BennyWang1007/dudect_fix
|\  
| * 83077d6 Improve dudect test with cropped time analysis
* |   9563d40 Merge pull request #281 from EricccTaiwan/rest-api
|\ \  
| * | 427c9b3 Introduce REST API fallback
| |/  
* |   eb54e6f Merge pull request #282 from salmoniscute/fix
|\ \  
| |/  
|/|   
| * b34fcd3 Correct grammar in queue.h function descriptions

接著使用同學提供的新版 fixture.c ，透過 make test 測試後發現，原先能通過的測試都無法通過：

+++ TESTING trace trace-17-complexity:
# Test if time complexity of 'q_insert_tail', 'q_insert_head', 'q_remove_tail', and 'q_remove_head' is constant
ERROR: Probably not constant time or wrong implementation
ERROR: Probably not constant time or wrong implementation
ERROR: Probably not constant time or wrong implementation
ERROR: Probably not constant time or wrong implementation
---	trace-17-complexity	0/5
---	TOTAL		95/100
make: *** [Makefile:72：test] 錯誤 1

相較於新版更為嚴謹的測試，我本身的簡單版本的程式碼有學習論文實作有關 warm-up 的步驟，通過捨棄第一批資料，來過濾與防止硬體造成的記憶相關問題，因此我發想，是否跟這邊會有差別呢？並開始實作。與此同時，我也正在等待 make valgrind 測試的結果。
最終結果發現，原來記憶體發生洩漏，以下是以 q_insert_tail 為例子：

==94830== 43,632 bytes in 909 blocks are definitely lost in loss record 1 of 4
==94830==    at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==94830==    by 0x110F04: init_once (fixture.c:216)
==94830==    by 0x110F04: test_const (fixture.c:227)
==94830==    by 0x1113F5: is_insert_tail_const (fixture.c:250)
==94830==    by 0x10CCC6: queue_insert (qtest.c:192)
==94830==    by 0x10D0EE: do_it (qtest.c:288)
==94830==    by 0x10E9CC: interpret_cmda (console.c:214)
==94830==    by 0x10ED76: interpret_cmd (console.c:234)
==94830==    by 0x10F00E: cmd_select (console.c:604)
==94830==    by 0x10F8EF: run_console (console.c:684)
==94830==    by 0x10DB6E: main (qtest.c:1441)

根據報錯 43,632 bytes in 909 blocks are definitely lost ，而且確認後發現，都是發生在 init_once 這個函式。回去檢視原本的程式碼，可以發現原本的程式碼在 test_const 這個函式裡的 for 迴圈中， init_once 一直被不斷地重複呼叫，但是在釋放記憶體的動作卻只有在最後的時候被執行一次而已。
根據 TEST_TRIES 參數為10與 ctx[] 矩陣定義為101個，假如只有釋放一次，那就是

101 \times 10 - 101 = 909

正好跟我們透過 valgrind 找到的問題一樣。
所以我調整 init_once 跟 test_const 的程式碼。新增一個判斷式在 init_once 中，確認 ctx 是否有重複 malloc 的行為。接著調整 test_const 函式，把 init_once 從 for 迴圈中移出，改成在每次測試時只呼叫一次，防止重複呼叫的問題。
此外，我也在 doit 函式中，防止 calloc 記憶體問題的判斷式中，新增判斷 percentiles ，讓整個判斷式更嚴謹。
最後的結果也十分理想，除了通過測試得到星之卡比外，在記憶體方面也不再出錯了。

+++ TESTING trace trace-17-complexity:
# Test if time complexity of 'q_insert_tail', 'q_insert_head', 'q_remove_tail', and 'q_remove_head' is constant
Probably constant time
Probably constant time
Probably constant time
Probably constant time
---	trace-17-complexity	5/5
---	TOTAL		100/100

Test with specific case by running command:
scripts/driver.py -p /tmp/qtest.6SX9ZB --valgrind -t <tid>

接著回到最一開始提到的 warm-up 步驟。實作出結果的我，當然也好奇是否 warm-up 步驟也能改善原先程式碼的問題。所以我將我原先修復好的記憶體問題程式碼復原成原先同學實作的版本。並調整 doit 新增 warm-up ，但是在測試了幾次後，結果都差不多， malloc() 或者說，記憶體相關方面出問題了。
接著我把我更改的記憶體程式碼新增後，問題就都不再發生，且可以順利通過測試了。

針對 sysprog21/lab0-c 專案，提交 pull request

找到問題並修復後，就決定繳交一份 PR 來協助完善 fixture.c 程式碼。
以下是我繳交的步驟：

建立一個新的 branch
- git fetch upstream：先 fetch 到上游最新的 commit 。
- git checkout upstream/master：切換到上游分支。
- git checkout -b < 要 PR 的 branch 名稱 >：根據剛剛最新的上游分支，來建立一個用來推送 PR 的分支。
將要繳交的更改新增到剛剛建立的分支上：
分成兩種情況
- 檔案已經有 push 過
  - git cherry-pick < 要提交的commit hash code >：透過 cherry-pick 把 commit 嫁接到要繳交 PR 的分之上。
- 檔案還沒有 push 過
  - 在要繳交 PR 的分支上修改檔案
  - git add < 修改的檔案名稱 >
  - git commit -a
git push origin < 要 PR 的 branch 名稱 >：將檔案推送上去。
接著去 Github 網站，手動建立 PR ，就完成了。

與老師進行 code review

因為是人生第一次提交 PR ，所以很興奮的同時，也還未想過一個好的 PR 繳交會有這麼多的細節要處理，導致一直被打槍。
事後我才發現是老師在協助我進行 code review ，並挑出我許多的沒注意到或是過於隨性處理的問題，整個過程受益匪淺。

問題紀錄：

過於依賴 AI 翻譯與修飾 commit message ，導致訊息出現不必要的總結，且沒有名確涵蓋開發者的發現問題的過程與想法。
沒有閱讀 CONTRIBUTING.md 導致將修復 memory leaks 的問題，與新增 warm-up 步驟的程式碼，隨意地放在一起，就繳交 PR 。
不應該直接提及檔案名稱，選擇使用模組化名稱，與說明做了什麼。否則不僅不能明確表達本次 commit 做出的修正，也會造成其他開發者在檢視過去開發紀錄上的困擾。
確認註解是否有存在的必要，並使用 c-style 的方式，或者符合該 repo 中要求的方式新增註解。

如何重複更改與調整 commit：

若是要調整程式碼，使用 git reset --soft HEAD^ ，退回上一次繳交 commit 前的狀態，接著修改檔案後重新繳交。
使用 git commit --amend 來修改已經繳交的 commit message
推送上去時，需要再指令尾端加上 --force 進行強制推送。

TBD
已經 PR warm-up code

TBD
整合網頁伺服器
測試半通過
不在相同網域，不能測試瀏覽器，但也有可能是我不夠熟悉電腦網路問題。
繳交時發現 #265 fmtscan 問題
嘗試修復
新增前端用語到字典中
繳交PR
select 跟 signal 基本上都還沒摸透

在 qtest 提供新的命令 shuffle

TBD
根據記憶 queue.h 不能改，不信邪，真的炸開。
直接寫在 qtest.c 裡
完成
理解與分析
ssh 改成 png
結果

Expectation:  41666
Observation:  {'1234': 41642, '1243': 41730, '1324': 41901, '1342': 41513, '1423': 41646, '1432': 41995, '2134': 41847, '2143': 41502, '2314': 41600, '2341': 41277, '2413': 41700, '2431': 41592, '3124': 41755, '3142': 41851, '3214': 41435, '3241': 41975, '3412': 41848, '3421': 41288, '4123': 41551, '4132': 41592, '4213': 42041, '4231': 41337, '4312': 41800, '4321': 41582}
chi square sum:  25.764124225987615

研讀 Linux 核心原始程式碼的 `lib/list_sort.c`

我原本的 q_sort() 是使用 top-down 的方式，研讀後應該改成 bottom-up 來節省記憶體上的使用。
目前找到關鍵點，不能用遞迴。
所以上面的迭代版，要嘗試做，或許可以先研讀完，再做，會更好。
或許也能解釋以下，為何總是跳警告。
做作業二的時候找到2024第一週測驗題與此題超級相關，有使用timsort
題目 7 / 參考題解

+++ TESTING trace trace-14-perf:
# Test performance of 'q_new', 'q_insert_head', 'q_insert_tail', 'q_reverse', and 'q_sort'
Warning: Skip checking the stability of the sort because the number of elements 2000000 is too large, exceeds the limit 100000.
---	trace-14-perf	6/6

TBD

2025q1 Homework1 (lab0)

CC: Urbaner3: Reply for Review on lumynous

開發環境

針對佇列操作 queue.c 的程式碼實作

實作前的準備

開始實作

1. q_new()

2. q_free()

3. q_insert_head() / q_insert_tail()

4. q_size

5. q_remove_head() / q_remove_tail()

6. q_delete_mid()

7. q_delete_dup

8. q_swap()

9. q_reverse()

10. q_reverseK()

q_sort()

12. q_merge()

13. q_ascend() / q_descend()