2025q1 Homework1 (lab0)

contributed by < Hlunlun >

Reviewed by `yy214123`

`q_free`

在 queue.h 中有實作：

static inline void q_release_element(element_t *e)
{
    test_free(e->value);
    test_free(e);
}

可以如下改善：

void q_free(struct list_head *head)
{
    element_t *entry = NULL, *safe;
    list_for_each_entry_safe(entry, safe, head, list) {
-       free(entry->value);
-       free(entry);
+       q_release_element(entry);
    }
    free(head);
    return;
}

作業書寫規範:

無論標題和內文中，中文和英文字元之間要有空白字元 (對排版和文字搜尋有利)
文字訊息 (尤其是程式執行結果) 請避免用圖片來表示，否則不好搜尋和分類
共筆書寫請考慮到日後協作，避免過多的個人色彩，用詞儘量中性
不要在筆記內加入 [TOC] : 筆記左上方已有 Table of Contents (TOC) 功能，不需要畫蛇添足
不要變更預設的 CSS 也不要加入任何佈景主題: 這是「開發紀錄」，用於評分和接受同儕的檢閱
當在筆記中貼入程式碼時，避免非必要的行號，也就是該手動將 c= 或 cpp= 變更為 c 或 cpp。行號只在後續討論明確需要行號時，才要出現，否則維持精簡的展現。可留意「你所不知道的 C 語言: linked list 和非連續記憶體」裡頭程式碼展現的方式
HackMD 不是讓你張貼完整程式碼的地方，GitHub 才是！因此你在開發紀錄只該列出關鍵程式碼 (善用 diff 標示)，可附上對應 GitHub commit 的超連結，列出程式碼是為了「檢討」和「便於他人參與討論」
留意科技詞彙的使用，請參見「資訊科技詞彙翻譯」及「詞彙對照表」
不要濫用 :::info, :::success, :::warning 等標示，儘量用清晰的文字書寫。:::danger 則僅限授課教師作為批注使用
避免過多的中英文混用，已有明確翻譯詞彙者，例如「鏈結串列」(linked list) 和「佇列」(queue)，就使用該中文詞彙，英文則留給變數名稱、人名，或者缺乏通用翻譯詞彙的場景
在中文敘述中，使用全形標點符號，例如該用「，」，而非 ","。注意書名號的使用，即 〈 和 〉，非「小於」和「大於」符號
避免使用不必要的 emoji 字元

開發環境

lun@lun-OMEN:~/linux2025/hw1/lab0-c$ gcc --version
gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

開發步驟與工具

前置作業

安裝雙系統：安裝雙系統在有兩個 SSD 的 hp 筆電上

安裝套件

sudo apt install build-essential git-core valgrind
sudo apt install cppcheck clang-format aspell colordiff

取得程式碼與測試

git clone https://github.com/sysprog21/lab0-c

每次開發完一個函數就 commit ，紀錄開發過程

clang-format -i <file>
git add <file>
git commit -a
git push

實做佇列

`q_new`

C99 [7.20.3.3] The malloc function

The malloc function returns either a null pointer or a pointer to the allocated space

malloc 如果沒有成功配置到記憶體是有可能回傳 NULL 的，所以要檢查 malloc 要求的記憶體是否有成功
問題：但後來在 make test 後發現第 13 個 trace 無法通過

猜測這個函式可能是一定要配置到記憶體位置才是正確的，因此使用迴圈檢查記憶體配值直到成功為止，經過 Commit 4417ea0 竟然就通過測資了，但是這樣可能會造成記憶體一直不夠一直再回圈中檢查，造成整個系統異常，要在想原因是什麼
2025/03/09 不要平感覺寫程式

     struct list_head *q = malloc(sizeof(struct list_head));
-    if (q == NULL)
-        return NULL;
+    while (q == NULL)
+        q = malloc(sizeof(struct list_head));
     INIT_LIST_HEAD(q);
     return q;

`q_free`

先將 list 移除，在釋放整個節點

`q_insert_head` 、`q_insert_tail`

重點：需要更注意字串的記憶體配置
問題：原本我直接把 element->value 指向 s，如果有在其他地方有人對 s 做任何變更或是釋放記憶體，element->value 就會也跟著改變，所以應該是要用 strdup 配置新的空間給 element->value
在你不知道的C語言：指標篇中也有說到字串的生命週期有可能快結束，所以才使用 strdup

-    element->value = s;
+    element->value = strdup(s);
    if (!element->value) {
        free(element);
        return false;

那為何要判斷 element->value 是否為 NULL 呢？
可以運用 linux 的說明文件 man page，在 terminal 打上 man strdup，這時候就會看到因為在使用 strdup 時會用到 malloc，而在規格書 C99 [7.20.3.3] 中有提到 malloc 有可能因為記憶體不夠用而造成記憶體配置失敗從而回傳 NULL，所以才要檢查 element->value

The strdup() function returns a pointer to a new
string which is a duplicate of the string s. Memory
for the new string is obtained with malloc(3), and
can be freed with free(3).

`q_remove_head` 、`q_remove_tail`

重點：只要 unlinlk 就好，不用釋放記憶體，找到第一個 list_head 節點，然後 unlink
問題：copying of string in remove_head overflowed destination buffer.
原本我是直接用 bufsize 有多大就複製多大的 entry->value 到 sp ，但如果 entry->value 的長度比 bufsize 小，會有 buffer overflow detected 的錯誤
這裡就會好奇所以 strncpy 是否會自動補上 \0 ，從 man page 看到 strncpy 的實做原理，可以看到這個函式會用到 strpncpy 並且在其中會檢查 dsize 和 src 的長度，但是並沒有和 dst 比較，而這裡的 q_remove_* 會有 bufsize 的限制，所以要而外檢查 element->value 的大小，經過 commit 27965a2 即可解決這個問題
```
stpncpy(char *restrict dst, const char *restrict src, size_t dsize)
{
   size_t  dlen;
   dlen = strnlen(src, dsize);
   return memset(mempcpy(dst, src, dlen), 0, dsize - dlen);
}
```
關於可以複製多大的字串到 sp ，在 man strncpy 中，看到了 strncpy 的實做使用到 strnlen(src, dsize) ，這個函式會回傳 min(strlen(src), dsize) ，可以讓程式碼更精簡，因此做了 commit d03a3d8 的修改

     if (sp) {
-        size_t size = strlen(entry->value) < (bufsize - 1)
-                          ? strlen(entry->value)
-                          : (bufsize - 1);
-        strncpy(sp, entry->value, size);
-        sp[size] = '\0';
+        size_t dlen = strnlen(entry->value, bufsize - 1);
+        mempcpy(sp, entry->value, dlen);
+        *(sp + dlen) = 0;
     }

`q_delete_mid`

運用指標的指標技巧，找到指向 middle node 的指標，然後將它移除，只是這裡的迴圈判斷條件是 fast 和 fast->next 還沒到迭代到 head 之前，如 commit a351491

`q_delete_dup`

使用 list_for_each_entry_safe 才可以在迭代過程中移除節點，因為只要出現多於 1 次就要全部移除，所以需要一個標誌來確認目前此次刪除是否進行完，如 commit 49224d8

list_for_each_entry_safe(node, safe, head, list) {
    if (&safe->list != head && !strcmp(node->value, safe->value)) {
        //delete node
        dup = true;
    } else if (dup) {
        //delete node
        dup = false;
    }
}

`q_swap`

先看 list_for_each_safe，所以使用這個巨集時，要 swap 的就是 node 和 safe

#define list_for_each_safe(node, safe, head)                     \
    for (node = (head)->next, safe = node->next; node != (head); \
         node = safe, safe = node->next)

因為是每兩個節點交換位置，所以每次迭代 safe 和 node 需要移動兩個位置，且要檢查 safe 和 node 都不是 head，但 node 會在 list_for_each_safe 中被檢查是否是 head，所以只要檢查 safe 就行，如 commit 6a80e4b

list_for_each_safe(node, safe, head) {
    if (safe == head)
        return;

    struct list_head *next = safe->next;
    struct list_head *prev = node->prev;
    
    //... node 和 safe 兩個節點交換位置
    
    safe = node->next;
}

`q_reverse`

重點：利用雙向鍊結串列的特性，改變每個節點的 prev 和 next，如 commit 7203bbd

`q_reverseK`

為了可以直接使用到 q_reverse，可以透過 list_cut_position 將雙向鍊結串列中的片段切割出來使用 q_reverse 去倒轉，將到轉完的雙向鍊結串列先存放在暫時的 new_head 中，繼續迭代 head 找出 k-group 的片段，最後在將 new_head 用 list_splice 全部拼接到 head，如 commit b4dc453

`q_sort`

透過研讀 Linux 核心原始程式碼的 lib/list_sort.c 了解排序運作原理後，實做在此函數，如 commit 4ebd88e

C99 [7.21.4.2] The strcmp function
應該使用 strcmp 函數來比較兩個字串，因為此函數是比較指標指向的記憶體位置中存放的值

The strcmp function compares the string pointed to by s1 to the string pointed to by s2.

C99 [6.5.8] Relational operators
如果直接使用關係運算子 (Relational operator) 對兩個指標( list_entry(a, element_t, list) 的屬性 value 變數類型是指標) 進行比較，則是比較指標指向的記憶體位置，而非記憶體中存放的數值

When two pointers are compared, the result depends on the relative locations in the address space of the objects pointed to. If two pointers to object or incomplete types both point to the same object, or both point one past the last element of the same array object, they compare equal. If the objects pointed to are members of the same aggregate object, pointers to structure members declared later compare greater than pointers to members declared earlier in the structure, and pointers to array elements with larger subscript

 bool cmp(const struct list_head *a, const struct list_head *b, bool descend)
 {
+    element_t *e1 = list_entry(a, element_t, list);
+    element_t *e2 = list_entry(b, element_t, list);
+
     if (descend)
-        return list_entry(a, element_t, list)->value >=
-               list_entry(b, element_t, list)->value;
+        return strcmp(e1->value, e2->value) >= 0;
 
-    return list_entry(a, element_t, list)->value <=
-           list_entry(b, element_t, list)->value;
+    return strcmp(e1->value, e2->value) <= 0;

`q_ascend`、`q_descend`

兩個函數都只要將最右邊當作基準點去由右而左迭代，只要遇到 left 比 right 指向的節點還要小( q_descend ) 或大( q_ascend )就刪除，並且一直往左走直到 left 走到 head ，如 commit ed4547b

`q_merge`

此函數主要功能就是將在不同 q_context_t 中的雙向鍊結串列 q 全部合併到第一個 q_context_t 中的 q ，不同的 q_context_t 則是藉由 chain 來連接(下圖藍色線條)，且在合併之前的每個 q 都保證已經排序過，因為是字串，所以 "10" 小於 "23"

合併後應該要是長這樣，除了第一個 q_context_t 中的 q，其餘 q_context_t 中的 q 都會變空的雙向鍊結串列

這裡我直接把其他 queue_context_t 中的雙向鍊結串列拼接到第一個 queue_context_t 中，在用 q_sort 進行排列即完成，commit

至此，即佇列各種函數的實做，成功就會出現星之卡比的圖片

研讀 Linux 核心原始程式碼的 lib/list_sort.c

參考：坤諦江、The Linux Kernel、Risheng

`attribute`

參考 GNU 6.35.1 Common Function Attributes

其中這段有提到 nonnull function attribute 可能會用在至少有一個指標參數的函數，告訴編譯器哪幾個指標參數必為非空指標

The nonnull attribute may be applied to a function that takes at least one argument of a pointer type. It indicates that the referenced arguments must be non-null pointers. For instance, the declaration…informs the compiler that, in calls to my_memcpy, arguments dest and src must be non-null.

lib/list_sort.c 使用的函數屬性使其第 2、3、4 個參數 cmp 、a 、 b 必須為非空的指標，所以從程式碼可以間接知道 cmp 是指標，因為 __attribute__((nonnull)) 只用於限制指標參數(可以透過閱讀 nonnull function attribute 範例得出結論 )

__attribute__((nonnull(2,3,4)))
static struct list_head *merge(void *priv, list_cmp_func_t cmp,
                struct list_head *a, struct list_head *b)

仔細閱讀函數屬性在函數被觸發以及在編譯階段定義時會有不同的效果，在函數被觸發時如果在標記為 nonnull 的位置傳入 null 會觸發緊告，所以其實程式還是可以通過編譯，但是會有警告

If the compiler determines that a null pointer is passed in an argument slot marked as non-null, and the -Wnonnull option is enabled, a warning is issued.

在課堂上老師也不斷提及 linux kernel 是牽一髮而動全身的，所以開發者必須更謹慎小心，而 __attribute__ 就是一種警告機制，把控傳參數這樣的細節

`list_cmp_func_t cmp`

解釋函數屬性時提及 cmp 是一個指標，其型態定義於 linux/list_sort.h ，將整數定義為一個指標函數 (pointer to function) ，或者可以解釋為這是一個會回傳整數的指標函數

typedef int __attribute__((nonnull(2,3))) (*list_cmp_func_t)(void *,
        const struct list_head *, const struct list_head *);

為何不直接使用它原本的整數型態，還特別定義為指標函數呢？

透過程式中的註解解釋 cmp，此函數用於比較當前節點的數值，

 The comparison function @cmp must return > 0 if @a should sort after
 @b ("@a > @b" if you want an ascending sort), and <= 0 if @a should
 sort before @b *or* their original order should be preserved.  It is
 always called with the element that came first in the input in @a,
 and list_sort is a stable sort, so it is not necessary to distinguish
 the @a < @b and @a == @b cases.

`merge`

運用到指標的指標技巧，進行兩個雙向鍊結串列的合併，並沒有去修改 prev 指標指向的節點，舉例以下 a 和 b 可能的樣貌，且最後一個節點都會是 NULL ，原因可以看到 merge_final 的解釋內容

經過 merge 函數後，會呈現以下，也就是說只有將 next 指標指向對的節點，prev 指標依舊，也就是說按照 next 指標走，排列的順序是正確的

`merge_final`

首先看到函數定義可知，參數中 cmp 、head 、a 、b 必不為空指標

__attribute__((nonnull(2,3,4,5)))
static void merge_final(void *priv, list_cmp_func_t cmp, struct list_head *head,
            struct list_head *a, struct list_head *b)

head 在 list_sort 中作為呼叫 merge_final 的引數，可以知道 head 是一個雙向鍊結串列的第一個節點，而在 merge_final 中 a 和 b 會有序的排列在 head 之後。

此處也可解釋為何在 merge 在過程中 a 和 b 明明是雙向鍊結串列的型態 list_head 結構中卻有 NULL ，因為以下程式碼第 11 行將 head->prev 這個節點的 next 指標指向 NULL ，也就是原本指向 head 的 head->prev->next 指標，現在指向 NULL ( 這跟直接將 head 設置為 NULL 還是差別的！所以 head 依然存在並未變成 NULL )，所以才可以用跟合併單向鍊結串列一樣的判斷條件是迭代直至 NULL

















__attribute__((nonnull(2,3)))
void list_sort(void *priv, struct list_head *head, list_cmp_func_t cmp)
{
    struct list_head *list = head->next, *pending = NULL;
    size_t count = 0;	/* Count of pending */

    if (list == head->prev)	/* Zero or one elements */
        return;

    /* Convert to a null-terminated singly-linked list. */
    head->prev->next = NULL;

        ...

    /* The final merge, rebuilding prev links */
    merge_final(priv, cmp, head, pending, list);
}

看完第 55 到到第 74 行的程式碼可以知道 merge_final 和 merge 作法大致相同，也就是會比較 a 和 b 這兩個雙向鍊結串列並且做合併，但是 merge_final 有調整每個節點的 prev 指標

for (;;) {
    /* if equal, take 'a' -- important for sort stability */
    if (cmp(priv, a, b) <= 0) {
        tail->next = a;
        a->prev = tail;
        tail = a;
        a = a->next;
        if (!a)
            break;
    } else {
        tail->next = b;
        b->prev = tail;
        tail = b;
        b = b->next;
        if (!b) {
            b = a;
            break;
        }
    }
}

若是兩個鍊結串列不是平衡的，代表只有將其中一個鍊結串列走完並排列到 head 後面，需要繼續處理另一個鍊結串列，以下程式碼即是處理此情況

/* Finish linking remainder of list b on to tail */
tail->next = b;
do {
    /*
     * If the merge is highly unbalanced (e.g. the input is
     * already sorted), this loop may run many iterations.
     * Continue callbacks to the client even though no
     * element comparison is needed, so the client's cmp()
     * routine can invoke cond_resched() periodically.
     */
    if (unlikely(!++count))
        cmp(priv, b, b);
    b->prev = tail;
    tail = b;
    b = b->next;
} while (b);

/* And the final links to make a circular doubly-linked list */
tail->next = head;
head->prev = tail;

`unlikely` 、 `likely`

參考 likely() and unlikely()

定義於 linux/compiler.h ，使用到 GCC 內建函式 __builtin_expect (long exp, long c) 如下

# define likely(x)	__builtin_expect(!!(x), 1)
# define unlikely(x)	__builtin_expect(!!(x), 0)

關於 __builtin_expect(long exp, long c) 的描述如下，此內建函數會提供編譯器分支預測的資訊，此處分支 branch 指程式中的能夠改變執行指令順序的如 if-else 和 switch 的條件表達式，這種功能可以避免指令跳轉從而優化程式執行過程

You may use __builtin_expect to provide the compiler with branch prediction information…

其運作方式如字面上 expect 的含意，編譯器會預期 exp 的數值結果是 c，回傳值會是 exp 的運算結果

The return value is the value of exp, which should be an integral expression. The semantics of the built-in are that it is expected that exp == c.

嘗試以下程式碼印出 __builtin_expect 的值，就是 exp 本身，此函數並不會影響 exp 本身，

#include<stdio.h>
int main(){
	int a = 10;
	printf("%ld\n", __builtin_expect(a, 1)); // 10
	printf("%ld\n", __builtin_expect(a==10, 0)); // 1
}

所以，__builtin_expect(!!(x), 1) 代表預期布林值 !!(x) 是對的，即希望執行 likely 這個分支；反之 __builtin_expect(!!(x), 0) 則代表預期布林值 !!(x) 是錯的，即不希望執行 unlikely 這個分支。透過預期 exp 發生的機率高低讓編譯器可以提早預測是否按照順序執行程式碼或是需要執行會降低效能的跳轉指令。

`list_sort`

此處一樣有限制第 2 和 3 個參數須不為空的指標

__attribute__((nonnull(2,3)))
void list_sort(void *priv, struct list_head *head, list_cmp_func_t cmp)

其參數定義如下，head 即 list_sort 要進行排序的雙向鍊結串列， cmp 則為的用於比較的函數

@priv: private data, opaque to list_sort(), passed to @cmp
@head: the list to sort
@cmp: the elements comparison function

參考 RinHizakura 中解釋 [RESEND PATCH v2 5/5] lib/list_sort: Optimize number of calls to comparison function

開頭解釋由於 CONFIG_RETPOLINE 降低了呼叫間接函數的表現，所以應該要減少使用間接函數如用來比較的 cmp() 函數，這也就是為什麼 list_sort 的合併排列中兩個序列長度需要以 2:1 的比例進行，因為此方式可以減少比較的次數，也就減少了調用 cmp() 而降低了系統的性能情況發生。

CONFIG_RETPOLINE has severely degraded indirect function call performance, so it's worth putting some effort into reducing the number of times cmp() is called.

這段程式碼改變了以前是用 Bottom-up mergesort 的方式(以下為前版本描述)，可以參考 patch 中提供的文獻：Bottom-up mergesort – A detailed analysis，比較特別的是這篇論文是一個經濟系和數學系的專家寫的，他們推導出可以表達 bottom-up mergesort 過程中做比較次數的表示式，並與 top-down 進行分析，並且得出結論雖然 top-down mergesort 的複雜度可能更低，但是它伴隨著需要先知道排列個數的限制，我想這也是為什麼以前是用 bottom-up 現在優化也不選擇 top-down 的原因

- * This function implements a bottom-up merge sort, which has O(nlog(n))
- * complexity.  We use depth-first order to take advantage of cacheing.
- * (E.g. when we get to the fourth element, we immediately merge the
- * first two 2-element lists.)

且合併排列的比例盡可能要是 2:1

兩個鍊結串列都是
$2^{k}$ ，所以合併後就是
$2^{k + 1}$
後續至少還有
$2^{k}$ 個元素
所以，合併過程中要同時處理的元素總數就是

$2^{k + 1} + 2^{k} = 2 \cdot 2^{k} + 2^{k} = 3 \cdot 2^{k}$

+ * This mergesort is as eager as possible while always performing at least
+ * 2:1 balanced merges.  Given two pending sublists of size 2^k, they are
+ * merged to a size-2^(k+1) list as soon as we have 2^k following elements.

cache thrashing 快取抖動：重複存取不在快取中的資料，導致快取不斷被替換，造成性能下降。

這邊有解釋到，因為這樣可以防止 cache thrashing，由於 CPU 的 cache 快取存取速度由快到慢分別為 L1, L2, L3，所以如果 CPU 要存取資料會最先到 L1，這

3 \cdot 2^{k}

個資料剛好可以都在 L1 就不會造成 cache thrashing

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

可以透過 lscpu 知道自己電腦的 L1 大小

+ * Thus, it will avoid cache thrashing as long as 3*2^k elements can
+ * fit into the cache.  Not quite as good as a fully-eager bottom-up
+ * mergesort, but it does use 0.2*n fewer comparisons, so is faster in
+ * the common case that everything fits into L1.

我們需要計算兩個鍊結串列的長度並且檢查是否為 2 的次方，而用來計算的 count 就是控制是否合併的因素，當兩個鍊結串列長度為

2^{k}

就合併，也就是計算 pending 的長度，以下所說的："在 k 個位置設為1後k 以前的位置就都變成0" 就是指 count 變成

2^{k}

的時候二進位的呈現

+ * The merging is controlled by "count", the number of elements in the
+ * pending lists.  This is beautiully simple code, but rather subtle.
+ *
+ * Each time we increment "count", we set one bit (bit k) and clear
+ * bits k-1 .. 0.  Each time this happens (except the very first time
+ * for each bit, when count increments to 2^k), we merge two lists of
+ * size 2^k into one list of size 2^(k+1).

以下是計算方式的說明，說明有幾個

2^{k}

長度的鍊結串列並且何時合併

這邊用實際數字來模擬這個過程，為了方便表示這裡先用單向鍊結串列：

初始狀態
$\to$ count = 0: 000

插入第 1 個元素

\to

count = 1: 001
有一個長度為 1 的鍊結串列

插入第 2 個元素

\to

count = 2: 010

有兩個長度為 1 的鍊結串列

插入第 3 個元素
$\to$ count = 3: 011

合併這兩個長度為 1 的鍊結串列為一個長度為 2 的鍊結串列、有一個長度為 1 的鍊結串列

插入第 4 個元素

\to

count = 4: 100

兩個長度為 1 的鍊結串列、一個長度為 2 的鍊結串列

兩個長度為 1 的鍊結串列合併成一個長度為 2 的鍊結串列、一個長度為 2 的鍊結串列

合併兩個長度為 2 的鍊結串列為一個長度為 4 的鍊結串列

重複 2. ～ 5. 流程

+ * The number of pending lists of size 2^k is determined by the
+ * state of bit k of "count" plus two extra pieces of information:
+ * - The state of bit k-1 (when k == 0, consider bit -1 always set), and
+ * - Whether the higher-order bits are zero or non-zero (i.e.
+ *   is count >= 2^(k+1)).
+ * There are six states we distinguish.  "x" represents some arbitrary
+ * bits, and "y" represents some arbitrary non-zero bits:
+ * 0:  00x: 0 pending of size 2^k;           x pending of sizes < 2^k
+ * 1:  01x: 0 pending of size 2^k; 2^(k-1) + x pending of sizes < 2^k
+ * 2: x10x: 0 pending of size 2^k; 2^k     + x pending of sizes < 2^k
+ * 3: x11x: 1 pending of size 2^k; 2^(k-1) + x pending of sizes < 2^k
+ * 4: y00x: 1 pending of size 2^k; 2^k     + x pending of sizes < 2^k
+ * 5: y01x: 2 pending of size 2^k; 2^(k-1) + x pending of sizes < 2^k
+ * (merge and loop back to state 2)

而這個過程合併用的函數就是定義在上面的 merge，完成所有節點的排序後，最後在使用 merge_final 調整雙向鍊結串列的 prev 指標，至此，就是 list_sort 的註解解釋

list_sort 排序過程

先將雙向鍊結串列頭尾斷開，即連接到 head 的 prev 指標指向的節點的 next 指標指向 NULL ，所以 head 的 prev 並未斷開
創建待排序的指標 pending ，初始化這個指標指向 NULL

struct list_head *list = head->next, *pending = NULL;
size_t count = 0;	/* Count of pending */

if (list == head->prev)	/* Zero or one elements */
    return;

/* Convert to a null-terminated singly-linked list. */
head->prev->next = NULL;

以下就是作者那一大串註解所說明的排序過程，

do {
    ..
} while (list);

tail 在這裡的作用是在找待排序的鍊結串列 pending 上的最後一個節點，pending 一開始是空的，所以 tail 指向 pending

size_t bits;
struct list_head **tail = &pending;

count 初始化是 0 ，所以一開始不會執行這個迴圈

for (bits = count; bits & 1; bits >>= 1)
    tail = &(*tail)->prev;

此處使用到的 likely，即有很大可能 bits 是大於 0 的數字，但是如果有經過上面的迴圈 bits 還大於 0 的可能有以下兩種
- bits 本身就是
  $2^{k}$
  2 (0010) 、4 (0100) 、8 (1000) … 等數都不會經過上面迴圈
- bits 是
  $(2^{k}, 2^{k + 1} - 1)$ 區間的整數
  5 (0101) 、6 (0110) 、9 (1001) 、10 (1010) 、11 (1011) 、12 (1100) 、13 (1101) 、14 (1110) … 等數中，偶數不會經過上面迴圈，奇數則是經過上面迴圈後依然大於 0
```
if (likely(bits)) {
    struct list_head *a = *tail, *b = a->prev;

    a = merge(priv, cmp, b, a);
    /* Install the merged result in place of the inputs */
    a->prev = b->prev;
    *tail = a;
}
```

count = 0

\to

count = 1
將雙向鍊結串列上的節點移至 pending 上

/* Move one element from input list to pending */
list->prev = pending;
pending = list;
list = list->next;
pending->next = NULL;
count++;

第一次執行這段程式碼如以下，此時 pending 上就會有一個節點 12 ，且其 prev 和 next 都會被斷開，如同從此雙向鍊結上移至另一個等待的鍊結串列上，此時也因應 pending 上增加一個節點，count = 1

count = 1

\to

count = 2：不 merge

count = 2

\to

count = 3 : 合併

進入 if likely(bits) 的區塊準備進行合併，這裡使用的是只會調整每個節點的 next 指標的 merge 函數

此時節點 2 的 next 指標就會指向節點 12 (此處假設是升冪排序)，且節點 2 的 prev 也會被清空指向 NULL

最後將下一個的節點增加至 pending

count = 3

\to

count = 4 : 不合併
找目前在 pending 上的最後一個節點，並將 tail 指向該指標，因為 next 指標是有順序的指向下一個節點，如果要找到最後一個節點就是要用 prev 去找 tail

將下一個節點新增至 pending

count = 4

\to

count = 5: 合併
將 pending 上的節點進行合併

經過合併後，節點 5 的 next 指標會指向節點 13

將下一個節點移至 pending

至此，由於 list 指向 NULL ，所以跳出迴圈

由上圖可知，雙向鍊結串列尚未排序完成，只有每格兩個節點用 next 去看順序是正確的，prev 目前也尚未完成，以下程式碼就是

/* End of input; merge together all the pending lists. */
list = pending;
pending = pending->prev;
for (;;) {
    struct list_head *next = pending->prev;

    if (!next)
        break;
    list = merge(priv, cmp, pending, list);
    pending = next;
}

將在 pending 的節點進行排序，

進入 for 迴圈進行排序
先將下一個節點存放至 next

經過 merge，完成所有節點的 next 指標排序： 5

\to

7

\to

13

先將下一個節點存放至 next

經過 merge，完成所有節點的 next 指標排序： 2

\to

5

\to

7

\to

12

\to

13

最後使用 merge_final 函數，將 prev 指標更新完成

merge_final(priv, cmp, head, pending, list);

實作 shuffle 命令

Fisher–Yates shuffle

此演算法是一個時間複雜度為 O(n) 和空間複雜度為 O(1) 的洗牌演算法，步驟如下：

假設有一個　n 長度的陣列　array

迭代 0 到　n 之間，迭代過程的變數　i
第　i 次迭代中
- 在
  $[0, i]$ 之間隨機找一個索引 j
- 將　array[i] 和　array[j] 交換位置

for i from 0 to n − 1 do
    j ← random integer such that 0 ≤ j ≤ i
    a[i] ← source[i]
    a[i] ← a[j]
    a[j] ← source[i]

`q_shuffle`

實做 Fisher-Yates Shuffle 在雙向鍊結串列中

閱讀論文〈Dude, is my code constant time?〉

Abstract

開發了一個可以檢測差不多 300 行的 C 語言程式的執行時間是否為常數時間的工具: dudect

Introduction

動機
為了抵抗 Timing attack 的相關資安攻擊，雖然聽起來是很瑣碎的事情，但是如果發生 timing leaks 就會造成很嚴重的結果

Assessing whether an implementation runs in constant time is not a trivial task. Even implementations that were supposed to be constant-time turned out not to be so [GBY16 ], [ Cry16]— reinforcing the argument that timing leaks may not be easy to detect, but can have serious consequences.
貢獻
即使以前也有可以評估記憶體資源或時間的工具，但是這些工具都需要去模擬硬體，這其實是滿困難的，而這篇論文的作者開發的 dudect 並不需要，只要跑在他們的平台上即可評估時間複雜度，而且不是靜態的分析?是用統計分析程式執行時間?

Our tool does not rely on static analysis but on statistical analysis of execution timing measurements on the target platform. In this way, we circumvent the problem of modeling the underlying hardware.

Approach

第一步: 測執行時間
- fix-vs-random 測試
  測兩組不同資料類別的執行時間來做洩漏檢測，一組是固定輸入，然後另一組是隨機輸入，這樣就可以抓到比較多潛在的時序洩漏
- 週期計數器
  用硬體的計數器去測執行時間
- 可以減少環境的影響
  因為是隨機的輸入類別，而且還是在測量前就選好資料類別的輸入，進入測量後就不會受到環境影響
第二步: 後處理
- 剪裁
  把無關類別和不是百分位數的測量值丟掉，只處理常態分布的時序，減少異常質對測量的影響
- 高階處理
  可用內積模仿 DPA 攻擊
第三步：統計測試
- 排除 null hyprothesis:
- t-test: 因為 cropping 是非線性的轉換，可以使用 Welch's t-test 來測試 first-order timing information leakage
- non-parametric test: fewer assumptions

Result

將由 300 行 C 程式碼開發的工具 deduct 開放在 github 上
採用 Welch's t 檢驗，結合 Welford 方差計算方法可以提高數值穩定性
dudect 在 MacBook、Intel Xeon 和 ARM7TDMI 上都有效，具有其通用性

開發過程遇到的問題

commit message 遇到的問題: 說我用到非英文單字，後來我檢查，因為我在 body 中寫到 "Use INIT_LIST_HEAD" macro 的名字導致無法儲存，改掉就好，

不通過的詞：deallocation

Read https://github.com/sysprog21/lab0-c/blob/master/CONTRIBUTING.md#git-commit-style carefully.
Proceed with commit? [e/n/?] e
Implement q_new with circular list structure                               [line 1]
 - Avoid using non-American English words

Failed to retrieve upstream commit hash from GitHub.
解決：重開機或是等一陣子在試，3/9直播有說道不要急著做 make test ，先思考在執行

lun@lun-OMEN:~/linux2025/hw1/lab0-c$ make test
Progress: [####################--------------------] 50%
[!] Failed to retrieve upstream commit hash from GitHub.

make: *** [Makefile:60: test] Error 1

閱讀資料

你所不知道的C語言：指標篇

釐清 pointer
引用 C語言: 超好懂的指標，初學者請進～中的描述可以很好理解

指標 (Pointer) 就是某變數的位址。而這邊的指標變數 (Pointer Variable)，則是用來存放指標的變數。
array subscripting 在編譯時期只確認以下兩件事:
- 得知 size
- 取得陣列第 0 個 (即最初的一項) 元素的指標
前兩者以外的操作，都透過 pointer
array subscripting => syntax sugar
C has no real "array"!
關於為什麼　&b[0] 是　pointer to struct?　& 不是　address of 用來取位置的嗎？那為什麼不是一個數值？
因為我們前面有了解過指標變數中存的值就是記憶體位置，在這裡我的理解是：把　b[0] 看成一個變數，&b[0] 這個數值就是記憶體位置，那什麼變數會存放記憶體位置呢？就是指標變數了，所以才會是　pointer to struct
```
(gdb) whatis &b[0]
type = struct {...} *
```

遇到 gdb 無法存取記憶體

(gdb) p &b
$1 = (struct {...} (*)[17]) 0x4620 <b>
(gdb) x/4 b
0x4620 <b>:	0	0	0	0
(gdb) p (&b[0])->v = {1, 2, 3}
Cannot access memory at address 0x4620

後來找到原因，因為我沒有把程式跑起來，但是跑起來後，因為 main() 中沒有程式碼，整個程式會直接結束，還來不急跟變數做互動它就結束的概念，所以要在 main() 這個地方做一個 break point，讓它停在那裡我們就可以對 b[0] 做修改記憶體了

(gdb) break main
Breakpoint 1 at 0x555555555136: file s.c, line 5.
(gdb) run
Starting program: /home/lun/linux2025/hw1/s 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Breakpoint 1, main () at s.c:5
5	int main(){}
(gdb) p (&b[0])->v = {1, 2, 3}
$1 = {1, 2, 3}

關於你所不知道的C語言：指標篇的這個部份，我和影片中實做的結果不一樣

(gdb) p &b
$31 = (struct {...} (*)[17]) 0x555555558060 <b>
(gdb) p &b+1
$32 = (struct {...} (*)[17]) 0x555555558280 <calendar>
(gdb) p &a[0]
$33 = (int *) 0x555555558040 <a>
(gdb) p &a[3]
$34 = (int *) 0x55555555804c

其中 &b+1 不是 a[0] 或是 a[3] 的位置，而是 calendar 的位置

(gdb) p &b[16]     
$42 = (struct {...} *) 0x555555558260 <b+512>
(gdb) p &calendar
$43 = (int (*)[12][31]) 0x555555558280 <calendar>

在查看 calendar 位置的實做過程中，發現 memcpy 不能直接使用，要先強至轉型才能看到值

$3 = (double *) 0x555555558280 <calendar>
(gdb) p memcpy(calendar, b, sizeof b[0])         
'memcpy' has unknown return type; cast the call to its declared return type
(gdb) p (double*)memcpy(calendar, b, sizeof b[0])
$3 = (double *) 0x555555558280 <calendar>

你所不知道的 C 語言: linked list 和非連續記憶體

pointer to pointer 的實做理解
用 leetcode 82. Remove Duplicates from Sorted List II 來實做，並用 graphviz 畫每個步驟

每一個節點都是一個 ListNode 物件

最一開始 indirect 指向 head，head 指向第一個節點的位置，cur 指向 indirect 指向的 head 指向的位置（指標的指標）

如果 cur 的值與下一個節點值不相同

改變 indirect 中儲存的記憶體位置，改到指向位置的指標的 next(也是一個指標)的位置

所以，indirect 不是直接指向 2 !! 而是指向指向 2 的 1->next 指標

如果 cur的值與下一個節點值相同

迭代 cur 一直到不相同

然後把 indirect 指到的 next 指標，指向 cur 指向的節點 4

重複以上步驟，直到 indirect 指向的指標指向的物件為 NULL

*-safe 就是不管任何條件 *，這個函式都可以執行

2025q1 Homework1 (lab0)

Reviewed by yy214123

q_free

開發環境

開發步驟與工具

實做佇列

q_new

q_free

q_insert_head 、q_insert_tail

q_remove_head 、q_remove_tail

q_delete_mid

q_delete_dup

q_swap

q_reverse

q_reverseK

q_sort

q_ascend、q_descend

q_merge

研讀 Linux 核心原始程式碼的 lib/list_sort.c

__attribute__

list_cmp_func_t cmp

merge

merge_final

unlikely 、 likely

list_sort

list_sort 排序過程

實作 shuffle 命令

Fisher–Yates shuffle

q_shuffle

閱讀論文〈Dude, is my code constant time?〉

Abstract

Introduction

Approach

Result

開發過程遇到的問題

閱讀資料

你所不知道的C語言：指標篇

你所不知道的 C 語言: linked list 和非連續記憶體

Read more

Linux 期末自我評量

`.venv` 虛擬環境

2025q1 Homework5 (assessment)

2025q1 Homework3 (kxo)

Reviewed by `yy214123`

`q_free`

`q_new`

`q_free`

`q_insert_head` 、`q_insert_tail`

`q_remove_head` 、`q_remove_tail`

`q_delete_mid`

`q_delete_dup`

`q_swap`

`q_reverse`

`q_reverseK`

`q_sort`

`q_ascend`、`q_descend`

`q_merge`

`attribute`

`list_cmp_func_t cmp`

`merge`

`merge_final`

`unlikely` 、 `likely`

`list_sort`

`q_shuffle`