Lab0 lib/list_sort.c 研讀筆記

# Lab0 `lib/list_sort.c` 研讀筆記 ::: spoiler `list_sort.c` src ```c // SPDX-License-Identifier: GPL-2.0 #include <linux/compiler.h> #include <linux/export.h> #include <linux/list_sort.h> #include <linux/list.h> /* * Returns a list organized in an intermediate format suited * to chaining of merge() calls: null-terminated, no reserved or * sentinel head node, "prev" links not maintained. */ __attribute__((nonnull(2,3,4))) static struct list_head *merge(void *priv, list_cmp_func_t cmp, struct list_head *a, struct list_head *b) { struct list_head *head, **tail = &head; for (;;) { /* if equal, take 'a' -- important for sort stability */ if (cmp(priv, a, b) <= 0) { *tail = a; tail = &a->next; a = a->next; if (!a) { *tail = b; break; } } else { *tail = b; tail = &b->next; b = b->next; if (!b) { *tail = a; break; } } } return head; } /* * Combine final list merge with restoration of standard doubly-linked * list structure. This approach duplicates code from merge(), but * runs faster than the tidier alternatives of either a separate final * prev-link restoration pass, or maintaining the prev links * throughout. */ __attribute__((nonnull(2,3,4,5))) static void merge_final(void *priv, list_cmp_func_t cmp, struct list_head *head, struct list_head *a, struct list_head *b) { struct list_head *tail = head; u8 count = 0; for (;;) { /* if equal, take 'a' -- important for sort stability */ if (cmp(priv, a, b) <= 0) { tail->next = a; a->prev = tail; tail = a; a = a->next; if (!a) break; } else { tail->next = b; b->prev = tail; tail = b; b = b->next; if (!b) { b = a; break; } } } /* Finish linking remainder of list b on to tail */ tail->next = b; do { /* * If the merge is highly unbalanced (e.g. the input is * already sorted), this loop may run many iterations. * Continue callbacks to the client even though no * element comparison is needed, so the client's cmp() * routine can invoke cond_resched() periodically. */ if (unlikely(!++count)) cmp(priv, b, b); b->prev = tail; tail = b; b = b->next; } while (b); /* And the final links to make a circular doubly-linked list */ tail->next = head; head->prev = tail; } /** * list_sort - sort a list * @priv: private data, opaque to list_sort(), passed to @cmp * @head: the list to sort * @cmp: the elements comparison function * * The comparison function @cmp must return > 0 if @a should sort after * @b ("@a > @b" if you want an ascending sort), and <= 0 if @a should * sort before @b *or* their original order should be preserved. It is * always called with the element that came first in the input in @a, * and list_sort is a stable sort, so it is not necessary to distinguish * the @a < @b and @a == @b cases. * * The comparison function must adhere to specific mathematical properties * to ensure correct and stable sorting: * - Antisymmetry: cmp(@a, @b) must return the opposite sign of * cmp(@b, @a). * - Transitivity: if cmp(@a, @b) <= 0 and cmp(@b, @c) <= 0, then * cmp(@a, @c) <= 0. * * This is compatible with two styles of @cmp function: * - The traditional style which returns <0 / =0 / >0, or * - Returning a boolean 0/1. * The latter offers a chance to save a few cycles in the comparison * (which is used by e.g. plug_ctx_cmp() in block/blk-mq.c). * * A good way to write a multi-word comparison is:: * * if (a->high != b->high) * return a->high > b->high; * if (a->middle != b->middle) * return a->middle > b->middle; * return a->low > b->low; * * * This mergesort is as eager as possible while always performing at least * 2:1 balanced merges. Given two pending sublists of size 2^k, they are * merged to a size-2^(k+1) list as soon as we have 2^k following elements. * * Thus, it will avoid cache thrashing as long as 3*2^k elements can * fit into the cache. Not quite as good as a fully-eager bottom-up * mergesort, but it does use 0.2*n fewer comparisons, so is faster in * the common case that everything fits into L1. * * * The merging is controlled by "count", the number of elements in the * pending lists. This is beautifully simple code, but rather subtle. * * Each time we increment "count", we set one bit (bit k) and clear * bits k-1 .. 0. Each time this happens (except the very first time * for each bit, when count increments to 2^k), we merge two lists of * size 2^k into one list of size 2^(k+1). * * This merge happens exactly when the count reaches an odd multiple of * 2^k, which is when we have 2^k elements pending in smaller lists, * so it's safe to merge away two lists of size 2^k. * * After this happens twice, we have created two lists of size 2^(k+1), * which will be merged into a list of size 2^(k+2) before we create * a third list of size 2^(k+1), so there are never more than two pending. * * The number of pending lists of size 2^k is determined by the * state of bit k of "count" plus two extra pieces of information: * * - The state of bit k-1 (when k == 0, consider bit -1 always set), and * - Whether the higher-order bits are zero or non-zero (i.e. * is count >= 2^(k+1)). * * There are six states we distinguish. "x" represents some arbitrary * bits, and "y" represents some arbitrary non-zero bits: * 0: 00x: 0 pending of size 2^k; x pending of sizes < 2^k * 1: 01x: 0 pending of size 2^k; 2^(k-1) + x pending of sizes < 2^k * 2: x10x: 0 pending of size 2^k; 2^k + x pending of sizes < 2^k * 3: x11x: 1 pending of size 2^k; 2^(k-1) + x pending of sizes < 2^k * 4: y00x: 1 pending of size 2^k; 2^k + x pending of sizes < 2^k * 5: y01x: 2 pending of size 2^k; 2^(k-1) + x pending of sizes < 2^k * (merge and loop back to state 2) * * We gain lists of size 2^k in the 2->3 and 4->5 transitions (because * bit k-1 is set while the more significant bits are non-zero) and * merge them away in the 5->2 transition. Note in particular that just * before the 5->2 transition, all lower-order bits are 11 (state 3), * so there is one list of each smaller size. * * When we reach the end of the input, we merge all the pending * lists, from smallest to largest. If you work through cases 2 to * 5 above, you can see that the number of elements we merge with a list * of size 2^k varies from 2^(k-1) (cases 3 and 5 when x == 0) to * 2^(k+1) - 1 (second merge of case 5 when x == 2^(k-1) - 1). */ __attribute__((nonnull(2,3))) void list_sort(void *priv, struct list_head *head, list_cmp_func_t cmp) { struct list_head *list = head->next, *pending = NULL; size_t count = 0; /* Count of pending */ if (list == head->prev) /* Zero or one elements */ return; /* Convert to a null-terminated singly-linked list. */ head->prev->next = NULL; /* * Data structure invariants: * - All lists are singly linked and null-terminated; prev * pointers are not maintained. * - pending is a prev-linked "list of lists" of sorted * sublists awaiting further merging. * - Each of the sorted sublists is power-of-two in size. * - Sublists are sorted by size and age, smallest & newest at front. * - There are zero to two sublists of each size. * - A pair of pending sublists are merged as soon as the number * of following pending elements equals their size (i.e. * each time count reaches an odd multiple of that size). * That ensures each later final merge will be at worst 2:1. * - Each round consists of: * - Merging the two sublists selected by the highest bit * which flips when count is incremented, and * - Adding an element from the input as a size-1 sublist. */ do { size_t bits; struct list_head **tail = &pending; /* Find the least-significant clear bit in count */ for (bits = count; bits & 1; bits >>= 1) tail = &(*tail)->prev; /* Do the indicated merge */ if (likely(bits)) { struct list_head *a = *tail, *b = a->prev; a = merge(priv, cmp, b, a); /* Install the merged result in place of the inputs */ a->prev = b->prev; *tail = a; } /* Move one element from input list to pending */ list->prev = pending; pending = list; list = list->next; pending->next = NULL; count++; } while (list); /* End of input; merge together all the pending lists. */ list = pending; pending = pending->prev; for (;;) { struct list_head *next = pending->prev; if (!next) break; list = merge(priv, cmp, pending, list); pending = next; } /* The final merge, rebuilding prev links */ merge_final(priv, cmp, head, pending, list); } EXPORT_SYMBOL(list_sort); ``` ::: ### 標頭研讀 ```c // SPDX-License-Identifier: GPL-2.0 #include <linux/compiler.h> #include <linux/export.h> #include <linux/list_sort.h> #include <linux/list.h> ``` `// SPDX-License-Identifier: GPL-2.0` SPDX（Software Package Data Exchange）是一種標準化的方式，用來標示軟體的授權許可資訊。`GPL-2.0` 則代表採用 GNU 通用公共授權（GNU General Public License）第 2 版進行授權。這是一種開源許可證，允許使用者自由地使用、修改和分發代碼，但前提是必須遵守 GPL-2.0 的條款，例如分發修改後的代碼時也必須保持開源並提供原始碼。這行註解清楚地告訴開發者這段代碼的法律授權狀態，避免了冗長的版權聲明文字。 `#include <linux/compiler.h>` : 它確保代碼在不同環境下都能正確編譯和運行 `#include <linux/export.h>` : 此定義了用於導出符號的巨集，例如 `EXPORT_SYMBOL`。在 Linux 核心中，有些函數或變數需要被其他核心模組（module）使用，這時就需要透過這些巨集將它們標記為可導出，如果有函數需要被其他模組調用，就會依賴這個頭檔案提供的功能 ## 概述 `list_sort` 用於對linux kernel中的doubly linekd list進行排序。排序部分採用的是穩定的 merge sort algorithm，並支援自定義比較邏輯。 ### 為什麼使用合併排序？ 1. 穩定性：合併排序保證相等元素的相對順序不變，這在核心中非常重要。 2. linked list友好：合併排序不需隨機存取，只需指針操作即可分割和合併鏈表。 3. 時間複雜度：O(nlogn)，空間複雜度低，適合linked list結構。 ## 實現實作方面 `list_sort` 仰賴兩個輔助函式 : `merge` 和 `merge_final` ### 輔函式1 `merge` ```C /* * Returns a list organized in an intermediate format suited * to chaining of merge() calls: null-terminated, no reserved or * sentinel head node, "prev" links not maintained. */ ``` 此註解為針對 `merge` 函式的作用 : 此函式回傳的中間格式架構有以下三特點 : 1. Null-terminated（以 NULL 結尾） 2. No reserved or sentinel head node（無保留或哨兵頭節點） 3. "Prev" links not maintained（不維護 "prev" 連結）作用 : 合併兩個已排序的子列表 a 和 b，生成一個單向鏈表（以 NULL 結尾） src : ```c __attribute__((nonnull(2,3,4))) ``` 此是GCC編譯器提供的一個擴展屬性，用於告訴編譯器某些函數參數不應該為 NULL。具體來說，這裡的 (2,3,4) 表示函數的第 2、第 3 和第 4 個參數必須是非空的ptr ```c static struct list_head *merge(void *priv, list_cmp_func_t cmp, struct list_head *a, struct list_head *b) { struct list_head *head, **tail = &head; for (;;) { if (cmp(priv, a, b) <= 0) { // 穩定性：相等時選 a *tail = a; tail = &a->next; a = a->next; if (!a) { *tail = b; break; } } else { *tail = b; tail = &b->next; b = b->next; if (!b) { *tail = a; break; } } } return head; } ``` 引數 : * `void *priv` : 為一私有泛型資料指標，目的是讓使用者能設置比較邏輯 `cmp` 的對象 * `list_cmp_func_t cmp` : 是一個自訂的比較函數，用來決定兩個串列節點 a 和 b 的相對順序 * `list_cmp_func_t` 定義在 `<linux/list.h>` 中 ```c typedef int (*list_cmp_func_t)(void *priv, const struct list_head *a, const struct list_head *b); ``` * `struct list_head *a` : a 是第一個sorted的linked list head * `struct list_head *b` : b 是第二個sorted的linked list head 邏輯 : 用 `*head` 作為新的list之head，再用 `**tail` 根據 a、b list中element內容比較去選擇將該 element 串入新的list(若相等則取a保證stable)。若 a、b 誰先取完，就將另一個list串到新list後面 ### 輔函式2 `merge_final` 作用：最終合併兩個子列表，並恢復雙向鏈表的 `prev` 指針，形成循環結構。 src : ```c static void merge_final(void *priv, list_cmp_func_t cmp, struct list_head *head, struct list_head *a, struct list_head *b) { struct list_head *tail = head; u8 count = 0; for (;;) { if (cmp(priv, a, b) <= 0) { tail->next = a; a->prev = tail; tail = a; a = a->next; if (!a) break; } else { tail->next = b; b->prev = tail; tail = b; b = b->next; if (!b) { b = a; break; } } } tail->next = b; do { if (unlikely(!++count)) cmp(priv, b, b); // 避免過長迴圈，允許調度 b->prev = tail; tail = b; b = b->next; } while (b); tail->next = head; head->prev = tail; } ``` 邏輯 : 前面部分與 `merge` 相同，新的部分為多維護一個 `count` 與後面使用 do-while 迴圈處理剩餘節點，同時監控迴圈次數(`count`)以避免過長執行。最後閉合形成循環 * `unlikely()` 是一個巨集（macro），作為編譯器的提示，告訴編譯器這個條件`（!++count）`不太可能發生，優化程式執行效率。 * 又因 `count` 為 u8 ，值為255時再執行 `++count` 後，會溢出，`count` 變為 0 * `cmp(priv, b, b)` 為一 no-op ，通常包含一些副作用（side effect），觸發某些系統檢查點。這些副作用給予系統一個機會，讓操作系統的調度器（scheduler）介入，檢查是否需要切換到其他任務。 ### 主函式 `list_sort` : src : ```c void list_sort(void *priv, struct list_head *head, list_cmp_func_t cmp) { struct list_head *list = head->next, *pending = NULL; size_t count = 0; if (list == head->prev) // 0 或 1 個元素，直接返回 return; head->prev->next = NULL; // 斷開循環，轉為單向鏈表 do { size_t bits; struct list_head **tail = &pending; for (bits = count; bits & 1; bits >>= 1) tail = &(*tail)->prev; if (likely(bits)) { struct list_head *a = *tail, *b = a->prev; a = merge(priv, cmp, b, a); a->prev = b->prev; *tail = a; } list->prev = pending; pending = list; list = list->next; pending->next = NULL; count++; } while (list); list = pending; pending = pending->prev; for (;;) { struct list_head *next = pending->prev; if (!next) break; list = merge(priv, cmp, pending, list); pending = next; } merge_final(priv, cmp, head, pending, list); } ``` #### 註解研讀 ```c /** * list_sort - sort a list * @priv: private data, opaque to list_sort(), passed to @cmp * @head: the list to sort * @cmp: the elements comparison function */ ``` * `@priv`：私有資料，對 list_sort 來說是不透明的指標，會直接傳遞給比較函數 @cmp。這允許 @cmp 在比較元素時使用額外的上下文資料。 * `@head`：要排序的linked list的頭部，指標指向串列的起始節點。 * `@cmp`：自訂的比較函數，由使用者提供，用來定義兩個節點之間的順序關係。 ```c /* * The comparison function @cmp must return > 0 if @a should sort after * @b ("@a > @b" if you want an ascending sort), and <= 0 if @a should * sort before @b *or* their original order should be preserved. ``` 描述比較函數 @cmp 的行為大於 0：表示 @a 應排在 @b 後面。例如，若要升序排序，則 @a > @b 時返回大於 0。小於等於 0：表示 @a 應排在 @b 前面，或兩者相等時保持原始順序。`list_sort` 是一個穩定排序（stable sort），即相等元素的相對順序不會改變。@cmp 總是以輸入串列中較早出現的元素作為 @a，因此無需明確區分 @a < @b 和 @a == @b 的情況，只需返回 <= 0 即可。 ```c /* The comparison function must adhere to specific mathematical properties: - Antisymmetry: cmp(@a, @b) must return the opposite sign of cmp(@b, @a). - Transitivity: if cmp(@a, @b) <= 0 and cmp(@b, @c) <= 0, then cmp(@a, @c) <= 0. ``` 描述@cmp所需的數學屬性要求 * 反對稱性（Antisymmetry）： cmp(@a, @b) 和 cmp(@b, @a) 的返回值符號必須相反。例如，若 cmp(@a, @b) > 0，則 cmp(@b, @a) < 0。這避免了邏輯矛盾（如 @a > @b 且 @b > @a）。 * 傳遞性（Transitivity）：若 cmp(@a, @b) <= 0 且 cmp(@b, @c) <= 0，則 cmp(@a, @c) <= 0 必須成立。這確保排序結果的一致性和正確性。 ```c /* This is compatible with two styles of @cmp function: - The traditional style which returns <0 / =0 / >0, or - Returning a boolean 0/1. ``` @cmp的兩種style * traditional style：返回 < 0（@a < @b）、= 0（@a == @b）、> 0（@a > @b），類似標準 C 的 `strcmp`。 * boolean style：返回 0（@a <= @b）或 1（@a > @b）。這種方式更簡潔，可能節省少量 CPU 週期。例如，`block/blk-mq.c` 中的 `plug_ctx_cmp` 就採用此 ```c /* A good way to write a multi-word comparison is:: if (a->high != b->high) return a->high > b->high; if (a->middle != b->middle) return a->middle > b->middle; return a->low > b->low; ``` @cmp 多欄位比較範例假設節點有 high、middle、low 三個欄位，依序比較。先比較 high，若不同，根據 high 的值決定順序；若相同，再比較 middle，最後比較 low。這種方式實現了字典序（lexicographical order），常用於多條件排序。 ```c /* This mergesort is as eager as possible while always performing at least 2:1 balanced merges. Given two pending sublists of size 2^k, they are merged to a size-2^(k+1) list as soon as we have 2^k following elements. ``` 說明此list_sort的eager體現在條件允許時盡早合併子串列，以免最後很長一條merge一小條不balanced的case。且實施的準則為 `2:1 balanced merges` ，當有兩個大小為 2^k 的子串列，且後續有 2^k 個元素時，將這兩個子串列合併成一個大小為 2^(k+1) 的串列，旨在先merge小。 ```c /* Thus, it will avoid cache thrashing as long as 3*2^k elements can fit into the cache. Not quite as good as a fully-eager bottom-up mergesort, but it does use 0.2*n fewer comparisons. ``` 只要快取能容納 3*2^k 個元素，就能避免cache thrashing，相較於fully-eager bottom-up mergesort，這種方法減少了約 0.2*n 次比較。 ```c /* The merging is controlled by "count", the number of elements in the pending lists. Each time we increment "count", we set one bit (bit k) and clear bits k-1 .. 0. Each time this happens (except the very first time for each bit, when count increments to 2^k), we merge two lists of size 2^k into one list of size 2^(k+1). ``` 描述 `count` 的角色為表示待處理串列中的元素總數，用來決定何時進行合併。且描述了運作機制 : 位元操作：每次 count 增加時，設置某個位元（bit k），並清除低於 k 的位元（k-1 至 0）。合併觸發：當 count 達到 2^k 的奇數倍（除了第一次達到 2^k），觸發兩個大小為 2^k 的串列合併，生成一個 2^(k+1) 的串列。 ```c /* This merge happens exactly when the count reaches an odd multiple of 2^k, which is when we have 2^k elements pending in smaller lists. ``` 描述 `count` 機制的merge時機 : 當 `count` 為 2^k 的奇數倍時，表示有 2^k 個元素在較小的待處理串列中，此時可以安全合併兩個 2^k 大小的串列。後續合併時，生成兩個 2^(k+1) 的串列，接著它們會被合併成 2^(k+2)，確保待處理串列數量不超過兩個。 ```c /* This merge happens exactly when the count reaches an odd multiple of 2^k, which is when we have 2^k elements pending in smaller lists. ``` 描述pending list的狀態，大小為 2^k 的待處理串列數量取決於： 1. `count` 的第 k 位狀態。 2. 第 k-1 位狀態（k = 0 時，假設 bit -1 恆為 1）。 3. 高位是否為非零（即 `count` >= 2^(k+1)）。 ```c /* There are six states we distinguish: 0: 00x: 0 pending of size 2^k; x pending of sizes < 2^k 1: 01x: 0 pending of size 2^k; 2^(k-1) + x pending of sizes < 2^k 2: x10x: 0 pending of size 2^k; 2^k + x pending of sizes < 2^k 3: x11x: 1 pending of size 2^k; 2^(k-1) + x pending of sizes < 2^k 4: y00x: 1 pending of size 2^k; 2^k + x pending of sizes < 2^k 5: y01x: 2 pending of size 2^k; 2^(k-1) + x pending of sizes < 2^k (merge and loop back to state 2) ``` 描述6種states `x` 表示任意位元，`y` 表示非零高位。 0：`00x` - 無 2^k 大小的串列，x 個小於 2^k 的串列。 1：`01x` - 無 2^k ，2^(k-1) + x 個小於 2^k。 2：`x10x` - 無 2^k ，2^k + x 個小於 2^k。 3：`x11x` - 1 個 2^k ，2^(k-1) + x 個小於 2^k。 4：`y00x` - 1 個 2^k ,2^k + x 個小於 2^k。 5：`y01x` - 2 個 2^k， 2^(k-1) + x 個小於 2^k，觸發合併後回到狀態 2。 ```c /* We gain lists of size 2^k in the 2->3 and 4->5 transitions and merge them away in the 5->2 transition. When we reach the end of the input, we merge all the pending lists, from smallest to largest. ``` 生成 : 2->3 和 4->5：當 bit k-1 為 1 且高位非零時，生成 2^k 大小的串列(state 5)。合併 : 5->2：合併兩個 2^k 的串列，生成 2^(k+1)，並回到state 2 最終合併：處理完所有輸入後，將所有待處理串列從小到大合併，完成排序。總結 : `list_sort` 透過 `count` 控制合併時機，實現 2:1 balanced merge。它優化了cache使用（需容納 32^k 元素）和比較次數（減少 0.2n 次），特別適合資料能放入 L1 快取的情況。使用者可透過 @cmp 和 @priv 靈活定義排序邏輯函式流程 1. guard clause與初始化 : * 若list只有 0 或 1 個元素，直接返回。 * 將circular list斷開，轉為單向鏈表。 * 初始化 `*pending`（待合併子列表）為 NULL，`count` 為 0。 2. 主循環：處理輸入 (merge時機為 `count` 達到某個 2^k 的奇數倍時，觸發大小為 2^k 的子列表合併) * 計算合併點： * 檢查 `count` 的二進位表示，找到最低位的 0（記為 bits）。 * 沿著 `pending` 的 `prev` 指針前進 `bits` 步，定位需要合併的子列表。 * 執行合併： * 若 `bits` != 0，取出 `pending` 中的兩個子列表 a 和 b，使用 merge 合併，並更新 `pending`。 * 加入新元素： * 從輸入 `list` 中取出一個元素，作為大小為 1 的子列表，加入 `pending`。 * 將 `count` 加 1。 3. 最終合併 * 從 `pending` 開始，逐對合併子列表，使用 merge * 使用 merge_final 合併剩餘兩個sublists，並恢復doubly linked list `list_sort` 優點特性 * 合併策略 : * 自底向上：從大小為 1 的子列表開始，逐步合併。 * 平衡合併：通過 count 控制，確保每次合併的子列表大小比例接近 2:1。 * 減少合併次數：僅在必要時合併，減少約 20% 的比較次數。 * stable * 在 merge 和 merge_final 中，當 cmp(a, b) <= 0 時優先選 a，保證相等元素的原始順序 * 效能考量 * cache友好：平衡合併使子列表大小適中，適合快取。 * 上下文切換：merge_final 中的計數器設計允許調度器介入 * count 設計 * 每次 count 增加，檢查其最低位的 0，觸發對應大小的子列表合併。bitwise 運算是非常快速的操作，尤其是在核心程式（如作業系統核心）中，對效能要求很高時，這種方法能顯著提升速度。