# 2024q1 Homework2 (quiz1+2) contributed by < [yenslife](https://github.com/yenslife) > ## 第一週測驗 `1` [quiz1-1](https://hackmd.io/@sysprog/linux2024-quiz1) 參考 [Optimized QuickSort — C Implementation (Non-Recursive)](https://alienryderflex.com/quicksort/) 實作非遞迴的快速排序法。 考慮結構體 `node_t`,其中 `left`, `right` 成員在 [quiz1-1](https://hackmd.io/@sysprog/linux2024-quiz1) 中似乎沒有用到。 ```c typedef struct __node { struct __node *left, *right; struct __node *next; long value; } node_t; ``` ```graphviz digraph n1 { rankdir=LR; node [shape=record]; head1 [label="{<left>left|<right>right}|<next>next|<value>value"]; head2 [label="{<left>left|<right>right}|<next>next|<value>value"]; head3 [label="{<left>left|<right>right}|<next>next|<value>value"]; NULL1[label=NULL,shape=plaintext] // next head1:next -> head2 head2:next -> head3 head3:next -> NULL1 } ``` 函式說明: - `list_tail` : 回傳 list 的最後一個節點 - `int list_length(node_t **left)` : 回傳 list 的長度 - `void list_add(node_t **list, node_t *node_t)` : 在 list 的開頭新增一個節點,並將 list 指向的節點改為新增的節點,該節點必須先被初始化 - `node_t *list_construct(node_t *list, int n)` : 初始化一個新的節點 - `void list_free(node_t **list)` : 釋放 list 中所有節點的記憶體空間 ### 快速排序函式解說 #### 初始化 觀察快速排序函式 `void quick_sort(node_t **list)`,在初始化部分,這邊使用 `begin` 和 `end` 這兩個長度為 `max_level` 的 `node_t` 陣列來表示 stack,並個別初始化為 list 的開頭和結尾。`left` 和 `right` 用來指向被分割成兩個部分的 list 開頭。每次都從 stack 的最上層 (index 為 0) 開始處理 ```c int n = list_length(list); int value; int i = 0; int max_level = 2 * n; /* (2 * n) 是最糟情況,即每次都找到最大/小的 */ node_t *begin[max_level], *end[max_level]; node_t *result = NULL, *left = NULL, *right = NULL; begin[0] = *list; end[0] = list_tail(list); ``` #### 主要的迴圈,模擬遞迴呼叫 先讓 L, R 分別表示目前這個 list 要處理的開頭和結尾。`i` 是待處理堆疊的數量,一開始是 0。當 list 還可以被切割時,程式碼最後會以 `i += 2` 增加兩個堆疊數量,分別表示左邊和右邊未處理的 list。如果發現當下的 list 只剩下一個節點 (`L == R`),那就將 `i--`。如果 L 非空,則將 L 加入 result 這個 list。 ```c /* 模擬遞迴呼叫 */ while (i >= 0) { node_t *L = begin[i], *R = end[i]; if (L != R) { /** * * 其他程式碼,稍後討論 * */ i += 2; } else { /* 剩下一個節點,放到 */ if (L) list_add(&result, L); i--; } } *list = result; ``` #### `if (L != R)` 的內容 發現 pivot 的選定方式為直接指定 L (這邊一定可以改進,好的 pivot 選擇方式能有效避免最糟情況),並將 `p` 指向 `pivot` 的下一個節點,也就是 `L` 的下一個節點。 ```c node_t *pivot = L; value = pivot->value; node_t *p = pivot->next; pivot->next = NULL; ``` ```graphviz digraph main { node[shape=plaintext] pivot [fontcolor="red"]; L [fontcolor="blue",label=left]; R [fontcolor="blue",label=right]; null[label=NULL]; node[shape=box] s0 [label="4"]; s1 [label="1"]; s2 [label="3"]; s3 [label="5"]; s4 [label="2"]; s5 [label="7"]; { rank="same"; s0->s1 s1->s2 s2->s3 s3->s4 s4->s5 s5->null } pivot->s0 } ``` 在 while 迴圈中開始比大小,比 `pivot` 大的節點放到 `right` 之後;反之,放到 `left` 之後。 ```c while (p) { node_t *n = p; p = p->next; list_add(n->value > value ? &right : &left, n); } ``` ```graphviz digraph main { node[shape=plaintext] pivot [fontcolor="red"]; L [fontcolor="blue",label=left]; R [fontcolor="blue",label=right]; n1[label=NULL]; n2[label=NULL]; node[shape=box] s0 [label="4"]; s1 [label="1"]; s2 [label="3"]; s3 [label="5"]; s4 [label="2"]; s5 [label="7"]; { rank="same"; s1->s2 s2->s4 s4->n1 s3->s5 s5->n2 } pivot->s0 L->s1 R->s3 } ``` 最後才是「切割」,可以發現處理順序是右邊 -> pivot -> 左邊 ```c begin[i] = left; end[i] = list_tail(&left); begin[i + 1] = pivot; end[i + 1] = pivot; begin[i + 2] = right; end[i + 2] = list_tail(&right); left = right = NULL; i += 2; ``` ```graphviz digraph stack { // rankdir=LR node [shape=plain]; end [label="end"]; begin [label="begin"]; node [shape=record]; e_stack [label="{<left>list_tail(&left)}|<pivot>pivot|<right>list_tail(&right)"]; b_stack [label="{<left>left}|<pivot>pivot|<right>right"]; { // rank="same"; end->e_stack:left; begin->b_stack:left; } } ``` ### 我的改寫方法 兩點可能的改進方案: 1. 減少額外空間利用 (`max_level` 大小) 2. 選擇 pivot 的方式 (Randomized Quick sort / Middle-of-Three) ### 減少空間佔用 主要佔空間的變數,其中 `max_level = 2 * n` - `begin` : `max_level * sizeof(node_t)` - `end` : `max_level * sizeof(node_t)` `max_level` 初始化大小之所以會是 `n * 2 * sizeof(node_t)`,是因為其中的 `n * 2` 表示 worst case 的 stack 大小,也就是每次都選到最大/小,導致無法一分為二。但越想越不對,如果已經有 shuffle 過了,那 `max_level` 的大小也不應該超過 n 才對。 ```c /* shuffle array, only work if n < RAND_MAX */ void shuffle(int *array, size_t n) { if (n <= 0) return; for (size_t i = 0; i < n - 1; i++) { size_t j = i + rand() / (RAND_MAX / (n - i) + 1); int t = array[j]; array[j] = array[i]; array[i] = t; } } ``` #### 測試1:極端案例 我在快速排序的 while 迴圈中加入了 `max_i` 變數,其功能為紀錄目前最大的 `i` 值,並在函式最後將其印出。 ```c int max_i = 0; while (i >= 0) { /** * 一堆程式碼 * */ if (i > max_i) max_i = i; } ``` 用極端案例「排序好的 list」來測試,發現 `max_i` 的值會和節點數量相同,與先前的理解無誤。 #### 測試2:隨機案例 再來我想看看 shuffle 過後的 list 平均的最大 stack 數 (`average_max_i`) 會是多少,以下設定節點數量為 1000 ~ 10000,shuffle 次數從一次到一千次,取平均。這邊我讓 quick_sort 可以回傳 `max_i`。 ```c /* 觀察 shuffle 1 ~ n 次平均的最大 stack 數 */ void test_shuffle(int n, int node_count) { int *list = malloc(sizeof(int) * node_count), average = 0; node_t *head = NULL; for (int i = 0; i < node_count; ++i) list[i] = i; for (int i = 1; i <= n; ++i) { for (int k = 1; k <= i; k++) { shuffle(list, node_count); } for (int i = 0; i < node_count; ++i) head = list_construct(head, list[i]); average += quick_sort(&head); list_free(&head); } free(list); printf("node_count = %d, shuffle = %d, average = %f\n", node_count, n, (float)average / n); } int main () { for (int i = 1000; i <= 10000; i+=1000) test_shuffle(50, i); return 0 } ``` 印出結果,發現節點個數確實與 `average_max_i` 成正比,`shuffle` 的次數增加也會增加平均最高堆疊數,但是影響力不大 (因為是隨機的隨機),到了 50 次之後就幾乎沒有影響了。 :::spoiler 實驗數據 (shuffle: 10) node_count = 1000, shuffle = 10, average = 24.600000 node_count = 2000, shuffle = 10, average = 28.200001 node_count = 3000, shuffle = 10, average = 29.600000 node_count = 4000, shuffle = 10, average = 29.000000 node_count = 5000, shuffle = 10, average = 30.600000 node_count = 6000, shuffle = 10, average = 33.200001 node_count = 7000, shuffle = 10, average = 33.799999 node_count = 8000, shuffle = 10, average = 33.000000 node_count = 9000, shuffle = 10, average = 33.799999 node_count = 10000, shuffle = 10, average = 34.599998 ::: :::spoiler 實驗數據 (shuffle: 20) node_count = 1000, shuffle = 20, average = 25.500000 node_count = 2000, shuffle = 20, average = 28.200001 node_count = 3000, shuffle = 20, average = 31.900000 node_count = 4000, shuffle = 20, average = 32.000000 node_count = 5000, shuffle = 20, average = 32.400002 node_count = 6000, shuffle = 20, average = 31.900000 node_count = 7000, shuffle = 20, average = 34.000000 node_count = 8000, shuffle = 20, average = 35.099998 node_count = 9000, shuffle = 20, average = 35.099998 node_count = 10000, shuffle = 20, average = 35.700001 ::: :::spoiler 實驗數據 (shuffle: 50) node_count = 1000, shuffle = 50, average = 26.480000 node_count = 2000, shuffle = 50, average = 28.760000 node_count = 3000, shuffle = 50, average = 31.799999 node_count = 4000, shuffle = 50, average = 31.520000 node_count = 5000, shuffle = 50, average = 33.279999 node_count = 6000, shuffle = 50, average = 34.080002 node_count = 7000, shuffle = 50, average = 35.160000 node_count = 8000, shuffle = 50, average = 35.919998 node_count = 9000, shuffle = 50, average = 36.160000 node_count = 10000, shuffle = 50, average = 36.639999 ::: :::spoiler 實驗數據 (shuffle: 100) node_count = 1000, shuffle = 100, average = 27.000000 node_count = 2000, shuffle = 100, average = 29.740000 node_count = 3000, shuffle = 100, average = 31.320000 node_count = 4000, shuffle = 100, average = 32.740002 node_count = 5000, shuffle = 100, average = 33.740002 node_count = 6000, shuffle = 100, average = 34.860001 node_count = 7000, shuffle = 100, average = 35.139999 node_count = 8000, shuffle = 100, average = 36.259998 node_count = 9000, shuffle = 100, average = 36.599998 node_count = 10000, shuffle = 100, average = 37.400002 ::: :::spoiler 實驗數據 (shuffle: 200) node_count = 1000, shuffle = 200, average = 26.809999 node_count = 2000, shuffle = 200, average = 29.410000 node_count = 3000, shuffle = 200, average = 31.680000 node_count = 4000, shuffle = 200, average = 33.139999 node_count = 5000, shuffle = 200, average = 33.730000 node_count = 6000, shuffle = 200, average = 34.820000 node_count = 7000, shuffle = 200, average = 35.840000 node_count = 8000, shuffle = 200, average = 36.189999 node_count = 9000, shuffle = 200, average = 37.160000 node_count = 10000, shuffle = 200, average = 37.810001 node_count = 1000, shuffle = 1000, average = 26.486000 node_count = 2000, shuffle = 1000, average = 29.719999 node_count = 3000, shuffle = 1000, average = 31.938000 node_count = 4000, shuffle = 1000, average = 33.189999 node_count = 5000, shuffle = 1000, average = 34.256001 node_count = 6000, shuffle = 1000, average = 35.015999 node_count = 7000, shuffle = 1000, average = 36.018002 node_count = 8000, shuffle = 1000, average = 36.480000 node_count = 9000, shuffle = 1000, average = 37.243999 node_count = 10000, shuffle = 1000, average = 37.987999 ::: :::spoiler 節點數少(shuffle: 50, node_count: 1 ~ 200) node_count = 1, shuffle = 50, average = 0.000000 node_count = 2, shuffle = 50, average = 1.960000 node_count = 3, shuffle = 50, average = 2.560000 node_count = 4, shuffle = 50, average = 2.920000 node_count = 5, shuffle = 50, average = 3.760000 node_count = 6, shuffle = 50, average = 4.040000 node_count = 7, shuffle = 50, average = 4.720000 node_count = 8, shuffle = 50, average = 5.160000 node_count = 9, shuffle = 50, average = 5.800000 node_count = 10, shuffle = 50, average = 6.200000 node_count = 10, shuffle = 50, average = 5.920000 node_count = 20, shuffle = 50, average = 9.160000 node_count = 30, shuffle = 50, average = 10.320000 node_count = 40, shuffle = 50, average = 11.680000 node_count = 50, shuffle = 50, average = 12.440000 node_count = 60, shuffle = 50, average = 13.120000 node_count = 70, shuffle = 50, average = 13.800000 node_count = 80, shuffle = 50, average = 14.360000 node_count = 90, shuffle = 50, average = 15.600000 node_count = 100, shuffle = 50, average = 15.760000 node_count = 110, shuffle = 50, average = 16.040001 node_count = 120, shuffle = 50, average = 16.200001 node_count = 130, shuffle = 50, average = 16.680000 node_count = 140, shuffle = 50, average = 16.799999 node_count = 150, shuffle = 50, average = 17.000000 node_count = 160, shuffle = 50, average = 17.799999 node_count = 170, shuffle = 50, average = 17.480000 node_count = 180, shuffle = 50, average = 17.879999 node_count = 190, shuffle = 50, average = 17.959999 node_count = 200, shuffle = 50, average = 18.320000 ::: 將實驗結果繪製成折線圖 `shuffle` 1 次,每次增加十個節點 ![截圖 2024-03-12 下午5.01.05](https://hackmd.io/_uploads/BytnZ6TTT.jpg) `shuffle` 2 次,每次增加十個節點 ![截圖 2024-03-12 下午7.49.28](https://hackmd.io/_uploads/rkkAZa6a6.jpg) `shuffle` 5 次,每次增加十個節點 ![截圖 2024-03-12 下午4.49.35](https://hackmd.io/_uploads/HyKmZTppa.png) `shuffle` 10 次,每次增加十個節點 ![截圖 2024-03-10 上午1.01.10](https://hackmd.io/_uploads/rJc1Lf56T.png) `shuffle` 50 次,每次增加一百個節點 (因為跑太慢了) ![截圖 2024-03-10 上午12.45.23](https://hackmd.io/_uploads/SJjEzG5Ta.png) 可以發現結果會趨近於 45 左右,`shuffle` 越多次,趨勢就越集中。所以我想將 `max_level` 的值依照節點數量的多寡來設定,並在排序之前先執行 2 ~ 10 次 shuffle,也就是加入 Randomized Quick Sort 的想法進去。 | 節點數 n | max_level | | ------------------ | --------- | | n < 100 | n | | `100 <= n <= 1000` | n / 10 | | `1000 <= n ` | n / 20 | 同時也發現一個問題,那就是 `shuffle` 的時間也會隨著節點量增加而變長,有沒有更好的方法?或者說可以如何改進 `shuffle` 的方式?會不會 `shuffle` 的亂度會影響趨勢的集中程度呢? TODO: 利用 [lab0-c(D)](https://hackmd.io/@sysprog/linux2024-lab0-d) 的工具來測試亂度 ### 觀察 glibc 的 `rand()` 實作 我原本想要用以上的 Randomized Quick Sort 方法來避免最糟情況,但是遇到太慢的問題,觀察 `shuffle` 函式的實作,會發現他其實是用 `rand()` 這個 [PRNG](https://en.wikipedia.org/wiki/Pseudorandom_number_generator) 來實作 [Fisher–Yates shuffle](https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle#The_modern_algorithm) 方法。希望透過理解 `rand()` 或是 `srand()` 的實作,嘗試以合理的角度修改目前 `shuffle` 的實作方式,最好在不影響隨機性的情況下提升執行速度。 (隨機性?這個也可以來探討,因為 `rand()` 並不是真正的「隨機」,而是在使用範圍內不可預測的「偽亂數」PRNG) ```c size_t j = i + rand() / (RAND_MAX / (n - i) + 1); ``` 觀察 [glibc](https://www.gnu.org/software/libc/) 的 [`stdlib/rand.c`](https://sourceware.org/git/?p=glibc.git;a=blob_plain;f=stdlib/rand.c;hb=d39a893ed6de8e63ffbfbcc4b7176a2fa852f8a8),其用途為回傳一個介於 0 ~ RAND_MAX 之間的整數 ```c /* Return a random integer between 0 and RAND_MAX. */ int rand (void) { return (int) __random (); } ``` 然後找到 `__random` 的實作,在 [`stdlib/random.c`](https://sourceware.org/git/?p=glibc.git;a=blob;f=stdlib/random.c;hb=faa575631468b0cb318df08bd030deb816a472aa),這邊用了 `__libc_lock_lock (lock)` 來避免 race condition,其中的 critical section 是 `_random_r (&unsafe_state, &retval)`,該函式實作在 `stdlib/random_r.c`。 至於為什麼要分成 `random.c` 和 `random_r.c` 兩個檔案?前者用來呼叫實作函式,並用 lock 確保不會發生 race condition,後者才是邏輯實作。其他函式像是 `initstate`, `setstate` 也是相同的概念,這讓我想到軟體開發中的 [Dependency Injection](https://zh.wikipedia.org/zh-tw/%E4%BE%9D%E8%B5%96%E6%B3%A8%E5%85%A5) 的設計方式。 ```c long int __random (void) { int32_t retval; __libc_lock_lock (lock); (void) __random_r (&unsafe_state, &retval); __libc_lock_unlock (lock); return retval; } ``` 這裡有一個參數 `unsafe_state`,用來記錄隨機演算法的 state information 參數,他是一個名為 `random_data` 的結構體。 ```c static struct random_data unsafe_state = { .fptr = &randtbl[SEP_3 + 1], /* 指向距離 rptr rand_sep 個位置 * 事實上不用這個也可以,只是為了實作方便 */ .rptr = &randtbl[1], /* 指向 state information 的開頭,從 * 之所以從 index 1 開始是因為 * index 0 紀錄 TYPE 了*/ .state = &randtbl[1], /* 由於是從 index 1 開始,所以 * state[-1] 就是 R.N.G 的種類 */ .rand_type = TYPE_3, /* R.N.G 的 種類 */ .rand_deg = DEG_3, /* 多項式冪數 */ .rand_sep = SEP_3, /* fptr 和 rptr 的距離 */ .end_ptr = &randtbl[sizeof (randtbl) / sizeof (randtbl[0])] /* end_ptr 指向 state information 的結尾 */ }; ``` 閱讀 `stdlib/random.c` 的註解,當 state information 小於 32 bytes 時,使用較簡單的 [Linear congruential](https://zh.wikipedia.org/zh-tw/%E7%B7%9A%E6%80%A7%E5%90%8C%E9%A4%98%E6%96%B9%E6%B3%95) (`TYPE_0`);反之,使用 [LFSR](https://zh.wikipedia.org/zh-tw/%E7%BA%BF%E6%80%A7%E5%8F%8D%E9%A6%88%E7%A7%BB%E4%BD%8D%E5%AF%84%E5%AD%98%E5%99%A8) 來產生隨機數,預設為 128 bytes 也就是 `TYPE_3`。state information 本質上是一個由 long 組成的陣列,第一個元素為 TYPE,其他則是 R.N.G 的 state information。 > If we are using the trivial TYPE_0 R.N.G., just do the old linear congruential bit. Otherwise, we do our fancy trinomial stuff, which is the same in all the other cases due to all the global variables that have been set up. The basic operation is to add the number at the rear pointer into the one at the front pointer. Then both pointers are advanced to the next location cyclically in the table. The value returned is the sum generated, reduced to 31 bits by throwing away the "least random" low bit. Note: The code takes advantage of the fact that both the front and rear pointers can't wrap on the same call by not testing the rear pointer if the front one has wrapped. Returns a 31-bit random number. 如果 state table (`randtbl`) 是 32 bytes 的情況,扣掉第一個元素 `TYPE_1`,就會剩下 7 個元素,而 7 就表示多項式的最高冪次;同理,128 bytes 的最高冪次就是 $(128\div4) - 1 = 31$ 個元素,如下程式碼。 ```c static int32_t randtbl[DEG_3 + 1] = { TYPE_3, -1726662223, 379960547, 1735697613, 1040273694, 1313901226, 1627687941, -179304937, -2073333483, 1780058412, -1989503057, -615974602, 344556628, 939512070, -1249116260, 1507946756, -812545463, 154635395, 1388815473, -1926676823, 525320961, -1009028674, 968117788, -123449607, 1284210865, 435012392, -2017506339, -911064859, -370259173, 1132637927, 1398500161, -205601318, }; ``` :::warning 我的問題:這些數字到底是怎麼選的,完全看不出線索,一開始猜測會不會是質數,但也不是,而且居然有負數存在? > 我後來在這個 [commit 50843ff](https://sourceware.org/git/?p=glibc.git;a=blobdiff;f=stdlib/random.c;h=473a5b13d3af3d34740fa8c5fc16c2dc70f01743;hp=fb32b36b876ef2ff170a6a52b26a0b91e750ddcd;hb=50843ff068aae5b1ff58581eddd3bd107e99114b;hpb=23ad311df0e01caa58b54afb8ae999a3ffc1ba7a) 發現 Roland McGrath 有更新這些數字,但他並沒有解釋為什麼。似乎在最初的 [commit](https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=50843ff068aae5b1ff58581eddd3bd107e99114b) 這些數值就已經定好了。 ::: 實作上使用三項式,如此一來要相加的量比較少。state table 中所有數值的 LSB 會是週期為 $2^{deg} - 1$ 的 LFSR,前提為多項式是不可約 (Irreducible polynomial)、本原多項式 (Primitive Polynomial),總週期約為 $deg*(2^{deg} - 1)$ `__random_r` 函式的實作可以分成「LCG `TYPE_0`」和「LFSR (`TYPE_1` ~ `TYPE_4`)」兩個部分來討論。 ### 線性同餘實作 LCG (Linear Congruential Generator) 直接先看程式碼 ```c int32_t *state; state = buf->state; int32_t val = ((state[0] * 1103515245U) + 12345U) & 0x7fffffff; state[0] = val; *result = val; ``` 為什麼會有 **1103515245U** 這個數字,這個數字不是亂選的,參考 [stackoverflow](https://stackoverflow.com/questions/8569113/why-1103515245-is-used-in-rand) 上的討論,以及這篇[論文](https://www.researchgate.net/publication/2683298_A_Collection_of_Selected_Pseudorandom_Number_Generators_With_Linear_Structures)。 在開始之前要先理解什麼是 LCG (Linear Congruential Generator),他是一種很常在密碼學中被使用的 Uniform PRNG 方法,利用質數的特性來產生一個週期很長的序列,每次的值都是可以預期的,但因為序列長度很長,所以有「隨機」的效果。判斷一個 LCG 的好壞通常使用 [spectral test](https://en.wikipedia.org/wiki/Spectral_test),觀察序列在二維平面或更高維度中的分佈,並比較距離。 ```c state[0] = val; /* 遞迴關係 */ ``` 遞迴關係式:$X_{n+1} = a X_n + b \ (mod\ m)$,LCG 的週期最大為 $m$,我們會選擇一個 seed $x_0$ 當作初始值,寫成函式的樣子:$LCG(m, a, b, x_0), a, b, x_0 \in Z_m$。 要讓 LCG 達到最大週期,應符合以下條件 - $b, m$ 互質 - m 的所有質因數都要能整除 $a-1$ - 若 $M$ 是 4 的倍數,$A-1$ 也是 - $a, b, x_0$ 都比 $m$ 小 - $a, b$ 是正整數 那為什麼 ANSIC 要選擇 $LCG(2^{31}, 1103515245, 12345, 12345)$ 做為參數呢?參考二維度 spectral test 的結果,雖然看起來很平均 ![image](https://hackmd.io/_uploads/SytYaa3aa.png) 但在論文中提到這句話 > Park and Miller [194] writes: its deficiencies are well known- according to Berke- ley 4.2 documentation, the low bits of the numbers generated are not very random. Spectral test for dim. $2 \leq s \leq 8$ 與相對應的表格 | $s$ = 2 | 3 | 4 | 5 | 6 | 7 | 8 | | ------- | ---- | ---- | ---- | ---- | ---- | ---- | | 0.84 | 0.52 | 0.63 | 0.49 | 0.68 | 0.43 | 0.54 | 測試結果表示這個方法並不隨機 (特別是低位元的表現),但在**實作**方面卻很方便,因為取模運算的成本高,除非你是對二取餘數,就可以直接用 bitwise 操作。 ```c int32_t val = ((state[0] * 1103515245U) + 12345U) & 0x7fffffff; ``` `0x7fffffff` 這個數字的來源?`0x7fffffff` 就是 $2^{31}-1$,所以取餘數就是保留剩下的位元 (就想想 `x` 對 $2^1$ 取餘數會寫成 `x & 1`;$2^{31}$ 也是一樣的道理)。 ### LFSR (Linear feedback shift register) 實作 LFSR (Linear feedback shift register) 指的是給定前一個狀態的輸出,將該輸出的線性函數再用作輸入的移位暫存器。 ```c int32_t *state; if (buf == NULL || result == NULL) goto fail; state = buf->state; /* * other code... * */ /* 儲存 LFSR 的狀態*/ int32_t *fptr = buf->fptr; int32_t *rptr = buf->rptr; int32_t *end_ptr = buf->end_ptr; uint32_t val; /* 計算 LFSR 的下一個狀態 */ val = *fptr += (uint32_t) *rptr; /* Chucking least random bit. * 之所以要丟棄是因為最 LSB 的隨機性較差 */ *result = val >> 1; ++fptr; if (fptr >= end_ptr) { /* 當 ptr 到達 end,就回到 state */ fptr = state; ++rptr; } else { ++rptr; if (rptr >= end_ptr) rptr = state; } buf->fptr = fptr; buf->rptr = rptr; ``` 將 `fptr` 和 `rptr` 指向的值相加,並更新 `fptr` 的值,將結果右移 1 bit,即可得到隨機值。目前最大的疑問就是 `randtbl` 那 31 個值到底是怎麼決定的?不可能是憑空想出來的,因為它可以做到 $2^{31}$ 循環,但我目前還沒找到相關文獻。 ### 用 List API 改寫非遞迴 Quick sort #### 結構體 新的結構體 `node_t`,與 [lab0](https://hackmd.io/@sysprog/linux2024-lab0/%2F%40sysprog%2Flinux2024-lab0-e#list_sort) 的 `element_t` 相似,將原本 `struct __node` 的 `next` 改成 `struct list_head`,並刪除用不到的 `left`, `right`。 ```diff - typedef struct __node { - struct __node *left, *right; - struct __node *next; + typedef struct { + struct list_head list; long value; } node_t; ``` #### `list_length`, `list_free` 只要會走訪整個 list 的函式都用 `list_for_each` API 來改寫 ```c int n = 0; /* 計算長度 */ struct list_head *pos; list_for_each(pos, list) { n++; } ``` ```c struct list_head *pos, *q; /* 釋放 list */ list_for_each_safe(pos, q, list) { node_t *node = list_entry(pos, node_t, list); free(node); } ``` #### `list_add` 在 `list.h` 中已經有定義 `list_add` 了,我把 `list_add` 改成 `list_add_node` ```c list_add(&node->list, list); ``` 其他函式的改寫都有一個共通點:參數只接收 `struct list_head` 而不是 `node_t`,因為有 `list_entry` 可以找到所屬 `node_t` 了。 #### `swap_node` 利用 list API 可以很方便實作雙向鏈結的節點交換 TODO: 相鄰節點交換會出錯,待改進。 ```c void swap_node(struct list_head *head, struct list_head *l1, struct list_head *l2) { if (!head || !l1 || !l2) return; if (list_empty(head)) return; if (l1 == l2) return; if (l1->next == l2) { list_move_tail(l1, l2->next); return; } if (l2->next == l1) { list_move_tail(l2, l1->next); return; } struct list_head *temp1 = l1->next; struct list_head *temp2 = l2->next; list_del_init(l1); /* 拔下來 */ list_del_init(l2); list_add_tail(l1, temp2); /* 插回去 */ list_add_tail(l2, temp1); } ``` #### Quick sort 與原本作法的不同之處在於 - 原本的 `begin` 和 `end` 從 `node_t` 陣列改成 `struct list_head` 的陣列 - 使用「交換」的方式來儲存節點,如此一來不需要額外的變數,更直覺。 - 利用雙向鏈結的優勢來比大小,把尾端比較小的值和開頭比較大的值交換,規則如下: 1. 若 `left`, `right` 指向同一個節點,表示該節點為 pivot,直接 return 2. 當 `left <= pivot` ,節點位置不變,將 `left` 指向下一個節點 (next) 3. 當 `left > pivot` - 若`right >= pivot`,則將 `right` 指向上一個節點 (prev) - 若`right < pivot`,則將 `left` 與 `right` 指向的節點交換 4. 重複以上步驟直到 `left == right` 且 `left != right->next` (兩個節點的情況),將 `pivot` 與 `left` 的上一個節點交換 (小於 pivot 的節點) 5. 將下一次迭代的 `left` 和 `right` 記錄到 `begin` 和 `end` 陣列 圖示如下,給定鏈結串列 `[4, 1, 6, 5, 2]` 進行快速排序 `left` 指向的節點 `4 <= 4`,所以將 `left` 指向下一個節點,下一個節點 `1 <= 4`,所以再指向下一個節點 6。 ```graphviz digraph main { node[shape=plaintext] pivot [fontcolor="red"]; L [fontcolor="blue",label=left]; R [fontcolor="blue",label=right]; node[shape=box] s0 [label="4"]; s1 [label="1"]; s2 [label="6"]; s3 [label="5"]; s4 [label="2"]; s5 [label="7"]; head[label=head]; { rank="same"; head->s0[dir=both] head->s5[dir=both] s0->s1 s1->s0 s1->s2 s2->s1 s2->s3 s3->s2 s3->s4 s4->s3 s4->s5 s5->s4 } pivot->s0 L->s0 R->s5 } ``` `left` 指向的節點 6 比 4 大,檢查 `right` 指向的節點,7 比 4 大所以位置保持不變,將 `right` 指向上一個節點 ```graphviz digraph main { node[shape=plaintext] pivot [fontcolor="red"]; L [fontcolor="blue",label=left]; R [fontcolor="blue",label=right]; node[shape=box] s0 [label="4"]; s1 [label="1"]; s2 [label="6"]; s3 [label="5"]; s4 [label="2"]; s5 [label="7"]; head[label=head]; { rank="same"; head->s0[dir=both] head->s5[dir=both] s0->s1 s1->s0 s1->s2 s2->s1 s2->s3 s3->s2 s3->s4 s4->s3 s4->s5 s5->s4 } pivot->s0 L->s2 R->s5 } ``` 檢查 `right` 指向的節點值,2 比 4 小所以將 `left` 與 `right` 交換 ```graphviz digraph main { node[shape=plaintext] pivot [fontcolor="red"]; L [fontcolor="blue",label=left]; R [fontcolor="blue",label=right]; node[shape=box] s0 [label="4"]; s1 [label="1"]; s2 [label="6"]; s3 [label="5"]; s4 [label="2"]; s5 [label="7"]; head[label=head]; { rank="same"; head->s0[dir=both] head->s5[dir=both] s0->s1 s1->s0 s1->s2 s2->s1 s2->s3 s3->s2 s3->s4 s4->s3 s4->s5 s5->s4 } pivot->s0 L->s2 R->s4 } ``` ```graphviz digraph main { node[shape=plaintext] pivot [fontcolor="red"]; L [fontcolor="blue",label=left]; R [fontcolor="blue",label=right]; node[shape=box] s0 [label="4"]; s1 [label="1"]; s2 [label="2"]; s3 [label="5"]; s4 [label="6"]; s5 [label="7"]; head[label=head]; { rank="same"; head->s0[dir=both] head->s5[dir=both] s0->s1 s1->s0 s1->s2 s2->s1 s2->s3 s3->s2 s3->s4 s4->s3 s4->s5 s5->s4 } pivot->s0 L->s2 R->s4 } ``` 將 `left` 指向下一個節點,5 比 4 大於是檢查 `right` 指向的節點 6,因為不滿足條件所以將 `right` 指向前一個節點 ```graphviz digraph main { node[shape=plaintext] pivot [fontcolor="red"]; L [fontcolor="blue",label=left]; R [fontcolor="blue",label=right]; node[shape=box] s0 [label="4"]; s1 [label="1"]; s2 [label="2"]; s3 [label="5"]; s4 [label="6"]; s5 [label="7"]; head[label=head]; { rank="same"; head->s0[dir=both] head->s5[dir=both] s0->s1 s1->s0 s1->s2 s2->s1 s2->s3 s3->s2 s3->s4 s4->s3 s4->s5 s5->s4 } pivot->s0 L->s3 R->s3 } ``` 最後將 `pivot` 與 `left` 的前一個節點交換,並記錄下次迭代 `left`, `right` 的位置。 ```graphviz digraph main { node[shape=plaintext] b1 [fontcolor="blue",label="begin[i]"]; b2 [fontcolor="blue",label="begin[i+2]"]; pivot [fontcolor="red",label="begin[i+1], end[i+2]"]; e1 [fontcolor="blue",label="end[i]"]; e2 [fontcolor="blue",label="end[i+2]"]; node[shape=box] s0 [label="2"]; s1 [label="1"]; s2 [label="4"]; s3 [label="5"]; s4 [label="6"]; s5 [label="7"]; head[label=head]; { rank="same"; head->s0[dir=both] head->s5[dir=both] s0->s1 s1->s0 s1->s2 s2->s1 s2->s3 s3->s2 s3->s4 s4->s3 s4->s5 s5->s4 } b1->s0 e1->s1 pivot->s2 b2->s3 e2->s5 } ``` 兩個節點的情況 ```graphviz digraph main { node[shape=plaintext] pivot [fontcolor="red"]; L [fontcolor="blue",label=left]; R [fontcolor="blue",label=right]; node[shape=box] s0 [label="2"]; s1 [label="1"]; head[label=head]; { rank="same"; head->s0[dir=both] head->s1[dir=both] s0->s1 s1->s0 } pivot->s0 L->s0 R->s1 } ``` ```graphviz digraph main { node[shape=plaintext] pivot [fontcolor="red"]; L [fontcolor="blue",label=left]; R [fontcolor="blue",label=right]; node[shape=box] s0 [label="2"]; s1 [label="1"]; head[label=head]; { rank="same"; head->s0[dir=both] head->s1[dir=both] s0->s1 s1->s0 } pivot->s0 L->s1 R->s1 } ``` 會需要在最後加上判斷來將 `left`, `right` 指向的同一個節點和 pivot 交換 ```c if (value > list_entry(right, node_t, list)->value) swap_node(list, left, pivot); ``` ```graphviz digraph main { node[shape=plaintext] pivot [fontcolor="red"]; L [fontcolor="blue",label=left]; R [fontcolor="blue",label=right]; node[shape=box] s0 [label="1"]; s1 [label="2"]; head[label=head]; { rank="same"; head->s0[dir=both] head->s1[dir=both] s0->s1 s1->s0 } pivot->s0 L->s1 R->s1 } ``` 實作上遇到的問題:因為是雙線鏈結串列,所以要隨時記得更新 `left`, `right` 的值,這樣會很麻煩,要考慮很多特殊情況,到這邊有點卡關,上面的方法不可行,但如果是矩陣就會方便很多。 > [醜陋的結果](https://gist.github.com/yenslife/deb45e747f8821f9b0b914f1eabf943c) 舉例以下情況,假設我想將 pivot 和 target 交換 ```graphviz digraph main { node[shape=plaintext] pivot [fontcolor="red"]; target [fontcolor="blue",label=target]; node[shape=box] s0 [label="4"]; s1 [label="1"]; s2 [label="2"]; s3 [label="3"]; s4 [label="8"]; s5 [label="7"]; head[label=head]; { rank="same"; head->s0[dir=both] head->s5[dir=both] s0->s1 s1->s0 s1->s2 s2->s1 s2->s3 s3->s2 s3->s4 s4->s3 s4->s5 s5->s4 } pivot->s0 // L->s2 target->s3 } ``` 因為紀錄的是記憶體位址而不是陣列的 "index",除了交換節點以外,還要去更改指標指向的位置,這就會導致很多例外情況要處理。 ```graphviz digraph main { node[shape=plaintext] target [fontcolor="blue",label=target]; pivot [fontcolor="red"]; node[shape=box] s0 [label="3"]; s1 [label="1"]; s2 [label="2"]; s3 [label="4"]; s4 [label="8"]; s5 [label="7"]; head[label=head]; { rank="same"; head->s0[dir=both] head->s5[dir=both] s0->s1 s1->s0 s1->s2 s2->s1 s2->s3 s3->s2 s3->s4 s4->s3 s4->s5 s5->s4 } pivot->s3 // L->s2 target->s0 } ``` 所以要改寫成 list API 風格可能要把單向鏈結的想法全部消除。舉例來說:`end` 陣列可能再不需要了,因為雙向鏈結串列的尾端很容易取得;要判斷是否剩下一個節點也只需要用 `list_is_singular` 巨集即可。參考 [yeh-sudo](https://hackmd.io/@yehsudo/linux2024-homework2#%E6%B8%AC%E9%A9%97-1) 的做法來寫會比較好,上面實作想法並沒有完全利用 list API 的優勢。 - 使用 `list_for_each_entry_safe` 來取代 while 迴圈 - 不再需要 `end` 陣列 我的 `new_list` 的實作 ```c struct list_head *new_list() { struct list_head *list = malloc(sizeof(struct list_head)); INIT_LIST_HEAD(list); return list; } ``` 改寫後,需要特別注意的是釋放記憶體的部分,用 `valgrind --leak-check=full ./a.out` 來查看記憶體洩漏狀況,發現有 53 個 block 沒有被釋放 ``` ==16617== HEAP SUMMARY: ==16617== in use at exit: 848 bytes in 53 blocks ==16617== total heap usage: 65 allocs, 12 frees, 2,136 bytes allocated ``` 在節點為空或者剩下一個節點的時候釋放記憶體 ```diff } else { if (list_is_singular(L)) { list_splice(L, result); } + free(L); + free(left); + free(right); i--; } } list_splice(result, head); + free(result); ``` 查看記憶體狀況,沒有多餘的釋放操作,但還是有 6 個 block 沒有檢查到,一個是 16 byte 與 `struct list_head` 相同。 ``` ==20547== ==20547== HEAP SUMMARY: ==20547== in use at exit: 96 bytes in 6 blocks ==20547== total heap usage: 65 allocs, 59 frees, 2,136 bytes allocated ==20547== ==20547== 32 bytes in 2 blocks are definitely lost in loss record 1 of 2 ==20547== at 0x4885118: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so) ==20547== by 0x10904B: new_list (in /Users/mac/test/linux2024/a.out) ==20547== by 0x1091FF: quick_sort (in /Users/mac/test/linux2024/a.out) ==20547== by 0x1094CB: main (in /Users/mac/test/linux2024/a.out) ==20547== ==20547== 64 bytes in 4 blocks are definitely lost in loss record 2 of 2 ==20547== at 0x4885118: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so) ==20547== by 0x10904B: new_list (in /Users/mac/test/linux2024/a.out) ==20547== by 0x109207: quick_sort (in /Users/mac/test/linux2024/a.out) ==20547== by 0x1094CB: main (in /Users/mac/test/linux2024/a.out) ==20547== ==20547== LEAK SUMMARY: ==20547== definitely lost: 96 bytes in 6 blocks ==20547== indirectly lost: 0 bytes in 0 blocks ==20547== possibly lost: 0 bytes in 0 blocks ==20547== still reachable: 0 bytes in 0 blocks ==20547== suppressed: 0 bytes in 0 blocks ==20547== ==20547== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0) ``` 最後的程式碼在[這邊](https://gist.github.com/yenslife/d117b92fead970400be9acf3c4d1ac7d),希望有人能幫忙找到那消失的 6 個 blocks,可以用 `list_print` 來查看整個 list 來找線索 。 ## 第一週測驗題 2 Timsort ## 第二週測驗題 1 [quiz2](https://hackmd.io/@sysprog/linux2024-quiz2) 參考:[Linux 核心的 hash table 實作](https://hackmd.io/@sysprog/linux-hashtable) 題目是給定前序 (preorder) 和中序 (inorder),我們要用這些資訊來重新建構一個二元樹 ### `hlist_node` 操作 `hlist_head` 為 hash table `hlist_node` 則是 hash table 指向的節點們 ```c struct hlist_node { struct hlist_node *next, **pprev; } ``` `list_add_head` 在雜湊表的開頭插入一個節點」 ```c static inline void hlist_add_head(struct hlist_node *n, struct hlist_head *h) { if (h->first) h->first->pprev = &n->next; n->next = AAAA; n->pprev = &h->first; h->first = n; } ``` 有一段比較不直覺的程式碼,因為牽扯到 indirect pointer ```c if (h->first) h->first->pprev = &n->next; ``` 要先知道 `h->first` 表示第一個 `hlist_node`,`pprev` 則是指向一個指向 `hlist_node` 的指標,`&n->next` 表示 n 這個節點的下一個節點的記憶體位址,也就是 `*pprev` 存的內容。 一開始是這樣 ```graphviz digraph G { rankdir = LR; splines = false; node[shape = "record"] list_head[label = "<m>list_head | <n>first"] node_1[label = "<m>hlist_node_1 | {<p>pprev | <n>next}", group = list]; node_2[label = "<m>n | {<p>pprev | <n>next}", group = list]; NULL[shape = plaintext, label = "NULL", group = list] {rank = "min" list_head} list_head -> node_1 -> node_2[ weight = 10, style = invis ] list_head:n -> node_1:m; node_1:p -> list_head:n; node_1:n -> NULL } ``` 經過這一段程式碼之後會變成這樣 ```graphviz digraph G { rankdir = LR; splines = false; node[shape = "record"] list_head[label = "<m>list_head | <n>first"] node_1[label = "<m>hlist_node_1 | {<p>pprev | <n>next}", group = list]; node_2[label = "<m>n | {<p>pprev | <n>next}", group = list]; NULL[shape = plaintext, label = "NULL", group = list] {rank = "min" list_head} list_head -> node_2 -> node_1[ style = invis weight = 10 ] list_head:n -> node_1:m[constraint=false]; node_1:p -> node_2:n node_1:n -> NULL } ``` 然後是後續的程式碼 ```c n->next = h->first; n->pprev = &h->first; h->first = n; ``` ```graphviz digraph G { rankdir = LR; splines = false; node[shape = "record"] list_head[label = "<m>list_head | <n>first"] node_1[label = "<m>hlist_node_1 | {<p>pprev | <n>next}", group = list]; node_2[label = "<m>n | {<p>pprev | <n>next}", group = list]; NULL[shape = plaintext, label = "NULL", group = list] {rank = "min" list_head} // list_head [weight = 10] // node_1 [weight = 10] // node_2 [weight = 10] // NULL [weight = 10] list_head -> node_2 -> node_1[ style = invis weight = 10 ] list_head:n -> node_2:m[constraint=false]; node_1:p -> node_2:n node_2:n ->node_1:m node_2:p -> list_head:n node_1:n -> NULL } ``` ### Tree 相關資料結構 `TreeNode` 是我們要建構樹的基本單元 ```c struct TreeNode { int val; struct TreeNode *left, *right; }; ``` `order_node` 紀錄節點的值和索引,內含一個 `hlist_node` ```c struct order_node { struct hlist_node node; int val; int idx; }; ``` ### Tree 操作 `find`: 傳入節點的值並計算雜湊值,從雜湊表中找到節點的索引編號 (idx) ```c static int find(int num, int size, const struct hlist_head *heads) { struct hlist_node *p; int hash = (num < 0 ? -num : num) % size; hlist_for_each (p, &heads[hash]) { struct order_node *on = list_entry(p, struct order_node, node); if (num == on->val) return on->idx; } return -1; } ``` `node_add` 在雜湊表中新增一個節點 ```c static inline void node_add(int val, int idx, int size, struct hlist_head *heads) { struct order_node *on = malloc(sizeof(*on)); on->val = val; on->idx = idx; int hash = (val < 0 ? -val : val) % size; hlist_add_head(&on->node, &heads[hash]); } ``` ### build tree 和 dfs build tree 會先將 inorder 陣列的內容新增到雜湊表上,然後開始做 DFS。 遞迴呼叫 DFS 來建構這顆樹,preorder 順序是中->左->右,可以把每次遞迴呼叫傳入的 `preorder` 的第一個元素當成根,然後根據 inorder 和 perorder 的範圍來判斷走訪的這個節點存不存在,如果不存在則代表子節點為 NULL。 ```c if (in_low > in_high || pre_low > pre_high) return NULL; struct TreeNode *tn = malloc(sizeof(*tn)); tn->val = preorder[pre_low]; int idx = find(preorder[pre_low], size, in_heads); tn->left = dfs(preorder, pre_low + 1, pre_low + (idx - in_low), inorder, in_low, idx - 1, in_heads, size); tn->right = dfs(preorder, pre_high - (in_high - idx - 1), pre_high, inorder, idx + 1, in_high, in_heads, size); return tn; ``` ### 測試 用測驗題中的範例來測試 > [程式碼](https://gist.github.com/yenslife/be9d4cf827dc52318945c5402782fc51) ### 改進 原本的程式碼沒有檢查 malloc 成功與否,可以在所有 malloc 之前都先檢查一次 ### Linux cgroups [cgroup](https://man7.org/linux/man-pages/man7/cgroups.7.html) 全名為 Linux control group,用來將 process 以層級的方式管理,透過一個虛擬的檔案系統 cgroupfs 為介面來管理 [source code](https://github.com/torvalds/linux/blob/master/kernel/cgroup/cgroup.c) 在第 4573 行有關於 pre-order walk 的介紹,透過 `cgroup_subsys_state` 來走訪 ```c struct cgroup_subsys_state * css_next_descendant_pre(struct cgroup_subsys_state *pos, struct cgroup_subsys_state *root) ``` ##