# 2024q1 Homework2 (quiz1+2) contributed by < `yu-hsiennn` > ## quiz 1 > [題目](https://hackmd.io/@sysprog/linux2024-quiz1) ### 測驗 1 #### 函式功用 - `void list_add(node_t **list, node_t *node_t)`: 串 `node_t` 到 `*list` 前方 - `node_t *list_tail(node_t **left)`: 回傳此串列的最後節點 - `int list_length(node_t **left)`: 計算此串列的長度 - `node_t *list_construct(node_t *list, int n)`: 串列初始化 - `void list_free(node_t **list)`: 釋放串列的記憶體空間 - `static bool list_is_ordered(node_t *list)`: 串列是否由小到大排序 - `void shuffle(int *array, size_t n)`: 將陣列依據其大小打亂裡面的值 #### 解題過程 ```c node_t *list_tail(node_t **left) { while ((*left) && (*left)->next) left = &(AAAA); return *left; } ``` - 由函式名稱可以推得,要找的是使串列的尾巴,所以需要跑完整條鏈結串列,故要更新 left 指向的位置, `AAAA` 則為下一個節點的位置,即 `(*left)->next` ```c int list_length(node_t **left) { int n = 0; while (*left) { ++n; left = &(BBBB); } return n; } ``` - 由函式名稱可以推得是要找此串列的長度,故要更新 left 指向的位置, `AAAA` 則為下一個節點的位置,即 `(*left)->next` ```c while (p) { node_t *n = p; p = CCCC; list_add(n->value > value ? &right : &left, n); } begin[i] = left; end[i] = DDDD; begin[i + 1] = pivot; end[i + 1] = pivot; begin[i + 2] = right; end[i + 2] = EEEE; ``` - 這邊的 `while loop` 是在遍歷整條鏈結串列,故 `p` 應為 `p->next` - `DDDD`,`EEEE` 都是要找目前子串列的尾巴,即 `list_tail(left)`,`list_tail(right)` #### quick_sort 流程如下: 1. 計算此串列長度 `n` 2. 創兩個大小為 `n` 的指標陣列 `begin` , `end` ,初始值設為 `begin[0] = *list, end[0] = list_tail(list);` 3. 判斷是否 i >= 0,是則進入迴圈 4. 將 `L` 設為 begin[i], `R` 設為 end[i],並判斷 `L` 是否等於 `R` 5. 若不相等則將 `L` 加進最終結果;相等則把 `L` 當為 `pivot` 6. 跑過所有陣列並將小於 `pivot` 的串到 `left` 大於的串到 `right` 7. 更新 `begin` 陣列內容 8. 重覆 2.~7. 直到跳出迴圈 9. 把結果只回 list 完成排序 以下用圖解來說明: - Round 1 ```graphviz digraph { node[shape=record]; graph[pencolor=transparent]; rankdir=LR; p1[label="{<data> 4|<next>}"]; p2[label="{<data> 1|<next>}"]; p3[label="{<data> 3|<next>}"]; p4[label="{<data> 5|<next>}"]; p5[label="{<data> 2|<next>}"]; p6[label="{<data> 7|<next>}"]; L[shape=none]; R[shape=none]; pivot[shape=none]; L -> p1; pivot -> p1 R -> p6; // {rank=same; L -> p1;} edge[tailclip=false,arrowtail=dot,dir=both]; p1:next:c -> p2:data; p2:next:c -> p3:data; p3:next:c -> p4:data; p4:next:c -> p5:data; p5:next:c -> p6:data; } ``` ```graphviz digraph { node[shape=record]; graph[pencolor=transparent]; rankdir=TB; "Begin[0]" -> 2 "End[0]" -> 1 "Begin[1]" -> 4 "End[1]" -> 4 "Begin[2]" -> 7 "End[2]" -> 5 } ``` - Round 2 ```graphviz digraph { node[shape=record]; graph[pencolor=transparent]; rankdir=LR; p1[label="{<data> 7|<next>}"]; p2[label="{<data> 5|<next>}"]; L[shape=none]; R[shape=none]; pivot[shape=none]; L -> p1; pivot -> p1 R -> p2; edge[tailclip=false,arrowtail=dot,dir=both]; p1:next:c -> p2:data; } ``` ```graphviz digraph { node[shape=record]; graph[pencolor=transparent]; rankdir=TB; "Begin[0]" -> 2 "End[0]" -> 1 "Begin[1]" -> 4 "End[1]" -> 4 "Begin[2]" -> 5 "End[2]" -> 5 "Begin[3]" -> 7 "End[3]" -> 7 } ``` - Round 3 ~ 5 (從 `Begin` 及 `End` 右邊,相等則加進 `Result` 的前面) ```graphviz digraph { node[shape=record]; graph[pencolor=transparent]; rankdir=TB; "Result" -> 4 -> 5 -> 7 "Begin[0]" -> 2 "End[0]" -> 1 } ``` - Round 6 ```graphviz digraph { node[shape=record]; graph[pencolor=transparent]; rankdir=LR; p1[label="{<data> 2|<next>}"]; p2[label="{<data> 3|<next>}"]; p3[label="{<data> 1|<next>}"]; L[shape=none]; R[shape=none]; pivot[shape=none]; L -> p1; pivot -> p1 R -> p3; edge[tailclip=false,arrowtail=dot,dir=both]; p1:next:c -> p2:data; p2:next:c -> p3:data; } ``` ```graphviz digraph { node[shape=record]; graph[pencolor=transparent]; rankdir=TB; "Result" -> 4 -> 5 -> 7 "Begin[0]" -> 1 "End[0]" -> 1 "Begin[1]" -> 2 "End[1]" -> 2 "Begin[2]" -> 3 "End[2]" -> 3 } ``` - 最終結果 ```graphviz digraph { node[shape=record]; graph[pencolor=transparent]; rankdir=LR; "Result"[shape=none]; "Result" -> 1 -> 2 -> 3 -> 4 -> 5 -> 7 } ``` #### 改進 quick_sort - 空間可以不用開到 2 * n ```diff void quick_sort(node_t **list) { int n = list_length(list); int value; int i = 0; - int max_level = 2 * n; + int max_level = n; node_t *begin[max_level], *end[max_level]; node_t *result = NULL, *left = NULL, *right = NULL; begin[0] = *list; end[0] = list_tail(list); while (i >= 0) { node_t *L = begin[i], *R = end[i]; if (L != R) { node_t *pivot = L; value = pivot->value; node_t *p = pivot->next; pivot->next = NULL; while (p) { node_t *n = p; p = p->next; list_add(n->value > value ? &right : &left, n); } + if (left) { + begin[i] = left; + end[i++] = list_tail(&left); + } + begin[i] = pivot; + end[i++] = pivot; + if (right) { + begin[i] = right; + end[i++] = list_tail(&right); + } - begin[i] = left; - end[i] = DDDD; - begin[i + 1] = pivot; - end[i + 1] = pivot; - begin[i + 2] = right; - end[i + 2] = EEEE; left = right = NULL; - i += 2; } else { - if (L) list_add(&result, L); i--; } } *list = result; } ``` #### 使用 **Linux 核心** `List API` 改寫 - 最差狀況: 串列數值為小到大,或是大到小 - TODO: List API --- ### 測驗 2 #### 函式功用 - `static void create_sample(struct list_head *head, element_t *space, int samples)`: 創造一個隨機串列 - `static void copy_list(struct list_head *from, struct list_head *to, element_t *space)`: 複製 `from` 的串列到 `to` 的串列上 - `int compare(void *priv, const struct list_head *a, const struct list_head *b)`: 用來當作排序的依據 - `bool check_list(struct list_head *head, int count)`: 確認串列是否排序好且是穩定的排序 #### Timsort 特性 - 有穩定性 - `merge` 採用2的冪效果最佳 - 每個 `run` 都會呈現小到大 - 設定 `minrun` 來限制每個 `run` 的**最小**長度,若不符合要求,則利用二分搜尋法插入進適合的位置 - `minrun` 會從 32~64 當中挑選一個數字,使其長度除以 `minrun` 可以等於或是略小於2的冪 - 合併採取 $A > B + C$ 及 $B > C$ ,只要不滿足任何一式,就會進行合併 #### 解題過程 ```c static struct list_head *merge(void *priv, list_cmp_func_t cmp, struct list_head *a, struct list_head *b) { struct list_head *head; struct list_head **tail = AAAA; for (;;) { /* if equal, take 'a' -- important for sort stability */ if (cmp(priv, a, b) <= 0) { *tail = a; tail = BBBB; a = a->next; if (!a) { *tail = b; break; } } else { *tail = b; tail = CCCC; b = b->next; if (!b) { *tail = a; break; } } } return head; } ``` - `AAAA` 應為 `head` 的起始值,也就是 `&head` ,後續再改變指向的位置即可 - `BBBB` 及 `CCCC` 都是再串接新節點後要更新 `tail` 的最後位置,即 `&a->next`,`&b->next` ```c static void build_prev_link(struct list_head *head, struct list_head *tail, struct list_head *list) { tail->next = list; do { list->prev = tail; tail = list; list = list->next; } while (list); /* The final links to make a circular doubly-linked list */ DDDD = head; EEEE = tail; } ``` - 根據註解 `The final links to make a circular doubly-linked list` ,可以知道是要將整條串列變成有迴圈的雙向鏈結串列,而再觀察上面的 `do while` 可以知道目前 `tail` 的位置是串列最後一個節點,所以 `DDDD` 應為 `tail->next`, `EEEE` 應為 `head->prev` ```c /* The final merge; rebuild prev links */ struct list_head *stk0 = tp, *stk1 = stk0->prev; while (stk1 && stk1->prev) stk0 = stk0->prev, stk1 = stk1->prev; if (stk_size <= FFFF) { build_prev_link(head, head, stk0); return; } merge_final(priv, cmp, head, stk1, stk0); ``` - 根據判斷是可以知道若小於則不做合,只需要重建鏈結串列節點的關係即可,故可以推得 `FFFF` 為 `1` #### 改進程式碼 ```diff static struct list_head *merge(void *priv, list_cmp_func_t cmp, struct list_head *a, struct list_head *b) { struct list_head *head; struct list_head **tail = AAAA; - for (;;) { - /* if equal, take 'a' -- important for sort stability */ - if (cmp(priv, a, b) <= 0) { - *tail = a; - tail = BBBB; - a = a->next; - if (!a) { - *tail = b; - break; - } - } else { - *tail = b; - tail = CCCC; - b = b->next; - if (!b) { - *tail = a; - break; - } - } - } + while (a && b) { + if (cmp(priv, a, b) <= 0) { + *tail = a; + a = a->next; + } else { + *tail = b; + b = b->next; + } + tail = &(*tail)->next; + } + *tail = (a) ? a : b; return head; } ``` - 重構部分重複的部分,減少迴圈內 `if` 的判斷次數 #### 整合進 `lab0-c` - TODO ## quiz 2 > [題目](https://hackmd.io/@sysprog/linux2024-quiz2) ### 測驗 1 #### 解題過程 ```c static inline void hlist_add_head(struct hlist_node *n, struct hlist_head *h) { if (h->first) h->first->pprev = &n->next; n->next = AAAA; n->pprev = &h->first; h->first = n; } ``` - 根據函式名稱可以推測出是要把新的節點加進串列中,所以 `AAAA` 應為 `h->first` ```graphviz digraph hlist_add_head { rankdir=LR; node [shape=record, fontname=Helvetica, fontsize=10]; label="Before Adding New Node"; color=lightgrey; head[label="<f0> head | <f1> first"]; nodeA[label="{<p> pprev | <n> Node A | <next> next}"]; nodeB[label="{<p> pprev | <n> Node B | <next> next}"]; null1[shape=none, label=""]; head:f1 -> nodeA:n; nodeA:next -> nodeB:n; nodeB:next -> null1; nodeA:p -> head:f1; nodeB:p -> nodeA:next; } ``` ```graphviz digraph hlist_add_head { rankdir=LR; node [shape=record, fontname=Helvetica, fontsize=10]; label="After Adding New Node"; color=lightgrey; head_new[label="<f0> head | <f1> first", style=filled, fillcolor=lightblue]; newNode[label="{<p> pprev | <n> New Node | <next> next}", style=filled, fillcolor=lightblue]; nodeA_new[label="{<p> pprev | <n> Node A | <next> next}", style=filled, fillcolor=lightblue]; nodeB_new[label="{<p> pprev | <n> Node B | <next> next}", style=filled, fillcolor=lightblue]; null2[shape=none, label=""]; head_new:f1 -> newNode:n [color=red]; newNode:next -> nodeA_new:n [color=red]; nodeA_new:next -> nodeB_new:n; nodeB_new:next -> null2; newNode:p -> head_new:f1 [color=red]; nodeA_new:p -> newNode:next [color=red]; nodeB_new:p -> nodeA_new:next; } ``` ```c static int find(int num, int size, const struct hlist_head *heads) { struct hlist_node *p; int hash = (num < 0 ? -num : num) % size; hlist_for_each (p, BBBB) { struct order_node *on = CCCC(p, struct order_node, node); if (num == on->val) return on->idx; } return -1; } ``` - `BBBB` 是要遍歷每個節點,故應該要用 `&heads[hash]`,而 `CCCC` 是要進去 `entry` 裡面取值,所以應該為 `list_entry` ```c static inline void node_add(int val, int idx, int size, struct hlist_head *heads) { struct order_node *on = malloc(sizeof(*on)); on->val = val; on->idx = idx; int hash = (val < 0 ? -val : val) % size; hlist_add_head(&on->node, DDDD); } ``` - 將新的節點加進串列裡面,`DDDD` 應為 `&heads[hash]` ```graphviz digraph node_add { rankdir=LR; node [shape=record, fontname=Helvetica, fontsize=10]; label="Hash Table Before Adding New Node"; color=lightgrey; head0[label="<f0> Bucket 0 | <f1>"]; head1[label="<f0> Bucket 1 | <f1>"]; head2[label="<f0> Bucket 2 | <f1> first"]; head3[label="<f0> Bucket 3 | <f1>"]; nodeA[label="{<p> pprev | <n> Node A (Val X) | <next> next}"]; nodeB[label="{<p> pprev | <n> Node B (Val Y) | <next> next}"]; null1[shape=none, label=""]; head2:f1 -> nodeA:n; nodeA:p -> head2:f1 nodeA:next -> nodeB:n; nodeB:p -> nodeA:next; nodeB:next -> null1; } ``` ```graphviz digraph node_add { rankdir=LR; node [shape=record, fontname=Helvetica, fontsize=10]; label="Hash Table After Adding New Node"; color=lightgrey; head0_new[label="<f0> Bucket 0 | <f1>", style=filled, fillcolor=lightblue]; head1_new[label="<f0> Bucket 1 | <f1>", style=filled, fillcolor=lightblue]; head2_new[label="<f0> Bucket 2 | <f1> first", style=filled, fillcolor=lightblue]; head3_new[label="<f0> Bucket 3 | <f1>", style=filled, fillcolor=lightblue]; newNode[label="{<p> pprev | <n> New Node (Val Z) | <next> next}", style=filled, fillcolor=lightblue]; nodeA_new[label="{<p> pprev | <n> Node A (Val X) | <next> next}", style=filled, fillcolor=lightblue]; nodeB_new[label="{<p> pprev | <n> Node B (Val Y) | <next> next}", style=filled, fillcolor=lightblue]; null2[shape=none, label="...", style=filled, fillcolor=lightblue]; head2_new:f1 -> newNode:n [color=red]; newNode:p -> head2_new:f1 [color=red]; newNode:next -> nodeA_new:n [color=red]; nodeA_new -> newNode:next [color=red]; nodeA_new:next -> nodeB_new:n; nodeB_new:p -> nodeA_new:next; nodeB_new:next -> null2; } ``` ### 測驗 2 #### 解題過程 - `LRUChach`: 用於 `LRU` 緩存機制的主體結構。包含下列幾個元素 - `capacity`: 緩存的最大容量 - `count`: 當前緩存中項目的數量 - `dhead`: 一個雙向鏈結串列的頭節點,用於按最近使用順序維持其緩存項目 - `hhead[]`: 一個 `hash` 頭節點的數組,用於快速訪問緩存項目 - `LRUNode`: 是緩存中每個項目的結構,包含每個緩存項目的資料 - `key`: 鍵值 - `value`: 數值 - `node`: 用於將此緩存項目連接到 `hash` 的結構 - `link`: 用於將此緩存項目連接到雙向鏈結的結構 - 透過 `hhead[]` 中的 `hash table` 可以快速定位一個特定的 `LRUNode` - `dhead` 用於指向最近被訪問的值 - 關係圖如下 ```graphviz digraph LRU_Cache { rankdir=LR; node [shape=record, style=filled, fillcolor=gray95]; LRUCache [label="{LRUCache|{capacity|count|dhead|<hhead> hhead[]}}"]; LRUNode [label="{LRUNode|{key|value|<node>node|<link>link}}"]; Array [label="Array of hlist_head", shape=plaintext]; List [label="Doubly Linked List (list_head)", shape=plaintext]; LRUCache:hhead -> LRUNode:n [label="Hash Table Linkage"]; LRUCache -> LRUNode:link [label="List Ordering Linkage"]; Array -> LRUNode:n [label="Hashing", color="red"]; List -> LRUNode:link [label="Ordering", color="blue"]; } ``` ```c void hlist_del(struct hlist_node *n) { struct hlist_node *next = n->next, **pprev = n->pprev; *pprev = next; if (next) EEEE = pprev; } ``` - 藉由函式名稱可以知道是要用來移除某個節點,而 `*pprev` 為前一個節點,所以應該更新 `next` 的 `prev`, `EEEE` 即為 `next->pprev` ```graphviz digraph hlist_del { rankdir=LR; node [shape=record, fontname=Helvetica, fontsize=10]; label="Before Deletion"; color=lightgrey; head[label="<f0> head | <f1> first"]; nodeA[label="{<p> pprev | <n> Node A | <next> next}"]; nodeB[label="{<p> pprev | <n> Node B (delete) | <next> next}", style=filled, fillcolor=red]; nodeC[label="{<p> pprev | <n> Node C | <next> next}"]; null1[shape=none, label=""]; head:f1 -> nodeA:n; nodeA:next -> nodeB:n; nodeB:next -> nodeC:n; nodeC:next -> null1; nodeA:p -> head:f1; nodeB:p -> nodeA:next; nodeC:p -> nodeB:next; } ``` ```graphviz digraph hlist_del { rankdir=LR; node [shape=record, fontname=Helvetica, fontsize=10]; label="After Deletion"; color=lightgrey; head_new[label="<f0> head | <f1> first", style=filled, fillcolor=lightblue]; nodeA_new[label="{<p> pprev | <n> Node A | <next> next}", style=filled, fillcolor=lightblue]; nodeC_new[label="{<p> pprev | <n> Node C | <next> next}", style=filled, fillcolor=lightblue]; null2[shape=none, label=""]; head_new:f1 -> nodeA_new:n [color=blue]; nodeA_new:next -> nodeC_new:n [color=blue]; nodeC_new:next -> null2; nodeA_new:p -> head_new:f1 [color=blue]; nodeC_new:p -> nodeA_new:next [color=blue, label="pprev updated", fontcolor=blue]; } ``` ```c void lRUCacheFree(LRUCache *obj) { struct list_head *pos, *n; list_for_each_safe (pos, n, &obj->dhead) { LRUNode *cache = list_entry(pos, LRUNode, FFFF); list_del(GGGG); free(cache); } free(obj); } ``` - `FFFF` 為 `pos` 裡面的 `LRUNode` 中的 `link` - `GGGG` 要將此節點從串列中移除,故 `&cache->link` ```c int lRUCacheGet(LRUCache *obj, int key) { int hash = key % obj->capacity; struct hlist_node *pos; hlist_for_each (pos, &obj->hhead[hash]) { LRUNode *cache = list_entry(pos, LRUNode, HHHH); if (cache->key == key) { list_move(IIII, &obj->dhead); return cache->value; } } return -1; } void lRUCachePut(LRUCache *obj, int key, int value) { LRUNode *cache = NULL; int hash = key % obj->capacity; struct hlist_node *pos; hlist_for_each(pos, &obj->hhead[hash]) { LRUNode *c = list_entry(pos, LRUNode, JJJJ); if (c->key == key) { list_move(KKKK, &obj->dhead); cache = c; } } if (!cache) { if (obj->count == obj->capacity) { cache = list_last_entry(&obj->dhead, LRUNode, link); list_move(&cache->link, &obj->dhead); hlist_del(&cache->node); hlist_add_head(&cache->node, &obj->hhead[hash]); } else { cache = malloc(sizeof(LRUNode)); hlist_add_head(&cache->node, &obj->hhead[hash]); list_add(&cache->link, &obj->dhead); obj->count++; } cache->key = key; } cache->value = value; } ``` - 根據函式名稱可以得知是要找快取中是否有這個 `key` 值,而 `pos` 會跑雜湊位置中的每個節點,而 `HHHH`、`JJJJ` 都會對應到 `hlist_node` 的內容,根據 `LRUNode` 結構可以知道兩個都為 `node` - `IIII`、`KKKK` 用於將他們的 `link` 移到 `dhead` 的下一個位置,故為 `&cache->link`、`&c->link` #### TODO (find LRU code in Linux kernel) ### 測驗 3 #### 解題過程 首先,先觀察巨集的功用 - `BITMAP_LAST_WORD_MASK(bits)`: 利用遮罩的方式,來把用不到的位元給過濾掉,因位元的大小可能不是剛好 `64-bit` - `__const_hweight64(w) (__const_hweight32(w) + __const_hweight32((w) >> 32))`: 計算 `Hamming` 權重,即二進位有幾個 `1` ``` # example: word = 1010101010101010101010101010101010101010101010101010101010101010 (64 bits) Step 1: __const_hweight64 Lower 32 bits: 10101010101010101010101010101010 Upper 32 bits: 10101010101010101010101010101010 Step 2: __const_hweight32 For both Lower and Upper 32-bit parts: Split into 16-bit parts. Lower 16 bits: 1010101010101010 Upper 16 bits: 1010101010101010 Step 3: __const_hweight16 For both Lower and Upper 16-bit parts of each 32-bit part: Split into 8-bit parts. Lower 8 bits: 10101010 Upper 8 bits: 10101010 Step 4: __const_hweight8 For each 8-bit part: Bit 0: 1 (1) Bit 1: 0 (0) Bit 2: 1 (1) Bit 3: 0 (0) Bit 4: 1 (1) Bit 5: 0 (0) Bit 6: 1 (1) Bit 7: 0 (0) The Hamming weight for each 8-bit part is 4 (since there are four 1s). Step 5: Sum up the Hamming weights Each 16-bit part has two 8-bit parts, so its Hamming weight is 4 + 4 = 8. Each 32-bit part has two 16-bit parts, so its Hamming weight is 8 + 8 = 16. Finally, the 64-bit word has two 32-bit parts, so its total Hamming weight is 16 + 16 = `32`. ``` ```c static inline unsigned long __ffs(unsigned long word) { int num = 0; #if BITS_PER_LONG == 64 if ((word & AAAA) == 0) { num += 32; word >>= 32; } #endif if ((word & 0xffff) == 0) { num += 16; word >>= 16; } if ((word & 0xff) == 0) { num += 8; word >>= 8; } if ((word & 0xf) == 0) { num += 4; word >>= 4; } if ((word & 0x3) == 0) { num += 2; word >>= 2; } if ((word & 0x1) == 0) num += 1; return num; } ``` - 由下面的程式可以推論, `AAAA` 應為 `0xffffffff` - 作用為找到第一個 `1` 的索引值 (右到左) ```c static inline void __clear_bit(unsigned long nr, volatile unsigned long *addr) { unsigned long mask = BIT_MASK(nr); // 把第 nr 個 bit 設為 1 unsigned long *p = ((unsigned long *) addr) + BIT_WORD(nr); // 指向正確要清除的數的位址 *p &= BBBB; } ``` - `BBBB` 將 `mask` 反向即可清除第 `nr` 個位元 ```c #define FIND_NTH_BIT(FETCH, size, num) \ ({ \ unsigned long sz = (size), nr = (num), idx, w, tmp; \ \ for (idx = 0; (idx + 1) * BITS_PER_LONG <= sz; idx++) { \ if (idx * BITS_PER_LONG + nr >= sz) \ goto out; \ \ tmp = (FETCH); \ w = hweight_long(tmp); \ if (w > nr) \ goto found; \ \ nr -= w; \ } \ \ if (sz CCCC BITS_PER_LONG) \ tmp = (FETCH) & BITMAP_LAST_WORD_MASK(sz); \ found: \ sz = min(idx * BITS_PER_LONG + fns(tmp, nr), sz); \ out: \ sz; \ }) ``` - 為了不讓 `sz` 超過 `BITS_PER_LONG`,故要做 `%` - 用於找到第 `N` 個設置為 `1` 的位元