2024q1 Homework2 (quiz1+2)

# 2024q1 Homework2 (quiz1+2) contributed by < `kevinzxc1217` > ## 第一週測驗題 ### 測驗一 ```c node_t *list_tail(node_t **left) { while ((*left) && (*left)->next) left = &(AAAA); return *left; } ``` 這裡一開始我的想法會是填 `left -> next` ，不斷的往下走訪，不過要考慮到 `left` 是指標的指標，所以這裡的 `AAAA` 要填 `(*left) -> next` 。 ```c int list_length(node_t **left) { int n = 0; while (*left) { ++n; left = &(BBBB); } return n; } ``` 這裡要計算串列的長度，一樣是希望透過指標的指標去走訪整個串列，所以這裡的 `BBBB` 要填 `(*left) -> next` 。 ```c while (p) { node_t *n = p; p = CCCC; list_add(n->value > value ? &right : &left, n); } ``` ![1](https://hackmd.io/_uploads/Syg-2Jwap.png) 希望 `p` 不斷的往下移動，則 `CCCC` 要填 `p -> next` 。 ![2](https://hackmd.io/_uploads/S1fgpkDTp.png) ![4](https://hackmd.io/_uploads/B1aPTyPpa.png) ![5](https://hackmd.io/_uploads/rJLkyxwT6.png) 這裡主要是在分類指標 `n` 所指向的值 `n->value` 是否大於 `pivot` ，若大於的話分類在 `right` ，否則分類在 `left` 串列。 ```c begin[i] = left; end[i] = DDDD; begin[i + 1] = pivot; end[i + 1] = pivot; begin[i + 2] = right; end[i + 2] = EEEE; ``` ![a1](https://hackmd.io/_uploads/rywJWew66.png) ![a2](https://hackmd.io/_uploads/B1mo-lvTT.png) 這裡的 `end` 指的是 `left` 串列上的尾端，可以呼叫函式 `list_tail(node_t **left)` ，其中要注意該函式接收的參數需要是指標的指標，所以這裡的`DDDD` 要填 `list_tail(&left)` ; 而 `EEEE` 要填 `list_tail(&right)` 。 ```c if (L != R) { ... i += 2; } else { if (L) list_add(&result, L); i--; } ``` 該函式主要是用到了 `begin` 和 `end` 的 `index` 來取代quick sort的遞迴呼叫，不同的 `index` 就對應著不同的遞迴階段，其中 `i+=2` 就對應遞迴的再次呼叫本身函式，而 `i--` 就對應遞迴的 `return` 。 #### 使用 Linux 核心風格的 List API 改寫上述程式碼 >[程式連結](https://github.com/kevinzxc1217/linux2024_hw2/blob/main/quick_sort/quick_sort.c) ```c typedef struct __node { struct list_head list; long value; } node_t; ``` 將 `node_t` 內各個 node 的連結改成 List API 內部的 `list_head` 型態。 ```c void *list_construct(struct list_head *head, int n){ if(!head) return false; node_t* node = malloc(sizeof(node_t)); list_add_tail(&node -> list, head); node -> value = n; } ``` `node_t` 結構類似於 lab-0 內的 `element_t` ，可以透過 `list_add_tail` 的方式接上所新增的節點。 ```c struct list_head *list_tail(struct list_head *head) { struct list_head *tail = head -> prev; return tail; } ``` 由於改成環狀鏈結串列，故可以直接從 `head` 的前一個找到最尾巴的節點。 ```c static bool list_is_ordered(struct list_head *head) { struct list_head *cur = head -> next; while(cur != head && cur-> next != head){ node_t *n_cur = list_entry(cur,node_t,list); node_t *n_next = list_entry(cur->next,node_t,list); if(n_cur->value < n_next->value) return false; cur = cur -> next; } return true; } ``` 檢查是否有照順序排序，鏈結串列上的節點，主要是透過 `list_entry()` 的函式找到其 `value` 值，再去做比對。 ```c void quick_sort(struct list_head *list) { int n = list_length(list); int pivot_value; int i = 0; int max_level = 2 * n; struct list_head *begin[max_level], *end[max_level]; struct list_head *result = list_new(), *left = list_new(), *right = list_new(); begin[0] = list; while (i >= 0) { struct list_head *L = begin[i]; if (L ->next !=list_tail(L)) { struct list_head *pivot = list_new(); struct list_head *cur = L ->next; int max = list_length(L) -1; int min = 0; int cnt = rand() % (max - min + 1) + min; struct list_head *p = cur; while(cnt){ p = p->next; cnt--; } if(cur == p) cur = cur->next; pivot_value = list_entry(p,node_t,list)->value; list_move_tail(p, pivot); while (!list_empty(L)) { cur = cur -> next; list_move_tail(cur -> prev, list_entry(cur -> prev,node_t,list)->value > pivot_value ? right : left); } list_splice_tail_init(left,begin[i]); begin[i+1] = list_new(); begin[i+2] = list_new(); list_splice_tail_init(pivot,begin[i+1]); list_splice_tail_init(right,begin[i+2]); i += 2; } else { if (!list_empty(L)) list_splice_tail_init(L, result); i--; } } list_splice_tail_init(result,list); } ``` 這裡改寫時有幾點要注意： - 在創建新鏈結串列時應使用 `list_new()` ，已確保有分配到一個 `head` 空間。 - 原先傳入的 `**list` 是指標的指標，這裡改寫成傳入 `*list` 指標，有部分程式會受影響。 - 原本程式碼內是使用 `begin` 和 `end` 兩個指標，由於改成環狀鏈結串列，改寫後僅使用 `begin` ，可用 `list_tail()` 代替需要找尋 `end` 的程式。 - 原先的 quick_sort 程式所使用到的鏈結串列並沒有 `head` ，所以在找第一個節點時要注意。 - 這裡將鏈結串列賦予時，應使用 `list_splice()` 函式，才能確保完全合併，不能僅使用等號做賦予。 ```c int max = list_length(L) -1; int min = 0; int cnt = rand() % (max - min + 1) + min; struct list_head *p = cur; while(cnt){ p = p->next; cnt--; } if(cur == p) cur = cur->next; ``` 其中這段程式是在目前的鏈結串列中隨機挑選一個節點當作 pivot ，與每次挑選第一個節點或是最後一個節點當作 pivot 來比較，可減少 worse case 發生。這裡要注意的是程式內 `cur` 所指向的節點和 `p` 所指向的節點要不同，不然後續將 `p` 所指向的節點移至 `pivot` 串列時， `cur` 便會跟著指向，我們希望 `cur` 仍然指向原先的 `list` 串列。 ### 測驗二 ```c static struct list_head *merge(void *priv, list_cmp_func_t cmp, struct list_head *a, struct list_head *b) { struct list_head *head; struct list_head **tail = AAAA; for (;;) { /* if equal, take 'a' -- important for sort stability */ if (cmp(priv, a, b) <= 0) { *tail = a; tail = BBBB; a = a->next; if (!a) { *tail = b; break; } } else { *tail = b; tail = CCCC; b = b->next; if (!b) { *tail = a; break; } } } return head; } ``` 因為 `**tail` 為指標的指標，一開始僅有 `*head` ，故 `AAAA` 所指的位址應為 `&head` 。其中 `*tail = a` 是將 `head` 指向 `a` 所指向的串列，而 `tail` 是指向新加入串列所指向的地址，故 `BBBB` 為 `&a->next` 而 `CCCC` 類推為 `&b->next`。 ```c static void build_prev_link(struct list_head *head, struct list_head *tail, struct list_head *list) { tail->next = list; do { list->prev = tail; tail = list; list = list->next; } while (list); /* The final links to make a circular doubly-linked list */ DDDD = head; EEEE = tail; } ``` 這段程式碼主要是在建構迴圈雙向鏈結串列，簡單來看，如果沒有需要新增 `list` 的話，只需要將傳入的 `head` 和 `tail` 兩者的 `prev` 和 `next` 互指即可形成迴圈雙向鏈結串列，所以其中 `DDDD` 為 `tail->next` ，而 `EEEE ` 為 `head->prev` 。中間部分主要是將 `list` 新增到 `tail` 後面，且形式為迴圈雙向鏈結串列。 ```c void timsort(void *priv, struct list_head *head, list_cmp_func_t cmp) { stk_size = 0; struct list_head *list = head->next, *tp = NULL; if (head == head->prev) return; /* Convert to a null-terminated singly-linked list. */ head->prev->next = NULL; do { /* Find next run */ struct pair result = find_run(priv, list, cmp); result.head->prev = tp; tp = result.head; list = result.next; stk_size++; tp = merge_collapse(priv, cmp, tp); } while (list); /* End of input; merge together all the runs. */ tp = merge_force_collapse(priv, cmp, tp); /* The final merge; rebuild prev links */ struct list_head *stk0 = tp, *stk1 = stk0->prev; while (stk1 && stk1->prev) stk0 = stk0->prev, stk1 = stk1->prev; if (stk_size <= FFFF) { build_prev_link(head, head, stk0); return; } merge_final(priv, cmp, head, stk1, stk0); } ``` 在這段程式碼中，其中 `find_run` 函數主要用是將鏈結串列依據規則分成不同的 `run` ，用 `tp` 去連接各個整理後的 `run` ，最後再 `merge` 起來。 ![b1](https://hackmd.io/_uploads/rkKK6StT6.png) 假設這次初始狀態的鏈結串列 ![b2](https://hackmd.io/_uploads/r1wJ0BFaa.png) 這是剛進去`rnu` 函式後，初始化的鏈結串列 ```c if (cmp(priv, list, next) > 0) { /* decending run, also reverse the list */ struct list_head *prev = NULL; do { len++; list->next = prev; prev = list; list = next; next = list->next; head = list; } while (next && cmp(priv, list, next) > 0); list->next = prev; } else { do { len++; list = next; next = list->next; } while (next && cmp(priv, list, next) <= 0); list->next = NULL; } ``` 這段程式主要是進行分割不同的 run ，並且依據其升序或降序去做反轉。 ![b5](https://hackmd.io/_uploads/r1Ic7LFTT.png) ![b6](https://hackmd.io/_uploads/ByjaQItaa.png) 這裡主要是將 `4->3->1` 鏈結串列反轉成 `1->3->4`，經過上述程式一輪後，鏈結串列會如上圖所示。 ![B3](https://hackmd.io/_uploads/By61MUFp6.png) ![B3](https://hackmd.io/_uploads/HyoFfLK6a.png) 跑完整個 `find_run` 函式後，會發現原本的 `4->3->1` 鏈結串列反轉成 `1->3->4` 了，並且將 `next` 指標指向下一次需要進去 `find_run` 整理的鏈結串列。重複以上動作最後可以得到多個經過升序排列好不同的 run 。 `merge_collapse()` 該函式主要檢查堆疊頂端的 3 個 run 是否滿足以下原則： 1. A 的長度要大於 B 和 C 的長度總和。 2. B 的長度要大於 C 的長度。當不符合這些條件時，函式會決定是否進行合併。 `merge_force_collapse()` 不檢查 `merge_collapse()` 的原則，直接將不同的 run 進行合併，直到剩餘 2 個 run。 `merge_final()` 將最後 2 個 run 進行合併，形成完全排序好的鏈結串列。 ![5](https://hackmd.io/_uploads/SyzMtUK6p.png) ## 第二週測驗題 ### 測驗一 ```c struct hlist_node { struct hlist_node *next, **pprev; }; struct hlist_head { struct hlist_node *first; }; static inline void hlist_add_head(struct hlist_node *n, struct hlist_head *h) { if (h->first) h->first->pprev = &n->next; n->next = AAAA; n->pprev = &h->first; h->first = n; } ``` 這裡由於 `n->next`的資料型態是指標，而`n->pprev` 的資料型態是指標的指標 `&h->first` ，故可以推敲出 `AAAA` 應填 `h->first` 。 ```c #define container_of(ptr, type, member) \ ((type *) ((char *) (ptr) - (size_t) & (((type *) 0)->member))) #define hlist_for_each(pos, head) \ for (pos = (head)->first; pos; pos = pos->next) static int find(int num, int size, const struct hlist_head *heads) { struct hlist_node *p; int hash = (num < 0 ? -num : num) % size; hlist_for_each (p, BBBB) { struct order_node *on = CCCC(p, struct order_node, node); if (num == on->val) return on->idx; } return -1; } ``` 這裡是在尋找該鏈結串列上，是否有值和該函式的 `num` 一樣，對應`hlist_for_each()` ， `BBBB` 是傳入該函式的 `&heads[hash]` ；而 `*on` 是希望將目前指標的 `val` 是什麼，所以 `CCCC` 是 `container_of` 。 ```c static inline void node_add(int val, int idx, int size, struct hlist_head *heads) { struct order_node *on = malloc(sizeof(*on)); on->val = val; on->idx = idx; int hash = (val < 0 ? -val : val) % size; hlist_add_head(&on->node, DDDD); } ``` 根據 `hlist_add_head()` 函式， `DDDD` 應填入 `struct hlist_head *h` 型態，故為 `&heads[hash]` 。 ```c static struct TreeNode *dfs(int *preorder, int pre_low, int pre_high, int *inorder, int in_low, int in_high, struct hlist_head *in_heads, int size) { if (in_low > in_high || pre_low > pre_high) return NULL; struct TreeNode *tn = malloc(sizeof(*tn)); tn->val = preorder[pre_low]; int idx = find(preorder[pre_low], size, in_heads); tn->left = dfs(preorder, pre_low + 1, pre_low + (idx - in_low), inorder, in_low, idx - 1, in_heads, size); tn->right = dfs(preorder, pre_high - (in_high - idx - 1), pre_high, inorder, idx + 1, in_high, in_heads, size); return tn; } ``` 這裡參考的演算法是先從前序找第一個位置當做 root ，該 root可以在中序內分割成兩個陣列，root 左邊為左子樹，右邊為右子樹；再將找到的左子樹陣列和右子樹在前序內做分割陣列，完成第一次。後續再將切分完的左子樹的第一個位置當作 root 重複以上步驟，直到都切成單一元素，再將右子樹重複以上步驟，直到切成單一元素，即可完成。而 `dfs` 使用 `pre_low` 和 `pre_high` 以及 `in_low` 和 `in_high` 透過設置不同的上下界，來替代上述提到的分割陣列，使其完成建樹。 ```c static struct TreeNode *buildTree(int *preorder, int preorderSize, int *inorder, int inorderSize) { struct hlist_head *in_heads = malloc(inorderSize * sizeof(*in_heads)); for (int i = 0; i < inorderSize; i++) INIT_HLIST_HEAD(&in_heads[i]); for (int i = 0; i < inorderSize; i++) node_add(inorder[i], i, inorderSize, in_heads); return dfs(preorder, 0, preorderSize - 1, inorder, 0, inorderSize - 1, in_heads, inorderSize); } ``` 假設一開始的前序中序為: 前序 3,9,20,15,7 中序 9,3,15,20,7 經過第一次的分割，前序中序變成: 前序 3[9][20,15,7] 中序 [9]3[15,20,7] ```graphviz digraph BST { node [fontname="Arial" ]; l1 [ label = "3" ]; l21 [ label = "9" ]; l22 [ label = "15,20,7" ]; l1 -> { l21 l22 }; } ``` 由於左子樹切割到只剩餘一個節點時，會換右子樹重複做分割的動作，都切到剩餘單一元素即可完成建樹: 前序 20,[15],[7] 中序 [15],20,[7] ```graphviz digraph BST { node [fontname="Arial" ]; l1 [ label = "3" ]; l21 [ label = "9" ]; l22 [ label = "20" ]; l31 [ label = "15" ]; l32 [ label = "7" ]; l1 -> { l21 l22 }; l22 -> { l31 l32 }; } ``` ### 測驗二 ```c typedef struct { int capacity; int count; struct list_head dhead; struct hlist_head hhead[]; } LRUCache; ``` **LRUCache** 內部的資料結構 `capacity` : 紀錄該 cache 的容量，能放多少值 `count` : 紀錄 cache 當前所使用的容量 `dhead` : 紀錄所有的鏈結串列 `hhead` : 紀錄各個 key 的對應的鏈結串列 ```c typedef struct { int key; int value; struct hlist_node node; struct list_head link; } LRUNode; ``` **LRUNode** 內部的資料結構 `key` : 該 node 所對應的 hash table 上的 key 值 `value` : 該 node 所存放的值 `node` : 與該串列上其他 node 連結的鏈結 `link` : 與 dhead 連結的鏈結 ```graphviz digraph LRUcache { node [shape=record]; rankdir = LR; subgraph cluster_a { label = LRUCache; struct0 [label="capacity"]; struct1 [label="count"]; struct2 [label="dhead | {<prev>prev | <next>next}"]; struct3 [label="<head>hhead[0] | <head1>hhead[1] | ..."]; } subgraph cluster_b { label = LRUNode1; struct6 [label="link | {<prev>prev | <next>next}"]; struct7 [label="node | {<prev>pprev | <next>next}"]; } subgraph cluster_c { label = LRUNode2; struct10 [label="link | {<prev>prev | <next>next}"]; struct11 [label="node | {<prev>pprev | <next>next}"]; } subgraph cluster_d { label = LRUNode3; struct12 [label="link | {<prev>prev | <next>next}"]; struct13 [label="node | {<prev>pprev | <next>next}"]; } struct2:next -> struct6; struct6:next -> struct10; struct10:next -> struct12; struct3:head -> struct7; struct3:head1 -> struct13; struct7:next -> struct11; } ``` ```c void hlist_del(struct hlist_node *n) { struct hlist_node *next = n->next, **pprev = n->pprev; *pprev = next; if (next) EEEE = pprev; } ``` 該函式主要是刪除指定的 node ，其中 `**pprev` 指向的是 `n` 的前一個 node 的 `next` 指標，所以 `*pprev = next` 這段是將 `n` 前一個 node 的 `next` 指向 `n` 的下一個 node 。而剩下部分就是希望 `next` 的前一個指向 `n` 的前一個，所以 `EEEE` 為 `next->pprev` 。 ```c void lRUCacheFree(LRUCache *obj) { struct list_head *pos, *n; list_for_each_safe (pos, n, &obj->dhead) { LRUNode *cache = list_entry(pos, LRUNode, FFFF); list_del(GGGG); free(cache); } free(obj); } ``` 這裡的 `pos` 型態是 `list_head` ，而 `LRUNode` 內對應 `list_head` 型態的結構是 `link` ，在 `list_entry` 中， `FFFF` 要填入和 `pos`同個型態，所以是 `node` 。而 `void list_del(struct list_head *entry)` 這裡是希望傳入的是 `list_head` 的形式，所以是 `&cache->link` 。 ```c int lRUCacheGet(LRUCache *obj, int key) { int hash = key % obj->capacity; struct hlist_node *pos; hlist_for_each (pos, &obj->hhead[hash]) { LRUNode *cache = list_entry(pos, LRUNode, HHHH); if (cache->key == key) { list_move(IIII, &obj->dhead); return cache->value; } } return -1; } ``` 該 LRU cache 首先使用傳入的 `key` 值去算出在要找的值在 hash table 陣列上的哪個串列中，若沒有的話回傳 -1 ，有的話則將該節點的 `value` 回傳。而其中因為 `pos` 是 `hlist_node` 的型態，所以 `HHHH` 是 `node` ；因為 `void list_move(struct list_head *entry, struct list_head *head)` 所以 `IIII` 要填入的是 `*list_head` 的型態，所以是 `&cache->link` 。 ```c void lRUCachePut(LRUCache *obj, int key, int value) { LRUNode *cache = NULL; int hash = key % obj->capacity; struct hlist_node *pos; hlist_for_each (pos, &obj->hhead[hash]) { LRUNode *c = list_entry(pos, LRUNode, JJJJ); if (c->key == key) { list_move(KKKK, &obj->dhead); cache = c; } } if (!cache) { if (obj->count == obj->capacity) { cache = list_last_entry(&obj->dhead, LRUNode, link); list_move(&cache->link, &obj->dhead); hlist_del(&cache->node); hlist_add_head(&cache->node, &obj->hhead[hash]); } else { cache = malloc(sizeof(LRUNode)); hlist_add_head(&cache->node, &obj->hhead[hash]); list_add(&cache->link, &obj->dhead); obj->count++; } cache->key = key; } cache->value = value; } ``` 這段程式碼主要是到雜湊表內放入新的節點，如果該節點存在，則將該節點透過 `list_move` 函式放到 `dhrad` 所指向的串列最前面；若不存在，則檢查該 `LRUCache` 是否已滿，若已滿，則透過 `list_last_entry` 函式找到該鏈結串列的最後一個節點，將其放到值放在 `dhead` 串列的第一個節點上，並且透過 `hlist_del` 該函式刪除 `hhead` 串列上的節點，再將新的值插入在第一個節點上；若未滿，則將其插入在 `hhead` 鏈結串列和 `dhead` 鏈結串列的第一個節點上。其中這裡的 `pos` 是 `hlist_node` 的型態，所以 `JJJJ` 是 `node` ，而 `KKKK` 需要是 `*list_head` 的型態，所以是 `&c->link` 。 ### 測驗三 ```c static inline unsigned long __ffs(unsigned long word) { int num = 0; #if BITS_PER_LONG == 64 if ((word & AAAA) == 0) { num += 32; word >>= 32; } #endif if ((word & 0xffff) == 0) { num += 16; word >>= 16; } if ((word & 0xff) == 0) { num += 8; word >>= 8; } if ((word & 0xf) == 0) { num += 4; word >>= 4; } if ((word & 0x3) == 0) { num += 2; word >>= 2; } if ((word & 0x1) == 0) num += 1; return num; } ``` 這裡主要是去計算傳入的 `AAAA` 可以根據規律判斷為 `0xffffffff` 。 ```c static inline void __clear_bit(unsigned long nr, volatile unsigned long *addr) { unsigned long mask = BIT_MASK(nr); unsigned long *p = ((unsigned long *) addr) + BIT_WORD(nr); *p &= BBBB; } ``` 該函數主要是希望將指定的位元清成 0 ，所以 `BBBB` 是 `~mask` 。 ```c #define FIND_NTH_BIT(FETCH, size, num) \ ({ \ unsigned long sz = (size), nr = (num), idx, w, tmp; \ \ for (idx = 0; (idx + 1) * BITS_PER_LONG <= sz; idx++) { \ if (idx * BITS_PER_LONG + nr >= sz) \ goto out; \ \ tmp = (FETCH); \ w = hweight_long(tmp); \ if (w > nr) \ goto found; \ \ nr -= w; \ } \ \ if (sz CCCC BITS_PER_LONG) \ tmp = (FETCH) & BITMAP_LAST_WORD_MASK(sz); \ found: \ sz = min(idx * BITS_PER_LONG + fns(tmp, nr), sz); \ out: \ sz; \ }) ``` 其中 `hweight_long()` 該函式主要是去計算這 8 個位元中有多少個 1 ，而程式內的 `for` 迴圈是在找第 n 個位元 1 放在哪裡，所以這裡的 `CCCC` 應填入 `%` 。