2025q1 Homework2 (quiz1+2)

# 2025q1 Homework2 (quiz1+2) contributed by < devarajabc > ## q1-1 此題旨在利用指標的指標實作 `list_insert_before` >The macro `LIST_INSERT_BEFORE()` inserts the new element `elm` before the `element` listelm. ### 解釋程式碼運作原理 #### 為何要使用 pointet to pointer ? 最天真的想法是走訪整個鏈結串列，找到 `before` 的前一位節點 `cure` 並將 `cur->next` 的值改為 `item` 而`item->next` 賦值為 `before`. ```c list_item_t* cur = l->head; while(cur && cur->next != before) cur = cur->next; cur->next = item; item->next = before; ``` 但這樣有什麼問題呢？因為此設計並未考量到 `l->head` 為 `before` 的情況 > If @before is the head of list, the new item is inserted at the front 上方的程式無法處理此情況，因此要需要新增一個分支 ```diff +if(l->head == before){ + l->head = item; + item->next = before; +} while(cur && cur->next != before) cur = cur->next; cur->next = item; item->next = before; ``` 若使用指標的指標可以減少分支: ```c list_insert_before(list_t* l, list_item_t *item, list_item_t* before){ list_item_t **p; for (p = &l->head; *p != before; p = &(*p)->next) ; *p = item; item = before; } ``` `p` 的意義為「指向某一節點的指標」的指標，而 `*p` 為指向某一節點的指標，利用 deference 的方式即可修改某一結構的指標裡的內容。若將「指標」比喻為一個記載地址的信封，而「指標的指標」就是該信封的位置；藉由「指標的指標」找到「信封」後，修改裡面所記載的地址。 ![image](https://hackmd.io/_uploads/rJmkkhGhyl.png) 以上方程式碼來說，利用 `for loop` 找到「指向 `before` 的指標」，再將該指標的地址紀錄在 `p` 中，接著利用 `*p->item` 將「指向 `before` 的指標」改為「指向 `item` 的指標」 #### 如何測試？ ```c #define my_assert(test, message) \ do { \ if (!(test)) \ return message; \ } while (0) ``` ### 在現有的程式碼基礎上，撰寫合併排序操作 ## q1-2 [你所不知道的 C 語言：記憶體管理、對齊及硬體特性](https://hackmd.io/@sysprog/c-memory) >A dynamic memory allocator maintatins maintains an area of a process's virtual memory known as the **heap**. >An allocator maintains the heap as a collection of various-size blocks 為何要要這樣呢？調整 `brk ptr`的大小再 `sw` 不就好了嗎？為何還要 various-size blocks 呢？ malloc vs. calloc ### 解釋程式碼運作原理此題要求完成二元搜尋樹的刪除程式，為了維護二元搜尋樹的結構，刪除節點 `target` 時需考慮以下三種情況： 1. `target` 為單一節點： ```c /* No children: remove the node. */ *node_ptr = NULL; ``` 2. `target` 只有一個子節點: ```c /* If the target node has one child (or none), simply splice it out. */ block_t *child = ((*node_ptr)->l) ? (*node_ptr)->l : (*node_ptr)->r; *node_ptr = child; ``` 3. `target` 有兩個子節點: 此時要先找到 `target` 的 predecessor ，即左子樹中的最大節點 >A "predecessor" of a node refers to the node that comes immediately before it in an inorder traversal. ```c block_t **pred_ptr = &(*node_ptr)->l; while ((*pred_ptr)->r) pred_ptr = &(*pred_ptr)->r; ``` 接著要再分成兩種情況: 1. predecessor 即為 `target` 的左子節點： ```graphviz digraph RBT { graph[ordering="out"] node [shape=circle, fixedsize=true, style=filled,width=.5] target -> {35 80} 35 [fillcolor=red] 80 -> {70 90} } ``` ```c /* If the predecessor is the immediate left child. */ if (*pred_ptr == (*node_ptr)->l) { block_t *old_right = (*node_ptr)->r; *node_ptr = *pred_ptr; /* Replace target with its left child. */ (*node_ptr)->r = old_right; /* Attach the original right subtree. */ } ``` 此時可以直接以 `pred_ptr` 取代 `taget` 原先的位置 ```graphviz digraph RBT { graph[ordering="out"] node [shape=circle, fixedsize=true, style=filled,width=.5] 35 -> {80} 35 [fillcolor=red] 80 -> {70 90} } ``` 2. predecessor 為左子樹中的的最大值 ```graphviz digraph RBT { graph[ordering="out"] node [shape=circle, fixedsize=true, style=filled,width=.5] target -> {33 50} 33 -> {20 48} 48 [fillcolor=red] } ``` ```c else { /* The predecessor is deeper in the left subtree. */ block_t *old_left = (*node_ptr)->l; block_t *old_right = (*node_ptr)->r; block_t *pred_node = *pred_ptr; /* Remove the predecessor from its original location. */ remove_free_tree(&old_left, *pred_ptr); /* Replace the target node with the predecessor. */ *node_ptr = pred_node; (*node_ptr)->l = old_left; (*node_ptr)->r = old_right; ``` ```graphviz digraph RBT { graph[ordering="out"] node [shape=circle, fixedsize=true, style=filled,width=.5] 48 -> {33 50} 33 -> {20} 48 [fillcolor=red] } ``` ## q1-3 題目說明介紹了非遞迴 (non-recursive; iterative) 的快速排序法，利用 `begin[i]`, `end[i]` 分別用來存放單一排序分割的起始點和終點(how)，i 為分割編號，堆疊中的片段（鏈結串列）會被依序取出，取出的片段會依據其長度進行不同的操作： - 片段長度為 `1`: 將該節點加入 `result` - 片段長度不為 `1`: 將取出的片段切分成三段: `left`(小於 pivot), pivot, `right`(大於 pivot) 並依序放進堆疊題目的要求是利用 Linux 核心風格 List API 的簡化標頭檔 list.h 來改寫快速排序程式碼。(有真的加快嗎？「加快」跟修改 API 有關係嗎？) 節點的結構做出以下調整 ```diff typedef struct __node { - struct __node *left, *right; - struct __node *next; + struct list_head list; long value; } node_t; ``` 因應結構的挑整，需要增加以下輔助函式： - `list_tail`: 回傳鏈結串列的最後一個節點的指標，即最後一個節點 - `list_length`: 回傳鏈結串列的節點數 - `rebuild_list_link`: 走訪鏈結串列並修正每一個節點的 `prev` 指標，將其指向該節點的前一個節點以符合雙相鏈結串列的定義。 ```c static void rebuild_list_link(struct list_head *head) { if (!head) return; struct list_head *node, *prev; prev = head; node = head->next; while (node) { node->prev = prev; prev = node; node = node->next; } prev->next = head; head->prev = prev; } ``` ### 解釋程式碼運作原理改寫後的程式碼和原先的邏輯一致，差別在於將節點加入不同鏈結串列的方式：改寫後的版本反而沒有用 list.h 的 API ，在排序的過程中把雙向鏈結串列當作單向鏈結串列處理：直接修改節點中，指標 `next` 的內容，並在最後用 `rebuild_list_link` 修正每個節點的 `prev` 指標。 ```c while (p) { struct list_head *n = p; p = p->next; int n_value = list_entry(n, node_t, list)->value; if (n_value > value) { n->next = right; right = n; } else { n->next = left; left = n; } } ``` 為何要將 `list_add` 拆成兩個步驟呢？ >是因為在排序結束之前，`prev` 的數值尚未確定，沒必要每次加入節點都要調整 `prev` 。 TODO: perf ### 研讀〈A comparative study of linked list sorting algorithms〉 ## q2-1 使用 Linux 核心風格 List API 的簡化標頭檔 list.h 來實作遞迴版本的快速排序程式在每次遞迴呼中，取出鏈結陣列的第一個節點當作 `pivot` ```c pivot = list_first_entry(head, struct listitem, list); list_del(&pivot->list); ``` 接著走訪剩餘的節點並依據節點的值放將其存入 `list_less` 或 `list_greater` ```c list_for_each_entry_safe (item, is, head, list) { if (cmpint(&item->i, &pivot->i) < 0) list_move_tail(&item->list, &list_less); else list_move_tail(&item->list, &list_greater); } ``` 此時鏈結陣列已被分成三個部分：`list_less`, `pivot`, `list_greater` 接著將分割好的鏈結串列 `list_less` 和 `list_greater` 放入遞迴呼叫中 ```c list_quicksort(&list_less); list_quicksort(&list_greater); ``` 最後將排序好的 `list_less`, `list_greater` 和 `pivot` 重新加入鏈結串列 `head` 中 ```c list_add(&pivot->list, head); list_splice(&list_less, head); list_splice_tail(&list_greater, head); ``` ### 若將 `list_move_tail` 換為 `list_move`，為何會無法滿足 stable sorting 根據 list.h 中對 `list_move` 的描述 >Move a list node to the beginning of the list 節點會往 linked-list 的 `prev` 方向加入，如此一來便會改變節點原先的順序 ## q2-2 題目要求開發整數的開平方根運算，此任務分成兩個部分 1. 藉由遞迴呼叫以計算 count leading zero (clz) 2. 實作整數開平方根 ### 解釋程式碼(1) 每次呼叫 `clz2()` 函式會將目前關注的位元（bits）「切成一半」，以檢查 leading zero 的數量。 ***Recursive case*** 讓我們來觀察「切一半」的過程，以 32 位元的 `x` 為例：獲得 `x_upper` 的內容可以透過 shift right :`x >> 16` 而 `x_lower` 則需要使用 bit mask：`x & 0xFFFF` 接著 `x_upper` 和 `x_lower` 分別作為下一次遞迴呼叫的輸入：如此壹來可以整理出規律： ```c static const int mask[] = {0, 8, 12, 14}; uint32_t upper = (x >> (16 >> c)); uint32_t lower = (x & (0xFFFF >> mask[c])); ``` 其中 `c` 為「切」第幾刀，`mask` 的內容為 $16-2^n$， $n$ 為欲取得之位數。接著依據 upper 的數值判斷下一次遞迴呼叫應該處理哪一部分，以累計 leading zero 的數量。若 upper 等於 0，代表 leading zero 至少有 `16 >> c`個零若 upper 不等於 0，則下一次遞迴呼叫應繼續檢查 upper 部分。 ```c return upper ? clz2(upper, c + 1) : (16 >> (c)) + clz2(lower, c + 1); ``` ***Base case*** 遞迴呼叫 `clz2()` 函式，直到僅剩 2 位元（bits）需要檢查 leading zero，然後回傳結果。僅剩下 2 位元，代表 32 已經被「切」了 4 次，即 `c == 3` ```c if (c == 3) return upper ? magic[upper] : 2 + magic[lower]; ``` 陣列 `magic` 的內容為索引值的 count leading zero: `0` -> `00` -> 兩位 `1` -> `01` -> 一位 `2` -> `10` -> 零位 `3` -> `11` -> 零為 ```c static const int magic[] = {2, 1, 0, 0}; ``` 如果 `upper` 為 true 有三種可能： `10`, `11` 或 `01` ，透過 `mask` 可以得到 leadind zero 的數量。 ```c if (c == 3) return upper ? magic[upper] : 2 + magic[lower]; ``` 如果 `upper` 為零則 leading zero 至少有兩位(+2)，再根據 `lower` 的值獲得 leading zero 的數量。 ### 解釋程式碼(2) ## q2-3 ### 解釋程式碼運作原理，應提供測試程式碼題意是給定一個陣列 `nums` 和一個目標值 `target`，求找到 `nums` 的 2 個元素相加會等於 `target` 的索引值。題目確保必為單一解，且回傳索引的順序沒差異。例如給定輸入 `nums` = [2, 7, 11, 15], `numsSize` = 4, `target` = 9，相加變成 9 的元素僅有 2 及 7，因此回傳這二個元素的索引值 `returnSize` = [0, 1] 把當前數字 ```c int *twoSum(int *nums, int numsSize, int target, int *returnSize) { map_t *map = map_init(10); *returnSize = 0; int *ret = malloc(sizeof(int) * 2); if (!ret) goto bail; for (int i = 0; i < numsSize; i++) { int *p = map_get(map, target - nums[i]); if (p) { /* found */ ret[0] = i, ret[1] = *p; *returnSize = 2; break; } p = malloc(sizeof(int)); *p = i; map_add(map, nums[i], p); } bail: map_deinit(map); return ret; } ``` 程式首先利用 `map_init` 初始化一個大小為？的 hashtable ```c typedef struct { int bits; struct hlist_head *ht; } map_t; ``` 為何不直接使用 `struct hlist_head` 陣列(e.g. `struct hlist_head Array[100]`)？ `bits` 是什麼呢？ ```c MAP_HASH_SIZE(map->bits) ``` `map_add` ```c void map_add(map_t *map, int key, void *data) { struct hash_key *kn = find_key(map, key); if (kn) return; kn = malloc(sizeof(struct hash_key)); kn->key = key, kn->data = data; struct hlist_head *h = &map->ht[hash(key, BBBB)]; struct hlist_node *n = &kn->node, *first = h->first; n->next = first; if (first) CCCC = &n->next; h->first = n; DDDD = &h->first; } ```