2023q1 Homework3 (quiz3)

--- tags: linux kernel class --- # 2023q1 Homework3 (quiz3) contributed by < `willwillhi1` > ### 測驗一先補一些相關的知識 [treesort](https://en.wikipedia.org/wiki/Tree_sort) 基本的做法是對所有要排序的元素建立二元搜尋數，然後就可以從中列出排序數組。經典的使用時機就是用在 `online` 的情境之中也就是會對這個數組有較多的操作(插入、刪除等)。所以如果是用在 `one-time sort`，還是用 `quick sort` 或 `merge sort` 就可以了。 [red-black tree](https://en.wikipedia.org/wiki/Red%E2%80%93black_tree) 規則: 1. 每個節點非黑即白。 2. `root` 節點一定是黑色。 3. 每個 `leaf(nil)` 都是黑色。 4. 如果一個 `node` 是紅的，則它的 `children` 都是黑的。 5. 對每一個點來說，往下走到任何 `leaf`，途中經過的黑色節點數量都會相同，也就是說從 `root` 到某個 `leaf` 的 `simple path` 一定不會超過從 `root` 到任何一條其他這樣的 `path` 的兩倍長。插入、搜尋、刪除的時間複雜度最差都是 $O(logN)$ 基本旋轉操作可以分成: `Left Rotation`、`Right Rotation` 插入的點預設都是紅色。插入的 `case`: 1. `uncle` 是紅的，重新標記顏色後往上層做檢查 2. `uncle` 是黑的或者是 `nil`，可以分成以下情況: * `LL`: 做一次 `Right Rotation` 即可。 * `LR`: 先做一次 `left Rotation`，就會變成 `LL`，然後再做一次 `Right Rotation`。 * `RR`: 做一次 `Left Rotation` 即可。 * `RL`: 先做一次 `right Rotation`，就會變成 `RR`，然後再做一次 `left Rotation`。 * 最後顏色的處理，把新的 `grandparent` 和新的 `uncle` 的顏色互換即可。研讀 [`treesort.c`](https://gist.github.com/jserv/920766d1fd893d0efd66ac7fdaf23401) 首先看到結構體 `node_t`，記錄了該節點的顏色 `color`、在紅黑樹中的左右節點 `left, right` 以及在原本陣列中的下一個節點 `next` 和其值 `value`。 ```c= typedef struct __node { uintptr_t color; struct __node *left, *right; struct __node *next; long value; } node_t __attribute__((aligned(sizeof(long)))); ``` `cmap_internal` 記錄整個 `map` 的資訊，有 `key`, `element`, `map` 的 `size`, 也記錄了 `end`, `most`, `least` 和比較的函式 `comparator`。 ```c= struct cmap_internal { node_t *head; /* Properties */ size_t key_size, element_size, size; cmap_iter_t it_end, it_most, it_least; int (*comparator)(void *, void *); }; ``` 定義紅色是 `0`，黑色是 `1` ```c= typedef enum { CMAP_RED = 0, CMAP_BLACK } color_t; ``` 這裡使用的技巧是利用 `node_t` 的位址一定會對齊 `long`，再加上系統是 `64` 位元，所以 `long` 是 `8 byte`，換句話說 `node_t` 的位址的末 `3` 位一定都是 `0`，所以這邊就利用這三個 `bits` 來表示顏色。最後要找該 `node` 的 `parent` 就把 `color` 末三碼歸零，反之要找 `color` 就是直接用最低的一位判斷即可。 ```c= #define rb_parent(r) ((node_t *) (((r)->color & ~7))) #define rb_color(r) ((color_t) (r)->color & 1) ``` 同理也可以判斷出 ```c= #define rb_set_red(r) \ do { \ (r)->color &= ~1; \ } while (0) #define rb_set_black(r) \ do { \ (r)->color |= 1; \ } while (0) ``` 這邊第七行的註解說是標成 `black`，所以應該是 `rb_set_red` ```c= static node_t *cmap_create_node(node_t *node) { /* Setup the pointers */ node->left = node->right = NULL; rb_set_parent(node, NULL); /* Set the color to black by default */ rb_set_red(node); return NULL; } ``` `cmap_l_l`, `cmap_l_r`, `cmap_r_r`, `cmap_r_l` 代表在平衡紅黑樹時會處理的四種情況，並會在 `cmap_fix_colors` 裡呼叫。 `cmap_insert` 裡的 `for` 迴圈部分是在紅黑樹裡尋找可以插入的位置，比較出來的結果 `res` 如果小於零代表往左走，反之則往右走，最後的中止條件就是如果走訪到的節點為空，就直接做插入的動作然後跳出迴圈。 ```c= for (node_t *cur = obj->head;;) { int res = obj->comparator(&node->value, &cur->value); if (!res) /* If the key matches something else, don't insert */ assert(0 && "not support repetitive value"); if (res < 0) { if (!cur->left) { cur->left = node; rb_set_parent(node, cur); cmap_fix_colors(obj, node); break; } cur = cur->left; } else { if (!cur->right) { cur->right = node; rb_set_parent(node, cur); cmap_fix_colors(obj, node); break; } cur = cur->right; } } ``` 最後看到 `tree_sort`，輸入是一個需要排序的陣列，所以一開始要做的是建立一個 `map` 並逐一地把節點輸入進去，這邊有用到 `indirect pointer` 的概念，所以在移動到下一個節點時要用 `list = &(*list)->next`。全部的節點都輸入到紅黑樹後還要把原本的陣列重新依照順序串接起來也就是 `for` 迴圈的部分，最後把尾端的點的 `next` 指向 `NULL`。 ```c= void tree_sort(node_t **list) { node_t **record = list; cmap_t map = cmap_new(sizeof(long), sizeof(NULL), cmap_cmp_int); while (*list) { cmap_insert(map, *list, NULL); printf("%ld\n", map->size); list = &(*list)->next; } node_t *node = cmap_first(map), *first = node; for (; node; node = cmap_next(node)) { *list = node; list = &(*list)->next; } *list = NULL; *record = first; free(map); } ``` 延伸問題，將 `treesort` 整合進 `lab0-c` 另外開一個筆記： [treesort](https://hackmd.io/@willwillhi/treesort) --- ### 測驗二 --- ### 測驗三 --- ### 測驗四計算 $⌈log2(x)⌉$，且 `x > 0` 這裡的程式原理是在算從最高位 `1` 到最低位總共幾位，但是這樣會造成 `2` 和 `3` 有同樣的結果 `2`，不符合取上界的需求，所以如果輸入是 `2` 的冪就可以通過 `x--` 來避免這個情況，最後 `2` 會回傳 `1`，`3` 會回傳 `2`。 ```c= int ceil_log2(uint32_t x) { uint32_t r, shift; x--; // 以前 32 bits 來看 // 如果前 16 bits 不為零則把 x 右移 16 位 r = (x > 0xFFFF) << 4; x >>= r; // 以前 16 bits 來看 // 如果前 8 bits 不為零則把 x 右移 8 位 // 最後把 r 做累加，等同 r += shift shift = (x > 0xFF) << 3; x >>= shift; r |= shift; // 以前 8 bits 來看 // 如果前 4 bits 不為零則把 x 右移 4 位 shift = (x > 0xF) << 2; x >>= shift; r |= shift; // 以前 4 bits 來看 // 如果前 2 bits 不為零則把 x 右移 2 位 shift = (x > 0x3) << 1; x >>= shift; // r | shift 先累加上一步的 bits // 到這裡會有三個 case: 0b01, 0b10, 0b11 // 0b10 和 0b11 因為最高位的 1 在第二位所以做 x >> 1 剛好可以加 1 // 最後因為這三個 case 都大於 0 代表至少有一個 1，所以需要加 1 return (r | shift | x >> 1) + 1; } ```