2023q1 Homework3 (quiz3)

contributed by < LiChiiiii>

測驗題目：quiz3

測驗一

紅黑樹

研讀 treesort.c ，考慮一個 Tree sort 的實作，搭配 Red–black tree 作為排序所需的自我平衡樹。
定義一個列舉型別 color_t ，包含兩個值，分別為 CMAP_RED 與 CMAP_BLACK，代表紅色節點和黑色節點。

typedef enum { CMAP_RED = 0, CMAP_BLACK } color_t;

定義一個結構 node_t ，其中 color 代表節點顏色， left 和 right 分別代表紅黑樹的左節點和右節點，next 代表在原始陣列中下一個節點， value 代表節點的值。

typedef struct __node {
uintptr_t color;
struct __node *left, *right;
struct __node *next;
long value;
} node_t __attribute__((aligned(sizeof(long))));

為了節省記憶體空間，這裡把父節點的位址存在 uintptr_t color 的最高位中。在 64 位元的系統中，考慮到 alignment，記憶體以 8 bytes 為一個單位進行管理，也就是說 node_t 的位址的最後 3 個 bits 一定為 0 ，因此只要對給定的引數 (r) 清除後面 3 個 bits 就可以得到父節點的位址，並且在引數 (r) 最後一個 bit 存放顏色信息， (r)->color & 1 取出最後一個 bit 。

#define rb_parent(r) ((node_t *) ((r)->color & ~7))
#define rb_color(r) ((color_t) (r)->color & 1)

在設定顏色時，以 bitwise 操作，若節點是紅色則設定最後一個bit 為 0 ，反之。

#define rb_set_red(r) do { (r)->color &= ~1; } while (0)
#define rb_set_black(r) do { (r)->color |= 1; } while (0)

cmap_insert() 是在紅黑樹中插入節點的功能，若比較結果 res 小於零代表往左邊走，否則往右邊走，直到走訪到的節點為 NULL，或迴圈結束，即插入節點。

static bool cmap_insert(cmap_t obj, node_t *node, void *value)
{
    /* Just insert the node in as the new head. */
    /* Calibrate the tree to properly assign pointers.*/
    ...       
    /* Traverse the tree until we hit the end or find a side that is NULL */
    for (node_t *cur = obj->head;;) {
        int res = obj->comparator(&node->value, &cur->value);
        if (!res) /* If the key matches something else, don't insert */
            assert(0 && "not support repetitive value");
        if (res < 0) {
            if (!cur->left) {
                cur->left = node;
                rb_set_parent(node, cur);
                cmap_fix_colors(obj, node);
                break;
            }
            cur = cur->left;
        } 
        else {
            if (!cur->right) {
                cur->right = node;
                rb_set_parent(node, cur);
                cmap_fix_colors(obj, node);
                break;
            }
            cur = cur->right;
        }
    }
    cmap_calibrate(obj);
    return true;
}

tree_sort() 會建立一個 map 配置 key 和 element 足夠的空間，接著在 while 迴圈將 list 中的每個節點一個一個插入 map 中。定義 *node 為 map 中擁有最小值的節點，透過 for 迴圈，從最左子開始在紅黑樹找到下一個節點（cmap_next(node) ），由小到大放入 list 中，完成 sort 功能。

void tree_sort(node_t **list)
{
    node_t **record = list;
    cmap_t map = cmap_new(sizeof(long), sizeof(NULL), cmap_cmp_int);
    while (*list) {
        cmap_insert(map, *list, NULL);
        list = &(*list) -> next ;
    }
    node_t *node = cmap_first(map), *first = node;
    for (; node; node = cmap_next(node)) {
        *list = node;
        list = &(*list) -> next ;
    }
    *list = NULL;
    *record = first;
    free(map);
}

測驗二

AVL 樹

定義一個結構 avl_node ，其中 parent_balance 最低位 2 bits 紀錄 node 的平衡狀態，高位紀錄父節點的位址，left 和 right 分別代表 AVL 樹的左節點和右節點。

struct avl_node {
    unsigned long parent_balance;
    struct avl_node *left, *right;
} AVL_NODE_ALIGNED;

因此想取出得到父節點就將 parent_balance 的高位取出，因為是 64 位元的系統，所以 avl_node 的位址的最後 3 個 bits 一定為 0，因此只要把最後 3 bits 設為 0 ，即可得到父節點的位址。

static inline struct avl_node *avl_parent(struct avl_node *node)
{
    return (struct avl_node *) ((node)->parent_balance & ~7);
}

定義一個列舉型別 avl_node_balance ，包含三個值，分別為 AVL_NEUTRAL 、 AVL_LEFT 和 AVL_RIGHT 。

AVL_NEUTRAL : 左子樹和右子樹的深度一樣
AVL_LEFT ：左子樹的深度 - 右子樹的深度 = 1
AVL_RIGHT ：右子樹的深度 - 左子樹的深度 = 1

enum avl_node_balance { AVL_NEUTRAL = 0, AVL_LEFT, AVL_RIGHT, };

因為 parent_balance 最低位 2 bits 存放節點的平衡狀態，利用 & 3 將最低位 2 bits 取出。

static inline enum avl_node_balance avl_balance(const struct avl_node *node)
{
    return (enum avl_node_balance)((node)->parent_balance & 3);
}

透過 node->parent_balance & 3 保留原本的 balance 資訊，利用 or 設定 parent。
透過 node->parent_balance & ~7 保留原本的 parent 資訊，利用 or 設定 balance。

static void avl_set_parent(struct avl_node *node, struct avl_node *parent)
{
    node->parent_balance = (unsigned long) parent | (node->parent_balance & 3);
}

static void avl_set_balance(struct avl_node *node,
                            enum avl_node_balance balance)
{
    node->parent_balance = (node->parent_balance & ~7) | balance;
}

插入新節點後需要確認 tree 是否平衡。換句話說，會先檢查插入節點是左子還是右子，再檢查 parent 的 balance 狀態在插入新節點後是否平衡。
如果插入的新節點是左子，就去檢查 parent 的 balancd 狀態，會出現三種可能的狀態：AVL_RIGHT 、 AVL_NEUTRAL 、 AVL_LEFT，當parent 是 AVL_LEFT 的狀態時表示左子樹高度比右子樹高 1 ，接著檢查插入節點的 balance ，如果是 AVL_LEFT 或 AVL_NEUTRAL，那麼就進行右旋操作（第42行），如果是 AVL_RIGHT ，那麼就進行左右旋操作，先將右子樹向左旋轉，再將原節點向右旋轉（第45行）。



























































void avl_insert_balance(struct avl_node *node, struct avl_root *root)
{
    struct avl_node *parent;

    /* go tree upwards and fix the nodes on the way */
    while ((parent = avl_parent(node))) {
        if (avl_is_right_child(node)) {
            switch (avl_balance(parent)) {
            default:
            case AVL_RIGHT:
                /* compensate double right balance by rotation
                 * and stop afterwards
                 */
                ...
            case AVL_NEUTRAL:
                /* mark balance as right and continue upwards */
                ...
            case AVL_LEFT:
                /* new right child + left leaning == balanced
                 * nothing to propagate upwards after that
                 */
                ...
        } else {
            switch (avl_balance(parent)) {
            default:
            case AVL_RIGHT:
                /* new left child + right leaning == balanced
                 * nothing to propagate upwards after that
                 */
                ...
            case AVL_NEUTRAL:
                /* mark balance as left and continue upwards */
                ...
            case AVL_LEFT:
                /* compensate double left balance by rotation
                 * and stop afterwards
                 */
                switch (avl_balance(node)) {
                default:
                case AVL_LEFT:
                case AVL_NEUTRAL:
                    avl_rotate_right(node, parent, root);
                    break;
                case AVL_RIGHT:
                    avl_rotate_leftright(node, parent, root);
                    break;
                }

                parent = NULL;
                break;
            }
        }

        if (!parent)
            break;

        node = parent;
    }
}

測驗三

這段程式碼是一個長度為 256 的表格，被稱為「對數表」（log table），用於計算二進制數字的位數（也就是 log_table_256[x] =

l o g_{2} x

）。例如， log_table_256[6] 的值為 2 ，代表著二進位表示法中的 00000110 ，第一個 1 位元出現在位置 2 。

static const char log_table_256[256] = {
#define _(n) n, n, n, n, n, n, n, n, n, n, n, n, n, n, n, n
    -1,   0,    1,    1,    2,    2,    2,    2,    3,    3,    3,
    3,    3,    3,    3,    3,    _(4), _(5), _(5), _(6), _(6), _(6),
    _(6), _(7), _(7), _(7), _(7), _(7), _(7), _(7), _(7),
#undef _
};

在程式碼中，它首先檢查最高的 32 位元，如果它們不是 0 ，則將它們向右移動 32 位元，並繼續檢查下一個 16 位元和 8 位元。找到的最高位元的對數與某些固定的值相加，以計算整個 64 位元的對數。因為表格中的值是以 2 為底的對數，所以它們的值域在 0 到 7 之間，可以直接用來計算每個 8 位元組的對數，而不需要進行實際的對數運算。

/* ASSUME x >= 1
 * returns smallest integer b such that 2^b = 1 << b is >= x
 */
uint64_t log2_64(uint64_t v)
{
    unsigned r;
    uint64_t t, tt, ttt;

    ttt = v >> 32;
    if (ttt) {
        tt = ttt >> 16;
        if (tt) {
            t = tt >> 8;
            if (t) {
                r = 56 + log_table_256[t];
            } else {
                r = 48 + log_table_256[tt];
            }
        } else {
            t = ttt >> 8;
            if (t) {
                r = 40 + log_table_256[t];
            } else {
                r = 32 + log_table_256[ttt];
            }
        }
    } else {
        tt = v >> 16;
        if (tt) {
            t = tt >> 8;
            if (t) {
                r = 24 + log_table_256[t];
            } else {
                r = 16 + log_table_256[tt];
            }
        } else {
            t = v >> 8;
            if (t) {
                r = 8 + log_table_256[t];
            } else {
                r = 0 + log_table_256[v];
            }
        }
    }
    return r;
}

這段程式碼是用來將一個 64 位元整數 x 根據 N_BITS 和 N_BUCKETS 的設定，將其分配到對應的桶(bucket)中。其中，mask111 和 mask011 分別是用來產生全為 1 的二進位數字，mask111的長度為 N_BITS + 1， mask011 的長度為 N_BITS 。

unsigned int bucket_number(uint64_t x)
{
    uint64_t mask111 = (1 << (N_BITS + 1)) - 1;
    uint64_t mask011 = (1 << (N_BITS)) - 1; /* one 1 less */

    unsigned char leq = ((x & mask111) < N_BUCKETS);
    /* leq (less or equal) is 0 or 1. */

    return (leq * (x & mask111)) +
           ((1 - leq) * ((x >> (N_BITS + 1)) & mask011));

}

測驗四

運作原理

考慮 ceil_log2 可針對給定無號 32 位元數值 x，找出

⌈ l o g 2 (x) ⌉

，舉例來說：

ceil_log2(7) = 3
ceil_log2(8) = 3
ceil_log2(9) = 4

主要概念是找到最高位元的 1 是在 32 bits 中的哪一個 bit，因為可以根據最高位元的 1 來觀察這個數值是介於哪兩個 2 的次方數之間。

程式碼運作方式如下：

變數 r 初始值為 0 ，目的是儲存計算後的對數結果。變數 shift 初始值為 0 ，目的是用來儲存計算每次 x 的偏移量。
將 x 先減 1 ，確保四捨五入可以到最近的 2 的次方數
檢查 x 是否大於 0xFFFF ，若是的話， r 會被設為 16 ，否和 r = 0 ，並將 x 右移 16 bits；
檢查 x 是否大於 0xFFF ，若是的話， shift 會被設為 8，否則 shift = 0 ，並將 x 右移 8 bits，接著 r 和 shift 做結合，以此類推檢查 0xF 和 0x3。
最後 r 會和最後剩下的偏移量 shift（可能是 0、1 或 2 個位元）及其移位後的結果結合，得出最終的對數結果。該結果最後會加 1，以取得向上取整後的對數值。

從 0xFFFF 開始檢查，是因為當輸入值大於 0xFFFF 時，右移 16 位元可以將輸入值的高 16 位元右移到低 16 位元，這樣可以把輸入值的位數減少一半，從而加快計算速度。接著，再根據輸入值的大小，從 0xFF、0xF、0x3 分別開始檢查，進行對應的右移操作，最後計算出以 2 為底數的對數的位數。

int ceil_log2(uint32_t x)
{
    uint32_t r, shift;

    x--;
    r = (x > 0xFFFF) << 4;                 
    x >>= r;
    shift = (x > 0xFF) << 3;
    x >>= shift;
    r |= shift;
    shift = (x > 0xF) << 2;
    x >>= shift;
    r |= shift;
    shift = (x > 0x3) << 1;
    x >>= shift;
    return ( r | shift | x >> 1 ) + 1;       
}

Trace ceil_log2(0x0FFFFFFF) 的執行過程。

shift = 0
r = 0
x = 0x0FFFFFFF

x = 0x0FFFFFFE

x > 0xFFFF
r = 0x10
x = 0x00000FFF

x > 0xFF
shift = 0x08
x = 0x0000000F
r = 0x18

x > 0xF
shift = 0x04
x = 0
r = 0x1C

x < 0x3
shift = 0

return (0x1C | 0 | 0 ) + 1 = 0x1D    //以 2 為底數的對數的位數

2023q1 Homework3 (quiz3)

測驗一

紅黑樹

測驗二

AVL 樹

測驗三

測驗四

運作原理

Read more

2023q1 Homework1 (lab0)

2023q1 Homework4 (quiz4)

2023q1 Homework3 (fibdrv)

2023q1 Homework2 (quiz2)