2024q1 Homework2 (quiz1+2)

contributed by < TSAICHUNGHAN >

第一周測驗題

測驗`1` : quicksort

參考 : vax-r

運作原理

鏈結串列結構體:

typedef struct __node {
    struct __node *left, *right;
    struct __node *next;
    long value;
} node_t;

list_add()

使用 list_add 將 node_t 加入到鏈結串列開頭

void list_add(node_t **list, node_t *node_t)
{
    node_t->next = *list;
    *list = node_t;
}

list_tail()

從 *left 開始訪問每個節點，直到訪問到尾部 (*left)->next = NULL 就將其回傳。

node_t *list_tail(node_t **left)
{
    while ((*left) && (*left)->next)
        left = &((*left)->next);
    return *left;
}

list_length()

方法類似 list_tail() ，只是在每次走訪節點時會記錄一次，最終回傳走過的節點數。

int list_length(node_t **left)
{
    int n = 0;
    while (*left) {
        ++n;
        left = &((*left)->next);
    }
    return n;
}

list_construct()

創建一個 node 節點並分配記憶體空間，再將其節點插入鏈結串列的開頭。

node_t *list_construct(node_t *list, int n)
{
    node_t *node = malloc(sizeof(node_t));
    node->next = list;
    node->value = n;
    return node;
}

list_free()

逐一走訪每個節點，並釋放記憶體空間直到尾巴(即 *list = NULL )

void list_free(node_t **list)
{
    node_t *node = (*list)->next;
    while (*list) {
        free(*list);
        *list = node;
        if (node)
            node = node->next;
    }
}

quick_sort()

定義 L 、R ， pivot 指向 L ， p 指向 pivot 的下個節點，並將 pivot 斷開，後續會將其歸類在單獨 begin[i+1] 中。

begin[0] = *list;
    end[0] = list_tail(list);
            
    while (i >= 0) {
        node_t *L = begin[i], *R = end[i];
        if (L != R) {
            node_t *pivot = L;
            value = pivot->value;
            node_t *p = pivot->next;
            pivot->next = NULL;

*p 由左至右訪問每個節點，若節點的 value 小於 pivot 將其放至於 *left ，若大於 pivot 將其放至於 *right ，分類完成後如下圖所示 :

while (p) {
    node_t *n = p;
    p = CCCC;
    list_add(n->value > value ? &right : &left, n);
}

begin[i] = left;
end[i] = DDDD;
begin[i + 1] = pivot;
end[i + 1] = pivot;
begin[i + 2] = right;
end[i + 2] = EEEE;

此時 pivot 換成 begin[2] 所指向鏈結串列的第一個，重複一樣的分類方式。

由於 L 與 R 都只剩一個節點，因此 L == R 的結果會依序將 begin[3] 、 begin[2] 、 begin[1] 送入 result 中，如下圖 :

if (L != R) {
    ...;
} else {
    if (L)
        list_add(&result, L);
    i--;
}

此時經過一連串 i-- 回到了 begin[0] ，重複以上動作如圖所示 :

最終結果 :

延伸問題:

引入 Linux Kernel API
引入 list.h 將 node_t 結構體改寫成：

typedef struct __node {
-   struct __node *left, *right;
-   struct __node *next;
+   struct list_head *list;
    long value;
} node_t;

改寫 quicksort ：

void quick_sort(struct list_head **head)
{
    if (!*head || list_empty(*head))
        return;
    int n = list_size(*head);
    int i = 0;
    int max_level = 2 * n;
    struct list_head *begin[max_level];

    for (int i = 1; i < max_level; i++)
        begin[i] = list_new();
    struct list_head *result = list_new(), *left = list_new(),
                     *right = list_new();
    begin[0] = *head;

    int stage = 0;
    // printf("%ld\n", list_entry(begin[0]->prev, node_t, list)->value);
    while (i >= 0) {
        
        fprintf(stdout, "This is %d stage -> ", stage++);

        if (begin[i]->next != begin[i]->prev) {
            fprintf(stdout, "If\n");
            
            node_t *pivot = list_entry(begin[i]->next, node_t, list);
            node_t *entry, *safe;
            list_for_each_entry_safe (entry, safe, begin[i], list) {
                if (entry->value > pivot->value) {
                    list_del(&entry->list);
                    list_add(&entry->list, right);
                } else if (entry->value < pivot->value) {
                    list_del(&entry->list);
                    list_add(&entry->list, left);
                }
            }
            begin[i] = left;
            list_del(&pivot->list);
            list_add(&pivot->list, begin[i + 1]);
            begin[i + 2] = right;

            INIT_LIST_HEAD(left);
            INIT_LIST_HEAD(right);
            i += 2;
        } else {
            fprintf(stdout, "else \n");
            if(!list_empty(begin[i]))
                list_splice_init(begin[i], result);
            i--;
        }
    }
    
    fprintf(stdout, "Sort Complete !\n");
    for (int j = 1; j < max_level; j++) {
        list_free(begin[j]);
    }
    struct list_head *q = result;
    list_for_each(q, result){
        printf("%ld\n",list_entry(q,node_t,list)->value);
    }
    // list_free(left);
    // show(result);
    // list_free(right);
    // show(result);
    *head = result;
}

完整程式碼請參考：quiz1+2

測驗`2` : Timsort

Timesort 運作原理

特性：

merge()

for (;;) {
    /* if equal, take 'a' -- important for sort stability */
    if (cmp(priv, a, b) <= 0) {
        *tail = a;
        tail = &a->next;
        a = a->next;
        if (!a) {
            *tail = b;
            break;
        }
    } else {
        *tail = b;
        tail = &b->next;
        b = b->next;
        if (!b) {
            *tail = a;
            break;
        }
    }
}

逐一比較兩個已排序的鏈結串列 a 跟 b 的頭節點，若較小的值則放入 head 的鏈結串列的尾部，其中 tail 指向鏈結串列的尾部，比較直到一方指向 NULL ，則另一方接續 tail 。

build_prev_link()

tail->next = list;
do {
    list->prev = tail;
    tail = list;
    list = list->next;
} while (list);
    /* The final links to make a circular doubly-linked list */
tail->next = head;
head->prev = tail;

建立 prev 連接上個節點，使鏈結串列形成環狀雙向鏈結串列。

merge_final()

與 merge() 相似，但多了 prev 的連結，及 build_prev_link() ，來形成環狀雙向鏈結串列。

find_run()

在 Timsort 中用於找到已經排序好的鏈結串列並計算子序列大小 len ，若為降序則將其反轉。

merge_at()

在特定位置 at 連接兩個已排序的 run 並更新 run 大小，及減少堆疊的大小 stk_size 。

merge_force_collapse(),*merge_collapse()

當 run 的數量超過一定值時將進行合併，用於提升合併的效率。

來源：測驗 2
Timsort 採用一組堆疊 (stack) 來暫存 run，此舉是為了避免在一開始就掃描整個資料範圍來產生 run，從而降低記憶體開銷。過程中，Timsort 運用 merge_collapse 函式來確保堆疊上的 run 長度保持平衡。該函式主要檢查堆疊頂端的 3 個 run 是否滿足以下原則：

A 的長度要大於 B 和 C 的長度總和。
B 的長度要大於 C 的長度。

當不符合這些條件時，函式會決定是否進行合併。例如，考慮 3 段 run: A 的長度為 30，B 的長度為 20，C 的長度為 10，由於 A 的長度不大於 B 加 C 的總和（亦即

A > B + C

），且 C 的長度小於 A，因此會選擇將 C 與 B 進行合併，形成兩個新的 run

A:30
BC:30

Timesort()

使用 find_run() 函數來尋找 run ，找到 run 之後將其添加到 stack 中，同時使用 merge_collapse() 與 merge_force_collapse() 進行合併，最後使用 build_prev_link() 將其形成環狀鏈結串列。

待完成：

提出改進方案並予以實作。
將改進過的 Timsort 實作整合進 sysprog21/lab0-c，作為另一個有效的排序命令，應設計效能評比的測試程式，確保測試案例符合 Timsort 設想的資料分布樣式。

第二周測驗題

測驗`1` : Binary Tree

Linux Kernel 裡 hash table 結構體：

代表 hlist_head 的頭節點

struct hlist_head {
	struct hlist_node *first;
};

Linux 核心的 hash table 實作中，用以處理 hash 數值碰撞的 hlist_node :

struct hlist_node {
    struct hlist_node *next, **pprev;
};

INIT_HLIST_HEAD()

初始化 hlist_head 的 first 使其指向 NULL

static inline void INIT_HLIST_HEAD(struct hlist_head *h)
{
    h->first = NULL;
}

hlist_add_head()

在 hlist 鏈結串列的頭插入一個新節點

static inline void hlist_add_head(struct hlist_node *n, struct hlist_head *h)
{
    if (h->first)
        h->first->pprev = &n->next;
    n->next = h->first;
    n->pprev = &h->first;
    h->first = n;
}

參考 hlist 数据结构图示说明裡面有更完整的圖示表達插入頭節點的過程。
其中說明 pprev 為何要宣告為指標的指標的目的。

n->pprev = &h->first;
h->first = n;

新節點的 pprev 指向指著自身節點的指標，因此將 h->first 指向新節點時新節點的 pprev 也會順道更新。

圖解：

find()

在雜湊表中查找指定值的位置索引 idx，如果找到則返回該值在列表中的索引，否則返回 -1。

static int find(int num, int size, const struct hlist_head *heads)
{
    struct hlist_node *p;
    int hash = (num < 0 ? -num : num) % size;
    hlist_for_each (p, &heads[hash]) {
        struct order_node *on = list_entry(p, struct order_node, node);
        if (num == on->val)
            return on->idx;
    }
    return -1;
}

其中 int hash = (num < 0 ? -num : num) % size; 計算指定數值 num 的哈希值，並將其與哈希列表的大小取餘數，以確保得到的哈希值在合法範圍內。
hlist_for_each (p, &heads[hash]) 訪問雜湊表中特定的 bucket 若找到對應 num 則回傳對應的位置索引，若無回傳 -1 。

struct TreeNode *dfs()

由 inorder / preorder 重建二元樹。

由前序( preorder )可知哪個節點在上面 (越前面的越上面)、而由中序( inorder )可知哪個節點靠左(越前面的越左邊)，於是可得知前序的第一個元素一定是根，且該元素對應到中序後，左邊的值就在左邊，右邊的值就在右邊，在從右邊的值對應 preorder 即可找出頂點，以此類推可以建構出樹。

遞迴的中止條件是在 inorder 中找不到對應元素。

static struct TreeNode *dfs(int *preorder,
                            int pre_low,
                            int pre_high,
                            int *inorder,
                            int in_low,
                            int in_high,
                            struct hlist_head *in_heads,
                            int size)
{
    if (in_low > in_high || pre_low > pre_high)
        return NULL;

    struct TreeNode *tn = malloc(sizeof(*tn));
    tn->val = preorder[pre_low];
    int idx = find(preorder[pre_low], size, in_heads);
    tn->left = dfs(preorder, pre_low + 1, pre_low + (idx - in_low), inorder,
                   in_low, idx - 1, in_heads, size);
    tn->right = dfs(preorder, pre_high - (in_high - idx - 1), pre_high, inorder,
                    idx + 1, in_high, in_heads, size);
    return tn;
}

node_add()

將新的節點添加到哈希列表中，以便後續可以根據特定的值快速查找到該節點。

static inline void node_add(int val,
                            int idx,
                            int size,
                            struct hlist_head *heads)
{
    struct order_node *on = malloc(sizeof(*on));
    on->val = val;
    on->idx = idx;
    int hash = (val < 0 ? -val : val) % size;
    hlist_add_head(&on->node, &heads[hash]);
}

struct TreeNode *buildTree()

先將 inoreder 節點建立 hash table ，再傳回 dfs() 函式來建樹。

static struct TreeNode *buildTree(int *preorder,
                                  int preorderSize,
                                  int *inorder,
                                  int inorderSize)
{
    struct hlist_head *in_heads = malloc(inorderSize * sizeof(*in_heads));
    for (int i = 0; i < inorderSize; i++)
        INIT_HLIST_HEAD(&in_heads[i]);
    for (int i = 0; i < inorderSize; i++)
        node_add(inorder[i], i, inorderSize, in_heads);

    return dfs(preorder, 0, preorderSize - 1, inorder, 0, inorderSize - 1,
               in_heads, inorderSize);
}

延伸問題：

撰寫完整的測試程式，指出其中可改進之處並實作

測試內容 :

在 Linux 核心的 cgroups 相關原始程式碼，找出 pre-order walk 程式碼並探討

參考 Linux kernel 裡的 cgroup.c 中搜尋 pre-order walk 會找到這段：

struct cgroup_subsys_state *
css_next_descendant_pre(struct cgroup_subsys_state *pos,
			struct cgroup_subsys_state *root)
{
	struct cgroup_subsys_state *next;

	cgroup_assert_mutex_or_rcu_locked();

	/* if first iteration, visit @root */
	if (!pos)
		return root;

	/* visit the first child if exists */
	next = css_next_child(NULL, pos);
	if (next)
		return next;

	/* no child, visit my or the closest ancestor's next sibling */
	while (pos != root) {
		next = css_next_child(pos, pos->parent);
		if (next)
			return next;
		pos = pos->parent;
	}

	return NULL;
}

在了解這段程式碼之前先來了解什麼是 cgroups ，其全名為 control groups ，他是 Linux Kernel 中的功能，可以用來限制，控制與隔離 Process 所使用到的系統資源，例如 CPU, Memory, Disk I/O, Network…等。

參考來源：第一千零一篇的 cgroups 介紹

以 root 自身當成第一個走訪的節點next = css_next_child(NULL, pos)，若子節點存在走訪相鄰子節點 next = css_next_child(pos, pos->parent) ，若子節點不存在則尋先前的 root 走訪 pos = pos->parent 。

測驗`2` : LRU Cache

Leetcode : 146. LRU Cache
LRU 說明：Least recently used (LRU)

由於 hlist 大致邏輯相同於lab0-c 的 list.h ，因此直接探討 LRU Cache 的部份

LRU 結構體

typedef struct {
    int capacity;
    int count;
    struct list_head dhead;
    struct hlist_head hhead[];
} LRUCache;

typedef struct {
    int key;
    int value;
    struct hlist_node node;
    struct list_head link;
} LRUNode;

dhead : 雙向鏈結串列的頭節點
hhead[] : 一個 hash 頭節點的數
node : 用於將緩存項目連接到 hash
link : 用於將此緩存項目連接到雙向鏈結的結構

lRUCacheCreate()

開頭 int capacity 表 Cache 的最大容量

INIT_LIST_HEAD(&cache->dhead) 初始化雙向鏈結串列

INIT_HLIST_HEAD(&cache->hhead[i]) 這個循環初始化了哈希表的各個 bucket 的頭部，使其成為空的哈希列表

LRUCache *lRUCacheCreate(int capacity)
{
    LRUCache *cache = malloc(2 * sizeof(int) + sizeof(struct list_head) +
                             capacity * sizeof(struct list_head));
    cache->capacity = capacity;
    cache->count = 0;
    INIT_LIST_HEAD(&cache->dhead);
    for (int i = 0; i < capacity; i++)
        INIT_HLIST_HEAD(&cache->hhead[i]);
    return cache;
}

lRUCacheFree()

用 LRUNode *cache = list_entry(pos, LRUNode, link) 獲取該LRU節點的結構，並做釋放的動作。

void lRUCacheFree(LRUCache *obj)
{
    struct list_head *pos, *n;
    list_for_each_safe (pos, n, &obj->dhead) {
        LRUNode *cache = list_entry(pos, LRUNode, FFFF);
        list_del(GGGG);
        free(cache);
    }
    free(obj);
}

FFFF 為 link,GGGG 為 &cache->link

lRUCacheGet()

與測驗1的 find() 函式相似， hash 透過 hash = key % obj->capacity 取得，在用 hlist_for_each() 走訪 hash 值對應的 hhead[hash] 鏈結串列，若有找到對應的 key 值將及節點放到雙向鏈結串列的頭並回傳該節點的 value 。

int lRUCacheGet(LRUCache *obj, int key)
{
    int hash = key % obj->capacity;
    struct hlist_node *pos;
    hlist_for_each (pos, &obj->hhead[hash]) {
        LRUNode *cache = list_entry(pos, LRUNode, HHHH);
        if (cache->key == key) {
            list_move(IIII, &obj->dhead);
            return cache->value;
        }
    }
    return -1;
}

HHHH 為 node , IIII 為 &cache->link

lRUCachePut()

在訪問每個節點的過程若找到對應的 key 使用 list_move(KKKK, &obj->dhead) 將該節點移動至雙向鏈結串列的頭，即最近使用過。
若 cache = NULL 表示 Cache Miss ，然後有兩種補同情況的處理方式：

若 Cache 已滿將雙向鏈結串列 dhead 的最尾端節點刪除，並將節點插到 hhead 的頭部
若 Cache 還沒滿，創建一個新節點，並將其加到 hhead 的頭部，同時添加到 dhead 的前面

void lRUCachePut(LRUCache *obj, int key, int value)
{
    LRUNode *cache = NULL;
    int hash = key % obj->capacity;
    struct hlist_node *pos;
    hlist_for_each (pos, &obj->hhead[hash]) {
        LRUNode *c = list_entry(pos, LRUNode, JJJJ);
        if (c->key == key) {
            list_move(KKKK, &obj->dhead);
            cache = c;
        }
    }

    if (!cache) {
        if (obj->count == obj->capacity) {
            cache = list_last_entry(&obj->dhead, LRUNode, link);
            list_move(&cache->link, &obj->dhead);
            hlist_del(&cache->node);
            hlist_add_head(&cache->node, &obj->hhead[hash]);
        } else {
            cache = malloc(sizeof(LRUNode));
            hlist_add_head(&cache->node, &obj->hhead[hash]);
            list_add(&cache->link, &obj->dhead);
            obj->count++;
        }
        cache->key = key;
    }
    cache->value = value;
}

JJJJ 為 node , KKKK 為 &c->link

延伸問題

撰寫完整的測試程式，指出其中可改進之處並實作

測試內容：LRUCacheTest

在 Linux 核心找出 LRU 相關程式碼並探討

測驗`3` : find_nth_bit

參考： yu-hsiennn 的測驗3
BITMAP_LAST_WORD_MASK(nbits) ：利用 mask 將不用的位元過濾，因電腦的未有可能為 32-bit 或 64-bit
__const_hweight64(w)：計算 Hamming 權重，及二進位有幾個1

#define __const_hweight8(w)                                              \
    ((unsigned int) ((!!((w) & (1ULL << 0))) + (!!((w) & (1ULL << 1))) + \
                     (!!((w) & (1ULL << 2))) + (!!((w) & (1ULL << 3))) + \
                     (!!((w) & (1ULL << 4))) + (!!((w) & (1ULL << 5))) + \
                     (!!((w) & (1ULL << 6))) + (!!((w) & (1ULL << 7)))))

#define __const_hweight16(w) (__const_hweight8(w) + __const_hweight8((w) >> 8))
#define __const_hweight32(w) \
    (__const_hweight16(w) + __const_hweight16((w) >> 16))
#define __const_hweight64(w) \
    (__const_hweight32(w) + __const_hweight32((w) >> 32))

__ffs

static inline unsigned long __ffs(unsigned long word)
{
    int num = 0;

#if BITS_PER_LONG == 64
    if ((word & AAAA) == 0) {
        num += 32;
        word >>= 32;
    }
#endif
    if ((word & 0xffff) == 0) {
        num += 16;
        word >>= 16;
    }
    if ((word & 0xff) == 0) {
        num += 8;
        word >>= 8;
    }
    if ((word & 0xf) == 0) {
        num += 4;
        word >>= 4;
    }
    if ((word & 0x3) == 0) {
        num += 2;
        word >>= 2;
    }
    if ((word & 0x1) == 0)
        num += 1;
    return num;
}

AAAA 為 0xffffffff

舉例說明：

word = 0b1010101000000000
num = 0
經過 if ((word & 0xff) == 0) 成立

word = 0b10101010
num = 8

經過 if ((word & 0x1) == 0) 成立
num = 9  // __ffs = 9 最終答案

__clear_bit

巨集 BIT_MASK(nr) 為只有第 nr 個 bit 為 1 的二進位值
巨集 BIT_WORD(nr) 位元位置在 BITS_PER_LONG 之內

若要指定清除 *p 指定的第 nr 個 bit 只須將 mask 取反，及 ~mask

static inline void __clear_bit(unsigned long nr, volatile unsigned long *addr)
{
    unsigned long mask = BIT_MASK(nr);
    unsigned long *p = ((unsigned long *) addr) + BIT_WORD(nr);

    *p &= BBBB;
}

BBBB 為 ~mask

fns

基於 __ffs 可以求出第 n 個被 set 過的 bit 的所在位置。

static inline unsigned long fns(unsigned long word, unsigned int n)
{
    while (word) {
        unsigned int bit = __ffs(word);
        if (n-- == 0)
            return bit;
        __clear_bit(bit, &word);
    }
    return BITS_PER_LONG;
}

舉例說明：

fns(1010101, 4)
word = 1010101
n = 4
-- 第一次迭代 --
n = 4
bit = 2
word = 1010100
-- 第二次迭代 --
n = 3
bit = 4
word = 1010000
-- 第三次迭代 --
n = 1
bit = 6
word = 1000000
-- 第四次迭代 --
n = 0
bit = 6

return 6

延伸問題：

在 Linux 核心原始程式碼找出 find_nth_bit 的應用案例，並予以解說和分析，需要涵蓋 CPU affinity 相關的原始程式碼。

2024q1 Homework2 (quiz1+2)

第一周測驗題

測驗1 : quicksort

運作原理

延伸問題:

測驗2 : Timsort

Timesort 運作原理

待完成：

第二周測驗題

測驗1 : Binary Tree

測驗2 : LRU Cache

延伸問題

測驗3 : find_nth_bit

Read more

伺服語錄

撰寫 LKM

印表機驅動安裝

研讀〈並行和多執行緒程式設計〉系列講座

測驗`1` : quicksort

測驗`2` : Timsort

測驗`1` : Binary Tree

測驗`2` : LRU Cache

測驗`3` : find_nth_bit