--- tags: linux2023 --- # [2023q1](http://wiki.csie.ncku.edu.tw/linux/schedule) 第 4 週測驗題 :::info 目的: 檢驗學員對 C 語言程式設計, alignment, bitwise, preprocessor 的認知 ::: ==[作答表單: 測驗 1](https://docs.google.com/forms/d/e/1FAIpQLSdEdI-0AglUU8CJdsvaPDjKxL4kRdpIxC1ztSNv95yJDBTlJw/viewform)== (針對 Linux 核心「設計」課程) ==[作答表單: 測驗 2-3](https://docs.google.com/forms/d/e/1FAIpQLSe_WRplQC0gOWxG-q8msC1sNPmpA35r0XT0OV333m6EPHCR1w/viewform)== (針對 Linux 核心「實作」課程) ### 測驗 `1` 延續[第 3 週測驗題](https://hackmd.io/@sysprog/linux2023-quiz3),我們想改寫 [Linux 核心的紅黑樹](https://hackmd.io/@sysprog/linux-rbtree) 的實作,進一步精簡紅黑樹節點佔用的空間,如下: ```c /* Node structure */ #define rb_node(x_type) \ struct { \ x_type *left, *right_red; \ } ``` 注意親代節點不保存於 `rb_node` 中,`right_red` 運用 [C 語言:記憶體管理、對齊及硬體特性](https://hackmd.io/@sysprog/c-memory) 和 [C 語言: bitwise 操作](https://hackmd.io/@sysprog/c-bitwise) 提及的指標地址對齊特性。以下是 C++ 測試程式碼: ```cpp #include <assert.h> #include <stdbool.h> #include <stdint.h> #include <stdlib.h> #include <time.h> #include <iostream> #include <random> #include "rb.h" using std::cout; using std::endl; struct node_; typedef struct node_ node_t; struct node_ { rb_node(node_t) link; int key; }; static int node_cmp(const node_t *a, const node_t *b) { int ret = (a->key > b->key) - (a->key < b->key); if (ret == 0 && a != b) assert(0 && "Duplicates are not allowed in the tree"); return (ret); } typedef rb_tree(node_t) tree_t; rb_gen(static, tree_, tree_t, node_t, link, node_cmp); typedef int value_t; enum { LEFT = 0, RIGHT = 1, NEITHER = 3, }; typedef int direction; struct node_s; class node { public: node() : m_node(NULL) {} explicit node(node_s *n) : m_node(n) {} node_s *operator->() const { return as_node(); } explicit operator bool() const { return m_node != NULL; } int longer() const { return uintptr_t(m_node) & 3; } void set_longer(int longer) { assert(m_node); m_node = (node_s *) ((intptr_t(m_node) & ssize_t(-4)) | longer); } bool balanced() const { return (uintptr_t(m_node) & 2) != 0; } bool operator==(const node &other) const { return as_node() == other.as_node(); } bool operator!=(const node &other) const { return as_node() != other.as_node(); } private: node_s *as_node() const { return (node_s *) (intptr_t(m_node) & ssize_t(-4)); } node_s *m_node; }; struct node_s { value_t value; node next[2]; }; int dir_check_depth(node tree) { if (!tree) return 0; int err = 0; int rv; int b = dir_check_depth(tree->next[LEFT]); int f = dir_check_depth(tree->next[RIGHT]); if (b == f) { if (!tree.balanced()) err = 1; rv = b + 1; } else if (b == f - 1) { if (tree.longer() != RIGHT) err = 1; rv = f + 1; } else if (b - 1 == f) { if (tree.longer() != LEFT) err = 1; rv = b + 1; } else { err = 1; rv = 0; } if (err) printf("err at %d: b=%d f=%d longer=%d\n", tree->value, b, f, tree.longer()); return rv; } std::pair<unsigned int, unsigned long long> doGetGepth(node_t *tree, node_t *null, unsigned int level) { if (tree == null) return std::make_pair(0U, 0ULL); auto left(doGetGepth(rbtn_left_get(node_t, link, tree), null, level + 1)), right(doGetGepth(rbtn_right_get(node_t, link, tree), null, level + 1)); return {1U + left.first + right.first, left.second + right.second + level}; } std::pair<unsigned int, unsigned long long> doGetGepth(node tree, node null, unsigned int level) { if (tree == null) return std::make_pair(0U, 0ULL); auto left(doGetGepth(tree->next[0], null, level + 1)), right(doGetGepth(tree->next[1], null, level + 1)); return {1U + left.first + right.first, left.second + right.second + level}; } template <typename T> double getDepth(T tree, T null = static_cast<T>(0)) { auto depth(doGetGepth(tree, null, 0U)); return depth.first ? double(depth.second) / depth.first : 0; } enum { NNODES = 1000 * 1000 }; int main(int argc, char *argv[]) { node_t *nodes = new node_t[NNODES]; std::default_random_engine dre; for (int i = 0; i < NNODES; ++i) { std::uniform_int_distribution<int> di(0, i); const int j = di(dre); if (i != j) nodes[i].key = nodes[j].key; nodes[j].key = i; } tree_t tree; tree_new(&tree); clock_t start = clock(); for (int i = 0; i < NNODES; ++i) tree_insert(&tree, nodes + i); cout << " tree_insert time: " << (double) (clock() - start) / CLOCKS_PER_SEC << " seconds" << endl; cout << " RB depth: " << getDepth(tree.root) << endl; start = clock(); for (int i = 0; i < NNODES; ++i) { if (!tree_search(&tree, nodes + i)) cout << "Strange failure to tree_search " << nodes[i].key << '\n'; } cout << " tree_search time: " << (double) (clock() - start) / CLOCKS_PER_SEC << " seconds" << endl; start = clock(); for (int i = 0; i < NNODES; ++i) { if (!tree_search(&tree, nodes + nodes[i].key)) cout << "Strange failure to tree_search " << nodes[i].key << '\n'; } cout << " alternative tree_search time: " << (double) (clock() - start) / CLOCKS_PER_SEC << " seconds" << endl; start = clock(); for (int i = 0; i < NNODES; ++i) { tree_remove(&tree, nodes + i); } cout << " tree_remove time: " << (double) (clock() - start) / CLOCKS_PER_SEC << " seconds" << endl; delete[] nodes; tree_destroy(&tree, NULL, NULL); cout << endl; return 0; } ``` 在 [Ampere eMAG 8180](https://en.wikichip.org/wiki/ampere_computing/emag/8180) 的參考執行輸出: ``` tree_insert time: 0.421294 seconds RB depth: 18.8082 tree_search time: 0.471369 seconds alternative tree_search time: 0.544996 seconds tree_remove time: 0.486769 seconds ``` 其中 `rb.h` 的原始程式碼可見 [rb.h](https://gist.github.com/jserv/6610eb56bf2486979c8bf6ee8061f71c) (部分遮蔽) > 你需要對照看 [C 語言:前置處理器應用篇](https://hackmd.io/@sysprog/c-preprocessor) 以理解裡頭巨集的使用 請補完程式碼,使其運作符合預期。作答規範: * `DDDD` 是 0 或 1 的數值 * `AAAA`, `CCCC`, `FFFF`, `GGGG` 是巨集名稱 * `BBBB` 是變數名稱 * `EEEE` 是表示式 :::success 延伸問題: 1. 摘錄上述紅黑樹新增和刪除節點操作的程式碼,解釋其原理。不用列出完整程式碼,但要推敲其設計思維 > 提示: 善用 [TreeComparison](https://github.com/tinfoilhater/TreeComparison) 製圖 2. 運用你在[第 3 週測驗題](https://hackmd.io/@sysprog/linux2023-quiz3)提及的延伸問題所開發出的效能評比程式,分析不同 rbtree 實作的效能落差 3. [jemalloc](https://github.com/jemalloc/jemalloc) 專案的 [test/unit/rb.c](https://github.com/jemalloc/jemalloc/blob/dev/test/unit/rb.c) 對 rbtree API 有更完整的涵蓋,其中 summarize/filter 能夠特化查詢範圍,請自 [jemalloc](https://github.com/jemalloc/jemalloc) 移植更豐富的 rbtree API 及其實作,並撰寫相關的測試程式 (含效能評比) 4. 解讀 Linux 核心的 [include/linux/rbtree.h](https://github.com/torvalds/linux/blob/master/include/linux/rbtree.h), [include/linux/rbtree_types](https://github.com/torvalds/linux/blob/master/include/linux/rbtree_types.h), [lib/rbtree.c](https://github.com/torvalds/linux/blob/master/lib/rbtree.c) 運作原理,並探討能否運用上述手法,以縮減 rbtree 節點的佔用空間 ::: --- ### 測驗 `2` [第 1 週測驗題](https://hackmd.io/@sysprog/linux2023-quiz1) 提及 [Timsort](https://en.wikipedia.org/wiki/Timsort),這裡我們想實作簡化的 Timsort,在 [Ampere eMAG 8180](https://en.wikichip.org/wiki/ampere_computing/emag/8180) 的參考執行輸出: ``` timsort: 21.0 us qsort: 41.0 us ``` 其中 `qsort` 來自 glibc 的 [qsort(3p)](https://man7.org/linux/man-pages/man3/qsort.3p.html),儘管 glibc 的 [stdlib/qsort.c](https://sourceware.org/git/?p=glibc.git;a=blob;f=stdlib/qsort.c;hb=HEAD) 已採取非遞迴的快速排序法並包含若干調整,上述的 Timsort 仍可優於快速排序法,並滿足 stable sort。 原始程式碼可見 [sort.c](https://gist.github.com/jserv/2ef9c2c6c274ff738f928f4e5af493a3) (部分遮蔽) 假設 `timsort` 函式第二個參數 (即 `void *f`) 轉型為 `offset_t` 後,必定大於第一個參數 (即 `void *l`) 轉型為 `offset_t` 的值。請補完程式碼,使其運作符合預期。作答規範: (以最精簡的形式撰寫,且依循 `lab0-c` 一致程式碼風格,皆不得包含 `;` 字元) * `AAAA` 是合法 C 語言敘述 * `BBBB` 和 `CCCC` 為表示式,不得出現 `++` 和 `--` 運算子 * `DDDD` 為表示式,必須包含 `MIN_RUN`,且 `+` 運算子必須先於 `*` 或 `/` 運算子出現 (由左到右),不得出現小括號 (即 `(` 或 `)`) * `EEEE`, `FFFF`, `GGGG` 是表示式,不得出現小括號 (即 `(` 或 `)`) :::success 1. 解釋上述程式碼運作原理,不用列出完整程式碼,但要推敲其設計思維,應參照論文〈[On the Worst-Case Complexity of TimSort](https://arxiv.org/abs/1805.08612)〉和〈[Merge Strategies: from Merge Sort to TimSort](http://nicolas.louvet.pages.univ-lyon1.fr/meef-nsi/m1-algoprog/TD7-tris/merge_strategies.pdf)〉 2. BONUS: 針對 Linux 核心原始程式碼的 `lib/list_sort.c` 和 `lib/sort.c`,提出 hybrid sorting 的引入和改進方案,並予以實作 $\to$ 之後可提交成果到 LKML 3. F9 microkernel 的 [kernel/lib/sort.c](https://github.com/f9micro/f9-kernel/blob/master/kernel/lib/sort.c) 用非遞迴的方式實作快速排序,請評估導入 [introsort](https://en.wikipedia.org/wiki/Introsort) 或更新的 hybrid sort 的效益 ::: --- ### 測驗 `3` 延伸測驗 `1`,考慮 `rbi.c` 是不依賴遞迴呼叫的紅黑樹實作,原始程式碼可見 [rbi.c](https://gist.github.com/jserv/c2b80079ba4fc6cfe563f5c02767f455) (部分遮蔽),編譯後得到檔名為 `rbi` 的執行檔,執行方式: ```shll $./rbi 7 11 1 99 22 33 44 9 3 5 ``` 參考執行輸出: ``` lookup(11): OK value: 1 value: 3 value: 5 value: 7 value: 9 value: 11 value: 22 value: 33 value: 44 value: 99 sum: 234 ``` `234` 是所有子樹節點的內含值總和。請補完程式碼,使其運作符合預期,作答規範: (以最精簡的方式撰寫,不包含小括號) * `HHHH` 和 `IIII` 是正整數 * `JJJJ`, `KKKK`, `LLLL` 皆為巨集名稱 :::success 1. 解釋上述程式碼運作原理,不用列出完整程式碼,但要推敲其設計思維 2. 將測驗 `1` 對節點的紅色和黑色標注的技巧嵌入到指標變數中 3. Linux 核心的 rbtree 具備 cache 機制 (參見 [Red-black Trees (rbtree) in Linux](https://www.kernel.org/doc/Documentation/rbtree.txt)),探討其原理,並嘗試在測驗 `1` 或測驗 `3` 的基礎之上實作 :::