2024q1 Homework2 (quiz1+2)

# 2024q1 Homework2 (quiz1+2) contributed by < `eleanorLYJ` > > [作業規範](https://hackmd.io/@sysprog/BkmmEQCn6) ## [第一週測驗題](https://hackmd.io/@sysprog/linux2024-quiz1) ### 測驗1 #### 理解程式碼有變數i，去模擬遞迴的stack的層數，用`begin[i]` 與 `end[i]` 代表第i層考慮的左閉右閉的區間。每次把區間的第一個節點當pivot，接著從pivot->next 開始迭代向next走，直到把連 R 也走完也就會把除了pivot的所有節點與pivot的值做比較或是說分類，節點大於pivot的加到right的開頭，反之與接到 left 的開頭。將所有節點分類完畢。測驗寫的演算法，都是先考慮遞迴pivot左邊區間，之後將pivot存下，之後再遞迴pivot的右邊區間。  #### 提出改進方案並予以實作 - 使用List API改寫 - 提出可避免最差狀況的快速排序實作想要解決: 與pivot同樣的值的節點仍要繼續遞迴。  ### 測驗2 #### 閱讀題目時 Timsort 筆記 - timsort 核心想法: - 走訪過程把有序數列塞進 run, 適當時機就合併 - 關於 run - merge_collapse 確保 run 長度 - run 太短就用二分插入 (插到其他的run裡) - 有個 stack 負責暫存 run, 前三個 run 稱為A, B, C - 關於合併 - 不考慮merge的兩個條件 1. `len(A) > len(B) + len(C)` 2. `len(B) > len(C)` -> 其一不符合就考慮合併 - 兩種合併排序 (如果len(A) < len(B) 就會把A 暫存起來) 1. one-pair-at-a-time (經典排序) 2. collpase mode: (考慮兩個已排序的數列時): 找 B[0] 應該插在 A 的哪位置，接著 B[1] 考慮 B[0] 以右的位置，直到把 B 陣列的值全部都插入到 A 陣列中 #### 理解程式碼原本以為`run_size`函數寫錯，發現是把size 存在 `節點->next->pre`裡。 #### 提出改進方案並予以實作閱讀程式碼後發現沒有作merge galloping, 我原本著手撰寫merge galloping, 但多加思考後是此演算法要實作在串列上, 然而普通合併方式與 merge galloping 效果會相同, 因此程式碼就維持不變 #### 資料分布設計效能評比的測試程式與 Timsort 設想的資料分布樣式。我測試資料設計成6種類別，分別為(0)全部隨機資料、(1)遞增資料、(2)遞減資料、(3)遞增資料並隨機交換3次、(4)遞增資料隨機交換7次、(5)每64個數字作為一個 run 組成的資料、(6)每64個數字作為一個 run 組成的資料並在同一個 run 以20％機率隨機交換 (7)全部資料以同一個隨機值組成 [github](https://github.com/eleanorLYJ/lab2) ``` ==== Testing timsort ==== type == 0 Elapsed time: 4507 Comparisons: 482810 List is sorted type == 1 Elapsed time: 63 Comparisons: 13357 List is sorted type == 2 Elapsed time: 54 Comparisons: 13357 List is sorted type == 3 Elapsed time: 0 Comparisons: 0 List is sorted type == 4 Elapsed time: 0 Comparisons: 0 List is sorted type == 5 Elapsed time: 0 Comparisons: 0 List is sorted type == 6 Elapsed time: 0 Comparisons: 0 List is sorted type == 7 Elapsed time: 1 Comparisons: 0 List is sorted ==== Testing list_sort ==== type == 0 Elapsed time: 3310 Comparisons: 450213 List is sorted type == 1 Elapsed time: 306 Comparisons: 93373 List is sorted type == 2 Elapsed time: 269 Comparisons: 90613 List is sorted type == 3 Elapsed time: 2 Comparisons: 0 List is sorted type == 4 Elapsed time: 2 Comparisons: 0 List is sorted type == 5 Elapsed time: 1 Comparisons: 0 List is sorted type == 6 Elapsed time: 0 Comparisons: 0 List is sorted type == 7 Elapsed time: 0 Comparisons: 0 List is sorted ``` TODO: 1. 將以上程式碼合併進lab0 , 作為一個有效的排序命令 2. 確定亂數足夠亂 ## [第二周測驗題](https://hackmd.io/@sysprog/linux2024-quiz2) ### 測驗 1 > 前序可知哪個節點在上面 (越前面的越上面)、而由中序可知哪個節點靠左(越前面的越左邊) #### 理解程式碼觀察 `dfs`函式如何遞迴的，其中著重理解參數意義。 ```c static struct TreeNode *dfs(int *preorder, int pre_low, int pre_high, int *inorder, int in_low, int in_high, struct hlist_head *in_heads, int size) //省略終止條件與建 `TreeNode *tn` 的過程 tn->left = dfs(preorder, pre_low + 1, pre_low + (idx - in_low), inorder, in_low, idx - 1, in_heads, size); tn->right = dfs(preorder, pre_high - (in_high - idx - 1), pre_high, inorder, idx + 1, in_high, in_heads, size); ``` 首先知道，變數`idx` 是當前節點(`tn`)在中序陣列中節點的索引值。 `dfs` 函式中的參數 `in_low` 與 `in_high` 比較好推測意義。觀察 `tn->left`，遞迴給`tn`的左子樹，範圍 `in_low` 到 `idx-1` (含)的中序索引範圍，而右子樹則是 `idx+1` (含) 到 `in_hight` 以後的索引範圍。也就是用 `idx` 分左右子樹的範圍. 接著觀察 `tn->left`，遞迴給 `tn` 左子樹中的`pre_low` 的值是 `pre_low + 1` ，因為建節點順序是按前序的順序，而當前節點`tn`已建值為`preorder[pre_low]` 的節點，因此 `tn` 的左子節點的值就是 `preorder[pre_low+1]`。而`tn`遞迴給左子樹的`pre_high` 是 `pre_low + (idx - in_low)`，`idx - in_low`是從中序觀點看目前 `tn` 的左子樹的節點個數。因此從`pre_low` 加左子樹的節點個數就是 `tn` 的左子樹要考慮的前序範圍。我的結論: `pre_low`、`pre_high`、`in_low` 和 `in_high` 參數是用於控制遞迴的範圍 - `pre_low` 與 `pre_high` 代表當前遞迴要考慮的前序索引範圍（閉區間）。 - `in_low` 與 `in_high` 代表當前遞迴要考慮的中序索引範圍（閉區間）。  ### linux核心 pre-order walk程式碼探討 > 在 Linux 核心的 cgroups 相關原始程式碼，找出 pre-order walk 程式碼並探討 kernel/cgroup/cgroup.c: * css_next_descendant_pre - find the next descendant for pre-order walk ```c struct cgroup_subsys_state * css_next_descendant_pre(struct cgroup_subsys_state *pos, struct cgroup_subsys_state *root) { struct cgroup_subsys_state *next; cgroup_assert_mutex_or_rcu_locked(); /* if first iteration, visit @root */ if (!pos) return root; /* visit the first child if exists */ next = css_next_child(NULL, pos); if (next) return next; /* no child, visit my or the closest ancestor's next sibling */ while (pos != root) { next = css_next_child(pos, pos->parent); if (next) return next; pos = pos->parent; } return NULL; } EXPORT_SYMBOL_GPL(css_next_descendant_pre); ``` ### 測驗 2 #### 理解程式碼  ### 測驗 3 #### 理解程式碼 `find_nth_bit` 指定的記憶體空間找出第 N 個設定的位元 ##### `__ffs` `__ffs` 找在word中第一個 bit (find first bit in word) 做法為逐步檢查數字中每個 2 的冪位元組，若該範圍內皆為 0，則將該範圍的位元數累加至 num 變數中，最後返回 num。若某個範圍內有 1 出現，則停止檢查並返回 num    ```c static inline unsigned long __ffs(unsigned long word) { int num = 0; #if BITS_PER_LONG == 64 if ((word & AAAA) == 0) { num += 32; word >>= 32; } #endif if ((word & 0xffff) == 0) { num += 16; word >>= 16; } if ((word & 0xff) == 0) { num += 8; word >>= 8; } if ((word & 0xf) == 0) { num += 4; word >>= 4; } if ((word & 0x3) == 0) { num += 2; word >>= 2; } if ((word & 0x1) == 0) num += 1; return num; } ``` 帶入值確定做法。 (二進位的表示，我以每2的冪作區隔) 如果 64 位元全0 `__ffs` 就會得到 63 word = 0|00|0000|00000000|0000000000000000|00000000000000000000000000000000|0000000000000000000000000000000000000000000000000000000000000000 num = 32+16+8+4+2+1 = 63 若 32 位元中的第一個位元為 1，`__ffs` 將返回 30 word = 1|00|0000|00000000|0000000000000000|00000000000000000000000000000000 -> num = 16+8+4+2=30 若 32 位元中的第 1 和 2 位元為 1，`__ffs` 將返回 28 word = 1|10|0000|00000000|0000000000000000|00000000000000000000000000000000 ->num = 16+8+4+0 = 28 能看出`ffs`得到從右數數過來第一個1是出現在第 n 個位元為 ##### `__clear_bit` ```c static inline void __clear_bit(unsigned long nr, volatile unsigned long *addr) { unsigned long mask = BIT_MASK(nr); unsigned long *p = ((unsigned long *) addr) + BIT_WORD(nr); *p &= BBBB; } ``` `#define BIT_MASK(nr) (1UL << ((nr) % BITS_PER_LONG)) ` BIT_MASK 生成出遮罩，只有第(nr) % BITS_PER_LONG bit 為1 ，其餘為0的遮罩。其做法為將1往左位移 ((nr) % BITS_PER_LONG)) 位 `#define BIT_WORD(nr) ((nr) / BITS_PER_LONG)` 用來計算指定位元 nr 在位址的 word 索引。其做法為超過 `BITS_PER_LONG` 就回傳1 ，反之，小於BITS_PER_LONG就回傳0。 `clear_bit` 先取得遮罩，再判斷 `nr` 長度若超過`BITS_PER_LONG`，`addr` 加一，得到指向`nr` 所在的位元組的指標 `p`。接著將遮罩做位元翻轉(`~mask`)，再與 `*p` 做 AND 運作，將位元`nr` 設為0。 ##### 計算一個數字中設置為 1 的位元數量 ```c #define __const_hweight8(w) \ ((unsigned int) ((!!((w) & (1ULL << 0))) + (!!((w) & (1ULL << 1))) + \ (!!((w) & (1ULL << 2))) + (!!((w) & (1ULL << 3))) + \ (!!((w) & (1ULL << 4))) + (!!((w) & (1ULL << 5))) + \ (!!((w) & (1ULL << 6))) + (!!((w) & (1ULL << 7))))) #define __const_hweight16(w) (__const_hweight8(w) + __const_hweight8((w) >> 8)) #define __const_hweight32(w) \ (__const_hweight16(w) + __const_hweight16((w) >> 16)) #define __const_hweight64(w) \ (__const_hweight32(w) + __const_hweight32((w) >> 32)) static inline unsigned long hweight_long(unsigned long w) { return __const_hweight64(w); } ``` 這段程式碼定義了計算一個數字中設置為 1 的位元數量的巨集和函式。 `__const_hweight8` 計算一個 8 位元數字中設置為 1 的位元數量。這是通過對數字進行 8 次位元 AND 運算，每次都檢查該位元是否設置為 1，如果是則將結果加 1。最後，返回加總的結果。這巨集裡的`!!` 能確保其結果為 0 或 1。接著，`__const_hweight16` 使用 `__const_hweight8` 函式計算給定數字的前 8 位元和後 8 位元的位元數量，並將結果相加。 `__const_hweight32` 和 `__const_hweight64` 類似地使用 `__const_hweight16` 函式來計算給定數字的前 16 位元、前 32 位元和前 64 位元的位元數量，並將結果相加。 ##### `fns` `fns` : 找第n個被設為1的位元位置 (find N'th set bit in a word) while 迴圈內，變數 `bit`的值為 `__ffs` 函式找到 word 中最低位設為 1 的位元的位置，之後變 n 減一，接著使用 `__clear_bit`清除`bit`所在的位元，也是讓下一次迭代找到的最小位元是此次迭代的次小位元。然而當 n 為 0 就回傳當前最小位元。倘若n 大於 word 全部有被設為1的位元總量或是 word 就是 0，兩情況就返回 `BITS_PER_LONG`。 ```c // @word: The word to search // @n: Bit to find static inline unsigned long fns(unsigned long word, unsigned int n) { while (word) { unsigned int bit = __ffs(word); if (n-- == 0) return bit; __clear_bit(bit, &word); } return BITS_PER_LONG; } ``` ##### ‵GENMASK` ```c #define GENMASK(h, l) \ (((~0UL) - (1UL << (l)) + 1) & (~0UL >> (BITS_PER_LONG - 1 - (h)))) ``` `((~0UL) - (1UL << (l)) + 1)`: 創造一個 l 之前都是 1，之後全是 0 的二進位例如當 l = 5 二進位 = 1111111111111111111111111111111111111111111111111111111111100000 `(~0UL >> (BITS_PER_LONG - 1 - (h))` : 創造一個第 `h+1` 之前位元都是 0，之後全是 1 的二進位。例如當 h = 5 二進位 = 00000000000000000000000000000000000000000000000000000000000111111 GENMASK 最終能得到介於第 h 與 l 之間的位元為 1 其餘為 0 的遮罩。 ##### `BITMAP_LAST_WORD_MASK` ```c #define BITMAP_LAST_WORD_MASK(nbits) (~0UL >> (-(nbits) & (BITS_PER_LONG - 1))) ``` ~0UL 全部位元都設置為 1 的值。而 (-(nbits) & (BITS_PER_LONG - 1))決定要讓~0UL往右移多少。首先我查 [C Operator Precedence](https://en.cppreference.com/w/c/language/operator_precedence)。為 unary minus的 - 會優先於 Bitwise AND，所以 `(-(nbits) & (BITS_PER_LONG - 1))` 先將 `nbits` 取二補數之後，再與 `BITS_PER_LONG - 1` 進行做 AND 運算。其中 `BITS_PER_LONG - 1`為常數63 (二進位為 11111)，所以這個 AND 運算只會保留`-(nbits)`最後一個BITS_PER_LLONG長度的有效位元。接著我帶入值，觀察此巨集效果 {nbits} -- {(nbits) & (BITS_PER_LONG - 1) (十進位)} -> {BITMAP_LAST_WORD_INDEX(nbits)} 0 -- 0(0) ->11111111111111111111111111111111 1 -- 111111(63) ->1 2 -- 111110(62) ->11 3 -- 111101(61) ->111 4 -- 111100(60) ->1111 5 -- 111011(59) ->11111 6 -- 111010(58) ->111111 7 -- 111001(57) ->1111111 8 -- 111000(56) ->11111111 9 -- 110111(55) ->111111111 設 n = -(nbits) 巨集會返回的值是 $2^{64-(n)} -1$ ##### small_const_nbits `__builtin_constant_p` 判斷其參數值是否在編譯時期就知道參數 ```c #define small_const_nbits(nbits) \ (__builtin_constant_p(nbits) && (nbits) <= BITS_PER_LONG && (nbits) > 0) ``` ##### FIND_NTH_BIT ```c #define FIND_NTH_BIT(FETCH, size, num) \ ({ \ unsigned long sz = (size), nr = (num), idx, w, tmp; \ \ for (idx = 0; (idx + 1) * BITS_PER_LONG <= sz; idx++) { \ if (idx * BITS_PER_LONG + nr >= sz) \ goto out; \ \ tmp = (FETCH); \ w = hweight_long(tmp); \ if (w > nr) \ goto found; \ \ nr -= w; \ } \ \ if (sz CCCC BITS_PER_LONG) \ tmp = (FETCH) & BITMAP_LAST_WORD_MASK(sz); \ found: \ sz = min(idx * BITS_PER_LONG + fns(tmp, nr), sz); \ out: \ sz; \ }) ``` ##### find_nth_bit >不理解: idx是? ```c static inline unsigned long find_nth_bit(const unsigned long *addr, unsigned long size, unsigned long n) { if (n >= size) return size; if (small_const_nbits(size)) { unsigned long val = *addr & GENMASK(size - 1, 0); return val ? fns(val, n) : size; } return FIND_NTH_BIT(addr[idx], size, n); } ```