2022q1 Homework4 (quiz4)

# 2022q1 Homework4 (quiz4) contributed by < [tommy2234](https://github.com/tommy2234) > 題目連結 [quiz4](https://hackmd.io/@sysprog/linux2022-quiz4) ## 測驗 1 延伸[第 3 週測驗題](https://hackmd.io/@sysprog/linux2022-quiz3)的測驗 `7`，已知輸入必為大於 `0` 的數值 (即 `x > 0`)，以下程式碼可計算 ⌈log2(x)⌉ ，也就是 [ceil](https://man7.org/linux/man-pages/man3/ceil.3.html) 和 [log2](https://man7.org/linux/man-pages/man3/log2.3.html) 的組合並轉型為整數: ```c int ceil_log2(uint32_t x) { uint32_t r, shift; x--; r = (x > 0xFFFF) << 4; x >>= r; shift = (x > 0xFF) << 3; x >>= shift; r |= shift; shift = (x > 0xF) << 2; x >>= shift; r |= shift; shift = (x > 0x3) << 1; x >>= shift; return (EXP1) + 1; } ``` 此題的概念和第三周的測驗 7 很像，整個過程就是在尋找最高位的 1 的位置。我們慢慢將 x shift 到 0 ，並且將 shift 的次數累計起來（累計在 r ），就可以得到最高位的 1 的位置。過程如下： 1. `r = (x > 0xFFFF) << 4;` `x >>= r;` 若最高位的 1 在第 16 位以上，就向右 shift 16 位。 2. `shift = (x > 0xFF) << 3` `x >>= shift;` 若最高位的 1 在第 8 位以上，就向右 shift 8 位。 3. `shift = (v > 0xF) << 2` `x >>= shift;` 若最高位的 1 在第 4 位以上，就向右 shift 4 位。以此類推... 因此 EXP1 = `r | shift | x>>1` 而最後的 +1 是和一開始的 x-- 互相搭配，都是為了滿足取 celling 的結果。 - Case 1 : x 正好為 2 羃次方此時 x-- 會讓最高位的 1 退一位，因此最後計算的結果會比預期少 1 ，再加上 1 就變回正確答案。 - Case 2 : x 不是 2 的羃次方此時 x-- 不會改變最高位的 1 的位置，最後計算的結果即是最高位的 1 的位置，再加上 1 正好達成取 celling 的目的。 ### 改進程式碼，使其得以處理 `x = 0` 的狀況，並仍是 branchless 最簡單的方法即是套用 x86 上的 [`__builtin_clz`](http://gcc.gnu.org/onlinedocs/gcc-4.3.4/gcc/Other-Builtins.html) 指令，迅速找到最高位的 1 的位置。 ```c int ceil_log2(uint32_t x) { assert(x != 0 && "log2(0) is undefined !\n"); return 32 - __builtin_clz(x - 1); } ``` :::warning TODO: 在 Linux 核心原始程式碼找出 ceil 和 log2 的組合 (類似的形式)，並解說應用案例 (提示: 在 Linux 核心排程器就有) ::: --- ## 測驗 2 複習〈[你所不知道的 C 語言: bitwise 操作](https://hackmd.io/@sysprog/c-bitwise)〉，改寫[第 3 週測驗題](https://hackmd.io/@sysprog/linux2022-quiz3)的測驗 `11` 裡頭的 fls 函式 (fls 意謂 “find last set”)，使其得以計算 [Find first set](https://en.wikipedia.org/wiki/Find_first_set): ```c #define BITS_PER_BYTE 8 #define BITS_PER_LONG (sizeof(unsigned long) * BITS_PER_BYTE) #include <stddef.h> static inline size_t ffs(unsigned long x) { if (x == 0) return 0; size_t o = 1; unsigned long t = ~0UL; size_t shift = BITS_PER_LONG; shift >>= 1; t >>= shift; while (shift) { if ((EXP2) == 0) { x >>= shift; EXP3; } shift >>= 1; t >>= shift; } return o; } ``` 在 x 非 0 的情況下將 o 初始化為 1 ，shift 初始化為`BITS_PER_LONG / 2`，代表的是目前搜索 first set bit 的搜索範圍。而 t 是搭配 shift 所形成的 mask ，` t >>= shift` 就會形成一個可以拿來檢測 first set bit 是否位在搜索範圍之內的 mask。過程中我們不斷縮小搜索範圍，也就是 shift ，並且用 `x&t` 來判斷 x 之中最低位的 1 位在目前搜索範圍的左邊（全 0）或者在範圍之內（全 1），如果位在左邊就要把 shift 累加進答案 o 之中，同時將 x 右移 shift。 EXP2 = `x&t` EXP3 = `o += shift` :::warning TODO: 在 Linux 核心原始程式碼的 include/asm-generic/bitops 目錄找出相關程式碼，並探討應用案例 ::: --- ## 測驗 3 考慮以下改寫自 Linux 核心的程式碼: 請補完，使 `consumer_del` 行為符合註解規範。 ```c struct foo_consumer { int (*handler)(struct foo_consumer *self, void *); struct foo_consumer *next; }; struct foo { struct foo_consumer *consumers; unsigned long flags; }; #include <stdbool.h> /* * For foo @foo, delete the consumer @fc. * Return true if the @fc is deleted sfccessfully * or return false. */ static bool consumer_del(struct foo *foo, struct foo_consumer *fc) { struct foo_consumer **con; bool ret = false; for (con = &foo->consumers; *con; EXP4) { if (*con == fc) { *con = EXP5; ret = true; break; } } return ret; } ``` 這題的實做很像[<你所不知道的 C 語言: linked list 和非連續記憶體>](https://hackmd.io/@sysprog/c-prog/%2Fs%2FSkE33UTHf)中所提到的『有品味的』刪除節點的方法。使用 indirect pointer 使刪除的操作更加方便，無須特別處理刪除 head 的情況。本題我們要從 foo 結構體所包含的 linked list -- consumers 之中刪除 fc 這個節點，在走訪的過程中尋找目標 fc 並刪除。 EXP4 無疑是使 con 走訪到下一個位置。因此 EXP4 為 `con = &(*con)->next`。 EXP5 則是已經找到 fc 的時候，此時 \*con 應該是 `foo->consumers` 或者是 fc 前一個節點的 next 指標，此時我們只要將 \*con 指向 `fc->next` 即可。因此 EXP5 為 `fc->next`。 ### 區分 `foo` 和 `foo_consumer` 的好處這樣的區分和 [linux/list.h](https://github.com/torvalds/linux/blob/master/include/linux/list.h) 之中 linux 核心風格的 linked list 有相似之處。將 foo 作為結構體本身，其中包含紀錄節點資訊的 foo_consumer ，如此一來我們在走訪時只須要走訪 foo_consumer 即可，當需要存取 foo 時可以透過 foo_consumer 回推母體的位置。並且將 linked list 的節點和母體分開可使節點得以適用於各種不同的結構體，只須將其包含於結構體中，而不需要重新定義。 :::warning TODO: 在 Linux 核心原始程式碼相似技巧的程式碼 (例如 uprobe)，並探討應用案例 ::: --- ## 測驗 4 以下嘗試用 [lab0](https://hackmd.io/@sysprog/linux2022-lab0) 提及的 [setjmp](https://man7.org/linux/man-pages/man3/setjmp.3.html) 和 [longjmp](https://man7.org/linux/man-pages/man3/setjmp.3.html)，用以實作〈[你所不知道的 C 語言: goto 和流程控制篇](https://hackmd.io/@sysprog/c-control-flow)〉闡述的 [coroutine](https://en.wikipedia.org/wiki/Coroutine)，參考的程式執行輸出如下: ``` Task 0: n = 3 Task 1: n = 3 Task 0: resume Task 1: resume Task 0: resume Task 0: complete Task 1: resume Task 1: complete ``` 原始程式碼如下: (檔名 `jmp.c`) ```c #include <setjmp.h> #include <stdint.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include "list.h" struct task { jmp_buf env; struct list_head list; }; static LIST_HEAD(tasklist); static void (**tasks)(void *); static int ntasks; static jmp_buf sched; static void task_add(struct list_head *tasklist, jmp_buf env) { struct task *t = malloc(sizeof(*t)); memcpy(t->env, env, sizeof(jmp_buf)); INIT_LIST_HEAD(&t->list); list_add_tail(&t->list, tasklist); } static void task_switch(struct list_head *tasklist) { jmp_buf env; if (!list_empty(tasklist)) { struct task *t = list_first_entry(tasklist, struct task, list); EXP6; memcpy(env, t->env, sizeof(jmp_buf)); free(t); longjmp(env, 1); } } static void task_join(struct list_head *tasklist) { jmp_buf env; while (!list_empty(tasklist)) { struct task *t = list_first_entry(tasklist, struct task, list); EXP7; memcpy(env, t->env, sizeof(jmp_buf)); free(t); longjmp(env, 1); } } void schedule(void) { static int i; srand(0xCAFEBABE ^ (uintptr_t) &schedule); /* Thanks to ASLR */ setjmp(sched); while (ntasks-- > 0) { int n = rand() % 5; tasks[i++](&n); printf("Never reached\n"); } task_join(&tasklist); } /* A task yields control n times */ void task0(void *arg) { jmp_buf env; static int n; n = *(int *) arg; printf("Task 0: n = %d\n", n); if (setjmp(env) == 0) { task_add(&tasklist, env); EXP8; } for (int i = 0; i < n; i++) { if (setjmp(env) == 0) { task_add(&tasklist, env); task_switch(&tasklist); } printf("Task 0: resume\n"); } printf("Task 0: complete\n"); longjmp(sched, 1); } void task1(void *arg) { jmp_buf env; static int n; n = *(int *) arg; printf("Task 1: n = %d\n", n); if (setjmp(env) == 0) { task_add(&tasklist, env); EXP9; } for (int i = 0; i < n; i++) { if (setjmp(env) == 0) { task_add(&tasklist, env); task_switch(&tasklist); } printf("Task 1: resume\n"); } printf("Task 1: complete\n"); longjmp(sched, 1); } #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0])) int main(void) { void (*registered_task[])(void *) = {task0, task1}; tasks = registered_task; ntasks = ARRAY_SIZE(registered_task); schedule(); return 0; } ``` 請補完，使行為符合註解規範。 ### Scheduling 流程 1. 在 scheduler 中，先將 jmp_buf 設定在 sched 中，以便之後跳回來。 2. 從 scheduler 分別跳到 task 0 和 task 1 ，讓兩者分別存下自己的 jmp_buf 並加到 tasklist ，之後跳回 scheduler 本身。 3. scheduler 呼叫 taskjoin()，將 tasklist 裡的 tasks 輪流拿出執行。 ### task 裡的結構第一次進到 task 裡面，會將自己的 jmp_buf 存下加到 tasklist 之後跳回 scheduler ，但是下次再跳進 task 並不會無限次回到原本`setjmp(env)`的位置，如以下的 if statement。 ```c if (setjmp(env) == 0) { task_add(&tasklist, env); EXP9; } ``` 從 man page 可以得知只有第一次的 direct call 會return 0 ，之後藉由 longjmp() 跳回時會 return 非 0 的值。因此 routine 得以繼續執行下去。 > setjmp() and sigsetjmp() return 0 when called directly; on the "fake" return that occurs after longjmp() or siglongjmp(), the nonzero value specified in val is returned. 接著 task 會依照傳入的 n 值將自己的 jmp_buf 加入 tasklist 共 n 次，也做 n 次的 task switch，因此 "`Task 1: resume`" 和 "`Task 2 resume`" 印出的次數都要符合自己收到的 n 值。 ```c for (int i = 0; i < n; i++) { if (setjmp(env) == 0) { task_add(&tasklist, env); task_switch(&tasklist); } printf("Task 1: resume\n"); } ``` ### Bug 然而程式的輸出似乎不符合預期，在 n = 3 時，task 0 和 task 1 都只印出了兩次 resume。 ``` Task 0: n = 3 Task 1: n = 3 Task 0: resume Task 1: resume Task 0: resume Task 0: complete Task 1: resume Task 1: complete ``` 將迴圈中的 i 值也印出看看發生了什麼事。發現兩個 task 的 i 值都發生了錯亂。 ``` Task 0: n = 3 Task 1: n = 3 Task 0: resume. i = 0 Task 1: resume. i = 1 Task 0: resume. i = 2 Task 0: complete Task 1: resume. i = 22058 Task 1: complete ``` 再看一下 man page 1. > The setjmp() function saves various information about the calling environment (==typically, the stack pointer, the instruction pointer, possibly the values of other registers and the signal mask==) in the buffer env for later use by longjmp(). In this case, setjmp() returns 0. 2. > The compiler may optimize variables into registers, and longjmp() may restore the values of other registers in addition to the stack pointer and program counter. Consequently, the values of automatic variables are unspecified after a call to longjmp() if they meet all the following criteria: • they are local to the function that made the corresponding setjmp() call; • their values are changed between the calls to setjmp() and longjmp(); and • they are not declared as volatile. 原來 setjmp 只會存下 stack pointer, the instruction pointer 等等暫存器的值，而 local variable 這類放在 stack 內的值則不會被存下，透過 longjmp 跳回時也不會被回復。這導致兩個 task 內迴圈迭代的 i 值發生錯亂，輸出也不如預期。 ### Fix bug 既然 atomatic variables 會受到影響，那我們可以把不希望被影響的值宣告成 static variable。 ```c static i; for (i = 0; i < n; i++) { if (setjmp(env) == 0) { task_add(&tasklist, env); task_switch(&tasklist); } printf("Task 1: resume. i = %d\n", i); } ``` 結果符合預期 ``` Task 0: n = 3 Task 1: n = 3 Task 0: resume. i = 0 Task 1: resume. i = 0 Task 0: resume. i = 1 Task 1: resume. i = 1 Task 0: resume. i = 2 Task 0: complete Task 1: resume. i = 2 Task 1: complete ``` :::warning TODO: 1. 對比 concurrent-programs 裡頭的 coroutine 程式碼，如 tinync 和 preempt_sched，嘗試引入 UNIX Signal 來做到 preemptive scheduling 2. 對比〈A brief history of the Linux Kernel’s process scheduler: The very first scheduler, v0.01〉，解說早期 Linux 核心 CPU 排程器的行為，並利用上述程式碼來模擬 :::