> [GitHub](https://github.com/NOVBobLee/sysprog202308/tree/main/quiz0) \ > [2023 年暑期 Linux 核心課程第 1 次作業](https://hackmd.io/@sysprog/linux2023-summer-quiz0) ## 測驗 $α−1$ ## 測驗 $α−2$ 1. adjust balance parameter [-1, 1] -> [-2, 2], etc. * Need to fix `replace_right`/`replace_left` 2. `hint` start w/`1`, not `0` 3. add `struct st_node **pparent` to deal w/judgement of left/right child 4. add some members in struct node's padding (for what?) ## 測驗 $α−3$ ## 測驗 $β−1$ 第一部份 (1.) `align_up()` 的定義。第二部份 (2, 3.) 通用式 (即第 `9` 行) 的原理。第三部分 (4, 5.) 為特殊情況 ( `alignment` 為二的冪) 下,針對運算進行最佳化。 ```c= #include <stdint.h> /* alignment > 0 */ static inline uintptr_t align_up(uintptr_t sz, size_t alignment) { uintptr_t mask = alignment - 1; if ((alignment & mask) == 0) { /* power of two? */ return (sz + mask) & ~mask; } return (((sz + mask) / alignment) * alignment); } ``` 1. `align_up()` 針對 `sz` 除以 `alignment` 所得到的餘數非零時,會有「無條件進位」到 `alignment` 倍數的動作;而當餘數為零時,則保持 `sz` 不變,以[數學](https://zh.wikipedia.org/zh-tw/%E5%B8%A6%E4%BD%99%E9%99%A4%E6%B3%95#%E5%9F%BA%E6%9C%AC%E5%AE%9A%E7%90%86)表示如下: $\text{Given }s \in\mathbb{N}\cup\{0\} \,,\, a \in\mathbb{N}$ \ $\text{Then, }\exists!\,q_s,r_s \in\mathbb{N}\cup\{0\} \text{ s.t. } s = aq_s + r_s \;\text{, where }\; 0 \leq r_s < a$ \ $\text{Case }r_s = 0 \text{: }\; \text{align_up(s,a) gives }a \cdot q_s = s = s - r_s$ \ $\text{Case }r_s = 1 \text{: }\; \text{align_up(s,a) gives }a \cdot (q_s + 1) =(s - r_s) + a$ \ $\qquad\vdots$ \ $\text{Case }r_s = a - 1 \text{: }\; \text{align_up(s,a) gives }a \cdot (q_s + 1) = (s - r_s) + a$ 2. 電腦整數除法有[下取整](https://zh.wikipedia.org/zh-tw/%E5%8F%96%E6%95%B4%E5%87%BD%E6%95%B0)的效果,以數學表示為 $\Bigl\lfloor\dfrac{s}{a}\Bigr\rfloor = q_s$ ($=\dfrac{s - r_s}{a}$) ,小數點的部份就消去了。在給定 `a` 的情況下, $\Bigl\lfloor\dfrac{s}{a}\Bigr\rfloor\cdot a$ 與 `align_up(s,a)` 的圖形皆為離散的[階梯函數](https://en.wikipedia.org/wiki/Step_function)樣子,此時將 $\Bigl\lfloor\dfrac{s}{a}\Bigr\rfloor \cdot a$ 左移即可與 `align_up(s,a)` 疊合。 $\text{Find }k \in\mathbb{N}\cup\{0\} \text{ s.t. the every results from align_up(s+k,a) in 1. don't change.}$ \ $\text{We have } s + k = (aq_s + r_s) + k = aq_{s+k} + r_{s+k} \text{, where } 0 \leq r_{s+k} < a$ \ $\text{Case }r_s = 0\text{: }\; s + k = (aq_s + 0) + k = aq_s + k \text{, where } 0 \leq k < a$ \ $\Rightarrow k \text{ could be }0, 1, \dots, (a - 1)$ \ $\text{Case }r_s = 1\text{: }\; s + k = (aq_s + 1) + k = a(q_s + 1) + (k - a + 1)\text{, where } 0 \leq (k - a + 1) < a$ \ $\Rightarrow k \text{ could be }(a - 1), a, \dots, (2a - 2)$ \ $\qquad\vdots$ \ $\text{Case }r_s = a - 1\text{: }\; s + k = \big(aq_s + (a - 1)\big) + k = a(q_s + 1) + (k - 1)\text{, where } 0 \leq (k - 1) < a$ \ $\Rightarrow k \text{ could be }1, 2, \dots, a$ \ $\text{There's only one } k \text{ matches in every cases} \Rightarrow k \text{ must be } a - 1$ 3. $\Bigl\lfloor\dfrac{s}{a}\Bigr\rfloor \cdot a$ 若要跟 `align_up()` 疊合,則必須左移 $a - 1$ (即變數 `mask` ) ,此時 `align_up()` $= \Bigl\lfloor\dfrac{s + m}{a}\Bigr\rfloor \cdot a$,得到通用式 (第 `9` 行)。 4. 從 $\Bigl\lfloor\dfrac{s + m}{a}\Bigr\rfloor \cdot a = aq_{s+m}$ 的式子可以看出,就是把原本 $s + m$ 中對於除數 $a$ 的餘數 $r_{s+m}$ 去除。此操作在 $a$ 為二的冪情況下,跟電腦位元操作結合,有比通用式 (除法和乘法) 更快的運算方式。該方法即為第 `7` 行 `(s + m) & ~m` ,原意就是將 $(s + m)$ 中對於除數 $a$ 之餘數 $r_{s + m}$ 去除,前提為 $a$ 是二的冪。 ```graphviz digraph { node [shape=plaintext]; bitarray [label=<<TABLE> <TR> <TD width="150">...</TD> <TD width="30">n</TD> <TD width="40" bgcolor="Yellow">(n-1)</TD> <TD width="30" bgcolor="Yellow">...</TD> <TD width="30" bgcolor="Yellow">1</TD> <TD width="30" bgcolor="Yellow">0</TD> </TR> </TABLE>>]; } ``` $\text{(1) Unsigned 64-Bits Integer } (s + m) = \sum_{0\leq d<64} i_d2^d, i_d = 0,1$ \ $\text{(2) Remainders of divisor }2^n = \Big\{ \sum_{0\leq d<n} i_d2^d | i_d = 0, 1 \Big\}$ \ $\text{(3) } m = (2^n - 1) = (2 - 1)(2^{n-1} + 2^{n-2} + \dots + 2^0)$ 5. 由於現在電腦位元是二進位制的,每個整數可以用二的冪組合而成 $(1)$ ,其係數就是記憶體上對應位置的位元。相同地,對於 $2^n$ 的餘數也可以用此方式組合而成 $(2)$ ,出現位置在上圖中黃色範圍內。觀察 `m` 的位元在記憶體上的樣子,剛好是整個黃色區域填滿 `1` $(3)$ ,此時用 `~` 運算,將 `m` 的位元反轉,搭配 `&` 運算,可以將白色區域內的位元保留,黃色區域內的位元歸零,此動作就是我們要的 $(s + m) - r_{s+m} = aq_{s+m} = \Bigl\lfloor\dfrac{s + m}{a}\Bigr\rfloor \cdot a$ 。 ## 測驗 $β−2$ ## 測驗 $γ−1$ ## 測驗 $γ−2$ * 使用 [`TSan`(`ThreadSanitizer`)](https://github.com/google/sanitizers/wiki/ThreadSanitizerCppManual) 編譯,在之後程式執行時可檢查 data race 的發生。 ```shell $ gcc -std=gnu11 -Wall -g -fsanitize=thread -o qsort_mt qsort_mt.c -lpthread ``` * 執行程式 (使用兩個執行緒), `TSan` 警告有 data race 發生。 ``` ./qsort_mt -tv -f2 -h2 -n100 ================== WARNING: ThreadSanitizer: data race (pid=5885) Write of size 4 at 0x7b4000000080 by thread T1 (mutexes: write M0): #0 allocate_thread /home/zz4t/test/ttt/sysprog2023/quiz0/qsort_mt.c:211 (qsort_mt+0x31f3) (BuildId: 73768f1e290a7297a057e88a4b073661ed3575c4) #1 qsort_algo /home/zz4t/test/ttt/sysprog2023/quiz0/qsort_mt.c:314 (qsort_mt+0x3f0e) (BuildId: 73768f1e290a7297a057e88a4b073661ed3575c4) #2 qsort_thread /home/zz4t/test/ttt/sysprog2023/quiz0/qsort_mt.c:352 (qsort_mt+0x4328) (BuildId: 73768f1e290a7297a057e88a4b073661ed3575c4) Previous read of size 4 at 0x7b4000000080 by thread T2 (mutexes: write M1): #0 qsort_thread /home/zz4t/test/ttt/sysprog2023/quiz0/qsort_mt.c:343 (qsort_mt+0x423e) (BuildId: 73768f1e290a7297a057e88a4b073661ed3575c4) Location is heap block of size 256 at 0x7b4000000000 allocated by main thread: #0 calloc /usr/src/debug/gcc/gcc/libsanitizer/tsan/tsan_interceptors_posix.cpp:701 (libtsan.so.2+0x43413) (BuildId: 7e8fcb9ed0a63b98f2293e37c92ac955413efd9e) #1 qsort_mt /home/zz4t/test/ttt/sysprog2023/quiz0/qsort_mt.c:132 (qsort_mt+0x2677) (BuildId: 73768f1e290a7297a057e88a4b073661ed3575c4) #2 main /home/zz4t/test/ttt/sysprog2023/quiz0/qsort_mt.c:505 (qsort_mt+0x4d7d) (BuildId: 73768f1e290a7297a057e88a4b073661ed3575c4) Mutex M0 (0x7ffeb8a182d8) created at: #0 pthread_mutex_init /usr/src/debug/gcc/gcc/libsanitizer/tsan/tsan_interceptors_posix.cpp:1329 (libtsan.so.2+0x448c6) (BuildId: 7e8fcb9ed0a63b98f2293e37c92ac955413efd9e) #1 qsort_mt /home/zz4t/test/ttt/sysprog2023/quiz0/qsort_mt.c:130 (qsort_mt+0x265a) (BuildId: 73768f1e290a7297a057e88a4b073661ed3575c4) #2 main /home/zz4t/test/ttt/sysprog2023/quiz0/qsort_mt.c:505 (qsort_mt+0x4d7d) (BuildId: 73768f1e290a7297a057e88a4b073661ed3575c4) Mutex M1 (0x7b40000000a8) created at: #0 pthread_mutex_init /usr/src/debug/gcc/gcc/libsanitizer/tsan/tsan_interceptors_posix.cpp:1329 (libtsan.so.2+0x448c6) (BuildId: 7e8fcb9ed0a63b98f2293e37c92ac955413efd9e) #1 qsort_mt /home/zz4t/test/ttt/sysprog2023/quiz0/qsort_mt.c:136 (qsort_mt+0x26fd) (BuildId: 73768f1e290a7297a057e88a4b073661ed3575c4) #2 main /home/zz4t/test/ttt/sysprog2023/quiz0/qsort_mt.c:505 (qsort_mt+0x4d7d) (BuildId: 73768f1e290a7297a057e88a4b073661ed3575c4) Thread T1 (tid=5887, running) created by main thread at: #0 pthread_create /usr/src/debug/gcc/gcc/libsanitizer/tsan/tsan_interceptors_posix.cpp:1036 (libtsan.so.2+0x44219) (BuildId: 7e8fcb9ed0a63b98f2293e37c92ac955413efd9e) #1 qsort_mt /home/zz4t/test/ttt/sysprog2023/quiz0/qsort_mt.c:144 (qsort_mt+0x280c) (BuildId: 73768f1e290a7297a057e88a4b073661ed3575c4) #2 main /home/zz4t/test/ttt/sysprog2023/quiz0/qsort_mt.c:505 (qsort_mt+0x4d7d) (BuildId: 73768f1e290a7297a057e88a4b073661ed3575c4) Thread T2 (tid=5888, running) created by main thread at: #0 pthread_create /usr/src/debug/gcc/gcc/libsanitizer/tsan/tsan_interceptors_posix.cpp:1036 (libtsan.so.2+0x44219) (BuildId: 7e8fcb9ed0a63b98f2293e37c92ac955413efd9e) #1 qsort_mt /home/zz4t/test/ttt/sysprog2023/quiz0/qsort_mt.c:144 (qsort_mt+0x280c) (BuildId: 73768f1e290a7297a057e88a4b073661ed3575c4) #2 main /home/zz4t/test/ttt/sysprog2023/quiz0/qsort_mt.c:505 (qsort_mt+0x4d7d) (BuildId: 73768f1e290a7297a057e88a4b073661ed3575c4) SUMMARY: ThreadSanitizer: data race /home/zz4t/test/ttt/sysprog2023/quiz0/qsort_mt.c:211 in allocate_thread ================== 0.0254 0.017 0.0102 ThreadSanitizer: reported 1 warnings ``` 從訊息中可以得到 data race 發生的記憶體位置為 `0x7b4000000000` ,在 `qsort_mt.c:132` 配置的 `c.pool` 之中。 ```c=132 if ((c.pool = calloc(maxthreads, sizeof(struct qsort))) == NULL) ``` `c.pool` 為 `struct qsort` 組成之陣列,其中的成員 `st` 應該要被 `mtx_st` 所保護。 ```c=88 struct qsort { enum thread_state st; /* For coordinating work. */ struct common *common; /* Common shared elements. */ void *a; /* Array base. */ size_t n; /* Number of elements. */ pthread_t id; /* Thread id. */ pthread_mutex_t mtx_st; /* For signalling state change. */ pthread_cond_t cond_st; /* For signalling state change. */ }; ``` 但在 `qsort_mt.c:211` 的位置進行 `st` 的寫入動作前,並未持有 `mtx_st` ,與 `qsort_mt.c:343` 位置的讀取動作有 data race 發生的可能。 ```c=205 static struct qsort *allocate_thread(struct common *c) { verify(pthread_mutex_lock(&c->mtx_al)); for (int i = 0; i < c->nthreads; i++) if (c->pool[i].st == ts_idle) { c->idlethreads--; c->pool[i].st = ts_work; /* st write */ verify(pthread_mutex_lock(&c->pool[i].mtx_st)); /* mtx_st acquire */ verify(pthread_mutex_unlock(&c->mtx_al)); return (&c->pool[i]); } verify(pthread_mutex_unlock(&c->mtx_al)); return (NULL); } ``` ```c=341 /* Wait for work to be allocated. */ verify(pthread_mutex_lock(&qs->mtx_st)); while (qs->st == ts_idle) /* st read */ verify(pthread_cond_wait(&qs->cond_st, &qs->mtx_st)); ``` 修正很簡單,就是在 `st` 執行寫入動作前,先爭取到 `mtx_st` 即可。 ```diff --- a/quiz0/qsort_mt.c +++ b/quiz0/qsort_mt.c @@ -208,8 +208,8 @@ static struct qsort *allocate_thread(struct common *c) for (int i = 0; i < c->nthreads; i++) if (c->pool[i].st == ts_idle) { c->idlethreads--; - c->pool[i].st = ts_work; verify(pthread_mutex_lock(&c->pool[i].mtx_st)); + c->pool[i].st = ts_work; verify(pthread_mutex_unlock(&c->mtx_al)); return (&c->pool[i]); } ``` ## 測驗 $γ−3$