# 2023 Homework2 (quiz1) contributed by < [tintinjian12999](https://github.com/tintinjian12999) > 題目: [2023_sumer_quiz1](https://hackmd.io/@sysprog/linux2023-summer-quiz1) ## 完成 AAAA 到 FFFF ```c //cond.h static inline void cond_wait(cond_t *cond, mutex_t *mutex) { int seq = load(&cond->seq, relaxed); mutex_unlock(mutex); #define COND_SPINS 128 for (int i = 0; i < COND_SPINS; ++i) { if (load(&cond->seq, relaxed) != seq) { mutex_lock(mutex); return; } spin_hint(); } futex_wait(&cond->seq, seq); mutex_lock(mutex); fetch_or(&mutex->state, MUTEX_SLEEPING, relaxed); //AAAA } static inline void cond_signal(cond_t *cond, mutex_t *mutex) { fetch_add(&cond->seq, 1, relaxed); //BBBB futex_wake(&cond->seq, 1); //EEEE } static inline void cond_broadcast(cond_t *cond, mutex_t *mutex) { fetch_add(&cond->seq, 1, relaxed); //CCCC futex_requeue(&cond->seq, 1, &mutex->state); //DDDD } ``` ```c //mutex.h static inline void mutex_unlock(mutex_t *mutex) { int state = exchange(&mutex->state, 0, release); if (state & MUTEX_SLEEPING) futex_wake(&mutex->state, 1); //FFFF } ``` AAAA : MUTEX_SLEEPING BBBB : 1 CCCC : 1 DDDD : 1 EEEE : futex_wake FFFF : futex_wake ## 解釋程式碼的運作機制 [參考資料來源: Futexes Are Tricky](https://dept-info.labri.fr/~denis/Enseignement/2008-IR/Articles/01-futex.pdf) [參考資料來源: futex(2) — Linux manual page](https://man7.org/linux/man-pages/man2/futex.2.html) ### futex.h #### SYS_futex system call ```c long syscall (SYS_futex, uint32_t *uaddr, int futex_op, uint32_t val, const struct timespec *timeout, /* or: uint32_t val2 */ uint32_t *uaddr2, uint32_t val3); ``` `uaddr:` futex word `futex_op:` 要進行的 futex 操作 `val`/`timeout`/`val2`/`uaddr2`/`val3` 會根據不同的 `futex_op` 而有不同的作用 #### futex_wait ```c /* Atomically check if '*futex == value', and if so, go to sleep */ static inline void futex_wait(atomic int *futex, int value) { syscall(SYS_futex, futex, FUTEX_WAIT_PRIVATE, value, NULL); } ``` `FUTEX_WAIT_PRIVATE :`被呼叫時會將 thread 暫停並加入 kernel 裡的 wait queue 直到 FUTEX_WAKE_PRIVATE 被呼叫為止,在 thread 被暫停之前會檢查欲暫停之 futex 的 value 與當前的 value 是否一致,若不相同則會返回 `EWOULDBLOCK` 這個 error,此舉是為了防止 lost wake-up,如果其他的 thread 在欲呼叫 `FUTEX_WAIT_PRIVATE` 的 thread 執行該呼叫之前變更了 futex 的 value,則會發現 value 的變動並回傳錯誤訊息 `EWOULDBLOCK`。 ![](https://hackmd.io/_uploads/ryYuwxh23.png) 如上圖所示,thread 2 在 value 被 thread 1 變更之前已經呼叫了 futex_wait ,因此 thread 2 會暫停並進入 wait queue , thread 3 在讀入 value 後被 thread 1 interupt,而後 thread 1 改變了 value 的值並喚醒了所有的 thread,此時 thread 2 從 wait queue 中被 return 並伴隨著傳回值 0,而 thread 3 試著呼叫 futex_wait 但因為 value 的值與先前不同而傳回錯誤值 `EWOULDBLOCK`。 `note:`這裡 FUTEX_WAIT_PRIVATE 的 PRIVATE 意指這個 futex 是 process-private 的,也就是只作用於同一個 process 下的不同 thread。 #### futex_wake ```c /* Wake up 'limit' threads currently waiting on 'futex' */ static inline void futex_wake(atomic int *futex, int limit) { syscall(SYS_futex, futex, FUTEX_WAKE_PRIVATE, limit); } ``` 喚醒在 wake queue 中的 waiters,共會喚醒 limit 個,通常 limit 的值會是 1 或 INT_MAX。 #### futex_requeue ```c /* Wake up 'limit' waiters, and re-queue the rest onto a different futex */ static inline void futex_requeue(atomic int *futex, int limit, atomic int *other) { syscall(SYS_futex, futex, FUTEX_REQUEUE_PRIVATE, limit, INT_MAX, other); } ``` 將喚醒 limit 個 waiters ,與 `futex_wake` 不同的是,若 wait_queue 裡有超過 limit 個waiters,則會將剩餘為喚醒的 waiters 從 futex 對應的 wait_queue 移到 other 對應的 wait_queue。 ### mutex.h #### mutex_trylock ```c static bool mutex_trylock(mutex_t *mutex) { int state = load(&mutex->state, relaxed); if (state & MUTEX_LOCKED) return false; state = fetch_or(&mutex->state, MUTEX_LOCKED, relaxed); if (state & MUTEX_LOCKED) return false; thread_fence(&mutex->state, acquire); return true; } ``` 嘗試持有鎖,若成功則回傳 true ,反之回傳 false 將 mutex 指向的 state 讀取出來,並觀察此時的 state 是否為 LOCKED,若是則傳回 false ,若不是則透過 `fetch_or`將 mutex 裡的 state 設為 LOCKED,且因為 `fetch_or` 會回傳尚未經過 or 運算原有的 state 值,若原來為 LOCKED 則回傳 false,也就是只有在原來的 state 不為 LOCKED 而在此函式中被設為 LOCKED 才會回傳 true。 > 一開始的 load 可能不需要 ? #### mutex_lock ```c static inline void mutex_lock(mutex_t *mutex) { #define MUTEX_SPINS 128 for (int i = 0; i < MUTEX_SPINS; ++i) { if (mutex_trylock(mutex)) return; spin_hint(); } int state = exchange(&mutex->state, MUTEX_LOCKED | MUTEX_SLEEPING, relaxed); while (state & MUTEX_LOCKED) { futex_wait(&mutex->state, MUTEX_LOCKED | MUTEX_SLEEPING); state = exchange(&mutex->state, MUTEX_LOCKED | MUTEX_SLEEPING, relaxed); } thread_fence(&mutex->state, acquire); } ``` 前面的迴圈嘗試獲得鎖,若成功則直接回傳而不用使用 system call,若試了 MUTEX_SPINS 次還是無法獲得鎖則將 mutex 的 state 設為 MUTEX_LOCKED | MUTEX_SLEEPING,並且如果 state 先前的狀態試 LOCKED 的話則將其加入 wait queue。 #### mutex_unlock ```c static inline void mutex_unlock(mutex_t *mutex) { int state = exchange(&mutex->state, 0, release); if (state & MUTEX_SLEEPING) futex_wake(&mutex->state, 1); //FFFF } ``` 重設 mutex 的狀態為 0,若先前 mutex 在 wait queue 中則透過 `futex_wake` 喚醒。 ## 修改[第 1 次作業的測驗 γ](https://hackmd.io/@sysprog/linux2023-summer-quiz0)提供的並行版本快速排序法實作,使其得以搭配上述 futex 程式碼運作 ### 以 [Thread Sanitizer](https://github.com/google/sanitizers/wiki/ThreadSanitizerCppManual) 找出上述程式碼的 data race 並著手修正 為了效能量測的準確性,要先解決潛在的 data race 問題。 運行結果為 ``` WARNING: ThreadSanitizer: data race (pid=8049) Write of size 4 at 0x7b4000000080 by thread T1 (mutexes: write M0): #0 allocate_thread /home/tintinjian12999/linux_summer_2023/hw2/qsort-mt.c:211 (a.out+0x3448) #1 qsort_algo /home/tintinjian12999/linux_summer_2023/hw2/qsort-mt.c:314 (a.out+0x415b) #2 qsort_thread /home/tintinjian12999/linux_summer_2023/hw2/qsort-mt.c:351 (a.out+0x4565) Previous read of size 4 at 0x7b4000000080 by thread T2 (mutexes: write M2): #0 qsort_thread /home/tintinjian12999/linux_summer_2023/hw2/qsort-mt.c:343 (a.out+0x4487) Location is heap block of size 256 at 0x7b4000000000 allocated by main thread: #0 calloc ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:668 (libtsan.so.0+0x305aa) #1 qsort_mt /home/tintinjian12999/linux_summer_2023/hw2/qsort-mt.c:132 (a.out+0x28fb) #2 main /home/tintinjian12999/linux_summer_2023/hw2/qsort-mt.c:503 (a.out+0x4f7f) Mutex M0 (0x7ffed14191f8) created at: #0 pthread_mutex_init ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:1220 (libtsan.so.0+0x4a616) #1 qsort_mt /home/tintinjian12999/linux_summer_2023/hw2/qsort-mt.c:130 (a.out+0x28de) #2 main /home/tintinjian12999/linux_summer_2023/hw2/qsort-mt.c:503 (a.out+0x4f7f) Mutex M2 (0x7b40000000a8) created at: #0 pthread_mutex_init ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:1220 (libtsan.so.0+0x4a616) #1 qsort_mt /home/tintinjian12999/linux_summer_2023/hw2/qsort-mt.c:136 (a.out+0x2981) #2 main /home/tintinjian12999/linux_summer_2023/hw2/qsort-mt.c:503 (a.out+0x4f7f) Thread T1 (tid=8051, running) created by main thread at: #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:962 (libtsan.so.0+0x5ea79) #1 qsort_mt /home/tintinjian12999/linux_summer_2023/hw2/qsort-mt.c:144 (a.out+0x2a86) #2 main /home/tintinjian12999/linux_summer_2023/hw2/qsort-mt.c:503 (a.out+0x4f7f) Thread T2 (tid=8052, running) created by main thread at: #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:962 (libtsan.so.0+0x5ea79) #1 qsort_mt /home/tintinjian12999/linux_summer_2023/hw2/qsort-mt.c:144 (a.out+0x2a86) #2 main /home/tintinjian12999/linux_summer_2023/hw2/qsort-mt.c:503 (a.out+0x4f7f) SUMMARY: ThreadSanitizer: data race /home/tintinjian12999/linux_summer_2023/hw2/qsort-mt.c:211 in allocate_thread ``` 看到在 qsort-mt.c 的 132 行發生 data race ,參考 [NOVBobLee](https://hackmd.io/@sysprog/SJxzu7zo3) 的作法成功修正。 ### 修改程式碼 只需要引用 `cond.h` 並將 `pthread` 版本的實作如 `pthread_mutex_lock`, `pthread_cond_signal` 等改成 `futex` 版的 `mutex_lock`, `cond_signal`。 ```c struct qsort { enum thread_state st; /* For coordinating work. */ struct common *common; /* Common shared elements. */ void *a; /* Array base. */ size_t n; /* Number of elements. */ pthread_t id; /* Thread id. */ pthread_mutex_t mtx_st; /* For signalling state change. */ pthread_cond_t cond_st; /* For signalling state change. */ mutex_t mtx_st; /* For signalling state change. */ cond_t cond_st; /* For signalling state change. */ }; /* Invariant common part, shared across invocations. */ struct common { int swaptype; /* Code to use for swapping */ size_t es; /* Element size. */ void *thunk; /* Thunk for qsort_r */ cmp_t *cmp; /* Comparison function */ int nthreads; /* Total number of pool threads. */ int idlethreads; /* Number of idle threads in pool. */ int forkelem; /* Minimum number of elements for a new thread. */ struct qsort *pool; /* Fixed pool of threads. */ pthread_mutex_t mtx_al; /* For allocating threads in the pool. */ mutex_t mtx_al; /* For allocating threads in the pool. */ }; ```