# 2022q1 Homework5 (quiz5) contributed by < [Shawn5141](https://github.com/Shawn5141) > # [測驗1](https://hackmd.io/@sysprog/linux2022-quiz5#%E6%B8%AC%E9%A9%97-1) # 延伸問題: ## 解釋上述程式碼運作原理,指出可改進之處並實作。 ```cpp /* Generates a table of prime numbers up to the maximum having digits. * All even numbers (including 2) are ignored. Bit n in the bitmap * represents the odd number (2n + 1). This enables the table to be * half the size of a naive Sieve of Eratosthenes implementation. */ static void generate_sieve(int digits) { ull max = 0; for (int count = 0; count < digits; ++count) max = max * 10 + 9; max = isqrt(max); half_max = max >> 1; /* We need half the space as multiples of 2 can be omitted */ int bytes = (max >> 1) + (max & 0x1); /* Calculate the actual number of bytes required */ bytes = (bytes >> 3) + (bytes & 0x1); bytes = ALIGN(bytes, 16); /* Align-up to 16-byte */ psieve = realloc(psieve, bytes); if (!psieve) { printf("realloc() failed!\n"); exit(1); } memset(psieve, 0, bytes); /* In psieve bit 0 -> 1, 1 -> 3, 2 -> 5, 3 -> 7 and so on... */ /* Set the 0th bit representing 1 to COMPOSITE */ psieve[0] |= COMPOSITE << (1 >> 1); unsigned char mask = 0x7; for (ull n = 3; n <= max; n += 2) { if (((psieve[n >> 4] >> ((n >> 1) & mask)) & 0x1) == PRIME) { for (ull mul = (n << 1); mul < max; mul += n) { /* Skip the evens: there is no representation in psieve */ if (!(mul & 0x1)) continue; /* Set offset of mul in psieve */ psieve[mul >> 4] |= COMPOSITE << ((mul >> 1) & mask); /* bit offset */ } } } } ``` :::warning 注意書寫規範:中英文間用一個半型空白字元或標點符號區隔 ::: - generate_sieve 透過 bitmap(psieve bit)去儲存所有的奇數。 - psieve bit 的大小為: - 長度為digit的數開根號。(後續檢驗質數不需查大於 sqrt 值,若有大於 sqrt 的因數必然有對應的小於 sqrt 的因數) - 再除以 2 (因為不去計算偶數)。 - 把 bits 轉換成 byte,並且用16Btye 去對齊。 - 存在psieve bit裡面的值代表: - 一個 psieve 的元素有 8 bits,可以覆蓋 16 個數字(因為偶數被刪掉)。所以這邊會用 `n>>4` 去找 n 所對應在 psieve 中的 index。`psieve[n >> 4]` 就得到對應的元素。 - 接著透過 `(n >> 1) & mask` 去計算 n 在此元素中對應到的 bit (offset)。 - 接著把 `((psieve[n >> 4] >> ((n >> 1) & mask)) & 0x1)` 組合起來,把 `psieve[n >> 4]` 右移 `(n >> 1) & mask)` offset 並取 0x1,來得到LSB。 - 最後去檢查 n 是否為質數,如果是的話就把 n 的所有倍數設為 `COMPOSITE`。 ```graphviz digraph "reverse-bits" { node [shape=record]; rankdir=LR; a [label="{<0> 1 |<1> 3 |<2> 5 |<3> 7 |<4> 9 |<5> 11 |<6> 13 |<7> 15 }"]; b [label="{<0> 1 |<1> 1 |<2> 1 |<3> 0 |<4> 0 |<5> 0 |<6> 0 |<7> 0 }"] node [shape="plain"] "uint8_t psieve[0]"->a:0:w "mask" -> b:0:w } ``` 圖源:[Kevin-Shih](https://hackmd.io/LQUDaPr6QCOOsRusCQOmNg?view#:~:text=uint8_t%20psieve%5B0,mask) ## isprime 質數的檢驗 ```cpp /* Check if a number is prime. */ static bool isprime(const ull val) { if (!(val & 0x1)) /* Test for divisibility by 2 */ return false; // 一次取64 bits出來 ull *pquadbits = (ull *) psieve; ull next = 3; /* start at 7 (i.e. 3 * 2 + 1) */ // quad 取 2's complement & ~0b111 () : 起始值再與 ~0b111 做 AND 以忽略以另外處理的前三個奇數(1, 3, 5 // prev = 0 // for (ull quad = ~*pquadbits & ~0b111, prev = 0; prev <= half_max; quad = ~*++pquadbits) { if (!quad) { prev += 64; continue; } while (quad) { ull i = __builtin_ctzll(quad); next = prev + i; if (!(val % ((next << 1) + 1))) return false; quad &= ~(1ULL << i); } prev += 64; } return true; } ``` - `pquadbits` - 代表每次取出 64 bits 的 psieve - `quad` - quad 取 2's complement `& ~0b111` () : 起始值再與 ~0b111 做 AND 以忽略以另外處理的前三個奇數(1, 3, 5)。 - `prev`: - 一次加64,只要比 half_max 還小就繼續計算。(代表還沒pquadbits 還有) - `i` and `next`: - 使用 `__builtin_ctzll` 去找出 trailing zero 有幾位,然後找出 `next`,代表下一個壓縮過後的 `prime number`。 - `!(val %(next << 1) + 1))`: - 把 next * 2 + 1 復原回原本的 prime number。 - 如果 val % prime 為 0(代表可以整除),取not 之後就是1,所以val 不會是 `prime number` 。直接回傳 `false` - 如果 val % prime 為 0(代表不能整除),取not 之後就是0,所以 val 不會是 `primte number` 。 繼續做。 - `quad &= ~(1ULL << i)``: - 標註檢驗過的質數 ## 是否有必要先將數值轉成字串?用十進位的角度處理運算是否產生額外的計算負擔? ## Linux 核心原始程式碼 lib/math/prime_numbers.c 有類似的程式碼,請解釋其作用、裡頭 RCU 的使用情境,及針對執行時間的改進 --- # 測驗 2 ## 解釋程式碼運作原理,指出可改進之處並實作 ```cpp= /* shortcuts */ #define atomic_load(src) __atomic_load_n(src, __ATOMIC_SEQ_CST) #define atomic_store(dst, val) __atomic_store(dst, val, __ATOMIC_SEQ_CST) #define atomic_exchange(ptr, val) \ __atomic_exchange_n(ptr, val, __ATOMIC_SEQ_CST) #define atomic_cas(dst, expected, desired) \ __atomic_compare_exchange(dst, expected, desired, 0, __ATOMIC_SEQ_CST, \ __ATOMIC_SEQ_CST) ``` 根據 [gcc doc](https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html),得知這邊的 memory order 是使用 `__ATOMIC_SEQ_CST` , 會嚴格的執行Sequentially consistent 。 > __atomic_load_n: This built-in function implements an atomic load operation. It returns the contents of *ptr. > __atomic_store: This is the generic version of an atomic store. It stores the value of *val into *ptr. > __atomic_exchange_n: This built-in function implements an atomic exchange operation. It writes val into *ptr, and returns the previous contents of *ptr. > __atomic_compare_exchange: This built-in function implements an atomic compare and exchange operation > __ATOMIC_SEQ_CST: Barrier in both directions and synchronizes with acquire loads and release stores in all threads. :::success | memory order | 作用 | |:------------:|:---------------------:| | memory_order_relaxed | 沒有 memory barrier,允許編譯器和處理器充分發揮。若目前操作為 atomic,則只保證該 atomic 操作的順序 | | memory_order_consume | 後面依賴此 atomic 變數的存取指令,不得重排至此條指令之前 | | memory_order_acquire | 後面存取指令不得重排至此條指令之前 | | memory_order_release | 前面存取指令不得重排至此條指令之後。當此條指令的結果對其他執行緒可見後,之前的所有指令都可見 | | memory_order_acq_rel | acquire + release 語意 | | memory_order_seq_cst | acq_rel 語意外加所有使用 seq_cst 的指令有嚴格地順序,若沒指定 memory order,則預設採用 | ::: ## list operation ```cpp= #include <stdint.h> //LIST_ITER: 利用 `atomic_load` 遍歷 list 中的每一個點。 #define LIST_ITER(head, node) \ for (node = atomic_load(head); node; node = atomic_load(&node->next)) //hp_t` 是 struct object,有 pointer 只到下一個 struct。 typedef struct __hp { uintptr_t ptr; struct __hp *next; } hp_t; /* Allocate a new node with specified value and append to list */ static hp_t *list_append(hp_t **head, uintptr_t ptr) { // Allocate a new node with specified value and append to list hp_t *new = calloc(1, sizeof(hp_t)); if (!new) return NULL; // 然後指派傳進來的 `ptr` 給 `new` 的 ptr new->ptr = ptr; // 利用 `atomic_laod` 讀取 `head` hp_t *old = atomic_load(head); // 重複把`old` 指派給 `new->next`, 直到`atomic_cas` 去把 head 的值替換成新的 do { new->next = old; } while (!atomic_cas(head, &old, &new)); return new; } /* Attempt to find an empty node to store value, otherwise append a new node. * Returns the node containing the newly added value. */ hp_t *list_insert_or_append(hp_t **head, uintptr_t ptr) { hp_t *node; bool need_alloc = true; LIST_ITER (head, node) { // 利用 atomic_load 去取node->ptr的值,如果是0的話,就使用atomic_cas 去想辦法替換成 傳進來 ptr的值。 // 如果成功的話就不用呼叫 list_append,所以讓 need_alloc = false uintptr_t expected = atomic_load(&node->ptr); if (expected == 0 && atomic_cas(&node->ptr, &expected, &ptr)) { need_alloc = false; break; } } if (need_alloc) node = list_append(head, ptr); return node; } /* Remove a node from the list with the specified value */ bool list_remove(hp_t **head, uintptr_t ptr) { hp_t *node; const uintptr_t nullptr = 0; LIST_ITER (head, node) { uintptr_t expected = atomic_load(&node->ptr); if (expected == ptr && atomic_cas(&node->ptr, &expected, &nullptr)) return true; } return false; } /* Returns 1 if the list currently contains an node with the specified value */ bool list_contains(hp_t **head, uintptr_t ptr) { hp_t *node; LIST_ITER (head, node) { if (atomic_load(&node->ptr) == ptr) return true; } return false; } /* Frees all the nodes in a list - NOT THREAD SAFE */ void list_free(hp_t **head) { hp_t *cur = *head; while (cur) { hp_t *old = cur; cur = cur->next; free(old); } } ``` ## shared domain object ```cpp= #define DEFER_DEALLOC 1 typedef struct { hp_t *pointers; hp_t *retired; void (*deallocator)(void *); } domain_t; /* Create a new domain on the heap */ domain_t *domain_new(void (*deallocator)(void *)) { domain_t *dom = calloc(1, sizeof(domain_t)); if (!dom) return NULL; dom->deallocator = deallocator; return dom; } /* Free a previously allocated domain */ void domain_free(domain_t *dom) { if (!dom) return; if (dom->pointers) list_free(&dom->pointers); if (dom->retired) list_free(&dom->retired); free(dom); } ``` - domain_t 是用來存 hazard pointer 和 retire pointer linked list 的object - domain_free 則是把存在裡面的 object 釋放掉。 ```cpp= /* * Load a safe pointer to a shared object. This pointer must be passed to * `drop` once it is no longer needed. Returns 0 (NULL) on error. */ uintptr_t load(domain_t *dom, const uintptr_t *prot_ptr) { const uintptr_t nullptr = 0; while (1) { uintptr_t val = atomic_load(prot_ptr); // 把 val 存到 &dom->pointers 的 其中一個 node hp_t *node = list_insert_or_append(&dom->pointers, val); if (!node) return 0; /* Hazard pointer inserted successfully */ if (atomic_load(prot_ptr) == val) return val; /* * This pointer is being retired by another thread - remove this hazard * pointer and try again. We first try to remove the hazard pointer we * just used. If someone else used it to drop the same pointer, we walk * the list. */ uintptr_t tmp = val; if (!atomic_cas(&node->ptr, &tmp, &nullptr)) list_remove(&dom->pointers, val); } } /* * Drop a safe pointer to a shared object. This pointer (`safe_val`) must have * come from `load` */ void drop(domain_t *dom, uintptr_t safe_val) { if (!list_remove(&dom->pointers, safe_val)) __builtin_unreachable(); } static void cleanup_ptr(domain_t *dom, uintptr_t ptr, int flags) { if (!list_contains(&dom->pointers, ptr)) { /* deallocate straight away */ dom->deallocator((void *) ptr); } else if (flags & DEFER_DEALLOC) { /* Defer deallocation for later */ list_insert_or_append(&dom->retired, ptr); } else { /* Spin until all readers are done, then deallocate */ while (list_contains(&dom->pointers, ptr)) ; dom->deallocator((void *) ptr); } } /* Swaps the contents of a shared pointer with a new pointer. The old value will * be deallocated by calling the `deallocator` function for the domain, provided * when `domain_new` was called. If `flags` is 0, this function will wait * until no more references to the old object are held in order to deallocate * it. If flags is `DEFER_DEALLOC`, the old object will only be deallocated * if there are already no references to it; otherwise the cleanup will be done * the next time `cleanup` is called. */ void swap(domain_t *dom, uintptr_t *prot_ptr, uintptr_t new_val, int flags) { const uintptr_t old_obj = atomic_exchange(prot_ptr, new_val); cleanup_ptr(dom, old_obj, flags); } /* Forces the cleanup of old objects that have not been deallocated yet. Just * like `swap`, if `flags` is 0, this function will wait until there are no * more references to each object. If `flags` is `DEFER_DEALLOC`, only * objects that already have no living references will be deallocated. */ void cleanup(domain_t *dom, int flags) { hp_t *node; LIST_ITER (&dom->retired, node) { uintptr_t ptr = node->ptr; if (!ptr) continue; if (!list_contains(&dom->pointers, ptr)) { /* We can deallocate straight away */ if (list_remove(&dom->pointers, ptr)) dom->deallocator((void *) ptr); } else if (!(flags & DEFER_DEALLOC)) { /* Spin until all readers are done, then deallocate */ while (list_contains(&dom->pointers, ptr)) ; if (list_remove(&dom->pointers, ptr)) dom->deallocator((void *) ptr); } } } ``` - load: 把資源加到 domain 的 `pointers` 中,因為是 shared domain , 只要呼叫 `load` 的 thread 就會把 資源加到這個 pointers 中。 - drop: 把資源從 shared domain pointers 移除。 - cleanup_ptr 搭配 swap 使用,而 cleanup 則是用來淘汰 retired array。 - swap: 把舊的資源跟新的交換,並把舊的從 domain 中刪除。 ## rw_thread ```cpp= #include <assert.h> #include <err.h> #include <pthread.h> #include <stdio.h> #define N_READERS 1 #define N_WRITERS 1 #define N_ITERS 20 #define ARRAY_SIZE(x) sizeof(x) / sizeof(*x) typedef struct { unsigned int v1, v2, v3; } config_t; static config_t *shared_config; static domain_t *config_dom; config_t *create_config() { config_t *out = calloc(1, sizeof(config_t)); if (!out) err(EXIT_FAILURE, "calloc"); return out; } void delete_config(config_t *conf) { assert(conf); free(conf); } static void print_config(const char *name, const config_t *conf) { printf("%s : { 0x%08x, 0x%08x, 0x%08x }\n", name, conf->v1, conf->v2, conf->v3); } void init() { shared_config = create_config(); config_dom = domain_new(delete_config); if (!config_dom) err(EXIT_FAILURE, "domain_new"); } void deinit() { delete_config(shared_config); domain_free(config_dom); } static void *reader_thread(void *arg) { (void) arg; for (int i = 0; i < N_ITERS; ++i) { config_t *safe_config = (config_t *) load(config_dom, (uintptr_t *) &shared_config); if (!safe_config) err(EXIT_FAILURE, "load"); print_config("read config ", safe_config); drop(config_dom, (uintptr_t) safe_config); } return NULL; } static void *writer_thread(void *arg) { (void) arg; for (int i = 0; i < N_ITERS / 2; ++i) { config_t *new_config = create_config(); new_config->v1 = rand(); new_config->v2 = rand(); new_config->v3 = rand(); print_config("updating config", new_config); swap(config_dom, (uintptr_t *) &shared_config, (uintptr_t) new_config, 0); print_config("updated config ", new_config); } return NULL; } int main() { pthread_t readers[N_READERS], writers[N_WRITERS]; init(); /* Start threads */ for (size_t i = 0; i < ARRAY_SIZE(readers); ++i) { if (pthread_create(readers + i, NULL, reader_thread, NULL)) warn("pthread_create"); } for (size_t i = 0; i < ARRAY_SIZE(writers); ++i) { if (pthread_create(writers + i, NULL, writer_thread, NULL)) warn("pthread_create"); } /* Wait for threads to finish */ for (size_t i = 0; i < ARRAY_SIZE(readers); ++i) { if (pthread_join(readers[i], NULL)) warn("pthread_join"); } for (size_t i = 0; i < ARRAY_SIZE(writers); ++i) { if (pthread_join(writers[i], NULL)) warn("pthread_join"); } deinit(); return EXIT_SUCCESS; } ``` ## 上述程式碼開啟多個 reader 時,無法通過 ThreadSanitizer,請指出具體問題並克服 把 N_READERS 改成多個 `reader`,並使用下列的指令去編譯過後, ``` gcc -o hazard_ptr hazard_ptr.c -lpthread -fsanitize=thread -fPIE -pie ``` 不會產生 data race condition。只有等 N_WRITER 變為多個後才會產生一下的問題。 ``` ================== WARNING: ThreadSanitizer: data race (pid=4006842) Write of size 8 at 0x7b0400000810 by thread T5: #0 free ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:711 (libtsan.so.0+0x36908) #1 delete_config <null> (hazard_ptr+0x1bc6) #2 cleanup_ptr <null> (hazard_ptr+0x191a) #3 swap <null> (hazard_ptr+0x19d5) #4 writer_thread <null> (hazard_ptr+0x1eed) Previous read of size 4 at 0x7b0400000810 by thread T4: #0 print_config <null> (hazard_ptr+0x1c28) #1 writer_thread <null> (hazard_ptr+0x1f03) Thread T5 (tid=4006848, running) created by main thread at: #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:969 (libtsan.so.0+0x5add5) #1 main <null> (hazard_ptr+0x1fd3) Thread T4 (tid=4006847, finished) created by main thread at: #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:969 (libtsan.so.0+0x5add5) #1 main <null> (hazard_ptr+0x1fd3) SUMMARY: ThreadSanitizer: data race (/usr/local/google/home/shouenhsiao/side-project/linux/hp/hazard_ptr+0x1bc6) in delete_config ================== ================== WARNING: ThreadSanitizer: data race (pid=4006842) Write of size 8 at 0x7b0400000818 by thread T5: #0 free ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:711 (libtsan.so.0+0x36908) #1 delete_config <null> (hazard_ptr+0x1bc6) #2 cleanup_ptr <null> (hazard_ptr+0x191a) #3 swap <null> (hazard_ptr+0x19d5) #4 writer_thread <null> (hazard_ptr+0x1eed) Previous read of size 4 at 0x7b0400000818 by thread T4: #0 print_config <null> (hazard_ptr+0x1bfd) #1 writer_thread <null> (hazard_ptr+0x1f03) Thread T5 (tid=4006848, running) created by main thread at: #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:969 (libtsan.so.0+0x5add5) #1 main <null> (hazard_ptr+0x1fd3) Thread T4 (tid=4006847, finished) created by main thread at: #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:969 (libtsan.so.0+0x5add5) #1 main <null> (hazard_ptr+0x1fd3) SUMMARY: ThreadSanitizer: data race (/usr/local/google/home/shouenhsiao/side-project/linux/hp/hazard_ptr+0x1bc6) in delete_config ``` 從以上可以發現 - `delete_config` 會和 `print_config` 搶資源。所以在想如果使用 `load` 把存好的 new_config 從 shared_config 拿出來,並且設定flag 為 `DEFER_DEALLOC` ,這樣就會確保不會有 data race 發生在 `delete_config` 和 `print_config`。 ```cpp= static void *writer_thread(void *arg) { (void) arg; for (int i = 0; i < N_ITERS / 2; ++i) { config_t *new_config = create_config(); new_config->v1 = rand(); new_config->v2 = rand(); new_config->v3 = rand(); print_config("updating config", new_config); swap(config_dom, (uintptr_t *) &shared_config, (uintptr_t) new_config, DEFER_DEALLOC); config_t *tmp_config = (config_t *) load(config_dom, (uintptr_t*) &shared_config); if(!tmp_config) err(EXIT_FAILURE, "load"); print_config("updated config ", tmp_config); drop(config_dom, (uintptr_t) tmp_config); } return NULL; } ``` 跑出來的結果也的確沒有觸發 ThreadSanitizer。 ## 比較 RCU 和 hazard pointer (HP),探討為何 Linux 不採用 HP 而是另外發展 RCU 根據 [linux 核心設計 RCU討論](https://hackmd.io/@sysprog/linux-rcu#Linux-%E6%A0%B8%E5%BF%83%E8%A8%AD%E8%A8%88-RCU-%E5%90%8C%E6%AD%A5%E6%A9%9F%E5%88%B6)得到以下表格。取自 [Proposed RCU C++ API](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0461r1.pdf) 和 [Folly](https://github.com/facebook/folly/blob/master/folly/synchronization/Rcu.h) 的表格: | | [Reference Counting](https://en.wikipedia.org/wiki/Reference_counting) | RCU | [Hazptr](https://en.wikipedia.org/wiki/Hazard_pointer) | | --- | --- | --- | --- | | Unreclaimed objects | Bounded | Unbounded | Bounded | | Contention among readers | High | None | None | | Traversal forward progress | lock-free | wait-free | lock-free | | Reclamation forward progress | lock-free | blocking | wait-free | | Traversal speed | atomic | no-overhead | folly’s is no-overhead | | Reference acquisition | unconditional | unconditional | conditional | | Automatic reclamation | yes | no | no | | Purpose of domains | N/A | isolate slow reader | configuration |