---
tags: linux2022
---
# 2022q1 Homework5 (quiz5)
contributed by < `Xx-oX` >
> [題目](https://hackmd.io/@sysprog/linux2022-quiz5)
## 測驗 `2`
### [Hazard Pointer](https://en.wikipedia.org/wiki/Hazard_pointer)
在並行程式設計中,當我們在存取共用的記憶體物件時,需要考慮到其他執行緒是否有可能也正在存取同一個物件,若要釋放該記憶體物件時,不考慮這個問題,會引發嚴重的後果,例如 [dangling pointer](https://www.wikiwand.com/en/Dangling_pointer)。
使用 mutex 是最簡單且直觀的方法:存取共用記憶體時,acquire lock 即可保證沒有其他執行緒正在存取同一物件,也就可安全地釋放記憶體。但若我們正在存取的是一種 lock-free 資料結構,當然就不能恣意地使用 lock,因為會違反 lock-free 特性,即無論任何執行緒失敗,其他執行緒都能可繼續執行。於是乎,我們需要某種同為 lock-free 的記憶體物件回收機制。
>[lock-free](https://hackmd.io/@sysprog/concurrency/https%3A%2F%2Fhackmd.io%2F%40sysprog%2Fconcurrency-lockfree): 以系統的觀點來看,只要執行足夠長的時間,至少會有一個執行緒會有進展 (progress),是一種性質
>lock-less: 不使用 mutex / semaphore 之類的 lock 以達到安全無誤地操作共用資料
#### [ABA Problem](https://en.wikipedia.org/wiki/ABA_problem)
任何使用 [compare and swap](https://en.wikipedia.org/wiki/Compare-and-swap) primitive 的 lock-free 資料結構都有可能碰上 ABA 問題。
ABA 問題源自於 cas 的 c (compare),只要被比較的值沒有改變,就會執行 swap 操作,但這個被比較的值可能已經被動過手腳。
舉個例子,設計一個 lock-free stack (改寫自 [lab](https://lumian2015.github.io/lockFreeProgramming/lock-free-stack.html))
```c
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#define atomic_load(src) __atomic_load_n(src, __ATOMIC_SEQ_CST)
#define atomic_cas(dst, expected, desired) \
__atomic_compare_exchange(dst, expected, desired, 0, __ATOMIC_SEQ_CST, \
__ATOMIC_SEQ_CST)
struct node {
int data;
struct node *next;
};
struct stack {
struct node *top;
}
void push(struct stack *s, int val)
{
struct stack next;
struct stack orig = atomic_load(s);
struct node *new_el = malloc(sizeof(struct node));
do {
new_el->data = val;
new_el->next = orig.top;
} while(!atomic_cas(s, &orig, next));
}
int pop(struct stack *s)
{
struct stack next;
struct stack orig = atomic_load(s);
do{
if(orig.top == NULL)
{
return -1;
}
next.top = orig.top->next;
} while(!atomic_cas(s, &orig, next));
int ret = orig.top.data;
free(orig.top);
return ret;
}
```
假設現在 stack 裡面資料如下 (其中 0x01, 0x02, 0x03 為記憶體位置),然後有兩個 thread T1, T2
```graphviz
digraph g{
rankdir = LR
node[shape=record]
top[label="<p>top|"]
1[label="<p>0x01|1"]
2[label="<p>0x02|2"]
3[label="<p>0x03|3"]
top:p -> 1:p -> 2:p -> 3:p
}
```
T1 執行 pop()
```c
orig.top = 0x01
next.top = 0x02
```
被 T2 插隊, T2 執行 2 次 pop()
stack 變成
```graphviz
digraph g{
rankdir = LR
node[shape=record]
top[label="<p>top|"]
3[label="<p>0x03|3"]
top:p -> 3:p
}
```
T2 再執行一次 push()
stack 變成
```graphviz
digraph g{
rankdir = LR
node[shape=record]
top[label="<p>top|"]
1[label="<p>0x01|4"]
3[label="<p>0x03|3"]
top:p -> 1:p -> 3:p
}
```
這邊假設 0x01 被重新使用了
如此一來,當 T2 結束, T1 繼續運行時執行剛剛被中斷的下一步
```c
atomic_cas(s, &orig, next)
```
此時 orig->top 依舊是 0x01
compare 成立
```c
s->top == orig.top == 0x01
```
所以便將 s 的值 swap 成 next 的值
```c
s->top = next.top == 0x02
```
但是 0x02 已經被釋放了 (在 T2 pop 時),於是就導致了錯誤!!
解決的方法:
* Tagged state reference: 用一個 tag 記錄 pointer 被更動的次數,或者記錄時間戳,以便比對,例如 java 中的 [AtomicStampedReference](https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/atomic/AtomicStampedReference.html)
* Intermediate nodes: 使用一些不代表任何 data 的中間節點來標記哪些節點是被刪除的,但 cost 太高
* Deferred reclamation: 利用 [Garbage collection](https://en.wikipedia.org/wiki/Garbage_collection_(computer_science)) 機制、 Hazard pointer 或 [RCU (read copy update)](https://en.wikipedia.org/wiki/Read-copy-update) 來解決
> defer: to delay something until a later time
> defer to sb/sth: to allow someone or something to make decisions for you or tell you what to do because of your respect for them or because of their higher rank, authority, knowledge, etc.
> 與 delay 的差別在於,defer 表示主動推遲某件事,而不是因為突發或不可控的狀況造成的延遲
對於 C 這樣缺乏內建 concurrent GC 機制的程式語言來說。Hazard pointer 是其中一種解決方案,其原理是讀取端執行緒對指標進行識別,指標 (特別是指向的記憶體區塊) 若要釋放時,會事先保存,延遲到確認沒有讀取端執行緒,才進行真正的釋放。Linux 核心的 [RCU 同步機制](https://hackmd.io/@sysprog/linux-rcu)是另一種 lock-free 程式設計演算法和記憶體回收機制。
Hazard pointer 可簡稱為 “HP”,其關鍵的結構有:
* Hazard pointer: 儲存此 thread 正在存取的指標,因此該指標不能直接被釋放
* Retire list (也稱為 thread free list): 被這個 thread 要求釋放的指標,但實際尚未釋放
![](https://i.imgur.com/P1Yh5AV.png)
要安全的釋放記憶體,其基本想法就是:
* 每個執行緒放在 hazard pointer 中的指標尚不能被釋放
* 每個執行緒要求釋放的指標先放在 retire list 中
* 掃描 retire list 可以得知所有執行緒皆不使用的指標,則可將其真正的釋放給作業系統
### 程式碼解釋
定義 atomic 操作,其中 cas 為 compare and swap 的縮寫
```c
/* shortcuts */
#define atomic_load(src) __atomic_load_n(src, __ATOMIC_SEQ_CST)
#define atomic_store(dst, val) __atomic_store(dst, val, __ATOMIC_SEQ_CST)
#define atomic_exchange(ptr, val) \
__atomic_exchange_n(ptr, val, __ATOMIC_SEQ_CST)
#define atomic_cas(dst, expected, desired) \
__atomic_compare_exchange(dst, expected, desired, 0, __ATOMIC_SEQ_CST, \
__ATOMIC_SEQ_CST)
```
> `__ATOMIC_SEQ_CST`
Enforces total ordering with all other `__ATOMIC_SEQ_CST` operations.
儲存 hazard pointer 跟 retire list 的 linked list 資料結構
```c
typedef struct __hp {
uintptr_t ptr;
struct __hp *next;
} hp_t;
```
管理 hazard pointer 的結構
```c
typedef struct {
hp_t *pointers;
hp_t *retired;
void (*deallocator)(void *);
} domain_t;
```
測試用的資源
```c
typedef struct {
unsigned int v1, v2, v3;
} config_t;
```
要填空的部分,cleanup 用來釋放已經在 retire list 中,但還沒被 deallocate 的資源。
flag: 為 0 或者 DEFER_DEALLOC (1)
若為 0 則等待無人使用指標指向的資源
為 DEFER_DEALLOC 時,則先不 deallocate
```c
/* Forces the cleanup of old objects that have not been deallocated yet. Just
* like `swap`, if `flags` is 0, this function will wait until there are no
* more references to each object. If `flags` is `DEFER_DEALLOC`, only
* objects that already have no living references will be deallocated.
*/
void cleanup(domain_t *dom, int flags)
{
hp_t *node;
LIST_ITER (&dom->retired, node) {
uintptr_t ptr = node->ptr;
if (!ptr)
continue;
if (!list_contains(&dom->pointers, ptr)) {
/* We can deallocate straight away */
if (EXP3)
dom->deallocator((void *) ptr);
} else if (!(flags & DEFER_DEALLOC)) {
/* Spin until all readers are done, then deallocate */
while (list_contains(&dom->pointers, ptr))
;
if (EXP4)
dom->deallocator((void *) ptr);
}
}
}
```
`EXP3`: list_remove(&dom->retired, ptr)
`EXP4`: list_remove(&dom->retired, ptr)
要先將 ptr 從 retired list 中移除,再進行 deallocate,故使用 list_remove
list_remove: 將 linked list 中特定的節點移除,成功時回傳 true
使用 atomic_cas 以便在不被插隊的情況下將要移除的位址指向 nullptr
```c
/* Remove a node from the list with the specified value */
bool list_remove(hp_t **head, uintptr_t ptr)
{
hp_t *node;
const uintptr_t nullptr = 0;
LIST_ITER (head, node) {
uintptr_t expected = atomic_load(&node->ptr);
if (expected == ptr && atomic_cas(&node->ptr, &expected, &nullptr))
return true;
}
return false;
}
```
:::spoiler 完整程式碼
```c
/* shortcuts */
#define atomic_load(src) __atomic_load_n(src, __ATOMIC_SEQ_CST)
#define atomic_store(dst, val) __atomic_store(dst, val, __ATOMIC_SEQ_CST)
#define atomic_exchange(ptr, val) \
__atomic_exchange_n(ptr, val, __ATOMIC_SEQ_CST)
#define atomic_cas(dst, expected, desired) \
__atomic_compare_exchange(dst, expected, desired, 0, __ATOMIC_SEQ_CST, \
__ATOMIC_SEQ_CST)
#include <stdint.h>
#define LIST_ITER(head, node) \
for (node = atomic_load(head); node; node = atomic_load(&node->next))
typedef struct __hp {
uintptr_t ptr;
struct __hp *next;
} hp_t;
#include <stdbool.h>
#include <stdlib.h>
#include <string.h>
/* Allocate a new node with specified value and append to list */
static hp_t *list_append(hp_t **head, uintptr_t ptr)
{
hp_t *new = calloc(1, sizeof(hp_t));
if (!new)
return NULL;
new->ptr = ptr;
hp_t *old = atomic_load(head);
do {
new->next = old;
} while (!atomic_cas(head, &old, &new));
return new;
}
/* Attempt to find an empty node to store value, otherwise append a new node.
* Returns the node containing the newly added value.
*/
hp_t *list_insert_or_append(hp_t **head, uintptr_t ptr)
{
hp_t *node;
bool need_alloc = true;
LIST_ITER (head, node) {
uintptr_t expected = atomic_load(&node->ptr);
if (expected == 0 && atomic_cas(&node->ptr, &expected, &ptr)) {
need_alloc = false;
break;
}
}
if (need_alloc)
node = list_append(head, ptr);
return node;
}
/* Remove a node from the list with the specified value */
bool list_remove(hp_t **head, uintptr_t ptr)
{
hp_t *node;
const uintptr_t nullptr = 0;
LIST_ITER (head, node) {
uintptr_t expected = atomic_load(&node->ptr);
if (expected == ptr && atomic_cas(&node->ptr, &expected, &nullptr))
return true;
}
return false;
}
/* Returns 1 if the list currently contains an node with the specified value */
bool list_contains(hp_t **head, uintptr_t ptr)
{
hp_t *node;
LIST_ITER (head, node) {
if (atomic_load(&node->ptr) == ptr)
return true;
}
return false;
}
/* Frees all the nodes in a list - NOT THREAD SAFE */
void list_free(hp_t **head)
{
hp_t *cur = *head;
while (cur) {
hp_t *old = cur;
cur = cur->next;
free(old);
}
}
#define DEFER_DEALLOC 1
typedef struct {
hp_t *pointers;
hp_t *retired;
void (*deallocator)(void *);
} domain_t;
/* Create a new domain on the heap */
domain_t *domain_new(void (*deallocator)(void *))
{
domain_t *dom = calloc(1, sizeof(domain_t));
if (!dom)
return NULL;
dom->deallocator = deallocator;
return dom;
}
/* Free a previously allocated domain */
void domain_free(domain_t *dom)
{
if (!dom)
return;
if (dom->pointers)
list_free(&dom->pointers);
if (dom->retired)
list_free(&dom->retired);
free(dom);
}
/*
* Load a safe pointer to a shared object. This pointer must be passed to
* `drop` once it is no longer needed. Returns 0 (NULL) on error.
*/
uintptr_t load(domain_t *dom, const uintptr_t *prot_ptr)
{
const uintptr_t nullptr = 0;
while (1) {
uintptr_t val = atomic_load(prot_ptr);
hp_t *node = list_insert_or_append(&dom->pointers, val);
if (!node)
return 0;
/* Hazard pointer inserted successfully */
if (atomic_load(prot_ptr) == val)
return val;
/*
* This pointer is being retired by another thread - remove this hazard
* pointer and try again. We first try to remove the hazard pointer we
* just used. If someone else used it to drop the same pointer, we walk
* the list.
*/
uintptr_t tmp = val;
if (!atomic_cas(&node->ptr, &tmp, &nullptr))
list_remove(&dom->pointers, val);
}
}
/*
* Drop a safe pointer to a shared object. This pointer (`safe_val`) must have
* come from `load`
*/
void drop(domain_t *dom, uintptr_t safe_val)
{
if (!list_remove(&dom->pointers, safe_val))
__builtin_unreachable();
}
static void cleanup_ptr(domain_t *dom, uintptr_t ptr, int flags)
{
if (!list_contains(&dom->pointers, ptr)) { /* deallocate straight away */
dom->deallocator((void *) ptr);
} else if (flags & DEFER_DEALLOC) { /* Defer deallocation for later */
list_insert_or_append(&dom->retired, ptr);
} else { /* Spin until all readers are done, then deallocate */
while (list_contains(&dom->pointers, ptr))
;
dom->deallocator((void *) ptr);
}
}
/* Swaps the contents of a shared pointer with a new pointer. The old value will
* be deallocated by calling the `deallocator` function for the domain, provided
* when `domain_new` was called. If `flags` is 0, this function will wait
* until no more references to the old object are held in order to deallocate
* it. If flags is `DEFER_DEALLOC`, the old object will only be deallocated
* if there are already no references to it; otherwise the cleanup will be done
* the next time `cleanup` is called.
*/
void swap(domain_t *dom, uintptr_t *prot_ptr, uintptr_t new_val, int flags)
{
const uintptr_t old_obj = atomic_exchange(prot_ptr, new_val);
cleanup_ptr(dom, old_obj, flags);
}
/* Forces the cleanup of old objects that have not been deallocated yet. Just
* like `swap`, if `flags` is 0, this function will wait until there are no
* more references to each object. If `flags` is `DEFER_DEALLOC`, only
* objects that already have no living references will be deallocated.
*/
void cleanup(domain_t *dom, int flags)
{
hp_t *node;
LIST_ITER (&dom->retired, node) {
uintptr_t ptr = node->ptr;
if (!ptr)
continue;
if (!list_contains(&dom->pointers, ptr)) {
/* We can deallocate straight away */
if (list_remove(&dom->retired, ptr))
dom->deallocator((void *) ptr);
} else if (!(flags & DEFER_DEALLOC)) {
/* Spin until all readers are done, then deallocate */
while (list_contains(&dom->pointers, ptr))
;
if (list_remove(&dom->retired, ptr))
dom->deallocator((void *) ptr);
}
}
}
#include <assert.h>
#include <err.h>
#include <pthread.h>
#include <stdio.h>
#define N_READERS 1
#define N_WRITERS 1
#define N_ITERS 20
#define ARRAY_SIZE(x) sizeof(x) / sizeof(*x)
typedef struct {
unsigned int v1, v2, v3;
} config_t;
static config_t *shared_config;
static domain_t *config_dom;
config_t *create_config()
{
config_t *out = calloc(1, sizeof(config_t));
if (!out)
err(EXIT_FAILURE, "calloc");
return out;
}
void delete_config(config_t *conf)
{
assert(conf);
free(conf);
}
static void print_config(const char *name, const config_t *conf)
{
printf("%s : { 0x%08x, 0x%08x, 0x%08x }\n", name, conf->v1, conf->v2,
conf->v3);
}
void init()
{
shared_config = create_config();
config_dom = domain_new(delete_config);
if (!config_dom)
err(EXIT_FAILURE, "domain_new");
}
void deinit()
{
delete_config(shared_config);
domain_free(config_dom);
}
static void *reader_thread(void *arg)
{
(void) arg;
for (int i = 0; i < N_ITERS; ++i) {
config_t *safe_config =
(config_t *) load(config_dom, (uintptr_t *) &shared_config);
if (!safe_config)
err(EXIT_FAILURE, "load");
print_config("read config ", safe_config);
drop(config_dom, (uintptr_t) safe_config);
}
return NULL;
}
static void *writer_thread(void *arg)
{
(void) arg;
for (int i = 0; i < N_ITERS / 2; ++i) {
config_t *new_config = create_config();
new_config->v1 = rand();
new_config->v2 = rand();
new_config->v3 = rand();
print_config("updating config", new_config);
swap(config_dom, (uintptr_t *) &shared_config, (uintptr_t) new_config,
0);
print_config("updated config ", new_config);
}
return NULL;
}
int main()
{
pthread_t readers[N_READERS], writers[N_WRITERS];
init();
/* Start threads */
for (size_t i = 0; i < ARRAY_SIZE(readers); ++i) {
if (pthread_create(readers + i, NULL, reader_thread, NULL))
warn("pthread_create");
}
for (size_t i = 0; i < ARRAY_SIZE(writers); ++i) {
if (pthread_create(writers + i, NULL, writer_thread, NULL))
warn("pthread_create");
}
/* Wait for threads to finish */
for (size_t i = 0; i < ARRAY_SIZE(readers); ++i) {
if (pthread_join(readers[i], NULL))
warn("pthread_join");
}
for (size_t i = 0; i < ARRAY_SIZE(writers); ++i) {
if (pthread_join(writers[i], NULL))
warn("pthread_join");
}
deinit();
return EXIT_SUCCESS;
}
```
:::
### 無法通過 Thread Sanitizer
#### 1 reader 1 writer
:::spoiler ThreadSanitizer 訊息
```
read config : { 0x00000000, 0x00000000, 0x00000000 }
read config : { 0x00000000, 0x00000000, 0x00000000 }
updating config : { 0x6b8b4567, 0x327b23c6, 0x643c9869 }
==================
WARNING: ThreadSanitizer: data race (pid=14932)
Write of size 8 at 0x7b0400000000 by thread T2:
#0 free ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:707 (libtsan.so.0+0x35f25)
#1 delete_config <null> (test+0x13f2)
#2 cleanup_ptr <null> (test+0x1df3)
Previous read of size 4 at 0x7b0400000004 by thread T1:
#0 print_config <null> (test+0x1445)
#1 reader_thread <null> (test+0x15ee)
Thread T2 (tid=14935, running) created by main thread at:
#0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:962 (libtsan.so.0+0x5ea79)
#1 main <null> (test+0x1858)
Thread T1 (tid=14934, finished) created by main thread at:
#0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:962 (libtsan.so.0+0x5ea79)
#1 main <null> (test+0x1801)
SUMMARY: ThreadSanitizer: data race (/home/ycwu4142/linux2022/quiz5/test+0x13f2) in delete_config
==================
==================
WARNING: ThreadSanitizer: data race (pid=14932)
Write of size 8 at 0x7b0400000008 by thread T2:
#0 free ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:707 (libtsan.so.0+0x35f25)
#1 delete_config <null> (test+0x13f2)
#2 cleanup_ptr <null> (test+0x1df3)
Previous read of size 4 at 0x7b0400000008 by thread T1:
#0 print_config <null> (test+0x142d)
#1 reader_thread <null> (test+0x15ee)
Thread T2 (tid=14935, running) created by main thread at:
#0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:962 (libtsan.so.0+0x5ea79)
#1 main <null> (test+0x1858)
Thread T1 (tid=14934, finished) created by main thread at:
#0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:962 (libtsan.so.0+0x5ea79)
#1 main <null> (test+0x1801)
SUMMARY: ThreadSanitizer: data race (/home/ycwu4142/linux2022/quiz5/test+0x13f2) in delete_config
==================
updated config : { 0x6b8b4567, 0x327b23c6, 0x643c9869 }
ThreadSanitizer: reported 2 warnings
```
:::
> 為了簡化,僅使用 2 個 iter
大意是說,T1 (reader thread) 在 print_config (讀取並印出) 時跟 T2 (writer thread) 執行的 delete_config 會有 data race 的情形
仔細查看,發現在 writer_thread 中 swap(將 shared_config 的值更新)時,會執行 cleanup_ptr 進而引發 delete_config 將舊的指標釋放掉
```c
void cleanup_ptr(domain_t *dom, uintptr_t ptr, int flags)
{
if (!list_contains(&dom->pointers, ptr)) { /* deallocate straight away */
dom->deallocator((void *) ptr);
} else if (flags & DEFER_DEALLOC) { /* Defer deallocation for later */
list_insert_or_append(&dom->retired, ptr);
} else { /* Spin until all readers are done, then deallocate */
while (list_contains(&dom->pointers, ptr))
;
dom->deallocator((void *) ptr);
}
}
```
將傳入的 flags 改成 `DEFER_DEALLOC`,試圖讓 cleanup_ptr 將指標 (ptr) 丟到 retired list 而不是直接 deallocate,但發現結果沒變。
更進一步查看發現原來是進到了第一個 case - hazard pointer 裡面沒有這個指標 (ptr)
:::warning
若是 writer 在 reader 讀取的過程中試圖 swap ,理論上該指標必然會在 hazard pointer 的串列之中,因為 reader 使用了 load 來讀取,而 load 會執行 insert_or_append ,將指標放進 hazard pointer 中,直到呼叫 drop 來解除。
:::
> 參考 [kdnvt](https://hackmd.io/@kdnvt/linux2022-quiz5#Memory-leak) 同學與 [bakudr18](https://hackmd.io/@bakudr18/r1Lbsu27q#%E9%87%8D%E6%96%B0%E5%AF%A6%E9%A9%97) 同學的共筆中的分析方式
```c=
reader | writer
--------------------+------------------------
| /* swap */
| val = atomic_exchange(ptr,new);
| /* cleanup_ptr */
| if(!list_contain(val)){
/* load */ |
list_insert(val); |
|
/* load on ptr */ |
if(ptr == val){ |
return val; |
} |
| free(val);
| }
```
由於 reader 在第 9 行才將 val insert 到 hazard pointer 裡面,而 writer 在第 6 行檢查時就找不到 val,所以便直接釋放 val
另外從兩位同學的共筆中還發現第 14 行的 free 會將 reader 讀到的值 free 掉,巧合的是正好我的 cpu 就照這樣的順序跑...
以上的結果是我將 hp 相關程式碼分離出來,編譯成 .o 檔 (ELF 格式),再連結的結果
> 見 [code](https://github.com/Xx-oX/Quiz5-Hazard-Pointer)
但如果按直接塞在一個檔案中直接編譯則有機率跑出以下結果
```c=
read config : { 0x00000000, 0x00000000, 0x00000000 }
read config : { 0x00000000, 0x00000000, 0x00000000 }
read config : { 0x00000000, 0x00000000, 0x00000000 }
read config : { 0x00000000, 0x00000000, 0x00000000 }
read config : { 0x00000000, 0x00000000, 0x00000000 }
read config : { 0x00000000, 0x00000000, 0x00000000 }
read config : { 0x00000000, 0x00000000, 0x00000000 }
read config : { 0x00000000, 0x00000000, 0x00000000 }
read config : { 0x00000000, 0x00000000, 0x00000000 }
read config : { 0x00000000, 0x00000000, 0x00000000 }
read config : { 0x00000000, 0x00000000, 0x00000000 }
read config : { 0x00000000, 0x00000000, 0x00000000 }
read config : { 0x00000000, 0x00000000, 0x00000000 }
read config : { 0x00000000, 0x00000000, 0x00000000 }
updating config : { 0x6b8b4567, 0x327b23c6, 0x643c9869 }
read config : { 0x00000000, 0x00000000, 0x00000000 }
read config : { 0x6b8b4567, 0x327b23c6, 0x643c9869 }
updated config : { 0x6b8b4567, 0x327b23c6, 0x643c9869 }
updating config : { 0x66334873, 0x74b0dc51, 0x19495cff }
read config : { 0x6b8b4567, 0x327b23c6, 0x643c9869 }
read config : { 0x66334873, 0x74b0dc51, 0x19495cff }
read config : { 0x66334873, 0x74b0dc51, 0x19495cff }
updated config : { 0x66334873, 0x74b0dc51, 0x19495cff }
updating config : { 0x2ae8944a, 0x625558ec, 0x238e1f29 }
read config : { 0x66334873, 0x74b0dc51, 0x19495cff }
updated config : { 0x2ae8944a, 0x625558ec, 0x238e1f29 }
updating config : { 0x46e87ccd, 0x3d1b58ba, 0x507ed7ab }
updated config : { 0x46e87ccd, 0x3d1b58ba, 0x507ed7ab }
updating config : { 0x2eb141f2, 0x41b71efb, 0x79e2a9e3 }
updated config : { 0x2eb141f2, 0x41b71efb, 0x79e2a9e3 }
updating config : { 0x7545e146, 0x515f007c, 0x5bd062c2 }
updated config : { 0x7545e146, 0x515f007c, 0x5bd062c2 }
updating config : { 0x12200854, 0x4db127f8, 0x0216231b }
updated config : { 0x12200854, 0x4db127f8, 0x0216231b }
updating config : { 0x1f16e9e8, 0x1190cde7, 0x66ef438d }
updated config : { 0x1f16e9e8, 0x1190cde7, 0x66ef438d }
updating config : { 0x140e0f76, 0x3352255a, 0x109cf92e }
updated config : { 0x140e0f76, 0x3352255a, 0x109cf92e }
updating config : { 0x0ded7263, 0x7fdcc233, 0x1befd79f }
updated config : { 0x0ded7263, 0x7fdcc233, 0x1befd79f }
```
其中 15 ~ 18 行很明顯就有 racing 的狀況,但沒有被 thread sanitizer 抓出來
#### multi-readers multi-writers
當有多個 writer 時也會發生 data race
:::spoiler Thread Sanitizer 訊息
```c
read config : { 0x00000000, 0x00000000, 0x00000000 }
read config : { 0x00000000, 0x00000000, 0x00000000 }
updating config : { 0x6b8b4567, 0x327b23c6, 0x643c9869 }
updated config : { 0x6b8b4567, 0x327b23c6, 0x643c9869 }
updating config : { 0x66334873, 0x74b0dc51, 0x19495cff }
==================
WARNING: ThreadSanitizer: data race (pid=2670)
Write of size 8 at 0x7b0400000810 by thread T3:
#0 free <null> (libtsan.so.0+0x35f45)
#1 delete_config <null> (a.out+0x1d7c)
#2 cleanup_ptr <null> (a.out+0x1ac3)
#3 swap <null> (a.out+0x1b82)
#4 writer_thread <null> (a.out+0x2089)
#5 <null> <null> (libtsan.so.0+0x2d1af)
Previous read of size 4 at 0x7b0400000810 by thread T2:
#0 print_config <null> (a.out+0x1de2)
#1 writer_thread <null> (a.out+0x209c)
#2 <null> <null> (libtsan.so.0+0x2d1af)
Thread T3 (tid=2674, running) created by main thread at:
#0 pthread_create <null> (libtsan.so.0+0x5ea99)
#1 main <null> (a.out+0x217d)
Thread T2 (tid=2673, finished) created by main thread at:
#0 pthread_create <null> (libtsan.so.0+0x5ea99)
#1 main <null> (a.out+0x217d)
SUMMARY: ThreadSanitizer: data race (/lib/x86_64-linux-gnu/libtsan.so.0+0x35f45) in __interceptor_free
==================
==================
WARNING: ThreadSanitizer: data race (pid=2670)
Write of size 8 at 0x7b0400000818 by thread T3:
#0 free <null> (libtsan.so.0+0x35f45)
#1 delete_config <null> (a.out+0x1d7c)
#2 cleanup_ptr <null> (a.out+0x1ac3)
#3 swap <null> (a.out+0x1b82)
#4 writer_thread <null> (a.out+0x2089)
#5 <null> <null> (libtsan.so.0+0x2d1af)
Previous read of size 4 at 0x7b0400000818 by thread T2:
#0 print_config <null> (a.out+0x1db7)
#1 writer_thread <null> (a.out+0x209c)
#2 <null> <null> (libtsan.so.0+0x2d1af)
Thread T3 (tid=2674, running) created by main thread at:
#0 pthread_create <null> (libtsan.so.0+0x5ea99)
#1 main <null> (a.out+0x217d)
Thread T2 (tid=2673, finished) created by main thread at:
#0 pthread_create <null> (libtsan.so.0+0x5ea99)
#1 main <null> (a.out+0x217d)
SUMMARY: ThreadSanitizer: data race (/lib/x86_64-linux-gnu/libtsan.so.0+0x35f45) in __interceptor_free
==================
updated config : { 0x66334873, 0x74b0dc51, 0x19495cff }
ThreadSanitizer: reported 2 warnings
```
:::
可以看出 writer 跟 writer 間有 UAF (use after free) 的問題
解決方法:在 writer print_config 之前強行 load 一次 shared_config,確保待會印出的內容沒有被其他人釋放
```c
static void *writer_thread(void *arg)
{
(void) arg;
for (int i = 0; i < N_ITERS / 2; ++i) {
config_t *new_config = create_config();
new_config->v1 = rand();
new_config->v2 = rand();
new_config->v3 = rand();
print_config("updating config", new_config);
swap(config_dom, (uintptr_t *) &shared_config, (uintptr_t) new_config,
0);
config_t *safe_config =
(config_t *) load(config_dom, (uintptr_t *) &shared_config);
if (!safe_config)
err(EXIT_FAILURE, "load");
print_config("updated config ", safe_config);
drop(config_dom, (uintptr_t) safe_config);
}
return NULL;
}
```
注意 load 完要 drop,不然會一直占用資源 (ptr 持續留在 hp 中)
### 比較 RCU 與 HP
RCU 適合用於 1 個 writer 多個 reader 的情形
利用 [訂閱-發布機制](https://hackmd.io/@sysprog/linux-rcu#%E8%A8%82%E9%96%B1%E2%80%94%E7%99%BC%E5%B8%83%E6%A9%9F%E5%88%B6) 來確保所有 reader 讀取到的資源都沒有被釋放,雖然內容可能不是最新的 (因此只能用於 [DNS resolution](https://nlp.stanford.edu/IR-book/html/htmledition/dns-resolution-1.html) 之類即使存取舊的資料,也不會影響最終行為正確性的場景)。
> 訂閱-發布機制: writer 更改資料然後發布,此時會進入「等待讀取端執行緒離開 critical section」等待所有訂閱該資料的 reader 都離開這個 critical section 後才釋放舊內容的記憶體
RCU 與 Hazard pointer 最大的區別在於 HP 保護的是一個 pointer 或者說一個記憶體區塊,而 RCU 則是保護一整個 critical section
整體來說 RCU 適用於 read-intensive workloads 而 HP 適用於 update rates 高的 workload
> 參見 [Proposed RCU C++ API](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0461r1.pdf)
也就是說 write/read 比率高 => HP
write/read 比率低 => RCU
> 取自 [對比其他 lock-free 同步機制](https://hackmd.io/@sysprog/linux-rcu#%E5%B0%8D%E6%AF%94%E5%85%B6%E4%BB%96-lock-free-%E5%90%8C%E6%AD%A5%E6%A9%9F%E5%88%B6)
| | RCU| HP |
| -------- | -------- | -------- |
|Unreclaimed objects|Unbounded|Bounded|
|Traversal forward progress|wait-free|lock-free|
|Reclamation forward progress|blocking|wait-free|
|Reference acquisition|unconditional|conditional|
> * wait-free 是 non-blocking 之中強度最高的,它保證所有的指令可以 bound 在某個時間內完成,也就是所有的執行緒經過一段足夠的時間後,一定會有所進展
> * lock-free 稍微弱一點,它允許部份執行緒被暫停,但仍然保證整體程式的其他部份會有進展
最差的情況下,就算只有一個執行緒在執行,其他所有執行緒都在等待,仍然屬於 lock-free
在 [ABA 問題](https://en.wikipedia.org/wiki/ABA_problem) 中解決辦法的部分提到
> Hazard pointers are lock-free, but can only track at most a fixed number of elements per thread as being in-use.
不確定跟表格中的 Unreclaimed objects 是不是同一件事
我認為能追蹤的數量有限是因為 retire list 需要 clean up,而執行 clean up 的時機,通常是 retire list 中的內容到達一定數量 (就好比垃圾桶滿了 => 此時需要傾倒)。因此需要設置上限
(定義何謂「滿」) ,而這個滿就限制了 HP 能夠追蹤的量