Try   HackMD

2022q1 Homework5 (quiz5)

contributed by < Xx-oX >

題目

測驗 2

Hazard Pointer

在並行程式設計中,當我們在存取共用的記憶體物件時,需要考慮到其他執行緒是否有可能也正在存取同一個物件,若要釋放該記憶體物件時,不考慮這個問題,會引發嚴重的後果,例如 dangling pointer

使用 mutex 是最簡單且直觀的方法:存取共用記憶體時,acquire lock 即可保證沒有其他執行緒正在存取同一物件,也就可安全地釋放記憶體。但若我們正在存取的是一種 lock-free 資料結構,當然就不能恣意地使用 lock,因為會違反 lock-free 特性,即無論任何執行緒失敗,其他執行緒都能可繼續執行。於是乎,我們需要某種同為 lock-free 的記憶體物件回收機制。

lock-free: 以系統的觀點來看,只要執行足夠長的時間,至少會有一個執行緒會有進展 (progress),是一種性質
lock-less: 不使用 mutex / semaphore 之類的 lock 以達到安全無誤地操作共用資料

ABA Problem

任何使用 compare and swap primitive 的 lock-free 資料結構都有可能碰上 ABA 問題。

ABA 問題源自於 cas 的 c (compare),只要被比較的值沒有改變,就會執行 swap 操作,但這個被比較的值可能已經被動過手腳。
舉個例子,設計一個 lock-free stack (改寫自 lab)

#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>

#define atomic_load(src) __atomic_load_n(src, __ATOMIC_SEQ_CST)
#define atomic_cas(dst, expected, desired)                                 \
    __atomic_compare_exchange(dst, expected, desired, 0, __ATOMIC_SEQ_CST, \
                              __ATOMIC_SEQ_CST)

struct node {
    int data;
    struct node *next;
};

struct stack {
    struct node *top;
}

void push(struct stack *s, int val)
{
    struct stack next;
    struct stack orig = atomic_load(s);
    struct node *new_el = malloc(sizeof(struct node));
    do {
        new_el->data = val;
        new_el->next = orig.top;
    } while(!atomic_cas(s, &orig, next));
}

int pop(struct stack *s)
{
    struct stack next;
    struct stack orig = atomic_load(s);
    do{
        if(orig.top == NULL)
        {
            return -1;
        }
        next.top = orig.top->next;
    } while(!atomic_cas(s, &orig, next));
    int ret = orig.top.data;
    free(orig.top);
    return ret;
}

假設現在 stack 裡面資料如下 (其中 0x01, 0x02, 0x03 為記憶體位置),然後有兩個 thread T1, T2







g



top

top

 



1

0x01

1



top:p->1:p





2

0x02

2



1:p->2:p





3

0x03

3



2:p->3:p





T1 執行 pop()

orig.top = 0x01
next.top = 0x02

被 T2 插隊, T2 執行 2 次 pop()
stack 變成







g



top

top

 



3

0x03

3



top:p->3:p





T2 再執行一次 push()
stack 變成







g



top

top

 



1

0x01

4



top:p->1:p





3

0x03

3



1:p->3:p





這邊假設 0x01 被重新使用了
如此一來,當 T2 結束, T1 繼續運行時執行剛剛被中斷的下一步

atomic_cas(s, &orig, next)

此時 orig->top 依舊是 0x01
compare 成立

s->top == orig.top == 0x01

所以便將 s 的值 swap 成 next 的值

s->top = next.top == 0x02

但是 0x02 已經被釋放了 (在 T2 pop 時),於是就導致了錯誤!!

解決的方法:

  • Tagged state reference: 用一個 tag 記錄 pointer 被更動的次數,或者記錄時間戳,以便比對,例如 java 中的 AtomicStampedReference
  • Intermediate nodes: 使用一些不代表任何 data 的中間節點來標記哪些節點是被刪除的,但 cost 太高
  • Deferred reclamation: 利用 Garbage collection 機制、 Hazard pointer 或 RCU (read copy update) 來解決

defer: to delay something until a later time
defer to sb/sth: to allow someone or something to make decisions for you or tell you what to do because of your respect for them or because of their higher rank, authority, knowledge, etc.
與 delay 的差別在於,defer 表示主動推遲某件事,而不是因為突發或不可控的狀況造成的延遲

對於 C 這樣缺乏內建 concurrent GC 機制的程式語言來說。Hazard pointer 是其中一種解決方案,其原理是讀取端執行緒對指標進行識別,指標 (特別是指向的記憶體區塊) 若要釋放時,會事先保存,延遲到確認沒有讀取端執行緒,才進行真正的釋放。Linux 核心的 RCU 同步機制是另一種 lock-free 程式設計演算法和記憶體回收機制。

Hazard pointer 可簡稱為 “HP”,其關鍵的結構有:

  • Hazard pointer: 儲存此 thread 正在存取的指標,因此該指標不能直接被釋放
  • Retire list (也稱為 thread free list): 被這個 thread 要求釋放的指標,但實際尚未釋放
    Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More →

要安全的釋放記憶體,其基本想法就是:

  • 每個執行緒放在 hazard pointer 中的指標尚不能被釋放
  • 每個執行緒要求釋放的指標先放在 retire list 中
  • 掃描 retire list 可以得知所有執行緒皆不使用的指標,則可將其真正的釋放給作業系統

程式碼解釋

定義 atomic 操作,其中 cas 為 compare and swap 的縮寫

/* shortcuts */
#define atomic_load(src) __atomic_load_n(src, __ATOMIC_SEQ_CST)
#define atomic_store(dst, val) __atomic_store(dst, val, __ATOMIC_SEQ_CST)
#define atomic_exchange(ptr, val) \
    __atomic_exchange_n(ptr, val, __ATOMIC_SEQ_CST)
#define atomic_cas(dst, expected, desired)                                 \
    __atomic_compare_exchange(dst, expected, desired, 0, __ATOMIC_SEQ_CST, \
                              __ATOMIC_SEQ_CST)

__ATOMIC_SEQ_CST
Enforces total ordering with all other __ATOMIC_SEQ_CST operations.

儲存 hazard pointer 跟 retire list 的 linked list 資料結構

typedef struct __hp {
    uintptr_t ptr;
    struct __hp *next;
} hp_t;

管理 hazard pointer 的結構

typedef struct {
    hp_t *pointers;
    hp_t *retired;
    void (*deallocator)(void *);
} domain_t;

測試用的資源

typedef struct {
    unsigned int v1, v2, v3;
} config_t;

要填空的部分,cleanup 用來釋放已經在 retire list 中,但還沒被 deallocate 的資源。
flag: 為 0 或者 DEFER_DEALLOC (1)
若為 0 則等待無人使用指標指向的資源
為 DEFER_DEALLOC 時,則先不 deallocate

/* Forces the cleanup of old objects that have not been deallocated yet. Just
 * like `swap`, if `flags` is 0, this function will wait until there are no
 * more references to each object. If `flags` is `DEFER_DEALLOC`, only
 * objects that already have no living references will be deallocated.
 */
void cleanup(domain_t *dom, int flags)
{
    hp_t *node;

    LIST_ITER (&dom->retired, node) {
        uintptr_t ptr = node->ptr;
        if (!ptr)
            continue;

        if (!list_contains(&dom->pointers, ptr)) {
            /* We can deallocate straight away */
            if (EXP3)
                dom->deallocator((void *) ptr);
        } else if (!(flags & DEFER_DEALLOC)) {
            /* Spin until all readers are done, then deallocate */
            while (list_contains(&dom->pointers, ptr))
                ;
            if (EXP4)
                dom->deallocator((void *) ptr);
        }
    }
}

EXP3: list_remove(&dom->retired, ptr)
EXP4: list_remove(&dom->retired, ptr)

要先將 ptr 從 retired list 中移除,再進行 deallocate,故使用 list_remove

list_remove: 將 linked list 中特定的節點移除,成功時回傳 true
使用 atomic_cas 以便在不被插隊的情況下將要移除的位址指向 nullptr

/* Remove a node from the list with the specified value */
bool list_remove(hp_t **head, uintptr_t ptr)
{
    hp_t *node;
    const uintptr_t nullptr = 0;

    LIST_ITER (head, node) {
        uintptr_t expected = atomic_load(&node->ptr);
        if (expected == ptr && atomic_cas(&node->ptr, &expected, &nullptr))
            return true;
    }

    return false;
}
完整程式碼
/* shortcuts */
#define atomic_load(src) __atomic_load_n(src, __ATOMIC_SEQ_CST)
#define atomic_store(dst, val) __atomic_store(dst, val, __ATOMIC_SEQ_CST)
#define atomic_exchange(ptr, val) \
    __atomic_exchange_n(ptr, val, __ATOMIC_SEQ_CST)
#define atomic_cas(dst, expected, desired)                                 \
    __atomic_compare_exchange(dst, expected, desired, 0, __ATOMIC_SEQ_CST, \
                              __ATOMIC_SEQ_CST)

#include <stdint.h>

#define LIST_ITER(head, node) \
    for (node = atomic_load(head); node; node = atomic_load(&node->next))

typedef struct __hp {
    uintptr_t ptr;
    struct __hp *next;
} hp_t;

#include <stdbool.h>
#include <stdlib.h>
#include <string.h>


/* Allocate a new node with specified value and append to list */
static hp_t *list_append(hp_t **head, uintptr_t ptr)
{
    hp_t *new = calloc(1, sizeof(hp_t));
    if (!new)
        return NULL;

    new->ptr = ptr;
    hp_t *old = atomic_load(head);

    do {
        new->next = old;
    } while (!atomic_cas(head, &old, &new));

    return new;
}

/* Attempt to find an empty node to store value, otherwise append a new node.
 * Returns the node containing the newly added value.
 */
hp_t *list_insert_or_append(hp_t **head, uintptr_t ptr)
{
    hp_t *node;
    bool need_alloc = true;

    LIST_ITER (head, node) {
        uintptr_t expected = atomic_load(&node->ptr);
        if (expected == 0 && atomic_cas(&node->ptr, &expected, &ptr)) {
            need_alloc = false;
            break;
        }
    }

    if (need_alloc)
        node = list_append(head, ptr);

    return node;
}

/* Remove a node from the list with the specified value */
bool list_remove(hp_t **head, uintptr_t ptr)
{
    hp_t *node;
    const uintptr_t nullptr = 0;

    LIST_ITER (head, node) {
        uintptr_t expected = atomic_load(&node->ptr);
        if (expected == ptr && atomic_cas(&node->ptr, &expected, &nullptr))
            return true;
    }

    return false;
}

/* Returns 1 if the list currently contains an node with the specified value */
bool list_contains(hp_t **head, uintptr_t ptr)
{
    hp_t *node;

    LIST_ITER (head, node) {
        if (atomic_load(&node->ptr) == ptr)
            return true;
    }

    return false;
}

/* Frees all the nodes in a list - NOT THREAD SAFE */
void list_free(hp_t **head)
{
    hp_t *cur = *head;
    while (cur) {
        hp_t *old = cur;
        cur = cur->next;
        free(old);
    }
}

#define DEFER_DEALLOC 1

typedef struct {
    hp_t *pointers;
    hp_t *retired;
    void (*deallocator)(void *);
} domain_t;

/* Create a new domain on the heap */
domain_t *domain_new(void (*deallocator)(void *))
{
    domain_t *dom = calloc(1, sizeof(domain_t));
    if (!dom)
        return NULL;

    dom->deallocator = deallocator;
    return dom;
}

/* Free a previously allocated domain */
void domain_free(domain_t *dom)
{
    if (!dom)
        return;

    if (dom->pointers)
        list_free(&dom->pointers);

    if (dom->retired)
        list_free(&dom->retired);

    free(dom);
}

/*
 * Load a safe pointer to a shared object. This pointer must be passed to
 * `drop` once it is no longer needed. Returns 0 (NULL) on error.
 */
uintptr_t load(domain_t *dom, const uintptr_t *prot_ptr)
{
    const uintptr_t nullptr = 0;

    while (1) {
        uintptr_t val = atomic_load(prot_ptr);
        hp_t *node = list_insert_or_append(&dom->pointers, val);
        if (!node)
            return 0;

        /* Hazard pointer inserted successfully */
        if (atomic_load(prot_ptr) == val)
            return val;

        /*
         * This pointer is being retired by another thread - remove this hazard
         * pointer and try again. We first try to remove the hazard pointer we
         * just used. If someone else used it to drop the same pointer, we walk
         * the list.
         */
        uintptr_t tmp = val;
        if (!atomic_cas(&node->ptr, &tmp, &nullptr))
            list_remove(&dom->pointers, val);
    }
}

/*
 * Drop a safe pointer to a shared object. This pointer (`safe_val`) must have
 * come from `load`
 */
void drop(domain_t *dom, uintptr_t safe_val)
{
    if (!list_remove(&dom->pointers, safe_val))
        __builtin_unreachable();
}

static void cleanup_ptr(domain_t *dom, uintptr_t ptr, int flags)
{
    if (!list_contains(&dom->pointers, ptr)) { /* deallocate straight away */
        dom->deallocator((void *) ptr);
    } else if (flags & DEFER_DEALLOC) { /* Defer deallocation for later */
        list_insert_or_append(&dom->retired, ptr);
    } else { /* Spin until all readers are done, then deallocate */
        while (list_contains(&dom->pointers, ptr))
            ;
        dom->deallocator((void *) ptr);
    }
}

/* Swaps the contents of a shared pointer with a new pointer. The old value will
 * be deallocated by calling the `deallocator` function for the domain, provided
 * when `domain_new` was called. If `flags` is 0, this function will wait
 * until no more references to the old object are held in order to deallocate
 * it. If flags is `DEFER_DEALLOC`, the old object will only be deallocated
 * if there are already no references to it; otherwise the cleanup will be done
 * the next time `cleanup` is called.
 */
void swap(domain_t *dom, uintptr_t *prot_ptr, uintptr_t new_val, int flags)
{
    const uintptr_t old_obj = atomic_exchange(prot_ptr, new_val);
    cleanup_ptr(dom, old_obj, flags);
}

/* Forces the cleanup of old objects that have not been deallocated yet. Just
 * like `swap`, if `flags` is 0, this function will wait until there are no
 * more references to each object. If `flags` is `DEFER_DEALLOC`, only
 * objects that already have no living references will be deallocated.
 */
void cleanup(domain_t *dom, int flags)
{
    hp_t *node;

    LIST_ITER (&dom->retired, node) {
        uintptr_t ptr = node->ptr;
        if (!ptr)
            continue;

        if (!list_contains(&dom->pointers, ptr)) {
            /* We can deallocate straight away */
            if (list_remove(&dom->retired, ptr))
                dom->deallocator((void *) ptr);
        } else if (!(flags & DEFER_DEALLOC)) {
            /* Spin until all readers are done, then deallocate */
            while (list_contains(&dom->pointers, ptr))
                ;
            if (list_remove(&dom->retired, ptr))
                dom->deallocator((void *) ptr);
        }
    }
}

#include <assert.h>
#include <err.h>
#include <pthread.h>
#include <stdio.h>

#define N_READERS 1
#define N_WRITERS 1
#define N_ITERS 20
#define ARRAY_SIZE(x) sizeof(x) / sizeof(*x)

typedef struct {
    unsigned int v1, v2, v3;
} config_t;

static config_t *shared_config;
static domain_t *config_dom;

config_t *create_config()
{
    config_t *out = calloc(1, sizeof(config_t));
    if (!out)
        err(EXIT_FAILURE, "calloc");
    return out;
}

void delete_config(config_t *conf)
{
    assert(conf);
    free(conf);
}

static void print_config(const char *name, const config_t *conf)
{
    printf("%s : { 0x%08x, 0x%08x, 0x%08x }\n", name, conf->v1, conf->v2,
           conf->v3);
}

void init()
{
    shared_config = create_config();
    config_dom = domain_new(delete_config);
    if (!config_dom)
        err(EXIT_FAILURE, "domain_new");
}

void deinit()
{
    delete_config(shared_config);
    domain_free(config_dom);
}

static void *reader_thread(void *arg)
{
    (void) arg;

    for (int i = 0; i < N_ITERS; ++i) {
        config_t *safe_config =
            (config_t *) load(config_dom, (uintptr_t *) &shared_config);
        if (!safe_config)
            err(EXIT_FAILURE, "load");

        print_config("read config    ", safe_config);
        drop(config_dom, (uintptr_t) safe_config);
    }

    return NULL;
}

static void *writer_thread(void *arg)
{
    (void) arg;

    for (int i = 0; i < N_ITERS / 2; ++i) {
        config_t *new_config = create_config();
        new_config->v1 = rand();
        new_config->v2 = rand();
        new_config->v3 = rand();
        print_config("updating config", new_config);

        swap(config_dom, (uintptr_t *) &shared_config, (uintptr_t) new_config,
             0);
        print_config("updated config ", new_config);
    }

    return NULL;
}

int main()
{
    pthread_t readers[N_READERS], writers[N_WRITERS];

    init();

    /* Start threads */
    for (size_t i = 0; i < ARRAY_SIZE(readers); ++i) {
        if (pthread_create(readers + i, NULL, reader_thread, NULL))
            warn("pthread_create");
    }
    for (size_t i = 0; i < ARRAY_SIZE(writers); ++i) {
        if (pthread_create(writers + i, NULL, writer_thread, NULL))
            warn("pthread_create");
    }

    /* Wait for threads to finish */
    for (size_t i = 0; i < ARRAY_SIZE(readers); ++i) {
        if (pthread_join(readers[i], NULL))
            warn("pthread_join");
    }
    for (size_t i = 0; i < ARRAY_SIZE(writers); ++i) {
        if (pthread_join(writers[i], NULL))
            warn("pthread_join");
    }

    deinit();

    return EXIT_SUCCESS;
}

無法通過 Thread Sanitizer

1 reader 1 writer

ThreadSanitizer 訊息
read config     : { 0x00000000, 0x00000000, 0x00000000 }
read config     : { 0x00000000, 0x00000000, 0x00000000 }
updating config : { 0x6b8b4567, 0x327b23c6, 0x643c9869 }
==================
WARNING: ThreadSanitizer: data race (pid=14932)
  Write of size 8 at 0x7b0400000000 by thread T2:
    #0 free ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:707 (libtsan.so.0+0x35f25)
    #1 delete_config <null> (test+0x13f2)
    #2 cleanup_ptr <null> (test+0x1df3)

  Previous read of size 4 at 0x7b0400000004 by thread T1:
    #0 print_config <null> (test+0x1445)
    #1 reader_thread <null> (test+0x15ee)

  Thread T2 (tid=14935, running) created by main thread at:
    #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:962 (libtsan.so.0+0x5ea79)
    #1 main <null> (test+0x1858)

  Thread T1 (tid=14934, finished) created by main thread at:
    #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:962 (libtsan.so.0+0x5ea79)
    #1 main <null> (test+0x1801)

SUMMARY: ThreadSanitizer: data race (/home/ycwu4142/linux2022/quiz5/test+0x13f2) in delete_config
==================
==================
WARNING: ThreadSanitizer: data race (pid=14932)
  Write of size 8 at 0x7b0400000008 by thread T2:
    #0 free ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:707 (libtsan.so.0+0x35f25)
    #1 delete_config <null> (test+0x13f2)
    #2 cleanup_ptr <null> (test+0x1df3)

  Previous read of size 4 at 0x7b0400000008 by thread T1:
    #0 print_config <null> (test+0x142d)
    #1 reader_thread <null> (test+0x15ee)

  Thread T2 (tid=14935, running) created by main thread at:
    #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:962 (libtsan.so.0+0x5ea79)
    #1 main <null> (test+0x1858)

  Thread T1 (tid=14934, finished) created by main thread at:
    #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:962 (libtsan.so.0+0x5ea79)
    #1 main <null> (test+0x1801)

SUMMARY: ThreadSanitizer: data race (/home/ycwu4142/linux2022/quiz5/test+0x13f2) in delete_config
==================
updated config  : { 0x6b8b4567, 0x327b23c6, 0x643c9869 }
ThreadSanitizer: reported 2 warnings

為了簡化,僅使用 2 個 iter

大意是說,T1 (reader thread) 在 print_config (讀取並印出) 時跟 T2 (writer thread) 執行的 delete_config 會有 data race 的情形
仔細查看,發現在 writer_thread 中 swap(將 shared_config 的值更新)時,會執行 cleanup_ptr 進而引發 delete_config 將舊的指標釋放掉

void cleanup_ptr(domain_t *dom, uintptr_t ptr, int flags)
{
    if (!list_contains(&dom->pointers, ptr)) { /* deallocate straight away */
        dom->deallocator((void *) ptr);
    } else if (flags & DEFER_DEALLOC) { /* Defer deallocation for later */
        list_insert_or_append(&dom->retired, ptr);
    } else { /* Spin until all readers are done, then deallocate */
        while (list_contains(&dom->pointers, ptr))
            ;
        dom->deallocator((void *) ptr);
    }
}

將傳入的 flags 改成 DEFER_DEALLOC,試圖讓 cleanup_ptr 將指標 (ptr) 丟到 retired list 而不是直接 deallocate,但發現結果沒變。
更進一步查看發現原來是進到了第一個 case - hazard pointer 裡面沒有這個指標 (ptr)

若是 writer 在 reader 讀取的過程中試圖 swap ,理論上該指標必然會在 hazard pointer 的串列之中,因為 reader 使用了 load 來讀取,而 load 會執行 insert_or_append ,將指標放進 hazard pointer 中,直到呼叫 drop 來解除。

參考 kdnvt 同學與 bakudr18 同學的共筆中的分析方式

reader | writer --------------------+------------------------ | /* swap */ | val = atomic_exchange(ptr,new); | /* cleanup_ptr */ | if(!list_contain(val)){ /* load */ | list_insert(val); | | /* load on ptr */ | if(ptr == val){ | return val; | } | | free(val); | }

由於 reader 在第 9 行才將 val insert 到 hazard pointer 裡面,而 writer 在第 6 行檢查時就找不到 val,所以便直接釋放 val

另外從兩位同學的共筆中還發現第 14 行的 free 會將 reader 讀到的值 free 掉,巧合的是正好我的 cpu 就照這樣的順序跑

以上的結果是我將 hp 相關程式碼分離出來,編譯成 .o 檔 (ELF 格式),再連結的結果

code

但如果按直接塞在一個檔案中直接編譯則有機率跑出以下結果

read config : { 0x00000000, 0x00000000, 0x00000000 } read config : { 0x00000000, 0x00000000, 0x00000000 } read config : { 0x00000000, 0x00000000, 0x00000000 } read config : { 0x00000000, 0x00000000, 0x00000000 } read config : { 0x00000000, 0x00000000, 0x00000000 } read config : { 0x00000000, 0x00000000, 0x00000000 } read config : { 0x00000000, 0x00000000, 0x00000000 } read config : { 0x00000000, 0x00000000, 0x00000000 } read config : { 0x00000000, 0x00000000, 0x00000000 } read config : { 0x00000000, 0x00000000, 0x00000000 } read config : { 0x00000000, 0x00000000, 0x00000000 } read config : { 0x00000000, 0x00000000, 0x00000000 } read config : { 0x00000000, 0x00000000, 0x00000000 } read config : { 0x00000000, 0x00000000, 0x00000000 } updating config : { 0x6b8b4567, 0x327b23c6, 0x643c9869 } read config : { 0x00000000, 0x00000000, 0x00000000 } read config : { 0x6b8b4567, 0x327b23c6, 0x643c9869 } updated config : { 0x6b8b4567, 0x327b23c6, 0x643c9869 } updating config : { 0x66334873, 0x74b0dc51, 0x19495cff } read config : { 0x6b8b4567, 0x327b23c6, 0x643c9869 } read config : { 0x66334873, 0x74b0dc51, 0x19495cff } read config : { 0x66334873, 0x74b0dc51, 0x19495cff } updated config : { 0x66334873, 0x74b0dc51, 0x19495cff } updating config : { 0x2ae8944a, 0x625558ec, 0x238e1f29 } read config : { 0x66334873, 0x74b0dc51, 0x19495cff } updated config : { 0x2ae8944a, 0x625558ec, 0x238e1f29 } updating config : { 0x46e87ccd, 0x3d1b58ba, 0x507ed7ab } updated config : { 0x46e87ccd, 0x3d1b58ba, 0x507ed7ab } updating config : { 0x2eb141f2, 0x41b71efb, 0x79e2a9e3 } updated config : { 0x2eb141f2, 0x41b71efb, 0x79e2a9e3 } updating config : { 0x7545e146, 0x515f007c, 0x5bd062c2 } updated config : { 0x7545e146, 0x515f007c, 0x5bd062c2 } updating config : { 0x12200854, 0x4db127f8, 0x0216231b } updated config : { 0x12200854, 0x4db127f8, 0x0216231b } updating config : { 0x1f16e9e8, 0x1190cde7, 0x66ef438d } updated config : { 0x1f16e9e8, 0x1190cde7, 0x66ef438d } updating config : { 0x140e0f76, 0x3352255a, 0x109cf92e } updated config : { 0x140e0f76, 0x3352255a, 0x109cf92e } updating config : { 0x0ded7263, 0x7fdcc233, 0x1befd79f } updated config : { 0x0ded7263, 0x7fdcc233, 0x1befd79f }

其中 15 ~ 18 行很明顯就有 racing 的狀況,但沒有被 thread sanitizer 抓出來

multi-readers multi-writers

當有多個 writer 時也會發生 data race

Thread Sanitizer 訊息
read config     : { 0x00000000, 0x00000000, 0x00000000 }
read config     : { 0x00000000, 0x00000000, 0x00000000 }
updating config : { 0x6b8b4567, 0x327b23c6, 0x643c9869 }
updated config  : { 0x6b8b4567, 0x327b23c6, 0x643c9869 }
updating config : { 0x66334873, 0x74b0dc51, 0x19495cff }
==================
WARNING: ThreadSanitizer: data race (pid=2670)
  Write of size 8 at 0x7b0400000810 by thread T3:
    #0 free <null> (libtsan.so.0+0x35f45)
    #1 delete_config <null> (a.out+0x1d7c)
    #2 cleanup_ptr <null> (a.out+0x1ac3)
    #3 swap <null> (a.out+0x1b82)
    #4 writer_thread <null> (a.out+0x2089)
    #5 <null> <null> (libtsan.so.0+0x2d1af)

  Previous read of size 4 at 0x7b0400000810 by thread T2:
    #0 print_config <null> (a.out+0x1de2)
    #1 writer_thread <null> (a.out+0x209c)
    #2 <null> <null> (libtsan.so.0+0x2d1af)

  Thread T3 (tid=2674, running) created by main thread at:
    #0 pthread_create <null> (libtsan.so.0+0x5ea99)
    #1 main <null> (a.out+0x217d)

  Thread T2 (tid=2673, finished) created by main thread at:
    #0 pthread_create <null> (libtsan.so.0+0x5ea99)
    #1 main <null> (a.out+0x217d)

SUMMARY: ThreadSanitizer: data race (/lib/x86_64-linux-gnu/libtsan.so.0+0x35f45) in __interceptor_free
==================
==================
WARNING: ThreadSanitizer: data race (pid=2670)
  Write of size 8 at 0x7b0400000818 by thread T3:
    #0 free <null> (libtsan.so.0+0x35f45)
    #1 delete_config <null> (a.out+0x1d7c)
    #2 cleanup_ptr <null> (a.out+0x1ac3)
    #3 swap <null> (a.out+0x1b82)
    #4 writer_thread <null> (a.out+0x2089)
    #5 <null> <null> (libtsan.so.0+0x2d1af)

  Previous read of size 4 at 0x7b0400000818 by thread T2:
    #0 print_config <null> (a.out+0x1db7)
    #1 writer_thread <null> (a.out+0x209c)
    #2 <null> <null> (libtsan.so.0+0x2d1af)

  Thread T3 (tid=2674, running) created by main thread at:
    #0 pthread_create <null> (libtsan.so.0+0x5ea99)
    #1 main <null> (a.out+0x217d)

  Thread T2 (tid=2673, finished) created by main thread at:
    #0 pthread_create <null> (libtsan.so.0+0x5ea99)
    #1 main <null> (a.out+0x217d)

SUMMARY: ThreadSanitizer: data race (/lib/x86_64-linux-gnu/libtsan.so.0+0x35f45) in __interceptor_free
==================
updated config  : { 0x66334873, 0x74b0dc51, 0x19495cff }
ThreadSanitizer: reported 2 warnings

可以看出 writer 跟 writer 間有 UAF (use after free) 的問題
解決方法:在 writer print_config 之前強行 load 一次 shared_config,確保待會印出的內容沒有被其他人釋放

static void *writer_thread(void *arg)
{
    (void) arg;

    for (int i = 0; i < N_ITERS / 2; ++i) {
        config_t *new_config = create_config();
        new_config->v1 = rand();
        new_config->v2 = rand();
        new_config->v3 = rand();
        print_config("updating config", new_config);

        swap(config_dom, (uintptr_t *) &shared_config, (uintptr_t) new_config,
             0);

        config_t *safe_config =
            (config_t *) load(config_dom, (uintptr_t *) &shared_config);
        if (!safe_config)
            err(EXIT_FAILURE, "load");
        print_config("updated config ", safe_config);
        drop(config_dom, (uintptr_t) safe_config);
    }

    return NULL;
}

注意 load 完要 drop,不然會一直占用資源 (ptr 持續留在 hp 中)

比較 RCU 與 HP

RCU 適合用於 1 個 writer 多個 reader 的情形
利用 訂閱-發布機制 來確保所有 reader 讀取到的資源都沒有被釋放,雖然內容可能不是最新的 (因此只能用於 DNS resolution 之類即使存取舊的資料,也不會影響最終行為正確性的場景)。

訂閱-發布機制: writer 更改資料然後發布,此時會進入「等待讀取端執行緒離開 critical section」等待所有訂閱該資料的 reader 都離開這個 critical section 後才釋放舊內容的記憶體

RCU 與 Hazard pointer 最大的區別在於 HP 保護的是一個 pointer 或者說一個記憶體區塊,而 RCU 則是保護一整個 critical section

整體來說 RCU 適用於 read-intensive workloads 而 HP 適用於 update rates 高的 workload

參見 Proposed RCU C++ API

也就是說 write/read 比率高 => HP
write/read 比率低 => RCU

取自 對比其他 lock-free 同步機制

RCU HP
Unreclaimed objects Unbounded Bounded
Traversal forward progress wait-free lock-free
Reclamation forward progress blocking wait-free
Reference acquisition unconditional conditional
  • wait-free 是 non-blocking 之中強度最高的,它保證所有的指令可以 bound 在某個時間內完成,也就是所有的執行緒經過一段足夠的時間後,一定會有所進展
  • lock-free 稍微弱一點,它允許部份執行緒被暫停,但仍然保證整體程式的其他部份會有進展
    最差的情況下,就算只有一個執行緒在執行,其他所有執行緒都在等待,仍然屬於 lock-free

ABA 問題 中解決辦法的部分提到

Hazard pointers are lock-free, but can only track at most a fixed number of elements per thread as being in-use.

不確定跟表格中的 Unreclaimed objects 是不是同一件事

我認為能追蹤的數量有限是因為 retire list 需要 clean up,而執行 clean up 的時機,通常是 retire list 中的內容到達一定數量 (就好比垃圾桶滿了 => 此時需要傾倒)。因此需要設置上限
(定義何謂「滿」) ,而這個滿就限制了 HP 能夠追蹤的量