2022q1 Homework5 (quiz6)

contributed by < linjohnss >

2022q1 第 6 週測驗題

測驗 `1`

利用 clone 實做 user level 的 thread 函式庫

程式碼實做原理

函數使用

int main()
{
    mutex_init(&lock);
    mutex_init(&rwlock);
    spin_init(&print_lock);

    atomic_init(&n_readers_in, 0);
    atomic_init(&n_writers_in, 0);

    thread_t readers[5], writers[5];
    for (int i = 0; i < 5; i++) {
        thread_create(&readers[i], f1, NULL);
        thread_create(&writers[i], f2, NULL);
    }

    for (int i = 0; i < 5; i++) {
        thread_join(writers[i], NULL);
        thread_join(readers[i], NULL);
    }

    return 0;
}

首先初始化 mutex, spin, atomic
接著 create 10 個 thread（5 個 reader, 5 個 writer）
最後分別 join 這 10 個 thread

Reader process in
Reader process in
Reader process out
Reader process out
Writer process in
Writers process out
Writer process in
Writers process out
Reader process in
Reader process out
Reader process in
Reader process out
Writer process in
Writers process out
Writer process in
Writers process out
Writer process in
Writers process out
Reader process in
Reader process out

會有 10 個 reader 的 in, out 與 10 個 writer 的 in, out

測驗 `2`

lock-free

在並行程式設計中，存取共同記憶體時，使用 mutex 是最直觀的方式，然而 mutex 容易造成 thread 互相阻塞的機會。為了提高程式的運行效能，可以改用 lock-free 結構來處理。

何種情境要考慮 ABA Problem

在執行 CAS 時，若 thread1 在讀取的間隔中，有其他 thread2 修改該值，使得 thread1 誤以為值並未改變，造成非預期的結果。

stack:top → A → B → C
thread1 run pop and gets interrupted before the compare_exchange_weak
thread2 pop twice and push A back
stack:top → A → C
thread1 will error assigning B as new head node

lock-free hash table 如何避免 ABA Problem?

Hazard pointer

根據2022q1 第 5 週測驗題描述

Hazard pointer 是其中一種 lock-free 處理記憶體釋放的解決方案，其原理是讀取端執行緒對指標進行識別，指標 (特別是指向的記憶體區塊) 若要釋放時，會事先保存，延遲到確認沒有讀取端執行緒，才進行真正的釋放。

在 hazard pointer 架構中，每個 thread 首先需要各自維護的兩個關鍵集合是：

hazard pointer: 儲存此 thread 正在存取的指標，因此該指標不能直接被釋放
retire list: 被這個 thread 要求釋放的指標，但實際尚未釋放

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

因此要安全的釋放記憶體，其基本想法就是：

每個執行緒放在 hazard pointer 中的指標尚不能被釋放
每個執行緒要求釋放的指標先放在 retire list 中
掃描 retire list 可以得知所有執行緒皆不使用的指標，則可將其真正的釋放給作業系統

程式碼的作法

在 destroy_node_later 中，分別用 free_later 釋放 node->key、node->value 與 node，不會直接釋放記憶體而是根據不同型態決定其釋放空間的方式







static void destroy_node_later(void *opaque, hashmap_kv_t *node)
{
    /* free all of these later in case other threads are using them */
    free_later((void *) node->key, free);
    free_later(node->value, opaque);
    free_later(node, free);
}

程式碼運作原理

hashmap 結構

首先，查看 hashmap 的資料結構，其定義在 hashmap.h：

/* links in the linked lists that each bucket uses */
typedef struct hashmap_keyval {
    struct hashmap_keyval *next;
    const void *key;
    void *value;
} hashmap_kv_t;

/* main hashmap struct with buckets of linked lists */
typedef struct {
    hashmap_kv_t **buckets;
    uint32_t n_buckets;

    uint32_t length; /* total count of entries */

    /* pointer to the hash and comparison functions */
    uint64_t (*hash)(const void *key);
    uint8_t (*cmp)(const void *x, const void *y);

    /* custom memory management of internal linked lists */
    void *opaque;
    hashmap_kv_t *(*create_node)(void *opaque, const void *key, void *data);
    void (*destroy_node)(void *opaque, hashmap_kv_t *node);
} hashmap_t;

為一個由n_buckets 個 hashmap_kv_t 單相串接而成的結構
另外包含兩個 function pointer hash 與 cmp
利用 hashmap_new、hashmap_get、hashmap_put、hashmap_del 對 hashmap 進行操作

hashmap_put

hashmap_put 負責將 value 根據 key 放入 map 中

/* Put the given key-value pair in the map.
 * @return true if an existing matching key was replaced.
 */
bool hashmap_put(hashmap_t *map, const void *key, void *value);

先根據 key 找到對應的 bucket_index


uint32_t bucket_index = map->hash(key) % map->n_buckets;

接著在 while 迴圈中，負找到對應的 bucket_index 中 key 是否已存在
prev 紀錄前一個節點









prev = NULL;
if (head) {
    for (kv = head; kv; kv = kv->next) {
        if (map->cmp(key, kv->key) == 0)
            break;
        prev = kv;
    }
}

若 kv 存在，更新該節點的 value
分為兩種狀況：

存在前一個節點 prev：
需要更新 prev->next，將 next 接上去，故 KKKK = &prev->next
然後對 kv 執行 destroy_node







if (__atomic_compare_exchange(KKKK, &kv, &next, false,
                              __ATOMIC_SEQ_CST,
                              __ATOMIC_SEQ_CST)) {
    /* this node, key and value are never again used by this */
    map->destroy_node(map->opaque, kv);
    return true;
}

不存在前一個節點 prev，即為 head：
需要更新 head->next，將 next 接上去，故 QQQQ = &map->buckets[bucket_index]
然後對 kv 執行 destroy_node






if (__atomic_compare_exchange(QQQQ, &kv,
                              &next, false, __ATOMIC_SEQ_CST,
                              __ATOMIC_SEQ_CST)) {
    map->destroy_node(map->opaque, kv);
    return true;
}

若 kv 不存在，代表未發現 key，需要建立節點並當成新的 head
需要更新 map->buckets[bucket_index]，將 next 接上去
以 __atomic_fetch_add 的方式將 map->length 加 1






if (__atomic_compare_exchange(&map->buckets[bucket_index], &head,
                              &next, false, __ATOMIC_SEQ_CST,
                              __ATOMIC_SEQ_CST)) {
    __atomic_fetch_add(&map->length, 1, __ATOMIC_SEQ_CST);
    return false;
}

hashmap_del

刪除 key 對應的節點，並保證在 multiple threads 同時刪除同一個 key 時，只會刪除一次

/* Remove the given key-value pair in the map.
 * @return true if a key was found.
 * This operation is guaranteed to return true just once, if multiple threads
 * are attempting to delete the same key.
 */
bool hashmap_del(hashmap_t *map, const void *key);

同 hashmap_put 要先找到對應的 bucket_index
判斷是否找到目標 match
判斷 prev 是否存在

prev 存在：
需要更新 prev->next，將 match->next 接上去，故 ZZZZ = &prev->next
以 __atomic_fetch_sub 的方式將 map->length 減 1
然後對 kv 執行 destroy_node







if (__atomic_compare_exchange(ZZZZ, &match, &match->next,
                              false, __ATOMIC_SEQ_CST,
                              __ATOMIC_SEQ_CST)) {
    __atomic_fetch_sub(&map->length, 1, __ATOMIC_SEQ_CST);
    map->destroy_node(map->opaque, match);
    return true;
}

prev 不存在，match 即為 head：







if (__atomic_compare_exchange(&map->buckets[bucket_index], &match,
                              &match->next, false, __ATOMIC_SEQ_CST,
                              __ATOMIC_SEQ_CST)) {
    __atomic_fetch_sub(&map->length, 1, __ATOMIC_SEQ_CST);
    map->destroy_node(map->opaque, match);
    return true;
}

hashmap_put 考慮因素

最佳的情況下，時間複雜度是 O(1)

hashmap 在沒有發生 collision 時的時間複雜度是 O(1)
本次所使用的 hash function 是直接 mod

static uint64_t hash_uint32(const void *key)
{
    return *(uint32_t *) key;
}

hash function 處理 collision 的作法

Separate chaining
發生 collision 將相同 index 的 node 串接在一起
選擇 key mod 7 作為 hash function，輸入 50, 700, 76, 85, 92, 73, 101

Open addressing
發生 collision 則往下接續尋找空的欄位，並插入空的欄位
選擇 key mod 7 作為 hash function，輸入 50, 700, 76, 85, 92, 73, 101

因為使用到 link list，Separate chaining 對於未知數量的 inputs 不用擔心 hash map 滿了，但 Separate chaining 相較於 Open addressing 會有高的效能損失。Open addressing 仰賴精準的 hash map 大小估計，進而避免越是後面的輸入其探查數量就急遽增高，因為碰到位子先有人佔的機率會愈來愈高。

目前的 hashmap 採用何種手段？

目前 hashmap 採用 Separate chaining
發生 collision 時，也就是 hashmap_put 中 head 存在，程式會接著找尋 head 後續串接的 node，代表每個 bucket 是由一條 link list 組成







if (head) {
    for (kv = head; kv; kv = kv->next) {
        if (map->cmp(key, kv->key) == 0)
            break;
        prev = kv;
    }
}

並且當 kv 不存在時，會先將 next->next 指向 head，代表當發生 collision 時，會將新的節點插入至 link list 的開頭


if (head) /* make sure the reference to existing nodes is kept */
    next->next = head;

hashmap_put_retries 的作用?

hashmap_put 有三個在執行 __atomic_compare_exchange 負責紀錄失敗次數的指標

hashmap_put_replace_fail
hashmap_put_head_fail
hashmap_put_retries

free later

free later 是先將要釋放記憶體的資料加到一個 link list buffer，後續呼叫 free_later_exit 時才會將 buffer 中的資料釋放
list_t 的結構包含 head 與紀錄長度的 length，每一個 list_node_t 包含串接下一個 node 的指標，與紀錄資料的 val

typedef struct list_node {
    struct list_node *next;
    void *val;
} list_node_t;

typedef struct {
    list_node_t *head;
    uint32_t length;
} list_t;

free_later_t 為 link list val 存放的資料結構，包含欲釋放的變數 var 與 function pointer free 可以用於決定變數釋放的時機

typedef struct {
    void *var;
    void (*free)(void *var);
} free_later_t;

當我們呼叫 free_later_exit 就會釋放 buffer 中的記憶體

int free_later_exit()
{
    /* purge anything that is buffered */
    free_later_run();

    /* stage and purge anything that was unbuffered */
    free_later_stage();
    free_later_run();

    /* release memory for the buffer */
    free(buffer);
    buffer = NULL;
    return 0;
}

程式碼改進

`free_later_run` 會試圖存取已經被釋放的空間





for (list_node_t *n = buffer_prev->head; n; n = n->next) {
    free_later_t *v = n->val;
    v->free(v->var);
    free(n);
}