2022q1 Homework5 (quiz6)

# 2022q1 Homework5 (quiz6) > contributed by < oucs638 > > 測驗題: [quiz6](https://hackmd.io/@sysprog/linux2022-quiz6) > Github: [oucs638/linux_2022spring_quiz6](https://github.com/oucs638/linux_2022spring_quiz6) ## 測驗 2 : Lock-free Hash Table - 此題測驗要補完 `hashmap.c` 中的程式碼，使程式執行符合預期 - 首先看 hash table 在 `hashmap.h` 中的資料結構定義 ```c /* links in the linked lists that each bucket uses */ typedef struct hashmap_keyval { struct hashmap_keyval *next; const void *key; void *value; } hashmap_kv_t; /* main hashmap struct with buckets of linked lists */ typedef struct { hashmap_kv_t **buckets; uint32_t n_buckets; uint32_t length; /* total count of entries */ /* pointer to the hash and comparison functions */ uint64_t (*hash)(const void *key); uint8_t (*cmp)(const void *x, const void *y); /* custom memory management of internal linked lists */ void *opaque; hashmap_kv_t *(*create_node)(void *opaque, const void *key, void *data); void (*destroy_node)(void *opaque, hashmap_kv_t *node); } hashmap_t; ``` - 每個 `hashmap_t` 會紀錄 `n_buckets` 個 single linked list 的首個節點 - `buckets[i]` 可以取用第 `i` 個 single linked list 的首個節點 - 每個 single linked list 由 `hashmap_kv_t` 組成，`->next` 指向下一個節點 ### KKKK, QQQQ - 這兩個程式碼片段在 `hashmap_put` 函式中 - 當 key 值存在，則要檢查該 key 值的 linked list 是否為 head - 如是，則新的 key 值接在前一個節點之後，故 `KKKK = &prev->next` - 如否，則接到 head 之後，故 `QQQQ = &map->buckets[bucket_index]` #### ZZZZ - 這個程式碼片段在 `hashmap_del` 函式中 - 當要刪除的 key 值對應的 linked list 存有節點，則要判斷目標節點是否為首個節點 - 如否，則直接修改前一個節點的 `next`，故 `ZZZZ = &prev->next` ### 延伸問題 1 - 解釋上述程式碼運作原理，並指出其設計和實作的缺失 (如原本只接受 uin32_t 型態的輸入，無法處理字串作為 key) ```c= bool hashmap_put(hashmap_t *map, const void *key, void *value) { if (!map) return NULL; /* hash to convert key to a bucket index where value would be stored */ uint32_t bucket_index = map->hash(key) % map->n_buckets; hashmap_kv_t *kv = NULL, *prev = NULL; /* known head and next entry to add to the list */ hashmap_kv_t *head = NULL, *next = NULL; while (true) { /* copy the head of the list before checking entries for equality */ head = map->buckets[bucket_index]; /* find any existing matches to this key */ prev = NULL; if (head) { for (kv = head; kv; kv = kv->next) { if (map->cmp(key, kv->key) == 0) break; prev = kv; } } if (kv) { /* if the key exists, update and return it */ if (!next) /* lazy make the next key-value pair to append */ next = map->create_node(map->opaque, key, value); /* ensure the linked list's existing node chain persists */ next->next = kv->next; /* CAS-update the reference in the previous node */ if (prev) { /* replace this link, assuming it has not changed by another * thread */ if (__atomic_compare_exchange(&prev->next, &kv, &next, false, __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST)) { /* this node, key and value are never again used by this */ map->destroy_node(map->opaque, kv); return true; } hashmap_put_replace_fail += 1; } else { /* no previous link, update the head of the list */ /* set the head of the list to be whatever this node points to * (NULL or other links) */ if (__atomic_compare_exchange(&map->buckets[bucket_index], &kv, &next, false, __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST)) { map->destroy_node(map->opaque, kv); return true; } /* failure means at least one new entry was added, retry the * whole match/del process */ hashmap_put_head_fail += 1; } } else { /* if the key does not exist, try adding it */ if (!next) /* make the next key-value pair to append */ next = map->create_node(map->opaque, key, value); next->next = NULL; if (head) /* make sure the reference to existing nodes is kept */ next->next = head; /* prepend the kv-pair or lazy-make the bucket */ if (__atomic_compare_exchange(&map->buckets[bucket_index], &head, &next, false, __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST)) { __atomic_fetch_add(&map->length, 1, __ATOMIC_SEQ_CST); return false; } /* failure means another thead updated head before this. * track the CAS failure for tests -- non-atomic to minimize * thread contention */ hashmap_put_retries += 1; } } } ``` - `hashmap_put` 中會先判斷 key value 是否已存在 hash table 中 - 如 key value 已存在 hash table 中，則需要判斷 key 是否存在 head (line 28 - 64) - 如否，則將 `prev->next` 指向 `next`，將新的節點連接上去 (line 36 - 47) - 如是，則將新的節點接在 head 之後 (line 48 - 63) - 如 key value 不存在 hash table 中，則將新的節點作為 head (line 64 - 85) ```c= bool hashmap_del(hashmap_t *map, const void *key) { if (!map) return false; uint32_t bucket_index = map->hash(key) % map->n_buckets; /* try to find a match, loop in case a delete attempt fails */ while (true) { hashmap_kv_t *match, *prev = NULL; for (match = map->buckets[bucket_index]; match; match = match->next) { if ((*map->cmp)(key, match->key) == 0) break; prev = match; } /* exit if no match was found */ if (!match) return false; /* previous means this not the head but a link in the list */ if (prev) { /* try the delete but fail if another thread did delete */ if (__atomic_compare_exchange(&prev->next, &match, &match->next, false, __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST)) { __atomic_fetch_sub(&map->length, 1, __ATOMIC_SEQ_CST); map->destroy_node(map->opaque, match); return true; } hashmap_del_fail += 1; } else { /* no previous link means this needs to leave empty bucket */ /* copy the next link in the list (may be NULL) to the head */ if (__atomic_compare_exchange(&map->buckets[bucket_index], &match, &match->next, false, __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST)) { __atomic_fetch_sub(&map->length, 1, __ATOMIC_SEQ_CST); map->destroy_node(map->opaque, match); return true; } /* failure means whole match/del process needs another attempt */ hashmap_del_fail_new_head += 1; } } return false; } ``` - `hashmap_del` 中會先判斷 key value 是否已存在 hash table 中 - 如否，則回傳 `false` (line 18 - 19) - 如是，則判斷目標節點是否為 `head` (line 22 - 43) - 如否，則將前個節點的 `next` 指向下個節點 (line 22 - 30) - 如是，則將下個節點設為 `head` (line 31 - 43) ### 延伸問題 2 - 研讀 [lfht](https://github.com/ksandstr/lfht) (Lock-free wait-free hash table using C11 atomics) 程式碼及對應的論文 [Lock-Free Linked Lists and Skip Lists](http://www.cse.yorku.ca/~ruppert/papers/lfll.pdf) #### Lock-Free Linked Lists and Skip Lists - 這篇論文討論如何實現 lock-free 的 distributed data structure - 論文使用 singly-linked list 來實現，並引入 skip list 的結構 - 論文實現的演算法使用 single-word compare & swap synchronization primitive - 為了實現 lock-free，使個別 process 的 delays 或 failures 不會阻礙其他 process 的執行，論文的實作中使用了 backlinks 和 flag bits 以及 mark 被刪除節點的機制 - backlink 指向被刪除的節點的前一個節點 - 當 flag bit 設為 1，表示下一個節點的刪除正在進行 - backlink 的設計使得被刪除的節點可以復原 - 論文實現的演算法主要有八個部分 - `Search (Key k): Node` - 如有目標 key 值的節點，則回傳該節點 - 否則，該 key 不存在 - `SearchFrom (Key k, Node *curr_mode): (Node, Node)` - 回傳兩個相鄰的節點 `n1`, `n2`，符合 `n1.key <= k < n2.key` - 在拜訪節點的同時，會檢查下個節點是否被 mark，並呼叫 `HelpMarked` 以刪除被 mark 的節點 - `HelpMarked (Node *prev_node, Node *del_node)` - 嘗試刪除被 mark 的節點 `del_node`，並將前一個節點 `prev_node` 的 flag 設回 0 - 使用 atomic compare and swap 指令進行操作，以確保同時沒有其他 process 對目標節點做更動 - `Delete (Key k): Node` - 嘗試刪除目標 key 值的節點，使用 three-step deletion - 首先，呼叫 `SearchFrom` 搜尋目標 key 值節點，如果該節點存在，則呼叫 `TryFlag` 嘗試將前一個節點的 flag 設為 1 - 如果 `TryFlag` 回傳結果不為 `null`，則呼叫 `HelpFlagged` 進行第二步和第三步 - `HelpFlagged (Node *prev_node, Node *del_node)` - 刪除節點的第二步，將 `del_node` 的 backlink 指向前一個節點，並呼叫 `TryMark` mark `del_node` - 刪除節點的第三步，呼叫 `HelpMarked` 嘗試刪除目標節點 - `TryMark (Node del_node)` - 嘗試 mark 要刪除的節點 `del_node` - 使用 atomic compare ad swap 指令進行操作，以確保同時沒有其他 process 對目標節點做更動 - `TryFlag (Node *prev_node, Node *target_node): (Node, Boolean)` - 嘗試將要刪除的目標節點的前一個節點 flag 設為 1，並回傳 `prev_node` - 使用 atomic compare ad swap 指令進行操作，以確保同時沒有其他 process 對目標節點做更動 - 如果目標節點已刪除，則回傳 `null` - `Insert (Key k, Element e): Node` - 嘗試插入給定 key 值的新節點 - 呼叫 `SearchForm` 搜尋 key 值插入的位置，並確認 key 值不重複 - 如果要插入的位置前一個節點 flag 為 1，代表後續的節點刪除還未完成，呼叫 `HelpFlagged` 完成刪除動作 - 如果要插入的位置前一個節點被 mark，表示該節點要被刪除，用 backlinks 往前找到沒有 mark 的節點重新設為 `prev_node` - 使用 atomic compare ad swap 指令進行操作，以確保同時沒有其他 process 對目標節點做更動 ### 延伸問題 3 - 參照 [atomic_hash](https://github.com/divfor/atomic_hash) 及其內建的測試程式，用更大規模的來測試