# 2022q1 Homework5 (quiz6) contributed by < `hankluo6` > > [第 6 週測驗題](https://hackmd.io/@sysprog/linux2022-quiz6) ## 測驗 `1` ### 程式原理 ```c /** * @brief Node in the TCB of the thread */ typedef struct __node { unsigned long int tid, tid_copy; void *ret_val; struct __node *next; funcargs_t *fa; } node_t; /** * @brief Singly-linked list of thread control blocks (TCB) */ typedef struct { node_t *head, *tail; } list_t; static list_t tid_list; ``` 執行緒的相關資料以 `tld_list` 連接,其中每個 node 會存有 `tld`, 執行緒的回傳值 (`ret_val`) 以及函式的資料 (`fa`)。 ```c /** * @brief Create a One One mapped thread * @param t Reference to the thread * @param routine Function associated with the thread * @param arg Arguments to the routine */ int thread_create(thread_t *t, void *routine, void *arg) { ... node_t *node = list_insert(&tid_list, 0); funcargs_t *fa = malloc(sizeof(funcargs_t)); fa->f = routine; fa->arg = arg; void *thread_stack = alloc_stack(STACK_SZ, GUARD_SZ); fa->stack = thread_stack; thread_t tid = clone(wrap, (char *) thread_stack + STACK_SZ + GUARD_SZ, CLONE_FLAGS, fa, &(node->tid), NULL, &(node->tld)); node->tid_copy = tid; node->fa = fa; *t = tid; return 0; } ``` 建立執行緒時,會新增一個新的節點放到 `tld_list` 中,節點的 `tld` 會被初始化為 0。透過 `clone` 建立新的執行緒,並將新建立的執行緒資料放到節點中。 其中 `CLONG_FLAGS` 為設置 child process 與 parent process 共用資源。 而 `CLONE_PARENT_SETTID` 表示將 child thread ID 設置到 `parent_tid` 參數上, `CLONE_CHILD_CLEARTID` 則表示將當 child process 終止時,`child_tid` 參數上的內容會被清空,且該位址上的 futex 會被喚醒。 >**CLONE_PARENT_SETTID** > >Store the child thread ID at the location pointed to by parent_tid (clone()) or cl_args.parent_tid (clone3()) in the parent’s memory. >**CLONE_CHILD_CLEARTID** > >Clear (zero) the child thread ID at the location pointed to by child_tid (clone()) or cl_args.child_tid (clone3()) in child memory when the child exits, and do a wakeup on the futex at that address. ```c /** * @brief Function to wait for a specific thread to terminate * @param t TID of the thread to wait for * @param guard Size of guard pag */ int thread_join(thread_t t, void **retval) { spin_acquire(&global_lock); void *addr = get_tid_addr(&tid_list, t); if (!addr) { spin_release(&global_lock); return ESRCH; } if (*((pid_t *) addr) == 0) { spin_release(&global_lock); return EINVAL; } int ret = 0; while (*((pid_t *) addr) == t) { spin_release(&global_lock); ret = syscall(SYS_futex, addr, FUTEX_WAIT, t, NULL, NULL, 0); spin_acquire(&global_lock); } syscall(SYS_futex, addr, FUTEX_WAKE, INT_MAX, NULL, NULL, 0); if (retval) *retval = get_node_from_tid(&tid_list, t)->ret_val; spin_release(&global_lock); return ret; } ``` [futex](https://man7.org/linux/man-pages/man2/futex.2.html) 可以用來檢查 user space 上的特定情況發生,當參數為 `FUTEX_WAIT` 時,如果 `uaddr` 與 `val` 相同,則會進入睡眠;而當參數為 `FUTEX_WAKE` 時,會喚醒在 `uaddr` 上最多 `val` 數量的 futex。 `thread_join` 必須等待執行緒執行完畢,當執行緒還在執行時,`addr` 的內容為執行緒的 `tld`,當執行緒結束時,根據上方 `clone` 的 `CLONE_CHILD_CLEARTID` flag,會將 `addr` 的內容清空。 故 `while` 迴圈與內部的 system call 會持續等待到執行緒結束才會跳出,並將回傳值回傳。 :::warning `FUTEX_WAKE` 看起來不需要,沒有其他執行緒使用同個空間 `addr` 在等待 wake up。 ::: ```c /** * @brief Function to make a thread terminate itself * @param ret return value of the thread to be available to thread_join() * @note Implicit call to thread_exit is made by each thread after completing * the execution of routine */ void thread_exit(void *ret) { spin_acquire(&global_lock); void *addr = get_tid_addr(&tid_list, gettid()); if (!addr) { spin_release(&global_lock); return; } if (ret) { node_t *node = get_node_from_tid(&tid_list, gettid()); node->ret_val = ret; } syscall(SYS_futex, addr, FUTEX_WAKE, INT_MAX, NULL, NULL, 0); spin_release(&global_lock); kill(SIGINT, gettid()); } ``` `thread_exit` 會在 `wrap` 函式的最後被呼叫,會設置回傳值並將在 `thread_join` 等待的 futex 喚醒,最後刪除自己本身的 process。 :::warning `FUTEX_WAKE` 這邊看起來也不需要,因為當 child process 結束時,也會喚醒 futex。 ::: --- ## 測驗 `2` ### 程式原理 #### hash table ```c void *hashmap_get(hashmap_t *map, const void *key) { /* hash to convert key to a bucket index where value would be stored */ uint32_t index = map->hash(key) % map->n_buckets; /* walk through the linked list nodes to find any matches */ for (hashmap_kv_t *n = map->buckets[index]; n; n = n->next) { if (map->cmp(n->key, key) == 0) return n->value; } return NULL; /* no matches found */ } ``` 與一般 hash table 一樣,找到對應的 bucket,再遍歷這個 bucket 上的 linked list 找到對應的 node。 ```c= bool hashmap_put(hashmap_t *map, const void *key, void *value) { if (!map) return NULL; /* hash to convert key to a bucket index where value would be stored */ uint32_t bucket_index = map->hash(key) % map->n_buckets; hashmap_kv_t *kv = NULL, *prev = NULL; /* known head and next entry to add to the list */ hashmap_kv_t *head = NULL, *next = NULL; while (true) { /* copy the head of the list before checking entries for equality */ head = map->buckets[bucket_index]; /* find any existing matches to this key */ prev = NULL; if (head) { for (kv = head; kv; kv = kv->next) { if (map->cmp(key, kv->key) == 0) break; prev = kv; } } if (kv) { /* if the key exists, update and return it */ if (!next) /* lazy make the next key-value pair to append */ next = map->create_node(map->opaque, key, value); /* ensure the linked list's existing node chain persists */ next->next = kv->next; /* CAS-update the reference in the previous node */ if (prev) { /* replace this link, assuming it has not changed by another * thread */ if (__atomic_compare_exchange(&prev->next, &kv, &next, false, __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST)) { /* this node, key and value are never again used by this */ map->destroy_node(map->opaque, kv); return true; } hashmap_put_replace_fail += 1; } else { /* no previous link, update the head of the list */ /* set the head of the list to be whatever this node points to * (NULL or other links) */ if (__atomic_compare_exchange(&head->next, &kv, &next, false, __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST)) { map->destroy_node(map->opaque, kv); return true; } /* failure means at least one new entry was added, retry the * whole match/del process */ hashmap_put_head_fail += 1; } } else { /* if the key does not exist, try adding it */ if (!next) /* make the next key-value pair to append */ next = map->create_node(map->opaque, key, value); next->next = NULL; if (head) /* make sure the reference to existing nodes is kept */ next->next = head; /* prepend the kv-pair or lazy-make the bucket */ if (__atomic_compare_exchange(&map->buckets[bucket_index], &head, &next, false, __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST)) { __atomic_fetch_add(&map->length, 1, __ATOMIC_SEQ_CST); return false; } /* failure means another thead updated head before this. * track the CAS failure for tests -- non-atomic to minimize * thread contention */ hashmap_put_retries += 1; } } } ``` `hashmap_put` 首先尋找 key 使否已經在 hash table 當中,`next` 為新建立的節點,如果新增失敗,則此次迴圈建立的 `next` 可以在之後的迴圈中使用,不用再重新建立。可以分成各個部分討論: * key 已經在 hash table 當中 (Line 28 ~ Line 64) `kv` 為要被取代的節點。接著依照 `kv` 的位置分成兩種情況: * key 不在第一個節點 (Line 36 ~ Line 48) 則我們要更新 `prev->next` 將 `next` 連接上去,故更新程式為: * `__atomic_compare_exchange(&prev->next, &kv, &next, false,__ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST)` * key 在第一個節點 (Line 48 ~ Line 63) 此情況表示 `head[bucket_idx] = kv`,我們要更新 `head->next` 將 `next` 連接上去,故更新程式為: * `__atomic_compare_exchange(&head->next, &kv, &next, false,__ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST)` 而被取代的 `kv` 空間必須被釋放,故透過 `map->destroy_node` 呼叫釋放函式。 * key 不在 hash table 當中 (Line 64 ~ Line 85) 直接將 `next` 連接到當成新的 `head`,故將 `next->next` 設定為 `head`,並更新: * `__atomic_compare_exchange(&map->buckets[bucket_index], &kv, &next, false,__ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST)` ```c bool hashmap_del(hashmap_t *map, const void *key) { if (!map) return false; uint32_t bucket_index = map->hash(key) % map->n_buckets; /* try to find a match, loop in case a delete attempt fails */ while (true) { hashmap_kv_t *match, *prev = NULL; for (match = map->buckets[bucket_index]; match; match = match->next) { if ((*map->cmp)(key, match->key) == 0) break; prev = match; } /* exit if no match was found */ if (!match) return false; /* previous means this not the head but a link in the list */ if (prev) { /* try the delete but fail if another thread did delete */ if (__atomic_compare_exchange(&prev->next, &match, &match->next, false, __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST)) { __atomic_fetch_sub(&map->length, 1, __ATOMIC_SEQ_CST); map->destroy_node(map->opaque, match); return true; } hashmap_del_fail += 1; } else { /* no previous link means this needs to leave empty bucket */ /* copy the next link in the list (may be NULL) to the head */ if (__atomic_compare_exchange(&map->buckets[bucket_index], &match, &match->next, false, __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST)) { __atomic_fetch_sub(&map->length, 1, __ATOMIC_SEQ_CST); map->destroy_node(map->opaque, match); return true; } /* failure means whole match/del process needs another attempt */ hashmap_del_fail_new_head += 1; } } return false; } ``` `hashmap_del` 與 `hashmap_put` 類似,根據要被移除的節點位置,透過 `__atomic_compare_exchange` 將 linked list 更新。 ### free list free list 用來存放要被釋放的空間,每個要釋放的空間先以 `free_later_t` 包裝,其存放該空間以及要釋放該空間的函式,再以 `list_node_t` 包裝以便可以放置在 free list `buffer` 當中。 ```graphviz digraph linked_list{ rankdir = LR; node [shape = record]; node2 [label = "<f0> val | <f1> next"]; node1 [label = "<f0> var | <f1> free"]; node0 [label = "address" height=0.4]; node2:f0->node1 node1:f0->node0 "list_node_t"[height=0.4,color=white]; "free_later_t"[height=0.4,color=white]; "key or value or node \n memory"[height=0.4,color=white]; "list_node_t"->"free_later_t"->"key or value or node \n memory" [color=white]; } ``` 這樣實作可以對每種型態決定其釋放空間的函式,如 hash 的 key 值存在 stack 中,使用 `opaque` 防止釋放。 :::warning 但此程式好像沒有正確釋放所有空間,`free_later_run` 並不會進到迴圈中。 > 後續實作: [hashmap](https://github.com/sysprog21/concurrent-programs/tree/master/hashmap),請將你的改進以 pull request 提交。 > :notes: jserv ::: ### 實作缺失 * 程式有時後會卡住 `test_del` 會呼叫 `mt_del_vals`,其中 `N_THREADS` 呼叫 `add_val` 增加相同資料到 `map` 當中,另外 `N_THREADS` 呼叫 `del_val` 將資料移除,有第三群 `N_THREADS` 會呼叫 `mt_add_vals` 插入不同的資料到 `map` 當中。 當資料從 `map` 移除時 atomic operation 沒有失敗,`hashmap_del_fail` 或 `hashmap_del_fail_new_head` 為 0 時,會再重新重複上述步驟。 導致程式執行時,會在 `while` 迴圈中重複操作,此情況在搭配 Debugger 或 Valgrind 等工具時更明顯。 :::warning 不確定 `hashmap_del_fail` 的 `hashmap_del_fail_new_head` 用途。 ::: * Memory Leak 程式有多處分配的記憶體沒有釋放。[Commit](https://github.com/hankluo6/concurrent-programs/commit/f8c15f28fc454fa90577b5905abd98011a7cc204) * `hashmap.c` * hashmap 在建立後沒有釋放空間,新增 `hashmap_free` 將 hashmap 釋放,並把剩下的節點放到 free list 中。 ```c void hashmap_free(hashmap_t *map) { for (int bucket = 0; bucket < map->n_buckets; ++bucket) { for (hashmap_kv_t *kv = map->buckets[bucket]; kv; kv = kv->next) { map->destroy_node(map->opaque, kv); } } free(map->buckets); free(map); } ``` * `free_later.c` * 原本 `free_later_run` 在釋放時會讀取到已經釋放的節點,且當 free function 為 `NULL` 時,`v->free` 會觸發 segmentation fault,改為釋放前一個節點。 ```diff - for (list_node_t *n = buffer_prev->head; n; n = n->next) { + for (list_node_t *n = buffer_prev->head, *prev = NULL; n;) { + prev = n; + n = n->next; - free_later_t *v = n->val; + free_later_t *v = prev->val; + if (v->free) free_later_t *v = n->val; + v->free(v->var); - free(n); + free(prev); } ``` * `free_later_stage` 會將 `buffer` 的內容給 `buffer_prev`,並新增一個新的 `buffer`,在之後的 `free_later_run` 便能將 `buffer_prev` 的內容釋放,但原本程式的 `free_later_stage` 並不會將內容賦值給 `buffer_prev`。 ```diff void free_later_stage(void) { ... - if (!buffer_prev || buffer_prev->length == 0) { + if (buffer_prev) { release_lock(&lock); return; } ... } ``` ###### tags: `linux2022`