# 2022q1 Homework5 (quiz6)
contributed by < `hankluo6` >
> [第 6 週測驗題](https://hackmd.io/@sysprog/linux2022-quiz6)
## 測驗 `1`
### 程式原理
```c
/**
* @brief Node in the TCB of the thread
*/
typedef struct __node {
unsigned long int tid, tid_copy;
void *ret_val;
struct __node *next;
funcargs_t *fa;
} node_t;
/**
* @brief Singly-linked list of thread control blocks (TCB)
*/
typedef struct {
node_t *head, *tail;
} list_t;
static list_t tid_list;
```
執行緒的相關資料以 `tld_list` 連接,其中每個 node 會存有 `tld`, 執行緒的回傳值 (`ret_val`) 以及函式的資料 (`fa`)。
```c
/**
* @brief Create a One One mapped thread
* @param t Reference to the thread
* @param routine Function associated with the thread
* @param arg Arguments to the routine
*/
int thread_create(thread_t *t, void *routine, void *arg)
{
...
node_t *node = list_insert(&tid_list, 0);
funcargs_t *fa = malloc(sizeof(funcargs_t));
fa->f = routine;
fa->arg = arg;
void *thread_stack = alloc_stack(STACK_SZ, GUARD_SZ);
fa->stack = thread_stack;
thread_t tid = clone(wrap, (char *) thread_stack + STACK_SZ + GUARD_SZ,
CLONE_FLAGS, fa, &(node->tid), NULL, &(node->tld));
node->tid_copy = tid;
node->fa = fa;
*t = tid;
return 0;
}
```
建立執行緒時,會新增一個新的節點放到 `tld_list` 中,節點的 `tld` 會被初始化為 0。透過 `clone` 建立新的執行緒,並將新建立的執行緒資料放到節點中。
其中 `CLONG_FLAGS` 為設置 child process 與 parent process 共用資源。
而 `CLONE_PARENT_SETTID` 表示將 child thread ID 設置到 `parent_tid` 參數上, `CLONE_CHILD_CLEARTID` 則表示將當 child process 終止時,`child_tid` 參數上的內容會被清空,且該位址上的 futex 會被喚醒。
>**CLONE_PARENT_SETTID**
>
>Store the child thread ID at the location pointed to by parent_tid (clone()) or cl_args.parent_tid (clone3()) in the parent’s memory.
>**CLONE_CHILD_CLEARTID**
>
>Clear (zero) the child thread ID at the location pointed to by child_tid (clone()) or cl_args.child_tid (clone3()) in child memory when the child exits, and do a wakeup on the futex at that address.
```c
/**
* @brief Function to wait for a specific thread to terminate
* @param t TID of the thread to wait for
* @param guard Size of guard pag
*/
int thread_join(thread_t t, void **retval)
{
spin_acquire(&global_lock);
void *addr = get_tid_addr(&tid_list, t);
if (!addr) {
spin_release(&global_lock);
return ESRCH;
}
if (*((pid_t *) addr) == 0) {
spin_release(&global_lock);
return EINVAL;
}
int ret = 0;
while (*((pid_t *) addr) == t) {
spin_release(&global_lock);
ret = syscall(SYS_futex, addr, FUTEX_WAIT, t, NULL, NULL, 0);
spin_acquire(&global_lock);
}
syscall(SYS_futex, addr, FUTEX_WAKE, INT_MAX, NULL, NULL, 0);
if (retval)
*retval = get_node_from_tid(&tid_list, t)->ret_val;
spin_release(&global_lock);
return ret;
}
```
[futex](https://man7.org/linux/man-pages/man2/futex.2.html) 可以用來檢查 user space 上的特定情況發生,當參數為 `FUTEX_WAIT` 時,如果 `uaddr` 與 `val` 相同,則會進入睡眠;而當參數為 `FUTEX_WAKE` 時,會喚醒在 `uaddr` 上最多 `val` 數量的 futex。
`thread_join` 必須等待執行緒執行完畢,當執行緒還在執行時,`addr` 的內容為執行緒的 `tld`,當執行緒結束時,根據上方 `clone` 的 `CLONE_CHILD_CLEARTID` flag,會將 `addr` 的內容清空。
故 `while` 迴圈與內部的 system call 會持續等待到執行緒結束才會跳出,並將回傳值回傳。
:::warning
`FUTEX_WAKE` 看起來不需要,沒有其他執行緒使用同個空間 `addr` 在等待 wake up。
:::
```c
/**
* @brief Function to make a thread terminate itself
* @param ret return value of the thread to be available to thread_join()
* @note Implicit call to thread_exit is made by each thread after completing
* the execution of routine
*/
void thread_exit(void *ret)
{
spin_acquire(&global_lock);
void *addr = get_tid_addr(&tid_list, gettid());
if (!addr) {
spin_release(&global_lock);
return;
}
if (ret) {
node_t *node = get_node_from_tid(&tid_list, gettid());
node->ret_val = ret;
}
syscall(SYS_futex, addr, FUTEX_WAKE, INT_MAX, NULL, NULL, 0);
spin_release(&global_lock);
kill(SIGINT, gettid());
}
```
`thread_exit` 會在 `wrap` 函式的最後被呼叫,會設置回傳值並將在 `thread_join` 等待的 futex 喚醒,最後刪除自己本身的 process。
:::warning
`FUTEX_WAKE` 這邊看起來也不需要,因為當 child process 結束時,也會喚醒 futex。
:::
---
## 測驗 `2`
### 程式原理
#### hash table
```c
void *hashmap_get(hashmap_t *map, const void *key)
{
/* hash to convert key to a bucket index where value would be stored */
uint32_t index = map->hash(key) % map->n_buckets;
/* walk through the linked list nodes to find any matches */
for (hashmap_kv_t *n = map->buckets[index]; n; n = n->next) {
if (map->cmp(n->key, key) == 0)
return n->value;
}
return NULL; /* no matches found */
}
```
與一般 hash table 一樣,找到對應的 bucket,再遍歷這個 bucket 上的 linked list 找到對應的 node。
```c=
bool hashmap_put(hashmap_t *map, const void *key, void *value)
{
if (!map)
return NULL;
/* hash to convert key to a bucket index where value would be stored */
uint32_t bucket_index = map->hash(key) % map->n_buckets;
hashmap_kv_t *kv = NULL, *prev = NULL;
/* known head and next entry to add to the list */
hashmap_kv_t *head = NULL, *next = NULL;
while (true) {
/* copy the head of the list before checking entries for equality */
head = map->buckets[bucket_index];
/* find any existing matches to this key */
prev = NULL;
if (head) {
for (kv = head; kv; kv = kv->next) {
if (map->cmp(key, kv->key) == 0)
break;
prev = kv;
}
}
if (kv) { /* if the key exists, update and return it */
if (!next) /* lazy make the next key-value pair to append */
next = map->create_node(map->opaque, key, value);
/* ensure the linked list's existing node chain persists */
next->next = kv->next;
/* CAS-update the reference in the previous node */
if (prev) {
/* replace this link, assuming it has not changed by another
* thread
*/
if (__atomic_compare_exchange(&prev->next, &kv, &next, false,
__ATOMIC_SEQ_CST,
__ATOMIC_SEQ_CST)) {
/* this node, key and value are never again used by this */
map->destroy_node(map->opaque, kv);
return true;
}
hashmap_put_replace_fail += 1;
} else { /* no previous link, update the head of the list */
/* set the head of the list to be whatever this node points to
* (NULL or other links)
*/
if (__atomic_compare_exchange(&head->next, &kv,
&next, false, __ATOMIC_SEQ_CST,
__ATOMIC_SEQ_CST)) {
map->destroy_node(map->opaque, kv);
return true;
}
/* failure means at least one new entry was added, retry the
* whole match/del process
*/
hashmap_put_head_fail += 1;
}
} else { /* if the key does not exist, try adding it */
if (!next) /* make the next key-value pair to append */
next = map->create_node(map->opaque, key, value);
next->next = NULL;
if (head) /* make sure the reference to existing nodes is kept */
next->next = head;
/* prepend the kv-pair or lazy-make the bucket */
if (__atomic_compare_exchange(&map->buckets[bucket_index], &head,
&next, false, __ATOMIC_SEQ_CST,
__ATOMIC_SEQ_CST)) {
__atomic_fetch_add(&map->length, 1, __ATOMIC_SEQ_CST);
return false;
}
/* failure means another thead updated head before this.
* track the CAS failure for tests -- non-atomic to minimize
* thread contention
*/
hashmap_put_retries += 1;
}
}
}
```
`hashmap_put` 首先尋找 key 使否已經在 hash table 當中,`next` 為新建立的節點,如果新增失敗,則此次迴圈建立的 `next` 可以在之後的迴圈中使用,不用再重新建立。可以分成各個部分討論:
* key 已經在 hash table 當中 (Line 28 ~ Line 64)
`kv` 為要被取代的節點。接著依照 `kv` 的位置分成兩種情況:
* key 不在第一個節點 (Line 36 ~ Line 48)
則我們要更新 `prev->next` 將 `next` 連接上去,故更新程式為:
* `__atomic_compare_exchange(&prev->next, &kv, &next, false,__ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST)`
* key 在第一個節點 (Line 48 ~ Line 63)
此情況表示 `head[bucket_idx] = kv`,我們要更新 `head->next` 將 `next` 連接上去,故更新程式為:
* `__atomic_compare_exchange(&head->next, &kv, &next, false,__ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST)`
而被取代的 `kv` 空間必須被釋放,故透過 `map->destroy_node` 呼叫釋放函式。
* key 不在 hash table 當中 (Line 64 ~ Line 85)
直接將 `next` 連接到當成新的 `head`,故將 `next->next` 設定為 `head`,並更新:
* `__atomic_compare_exchange(&map->buckets[bucket_index], &kv, &next, false,__ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST)`
```c
bool hashmap_del(hashmap_t *map, const void *key)
{
if (!map)
return false;
uint32_t bucket_index = map->hash(key) % map->n_buckets;
/* try to find a match, loop in case a delete attempt fails */
while (true) {
hashmap_kv_t *match, *prev = NULL;
for (match = map->buckets[bucket_index]; match; match = match->next) {
if ((*map->cmp)(key, match->key) == 0)
break;
prev = match;
}
/* exit if no match was found */
if (!match)
return false;
/* previous means this not the head but a link in the list */
if (prev) { /* try the delete but fail if another thread did delete */
if (__atomic_compare_exchange(&prev->next, &match, &match->next,
false, __ATOMIC_SEQ_CST,
__ATOMIC_SEQ_CST)) {
__atomic_fetch_sub(&map->length, 1, __ATOMIC_SEQ_CST);
map->destroy_node(map->opaque, match);
return true;
}
hashmap_del_fail += 1;
} else { /* no previous link means this needs to leave empty bucket */
/* copy the next link in the list (may be NULL) to the head */
if (__atomic_compare_exchange(&map->buckets[bucket_index], &match,
&match->next, false, __ATOMIC_SEQ_CST,
__ATOMIC_SEQ_CST)) {
__atomic_fetch_sub(&map->length, 1, __ATOMIC_SEQ_CST);
map->destroy_node(map->opaque, match);
return true;
}
/* failure means whole match/del process needs another attempt */
hashmap_del_fail_new_head += 1;
}
}
return false;
}
```
`hashmap_del` 與 `hashmap_put` 類似,根據要被移除的節點位置,透過 `__atomic_compare_exchange` 將 linked list 更新。
### free list
free list 用來存放要被釋放的空間,每個要釋放的空間先以 `free_later_t` 包裝,其存放該空間以及要釋放該空間的函式,再以 `list_node_t` 包裝以便可以放置在 free list `buffer` 當中。
```graphviz
digraph linked_list{
rankdir = LR;
node [shape = record];
node2 [label = "<f0> val | <f1> next"];
node1 [label = "<f0> var | <f1> free"];
node0 [label = "address" height=0.4];
node2:f0->node1
node1:f0->node0
"list_node_t"[height=0.4,color=white];
"free_later_t"[height=0.4,color=white];
"key or value or node \n memory"[height=0.4,color=white];
"list_node_t"->"free_later_t"->"key or value or node \n memory" [color=white];
}
```
這樣實作可以對每種型態決定其釋放空間的函式,如 hash 的 key 值存在 stack 中,使用 `opaque` 防止釋放。
:::warning
但此程式好像沒有正確釋放所有空間,`free_later_run` 並不會進到迴圈中。
> 後續實作: [hashmap](https://github.com/sysprog21/concurrent-programs/tree/master/hashmap),請將你的改進以 pull request 提交。
> :notes: jserv
:::
### 實作缺失
* 程式有時後會卡住
`test_del` 會呼叫 `mt_del_vals`,其中 `N_THREADS` 呼叫 `add_val` 增加相同資料到 `map` 當中,另外 `N_THREADS` 呼叫 `del_val` 將資料移除,有第三群 `N_THREADS` 會呼叫 `mt_add_vals` 插入不同的資料到 `map` 當中。
當資料從 `map` 移除時 atomic operation 沒有失敗,`hashmap_del_fail` 或 `hashmap_del_fail_new_head` 為 0 時,會再重新重複上述步驟。
導致程式執行時,會在 `while` 迴圈中重複操作,此情況在搭配 Debugger 或 Valgrind 等工具時更明顯。
:::warning
不確定 `hashmap_del_fail` 的 `hashmap_del_fail_new_head` 用途。
:::
* Memory Leak
程式有多處分配的記憶體沒有釋放。[Commit](https://github.com/hankluo6/concurrent-programs/commit/f8c15f28fc454fa90577b5905abd98011a7cc204)
* `hashmap.c`
* hashmap 在建立後沒有釋放空間,新增 `hashmap_free` 將 hashmap 釋放,並把剩下的節點放到 free list 中。
```c
void hashmap_free(hashmap_t *map)
{
for (int bucket = 0; bucket < map->n_buckets; ++bucket) {
for (hashmap_kv_t *kv = map->buckets[bucket]; kv; kv = kv->next) {
map->destroy_node(map->opaque, kv);
}
}
free(map->buckets);
free(map);
}
```
* `free_later.c`
* 原本 `free_later_run` 在釋放時會讀取到已經釋放的節點,且當 free function 為 `NULL` 時,`v->free` 會觸發 segmentation fault,改為釋放前一個節點。
```diff
- for (list_node_t *n = buffer_prev->head; n; n = n->next) {
+ for (list_node_t *n = buffer_prev->head, *prev = NULL; n;) {
+ prev = n;
+ n = n->next;
- free_later_t *v = n->val;
+ free_later_t *v = prev->val;
+ if (v->free)
free_later_t *v = n->val;
+ v->free(v->var);
- free(n);
+ free(prev);
}
```
* `free_later_stage` 會將 `buffer` 的內容給 `buffer_prev`,並新增一個新的 `buffer`,在之後的 `free_later_run` 便能將 `buffer_prev` 的內容釋放,但原本程式的 `free_later_stage` 並不會將內容賦值給 `buffer_prev`。
```diff
void free_later_stage(void)
{
...
- if (!buffer_prev || buffer_prev->length == 0) {
+ if (buffer_prev) {
release_lock(&lock);
return;
}
...
}
```
###### tags: `linux2022`