2022q1 Homework1 (quiz1)

--- tags: linux2022 --- # 2022q1 Homework1 (quiz1) contributed by < `eric88525` > > [第一週測驗](https://hackmd.io/@sysprog/linux2022-quiz1) ## 測驗 1 ### 1-1 > 解釋上述程式碼運作原理 ```c= #include <stddef.h> #include <stdlib.h> struct hlist_node { struct hlist_node *next, **pprev; }; struct hlist_head { struct hlist_node *first; }; typedef struct { int bits; struct hlist_head *ht; } map_t; #define MAP_HASH_SIZE(bits) (1 << bits) /** * 建立 2^bits 欄位的 hash table */ map_t *map_init(int bits) { map_t *map = malloc(sizeof(map_t)); if (!map) return NULL; map->bits = bits; // MAP_HASH_SIZE(bits) 回傳2^bits map->ht = malloc(sizeof(struct hlist_head) * MAP_HASH_SIZE(map->bits)); if (map->ht) { for (int i = 0; i < MAP_HASH_SIZE(map->bits); i++) (map->ht)[i].first = NULL; } else { free(map); map = NULL; } return map; } struct hash_key { int key; void *data; struct hlist_node node; }; #define container_of(ptr, type, member) \ ({ \ void *__mptr = (void *) (ptr); \ ((type *) (__mptr - offsetof(type, member))); \ }) #define GOLDEN_RATIO_32 0x61C88647 static inline unsigned int hash(unsigned int val, unsigned int bits) { /* High bits are more random, so use them. */ return (val * GOLDEN_RATIO_32) >> (32 - bits); } /* * 檢查 key 是否存在於 map 中 * */ static struct hash_key *find_key(map_t *map, int key) { struct hlist_head *head = &(map->ht)[hash(key, map->bits)]; for (struct hlist_node *p = head->first; p; p = p->next) { struct hash_key *kn = container_of(p, struct hash_key, node); if (kn->key == key) return kn; } return NULL; } /* * 檢查 key 是否存在於 hash table 中 * */ void *map_get(map_t *map, int key) { struct hash_key *kn = find_key(map, key); return kn ? kn->data : NULL; } /* * 新增一組 kn = (key,data) 到 map_t 結構 * */ void map_add(map_t *map, int key, void *data) { struct hash_key *kn = find_key(map, key); if (kn) return; // 建立 hash_key 實體並指派數值 kn = malloc(sizeof(struct hash_key)); kn->key = key, kn->data = data; // 取得 hash table [hash key] 的 head struct hlist_head *h = &map->ht[hash(key, map->bits)]; struct hlist_node *n = &kn->node, *first = h->first; // 把 kn 的 hlist_node 連接到對應的 head n->next = first; if (first) first->pprev = &n->next; h->first = n; n->pprev = &h->first; //BBB } /* * 刪除 map_t 結構 * */ void map_deinit(map_t *map) { if (!map) return; for (int i = 0; i < MAP_HASH_SIZE(map->bits); i++) { struct hlist_head *head = &map->ht[i]; for (struct hlist_node *p = head->first; p;) { struct hash_key *kn = container_of(p, struct hash_key, node); struct hlist_node *n = p; p = p->next; if (!n->pprev) /* unhashed */ goto bail; struct hlist_node *next = n->next, **pprev = n->pprev; *pprev = next; if (next) next->pprev = pprev; n->next = NULL, n->pprev = NULL; bail: free(kn->data); free(kn); } } free(map); } int *twoSum(int *nums, int numsSize, int target, int *returnSize) { // 建立 hash table map_t *map = map_init(10); *returnSize = 0; // 要回傳的答案 int *ret = malloc(sizeof(int) * 2); if (!ret) goto bail; for (int i = 0; i < numsSize; i++) { // 檢查有無存在 target-當前數字 int *p = map_get(map, target - nums[i]); if (p) { /* found */ // 寫入答案 ret[0] = i, ret[1] = *p; *returnSize = 2; break; } // 加入 nums[i] 到 map_t 結構中 p = malloc(sizeof(int)); *p = i; map_add(map, nums[i], p); } bail: // 釋放記憶體 map_deinit(map); return ret; } ``` ### 1-2 > 研讀 Linux 核心原始程式碼 include/linux/hashtable.h 及對應的文件 How does the kernel implements Hashtables?，解釋 hash table 的設計和實作手法，並留意到 tools/include/linux/hash.h 的 GOLDEN_RATIO_PRIME，探討其實作考量 --- ## 測驗 2 ### Ans COND1 = `head->next && (head->val == head->next->val)` COND2 = `head->next && (head->val == head->next->val)` ### Code ```c= struct ListNode *deleteDuplicates(struct ListNode *head) { if (!head) return NULL; if (head->next && (head->val == head->next->val)) { while (head->next && (head->val == head->next->val)) head = head->next; return deleteDuplicates(head->next); } head->next = deleteDuplicates(head->next); return head; } ``` ### Explanation > line 6-8 [color=lightgreen] 如果當前節點跟下一個節點重複，那就會持續移動head指標到最後一個重複的node + 一開始指向頭 ![](https://i.imgur.com/ztvRa6L.png) + 持續移動到直到最後一個重複節點 ![](https://i.imgur.com/VD1TCxB.png) > line 9 [color=lightgreen] 將最後重複節點的下一個位置，丟入下次遞迴並回傳，在本次遞迴中就成功跳過重複的數字 + 下一輪遞迴會看到以下情況，這也是本次函式呼叫的回傳值 ![](https://i.imgur.com/ERyo4L3.png) > line 12 [color=lightgreen] 如果當前節點跟下一節點不重複，指定head->next為經過遞迴處理的後續節點 + head 指向的節點，跟head->next不重複 ![](https://i.imgur.com/PmodJJx.png) + 將 head->next 丟入下輪遞迴，並指定head->next為其回傳值 ![](https://i.imgur.com/sIJ2hXF.png) + 回傳值如同上面 line 9的描述，為一個數字為3的節點，因此最後結果為 ![](https://i.imgur.com/QhO7J7O.png) ### 2-1 > 嘗試避免遞迴，寫出同樣作用的程式碼 #### Code ```c= struct ListNode *deleteDuplicates(struct ListNode *head) { if (!head) return NULL; struct ListNode **p = &head; struct ListNode *curr = head; while (curr) { if ( curr->next && (curr->val == curr->next->val) ){ while( curr->next && (curr->val == curr->next->val)) curr = curr->next; curr = curr->next; *p = curr; }else{ p = &((*p)->next); curr = curr->next; } } return head; } ``` #### Explanation + 透過 head pointer 迭代所有的node，會碰上兩種情況，需要做不同處理 1. head 跟 head->next **重複** 2. head 跟 head->next **不重複** > line 6-7 [color=lightgreen] 宣告 curr 跟 head指向相同節點位址，**p 則是指向 head本身的位址 ![](https://i.imgur.com/oKMPXCw.png) > line 9 [color=lightgreen] 只要 curr一直有指向節點 while 就會持續 > line 10-13 [color=lightgreen] 情況1發生：節點數字重複，讓curr移動到不重複片段的起點 ![](https://i.imgur.com/Dci3oON.png) > line 14 [color=lightgreen] 改變 \*p 為 curr 的位址，也就是此時 head 會改為指向 curr，成功跳過重複節點 ![](https://i.imgur.com/kr8MREG.png) > line 16-17[color=lightgreen] 情況2發生，curr的數字不等於curr->next的數字，p就會改為 \*p 所指向節點的 next 指標，最後讓curr往下移動 ![](https://i.imgur.com/nohsKn6.png) curr往下移動後，如果curr還是跟curr->next不一樣，p 就會再次移動到 \*p 指到節點的 next 指標 ![](https://i.imgur.com/XtFbl16.png) 此時碰上curr = curr->next 的情況 ![](https://i.imgur.com/9txnQP4.png) curr會透過 line 10-13 來移動到不重複的起點，並改變 *p 為 curr 的位址(此時 p 指向 2 的 next 指標，因此讓 2 的 next 指到 node 4)，完成跳過兩個重複的 3 ![](https://i.imgur.com/2LzY08E.png) ### 2-2 > 以類似 Linux 核心的 circular doubly-linked list 改寫，撰寫遞迴和迭代 (iterative) 的程式碼 --- ## 測驗 3 ### Ans MMM1 = `list_for_each_entry_safe` MMM2 = `list_for_each_entry` MMM3 = `list_for_each_entry` MMM4 = `list_last_entry` ### Code ```c= #include <stdio.h> #include <stdlib.h> #include "list.h" typedef struct { int capacity, count; struct list_head dhead, hheads[]; } LRUCache; typedef struct { int key, value; struct list_head hlink, dlink; } LRUNode; LRUCache *lRUCacheCreate(int capacity) { LRUCache *obj = malloc(sizeof(*obj) + capacity * sizeof(struct list_head)); obj->count = 0; obj->capacity = capacity; INIT_LIST_HEAD(&obj->dhead); for (int i = 0; i < capacity; i++) INIT_LIST_HEAD(&obj->hheads[i]); return obj; } void lRUCacheFree(LRUCache *obj) { LRUNode *lru, *n; list_for_each_entry_safe (lru, n, &obj->dhead, dlink) { // MMM1 list_del(&lru->dlink); free(lru); } free(obj); } int lRUCacheGet(LRUCache *obj, int key) { LRUNode *lru; int hash = key % obj->capacity; list_for_each_entry (lru, &obj->hheads[hash], hlink) { // MMM2 if (lru->key == key) { list_move(&lru->dlink, &obj->dhead); return lru->value; } } return -1; } void lRUCachePut(LRUCache *obj, int key, int value) { LRUNode *lru; int hash = key % obj->capacity; list_for_each_entry (lru, &obj->hheads[hash], hlink) { // MMM3 if (lru->key == key) { list_move(&lru->dlink, &obj->dhead); lru->value = value; return; } } if (obj->count == obj->capacity) { list_for_each_entry (lru,&obj->dhead, dlink){} // MMM4 list_del(&lru->dlink); list_del(&lru->hlink); } else { lru = malloc(sizeof(LRUNode)); obj->count++; } lru->key = key; list_add(&lru->dlink, &obj->dhead); list_add(&lru->hlink, &obj->hheads[hash]); lru->value = value; } ``` ### 3-1 > 解釋上述程式碼的運作，撰寫完整的測試程式，指出其中可改進之處並實作 + 資料型態 + LRUCache 主結構: + capacipy 紀錄最大容量 + count 紀錄目前有多少資料存在裡面 + dhead 紀錄最近使用哪些節點 + hheads : 陣列，每個內容會指向一個hash表頭 + LRUNode + 函式 + lRUCacheCreate：在底下說明 + lRUCacheFree：刪除 dlink 和 dhead 指向的空間 + lRUCacheGet：先取得hashkey後，到對應的hhead[ hashkey ] double linking list 內找 key 對應的 value 並回傳 + lRUCachePut：在底下說明 > line 16-25 [color=lightgreen] **lRUCacheCreate:** 創建 LRUCache 實體如果 capacity傳入為3，那就會讓 hhead 為 size 3的指標陣列， ```c LRUCache *myCache = lRUCacheCreate(3); ``` ![](https://i.imgur.com/0CvgfLK.png) > line 50-74 [color=lightgreen] **lRUCachePut:** 將新的 key,value 加入LRUCache 結構加入 key = 4, value = 44 ```c lRUCachePut(myCache , 4, 44) ``` > line 54-60 [color=lightgreen] 會先幫 value 計算 **hashkey**，並到對應的 hhead[ **hashkey** ] 內查找是否曾有存過這個 **hashkey**，有的話就更新數值並把 dlink 移動到最前面，位置越靠前代表近期有使用過。 1. 假設原本有兩個節點存在，想加入 (7,10) ![](https://i.imgur.com/rL8dBxW.png) 2. 在 line 55-58 ，先檢查 hhead[ hashkey ] 內，如果有找到相同 key 的節點，讓 \*lru 指向他，並更改其 value 後移動到 dhead 的最前方 ![](https://i.imgur.com/2uslgOt.png) > line 62-66 [color=lightgreen] 當 capacity == count 時，透過 list_last_entry 來讓 lru 指向距離 dhead 最遠的節點，並且從list中抽出 ![](https://i.imgur.com/GRGjdVm.png) 抽出後接續 line 66-74，加入到 > line 66-74 [color=lightgreen] 將 \*lru 指向節點，指派 key 和 value 後接上對應位置(dlink 接上dhead，hlink 接到 hhead ) ![](https://i.imgur.com/UQjXBfA.png) #### Testcode 測試在capacity = `CAPACITY` 的實體內塞入 `TEST_SIZE` 份資料 ```c #define CAPACITY 5 #define TEST_SIZE 10 int main() { LRUCache *lRUCache = lRUCacheCreate(CAPACITY); printf("[lRUCacheCreate] capacity = %d\n", CAPACITY); int klist[TEST_SIZE]; int i, k, v; for (i = 0; i < TEST_SIZE; i++) { int randNum = rand(); k = randNum % 100; v = randNum % 100; printf("[lRUCachePut] key = %d value = %d\n", k, v); lRUCachePut(lRUCache, k, v); klist[i] = k; } for (i = 0; i < TEST_SIZE; i++) { int getVal = lRUCacheGet(lRUCache, klist[i]); printf("[lRUCacheGet] finding key = %d ", klist[i]); if (getVal == -1) { printf("but not found\n"); } else { printf("exist: value = %d\n", getVal); } } lRUCacheFree(lRUCache); } ``` ### 3-2 > 在 Linux 核心找出 LRU 相關程式碼並探討 --- ## 測驗 4 ### Ans LLL = `--left` RRR = `++right` ### code ```c= #include <stdio.h> #include <stdlib.h> #include "list.h" struct seq_node { int num; struct list_head link; }; static struct seq_node *find(int num, int size, struct list_head *heads) { struct seq_node *node; int hash = num < 0 ? -num % size : num % size; list_for_each_entry (node, &heads[hash], link) { if (node->num == num) return node; } return NULL; } int longestConsecutive(int *nums, int n_size) { int hash, length = 0; struct seq_node *node; struct list_head *heads = malloc(n_size * sizeof(*heads)); for (int i = 0; i < n_size; i++) INIT_LIST_HEAD(&heads[i]); for (int i = 0; i < n_size; i++) { if (!find(nums[i], n_size, heads)) { hash = nums[i] < 0 ? -nums[i] % n_size : nums[i] % n_size; node = malloc(sizeof(*node)); node->num = nums[i]; list_add(&node->link, &heads[hash]); } } for (int i = 0; i < n_size; i++) { int len = 0; int num; node = find(nums[i], n_size, heads); while (node) { len++; num = node->num; list_del(&node->link); int left = num, right = num; while ((node = find(LLL, n_size, heads))) { len++; list_del(&node->link); } while ((node = find(RRR, n_size, heads))) { len++; list_del(&node->link); } length = len > length ? len : length; } } return length; } ``` ### 4-1 > 解釋上述程式碼的運作，撰寫完整的測試程式，指出其中可改進之處並實作 + 用以下題目來說明程式運作，答案為4 ```c nums = [100,4,200,1,3,2] ``` + find 功能為在結構中找 num 是否存在，作法為 mod num 後得到`hash`，接著到對應的`heads[hash]`查找有無`num`，有則回傳節點位址 ```c=10 static struct seq_node *find(int num, int size, struct list_head *heads) { struct seq_node *node; int hash = num < 0 ? -num % size : num % size; list_for_each_entry (node, &heads[hash], link) { if (node->num == num) return node; } return NULL; } ``` + 初始化`heads`為`size n_size`的陣列 ```c=23 int hash, length = 0; struct seq_node *node; struct list_head *heads = malloc(n_size * sizeof(*heads)); for (int i = 0; i < n_size; i++) INIT_LIST_HEAD(&heads[i]); ``` ![](https://i.imgur.com/fVFk4xe.png) + 先在 `heads [hash]` 中查找數字，如果此數字不存在就加入到 `heads[hashkey]` 後面 ```c=30 for (int i = 0; i < n_size; i++) { if (!find(nums[i], n_size, heads)) { hash = nums[i] < 0 ? -nums[i] % n_size : nums[i] % n_size; node = malloc(sizeof(*node)); node->num = nums[i]; list_add(&node->link, &heads[hash]); } } ``` 第一個 num 為100，透過 find 找不到此數字，所以加入 heads[hash] 這裡的 hashkey = 100 % 6 = 4 ![](https://i.imgur.com/eKyzVKL.png) nums[1] = 4，find 找不到此數字，加入 heads[hash] ![](https://i.imgur.com/RBGyYiJ.png) 最後完成所有數字的加入 ![](https://i.imgur.com/UB5JSqS.png) + 從 `nums[i]` 開始，如 `nums[i]` 存在於結構中，向左右查找並更新最大長度 ```c=39 for (int i = 0; i < n_size; i++) { int len = 0; int num; node = find(nums[i], n_size, heads); while (node) { len++; num = node->num; list_del(&node->link); int left = num, right = num; while ((node = find(--left, n_size, heads))) { len++; list_del(&node->link); } while ((node = find(++right, n_size, heads))) { len++; list_del(&node->link); } length = len > length ? len : length; } } ``` 指向100，移出結構，並往`left = 99` `right = 101`查找，更新length為1 ![](https://i.imgur.com/s0c0MOR.png) 指向4，移出結構，並往`left = 3` `left = 2` `left = 1` `right = 5`查找，更新length為 4 ![](https://i.imgur.com/nwXVUxs.png) + 最後回傳`length` ```c=63 return length; ``` ### Test 在 TEST_SIZE 大小的 int array中讓 MAX_LEN個數字為連續，函式回傳值應該等於 MAX_LEN ```c #define MAX_LEN 50 #define TEST_SIZE 300 int main(){ int i, startNum = MAX_LEN; int nums[TEST_SIZE] ={}; printf("Test nums:\n"); for(i=0;i<TEST_SIZE;i++){ if ( startNum && i%2) nums[i] = startNum--; printf("%d ",nums[i]); } printf("\n"); int ans = longestConsecutive(nums, TEST_SIZE); printf("Ans = %d , longestConsecutive = %d\n",MAX_LEN,ans); } ``` ### 4-2 > 嘗試用 Linux 核心風格的 hash table 重新實作上述程式碼 + [畫lk list](https://stackoverflow.com/questions/50494263/circular-list-in-graphviz-or-how-to-bend-the-edge)