2023q1 Homework1 (lab0)

--- tags: linux kernel --- # 2023q1 Homework1 (lab0) contributed by < [aa12551](https://github.com/aa12551/lab0-c.git) > ## 作業要求 - [ ] 修改 queue.[ch] 和連帶的檔案，測試後用 Git 管理各項修改，要能滿足 $ make test 自動評分系統的所有項目 - [ ] 研讀 Linux 核心原始程式碼的 [lib/list_sort.c](https://github.com/torvalds/linux/blob/master/lib/list_sort.c) 並嘗試引入到 `lab0-c` 專案，比較你自己實作的 merge sort 和 Linux 核心程式碼之間效能落差 - [ ] 確保 `qtest` 提供的 `web` 命令得以搭配上述佇列實作使用，目前一旦網頁伺服器啟動後，原本處理命令列操作的 `linenoise` 套件就無法使用，請提出改善機制 - [ ] 在 `qtest` 提供新的命令 `shuffle`，允許藉由 [Fisher–Yates shuffle](https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle) 演算法，對佇列中所有節點進行洗牌 (shuffle) 操作，需要以統計的原理來分析你的實作，探究洗牌的「亂度」 - [ ] 在 `qtest` 中執行 `option entropy 1` 並搭配 `ih RAND 10` 一類的命令，觀察亂數字串的分布，並運用「資訊與熵互補，資訊就是負熵」的觀念，設計一組新的命令或選項，得以在 `qtest` 切換不同的 PRNG 的實作 (可選定 Xorshift)，搭配統計分析工具，比較亂數產生器 (如 Xorshift vs. 內建) 的品質 - [ ] 研讀論文[〈Dude, is my code constant time?〉](https://eprint.iacr.org/2016/1123.pdf)，解釋本程式的 “simulation” 模式是如何透過以實驗而非理論分析，達到驗證時間複雜度，需要解釋 [Student’s t-distribution](https://en.wikipedia.org/wiki/Student%27s_t-distribution) 及程式實作的原理。 ## 開發環境 ```shell $ gcc --version gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0 Copyright (C) 2021 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 39 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 Vendor ID: GenuineIntel Model name: Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz CPU family: 6 Model: 158 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 1 Stepping: 9 CPU max MHz: 3800.0000 CPU min MHz: 800.0000 BogoMIPS: 5599.85 ``` ## 改進 `queue.c` ### 結構 - 在程式開始時，<s>會創建一個 `queue_chain_t` 的結構</s> :::warning 思考上述表達是否合理，結構體並非「程式開始時建立」，其實結構體在描述一段連續記憶體區塊的佈局 (layout)。改進你的漢語表達，應要足以彰顯你的專業。 :notes: jserv ::: ```c typedef struct { struct list_head head; int size; } queue_chain_t; ``` - 而在 `list_head` 中，為兩個指向 `list_head` 的指標，分別會指向前一個節點及後一個節點 ```c struct list_head { struct list_head *prev; struct list_head *next; }; ``` - 當我們下 `new` 指令時，即會創建一個 list ，此時就會在 `queue_chain_t` 的 `head` 指向一個 `queue_context_t` ，此結構將會儲存此 list 的相關資訊，`q` 為指向此 list 的 element ，此實再創健一個 list 時， `chain` 會指向下一個 list 的 `queue_context_t` ```c typedef struct { struct list_head *q; struct list_head chain; int size; int id; } queue_contex_t; ``` - list 的每一個 element 則是一個 `element_t` 結構，`value` 儲存資料，`list` 指向上一個與下一個 element ```c typedef struct { char *value; struct list_head list; } element_t; ``` - 由上面的結構可以知道，所有的結構皆以 `list_head` 連接，這樣的好處是可以將指標的類型都統一，但會無法知道此指標指向的結構是哪一個，且無法直接存取結構內的參數，需要利用特定巨集來存取參數 ### list 的巨集與函式介紹 - INIT_LIST_HEAD(head) : 對 list 的 `next` 和 `prev` 初始化 - list_entry(node, type, member) : 存取 `list_head` 指向結構的其他參數 - list_for_each(node, head) : 走過 list 的所有節點 - list_for_each_entry_safe (entry, safe, l, list) : 在可能會刪除節點的運算中，也可以順利的走完 list 的所有節點 (使用 safe 先儲存下一個節點的位置) - list_add(struct list_head *node, struct list_head *head) : 增加節點在 head 前面 - list_add_tail(struct list_head *node, struct list_head *head) : 增加節點在 head 後面 - list_del(struct list_head *node) : 移除 (非刪除) node 指向的節點 ### q_new - 需要確認是否有成功分配記憶體，並初始化 list ```c struct list_head *q_new() { struct list_head *head = malloc(sizeof(*head)); if (!head) return NULL; INIT_LIST_HEAD(head); return head; } ``` ### q_free - 走遍整個 list，因為有刪除節點的運算，使用 `list_for_each_entry_safe` 確保可以走遍整個 list - 如果傳入的指標為 NULL，不做任何事情 ```c void q_free(struct list_head *l) { if (!l) return; element_t *entry = NULL, *safe = NULL; list_for_each_entry_safe (entry, safe, l, list) { q_release_element(entry); } free(l); } ``` ### q_insert_head - 此函式需要負責完整複製字串到 `list` ，且負責分配記憶體。 - 需要檢查 `head` 是否為 `NULL` 以及在 malloc 新的 `element_t` 及 `element_t` 的成員`value` 是否有成功 - 需要注意在字串結尾需要有 `\0` ```c bool q_insert_head(struct list_head *head, char *s) { if (!head) return false; element_t *newnode = malloc(sizeof(element_t)); if (!newnode) return false; newnode->value = malloc(sizeof(char) * (strlen(s) + 1)); if (!newnode->value) { free(newnode); return false; } memcpy(newnode->value, s, strlen(s) + 1); list_add(&newnode->list, head); return true; } ``` :::warning insert head 在 trace-17-complexity 的測試中沒有過關 ::: ### q_insert_tail 與 `q_insert_head` 的實作相同，僅將 `list_add` 改為 `list_add_tail` ```c bool q_insert_tail(struct list_head *head, char *s) { if (!head) return false; element_t *newnode; if (!(newnode = malloc(sizeof(element_t)))) return false; if (!(newnode->value = malloc(strlen(s) + 1))) { free(newnode); return false; } memcpy(newnode->value, s, strlen(s) + 1); list_add_tail(&newnode->list, head); return true; } ``` :::warning insert tail 在 trace-17-complexity 的測試中沒有過關 ::: ### q_remove_head - `sp` 由外部分配記憶體 - 如果 `sp` 的長度大於 `bufsize` ，只複製 `bufsize` 的尺寸到 `sp` - 需要注意在字串結尾需要有 `\0` ```c element_t *q_remove_head(struct list_head *head, char *sp, size_t bufsize) { if (!head || list_empty(head)) return NULL; element_t *removeNode = list_entry(head->next, element_t, list); size_t len = strlen(removeNode->value) + 1; len = (len > bufsize) ? bufsize : len; if (sp) { memcpy(sp, removeNode->value, len); sp[len - 1] = '\0'; } list_del(head->next); return removeNode; } ``` :::warning 在一般的模式時，會檢查 `sp` 是否有被分配記憶體，不過在 `simulation` 模式時，不會確保 `sp` 的記憶體有被分配，會造成 segmentation fault ，因此需要額外增加判斷 `sp` 是否有被有效的分配到記憶體 ::: ### q_remove_tail - 與 `q_remove_head` 的實作一樣，僅更改移除的方向 ```c element_t *q_remove_tail(struct list_head *head, char *sp, size_t bufsize) { if (!head || list_empty(head)) return NULL; element_t *removeNode = list_entry(head->prev, element_t, list); size_t len = strlen(removeNode->value) + 1; len = (len > bufsize) ? bufsize : len; if (sp) { memcpy(sp, removeNode->value, len); sp[len - 1] = '\0'; } list_del(head->prev); return removeNode; } ``` ### q_size - 將整個 `list` 走遍，並計算途中有幾個節點 ```c int q_size(struct list_head *head) { struct list_head *iterator = NULL; if (!head) return 0; int count = 0; list_for_each (iterator, head) count++; return count; } ``` ### q_delete_mid - 因為之後的函式會常需要從 list 移除節點並釋放空間，以 `q_delete_element` 完成移除節點並釋放空間 ```c void q_delete_element(element_t *node) { list_del(&node->list); q_release_element(node); } ``` - 使用快慢指標找出中間節點並刪除 ```c bool q_delete_mid(struct list_head *head) { if (!head || list_empty(head)) return false; struct list_head *slow = head->next, *fast = head->next; while (fast != head && fast->next != head) { fast = fast->next->next; slow = slow->next; } q_delete_element(list_entry(back, element_t, list)); return true; } ``` - 而因為我們使用的是 doubly linked list ，可以使用兩個指標分別由前後找，當前後的指標指向一樣的節點或是只差一個節點時，即為中間節點，可以減少 traverse 的次數，將以上的程式碼改為如下 ```c bool q_delete_mid(struct list_head *head) { if (!head || list_empty(head)) return false; struct list_head *front = head->next, *back = head->prev; while (front != back && front->next != back) { front = front->next; back = back->prev; } q_delete_element(list_entry(back, element_t, list)); return true; } ``` ### q_delete_dup ```c bool q_delete_dup(struct list_head *head) { if (!head) return false; if (list_empty(head)) return true; element_t *entry, *safe; bool delete = false; list_for_each_entry_safe (entry, safe, head, list) { if (&safe->list == head) { if (delete) { list_del(&entry->list); q_release_element(entry); } return true; } if (strcmp(entry->value, safe->value) == 0) { list_del(&entry->list); q_release_element(entry); delete = true; continue; } if (delete) { list_del(&entry->list); q_release_element(entry); delete = false; } } return true; } ``` ### q_swap - 兩兩節點交換，交換完後再往下兩個節點交換 ```c void q_swap(struct list_head *head) { if (!head) return; struct list_head *iterator = head->next; while (iterator != head && iterator->next != head) { char *temp = list_entry(iterator, element_t, list)->value; list_entry(iterator, element_t, list)->value = list_entry(iterator->next, element_t, list)->value; list_entry(iterator->next, element_t, list)->value = temp; iterator = iterator->next->next; } } ``` ### q_reverse - 將所有節點本身的 `prev` 和 `next` 互換，即可得到一個反轉的 `list` ```c void q_reverse(struct list_head *head) { if (!head) return; struct list_head *iterator = head->prev; while (iterator != head) { q_reverse_pointer(iterator); iterator = iterator->next; } q_reverse_pointer(head); } ``` - 而因為在 `q_reverse` 和 `q_reverseK` 皆需要將節點本身的 `next` 和 `prev` 互換，因此增加一個函式完成此功能 ```c void q_reverse_pointer(struct list_head *head) { struct list_head *temp = head->next; head->next = head->prev; head->prev = temp; } ``` ### q_reverseK - 先計算 list 的長度，即可知道要做幾組反轉 (len / k) 組 ```c void q_reverseK(struct list_head *head, int k) { if (!head) return; struct list_head *local_head = head; struct list_head *iterator = head->next; int len = q_size(head); for (int j = 0; j < (len / k); j++) { for (int i = 0; i < k; i++) { q_reverse_pointer(iterator); iterator = iterator->prev; } local_head->next->next = iterator; iterator->prev->prev = local_head; struct list_head *temp = iterator->prev; iterator->prev = local_head->next; local_head->next = temp; local_head = iterator->prev; } } ``` ### q_sort - 實作 merge sort :::danger 不需要張貼完整程式碼，在開發紀錄中，你應該提及自己的想法、考量因素，對現有途徑的分析等等。程式碼僅是輔佐性質，列出關鍵者即可。 :notes: jserv ::: <s> ```c /* Function to split the doubly linked list into two halves */ void splitList(struct list_head *source, struct list_head **frontRef, struct list_head **backRef) { struct list_head *slowPtr = source; struct list_head *fastPtr = source->next; /* Move fastPtr by two and slowPtr by one */ while (fastPtr != NULL) { fastPtr = fastPtr->next; if (fastPtr != NULL) { slowPtr = slowPtr->next; fastPtr = fastPtr->next; } } /* Set the front and back halves of the list */ *frontRef = source; *backRef = slowPtr->next; (*backRef)->prev = NULL; slowPtr->next = NULL; } /* Function to merge two sorted doubly linked lists */ struct list_head *mergeLists(struct list_head *a, struct list_head *b) { struct list_head *result = NULL; /* Base case */ if (a == NULL) { return b; } else if (b == NULL) { return a; } char *s1 = list_entry(a, element_t, list)->value; char *s2 = list_entry(b, element_t, list)->value; /* Recursively merge the lists */ if (strcmp(s1, s2) < 0) { result = a; result->next = mergeLists(a->next, b); result->next->prev = result; } else { result = b; result->next = mergeLists(a, b->next); result->next->prev = result; } return result; } /* Function to perform merge sort on a doubly linked list */ void sort(struct list_head **headRef) { struct list_head *head = *headRef; struct list_head *a = NULL; struct list_head *b = NULL; /* Base case: if the list is empty or has only one element */ if (head == NULL || head->next == NULL) { return; } /* Split the list into two halves */ splitList(head, &a, &b); /* Recursively sort the two halves */ sort(&a); sort(&b); /* Merge the sorted halves */ *headRef = mergeLists(a, b); } /* Sort elements of queue in ascending order */ void q_sort(struct list_head *head) { if (!head || head->next == head) return; head->prev->next = NULL; sort(&(head->next)); struct list_head *iter = head; while (iter->next != NULL) { iter = iter->next; } head->next->prev = head; head->prev = iter; iter->next = head; } ``` </s> :::warning 在 trace-14-perf 的測試中，會產生 segmentation fault，測試過後發現當 list 長度過長時才會產生 segmentation fault (大概300000筆RAND資料時) ::: - 確實也覺得的寫的有點冗長，嘗試將程式縮減並與 `q_merge` 所用到的 `mergetwolists` 與此函式合併，並將原本 `splitList` 使用快慢指標的方法改為由頭尾去找中間節點 ```c void mergetwolists(struct list_head **l1, struct list_head **l2) { struct list_head *l1_iter = (*l1)->next; struct list_head *l2_iter = (*l2)->next; struct list_head *iter = *l1; while (l1_iter != *l1 && l2_iter != *l2) { char *s1 = list_entry(l1_iter, element_t, list)->value; char *s2 = list_entry(l2_iter, element_t, list)->value; if (strcmp(s1, s2) > 0) { iter->next = l2_iter; l2_iter->prev = iter; l2_iter = l2_iter->next; iter = iter->next; } else { iter->next = l1_iter; l1_iter->prev = iter; l1_iter = l1_iter->next; iter = iter->next; } } if (l1_iter == *l1) { iter->next = l2_iter; l2_iter->prev = iter; while (l2_iter->next != *l2) l2_iter = l2_iter->next; l2_iter->next = *l1; (*l1)->prev = l2_iter; } if (l2_iter == *l2) { iter->next = l1_iter; l1_iter->prev = iter; } } void splitlists(struct list_head **l1, struct list_head **l2) { struct list_head *front = (*l1)->next, *back = (*l1)->prev; while (front != back && front->next != back) { front = front->next; back = back->prev; } list_cut_position((*l2)->next, (*l1)->next, back); } /* Sort elements of queue in ascending order */ void q_sort(struct list_head *head) { /* Base case: if the list is empty or has only one element */ if (head == NULL || head->next == head || head->next == head->prev) { return; } struct list_head l1; INIT_LIST_HEAD(&l1); struct list_head *l1_ptr = &l1; /* Split the list into two halves */ splitlists(&head, &l1_ptr); /* Recursively sort the two halves */ q_sort(head); q_sort(l1_ptr); /* Merge the sorted halves */ mergetwolists(&head, &l1_ptr); } ``` - 這樣就可以過 trace-14-perf 的測試了 ### q_descend - 從 list 的後面開始往前 traverse ，當節點的數值比 `max` 大時，`max` 的數值改為當前節點的數值，如果節點的數值比 `max` 小時，則從 list 移除此節點並刪除 ```c int q_descend(struct list_head *head) { // https://leetcode.com/problems/remove-nodes-from-linked-list/ if (!head || head->next == head) return 0; element_t *entry = list_entry(head->prev, element_t, list); element_t *safe = list_entry(entry->list.prev, element_t, list); char *max = "\0"; // the smallest value of ASCII int count = 0; while (&entry->list != (head)) { if (strcmp(max, entry->value) > 0) { q_delete_element(entry); } else { max = entry->value; count++; } entry = safe; safe = list_entry(safe->list.prev, element_t, list); } return count; } ``` ### q_merge - 使用 `mergetwolists` 將兩個已排列好的 list 合併並重新排列，並將所有的 list 都合併到第一個 list ```c /* Merge all the queues into one sorted queue, which is in ascending order */ int q_merge(struct list_head *head) { // https://leetcode.com/problems/merge-k-sorted-lists/ if (!head) return 0; struct list_head *l1 = list_entry(head->next, queue_contex_t, chain)->q; struct list_head *iter = head->next; while (iter->next != head) { struct list_head *l2 = list_entry(iter->next, queue_contex_t, chain)->q; mergetwolists(&l1, &l2); list_entry(iter->next, queue_contex_t, chain)->q = NULL; iter = iter->next; } return q_size(l1); } ``` :::warning current score 95/100 :::