# 2024q1 Homework1 (lab0) contributed by < `yqt2000` > ## 開發環境 ```shell $ gcc --version gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 Copyright (C) 2021 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 39 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 Vendor ID: GenuineIntel Model name: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz CPU family: 6 Model: 140 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 1 Stepping: 1 CPU max MHz: 4200.0000 CPU min MHz: 400.0000 BogoMIPS: 4838.40 ``` :::warning 在終端機執行 `export LC_ALL=C` 再執行上述命令,可得英語訊息,目前的中文訊息用語不精準。 ::: ::: success 已修正 by `yqt2000` ::: :::danger 1. 無論標題和內文中,中文和英文字元之間要有空白字元 (對排版和文字搜尋有利); 2. 文字訊息避免用圖片來表示,否則不好搜尋和分類 ::: ## 針對佇列操作的程式碼實作 ### q_free 建立新的「空」佇列 使用 `list.h` 中的巨集 `list_for_each_entry_safe` ,並搭配 `queue.h` 的 `q_release_element` 逐一釋放佇列中的物件。 ```c element_t *entry, *safe = NULL; list_for_each_entry_safe (entry, safe, head, list) { q_release_element(entry); } ``` ### q_insert_head/q_insert_tail 在佇列開頭/尾端 (head/tail) 插入 (insert) 給定的新節點 (以 LIFO/FIFO 準則); 原本想使用 strncpy 來初始化節點的值,與同學討論後發現有 strdup 這個函式,以下引用[strdup - cppreference.com](https://en.cppreference.com/w/c/experimental/dynamic/strdup)的函式敘述 > ```c > char * strdup( const char *str1 ); > ``` > Returns a pointer to a null-terminated byte string, which is a duplicate of the string pointed to by str1. > The returned pointer must be passed to free to avoid a memory leak. > If an error occurs, a null pointer is returned and errno may be set. 此函式可幫助檢驗錯誤,當錯誤發生時會回傳空指標,但是為避免發生記憶體流失 (memory leak) 須釋放(free)這個回傳的指標,其中 q_release_element 中已包含其實作,因此使用 strdup 可使整體程式更加簡潔有品味,故使用 strdup 替代 strncpy 來實作 。 ```c if (!(e->value = strdup(s))) { q_release_element(e); return false; } list_add(&e->list, head); ``` 在實作 q_insert_head/q_insert_tail 中差異部份僅為上文程式碼中,將 list_add 更改為 `list_add_tail` ### q_remove_head/q_remove_tail 功能簡述:在佇列開頭/尾端 (head/tail) 移去 (remove) 給定的節點; :::danger 避免過多的中英文混用,已有明確翻譯詞彙者,例如「鏈結串列」(linked list) 和「佇列」(queue),就使用該中文詞彙,英文則留給變數名稱、人名,或者缺乏通用翻譯詞彙的場景。本例的 pointer to character 就是「字串」。 ::: :::danger 參照 C 語言規格書和 Linux man pages,亦即要用 https://man7.org/linux/man-pages/ 的超連結。 ::: ::: success 已更正 by `yqt2000` ::: 此部份使用 strncpy 將欲移除節點的值複製至 sp 這個字串中,其中引用 [strncpy(3p) — Linux manual page](https://man7.org/linux/man-pages/man3/strncpy.3p.html) 中的部份函式敘述 ::: info The stpncpy() and strncpy() functions shall copy not more than n bytes (bytes that follow a NUL character are not copied) from the array pointed to by s2 to the array pointed to by s1. If the array pointed to by s2 is a string that is shorter than n bytes, NUL characters shall be appended to the copy in the array pointed to by s1, until n bytes in all are written. ::: 顯然在以下程式碼中,在呼叫 strncpy 時,將所有能用的空間 bufsize 全用上了(如上述描述未使用的記憶體空間會填補 `\0` ),這並不是一個節省空間的行為,因此尚有改進的空間。 ```c element_t *q_remove_head(struct list_head *head, char *sp, size_t bufsize) { if (!head || list_empty(head)) return NULL; element_t *e = list_first_entry(head, element_t, list); if (sp) { strncpy(sp, e->value, bufsize - 1); sp[bufsize - 1] = '\0'; } list_del(&e->list); return e; } ``` ### q_delete_mid 功能簡述:移走佇列中間的節點,詳見 [LeetCode 2095. Delete the Middle Node of a Linked List](https://leetcode.com/problems/delete-the-middle-node-of-a-linked-list/description/) middle node 的敘述如下: > The middle node of a linked list of size n is the ⌊n / 2⌋th node from the start using 0-based indexing. If there're six elements, the third member should be returned. 此實作關鍵部份為如何找出 middle node ,以下為使用快慢指標實作的部份程式碼,然而因為此 doubly lined list 是 circular 的,移動fast指標須謹慎檢查是否為head,否則容易進入無窮迴圈。 ```c struct list_head **slow, *fast; slow = &head; fast = head->next; while (fast != head) { slow = &(*slow)->next; fast = fast->next; if (fast != head) { fast = fast->next; } } ``` ### q_delete_dup 在已經排序的狀況,移走佇列中具備重複內容的節點,詳見 [LeetCode 82. Remove Duplicates from Sorted List II](https://leetcode.com/problems/remove-duplicates-from-sorted-list-ii/description/) 使用 `list_for_each_entry_safe` 遍歷整條list,其中使用該巨集可存取下個 entry (s),當entry c 與 entry s 值為相同時,移除 entry c 於 list 的連結並釋放其記憶體空間,並且使用 bool dup 協助移除該次重複數值的最後一個節點。 ```c element_t *c, *s = NULL; bool dup = false; list_for_each_entry_safe (c, s, head, list) { if (&s->list != head && strcmp(c->value, s->value) == 0) { list_del(&c->list); q_release_element(c); dup = true; } else if (dup) { list_del(&c->list); q_release_element(c); dup = false; } } return true; ``` ### q_swap 交換佇列中鄰近的節點,詳見 [LeetCode 24. Swap Nodes in Pairs](https://leetcode.com/problems/swap-nodes-in-pairs/description/) 善用 `list.h` 中 list_move 功能,將節點往後接一個節點,程式碼兼具簡潔即可讀性。 ```c for (struct list_head *c = head->next; c->next != head && c != head; c = c->next) { list_move(c, c->next); } ``` ### q_reverse 以反向順序重新排列鏈結串列,該函式不該配置或釋放任何鏈結串列節點,換言之,它只能重排已經存在的節點; 將鍊結串列依序重新接至head後面 ```c if (!head || list_empty(head) || list_is_singular(head)) return; struct list_head *n, *s; list_for_each_safe (n, s, head) { list_del(n); list_add(n, head); } ``` 後來發現其指令同 list_move 故作其更改 ```diff - list_del(n); - list_add(n, head); + list_move(n,head) ``` ### q_reverseK 類似 q_reverse,但指定每 k 個節點為一組,進行反向順序排列,詳見 [LeetCode 25. Reverse Nodes in k-Group](https://leetcode.com/problems/reverse-nodes-in-k-group/description/) 首先計算要進行 reverse 的 group 的個數,使用 cutHead 及 cutLast 兩個指標指向該 group 的前一個 node 和最後一個 node ,依序裁切各個 group 進行 reverse 後,再接回原本從 list 切斷的位置 ```c void q_reverseK(struct list_head *head, int k) { // https://leetcode.com/problems/reverse-nodes-in-k-group/ int times = q_size(head) / k; struct list_head *cutLast = head; struct list_head cutList; for (int i = 0; i < times; i++) { INIT_LIST_HEAD(&cutList); struct list_head *cutHead; cutHead = cutLast; for (int j = 0; j < k; j++) cutLast = cutLast->next; list_cut_position(&cutList, cutHead, cutLast); q_reverse(&cutList); list_splice(&cutList, cutHead); cutLast = cutList.prev; } } ``` ### q_ascend/q_descend 參閱 [LeetCode 2487. Remove Nodes From Linked List](https://leetcode.com/problems/remove-nodes-from-linked-list/) ```c /* Remove every node which has a node with a strictly greater value anywhere to * the right side of it */ int q_descend(struct list_head *head) { // https://leetcode.com/problems/remove-nodes-from-linked-list/ if (!head || list_empty(head) || list_is_singular(head)) return 0; element_t *e = list_first_entry(head, element_t, list); element_t *s = list_entry(e->list.next, element_t, list); for (; &s->list != head; e = s, s = list_entry(s->list.next, element_t, list)) { // check element e, if e<s delete it and move e to previous node while (&s->list != head && strcmp(e->value, s->value) < 0) { list_del(&e->list); q_release_element(e); if (s->list.prev != head) { e = list_entry(s->list.prev, element_t, list); } else { break; } } } return 1; } ``` ### q_sort 這邊值得注意的是,會先將雙向鏈結串列的 last 暫時先指向 NULL 避免在 mergeSortList 函式中進入無窮迴圈,另外 mergeSortList 函式實作的想法是僅排序節點 ,其中並不是傳入合法的雙向環狀鍊結串列,因此在結束該函式後,必須將節點串列重新接回 head 和重新連接 previous link (list_head *prev)。 ```c /* Sort elements of queue in ascending/descending order */ void q_sort(struct list_head *head, bool descend) { if (!head || list_empty(head) || list_is_singular(head)) return; head->prev->next = NULL; head->next = mergeSortList(head->next); // reconnect the link(prev) struct list_head *cur; for (cur = head; cur->next; cur = cur->next) cur->next->prev = cur; cur->next = head; head->prev = cur; if (descend) q_reverse(head); } ``` ```c // ascending struct list_head *mergeSortList(struct list_head *head) { if (!head || !head->next) return head; struct list_head *slow = head; struct list_head *fast = head; while (fast && fast->next) { slow = slow->next; fast = fast->next->next; } slow->prev->next = NULL; struct list_head *l1 = mergeSortList(head); struct list_head *l2 = mergeSortList(slow); // merge sorted l1 and sorted l2 return mergeTwoSortedList(l1, l2); } ``` 在 mergeTwoSortedList 中,當 l1, l2非空時,接在 `head` 尾端,此部份僅作相接,並不是合法的雙向鏈結串列,因此如上述描述會再重接 previous link。 ```c // assume l1, l2 are ascending and return the asending merge result struct list_head *mergeTwoSortedList(struct list_head *l1, struct list_head *l2) { struct list_head head; INIT_LIST_HEAD(&head); while (l1 && l2) { element_t *e1 = list_entry(l1, element_t, list); element_t *e2 = list_entry(l2, element_t, list); if (strcmp(e1->value, e2->value) <= 0) { l1 = l1->next; list_add_tail(&e1->list, &head); } else { l2 = l2->next; list_add_tail(&e2->list, &head); } } if (l1) { head.prev->next = l1; l1->prev = head.prev; } if (l2) { head.prev->next = l2; l2->prev = head.prev; } return head.next; } ``` :::danger 避免過多的中英文混用,已有明確翻譯詞彙者,例如「鏈結串列」(linked list) 和「佇列」(queue),就使用該中文詞彙,英文則留給變數名稱、人名,或者缺乏通用翻譯詞彙的場景。 ::: ### q_merge 在這部份實作中,先將其他佇列的雙向鏈結串列接至第一條佇列的雙向鏈結串列的尾端,再進行排序。 :::danger 改進你的漢語表達。 ::: 顯然這並沒有利用「每個佇列中的雙向鏈結串列皆為已排序的」這個條件,因此有待改進。 ```c /* Merge all the queues into one sorted queue, which is in ascending/descending * order */ int q_merge(struct list_head *head, bool descend) { // https://leetcode.com/problems/merge-k-sorted-lists/ if (!head || list_empty(head)) return 0; queue_contex_t *q1 = list_entry(head->next, queue_contex_t, chain); queue_contex_t *qnode = NULL; if (list_is_singular(head)) return q1->size; list_for_each_entry (qnode, head, chain) { if (qnode->id == q1->id) continue; list_splice_tail(qnode->q, q1->q); qnode->size = 0; INIT_LIST_HEAD(qnode->q); q1->size += qnode->size; } q_sort(q1->q, descend); return q1->size; } ``` ### Valgrind + 自動測試程式 以下為執行 make valgrind 的結果 使用 make valgrind ,利用原先於 `traces` 資料夾中17個 `.cmd` 測資檔案進行測試,並檢查是否有記憶體流失(momory leak)的問題,得到的結果如下 --- TOTAL 100/100 </s> :::danger 摘錄關鍵訊息即可。 ::: --- ## 引入 [lib/list_sort](https://github.com/torvalds/linux/blob/master/lib/list_sort.c) 到 `lab0-c` 專案 ### `void *priv` 在 [lib/list_sort](https://github.com/torvalds/linux/blob/master/lib/list_sort.c)中 `void *priv` 有著以下的描述。 :::info @priv: private data, opaque to list_sort(), passed to @cmp ::: 在我們的專案中不會使用到這部份,所以將其移除並不會影響排序結果,因此將其從函式引數及相關程式碼中移除,以下舉其中一函式為例,另外還有 `merge_final`, 和`list_sort` 也有傳入此引數 `priv`。 ```diff= - __attribute__((nonnull(2,3))) - void list_sort(void *priv, struct list_head *head, list_cmp_func_t cmp) + __attribute__((nonnull(1,2))) + void list_sort(struct list_head *head, list_cmp_func_t cmp) ``` 另外 `__attribute__((nonnull(1,2))` 參考 [5.24 Declaring Attributes of Functions](https://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Function-Attributes.html) 有以下描述 > nonnull (arg-index, ...) The nonnull attribute specifies that some function parameters should be non-null pointers. For instance, the declaration: ```c extern void * my_memcpy (void *dest, const void *src, size_t len) __attribute__((nonnull (1, 2))); ``` > causes the compiler to check that, in calls to my_memcpy, arguments dest and src are non-null. If the compiler determines that a null pointer is passed in an argument slot marked as non-null, and the -Wnonnull option is enabled, a warning is issued. The compiler may also choose to make optimizations based on the knowledge that certain function arguments will not be null. ### add command in the queue console :::danger 改進你的漢語表達。 ::: 欲在實作的佇列命令視窗(console)中加入 `lsort` 這條命令,首先須在 `console_init` 函式中加入以下這行 ```c ADD_COMMAND(lsort, "Sort queue by linux kernel like list_sort", ""); ``` 並仿照 `do_sort` 函式,添加 `do_lsort` ,將函式呼叫改為引入的新函式 `q_list_sort` ```diff= bool do_lsort(int argc, char *argv[]){ ... - q_sort(current->q, descend); + q_list_sort(current->q, descend); ... } ``` 即可於 `.cmd` 的測試檔中呼叫 `lsort` 來使用引入的 `list_sort` 進行排序。 ### 對 linux kernel like list_sort 與 implemented merge_sort 進行效能分析 仿照 [2024q1 第 1 週測驗題測驗二](https://hackmd.io/@sysprog/linux2024-quiz1#%E6%B8%AC%E9%A9%97-2) 的 timsort 測驗程式進行改寫,細節程式碼紀錄於 [commit d90e226](https://github.com/yqt2000/lab0-c/commit/d90e2268e1fef66e4976ac929912404455f97ab4) 對於隨機生成的樣本數為 10000 的雙向鍊結串列進行排序。 | sorting algo | cpu cycles | cmp nums | | -------- | -------- | -------- | | linux kernel like list_sort| 730 | 120999 | | implemented merge sort | 836 | 120428 | 其中並未考慮資料分佈問題,因此接續進行 shuffle 的實作 ### 在 qtest 提供新的命令 shuffle 比照 [lab0-d](https://hackmd.io/@sysprog/linux2024-lab0/%2F%40sysprog%2Flinux2024-lab0-d) 在qtest 中提供新的命令 shuffle, 實作的 shuffle 演算法為 Fisher–Yates shuffle 細節程式碼請參照 [Commit f287e75](https://github.com/yqt2000/lab0-c/commit/f287e7531d5a453bae45ee5d60cda178ad5409d2) ```c // To shuffle an array a of n elements (indices 0..n-1): void FYshuffle(struct list_head *head){ // for i from n−1 down to 1 do // j ← random integer such that 0 ≤ j ≤ i // exchange a[j] and a[i] if (!head || list_is_singular(head)) return; int len = q_size(head); struct list_head *node_i = head->prev; for(int i = len-1; i>0; i--){ int r = rand() % (i+1); struct list_head *node_j = head->next; for(int j=0; j<r; j++) node_j = node_j->next; list_swap(node_i,node_j); node_i = node_j->prev; } return; } ``` 統計分析如下 ``` Expectation: 41666 Observation: {'1234': 41652, '1243': 41399, '1324': 41729, '1342': 41727, '1423': 41761, '1432': 41756, '2134': 41315, '2143': 41691, '2314': 41993, '2341': 41282, '2413': 41651, '2431': 41846, '3124': 41547, '3142': 41810, '3214': 41796, '3241': 41363, '3412': 41871, '3421': 41332, '4123': 41143, '4132': 41699, '4213': 41932, '4231': 42077, '4312': 41952, '4321': 41676} chi square sum: 33.6128738059809 ``` ![uniform_distribution](https://hackmd.io/_uploads/rkDu4KtCT.png) --- ## 整合網頁伺服器 :::danger 說好的進度呢? :::