# 2023q1 Homework1 (lab0) contributed by < `hankTaro` > ###### tags: `Linux 核心設計` ## 開發環境 ```bash $ gcc --version gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0 $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 39 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 Vendor ID: GenuineIntel Model name: Intel(R) Core(TM) i5-1035G1 CPU @ 1.00GHz CPU family: 6 Model: 126 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 1 Stepping: 5 CPU max MHz: 3600.0000 CPU min MHz: 400.0000 BogoMIPS: 2380.80 ``` ## 開發過程 ### `struct element_t` 在開始前,先確認其節點的架構,以便進行撰寫 ```c typedef struct { char *value; struct list_head list; } element_t; ``` ### `q_new` ```c /* Create an empty queue */ struct list_head *q_new() { struct list_head *new = malloc(sizeof(struct list_head)); if (!new) return NULL; INIT_LIST_HEAD(new); return new; } ``` ### `q_free` 由於 `list_head` 嵌入於 element_t 的結構中,使其可以成為 doubly-linked list,要釋放的是結構整體,而非 `list_head` 本身,所以使用 `q_release_element()` 而非 `free()` 由 [malloc for struct and pointer in C](https://stackoverflow.com/questions/14768230/malloc-for-struct-and-pointer-in-c) 所言,由於宣告 `element_t` 時,只有架好 1 的部分,其中 x 指標所指的空間(2)需要另外分配 ```graphviz digraph graphname { rankdir="LR" node [shape=record] subgraph cluster_0 { label="1" peripheries=0 stack[label="<f0>x|<f1>n"] } subgraph cluster_1 { label="2" peripheries=0 stack2[label="<f0>*x"] } ptr [label=y ,shape=plaintext] ptr->stack:f0 stack:f0->stack2:f0 } // 1 2 // +-----+ +------+ // y------>| x------>| *x | // | n | +------+ // +-----+ ``` :::warning 使用 Graphviz 重新製圖。 :notes: jserv ::: :::info 已更改 ::: 同理,`free(list_head *ptr)` 只會將其所指的 list 節點釋放(也就是上圖中,1的部分),並未釋放內部指針指向的空間(也就是上圖中,2的部分) 最後釋放 `list_head` 結構的 l ```c /* Free all storage used by queue */ void q_free(struct list_head *l) { if (list_empty(l)) return; element_t *temp, *it; list_for_each_entry_safe(it,temp,l,list) { q_release_element(it); } free(l); } ``` ### `q_insert_head` / `q_insert_tail` 起初使用 `new->value = s;` 並未使用 `strup(s)` 複製字串並進行空間的分配,由於 element_t 中的 value 是指標,需要為其分配記憶體位置 所以使用 [strdup()](https://man7.org/linux/man-pages/man3/strdup.3.html) 來進行動態記憶體配置 > The strdup() function returns a pointer to a new string which is a duplicate of the string s. > Memory for the new string is obtained with malloc(), and can be freed with free(). ~~根據 [Dangling Pointer](https://hackmd.io/@sysprog/c-std-security#II%E8%A1%8D%E7%94%9F%E7%9A%84%E8%B3%87%E5%AE%89%E8%AD%B0%E9%A1%8C-Integer-Overflow3) 概念,在 `free()` 後將 new 指向 NULL~~ function 結束後會自動釋放清除 ```c /* Insert an element at head of queue */ bool q_insert_head(struct list_head *head, char *s) { if (!head) return false; element_t *new = malloc(sizeof(element_t)); if (!new) return false; new->value = strdup(s); if (!new->value) { //if allocate failed free(new); return false; } list_add(&new->list, head); return true; } /* Insert an element at tail of queue */ bool q_insert_tail(struct list_head *head, char *s) { if (!head) return false; element_t *new = malloc(sizeof(element_t)); if (!new) return false; new->value = strdup(s); if (!new->value) { //if allocate failed free(new); return false; } list_add_tail(&new->list, head); return true; } ``` :::warning 思考如何縮減重複的程式碼,從而達到更精簡的目標。 :notes: jserv ::: ### `q_remove_head` / `q_remove_tail` 依照 `queue.h` 中的 `q_remove_head` 說明 > NOTE: "remove" is different from "delete" > If sp is non-NULL and an element is removed, copy the removed string to `*sp` (up to a maximum of bufsize-1 characters, plus a null terminator.) > Return: the pointer to element, %NULL if queue is NULL or empty. ```c /* Remove an element from head of queue */ element_t *q_remove_head(struct list_head *head, char *sp, size_t bufsize) { if (!head || list_empty(head)) return NULL; element_t *remove_loc = list_first_entry(head, element_t, list); list_del(&remove_loc->list); if(sp) { strncpy(sp, remove_loc->value, bufsize - 1); sp[bufsize - 1] = '\0'; } return remove_loc; } /* Remove an element from tail of queue */ element_t *q_remove_tail(struct list_head *head, char *sp, size_t bufsize) { if (!head || list_empty(head)) return NULL; element_t *remove_loc = list_last_entry(head, element_t, list); list_del(&remove_loc->list); if(sp) { strncpy(sp, remove_loc->value, bufsize - 1); sp[bufsize - 1] = '\0'; } return remove_loc; } ``` ### `q_size` ```c /* Return number of elements in queue */ int q_size(struct list_head *head) { if(!head || list_empty(head)) return 0; int count = 0; struct list_head *it; list_for_each(it, head) count++; return count; } ``` ### `q_delete_mid` 有想到兩種做法 1. 利用雙向鏈結串列的功能,一個指標正向尋訪,一個逆向尋訪,當正向指標或是正向指標的下一個位置等於逆向指標,重疊位置便是 mid 3. 利用 Floyd 演算法,fast 指標的尋訪速度是 slow 的兩倍,當 fast 或 fast->next 指標指向 head 時,slow 的位置便是 mid 本次使用第一種方法,想法是多加利用雙向串列的優勢,減少 $(n+\dfrac{n}{2})-n=\dfrac{n}{2}$ 次尋訪時間 ```c /* Delete the middle node in queue */ bool q_delete_mid(struct list_head *head) { if (!head || list_empty(head)) return false; struct list_head *r = head->next; struct list_head *l = head->prev; while(r != l && r->next != l){ r = r->next; l = l->prev; } list_del(r); element_t *del = list_entry(r, element_t, list); q_release_element(del); return true; } ``` ### `q_delete_dup` `*start` 紀錄其起始點,用 `*it` 進行尋訪,並同時確認`it->value` 與 `it->value`內容是否相同,直到抵達兩者內容不重複的 node ,開始判斷 `it` 與 `*start` 中是否有 node 存在,若存在就代表起始點的值存在重複,則使用 `list_cut_position(&tmp, start, end->prev)` 將其部分移出至 `del` ,最後統一清理掉,並將 `safe->list.prev` 設為新起始點 (it 所指位置有可能在`list_cut_position()` 中被 remove,故不使用其作為新起始點) ```c /* Delete all nodes that have duplicate string */ bool q_delete_dup(struct list_head *head) { if (!head || list_empty(head)) return false; struct list_head *start = head; element_t *it, *safe; LIST_HEAD(del); list_for_each_entry_safe (it, safe, head, list) { if (&safe->list != head && strcmp(safe->value, it->value) == 0) continue; /* Detect duplicated elements */ if (it->list.prev != start) { LIST_HEAD(tmp); list_cut_position(&tmp, start, &it->list); list_splice(&tmp, &del); } start = safe->list.prev; } /* free del */ list_for_each_entry_safe (it, safe, &del, list) q_release_element(it); return true; } ``` ### `q_swap` ```c /* Swap every two adjacent nodes */ void q_swap(struct list_head *head) { if (!head) return; struct list_head *node; for (node = head->next; node != head && node->next != head; node = node->next) { struct list_head *next = node->next; list_del_init(node); list_add(node, next); } } ``` ### `q_reverse` 想法是將 node 中的 `next` 指向 `prev` 並且 `prev` 指向 `next`,並使用迭代對每個節點做轉換,此行為不難做,所以重點在於如何尋訪 ```c /* Reverse elements in queue */ void q_reverse(struct list_head *head) { if (!head || list_empty(head)) return; struct list_head *cur = head, *prev = head->prev, *next = NULL; while (next != head) { next = cur->next; cur->next = prev; cur->prev = next; prev = cur; cur = next; } } ``` ### `q_reverseK` `*start` 紀錄其起始點,每尋訪 K 個 node 便將起始點到當前的點用 `list_cut_position()` 分割到串列外,做 `reverse()` 隨後再用 `list_splice_init()` 將其,同時更新 `start` 位置,做為下次的分割起點以及合併點 ```c /* Reverse the nodes of the list k at a time */ void q_reverseK(struct list_head *head, int k) { if (!head || list_empty(head)) return; LIST_HEAD(make_rev); int count = 0; struct list_head *it, *safe, *start = head; list_for_each_safe (it, safe, head){ count++; if (count == k) { count = 0; list_cut_position(&make_rev, start, it); q_reverse(&make_rev); list_splice_init(&make_rev, start); start = safe->prev; } } } ``` :::warning 撰寫更精簡的程式碼。 :notes: jserv ::: ### `q_sort` 參考[你所不知道的 C 語言: linked list 和非連續記憶體](https://hackmd.io/@sysprog/c-linked-list#%E6%A1%88%E4%BE%8B%E6%8E%A2%E8%A8%8E-LeetCode-21-Merge-Two-Sorted-Lists)以及 [seasonwang0905](https://hackmd.io/6dj5N-DGRWKY0gX2NcNSZQ?view#q_sort)的 Merge sort 寫法,先寫出合併的函式 ```c struct list_head *merge_two_lists(struct list_head *L1, struct list_head *L2) { struct list_head *head = NULL, **ptr = &head, **node; for (node = NULL; L1 && L2; *node = (*node)->next) { element_t *E1 = list_entry(L1, element_t, list); element_t *E2 = list_entry(L2, element_t, list); node = strcmp(E1->value, E2->value) <= 0 ? &L1: &L2; *ptr = *node; ptr = &(*ptr)->next; } *ptr = (struct list_head *)((u_int64_t) L1 | (u_int64_t) L2); return head; } ``` ~~接下來寫出遞迴的分割函式,其中的`struct list_head *L1 = mergesort(head);` 宣告十分重要,不能不宣告寫出 `return merge_two_list(mergesort(head), mergesort(fast));` ,因為 `mergesort(head)` 的型態是 `list_head`,會需要用另一個 `list_head` 去指向他,否則進入 `merge_two_list()` 後,`mergesort(head)` 被作為 `head` 無法被 `container_of()` ,其中的值也就無法被取出(此階段會使整體 element 減少1)~~ 此處不可使用 `list_empty(head)` 因為末端不會指向 `head` 只會指向 `NULL` ```c struct list_head *mergesort(struct list_head *head) { if (!head || !head->next) return head; struct list_head *fast = head, *slow = head; while (fast && fast->next) { fast = fast->next->next; slow = slow->next; } fast = slow; slow->prev->next = NULL; struct list_head *L1 = mergesort(head); struct list_head *L2 = mergesort(fast); return merge_two_list(L1, L2); } ``` 此函式的想法是,先選取雙向串列中其中一向,將其處理成單向鏈結串列進行排序,再依據其順序依序走訪,將另一項的串列重新定向 ```c /* Sort elements of queue in ascending order */ void q_sort(struct list_head *head) { if (!head || list_empty(head)) return; head->prev->next = NULL; head->next = mergesort(head->next); struct list_head *current = head, *next = head->next; while (next) { next->prev = current; current = next; next = next->next; } current->next = head; head->prev = current; } ``` >疑問: >要怎麼知道傳入的head 是否有被嵌入於 sturct 中,像是q_sort中 `head->next = mergesort(head->next);`傳入的值是確定嵌入於 `element_t` 中的 >因為如果 head 有被嵌入於 sturct 中,那就會需要`head = mergesort(head);` 去求出正確的排序,否則 head 中的值不會被排序到 >如果是 head 有被嵌入於 sturct 中,感覺用`head = mergesort(head);`也不是不行,直接重新找出個 head 出來(已排序),反正出來一定是個排序好的單向串列,再去把prev整理一下就好了 > >有被嵌入於 sturct 中的 list_head 可以當作head嗎?(個人猜測可以 但這邊的head應該沒有) >有規定 sturct 的 head 一定要屬於 被嵌入於 sturct 中嗎?(個人猜測沒規定 因為這邊兩者都有使用) :::warning 改進你的漢語表達,什麼是「包著」?請善用 GDB 追蹤。 :notes: jserv ::: ### `q_descend` 其說明 >Remove every node which has a node with a strictly greater value anywhere to the right side of it. 換個說法就是從 tail 開始逆向尋訪,並先將 tail 設為 max,任何值比其小的節點,都必須被移除,直到遇見比其大的節點,將 max 更新為當前節點,直到尋訪到 head,每當有一個點成為 `max` 就代表此點最終會存在於串列中,所以 `count++` ```c /* Remove every node which has a node with a strictly greater value anywhere to * the right side of it */ int q_descend(struct list_head *head) { if (!head || list_empty(head)) return 0; element_t *it, *safe; char *max = NULL; int count = 0; for (it = list_entry(head->prev, element_t, list), safe = list_entry(head->prev->prev, element_t, list); &it->list != head; it = safe, safe = list_entry(safe->list.prev, element_t, list)) { if (!max || strcmp(it->value, max) > 0) { count++; max = it->value; } else { list_del(&it->list); q_release_element(it); } } return count; } ``` ### `q_merge` ```graphviz digraph G { { node[shape=none,label="step = 1"]; i1 node[shape=none,label="step = 2"]; i2 node[shape=none,label="step = 3"]; i4 node[shape=none,label="step = 4"]; i8 node[shape=none,label="...."]; i9 node[shape=none,label="step = n"]; i10 } interval1[label="<f0>L0|<f1>L1|<f2>L2|<f3>L3|<f4>L4|<f5>L5|<f6>L6|<f7>L7", shape=record, fixedsize=false,width=6] interval2[label="<f0>L01|<f1>|<f2>L2|<f3>L3|<f4>L4|<f5>L5|<f6>L6|<f7>L7", shape=record, fixedsize=false,width=6] interval4[label="<f0>L01|<f1>L2|<f2>L3|<f3>L4|<f4>L5|<f5>L6|<f6>L7|<f7>", shape=record, fixedsize=false,width=6] interval8[label="<f0>L01|<f1>L23|<f2>|<f3>L4|<f4>L5|<f5>L6|<f6>L7|<f7>", shape=record, fixedsize=false,width=6] interval9[style=invis] interval10[label="<f0>result|<f1>|<f2>|<f3>|<f4>|<f5>|<f6>|<f7>", shape=record, fixedsize=false,width=6] i1->i2[style=invis] i2->i4[style=invis] i4->i8[style=invis] i8->i9[style=invis] i9->i10[style=invis] // interval1:f0 -> interval2:f0 // interval1:f1 -> interval2:f0 // interval1:f2 -> interval2:f2 // interval1:f3 -> interval2:f2 // interval1:f4 -> interval2:f4 // interval1:f5 -> interval2:f4 // interval1:f6 -> interval2:f6 // interval1:f7 -> interval2:f6 interval1:f0 -> interval2:f0 interval1:f1 -> interval2:f0 interval1:f0 -> interval2:f0[style=invis] interval1:f7 -> interval2:f7[style=invis] // interval2:f0 -> interval4:f0 // interval2:f2 -> interval4:f0 // interval2:f4 -> interval4:f4 // interval2:f6 -> interval4:f4 interval2:f1 -> interval4:f7[label=list_move_tail] interval2:f0 -> interval4:f0[style=invis] interval2:f7 -> interval4:f7[style=invis] interval4:f1 -> interval8:f1 interval4:f2 -> interval8:f1 interval4:f7 -> interval8:f7[style=invis] interval8:f7 -> interval9:f7[style=invis] interval9:f7 -> interval10:f7[style=invis] } ``` 先將串列中的空子串列移除,必免 `while (first->q && second->q)` 遇上空串列而忽略另一個非空串列 e.g. `lists = [[1,4,5],[1,3,4],[],[2,6]]` , 當 `first==[], second==[2,6]`, [2,6] 會被忽略 ```c= list_for_each_entry_safe (it, safe, head, chain) { if (!it->q) list_del_init(&it->chain); } ``` 由於有 $n$ 個串列需要合併,會需要合併 $n/2$ 次,當 $n$ 為奇數,會需要合併 $(n/2)+1$ 次,所以設 `count = (n+1)/2` 其中 `merge_two_list` 的功能與 `q_sort` 中的 `merge_two_lists` 相似,不過這裡的需要在函式內將雙向串列的結構整理好,所以使用 `list_move_tail` 來移動最小值至目標地點 ```c= int count = (q_size(head) + 1) / 2; for (int i = 0; i < count; i++) { queue_contex_t *first = list_first_entry(head, queue_contex_t, chain); queue_contex_t *second = list_entry(first->chain.next, queue_contex_t, chain); while (first->q && second->q) { merge_two_list(first->q, second->q); second->q = NULL; list_move_tail(&second->chain, head); first = list_entry(first->chain.next, queue_contex_t, chain); second = list_entry(first->chain.next, queue_contex_t, chain); } } ``` 由於最後排列好的串列會位在 `head` 指向的第一個 entry ,所以最後 return 其 size 即可 ```c return q_size(list_first_entry(head, queue_contex_t, chain)->q); ```