2023q1 Homework1 (lab0)

# 2023q1 Homework1 (lab0) contributed by < `Damien-Chen` > :::danger 不要打錯自己的 GitHub 帳號名稱。 :notes: jserv ::: ## 開發環境 Windows 11-WSL2-Ubuntu-20.04.5-LTS ```shell $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian Address sizes: 48 bits physical, 48 bits virtual CPU(s): 6 On-line CPU(s) list: 0-5 Thread(s) per core: 2 Core(s) per socket: 3 Socket(s): 1 Vendor ID: AuthenticAMD CPU family: 25 Model: 68 Model name: AMD Ryzen 5 6600HS Creator Edition Stepping: 1 CPU MHz: 3293.725 BogoMIPS: 6587.45 Virtualization: AMD-V Hypervisor vendor: Microsoft Virtualization type: full L1d cache: 96 KiB L1i cache: 96 KiB L2 cache: 1.5 MiB L3 cache: 16 MiB ``` ## queue.c 實作 :::danger 注意書寫規範：中英文間用一個空白字元區隔。 :notes: jserv ::: ### 基礎資料結構在實作前，先對 `list.h` 和 `queue.h` 中定義的資料結構有個基礎認識，搭配 [你所不知道的 C 語言: linked list 和非連續記憶體篇](https://hackmd.io/@sysprog/c-linked-list)，對之後的理論與實作能夠更好想像和理解。 list.h 中定義的 linked list 元素如下 ```c struct list_head { struct list_head *prev; struct list_head *next; }; ``` 從 prev 和 next 兩指標可以發現這是個 circular doubly-linked list。 :::warning 注意連字號的位置。 :notes: jserv ::: 在 queue.h 中將 `list_head` 結構體嵌入每個鏈結串列單元中，如下 ```c typedef struct { char *value; struct list_head list; } element_t; ``` 因此每個鏈結串列單元包含一個字元 value，指向前一個元素的指標prev，指向下一個元素的指標 next。 ### q_new() 該函式要求建立一個空的佇列，所以配置一個 list_head，且該元素的 prev 和 next 皆指向自己。這裡直接使用內建函式，使程式碼更簡潔。 INIT_LIST_HEAD() : 將 prev 和 next 指向自己。 ```c struct list_head *q_new() { struct list_head *q = malloc(sizeof(struct list_head)); if(!q){ return NULL; } INIT_LIST_HEAD(q); return q; } ``` ### q_free() 走訪所有節點並依序刪除，使用的內建函式如下: list_for_each_entry_safe() : 取得每個節點的開頭位置 ```c void q_free(struct list_head *l) { // need to check whether list_head is NULL // otherwise segmentation fault will happen if(!l){ return; } element_t *curr, *temp; list_for_each_entry_safe(curr, temp, l, list){ q_release_element(curr); } free(l); } ``` ### q_insert_head() 於佇列的開頭加入一個節點，因此配置記憶體須為 element_t 結構，且該結構還包含需要插入的字串，所以配置一個字串 new_s，須注意後面的+1為 '\0' 的空間。 ```c bool q_insert_head(struct list_head *head, char *s) { if (!head) { return false; } element_t *new = malloc(sizeof(element_t)); if (!new) { return false; } char *new_s = malloc((strlen(s) + 1) * sizeof(char)); if (!new_s) { free(new); return false; } memcpy(new_s, s, strlen(s) + 1); new->value = new_s; list_add(&new->list, head); return true; } ``` ### q_insert_tail() 和 insert_head相同，只要更改為 list_add_tail即可。 ```c bool q_insert_tail(struct list_head *head, char *s) { if (!head) { return false; } element_t *new = malloc(sizeof(element_t)); if (!new) { return false; } char *new_s = malloc((strlen(s) + 1) * sizeof(char)); if (!new_s) { free(new); return false; } memcpy(new_s, s, strlen(s) + 1); new->value = new_s; list_add_tail(&new->list, head); return true; } ``` ### q_remove_head() Remove只需將節點的指標unlink，不須釋放記憶體。根據queue.h中函式的說明撰寫即可，須注意當bufsize = 0時，bufsize - 1 會出現錯誤，因此加入bufsize != 0 當作判斷條件。 list_first_entry() : 取得linked list 開頭的entry list_del() : 將 node 從其所屬的 linked list 結構中移走，且不釋放記憶體。 ```c element_t *q_remove_head(struct list_head *head, char *sp, size_t bufsize) { if (!head || list_empty(head)) { return NULL; } element_t *ele = list_first_entry(head, element_t, list); if (sp && bufsize != 0) { strncpy(sp, ele->value, bufsize - 1); sp[bufsize - 1] = '\0'; } list_del(&ele->list); return ele; } ``` ### q_remove_tail() 和remove_head相同，只需更改巨集函式即可 ```c element_t *q_remove_tail(struct list_head *head, char *sp, size_t bufsize) { if (!head || list_empty(head)) { return NULL; } element_t *ele = list_last_entry(head, element_t, list); if (sp && bufsize != 0) { strncpy(sp, ele->value, bufsize - 1); sp[bufsize - 1] = '\0'; } list_del(&ele->list); return ele; } ``` ### q_size() 使用list_for_each()走訪每個節點即可。 ```c int q_size(struct list_head *head) { if (!head || list_empty(head)) { return 0; } int len = 0; struct list_head *curr; list_for_each (curr, head) { len++; } return len; } ``` ### q_delete_mid() 要找出中間節點，可使用快慢指標方式，快的每次往前走兩步，慢的往前走一步，當快指標到達linked list結尾時(也就是head->prev)就停下，此時慢指標的位置好就是中間節點的位置，將其移除並釋放記憶體。 :::danger 注意書寫規範：中英文間用一個空白字元區隔。 :notes: jserv ::: ```c bool q_delete_mid(struct list_head *head) { if (!head || list_empty(head)) { return false; } struct list_head *fast = head; struct list_head *slow = head; do { fast = fast->next->next; slow = slow->next; } while (fast->prev != head && fast != head); list_del(slow); q_release_element(list_entry(slow, element_t, list)); return true; } ``` ### q_delete_dup() ```c bool q_delete_dup(struct list_head *head) { if (!head) { return false; } struct list_head *curr = head->next; while (curr != head) { if (curr->next != head) { struct list_head *safe = curr->next; element_t *c = container_of(curr, element_t, list); element_t *n = container_of(curr->next, element_t, list); if (!strcmp(c->value, n->value)) { do { struct list_head *next = n->list.next; list_del(&n->list); q_release_element(n); if (next == head) { break; } n = container_of(next, element_t, list); } while (!strcmp(c->value, n->value)); safe = curr->next; list_del(&c->list); q_release_element(c); } curr = safe; } } return true; } ``` ### q_swap() 走訪每個節點，每次走訪將該節點移動至下一個節點後面。一開始我是使用for迴圈的形式(如下程式碼中的註解部分)，後來觀察<Lambert>的程式後發現可以直接使用list_for_each()使程式更簡潔。 ```c void q_swap(struct list_head *head) { if (!head || list_empty(head)) { return; } struct list_head *curr; list_for_each (curr, head) { if (curr->next == head) { break; } list_move(curr, curr->next); } /* The following is original form */ // for (struct list_head *curr = head->next; // curr->next != head && curr != head; curr = curr->next) { // list_move(curr, curr->next); // } } ``` ### q_reverse() 走訪每個節點，將其刪除後，再加到開頭。因為要在迴圈過程中更改節點，需使用list_for_each_safe()。 ```c void q_reverse(struct list_head *head) { if (!head || list_empty(head)) { return; } struct list_head *curr, *n; list_for_each_safe (curr, n, head) { list_del(curr); list_add(curr, head); } } ``` ### q_reverseK() 最直覺的想法就是，使用一個counter搭配上述的reverse()。每走訪一個節點，counter加1，當counter == 1時，將前面到目前的節點傳遞給reverse()，反轉後在插入原本位置。 ```c void q_reverseK(struct list_head *head, int k) { if (!head || list_empty(head) || k <= 0) { return; } struct list_head *curr, *safe, start, *pivot; INIT_LIST_HEAD(&start); pivot = head; int cnt = 0; list_for_each_safe(curr, safe, head){ cnt++; if(cnt == k){ list_cut_position(&start, pivot, curr); q_reverse(&start); list_splice_init(&start, pivot); pivot = safe->prev; cnt = 0; } } } ``` 當cnt == k 時，呼叫list_cut_position()，該函式將從pivot到curr中的所有節點全部移到start，再將start反轉後，插入pivot的起始位置。更新pivot位置為下一次的切點。用於當作暫時的head。 ### q_sort() 這裡我參考<paul90317>的方法實現merge sort。 ```c static inline void l_merge(struct list_head *head, struct list_head *left, struct list_head *right) { while (!list_empty(left) && !list_empty(right)) { if (strcmp(list_entry(left->next, element_t, list)->value, list_entry(right->next, element_t, list)->value) < 0) { list_move_tail(left->next, head); } else { list_move_tail(right->next, head); } } if (!list_empty(left)) { list_splice_tail_init(left, head); } else { list_splice_tail_init(right, head); } } ``` ```c void q_sort(struct list_head *head) { if (!head || q_size(head) < 2) return; struct list_head *fast = head, *slow = head; do { fast = fast->next->next; slow = slow->next; } while (fast != head && fast->next != head); LIST_HEAD(left); LIST_HEAD(right); list_splice_tail_init(head, &right); list_cut_position(&left, &right, slow); q_sort(&left); q_sort(&right); l_merge(head, &left, &right); } ``` ### q_descend() 該函式判斷如果該節點的右邊所有節點，只要有一個值比自己大就刪除，因此可以從右往左找最大值，每次只需跟最大值比較即可。 ```c int q_descend(struct list_head *head) { if (!head || list_empty(head)) { return 0; } element_t *entry = list_entry(head->prev, element_t, list), *safe = list_entry(entry->list.prev, element_t, list); char *best = entry->value; while (&entry->list != head) { if (strcmp(entry->value, best) < 0) { list_del(&entry->list); q_release_element(entry); } else { best = entry->value; } entry = safe; safe = list_entry(entry->list.prev, element_t, list); } return q_size(head); } ``` ### q_merge() 把所有佇列合併為一個，再呼叫 sort 排序 ```c int q_merge(struct list_head *head) { if (!head || list_empty(head)) return 0; int n = q_size(head); while (n > 1) { struct list_head *fast = head->next, *slow = head->next; for (int i = 0; i < n / 2; ++i) { LIST_HEAD(temp); l_merge(&temp, container_of(fast, queue_contex_t, chain)->q, container_of(fast->next, queue_contex_t, chain)->q); list_splice_tail(&temp, container_of(slow, queue_contex_t, chain)->q); fast = fast->next->next; slow = slow->next; } if (n & 1) list_splice_tail_init(container_of(fast, queue_contex_t, chain)->q, container_of(slow, queue_contex_t, chain)->q); n = (n + 1) / 2; } return q_size(container_of(head->next, queue_contex_t, chain)->q); } ``` ## 分數目前分數為 95/100，卡在時間複雜度無法通過。 :::danger 注意書寫規範：中英文間用一個空白字元區隔。 :notes: jserv ::: ## 研讀 linux 核心原始程式碼 list_sort.c linux 使用的 merge sort 和一般常見的不同，從註解中可以看到其設計目標: ``` This mergesort is as eager as possible while always performing at least * 2:1 balanced merges. Given two pending sublists of size 2^k, they are * merged to a size-2^(k+1) list as soon as we have 2^k following elements. * * Thus, it will avoid cache thrashing as long as 3*2^k elements can * fit into the cache. Not quite as good as a fully-eager bottom-up * mergesort, but it does use 0.2*n fewer comparisons, so is faster in * the common case that everything fits into L1. ``` 該方法會維持兩個要 merge 的 list 比例至少為 2 : 1，只要這 2 : 1 的 list 可以被放到 cache，就能有效避免 cache thrashing。可以得知這個演算法的設計是較為 cache friendly 的，且以硬體為出發點的觀點設計。課堂學到或是刷題完全不曾提過此設計觀念。了解到 linux 的 merge sort 設計目標後，接著欣賞其程式碼，