2022q1 Homework1 (lab0)

--- tag: linux2022 --- # 2022q1 Homework1 (lab0) contributed by < `jim12312321` > ### Reviewed by `kevinshieh0225` - 在執行節點逐一刪除時，可以用 linux kernel api 中的 `list_for_each_safe`。 - 有一些我自己沒有注意到的實作細節，比如在移除節點時使用 `list_del_init`，可讓整體程式碼更安全。 - 可以在實作函式的改寫版本加上小標題，較容易追蹤文章結構 - 現在使用 `container_of` 應該不再需要 `cppcheck-suppress nullPointer` 了 - `q_delete_dup` 的規則判定有變，所有在原本佇列中發生重複的節點都要刪除，目前程式碼實作並不符合要求。 - 文件程式碼適時註解。 - 可實作看看迭代版本的 q_sort 並比較結果。 ## 實驗環境 ```shell $ gcc --version gcc (Ubuntu 11.2.0-7ubuntu2) 11.2.0 $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian Address sizes: 39 bits physical, 48 bits virtual CPU(s): 20 On-line CPU(s) list: 0-19 Thread(s) per core: 1 Core(s) per socket: 14 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 154 Model name: 12th Gen Intel(R) Core(TM) i7-12700H Stepping: 3 CPU MHz: 2700.000 CPU max MHz: 6000.0000 CPU min MHz: 400.0000 BogoMIPS: 5376.00 Virtualization: VT-x L1d cache: 336 KiB L1i cache: 224 KiB L2 cache: 8.8 MiB NUMA node0 CPU(s): 0-19 ``` ## [作業要求](https://hackmd.io/@sysprog/linux2022-lab0) 詳細閱讀 [lab0](https://hackmd.io/@sysprog/linux2022-lab0) 說明，依據指示著手修改 queue.[ch] 和連帶的檔案，測試後用 Git 管理各項修改。 * `q_new`: 建立新的「空」佇列 * `q_free`: 釋放佇列所佔用的記憶體 * `q_insert_head`: 在佇列開頭 (head) 插入 (insert) 給定的新節點 (以 LIFO 準則) * `q_insert_tail`: 在佇列尾端 (tail) 插入 (insert) 給定的新節點 (以 FIFO 準則) * `q_remove_head`: 在佇列開頭 (head) 移去 (remove) 給定的節點 * `q_release_element`: 釋放特定節點的記憶體空間 * `q_size`: 計算目前佇列的節點總量 * `q_delete_mid`: 移走佇列中間的節點 * `q_delete_dup`: 在已經排序的狀況，移走佇列中具備重複內容的節點 * `q_swap`: 交換佇列中鄰近的節點 * `q_reverse`: 以反向順序重新排列鏈結串列，該函式不該配置或釋放任何鏈結串列節點，換言之，它只能重排已經存在的節點 * `q_sort`: 以==遞增順序==來排序鏈結串列的節點 ## 針對佇列的操作 ### q_new 建立空佇列，若沒有配置指定大小的記憶體，則回傳 NULL。 > commit 1be837 ```cpp struct list_head *q_new() { element_t *q = malloc(sizeof(element_t)); if (!q) return NULL; q->value = NULL; q->list.next = &q->list; q->list.prev = &q->list; return &q->list; } ``` > 「當每次提交一些東西時，Git 會產生一個快照，將所有檔案放置在一個物件記錄著，可以輸入 git log（歷史紀錄）會看到提交後 40 個字元長 16 進位的 SHA-1（Secure Hash Algorithm 1）雜湊演算碼，這可以做為提交後的識別碼，使用命令配合 commit 識別碼，一般只需要 6~8 個數字即可辨識。」 -- 摘自 [git](https://powerkaifu.github.io/2019/10/14/lesson-git/) 後來發現在 list.h 中有可以直接初始話佇列的函式（INIT_LIST_HEAD），因此更新。 > commit 870612 ```cpp struct list_head *q_new() { element_t *q = malloc(sizeof(element_t)); if (q == NULL) { return NULL; } INIT_LIST_HEAD(&q->list); return &q->list; } ``` ### q_free 釋放佇列所佔用的記憶體 > 由於嘗試許久仍寫不出來，因此以下內容有去參考kevinshieh0225、laneser中q_free的寫法 > > commit c26a4e ```cpp void q_free(struct list_head *l) { if (!l) return; struct list_head *node = l->next; while (node != l) { // cppcheck-suppress nullPointer element_t *e = container_of(node, element_t, list); node = node->next; q_release_element(e); } // cppcheck-suppress nullPointer element_t *head = container_of(l, element_t, list); free(head); } ``` ### q_insert_head 給定要插入的新字串，將新字串加入新節點並中在佇列**開頭**插入給定的新節點。若新節點為 NULL 或沒有配置記憶體空間，則回傳 false 。若順利加入佇列**開頭**則回傳 true 。 > commit 870612 ```cpp bool q_insert_head(struct list_head *head, char *s) { element_t *q = malloc(sizeof(element_t)); if (！q) { return false; } q->value = malloc(sizeof(s)); memset(q->value, '\0', strlen(q->value)); strncpy(q->value, s, strlen(s)); list_add(&q->list, head); return true; } ``` ::: spoiler 踩坑紀錄 - 複製字串 (strnpy) 時用 strlen(s) 當作第三個參數（ERROR: Corruption detected in block with address 0x555c8c87df10 when attempting to free it） - 用設立一個變數計算字串字數取代 strlen ::: > commit c26a4e ```cpp bool q_insert_head(struct list_head *head, char *s) { element_t *q = malloc(sizeof(element_t)); if (q == NULL || head == NULL) { return false; } int charsize = 0; while (*(s + charsize)) charsize += 1; q->value = malloc(charsize + 1); if (q->value == NULL) { free(q); return false; } strncpy(q->value, s, charsize); q->value[charsize] = '\0'; list_add(&q->list, head); return true; } ``` 加入檢查 head 是否為 NULL 的判斷式讓程式更加 robust > commit 58c193 ```cpp bool q_insert_head(struct list_head *head, char *s) { if (!head) return false; element_t *q = malloc(sizeof(element_t)); if (!q) return false; int charsize = 0; while (*(s + charsize)) charsize += 1; q->value = malloc(charsize + 1); if (q->value == NULL) { free(q); return false; } strncpy(q->value, s, charsize); q->value[charsize] = '\0'; list_add(&q->list, head); return true; } ``` ### q_insert_tail 給定要插入的新字串，將新字串加入新節點並中在佇列**結尾**插入給定的新節點。若新節點為 NULL 或沒有配置記憶體空間，則回傳 false 。若順利加入佇列**結尾**則回傳 true 。 > commit 870612 ```cpp bool q_insert_tail(struct list_head *head, char *s) { element_t *q = malloc(sizeof(element_t)); if (！q) { return false; } q->value = malloc(sizeof(s)); memset(q->value, '\0', strlen(q->value)); strncpy(q->value, s, strlen(s)); list_add_tail(&q->list, head); return true; } ``` 跟 q_insert_head 有一樣問題故修正。 > commit c26a4e ```cpp bool q_insert_tail(struct list_head *head, char *s) { element_t *q = malloc(sizeof(element_t)); if (q == NULL || head == NULL) { return false; } int charsize = 0; while (*(s + charsize)) charsize += 1; q->value = malloc(charsize + 1); if (q->value == NULL) { free(q); return false; } strncpy(q->value, s, charsize); q->value[charsize] = '\0'; list_add_tail(&q->list, head); return true; } ``` 加入檢查 head 是否為 NULL 的判斷式讓程式更加 robust > commit 58c193 ```cpp bool q_insert_tail(struct list_head *head, char *s) { if (!head) return false; element_t *q = malloc(sizeof(element_t)); if (!q) return false; int charsize = 0; while (*(s + charsize)) charsize += 1; q->value = malloc(charsize + 1); if (q->value == NULL) { free(q); return false; } strncpy(q->value, s, charsize); q->value[charsize] = '\0'; list_add_tail(&q->list, head); return true; } ``` ### q_remove_head 將佇列**開頭**的節點移除。 - 回傳 NULL （移除失敗） - 佇列為空或 NULL - 回傳被移除的節點（移除成功） - 還需將被移除的節點中儲存的值複製到 sp 中 :::spoiler 踩坑紀錄 - 沒有搞清楚 sp 是什麼，胡亂 malloc ，結果不意外的跳出錯誤訊息（ERROR: Failed to store removed value） - 查看 qtest.c 後發現 remove (在 qtest.c 中傳入 q_remove_head 的變數) 並且他已經配置好記憶體跟初始化內容，因此不需要多此一舉，去掉 malloc 後錯誤訊息即消失。 ::: > commit c26a4e ``` cpp element_t *q_remove_head(struct list_head *head, char *sp, size_t bufsize) { if (head == NULL || list_empty(head) != 0) { return NULL; } if (sp == NULL) { return NULL; } // cppcheck-suppress nullPointer element_t *cur_e = list_entry(head->next, element_t, list); list_del_init(head->next); strncpy(sp, cur_e->value, bufsize - 1); return cur_e; } ``` 將 sp 根據 bufsize 的大小規定，在尾端加上 '\0' > commit fd5f90 ``` cpp element_t *q_remove_head(struct list_head *head, char *sp, size_t bufsize) { if (!head || list_empty(head)) return NULL; if (!sp) return NULL; // cppcheck-suppress nullPointer element_t *cur_e = list_entry(head->next, element_t, list); list_del_init(head->next); strncpy(sp, cur_e->value, bufsize - 1); sp[bufsize - 1] = '\0'; return cur_e; } ``` ### q_remove_tail 將佇列**結尾**的節點移除。 - 回傳 NULL （移除失敗） - 佇列為空或 NULL - 回傳被移除的節點（移除成功） - 還需將被移除的節點中儲存的值複製到 sp 中 > commit c26a4e ``` cpp element_t *q_remove_tail(struct list_head *head, char *sp, size_t bufsize) { if (head == NULL || list_empty(head) != 0) { return NULL; } if (sp == NULL) { return NULL; } // cppcheck-suppress nullPointer element_t *cur_e = list_entry(head->prev, element_t, list); list_del_init(head->prev); strncpy(sp, cur_e->value, bufsize - 1); return cur_e; } ``` 將 sp 根據 bufsize 的大小規定，在尾端加上 '\0' > commit fd5f90 ``` cpp element_t *q_remove_head(struct list_head *head, char *sp, size_t bufsize) { if (!head || list_empty(head)) return NULL; if (!sp) return NULL; // cppcheck-suppress nullPointer element_t *cur_e = list_entry(head->prev, element_t, list); list_del_init(head->prev); strncpy(sp, cur_e->value, bufsize - 1); sp[bufsize - 1] = '\0'; return cur_e; } ``` ### q_size 回傳佇列長度,若佇列是為 NULL 或空佇列則回傳 0。 > 以下程式碼源自老師的 [lab0 作業要求](https://hackmd.io/@sysprog/linux2022-lab0) > commit 392750 ```cpp int q_size(struct list_head *head) { if (!head) return 0; int len = 0; struct list_head *li; list_for_each (li, head) len++; return len; } ``` ### q_delete_mid 移除佇列中間的節點並將其佔用記憶體釋放。 - 回傳 true (移除成功) - 移除並釋放中間的節點 - 回傳 false (移除失敗) - 佇列為 NULL 或空佇列 > commit 16e285 ```cpp bool q_delete_mid(struct list_head *head) { if (head == NULL || list_empty(head) != 0) { return false; } if (list_is_singular(head)) { head = head->next; // cppcheck-suppress nullPointer element_t *e = container_of(head, element_t, list); list_del(head); q_release_element(e); return true; } int del_n_index = 0; struct list_head *node = head->next; /* even nodes */ if (q_size(node) % 2 == 0) { del_n_index = q_size(node) / 2; } else { /* odd nodes */ del_n_index = (q_size(node) + 1) / 2 - 1; } while (del_n_index > 0) { node = node->next; del_n_index--; } // cppcheck-suppress nullPointer element_t *e = container_of(node, element_t, list); list_del(node); q_release_element(e); return true; } ``` 利用[你所不知道的 C 語言: linked list 和非連續記憶體](https://hackmd.io/@sysprog/c-linked-list)中提到的 fast/slow 指標找中間節點的方法重新改寫。 > commit 430645 ```cpp= bool q_delete_mid(struct list_head *head) { if (!head || list_empty(head)) return false; if (list_is_singular(head)) { head = head->next; // cppcheck-suppress nullPointer element_t *e = container_of(head, element_t, list); list_del(head); q_release_element(e); return true; } struct list_head *fast = head, *slow = head; while (true) { if (fast->next == head || fast->next->next == head) break; fast = fast->next->next; slow = slow->next; } slow = slow->next; // cppcheck-suppress nullPointer element_t *e = container_of(node, element_t, list); list_del(node); element_t *e = container_of(slow, element_t, list); list_del(slow); q_release_element(e); return true; } ``` ### q_delete_dup 將佇列中存有重複的節點移除並釋放其記憶體空間。 > commit ec6546 ```cpp void q_delete_dup(struct list_head *head) { if (!head || list_empty(head)) return false; struct list_head *node, *del; element_t *del_node; node = head->next; while (node != head) { char *cstr, *nstr; // cppcheck-suppress nullPointer cstr = list_entry(node, element_t, list)->value; if (node->next == head) { break; } // cppcheck-suppress nullPointer nstr = list_entry(node->next, element_t, list)->value; if (strcmp(cstr, nstr) == 0) { del = node; node = node->next; list_del_init(del); // cppcheck-suppress nullPointer del_node = list_entry(del, element_t, list); q_release_element(del_node); } else { node = node->next; } } return true; } ``` ### q_swap 將佇列中兩兩相鄰的節點互換。 > commit 0508d3 ```cpp void q_swap(struct list_head *head) { struct list_head *odd, *even; odd = head->next; even = odd->next; while (true) { // swap odd and even odd->next = even->next; odd->next->prev = odd; even->prev = odd->prev; even->prev->next = even; odd->prev = even; even->next = odd; // end odd = odd->next; even = odd->next; if (odd == head || even == head) { break; } } } ``` ### q_reverse 反向順序重新排列鏈結串列，並且不能釋放或配置新的記憶體空間。 > commit 3cd169 ```cpp void q_reverse(struct list_head *head) { if (head == NULL || list_empty(head) != 0) { return; } struct list_head *node, *temp; for (node = head->next; node != head; node = node->prev) { temp = node->next; node->next = node->prev; node->prev = temp; } temp = head->next; head->next = head->prev; head->prev = temp; } ``` ### q_sort 利用 merge sort q_delete_dup將佇列依照佇列節點中的字串遞增排序。 ==以下程式優化太糟，無法通過效能測試，已更新較有效率的版本== > commit 94ee22 ```cpp struct list_head *merge(struct list_head *left, struct list_head *right) { struct list_head *head, *cur; // cppcheck-suppress nullPointer char *lstr = list_entry(left, element_t, list)->value; // cppcheck-suppress nullPointer char *rstr = list_entry(right, element_t, list)->value; int cmp = strcmp(lstr, rstr); if (cmp > 0) { // left > right head = right; right = right->next; right->prev = head->prev; right->prev->next = right; head->next = head; head->prev = head; } else { // left <= right head = left; left = left->next; left->prev = head->prev; left->prev->next = left; head->next = head; head->prev = head; } cur = head; if (head == left) { left = NULL; } else if (head == right) { right = NULL; } while (true) { if (left == NULL) { cur->next = right; head->prev = right->prev; right->prev->next = head; right->prev = cur; break; } else if (right == NULL) { cur->next = left; head->prev = left->prev; left->prev->next = head; left->prev = cur; break; } // cppcheck-suppress nullPointer lstr = list_entry(left, element_t, list)->value; // cppcheck-suppress nullPointer rstr = list_entry(right, element_t, list)->value; cmp = strcmp(lstr, rstr); if (cmp > 0) { // left > right cur->next = right; if (q_size(right) == 0) { right->prev = cur; right = NULL; } else { right = right->next; right->prev = right->prev->prev; right->prev->next = right; cur->next->prev = cur; } cur = cur->next; cur->next = head; head->prev = cur; } else { // left <= right cur->next = left; if (q_size(left) == 0) { left->prev = cur; left = NULL; } else { left = left->next; left->prev = left->prev->prev; left->prev->next = left; cur->next->prev = cur; } cur = cur->next; cur->next = head; head->prev = cur; } } return head; } struct list_head *mergesort(struct list_head *head) { if (head->next == head) { return head; } int mid_index = q_size(head) / 2; struct list_head *mid, *temp = head; while (mid_index > 0) { temp = temp->next; mid_index--; } mid = temp->next; temp->next = head; mid->prev = head->prev; head->prev->next = mid; head->prev = temp; struct list_head *left = mergesort(head), *right = mergesort(mid); return merge(left, right); } void q_sort(struct list_head *head) { if (head == NULL || list_empty(head) != 0 || list_is_singular(head) != 0) { return; } struct list_head *node = head->next; node->prev = head->prev; node->prev->next = node; head->prev = NULL; head->next = NULL; node = mergesort(node); head->next = node; head->prev = node->prev; node->prev->next = head; node->prev = head; } ``` 看完 [你所不知道的 C 語言: linked list 和非連續記憶體](https://hackmd.io/@sysprog/c-linked-list)後，利用裡面老師示範的技巧改寫 mergesort :::spoiler 解惑紀錄在理解程式碼並實作的過程中，我對於 mergeTwoLists 中的這段程式碼不了解，經過查閱一些資料後，總算理解這樣做的用意，因而紀錄。（<s>以下內容為我個人理解，若有誤，歡迎題點。</s>） > 不用在開發紀錄中說客套話。 > :notes: jserv ```cpp *ptr = (struct list_head *) ((uintptr_t) left | (uintptr_t) right); ``` 一開始我對這段程式碼存有幾點疑惑 - uintptr_t 是什麼？ - 為什麼這樣可以取代 if-else ？ - 為什麼這樣可以增進效能？ 1. uintptr_t 是什麼？根據 C99 中的規範，提供如 intptr_t,uintptr_t 讓任何可指向 void 的指標可以與之相互轉換且內容不變，因此便可以對指標所指向之地址進行操作 (e.g. void * 無法進行如++或--等數值操作) 。 > TODO: 翻閱 C 語言規格書 > :notes: jserv 參考資料： <s>[intptr_t、uintptr_t数据类型的解析](https://blog.csdn.net/cs_zhanyb/article/details/16973379)</s> > 避免看這種二手資訊，無助於你理解，而且屆時語言標準更動時，你無從掌握 > :notes: jserv [C99 standard §7.18.1.4 (p.257)](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf) 2. 為什麼這樣可以取代 if-else ？因為 NULL 經過 uintptr_t 的轉換後就會變成 0 ，因此就可以透過 or 的操作，直接得到不是 NULL 的另一個指標。 3. 為什麼這樣可以增進效能？跟處理器面對分支 (branch) 時的處理方式有關，若預測錯分支就會造成效能損耗。因此能減少分支的使用就**有可能**增進效能 (如果每次分支處理都猜對，那有沒有用 if-else 是沒有差別的) > 以講義提供的範例程式碼來說，`if` 和 `else` 成立的機會約略是各 50%，於是分支預測器就很難總是猜對，你可以翻閱 CS:APP 第 3 章，得知如何分析產生的組合語言輸出，屆時你就會發現，最初的程式碼存在比較和分支跳躍等指令，反過來說，調整後的程式碼只要單純的 bitwise OR 操作，從而省下 CPU 週期。 > :notes: jserv 參考資料： <s> [代码里写很多if会影响效率吗?](https://www.zhihu.com/question/51107242) [分支預測器 -wiki](https://zh.wikipedia.org/zh-tw/%E5%88%86%E6%94%AF%E9%A0%90%E6%B8%AC%E5%99%A8) </s> > 去翻閱計算機組織/結構的教科書，裡頭有 branch predictor 的說明，你需要「完整」的認識，而非從隻字片語中找線索，後者的難度比認真讀教科書還高，畢竟你變成「偵探」了 > :notes: jserv ::: > commit b7014e > commit 39f6ca ```cpp struct list_head *mergeTwoLists(struct list_head *left, struct list_head *right) { struct list_head *head = NULL, **ptr = &head, **node; for (node = NULL; left && right; *node = (*node)->next) { // cppcheck-suppress nullPointer node = (strcmp(list_entry(left, element_t, list)->value, // cppcheck-suppress nullPointer list_entry(right, element_t, list)->value) < 0) ? &left : &right; *ptr = *node; ptr = &(*ptr)->next; } *ptr = (struct list_head *) ((uintptr_t) left | (uintptr_t) right); return head; } struct list_head *mergesort(struct list_head *head) { if (!head->next) return head; struct list_head *fast = head, *slow = head, *mid; while (true) { if (!fast->next || !fast->next->next) break; fast = fast->next->next; slow = slow->next; } mid = slow->next; slow->next = NULL; struct list_head *left = mergesort(head), *right = mergesort(mid); return mergeTwoLists(left, right); } void q_sort(struct list_head *head) { if (!head || list_empty(head) || list_is_singular(head)) return; struct list_head *node = head->next, *temp; head->prev->next = NULL; head->next = NULL; node = mergesort(node); temp = head; temp->next = node; while (temp->next) { temp->next->prev = temp; temp = temp->next; } temp->next = head; head->prev = temp; } ``` ## 參考資訊 - 共筆格式參考：[共筆示範](https://hackmd.io/@cccccs100203/linux2020-lab0) - 同學作業參考： [laneser](https://hackmd.io/@laneser/linux2022_lab0), [kevinshieh0225](https://hackmd.io/@Masamaloka/2022q1-hw1) - 踩坑/解惑參考： - [lab0-c](https://hackmd.io/@sysprog/linux2022-lab0) - [你所不知道的 C 語言: linked list 和非連續記憶體](https://hackmd.io/@sysprog/c-linked-list) - [sizeof 在語言層面的陷阱](http://blog.linux.org.tw/~jserv/archives/001876.html?fbclid=IwAR1leirIhqWK5Qm9VpCif08I3v6w0fZi5Y0W-EiSxjqFnVDWDd-p8F0dq_s) - [C99 standard p.257](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf)