2025q1 Homework1 (lab0)

# 2025q1 Homework1 (lab0) contributed by < `HenryChaing` > {%hackmd NrmQUGbRQWemgwPfhzXj6g %} ## 開發環境 ```shell $ gcc --version gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 39 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 Vendor ID: GenuineIntel Model name: Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz CPU family: 6 Model: 94 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 1 Stepping: 3 CPU max MHz: 4000.0000 CPU min MHz: 800.0000 BogoMIPS: 6799.81 ``` ## 開發指定的佇列操作 ### `q_new` 初始版本： > [commit cee17f9](https://github.com/HenryChaing/lab0-c/commit/cee17f9281c1a5407ef7ae3df1b46e8f6bf72ecf) :::warning 「為了快速實作」這句話沒辦法傳遞任何有用的資訊，不如不說。工程人員說話要精準有效。 > 了解，謝謝老師，下次會減少冗言贅字，盡量有效表達。 ::: <s>為了快速實作，</s> 我在不考慮配置記憶體失敗的情況下做了基礎的實作。也就是新增一個 `list_head` 作為 `head` ，並將其 `prev` 及 `next` 皆指向 `head` ，最後回傳 `head` ，作為空的鏈結串列。 ```c list_head *q_new() { list_head *head = (list_head *) malloc(sizeof(list_head)); head->next = head; head->prev = head; return head; } ``` 改進： > [commit 51b75a5](https://github.com/HenryChaing/lab0-c/commit/51b75a58aec05020ef879dbc721a097b9ae45bf0) 為了通過 `make test` 自動程式檢測，實作 `queue.h` 中對於此方法的所有需求，包含若配置記憶體失敗則回傳 NULL 的這項預防程式執行時錯誤的機制。 ```diff list_head *q_new() { list_head *head = (list_head *) malloc(sizeof(list_head)); + if(!head){ + return NULL; + } head->next = head; head->prev = head; return head; } ``` > 可以嘗試以 linux 風格 [list.h](https://github.com/torvalds/linux/blob/master/include/linux/list.h) 提供的 API 改寫 [name=HotMercury] > 可改為 list_empty（head）判斷 > [name=HenryChaing] ### `q_free` 初始版本： > commit (`git commit --amend` 前) 原先只有釋放之前配置的 `list_head` 空間，如下所示: ```c void q_free(list_head *l) { free(l); } ``` 改進： > [commit cee17f9 (line 25)](https://github.com/HenryChaing/lab0-c/commit/cee17f9281c1a5407ef7ae3df1b46e8f6bf72ecf) `q_free` 的過程中會依序走訪各個 `list_head` ，並透過`containerof` 巨集來找出該節點的起始位址，也就是由 `list_head` 來找出 `element_t` 這個實際的節點。解除 `element_t` 以及其資料 `value` 所配置的空間。最後不忘再`free(head)`。 ```diff void q_free(list_head *l) { + list_head *pt = l->next; + while (pt != l) { + element_t *node = container_of(pt, element_t, list); + pt = pt->next; + free(node->value); + free(node); + } free(l); } ``` > [commit 57f5eb9](https://github.com/HenryChaing/lab0-c/commit/57f5eb9f391a1422360c8c5df74d173344314e19) 若引數 `l` 為 NULL，則直接`return`。 ```diff void q_free(list_head *l) { + if(!l){ + return; + } list_head *pt = l->next; while (pt != l) { element_t *node = container_of(pt, element_t, list); pt = pt->next; free(node->value); free(node); } free(l); } ``` ### `q_insert_head` 初始版本： > [commit cee17f9 (line 38)](https://github.com/HenryChaing/lab0-c/commit/cee17f9281c1a5407ef7ae3df1b46e8f6bf72ecf) `line 40-41`會先配置 element_t 所需的記憶體空間；`line 44-48`則會用深層複製的方式將字串作複製；`line 50-54`會將鏈結重新配置讓新的節點安插到 head 的下一個位置。 ```c=38 bool q_insert_head(struct list_head *head, char *s) { element_t *new_node = (element_t *) malloc(sizeof(element_t)); char *str_copy = (char *) malloc(strlen(s)+1); int i; for (i = 0; i < strlen(s); i++) { *(str_copy + i) = *(s + i); } *(str_copy + i) = '\0'; new_node->value = str_copy; new_node->list.prev = head; new_node->list.next = head->next; head->next->prev = &new_node->list; head->next = &new_node->list; return true; } ``` > 第 42 行是空的，行數標號錯了 [name=HotMercury] 改進： > [commit e6ceef1](https://github.com/HenryChaing/lab0-c/commit/e6ceef17a2eafc823b99b329b3a0ee297e72223d) `line 51-66`新增了配置記憶體失敗以及 `head == NULL` 的處理機制；`line 72`是解決上次字串結尾沒有`\0`結尾的錯誤。 ```diff bool q_insert_head(struct list_head *head, char *s) { + if (!head) { + return false; + } element_t *new_node = (element_t *) malloc(sizeof(element_t)); + if (!new_node) { + return false; + } char *str_copy = (char *) malloc(strlen(s)+1); + if (!str_copy) { + free(new_node); + return false; + } int i; for (i = 0; i < strlen(s); i++) { *(str_copy + i) = *(s + i); } + *(str_copy + i) = '\0'; new_node->value = str_copy; new_node->list.prev = head; new_node->list.next = head->next; head->next->prev = &new_node->list; head->next = &new_node->list; return true; } ``` > [commit 7889767](https://github.com/HenryChaing/lab0-c/commit/78897679c43f1cba901f9190129cdfa4f2eeff8b) 為了讓 `insert_head` 達到 $O(1)$ ，首先將重複呼叫的 `strlen(s)` 設為變數`s_length`。 <s>避免嚴重的執行負重</s> > 應該是可以減少 strlen 呼叫次數，跟 $O(1)$ 沒有太大的關西 [name=HotMercury] :::warning 改進你的漢語表達。 > 若有表達不清的地方，會再向chatGPT尋求協助。 ::: ```diff +int s_length = strlen(s); -char *str_copy = (char *) malloc(strlen(s)+1); +char *str_copy = (char *) malloc(s_length+1); -for (i = 0; i < strlen(s); i++) { +for (i = 0; i < s_length; i++) { ``` ### `q_insert_tail` :::danger 避免非必要的項目縮排 (即 `* ` 和 `- `)，以清晰、明確，且流暢的漢語書寫。 ::: 三次 commit 功能皆與 q_insert_head() 的紀錄相同。 > [commit cee17f9](https://github.com/HenryChaing/lab0-c/commit/cee17f9281c1a5407ef7ae3df1b46e8f6bf72ecf) > [commit a7c6fa0](https://github.com/HenryChaing/lab0-c/commit/a7c6fa05afa17b9510bd91e1299c480c5e163859) > [commit 7889767](https://github.com/HenryChaing/lab0-c/commit/78897679c43f1cba901f9190129cdfa4f2eeff8b) ### `q_remove_head` 初始版本： > [commit cee17f9 (line 72)](https://github.com/HenryChaing/lab0-c/commit/cee17f9281c1a5407ef7ae3df1b46e8f6bf72ecf) 因為這裡要回傳的是 `element_t` 的結構變數，因此在只有給定其成員變數 `struct list_head` 的情況下，我們必須要反推得到 `element_t` 之記憶體起始位址。這裡會用到巨集 ==container_of== ，也就是在給定 1.成員變數位址 2.結構名稱 3.成員名稱之情況下，就可以反推得到結構變數的記憶體位址，也就可以對這個結構變數進行相關操作。初始版本僅利用 `line 75-76` 移除節點，最後回傳 `container_of` 得到的節點。 ```c=72 element_t *q_remove_head(struct list_head *head, char *sp, size_t bufsize) { element_t *return_element = container_of(head->next, element_t, list); head->next->next->prev = head; head->next = head->next->next; return return_element; } ``` 改進： > [commit 91da239](https://github.com/HenryChaing/lab0-c/commit/91da2398c2b785610bf961a098582af774555869) 除了增加判別佇列為空或不存在或回傳字串指標指向 NULL 等問題外，我們新增了對字串進行深層複製的功能，並且也會考慮複製後的字串其大小不會超過 buffer size 。 ```diff element_t *q_remove_head(struct list_head *head, char *sp, size_t bufsize) { + if (q_size(head) == 0 || !sp) { + return NULL; + } element_t *return_element = container_of(head->next, element_t, list); head->next->next->prev = head; head->next = head->next->next; + size_t i; + for (i = 0; i < strlen(return_element->value) && i < bufsize - 1; i++) { + *(sp + i) = *(return_element->value + i); + } + *(sp + i) = '\0'; return return_element; } ``` > [commit 0807cec](https://github.com/HenryChaing/lab0-c/commit/0807cec474e7029888046d968bc6f998515df436) 為了讓 `remove_head` 達到 $O(1)$ ，首先將重複呼叫的 `strlen(s)` 設為變數`s_length`。並且為了減少指標取值的動作，額外設定指標變數 `value` 。 ```diff + char *value = return_element->value; + int s_length = strlen(value); - for (i = 0; i < strlen(return_element->value) && i < bufsize - 1; i++) { - *(sp + i) = *(return_element->value + i); + for (i = 0; i < s_length && i < bufsize - 1; i++) { + *(sp + i) = *(value + i); ``` ### `q_remove_tail` * 三次 commit 功能皆與 q_remove_head() 的紀錄相同。 > [commit cee17f9](https://github.com/HenryChaing/lab0-c/commit/cee17f9281c1a5407ef7ae3df1b46e8f6bf72ecf) > [commit a579f01](https://github.com/HenryChaing/lab0-c/commit/a579f01334e54c054699e4be2c30676b24199bd3) > [commit 0807cec](https://github.com/HenryChaing/lab0-c/commit/0807cec474e7029888046d968bc6f998515df436) ### `q_size` 初始版本： > [commit cee17f9 (line 90)](https://github.com/HenryChaing/lab0-c/commit/cee17f9281c1a5407ef7ae3df1b46e8f6bf72ecf) `line 94-97` 會讓 `pt` 指標依序拜訪各個節點，每拜訪一個節點就會增加計數，直到 `pt` 走到佇列尾端。 ```c=90 int q_size(struct list_head *head) { int count = 0; list_head *pt = head; while (pt->next != head) { count++; pt = pt->next; } return count; } ``` 改進： > [commit d09074d](https://github.com/HenryChaing/lab0-c/commit/d09074d0d4ddcd809b1879859a462430c4dacdee) 實踐題目的要求，在佇列不存在或為空佇列時，回傳大小為零。 ```diff int q_size(struct list_head *head) { + if (!head || head->next == head) { + return 0; + } int count = 0; list_head *pt = head; while (pt->next != head) { count++; pt = pt->next; } return count; } ``` ### `q_delete_mid` 初始版本： > [commit cee17f9 (line 105)](https://github.com/HenryChaing/lab0-c/commit/cee17f9281c1a5407ef7ae3df1b46e8f6bf72ecf) `line 107-113` 會先算出該佇列中間節點的索引，並將 `pt` 指標移至節點位址，其中佇列中間節點索引計算方式是採用 `floor ( length / 2 )` 。 `line 122-123` 為將節點從佇列中移除， `line 124-125` 則是刪除節點，刪除結構變數以及其指到的字串所配置到的空間。 ```c=105 bool q_delete_mid(struct list_head *head) { int length = q_size(head); int mid_index = length / 2; list_head *pt = head; for (size_t i = 0; i < mid_index; i++) { pt = pt->next; } if (length == 0) { return false; } else if (length == 1) { pt = pt->next; } element_t *delete_node = container_of(pt, element_t, list); pt->prev->next = pt->next; pt->next->prev = pt->prev; free(delete_node->value); free(delete_node); // https://leetcode.com/problems/delete-the-middle-node-of-a-linked-list/ return true; } ``` > 可以做一個分析比較與使用快慢指標找中間點的效率 [commit](https://github.com/HotMercury/lab0-c/commit/656d055d2861784410b66f5fdcf0eb37b61a1f52) 之後會補上來 [name=HotMercury] 改進： > [commit f443815 ](https://github.com/HenryChaing/lab0-c/commit/f4438154b88bc205a5b03c450d69cd694e34a683) 在進行實際測試，也就是執行 `scripts/driver.py` ，執行到 [`trace-02-ops`](https://github.com/HenryChaing/lab0-c/blob/master/traces/trace-02-ops.cmd) 時，發生了移除的節點與預期不同的錯誤問題，在用紙筆進行實際運算後，發現若 `delete_mid` 的中間節點索引採用 `ceil (length/2)`，就會如預期結果執行完畢，因此這項改動是為了達到測試目的所作的更動。 ``` cmd> rt tiger l = [dolphin,bear,gerbil,meerkat,bear,gerbil] cmd> dm l = [dolphin,bear,meerkat,bear,gerbil] cmd> dm l = [dolphin,bear,bear,gerbil] 往後符合預期... ``` ```diff -int mid_index = length / 2; +int mid_index = (length+1) / 2; ``` ### `q_delete_dup` 初始版本： > [commit cee17f9 (line 130)](https://github.com/HenryChaing/lab0-c/commit/cee17f9281c1a5407ef7ae3df1b46e8f6bf72ecf) `line 134-139` 會先配置字串陣列，字串長度限九十九，共佇列長度個字串空間。 `line 141-160` 會在走訪時紀錄節點指到的字串內容，往後若有遇到內容一致的節點，則會刪除該節點。（但最初遇到的因為沒有紀錄所以不會刪除，是疏失） `line 162-165` 釋放字串陣列所指到的字串們的空間，最後釋放字串陣列的空間。 ```c=130 bool q_delete_dup(struct list_head *head) { int str_mem_count = 0; int delete_sign = 0; char **str_mem = malloc(q_size(head) * sizeof(char *)); list_head *pt = head->next; for (size_t i = 0; i < q_size(head); i++) { str_mem[i] = (char *) malloc(100 * sizeof(char)); } while (pt != head) { element_t *node = container_of(pt, element_t, list); for (size_t i = 0; i < str_mem_count; i++) { if (strcmp(str_mem[i], node->value) == 0) { pt->next->prev = pt->prev; pt->prev->next = pt->next; pt = pt->next; delete_sign = 1; } } if (delete_sign == 1) { delete_sign = 0; free(node->value); free(node); } else { strncpy(str_mem[str_mem_count++], node->value, 100); pt = pt->next; } } for (size_t i = 0; i < q_size(head); i++) { free(str_mem[i]); } free(str_mem); // https://leetcode.com/problems/remove-duplicates-from-sorted-list-ii/ return true; } ``` 改進： > [commit c1c2822](https://github.com/HenryChaing/lab0-c/commit/c1c2822b613b030a356f4599a3afcfe2555313a7) 因為上個版本會有重複字串中首次被拜訪者不會被刪除的問題，因此改由以下版本實作，這個是在實作 `ascend/descend` 函式後，運用相同原理實作的版本。我們新增了兩個指標，分別是 `last` 及 `front` 。 **外部迴圈**： `last` 每一回合前進一個節點，每一回合會判斷內部迴圈是否有刪除節點，若有刪除則 `last` 所指到的節點也會刪除。 **內部迴圈**： `front` 會逐一拜訪 `last` 之後的節點，若有內容與 `last` 相同的節點，則會刪除 `front` 指到的節點。 ```c bool q_delete_dup(struct list_head *head) { if (!head) { return false; } list_head *last = head->next; list_head *front = last->next; bool delete = false; while (last != head) { element_t *last_node = container_of(last, element_t, list); while (front != head) { element_t *front_node = container_of(front, element_t, list); if (strcmp(last_node->value, front_node->value) == 0) { front->prev->next = front->next; front->next->prev = front->prev; front = front->next; free(front_node->value); free(front_node); delete = true; continue; } front = front->next; } if (delete) { last->prev->next = last->next; last->next->prev = last->prev; last = last->next; front = last->next; free(last_node->value); free(last_node); delete = false; continue; } last = last->next; front = last->next; } return true; } ``` ### `q_swap` 版本： > [commit cee17f9 (line 173)](https://github.com/HenryChaing/lab0-c/commit/cee17f9281c1a5407ef7ae3df1b46e8f6bf72ecf) `swap` 要實作的是相鄰的兩個節點互換，但最複雜的是鏈結的重新配置。 `line 179-180` 其他佇列節點對相鄰兩節點的鏈結重製。 `line 181-182` 相鄰兩節點之間的鏈結重製。 `line 183-184` 相鄰兩節點對其他佇列節點的鏈結重製。上述循環到相鄰兩節點其中一者為 `head` 為止。 ```c=173 void q_swap(struct list_head *head) { list_head *front = head->next->next; list_head *last = head->next; while (front != head && last != head) { (last)->prev->next = front; (front)->next->prev = last; (last)->next = (front)->next; (front)->prev = (last)->prev; (front)->next = last; (last)->prev = front; front = ((front)->next->next->next); last = ((last)->next); } // https://leetcode.com/problems/swap-nodes-in-pairs/ } ``` ### `q_reverse` 初始版本： > [commit cee17f9 (line 198)](https://github.com/HenryChaing/lab0-c/commit/cee17f9281c1a5407ef7ae3df1b46e8f6bf72ecf) 在實作 `reverse` 功能時，我使用了三個指標 `front`, `last`, `temp` ，其中 `front` 及 `last` 每一回合會更改鏈結配置，更改彼此之間的前後關係。而 `temp` 則是因為設計關係需要讓 `front` 移動而存在的節點。 ```c void q_reverse(struct list_head *head) { list_head *front = head->next; list_head *last = head; list_head *temp = NULL; do { last->prev = front; temp = front->next; front->next = last; last = front; front = temp; } while (last != head); } ``` 改動： > [commit d976f67](https://github.com/HenryChaing/lab0-c/commit/d976f67264dd218f3a9d383b827aea6fb8fead63) 在因應老師的要求下，我將函式寫成了更精簡的版本，這次移除了作用較少的 `temp` 指標。這次從每回合更改兩節點之間的前後關係，改成每回合更改單一節點的 `next` 及 `prev` 指標，這樣的寫法兩個指標就可以正常移動，而無須三個指標。 ```c void q_reverse(struct list_head *head) { list_head *last = head; list_head *front = head->next; do { last->next = last->prev; last->prev = front; front = front->next; last = last->prev; } while (head != last); } ``` > [commit 0ca8de6](https://github.com/HenryChaing/lab0-c/commit/0ca8de652da6b2911b179b44c0de9428348a85d6) 新增若佇列長度為零，則不進行反轉。 ```diff + if (q_size(head) == 0) { + return; + } ``` ### `q_reverseK` 版本： > [commit e78796a](https://github.com/HenryChaing/lab0-c/commit/e78796a9dfc8ee98827f9835b27986c84c641290) `reverseK` 就是相鄰的 K 個節點要進行反轉，若不足 K 個則不反轉。 **外部迴圈**：在完成內部迴圈的相鄰節點反轉後，要將這些相鄰節點的頭尾與佇列重新作鏈結。 **內部迴圈**：相鄰節點的反轉，採用的方式與 `q_reverse` 相同。 ```c void q_reverseK(struct list_head *head, int k) { int length = q_size(head); int turn = length / k; list_head *front = head->next->next; list_head *last = head->next; for (size_t i = 0; i < turn; i++) { list_head *ll = last->prev; for (size_t j = 0; j < k; j++) { last->next = last->prev; last->prev = front; front = front->next; last = last->prev; } list_head *ff = last; last = ll->next; front = ff->prev; last->next = ff; ff->prev = last; front->prev = ll; ll->next = front; last = ff; front = ff->next; } // https://leetcode.com/problems/reverse-nodes-in-k-group/ } ``` ### `q_sort` > 參考： [lib/list_sort.c](https://github.com/torvalds/linux/blob/master/lib/list_sort.c) 版本： > [commit b267f66](https://github.com/HenryChaing/lab0-c/commit/b267f664638cc9d9b0909990c54b1c4080cc47d4) 這個版本的 `q_sort` 是直接引用 Linux kernel 的合併排序方法。有額外實作的部份是 `compare` 副函式，為了達成 `stable` 的排序方法， `nodeA->value` 只有大於 `nodeB->value` 時才會回傳 true 。 ```c static bool compare(struct list_head *a, struct list_head *b) { element_t *nodeA = container_of(a, element_t, list); element_t *nodeB = container_of(b, element_t, list); if (strcmp(nodeA->value, nodeB->value) <= 0) { return false; } return true; } ``` :::danger 你如何確保目前的測試已涵蓋排序演算法的最差狀況？ ::: ### `q_ascend` 初始版本： > [commit c094b1f](https://github.com/HenryChaing/lab0-c/commit/c094b1fbd430fb707d700ae9f697f4398067359a) `ascend` 目的是為了讓佇列的值成升冪排列。因此我採用了類似 `q_delete_dup` 的方式刪除會導致無法升冪排列的節點。 **外部迴圈**：移動 `last` 指標， `last` 所指到的節點會作為判斷之後的資料是否成升冪排列的基礎。 **內部迴圈**：逐一拜訪 `last` 之後的節點，若有節點的資料比 `last` 所指到的節點資料要小，則移除該節點，保持升冪。 ```c int q_ascend(struct list_head *head) { list_head *front = head->next->next; list_head *last = head->next; while (last->next != head) { element_t *last_node = container_of(last, element_t, list); while (front != head) { element_t *front_node = container_of(front, element_t, list); if (strcmp(last_node->value, front_node->value) > 0) { last->prev->next = last->next; last->next->prev = last->prev; break; } front = front->next; } last = last->next; front = last->next; } // https://leetcode.com/problems/remove-nodes-from-linked-list/ return q_size(head); } ``` 改進： > [commit 5a751d1](https://github.com/HenryChaing/lab0-c/commit/5a751d11e7a8740f485305e28ea0953b61b8f223) 在進行實際測試，也就是執行 `scripts/driver.py` ，執行到 [`trace-06-ops`](https://github.com/HenryChaing/lab0-c/blob/master/traces/trace-06-ops.cmd) 時，發現 `ascend` 若未刪除而是移除節點時，會發生最後的 `free` 回報錯誤說有被配置的區塊沒有被回收，而在改成以下刪除節點的方式後，問題也隨之解決。 `last = last->prev` 是因為要配合函式外部迴圈 `last` 的正確移動。 ```diff if (strcmp(last_node->value, front_node->value) > 0) { last->prev->next = last->next; last->next->prev = last->prev; + last = last->prev; + free(last_node->value); + free(last_node); break; } ``` ### `q_descend` :::danger 避免非必要的項目縮排 (即 `* ` 和 `- `)，以清晰、明確，且流暢的漢語書寫。 > 會減少縮排使用，必要時改為數字索引（如：1、2、3） ::: 兩次 commit 功能皆與 q_ascend() 的紀錄相同。 > [commit 182c642](https://github.com/HenryChaing/lab0-c/commit/182c6420de723cb5e41aa6885f6d7b614190ccf5) > [commit 54ca662](https://github.com/HenryChaing/lab0-c/commit/54ca6624ee54319d15950b088d7d819259efb40d) ### `q_merge` #### `q_context_t, q_chain_t` :::danger 程式碼註解總是以美式英語書寫！ > 會再發 commit 。 ::: ```c typedef struct { struct list_head head; //queue's head int size; //queue's size } queue_chain_t; typedef struct { struct list_head *q; //point to Linux kernel queue struct list_head chain; //chain all queue_contex_t int size; // q_size(q) int id; //queue id } queue_contex_t; ``` 在開始實作 `q_merge` 前，這兩個是必須知道的結構變數。首先因為此題佇列是複數的，所以會有 `queue_context_t` 紀錄某一個佇列的內容，並且再由 `queue_chain_t` 將 `queue_context_t` 的內容串接成佇列。 #### `merge_2_queue` > [commit 0857558 ](https://github.com/HenryChaing/lab0-c/commit/08575585724066097e9c6a91a9a3a602c93a8963) > 參考： [Merge-Two-Sorted-Lists](https://hackmd.io/@sysprog/c-linked-list#%E6%A1%88%E4%BE%8B%E6%8E%A2%E8%A8%8E-LeetCode-21-Merge-Two-Sorted-Lists) ```c=497 void merge_2_queue(list_head *a_queue, list_head *b_queue, bool descend) { list_head *a_list_head = a_queue; list_head *b_list_head = b_queue; list_head *a_list = a_queue->next; list_head *b_list = b_queue->next; list_head **ptr = &a_list_head; char remove_value[10]; while (a_list != a_list_head && b_list != b_list_head) { element_t *a_node = container_of(a_list, element_t, list); element_t *b_node = container_of(b_list, element_t, list); printf("(%s,%s)\n", a_node->value, b_node->value); if (!compare(a_list, b_list)) { a_list = a_list->next; list_head *rm_list = &(q_remove_head(a_list->prev->prev, remove_value, 10)->list); rm_list->prev = (*ptr); rm_list->next = (*ptr)->next; (*ptr)->next->prev = rm_list; (*ptr)->next = rm_list; ptr = &(*ptr)->next; } else { b_list = b_list->next; list_head *rm_list = &(q_remove_head(b_list->prev->prev, remove_value, 10)->list); rm_list->prev = (*ptr); rm_list->next = (*ptr)->next; (*ptr)->next->prev = rm_list; (*ptr)->next = rm_list; ptr = &(*ptr)->next; } } while (a_list != a_list_head) { a_list = a_list->next; list_head *rm_list = &(q_remove_head(a_list->prev->prev, remove_value, 10)->list); rm_list->prev = (*ptr); rm_list->next = (*ptr)->next; (*ptr)->next->prev = rm_list; (*ptr)->next = rm_list; ptr = &(*ptr)->next; } while (b_list != b_list_head) { b_list = b_list->next; list_head *rm_list = &(q_remove_head(b_list->prev->prev, remove_value, 10)->list); rm_list->prev = (*ptr); rm_list->next = (*ptr)->next; (*ptr)->next->prev = rm_list; (*ptr)->next = rm_list; ptr = &(*ptr)->next; } } ``` 這個實作參考了課程教材關於合併排序中兩個串列合併的操作，對象從單向鏈結串列轉變成包含 `list_head` 的雙向鏈結串列，我參考了使用指標的指標 indirect pointer 版本的實作方式，以此達到不使用記憶體配置的要求。 `line 501-503` `a_list` `b_list` 會指向尚未完成合併的串列元素，間接指標 `ptr` 則會間接指向第一個串列中的 Head 。 `line 506-554` 比照單向串列的合併方法，差異在於須更改前後節點的鏈結。這裡紀錄一下間接指標的使用心得，指標運算中會使用到的運算子 (operator) ，首先是 `&` (address of) 以及 `*` (value of) ，使用它們會分別得到運算元的地址以及運算元的數值。再來是 `=` (assign) 稱為賦值，會將第一個運算元的數值變更為第二個運算元的數值。有了這些概念會較容易理解指標操作。 #### `q_merge` ```c=459 int q_merge(struct list_head *head, bool descend) { list_head *q = container_of(head->next, queue_contex_t, chain)->q; // first list list_head *chain_context = head->next->next; // second list while (chain_context != head) { list_head *merge_list = container_of(chain_context, queue_contex_t, chain)->q; merge_2_queue(q, merge_list, descend); chain_context = chain_context->next; } if (descend) { q_reverse(q); } return q_size(q); } ``` `q_merge` 會呼叫 `merge_2_list` 副函式，主要功能是將 `q_chain_t` 中的每個佇列進行合併。 ### make test 最終結果：通過 [`trace-01-ops ~ trace-16-perf`](https://github.com/HenryChaing/lab0-c/blob/master/traces) ，未通過 [trace-17-complexity](https://github.com/HenryChaing/lab0-c/blob/master/traces/trace-17-complexity.cmd)，得分： 95/100 [(回歸測試結果)](https://github.com/HenryChaing/lab0-c/actions/runs/8131889164/job/22221703343)。 ## 運用 Valgrind 排除 `qtest` 實作的記憶體錯誤 ### 實作新命令 `hello` 在 `qtest` 當中因為想要實作[作業要求](https://hackmd.io/@sysprog/linux2024-lab0/%2F%40sysprog%2Flinux2024-lab0-f)中提到的 `shuffle` 命令，因此先嘗試著新增命令，我選擇新增的指令是 `hello` ，但遇到了記憶體錯誤的問題，由於不清楚是那一道指令出錯，因此選擇了 Valgrind 作為排解問題的工具。面臨的問題： ``` $ ./qtest cmd> hello Segmentation fault occurred. You dereferenced a NULL or invalid pointerAborted (core dumped) ``` 運用了 Valgrind 後，發現了問題如下： ``` $ valgrind -q --leak-check=full ./qtest cmd> hello ==12089== Invalid read of size 8 ==12089== at 0x10B4C2: do_hello (qtest.c:880) ==12089== by 0x10E019: interpret_cmda (console.c:181) ==12089== by 0x10E5CE: interpret_cmd (console.c:201) ==12089== by 0x10F238: run_console (console.c:691) ==12089== by 0x10D40B: main (qtest.c:1266) ==12089== Address 0x0 is not stack'd, malloc'd or (recently) free'd ==12089== Segmentation fault occurred. You dereferenced a NULL or invalid pointer ``` 問題出在執行 `do_hello` 的第880行時，存取了非法的記憶體位址，於是我檢視了該方法的實作方式： ```c=878 static bool do_hello(int argc, char *argv[]) { q_hello(current->q); q_show(3); return true; } ``` >`do_new` 也沒有參數，不會出事，為什麼要讓 `do_hello` 成為特例？[name=HotMercury] 因為在 `q_test` 的設計當中，`current` 若未經過 `do_new` 方法，則 `current` 就會指向 NULL 位址，因此前述的第880行才會因為 dereferenced 的原因存取錯誤。之後也透過將 `q_hello` 改成沒有參數的方法，結束了這個問題。 ### 運用 Valgrind Massif 觀察 simulation 之記憶體使用情形 > 參考： [alanjian85](https://hackmd.io/@alanjian85/lab0-2023#%E9%80%8F%E9%81%8E-Massif-%E5%88%86%E6%9E%90%E8%A8%98%E6%86%B6%E9%AB%94%E4%BD%BF%E7%94%A8%E9%87%8F) #### 事前準備使用 massif 得到分析數據，會得到行程在記憶體使用狀況的 snapshot 。 ```shell $ valgrind --tool=massif ./qtest -f traces/trace-017-complexity.cmd ``` 使用 massif-visualizer 圖形化數據。 ```shell $ massif-visualizer massif.out.<pid> ``` #### 運行 trace-017-complexity （初始化階段） ![massif_result_2](https://hackmd.io/_uploads/Hk9QeQfa6.png) 可以看到在最初期的階段，也就是前期緩慢的斜坡，這時行程都在運行 `malloc_or_fail` ，例如 `add_cmd` , `add_param` ，初始化我們的命令界面。再來第二階段是峰值的部份，這個階段除了有上個階段一樣的工作，還有就是多了 `_IO_file_doallocate` 在運作，並且這些處理都是源自於 `report(1, "ERROR: Could not open source file '%s'", infile_name);` 這項錯誤。 #### 運行 trace-017-complexity （運作階段） ![massif_result_3](https://hackmd.io/_uploads/BykhrmG6a.png) 在實際指令運行階段，此時佔用堆積最大宗的函式已變為 `test_malloc` ，也就是在運行 `q_insert_head` 時會觸發用來分配記憶體的函式。再來佔據的第二大的是 `doit` 函式(雖然僅有 `1-2%`)，它是在設置 simulation bit 後會觸發的函式，目的是測驗 simulation 的效果。 ## 運用 Address Sanitizer 除錯在實作 `shuffle` 功能的過程當中，每次更新程式碼進行測試時，我都會選擇 `make SANITIZER=1` 來進行編譯，也就是在編譯過程當中會加入 AddressSanitizer 的程式碼，來偵測並紀錄記憶體相關的錯誤。在實作過程中，我偶然發生以下錯誤： ``` cmd> ih e l = [e d c b a] cmd> shuffle ERROR: Time limit exceeded. Either you are in an infinite loop, or your code is too inefficient cmd> Freeing queue ================================================================= ==20156==ERROR: AddressSanitizer: heap-use-after-free on address 0x6060000000b0 at pc 0x558ad103bcef bp 0x7ffd0d032f70 sp 0x7ffd0d032f60 READ of size 8 at 0x6060000000b0 thread T0 #0 0x558ad103bcee in q_free /home/ubuntu/linux2024/lab0-test-git/queue.c:43 #1 0x558ad10339f2 in q_quit /home/ubuntu/linux2024/lab0-test-git/qtest.c:1115 #2 0x558ad10394cb in do_quit /home/ubuntu/linux2024/lab0-test-git/console.c:246 #3 0x558ad103acf8 in finish_cmd /home/ubuntu/linux2024/lab0-test-git/console.c:635 #4 0x558ad103778a in main /home/ubuntu/linux2024/lab0-test-git/qtest.c:1273 #5 0x7f1351e29d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 #6 0x7f1351e29e3f in __libc_start_main_impl ../csu/libc-start.c:392 #7 0x558ad1032d24 in _start (/home/ubuntu/linux2024/lab0-test-git/qtest+0xad24) 0x6060000000b0 is located 48 bytes inside of 64-byte region [0x606000000080,0x6060000000c0) freed by thread T0 here: #0 0x7f13522b4537 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:127 #1 0x558ad103b714 in test_free /home/ubuntu/linux2024/lab0-test-git/harness.c:204 #2 0x558ad103bcd1 in q_free /home/ubuntu/linux2024/lab0-test-git/queue.c:45 #3 0x558ad10339f2 in q_quit /home/ubuntu/linux2024/lab0-test-git/qtest.c:1115 #4 0x558ad10394cb in do_quit /home/ubuntu/linux2024/lab0-test-git/console.c:246 #5 0x558ad103acf8 in finish_cmd /home/ubuntu/linux2024/lab0-test-git/console.c:635 #6 0x558ad103778a in main /home/ubuntu/linux2024/lab0-test-git/qtest.c:1273 #7 0x7f1351e29d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 ``` 本次的錯誤是我在按下 `ctrl+c` 送出 `SIGINT` 訊號後產生，由於我在 `q_shuffle` 方法實作錯誤，導致 `q_free` 遇到鏈結已被打亂的雙向鏈結串列，間接造成已經被解除配置的空間再次被使用的錯誤。以下是我的分析：在經過 AddressSanitizer 的回報錯誤觀察後，我發現這個空間是我起初在執行 `ih` 命令後，所分配到的空間，因此確定這是經過 `test_malloc` 後新增出來的節點空間，以下是出錯的程式碼： ```c=41 while (pt != l) { element_t *node = container_of(pt, element_t, list); pt = pt->next; free(node->value); free(node); } ``` 出錯是在 `line 43`，並且回報的錯誤是 `heap-use-after-free` ，因此能推論出此時 `pt->next` 指向的節點空間已被回收，因此我思考了可能發生這個狀況的原因。 ```graphviz digraph "Doubly Linked List" { rankdir=LR; node [shape=record]; h [label="{<ref1> | head |<ref2> }"]; a [label="{ <ref1> | <data> a | <ref2> }"] b [label="{ <ref1> | <data> b | <ref2> }"]; c [label="{ <ref1> | <data> c | <ref2> }"]; a:ref2:c -> b:data:n b:ref1:c -> a:data:s h:ref2:c -> a:data:n b:ref2:c -> c:data:n c:ref1:c -> b:data:s a:ref1:s -> h:data:s c:ref2:e -> a:ref2:n } ``` 以上是我思考後的可能原因之一，也就是某個節點的 `next` 指標，往回指向了先前的節點，也就是現在圖中 `c` 的 `next` 指向了 `a` ，因此在我寫的鏈結串列回收當中，回收順序就會變成如下： `a->b->c->a` ，也就會發生上述的 `a` 節點已被回收但是又要被 `pt` 所使用的情形發生。 <s>理論上</s> :::warning 不要濫用「理論上」，你依據什麼「理論」說話呢？工程人員說話要精準明確且著重溝通的效率。 > 了解，謝謝老師半夜批改！！ ::: --- ## 使用 perf 對 `q_merge` 以及 `lib/list_sort` 實行效能分析 > 參考： [運用 Perf 分析程式效能並改善](https://hackmd.io/@sysprog/gnu-linux-dev/https%3A%2F%2Fhackmd.io%2F%40sysprog%2Flinux-perf#%E9%81%8B%E7%94%A8-Perf-%E5%88%86%E6%9E%90%E7%A8%8B%E5%BC%8F%E6%95%88%E8%83%BD%E4%B8%A6%E6%94%B9%E5%96%84) 關於 `q_sort()` 的實作，我使用了第二週測驗提供的快速排序法，而比較對象是 Linux 核心採用的鏈結串列排序 `lib/list_sort.c` 。至於 perf 則是使用 `perf record` 用來採樣硬體事件在程式碼特定段落的發生頻率，接著 `perf stat` 則是印出硬體事件的統計數據。 ### `q_sort` 首先我們來分析程式熱點，第一個對象是快速排序也就是 `q_sort` 。經過 `perf record` 幫我們採樣程式執行時 PC 指令位址，我們得知了最常執行也就是程式熱點為 `list_move_tail` 這個函式，這段程式碼若詳閱第二週測驗一可得知這是 partition 的實作，它會將大於 pivot 的節點移至 `list_greater` 串列，反之則移到 `list_less` 串列。 ```c list_for_each_entry_safe (item, is, head, list) { if (compare(&item->list,&pivot->list)) list_move_tail(&item->list, &list_less); else list_move_tail(&item->list, &list_greater); } ``` 以下是 `perf record` 經過反組譯後得到的程式段落，由於 `list_for_each_entry_safe` 是巨集並且 `list_move_tail` 是 inline 函式，所以它們都屬於 `q_sort` 這個 symbol 底下的組合語言程式碼，這是採計 cycles 作為硬體是分析後的結果。 ![image](https://hackmd.io/_uploads/H17BeIvoyl.png) ``` 0.60 │ bc: mov 0x10(%r12),%rax // entry->member.next 77.69 │ sub $0x8,%rax // container_of() 0.83 │ lea 0x8(%r12),%rbx // entry->member 0.10 │ mov %r12,%rbp // entry value 0.20 │ cmp %r13,%rbx // entry->member comapre to head │ ↓ je 10f // jump to q_sort (recursive) 0.28 │ mov %rax,%r12 // safe pointer update ``` ```c for (entry = list_entry((head)->next, __typeof__(*entry), member), safe = list_entry(entry->member.next, __typeof__(*entry), member); &entry->member != (head); entry = safe, safe = list_entry(safe->member.next, __typeof__(*entry), member)) ``` 這段程式碼是程式熱點，不過編譯器很省用暫存器資源，因此在 `list_for_each_entry_safe` 巨集當中，最常出現的暫存器 `%r12` 同時代表了更新後的 `entry` 以及更新前的 `safe` 變數，所以以這段 `bc` symbol 所指向的指令來說，它就是要找出 `safe->member.next` 的位址並存到 `%rax` 暫存器當中。 ![image](https://hackmd.io/_uploads/ryrok8wsJg.png) 可以發現使用 `perf record` 紀錄事件 `cpu_core/L1-dcache-load-misses/` ，剛才推測的 `container_of()` 為程式熱點是快取失誤造成，這個論點不成立，因為可以發現其他在 cycles 統計時沒有顯著出現的指令成為了快取失誤的熱點，而且大部分皆為 `mov` 這類與存取記憶體相關的指令，倒是開始好奇為什麼 sub 這個單純使用暫存器計算的指令會與 L1-dcache-load-misses 有關。隨後看到 `perf stat` ，這裡想用處理器效能的角度來分析這個程式的執行狀況，這裡使用的是 CPI (clock per insruction) 這個指標，忽略在這次實驗已經確定的時脈週期以及指令個數。可以發現我們只使用了 `cpu_core` 這部份的處理器，執行使用的指令個數約為50億，而時脈則使用了60億個，這裡約莫換算 CPI 為 $\dfrac{5}{6}$ 。接著發現有命令下達不夠準確的問題，當我將指定的硬體事件 `-e` 只設定為 `cycles,instructions` 時， `perf stat` 就會幫我換算出 CPI ，雖然結果相當逼近，但是也隨之發現一個問題是執行效率，平均一個時脈還無法完成一道指令。 ``` Performance counter stats for './qtest -f test_sort.txt': 3,534,832,969 cpu_atom/cycles/ (0.06%) 5,456,970,387 cpu_core/cycles/ (99.94%) 2,554,911,179 cpu_atom/instructions/ # 0.72 insn per cycle (0.06%) 4,652,190,529 cpu_core/instructions/ # 0.85 insn per cycle (99.94%) 1.747922422 seconds time elapsed 1.486525000 seconds user 0.261092000 seconds sys ``` ### `list_sort` 再來看到 `list_sort` ，原本的函式需要提供比較函式，這裡根據 `q_test` 的使用案例提供了簡單的字串比較函式，使用 glibc 提供的 `strcmp` 函式。在使用 `perf record` 後可以發現程式熱點集中於我實作的 `compare` 函式當中，第一個部份是將 `container_of(...)->value` 搬移到暫存器時，第二個部份是跳至 `strcmp` 函式。 ![Screenshot from 2025-03-08 23-01-40](https://hackmd.io/_uploads/ryHzn0Koye.png) 接著來到 `perf stat` 的統計結果，可以發現到 IPC 仍就低於 1 ，因此可能是實驗設計的問題，我決定在這個章節的最後介紹實驗設計。 ``` Performance counter stats for './qtest -f test_sort.txt': 4,855,456,488 cpu_atom/cycles/ (0.95%) 5,547,921,482 cpu_core/cycles/ (99.05%) 5,941,247,579 cpu_atom/instructions/ # 1.22 insn per cycle (0.95%) 4,638,951,343 cpu_core/instructions/ # 0.84 insn per cycle (99.05%) 1.836980469 seconds time elapsed 1.576357000 seconds user 0.259729000 seconds sys ``` ### 實驗設計我使用的是 qtest 讀取命令腳本的功能，在這個命令腳本當中我將一百萬個隨機字串依序加入到鏈結串列當中，最後依據實驗對象呼叫 `q_sort()` 或 `list_sort()` 函式。因此上述兩個實驗在統計時之所以效果不彰，也可能是腳本中的其他命令導致，例如 `q_show` 出現的次數會比排序函式還要多。 ## 改善 `web` 命令 > 參考： [I/O Multiplexing](https://hackmd.io/@sysprog/linux-file-system#IO-Multiplexing) 在介紹 `web` 命令之前，必須先了解 `linenoise` 函式庫，這是一個文字編輯函式庫， `qtest` 執行檔就是透過 `linenoise` 來讀取執行時使用者在鍵盤輸入的內容，在預設的 `linenoise` 當中程式會讀取 `STDIN_FILENO` 這個 file descriptor 當中的字元，但是 `linenoise` 還有針對不同可以讀取的 file descriptor 作多工函式引入的程式片段。而在 `qtest` 當中就實作了 `STDIN_FILENO` 與 Socket file descriptor 的多工處理函式，接著就要介紹到重要的系統呼叫 [**select**](https://man7.org/linux/man-pages/man2/select.2.html) ， select 的特點在於可以等待多個 file descriptor (fd) ，在有 fd 等待的事件發生前會 Block waiting ，因此 select 常被用於實作 I/O Multiplexing 的函式。 #### 原先的 I/O multiplexing 實作 ```c int web_eventmux(char *buf) { fd_set listenset; FD_ZERO(&listenset); FD_SET(STDIN_FILENO, &listenset); int max_fd = STDIN_FILENO; if (server_fd > 0) { FD_SET(server_fd, &listenset); max_fd = max_fd > server_fd ? max_fd : server_fd; } int result = select(max_fd + 1, &listenset, NULL, NULL, NULL); if (result < 0) return -1; if (server_fd > 0 && FD_ISSET(server_fd, &listenset)) { FD_CLR(server_fd, &listenset); struct sockaddr_in clientaddr; socklen_t clientlen = sizeof(clientaddr); int web_connfd = accept(server_fd, (struct sockaddr *) &clientaddr, &clientlen); char *p = web_recv(web_connfd, &clientaddr); char *buffer = "HTTP/1.1 200 OK\r\nContent-Type: text/plain\r\n\r\n"; web_send(web_connfd, buffer); strncpy(buf, p, strlen(p) + 1); free(p); close(web_connfd); return strlen(buf); } FD_CLR(STDIN_FILENO, &listenset); return 0; } ``` 在原先的 I/O Multiplexing 的實作當中，會先使用與 select 函式相關的巨集來制定集合內容，例如在 `web` 功能啟動後， `select` 函式會監聽鍵盤的輸入以及伺服器指定的連接埠，當有其中一者為可讀時， `select` 函式就會回傳正值並到 `FD_ISSET` 巨集判斷可讀的 fd ，接著才會開始真正的使用 `read` 或是 `accept` 系統呼叫進行讀取。但是這個實作會遇到一個狀況， `web_connfd` 是已經建立連線的 socket fd ，這個函式會讀取這個 fd 當中的內容，但是若這個連線當中的客戶端沒有傳送資料，這個行程就會被迫 blocking ，因此我再這個函式作了如下改善。 #### 加入 select 的 I/O multiplexing 實作 > [commit bce61ae ](https://github.com/HenryChaing/lab0-c/commit/bce61aedfb93c864d2543dcf387d3e1357d1e1fe) ```diff int web_connfd =accept(server_fd, (struct sockaddr *) &clientaddr, &clientlen); + fd_set listenset_2; + FD_ZERO(&listenset_2); + FD_SET(STDIN_FILENO, &listenset_2); + FD_SET(web_connfd, &listenset_2); + max_fd = STDIN_FILENO > web_connfd ? STDIN_FILENO : web_connfd; + int result = select(max_fd + 1, &listenset_2, NULL, NULL, NULL); + if (result < 0) + return -1; + if (FD_ISSET(STDIN_FILENO, &listenset_2)) { + FD_CLR(STDIN_FILENO, &listenset_2); + return 0; + } else { + FD_CLR(web_connfd, &listenset_2); + } char *p = web_recv(web_connfd, &clientaddr); ``` 我在建立連線並得到 socket fd 的過程後使用了 `select` 系統呼叫來判別 `STDIN_FILENO` 與 socket 是否為可讀狀態，這樣就可以免除 socket 的讀取會 blocking 整個行程的問題。 ## 提供新的命令 `shuffle` 我們的 `shuffle` 所採取的演算法是 `Fisher–Yates shuffle` ，是一個時間複雜度為 $O(n)$ 的演算法。會執行以下步驟： 1. 選擇 $Node_i$ (其中 $i$ 為佇列長度加一再減去回合數) 2. 選擇 $Node_j$ (其中 $j$ 為 1 到 $i$ 之間任意數) 3. 將 $Node_i$ 與 $Node_j$ 進行調換。 4. 循環上述步驟直到回合數＝佇列長度減一 - [ ] 第一回合（ i = 3+1-1 = 3 , j = 1 ）交換前： ```graphviz digraph graphname{ node[shape=box] I[shape=plaintext,fontcolor=red] J[shape=plaintext,fontcolor=red] { rankdir = LR H[label=head] A[label=A] B[label=B] C[label=C] } { rank="same"; I->C } { rank="same"; J->A } rankdir=LR H->A->B->C } ``` 交換後： ```graphviz digraph graphname{ node[shape=box] I[shape=plaintext,fontcolor=red] J[shape=plaintext,fontcolor=red] { rankdir = LR H[label=head] A[label=A] B[label=B] C[label=C] } { rank="same"; I->C } { rank="same"; J->A } rankdir=LR H->C->B->A } ``` - [ ] 第二回合（ i = 3+1-2 = 2 , j = 1 ）交換前： ```graphviz digraph graphname{ node[shape=box] I[shape=plaintext,fontcolor=red] J[shape=plaintext,fontcolor=red] { rankdir = LR H[label=head] A[label=A] B[label=B] C[label=C] } { rank="same"; I->B } { rank="same"; J->C } rankdir=LR H->C->B->A } ``` 交換後： ```graphviz digraph graphname{ node[shape=box] I[shape=plaintext,fontcolor=red] J[shape=plaintext,fontcolor=red] { rankdir = LR H[label=head] A[label=A] B[label=B] C[label=C] } { rank="same"; I->B } { rank="same"; J->C } rankdir=LR H->B->C->A } ``` ### 實作解說 > [commit 6ffe9f5](https://github.com/HenryChaing/lab0-c/commit/6ffe9f5936607675f0f8c511697ffcbc16e58068) > 更改了 checksums 已成功 commit ```c= bool q_shuffle(struct list_head *head) { list_head *node_i; list_head *node_j; list_head *nodei_prev, *nodej_prev; int size = q_size(head); for (int i = size; i > 1; i--) { int j = rand() % (i) + 1; node_i = head; node_j = head; for (size_t inner_i = 0; inner_i < i; inner_i++) { node_i = node_i->next; } for (size_t inner_i = 0; inner_i < j; inner_i++) { node_j = node_j->next; } int distance = i - j; if (distance == 0) { } else if (distance == 1) { node_j->prev->next = node_i; node_i->next->prev = node_j; node_j->next = node_i->next; node_i->prev = node_j->prev; node_i->next = node_j; node_j->prev = node_i; } else { nodei_prev = node_i->prev; nodej_prev = node_j->prev; node_i->prev->next = node_i->next; node_i->next->prev = node_i->prev; node_j->prev->next = node_j->next; node_j->next->prev = node_j->prev; list_add(node_i, nodej_prev); list_add(node_j, nodei_prev); } } return true; } ``` `line 8` 的外部迴圈，會讓 `i` 迭代從最後一個節點到第二個節點，`line 9` 會讓 `j` 隨機挑選一個節點。 `line 14-16` 的內部迴圈會將 `node_i` 移到指定位址。`line 18-20` 同理會將`node_j` 移到指定位址。 `line 24-42` 將每一回合要交換的節點分成三個案例 : * `distance == 0` : 此案例無須交換節點。 * `distance == 1` : 利用 swap 的方式交換節點。 * `distance > 1` : 先移除兩個節點，在將其安插回佇列。 ### 說明遵守 Uniform distribution 我們利用 Pearson's chi-squared test 能檢驗虛無假說（Null hypothesis）的方式，來驗證我們實作的 `shuffle` ，落到各種結果的機率均一致。 + $H_0$（虛無假說）: shuffle 的結果發生的機率相同，遵守 Uniform distribution + $H_1$（對立假說）: shuffle 的結果發生的機率至少有一個不同 ### Pearson's chi-squared test #### 1. 計算 chi-squared test statistic $X^2$ $$ X^2 = \sum_{i=1}^n {(O_i - E_i)^2 \over E_i} $$ + $X^2$: Pearson's cumulative test statistic + $O_i$: the number of observations of type i + $E_i$: the expected count of type i ``` Expectation: 41666 Observation: {'1234': 41801, '1243': 41457, '1324': 41178, '1342': 41645, '1423': 41927, '1432': 41516, '2134': 41493, '2143': 41615, '2314': 41444, '2341': 41688, '2413': 41825, '2431': 42087, '3124': 41907, '3142': 41440, '3214': 41652, '3241': 41454, '3412': 41855, '3421': 41791, '4123': 41661, '4132': 42227, '4213': 41484, '4231': 41634, '4312': 41350, '4321': 41869} chi square sum: 32.917342677482836 ``` 在測試 shuffle 1000000 次後，二十四種結果各自的頻率如下表： || 觀察到的頻率| 期待的頻率 |${(O_i - E_i)^2 \over E_i}$| | -------- | -------- | -------- |---| | [1, 2, 3, 4] | 41801 | 41666 |0.43740699851| | [1, 2, 4, 3] | 41457 | 41666 |1.04836077377| | [1, 3, 2, 4] | 41178 | 41666 |5.71554744876| | [1, 3, 4, 2] | 41645 | 41666 |0.01058416934| | [1, 4, 2, 3] | 41927 | 41666 |1.63493015888| | [1, 4, 3, 2] | 41516 | 41666 |0.54000864013| | [2, 1, 3, 4] | 41493 | 41666 |0.71830749292| | [2, 1, 4, 3] | 41615 | 41666 |0.0624249988| | [2, 3, 1, 4] | 41444 | 41666 |1.18283492536| | [2, 3, 4, 1] | 41688 | 41666 |0.01161618585| | [2, 4, 1, 3] | 41825 | 41666 |0.60675370805| | [2, 4, 3, 1] | 42087 | 41666 |4.25385206163| | [3, 1, 2, 4] | 41907 | 41666 |1.39396630346| | [3, 1, 4, 2] | 41440 | 41666 |1.2258436135| | [3, 2, 1, 4] | 41652 | 41666 |0.00470407526| | [3, 2, 4, 1] | 41454 | 41666 |1.07867325877| | [3, 4, 1, 2] | 41855 | 41666 |0.85731771708| | [3, 4, 2, 1] | 41791 | 41666 |0.37500600009| | [4, 1, 2, 3] | 41661 | 41666 |0.0006000096| | [4, 1, 3, 2] | 42227 | 41666 |7.5534248548| | [4, 2, 1, 3] | 41484 | 41666 |0.79498871982| | [4, 2, 3, 1] | 41634 | 41666 |0.02457639322| | [4, 3, 1, 2] | 41350 | 41666 |2.39658234532| | [4, 3, 2, 1] | 41869 | 41666 |0.9890318245| |Total|||32.917342677482836| $X^2$ = 32.917342677482836 #### 2. 決定自由度（degrees of freedom）在可能結果共有二十四種的情況下，我們知道二十四種結果總和機率為一，因此這次實驗的自由度為二十三，因為有固定數值 $$ P(x_n) = 1-P(x_1)-...-P(x_{n-1}) $$ #### 3. 選擇顯著水準及 P value 顯著水準 α 我們設定最常用的 0.05 ， P value 我們藉由卡方分佈表查表，由於 ${X^2 = 32.917}$ 且自由度為二十三，因此經由查表後得到 ${0.05 < P\ value< 0.1}$ 。 #### 4. 統計檢定的結果不拒絕虛無假說由於 ${P\ value}$ 沒有低於顯著水準 α ，因此虛無假說 $H_0$ 無法被拒絕，也就是說無法否認 shuffle 的結果遵循 Uniform distribution 。而從最後的圖表來觀察，機率傾向於一致。 ![Shuffle_result](https://hackmd.io/_uploads/SJpLIOy66.png) --- ## 井字遊戲 ### 並行程式設計 > 參考： [並行程式設計 coroutine](https://hackmd.io/@sysprog/concurrent-sched) 我使用 coroutine 的方式實作「電腦 vs 電腦」的對弈模式，其中電腦 A 是使用 negamax 演算法，而電腦 B 是使用 MCTS 演算法。而電腦 A、B 分別對應到任務一及任務二。至於任務之間的切換，是採用 `setjmp` + `longjmp` 的方法。 #### 1. setjmp/longjmp 這兩個函式可以轉換程式執行的順序，其中 `setjmp` 可以利用 `jum_buf` 儲存目前程式的狀態，並且在遇到 `longjmp(jum_buf)` 後跳回 `setjmp` 並恢復程式的儲存狀態，這樣的函式設計可以方便程式執行時在不同任務間轉換。 TTY raw mode #### 2. task_add/task_switch 這是主要切換任務的函式，我們用 `list_head` 構成的循環雙向鏈結串列存放即將執行的任務，也就是存放 `jmp_buf` 。其中 `task_add` 可以將任務加到串列當中， `task_switch` 可以切換到我們紀錄的下一個任務執行。流程設計參照下方程式碼， `schedule` 函式會將兩個任務放到佇列中，而任務執行完的當下會再將這個任務加到佇列當中，若此對局勝負揭曉則不會再將加到佇列當中，佇列為空也就代表並行程式結束執行。 ```c static LIST_HEAD(tasklist); static void (**tasks)(void *); static struct arg *args; static int ntasks; static jmp_buf sched; static struct task *cur_task; static void task_add(struct task *task) { list_add_tail(&task->list, &tasklist); } static void task_switch() { if (!list_empty(&tasklist)) { struct task *t = list_first_entry(&tasklist, struct task, list); list_del(&t->list); cur_task = t; longjmp(t->env, 1); } } void schedule(void) { static int i; i = 0; setjmp(sched); while (ntasks-- > 0) { struct arg arg = args[i]; tasks[i++](&arg); printf("Never reached\n"); } task_switch(); } ``` ### 程式碼解說 > 參考： [sysprog21/concurrent-programs/coro](https://github.com/sysprog21/concurrent-programs/tree/master/coro) > [commit d768307](https://github.com/HenryChaing/lab0-c/commit/d768307a7ceb6100795aa9157e4a6e882716a2e0) #### 任務參數-結構變數 ```c struct task { jmp_buf env; struct list_head list; char task_name[10]; char *table; char turn; }; struct arg { char *task_name; char *table; char turn; }; ``` 首先可以看結構變數 `task` 中，第一個成員變數 `env` 即是用來儲存這次進入任務前的程式執行狀態，而與參考程式略有不同的是我新增了棋盤以及回合符號這兩個結構變數，目的是判斷此次任務是否有結果並且停止 `task_add` 。 #### 任務一、任務二 ```c /*negamax*/ void task0(void *arg) { if (setjmp(task->env) == 0) { task_add(task); longjmp(sched, 1); } while (1) { task = cur_task; if (setjmp(task->env) == 0) { char win = check_win(task->table); if (win == 'D') { draw_board(task->table); printf("It is a draw!\n"); break; } draw_board(task->table); int move = negamax_predict(task->table, task->turn).move; if (move != -1) { task->table[move] = task->turn; record_move(move); } task_add(task); task_switch(); } } printf("%s: complete\n", task->task_name); longjmp(sched, 1); } /*mcts*/ void task1(void *arg) { if (setjmp(task->env) == 0) { task_add(task); longjmp(sched, 1); } while (1) { task = cur_task; if (setjmp(task->env) == 0) { char win = check_win(task->table); if (win == 'D') { draw_board(task->table); printf("It is a draw!\n"); break; } draw_board(task->table); int move = mcts(task->table, task->turn); if (move != -1) { task->table[move] = task->turn; record_move(move); } task_add(task); task_switch(); } } printf("%s: complete\n", task->task_name); longjmp(sched, 1); } ``` 以上是兩個任務函式的部份程式碼，其執行的流程如下： 1. 第一次呼叫這個函式時，是被 `schedule` 函式呼叫，會進到第一個條件式當中，把此任務加到佇列當中，並回到 `schedule` 當中。 2. 第二次到最後一次被呼叫，會進入迴圈當中，除了最後一次皆會 `task_add` 及 `task_switch` ，並完成 AI 下棋步驟。 3. 最後一次被呼叫，會判斷分出勝負並離開迴圈，回到 `task_switch` 當中並執行下個任務。 #### 任務三: 鍵盤事件處理 > [commit 49b8efd](https://github.com/HenryChaing/lab0-c/commit/49b8efd26615090abb4ebaed45aed145d529b6d8) 為了在遊戲中達成 `ctrl+q`, `ctrl+p` 控制程式流程，因此我們新增一個任務用來處理鍵盤事件，但為了不影響終端機畫面，因此我們將處理鍵盤事件的輸入輸出方式改為 `RAW mode` ，好處是無須按下 `ENTER` 鍵就可直接讀取輸入，以及防止輸入資訊預設的輸出到終端機上。 ```c /*** terminal ***/ void disableRawMode() { tcsetattr(STDIN_FILENO, TCSAFLUSH, &orig_termios); } void enableRawMode() { tcgetattr(STDIN_FILENO, &orig_termios); atexit(disableRawMode); struct termios raw = orig_termios; raw.c_iflag &= ~(IXON); raw.c_lflag &= ~(ECHO | ICANON | ISIG); tcsetattr(STDIN_FILENO, TCSAFLUSH, &raw); } /*** input ***/ char editorReadKey() { int nread; char c; while ((nread = read(STDIN_FILENO, &c, 1)) != 1) { if (nread == -1 && errno != EAGAIN) die("read"); } return c; } int process_key() { enableRawMode(); int signal = 0; while (signal == 0) { char c = editorReadKey(); switch (c) { case CTRL_KEY('q'): return 1; case CTRL_KEY('p'): signal = 1; break; } } disableRawMode(); return 0; } ``` :::warning 詳閱 `CONTRIBUTING.md` 以理解專案偏好的程式碼風格，避免 camelCase。 > 收到，會改用 snake_case 命名。 ::: 可以看到 `process key` 函式前後有啟用以及關閉 `RAW mode` 的函式。進到 `RAW mode` 後會先將部份狀態關閉，例如：程式中斷訊號、回傳輸入結果、逐位元讀取功能等等，而開關方式就是位元操作。這裡另外值得一提的是 `ctrl` ，它能將與它一同輸入的英文字母不論大小寫映射到 `0~25` 這個區間內，原理是 `ctrl` 會將前三個位元清除，而在 ASCII 的設計當中，英文字母的最後五個位元剛好會對應 `0~25` 的位元字串。因此在做事件處理時，我們只須觀察輸入位元組的後面五個位元，即可知道使用者輸入的資料是否為 `ctrl+alphabet`，因為鍵盤無法輸入 `0~25` 等不可顯示的字元。 #### 顯示目前時間 > [commit c6d0450](https://github.com/HenryChaing/lab0-c/commit/c6d0450ee06d31c0ed1f63a106c9872804d8c307) 依照作業要求，我們要在程式執行過程中，不斷更新目前時間。而應用的就是跳脫序列（Escape Sequence），它可以改變終端機的文字格式，其中跳脫序列的開頭都以跳脫字元作為開頭 `\x1b` ，後續接著的字元是要執行的內容。 ```c void editorDrawRows(struct abuf *ab) { time_t now = time(NULL); struct tm *currtime; currtime = localtime(&now); char r_status[80]; int r_len = snprintf(r_status, sizeof(r_status), "[ %2d:%2d:%2d ]", currtime->tm_hour, currtime->tm_min, currtime->tm_sec); abAppend(ab, "\r", 2); abAppend(ab, r_status, r_len); abAppend(ab, "\x1b[K", 3); } void editorRefreshScreen() { struct abuf ab = ABUF_INIT; abAppend(&ab, "\x1b[?25l", 6); // abAppend(&ab, "\x1b[H", 3); editorDrawRows(&ab); // abAppend(&ab, "\x1b[H", 3); abAppend(&ab, "\x1b[?25h", 6); if (write(STDOUT_FILENO, ab.b, ab.len) != ab.len) return; abFree(&ab); } ``` 接著會介紹上述程式碼會用到的跳脫序列，例如參考中有但被註解掉的 `\x1b[H` ，它是要移動游標到指定位置，預設為 `\x1b[1;1H` 也就是終端機第一列第一行的位置。至於 `\x1b[K` 則是清除游標以後此行的內容。而我實作的方式，除了針對前述三個任務進行的排程，我額外在第三個任務中加入執行緒，這個執行緒會每秒定期執行 `editorRefreshScreen` 函式，並且把當下時間輸出在游標的位置，在離開任務時也會將 `\r\n` 輸出在終端機以免影響板面。 :::warning 本程式不該使用執行緒，使用 (preemptive) coroutine 開發。 > 收到，有研究過範例程式 [peempt_sched](https://github.com/sysprog21/concurrent-programs/tree/master/preempt_sched) ::: #### 將各局過程顯示於終端機 ```c void print_all_moves() { for (int i = 0; i < record_arr_len; i++) { printf("Battle %d, Moves: ", i + 1); for (int j = 1; j <= move_record_arr[i][0]; j++) { printf("%c%d", 'A' + GET_COL(move_record_arr[i][j]), 1 + GET_ROW(move_record_arr[i][j])); if (j < move_record_arr[i][0]) { printf(" -> "); } } printf("\n"); } } ``` 我新增了一個二維陣列紀錄每次對局的過程，特別的是我用每列的第一個陣列元素紀錄此局所走的步數，確認這一列實際存放的元素個數。 :::warning TODO: 學習 [minicoro](https://github.com/edubart/minicoro) 的手法，以行內組合語言取代 setjmp/longjmp，降低任務切換的成本。 ::: ### 結果呈現 {%youtube PO-f_JB_rEM %} ### 使用定點數取代 Monte Carlo 中的浮點數運算 > [commit 9fb5342](https://github.com/HenryChaing/lab0-c/commit/9fb5342abac669bfe6a68a43f3ee97bf9eb2dda8) #### struct node ```diff struct node { int move; char player; - double score; + __uint32_t score; }; ``` #### uct_score ```diff -static inline double uct_score(int n_total, int n_visits, double score) +static inline double uct_score(int n_total, int n_visits, __uint32_t score) { if (n_visits == 0) return DBL_MAX; - return score / n_visits + EXPLORATION_FACTOR * sqrt(log(n_total) / n_visits); + return score / n_visits + EXPLORATION_FACTOR * sqrt(log(n_total) / n_visits)*65536; } ``` #### backpropagate ```diff -static void backpropagate(struct node *node, double score) +static void backpropagate(struct node *node, __uint32_t score) { while (node) { node->n_visits++; node->score += score; node = node->parent; - score = 1 - score; + score = 65536 - score; } } ``` 在 MCTS 演算法中，最常呼叫的函式是迴圈中的 `select_move` ，在這個函式中又會呼叫 `uct_score` 進行大量的浮點數運算。在效能以及程式可移植性的考量下，我們將其改為定點數的運算。這項改動讓函式從原本的資料型態 `double` 改為 `uint_32` ，而定點數的運作原理如下，我將 `uint_32` 的二進位小數點設於第十六個位元及第十七個位元之間（LSB 從第一個位元起算）。於是紀錄的數值在處理運算時就會是原先的 $2^{16}$ 倍，例如要將 `uint_32` 加上零點五就會被轉為加上 $2^{8}$ 。但是這樣的轉變就可以省去原先浮點數複雜的運算模式，換為二進位的基礎運算。對於 MCTS 迴圈中 `select_move` 大量呼叫的 `uct_score` 函式，目前已經使用定點數實作改寫完成，比較的分數值會是原本的 $2^{16}$ 倍，但這並不影響最後比較結果。 `backpropagate` 迴圈中的兩個浮點數運算也被置換完成。 ### Monte Carlo 亂數產生器在原先的 [`jserv/ttt`](https://github.com/jserv/ttt) 專案中，就有先引入偽亂數產生器 `mt19937-64` 輔助 negamax 演算法，而我也嘗試將其引入到 MCTS 演算法當中，取代原本標準函式庫的亂數產生函式 `rand()` 。 #### perf stat 於是我用效能比較工具 Perf 比較了兩者的效能，測試的項目分別是「使用的處理器時脈週期」及「使用的指令數目」這兩個項目。使用的命令是 `perf stat` 用來紀錄 MCTS 演算法執行三次後使用者模式下的測試項目用量，分別得到以下兩個結果。 ```shell //mt19937-64 random $ perf stat --repeat 3 -e cycles,instructions ./qtest Performance counter stats for './qtest': 1,441,803,000 cycles 2,396,658,028 instructions # 1.66 insn per cycle 4.612340850 seconds time elapsed 0.367987000 seconds user 0.023741000 seconds sys //stdlib random $ perf stat --repeat 3 -e cycles,instructions ./qtest Performance counter stats for './qtest': 1,498,077,142 cycles 2,411,490,314 instructions # 1.61 insn per cycle 7.924384926 seconds time elapsed 0.379149000 seconds user 0.032268000 seconds sys ``` 其中執行時間與本次實驗無關（因為 `cmd` 命令輸入也同樣記入），因此我們在意的是時脈週期以及指令數目。可以看到採用 `mt19937-64` 偽亂數產生器後，時脈週期有了明顯的減少，約是原本的 $\dfrac{96}{100}$ ，但是我用 `perf record` 對函式進行程式熱點分析時又看到了不同的結果。 #### perf record `perf record` 命令可以針對測試項目對程式的函式進行熱點分析，我用了與上述同樣的比較對象以及測試項目進行比較分析，分別得到以下結果。可以發現若只觀察 `mcts` 函式，時脈週期在變更亂數生成器後有了略微的上升，打破了 `perf stat` 所預想的時脈週期減少的結論。但是也有可能 `perf record` 只觀察使用者模式下的用量，導致亂數生成皆在使用者模式的 `mt19937-64` 時脈週期偏高。 1. clock-cycles ![need1-1](https://hackmd.io/_uploads/rkxWwmhAp.png) ![need2-1](https://hackmd.io/_uploads/SylbDQ3Cp.png) 2. instructions ![need1-2](https://hackmd.io/_uploads/rkpnP72CT.png) ![need2-2](https://hackmd.io/_uploads/SJ6nPm2Ra.png) > $\uparrow$ 上圖： rand() ，下圖： mt19937_rand() --- ## 延伸實作 ### git rebase 一項 git 專案可能同時擁有多個分支，例如目前我的 [`lab0-c`](https://github.com/HenryChaing/lab0-c) 就存在三個分支。而 `rebase` 和 `merge` 就是為了處理這些分支而存在的命令。按照這次作業的要求，我們要先從 GitHub 原始專案的 `master` 分支抓取到我們的 git 專案當中成為新的遠端分支，並且我們要將目前本地的主分支 rebase 到這個遠端分支上。這裡先用 graphviz 解釋 rebase 要處理的事情： ```graphviz digraph graphname{ node[shape=box] HEAD[shape=plaintext,fontcolor=blue] next[label="ttt",shape=plaintext,fontcolor=red] now[label="master",shape=plaintext,fontcolor=red] { rankdir = LR A[label=affe9f5] B[label=c6d0450] C[label=a53bf49] D[label=cc3388a] E[label=fc759af] F[label=c6d87cd] // NULL1[label=NULL,shape=plaintext] } { rank="same"; HEAD->now->F } { rank="same"; next->C } rankdir=LR A->B->C A->D->E->F } ``` ```graphviz digraph graphname{ node[shape=box] HEAD[shape=plaintext,fontcolor=blue] next[label="ttt",shape=plaintext,fontcolor=red] now[label="master",shape=plaintext,fontcolor=red] { rankdir = LR A[label=affe9f5] B[label=e533cbe] C[label=e07b3c4] D[label=cc3388a] E[label=fc759af] F[label=c6d87cd] // NULL1[label=NULL,shape=plaintext] } { rank="same"; now->F } { rank="same"; HEAD->next->C } rankdir=LR A->D->E->F->B->C } ``` 上圖的過程如下，首先共有兩個分支 `ttt` 以及 `master` ，並且兩個分支在分支產生後皆有新的 commit 紀錄，這時我們希望將兩個分支合併並且採用 rebase 的方式，也決定了是要將 `ttt` 分支 rebase 到 `master` 分支。因此我們會將 `c6d0450` 這個 commit 所紀錄的 patch file 重新對 `master` 分支進行 commit ，過程中可能會產生衝突， git 會自動產生標註在要 commit 的檔案當中，修改完後便繼續 commit 。重複上序流程直到 `ttt` 中的 commit 紀錄皆重新 commit 到 `master` 分支上。值得注意的是因為重新 commit 的關係所以 SHA-1 值以及時間點都會與之前的不同。 #### 實作流程 - [ ] git fetch ```shell $ git remote add origin_sys https://github.com/sysprog21/lab0-c.git $ git fetch origin_sys master From https://github.com/sysprog21/lab0-c * branch master -> FETCH_HEAD * [new branch] master -> origin_sys/master ``` 首先先新增遠端分支 `orig_sys` ，並更新這個分支到最新進度。 - [ ] git rebase (conflict) ```shell $ git branch * master ttt $ git rebase origin_sys/master Auto-merging queue.c CONFLICT (content): Merge conflict in queue.c error: could not apply cee17f9... Implement q_new , q_free , q_insert_(head,tail) hint: Resolve all conflicts manually, mark them as resolved with hint: "git add/rm <conflicted_files>", then run "git rebase --continue". hint: You can instead skip this commit: run "git rebase --skip". hint: To abort and get back to the state before "git rebase", run "git rebase --abort". Could not apply cee17f9... Implement q_new , q_free , q_insert_(head,tail) ``` 接著確認目前所在分支，確定是要合併過去的分之後，就可以執行 rebase 命令，在預設情況下會逐一進行，直到遇到衝突為止，在實作中我遇到的衝突如下，如提示所說，須將更改後的內容留在檔案中，隨後再重新 `git add <file> && git commit` 。 ```c <<<<<<<<<< master void q_free(list_head *l) { list_head *pt = l->next; while (pt != l) { element_t *node = container_of(pt, element_t, list); pt = pt->next; free(node->value); free(node); } free(l); } ========== void q_free(struct list_head *head) {} >>>>>>>>>> orig_sys/master ``` 這是實際遇到的衝突之一，只要留下需要的程式碼再將其餘標註刪除，再重新 commit 即可，我是透過 VScode 之衝突解決功能並修改 commit 內容後提交。 - [ ] git push 最後要將 rebase 完的結果推到 github 上，但是會遇到不是 fast-forward 的問題，原因在於 github 紀錄的最後一次 commit 已不存在於現在推上去的紀錄中。因此我們必須改成 `git push -f` 強置置換原先 github 上分支的內容。 ```shell To https://github.com/HenryChaing/lab0-c.git ! [rejected] master -> master (non-fast-forward) error: failed to push some refs to 'https://github.com/HenryChaing/lab0-c.git' hint: Updates were rejected because the tip of your current branch is behind hint: its remote counterpart. Integrate the remote changes (e.g. hint: 'git pull ...') before pushing again. hint: See the 'Note about fast-forwards' in 'git push --help' for details. ``` ```shell $ git push -f To https://github.com/HenryChaing/lab0-c.git + 6ffe9f5...694695b master -> master (forced update) ``` - [ ] [rebase 結果](https://github.com/HenryChaing/lab0-c/commits/master/) --- ### 〈Teach Yourself Programming in Ten Years〉閱讀感想首先作者就如標題所述，如果我們是希望成為名符其實的程式設計師，那麼我們就要有決心付出在這個領域當中，他也提到不論是音樂神童亦或是有名樂團，都也要至少十年的時間投入練習及學習。接著他提到了關於程式設計領域我們可以著手的項目，我列出幾項我體會較多的。首先他說若要在某個領域有卓越表現，通常有卓越表現的人並不等於經驗豐富的人，但反之如果有經驗豐富的個人願意花時間刻意練習，則非常有機會成為有卓越表現的人。再來他也有提到專業知識的重要性，包括計算機科學一定要理解的計算機結構基礎知識。最後他希望我們參與專案開發，並且要與其他程式設計師多加溝通，以了解他們的思路，這樣思維才得以融會貫通。以目前學生的角度出發，我的感想如下：我能認同作者所說的做中學的重要性，因為若要讓大學四年所學的有所貢獻，那麼我們還是得以程式設計師的角色出發，所以最重要的還是莫過於實作能力。而且從目前這堂核心實作課程我所得到的感想，例如循環雙向鏈結串列的實作、（快取）記憶體用量分析、排序最差狀況的避免及優化、 LRU 快取實作、並行程式的設計。這些其實都與我們大學所學的息息相關，但是要完成這些最重要的還是閱讀程式碼以及撰寫有效率程式碼的能力。要與其他程式設計師討論這點最近也略有感觸，因為實驗室計畫而需要與人合作，在著手寫 ROS2 程式前會先與同學討論，不僅得到不同思路而且問題目前都迎刃而解，所以頗為認同。在這堂課當中，也要嘗試批評其他同學的作業，老實說還真的有學到不一樣的觀點，[HotMercury](https://hackmd.io/@NeedSleep) 指出要使用 Linux 核心風格的程式碼，以及佇列插入要達到 $O(1)$與函式呼叫次數無關等等資訊，其實讓我有機會好好重思這些問題，甚至有了更好的解決方向。老師的批改也讓我了解到了問題的直接所在，也得以再更新筆記。最後作個結尾，作者最後強調要無所畏懼的投入學習以及練習，對我而言知識或許沒有其他修課同學那麼充足，但至少在投入練習這點我希望能超越其他同學，讓自己不會愧對於未來想要成為稱職的程式設計師這一點，就像老師說的工程師對這個社會的責任就是變強，我希望我也可以透過投入這堂課的練習讓自己變強。 --- ### 課程教材回顧這裡會主要回顧前三週的教材。首先是第一週的作業系統觀念，裡面提到了 `fork` 的概念是源自於並行處理的程式， `fork` 會產生新的行程，並透過 `join` 來同步兩個行程的執行。還有提到 sheduler 的觀念，因為當時已經提到了行程，因此也需要設計相對應的行程對處理器之間的資源分配，因此 scheduler 的概念就此產生，接著才有我們學到的行程排程方式，包含 Round Robin , FIFO 等等。接著是第二週的數值系統，首先探討了相對二進制，三進制更容易處理負數的問題，因為二進位還有分成無號數以及有號數的運算，在轉換有號負數成無號數時計算方式是加上 $2^n$ ( $n$ 是字組大小)，但容易產生預期外的錯誤。再來說到位元操作章節，首先講到了特殊的 ascii 設計，關於 `A` 及 `a` 的設計，剛好是 `1000001` 及 `1100001` ，因此我們可以透過對第五個位元進行與一的互斥或運算完成 `A` 及 `a` 的轉換。另外一項特性是 `ctrl` 可以遮罩第五及第六個位元，所以 `A` 及 `a` 可以轉換成 `1` ，其他字母也有相同的特性。再來是記憶體管理以及對齊的章節，首先提到了記憶體的階層，不過主題是記憶體存取速度（最快的是快取，其次是主記憶體，再來才是快閃記憶體，最後是傳統硬碟），比較需要注意的是彼此階層之間速度相差十倍，因此快取是否能抓取到常用資料成為了重要的議題，以及主記憶體及硬碟中的需求分頁也是值得注意的重點。接著是記憶體對齊的議題，舉例的是最常見的結構變數，當一個結構變數當中有 `int` 型態變數以及 `char` 型態變數，則編譯器在實際配置記憶體空間時會對齊位址為四的倍數的記憶體位址，這是為了記憶體存取時能夠直接操作。第二週最後是 bit-feild 章節，它可以透過在宣告變數時讓變數透過 `:` 設定變數使用空間，設定為零時不得有變數名稱並且之後的變數會對齊下一個界線，老實說還不懂 bit-field 為零的記憶體對齊。然後是第三週的浮點數運算，首先提到了 IEEE754 就算有明確規範，但是不同處理器也會產生不同的結果，甚至還有處理器沒有運算單元協助浮點數運算，但是編譯器可以融入 sse2 指令集以符合 IEEE754 的規範。需要特別注意的是這會導致儲存的浮點數並非我們直觀的十進位，而是會儲存成二進位相近的數值，因此 `0.1 + 0.2 == 0.3` 非真。盡量避免讓浮點數進行比較運算，因為實際儲存的數值並非實際程式賦予的數值。