2025q1 Homework1 (lab0)

# 2025q1 Homework1 (lab0) contributed by <`JeffBla`> {%hackmd NrmQUGbRQWemgwPfhzXj6g %} #### Reviewed by `MikazukiHikari` 建議不同的queue.[ch]功能分開在不同的 Commit 中實作。 ## 開發環境 ``` $ gcc --version gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 39 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Vendor ID: GenuineIntel Model name: 13th Gen Intel(R) Core(TM) i7-13620H CPU family: 6 Model: 186 Thread(s) per core: 2 Core(s) per socket: 10 Socket(s): 1 Stepping: 2 CPU(s) scaling MHz: 25% CPU max MHz: 4900.0000 CPU min MHz: 400.0000 BogoMIPS: 5836.80 Virtualization features: Virtualization: VT-x Caches (sum of all): L1d: 416 KiB (10 instances) L1i: 448 KiB (10 instances) L2: 9.5 MiB (7 instances) L3: 24 MiB (1 instance) NUMA: NUMA node(s): 1 NUMA node0 CPU(s): 0-15 ``` ## 針對佇列操作的程式碼實作 ### 巨集為了提昇可讀性與便利性，將時常重複的操作以巨集代替。 ### q_entry() > commit: [8ec5ffe > ](https://github.com/JeffBla/lab0-c/commit/8ec5ffe5dbe13d775d45deff4ac0a60b33fc5d3f) 利用嵌入體鏈結串列反推佇列節點。 ### new_q_element() > commit: [8ec5ffe > ](https://github.com/JeffBla/lab0-c/commit/8ec5ffe5dbe13d775d45deff4ac0a60b33fc5d3f) 為佇列節點配置記憶體。 ### swap() > commit: [3413e9a > ](https://github.com/JeffBla/lab0-c/commit/3413e9a68da239f5c3472a8721838fa68976b2b8) 輸入三個變數，其中兩個為與交換值之變數，另一個為暫存之變數。 ### 目標函式 ### q_new() > commit: [8ec5ffe > ](https://github.com/JeffBla/lab0-c/commit/8ec5ffe5dbe13d775d45deff4ac0a60b33fc5d3f) 配置記憶體，考慮配置記憶體失敗，失敗回傳空指標，若成功，初始化嵌入體節點頭。 ### q_free() > commit: [8ec5ffe > ](https://github.com/JeffBla/lab0-c/commit/8ec5ffe5dbe13d775d45deff4ac0a60b33fc5d3f) 利用迴圈 `list_for_each_entry_safe` 拔掉節點並且釋放佇列節點與其儲存之字串，當佇列不存在，直接返回函式執行。 > commit: [5aa2284 > ](https://github.com/JeffBla/lab0-c/commit/5aa228400021ea587a20443c6418d334c391955f) 未釋放佇列的頭部，完善它。 ### q_insert_common() > commit: [250992a > ](https://github.com/JeffBla/lab0-c/commit/250992a3829109d3439531b1efe98ba7677f8377) 根據 clean code ，將 [q_insert_head](#q_insert_head()) 與 [q_insert_tail](#q_insert_tail()) 重構到此函式，利用 `insert_fn` 根據位置調整所欲呼叫的函式。 ```c void (*insert_fn)(struct list_head *, struct list_head *); ``` ```diff +bool q_insert_common(struct list_head *head, + char *s, + void (*insert_fn)(struct +list_head *, struct list_head *)){ ... - list_add(&new_node->list, head); + insert_fn(&new_node->list, head); ``` ### q_insert_head() > commit: [8ec5ffe > ](https://github.com/JeffBla/lab0-c/commit/8ec5ffe5dbe13d775d45deff4ac0a60b33fc5d3f) 利用 `new_q_element` 分配佇列節點空間，若配置記憶體失敗，函式返回並回傳 `false` 。利用 `strdup` 複製傳入字串並分配字串空間，若配置記憶體失敗，函式返回並回傳 `false` 。最後運用 linux 鏈結串列特性，將嵌入體 `list_head` 插入至佇列嵌入體所組成的鏈結串列開頭。 > commit: [513dca2 > ](https://github.com/JeffBla/lab0-c/commit/513dca26468428548f38fd92a48898b7e08a3a3c) 若 `strdup` 出錯，則應該釋放先前所配置的空間，而非直接返回函式。 > commit: [250992a > ](https://github.com/JeffBla/lab0-c/commit/250992a3829109d3439531b1efe98ba7677f8377) 將函式邏輯重構至 [q_insert_common](#q_insert_common()) ### q_insert_tail() > commit: [8ec5ffe > ](https://github.com/JeffBla/lab0-c/commit/8ec5ffe5dbe13d775d45deff4ac0a60b33fc5d3f) 與 [q_insert_head](#q_insert_head()) 相似，但最後將嵌入體插入至佇列嵌入體所組成的鏈結串列尾端。 > commit: [513dca2 > ](https://github.com/JeffBla/lab0-c/commit/513dca26468428548f38fd92a48898b7e08a3a3c) 同 [q_insert_head](#q_insert_head()) > commit: [250992a > ](https://github.com/JeffBla/lab0-c/commit/250992a3829109d3439531b1efe98ba7677f8377) 將函式邏輯重構至 [q_insert_common](#q_insert_common()) ### q_remove_mid() > commit: [8ec5ffe > ](https://github.com/JeffBla/lab0-c/commit/8ec5ffe5dbe13d775d45deff4ac0a60b33fc5d3f) > commit: [a56ce3c > ](https://github.com/JeffBla/lab0-c/commit/a56ce3cff58e500a86fa6cdb13cba8244df003a3) 符合 clean code ，將重複的程式碼重構。因為欲刪除節點以提供，在任一點的拔除皆為等價，因此將此函式代替在任意點的拔除。將目標節點拔掉，若給定字串指標存在，利用 `strncpy` 將目標所儲存的字串複製到 `bufsize` 。為了維持 c 語言的字串形式，複製長度為 `bufsize -1` 並在 `bufsize-1` 位置設為 `\0` 。當佇列不存在或佇列為空時，直接返回空指標。 ### q_remove_head() & q_remove_tail() > commit: [8ec5ffe > ](https://github.com/JeffBla/lab0-c/commit/8ec5ffe5dbe13d775d45deff4ac0a60b33fc5d3f) 此兩函式主要邏輯被重構至 [q_remove_mid](#q_remove_mid()) ### q_size() > commit: [3689eff > ](https://github.com/JeffBla/lab0-c/commit/3689eff1120a4d30468bd6f8cedbe205ab3e0457) 以迴圈遍歷並紀錄所有節點，得出長度，當佇列不存在，則回傳值為0。 ### q_delete_mid() > commit: [8ec5ffe > ](https://github.com/JeffBla/lab0-c/commit/8ec5ffe5dbe13d775d45deff4ac0a60b33fc5d3f) > commit: [6f4b259 > ](https://github.com/JeffBla/lab0-c/commit/6f4b25914723e0c83d239e2c9497b2b6a4db6c15) 當佇列不存在或為空時，直接返回函式並回傳 `false` 。使用快慢指標，並以計數器紀錄快指標經過的節點數，當計數器中的數值為基數，慢指標往下一個移動，最後當快指標指向空指標，慢指標極為中點。拔除並釋放相關空間。其中須注意若佇列為空， `mid` 若指向頭部，則表示佇列為空。 ```diff if ((cnt & 1) == 1) mid = mid->next; cnt++; curr = curr->next; } + if (mid == head) + return false; list_del(mid); q_release_element(q_entry(mid)); ``` > commit: [0c68d9f > ](https://github.com/JeffBla/lab0-c/commit/0c68d9f88a9d0f8697c2c0dfe5264d98d1d4d58d) 第二周課程提及到除法的費時，考量到此，利用 bitops ```diff int cnt = 0; while (curr) { - if (cnt % 2 == 1) + if ((cnt & 1) == 1) mid = mid->next; cnt++; ``` ### q_delete_dup() > commit: [8ec5ffe > ](https://github.com/JeffBla/lab0-c/commit/8ec5ffe5dbe13d775d45deff4ac0a60b33fc5d3f) > commit: [4fb3b3c > ](https://github.com/JeffBla/lab0-c/commit/4fb3b3cd27ef835f304b7e384e2a8d3fbc6c4397) 當佇列不存在或為空時，返回函式並回傳 `false` 。利用快慢指標遍歷，當發現相鄰節點的值相同時，標記為重複，快指標繼續向後移動，直到該段重複節點結束。刪除所有重複節點，慢指標指向快指標的目標，並調整鏈結確保正確連接，若無重複則繼續遍歷，最後回傳 `true` 。 ### q_swap() > commit: [016fde4 > ](https://github.com/JeffBla/lab0-c/commit/016fde41fe310e94cf9b0fa8df04dbcd2673ff7a) > commit: [5c7402a > ](https://github.com/JeffBla/lab0-c/commit/5c7402aa974379229bfc35769e60301aad1ff929) 當佇列不存在、只有一個節點，或為空時，直接返回函式。使用快慢指標遍歷佇列，交換相鄰的兩個節點，確保 `next` 與 `prev` 指標正確更新，維持雙向鏈結串列的完整性。 ### q_reverse() > commit: [3413e9a > ](https://github.com/JeffBla/lab0-c/commit/3413e9a68da239f5c3472a8721838fa68976b2b8) 當佇列不存在或為空時，返回函式。使用雙指標分別指向頭尾節點，利用 `swap` 巨集交換節點內部字串指標值，並向中間移動，直到遍歷完成，最終達成完整反轉。與 [q_reverse_only_k](#q_reverse_only_k()) 不同之處在於反轉整個雙向環狀鏈結串列時，頭尾的取得是 $O(1)$ ### q_reverse_only_k() > commit: [3413e9a > ](https://github.com/JeffBla/lab0-c/commit/3413e9a68da239f5c3472a8721838fa68976b2b8) 使用雙指標分別指向區段的頭尾節點，利用 `swap` 巨集交換 k 個節點內部字串指標值，確保雙向移動時不超過區段範圍，最終完成該區段的反轉。然而，在尋找尾端節點時，須遍歷 k 個元素 ```c struct list_head *back = front; for (int i = 1; i < k; i++) { back = back->next; } ``` ### q_reverseK() > commit: [3413e9a > ](https://github.com/JeffBla/lab0-c/commit/3413e9a68da239f5c3472a8721838fa68976b2b8) > commit: [39b30f3 > ](https://github.com/JeffBla/lab0-c/commit/39b30f32cc705c3f0aecbb03ec88e3e651ac1c80) 當佇列不存在或為空時，直接返回函式並返回。遍歷鏈結串列，每 k 個節點為一組，利用 [q_reverse_only_k](#q_reverse_only_k()) 反轉該區段，利用雙重迴圈，每湊齊 k 個元素，反轉一次，若剩餘節點不足 k 則保持原樣，最後完成遍歷。 ### 排序實作模仿 `list_sort.c` 實現核心風格的由下而上合併排序，並支援升序與降序。 #### cmp_in_sort() > commit: [c487b61 > ](https://github.com/JeffBla/lab0-c/commit/c487b6190c2892d026660ebe00f0ea73c5640733) > commit: [1b24768 > ](https://github.com/JeffBla/lab0-c/commit/1b24768118916d1aba81690168b73fd167e02730) > commit: [8073fae > ](https://github.com/JeffBla/lab0-c/commit/8073faed06a3f65e43632b61d2442e22304209a4) 比較兩個節點的字串值，根據 descend 旗標決定排序方式，返回是否應交換位置。 #### merge() > commit: [c487b61 > ](https://github.com/JeffBla/lab0-c/commit/c487b6190c2892d026660ebe00f0ea73c5640733) 使用底向上的方式合併兩條已排序的鏈結串列，根據 descend 旗標決定合併順序，最終返回合併後的頭節點。此函式將鏈結串列當作單向非循環。 > commit: [8073fae > ](https://github.com/JeffBla/lab0-c/commit/8073faed06a3f65e43632b61d2442e22304209a4) 相較於 [linux/list_sort.c](https://github.com/torvalds/linux/blob/master/lib/list_sort.c) ，為了省略進入 [cmp_in_sort](#cmp_in_sort()) 判斷其中一個欲合併鏈結串列是否為空指標，將 for 迴圈改成 while 迴圈，其中的執行條件是兩欲合併鏈結串列皆不為空指標。至於當其中一個欲合併鏈結串列為空且迴圈執行完畢後，另一欲合併鏈結串列將會被插入於當前合併合併鏈結串列的尾端。 #### merge_final() > commit: [c487b61 > ](https://github.com/JeffBla/lab0-c/commit/c487b6190c2892d026660ebe00f0ea73c5640733) 進一步合併排序後的串列，確保 prev 指標正確連結，最終形成雙向循環鏈結串列。 > commit: [58e7359 > ](https://github.com/JeffBla/lab0-c/commit/58e735980af5e732d7800a41e24263da47d40fa0) 後續追蹤發現，不單單只要維護 `head` ，所有的 `prev` 都要修正。 ```diff } - // Make linked list Circular - head->prev = *tail; + // Make rest linked list nodes doubly linked + while (*tail) { + (*tail)->prev = prev; + prev = *tail; + tail = &(*tail)->next; + } + // Make list circular + head->prev = prev; + prev->next = head; } ``` > commit: [8073fae > ](https://github.com/JeffBla/lab0-c/commit/8073faed06a3f65e43632b61d2442e22304209a4) 相較於 [linux/list_sort.c](https://github.com/torvalds/linux/blob/master/lib/list_sort.c) ，為了省略進入 [cmp_in_sort](#cmp_in_sort()) 判斷其中一個欲合併鏈結串列是否為空指標，將 for 迴圈改成 while 迴圈，其中的執行條件是兩欲合併鏈結串列皆不為空指標。至於當其中一個欲合併鏈結串列為空且迴圈執行完畢後，另一欲合併鏈結串列將會被插入於當前合併合併鏈結串列的尾端。 #### q_sort() > commit: [c487b61 > ](https://github.com/JeffBla/lab0-c/commit/c487b6190c2892d026660ebe00f0ea73c5640733) 當佇列不存在、為空或僅有一個元素時，直接返回函式。使用 Linux 核心 list_sort 風格的合併排序，先將雙向循環鏈結串列轉換為單向鏈結串列，再逐步合併節點，最後透過 merge_final 修正 prev 指標並恢復循環結構。其中合併方式遵照 list_sort 。在 [linux/list_sort.c](https://github.com/torvalds/linux/blob/master/lib/list_sort.c) 中 ```c /* This mergesort is as eager as possible while always performing at least * 2:1 balanced merges. Given two pending sublists of size 2^k, they are * merged to a size-2^(k+1) list as soon as we have 2^k following elements. ``` 跟據 [你所不知道的 C 語言: linked list 和非連續記憶體](https://hackmd.io/@sysprog/c-linked-list) > 保持兩個要被 merge 的 list 至少是 2 : 1 的狀態 (較大的 list 至多是較小者的 2 倍)，這可有效的降低比較的次數。 > 其實做方式與解釋 ```c /* The merging is controlled by "count", the number of elements in the * pending lists. This is beautiully simple code, but rather subtle. * * Each time we increment "count", we set one bit (bit k) and clear * bits k-1 .. 0. Each time this happens (except the very first time * for each bit, when count increments to 2^k), we merge two lists of * size 2^k into one list of size 2^(k+1). ``` ```c /* Find the least-significant clear bit in count */ for (bits = count; bits & 1; bits >>= 1) tail = &(*tail)->prev; /* Do the indicated merge */ if (likely(bits)) { struct list_head *a = *tail, *b = a->prev; a = merge(priv, (cmp_func)cmp, b, a); /* Install the merged result in place of the inputs */ a->prev = b->prev; *tail = a; } ``` 也就是說此演算法會跟據 `count` 的最低有效 0 位來進合併的選擇，善用 bit 來達成 > Given two pending sublists of size 2^k, they are merged to a size-2^(k+1) list as soon as we have 2^k following elements. 至於最後面則故意留下一條尚未被合併，利用 `merge_final` 將雙向循環鏈結串列的 `prev` 指標復原。 ### q_ascend_descend() > commit: [9c1eb51 > ](https://github.com/JeffBla/lab0-c/commit/9c1eb51d4e39156c99c44a8f7dbf0861dd7e442c) 符合 clean code ，將重複的程式碼重構。由於遞增與遞減的邏輯只差在比較值得方式，因此將此程式邏輯提取至此。使用遞迴處理鏈結串列，透過 `pivot` 來篩選元素，根據遞增或遞減模式移除不符合條件的節點。函式會先遞迴到末點，最後一個節點先與 `pivot` 進行比較。若符合遞增/遞減條件則更新 `pivot`，否則將節點從鏈結串列移除，最後回傳符合條件的節點數量。 ### q_ascend() > commit: [9c1eb51 > ](https://github.com/JeffBla/lab0-c/commit/9c1eb51d4e39156c99c44a8f7dbf0861dd7e442c) 移除所有右側存在更小值的節點，確保剩餘節點單調遞增。使用 `~~~~~~~` 作為初始 `pivot` ，然後呼叫 [q_ascend_descend](#q_ascend_descend()) 進行處理。會選 `~~~~~~~` 作為其初始是因為 `~` 為 ASCII 中最大的可見字元，而若剛好有 `~` 在比較中則會導致此函式出錯，因此利用多個 `~` 降低錯誤率根據 `man ascii` : ```bash Oct Dec Hex Char 176 126 7E ~ 177 127 7F DEL ``` ### q_descend() > commit: [9c1eb51 > ](https://github.com/JeffBla/lab0-c/commit/9c1eb51d4e39156c99c44a8f7dbf0861dd7e442c) 移除所有右側存在更大值的節點，確保剩餘節點單調遞減。使用 `\0` 作為初始 pivot，然後呼叫 [q_ascend_descend](#q_ascend_descend()) 進行處理。 ### 合併多個佇列 #### 巨集 ##### ctx_entry() > commit: [8073fae > ](https://github.com/JeffBla/lab0-c/commit/8073faed06a3f65e43632b61d2442e22304209a4) 將嵌入體鏈結串列反推佇列鏈 `queue_contex_t` 節點。 #### break_circular() > commit: [8073fae > ](https://github.com/JeffBla/lab0-c/commit/8073faed06a3f65e43632b61d2442e22304209a4) 為了增加可讀性，將循環變成非循環的程式邏輯重構成此函式。 #### list_is_2() > commit: [8073fae > ](https://github.com/JeffBla/lab0-c/commit/8073faed06a3f65e43632b61d2442e22304209a4) 確認鏈結串列是否只有包含兩個節點。 #### _q_merge() > commit: [8073fae > ](https://github.com/JeffBla/lab0-c/commit/8073faed06a3f65e43632b61d2442e22304209a4) 該函式負責合併 ([merge_final](#merge_final())) 兩個鏈結串列，並根據遞減/遞增參數決定合併順序。它會先取出前兩個隊列的上下文，然後打破其環狀結構 ([break_circular](#break_circular()))，接著執行合併，最終將一個記錄為空隊列並回收，而另一個則繼續保留在隊列中。原本程式碼如以下，希望模仿 [linux/list_sort.c](https://github.com/torvalds/linux/blob/master/lib/list_sort.c) 中的合併運算，然而，目前還沒有想好整體流程，因此先取消此實做，改成全部都是使用 `merge_final`。 ```c if (is_final) { merge_final(descend, ctx1->q, ctx1->q->next, ctx2->q->next); } else { ctx1->q->next = merge(descend, ctx1->q->next, ctx2->q->next); } ``` #### q_merge() > commit: [8073fae > ](https://github.com/JeffBla/lab0-c/commit/8073faed06a3f65e43632b61d2442e22304209a4) 該函式會不斷合併隊列，直到只剩下一個排序後的隊列。它使用 ([_q_merge](#_q_merge())) 來進行兩兩合併，最後回傳合併後的隊列大小。原本程式碼如以下，利用剩餘的兩個鏈結串列，做最後的維護，希望模仿 [linux/list_sort.c](https://github.com/torvalds/linux/blob/master/lib/list_sort.c) 中的合併運算，然而，目前還沒有想好整體流程，因此先取消此實做，改成全部都是使用 `merge_final`。 ```c while (!list_is_2(head)) { _q_merge(head, descend, false, &empty_head); } _q_merge(head, descend, true, &empty_head); } ``` ## 解釋 select 系統呼叫在本程式的使用方式首先，確定程式碼已實作同時處理網頁接收與命令列。觀察程式碼，找出主要處理命令列的函式。命令列會由 `console.c` 中 `run_console` 開始處理指令。其關鍵程式碼： ```c if (!has_infile) { char *cmdline; while (use_linenoise && (cmdline = linenoise(prompt))) { interpret_cmd(cmdline); line_history_add(cmdline); /* Add to the history. */ line_history_save(HISTORY_FILE); /* Save the history on disk. */ line_free(cmdline); while (buf_stack && buf_stack->fd != STDIN_FILENO) cmd_select(0, NULL, NULL, NULL, NULL); has_infile = false; } if (!use_linenoise) { while (!cmd_done()) cmd_select(0, NULL, NULL, NULL, NULL); } } else { while (!cmd_done()) cmd_select(0, NULL, NULL, NULL, NULL); } ``` 若 `use_linenoise` 被啟用則開始利用 `linenoise` 讀取指令，若否，執行 `cmd_select` 進行針對其他檔案描述子處理。在 `linenoise` 中，主要執行輸入指令操作的是 `line_raw` 內的 `line_edit`。其中的一段程式碼負責讀取指令。 ```c while (1) { signed char c; int nread; char seq[5]; if (eventmux_callback != NULL) { int result = eventmux_callback(l.buf); if (result != 0) return result; } nread = read(l.ifd, &c, 1); if (nread <= 0) return l.len; ``` `eventmux_callback` 為在啟用網頁服務後 `console.c` 設定的函式。 `web_eventmux` 則是本次欲探討 `select` 使用方式的出處。 ```c static bool do_web(int argc, char *argv[]) { ... web_fd = web_open(port); if (web_fd > 0) { printf("listen on port %d, fd is %d\n", port, web_fd); line_set_eventmux_callback(web_eventmux); use_linenoise = false; } else { ``` ```c int web_eventmux(char *buf) { fd_set listenset; FD_ZERO(&listenset); FD_SET(STDIN_FILENO, &listenset); int max_fd = STDIN_FILENO; if (server_fd > 0) { FD_SET(server_fd, &listenset); max_fd = max_fd > server_fd ? max_fd : server_fd; } int result = select(max_fd + 1, &listenset, NULL, NULL, NULL); if (result < 0) return -1; if (server_fd > 0 && FD_ISSET(server_fd, &listenset)) { FD_CLR(server_fd, &listenset); struct sockaddr_in clientaddr; socklen_t clientlen = sizeof(clientaddr); int web_connfd = accept(server_fd, (struct sockaddr *) &clientaddr, &clientlen); char *p = web_recv(web_connfd, &clientaddr); char *buffer = "HTTP/1.1 200 OK\r\nContent-Type: text/plain\r\n\r\n"; web_send(web_connfd, buffer); ``` 首先我們分析 `web_eventmux` 的處理，創建一個讀取的檔案描述子集合，將 `stdin` 與 `socket`（如果有的話）存放於此集合，由於我們只須利用 `select` 處理輸入的操作，因此除了 `readfds` 其他皆為空指標，其宣告如下 [select(2)](https://man7.org/linux/man-pages/man2/select.2.html) ```c int select(int nfds, fd_set *_Nullable restrict readfds, fd_set *_Nullable restrict writefds, fd_set *_Nullable restrict exceptfds, struct timeval *_Nullable restrict timeout); ``` 在設定完 `select` 後，處理當檔案描述子變更時，所做的事。在確認是網頁服務的輸入後，使用 `accept` 系統呼叫建立連線並執行之後的處理。其詳細描述於 [accept(2)](https://man7.org/linux/man-pages/man2/accept.2.html)： ``` The accept() system call is used with connection-based socket types (SOCK_STREAM, SOCK_SEQPACKET). It extracts the first connection request on the queue of pending connections for the listening socket, sockfd, creates a new connected socket, and returns a new file descriptor referring to that socket. The newly created socket is not in the listening state. The original socket sockfd is unaffected by this call. ``` 另外，在 `line_edit` 中，可以看到 ```c nread = read(l.ifd, &c, 1); if (nread <= 0) return l.len; ``` 處理使用者在命令列輸入字元。不過，到此不免產生， `select` 是如何實際運作的疑問，以及如何與 `read` 與 `accept` 協作。看一次 [select(2)](https://man7.org/linux/man-pages/man2/select.2.html) 的解釋： ``` select() allows a program to monitor multiple file descriptors, waiting until one or more of the file descriptors become "ready" for some class of I/O operation (e.g., input possible). A file descriptor is considered ready if it is possible to perform a corresponding I/O operation (e.g., read(2), or a sufficiently small write(2)) without blocking. ``` 值得注意的是在[select(2) NOTES](https://man7.org/linux/man-pages/man2/select.2.html#NOTES) ： ``` The operation of select() and pselect() is not affected by the O_NONBLOCK flag. ``` `select` 會等待直到輸入的檔案描述子其中一些能夠被取用，因此利用 `FD_ISSET` 就可以讓離開 `select` 的程式開始處理已經能夠被使用的檔案描述子， `select` 隱含著分支的意義，這也是為何在 `select` 後需要加上判別並進行分支 `read` 與 `accept` 處理。利用 `gdb` 觀察。 ```bash gdb -q ./qtest ``` 建立斷點以觀察 `select` 、 `read` 及 `accept` ```bash b web.c:248 # int result = select(max_fd + 1, ... b web.c:256 # int web_connfd = accept(server_fd, ... b linenoise.c:956 # nread = read(l.ifd, &c, 1); ``` 執行後，呈現 `cmd>` ，啟用 web ，當按下鍵盤， ```clike listen on port 9999, fd is 3 cmd> Breakpoint 1, web_eventmux (buf=0x7fffffffbc50 "") at web.c:248 248 int result = select(max_fd + 1, &listenset, NULL, NULL, NULL); ``` 輸入 c 繼續執行後會發現，程式繼續跑了，但沒有進入到任一個斷點。此時若在輸入字元，即會跳到 `read` 。 ```clike Breakpoint 2, line_edit (prompt=0x555555561503 "cmd> ", buflen=4096, buf=0x7fffffffbc50 "", stdout_fd=1, stdin_fd=0) at linenoise.c:957 957 if (nread <= 0) ``` 輸入 c 繼續觀察，程式又跑回 `select` ，且輸入 n 會發現 gdb 卡住了，代表 `select` 正在等待檔案描述子的更動，輸入一個字元後會發現之前輸入的 n 又做動了。接下來觀察， `socket` 的處理情形，將 gdb 跑回 `select` 並且輸入 c ，現在已經知道目前的情形是 `select` 正在等待檔案描述子，若在令一個終端機輸入 ```bash curl http://localhost:9999/new ``` gdb 跳出進入網路處理的訊息，而若輸入 c 則對應的指令動作被執行且 gdb 又回到 `select` 。 ```clike Breakpoint 3, web_eventmux (buf=0x7fffffffbc50 "") at web.c:256 256 int web_connfd = ``` 綜上所述， `select` 會代替所指派檔案描述子對應到的系統呼叫等待，結束等待後須利用 `FD_ISSET` 以及 `FD_CLR` 導向並確認處理完成。上述程式碼並沒有任何檔案描述子被設定為不阻擋，但因 `select` 已確認該檔案描述子需要處理，因此在處理對應到的系統呼叫並不再等待。 ## 解釋 simulation 模式使用兩組不同類型的樣本：固定數值與隨機數值，利用Welch's T檢驗針對執行時間（CPU時脈週期個數），比較兩組樣本的平均值是否有差異。首先建立：虛無假設(Null hypothesis)→H0: 第一組平均數等於第二組平均數對立假說(alternative hypothesis)→H1: 第一組平均數不等於第二組平均數如果此測試失敗，則表示存在一階時序資訊洩漏。 Welch's T 檢驗方程式 $t =\frac{\mu_x-\mu_y}{\sqrt{\frac{S_x^2}{n_x}}+\frac{S_y^2}{n_y}}$ 使用到線上演算法 - Welford 演算法其更新平均數的算法： $\mu_{new}=\mu_{old}+\frac{x-\mu_{old}}{n}$ 更新變異數的算法： $M2_{new} = M2_{old} + (x-\mu_{old})(x-\mu_{new})$ $S^2 = \frac{M2}{n-1}$ 將此數值帶入 Welch's T 檢驗方程式即可得知檢測值。在過程中只須紀錄並更新平均數與變異數的中繼值 $M2$ ### 實作改進百分點論文中提到利用前處理可以得到更好的檢測結果，其方法為捨去超過特定百分位以後的值。這裡的基本原理是測試受限的分佈範圍，尤其是較低的循環計數尾端。前端可能更容易受到與資料無關的雜訊的影響。這個（啟發式）過程給出了良好的經驗結果（使檢測更快）；儘管如此，該方法也處理沒有預處理的測量結果。 > commit: [1fdd752 > ](https://github.com/JeffBla/lab0-c/commit/1fdd752928f69b4b5527ff02701bf9cd71394eeb) 本次實做參考 [oreparaz/dudect](https://github.com/oreparaz/dudect) 設定 100 個百分點並計算出該百分點實際對應的數值 `percentile` ```c static int64_t percentile(int64_t *a_sorted, double which, size_t size) { size_t array_position = (size_t) ((double) size * (double) which); assert(array_position < size); return a_sorted[array_position]; } ``` :::spoiler 其對應數值 0.0670 0.1294 0.1877 0.2421 0.2929 0.3402 0.3844 0.4257 0.4641 0.5000 0.5335 0.5647 0.5939 0.6211 0.6464 0.6701 0.6922 0.7128 0.7321 0.7500 0.7667 0.7824 0.7969 0.8105 0.8232 0.8351 0.8461 0.8564 0.8660 0.8750 0.8834 0.8912 0.8985 0.9053 0.9116 0.9175 0.9231 0.9282 0.9330 0.9375 0.9417 0.9456 0.9492 0.9526 0.9558 0.9588 0.9615 0.9641 0.9665 0.9688 0.9708 0.9728 0.9746 0.9763 0.9779 0.9794 0.9808 0.9821 0.9833 0.9844 0.9854 0.9864 0.9873 0.9882 0.9890 0.9897 0.9904 0.9910 0.9916 0.9922 0.9927 0.9932 0.9937 0.9941 0.9945 0.9948 0.9952 0.9955 0.9958 0.9961 0.9964 0.9966 0.9968 0.9970 0.9972 0.9974 0.9976 0.9978 0.9979 0.9980 0.9982 0.9983 0.9984 0.9985 0.9986 0.9987 0.9988 0.9989 0.9990 0.9990 ::: ### 致命缺陷照 [oreparaz/dudect](https://github.com/oreparaz/dudect) 實作完後，就算運行時間不是 $O(1)$ 也會過，因此假設了幾個問題。下圖是非 $O(1)$ 實作的時間分佈圖，用高斯擬合並隨機抽樣產生曲線一下的點，藍色曲線為固定數值，紅色為隨機數值，可以發現，不管是平均數或者標準差都不相同。這種行為會導致 T 檢定失敗，但運行結果卻沒有問題。 ![實際運行時間比較](https://hackmd.io/_uploads/ByxkI8LTjkx.png) 1. 運行數量錯誤如下程式碼： ```c bool measure(int64_t *before_ticks, int64_t *after_ticks, uint8_t *input_data, int mode){ ... case DUT(insert_head): for (size_t i = DROP_SIZE; i < N_MEASURES - DROP_SIZE; i++) { ... case DUT(insert_tail): for (size_t i = DROP_SIZE; i < N_MEASURES - DROP_SIZE; i++) { ... ``` 猜測應該要去掉或略過 `DROP_SIZE` 個迴圈，但此寫法去除了兩倍的 `DROP_SIZE` 。 2. 最終選擇錯誤根據 `max_test` 最後的結果取最大值，由於原本無百分位以上捨去的運算並沒有被移除，因此結果最壞應該還是失敗才對，如之前未修改的程式碼。 ```c static t_context_t *max_test(t_context_t *t[]) { size_t ret = 0; double max = 0; for (size_t i = 0; i < DUDECT_TESTS; i++) { if (t[i]->n[0] > ENOUGH_MEASURE) { double x = fabs(t_compute(t[i])); if (max < x) { max = x; ret = i; } } } return t[ret]; } ```