2025q1 Homework1 (lab0)

# 2025q1 Homework1 (lab0) contributed by <```Charlie-Tsai1123```> ### Reviewed by `Denny0097` > 筆記程式碼格式應為C，而非Cpp。 ### Reviewed by `EricccTaiwan` > 善用 [Graphviz](https://graphviz.org/)。在 `q_sort` 和 `q_merge` 的流程實作圖，用 [Graphviz](https://graphviz.org/) 取代能更加美觀，且方便移植 (e.g., 複製到其他筆記中使用)。 {%hackmd NrmQUGbRQWemgwPfhzXj6g %} ## 開發環境 ```shell $ gcc --version gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 48 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Vendor ID: AuthenticAMD Model name: AMD Ryzen 7 7735HS with Radeon Graphics CPU family: 25 Model: 68 Thread(s) per core: 2 Core(s) per socket: 8 Socket(s): 1 Stepping: 1 Frequency boost: enabled CPU(s) scaling MHz: 30% CPU max MHz: 4829.0000 CPU min MHz: 400.0000 BogoMIPS: 6388.23 ``` ## queue implementation ### ```q_new``` >Commit [dae63fa](https://github.com/sysprog21/lab0-c/commit/dae63fab49a3d3222f924827cd2969dc8aabb57c) > 先用 ```malloc``` 分配空間給 ```head``` 接著判斷是否成功配置，有的話利用 ```INIT_LIST_HEAD``` 讓 ```next``` 跟 ```prev``` 指標指向自己 ```cpp /* Create an empty queue */ struct list_head *q_new() { struct list_head *queue = malloc(sizeof(struct list_head)); if (!queue) return NULL; INIT_LIST_HEAD(queue); return queue; } ``` ### ```q_free``` >Commit [dae63fa](https://github.com/sysprog21/lab0-c/commit/dae63fab49a3d3222f924827cd2969dc8aabb57c) > free 每個元素流程應該是 1. 斷開與 queue 的連結 (```list_del```)2. 釋放該元素而其中釋放元素又分成兩部分 1. 釋放 ```char*``` 2. 釋放 ```element_t*``` 原本想要自己寫，但在閱讀 ```queue.h``` 後發現 ```q_release_element``` 已經完成了 ```cpp void q_free(struct list_head *head) { if (!head) return; element_t *entry = NULL, *safe = NULL; list_for_each_entry_safe (entry, safe, head, list) { list_del(&entry->list); q_release_element(entry); } free(head); return; } ``` ### ```q_insert_head``` / ```q_insert_tail``` >Commit [e7ec77b](https://github.com/sysprog21/lab0-c/commit/e7ec77bdfc8ac20177134ba4a1e2203aa06ab5f8) 首先分配新元素的空間，分成兩步 1. 分配 ```element_t``` 2. 分配 ```char*``` ```char*``` 使用 ```strdup``` (分配空間並複製字串) 接著用 ```list_add``` / ```list_add_tail``` 加到 ```queue``` 中 ```cpp /* Insert an element at head of queue */ bool q_insert_head(struct list_head *head, char *s) { if (!head) return false; element_t *new_element = malloc(sizeof(element_t)); if (!new_element) return false; new_element->value = strdup(s); if (!new_element->value) { free(new_element); return false; } list_add(&new_element->list, head); return true; } ``` ### ```q_remove_head``` / ```q_remove_tail``` >Commit [64c4ef8](https://github.com/sysprog21/lab0-c/commit/64c4ef809ddb701a9ef200920dae777ad0061f4b) 先記錄要刪除的點，用 ```list_entry``` 得到指標所指位子的 ```element_t``` ，接著用 ```strncpy``` 複製字串，最後用 ```list_del``` 移除節點 ```cpp /* Remove an element from head of queue */ element_t *q_remove_head(struct list_head *head, char *sp, size_t bufsize) { if (!head || list_empty(head)) return NULL; struct list_head *first = head->next; element_t *element = list_entry(first, element_t, list); if (sp) { strncpy(sp, element->value, bufsize - 1); sp[bufsize - 1] = '\0'; } list_del(first); return element; } ``` 但這樣過不了 track-17-complexity ```bash +++ TESTING trace trace-17-complexity: # Test if time complexity of q_insert_tail, q_insert_head, q_remove_tail, and q_remove_head is constant ERROR: Probably not constant time or wrong implementation Probably constant time ERROR: Probably not constant time or wrong implementation Probably constant time --- trace-17-complexity 0/5 --- TOTAL 95/100 make: *** [Makefile:61: test] Error 1 ``` （新增 percentile 後已達到100/100） ### ```q_size``` >Commit [cba267a](https://github.com/sysprog21/lab0-c/commit/cba267aa5efa7e9081d4170d95fe4becd3d719a0) 用 ```list_for_each``` 遍歷每個節點並用 ```size``` 記錄遍歷了幾次 ```cpp /* Return number of elements in queue */ int q_size(struct list_head *head) { if (!head || list_empty(head)) return 0; int size = 0; struct list_head *node = NULL; list_for_each (node, head) size++; return size; } ``` ### ```q_delete_mid``` >Commit [b438041](https://github.com/sysprog21/lab0-c/commit/b438041c31f34b275213d94c9618550c87df68d4) 用快慢指針找出中間的節點也就是 ```slow->next``` 然後刪除 ```cpp /* Delete the middle node in queue */ bool q_delete_mid(struct list_head *head) { // https://leetcode.com/problems/delete-the-middle-node-of-a-linked-list/ if (!head || list_empty(head)) return false; struct list_head *fast = head, *slow = head; while (fast->next != head && fast->next->next != head) { fast = fast->next->next; slow = slow->next; } element_t *element = list_entry(slow->next, element_t, list); list_del(slow->next); q_release_element(element); return true; } ``` ### ```q_delete_dup``` >Commit [b438041](https://github.com/sysprog21/lab0-c/commit/b438041c31f34b275213d94c9618550c87df68d4) 我的想法是用兩個指標分別是 ```current``` 記錄當前檢查是否重複的節點， ```next``` 檢查後面的值是否跟 ```current``` 相同，若相同就刪除，不同就更新 ```current``` 。 ```isRepeat``` 是用來記錄是否有 ```next``` 跟 ```current``` 相同，有的話更新 ```current``` 前要刪除 ```current```。 ![q_delete_dup](https://hackmd.io/_uploads/BJsVYYSsJl.png) ```cpp /* Delete all nodes that have duplicate string */ bool q_delete_dup(struct list_head *head) { // https://leetcode.com/problems/remove-duplicates-from-sorted-list-ii/ if (!head || list_empty(head) || list_is_singular(head)) return false; struct list_head *current = head->next; while (current != head) { element_t *current_element = list_entry(current, element_t, list); struct list_head *next = current->next; bool isRepeat = false; while (next != head) { element_t *next_element = list_entry(next, element_t, list); if (strcmp(current_element->value, next_element->value) == 0) { isRepeat = true; next = next->next; list_del(next->prev); q_release_element(next_element); } else { break; } } current = current->next; if (isRepeat) { list_del(current->prev); q_release_element(current_element); } } return true; } ``` ### ```q_swap``` >Commit [9787922](https://github.com/sysprog21/lab0-c/commit/9787922e7bace3ad4071dbe8a3984e43f76006a2) 此題先找出要交換的那兩個點 ```current->next``` 、 ```current->next->next```，再藉由 ```list_del``` 先移除第一個節點再用 ```list_add``` 加入到第二個節點後 ```cpp /* Swap every two adjacent nodes */ void q_swap(struct list_head *head) { // https://leetcode.com/problems/swap-nodes-in-pairs/ if (!head || list_empty(head)) return; struct list_head *current = head; while (current->next != head && current->next->next != head) { struct list_head *move = current->next; list_del(move); list_add(move, current->next); current = move; } return; } ``` 好像可以用 ```list_move_tail``` 讓程式碼更簡潔：　（待改） ```diff - list_del(move); - list_add(move, current->next); + list_move_tail(move, current->next->next); ``` 發現好像可以直接用 ```q_reverseK``` 實做只需要 ```q_reverseK(head, 2)``` ### ```q_reverse``` >Commit [9787922](https://github.com/sysprog21/lab0-c/commit/9787922e7bace3ad4071dbe8a3984e43f76006a2) > 把第一個元素也就是 ```head->next``` 搬到 ```queue``` 的末尾並用 ```tail``` 紀錄，然後更新 ```queue``` 的 ```tail``` ```cpp /* Reverse elements in queue */ void q_reverse(struct list_head *head) { if (!head || list_empty(head)) return; struct list_head *tail = head; while (head->next != tail) { list_move_tail(head->next, tail); tail = tail->prev; } return; } ``` 發現好像可以用```q_reverseK``` 也就是 ```q_reverseK(head, q_size(head))``` ### ```q_reverseK``` >Commit [9787922](https://github.com/sysprog21/lab0-c/commit/9787922e7bace3ad4071dbe8a3984e43f76006a2) ```cpp /* Reverse the nodes of the list k at a time */ void q_reverseK(struct list_head *head, int k) { // https://leetcode.com/problems/reverse-nodes-in-k-group/ if (k == 1 || !head || list_empty(head) || list_is_singular(head)) return; struct list_head *prev = head, *current = head->next; int itGroup = k - 1, counter = q_size(head) / k; while (counter != 0) { itGroup--; list_move(current->next, prev); if (itGroup == 0) { itGroup = k - 1; counter--; prev = current; current = current->next; } } return; } ``` ### ```q_sort``` >Commit [aad1788](https://github.com/sysprog21/lab0-c/commit/aad17887702d27926bcea6ded5e98a7daa120ce7) 採用 merge sort 但實做方式使用指標第一步是 partition 須產生指向左邊右邊 partition 第一個元素的指標 ![merge_sort_partition](https://hackmd.io/_uploads/B1tmQQPsyg.png) ```cpp // q_sort step 1 separate queue into left and right part struct list_head *mergeSort_partition(struct list_head *head, struct list_head *tail, bool descend) { if (head->next == tail) return head; struct list_head *fast = head, *middle = head; // becarful singular empty NULL while (fast->next != tail && fast->next->next != tail) { fast = fast->next->next; middle = middle->next; } struct list_head *left = mergeSort_partition(head, middle->next, descend); struct list_head *right = mergeSort_partition(middle->next, tail, descend); return mergeSort_merge(left, right, tail, descend); } ``` merge 中止條件： 1. ```left``` == ```right``` 左邊的 ```queue``` 沒元素了 2. ```right``` == ```tail``` 右邊的 ```queue``` 沒元素了 ![merge_sort_merge](https://hackmd.io/_uploads/SyULqXwskl.png) ```cpp // q_sort step 2 merge left and right queue struct list_head *mergeSort_merge(struct list_head *left, struct list_head *right, struct list_head *tail, bool descend) { const struct list_head *tmp = left->prev; while (left != right && right != tail) { const char *left_value = list_entry(left, element_t, list)->value; const char *right_value = list_entry(right, element_t, list)->value; if ((descend && strcmp(left_value, right_value) > 0) || (!descend && strcmp(left_value, right_value) < 0) || strcmp(left_value, right_value) == 0) { left = left->next; } else { right = right->next; list_move_tail(right->prev, left); } } return tmp->next; } ``` >Commit [7d711af](https://github.com/sysprog21/lab0-c/commit/7d711af1c94c2a53663c22dd89a6d9f86f84dbf5) ### ```q_ascend``` / ```q_descend``` >Commit [cfd347b](https://github.com/sysprog21/lab0-c/commit/cfd347b115a89685fa84daeed4580ce75bab8317) ```cpp int q_ascend(struct list_head *head) { // https://leetcode.com/problems/remove-nodes-from-linked-list/ if (!head || list_empty(head) || list_is_singular(head)) return q_size(head); q_reverse(head); struct list_head *current = head->next; while (current->next != head) { const element_t *current_element = list_entry(current, element_t, list); element_t *next_element = list_entry(current->next, element_t, list); if (strcmp(current_element->value, next_element->value) < 0) { list_del(current->next); q_release_element(next_element); } else { current = current->next; } } q_reverse(head); return q_size(head); } ``` ### ```q_merge``` > Commit [3fb617e](https://github.com/sysprog21/lab0-c/commit/3fb617e40ba8cfbc615abeec72dc21da22ab4b60) question: 如何只用 ```struct list_head *``` 表達整個串列架構? 如何得到每個 ```queue```? 16 - 40 行 queue.h ```clike /** * element_t - Linked list element * @value: pointer to array holding string * @list: node of a doubly-linked list * * @value needs to be explicitly allocated and freed */ typedef struct { char *value; struct list_head list; } element_t; /** * queue_contex_t - The context managing a chain of queues * @q: pointer to the head of the queue * @chain: used by chaining the heads of queues * @size: the length of this queue * @id: the unique identification number */ typedef struct { struct list_head *q; struct list_head chain; int size; int id; } queue_contex_t; ``` 62 - 65 行 qtest.c ```clike typedef struct { struct list_head head; int size; } queue_chain_t; ``` 從以上兩個程式我推測 ```queue``` 的結構如下： ![queue_structure](https://hackmd.io/_uploads/Hk9QaoEj1e.png) 那知道整個結構後就可以用 ```list_entry``` 取出想要的元素了 ```cpp void merge(struct list_head *first, struct list_head *second, bool descend) { struct list_head *first_current = first->next; struct list_head *second_current = second->next; while (first_current != first && second_current != second) { const char *first_value = list_entry(first_current, element_t, list)->value; const char *second_value = list_entry(second_current, element_t, list)->value; if ((descend && strcmp(first_value, second_value) > 0) || (!descend && strcmp(first_value, second_value) < 0)) { first_current = first_current->next; } else { second_current = second_current->next; list_move_tail(second_current->prev, first_current); } } if (second_current != second) { list_splice_tail_init(second, first); } } /* Merge all the queues into one sorted queue, which is in ascending/descending * order */ int q_merge(struct list_head *head, bool descend) { // https://leetcode.com/problems/merge-k-sorted-lists/ if (!head || head->next == head) return 0; struct list_head *target = head->next->next; struct list_head *first = list_entry(head->next, queue_contex_t, chain)->q; while (target != head) { struct list_head *second = list_entry(target, queue_contex_t, chain)->q; merge(first, second, descend); target = target->next; } int size = q_size(first); list_entry(head->next, queue_contex_t, chain)->size = size; return size; } ``` ## 在 ```qtest``` 提供 ```shuffle``` ### 實做 ```q_shuffle``` >利用 Fisher–Yates shuffle 演算法來實作洗牌（shuffle） > 1. 先用 q_size 取得 queue 的大小 len。隨機從 0 ~ (len - 1) 中抽出一個數字 random > 2. old 將指向從前面數來第 random 個節點，new 會指向最後一個未被抽到的節點，將 old 和 new 指向的節點的值交換，再將 len - 1。 > 3. 隨著 len 大小變小，已經被抽到過，並交換值到 queue 後面的會愈來愈多，直到所有的節點都已經被抽到過，shuffle 就結束。 ```cpp void q_shuffle(struct list_head *head) { if (!head || list_empty(head) || list_is_singular(head)) return; int len = q_size(head); for (struct list_head *new = head->prev; new != head; new = new->prev) { int random = rand() % len; struct list_head *old = head->next; for (int i = 0; i < random; i++) { old = old->next; } if (new == old) continue; element_t* new_element = list_entry(new, element_t, list); element_t* old_element = list_entry(old, element_t, list); char *tmp = new_element->value; new_element->value = old_element->value; old_element->value = tmp; } } ``` 思考如何將 ```q_shuffle``` 的功能加入 ```qtest.c``` 中（參考 ```do_reverse```) 問題 1 : ```exception_setup``` 功能測試程式查看是否發生錯誤，回傳 signal ，讓 ```qtest``` 執行的時候得知是否有操作成功問題 2 : ```set_noallocate_mode``` 功能禁止 alloc memory ```cpp static bool do_shuffle(int argc, char *argv[]) { if (argc != 1) { report(1, "%s takes no arguments", argv[0]); return false; } if (!current || !current->q) report(3, "Warning: Calling shuffle on null queue"); error_check(); set_noallocate_mode(true); if (current && exception_setup(true)) q_shuffle(current->q); exception_cancel(); set_noallocate_mode(false); q_show(3); return !error_check(); } ``` ```cpp ADD_COMMAND(shuffle, "Shuffle queue", ""); ``` ```bash charlie-tsai:~/linux2025/hw/lab0-c$make CC qtest.o qtest.c: In function ‘do_shuffle’: qtest.c:929:9: error: implicit declaration of function ‘q_shuffle’; did you mean ‘do_shuffle’? [-Werror=implicit-function-declaration] 929 | q_shuffle(current->q); | ^~~~~~~~~ | do_shuffle cc1: all warnings being treated as errors make: *** [Makefile:54: qtest.o] Error 1 ``` 解決：在 ```queue.h``` 中加入 ```q_shuffle``` 的宣告在 ```./qtest``` 中可以使用，但是 ```git commit -a``` 出現以下錯誤 ```bash [!] You are not allowed to change the header file queue.h or list.h ``` 解決：因為禁止修改 ```queue.h``` 所以直接把 ```q_shuffle``` 的實做放到 ```qtest.c``` 內 ### 統計方法驗證　```shuffle``` [Pearson's chi-squared test](https://en.wikipedia.org/wiki/Pearson%27s_chi-squared_test) 能檢驗虛無假說（Null hypothesis) ，即某特定事件在樣本中觀察到的頻率分佈與特定理論分佈一致，事件必須是互斥且機率為 1 如果要 shuffle 四個不同的數字，會出現24種可能的結果，彼此互斥的，且發生機率加起來為 1。那假設 shuffle 是公平的（24種結果發生的機率相同），並遵守 Uniform distribution，那虛無假說為： - $H_0$ (虛無假說）: shuffle 的結果發生的機率相同，遵守 Uniform distribution - $H_1$（對立假說）: shuffle 的結果發生的機率至少有一個不同 1. 先計算 chi-squared test statistic $X^2$ $$X^2 = \sum_{i=1}^n {(O_i - E_i)^2 \over E_i}$$ - $X^2$: Pearson's cumulative test statistic - $O_i$: the number of observations of type i - $E_i$: the expected count of type i 用　[shuffle_test.py](https://hackmd.io/@sysprog/linux2025-lab0/%2F%40sysprog%2Flinux2025-lab0-d#%E6%B8%AC%E8%A9%A6%E7%A8%8B%E5%BC%8F)　跑出得結果 ```bash Expectation: 41666 Observation: {'1234': 41585, '1243': 41461, '1324': 41674, '1342': 41804, '1423': 41645, '1432': 41761, '2134': 41914, '2143': 41774, '2314': 41500, '2341': 41642, '2413': 41465, '2431': 41826, '3124': 41950, '3142': 41575, '3214': 41452, '3241': 41395, '3412': 41673, '3421': 42184, '4123': 41316, '4132': 41660, '4213': 41868, '4231': 41674, '4312': 41522, '4321': 41680} chi square sum: 21.72860365765852 ``` $X^2$ = 21.72860365765852 2. 決定自由度對於 N 個隨機樣本而言，自由度為 N - 1。我們 shuffle 4 個數字會有24種結果，因為加起來機率為 1 ，所以實際上可以自由變換的變數只有23個，其中一個結果的機率為 1 減去另外23個結果發生的機率，所以自由度為 23。 3. 選擇顯著水準 - [顯著水準（Significance level）](https://en.wikipedia.org/wiki/Statistical_significance)α 代表在虛無假說（$𝐻_0$）為真下，錯誤地拒絕 ($𝐻_0$) 的機率，即型一錯誤發生之機率。 α = P(拒絕 $𝐻_0$ | $𝐻_0$ 為真) 我們 alpha 設定為常見的 0.05。 - [P value](https://zh.wikipedia.org/wiki/P%E5%80%BC) 從卡方分布表找合適的 P value，我們的自由度為 23，$𝑋^2$ 為 21.72860365765852，因為 14.848< 21.72860365765852 < 32.007，於是 P value 介於 0.9 和 1.0 之間。 ![Screenshot from 2025-03-08 20-43-04](https://hackmd.io/_uploads/rJjdihYjkx.png) 4. 統計檢定的結果不拒絕虛無假說對於要拒絕的虛無假說（$𝐻_0$），觀察到的結果必須具有統計顯著性，即觀察到的 P value 小於預先指定的顯著水準 α。因為 P value（0.9~1.0）> alpha（0.05），統計檢定的結果不拒絕虛無假說（$𝐻_0$）也就是樣本資料不拒絕 shuffle 的結果遵循 Uniform distribution，因為沒有足夠證據推翻。從圖表來看 shuffle 的頻率是大致符合 Uniform distribution 的。用　[shuffle_test.py](https://hackmd.io/@sysprog/linux2025-lab0/%2F%40sysprog%2Flinux2025-lab0-d#%E6%B8%AC%E8%A9%A6%E7%A8%8B%E5%BC%8F)　跑出得圖表 ![shuffle_result](https://hackmd.io/_uploads/rkkelYFsyx.png) ## 運用 ```Address Sanitizer``` 除錯執行　```make SANITIZER=1``` 以及 ```make test``` 是　100/100 ## 運用 ```Valgrind``` 除錯 ### 執行 ```make valgrind``` ```bash --- trace-17-complexity 5/5 --- TOTAL 100/100 Test with specific case by running command: scripts/driver.py -p /tmp/qtest.G9xdpd --valgrind -t <tid> ``` ### 運用　```Massif``` ```Massif``` 是分析記憶體使用狀況的工具可分成以下： - Heap blocks：當程式使用 ```malloc```、```calloc```、```realloc``` 來分配動態記憶體時，作業系統會從heap分配一塊記憶體，這就是 Heap block。 - Heap administration blocks 除存內部管理資訊 - Stack sizes來存放函式的區域變數、函式返回地址、參數等的記憶體區域，每個執行緒（thread）通常都有自己的 stack，其大小受限於作業系統。 ## 研讀論文〈[Dude, is my code constant time?](https://eprint.iacr.org/2016/1123.pdf)〉 - 目標：檢測程式碼是否符合常數時間執行的要求 (即使某些實做理論上是常數時間，實際上可能不是) >Even implementations that were supposed to be constant-time turned out not to be so - 優點： 1. 不需要硬體建模 2. black-box testing (不需要檢查 assembly code 且不須依賴編譯器設定) >black-box testing 測試者不需要知道程式的內部運作（如程式碼或演算法），而是根據輸入和輸出來評估系統是否符合預期行為。 - 為什麼需要常數時間？ Timing Attacks 透過測量密碼學實做的執行時間來推測機密資訊 ### method : Timing Leakage Detection 檢測時間執行是否與輸入數據有關，對兩組不同輸入統計，判斷是否有顯著差異 Step 1: Measure execution time 1. Classes definition: fix-vs-random: 第一類 fixed input、第二類　random input 目標：檢測輸入資料是否影響執行時間 2. Cycle counters: x86: TSC (Time Stamp Counter) ARM: SysTick 目標：測量時間 3. Environmental condition 每次測試隨機選擇輸入類別預先分配輸入類別及準備數據 Step 2: Post-processing 1. cropping：去除極端直 discard p-th percentile of data (減少與數據無關雜訊、聚焦於教低執行時間的資料) 2. Higher-order preprocessing Step 3: Statistical Test 1. Welch's t-test：檢測平均直是否不同 2. Kolmogorov-Smirnov (K-S test)：不需要假設數據符合某種分佈，但需要更多數據 ### 解釋 simulation 模式 #### qtest.c 在 ```qtest.c``` 中搜尋 simulation 發現出現在 ```queue_insert``` 跟 ```queue_remove``` 中，而他們分別會呼叫 ```is_insert_tail_const``` 、 ```is_insert_head_const``` 以及 ```is_remove_tail_const``` 、```is_remove_head_const``` 而在 dudect/fixture.c 可以看到 ```cpp #define DUT_FUNC_IMPL(op) \ bool is_##op##_const(void) { return test_const(#op, DUT(op)); } ``` 因此以上四個函式他們會呼叫 ```test_const``` #### fixture.c 的 test_const ```cpp static bool test_const(char *text, int mode) { bool result = false; t = malloc(sizeof(t_context_t)); for (int cnt = 0; cnt < TEST_TRIES; ++cnt) { printf("Testing %s...(%d/%d)\n\n", text, cnt, TEST_TRIES); init_once(); for (int i = 0; i < ENOUGH_MEASURE / (N_MEASURES - DROP_SIZE * 2) + 1; ++i) result = doit(mode); printf("\033[A\033[2K\033[A\033[2K"); if (result) break; } free(t); return result; } ``` 最多進行 ```TEST_TRIES``` 次測試，每輪至少要有 ```ENOUGH_MEASURE``` 筆測資，而每次用 ```(N_MEASURES - DROP_SIZE * 2)``` 筆 random 的資料跟 fix 比較，所以總共進行 ```ENOUGH_MEASURE / (N_MEASURES - DROP_SIZE * 2) + 1``` 次 >[!Note] 問題：為什麼是 ```N_MEASURES - DROP_SIZE * 2``` ? 我認為是為了符合論文中所提及的 Environmental condition，因為在最前面跟最後面的資料可能會有異常值的影響。每輪會呼叫 ```doit``` 函式，並回傳是否符合常數時間 #### fixture.c 的 doit 函式 ```doit``` 中較重要的操作為以下五行 ```cpp prepare_inputs(input_data, classes); bool ret = measure(before_ticks, after_ticks, input_data, mode); differentiate(exec_times, before_ticks, after_ticks); update_statistics(exec_times, classes); ret &= report(); ``` 1. ```prepare_input``` 產生 random vs fix 的資料（輸入幾次、輸入什麼都是 random） class 0 為不輸入 (fix)、class 1 為輸入 random 次 (random)，random string 為輸入的資料 ![prepare_input](https://hackmd.io/_uploads/r1x9xA9iJx.png) ```cpp void prepare_inputs(uint8_t *input_data, uint8_t *classes) { randombytes(input_data, N_MEASURES * CHUNK_SIZE); for (size_t i = 0; i < N_MEASURES; i++) { classes[i] = randombit(); if (classes[i] == 0) memset(input_data + (size_t) i * CHUNK_SIZE, 0, CHUNK_SIZE); } for (size_t i = 0; i < N_MEASURES; ++i) { /* Generate random string */ randombytes((uint8_t *) random_string[i], 7); random_string[i][7] = 0; } } ``` 這個函式就是在執行論文中提到的 Step1: Measure Execution Time 中的 Classes Definition 2. ```measure``` 紀錄 N_MEASURES 次 operation 執行的前後時間 3. ```differentiate``` 計算 operation 執行時間藉由 after_tick - before tick 2 跟 3 都在執行 Step1: Measure Execution Time 中的 Cycle counter 計算 operation 時間 4. ```update_statistics``` 更新此輪 fix (class 0) 跟 random (class 1) 的 operation 執行時間分佈 5. ```ret &= report();``` 計算 t statistic value 查看是否分佈相同 4 跟 5 在執行 Step 3: Statistical Test 如果分佈相同則與輸入沒有關係，此操作為常數時間 ### Student's t-distribution 接下來要解釋上面的第 4 跟 5 如何使用t statistic value ### 新增 percentile 處理首先先了解 percentile 的作用，percentile 主要在實施 step 2 的 pros processing ，從 [程式碼](https://github.com/oreparaz/dudect/blob/master/src/dudect.h)中的註解可以得知由於執行時間分佈可能會受到系統因素影響，我們丟棄異常大的測量值，只保留最快的 x% 測試數據，並對不同的 x 進行多次測試，以提高分析的準確性。接下來看看原來程式碼如何實現： ```cpp static int64_t percentile(int64_t *a_sorted, double which, size_t size) { size_t array_position = (size_t)((double)size * (double)which); assert(array_position < size); return a_sorted[array_position]; } ``` ```percentile``` 中的 ```which``` 代表的是第幾百分位數，所以它回傳的是第 ```which``` 個百分位數的值 ```cpp static void prepare_percentiles(dudect_ctx_t *ctx) { qsort(ctx->exec_times, ctx->config->number_measurements, sizeof(int64_t), (int (*)(const void *, const void *))cmp); for (size_t i = 0; i < DUDECT_NUMBER_PERCENTILES; i++) { ctx->percentiles[i] = percentile( ctx->exec_times, 1 - (pow(0.5, 10 * (double)(i + 1) / DUDECT_NUMBER_PERCENTILES)), ctx->config->number_measurements); } ``` 那在計算第幾百分位數的值前，需要先將 ```exec_time``` 排序， ```prepare_percentiles``` 是計算每個百分位的值。問題：為什麼計算第幾百分為的公式為 ```1 - (pow(0.5, 10 * (double)(i + 1) / DUDECT_NUMBER_PERCENTILES)``` 我先用 matplotlib 分析畫出 ```which``` 的分佈 ![which_i](https://hackmd.io/_uploads/BJSXwO0o1g.png) 可以發現，which 前期的斜率較抖可以接受的資料量較多，因為 constant time 會符合 t 分布在大部分的資料集中於較短時間的資料 >the execution time distribution tends to be skewed towards large timings, leading to a fat right tail. ## 整合網頁伺服器 ### 理解 fd 和 select 的功能 fd 就是所謂的 file descriptor ，因為在 UNIX 系統中，任何東西都可以視為檔案，而 socket 就是利用 UNIX file descriptors 與其他程式溝通 fd_set 是在 UNIX/Linux 的 select() API 中使用的文件描述符集合，用於監聽多個文件描述符（FD，File Descriptor）的狀態，如： - 可讀（readable） - 可寫（writable） - 異常（exceptional conditions）它通常用來監控網路 socket、文件、管道 (pipe)、標準輸入輸出等 I/O 資源，確保程式不會在 read() 或 write() 時阻塞。以下為 fd_set 可使用的函式： - FD_ZERO(&set)：清空 fd_set - FD_SET(fd, &set)：將 fd 加入 fd_set - FD_CLR(fd, &set)：從 fd_set 移除 fd - FD_ISSET(fd, &set)：判斷 fd 是否在 fd_set 中 - select(nfds, &readfds, &writefds, &exceptfds, &timeout)：監聽文件描述符狀態 ```cpp int select(int nfds, fd_set *_Nullable restrict readfds, fd_set *_Nullable restrict writefds, fd_set *_Nullable restrict exceptfds, struct timeval *_Nullable restrict timeout); ``` >select() allows a program to monitor multiple file descriptors, waiting until one or more of the file descriptors become "ready" for some class of I/O operation (e.g., input possible). 從 [linux manual page](https://man7.org/linux/man-pages/man2/select.2.html) 可以得知 ```select``` 是一個多工 I/O 監聽機制，允許程式同時監控多個檔案描述符（file descriptors, fd）eg: nfds = 5, 則會監聽 fd = 0, 1, 2, 3, 4, 5 當這些 fd 有事件發生（可讀、可寫或異常），```select``` 會返回，讓程式處理事件。這樣可以避免程式一直「阻塞」在某個 fd 上，而是可以有效率地監聽多個來源。 >[!note]問題：select 用了什麼機制可以防止程式阻塞在某個 fd ### console.c 實做 #### cmd_select ```cpp static int cmd_select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout) ``` ```cmd_select``` 在 ```select``` 的基礎下又加了一些功能，以下對兩者做一些比較 | 功能 | select | cmd_select | | -------- | -------- | -------- | | 監聽標準輸入 | :ballot_box_with_check:監聽<br/> 0~nfds 的 fd | :ballot_box_with_check:只考慮 STDIN_FILENO 跟 web_fd | |緩衝區處理|:x:無緩衝區處理|:ballot_box_with_check:會先處理緩衝區（buf_stack->fd）| |命令處理|:x:不會執行命令|:ballot_box_with_check:讀取並執行（interpret_cmd(cmdline))| |linenoise 支援|:x:不支援|:ballot_box_with_check:支援 linenoise 提供歷史紀錄| #### do_web 開啟伺服器 socket (web_open(port)) 並監聽設定事件多工處理函式 ```web_eventmux``` #### run_console ```run_console``` 中皆使用 ```cmd_select(0, NULL, NULL, NULL, NULL)``` 形式，因為只須讀取資料，且處理方式只須考慮 STDIN_FILENO 跟 web_fd （忽略 nfds）。此外，因為 ```readfds``` 回傳是 0 所以會直接採用 ```&local_readset``` >[!Note] 問題：既然直接採用 local_readset 且使用 cmd_select 時皆沒有用到傳入的參數，為什麼寫的時候還要傳入參數？我的想法是，可能讓它看起來像 select 增加他的 readability