2025q1 Homework1 (lab0)

# 2025q1 Homework1 (lab0) **contributed by <`leonnig`>** ### Reviewed by `HeatCrab` >由於在 `make test` 時，第17個測試 it、ih、rt、rh 是否是 constant time，測試過程非常慢，想當然這項測試也沒有通過。於是我使用 valgrind 搭配 massif 來檢查看看問題。 > > >![image](https://hackmd.io/_uploads/B1IyQQd3Jl.png) > >待完成記憶體問題應該不會是影響第十七個測試未通過的主因，對這個圖片也沒有分析，蠻可惜的。下方的 dudect 的分析很詳盡，但是還沒實作到分析第十七個測試究竟為什麼會沒通過的主因之前就停止了，完成的話基本上就可以看到星之卡比了，很可惜。 {%hackmd NrmQUGbRQWemgwPfhzXj6g %} ## 開發環境 ```shell= $ gcc --version gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 39 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 Vendor ID: GenuineIntel Model name: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz CPU family: 6 Model: 140 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 1 Stepping: 1 CPU(s) scaling MHz: 23% CPU max MHz: 4200.0000 CPU min MHz: 400.0000 BogoMIPS: 4838.40 Virtualization: VT-x L1d: 192 KiB (4 instances) L1i: 128 KiB (4 instances) L2: 5 MiB (4 instances) L3: 8 MiB (1 instance) NUMA node0 CPU(s): 0-7 ``` ## 針對佇列操作的程式碼實作 ### `q_new` 目標為建立一個空佇列，其`next`和`prev`皆指向自己。查閱C語言記憶體管理的相關用法根據[ISO/IEC 9899:202y](https://open-std.org/JTC1/SC22/WG14/www/docs/n3467.pdf)章節7.24.4 > The pointer returned points to the start (lowest byte address) of the allocated space. If the space cannot be allocated, a null pointer is returned. 也就是說若配置失敗，會回傳空指標，須納入考量。首先使用`malloc`去配置一段記憶體空間(`size`為`list_head`結構體的大小)，並且後續必須檢查配置是否成功，若失敗則回傳`NULL`，反之則可以利用`list.h`中的`INIT_LIST_HEAD`來完成`next`和`prev`的初始化。實作如下: ```c struct list_head *q_new() { struct list_head *head = malloc(sizeof(struct list_head)); if (!head) { return NULL; } INIT_LIST_HEAD(head); return head; } ``` 不過經過`make test`的評分後發現，在test of malloc failure on new這邊反而沒拿到分數，會發生error，提示說仍有部分block是allocated。因此我想確認NULL在記憶體中是否會佔據空間。 ### `q_size` 目標為取得佇列內部的結點數量(相當於佇列的長度)。一開始直覺的想法是逐一走訪佇列中每個節點，再透過變數的累加來紀錄長度，但發現因為本作業是以雙向鏈結串列來去實作，也就是說佇列尾部的`next`不會是指向`NULL`，也就不能靠此來判斷是否已走訪到佇列尾部。後來發現`list.h`中其實已經有提供`list_for_each`可以達成逐一走訪每個節點並檢查`next`是否指向`head`。首先檢查該佇列是否為空或者`NULL`，是的話則回傳0，否則使用`list_for_each`逐一走訪結點並宣告一個計數器來紀錄結點數量。 ### `q_insert_head` 目標為將 element 插入到佇列的最前端。我理解的圖示表示: ![first_insert](https://hackmd.io/_uploads/H1fS8sZoJe.png) 用`malloc`為準備插入的新 element 配置記憶體空間，若配置失敗或 queue為`NULL`，則此次插入失敗。而根據`queue.h`中的要求，需要複製字串到`value`，於是我使用`strcpy`，為了避免複製過來的字串會超過`value`可以存放的，所以將`value`的記憶體空間配置為`strlen(s)+1`(+1為存放'\0'用)，最後使用`list.h`提供的 API `list_add`來實作，初始程式碼如下: ```c bool q_insert_head(struct list_head *head, char *s) { element_t *new_node = malloc(sizeof(element_t)); if(!new_node || !head) return false; new_node->value = malloc(strlen(s) + 1); strcpy(new_node->value, s); list_add(&new_node->list, head); return true; } ``` 不過在`git commit`時，遇到以下訊息: ```shell Dangerous function detected in /home/user/linux2025/lab0-c/queue.c 35: strcpy(new_node->value, s); ``` 於是我查閱了 [Common vulnerabilities guide for C programmers ](https://security.web.cern.ch/recommendations/en/codetools/c.shtml) > The strcpy built-in function does not check buffer lengths and may very well overwrite memory zone contiguous to the intended destination. 因為`strcpy`不會檢查buffer長度，可能會覆蓋掉相鄰的記憶體空間，因此這邊改為使用`strncpy`，並且需要在尾部加入'\0'已得知字串的結尾。 ```c strncpy(new_node->value, s, strlen(s)); new_node->value[strlen(s)] = '\0'; ``` 後來發現少了一些必要檢查，像是對新插入的 element 的 value 做 `malloc`，卻沒有檢查其後續配置記憶體是否成功，以下為修改後的程式碼: ```diff + if (!head || !s) + return false; element_t *new_node = malloc(sizeof(element_t)); - if (!new_node || !head) + if(!new_node) return false; new_node->value = malloc(strlen(s) + 1); + if(!new_node->value){ + free(new_node); + return false; + } ``` ### `q_insert_tail` 實作方式和`q_insert_head`基本大同小異，差別在於這邊是使用`list_add_tail`來插入到佇列尾部。 ### `q_free` 一開始的想法是讓他遍歷佇列中的每個`element`，並且使用`free`去釋放。而在 commit 時會出現靜態分析的錯誤，是由於我將拿來釋放用的`element`宣告在遍歷的迴圈外面，改到迴圈內部後則通過靜態分析。 ```diff - element_t *ele; while (cur != head) { - ele = container_of(cur, element_t, list); + element_t *ele = container_of(cur, element_t, list); cur = cur->next; q_release_element(ele); } ``` ### `q_remove_head` 檢查傳入的`sp`參數是否為`NULL`，若不為空則使用`strncpy`將要被移除的 element 的 value 複製到`sp`中，並且使用`list_del`移除佇列的第一個節點。 ### `q_remove_tail` 實作上和`q_remove_head`雷同，只是差在要移除的目標是在佇列尾部。 ### `q_delete_mid` 使用快慢指標的技巧，在迴圈中`ptr`每次走一步，`fast`走兩步，迴圈結束時，`ptr`所指向的節點即為佇列的中間節點。在提交 commit 時遇到靜態分析錯誤 >Running static analysis... queue.c:124:27: style: Variable 'fast' can be declared as pointer to const [constVariablePointer] for (struct list_head *fast = head->next; 而這邊可以參照 [你所不知道的 C 語言：指標篇](https://hackmd.io/@sysprog/c-pointer) 中 **針對指標的修飾 (qualifier)** 段落。 > 指標本身不可變更 (Constant pointer to variable): const 在 * 之後 char * const pContent; 指標所指向的內容不可變更 (Pointer to constant): const 在 * 之前 const char * pContent; char const * pContent; 所以雖然`fast`會指向不同的`list_head`，但並不會修改`list_head`的內容，使用`const`修飾可以更清楚`fast`的作用。 ```diff - for(，struct list_head *fast = head->next; + for (const struct list_head *fast = head->next; fast->next != head && fast->next->next != head; fast = fast->next->next) ``` ### `q_delete_dup` 使用`list_for_each_entry_safe`來實作，此`macro`適合用在要對節點作移除或刪除的動作時，在迴圈中會依序去比較當下目標節點的`value`是否有相同，而在比較前我先宣告了3個 `element_t`: ```c struct list_head *ele, *safe, *dup; ``` `ele` 作為主要負責遍歷的指標，`dup`指向`ele`的下一個 element，用於檢查`ele`的下個 element 的 value 是否有重複，若有重複，則開始進行第二次遍歷，將`dup`指標依序往後移動直到遇到不跟`ele->value`重複的 element，而中間若有重複的 element 使用`q_release_element`直接移除，但這麼做可能會讓`safe`指向`NULL`，所以每一個 step 也要讓 `safe`跟著`dup`移動，才能確保`ele`下次指向`safe`的位址時是正確的。而比較用的 function 我是使用 `strcmp`去作去作`value`的比較。 [SO/IEC 9899:202y](https://open-std.org/JTC1/SC22/WG14/www/docs/n3467.pdf) 章節7.27.4.3 >The strcmp function returns an integer greater than, equal to, or less than zero, accordingly as the string pointed to by s1 is greater than, equal to, or less than the string pointed to by s2. 判斷當回傳的值為0時，代表此兩邊的字串內容相同。初始實作如下: ```c list_for_each_entry_safe(ele, safe, head, list) { ptr = (&(ele->list))->next; bool check = false; if (ptr == head) return true; dup = list_entry(ptr, element_t, list); while (ptr != head && strcmp(dup->value, ele->value) == 0) { check = true; tmp = ptr->next; list_del(ptr); q_release_element(dup); ptr = tmp; dup = list_entry(ptr, element_t, list); safe = dup; } if (check) { //ele also need be deleted list_del(&ele->list); q_release_element(ele); } } ``` ### `q_swap` 宣告兩個主要的指標來固定指向準備交換的第一個和第二個節點，搭配 `list_del` 和 `list_add`來實現交換兩個相鄰節點的功能。 ### `q_reverse` 從佇列中第一個節點開始，依序移除，然後再加入到佇列頭部，這邊用到`list_move`，可以一次做到`list_del`和`list_add`，最後整條佇列掃完一遍即可完成反轉。圖示: ![reverse1.drawio (1)](https://hackmd.io/_uploads/ryym7m1h1e.png) ![reverse2.drawio](https://hackmd.io/_uploads/rkJgNQk2Jl.png) ### `q_reverseK` 先用`LIST_HEAD`初始化兩個 list_head，一個負責做反轉的，一個則是拿來存放反轉過後的節點。 ```c LIST_HEAD(trans); //reverse LIST_HEAD(tmp); //store ``` 我的想法是先將 k 個節點從佇列中移除，這邊用到`list_cut_position`擷取前面 k 個節點，並且加入到`trans`中，然後使用`q_ereverse`將這 k 個節點反轉，在使用`list_splice_tail_init`加入到`tmp`的尾部，這樣才能保持原來每一組節點在佇列中的順序。 ```c while(q_size(head) >= k){ start = head->next; end = start; for(int i=1; i<k; i++){ end = end->next; } list_cut_position(&trans, head, end); q_reverse(&trans); list_splice_tail_init (&trans,&tmp); } ``` ![reversek1.drawio](https://hackmd.io/_uploads/BkHWRXk2kx.png) ----- ![reversek2.drawio](https://hackmd.io/_uploads/SkiyfN1n1l.png) ---- ![reversek3.drawio](https://hackmd.io/_uploads/BJe0MEy2kg.png) 而最後`head`為 empty 時，再把反轉好的佇列從`tmp`加回到`head`。 ### `q_sort` 原本 sort 是寫`insertionsort`，但在`make test`的時候會遇到時間複雜度無法降到 $O(nlogn)$，以及執行`qtest`時會報錯沒有`stable sorting`，於是選擇使用`mergesort`來實作，於是另外寫了兩個 function 來完成。 #### mergeTwoLists 功能: 將給予的兩條以排序好的 lists 進行 merge，由於還要可以做到任意選擇`ascending`和`descending`的排序。判斷的條件是，當L1當前的值小於L2當前的值時，將 L1_ele 往後移，否則將 L2_ele 加入到 L1_ele 前方，而這是在`ascending`的情況下，但當是`descending`時，要讓條件判斷可以反過來。所以判斷式可以加入`XOR`來幫忙判斷。 ```c if ((strcmp(L1_ele->val, L2_ele->value) <= 0)^descend) ``` | strcmp <= 0 | descend | result | | -------- | -------- | -------- | | 1 | 0 | L1_ele後移 | | 1 | 1 | L2_ele加入到L1_ele前面 | | 0 | 0 | L2_ele加入到L1_ele前面 | | 0 | 1 | L1_ele後移 | 而最後剩下不為空的那條list，再加入至另一端的尾部。而這邊使用的是`list_splice_tail_init`，因為原本使用`list_splice_tail`時，在`q_merge`呼叫`mergeTwoLists`會出錯。實作如下: ```c struct list_head *mergeTwoLists(struct list_head *L1, struct list_head *L2, bool descend) { element_t *L1_ele = list_entry(L1->next, element_t, list), *L2_ele = list_entry(L2->next, element_t, list), *next; while (&L1_ele->list != L1 && &L2_ele->list != L2) { if ((strcmp(L1_ele->value, L2_ele->value) <= 0) ^ descend) { L1_ele = list_entry(L1_ele->list.next, element_t, list); } else { next = list_entry(L2_ele->list.next, element_t, list); list_move(&L2_ele->list, L1_ele->list.prev); L2_ele = next; } } list_splice_tail_init(L2, L1); return L1; } ``` #### mergesort_list 功能: 將 list 以中點分割，並且遞迴下去，直到 list 內剩下一個節點時回傳。在呼叫`mergeTwoLists`來將分割完的 lists 倆倆合併。先用快慢指標找出 list 的中間節點，再以中間節點的下一個節點作為分割的位置，並且宣告一個新的`list_head`來連接被分割的 list，然後兩 list 再各自遞迴下去。實作如下: ```c struct list_head *mergesort_list(struct list_head *head, bool descend) { if (!head || list_empty(head) || list_is_singular(head)) return head; struct list_head *slow = head; for (const struct list_head *fast = head->next; fast != head && fast->next != head; fast = fast->next->next) slow = slow->next; LIST_HEAD(cut); cut.prev = head->prev; head->prev->next = &cut; slow->next->prev = &cut; cut.next = slow->next; slow->next = head; head->prev = slow; struct list_head *left = mergesort_list(head, descend), *right = mergesort_list(&cut, descend); return mergeTwoLists(left, right, descend); } ``` #### q_sort 此 function 呼叫執行`mergesort_list`。 ### `q_ascend` 一開始先宣告兩個`struct list_head *`，一個在前，一個在後 ```c struct list_head *cmp = head->next; struct list_head *ptr = cmp->next; ``` 用`while`迴圈讓指標遍歷這條 list ，每次都會讓`cmp`指向的 element去和相鄰後方`ptr`指向的 element 去比較 value 大小，如果前者沒有比後者大，則兩個指標繼續往前走，但若前方的 value 大於後方，則將前方的 element，也就是`cmp`指向的 element 刪除，但因為刪除後，也不能保證此時`ptr`指向的點和更前面的點可以形成`ascending`，所以一旦完成了一次刪除後，就會讓`cmp`和`ptr`重新會到佇列最前方的位置重新遍歷，直到某次遍歷出現完全沒有刪除 element 的情況才會終止，而中止時也代表 list 已經為`ascending`。 ```c while (ptr != head) { left = list_entry(cmp, element_t, list); right = list_entry(ptr, element_t, list); if (strcmp(left->value, right->value) > 0) { list_del(cmp); q_release_element(left); cmp = head->next; ptr = cmp->next; } else { cmp = ptr; ptr = ptr->next; } } ``` ### `q_descend` 這邊的實作方式和`q_ascending`大同小異，差別在於判斷 element 的 value 大小時，因為是要降序，所以當前方的value小於後方時，刪除前方的 element。 ### `q_merge` 在寫此 function 之前需要先了解`q_contex_t`這個結透體以及整個queue的結構 ![q_merge2.drawio](https://hackmd.io/_uploads/Byuurme21e.png =60%x) 每個佇列之間都是由`chain`這個 member 所連接在一起的，而我想要將其他佇列給合併起來，就必須取得`q`，而我使用`container_of`去取得`q_contex_t`結構體，有了結構體就能存取其中的成員了。我的作法是使用`list_for_each_safe`，走訪 chain 上每個 node，並且取得`q_ccontex_t`，將裡面的 list 和第一個 queue去做合併，這邊使用前面寫過的`mergeTwoLists`，實作如下: ```c struct list_head *q_c = NULL, *safe = NULL; queue_contex_t *tmp = container_of(head->next, queue_contex_t, chain); list_for_each_safe(q_c, safe, head) { if (q_c != head->next) { queue_contex_t *merge = list_entry(q_c, queue_contex_t, chain); mergeTwoLists(tmp->q, merge->q, descend); } } ``` 不過在過程中有遇到問題，在我使用`qtest`測試的時候，我發現當我要 merge 到 first queue 的時候，只要我的 list 中的節點超過1個，就會引發錯誤: >Attempted to free unallocated block. ## git rebase 先將上游的 branch 加入進來，將其命名為`upstream` ```shell $ git remote add upstream git@github.com:sysprog21/lab0-c.git ``` 使用`git remote -v` 檢查看看。 ```shell origin git@github.com:leonnig/lab0-c (fetch) origin git@github.com:leonnig/lab0-c (push) upstream git@github.com:sysprog21/lab0-c.git (fetch) upstream git@github.com:sysprog21/lab0-c.git (push) ``` 使用`git fetch`更新遠程的 repo。 ```shell $ git fetch upstream master ``` 若有更改過但尚未 commit 的檔案，在rebase前可以用 stash 作暫存。 ```shell $ git stash ``` 使用 rebase 將提交的 commit 移至最新的基底。 ```shell $ git rebase upstream/master ``` 將更新公開推送到Github。 ```shell $ git push --force ``` 等 rebase 成功後再用 `git pop` 將檔案還原回來。 ## 以 Valgrind 分析記憶體問題由於在 `make test` 時，第17個測試 it、ih、rt、rh 是否是 constant time，測試過程非常慢，想當然這項測試也沒有通過。於是我使用 valgrind 搭配 massif 來檢查看看問題。 ![image](https://hackmd.io/_uploads/B1IyQQd3Jl.png) 待完成 ## 在 `qtest` 提供新的命令 `shuffle` 為了在`qtest`中新增命令，需要先了解`qtest`的運作機制。閱讀 [qtest 命令直譯器的實作](https://hackmd.io/@sysprog/linux2025-lab0/%2F%40sysprog%2Flinux2025-lab0-b#qtest-%E5%91%BD%E4%BB%A4%E7%9B%B4%E8%AD%AF%E5%99%A8%E7%9A%84%E5%AF%A6%E4%BD%9C) ，從中可以得知，我們撰寫的關於 queue 的操作命令，都是用`cmd_element_t`這個結構體包裝起來，而操作的函式則放在`operation`成員中，而這些命令則會被存放在 `cmd_list` 這個鏈結串列中。一開始在`consloe.c`裡面並沒有找到跟佇列操作相關的函式，後來才在`qtest.c`裡面發現 > Implementation of testting code for queue code. 而這些 queue code 又是如何被新增的。在`console.h`中可以發現`ADD_COMMAND`巨集的定義，而他又會展開為`add_cmd`，所以根據作業說明，我需要在`console_init`中呼叫 shuffle，才可以增加這個新命令。所以我需要在`queue.c` 中撰寫 `q_shuffle`，然後在`qtest.c` 中提供 `do_shuffle` 函式去呼叫`queue.c`。但作業規定有提到不能修改 `queue.h`，所以我不能直接在裡面宣告 `q_shuffle`，我使用`extern`的方式去做外部宣告。 ```c extern void q_shuffle(struct list_head *head); ``` ### q_shuffle 使用 [Fisher–Yates shuffle](https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle#The_modern_algorithm) 去實作 shuffle。剛開始的想法很單純，直接去判斷 `len` 是否大於0，每回合只要還大於0，就使用`rand()` 來獲得一個數值，也就是本回合`old`要指向的位置，而`new`則從最後一個節點開始逐回合往前，去跟`old`指向的節點做交換，而交換時我另外宣告了一個指標`tmp`來儲存`new`應該交換的位置，而交換的操作則是使用 `list_move` API 來完成，先將`old`移動到`new`的後方，再將`new`移動到`tmp`後方完成交換: ```c tmp = old->prev; list_move(old, new); list_move(new, tmp); new = old->prev; old = head->next; len -= 1; ``` ### 測試使用[測試程式](https://hackmd.io/@sysprog/linux2025-lab0/%2F%40sysprog%2Flinux2025-lab0-d#%E6%B8%AC%E8%A9%A6%E7%A8%8B%E5%BC%8F)進行1000000測試 | 樣本 | 觀察到的頻率 | 預期的頻率 | ${(O_i - E_i)^2 \over E_i}$ | | ------------ | ------------ | ---------- | --------------------------- | | [1, 2, 3, 4] | 41899 | 41666 | 1.302956847309557 | | [1, 2, 4, 3] | 41845 | 41666 | 0.768996303940863 | | [1, 3, 2, 4] | 41798 | 41666 | 0.41818269092305477 | | [1, 3, 4, 2] | 41914 | 41666 | 1.4761196179138867 | | [1, 4, 2, 3] | 41432 | 41666 | 1.3141650266404263 | | [1, 4, 3, 2] | 41811 | 41666 | 0.5046080737291797 | | [2, 1, 3, 4] | 41579 | 41666 | 0.1816589065425047 | | [2, 1, 4, 3] | 41616 | 41666 | 0.06000096001536025 | | [2, 3, 1, 4] | 41099 | 41666 | 7.71585945375126 | | [2, 3, 4, 1] | 41676 | 41666 | 0.00240003840061441 | | [2, 4, 1, 3] | 41843 | 41666 | 0.7519080305284884 | | [2, 4, 3, 1] | 41385 | 41666 | 1.8950943215091443 | | [3, 1, 2, 4] | 41606 | 41666 | 0.08640138242211876 | | [3, 1, 4, 2] | 41872 | 41666 | 1.018480295684731 | | [3, 2, 1, 4] | 41747 | 41666 | 0.15746651946431142 | | [3, 2, 4, 1] | 41525 | 41666 | 0.47715163442615083 | | [3, 4, 1, 2] | 41604 | 41666 | 0.09225747611961792 | | [3, 4, 2, 1] | 41833 | 41666 | 0.6693467095473528 | | [4, 1, 2, 3] | 41656 | 41666 | 0.00240003840061441 | | [4, 1, 3, 2] | 41555 | 41666 | 0.29570873133970144 | | [4, 2, 1, 3] | 41501 | 41666 | 0.6534104545672731 | | [4, 2, 3, 1] | 41745 | 41666 | 0.14978639658234533 | | [4, 3, 1, 2] | 41782 | 41666 | 0.322949167186675 | | [4, 3, 2, 1] | 41677 | 41666 | 0.002904046464743436 |Total | | | 20.32021312340997 $X^2$ = 20.32021312340997 在此測試中是24個隨機樣本，自由度為23。顯著水準（Significance level）α 測定為 0.05。從卡方分布表 ![image](https://hackmd.io/_uploads/SkPJJsahye.png) 對照自由度23，因為 14.8480 < 20.3202 < 32.0069，所以我們的p value 介於 0.9 和 0.1之間。因為 p value (0.1~0.9) > alpha (0.05)，統計檢定的結果不拒絕虛無假說，再搭配實驗的圖表可以看出結果大致是符合 Uniform distribution。 ![rand_out_2](https://hackmd.io/_uploads/HkaGljahyg.png =130%x) ## 研讀論文〈Dude, is my code constant time?〉 ### lab0-c 中的 simulation 我去查看`qtest.c`的程式碼，發現在 `queue_insert`和`queue_remove`兩個函式中都有額外去判斷 simulation 變數以查看是否開啟 simulation 模式，以 insert 為例，其中有一個判斷式 ```c pos == POS_TAIL ? is_insert_tail_const() : is_insert_head_const(); ``` 即判斷插入或刪除的位置，並且去呼叫`is_insert_XXX_const()`，而我在`fixture.h`中找到它的宣告，他是來自 `is_##x##_const(void)` 這個巨集會展開為`DUT_FUNCS`，而在 `constant.h` 裡面有對`DUT_FUNCS`的更仔細的定義 ```c #define DUT_FUNCS _(insert_head) _(insert_tail) _(remove_head) _(insert_tail) ``` 可以發現它其實是展開為另外4個前置處理器，而在同個檔案內，繼續往下看可以發現他們展開後的樣子 ```c #define DUT(x) DUT_##x enum { #define _(x) DUT(x), DUT_FUNCS #undef _ }; ``` 可以看到`_(x)`會被替換成`DUT(x)`，而`DUT(x)`又會被替換為`DUT_##x` ，搭配`enum`，最後會展開為 ```c enum { DUT(insert_head), DUT(insert_tail), DUT(remove_head), DUT(remove_tail), } ``` 再將`DUT(x)`替換掉 ```c enum { DUT_insert_head, DUT_insert_tail, DUT_remove_head, DUT_remove_tail, } ``` 而在`fixture.c`中有完整的定義了`is_XXX_const` ```c #define DUT_FUNC_IMPL(op) \ bool is_##op##_const(void) \ { \ return test_const(#op, DUT(op)); \ } #define _(x) DUT_FUNC_IMPL(x) DUT_FUNCS #undef ``` 就是根據四種不同的操作，去回傳 `test_const` 函式執行過後的結果。而`test_const`裡面最主要是在執行 `doit` ，這也是主要在測量是否為 constant time 的函式。 #### doit 函式因為不希望測試會影響到原本的功能性，所以在 simulation 模式中，會另外建立獨立的佇列來做測試。首先利用`prepare_inputs`函式準備測試用的資料，產生隨機的字串。 ```c void prepare_inputs(uint8_t *input_data, uint8_t *classes) { randombytes(input_data, N_MEASURES * CHUNK_SIZE); for (size_t i = 0; i < N_MEASURES; i++) { classes[i] = randombit(); if (classes[i] == 0) memset(input_data + (size_t) i * CHUNK_SIZE, 0, CHUNK_SIZE); } for (size_t i = 0; i < N_MEASURES; ++i) { /* Generate random string */ randombytes((uint8_t *) random_string[i], 7); random_string[i][7] = 0; } } ``` 而在這個函式中，可以注意到它的引數中有個`classes`，這個應該就是對應到原論文中提到的兩種類型的測資: >Typically, in a fix-vs-random leakage detection test, the first class input data is fixed to a constant value, and the second class input data is chosen at random for each measurement. 在總共`N_MEASURES`輪中，每輪都會使用`randombit()`去決定測資是要使用 fix 還是 random。再透過呼叫`measure`來對不同操作做檢查，確認4種操作能夠正常運作(以 insert_head 為例) ```c int before_size = q_size(l); before_ticks[i] = cpucycles(); dut_insert_head(s, 1); after_ticks[i] = cpucycles(); int after_size = q_size(l); dut_free(); if (before_size != after_size - 1) return false; ``` 可以看到它去對插入操作前後的 size 做檢查，確認功能是否正常，同時也去記錄插入前後的 cpu cycle，這在後面的量測會需要用到。此處的`ret`變數會存放檢查的結果。 ```c bool ret = measure(before ticks, after_ticks, input_data, mode); ``` 接下來執行 `differentiate` 函式，將 insert/remove 前後的cpu cycles的差值作為執行時間，並保存起來。 ```c static void differentiate(int64_t *exec_times, const int64_t *before_ticks, const int64_t *after_ticks) { for (size_t i = 0; i < N_MEASURES; i++) exec_times[i] = after_ticks[i] - before_ticks[i]; } ``` 接著執行 `update_statistics` ，這個函式會藉由`t_push`去計算出每一筆的平均值和變異數所需要的數值。這邊用到了論文中提到的 [Welford's online algorithm](https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Welford's_online_algorithm) ，可以透過動態計算平均值與變異數的方式，在每次有新一筆的資料進來時立刻做更新，不用收集完整資料在做計算，我想這也是他為甚麼叫 **online** 的原因。而在這邊就是要去計算累積的變異數，公式如下: $S_n = S_{n-1} + (x_n - M_{n-1})(x_n - M_n)$ 在 `t_push` 函式內正是實作以上公式: ```c /* Welford method for computing online variance * in a numerically stable way. */ double delta = x - ctx->mean[class]; ctx->mean[class] = ctx->mean[class] + delta / ctx->n[class]; ctx->m2[class] = ctx->m2[class] + delta * (x - ctx->mean[class]); ``` 這邊的`delta`對應到 $x_n - M_{n-1}$。我們一開始定義 $S_n$ 是用於計算方差的累積變數，而 $M_n$ 為平均值。而透過 Welford method 的方式，當今天有一筆新的資料 $X_n d$進來時，平均值的更新公式如下: $M_n = M_{n-1} + \frac{X_n - M_{n-1}}{n}$ 也是對應到上方 `ctx->mean[class]` 的公式。所以`update_statistics`的工作算是把待會要做 t-test 的變數給準備好。再來就是執行`report` 去計算 t 值。 ```c double max_t = fabs(t_compute(t)); ``` 在來使用論文中提到的 Welch’s t-test: $t = \frac{\bar{X_0} - \bar{X_1}}{\sqrt{\frac{Var_0}{N_0} + \frac{Var_1}{N_1}}}$ 對應到 `t_compute` 內的實作: ```c static double t_compute(ttest_ctx_t *ctx) { double var[2] = {0.0, 0.0}; var[0] = ctx->m2[0] / (ctx->n[0] - 1); var[1] = ctx->m2[1] / (ctx->n[1] - 1); double num = (ctx->mean[0] - ctx->mean[1]); double den = sqrt(var[0] / ctx->n[0] + var[1] / ctx->n[1]); double t_value = num / den; return t_value; } ``` 而後跟原論文一樣，會檢查 $t$ 值是否大於 threshold，若大於則判斷為執行時間不為 constant time。而在`lab-0` 缺少了 percentile 的處理，於是我試著將原論文中對於 percentile 的處理給引入: ```c static int cmp(const int64_t *a, const int64_t *b) { if (*a == *b) return 0; return (*a > *b) ? 1 : -1; } ``` 此函式是搭配`qsort`做排序，由小到大。 --- ```c static int64_t percentile(int64_t *a_sorted, double which, size_t size) { size_t array_position = (size_t)((double)size * (double)which); assert(array_position < size); return a_sorted[array_position]; } ``` `percentile` 則會在已排序的陣列中，找出某個指定百分位數的資料並回傳，也就是取出對應的 percentile 的值。 --- 在`dudect`中，percentile 和 execution_time 都是被定義在 `dudect_ctx_t` 結構體中，但在 `lab0-c` 內沒有定義該結構體，而是用陣列去儲存，所以我也多配置一個陣列空間專門來存放 percentile 的值。在 `lab0-c` 中，對資料沒有進行 `percentile` 處理，而是直接做`t_push`，這邊將 `dudect` 中做 `cropping` 的部分引入進來，將大於這些百分位數的值給裁剪掉。 ```diff for (size_t i = 0; i < N_MEASURES; i++) { int64_t difference = exec_times[i]; /* CPU cycle counter overflowed or dropped measurement */ if (difference <= 0) continue; /* do a t-test on the execution time */ - t_push(t, difference, classes[i]); + //t_push(t, difference, classes[i]); + /* cropping */ + for (size_t crop_index = 0; crop_index < NUMBER_PERCENTILES; crop_index++){ + if (difference < percentiles[crop_index]){ + t_push(t, difference, classes[i]); + } + } } ``` ## 研讀 Linux 核心原始程式碼的 [lib/list_sort.c](https://github.com/torvalds/linux/blob/master/lib/list_sort.c) 參考筆記: [chiangkd](https://hackmd.io/erWlfVMfQqyUe9JVbOLlBA?view#%E7%A0%94%E8%AE%80-Linux-%E6%A0%B8%E5%BF%83%E5%8E%9F%E5%A7%8B%E7%A8%8B%E5%BC%8F%E7%A2%BC%E7%9A%84-list_sortc) , [kdnvt](https://hackmd.io/@kdnvt/linux2022_lab0#%E7%A0%94%E8%AE%80-liblist_sortc) 裡面主要有三個函式: `merge`, `merge_final`, `list_sort` ### merge 利用間接指標的方式去比較兩個鏈結串列內的節點大小，並依照節點大小將其連接起來，而過程中維持 stable ，不過跟一般的合併行為有一點點不同，當其中一個串列的指標先走完(為`NULL`)時，回傳整條串列比較完之後最小的節點，在其中用到了 `__attribute__((nonnull(2,3,4)))`，我去查閱了 [GCC的Declaring Attributes of Functions](https://gcc.gnu.org/onlinedocs/gcc-4.4.7/gcc/Function-Attributes.html)，`__attribute__` 可以讓使用者在函數宣告中稱加一些特性，讓編譯器可以更仔細的檢查程式碼。此處使用到的屬性就是 `nonnull` > The nonnull attribute specifies that some function parameters should be non-null pointers. For instance, the declaration. 也就是編譯器會去檢查指定的某些函式參數是否為空，在 `merge` 中，用法如下 ```c __attribute__((nonnull(2,3,4))) static struct list_head *merge(void *priv, list_cmp_func_t cmp, struct list_head *a, struct list_head *b) ``` 也就是他讓函式中的第2, 3, 4個參數不為空。 ### merge_final 其實就是將 `list` 和 `pending` 串接在一起，除了一樣是做跟 `merge` 一樣的操作以外，過程中會將節點的 `prev` 接回去，把鏈結串列從單向恢復為雙向 ```c static void merge_final(void *priv, list_cmp_func_t cmp, struct list_head *head, struct list_head *a, struct list_head *b) { struct list_head *tail = head; u8 count = 0; for (;;) { /* if equal, take 'a' -- important for sort stability */ if (cmp(priv, a, b) <= 0) { tail->next = a; a->prev = tail; tail = a; a = a->next; if (!a) break; } else { tail->next = b; b->prev = tail; tail = b; b = b->next; if (!b) { b = a; break; } } } ``` ```C /* Finish linking remainder of list b on to tail */ tail->next = b; do { /* * If the merge is highly unbalanced (e.g. the input is * already sorted), this loop may run many iterations. * Continue callbacks to the client even though no * element comparison is needed, so the client's cmp() * routine can invoke cond_resched() periodically. */ if (unlikely(!++count)) cmp(priv, b, b); b->prev = tail; tail = b; b = b->next; } while (b); /* And the final links to make a circular doubly-linked list */ tail->next = head; head->prev = tail; } ``` 其中程式碼的註解有提到若該次`merge`為原本就已經排序好的，則會出現許多次不必要的疊代，所以它註解這邊說到了: > the client's cmp() routine can invoke cond_resched() periodically. 可以看到上面那段的`cmp`函式，它是讓`b` 跟 `b`比較，所以實際上根本沒有在做比較，而我認為他這麼做應該是因為如果今天的資料量極大，就變成說會一直卡在這個迴圈中，那cpu也會一直被占用，但這明明是一個根本沒必要的比較操作，所以去定期呼叫`cmp`，讓其他的 process 能夠取得 cpu。 #### likely & unlikely 在 [compiler.h](https://github.com/torvalds/linux/blob/master/include/linux/compiler.h) 裡面有定義，會根據 `CONFIG_PROFILE_ALL_BRANCHES` 而對這兩個 macro 有不同的定義，這邊假設他是未開啟的。 ```c # define likely(x) __builtin_expect(!!(x), 1) # define unlikely(x) __builtin_expect(!!(x), 0) ``` 此處參考筆記 [[Linux Kernel慢慢學]likely and unlikely macro](https://meetonfriday.com/posts/cecba4ef/)，其中 `__builtin_expect` 就是在讓編譯器得知說 branch 比較可能會出現哪種結果，而編譯器就可以對此做優化。這邊參考筆記去做測試 ```c #include<stdio.h> #define likely(x) __builtin_expect(!!(x), 1) #define unlikely(x) __builtin_expect(!!(x), 0) void foo(); void bar(); int main(int argc, char *argv[]) { if (likely (argc == 2)) foo(); else bar(); } ``` 這樣的寫法就是讓告訴編譯器說，程式的第一個條件很可能為真，所以編譯器應當會將執行 `foo` 的這條分支優化。而再將 `lilely` 改為 `unlikely` ，並且去比較兩者的組合語言 ```shell $ diff -c likely.s unlike.s ``` ```diff *** likely.s 2025-04-13 19:59:41.726776737 +0800 --- unlikely.s 2025-04-13 19:59:58.096183689 +0800 *************** *** 12,28 **** .cfi_def_cfa_offset 16 xorl %eax, %eax cmpl $2, %edi ! jne .L2 ! call foo@PLT .L3: xorl %eax, %eax addq $8, %rsp .cfi_remember_state .cfi_def_cfa_offset 8 ret ! .L2: .cfi_restore_state ! call bar@PLT jmp .L3 .cfi_endproc .LFE23: --- 12,28 ---- .cfi_def_cfa_offset 16 xorl %eax, %eax cmpl $2, %edi ! je .L6 ! call bar@PLT .L3: xorl %eax, %eax addq $8, %rsp .cfi_remember_state .cfi_def_cfa_offset 8 ret ! .L6: .cfi_restore_state ! call foo@PLT jmp .L3 .cfi_endproc .LFE23: ``` 會根據不同的策略去調整分支後下一條預設執行的指令。 ### list_sort 使用 `count` 來記錄 pending lists 內的節點數量，同時也用來決定是否該進行合併。下面程式碼是最核心的部分: ```c /* Find the least-significant clear bit in count */ for (bits = count; bits & 1; bits >>= 1) tail = &(*tail)->prev; /* Do the indicated merge */ if (likely(bits)) { struct list_head *a = *tail, *b = a->prev; a = merge(priv, cmp, b, a); /* Install the merged result in place of the inputs */ a->prev = b->prev; *tail = a; } ``` 由程式碼可以看出，`bits` 的變化會決定合併，當 `count+1` 後為 $2^k$ 的話，則不合併，因為代表這時的 `count` 就是 $2^k -1$，那用位元表示的話就會是 ...01111111...，這樣就會因為上面程式的判斷，而讓 bits 一直往右 shift，直到變為 0，而無法進行合併。

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.