2025q1 Homework1 (lab0)

# 2025q1 Homework1 (lab0) contributed by <`Eddie-064`> {%hackmd NrmQUGbRQWemgwPfhzXj6g %} ## 開發環境 ```shell= $ uname -a Linux eddie-Macmini8-1 6.13.4-2-t2-noble #2 SMP PREEMPT_DYNAMIC Tue Feb 25 06:26:52 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux $ gcc --version gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 Copyright (C) 2023 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 39 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 4 On-line CPU(s) list: 0-3 Vendor ID: GenuineIntel Model name: Intel(R) Core(TM) i3-8100B CPU @ 3.60GHz CPU family: 6 Model: 158 Thread(s) per core: 1 Core(s) per socket: 4 Socket(s): 1 Stepping: 10 CPU(s) scaling MHz: 23% CPU max MHz: 3600.0000 CPU min MHz: 800.0000 BogoMIPS: 7200.00 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge m ca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 s s ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nons top_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer a es xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpu id_fault epb pti ssbd ibrs ibpb stibp tpr_shadow flexp riority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm arat pl n pts hwp hwp_act_window hwp_epp vnmi md_clear flush_l 1d arch_capabilities Virtualization features: Virtualization: VT-x Caches (sum of all): L1d: 128 KiB (4 instances) L1i: 128 KiB (4 instances) L2: 1 MiB (4 instances) L3: 6 MiB (1 instance) NUMA: NUMA node(s): 1 NUMA node0 CPU(s): 0-3 Vulnerabilities: Gather data sampling: Vulnerable Itlb multihit: KVM: Mitigation: Split huge pages L1tf: Mitigation; PTE Inversion; VMX conditional cache flush es, SMT disabled Mds: Mitigation; Clear CPU buffers; SMT disabled Meltdown: Mitigation; PTI Mmio stale data: Mitigation; Clear CPU buffers; SMT disabled Reg file data sampling: Not affected Retbleed: Mitigation; IBRS Spec rstack overflow: Not affected Spec store bypass: Mitigation; Speculative Store Bypass disabled via prct l Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointe r sanitization Spectre v2: Mitigation; IBRS; IBPB conditional; STIBP disabled; RS B filling; PBRSB-eIBRS Not affected; BHI Not affected Srbds: Mitigation; Microcode Tsx async abort: Not affected ``` - 使用 `git fetch` 取得最新更新： 1. `git remote add upstream https://github.com/sysprog21/lab0-c` 2. `git fetch upstream` 3. 檢查是在 `master` 分支下 4. `git rebase upstream/master` 5. `git push -f` ## 實作指定佇列操作 ### `q_free` 函式的說明為 `Free all storage used by queue` ，透過 `list_for_each_entry_safe` 刪除佇列的每個節點後，要再釋放 `head` 空間。 ```diff void q_free(struct list_head *head) { if (!head) return; element_t *ele, *safe; list_for_each_entry_safe (ele, safe, head, list) { q_release_element(ele); } + list_del(head); + free(head); } ``` 不懂為什麼只有 safe 需要初始化。 ```shell Running static analysis... queue.c:25:5: error: Uninitialized variable: safe [uninitvar] list_for_each_entry_safe (ele, safe, head, list) { ^ ``` 有一個很奇怪的地方，透過 valgrind 跑 free 指令的時候會出現錯誤： ```shell $ ./scripts/driver.py -p ./qtest --valgrind -t 1 -v 4 --- Trace Points +++ TESTING trace trace-01-ops: cmd> new l = [] cmd> ih a l = [a] cmd> free ==147448== Invalid write of size 8 ==147448== at 0x10FE31: list_del (list.h:149) ==147448== by 0x10FE31: q_free (queue.c:28) ``` 但直接跑又不會出問題： ```shell $ make check ./qtest -v 3 -f traces/trace-eg.cmd cmd> new l = [] cmd> ih a l = [a] cmd> free l = NULL cmd> # Exit program cmd> quit Freeing queue ``` ### `q_insert_head`, `q_insert_tail` 在處理字串複製時，需要注意字串是否為 `NULL` 、有沒有包含 `\0` 字元，以及記憶體有沒有分配成功。參考〈 [你所不知道的C語言：指標篇](https://hackmd.io/@sysprog/c-prog/%2Fs%2FHyBPr9WGl#Pointers-vs-Arrays) 〉，教材中談到 `strcpy` 與 `strdup` 差異，`strcpy` 可能會發生 buffer overflow ，而 `strncpy` 可以避免這個狀況發生，而 `strdup` 直接回傳字串的副本，使用起來更加精簡。目前還有下列錯誤訊息需要除錯： ```shell +++ TESTING trace trace-11-malloc: # Test of malloc failure on 'q_insert_head': 'q_new', and 'q_insert_head' Segmentation fault occurred. You dereferenced a NULL or invalid pointer--- trace-11-malloc 0/6 +++ TESTING trace trace-12-malloc: # Test of malloc failure on 'q_insert_tail': 'q_new', 'q_insert_head', and 'q_insert_tail' Segmentation fault occurred. You dereferenced a NULL or invalid pointer--- trace-12-malloc 0/6 ``` 進入 `lldb` debugger 模式後，照著 `trace-11` 步驟進行測試，在執行 `ih gerbil 20` 出現以下錯誤： `Stop reason: signal SIGSEGV: address not mapped to object (fault address: 0x0)` 然後發現其實 `qtest` 有跳出提示： `WARNING: Malloc returning NULL` 也就是在分配 `element_t` 時回傳了 `NULL` ，此時應回傳 `false`。 ```diff @@ -36,6 +36,9 @@ bool q_insert_head(struct list_head *head, char *s) return false; element_t *ele = malloc(sizeof(element_t)); + if (!ele) + return false; ele->value = strdup(s); list_add(&ele->list, head); ``` 重新運行後 `qtest` 出現以下錯誤提示： `ERROR: Failed to save copy of string in queue` 而佇列的最後一個節點變成 `(null)` ，再看到 `qtest.c` 中： ```clike char *cur_inserts = entry->value; if (!cur_inserts) { report(1, "ERROR: Failed to save copy of string in queue"); ``` 雖然題目沒有特別要求空字串的處理，但從這邊的測試可以看出佇列中不能包含有空字串的節點。調整後如下。 ```diff bool q_insert_head(struct list_head *head, char *s) { - if (!head) + if (!head || !s) return false; element_t *ele = malloc(sizeof(element_t)); + if (!ele) + return false; ele->value = strdup(s); + if (!ele->value) { + free(ele); + return false; + } list_add(&ele->list, head); ``` ### `q_remove_head`, `q_remove_tail` 題目要求: ```shell If sp is non-NULL and an element is removed, copy the removed string to *sp (up to a maximum of bufsize-1 characters, plus a null terminator.) ``` 考慮使用 `strncpy` 或 `strndup`，參考 [strncpy man page](https://linux.die.net/man/3/strncpy) ，裡面提到「If there is no null byte among the first n bytes of src, the string placed in dest will not be null-terminated.」，因此需要開發者自己檢查 `null terminator`。而查閱 [strndup man page](https://linux.die.net/man/3/strndup) 後，發現與 `strncpy` 不同，即使字串大於 n 也會在結尾端補上 `null terminator`，所以實際大小應該是 n + 1。後來發現 `sp` 應該是原本就已經分配好 `buffersize` 的空間，所以這邊適合使用 `strncpy` 實作。另外也可使用 [strnlen](https://man7.org/linux/man-pages/man3/strnlen.3.html) 替代 `strlen`，可節省計算超過 `buffersize` 的額外字串。 ### `q_delete_mid` 使用快慢指標實作，一開始實作時快慢指標初始值皆設為 `head` ，如果佇列只有 3 個節點，慢指標第一個節點就會停下，在第 ⎣n/2⎦ - 1 個節點，主要是因為快指標會先確認下下一個是不是終點造成的特例，改成 `head->next` 可以避免這個特例發生。 ```diff if (!head || list_empty(head)) return false; - struct list_head *slow = head, *fast = head; + struct list_head *slow = head->next, *fast = head->next; for (; fast->next != head && fast->next->next != head; slow = slow->next, fast = fast->next->next) ; list_del(slow); q_release_element(list_entry(slow, element_t, list)); ``` ### `q_delete_dup` 原本在思考要如何比較完整字串，後來看了 `trace-06` 的測資只有針對一個字節測試，因此實作上可以將字節轉成 ASCII 的數字形態，透過 hash table 的方式做比較，來移除重複的節點。在跑 `trace-06` 的時候遇到 segmentation fault ，參考 `README.md` 後看到能用 `debug -a` 分析 core dump，實際操作後發現透過 gdb 工具可以還原 error 發生當下，不僅能夠使用單步查看執行過的每一行(call hierachy)，還能透過 `p` 查看當下變數的數值。 ### `q_swap` 可以透過 `list_move` 完成 ### `q_reverse` 可以透過 `list_for_each_safe` 與 `list_move` 完成 ### `q_reverseK` 使用 `list_cut_position` 搭配 `list_splice_tail` 實作 ### `q_sort` 實作 merge sort 的時候遇到以下錯誤： ```clike! l = [3 1] cmd> sort Program received signal SIGSEGV, Segmentation fault. 0x000055555555bdc1 in merge_sort (head=0x555555569248, head@entry=<error reading variable: DWARF-2 expression error: Loop detected (257).>, descend=descend@entry=false) at queue.c:255 255 { (gdb) l 250 } 251 return result; 252 } 253 254 static struct list_head *merge_sort(struct list_head *head, bool descend) 255 { ``` 後來發現原因是因為跑 divide 後，尾巴的節點指向錯誤的節點造成迴圈。目前雖然在少量的資料可以正確排序，但在 `trace-14-perf` 測試中出現以下錯誤： ```shell! +++ TESTING trace trace-14-perf: # Test performance of 'q_new', 'q_insert_head', 'q_insert_tail', 'q_reverse', and 'q_sort' Warning: Skip checking the stability of the sort because the number of elements 2000000 is too large, exceeds the limit 100000. ==375152== Invalid read of size 1 ==375152== at 0x4850367: strcmp (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==375152== by 0x10B173: do_sort (qtest.c:628) ==375152== by 0x10E76D: interpret_cmda (console.c:181) ==375152== by 0x10ED32: interpret_cmd (console.c:201) ==375152== by 0x10EFC1: cmd_select (console.c:593) ==375152== by 0x10F89F: run_console (console.c:673) ==375152== by 0x10DB5C: main (qtest.c:1444) ==375152== Address 0xdeadbeef is not stack'd, malloc'd or (recently) free'd ``` 錯誤發生在 `qtest.c` 檔案內，可能是我排序的時候哪裡出問題間接造成影響。中斷在以下位置： ```clike (gdb) up #1 0x0000555555557174 in do_sort (argc=<optimized out>, argv=<optimized out>) at qtest.c:628 628 if (!descend && strcmp(item->value, next_item->value) > 0) { ``` ```shell! dap> p ((element_t *)(((char *)(&item->list))-8))->value (char *) 0x000055555556a7c0 "zebra" dap> p ((element_t *)(((char *)(&item->list)->next)-8))->value (char *) 0x00000000deadbeef "" dap> p ((element_t *)(((char *)(&item->list)->prev)-8))->value (char *) 0x000055555556a730 "zebra" dap> p ((element_t *)(((char *)(&item->list)->prev->prev)-8))->value (char *) 0x000055555556a1e0 "qsdifd" dap> ``` 原來 `0xdeadbeef` 有特別的涵義，參考 [Hexspeak](https://zh.wikipedia.org/zh-tw/Hexspeak#:~:text=0xDEADBEEF%EF%BC%88%E3%80%8Cdead%20beef%E3%80%8D%EF%BC%89,%E9%87%8B%E6%94%BE%E7%9A%84%E8%A8%98%E6%86%B6%E9%AB%94%E5%84%B2%E5%AD%98)。 debug 後，這個也是排序邏輯問題導致，就是不當的節點連結，當原本排序 6 個節點排序完只剩下 4 個節點，後面 qtest 測試就出現以上錯誤。奇怪的是我後面有重新確認節點的連接，不懂為什麼會有節點指向 `0xdeadbeef`，這個問題在 Valgrind 和 Sanitizer 都只能等到 qtest 的 628 行才發現，還沒找到發生問題的根本原因。(考慮使用其他 [sanitizers](https://github.com/google/sanitizers) 做實驗) > Commit [9e2b8e8](https://github.com/Eddie-064/lab0-c/commit/9e2b8e87fbd2527cd896f884c5455f7c99b8d2ba) ```diff @@ -241,7 +241,6 @@ static inline struct list_head *merge_TwoLists(struct list_head *L, if (tmp != L) { tmp->next = R; R->next = L; + tmp = R; } else { R->next = L; if (tmp == result) ``` ### `q_ascend`, `q_descend` 使用 `list_for_each_entry_safe` 疊代存取兩個節點進行比較，實作時遇到以下錯誤： ```shell l = [3 2 1] cmd> ascend Segmentation fault occurred. You dereferenced a NULL or invalid pointer Aborted (core dumped) ``` 仔細檢查後發現 `list_for_each_entry_safe` 的迴圈條件是 `&entry->list ！= head`，而 `safe` 是透過 `list_entry` 推算出下一個節點的起始位置，`list_entry` 實際上是 `container_of` 的重新命名，在解讀 `container_of` 的運作原理時參考到 [Linux 核心原始程式碼巨集: container_of](https://hackmd.io/@sysprog/linux-macro-containerof)，裡面提到幾個我沒留意到的細節： > - `container_of` 中包含 `typeof` 屬於 GNU extension。在使用 `-pedantic` 或某些編譯器選項時，如果程式碼用到 C89/C90 以外的特徵會跳出警告訊息，\_\_extension\_\_ 可以避免編譯器跳出警告，除此之外沒有其他影響。 > - `container_of(ptr, type, member)` 當中的 `const __typeof__(((type *) 0)->member) *(__pmember) = (ptr);` 其實就是在檢查 `ptr` 是不是 member 型別，但並不會檢查 `__pmember - offsetof(type, member)` 輸出後是不是 type 型別。所以如果下一個節點指向 head ，list_entry 回傳的只是減去 offsetof 後推算的地址，隨意操作可能造成不可預期行為。在除錯的過程，因為想觀察運行中的行為，所以設中斷點進行單步執行，但 `qtest` 程式碼中包含有超時發出 `SIGALRM` 的機制，`process handle SIGALRM --stop false --notify false --pass false` 指令可以告訴 `lldb` 忽略這個 signal，gdb 的話 `handle SIGALRM ignore` 即可。在跑 `trace-06` 測試失敗，發現是因為排序順序錯誤，`q_ascend` 是移除所有 value 大於後面節點的節點，從 head 開始檢查需要 backtracking 返回確認有沒有節點的 value 更大，而從 tail 往回就只要每個節點都大於等於下一個節點就好。相當於記錄最小值，只要大於就刪除節點。後來的修正如下： ```diff @@ -180,17 +194,23 @@ int q_ascend(struct list_head *head) { if (!head || list_empty(head)) return 0; - element_t *ele, *safe = NULL; + element_t *ele, *safe; int count = 0, total = 0; - list_for_each_entry_safe (ele, safe, head, list) { + char min = 127; + + for (ele = list_entry((head)->prev, element_t, list), + safe = list_entry(ele->list.prev, element_t, list); + &ele->list != (head); + ele = safe, safe = list_entry(safe->list.prev, element_t, list)) { total++; - if (ele->list.next == head) - break; - if (*safe->value < *ele->value) { + if (*ele->value > min) { list_del(&ele->list); q_release_element(ele); count++; + continue; } + if (*ele->value < min) + min = *ele->value; } return total - count; } ``` ### `q_merge` 實作的過程不知為什麼 merge 一直卡在 `list_for_each` 裡面出不來？ > 最後發現是因為 `list_entry` 取錯記憶體位置，因為使用 `list_for_each` 並不會 dereference 所以沒造成 segment fault，變成無窮迴圈了。測試命令為： ```shell! cmd> new l = [] cmd> merge ERROR: Time limit exceeded. Either you are in an infinite loop, or your code is too inefficient l = [] ``` ```clike int q_merge(struct list_head *head, bool descend) { if (!head || list_empty(head)) return 0; int count = 0; queue_contex_t *chain = list_entry(head, queue_contex_t, chain); struct list_head *merge_queue; merge_queue = chain->q; if (!merge_queue || list_empty(merge_queue)) return 0; struct list_head *pos; list_for_each (pos, merge_queue) { count++; } ``` 進入除錯模式查看 `merge_queue`，發現 `list_head * merge_queue = {prev:0x0000555555569110, next:0x00005555555654a0}`，重新查看 `q_new` 初始化的 adress 及比對 `q_merge` 中看到的 head address ，發現到了 `q_merge`，merge_queue 的 next 居然指向其他某個不明的位置，而 `prev` 和在 `q_new` 看到的一樣，仍然是指向自己。後來往 `qtest.c` 查看才發現這個 head 是 `queue_chain_t` 的成員，而不是 `queue_contex_t` 的成員，所以上面的 `chain` 指向的位置是一個未知的位子，dereference 會造成不可預期的結果。(使用 `container_of` 要小心，不要存取到錯誤的位置，即使沒造成 segment fault，也可能會像這種案例進入無窮迴圈) ## 論文〈Dude, is my code constant time?〉先以 `q_insert_head` 為 use case 了解實作的脈絡。首先看到 `qtest.c`->`queue_insert`->`is_insert_head_const`->`test_const`->`do_it`，在 `do_it` 這個函式中實作了 `dutect` 的 `leakage detection test` 測試程式，以下為個步驟實作細節： 1. `prepare_input` 這裡和 `dutect` 實作一樣隨機將一部分的 `input_data` 清空，其餘的 `input_data` 為隨機生成。`random_string` 是原實作中沒用到相同或對應的數據，一樣隨機生成。 - `input_data` 每個測試單位的 `chunk size` 為 2 byte - `random_string` 每個測試字串長度為 8 byte 2. `measure` 在直接測量 `q_insert_head` 前，從程式碼的流程來看，先是隨機生成 `s` 字串、新增佇列，接著： ```clike dut_insert_head( get_random_string(), *(uint16_t *) (input_data + i * CHUNK_SIZE) % 10000); ``` 因為 `input_data` 是前面隨機生成的，所以這邊會取得一個 16 bits 的隨機數字，帶入以下 macro: ```clike #define dut_insert_head(s, n) \ do { \ int j = n; \ while (j--) \ q_insert_head(l, s); \ } while (0) ``` 會執行 n 次 `q_insert_head`，跑完之後才是測量一次的 `q_insert_head` ，最後檢查測量前後的 `q_size`。這裡不太懂的地方是為什麼在測量之前要先執行 n 次的 `q_insert_head`? 還有就是這個函式測試數據為什麼要丟掉頭跟尾巴？ `for (size_t i = DROP_SIZE; i < N_MEASURES - DROP_SIZE; i++)` 最後程式碼整個看下來，`DROP_SIZE` 所影響的數值應該只有 `input_data` 是從中間開始取，猜測可能和亂數的亂度有關？ 3. `differentiate` 和論文的實作一樣，計算 execution time 4. `update_statistics` 這裡和論文的實作中有兩個不同點。a. 原實作中丟棄掉前 10 個樣本，`lab0` 沒有。b. 原實作中還有計算 percentile 及 cropped 部分。 5. `report` 這裡會計算 $t = \frac{\bar{X_0} - \bar{X_1}}{\sqrt{\frac{s^2_1}{N_1} + \frac{s^2_2}{N_2}}}$，最後比較 `threshold` 小於等於 10 才有可能是 `constant time`。論文中提到： > We discard measurements that are larger than the p percentile, for various values of p. The rationale here is to test a restricted distribution range, especially the lower cycle count tail. The upper tail may be more influenced by data-independent noise. 以及註腳： > The exact values for p follow an inverse exponential trend 論文實作的 percentile 函式會先將 execu_time 由小排到大，而根據上面註腳 p 為 inverse exponential，所以大部分 t 值的 threshold 會在執行時間較大的位置，疑惑的是這樣不是就等於包含了較多 data-independent noise? > 原本覺得 noise 越多，越可能造成 t 超過臨界值，但其實考慮 $t = \frac{\bar{X_0} - \bar{X_1}}{\sqrt{\frac{s^2_1}{N_1} + \frac{s^2_2}{N_2}}}$ 公式，當 noise 增加會導致 $s^2$ 與 $\bar{X_{0}}$ 增加，不過分子還是分母增加的多有待分析。問了 AI 後得到以下解釋： > ![output](https://hackmd.io/_uploads/HJFr2nST1l.png) 這張圖顯示了整體執行時間的分布，並標出了幾個常用的 percentile threshold（90%、95%、98%、99%）：這些虛線（threshold）都位於右側的 upper tail。實際上 dudect 會丟掉右側超過這些 threshold 的資料，只保留左側的 sample 來進行統計分析。這樣做的目的是避免受 upper tail 中 noisy outlier 的影響，從而提高 t-test 的穩定性與可信度。疑惑的點還有 t-test 設定的臨界值 4.5, 10, 500 從何而來？對應到的 significance level $\alpha$ 又是多少？接著參考論文實作的 percentile 加入 lab0，還是沒辦法通過 constant time，實驗之後發現 `measure` 函式中： ```clike dut_insert_head( get_random_string(), *(uint16_t *) (input_data + i * CHUNK_SIZE) % 10000); ``` 如果隨機產生的節點小於 500，可以通過，不太懂為什麼節點長度會影響結果。 > 後來想想，之所以會有這個結果可能是在測量之前大量的節點添加運算導致的 noise，考慮將 $l$ 放在 pre-processing 先處理完成，以此避免過多 noise。看了 `get_random_string` 函式實作後，發現是在已經生成的 150 個隨機字串中，按照每次呼叫順序取得。而每次疊代採樣之前都會先取得測量要用的字串，然後生成節點，呼叫 `get_random_string` 函式 `*(uint16_t *) (input_data + i * CHUNK_SIZE) % 10000)` 次，這樣下一次取得測量的字串時會依賴於前一次的 `random_string_iter`。根據[中央極限定理](https://zh.wikipedia.org/zh-tw/%E4%B8%AD%E5%BF%83%E6%9E%81%E9%99%90%E5%AE%9A%E7%90%86)來自相同的母體的樣本平均值分佈會與常態分佈相似，隨著樣本數量越大越區近於標準常態分佈。下圖來自 AI 的範例，當 t 值超過臨界值，則可以說兩樣本有顯著差異，可能來自不同母體，也就可能包含 timing leakage。 ![output (1)](https://hackmd.io/_uploads/rJBpMpHp1l.png) 下圖也來自 AI，視覺化了 t 分佈和樣本平均數分佈範例，幫助理解。 ![output (4)](https://hackmd.io/_uploads/SJ50iTBT1g.png) 以下為執行 `q_insert_tail` simulation 的執行時間分佈圖 ![image](https://hackmd.io/_uploads/SJHjCIUpJg.png) 案資料生成順序排列 ![image](https://hackmd.io/_uploads/ryGKVwLayg.png) 把數量縮小到一組 `N_MEASURES`，再將相鄰相同 class 的加上虛線，發現 fix class 只要出現第二次執行時間都會大幅下降，好神奇。 ![image](https://hackmd.io/_uploads/HJKJKvL61e.png) `q_remove_head` simulation 的執行時間分佈圖 ![image](https://hackmd.io/_uploads/SkLFyvI6kg.png) 按資料生成順序排列 ![image](https://hackmd.io/_uploads/B1jT7v8p1x.png) ## 添加 `shuffle` command 參考作業要求及[隱藏的傅立葉轉換](https://hackmd.io/@sysprog/S1jR0mHfN)實作 Fisher–Yates shuffle。熵（entropy）的定義是某個隨機事件平均每個結果發生時所能傳達出的資訊量。例如：一個隨機字串，平均每個 char 所包含的資訊量就是熵。所以當熵越小，代表資料能夠壓縮的比率越高。 ```shell! $ ./qtest cmd> new l = [] cmd> option entropy 1 cmd> ih RAND 10 l = [xcftqgkjd(37.50%) quthknttd(31.25%) ydkax(26.56%) qbmkx(26.56%) ykupiebt(35.94%) khlamqpx(35.94%) uswaldn(32.81%) bfqjpq(26.56%) lggqjyd(29.69%) imurgh(29.69%)] ``` lab0 中實作了 entropy 計算方式，運用定點數方式實作，實際計算的 `entropy` 和 `ent` 有些出入，待進一步分析原因。 > 程式碼中是運用定點數技巧實作，加上沒有補償誤差，entropy 運算完後取百分比才是最後的結果，也就是除以最大值 8（每 byte 最大的 entropy）。 ![image](https://hackmd.io/_uploads/B1q5p4jT1x.png) ```shell! $ python3 tools/entropy.py Expectation: 4166 Observation: {'1234': 4181, '1243': 4133, '1324': 4291, '1342': 4060, '1423': 4209, '1432': 4087, '2134': 4292, '2143': 4110, '2314': 4039, '2341': 4340, '2413': 4087, '2431': 4188, '3124': 4151, '3142': 4098, '3214': 4140, '3241': 4136, '3412': 4411, '3421': 4072, '4123': 4161, '4132': 4213, '4213': 4188, '4231': 4186, '4312': 4113, '4321': 4114} chi square sum: 46.16514642342775 ``` 卡方檢定可以用於檢測觀察值是否符合理論上的分佈。令 $H_0$ 為 shuffle 的每個排列發生機率相同，為 Uniform distribution。 $H_1$ 對立假說就是「shuffle 所有可能排列中存在至少一個發生機率不同」。參考 [卡方分布表](https://www.obhrm.net/index.php/%E5%8D%A1%E6%96%B9%E5%88%86%E5%B8%83%E8%A1%A8_Chi-Square_Probabilities)：自由度為 $4! -1 = 23$，$X^2 = 46.165 > 35.172$，$X^2$ 落在小於 0.01 的 p value，假設顯著水準 $\alpha$ 為 0.05，因為 $p < 0.01 < \alpha = 0.05$ 因此拒絕 $H0$，這次實作應該有很大問題。修改亂數取得方式： ```diff! diff --git a/qtest.c b/qtest.c index 16e91f4..5c9dcd5 100644 --- a/qtest.c +++ b/qtest.c @@ -716,9 +716,8 @@ static void q_shuffle(struct list_head *head) for (new = head->prev, safe = head; new != head; safe = safe->prev, new = safe->prev) { uint64_t count = 0; - uint16_t random; - randombytes((uint8_t *) &random, sizeof(uint16_t)); - int position = random % (len - 1); + double random = ((double) rand()) / ((double) RAND_MAX + 1); // 0 <= random < 1 + int position = (int) (random * (len - 1)); list_for_each(old, head) { if (count++ == position) { len--; ``` ![image](https://hackmd.io/_uploads/SyJB4UiaJx.png) ```shell! Expectation: 4166 Observation: {'1234': 4111, '1243': 4174, '1324': 4151, '1342': 4260, '1423': 4122, '1432': 4224, '2134': 4210, '2143': 4140, '2314': 4158, '2341': 4080, '2413': 4122, '2431': 4187, '3124': 4089, '3142': 4169, '3214': 4141, '3241': 4269, '3412': 4274, '3421': 4081, '4123': 4295, '4132': 4071, '4213': 4188, '4231': 4161, '4312': 4192, '4321': 4131} chi square sum: 22.572251560249644 ``` $14.848 < X^2 = 22.572 < 32.007$， p value 介於 0.9 與 0.1 之間，大於 $\alpha = 0.05$，因此無法拒絕虛無假設 $H_0$。目前參考的[卡方分布表](https://www.obhrm.net/index.php/%E5%8D%A1%E6%96%B9%E5%88%86%E5%B8%83%E8%A1%A8_Chi-Square_Probabilities)沒有列出 p value 0.9 至 0.1 的詳細表格，待查找其他來源。 ## 研讀 Linux 核心原始程式碼的 lib/list_sort.c