2024 Homework1

# 2024q1 Homework1 (lab0) contributed by < `shlin41` > :::danger 不要寫錯自己的 GitHub 帳號名稱。留意各式細節，唯有重視小處並步步為營，方可挑戰原始程式碼規模超越三千七百萬行的 Linux 核心。 ::: ## 開發環境 ```shell $ gcc --version gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 Copyright (C) 2021 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 48 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 12 On-line CPU(s) list: 0-11 Vendor ID: AuthenticAMD Model name: AMD Ryzen 5 5600H with Radeon Graphics CPU family: 25 Model: 80 Thread(s) per core: 2 Core(s) per socket: 6 Socket(s): 1 Stepping: 0 CPU max MHz: 4280.0000 CPU min MHz: 400.0000 BogoMIPS: 6587.55 ``` ## 指定的佇列操作 :::danger 避免非必要的項目縮排 (即 `* ` 和 `- `)，以清晰、明確，且流暢的漢語書寫。 ::: 實作前先瀏覽 `list.h`，以確認有哪些 API 可使用針對實作時要特別花時間的 function 進行紀錄 :::danger 避免過多的中英文混用，已有明確翻譯詞彙者，例如「鏈結串列」(linked list)、「節點」(node) 和「佇列」(queue)，就使用該中文詞彙，英文則留給變數名稱、人名，或者缺乏通用翻譯詞彙的場景。 ::: ### `q_remove_head` 參數中有一個長度為 bufsize 的<s>字元陣列</s> 字串 sp，需要把 `element->value` 複製過去。一開始忘記在 sp 最後補字串結束符號，花了一些時間除錯。 ```c if (sp) { size_t len = strlen(element->value); if (len > bufsize - 1) len = bufsize - 1; strncpy(sp, element->value, len); sp[len] = '\0'; } ``` ### `q_delete_dup` 利用 front, back 兩個指標框出 value 相同的 element set `[front .. back)`。如果 set 的 size > 1，則刪除其中所有 elements。 :::danger 改進你的漢語表達。 ::: ```c bool q_delete_dup(struct list_head *head) { // https://leetcode.com/problems/remove-duplicates-from-sorted-list-ii/ if (!head) return false; struct list_head *front = head->next; struct list_head *back = front->next; while (front != head) { element_t *felement = list_entry(front, element_t, list); /* Find same-value elements from [front ... back) */ while (back != head) { element_t *belement = list_entry(back, element_t, list); if (strcmp(felement->value, belement->value) == 0) back = back->next; else break; } /* Delete same-value elements if duplicating */ if (back->prev != front) { struct list_head *root = front->prev; while (root->next != back) { struct list_head *temp = root->next; element_t *telement = list_entry(temp, element_t, list); list_del(temp); q_release_element(telement); } } front = back; } return true; } ``` ### `q_swap` :::danger 避免過多的中英文混用，已有明確翻譯詞彙者，例如「鏈結串列」(linked list)、「節點」(node) 和「佇列」(queue)，就使用該中文詞彙，英文則留給變數名稱、人名，或者缺乏通用翻譯詞彙的場景。 ::: 利用一個指標 node 往前走。在遇到 head 之前每次走兩步，然後把前兩個節點 first, second 交換。 ```c void q_swap(struct list_head *head) { // https://leetcode.com/problems/swap-nodes-in-pairs/ if (!head) return; else if (list_empty(head)) return; struct list_head *node = head->next; while (node != head && node->next != head) { node = node->next->next; struct list_head *first = node->prev->prev; struct list_head *second = node->prev; list_move(first, second); } } ``` ### `q_reverseK` 利用一個 count 來找出 {K 個節點}，然後在對這 {K 個節點} 進行 reverse。過程中利用 pseudo_head 紀錄 {K 個節點} 的頭在哪，以方便利用 list_move 來做反向。 :::danger 給定的佇列具備環狀且雙向的特性，不該說「反向」，改進你的漢語表達，使用精準的詞彙。 ::: ```c void q_reverseK(struct list_head *head, int k) { // https://leetcode.com/problems/reverse-nodes-in-k-group/ if (!head) return; else if (list_empty(head)) return; struct list_head *pseudo_head = head; struct list_head *node, *safe; int count = 0; list_for_each_safe (node, safe, head) { count = (count + 1) % k; if (count == 0) { struct list_head *curr = pseudo_head->next; struct list_head *next = curr->next; for (int i = 0; i < k; ++i) { list_move(curr, pseudo_head); curr = next; next = next->next; } pseudo_head = safe->prev; } } } ``` ### `q_ascend / q_descend` 寫一個通用 q_scend 和 cmp_function 處理遞增和遞減。 ```c static int q_scend(struct list_head *head, bool descend) { if (!head) return 0; else if (list_empty(head)) return 0; int count = 0; struct list_head *node, *safe; list_for_each_safe (node, safe, head) { count++; while (node->prev != head && !cmp_function(node->prev, node, descend)) { struct list_head *target = node->prev; list_del(target); q_release_element(list_entry(target, element_t, list)); count--; } } return count; } ``` ### `q_merge` `q_merge` 的說明: > This function merge the second to the last queues in the chain into the first queue. The queues are guaranteed to be sorted before this function is called. No effect if there is only one queue in the chain. Allocation is disallowed in this function. There is no need to free the 'qcontext_t' and its member 'q' since they will be released externally. However, q_merge() is responsible for making the queues to be NULL-queue, except the first one. 實作時沒看清楚說明，把被 merge 的 `queue_contex_t *second` 從 queue 中移除。導致測試錯誤。錯誤資訊 > +++ TESTING trace trace-03-ops: > \# Test of insert_head, insert_tail, remove_head, reverse and merge > ERROR: Removal from queue failed (1 failures total) > ERROR: Removal from queue failed (2 failures total) > ERROR: Removal from queue failed (3 failures total) > ERROR: Removal from queue failed (4 failures total) > ERROR: Attempted to free unallocated block. Address = 0x563452f82af5 > Segmentation fault occurred. You dereferenced a NULL or invalid pointer--- trace-03-ops 0/6 * 修改 line22：多設計一個 `struct list_head trash`，用來除存被 merge 後的 `queue_contex_t *second`，最後再把 `trach` 連回。 ```diff - list_del(&second->chain); + list_move(&second->chain, &trash); ``` 最後版本： ```c int q_merge(struct list_head *head, bool descend) { // https://leetcode.com/problems/merge-k-sorted-lists/ if (!head) return 0; else if (list_empty(head)) return 0; struct list_head trash; INIT_LIST_HEAD(&trash); while (!list_is_singular(head)) { struct list_head *trace = head->next; while (trace != head && trace->next != head) { trace = trace->next->next; queue_contex_t *first = list_entry(trace->prev->prev, queue_contex_t, chain); queue_contex_t *second = list_entry(trace->prev, queue_contex_t, chain); first->size = q_merge_two(first->q, second->q, descend); second->size = 0; list_move(&second->chain, &trash); } } list_splice_tail(&trash, head); return list_first_entry(head, queue_contex_t, chain)->size; } ``` ## 研讀論文 ### 遭遇的問題 `$make test` 在 local 端 PC 可以 pass，但是 GitHub CI `trace-17-complexity` 卻沒過。 ```diff +++ TESTING trace trace-17-complexity: --- trace-17-complexity 0/5 --- TOTAL 95/100 make: *** [Makefile:60: test] Error 1 ``` 參考 [25077667](https://hackmd.io/@25077667/SkpLspWnT) 的作業：發現這是 dudect 的問題。 :::warning 以指定論文的觀點來解釋。 ::: ### dudect/sample trace: `void randombytes(uint8_t *x, size_t how_much)`: * 用來隨機出 `how_much` 個 byte，並放入 `x`所指的位址空間中 * `1048576 == 2^20`, 一次從 /dev/urandom 中讀出來 byte 數目的最大量 `uint8_t randombit(void)`: * 生出一個隨機的 bit `void prepare_inputs(dudect_config_t *c, uint8_t *input_data, uint8_t *classes)`: * 先隨機 `input_data`，再把 `input_data` 分成兩組 * `classes[0]`: 把隨機值改為固定數值 0x00 * `classes[1]`: 保持原隨機值 int dudect_init(dudect_ctx_t *ctx, dudect_config_t *conf)`: * 用來初始化 `dudect_ctx_t` ``` typedef struct { dudect_config_t *config; /* number_measurements = 500 */ int64_t *ticks; int64_t *exec_times; uint8_t *input_data; uint8_t *classes; /* DUDECT_TESTS = 1+DUDECT_NUMBER_PERCENTILES+1 */ ttest_ctx_t *ttest_ctxs[DUDECT_TESTS]; /* DUDECT_NUMBER_PERCENTILES = 100 */ int64_t *percentiles; } dudect_ctx_t; ``` 三個預設的常數 * `number_measurements (MEASUREMENTS_PER_CHUNK)`: default = 500 * `DUDECT_TESTS`: default = 1 + 100 + 1 * `DUDECT_NUMBER_PERCENTILES`: default = 100 `dudect_state_t dudect_main(dudect_ctx_t *c)`: * 主要的測試程式 `static void measure(dudect_ctx_t *ctx)`: * 用來執行 `do_one_computation()` 並計算時間 * 值得注意: `ctx->exec_times[number_measurements-1]` 沒有存值。所以後面的 `update_statistics()` 也不會拿這個值來計算。 `static void prepare_percentiles(dudect_ctx_t *ctx)`: * 程式碼中的說明: > Set different thresholds for cropping measurements. The exponential tendency is meant to approximately match the measurements distribution, but there's not more science than that. * 拿第一次 measure() 的結果來定義 threaholds `static void update_statistics(dudect_ctx_t *ctx)`: * 只拿 `exec_times[10] ~ exec_times[number_measurements-2]` 來分析 * 各種分析結果紀錄在 `ttest_ctxs` * `ttest_ctxs[0]`: 用來計算沒有 threshold 的分析結果 * `ttest_ctxs[1..100]`: 用來計算不同 threaholds (`percentiles[0..99]`) 的分析結果 * ttest_ctxs[101]: 還不懂 `static dudect_state_t report(dudect_ctx_t *ctx)`: * 根據 `max_t` 的值決定是否為常數執行時間 \begin{aligned} max\_t = \dfrac{|mean[0] - mean[1]|}{\sqrt{\dfrac{var[0]}{n[0]} + \dfrac{var[1]}{n[1]}}} \end{aligned} > ==疑問未解==: 為什麼不直接取 $|mean[0] - mean[1]|$ ? ### 不知如何修正 `static t_context_t *t` 紀錄了在沒有 cropping threshold 下，兩個類別的統計數據。但是為什麼根據統計結果算出來的 `max_t` 仍會超過 `t_threshold_moderate` 是我看不出來的地方。 `prepare_percentiles()` 不像是真正的主因。在原來的 dudect 程式中。`max_t` 會取整個 `ttest_ctxs[i]` 中最大的值。而 `ttest_ctxs[0]` 就是沒有 cropping threshold 的統計數據。`max_t` 只會大於等於 `ttest_ctxs[0]`。因此，在 lab0-c 不使用 `prepare_percentiles()` 也可以通過測試才對。 fix-and-random test 的部份 (以 insert_head 為例)： * fix test: ```cpp /* input_data[i] == 0, therefore dut_insert_head(xxx, 0) */ before_ticks[i] = cpucycles(); dut_insert_head(s, 1); after_ticks[i] = cpucycles(); ``` * random test: ```cpp dut_insert_head(get_random_string(), *(uint16_t *) (input_data + i * CHUNK_SIZE) % 10000); before_ticks[i] = cpucycles(); dut_insert_head(s, 1); after_ticks[i] = cpucycles(); ``` 雖然兩者都在 dut_insert_head(s, 1) 前後，計算執行時間，但是不清楚 random test 第一步動作的目的。懷疑問題出在 `t_compute()` 計算方式。 ```cpp double t_compute(t_context_t *ctx) { double var[2] = {0.0, 0.0}; var[0] = ctx->m2[0] / (ctx->n[0] - 1); var[1] = ctx->m2[1] / (ctx->n[1] - 1); double num = (ctx->mean[0] - ctx->mean[1]); double den = sqrt(var[0] / ctx->n[0] + var[1] / ctx->n[1]); double t_value = num / den; return t_value; } ``` 從 GitHub 測試中，抓取一段失敗的值。偷印出兩個類別統計出來的 $mean,\ var = m2/(n-1),\ n$ ``` meas: 0.01 M, not enough measurements (540 still to go). meas: 0.01 M, not enough measurements (430 still to go). meas: 0.01 M, not enough measurements (320 still to go). meas: 0.01 M, not enough measurements (210 still to go). meas: 0.01 M, not enough measurements (100 still to go). meas: 0.01 M, max t: +42.26, max tau: 4.22e-01, (5/tau)^2: 1.40e+02. class0: mean=42.444444, var=420.014756, n=5031.000000 class1: mean=67.699136, var=1362.694115, n=4979.000000 ``` 會得到 den = 0.597，導致 t_value 反而變大。這個計算方式不就是在懲罰 variance 小的測試嗎？variance 小反而無法通過測試。