2024q1 Homework1 (lab0)

contributed by < 56han >

Reviewed by `allenliao666`

commit 內文中句首應該要大寫，例如 commit f88e90a
q_ascend 的內文及註解有誤，應該是若右側節點小於當前節點才需標記當前節點為需要刪除。
q_ascend 和 q_descend 有不需雙層迴圈即可完成的實作方法，可以嘗試看看。
開發紀錄寫得很清楚，閱讀起來很舒服。

詳閱作業規範
留意各式細節，唯有重視小處並步步為營，方可挑戰原始程式碼規模超越三千七百萬行的 Linux 核心

Reviewed by `Ackerman666`

q_delete_mid 在雙向佇列中，可從頭尾開始往中間找尋中間點
如此只需走訪一次串列就行可參考 YangYeh-PD 的解說

開發環境

$ gcc --version
gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0

$ lscpu
Architecture:               x86_64
CPU op-mode(s):             32-bit, 64-bit
Address sizes:              39 bits physical, 48 bits virtual
Byte Order:                 Little Endian
CPU(s):                     12
On-line CPU(s) list:        0-11
Vendor ID:                  GenuineIntel
Model name:                 Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
CPU family:                 6
Model:                      158
Thread(s) per core:         2
Core(s) per socket:         6
Socket(s):                  1
Stepping:                   10
CPU max MHz:                4600.0000
CPU min MHz:                800.0000
BogoMIPS:                   6399.96
L1d cache:                  192 KiB (6 instances)
L1i cache:                  192 KiB (6 instances)
L2 cache:                   1.5 MiB (6 instances)
L3 cache:                   12 MiB (1 instance)
NUMA node0 CPU(s):          0-11

雙向鏈結串列結構

指定的佇列操作

`q_new`

定義一個函式 q_new，用於動態配置，並初始化一個空的雙向鏈結串列節點，如果配置失敗則返回 NULL。

struct list_head *q_new()
{
    struct list_head *head = malloc(sizeof(struct list_head));
    if (!head)
        return NULL;

    INIT_LIST_HEAD(head);

    return head;
}

`q_free`

原本寫法如下，但寫完 q_insert_head 之後，反而分數變少。
我認為是配置給 element_t 結構體中 value 欄位的記憶體沒有被釋放。此外，直接釋放 struct list_head 結構體的 pointer 也可能破壞 list 的結構，因為它沒有按照 list 的邏輯結構來逐一走訪每個節點，並釋放每個元素。

避免過多的中英文混用，已有明確翻譯詞彙者，例如「鏈結串列」(linked list) 和「佇列」(queue)，就使用該中文詞彙，英文則留給變數名稱、人名，或者缺乏通用翻譯詞彙的場景。

\to

資訊科技詞彙翻譯

void q_free(struct list_head *head)
{
    if (!head) {
        return;
    }
+   element_t *n, *s;
+   list_for_each_entry_safe (n, s, head, list)
+       q_release_element(n);

-   struct list_head *current = head->next;
-   while (current != head) {
-       struct list_head *tmp = current;
-       current = current->next;
-       free(tmp);
-   }

    free(head);
}

`q_insert_head`

用於在雙向鏈結串列的頭部插入一個新元素，新元素中包含了一個字串s。如果插入成功返回 true，否則返回 false。

/* Insert an element at head of queue */
element_t *new_element = malloc(sizeof(element_t));
    if (!head || !new_element) {
        return false;
    }

    new_element->value = strdup(s);  
    if (!new_element->value) {  
        free(new_element);
        return false;
    }

    list_add(&new_element->list, head);
    return true;

`q_insert_tail`

將q_insert_tail中的差異在於插入節點的位置不同，因此只須修改最後一行。

-    list_add(&new_element->list, head);
+    list_add_tail(&new_element->list, head);

觀摩 164253，可以在 q_insert_tail 中直接呼叫 q_insert_head，並將第一個參數改為 head->prev 即可。

bool q_insert_tail(struct list_head *head, char *s)
{
    return q_insert_head(head->prev, s);
}

`q_remove_head`

Delete 與 Remove 的概念可以參考 2024 年 Linux 核心設計/實作課程作業 —— lab0 (A) 提及的內容，主要的差異為目標節點還是否存在於記憶體空間。

"Remove" 這動作對於 linked list 來說，是斷開節點和節點的關聯，也就是說原本若是 A
$\to$ B
$\to$ C，在 remove(B) 操作完成後，就會變成 A
$\to$ C 的節點關係。特別注意到 B 這個節點其實還存在於記憶體中，只是我們無法從節點間的關係看出來。於是 "remove" 可解讀為 "take away" (將某物帶走)。
"Delete" 則有 "erase" 的涵義，若用於 linked list，則指某個節點將被「抹除」，對應的記憶體空間可能會被釋放 (release) 或回收 (reclaim)，並非只是指標的變化。

從雙向鏈結串列的 head 移除一個元素。將移除的元素中的字串複製到 sp 指向的緩衝區中，並確保字串以空字串終止。

/* Remove an element from head of queue */
element_t *q_remove_head(struct list_head *head, char *sp, size_t bufsize)
{
    if (head == NULL || list_empty(head)) { 
        return NULL;
    }
    element_t *n = list_first_entry(head, element_t, list);
    list_del(head->next);

    if (sp != NULL) {
        strncpy(sp, n->value, bufsize - 1);
        sp[bufsize - 1] = '\0';
    }
    return n;  
}

`q_remove_tail`

將 q_remove_tail 中的差異在於插入節點的位置不同，因此改從 tail 獲取最後一個元素，並刪除最後一個元素。

-    element_t *n = list_first_entry(head, element_t, list);
-    list_del(head->next);
+    element_t *n = list_last_entry(head, element_t, list);
+    list_del(head->prev);

參考 164253，
和 q_insert_tail 一樣可以簡單的用 q_remove_head 完成

element_t *q_remove_tail(struct list_head *head, char *sp, size_t bufsize)
{
    return q_remove_head(head->prev, sp, bufsize);
}

`q_delete_mid`

參考你所不知道的 C 語言: linked list 和非連續記憶體
方法：使用快慢指標

/* Delete the middle node in queue */
bool q_delete_mid(struct list_head *head)
{
    // https://leetcode.com/problems/delete-the-middle-node-of-a-linked-list/
    struct list_head *slow = head->next;

    if (!head)
        return false;

    for (struct list_head *fast = head->next; fast != head && fast->next != head; fast = fast->next->next) {
        slow = slow->next;
    }
    list_del(slow);
    q_release_element(list_entry(slow,element_t,list));

    return true;
}

`q_delete_dup`

原本誤以為題目的意思是遇到重複，只要留下一個元素，並把後面連續重複的元素都刪除。
後來依題意改成刪除所有連續重複的元素。

/* Delete all nodes that have duplicate string */
bool q_delete_dup(struct list_head *head)
{
    // https://leetcode.com/problems/remove-duplicates-from-sorted-list-ii/
    if (!head)
        return false;
    struct list_head *node;
    bool flag = false;
    
    for(node = head->next ; node->next!=head ;){
-       element_t *prev = list_entry(node->prev, element_t, list);
        element_t *current = list_entry(node, element_t, list);
        element_t *next = list_entry(node->next, element_t, list);

        if (strcmp(current->value, next->value) == 0) {
            list_del(&next->list);
            q_release_element(next);
            flag = true;
        } else {
            if(flag){
-               list_del(&prev->list);
-               q_release_element(prev);
                // 刪除當前節點，並重置標誌
+               struct list_head *temp = node->next; // 保存下一個節點的地址
+               list_del(node);
+               q_release_element(current);
+               node = temp; // 更新node為下一個節點
+               flag = false;
            }
+           else{
                node = node->next;
+           }
-           flag = false;
        }
    }
    // 處理最後有重複的節點
+   if (flag) {
+       element_t *c = list_entry(node, element_t, list);
+       list_del(node);
+       q_release_element(c);
+   }
    return true;
}

觀摩 96121503，自我檢討，其中：for loop 可以善用 list.h 的 list_for_each_entry_safe 完成，使程式碼更加簡潔。

`q_swap`

原本想法兩兩互換，後來參考 laneser 同學方法： 先移除一個節點，再加入在交換的位置上。

void q_swap(struct list_head *head)
{
    // https://leetcode.com/problems/swap-nodes-in-pairs/
    if (!head)
        return;
    struct list_head *node;
    for (node = head->next; (node->next != head) && (node != head);
         node = node->next) {
        struct list_head *next = node->next;
        list_del(node);
        list_add(node, next);
}

觀摩 yutingshih，發現可以善用 list.h 中的 list_move_tail() 撰寫。

list_move(node, head)：將 node 移動到 head 的後面，成為鏈結串列的第一個節點。
list_move_tail(node, head)：將 node 移動到 head 的前面，成為鏈結串列的最後一個節點。

改進你的漢語表達。

list_for_each(node, head) {
    if (node->next == head)
        break;
    list_move_tail(node->next, node);
}

`q_reverse`

利用 list.h 中的list_move ，將鏈結串列中的結點一個一個往前移。

struct list_head *current, *s;
list_for_each_safe (current, s, head){
    list_move(current, head);    
}

`q_reverseK`

逐一走訪鏈結串列，每當計數達到 k 時，就將當前到達的位置之前的部分切割出來，翻轉，然後接回原來的位置，直到逐一走訪完整個鏈結串列。

/* Reverse the nodes of the list k at a time */
void q_reverseK(struct list_head *head, int k)
{
    // https://leetcode.com/problems/reverse-nodes-in-k-group/
    if (!head || list_empty(head))
        return;

    struct list_head *it, *safe, *cut;
    int count = k;
    cut = head;
    list_for_each_safe (it, safe, head) {
        if (--count)
            continue;
        LIST_HEAD(tmp);
        count = k;
        list_cut_position(&tmp, cut, it);
        q_reverse(&tmp);
        list_splice(&tmp, cut);
        cut = safe->prev;
    }
}

`q_ascend`

透過外層循環逐一走訪整個鏈結串列，對於每個節點，使用內層循環 比較其右側所有節點的值。
如果發現右側有節點的值大於當前節點的值，則標記當前節點為需要刪除。
逐一走訪完成後， list_del 將需要刪除的節點從鏈結串列中移除並 q_release_element 釋放相關資源。

程式碼註解總是用美式英語書寫！

/* Remove every node which has a node with a strictly less value anywhere to
 * the right side of it */
int q_ascend(struct list_head *head)
{
    // https://leetcode.com/problems/remove-nodes-from-linked-list/
    if (!head || list_empty(head))
        return 0;

    struct list_head *cur = head->next;
    while (cur != head) {  // 外層循環逐一走訪整個 linked list
        struct list_head *next_for_comparison = cur->next;
        bool should_delete = false;

        while (next_for_comparison !=
               head) {  // 外層循環逐一走訪當前節點右側的所有節點
            element_t *cur_element = list_entry(cur, element_t, list);
            element_t *next_element =
                list_entry(next_for_comparison, element_t, list);

            if (strcmp(cur_element->value, next_element->value) > 0) {
                should_delete = true;
                break;  // 找到一個嚴格更大的值，當前節點需要被刪除
            }

            next_for_comparison = next_for_comparison->next;
        }

        struct list_head *next = cur->next;  // 保存當前節點的下一個節點
        if (should_delete) {
            list_del(cur);  // 從 linked list 中刪除當前節點
            element_t *to_delete = list_entry(cur, element_t, list);
            q_release_element(to_delete);  // 釋放節點資源
        }

        cur = next;  // 移動到下一個節點
    }

    return q_size(head);
}

程式碼列出是為了討論和檢討，抽離出關鍵程式碼來討論。

`q_descend`

和q_ascend 相同，修改判斷式即可。

-    if (strcmp(cur_element->value, next_element->value) > 0)
+    if (strcmp(cur_element->value, next_element->value) < 0)

`q_merge_two`

合併兩個雙向鏈結串列 first 和 second。使用一個臨時鏈結串列 tmp 來存放合併後的結果。
並逐一比較 first 和 second 鏈結串列中的元素，按照元素值的升序將元素移動到 tmp 鏈結串列中。
當其中一個鏈結串列被完全移動完畢後，善用 list.h中的list_splice，將剩餘的元素也加入到 tmp 中，並最終使用list_splice_tail_init將 tmp 鏈結串列中的所有元素移動回 first 鏈結串列中。

static int q_merge_two(struct list_head *first, struct list_head *second)
{
    if (!first || !second)
        return 0;

    int count = 0;
    LIST_HEAD(tmp);
    while (!list_empty(first) && !list_empty(second)) {
        element_t *f = list_first_entry(first, element_t, list);
        element_t *s = list_first_entry(second, element_t, list);
        if (strcmp(f->value, s->value) <= 0)
            list_move_tail(&f->list, &tmp); //f 接到 tmp 後面
        else
            list_move_tail(&s->list, &tmp);
        count++;
    }
    count += q_size(first) + q_size(second);
    list_splice(&tmp, first);
    list_splice_tail_init(second, first);

    return count;
}

`q_sort`

以 Divide and conquer 實作 Merge Sort 對鏈結串列進行排序。我使用的方法是先以升序作排列，最後判斷 descend 值，判斷是否需要反轉(reverse)。

你如何確保目前的測試程式已涵蓋排序演算法的最差狀況？

/* Sort elements of queue in ascending/descending order */
void q_sort(struct list_head *head, bool descend)
{
    /* Try to use merge sort*/
    if (!head || list_empty(head) || list_is_singular(head))
        return;

    /* Find middle point */
    struct list_head *mid, *left, *right;
    left = right = head;
    do {
        left = left->next;
        right = right->prev;
    } while (left != right && left->next != right);
    mid = left;

    /* Divide into two part */
    LIST_HEAD(second);
    list_cut_position(&second, mid, head->prev);

    /* Conquer */
    q_sort(head, false);
    q_sort(&second, false);

    /* Merge */
    q_merge_two(head, &second);
    if (descend == true) {
        q_reverse(head);
    }
}

TODO: 思考如何避免遞迴

`q_merge`

處理了將多個排序好的子佇列合併成一個大佇列，並根據指定順序進行排序的任務。
這些子佇列使用 queue_contex_t 結構表示，並藉由 chain 成員鏈接在一個主要的鏈結串列 head 中。合併後的佇列給可以根據 descend 參數以升序或降序進行排序。

/* Merge all the queues into one sorted queue, which is in ascending/descending
 * order */
int q_merge(struct list_head *head, bool descend)
{
    // https://leetcode.com/problems/merge-k-sorted-lists/
    if (!head || list_empty(head))
        return 0;

    if (list_is_singular(head))
        return list_first_entry(head, queue_contex_t, chain)->size;

    LIST_HEAD(tmp);
    queue_contex_t *cur, *safe;
    queue_contex_t *first = list_first_entry(head, queue_contex_t, chain);

    list_for_each_entry_safe (cur, safe, head, chain) {
        list_splice_init(cur->q, &tmp);
    }

    int size = q_size(&tmp);
    list_splice_init(&tmp, first->q);
    q_sort(first->q, descend);

    return size;
}

Valgrind + 自動測試程式

閱讀 The Valgrind Quick Start Guide 得知，Valgrind 工具套件提供了許多 debugging 和分析工具，可協助我的程式更快、更正確，其中最受歡迎的是 Memcheck。它可以檢測 C 和 C++ 程式中常見的許多與記憶體相關的錯誤，這些錯誤可能導致崩潰和不可預測的行為。
程式將比正常運行速度慢得多，並且使用更多記憶體。Memcheck 將發出有關其檢測到的記憶體錯誤和洩漏的訊息。

massif

它測量程式使用了多少 heap 記憶體，測量程式堆疊的大小。此外，某些空間洩漏是傳統洩漏檢查器（例如 Memcheck ）無法偵測到的。

這是因為記憶體實際上並沒有丟失，仍然有一個指向它的指標，但它沒有被使用。隨著時間過去，存在此類洩漏的程式會增加其使用的記憶體量，Massif 可以幫助識別這些洩漏。

$ valgrind --tool=massif ./qtest -f <trace file>

運行 trace-017-complexity

test_const 的記憶體使用量穩定且持續增長，代表它沒有釋放記憶體。

TODO:
為何在 "Peak of 1.7 MiB at snapshot #87" 記憶體使用會達到峰值。

在 qtest 提供新的命令 shuffle

閱讀 Fisher–Yates shuffle 演算法之後，實作洗牌（shuffle）。
首先在 qtest.c 中 ADD_COMMAND，在開始實做 q_shuffle。
原本直接將 q_shuffle 實作在 queue.c 中，並在 queue.h 新增 q_shuffle 的宣告，git commit 之後出現錯誤訊息如下：

[!] You are not allowed to change the header file queue.h or list.h

因此得知不能修改 queue.h 檔，於是我多寫了兩個檔案，分別是：shuffle.h 和 shuffle.c。

commit e26bd21

commit shuffle.c 時，出現了以下錯誤

Fail to pass static analysis.

是因為 cppcheck 檢查到了非標準代碼，如果想要過濾掉某些錯誤，可以加上 cppcheck-suppress aaaa 。因此我加入註解 // cppcheck-suppress nullPointer 以忽略警告。

element_t *oldnode = 
    list_entry(old, element_t, list);  // cppcheck-suppress nullPointer

統計方法驗證 `shuffle`

Expectation:  41666
Observation:  {'1234': 41617, '1243': 41491, '1324': 41902, '1342': 42053, '1423': 41613, '1432': 41631, '2134': 41597, '2143': 41646, '2314': 41615, '2341': 41697, '2413': 41729, '2431': 41872, '3124': 41441, '3142': 41696, '3214': 41700, '3241': 41560, '3412': 41635, '3421': 41660, '4123': 41502, '4132': 41584, '4213': 41759, '4231': 41908, '4312': 41402, '4321': 41690}
chi square sum:  12.808332933326932

從圖表來看 shuffle 的頻率大致符合 Uniform distribution。

論文〈Dude, is my code constant time?〉

Dudect 是一種用於評估軟體實作是否符合常數時間執行的工具，對於密碼學實作的安全性非常重要。常數時間執行能夠防範定時攻擊 (timing attack)，這是一種利用密碼學運算的執行時間差異，來推測密鑰或機密資訊的側錯分析攻擊手法。
它的貢獻在於以很小的實作成本，提供高度的安全性檢測能力。使用的 t-test 是用於檢測算法執行時間是否存在統計學上的顯著差異，這對於偵測側通道攻擊非常重要。

程式實作的原理

dudect.h 過程就是一個迭代的統計檢測，透過不斷收集新的測量數據、更新統計數據、設置合理的閾值過濾極端值，最終得出被測試函數是否存在 Timing leakage 風險。

其中 prepare_percentiles() 函式，是先將執行時間進行排序，接著從排序後的 ctx->exec_times 執行時間資料中給定百分位數對應的值。
此函式目的在於計算出執行時間閾值，用於後續過濾極端值。透過設置不同的閾值，可以有選擇地過濾掉測量值中的極值，從而獲得更加準確的統計結果。

static void prepare_percentiles(dudect_ctx_t *ctx) {
  qsort(ctx->exec_times, ctx->config->number_measurements, sizeof(int64_t), (int (*)(const void *, const void *))cmp);
  for (size_t i = 0; i < DUDECT_NUMBER_PERCENTILES; i++) {
    ctx->percentiles[i] = percentile(
        ctx->exec_times, 1 - (pow(0.5, 10 * (double)(i + 1) / DUDECT_NUMBER_PERCENTILES)),
        ctx->config->number_measurements);
  }
}

另外 update_statistics 中以捨棄前 10 筆測量，以避免一開始測量的不穩定性。difference 儲存每次迭代的執行時間差異，對每個有效的 difference 執行 t-test，檢測算法執行時間是否存在統計學上的顯著差異。
迴圈走訪多個剪裁閾值，只對那些低於特定閾值的 difference 執行 t-test。這可以幫助識別在不同的執行時間閾值下，算法的行為是否存在顯著的統計差異。

static void update_statistics(dudect_ctx_t *ctx) {
  for (size_t i = 10 /* discard the first few measurements */; i < (ctx->config->number_measurements-1); i++) {
    int64_t difference = ctx->exec_times[i];

    if (difference < 0) {
      continue; // the cpu cycle counter overflowed, just throw away the measurement
    }

    // t-test on the execution time
    t_push(ctx->ttest_ctxs[0], difference, ctx->classes[i]);

    // t-test on cropped execution times, for several cropping thresholds.
    for (size_t crop_index = 0; crop_index < DUDECT_NUMBER_PERCENTILES; crop_index++) {
      if (difference < ctx->percentiles[crop_index]) {
        t_push(ctx->ttest_ctxs[crop_index + 1], difference, ctx->classes[i]);
      }
    }

    // ... 以下省略
  }
}

`lab0-c` 加入具備 `percentile` 的處理

commit e91a53f

在 dudect/fixture.c 中，我發現 doit 這個函式實作與 dudect.h 主要實作功能類似，但沒有實作 percentile 這個部份，造成極端值沒有被去除，導致判斷結果出錯。
將 dudect.h 處理 percentile 的部分加入 lab0-c 的 fixture.c 後便可以正確分析 insert_head, insert_tail, remove_head, remove_tail 的執行時間。

研讀並嘗試引入 Linux 核心原始程式碼的 `lib/list_sort.c`

安裝 perf 及設定系統權限，預設會使 perf 權限不足：

$ sudo su # As Root
$ sysctl -w kernel.perf_event_paranoid=-1
$ echo 0 > /proc/sys/kernel/kptr_restrict
$ exit

嘗試引入 `lib/list_sort.c`

將 lib/list_sort.c 和 linux/list_sort.h 引入本專案中

實驗

隨機產生 500000 個字串加入到鏈結串列中，使用不同的 perf 效能分析工具分析兩種不同的排序方式效能上的差異。
traces/trace-sort.cmd:

option fail 0
option malloc 0
new
ih RAND 500000
time
sort/lsort
time

執行以下指令，以比較執行時間

perf stat --repeat 5 -e instructions,cycles ./qtest -f traces/trace-sort.cmd

q_sort 執行時間表：

測試次數	執行時間（秒）
1	0.9123
2	0.9287
3	0.9295
4	0.9474
5	0.9541

linux 的 list_sort 執行時間表：

測試次數	執行時間（秒）
1	0.45526
2	0.4762
3	0.46526
4	0.45756
5	0.45429

list_sort 優點

使用單向鏈結串列，可以節省空間。
使用了 Bottom-Up 合併演算法，與傳統的 Top-Down 合併排序演算法相比，可以減少分割的步驟，節省了時間。
透過限制大小比率最多為2:1，可以減少在最終合併步驟中所需的比較次數，也可以避免極端不平衡的情況，從而改善排序的最壞情況效能。

使用 address sanitizer

make clean
make SANITIZER=1

之後執行 make test ，沒有出現任何記憶體相關錯誤訊息。

整合 tic-tac-toe 遊戲

將 jserv/ttt 專案的程式碼部分整合進 lab0-c

將 jserv/ttt 專案的 zobrist.[ch] 會用到 hlist.h 相關的程式碼抽出，成為 hlist.h，並確保後者可與 lab0-c 既有的 list.h 並存。

commit 53ad444

修改 qtest 程式，使其新增 ttt 命令，其作用是執行與人類對弈的井字遊戲，從 jserv/ttt 的 main.c 檔案中提取與遊戲相關的程式碼至專案中的 console.c。當輸入指令 ttt 後，將繼續執行原始 main.c 中的內容，以啟動井字遊戲。

commit 4876b95

Monte Carlo tree search (MCTS) 演算法

此方法依賴平衡探索和利用的智慧搜尋樹。執行隨機採樣並儲存操作統計數據，以便在每次後續迭代中做出更明智的選擇。

Monte Carlo tree search 的每個迴圈包括四個步驟：

選擇（Selection）：從根節點 R 開始，連續向下選擇子節點至葉子節點 L 。
擴充（Expansion）：除非任意一方的輸贏使得遊戲在 L 結束，否則建立一個或多個子節點並選取其中一個節點 C 。
仿真（Simulation）：再從節點 C 開始，用隨機策略進行遊戲。
反向傳播（Backpropagation）：使用隨機遊戲的結果，更新從 C 到 R 的路徑上的節點資訊。

每一個節點的內容代表勝利次數/遊戲次數

2024q1 Homework1 (lab0)

Reviewed by allenliao666

Reviewed by Ackerman666

開發環境

雙向鏈結串列結構

指定的佇列操作

q_new

q_free

q_insert_head

q_insert_tail

q_remove_head

q_remove_tail

q_delete_mid

q_delete_dup

q_swap

q_reverse

q_reverseK

q_ascend

q_descend

q_merge_two

q_sort

q_merge