2024q1 Homework1 (lab0)

contributed by < MiohitoKiri5474 >

開發環境

$ gcc --version
gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ lscpu
Architecture:           x86_64
  CPU op-mode(s):       32-bit, 64-bit
  Address sizes:        36 bits physical, 48 bits virtual
  Byte Order:           Little Endian
CPU(s):                 4
  On-line CPU(s) list:  0-3
Vendor ID:              GenuineIntel
  Model name:           06/7e
    CPU family:         6
    Model:              126
    Thread(s) per core: 1
    Core(s) per socket: 4
    Socket(s):          1
    Stepping:           5
    BogoMIPS:           3993.10

neoVim Config

習慣用 neoVim 當作 editor，有原生支援 language server protocol，提供語法補完和程式碼檢查的功能。
之前為了寫 C++ 而做了一些設定，不過這次打開 queue.c 之後發現有無法套用 C++17 的錯誤。

發現是我之前為了寫 C++ 時可以直接套用 C++17 的語法，所以在 ~/.clangd 裡面新增了相關的 argument，只要 clangd 啟動時就會套用。

研究了要如何按照不同檔案格式套用不同的標準後，改成以下樣式。

CompileFlags:
  # Flags for C files
  '.*\.c$':
    - "-std=c11"  # or your preferred C standard
  # Flags for C++ files
  '.*\.cpp$':

同時 neoVim 也支援 formatter，於是在 config 中新增了相關設定，在存檔時自動使用 clang-format formatting。

return {
    "stevearc/conform.nvim",
    optional = true,
    opts = {
        formatters_by_ft = {
            ["c"] = { "clang_format" },
        },
    },
    keys = {
        {
            "<leader>ft",
            function()
                require("conform").format()
            end,
        },
    },
}

`queue.c`

目前得分 100。

`q_new`

用 link_head 宣告一個鏈結串列，並用 INIT_LIST_HEAD 初始化後回傳。

/* Create an empty queue */
struct list_head *q_new()
{
    struct list_head *res =
        (struct list_head *) malloc(sizeof(struct list_head));
    INIT_LIST_HEAD(res);
    return res;
}

`q_free`

原本是用一個遞迴函數來處理，但因為不容易維護且容易造成 segmentation fault，最後改用 list_for_each_safe 取得所有節點，並在其中使用 list_entry, contain_of 取得其被包含的結構體，並用 list_del 刪除這個節點後再將 structure 資料釋放。

原先在測試時一直遇到有記憶體沒有被釋放的問題，最後發現是忘記把 element_t 中的字串釋放掉。

同時接下來預期會在許多地方需要釋放 element_t 的 object，因此撰寫巨集 free_element 來處理。

/* Free element_t object */
#define free_element(obj) \
    free(obj->value);     \
    free(obj)

/* Free all storage used by queue */
void q_free(struct list_head *l)
{
    if (!l || q_empty ( l )) {
        free ( l );
        return;
    }
    struct list_head *tmp, *next;

    list_for_each_safe (tmp, next, l) {
        element_t *entry = list_entry(tmp, element_t, list);
        free_element(entry);
    }

    free(l);
}

`q_insert_head`, `q_insert_tail`

兩個函數作用比較類似，也只有差一點點，所以放在同一個標題下。
也因為部分函數內容是重複的，所以額外寫了一個 element_new 函數用來建立一個新的 element_t object。

/* Create a new element */
element_t *element_new(char *s)
{
    element_t *res;
    res = (element_t *) malloc(sizeof(element_t));
    if (res == NULL) {
        free(res);
        return NULL;
    }
    res->value = strdup(s);
    return res;
}

本來 strdup 是先 malloc 位置之後再用 strcpy 過去，不過這樣寫更為簡潔，於是換成現在的寫法。

而在 q_insert_head/q_insert_tail 中便是建立完 element 後將他們加入鏈結串列中。

/* Insert an element at head of queue */
bool q_insert_head(struct list_head *head, char *s)
{
    element_t *element = element_new(s);
    if (!element) {
        free(element);
        return false;
    }
    list_add(&element->list, head);
    return true;
}

/* Insert an element at tail of queue */
bool q_insert_tail(struct list_head *head, char *s)
{
    element_t *element = element_new(s);
    if (!element) {
        free(element);
        return false;
    }
    list_add_tail(&element->list, head);
    return true;
}

`q_remove_head`, `q_remove_tail`

同樣也是為了方便接下來需要移除節點的時候不用重複寫一次，所以這邊將移除節點寫成一個函數 remove_element。

/* Delete element */
element_t *remove_element(struct list_head *head,
                          char *sp,
                          size_t bufsize,
                          struct list_head *target)
{
    element_t *res = list_entry(target, element_t, list);
    if (sp && res->value) {
        strncpy ( sp, res -> value, bufsize );
        sp[bufsize - 1] = 0;
    }
    list_del ( target );
    return res;
}

然後在 q_remove_head/q_remove_tail 分別呼叫對應的位置。

/* Remove an element from head of queue */
element_t *q_remove_head(struct list_head *head, char *sp, size_t bufsize)
{
    return remove_element(head, sp, bufsize, head->next);
}

/* Remove an element from tail of queue */
element_t *q_remove_tail(struct list_head *head, char *sp, size_t bufsize)
{
    return remove_element(head, sp, bufsize, head->prev);
}

`q_delete_mid`

藉由兩個前進速度不同的指標，找到中間的節點。
等到前進速度較快的指標到末端的時候，另一個指標會停在中間。

while (fast != head && fast->next != head) {
    fast = fast->next->next;
    target = target->next;
}
list_del(target);
element_t *entry = list_entry(target, element_t, list);
free(entry->value);
free(entry);

`q_delete_dup`

本來是用 while-loop 去寫，如果檢查到相同的就刪除，然後下一個節點過去。
不過因為直接使用 list_del，會造成原本的 cur->nxet 跑掉，因此後來改成用 list_for_each_safe 檢查整個鏈結串列，並檢查如果目前的節點資料與下一個節點資料相同，便將目前節點刪除。

後來在測試的時候才發誤解題意了，是要將全部相同的元素都拔掉。

bool last = false;
list_for_each_safe (cur, safe, head) {
    element_t *entry = list_entry(cur, element_t, list);
    if (last || (cur->next != head && 
                 !strcmp(entry->value, 
                         list_entry(cur->next, element_t, list)->value))) {
        last = true;
        list_del(cur);
        free_element(entry);
    } else
        last = false;
}

寫完過了幾天的測試中，才發現之前的測試不夠完全，沒有找到 bug。
這個 bug 會發生在如果重複的元素不是排在最後面的時候（例如：["abc", "abc", "xyz"]），會將第一組出現重複的元素後面的所有元素都刪掉，因為只要在上面的程式碼第 15 行的條件一但達成後（也就是兩個字串完全相同時），last 就再也不可能被改為 false 了，無論後面的比較結果如何這一條條件永遠會達成。
於是把 function 改成以下的版本，在 for-loop 的最後會將 last 的值與 match 結果同步。

    list_for_each_safe (cur, safe, head) {
        element_t *entry = list_entry(cur, element_t, list);
-       if (last || (cur->next != head &&
-                    !strcmp(entry->value,
-                           list_entry(cur->next, element_t, list)->value))) {
+       bool match = cur->next != head &&
+                    !strcmp(entry->value,
+                            list_entry(cur->next, element_t, list)->value);
+       if (last || match) {
            list_del(cur);
            free_element(entry);
-       } else
-           last = false;
+       }
+       last = match;
    }

`q_swap`

本來是要將節點倆倆取出，並將他們依序加回鏈結串列的前端，但在很後面的時候發現會有錯誤，所以改成直接呼際 \(k = 2\) 的 q_reverseK。

/* Swap every two adjacent nodes */
void q_swap(struct list_head *head)
{
    // https://leetcode.com/problems/swap-nodes-in-pairs/
    if (!head || list_empty(head) || q_size(head) < 2)
        return;
    // the previous version had a bug, using q_reverseK instead
    q_reverseK(head, 2);
}

`q_reverse`

避免過多的中英文混用，已有明確翻譯詞彙者，例如「鏈結串列」(linked list) 和「佇列」(queue)，就使用該中文詞彙，英文則留給變數名稱、人名，或者缺乏通用翻譯詞彙的場景。

用 list_for_each_safe 遍歷所有 linked list 中的節點，並將所有節點依序放到開頭。

/* Reverse elements in queue */
void q_reverse(struct list_head *head)
{
    if (!head || list_empty(head))
        return;
    struct list_head *cur, *safe;
    list_for_each_safe (cur, safe, head)
        list_move(cur, head);
}

`q_reverseK`

用 list_cut_position 從 linked lsit 中取出前 k 個節點，藉由重複利用 q_reverse 反轉區間之後將這個區間接回去，並取出接下來 k 個節點重複此操作。

do {
    // check the rest of list have k nodes or more
    bool flag = true;
    tmp = cur->next;
    for (int i = 0; i < k; i++) {
        if (!tmp || tmp == head) {
            flag = false;
            break;
        }
        tmp = tmp->next;
    }

    // if the rest of list don't have enough nodes
    if (!flag)
        break;

    // reverse the segment and splice it back to right position
    LIST_HEAD(slice);
    list_cut_position(&slice, cur, tmp->prev);
    q_reverse(&slice);
    list_splice_init(&slice, cur);

    // move forward k nodes
    for (int i = 0; i < k; i++)
        cur = cur->next;
} while (cur != head);

`q_sort`

手寫 merge sort，原先是要按照以前習慣的作法，將鏈結串列分成兩邊後分別排序、用一個 tmp array 暫存資料後再複製回原始位置。
當時這樣寫是想要降低複雜度，在合併兩個串列時不用花單次 \(O(N)\) 的新增元素時間，每次操作均為 \(O(1)\)，並且在完成合併後再花 \(O(N)\) 的時間複製，會較快一些。
不過既然都用鏈結串列了，單次操作本來就需要 \(O(N)\)，因此改為插入元素的方式。

會將原先的 head 切出前半放在 left 中，剩下留在 head 中。
而原先是用 list_for_each_safe 遍歷 head，並確認是否能把 left 最前端的元素加入這個位置，不過這樣會讓 list_for_each_safe 的指標順序產生混亂造成 segmentation fault，後來改成遍歷 left，並用另一個指標控制目前在 head 中的位置，並檢查是否能將目前的元素放進這個位置中。


LIST_HEAD(left);
struct list_head *mid = head->next, *fast = head->next, *safe, *cur, *tmp;
while (fast != head &&
        fast->next != head) {  // get the mid position by using the same way
                                // in q_delete_mid
    fast = fast->next->next;
    mid = mid->next;
}
// split the list into two seq
list_cut_position(&left, head, mid->prev);

// sort each side
q_sort(&left, descend);
q_sort(head, descend);

tmp = head->next;
list_for_each_safe (cur, safe, &left) {  // merge lists
    while (tmp != head) {  // find a place for current element_t object
        char *cur_value = list_entry(cur, element_t, list)->value;
        char *tmp_value = list_entry(tmp, element_t, list)->value;
        int res = strcmp(cur_value, tmp_value);
        if ((!descend && res <= 0) || (descend && res >= 0)) {
            // the element is allowed be placed in this position
            list_del(cur);
            list_add(cur, tmp->prev);
            break;  // current element is placed, break the loop
        }
        // if not, keep moving
        tmp = tmp->next;
    }
    if (tmp == head) {
        list_del(cur);
        list_add_tail(cur, head);
    }
}

你如何確保目前的測試已涵蓋排序演算法最差的狀況呢？

`q_ascend`/`q_descend`

為了從串列的尾端操作，所以先將鏈結串列先倒置，等處完後再將順序倒回來。
在 q_ascend 中，從尾端開始比較每個節點的值，如果比目前節點大則會將節點刪除，反之將目前的值紀錄下來。
而在 q_descend 中則是改成紀錄最大值，並且刪除較小的節點。

另外在過程中發現有多個區塊的記憶體沒有被釋放，後來發現是在更新紀錄的值之前沒有先原先的記憶體位置釋放掉，多 free 一次便沒有這個問題了。

/* Remove every node which has a node with a strictly less value anywhere to
 * the right side of it */
int q_ascend(struct list_head *head)
{
    // https://leetcode.com/problems/remove-nodes-from-linked-list/
    if (!head || list_empty(head))
        return 0;
    struct list_head *cur, *safe;
    char *mi = NULL;
    q_reverse(head);  // the process will begin from the end, reverse first
    list_for_each_safe (cur, safe, head) {
        element_t *cur_entry = list_entry(cur, element_t, list);
        if (cur != head->next && strcmp(cur_entry->value, mi) > 0) {
            // skip at first element
            list_del(cur);
            free_element(cur_entry);
        } else {
            free(mi);  // importent! remind free the string before next strdup
            mi = strdup(cur_entry->value);
        }
        cur_entry = NULL;
    }
    q_reverse(head);  // reverse back to original order
    free(mi);
    return q_size(head);
}

/* Remove every node which has a node with a strictly greater value anywhere to
 * the right side of it */
int q_descend(struct list_head *head)
{
    // https://leetcode.com/problems/remove-nodes-from-linked-list/
    // same as q_ascend, but in different order
    if (!head || list_empty(head))
        return 0;
    struct list_head *cur, *safe;
    char *ma = NULL;
    q_reverse(head);
    list_for_each_safe (cur, safe, head) {
        element_t *cur_entry = list_entry(cur, element_t, list);
        if (cur != head->next && strcmp(cur_entry->value, ma) < 0) {
            list_del(cur);
            free_element(cur_entry);
        } else {
            free(ma);
            ma = strdup(cur_entry->value);
        }
        cur_entry = NULL;
    }
    q_reverse(head);
    free(ma);
    return q_size(head);
}

論文〈Dude, is my code constant time?〉

上面的程式碼完成後，push 上 Github，大概過了兩分鐘後收到了一封 email。

Screenshot 2024-02-28 at 15.54.14

奇怪，local host 測都過了，打開 GitHub Actions 之後發現。

Screenshot 2024-02-28 at 16.02.11

於是跑回去看作業敘述。

研讀論文〈Dude, is my code constant time?〉，解釋本程式的 "simulation" 模式是如何透過以實驗而非理論分析，達到驗證時間複雜度，需要解釋 Student's t-distribution 及程式實作的原理。

注意：現有實作存在若干致命缺陷，請討論並提出解決方案

在 oreparaz/dudect 的程式碼具備 percentile 的處理，但在 lab0-c 則無。

應該是跟這方面有關，於是去研究了一下論文。
論文中提到有些程式碼看起來是 constant time，但實際上並不是，會造成一些資安上的漏洞。

總之，在大略的研究之後，發現在 orparaz/dudect 的原始碼中，有以下的 code（429 行處）：

bool first_time = ctx->percentiles[DUDECT_NUMBER_PERCENTILES - 1] == 0;

dudect_state_t ret = DUDECT_NO_LEAKAGE_EVIDENCE_YET;
if (first_time) {
    // throw away the first batch of measurements.
    // this helps warming things up.
    prepare_percentiles(ctx);
} else {
    update_statistics(ctx);
    ret = report(ctx);
}

尋找了一下 prepare_percentiles 的定義後，有以下結果：

/*
 set different thresholds for cropping measurements.
 the exponential tendency is meant to approximately match
 the measurements distribution, but there's not more science
 than that.
*/
static void prepare_percentiles(dudect_ctx_t *ctx) {
  qsort(ctx->exec_times, ctx->config->number_measurements, sizeof(int64_t), (int (*)(const void *, const void *))cmp);
  for (size_t i = 0; i < DUDECT_NUMBER_PERCENTILES; i++) {
    ctx->percentiles[i] = percentile(
        ctx->exec_times, 1 - (pow(0.5, 10 * (double)(i + 1) / DUDECT_NUMBER_PERCENTILES)),
        ctx->config->number_measurements);
  }
}

大意是將極值切除，只留下 \(1-0.5^{10 \cdot \frac{1 + i}{N}}\) 以下的部分。

接著回去翻閱程式碼，從 qtest.c 中找到是 is_insert_tail_const 和 is_insert_head_const 這部分在判斷，於是從上面 include 的幾個 local header 逐個尋找，在第一個 header deduct/fixture.c 中就找到了，定義在最下面：

#define DUT_FUNC_IMPL(op)                \
    bool is_##op##_const(void)           \
    {                                    \
        return test_const(#op, DUT(op)); \
    }

很優雅的 define，筆記一下原來 define 可以這樣寫。

即是 X macros 技巧。

繼續尋找 test_const 的定義，再往上跳到 doit 函數，這應該就是對應論文中的 dudect_main，於是我將其中的 prepare_statistics 實作進來。

static int cmp(const int64_t *a, const int64_t *b)
{
    return (int) (*a - *b);
}

static int64_t percentile(int64_t *arr, double which, size_t sz)
{
    size_t res = (size_t) sz * which;
    assert(res < sz);
    return res;
}
static void pre_percentiles(int64_t *exec_time)
{
    qsort(exec_time, N_MEASURES, sizeof(int64_t),
          (int (*)(const void *, const void *)) cmp);
    for (size_t i = 0; i < N_MEASURES; i++) {
        exec_time[i] = percentile(
            exec_time, 1 - (pow(0.5, 10) * (double) (i + 1) / N_MEASURES),
            N_MEASURES);
    }
}

同時按照論文中的寫法，將 update_statistics 中的 for 迴圈改從 10 開始：

static void update_statistics(const int64_t *exec_times, uint8_t *classes)
{
-    for (size_t i = 0; i < N_MEASURES; i++) {
+    for (size_t i = 10; i < N_MEASURES; i++) {

最後將這些改動 push 上 GitHub 重新跑一次 CI/CD，最後得到以下結果。

Screenshot 2024-02-28 at 16.26.49

太好了是卡比！

對照論文，確認你的修改反映出其論述，之後可提交 pull request。

`q_shuffle`

使用 Fisher–Yates shuffle 來實作，但順序反過來從前端開始處理、並和往後的做交換，從前端實作會好寫一些，反正最終都是要將串列打亂，從頭或從尾開始實作並不影響結果。

會逐個處理，並將目前物件 ?? 和後面的做交換，tms 為後面幾個元素，如果 tms 為1 代表和下一個元素交換、為 2 代表和下兩個元素交換依此類推（tms 為 0 時不交換）。

/* Shuffle the list by using Fisher-Yates shuffle Algorithm */
void q_shuffle(struct list_head *head)
{
    if (!head || list_empty(head))
        return;

    int len = q_size(head);
    if (len == 1)  // don't need to process
        return;
    struct list_head *cur, *safe, *target;
    list_for_each_safe (cur, safe, head) {
        len--;
        if (len == 0)  // last element
            break;
        target = cur;
        int tms = rand() % len;
        if (tms == 0)  // don't need to swap
            continue;
        for (int i = 0; i < tms; i++)
            target = target->next;
        swap(target, cur);
    }
    puts("end of q_shuffle");
}

補上數學分析。

2024q1 Homework1 (lab0)

開發環境

neoVim Config

queue.c

q_new

q_free

q_insert_head, q_insert_tail

q_remove_head, q_remove_tail

q_delete_mid

q_delete_dup

q_swap

q_reverse

q_reverseK

q_sort

q_ascend/q_descend

論文 〈Dude, is my code constant time?〉

q_shuffle

Read more

2024q1 Homework5 (assessment)

Concurrent Red-Black Tree

2024q1 Homework 4 (quiz3+4)

簡易（？）鍵盤選購指南