2023q1 Homework1 (lab0)

contributed by < paul90317 >

Reviewed by `Urbaner`

程式碼風格簡短標準一致，說明有抓到重點，可能因省略而忘記提及過程。
第二次交作業，考慮在 git commit 並程式碼簡介過程及說明，以利後續開發者維護，注意代名詞用法，詳 google tech writing。
工作技術成熟，commit 命名可以詳述特徵，不要用數字。 requirement 可以附連結，否則不知道指的是怎樣的要求。甚至每次 commit 都有各自的要求，針對某功能的幾個細項。根據 Amazing Code Reviews ，commit 盡量短小，可以幫助團隊討論及開發。

作業要求

開發環境

$ lscpu

Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         39 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  8
  On-line CPU(s) list:   0-7
Vendor ID:               GenuineIntel
  Model name:            Intel(R) Core(TM) i5-9300H CPU @ 2.40GHz
    CPU family:          6
    Model:               158
    Thread(s) per core:  2
    Core(s) per socket:  4
    Socket(s):           1
    Stepping:            10
    CPU max MHz:         4100.0000
    CPU min MHz:         800.0000
    BogoMIPS:            4800.00
Caches (sum of all):     
  L1d:                   128 KiB (4 instances)
  L1i:                   128 KiB (4 instances)
  L2:                    1 MiB (4 instances)
  L3:                    8 MiB (1 instance)

$ gcc --version
gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0

開發紀錄

目前分數 100/100

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

解析 list.h

struct list_head {
    struct list_head *prev;
    struct list_head *next;
};

typedef struct {
    char *value;
    struct list_head list;
} element_t;

#define container_of(ptr, type, member) ...

container_of 是 某個東西的容器，只要把指標丟進去，便可以反向定位出容器的指標。
ptr 為容器中成員的指標, type 為容器的類別, member 為成員的名稱。
舉例：container_of(node, element_t, list)，其中 node 的類別是 struct list_head

命名 head：head of queue，不是第一個元素
命名 entry：類別是 element_t，容器
命名 node：類別是 struct list_head

q_new

INIT_LIST_HEAD(head)：將指標 struct list_head 初始化成一個大小為 0 的 queue。

/* Create an empty queue */
struct list_head *q_new()
{
    struct list_head *q = malloc(sizeof(struct list_head));
    INIT_LIST_HEAD(q);
    return q;
}

q_free

既然此處的鏈結串列為雙向，那「順向」到底是哪個方向？需要精準用語和定義。

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

jserv

list_for_each(node, head)：單純的向後 (向 next) 走訪，當對 node 進行修改會發生未定義行為，不適合用於這裡。
list_for_each_safe(node, safe, head)：多一個 safe 預先儲存下一個存取的節點，便可以在每次對 node 修改。
list_for_each_entry_safe(entry, safe, head, member)：由於存取每個節點都要取得 node 的容器，並且將 value 釋放，不如直接拿到容器 entry 會更漂亮。

/* Free all storage used by queue */
void q_free(struct list_head *l)
{
    element_t *entry, *safe;
    list_for_each_entry_safe (entry, safe, l, list) {
        q_release_element(entry);
    }
    free(l);
}

q_size

/* Return number of elements in queue */
int q_size(struct list_head *head)
{
    if (!head)
        return -1;
    int size = 0;
    struct list_head *node;
    list_for_each (node, head)
        size++;
    return size;
}

q_insert

static inline bool q_insert(struct list_head *head, char *s)
{
    if (!head || !s)
        return false;
    
    element_t *entry = malloc(sizeof(element_t));
    if (!entry)
        return false;
    
    size_t len = strlen(s) + 1;
    entry->value = malloc(len);
    if (!entry->value) {
        free(entry);
        return false;
    }
    
    memcpy(entry->value, s, len);
    list_add(&entry->list, head);
    return true;
}

/* Insert an element at head of queue */
bool q_insert_head(struct list_head *head, char *s)
{
    return q_insert(head, s);
}

/* Insert an element at tail of queue */
bool q_insert_tail(struct list_head *head, char *s)
{
    return q_insert(head->prev, s);
}

q_remove

strncpy(s1, s2, n) 重點如下

If the array pointed to by s2 is a string that is shorter than n characters,
–-> null characters are appended to the copy in the array pointed to by s1,
–-> until n characters in all have been written.

如果 s2 比 n 短，會在 s1 中的副本的結尾接上空字元，否則只會把 s2 前 n 個字元複製到 s1 中，所以為了防止後續對 s1 進行超出記憶體的讀寫 (讀不到空字元)，都會在使用該函式後固定把緩存 s1 最後變成空字元。

static inline element_t *q_remove(struct list_head *node,
                                  char *sp,
                                  size_t bufsize)
{
    if (!node || list_empty(node))
        return NULL;
    
    element_t *entry = container_of(node, element_t, list);
    if (sp) {
        strncpy(sp, entry->value, bufsize - 1);
        sp[bufsize - 1] = '\0';
    }
    
    list_del(node);
    return entry;
}

/* Remove an element from head of queue */
element_t *q_remove_head(struct list_head *head, char *sp, size_t bufsize)
{
    return q_remove(head->next, sp, bufsize);
}

/* Remove an element from tail of queue */
element_t *q_remove_tail(struct list_head *head, char *sp, size_t bufsize)
{
    return q_remove(head->prev, sp, bufsize);
}

q_delete_mid

當 fast 繞了一圈後，slow 會剛好在中間。

/* Delete the middle node in queue */
bool q_delete_mid(struct list_head *head)
{
    // https://leetcode.com/problems/delete-the-middle-node-of-a-linked-list/

    if (!head || list_empty(head))
        return false;

    struct list_head *fast = head, *slow = head;
    do {
        fast = fast->next->next;
        slow = slow->next;
    } while (fast->prev != head && fast != head);

    list_del(slow);
    q_release_element(list_entry(slow, element_t, list));

    return true;
}

q_delete_dup

走訪 list，存取每一個 entry，如果當前元素 (entry) 的下一個元素 (next) 與 entry 有相同的值時，會執行以下步驟

迭代 next，直到停在第一個不同於 entry 的值，並釋放所經元素的記憶體。
將 entry 前一元素與 next 串連，釋放 entry。

/* Delete all nodes that have duplicate string */
bool q_delete_dup(struct list_head *head)
{
    // https://leetcode.com/problems/remove-duplicates-from-sorted-list-ii/

    if (!head)
        return false;

    element_t *entry, *next, *temp;
    list_for_each_entry_safe (entry, next, head, list) {
        if (&next->list != head && strcmp(entry->value, next->value) == 0) {
            do {
                temp = next;
                next = list_entry(next->list.next, element_t, list);
                q_release_element(temp);
            } while (&next->list != head &&
                     strcmp(entry->value, next->value) == 0);

            entry->list.prev->next = &next->list;
            next->list.prev = entry->list.prev;

            q_release_element(entry);
        }
    }

    return true;
}

q_swap

變數說明如下

prev：當前節點的上一節點
node：當前節點
next：當前節點的下一節點
nextnext：當前節點的下下節點

while 說明如下

init：初始化變數
condition：當剩下兩個以下的節點則結束迭代
statement：交換 node 與 next，並與 prev 和 nextnext 進行連接
iteration：迭代 node 兩次，並給予其他變數新的值

/* Swap every two adjacent nodes */
void q_swap(struct list_head *head)
{
    // https://leetcode.com/problems/swap-nodes-in-pairs/

    if (!head)
        return;
    
    /* init */
    struct list_head *node = head->next, *next = node->next, *prev = node->prev,
                     *nextnext = next->next;
    while (/* condition */ node != head && next != head) {
        
        /* statement */
        next->next = node;
        next->prev = prev;
        node->next = nextnext;
        node->prev = next;
        
        prev->next = next;
        nextnext->prev = node;
        
        /* iteration */
        node = nextnext;
        next = node->next;
        nextnext = next->next;
        prev = node->prev;
    }
}

q_reverse

使用 list_for_each_safe 可以讀寫 node，原理請見 q_free

/* Reverse elements in queue */
void q_reverse(struct list_head *head)
{
    if (!head || list_empty(head))
        return;

    struct list_head *node, *safe, *temp;
    list_for_each_safe (node, safe, head) {
        temp = node->next;
        node->next = node->prev;
        node->prev = temp;
    }

    temp = head->next;
    head->next = head->prev;
    head->prev = temp;
}

q_reverseK

使用 stack，每 k 個元素作為一組將其推入 stack 中，然後再依序取出來，就會得到相反順序的元素序列

改進你的漢語表達。

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

jserv

已透過 Chat GPT 改進

/* Reverse the nodes of the list k at a time */
void q_reverseK(struct list_head *head, int k)
{
    // https://leetcode.com/problems/reverse-nodes-in-k-group/

    if (!head || list_empty(head) || k <= 0)
        return;

    struct list_head *node = head->next, *safe;
    for (int n = q_size(head); n >= k; n -= k) {
        LIST_HEAD(stack);

        for (int i = 0; i < k; ++i) {
            safe = node->next;
            list_del(node);
            list_add(node, &stack);
            node = safe;
        }

        list_splice_tail(&stack, node);
    }
}

q_descend

先來分析題目，每個元素 entry 會不會被刪掉取決在右側是否有一個比自己大的元素。
所以只要由右而左走訪，紀錄最大值 best，就可以在存取每個元素時只需要比較一次。

/* Remove every node which has a node with a strictly greater value anywhere to
 * the right side of it */
int q_descend(struct list_head *head)
{
    // https://leetcode.com/problems/remove-nodes-from-linked-list/

    if (!head || list_empty(head))
        return 0;

    element_t *entry = list_entry(head->prev, element_t, list),
              *safe = list_entry(entry->list.prev, element_t, list);
    char *best = entry->value;
    while (&entry->list != head) {
        if (strcmp(entry->value, best) < 0) {
            list_del(&entry->list);
            q_release_element(entry);
        } else {
            best = entry->value;
        }

        entry = safe;
        safe = list_entry(entry->list.prev, element_t, list);
    }

    return q_size(head);
}

q_sort

使用 merge sort 進行排序

l_merge

將 left 與 right 合併到 head 的尾端。

static inline void l_merge(struct list_head *head,
                           struct list_head *left,
                           struct list_head *right)
{
    while (!list_empty(left) && !list_empty(right)) {
        if (strcmp(list_entry(left->next, element_t, list)->value,
                   list_entry(right->next, element_t, list)->value) < 0) {
            list_move_tail(left->next, head);
        } else {
            list_move_tail(right->next, head);
        }
    }

    if (!list_empty(left)) {
        list_splice_tail_init(left, head);
    } else {
        list_splice_tail_init(right, head);
    }
}

q_sort

使用快慢指標找中點 slow，在用 list_cut_position 將其一分為二成 left 和 right，分別做排序，再用 l_merge 合併。

list_cut_position(left, right, mid)：將 right 以元素 mid 進行切割，左邊放到 left (包含 mid)，右邊保留在 right。

/* Sort elements of queue in ascending order */
void q_sort(struct list_head *head)
{
    if (!head || q_size(head) < 2)
        return;

    struct list_head *fast = head, *slow = head;
    do {
        fast = fast->next->next;
        slow = slow->next;
    } while (fast != head && fast->next != head);

    LIST_HEAD(left);
    LIST_HEAD(right);
    list_splice_tail_init(head, &right);
    list_cut_position(&left, &right, slow);
    q_sort(&left);
    q_sort(&right);
    l_merge(head, &left, &right);
}

q_merge

持續將 queue 兩兩合併，直到只剩 1 個 queue。

for：每次迭代會將兩個 queue 合併成一個並放到 slow 指向的 queue 裡 (原本是空的)，當只剩一個 queue 時，在 for 結束時會直接放到 slow 指向的 queue 裡。
while：若當前有 n 個 queue，經過迭代後會剩 floor[(n + 1) / 2]個 (因為會兩兩合併)，最後剩下一個就是題目要的解。

int q_merge(struct list_head *head)
{
    // https://leetcode.com/problems/merge-k-sorted-lists/
    if (!head || list_empty(head))
        return 0;

    int n = q_size(head);
    while (n > 1) {
        struct list_head *fast = head->next, *slow = head->next;
        for (int i = 0; i < n / 2; ++i) {
            LIST_HEAD(temp);
            l_merge(&temp, container_of(fast, queue_contex_t, chain)->q,
                    container_of(fast->next, queue_contex_t, chain)->q);

            list_splice_tail(&temp,
                             container_of(slow, queue_contex_t, chain)->q);

            fast = fast->next->next;
            slow = slow->next;
        }
        if (n & 1)
            list_splice_tail_init(container_of(fast, queue_contex_t, chain)->q,
                                  container_of(slow, queue_contex_t, chain)->q);

        n = (n + 1) / 2;
    }

    return q_size(container_of(head->next, queue_contex_t, chain)->q);
}

trace-17

初看論文 Dude, is my code constant time? 之後，發現 dudect 測試工具其實是對某個函式進行大量測試，得到對不同 cycle 的機率分佈，從而推論該函式是否為 constant time。

所以首先，我們先觀測目前的機率分佈，看看為何我的程式碼不是 constant time。

可以看出 q_insert_head 在 cycle 為 350 附近有不明原因的高峰，經過排查，這是 malloc 造成的，原因有待查證。

如果 dudect 是從機率分佈判斷是否為 constant time，解法就很簡單了，讓 cycle 為 150 的樣本跑完後等到 cycle 350 的樣本結束再離開，該方法用到 busy loop，以下是程式碼。

#define at_least_cycle(c, type, func, ...)    \
    ({                                        \
        type ret;                             \
        if (simulation) {                     \
            int64_t t_start = cpucycles();    \
            ret = func(__VA_ARGS__);          \
            while (cpucycles() - t_start < c) \
                ;                             \
        } else                                \
            ret = func(__VA_ARGS__);          \
        ret;                                  \
    })

為了解釋該巨集的用法，我舉一個例子，當有一個函式 float f(int a,char *b)，而函式呼叫 f(a, b) 套用巨集變成 at_least_cycle(500, float, f, a, b)，第一個參數是至少要跑多少 cycle，第二個參數是函式回傳值，第三個參數是原本的函式的指標，剩下的參數就是原本 f 的參數。

機率分佈如下

最後將程式碼上傳，順利看到皮卡丘。
Action

TODO
理解論文 Dude, is my code constant time? 想要做的事，改進 simulation 程式。而非用 busy loop 假解，FB 討論區

運用 Valgrind 排除 `qtest` 實作的記憶體錯誤

輸入命令 valgrind ./qtest < traces/trace-01-ops.cmd 可得以下輸出。






































==43214== 32 bytes in 1 blocks are still reachable in loss record 1 of 5
==43214==    at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==43214==    by 0x10CBE0: do_new (qtest.c:146)
==43214==    by 0x10DE0C: interpret_cmda (console.c:181)
==43214==    by 0x10E3C1: interpret_cmd (console.c:201)
==43214==    by 0x10F02B: run_console (console.c:691)
==43214==    by 0x10D1FE: main (qtest.c:1224)
==43214== 
==43214== 56 bytes in 1 blocks are still reachable in loss record 2 of 5
==43214==    at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==43214==    by 0x10F122: test_malloc (harness.c:133)
==43214==    by 0x10F527: q_new (queue.c:31)
==43214==    by 0x10CC19: do_new (qtest.c:150)
==43214==    by 0x10DE0C: interpret_cmda (console.c:181)
==43214==    by 0x10E3C1: interpret_cmd (console.c:201)
==43214==    by 0x10F02B: run_console (console.c:691)
==43214==    by 0x10D1FE: main (qtest.c:1224)
==43214== 
==43214== 130 bytes in 10 blocks are still reachable in loss record 3 of 5
==43214==    at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==43214==    by 0x49FD60E: strdup (strdup.c:42)
==43214==    by 0x1122F6: line_history_add (linenoise.c:1275)
==43214==    by 0x10F033: run_console (console.c:692)
==43214==    by 0x10D1FE: main (qtest.c:1224)
==43214== 
==43214== 150 bytes in 10 blocks are still reachable in loss record 4 of 5
==43214==    at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==43214==    by 0x49FD60E: strdup (strdup.c:42)
==43214==    by 0x1122F6: line_history_add (linenoise.c:1275)
==43214==    by 0x113151: line_hostory_load (linenoise.c:1360)
==43214==    by 0x10D259: main (qtest.c:1212)
==43214== 
==43214== 160 bytes in 1 blocks are still reachable in loss record 5 of 5
==43214==    at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==43214==    by 0x112342: line_history_add (linenoise.c:1263)
==43214==    by 0x113151: line_hostory_load (linenoise.c:1360)
==43214==    by 0x10D259: main (qtest.c:1212)
==43214==

第 1 行與第 9 行是因為 queues 仍然可以存取，我最一開始的作法是在 main 函式最後放置以下程式碼

struct list_head *cur = chain.head.next;
while (cur != &chain.head) {
    queue_contex_t *ctx;
    ctx = list_entry(cur, queue_contex_t, chain);
    cur = cur->next;
    q_free(ctx->q);
    free(ctx);
}

然而在執行 trace-14 陷入無窮迴圈，當時使用的指令是 make valgrind。本以為是排序演算法太慢，除錯後發現是因為節點太多釋放且釋放每個節點的時間過長。
解決方法是加 set_cautious_mode(false) 在上面提到的程式碼的前面，可降低每個節點釋放時間。
也可以直接移除 q_quit 第一行的 return true;，因為 q_quit 本來就是釋放 queues 的函式。

第 19 行以後是在 line_history_add 第 1263 與 1275 行所分配的記憶體沒有釋放。































int line_history_add(const char *line)
{
    if (history_max_len == 0)
        return 0;

    /* Initialization on first call. */
    if (!history) {
        history = malloc(sizeof(char *) * history_max_len);
        if (!history)
            return 0;
        memset(history, 0, (sizeof(char *) * history_max_len));
    }

    /* Don't add duplicated lines. */
    if (history_len && !strcmp(history[history_len - 1], line))
        return 0;

    /* Add an heap allocated copy of the line in the history.
     * If we reached the max length, remove the older line. */
    char *linecopy = strdup(line);
    if (!linecopy)
        return 0;
    if (history_len == history_max_len) {
        free(history[0]);
        memmove(history, history + 1, sizeof(char *) * (history_max_len - 1));
        history_len--;
    }
    history[history_len] = linecopy;
    history_len++;
    return 1;
}

為了要在程式結束前將 history 釋放，在 linenoise.c 增加以下函式

/* free history */
#include "console.h"
static bool history_free(int argc, char *argv[])
{
    for (int i = 0; i < history_len; i++) {
        free(history[i]);
    }
    free(history);
    return true;
}

在 if (!history) { 區塊中插入 add_quit_helper(history_free); 以註冊釋放事件。
以 add_quit_helper 註冊的函式會在 main 函式中所執行的 finish_cmd 執行。
接著執行命令 make valgrind，沒有錯誤訊息。
commit

論文閱讀〈Dude, is my code constant time?〉

論文中，使用兩種資料集 (fix 和 random) 作為輸入，產出兩個 distribution，對他們進行 Welch’s t-test，試圖推翻兩者相同的 null hypothesis，推翻代表程式非常數時間，否則有可能是常數時間。

在 lab0 中，random 資料集的每次測試，是先產生隨機大小的 queue，接著執行欲測試函式並紀錄其時間，fix 則都是大小為 0 (測試 remove 大小為 1) 的 queue。

以下是用於統計計算的結構。

/* 取樣配置 */
typedef struct {
  size_t chunk_size; /* 單一取樣的參數大小 */
  size_t number_measurements; /* 取樣次數 */
} dudect_config_t;

/* 多次取樣後的結果 */
typedef struct {
  double mean[2]; /* 平均 */
  double m2[2]; /* 每項與平均的差的總和 */
  double n[2]; /* 次數 */
} ttest_ctx_t;

/* 多次取樣過後的原始資料 */
typedef struct {
  int64_t *ticks; 
  int64_t *exec_times; /* 每次取樣的執行時間 (取樣結果) */
  uint8_t *input_data; /* 每格取樣預先產生的資料 */
  uint8_t *classes; /* 每個取樣是 fix 或 random */
  dudect_config_t *config; /* 取樣配置 */
  ttest_ctx_t *ttest_ctxs[DUDECT_TESTS];
  int64_t *percentiles;
} dudect_ctx_t;

/* 將單次取樣結果 x 更新至多次取樣結果 ctx 裡 */
static void t_push(ttest_ctx_t *ctx, double x, uint8_t class);

/* 計算兩個 class 的 t */
static double t_compute(ttest_ctx_t *ctx);

/* 初始化多次取樣結果 */
static void t_init(ttest_ctx_t *ctx);

取自論文原始碼 dudect.h

percentiles

引用論文
Cropping: Typical timing distributions are positively
skewed towards larger execution times. This may be caused by
measurement artifacts, the main process being interrupted by
the OS or other extraneous activities. In this case, it may be
convenient to crop the measurements (or discard measurements
that are larger than a fixed, class-independent threshold).

引用論文
Pre-processing: We pre-process measurements prior to
statistical processing. We discard measurements that are larger
than the p percentile, for various2 values of p. The rationale
here is to test a restricted distribution range, especially the
lower cycle count tail. The upper tail may be more influenced
by data-independent noise. This (heuristic) process gives good
empirical results (makes detection faster); nevertheless we also
process measurements without pre-processing.

前兩個引文取自論文，是在說為了排除排外在因素，像是測量遺器或是作業系統中斷事件等等，論文使用百分點裁切來進行前處理。以下幾個論點是我閱讀論文原始馬後，覺得跟前兩段引文最相關的部份。

裁切是為了排除與資料無關的外在因素，外在因素通常會使某次執行時間更長 (幾百倍，如圖一)，使想一個情況，如果 random 或 fix 都有一些外在因素造成的樣本，這些樣本是其他樣本的好幾倍，這樣會讓原本的樣本被稀釋，可能讓原本有 leakage 的清況被誤判為 constant time，由於原始碼的會持續檢測不會中止，這就會讓發現 leakage 的時間延後。
第一筆資料先不統計，而是用它產生多個裁剪點，dudect_main
將原始分佈以不同的百分點進行裁切，並把執行時間較高的樣本移除。原始分佈可視為以 100% 進行裁切的受限分佈，update_statistics
每個受限分佈都包含兩個分佈，分別代表兩個 class，fix 和 random。將每個受限分佈的 fix 和 random class 透過 t_compute 進行計算得到一個統計量 t-value。代表每個受限分佈都有一個統計量。
論文透過函數
$f (x) = 1 - (\frac{1}{2})^{0.1 \cdot x}$ 讓裁切點於高百分位分佈較低百分位多，但沒有科學根據來說明這樣更好，prepare_percentiles
為了更準確的發現 leakage，會在所有不同受限分佈中找到統計量最高的來進行統計試驗，這代表可以更好的排除外在因素，max_test、report
只有在統計資料足夠的情況下 (不只一筆) 才會進行統計檢驗並嘗試發現 leakage，report

圖一

這是一張柱狀圖，可以看到時間大部分的時間都在 10000 以下，但在 60000 確有幾筆資料，可能是與測資獨立的外在因素導致。

著手改進 lab0 dudect 實作

lab0 中的實作因為有自己的考量，有些部份似乎與論文原始碼不同，條列以下幾點

	lab0	dudect.h
percentile	有	無
每次測試持續時間	到有足夠資料就回報並停止	直到發現 leakage
判定常數時間的方法	10次 leakage 機會，有一次常數時間就是常數時間	程式不停止，只有 leakage found 和 constant time at this moment

lab0 為了跑自動測試，將每次測試的持續時間改成有上限。為了讓程式更容易通過自動測試，將 判定常數時間的方法 改成有 9 次 leakage 的機會。

#define N_THRESHOLDS 100

static bool first;
static t_context_t restricted_distributions[N_THRESHOLDS + 1];
static int64_t percentiles[N_THRESHOLDS];

N_THRESHOLDS: 有多少受限分佈
first: 是否是第一次呼叫 dudect_main
restricted_distributions: 受限分佈，restricted_distributions[N_THRESHOLDS] 存原始分佈
percentiles: 每個受限分佈的閾值。

typedef enum {
    LEAKAGE_FOUND = 0,
    NO_LEAKAGE_EVIDENCE_YET,
    NOT_ENOUGH_MEASURE,
    ERROR_IMPLEMENT
} report_state_t;

回報的狀態

LEAKAGE_FOUND: 發現 leakage
NO_LEAKAGE_EVIDENCE_YET: 常數時間
NOT_ENOUGH_MEASURE: 沒有足夠的樣本
ERROR_IMPLEMENT: 錯誤的實作

static report_state_t dudect_main(int mode)
{
    prepare_inputs(input_data, classes);
    if (!measure(before_ticks, after_ticks, input_data, mode))
        return ERROR_IMPLEMENT;
    differentiate(exec_times, before_ticks, after_ticks);
    if (first) {
        prepare_percentiles(exec_times);
        first = false;
        return NOT_ENOUGH_MEASURE;
    }
    update_statistics(exec_times, classes);
    return report();
}

將 doit 改成 dudect_main，第一次會呼叫函式 prepare_percentiles 而非 update_statistics。

static bool test_const(char *text, int mode)
{
    for (int cnt = 0; cnt < TEST_TRIES; ++cnt) {
        printf("Testing %s...(%d/%d)\n", text, cnt, TEST_TRIES);
        dudect_init();
        report_state_t result = NOT_ENOUGH_MEASURE;
        while (result == NOT_ENOUGH_MEASURE) {
            result = dudect_main(mode);
        }
        if (result == NO_LEAKAGE_EVIDENCE_YET)
            return true;
        if (result == ERROR_IMPLEMENT)
            return false;
    }
    return false;
}

美化 test_const

static t_context_t *max_test()
{
    size_t ret = N_THRESHOLDS;
    double max = 0;
    for (size_t i = 0; i <= N_THRESHOLDS - 20; i++) {
        if (restricted_distributions[i].n[0] +
                restricted_distributions[i].n[1] >
            ENOUGH_MEASURE) {
            double x = fabs(t_compute(restricted_distributions + i));
            if (max < x) {
                max = x
                ret = i;
            }
        }
    }
    return restricted_distributions + ret;
}

這是我自己添加，取最大值時忽略後面 20 個受限分佈，因為它們可能無法有效排除外在因素，但要注意剩下的受限分佈可能會錯誤地排除正常樣本，從而導致本來會發現 leakage 的情況變成常數時間，在論文中這會造成資安漏洞。
此外，我無法辨別 malloc 是否是造成外在因素的原因 ? 若是，malloc 所造成的外在因素是否也是一種 leakage ?

Jim Huang

2023/03/01 15:22:56

排查

如何排查？ (Edited)

paul90317

2023/03/02 03:57:17

1. 我先去看 simulation 的程式碼，發現要複製的字串長度都是 8，先排除掉因為字串長度不同造成非常數時間複雜度的情況。 2. 接著我將 malloc 換成函式中的靜態陣列，接者做實驗得到機率分布，觀察後判斷是常數時間複雜度的分佈，就可以確認是 malloc 造成的問題。