Try   HackMD

2023q1 Homework1 (lab0)

contributed by <DokiDokiPB >

開發環境

$ gcc --version                                    
gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
$ lscpu
Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         39 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  4
  On-line CPU(s) list:   0-3
Vendor ID:               GenuineIntel
  Model name:            Intel(R) Core(TM) i5-4250U CPU @ 1.30GHz

使用主機: Macbook Air 2014 A1466 (Ubuntu Linux)

開發過程

queue.c 中,使用 list.hlist_head 串接資料,list.h 中具有幾個好用函式與巨集

  • 初始化佇列開頭:INIT_LIST_HEAD
  • 新增單一節點:list_add, list_add_tail
  • 轉移多個節點至新的 headlist_splice
  • node:表示具有元素( element_t )的節點
  • list:具有多個節點的佇列開頭,或是複數串接的節點 ???
  • head:表示佇列開頭

改進漢語表達。

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
jserv

q_new

struct list_head *q_new()
{
    struct list_head *head = malloc(sizeof(struct list_head));
    if (!head) {
        return NULL;
    }
    INIT_LIST_HEAD(head);
    return head;
}

透過 malloc 配置 head,依據 list_head,皆為 head->next, head->prev 指向 heaed

q_free

void q_free(struct list_head *l)
{
    if (!l) {
        return;
    }
    element_t *e, *safe;
    list_for_each_entry_safe (e, safe, l, list) {
        free(e->value);
        free(e);
    }
    free(l);
}

list_for_each_entry_safe 中,利用 *safe 指標保留下個節點的位址,就能將現在操作的節點釋放

q_insert_head, q_insert_tail

將節點加入至 headq_insert_head 透過 list_add 加為首個節點;q_insert_tail 透過 list_add_tail 加為末個節點

bool q_insert_head(struct list_head *head, char *s)
{
    if (!head || !s) {
        return false;
    }

    element_t *e = malloc(sizeof(element_t));
    char *dup = strdup(s);
    if (!e) {
        return false;
    }
    if (!dup) {
        free(e);
        return false;
    }
    e->value = dup;
    
    list_add(&e->list, head);
    return true;
}

bool q_insert_tail(struct list_head *head, char *s)
{
    // ...
    list_add_tail(&e->list, head);
    return true;
}

q_remove_head, q_remove_tail

將指定節點的元素從 head 移除

element_t *q_remove_head(struct list_head *head, char *sp, size_t bufsize)
{
    if (!head || !bufsize) {
        return NULL;
    }
    if (list_empty(head)) {
        return NULL;
    }

    struct list_head *node = head->next;
    element_t *e = container_of(node, element_t, list);
    size_t len = strlen(e->value);
    memset(sp, 0, bufsize);
    if (len > bufsize) {
        strncpy(sp, e->value, bufsize);
        sp[bufsize - 1] = '\n';
    } else {
        strncpy(sp, e->value, len);
    }
    list_del(node);
    return e;
}

element_t *q_remove_tail(struct list_head *head, char *sp, size_t bufsize)
{
    // ...
    struct list_head *node = head->prev;
    // ...
    return e;
}

依據在 queue.h 中函式的說明,bufsize*sp 的陣列長度

  • bufsize 小於元素中的字串長度(變數 len ),則擷取部分元素字串。
  • bufsize 大於 len,則整段複製到 *sp

思考如何精簡程式碼,善用前置處理器。

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
jserv

q_size

int q_size(struct list_head *head)
{
    if (!head) {
        return 0;
    }
    int c = 0;
    struct list_head *node;
    list_for_each (node, head) {
        c++;
    }
    return c;
}

利用 list_for_each 走過整個佇列,取得元素數量

q_delete_mid

bool q_delete_mid(struct list_head *head) { // if (!head) { return false; } if (list_empty(head)) { return false; } struct list_head *slow, *fast; slow = head->next; fast = head->next; while (fast != head && fast->next != head) { slow = slow->next; fast = fast->next->next; } list_del(slow); element_t *e = container_of(slow, element_t, list); free(e->value); free(e); return true; }

依據 LeetCode 2095. Delete the Middle Node of a Linked List 以及其圖片的說明

  • 當節點數量為奇數時,中間節點為正中間(若有 7 個節點,取第 4 個)
  • 當節點為偶數時,中間節點為中間取後一個(若有 4 個節點,取第 3 個)

在第 15 ~ 17 行中,透過一次移動一個節點的 slow 指標與一次移動兩個節點的 fast 指標,當 fast 回到 head 指標時,slow 指標即在中間節點的位址上。

改進程式碼,使其更精簡的程式碼。

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
jserv

q_reverse

void q_reverse(struct list_head *head) { if (!head) { return; } head->prev->next = NULL; struct list_head *node = head->next; struct list_head *cur = head; while (node) { struct list_head *tmp = node->next; cur->prev = node; node->next = cur; node->prev = tmp; cur = node; node = tmp; } cur->prev = head; head->next = cur; }

反向 = reverse

需要詞彙精準的定義。

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
jserv

在第 6 行中,先將 doubly-linked list 尾端元素 head->prev->next 指派為 NULL,使得在 9 行中能夠脫離 while 迴圈,表示到尾端處理完畢

  • 使用 node 變數走過每個節點,並將每個節點反向。
  • 使用 cur 表示 node 的前一個節點;在第 18 行中表示反向後的第一個節點,串接至 head

q_reverseK

void q_reverseK(struct list_head *head, int k)
{
    // https://leetcode.com/problems/reverse-nodes-in-k-group/
    if (k <= 1)
        return;

    struct list_head *safe, *node, *start = head->next;
    int c = 0;
    list_for_each_safe (node, safe, head) {
        c++;
        if (c == k) {
            c = 0;
            struct list_head *prev = start->prev;

            LIST_HEAD(sub_head);
-            sub_head.next = start;
-            start->prev = &sub_head;
-            sub_head.prev = node;
-            node->next = &sub_head;
+            list_cut_position(&sub_head, start, node);
            q_reverse(&sub_head);
+            list_splice(&sub_head, prev);
-            prev->next = sub_head.next;
-            sub_head.next->prev = prev;
-            sub_head.prev->next = safe;
-            safe->prev = sub_head.prev;

            start = safe;
        }
    }
}

k 個節點,將該複數節點加入到 sub_head 中,透過既有的函式 q_reverse 處理;再將處理好的 sub_head 中的節點加回 head

其他同學的實作,傾向直接操作節點。我的實作還是以一個 sub_head 包裝想要操作的節點。
TODO: 撰寫更精簡的程式碼。參照他人的程式碼時,應標註出處。

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
jserv

原本的實作考量 list_cut_postition 的函式參數名稱都為 head_from, head_to,可能僅適用完成的佇列而非節點。之後,觀摩 yanjiew 後,我發現可以使用 list_cut_position 去切割 head 的多個節點,再使用 list_splice 接回原本的 head

q_swap

void q_swap(struct list_head *head)
{
    // https://leetcode.com/problems/swap-nodes-in-pairs/
    q_reverseK(head, 2);
}

在作業說明中, q_swap 是一種 q_reverseK(head, 2); 的特例。

q_sort

避免張貼完整程式碼,你應該闡述自己的想法、考量因素,還有相關的策略。程式碼列表僅為輔助性質。

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
jserv

void q_sort(struct list_head *head)
{
//    ...

    // split with middle.
    struct list_head *slow = head->next, *fast = head->next->next;
    while (fast != head && fast->next != head) {
        slow = slow->next;
        fast = fast->next->next;
    }
    LIST_HEAD(left_head);
    LIST_HEAD(right_head);
-    struct list_head *right_node = slow->next;
-    left_head.next = head->next;
-    left_head.next->prev = &left_head;
-    left_head.prev = slow;
-    left_head.prev->next = &left_head;

-    right_head.next = right_node;
-    right_head.next->prev = &right_head;
-    right_head.prev = head->prev;
-    right_head.prev->next = &right_head;
-    INIT_LIST_HEAD(head);
+    list_cut_position(&left_head, head, slow);
+    list_splice_init(head, &right_head);
    q_sort(&left_head);
    q_sort(&right_head);
    merge_two_lists(&left_head, &right_head, head);
}

這裡採用 merge sort,程式碼主體可分為 divide, conquer and merge

  • Divide: 將 head 切為左半邊 left_head 與右半邊 right_head ,在程式碼中,切割的中間的節點為 slow
  • Conquer: 將 left_headright_head 分別用 q_sort 函式處理
  • Merge: 合併兩個佇列(以 merge_two_lists 函式實作)
    將較小元素先加入至 head 中,若其中一方沒有元素,即把另一方整個加入至 head 尾端中

在實作的過程中,可以利用 list_cut_position 切割左半與右半佇列

q_descend

int q_descend(struct list_head *head)
{
    int ret = 0;
    if (!head)
        return ret;
    if (list_empty(head))
        return ret;

    ret = 1;
    LIST_HEAD(less);
    struct list_head *cur = head->prev, *node = head->prev->prev;

    while (node != head) {
        struct list_head *prev = node->prev;
        element_t *e1 = list_entry(cur, element_t, list);
        element_t *e2 = list_entry(node, element_t, list);
        int c = strcmp(e2->value, e1->value);
        if (c >= 0) {
            cur = node;
            ret++;
        } else {
            list_move(node, &less);
        }
        node = prev;
    }

    element_t *e, *safe;
    list_for_each_entry_safe (e, safe, &less, list) {
        free(e->value);
        free(e);
    }
    return ret;
}

LeetCode 2487. Remove Nodes From Linked List 以及圖片的說明中

5 -> 2 -> 13 -> 3 -> 8
-->>
13 -> 8

可以視為由後往前來數的嚴格遞增的佇列。在實作上,也是從後往前的方式移除較小的元素。

  • less 表示要被刪除的節點,先將欲刪除的節點預先加入至 less 佇列中
  • cur 表示當前最大數值的節點
  • node 表示當前比較的節點,若大於 cur 的字串,則 cur 變為 noderet 累計加 1

q_delete_dup

bool q_delete_dup(struct list_head *head)
{
    // https://leetcode.com/problems/remove-duplicates-from-sorted-list-ii/
    if (!head)
        return false;
    if (list_is_singular(head))
        return true;

    LIST_HEAD(delete_head);
    int move_count = 1;
    struct list_head *cur = head->next;
    while (cur != head) {
        struct list_head *node = cur->next;
        bool isdup = false;
        while (node != head) {
            struct list_head *next = node->next;
            element_t *e1 = list_entry(node, element_t, list);
            element_t *e2 = list_entry(cur, element_t, list);
            int c = strcmp(e1->value, e2->value);
            if (c == 0) {
                isdup = true;
                list_move(node, &delete_head);
            }
            node = next;
        }

        if (isdup) {
            list_move(cur, &delete_head);
        } else {
            move_count += 1;
        }
        cur = head;
        for (int i = 0; i < move_count; i++)
            cur = cur->next;
    }

    element_t *e, *safe;
    list_for_each_entry_safe (e, safe, &delete_head, list) {
        free(e->value);
        free(e);
    }

    return true;
}

雖然有提供 LeetCode 23. Merge k Sorted Lists,根據函式註解說明,並無特別強調排序。

在沒有排序的佇列中,每次比較一個節點是否有重複,皆必須走過所有節點確認

  • cur 表示當前的節點
  • node 表示被比較的節點,如果與 cur 的元素相同,則加入到 delete_head
  • 透過 delete_head 重複的節點刪除。

此程式碼移除重複的節點,憑肉眼觀察也是,不過卻有 ERROR: Duplicate strings are in queue or distinct strings are not in queue,即使使用 q_sort 也會出現。

🤔🤔🤔 問題待排除

$ ./qtest 
cmd> new
l = []
cmd> it 5
l = [5]
cmd> it 4
l = [5 4]
cmd> it 3
l = [5 4 3]
cmd> it 2
l = [5 4 3 2]
cmd> it 1
l = [5 4 3 2 1]
cmd> it 5
l = [5 4 3 2 1 5]
cmd> it 4
l = [5 4 3 2 1 5 4]
cmd> dedup
ERROR: Duplicate strings are in queue or distinct strings are not in queue
l = [3 2 1]
cmd> quit

q_merge

int q_merge(struct list_head *head)
{
    // https://leetcode.com/problems/merge-k-sorted-lists/
    return 0;
}

如果是兩個佇列合併,則只需要 merge_two_lists 函式,若是多個佇列合併,則需要考量使用 priority queue 紀錄最小值


排除錯誤以及使用 Valgrind 排除錯誤

錯誤一:在兩個以上的佇列,在命令列使用 free 會造成程式進入無窮迴圈

避免使用「卡住」這樣不精準的詞彙。

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
jserv

遇到了一個錯誤,步驟如下

  1. ./qtest 啟動程式
  2. new 兩次,會有兩個不同 ID 的 queue
  3. free 一次會造成程式碼卡住
  • 經過測試之後,並不是 q_free 造成的無限迴圈
  • 推測是在某個地方造成無限迴圈

解決思路:

  • 使用 GDB 命令 gdb qtest 重現錯誤
  • 使用 Ctrl+C 中斷執行,查看無限迴圈的行數
  • 使用 where 查看 call stack
  • 使用 s 逐行檢查卡住的部份
    • 發現卡在 qtest.c 的第 851 ~ 853 行
  • 使用 p cur, p currnet->q, p currnet->q->next 檢查記憶體位址
Reading symbols from qtest...
(gdb) run
Starting program: /home/g111056119/Documents/lab0-c_2023/qtest 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
cmd> new
l = []
cmd> new
l = []
cmd> free
^C
Program received signal SIGINT, Interrupt.
is_circular () at qtest.c:853
853	        cur = cur->next;
(gdb) where
#0  is_circular () at qtest.c:853
#1  q_show (vlevel=vlevel@entry=3) at qtest.c:877
#2  0x0000555555558ab5 in do_free (argc=<optimized out>, argv=<optimized out>) at qtest.c:123
#3  0x0000555555559dfb in interpret_cmda (argc=argc@entry=1, argv=argv@entry=0x555555569070)
    at console.c:181
#4  0x000055555555a3b0 in interpret_cmd (cmdline=cmdline@entry=0x555555568ae0 "free")
    at console.c:201
#5  0x000055555555b01a in run_console (infile_name=infile_name@entry=0x0) at console.c:691
#6  0x00005555555591ed in main (argc=1, argv=<optimized out>) at qtest.c:1223

(gdb) p current->q
$18 = (struct list_head *) 0x5555555654b8 <current>
(gdb) p current->q->next
$19 = (struct list_head *) 0x555555568f78
(gdb) p current->q->next->next
$20 = (struct list_head *) 0x5555555654c0 <chain>
(gdb) p current->q->next->next->next
$21 = (struct list_head *) 0x555555568f78
(gdb) p current->q->next->next->next->next
$22 = (struct list_head *) 0x5555555654c0 <chain>

static bool is_circular() { struct list_head *cur = current->q->next; while (cur != current->q) { if (!cur) return false; cur = cur->next; } // ... }

透過比較 do_freedo_prev 函式內容,發現多打 & 符號

  • do_free 函式
if (chain.size > 1) {
        qnext = ((uintptr_t) &current->chain.next == (uintptr_t) &chain.head)
                    ? chain.head.next
                    : current->chain.next;
    }
  • do_prev 函式
if (chain.size > 1) {
        next = ((uintptr_t) chain.head.prev == (uintptr_t) &current->chain)
                   ? chain.head.next
                   : current->chain.next;
        current = next ? list_entry(next, queue_contex_t, chain) : NULL;
    }

因此在 do_free 函式中移除 & 即可

if (chain.size > 1) {
-        qnext = ((uintptr_t) &current->chain.next == (uintptr_t) &chain.head)
+        qnext = ((uintptr_t) current->chain.next == (uintptr_t) &chain.head)
                    ? chain.head.next
                    : current->chain.next;
    }

後來發現 github/lab0-c 已經修正該錯誤,因此只要 git 操作即可

  • git remote add lab0-c https://github.com/sysprog21/lab0-c.git
  • gir remote update
  • git merge lab0-c/master

錯誤二

使用指令測試單獨的檔案

valgrind ./qtest -f ./traces/trace-01-ops.cmd

valgrind 會回報以下問題:

# Test of insert_head and remove_head
l = []
l = [gerbil]
l = [bear gerbil]
l = [dolphin bear gerbil]
Removed dolphin from queue
l = [bear gerbil]
Removed bear from queue
l = [gerbil]
Removed gerbil from queue
l = []
==8952== 32 bytes in 1 blocks are still reachable in loss record 1 of 2
==8952==    at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==8952==    by 0x10CBCE: do_new (qtest.c:145)
==8952==    by 0x10DDFA: interpret_cmda (console.c:181)
==8952==    by 0x10E3AF: interpret_cmd (console.c:201)
==8952==    by 0x10E7B0: cmd_select (console.c:610)
==8952==    by 0x10F09C: run_console (console.c:705)
==8952==    by 0x10D1EC: main (qtest.c:1223)
==8952== 
==8952== 56 bytes in 1 blocks are still reachable in loss record 2 of 2
==8952==    at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==8952==    by 0x10F110: test_malloc (harness.c:133)
==8952==    by 0x10F515: q_new (queue.c:17)
==8952==    by 0x10CC07: do_new (qtest.c:149)
==8952==    by 0x10DDFA: interpret_cmda (console.c:181)
==8952==    by 0x10E3AF: interpret_cmd (console.c:201)
==8952==    by 0x10E7B0: cmd_select (console.c:610)
==8952==    by 0x10F09C: run_console (console.c:705)
==8952==    by 0x10D1EC: main (qtest.c:1223)
==8952== 

後來發現 git commit c7d31497c5845bc0af308bf4601e7636c18f0153 修正此問題


command 的翻譯是「命令」,instruction 的翻譯是「指令」

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
jserv

在 qtest 提供新的命令 shuffle

參考其他巨集的實作,新增一個 shuffle 命令

ADD_COMMAND(new, "Create new queue", "");
ADD_COMMAND(free, "Delete queue", "");
+ ADD_COMMAND(shuffle, "Shuffle the current queue.", "");

由於在 git 設定中,不能變更 queue.hlist.h,因此實作只能在 q_test.c 進行

static bool do_shuffle(int argc, char *argv[])
{
    if (argc != 1) {
        report(1, "%s takes no arguments", argv[0]);
        return false;
    }

    if (!current) {
        report(3, "Warning: Try to operate null queue");
        return false;
    }
    struct list_head *q = current->q;
    if (!q)
        return false;
    if (list_empty(q) || list_is_singular(q))
        return true;

    const int n = q_size(q);
    for (int i = n; i > 1; i--) {
        int r = rand() % i;
        struct list_head *tmp = q->next;
        while (r--)
            tmp = tmp->next;
        list_move_tail(tmp, q);
    }
    return q_show(3);
}

研讀論文 〈Dude, is my code constant time?

台北大學歐士田教授的假設檢定的介紹,閱讀之後對於論文的理解非常有幫助,使用了基本的統計學在驗證

H0

  • 論文摘要中,闡述一種資訊安全的攻擊方式:Timing attacks,一種 side-channel attacks (旁路攻擊),例如函式驗證敏感資訊時,若是錯誤的資訊就提早回傳結果,導致攻擊者可以利用處理時間,來破解驗證機制。
    • 論文在不知道硬體實作的情況下,透過統計方式檢驗是否程式碼為固定時間執行 (constant time)
  • 論文介紹中,主要闡述其貢獻:TIMING LEAKAGE DETECTION
    • 使用兩組資料測試,利用統計的方式去比較這兩者的執行時間
      • 第一組為固定的常態分佈的測試資料(固定的資料有固定的結果)
      • 第二組為隨機的資料
    • 由於沒學過統計,這裡我理解為使用兩個統計結果,使用其標準差跟平均數去計算兩者統計結果差異,利用 Welcch's t-test 跟 two-tails test 計算出某個數值,以驗證或推翻虛無假說。

修正 Probably not constant time or wrong implementation

使用測試案例 trace-17-complexity.cmd 會獲得 Probably not constant time or wrong implementation,觀察其測試內容

# Test if time complexity of q_insert_tail, q_insert_head, q_remove_tail, and q_remove_head is constant
option simulation 1
it
ih
rh
rt
option simulation 0

根據作業說明提示:亂數+論文閱讀

計算 approximate of df(degree of freedom)公式,參考 Welch (Unpooled Variance) t Tests and Confidence Intervals: Introduction 提及的公式:

  • S1
    S2
    表示變異數
  • n
    表示樣本數量
    df=(S12n1+S22n2)21n11(S12n1)2+1n21(S22n2)2

參考 RinHizakura 的筆記中的描述,在 fixture.creport函式中

static bool report(void)
{
    double max_t = fabs(t_compute(t));
    double number_traces_max_t = t->n[0] + t->n[1];
    double max_tau = max_t / sqrt(number_traces_max_t);
//     ...
//     ...
    /* Definitely not constant time */
    if (max_t > t_threshold_bananas)
        return false;

    /* Probably not constant time. */
    if (max_t > t_threshold_moderate)
        return false;

    /* For the moment, maybe constant time. */
    return true;
}
  • max_t 表示 t value
  • 利用 t value 查看在 df 產生的分佈中位置
  • 計算區域下的積分面積,此為 p value
  • 如果 p value 超過小於某個指定數值
    α
    ,即表示可以推翻
    H0
    並支持
    H1
    ,在本作業中
    • H0
      表示 Probably Not Constant Time 的虛無假設
    • H1
      表示對立假設

新增以下程式碼:

double s1 = t->m2[0] / (t->n[0] - 1);
s1 = s1 * s1;
double s2 = t->m2[1] / (t->n[1] - 1);
s2 = s2 * s2;

double num = s1 / t->n[0] + s2 / t->n[1];
num = num * num;
double d1 = s1 / t->n[0];
d1 = (d1 * d1) / (t->n[0] - 1);
double d2 = s2 / t->n[1];
d2 = (d2 * d2) / (t->n[1] - 1);
double den = d1 + d2;
double df = num / den;

附註:

  • 目前的程式碼計算僅僅算到 df。尚未有想法去計算積分下的數值

研讀 Linux 核心原始程式碼的 lib/list_sort.c 並嘗試引入到 lab0-c 專案