Try   HackMD

2025q1 Homework1 (lab0)

contributed by < Cheng5840 >

作業書寫規範:

  • 無論標題和內文中,中文和英文字元之間要有空白字元 (對排版和文字搜尋有利)
  • 文字訊息 (尤其是程式執行結果) 請避免用圖片來表示,否則不好搜尋和分類
  • 共筆書寫請考慮到日後協作,避免過多的個人色彩,用詞儘量中性
  • 不要在筆記內加入 [TOC] : 筆記左上方已有 Table of Contents (TOC) 功能,不需要畫蛇添足
  • 不要變更預設的 CSS 也不要加入任何佈景主題: 這是「開發紀錄」,用於評分和接受同儕的檢閱
  • 在筆記中貼入程式碼時,避免非必要的行號,也就是該手動將 c=cpp= 變更為 ccpp。行號只在後續討論明確需要行號時,才要出現,否則維持精簡的展現。可留意「你所不知道的 C 語言: linked list 和非連續記憶體」裡頭程式碼展現的方式
  • HackMD 不是讓你張貼完整程式碼的地方,GitHub 才是!因此你在開發紀錄只該列出關鍵程式碼 (善用 diff 標示),可附上對應 GitHub commit 的超連結,列出程式碼是為了「檢討」和「便於他人參與討論」
  • 留意科技詞彙的使用,請參見「資訊科技詞彙翻譯」及「詞彙對照表
  • 不要濫用 :::info, :::success, :::warning 等標示,儘量用清晰的文字書寫。:::danger 則僅限授課教師作為批注使用
  • 避免過多的中英文混用,已有明確翻譯詞彙者,例如「鏈結串列」(linked list) 和「佇列」(queue),就使用該中文詞彙,英文則留給變數名稱、人名,或者缺乏通用翻譯詞彙的場景
  • 在中文敘述中,使用全形標點符號,例如該用「,」,而非 ","。注意書名號的使用,即 ,非「小於」和「大於」符號
  • 避免使用不必要的 emoji 字元

開發環境

$ gcc --version
gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0

$ lscpu
Architecture:                         x86_64
CPU op-mode(s):                       32-bit, 64-bit
Byte Order:                           Little Endian
Address sizes:                        39 bits physical, 48 bits virtual
CPU(s):                               16
On-line CPU(s) list:                  0-15
Thread(s) per core:                   2
Core(s) per socket:                   8
Socket(s):                            1
Vendor ID:                            GenuineIntel
CPU family:                           6
Model:                                186
Model name:                           13th Gen Intel(R) Core(TM) i7-13620H
Stepping:                             2
CPU MHz:                              2918.398
BogoMIPS:                             5836.79
Virtualization:                       VT-x
Hypervisor vendor:                    Microsoft
Virtualization type:                  full
L1d cache:                            384 KiB
L1i cache:                            256 KiB
L2 cache:                             10 MiB
L3 cache:                             24 MiB

linked-list illustration

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

q_remove_head()

element_t *q_remove_head(struct list_head *head, char *sp, size_t bufsize)
{
    element_t *element_removed = list_entry(head->next, element_t, list);
    sp = strndup(element_removed->value, bufsize-1);
    sp[bufsize-1] = '\0';
    list_del(head->next);
    return element_removed;
}
  1. in queue.h, it says that q_reomve_head is used to remove the element from head of queue, and copy the removed string to sp up to maximum size of bufsize-1. My question is why is the copy of string needed, and why is the maximum size bufsize? If the functoin return the pointer to the element, why can't i just get the string from this pointer.

  2. why is list_del named list_del, cause in list.h, it says that list_del is used to "remove" a node instead of deleting the node.

  3. I get ERROR: Failed to store removed value in this code, but I haven't figured out why.

Correct version:

{
    if (!head || list_empty(head))
        return NULL;
    struct list_head *node = head->next;
    element_t *elem = list_entry(node, element_t, list);

    if (sp && bufsize > 0) {
        strncpy(sp, elem->value, bufsize - 1);
        sp[bufsize - 1] = '\0';
    }

    list_del(node);
    return elem;
}

q_reverse()

/* Reverse elements in queue */
void q_reverse(struct list_head *head)
{
    struct list_head *node;
    struct list_head *prev = head;
    list_for_each (node, head) {
        prev->next = prev->prev;
        prev->prev = node;
        prev = node;
    }
    prev->next = prev->prev;
    prev->prev = head;
}

I have no idea why this won't work, need to figure it out later !!
It seems like we can't modified nodes in list_for_each
-> Yes, we can't modify nodes in list_for_each. But we can use list_for_each_safe.

bool q_insert_head(struct list_head *head, char *s) { element_t *new_element_t = malloc(sizeof(element_t)); new_element_t->value = strdup(s); // need to free it when delete head list_add(&(new_element_t->list), head); return true; }
l = [1 2 3]
cmd> rh 1
ERROR: Attempted to free unallocated block.  Address = 0x5558e8098370
ERROR: Attempted to free unallocated or corrupted block.  Address = 0x5558e8098370
ERROR: Corruption detected in block with address 0x5558e8098370 when attempting to free it
free(): double free detected in tcache 2
Aborted (core dumped)

I got this error, so i check q_remove_head and q_insert_head, after a few survey, it's about :

Difference between strcpy(), strdup(), strndup()

The main difference between strcpy(), strdup(), and strndup() lies in memory allocation and how they handle copying strings.

strcpy(), defined in <cstring>, copies a null-terminated string from src to dest, including the null terminator. However, it does not perform any bounds checking, which can lead to undefined behavior if dest is not large enough or if src and dest overlap. Since it does not allocate memory dynamically, dest must be an already allocated buffer with sufficient space.

On the other hand, strdup(), defined in <string.h>, dynamically allocates memory for a duplicate of str1 and returns a pointer to the newly allocated string. Since the memory is allocated dynamically, it must be freed to prevent memory leaks. If memory allocation fails, strdup() returns a null pointer. This function is useful when a string copy is needed but the destination buffer size is unknown in advance.

Similarly, strndup() also allocates memory dynamically but differs in that it copies at most size bytes from src, ensuring that the result is null-terminated. If size is smaller than the length of src, only a portion of the string is copied. If size is larger but src lacks a null terminator within the first size bytes, one is appended. Like strdup(), the memory allocated by strndup() must also be freed to avoid memory leaks.

q_delete_mid()

    list_del(slow);
    element_t *ele_deleted = list_entry(slow, element_t, list);
    q_release_element(ele_deleted);

slow is the pointer point to the middle node of the queue, we need to list_del(slow) before q_release_element(ele_deleted), otherwise, it will cause Segmentation fault.

if q_release_element() is written before list_del():
q_release_element(ele_deleted) will release the memory of element_t that is pointed by ele_deleted, but slow is still licked in queue, which means slow->prev and slow->next may still pointed to the released memory。

Besides the fast slow pointer method, we can also use the property of doubly linked list. Which means we can use two pointer, one moves forward, while the other one moves backward. When they meat, we arrive the mid node we want!

q_delete_dup()

my code is messy, i need to use more api, but i haven't know how

q_descend()

The way I implement this function is that maintaing a *minstr (minimum string), and check the queue backwards.
For every node, check if its string is less than minstr. If so, update the minstr; otherwise, del the node.

The directino of this implementation is backwards, so I try to implement q_descend which moves forwards.

So I try to use recursive method to solve the problem on leetcode 2487.
This is the code:

struct ListNode* removeNodes(struct ListNode* head) {
    if (!head->next)
        return head;
    struct ListNode* next = removeNodes(head->next);
    if (head->val >= next->val) {
        head->next = next;
        return head;
    }   
    else
        return next;
}

Though the time complexity of these two method are both O(n), but the performance seems worse in recursive method.

q_sort()

void q_sort(struct list_head *head, bool descend) 
{
    /*split the queue to two halves.
     * q_sort both sub_queue
     * merge two sub_queue
    */ 
    struct list_head *merged = merge_two_lists(&lefthead, head, descend);
    head = merged;
}

This is what i wrote at first, and the problem here is that head = merged; won't change the original head, but only change the local variable.
Instead writing head = merged;, we should write INIT_LIST_HEAD(head) to reinitialize head and use list_splice_tail(merged, head) to update head correctly.

  • I’m considering whether q_merge can be used to implement q_sort. One possible approach is to initially treat each node as an independent queue and encapsulate it within a queue_context_t structure. Then, we can iteratively apply q_merge to combine these individual queues, gradually sorting the list in the process.

So far, i get 95/100 on github make test, but get 100/100 on local pc ??
image

Check what is TSC??

論文閱讀 <Dude, is my code constant time?>

名詞解釋:

  • timing channel:
  • leakage:
  • side channel attack:
  • timing attack:
    在本論文中,作者提出了一種基於統計分析的方法來檢測程式碼是否存在時間變異性(Timing Variability)。
    這種方法不依賴靜態分析(Static Analysis),而是透過測量執行時間並進行統計檢定來判斷程式是否符合常數時間執行。

APPROACH

Step 1: Measure Execution Time

測試密碼學函式在固定輸入 vs 隨機輸入兩種情境下的執行時間,使用 CPU 週期計數器 來精確測量。

Step 2: Apply Post-processing

執行統計分析之前,可以先對測量數據進行後處理。 一般而言,執行時間的分佈通常會向較大的執行時間偏移,大多數測試執行時間較短,但少部分測試可能會有明顯較長的執行時間。而其中可能的原因像是: 測量誤差、OS 排程中斷、其他背景程序的干擾等等。
因此我們可能需要裁減(Cropping)測量數據,如 丟棄執行時間較長的測量數據、
設定一個固定的閾值(threshold),超過該值的測量結果將被忽略(該閾值不應與輸入數據類別相關)

Apply Statistical Test

使用 Welch’s t-test 或 Kolmogorov-Smirnov 檢定 來判斷執行時間是否與輸入數據相關。

Welch’s t-test
此方法主要測試兩組數據的均值是否相同,若 t 檢定拒絕零假設,則代表程式的執行時間存在統計上顯著的差異

CDF (Cumulative Distribution Function)

memcmp() :
int memcmp( const void* lhs, const void* rhs, std::size_t count );

  • Parameters
    • lhs, rhs - pointers to the memory buffers to compare
    • count - number of bytes to examine
  • Return value
    • Negative value if the first differing byte (reinterpreted as unsigned char) in lhs is less than the corresponding byte in rhs.
    • 0​ if all count bytes of lhs and rhs are equal.
    • Positive value if the first differing byte in lhs is greater than the corresponding byte in rhs.

新增 q_shuffle()

參考Fisher–Yates shuffle教材
於 queue.c 內新增 q_shuffle 函式

+void q_shuffle(struct list_head *head)

為了能夠在 qtest 中測試 shuffle 命令,必須在 queue.h 中宣告 q_shuffle ,但因為在lab0中 queue.h 不允許修改, 所以改至 qtest.c 中宣告。

+void q_shuffle(struct list_head *head);
static bool do_shuffle(int argc, char *argv[])
{
    if (argc != 1) {
        report(1, "%s takes no arguments", argv[0]);
        return false;
    }

    if (!current || !current->q) {
        report(3, "Warning: Calling shuffle on null queue");
        return false;
    }
    error_check();

    set_noallocate_mode(true);
    if (exception_setup(true))
        q_shuffle(current->q);
    exception_cancel();
    set_noallocate_mode(false);

    q_show(3);
    return !error_check();

}

需要再理解程式細節

在 qtest.c 內新增

+static bool do_shuffle(int argc, char *argv[])

其內部使用了許多 API 但內部細節還不了解

exception_setup()這個函式提供了一種 異常處理機制,確保當發生錯誤時,不會導致程式崩潰,而是能夠安全地回到 setjmp 設定的點。

使用作業說明的測試程式,以下為 shuffle 測試結果
image

simulation

在 qtest.c 中觀察 queue_insert 可發現,若 simulation = 1 則會執行此段程式

if (simulation) {
    if (argc != 1) {
        report(1, "%s does not need arguments in simulation mode", argv[0]);
        return false;
    }
    bool ok =
        pos == POS_TAIL ? is_insert_tail_const() : is_insert_head_const();
    if (!ok) {
        report(1,
               "ERROR: Probably not constant time or wrong implementation");
        return false;
    }
    report(1, "Probably constant time");
    return ok;
}

在 simulation 模式下,不允許額外的參數,檢查完沒有額外的參數後便會檢查插入操作是否是 constant time
在 fixure.h 中

#define _(x) bool is_##x##_const(void);
DUT_FUNCS
#undef _
  1. 定義 _ 為一個巨集函式
  2. ## 會將 x 連接到 is__const(void),產生新的函式宣告
    (e.g. _(insert_head) → bool is_insert_head_const(void);)
  3. 清除 _ 的定義,避免 _ 影響後續的程式碼

上述函式在 fixture.c 中實作

#define DUT_FUNC_IMPL(op) \
    bool is_##op##_const(void) { return test_const(#op, DUT(op)); }

static bool test_const(char *text, int mode)
{
    bool result = false;
    t = malloc(sizeof(t_context_t));

    for (int cnt = 0; cnt < TEST_TRIES; ++cnt) {
        printf("Testing %s...(%d/%d)\n\n", text, cnt, TEST_TRIES);
        init_once();
        for (int i = 0; i < ENOUGH_MEASURE / (N_MEASURES - DROP_SIZE * 2) + 1;
             ++i)
            result = doit(mode);
        printf("\033[A\033[2K\033[A\033[2K");
        if (result)
            break;
    }
    free(t);
    return result;
}
static bool doit(int mode)
{
    int64_t *before_ticks = calloc(N_MEASURES + 1, sizeof(int64_t));
    int64_t *after_ticks = calloc(N_MEASURES + 1, sizeof(int64_t));
    int64_t *exec_times = calloc(N_MEASURES, sizeof(int64_t));
    uint8_t *classes = calloc(N_MEASURES, sizeof(uint8_t));
    uint8_t *input_data = calloc(N_MEASURES * CHUNK_SIZE, sizeof(uint8_t));

    if (!before_ticks || !after_ticks || !exec_times || !classes ||
        !input_data) {
        die();
    }

    prepare_inputs(input_data, classes);

    bool ret = measure(before_ticks, after_ticks, input_data, mode);
    differentiate(exec_times, before_ticks, after_ticks);
    update_statistics(exec_times, classes);
    ret &= report();

    free(before_ticks);
    free(after_ticks);
    free(exec_times);
    free(classes);
    free(input_data);

    return ret;
}

doit 即為實際測試 constant time 的函式,在閱讀 dudect 後,發現update_statistics 的 for loop 是從 i=10 開始,目的是要 discard the first few measurements ,因此我也將 lab0-c 內的函式改成

+static void update_statistics(const int64_t *exec_times, uint8_t *classes) +{ + for (size_t i = 10; i < N_MEASURES; i++) {

另外 dudect_main 當中再呼叫 update_statistics 之前會先呼叫 prepare_percentile 函式,目的是要為每次 cropping measurements 設置不同的 threshold

至此 make test 即可達到 100/100

若要查看 dudece 更詳細說明 可再點擊連結查看

queue 的操作是怎麼運作的

qtest.c 的 main 首先確認當前目錄是否為有效的 Git 專案(例如:.git 目錄、必要的 Git hooks 是否安裝正確,以及 Commit 記錄是否符合要求)。
再來 while 迴圈會先透過 getopt() 來處理指令列引數

-h: 顯示使用說明(usage(argv[0]))。
-f <檔名>: 指定從檔案讀取指令(infile_name = buf;)。
-v <整數>: 設定詳細輸出(verbosity)層級,例如 -v 3 代表輸出更多除錯訊息。
-l <檔名>: 記錄執行結果到指定的記錄檔(log file)。

例:

$ ./qtest -f traces/trace-14-perf.cmd
/* A better seed can be obtained by combining getpid() and its parent ID
     * with the Unix time.
     */
    srand(os_random(getpid() ^ getppid()));

呼叫 os_random() 取得一個相對隨機的種子,再透過 srand() 進行標準 C 的亂數初始化。
這樣可以使後續需要亂數的操作(例如插入隨機字串)更具隨機性。

q_init();

設定 fail_count = 0;
使用 INIT_LIST_HEAD(&chain.head); 初始化一個名為 chain 的全域鏈結串列,用來儲存多個佇列(queue_contex_t)。
安裝訊號處理器:對 SIGSEGV(segmentation fault)與 SIGALRM(超時)做特殊處理,以利在測試過程中捕捉危險錯誤或無限迴圈。

init_cmd();
console_init();

init_cmd() 和 console_init() 皆是為了設定互動式控制台的指令表。
console_init() 內部透過 ADD_COMMAND() 綁定各種字串指令(如 ih, it, rh, rt)對應的實際函式(如 do_ih, do_it)。
每一個綁定的指令,都是使用者在互動式終端輸入時,實際會呼叫到的操作函式。

if (!infile_name) {
    line_set_completion_callback(completion);
    line_history_set_max_len(HISTORY_LEN);
    line_history_load(HISTORY_FILE);
}

若使用者沒有提供 -f 參數(即 infile_name == NULL),表示要在終端機直接互動輸入指令。
這裡做了幾件事:

  • line_set_completion_callback(completion): 設定自動完成功能(例如按 Tab 自動補齊指令)。
  • line_history_set_max_len(HISTORY_LEN): 設定最多可記憶多少筆歷史指令。
  • line_history_load(HISTORY_FILE): 從既定的歷史檔(預設為 .history)載入以往輸入過的指令。
add_quit_helper(q_quit);

當使用者在互動式介面或檔案中輸入 quit 命令時,最終會呼叫 q_quit() 來釋放所有佇列的記憶體,並檢查是否有記憶體洩漏。

bool ok = true;
ok = ok && run_console(infile_name);

這是整個互動測試的核心執行迴圈:
如果使用者有指定 -f file,則會自動讀取該檔案的每一行命令並執行。
若沒有指定檔案,則進入互動式模式,等待使用者在終端鍵入指令,如 ih apple、rt、merge、quit 等等

第二周課堂問答

關於 swap 與 reversek 有甚麼相似處 ?
swap 是將整條 linked list 每兩個一組做順序交換,若 linked list 節點數為奇數個,則最後一個節點不動。
reverseK 是將整條 linked list 每 K 個一組,組內順序反轉,所以可以把 swap 當作是 reverseK k=2 的情況。

void q_swap(struct list_head *head)
{
    if (list_empty(head) || list_is_singular(head))
        return;

    struct list_head *cur = head->next;
    struct list_head *nextpair;
    while (cur != head && cur->next != head) {
        nextpair = cur->next->next;

        cur->prev->next = cur->next;
        cur->next->prev = cur->prev;

        cur->next->next = cur;
        cur->prev = cur->next;

        cur->next = nextpair;
        nextpair->prev = cur;

        cur = nextpair;
    }
}
q_reverseK(struct list_head *head, int k)
{
    if (!head || head->next->next == head || k == 1) return;

    struct list_head* prev= head;
    struct list_head* cur = head->next;

    int count = 0;
    //can replace by q_size()
    while (cur!=head){
        count ++;
        cur = cur->next;
    }

    while (count >= k) {
        cur = prev->next;
        struct list_head* next = cur->next;

        // reverse k nodes
        // move last node to first each round
        for (int i=0; i<k-1; i++){
            cur->next = next->next;
            next->next->prev = cur;

            next->next = prev->next;
            prev->next->prev = next;

            prev->next = next;
            next->prev = prev;
            next = cur->next;
        }

        prev = cur;
        count -= k;
    }

}

while loop 會將 linked list 分成 k 個一組,for loop 每一輪會將該組最後一個節點移至最前面,重複 k-1 輪,如此一來便能達到組內 reverese 的效果。

Git relevant

$ git add -p filename
Git 會顯示該檔案的部分修改,並詢問你是否要加入此變更
Stage this hunk [y,n,q,a,d,j,J,g,/,s,e,?]?

n→跳過此部分;
s→分割成更小的部分;
e→手動編輯變更;
a→加入這個檔案內所有的變更;
d→跳過這個檔案內所有的變更;
q→結束,不再處理剩下的變更;
?→顯示幫助

$ git restore --staged filename 讓檔案回到 unstaged 狀態
git diff --cached 這會顯示 已 staged(已 git add)但還沒 commit 的變更內容。