2024q1 Homework1 (lab0)

contributed by < LambertWSJ >

實作 Queue 介面

`q_insert_head` / `q_insert_tail`

將節點插入到 queue 的頭尾。可以利用 doubly linked list 的特性簡化成傳入 queue 頭尾的節點，之後再用 list_add 插入新節點。

list_add(node, head): node 插入到 head 後面

static inline bool q_insert_node(struct list_head *head, char *s)
{
    element_t *el = malloc(sizeof(element_t));
    if (!el)
        return false;

    el->value = strdup(s);
    if (!el->value) {
        free(el);
        return false;
    }
    list_add(&el->list, head);

    return true;
}

假設 queue 有 aa, bb, cc 三個節點，要新增的節點為 dd，如下所示

插入開頭 list_add(dd, head)，將 dd 插入到 head 後方，效果相當於 list_add_tail(dd, aa)

由於大部分的操作相同，所以只需要 q_insert_node 來處裡插入頭尾的函式即可。

/* Insert an element at head of queue */
bool q_insert_head(struct list_head *head, char *s) {
    return q_insert_node(head, s);
}

/* Insert an element at tail of queue */
bool q_insert_tail(struct list_head *head, char *s) {
    return q_insert_node(head->prev, s);
}

`q_remove_head` / `q_remove_tail`

從 queue 的頭尾移除節點也能夠簡化成只使用一個函式來處理，利用雙向串列的特性傳入頭跟尾的 list_head 即可，從串列拿掉 node 後，再把 el->value 複製到 sp 上用來比對答案。

static inline element_t *q_remove_node(struct list_head *node,
                                       char *sp,
                                       size_t bufsize) {
    if (list_empty(node))
        return NULL;

    element_t *el = list_entry(node, element_t, list);
    list_del(node);
    if (sp && el->value) {
        size_t len = strlen(el->value) + 1;
        len = len < bufsize ? len : bufsize;
        strncpy(sp, el->value, len);
        sp[len - 1] = '\0';
    }
    return el;
}

element_t *q_remove_head(struct list_head *head, char *sp, size_t bufsize) {
    return q_remove_node(head->next, sp, bufsize);
}

element_t *q_remove_tail(struct list_head *head, char *sp, size_t bufsize) {
    return q_remove_node(head->prev, sp, bufsize);
}

`q_reverse` / `q_reverseK`

q_reverse

串列反轉的操作較為簡單，只需要將每個節點移動到 head 前面即可達成。這裡要使用 safe 版的 for 迴圈，確保節點在移動後， node 可以取得下一個節點，否則永遠會停留在第一個跟第二個節點，不斷的在循環。

void q_reverse(struct list_head *head) {
    struct list_head *node, *safe;
    list_for_each_safe (node, safe, head) {
        list_move(node, head);
    }
}

如果在 safe 版的 for 迴圈中使用 list_move_tail 會發生什麼事? 那只會不斷的把當前的節點，往後面移動，那麼 node 將永遠都到不了 head，一樣會遇到無窮迴圈。

上述的迴圈可以輕鬆改寫成遞迴，不過反轉的節點太多會導致 stack overflow

static void do_recur_reverse(struct list_head *node, struct list_head *head)
{
    if (node == head)
        return;

    struct list_head *safe = node->next;
    list_move(node, head);
    do_recur_reverse(safe, head);
}

void q_reverse_recur(struct list_head *head)
{
    do_recur_reverse(head->next, head);
}

除了單向的實做，亦可以利用雙向鍊結串列的特性，頭尾兩兩交換，概念類似一樣的技巧用在陣列反轉。

list_for_each_safe 由上到下的判斷式，依序考慮下面兩種 case

節點數量為偶數個時，則必然會遇到 cur 的下一個節點就會是 last，表示其它節點已經兩兩交換完畢，剩下彼此還沒交換位置。
節點數量為奇數個時，則 cur 與 last 必然會重疊，重疊時表示所有節點都已交換完畢，可以結束迴圈。

void q_reverse_bidir(struct list_head *head)
{
    struct list_head *cur, *safe;
    struct list_head *last = head->prev, *last_safe;

    list_for_each_safe (cur, safe, head) {
        last_safe = last->prev;
        if (cur->next == last) {
            list_swap(cur, last);
            break;
        } else if (cur == last)
            break;

        list_swap(cur, last);
        last = last_safe;
    }
}

其中 list_swap 的實做如下，考慮到可能是前後節點交換，所以 e2 和 e1 交換完後，可以檢查 prev 是不是 e1，如果是的話表示 e1 與 e2 是鄰近節點，e2 搬動完後也就完成交換了，不必再做一次 list_move。

static void list_swap(struct list_head *e1, struct list_head *e2)
{
    struct list_head *prev = e2->prev;
    list_move(e2, e1);
    if (prev != e1)
        list_move(e1, prev);
}

console.c
三種不同的實做要拿來測試的話，可以在加入 option 變數來決定使用哪個實做來進行測試

int rev_opt = 0;

void init_cmd()
{
    
    /* ... */
    add_param("rev_opt", &rev_opt,
              "chose reverse implemntation"
              ", 0 for list_move method(default)"
              ", 1 for bi-direction reverse"
              ", 2 for recursive reverse(may stack overflow)",
              NULL);
}

qtest.c

有了 option 後，就能夠透過 rev_opt 來查表選擇實做，這裡使用 typedef 簡化函式指標，用來提昇程式碼的可讀性

typedef void (*queue_op)(struct list_head *head);
queue_op rev_tlb[] = {q_reverse, q_reverse_bidir, q_reverse_recur};
queue_op reverse_op = q_reverse;


static bool do_reverse(int argc, char *argv[])
{
    /* .... */
+    if (rev_opt < 3)
+        reverse_op = rev_tlb[rev_opt];
+    else {
+        printf(
+            "unknown reverse option: %d\n"
+            "rev_opt only support follow option number\n"
+            "0 for list_move method(default)\n"
+            "1 for bi-direction reverse\n"
+            "2 for recursive reverse(may stack overflow)\n",
+            rev_opt);
+        trigger_exception("Please chose option carefully");
+    }


+    if (current && exception_setup(true))
-       q_reverse(current->q);   
+       reverse_op(current->q);
}

q_reverseK

題目大意為每 k 個節點就將串列反轉。

實做時，卡在 reverse 後，怎麼指向下個 head? 去年參考 komark06 的寫法後，才知道指向 safe 的前一個節點當做 head。

直觀的做法就是 anchor 來記錄要反轉的子串列的頭，再用另一個變數去紀錄迴圈是不是走了 k 次，是的話就將走過的這一段串列反轉，反轉的作法是將走過 k 的一段串列切割出來給 q_reverse 反轉，完畢後再放回去。

anchor 接上反轉後的子串列，要將 anchor 更新到 safe 的前一個節點，當作下一輪反轉的子串列的 list_head。

有了不同的反轉實做，就能夠將 reverse_op 帶入到 q_reverseK。

void q_reverseK(struct list_head *head, int k) {
    if (!head || k <= 1) {
        return;
    }

    struct list_head *node, *safe, *anchor;
    LIST_HEAD(rev_head);
    anchor = head;
    int i = 0;

    list_for_each_safe (node, safe, head) {
        i++;
        if (i == k) {
            list_cut_position(&rev_head, anchor, node);
-            q_reverse(&rev_head);
+            reverse_op(&rev_head);
            list_splice_init(&rev_head, anchor);
            anchor = safe->prev;
            i = 0;
        }
    }
}

`q_ascend` / `q_descend`

題目大意為刪掉串列的解點，使串列的值從右手邊觀察是嚴格遞增。單向串列的方法是透過遞迴讓 head 走訪到最後一個節點再比大小，因為題目要嚴格遞增，從最後一個節點開始記錄最大值，如果節點的值比最大值還要小就刪掉，比最大值還大就紀錄節點的內涵值，遞迴結束後就可以得到題目要求的串列。

我們可以利用雙向串列的特點，反過來走訪串列，並重複上述比大小的過程，就不用寫成遞迴了，也避開了 stack overflow 的風險。

由於是從從右手邊觀察，又要刪掉節點，所以用 macro 把迴圈簡化成 list_for_each_safe_reverse

#define list_for_each_safe_reverse(node, safe, head)         \
    for (node = head->prev, safe = node->prev; node != head; \
         node = safe, safe = node->prev)

int q_descend(struct list_head *head) {
    if (list_empty(head))
        return 0;

    char *maxv = "\0";
    int sz = 0;
    struct list_head *cur, *safe;

    list_for_each_safe_reverse(cur, safe, head) {
        sz++;
        element_t *entry = list_entry(cur, element_t, list);
        if (strcmp(entry->value, maxv) > 0) {
            maxv = entry->value;
        } else {
            list_del(cur);
            q_release_element(entry);
            sz--;
        }
    }

    return sz;
}

與 q_descend 解法一樣，差別在於改成從左手邊觀察，所以只要把迴圈換成 list_for_each_safe 即可。

int q_ascend(struct list_head *head) {
    if (list_empty(head))
        return 0;

    char *maxv = "\0";
    int sz = 0;
    struct list_head *cur, *safe;

    list_for_each_safe (cur, safe, head) {
        sz++;
        element_t *entry = list_entry(cur, element_t, list);
        if (strcmp(entry->value, maxv) > 0) {
            maxv = entry->value;
        } else {
            list_del(cur);
            q_release_element(entry);
            sz--;
        }
    }

    return sz;
}

`q_swap`

將串列的節點兩兩交換，使用 for each 走訪串列，將下個節點移動到當前節點的尾，如果下個節點是 head 就停止，head 不是用來交換的而是用來找到串列的本體。

void q_swap(struct list_head *head) {
    struct list_head *node;
    list_for_each (node, head) {
        if (node->next == head)
            break;
        list_move_tail(node->next, node);
    }
}

呼叫 list_move(node->next, node)，node2 跟 node1 交換

下一輪迴圈 node 指向 node3，依此類推將 node3 跟 node4 交換，直到 node 的下個節點是 head 就停下。

實做 q_reverse 的過程中，也順手實做了 list_swap，可以用來替代 list_move_tail，暫時沒想到好的函式名稱，先命名為 q_swap2

void q_swap2(struct list_head *head)
{
    struct list_head *node;
    list_for_each (node, head) {
        if (node->next == head)
            break;
        list_swap(node->next, node);
    }
}

`q_delete_mid`

刪除 queue 中間的節點，可以使用快慢指標的技巧來尋找中間的節點再刪除即可。

bool q_delete_mid(struct list_head *head) {
    struct list_head *fast = head->next, *node;
    list_for_each (node, head) {
        if (fast->next == head || fast->next->next == head)
            break;
        fast = fast->next->next;
    }
    list_del(node);
    q_release_element(list_entry(node, element_t, list));
    return true;
}

`q_delete_dup`

從單向串列的解法改寫過來，如果遇到是重複的就刪除掉節點，並將 dup 設為 true，如果當前節點是連續節點的最後一個，就用 dup 來確認是不是最後一個連續節點，是的話就刪除，由於刪除會拿掉節點，所以要用 safe 版的 for each 迴圈。

bool q_delete_mid(struct list_head *head)
{
    struct list_head *fast = head->next, *node;
    list_for_each (node, head) {
        if (fast->next == head || fast->next->next == head)
            break;
        fast = fast->next->next;
    }
    list_del(node);
    q_release_element(list_entry(node, element_t, list));
    return true;
}

`q_sort`

按照註解描述會有遞增與遞減排序的 list，因此要加入對應的 compare function 到排序演算法。

排序的實做選擇合併排序，除了實做上較容易外，合併排序屬於穩定排序，在最差的情況下的時間複雜度扔有

O (l o g N)

的表現。

參考 2021 的測驗題修改的遞迴版 merge sort。


















void q_sort(struct list_head *head, bool descend)
{
    if (list_empty(head) || list_is_singular(head))
        return;
    struct list_head *fast = head->next, *slow;
    list_for_each (slow, head) {
        if (fast->next == head || fast->next->next == head)
            break;
        fast = fast->next->next;
    }
    LIST_HEAD(sorted);
    LIST_HEAD(left);
    list_cut_position(&left, head, slow);
    q_sort(&left, descend);
    q_sort(head, descend);
    merge_two_lists(&left, head, &sorted, descend);
    list_splice_tail(&sorted, head);
}

5 ~ 10 行：divide 階段，將串列分割成左右兩半。用快慢指標的技巧找出找出左半邊串列的結尾 slow。
11 ~ 13 行：將 head 到 slow 的串列用 list_cut_position 切來並用 left 作為開頭的 list_head，此時 head 為右半邊的串列開頭。

list_cut_position(&left, head, slow) 分割完後如下圖所示

14 ~ 18：呼叫 q_sort 將串列分割成 sorted lists 後，用 merge_tow_lists 把兩個串列合併到 sorted 上。

`merge_two_lists`

merge_two_lists 則參考過去使用間接指標的實做，改成 list API 的版本。只要 l1 和 l2 還有節點，就用 compare function 決定下個要合併的節點移動到 sorted 後面，當其中一個串列合併完了就把令一個串列拼接在 sorted 後面。

依據 descend 判斷遞增與遞減，可以建表來紀錄兩者對應的比較函式，用以取代三元表達式(?:)的分支判斷。

static int (*sort_cmp[])(struct list_head *l1, struct list_head *l2) = {
    [true] = dec_cmp,
    [false] = asc_cmp,
};

static void merge_two_lists(struct list_head *l1,
                            struct list_head *l2,
                            struct list_head *sorted,
                            bool descend)
{
    struct list_head *next, **node;
    int (*cmp)(struct list_head *l1, struct list_head *l2) = sort_cmp[descend];

    for (node = NULL; !list_empty(l1) && !list_empty(l2); *node = next) {
        node = cmp(l1, l2) ? &l1->next : &l2->next;
        next = (*node)->next;
        list_move_tail(*node, sorted);
    }
    list_splice_tail_init(list_empty(l1) ? l2 : l1, sorted);
}

為什麼最後要使用 list_splic_tail_init 而不是 list_splic_tail ？

list_splice_tail(list, head) 的註解表示 list 開頭的 head 會被加入到 head 這個串列，所以後面要使用到 list 的話要初始化它的 head。

All nodes from @list are added to to the end of the list of @head.
It is similar to list_add_tail but for multiple nodes. The @list head is not
modified and has to be initialized to be used as a valid list head/node
again.

假設最後要和 sorted 合併的串列為 l2，而 l2 在 q_sort 的傳入的開頭為 head
qsort

\to

merge_two_lists(&left, head, &sorted, descend)

呼叫 list_splice_tail(head, sorted)，head 並沒有被拿掉，而是留在 sorted 串列裡面。
這導致之後呼叫 merge_two_lists 的時候，會陷入無窮迴圈，迴圈判斷式 list_empty(head) 永遠為 false，因為 head->next 永遠指不到自己。

比較函式使用 strcmp 來比較字串，依據手冊的 strcmp 提到回傳值的三種情況

• 0, if the s1 and s2 are equal;
• a negative value if s1 is less than s2;
• a positive value if s1 is greater than s2.

若 descend 為 true，字串要由大到小排序，也就是 s2 會優先加入到 sorted 串列裡面，因此 l2_val 比 l1_val 大時要回傳 1，反之則回傳 0，ascend 也依此類推。

static int dec_cmp(struct list_head *l1, struct list_head *l2)
{
    char *l1_val = list_first_entry(l1, element_t, list)->value;
    char *l2_val = list_first_entry(l2, element_t, list)->value;
    return strcmp(l1_val, l2_val) <= 0 ? 0 : 1;
}

static int asc_cmp(struct list_head *l1, struct list_head *l2)
{
    char *l1_val = list_first_entry(l1, element_t, list)->value;
    char *l2_val = list_first_entry(l2, element_t, list)->value;
    
    return strcmp(l1_val, l2_val) > 0 ? 0 : 1;
}

觀察上述比較函式的實，最後扔然要做分支判斷，該如何消除？
判斷式以 0 為基準，且 strcmp 為回傳有號數，可以利用 sign bit 即 MSB 來判斷數值是否為負數，在使用 not operator(!)轉成 0 和 1，以減少 merge sort 的分支數量。

+#define MSB(val) ((val) & 0x80000000)

static int dec_cmp(struct list_head *l1, struct list_head *l2)
{
    char *l1_val = list_first_entry(l1, element_t, list)->value;
    char *l2_val = list_first_entry(l2, element_t, list)->value;
-   return strcmp(l1_val, l2_val) <= 0 ? 0 : 1;
+   return !MSB(strcmp(l1_val, l2_val));
}

static int asc_cmp(struct list_head *l1, struct list_head *l2)
{
    char *l1_val = list_first_entry(l1, element_t, list)->value;
    char *l2_val = list_first_entry(l2, element_t, list)->value;
-   return strcmp(l1_val, l2_val) > 0 ? 0 : 1;
+   return !!MSB(strcmp(l1_val, l2_val));
}

為了之後引入 kernel 的 list_sort，顯然不能使用只有兩個參數的比較函式，合併時也不能僅用 true 和 false 來決定下個要合併哪個節點，而是要和 0 做比較，應此將 compare 函式修改成如下。

static int asc_cmp(void *priv,
                   const struct list_head *l1,
                   const struct list_head *l2)
{
    char *s1 = list_entry(l1, element_t, list)->value;
    char *s2 = list_entry(l2, element_t, list)->value;
    return strcmp(s1, s2);
}

static int dec_cmp(void *priv,
                   const struct list_head *l1,
                   const struct list_head *l2)
{
    return asc_cmp(priv, l2, l1);
}

手冊 strcmp(3) 提到下方描述

strcmp() returns an integer indicating the result of the
comparison, as follows:
• 0, if the s1 and s2 are equal;
• a negative value if s1 is less than s2;
• a positive value if s1 is greater than s2.

回傳值與兩個數值相減來判斷誰大誰小的概念一樣，所以 strcmp(s1, s2) 回傳 -5，那麼 s1 與 s2 調過來 strcmp(s2, s1) 就會回傳 5，可依此特性簡化升序與降序排列，合併函式就會依照正負數對調合併的節點。

合併函式的函式指標型態改成和 kernel 的 list_sort 的 cmp 一樣，不過傳入到比較函式會變成 next，如果拿原本的比較函式，除了結果不正確，用在 list_sort 還會造成 Segementation fault。

static void merge_two_lists(struct list_head *l1,
                            struct list_head *l2,
                            struct list_head *sorted,
                            bool descend)
{
    struct list_head *next, **node;
    list_cmp_func_t cmp = sort_cmp[descend];

    for (node = NULL; !list_empty(l1) && !list_empty(l2); *node = next) {
        node = cmp(NULL, l1->next, l2->next) <= 0 ? &l1->next : &l2->next;
        next = (*node)->next;
        list_move_tail(*node, sorted);
    }

    list_splice_tail_init(list_empty(l1) ? l2 : l1, sorted);
}

`q_merge`

改寫自 yanjiew1 的解法，利用雙向 linked list 的特性，每次迭代的時候，將頭跟尾串接到 sorted 串列上，如果頭跟尾是同一個，表示已經走完串列了，就把頭拼接到 sorted 上(選擇尾巴也可以)，合併完之後在透過 q_sort 合併成一條 sorted list。最後把 sorted 的串列接到 head 上。

int q_merge(struct list_head *head, bool descend)
{
    if (!head)
        return 0;
    LIST_HEAD(sorted);
    queue_contex_t *cur = NULL;
    queue_contex_t *last = list_last_entry(head, queue_contex_t, chain);

    list_for_each_entry (cur, head, chain) {
        if (cur == last) {
            list_splice_init(cur->q, &sorted);
            break;
        }
        list_splice_init(cur->q, &sorted);
        list_splice_init(last->q, &sorted);
        last = list_entry(last->chain.prev, queue_contex_t, chain);
    }
    q_sort(&sorted, descend);
    int size = q_size(&sorted);
    list_splice_init(&sorted, list_first_entry(head, queue_contex_t, chain)->q);
    return size;
}

make test 有時後會 95/100 有時候滿分，要開始讀論文才能把統計執行時間的函式修正確。

+++ TESTING trace trace-17-complexity:
# Test if time complexity of q_insert_tail, q_insert_head, q_remove_tail, and q_remove_head is constant
ERROR: Probably not constant time or wrong implementation
ERROR: Probably not constant time or wrong implementation
Probably constant time
Probably constant time
---     trace-17-complexity     0/5
---     TOTAL           95/100
make: *** [Makefile:60: test] Error 1

引入 list_sort.c

教材已寫清楚程式碼大部分的運作原理，搭配原始碼可以有效率的看懂程式碼，所以這裡改成做實驗還有驗證 paper

簡化程式碼，觀察輸出的指令

待整理

merge
原本的 merge 函式可以透過 indirect pointer 簡化原本 if-else 區塊相同操作的部份，其實就像在做 merge two sorted lists 一樣

__attribute__((nonnull(2, 3, 4))) static struct list_head *
merge(void *priv, list_cmp_func_t cmp, struct list_head *a, struct list_head *b)
{
    /* cppcheck-suppress unassignedVariable */
    struct list_head *head, **tail = &head;
    struct list_head *node;

    for (;;) {
        /* if equal, take 'a' -- important for sort stability */
        struct list_head **indir = cmp(priv, a, b) <= 0 ? &a : &b;
        node = *indir;
        *tail = node;
        tail = &node->next;
        *indir = node->next;
        if (!node->next) {
            break;
        }
    }
    *tail = (struct list_head *) ((uintptr_t) a | (uintptr_t) b);
    return head;
}

merge_final

__attribute__((nonnull(2, 3, 4, 5))) static void merge_final(
    void *priv,
    list_cmp_func_t cmp,
    struct list_head *head,
    struct list_head *a,
    struct list_head *b)
{
    struct list_head *tail = head;
    struct list_head *node;
    u8 count = 0;

    for (;;) {
        /* if equal, take 'a' -- important for sort stability */
        struct list_head **indir = cmp(priv, a, b) <= 0 ? &a : &b;
        node = *indir;
        tail->next = node;
        node->prev = tail;
        tail = node;
        *indir = node->next;
        if (!node->next)
            break;
    }

    node = (struct list_head *) ((uintptr_t) a | (uintptr_t) b);

    /* Finish linking remainder of list b on to tail */
    tail->next = node;
    do {
        /*
         * If the merge is highly unbalanced (e.g. the input is
         * already sorted), this loop may run many iterations.
         * Continue callbacks to the client even though no
         * element comparison is needed, so the client's cmp()
         * routine can invoke cond_resched() periodically.
         */
        if (unlikely(!++count))
            cmp(priv, node, node);
        node->prev = tail;
        tail = node;
        node = node->next;
    } while (node);

    /* And the final links to make a circular doubly-linked list */
    tail->next = head;
    head->prev = tail;
}

使用 GDB 觀察排序過程

lab0 - Linux 核心的鏈結串列排序何時合併的表格整理變數 count 在各階段時 pending 上的節點數量，如果是自己做實驗該如何得知 pending 每個子串列的數量，還有其節點的內含值？

首先想到的就是 codegen 生成 graphviz script，但是這麼做要在 list_sort.c 加入很多程式碼又要額外處裡字串問題，寫完後除了驗證正確性外，還要用 #ifdef 覆蓋額外的程式碼，導致原本乾淨有品味的程式碼被我弄得一團亂。

要解決上述問題，可以使用強大的 GDB 搭配 Python API 在執行時期拿到 pending 串列，並用 Python 完成 codegen、存檔和畫圖，既避免弄髒程式碼又能夠練習 GDB。

靈感來自手写红黑树, 再用gdb画给你看

新增 dump_sort 命令到 GDB

dump_sort 用來做這次實驗的命令，格式如下

dump_sort list [-o png_path] [-type list_type]

list(必填) : 傳入要觀察的串列開頭，預設是當作都傳入 pending 串列，也就是 prev 指向下個子串列開頭，next 指向子串列的下個節點，另一種則是原本的雙向鍊結串列

-type(可忽略): pending 和 origin 兩種類型，如上所述

-o(可忽略): 輸出的檔案名稱，如果有包含資料夾的路徑在裡面，會自動建立資料夾。只會輸出 png 檔，所以無論傳入的副檔名是 jpg 還是 raw 都會被改為 png

完整程式碼請參考 dump_sort.py。

要添加 gdb command 大致流程如下

建立繼承 gdb.Command 的 class
物件初始化時傳給父類別 command 名稱，還有 command 類型，若沒有特別指定命令只能在什麼情況下用 (e.g. TUI)，可以傳入 gdb.COMMAND_USER
實做 gdb.Command 的 invoke 介面。GDB 使用到自定義的命令時會呼叫到這個函式。

This method is called by GDB when this command is invoked.
CLI Commands In Python (Debugging with GDB)

這次 demo 只用到 gdb 函式庫的 parse_and_eval 和 execute，兩者都會讓 GDB 執行傳入的字串，但是前者會回傳結果，後者一般情況下不會有回傳值，除非 to_string=True 才會回傳結果。

考量到程式的可讀性，要取值時 parse_and_eval，其它都用 execute，具體使用方法如下程式碼的第 21 行走訪 pending 串列開始。

先看 26 行，從 pend 拿到子串列 list 的開頭後走訪子串列，節點的內含值使用包裝過 container_of 的 node_val 取得，31 行執行完後回傳的是 element_t 的開頭，裡面的欄位用字典保存，因此要存取資料的話要傳入欄位名稱的字串，例 val['value']，回傳的結果放到 div_list，走訪完子串列，再將 div_list 加入到 sort_lists 以便後後續做 codegen 跟畫圖。

要換到下個子串列的話就要存取 pending->prev，也就是第 34 行的操作，如果沒有子串列就退出，有的話就重複上述過程，就能在 Python 的環境中使用整個 pending 串列。




































import gdb

class DumpDivide(gdb.Command):
    def reset(self):
        self.png_path = None
        self.list_type = "pending"

    def __init__(self) -> None:
        super().__init__("dump_sort", gdb.COMMAND_USER)
        self.reset()

    def invoke(self, args: str, from_tty: bool) -> None:
        self.reset()
        div_list   = []
        sort_lists = []

        args = args.split(" ")
        self.parse_args(args)

        if self.list_type == "pending":
            gdb.execute(f"p $pend = {args[0]}")
            while True:
                if str(gdb.parse_and_eval("node_val($pend)")) == "0x0":
                    break
                else:
                    gdb.execute("p $list = $pend")
                    while True:
                        if str(gdb.parse_and_eval("node_val($list)")) == "0x0":
                            break
                        else:
                            val = gdb.parse_and_eval("node_val($list)")
                            div_list.append(val['value'].string())
                            gdb.execute("p $list = $list->next")
                gdb.execute("p $pend = $pend->prev")
                sort_lists.append(div_list)
                div_list = []

設置斷點

在 list_sort 的分割與合併階段下斷點，用來觀察兩階段執行完的 pending 串列。

分割與合併階段斷點的參考位置如下程式碼 highlight 的位置，使用這兩處的 pending 串列傳入到 dump_sort 可以印出這兩階段的結果。

list_sort.c

__attribute__((nonnull(2, 3))) void list_sort(void *priv,
                                              struct list_head *head,
                                              list_cmp_func_t cmp)
{
    do {
        /* ... */
        /* Move one element from input list to pending */
        list->prev = pending;
        pending = list;
        list = list->next;
        pending->next = NULL;
+        count++; /* divide breakpoint */
    } while (list);
    
    for (;;) {
        struct list_head *next = pending->prev;

        if (!next)
            break;
        list = merge(priv, cmp, pending, list);
+        pending = next; /* merge breakpoint */
    }
    /* The final merge, rebuilding prev links */
    merge_final(priv, cmp, head, pending, list);
}

qtest.c
原本想再 list_sort 的 merge_final 用 dump_sort 印出最終的排序好串列，但是這時候的 head 還沒有完全接好，要等到 merge_final 執行完，否則會存取到 0xdeadbeaf，所以改成執行完 list_sort 的下一行下斷點並傳入 current->q 印出結果。

bool do_sort(int argc, char *argv[])
{
    /* ... */
    m_sort = sort_opt ? q_ksort : q_sort;
    set_noallocate_mode(true);
    if (current && exception_setup(true))
        m_sort(current->q, descend);
+    exception_cancel();
    set_noallocate_mode(false);
    
}

GDB 測試 command

測資：建立 01~16 個節點，打亂後交由 list_sort 排序。篇幅關係不放完整 cmd 上來

lab0/scripts/sort_step.cmd
為了避免每次執行到斷點就要停下來敲命令，這裡使用 gdb 的 command 命令規定執行到斷點時要執行的操作。檔名使用 $cnt 產生流水號後面加上階段，dump_sort 做完後更新流水號並繼續執行下去(c, continue)。

b list_sort.c:237
b list_sort.c:249
b qtest.c:634

p $cnt=1
# divide
command 1
eval "set $png=\"./sort_step/%d_divide.png\"", $cnt
dump_sort pending -o $png
p $cnt += 1
c
end

# merge
command 2
eval "set $png=\"./sort_step/%d_merge.png\"", $cnt
dump_sort pending -o $png
p $cnt += 1
c
end

# final
command 3
eval "set $png=\"./sort_step/%d_list.png\"", $cnt
dump_sort current->q -type origin -o $png
p $cnt += 1
c
end

$ gdb -q -x ./scripts/dump_sort.py --args ./qtest -f ./traces/demo.cmd
(gdb) source ./scripts/sort_step.cmd 
Breakpoint 4 at 0x55555555fda0: file list_sort.c, line 237.
Breakpoint 5 at 0x55555555fdc4: file list_sort.c, line 249.
Breakpoint 6 at 0x555555556e10: file qtest.c, line 634.
$730 = 1
(gdb) r
Starting program: /home/lambert-wu/Repos/lab0-c/qtest -f ./traces/demo.cmd

假設 shuffle 完的串列入下

l = [11 02 12 01 05 07 03 04 14 15 16 06 10 08 13 09]

執行完成後，圖片和 dot 檔會存入到 lab0/sort_step/，暫時還不知道怎麼用 convert 命令把一系列的圖片大小不一 png 檔轉成每個步驟階段的串列置中顯示的 gif 檔，這樣會更生動

$ ls ./sort_step/
10_divide.dot  12_divide.png  15_divide.dot  17_merge.png  1_divide.dot  2_divide.png  5_divide.dot  7_divide.png
10_divide.png  13_divide.dot  15_divide.png  18_merge.dot  1_divide.png  3_divide.dot  5_divide.png  8_divide.dot
11_divide.dot  13_divide.png  16_divide.dot  18_merge.png  20_list.dot   3_divide.png  6_divide.dot  8_divide.png
11_divide.png  14_divide.dot  16_divide.png  19_merge.dot  20_list.png   4_divide.dot  6_divide.png  9_divide.dot
12_divide.dot  14_divide.png  17_merge.dot   19_merge.png  2_divide.dot  4_divide.png  7_divide.dot  9_divide.png

上述測資分割完的 pending list 如下圖所示，圖形呈現參考〈Queue-Mergesort〉。
Note: 藍色箭頭為 prev，紅色箭頭為 next。

pending 每個排序好的子串列的數量都是 2 的冪，充分利用雙向鍊結串列的特性不需要用額外的空間配置 Queue，也符合教材的表格最終結果。

最後一次合併前的 pending 串列如下, 兩個子串列長度一樣。

遞增排序完的串列如下圖所。

2024q1 Homework1 (lab0)

實作 Queue 介面

q_insert_head / q_insert_tail

q_remove_head / q_remove_tail

q_reverse / q_reverseK

q_ascend / q_descend

q_swap

q_delete_mid

q_delete_dup

q_sort

merge_two_lists

q_merge