2023q1 Homework1 (quiz1)

contributed by < Shiritai >

開發環境

$ gcc --version
gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
$ lscpu
Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         39 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  8
  On-line CPU(s) list:   0-7
Vendor ID:               GenuineIntel
  Model name:            12th Gen Intel(R) Core(TM) i3-12100
    CPU family:          6
    Model:               151
    Thread(s) per core:  2
    Core(s) per socket:  4
    Socket(s):           1
    Stepping:            5
    CPU max MHz:         5500.0000
    CPU min MHz:         800.0000
    BogoMIPS:            6604.80

測驗一

作答部分

AAA
由 pivot 變數名稱與其在程式碼其他地方的行為，推測其為快速排序法的錨點，再由使用的 list API 猜測程式碼想取第一項目為錨點，我們需以正確的方式使用 list API。
於 list.h 中，list_first_entry 的說明如下：










/**
 * list_first_entry() - Get first entry of the list
 * @head: pointer to the head of the list
 * @type: type of the entry containing the list node
 * @member: name of the list_head member variable in struct @type
 *
 * Return: @type pointer of first entry in list
 */
#define list_first_entry(head, type, member) \
    list_entry((head)->next, type, member)

清楚地提到 AAA 應該放擁有 list_head 成員的節點型別: struct item

BBB
承上，第二參數為 list_head 成員於 item 中的名稱，即 list。
CCC
CCC 出現的位置與形式讓人猜測為走訪所有節點，以下為所有走訪節點的巨集:
```
list_for_each(node, head)
list_for_each_entry(entry, head, member)
list_for_each_safe(node, safe, head)
list_for_each_entry_safe(entry, safe, head, member)
```
可發現參數數量不同，依據文件註解所述，可總結出
- _entry 表透過 list_head 成員走訪其 container
- _safe 表使用同時兩節點走訪，目的在於可以更改走訪過程中後者節點的指標，也就可以對其進行「移除」、「刪除」等操作。
不過要回答這題只需要知道其需要四個參數，唯一符合的只有 list_for_each_entry_safe。
DDD

CCC 以降的迴圈所執行的行為可以猜測為快速排序法的 partition，透過觀察執行 DDD 的條件，可以發現是對相較 pivot 小者的操作。另觀察 list_less 和 list_greater 起初都是空串列，可以推測這兩個首節點應該負責 partition 的任務，推測 DDD 應為將 itm 插入 list_less 中，故 DDD 似乎可為任何插入節點的 list API。

但要注意的是，題目要求 stable sort，原本的順序應該被保留，應該將較晚進行判斷的節點放入 list_less 的後面，故 DDD 為 list_move_tail。
EEE

承上題，同理 EEE 為 list_move_tail。

FFF

快速排序法為 divide and conquer 的演算法，conquer 完後要將原本切成兩份的串列併回一條。

觀察最後三行，第一行將 pivot 放回目前為空串列的 head。

第二行由函式註解得知為將 list_less 加入 head 串列的前方。











/**
 * list_splice() - Add list nodes from a list to beginning of another list
 * @list: pointer to the head of the list with the node entries
 * @head: pointer to the head of the list
 *
 * All nodes from @list are added to the beginning of the list of @head.
 * It is similar to list_add but for multiple nodes. The @list head is not
 * modified and has to be initialized to be used as a valid list head/node
 * again.
 */
static inline void list_splice(struct list_head *list, struct list_head *head)

至此，我們已經完成合併步驟的一半，剩下另一半「大者」也須併回原串列。

如此可猜測 FFF 為 list_splice_tail，如此一來便可將「大者」以正確的順序合併。

程式碼運作原理

作答部分已說明，在此對沒有解釋的部分進行補充。

邊界情形

遞迴初始條件為 0~1 個節點。

其實可以只考慮空串列的情況。注意若只有一節點，則該節點會作為 pivot，不會繼續遞迴呼叫導致無窮遞迴。不過為了效能考量，提早檢查單節點串列的情況可減少函式呼叫次數。
```
if (list_empty(head) || list_is_singular(head))
        return;
```

改進實作

快速排序的優化很多種，比如

調整取 pivot 策略
三路快排
hybrid sort

我的以下改進實作基於 lab0-c 的框架，進而可以和 q_sort (merge + insertion sort) 和純 insertion sort 進行比較。

以下為串列之快速排序的實作，有以下特點。

以首個節點為 pivot 以保持穩定排序
採三路快排，與 pivot 相等的節點會串在以 pivot 為第一節點 (list_middle 之下) 的後方，不需參與後續的遞迴呼叫。
partition 時計算各個分區的大小，以此為基礎決定後續要採取 insertion sort 或者 quick sort。















































void q_quicksort(struct list_head *head, bool descend)
{
    if (!head || list_empty(head) || list_is_singular(head))
        return;

    // optimization: use head as "list_less"
    struct list_head list_middle, list_greater;
    INIT_LIST_HEAD(&list_middle);
    INIT_LIST_HEAD(&list_greater);

    element_t *pivot = list_first_entry(head, element_t, list);
    list_move(&pivot->list, &list_middle);

    element_t *cur, *nxt;
    int l_cnt, r_cnt;
    l_cnt = r_cnt = 0;
    list_for_each_entry_safe (cur, nxt, head, list) {
        // optimization: 3-way quick partition
        int order = strcmp(cur->value, pivot->value);
        if (!order) { // "="
            list_move_tail(&cur->list, &list_middle);
        } else if (order > 0) { // ">"
            list_move_tail(&cur->list, &list_greater);
            ++r_cnt;
        } else { // "<"
            ++l_cnt;
        }
    }

    // optimization: hybrid sort
    // Use quick sort + insertion sort
    // threshold to use insertion sort optimization
    static const int THRESHOLD = 32;
    if (l_cnt > THRESHOLD) {
        q_quicksort(head, descend);
    } else if (l_cnt) {
        q_i_sort(head->next, head->prev, descend);
    }
    if (r_cnt > THRESHOLD) {
        q_quicksort(&list_greater, descend);
    } else if (r_cnt) {
        q_i_sort(list_greater.next, list_greater.prev, descend);
    }

    list_splice_tail(&list_middle, head);
    list_splice_tail(&list_greater, head);
}

2023q1 Homework1 (quiz1)

開發環境

測驗一

作答部分

程式碼運作原理

改進實作

測驗二

測驗三

2023q1 Homework1 (quiz1)

開發環境

測驗一

作答部分

程式碼運作原理

改進實作

測驗二

測驗三

Read more

Arch Linux 伺服器建置

Rust X OS 筆記

Ubuntu (desktop) 實驗平台搭建紀錄

Introduction to Database Systems