2021q1 Homework1 (quiz1)

contributed by < robertlin0401 >

tags: `linux2021`

2021 年第 1 週隨堂測驗題目

程式運作原理

node 結構

typedef struct __node {                   
    int value;
    struct __node *next;
} node_t;

list operation 函式




static inline void list_add_node_t(node_t **list, node_t *node_t) {
    node_t->next = *list;
    *list = node_t;
}

說明
- 作用：將 node_t 加入 linked list 的前端
- node_t：表指向欲加入目標 linked list 的新 node 的指標
- list：表指向目標 linked list 的第一個 node 的指標的指標

分析

給定一 linked list 結構如下， head 表指向第一個 node 的指標

第 2 行程式碼將 *list，即 head，賦值給 node_t->next，因此 node_t->next 將指向 head 所指的 node_A，結果如下

第 3 行程式碼將 *list，即 head，改成指向 node_t，結果如下





static inline void list_concat(node_t **left, node_t *right) {
    while (*left)
        left = &((*left)->next); // LLL
    *left = right;
}

說明
- 作用：將 right 所指向的 linked list，串接（concatenate）在 *left 所指向的目標 linked list 尾端
- left：表指向目標 linked list 的第一個 node 的指標的指標
- right：表欲串接在目標 linked list 尾端的 linked list 的第一個 node 的指標

分析

給定兩 linked list 結構如下， head 表指向目標 linked list 的第一個 node 的指標

第 2-3 行程式碼的目的為尋找 *left 串列的尾端 node 的 next 指標的指標，步驟如下

第一步：*left 即 head，指向 node_A，並非 NULL，因此將 left 向前移動，改成指向 node_A->next 的指標

第二步：*left 仍非 NULL，因此繼續將 left 向前移動，改成指向 node_B->next 的指標

第三步：此時 *left 為 NULL ，迴圈終止
根據上述步驟可知，LLL 應填入 &((*left)->next)，其中，(*left)->next 為下一個 node 的指標

第 4 行程式碼將當前 *left，即 node_B->next，改成欲串接的 linked list 的第一個 node 的指標 right，由此兩串列得以成功串接，結果如下

quicksort 實作




























void quicksort(node_t **list)
{
    if (!*list)
        return;

    node_t *pivot = *list;
    int value = pivot->value;
    node_t *p = pivot->next;
    pivot->next = NULL;

    node_t *left = NULL, *right = NULL;
    while (p) {
        node_t *n = p;
        p = p->next;
        list_add_node_t(n->value > value
                            ? &right /* AAA */
                            : &left  /* BBB */
                        , n);
    }

    quicksort(&left);
    quicksort(&right);

    node_t *result = NULL;
    list_concat(&result, left);
    list_concat(&result, pivot); list_concat(&result, right); // CCC
    *list = result;
}

說明
- quicksort 基本概念
  1. 取數列其中一個元素作為 pivot，將 pivot 與其餘元素逐一比較，較 pivot 小者視為一個新數列，較 pivot 大者視為一個新數列，此過程稱為 Partition
  2. 對兩個新數列分別做 quicksort，直到無法再分割出新的數列為止
  補充說明：Array vs. Linked List
  - Array 表示
    1. 在每次 Partition 的過程中，會同時進行元素的 SWAP
    2. 以由小到大排序為例，過程中會從前端搜尋比 pivot 大的元素，與從尾端搜尋的比 pivot 小的元素做 SWAP，直到比 pivot 小（大）的元素全部集中在前端（尾端）
    3. 最後再將 pivot SWAP 到最正確的位置即完成一次 Partition
    4. 由此過程可知，當無法再分割出新的數列時，排序即完成
  - Linked List 表示
    1. 在每次 Partition 的過程中不做 SWAP，而是將 list 分割成三個 sublists：left、right 以及只有一個元素的 pivot
    2. 針對 left 及 right 遞迴進行 quicksort
    3. 當無法再分割出新的 list 時，排序尚未完成，需要再將所有 sublists 依 left -> pivot -> right 的順序進行合併
- list：表目標串列的第一個 node 的指標的指標
分析
- 第 3-4 行程式碼為 quicksort 的終止條件，當 !*list 成立時，表示當前串列的元素個數為 0 個，無法再進行分割，此條件符合上述 quicksort 基本概念第 2 點
- 第 6 行程式碼將指向串列第一個 node 的指標賦值給 pivot ，因此 pivot 取的是串列的第一個元素
- 第 12-19 行程式碼根據 pivot 值將 linked list 進行分割，因為預期依據遞增順序排序，故分割成由較 pivot 小的 node 所組成的子串列 left，與由較 pivot 大的 node 所組成的子串列 right
  - list_add_node_t 函式的第一個引數 n->value > value ? AAA : BBB 為指定當前 node n 應加入的 linked list 的指標的指標，其中，value 為 pivot 之值
  - AAA 為 n->value > value 成立時 node n 應加入的 linked list 的指標的指標，因此 AAA 應填入 &right
  - BBB 為 n->value > value 不成立時 node n 應加入的 linked list 的指標的指標，因此 BBB 應填入 &left
- 第 21-22 行程式碼以遞迴的方式針對 left、right 兩子串列做 quicksort
- 第 24-27 行程式碼目的是在分割結束後，將排序後的 nodes 依序重新串接，並將最終結果賦值給 *list，即完整串列的第一個 node 的指標
  - CCC 應填入 list_concat(&result, pivot); list_concat(&result, right)，說明可參考上述 quicksort 的補充說明中 Linked List 表示部分

Random 行為的修正

問題描述

random() 方法是一種 Pseudo-Random Number Generator (PRNG)，它會根據 srandom() 所設定的 seed 來生成隨機數，當未設定 seed 時，預設值為 1

節錄自 random(3) man page description：

The srandom() function sets its argument as the seed for a new sequence of pseudo-random integers to be returned by random(). These sequences are repeatable by calling srandom() with the same seed value. If no seed value is provided, the random() function is automatically seeded with a value of 1.

由於此測試程式未設定 seed，表示每次執行程式均以 1 為 seed 生成隨機數，故每次執行之結果均相同，以下為測試結果：

$ ./random_test
NOT IN ORDER : 1016 84 706 124 326 483 763 498 939 186 205 809 236 74 255 81 115 105 966 359 
    IN ORDER : 74 81 84 105 115 124 186 205 236 255 326 359 483 498 706 763 809 939 966 1016 
$ ./random_test 
NOT IN ORDER : 1016 84 706 124 326 483 763 498 939 186 205 809 236 74 255 81 115 105 966 359 
    IN ORDER : 74 81 84 105 115 124 186 205 236 255 326 359 483 498 706 763 809 939 966 1016

想法

為了生成隨機亂數，我們需要在每次執行程式時給予不同的 seed 值
Linux kernel 會收集來自驅動程式或其它來源的環境噪音並存入熵池（entropy pool），由於環境噪音等資訊是無法預期的，因此可以使用它作為 seed
節錄自 random(4) man page description：
- The character special files /dev/random and /dev/urandom (present since Linux 1.3.30) provide an interface to the kernel's random number generator.
- The random number generator gathers environmental noise from device drivers and other sources into an entropy pool.
- Linux 3.17 and later provides the simpler and safer getrandom(2) interface which requires no special files
Linux 另外提供 getrandom() 的系統呼叫可從熵池中隨機讀取指定數量的位元組

補充說明：為何不全部使用熵池的資料作為亂數資料？
因為熵池大小有限，若全部自熵池取用，在需要取得較大量的亂數的情形，可能會發生可取得的資料量不足的問題，因此只取用其中 4 bytes 作為 PRNG 的 seed

修改

在測試程式中加入此段程式碼
```
int *seed = (int *)malloc(sizeof(int));
int result = getrandom(seed, sizeof(int), GRND_RANDOM | GRND_NONBLOCK);
if (result == -1) *seed = result;
srandom(*seed);
free(seed)
```
- 說明
  - 第 2 行程式碼透過 getrandom() 的系統呼叫從熵池中隨機讀取 4 個位元組，即 seed 所需的整數型態的大小
  - 第 3 行程式碼檢查 getrandom() 的結果是否成功，若失敗則直接將 result（-1）做為 seed 使用
  參考資料：getrandom(2) man page

以下為測試結果：

$ ./optimized_random_test 
NOT IN ORDER : 188 170 115 147 816 212 472 309 249 942 1007 433 134 436 577 831 785 589 830 1009 
    IN ORDER : 115 134 147 170 188 212 249 309 433 436 472 577 589 785 816 830 831 942 1007 1009 
$ ./optimized_random_test 
NOT IN ORDER : 678 155 495 710 632 554 699 416 46 524 143 264 930 10 249 375 177 30 582 576 
    IN ORDER : 10 30 46 143 155 177 249 264 375 416 495 524 554 576 582 632 678 699 710 930 
$ ./optimized_random_test 
NOT IN ORDER : 404 750 572 632 568 570 942 61 306 565 993 235 997 853 268 328 286 712 416 201 
    IN ORDER : 61 201 235 268 286 306 328 404 416 565 568 570 572 632 712 750 853 942 993 997

非遞迴方法實作

想法

在遞迴的方法中，每一回合會取一個 node 作為 pivot，並將 pivot 以外的 nodes 以較 pivot 大或小為基準分割為兩個 sublists，接著對個別的 sublist 做遞迴排序（詳見 quicksort 基本概念）
在非遞迴方法中，每一回合分割完成後，改從兩個 sublists 中挑選較短者先進行排序，另一者則將其記錄起來（以陣列記錄其起始與中止位置編號），並以迭代的方式一一排序，直到該陣列紀錄的所有 sublists 皆完成迭代

實作





































void non_recursive_quicksort(node_t **list, int elements)
{
    #define MAX_LEVELS 300

    int *arr = list2arr(list, elements);
    int pivot, begin[MAX_LEVELS], end[MAX_LEVELS], iter = 0, L, R;

    begin[0] = 0;
    end[0] = elements;
    while (iter >= 0) {
        L = begin[iter];
        R = end[iter] - 1;
        if (L < R) {
            pivot = arr[L];
            while (L < R) {
                while (arr[R] >= pivot && L < R) R--; if (L < R) arr[L++] = arr[R];
                while (arr[L] <= pivot && L < R) L++; if (L < R) arr[R--] = arr[L];
            }
            arr[L] = pivot;
            begin[iter+1] = L + 1;
            end[iter+1] = end[iter];
            end[iter++] = L;
            if (end[iter]-begin[iter] > end[iter-1]-begin[iter-1]) {
                SWAP(&begin[iter], &begin[iter-1]);
                SWAP(&end[iter], &end[iter-1]);
            }
        } else {
            iter--;
        }
    }

    for (iter = 0; iter < elements; ++iter) {
        (*list)->value = arr[iter];
        list = &((*list)->next);
    }
    free(arr);
}

說明
- list：表目標串列的第一個 node 的指標的指標
- elements：表目標串列的 node 總數／長度
分析
- 第 3 行程式碼所設定的 MAX_LEVELS，其意義相似於 recursive 版本中的遞迴呼叫深度，作為限制其記憶體使用量
  - note：
    在遞迴版本中，每增加一層深度，便需要多一次 function call 占用 stack space；
    在非遞迴版本中，則僅需多出記錄串列首尾位置編號的兩個變數的記憶體空間使用
- 第 5 行程式碼目的為將 linked list 中的資料提取出來，儲存成 array 以方便後續排序中搜尋與搬動資料，可避免大量的指標操作，其中，list2arr 的實作如下：
```
static int *list2arr(node_t **list, int elements)
{
    int *arr = (int *)malloc(elements * sizeof(int));
    for (int iter = 0; iter < elements; ++iter) {
        arr[iter] = (*list)->value;
        list = &((*list)->next);
    }
    return arr;
}
```
- 第 10-30 行的迴圈目的是針對每個 subarray 進行排序，直到所紀錄的所有 subarrays 皆完成迭代
  - 第 11-12 行程式碼會取出所記錄的最後一個 subarray 的資訊，並判斷是否為合法的 subarray（長度 > 0），若合法則進行排序
  - 第 14 行程式碼會取當前 subarray 的第一個元素作為 pivot
  - 第 15-18 行的迴圈目的是根據 pivot 將 subarray 中的元素進行搬動，較 pivot 小者搬至左側，反之搬至右側
  - 第 19-22 行程式碼會將 pivot 搬至正確位置，並將左、右兩 subarrays 的資訊記錄起來
  - 第 23-26 行程式碼會比較左、右兩 subarrays，並將較短者 swap 到紀錄的最後一筆，在下次迭代中便會優先排序，如此一來可避免迭代深度超出 MAX_LEVELS，其中，SWAP 的實作如下：
```
static inline void SWAP(int *a, int *b) {
    *a ^= *b; *b ^= *a; *a ^= *b;
}
```
- 第 32-35 行程式碼將排序好的 array 的值依序存回 linked list 中

參考資料：Optimized QuickSort — C Implementation (Non-Recursive)

linux-list quicksort 探討

資料結構

list_head：用於紀錄雙向鏈結，同時也用作 list 的指標

struct list_head {
    struct list_head *prev;
    struct list_head *next;
};

參考 /include/list.h#list_head

listitem：表 node 本身

struct listitem {
    uint16_t i;
    struct list_head list;
};

參考 /private/common.h#listitem

程式碼與說明





























static void list_qsort(struct list_head *head)
{
    struct list_head list_less, list_greater;
    struct listitem *pivot;
    struct listitem *item = NULL, *is = NULL;

    if (list_empty(head) || list_is_singular(head))
        return;

    INIT_LIST_HEAD(&list_less);
    INIT_LIST_HEAD(&list_greater);

    pivot = list_first_entry(head, struct listitem, list);
    list_del(&pivot->list);

    list_for_each_entry_safe (item, is, head, list) {
        if (cmpint(&item->i, &pivot->i) < 0)
            list_move_tail(&item->list, &list_less);
        else
            list_move(&item->list, &list_greater);
    }

    list_qsort(&list_less);
    list_qsort(&list_greater);

    list_add(&pivot->list, head);
    list_splice(&list_less, head);
    list_splice_tail(&list_greater, head);
}

第 10-11 行程式碼將 list_less 與 list_greater 兩 lists 進行初始化，分別作為鏈結比 pivot 小與比 pivot 大的 nodes 的 sublist

參考 INIT_LIST_HEAD
第 13-14 行程式碼取串列的第一個 node 作為 pivot，並將其從原串列刪除

參考 list_first_entry、list_del
第 16-21 行程式碼將串列剩餘的 nodes 依序與 pivot 做比較，較 pivot 小者會從原串列刪除並加入 list_less 的尾端，較 pivot 大者則是會從原串列刪除並加入 list_greater 的前端，因此比較皆完成之後，原串列會變為空串列

參考 list_for_each_entry_safe、cmpint、list_move_tail、list_move
第 23-24 行程式碼針對 list_less 與 list_greater 兩 sublists 進行遞迴排序
第 26-28 行程式碼將 pivot 加入當前為空的原串列的前端，並將 list_less 加入原串列的前端、 list_greater 加入原串列的尾端

參考 list_add、list_splice、list_splice_tail

分析

對比本題的 quicksort 與 linux-list 版本的 quicksort，我們可以發現兩者的運作流程是相同的

不同點在於後者的資料結構較複雜，相對地，其 list operation 也較為複雜，以下為後者的資料結構圖示：

非遞迴方法實作





















































































static bool isValid(struct list_head *head, struct list_head *L, struct list_head *R)
{
    if (L == R->next) return false;
    while (1) {
        L = L->next;
        if (L == R)	return true;
        if (L == head) return false;
    }
}

static int list_offset(struct list_head *head, struct list_head *begin, struct list_head *end)
{
    int count = 0;
    while (begin != end) {
        if (begin == head) return -1;
        count++;
        begin = begin->next;
    }
    return count;
}

static void SWAP(struct list_head **a, struct list_head **b)
{
    struct list_head *temp = *a;
    *a = *b;
    *b = temp;
}

static void list_qsort(struct list_head *head)
{	
    #define MAX_LEVELS 300
    #define isValid() isValid(head, L, R)
    #define valueOf(node) list_entry(node, struct listitem, list)->i

    struct list_head *pivot, *begin[MAX_LEVELS], *end[MAX_LEVELS], *L, *R;
    int iter = 0;

    if (list_empty(head) || list_is_singular(head))
        return;

    begin[0] = head->next;
    end[0] = head;
    while (iter >= 0) {
        L = begin[iter];
        R = end[iter]->prev;
        if (isValid()) {
            pivot = L;
            while (isValid()) {
                while (valueOf(R) >= valueOf(pivot) && isValid()) R = R->prev;
                if (isValid()) {
                    struct list_head *target = R;
                    R = R->prev;
                    list_move(target, L);
                    L = L->next;
                }
                while (valueOf(L) <= valueOf(pivot) && isValid()) L = L->next;
                if (isValid()) {
                    struct list_head *target = L;
                    L = L->prev;
                    list_move(target, R);
                }
            }
            begin[iter] = pivot->next;
            if (valueOf(L) >= valueOf(pivot)) {
                if (L != pivot) {
                    begin[iter+1] = L;
                    list_move_tail(pivot, L);
                } else {
                    begin[iter+1] = L->next;
                }
            } else {
                begin[iter+1] = L->next;
                list_move(pivot, L);
            }
            end[iter+1] = end[iter];
            end[iter++] = pivot;
            if (list_offset(head, begin[iter], end[iter]) > list_offset(head, begin[iter-1], end[iter-1])) {
                SWAP(&begin[iter], &begin[iter-1]);
                SWAP(&end[iter], &end[iter-1]);
            }
        } else {
            iter--;
        }
    }
}

說明
- 此處與前例先轉換成 array 的作法不同，而是使用給定的 list 結構與其提供的 list operation 進行實作
- 程式運作流程與 array 版本相同，但由於是以 linked list 進行實作，需要顧及 list 結構的維護，導致程式碼之可讀性與邏輯較混亂，此處應該可以進行加強

高效率的 linked list 排序演算法

說明

在 doubly linked list 的資料結構下，tree sort 為文章中所提及的最快速的排序演算法
tree sort 演算法步驟如下：
- 將串列建構成一個二元搜尋樹
- 對該二元搜尋樹進行中序走訪（inorder traversal），所得到的結果便會是由小到大排序好的結果
  （中序走訪：按照左子樹->樹根->右子樹的順序進行走訪）

程式碼

static inline void set_leaf(struct list_head *leaf)
{
    leaf->prev = NULL;
    leaf->next = NULL;
}

static inline void inorder_traversal(struct list_head *root, struct list_head *head)
{
    if (root != NULL) {
        inorder_traversal(root->prev, head);
        struct list_head *next = root->next;
        list_add_tail(root, head);
        inorder_traversal(next, head);
    }
}

static void list_tsort(struct list_head *head)
{	
    #define valueOf(node) list_entry(node, struct listitem, list)->i

    if (list_empty(head) || list_is_singular(head))
        return;
    
    struct list_head *root = head->next;
    list_del(root);
    set_leaf(root);
    while (!list_empty(head)) {
        struct list_head *target = head->next;
        list_del(target);
        set_leaf(target);
        struct list_head *position = root;
        while (1) {
            if (valueOf(target) <= valueOf(position)) {
                if (position->prev != NULL)
                    position = position->prev;
                else {
                    position->prev = target;
                    break;
                }
            } else {
                if (position->next != NULL)
                    position = position->next;
                else {
                    position->next = target;
                    break;
                }
            }
        }
    }

    inorder_traversal(root, head);
}

參考資料：A Comparative Study of Linked List Sorting Algorithms

2021q1 Homework1 (quiz1)

tags: linux2021

程式運作原理

node 結構

list operation 函式

quicksort 實作

Random 行為的修正

問題描述

想法

修改

非遞迴方法實作

想法

實作

linux-list quicksort 探討

資料結構

程式碼與說明

相關 operation

分析

非遞迴方法實作

高效率的 linked list 排序演算法

說明

程式碼

Read more

hw4 report

Project 3

Project 2

week 5

tags: `linux2021`