2023q1 Homework1 (lab0)

contributed by < Risheng1128 >

作業說明

實驗環境

$ lscpu
Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         39 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  12
  On-line CPU(s) list:   0-11
Vendor ID:               GenuineIntel
  Model name:            11th Gen Intel(R) Core(TM) i5-11400H @ 2.70GHz
    CPU family:          6
    Model:               141
    Thread(s) per core:  2
    Core(s) per socket:  6
    Socket(s):           1
    Stepping:            1
    CPU max MHz:         4500.0000
    CPU min MHz:         800.0000
    BogoMIPS:            5376.00
Virtualization features: 
  Virtualization:        VT-x
Caches (sum of all):     
  L1d:                   288 KiB (6 instances)
  L1i:                   192 KiB (6 instances)
  L2:                    7.5 MiB (6 instances)
  L3:                    12 MiB (1 instance)
NUMA:                    
  NUMA node(s):          1
  NUMA node0 CPU(s):     0-11

$ gcc --version
gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

佇列實作

使用的結構

實作之前，先分析目前 lab0-c 對管理佇列的設計

雙向鏈結串列

struct list_head {
    struct list_head *prev;
    struct list_head *next;
};

佇列元素

typedef struct {
    char *value;
    struct list_head list;
} element_t;

由上述兩種的結構，可以知道佇列的元素本身就是鏈結串列的節點，會產生如下的示意圖

接著是和去年不同的部份，現在可以管理多個佇列，主要是由於以下的結構實作

typedef struct {
    struct list_head *q;
    struct list_head chain;
    int size;
    int id;
} queue_contex_t;

上述的結構為每個佇列的節點，成員說明如下:

q: 指向佇列的第一個節點
chain: 負責管理佇列的鏈結串列的節點
size: 佇列節點的數量
id: 鏈結串列編號

接著以下的結構為管理佇列的鏈結串列的第一個節點

typedef struct {
    struct list_head head;
    int size;
} queue_chain_t;

static queue_chain_t chain = {.size = 0};

由上述的結構，可以得知目前管理佇列的鏈結串列示意圖如下:

其中 q 則是對應到上述佇列的結構

q_new

函式功能: 建立新的「空」佇列

函式流程

使用 malloc 分配記憶體空間，並由指標 new 指向
如果空間分配失敗 malloc 會回傳 NULL ，此時回傳 NULL
使用函式 INIT_LIST_HEAD 初始化

實際程式碼:

struct list_head *q_new()
{
    struct list_head *new = malloc(sizeof(*new));
    if (!new)
        return NULL;
    // initialize the head
    INIT_LIST_HEAD(new);
    return new;
}

q_free

函式功能: 釋放佇列所佔用的記憶體

函式流程

如果傳入的 head 為 NULL ，直接結束函式
走訪整個鏈結串列，每經過一個節點，就對其使用函式 list_del 移除該節點並釋放其空間
釋放 head 的記憶體空間

實際程式碼:

void q_free(struct list_head *l)
{
    if (!l)
        return;
    element_t *entry, *safe;
    list_for_each_entry_safe (entry, safe, l, list) {
        list_del(&entry->list);
        q_release_element(entry);
    }
    // free list head
    free(l);
}

q_insert_head

函式功能: 在佇列開頭 (head) 插入 (insert) 給定的新節點

函式流程

使用函式 malloc 分配 element_t 大小的記憶體空間，若失敗則回傳 false
使用函式 strdup 建立複製字串資料
使用函式 list_add 將節點加在 head 的下一個位置

實際程式碼:

bool q_insert_head(struct list_head *head, char *s)
{
    if (!head)
        return false;
    element_t *element = malloc(sizeof(*element));
    if (!element)
        return false;
    // copy string
    element->value = strdup(s);
    if (!element->value) {
        free(element);
        return false;
    }
    // add node into doubly-linked list at head
    list_add(&element->list, head);
    return true;
}

其中函式 strdup 的實作如下

/* Use undef to avoid strdup redefined error */
#undef strdup
#define strdup test_strdup

char *test_strdup(const char *s)
{
    size_t len = strlen(s) + 1;
    void *new = test_malloc(len);
    if (!new)
        return NULL;

    return memcpy(new, s, len);
}

q_insert_tail

函式功能: 在佇列尾端 (tail) 插入 (insert) 給定的新節點

函式流程

使用函式 malloc 分配 element_t 大小的記憶體空間，若失敗則回傳 false
使用函式 strdup 建立複製字串資料
使用函式 list_add_tail 將節點加在 head 的前一個位置

實際程式碼:

bool q_insert_tail(struct list_head *head, char *s)
{
    if (!head)
        return false;
    element_t *element = malloc(sizeof(*element));
    if (!element)
        return false;
    // copy string
    element->value = strdup(s);
    if (!element->value) {
        free(element);
        return false;
    }
    // add node into doubly-linked list at tail
    list_add_tail(&element->list, head);
    return true;
}

q_remove_head

函式功能: 在佇列開頭 (head) 移去 (remove) 給定的節點

函式流程

如果佇列為 NULL 或是空佇列，則回傳 NULL
移除佇列的第一個元素 (head->next) 並且取得其資料
將資料複製到 buffer 中

實際程式碼:

element_t *q_remove_head(struct list_head *head, char *sp, size_t bufsize)
{
    if (!head || list_empty(head))
        return NULL;

    struct list_head *node = head->next;
    element_t *element = list_entry(node, element_t, list);

    list_del(node);
    if (sp) {
        strncpy(sp, element->value, bufsize - 1);
        // add null terminator
        sp[bufsize - 1] = 0;
    }
    return element;
}

q_remove_tail

函式功能: 在佇列尾端 (tail) 移去 (remove) 給定的節點

函式流程

如果佇列為 NULL 或是空佇列，則回傳 NULL
移除佇列的最後一個元素 (head->prev) 並且取得其資料
將資料複製到 buffer 中

實際程式碼:

element_t *q_remove_tail(struct list_head *head, char *sp, size_t bufsize)
{
    if (!head || list_empty(head))
        return NULL;

    struct list_head *node = head->prev;
    element_t *element = list_entry(node, element_t, list);

    list_del(node);
    if (sp) {
        strncpy(sp, element->value, bufsize - 1);
        // add null terminator
        sp[bufsize - 1] = 0;
    }
    return element;
}

q_size

函式功能: 計算目前佇列的節點總量

函式流程

如果佇列為 NULL 則回傳 0
走訪鏈結串列並計算走過的數量

實際程式碼:

int q_size(struct list_head *head)
{
    if (!head)
        return 0;

    int count = 0;
    struct list_head *node;
    list_for_each (node, head)
        count++;
    return count;
}

q_delete_mid

函式功能: 移走佇列中間的節點

函式流程

如果佇列為 NULL 或是空佇列則回傳 false
使用指標分別往前 (forward) 及往後 (afterward) 走，並找到中間的節點 (最後為 forward )
移除中間的節點並釋放其記憶體空間

實際程式碼:

bool q_delete_mid(struct list_head *head)
{
    // https://leetcode.com/problems/delete-the-middle-node-of-a-linked-list/
    if (!head || list_empty(head))
        return false;

    struct list_head *afterward = head->next, *forward = head->prev;
    // move point to find the middle node
    for (; afterward != forward && afterward->next != forward;
         afterward = afterward->next, forward = forward->prev)
        ;

    // remove the middle node in queue
    list_del(forward);
    // delete the node
    q_release_element(list_entry(forward, element_t, list));
    return true;
}

q_delete_dup

函式功能: 在已經排序的狀況，移走佇列中具備重複內容的節點

函式流程

如果佇列為 NULL 則回傳 false

走訪整個鏈結串列並取得 curr 和 next 的元素地址

list_for_each_entry_safe (curr_entry, next_entry, head, list)

當資料相同時，會對 next 指向的節點進行移除並釋放空間，同時 flag 被設置為 true ，字串比較則使用 strcmp 函式，當資料相同時回傳 0

while (&next_entry->list != head &&
       !strcmp(curr_entry->value, next_entry->value)) {
    list_del(&next_entry->list);
    q_release_element(next_entry);
    // update next pointer
    next_entry = list_entry(curr_entry->list.next, element_t, list);
    flag = true;
}

當 flag 為 true 時，表示發生過資料相同的情況，但 curr 指到的節點尚未釋放，因此要記得釋放該空間

// need remove current node
if (flag) {
    list_del(&curr_entry->list);
    q_release_element(curr_entry);
    flag = false;
}

實際程式碼:

bool q_delete_dup(struct list_head *head)
{
    // https://leetcode.com/problems/remove-duplicates-from-sorted-list-ii/
    if (!head)
        return false;

    bool flag = false;
    element_t *curr_entry, *next_entry;

    list_for_each_entry_safe (curr_entry, next_entry, head, list) {
        while (&next_entry->list != head &&
               !strcmp(curr_entry->value, next_entry->value)) {
            list_del(&next_entry->list);
            q_release_element(next_entry);
            // update next pointer
            next_entry = list_entry(curr_entry->list.next, element_t, list);
            flag = true;
        }

        // need remove current node
        if (flag) {
            list_del(&curr_entry->list);
            q_release_element(curr_entry);
            flag = false;
        }
    }
    return true;
}

TODO: 用更精簡的方式撰寫程式碼。
:notes: jserv

後來參考 alanjian85 同學的作業，發現了更好的解法，兩者相同的部份都是使用指標 prev 及 curr 來判斷資料是否相同，而不同之處在於我原本的實作是移除 curr 指向的節點，後者則是移除 prev 的節點，如此一來可以節省 curr 的迴圈

修改後的程式碼如下所示，修改紀錄為 commit:

bool q_delete_dup(struct list_head *head)
{
    // https://leetcode.com/problems/remove-duplicates-from-sorted-list-ii/
    if (!head)
        return false;

    bool flag = false;
    element_t *curr_entry, *next_entry;

    list_for_each_entry_safe (curr_entry, next_entry, head, list) {
        bool equal = &next_entry->list != head &&
                    !strcmp(curr_entry->value, next_entry->value);
        if (flag || equal) {
            list_del(&curr_entry->list);
            q_release_element(curr_entry);
            flag = equal;
        }
    }

    // need remove current node
    if (flag) {
        list_del(&curr_entry->list);
        q_release_element(curr_entry);
    }

    return true;
}

q_swap

函式功能: 交換佇列中鄰近的節點

函式流程

如果佇列為 NULL 則直接離開函式
走訪鏈結串列，每次都將 first 指到的節點取出並加到 second 指到的節點後，如此一來就達到交換的功能

實際程式碼:

void q_swap(struct list_head *head)
{
    // https://leetcode.com/problems/swap-nodes-in-pairs/
    if (!head)
        return;

    struct list_head *first = head->next;
    for (struct list_head *second = first->next;
         first != head && second != head;
         first = first->next, second = first->next) {
        // can swap
        list_del(first);
        list_add(first, second);
    }
}

q_reverse

函式功能: 以反向順序重新排列鏈結串列，該函式不該配置或釋放任何鏈結串列節點，換言之，它只能重排已經存在的節點

函式流程

建立三個指標 prev, curr 及 next ，並依照 prev → curr → next 的順序指向不同節點

next 指到目前節點的下一個

交換目前節點的下一個及上一個節點

更新 prev 及 curr

完整走訪整個鏈結串列，即可完成反轉

實際程式碼:

void q_reverse(struct list_head *head)
{
    if (!head)
        return;

    struct list_head *prev = head->prev, *curr = head, *next = NULL;
    while (next != head) {
        next = curr->next;
        curr->next = prev;
        curr->prev = next;
        prev = curr;
        curr = next;
    }
}

q_reverseK

函式功能: 類似 q_reverse ，但指定每 k 個節點為一組，進行反向順序排列

以下的函式流程以 trace-06-ops.cmd 的測資為例

l = [e, e, d, c, b, a, a, a], k = 3

函式流程

如果佇列為 NULL 或是空佇列則直接離開函式
切開鏈結串列裡最後一個節點回到 head 的連線
```
head->prev->next = NULL;
```

當 count 第一次和 k 相同且在執行函式 q_reverse 之前，此時從佇列頭開始，可以看到形成了一個單向環狀鏈結串列 [e, e, d] ，這麼做的目的是要滿足函式 q_reverse 輸入，也就是函式 q_reverse是以輸入節點的下一個節點 (head->next) 開始反轉

接著開始反轉，需注意的是原本的 sub_head 及 sub_tail 指向節點的相對位置會相反

將 old_tail->next 指向 sub_tail 指到的節點且 sub_head->next 指向 next->head 指到的節點，前者在這部份不會有任何改變，但在鏈結串列之間反轉後，需要上次反轉後的 tail 接上，就會有所功能

更新所有指標並重複上述的步驟
最後利用函式 restructure_list 將鏈結串列接好

實際程式碼:

void q_reverseK(struct list_head *head, int k)
{
    // https://leetcode.com/problems/reverse-nodes-in-k-group/
    if (!head || list_empty(head))
        return;

    int count = 0;
    struct list_head *sub_head = head->next, *next_head = NULL,
                     *old_tail = head;

    // cut the linked list to be singly-linked list
    head->prev->next = NULL;

    for (struct list_head *sub_tail = head->next; sub_tail;
         sub_tail = sub_tail->next) {
        if (++count == k) {
            next_head = sub_tail->next;
            sub_tail->next = old_tail;
            q_reverse(old_tail);
            // old node connects to the head of new list
            old_tail->next = sub_tail;
            // the new list connect to the next node
            sub_head->next = next_head;
            old_tail = sub_tail = sub_head;
            sub_head = next_head;
            count = 0;
        }
    }
    /* restructure the doubly-linked list */
    restructure_list(head);
}

其中函式 restructure_list 的實作如下，功能為將雙向鏈結串列的 prev 指標接好

void restructure_list(struct list_head *head)
{
    struct list_head *curr = head, *next = curr->next;
    while (next) {
        next->prev = curr;
        curr = next;
        next = next->next;
    }
    curr->next = head;
    head->prev = curr;
}

q_sort

函式功能: 以遞增順序來排序鏈結串列的節點

函式流程

函式 q_sort 的實作被主要拆為三個函式 q_sort, mergesort 及 mergelist

q_sort:

首先將指向 head 的指標改成 NULL，目的在於把雙向鏈結串列的終點從 head 改為 NULL，變成單向鏈結串列
```
head->prev->next = NULL;
```

接著呼叫函式 mergesort 開始進行 merge sort

head->next = mergesort(head->next);

排序完後每個節點的 prev 會亂掉，因此需要走訪各個節點並且把所有 prev 接回來
```
restructure_list(head);
```

mergesort: 參考你所不知道的 C 語言: linked list 和非連續記憶體裡， merge sort 的實作方式並做修改

使用「快慢指標」Floyd’s Cycle detection 演算法找出鏈結串列正中間的節點，並將鏈結串列切成 left 及 right 兩組鏈結串列

struct list_head *fast = head, *slow = head;
while (fast && fast->next) {
     fast = fast->next->next;
     slow = slow->next;
}
 
struct list_head *mid = slow;
slow->prev->next = NULL;

struct list_head *left = mergesort(head);
struct list_head *right = mergesort(mid);

呼叫函式 mergelist 合併 left 及 right

return mergelist(left, right);

mergelist: 參考 21. Merge Two Sorted Lists ，並實作合併兩個鏈結串列

利用指標的指標 indirect 指向指標 res ，並藉由修改指標完成鏈結串列合併的動作
```
struct list_head *res = NULL;
struct list_head **indirect = &res;
```

使用函式 strcmp 判斷資料的大小

node = strcmp(list1_entry->value, list2_entry->value) < 0 ? &list1

實際程式碼:

/* Merge the two lists in a one sorted list. */
static struct list_head *mergelist(struct list_head *list1,
                                   struct list_head *list2)
{
    struct list_head *res = NULL;
    struct list_head **indirect = &res;
    for (struct list_head **node = NULL; list1 && list2;
         *node = (*node)->next) {
        element_t *list1_entry = list_entry(list1, element_t, list);
        element_t *list2_entry = list_entry(list2, element_t, list);
        node = strcmp(list1_entry->value, list2_entry->value) < 0 ? &list1
                                                                  : &list2;
        *indirect = *node;
        indirect = &(*indirect)->next;
    }
    *indirect = (struct list_head *) ((size_t) list1 | (size_t) list2);
    return res;
}

/*
 * Divide the list into several nodes and merge to sorted list.
 * No effect if q is NULL or empty.
 */
static struct list_head *mergesort(struct list_head *head)
{
    if (!head || !head->next)
        return head;

    struct list_head *fast = head, *slow = head;
    while (fast && fast->next) {
        fast = fast->next->next;
        slow = slow->next;
    }

    struct list_head *mid = slow;
    slow->prev->next = NULL;

    struct list_head *left = mergesort(head);
    struct list_head *right = mergesort(mid);
    return mergelist(left, right);
}

/* Sort elements of queue in ascending order */
void q_sort(struct list_head *head)
{
    if (!head || list_empty(head))
        return;

    // cut the linked list to be singly-linked list
    head->prev->next = NULL;
    head->next = mergesort(head->next);
    /* restructure the doubly-linked list */
    restructure_list(head);
}

q_descend

函式功能: 對所有節點而言，移除所有在其之前且資料較小的節點

思考邏輯: 目標是把對每個節點之前，所有資料比其小的節點移除，換言之，其實可以利用雙向鏈結串列的特性，反向走訪所有的節點，並產生遞增的鏈結串列

實際程式碼:

int q_descend(struct list_head *head)
{
    // https://leetcode.com/problems/remove-nodes-from-linked-list/
    if (!head || list_empty(head) || list_is_singular(head))
        return 0;

    for (struct list_head *curr = head->prev, *next = curr->prev; next != head;
         next = curr->prev) {
        element_t *curr_entry = list_entry(curr, element_t, list);
        element_t *next_entry = list_entry(next, element_t, list);

        // if current node is greater than next node
        if (strcmp(curr_entry->value, next_entry->value) > 0) {
            list_del(next);
            q_release_element(next_entry);
        } else
            curr = next;
    }

    return q_size(head);
}

q_merge

函式功能: 合併所有已排序的佇列，並合而為一個新的已排序佇列

函式流程

利用指標 merge 暫時儲存已經合併的佇列
走訪所有的佇列並一個一個和 merge 合併 (利用函式 mergelist) ，由於函式 mergelist 的輸入皆為單向鏈結串列，因此這裡需要先將原本的佇列變成單向鏈結串列，再進行合併
```
head_entry->q->prev->next = NULL;
merge = mergelist(merge, head_entry->q->next);
```

將合併完的佇列復原

head_entry->q->next = head_entry->q;

最後將合併的佇列接在 chain 的第一個元素上，並使用函式 restructure_list 將 prev 完整連接

head_entry = list_entry(head->next, queue_contex_t, chain);
head_entry->q->next = merge;
/* restructure the doubly-linked list */
restructure_list(head_entry->q);

實際程式碼:

int q_merge(struct list_head *head)
{
    // https://leetcode.com/problems/merge-k-sorted-lists/
    if (!head || list_empty(head))
        return 0;

    struct list_head *merge = NULL;
    queue_contex_t *head_entry = NULL;
    list_for_each_entry (head_entry, head, chain) {
        // cut the linked list to be singly-linked list
        head_entry->q->prev->next = NULL;
        merge = mergelist(merge, head_entry->q->next);
        head_entry->q->next = head_entry->q;
    }

    head_entry = list_entry(head->next, queue_contex_t, chain);
    head_entry->q->next = merge;

    /* restructure the doubly-linked list */
    restructure_list(head_entry->q);
    return q_size(head_entry->q);
}

研讀 Linux 核心原始碼 list_sort.c

原始碼的運作分析放在額外的筆記研讀 Linux 核心原始程式碼 list_sort.c

引入 Linux 核心原始碼 list_sort.c 到 lab0-c

完整的實作紀錄可參考 commit ，以下為實作的流程

首先，將 Linux 核心原始檔案 list_sort.c 及 list_sort.h 的相關程式碼加進 lab0-c 並作修改，其中包含 list_sort 完整的程式碼及其函式定義，接著將以下 lab0-c 沒有的巨集函式加進 list_sort.h

likely() (位於 include/linux/compiler.h)
unlikely() (位於 include/linux/compiler.h)

完成的 list_sort.h 實際程式碼如下所示:

#pragma once

#ifndef likely
#define likely(x) __builtin_expect(!!(x), 1)
#endif
#ifndef unlikely
#define unlikely(x) __builtin_expect(!!(x), 0)
#endif

struct list_head;

__attribute__((nonnull(2, 3))) typedef int (*list_cmp_func_t)(
    void *,
    const struct list_head *,
    const struct list_head *);

__attribute__((nonnull(2, 3))) void list_sort(void *priv,
                                              struct list_head *head,
                                              list_cmp_func_t cmp);

__attribute__((nonnull(2, 3))) int list_cmp(void *priv,
                                            const struct list_head *list1,
                                            const struct list_head *list2);

接著在 Makefile 新增要編譯出的目標檔案，這裡可由變數 ENABLE_LINUX_SORT 讓使用者決定是否要使用 list_sort

ENABLE_LINUX_SORT ?= 1
ifeq ($(ENABLE_LINUX_SORT), 1)
CFLAGS += -D FEATURE_LINUX_SORT=1
OBJS += list_sort.o
endif

在 list_sort.c 新增函式 list_cmp ，用來比較節點的資料

/* List compare funciton */
__attribute__((nonnull(2, 3))) int list_cmp(void *priv,
                                            const struct list_head *list1,
                                            const struct list_head *list2)
{
    // cppcheck-suppress nullPointer
    element_t *list1_entry = list_entry(list1, element_t, list);
    // cppcheck-suppress nullPointer
    element_t *list2_entry = list_entry(list2, element_t, list);
    return strcmp(list1_entry->value, list2_entry->value) <= 0 ? 0 : 1;
}

最後則是修改檔案 qtest.c 裡的函式 do_sort ，並搭配剛剛在 Makefile 裡設定的參數來決定要使用原本的 q_sort 或是新引入的 list_sort

bool do_sort(int argc, char *argv[])
{
    ...
    if (current && exception_setup(true))
#if FEATURE_LINUX_SORT
        list_sort(NULL, current->q, list_cmp);
#else
        q_sort(current->q);
#endif
    ...
}

比較自己實作的 merge sort 和 Linux 核心程式碼

首先建立一個用來量測排序時間的測試檔 trace-time-measure.cmd ，內容如下所示，其中隨機產生的資料數會改變

option fail 0
option malloc 0
new
ih RAND 100000
time
sort
time

接著修改以下的 Makefile 腳本並輸入命令 make check 來分別測試 q_sort 及 list_sort

check: qtest
	./$< -v 3 -f traces/trace-time-measure.cmd

可以統計出兩者執行的時間，測試方式為對每個函式分別測試 3 次並紀錄其排序時間，同時資料總數從 100000 以公差為 100000 遞增到 500000

資料總數	`q_sort` 第一次 (s)	`q_sort` 第二次 (s)	`q_sort` 第三次 (s)	`list_sort` 第一次 (s)	`list_sort` 第二次 (s)	`list_sort` 第三次 (s)
100000	0.072	0.071	0.068	0.039	0.040	0.038
200000	0.190	0.222	0.200	0.114	0.120	0.114
300000	0.333	0.361	0.330	0.196	0.198	0.194
400000	0.488	0.528	0.517	0.286	0.284	0.301
500000	0.646	0.685	0.699	0.376	0.393	0.363

接著將以上的數據取平均後如下所示

資料總數	`q_sort` 平均 (s)	`list_sort` 平均 (s)
100000	0.070	0.039
200000	0.204	0.116
300000	0.341	0.196
400000	0.511	0.290
500000	0.676	0.377

最後使用 gnuplot 畫圖，參考 gnuplot 語法解說和示範，以下為比較結果，可以發現 list_sort 的排序時間比起 q_sort 快了不少

接著利用 perf 找到兩者的差異，安裝可以參考 Linux 效能分析工具: Perf

輸入命令 perf stat --repeat 5 -e cache-misses,cache-references,instructions,cycles ./qtest -f traces/trace-time-measure.cmd
→ 程式會執行 5 次，並且自動平均，其中每次測試的資料數都設定為 500000 ，以下為 q_sort 及 list_sort 執行的結果

首先是 q_sort 的結果

Performance counter stats for './qtest -f traces/trace-time-measure.cmd' (5 runs):

         2256,8287      cache-misses              #   68.299 % of all cache refs      ( +-  0.15% )
         3288,3709      cache-references                                              ( +-  0.36% )
      22,6838,1411      instructions              #    0.44  insn per cycle           ( +-  0.00% )
      50,7244,9200      cycles                                                        ( +-  0.51% )

           1.18335 +- 0.00408 seconds time elapsed  ( +-  0.34% )

接著是 list_sort 的結果

Performance counter stats for './qtest -f traces/trace-time-measure.cmd' (5 runs):

         1887,5232      cache-misses              #   72.800 % of all cache refs      ( +-  0.22% )
         2577,8636      cache-references                                              ( +-  0.26% )
      22,7975,0582      instructions              #    0.71  insn per cycle           ( +-  0.02% )
      31,8624,7214      cycles                                                        ( +-  0.38% )

           0.73931 +- 0.00590 seconds time elapsed  ( +-  0.80% )

將輸出的結果做成表格，如下表所示

	`q_sort`	`list_sort`
cycles	50,7244,9200	31,8624,7214
instructions	22,6838,1411	22,7975,0582
cache-references	3288,3709	2577,8636
cache-misses	2256,8287	1887,5232
insn per cycle	0.44	0.71

可以發現 list_sort 發生 cache miss 的次數比 q_sort 來的少，可以對應到 Linux 核心的鏈結串列排序裡提到「用 bottom up 實作 merge sort 對 cache 較友善，可避免大量的 cache trashing」

探討 list_sort.c 的比較次數。
:notes: jserv

運用 Valgrind 排除 qtest 實作的記憶體錯誤

輸入命令 make valgrind 對資料夾 traces 裡的所有測試資料使用 valgrind 進行檢查

測試結果

+++ TESTING trace trace-01-ops:
# Test of insert_head and remove_head
---	trace-01-ops	5/5
+++ TESTING trace trace-02-ops:
# Test of insert_head, insert_tail, remove_head, remove_tail, and delete_mid
---	trace-02-ops	6/6
+++ TESTING trace trace-03-ops:
# Test of insert_head, insert_tail, remove_head, reverse and merge
---	trace-03-ops	6/6
+++ TESTING trace trace-04-ops:
# Test of insert_head, insert_tail, size, swap, and sort
---	trace-04-ops	6/6
+++ TESTING trace trace-05-ops:
# Test of insert_head, insert_tail, remove_head, reverse, size, swap, and sort
---	trace-05-ops	6/6
+++ TESTING trace trace-06-ops:
# Test of insert_head, insert_tail, delete duplicate, sort, descend and reverseK
---	trace-06-ops	6/6
+++ TESTING trace trace-07-string:
# Test of truncated strings
---	trace-07-string	6/6
+++ TESTING trace trace-08-robust:
# Test operations on empty queue
---	trace-08-robust	6/6
+++ TESTING trace trace-09-robust:
# Test remove_head with NULL argument
---	trace-09-robust	6/6
+++ TESTING trace trace-10-robust:
# Test operations on NULL queue
---	trace-10-robust	6/6
+++ TESTING trace trace-11-malloc:
# Test of malloc failure on insert_head
---	trace-11-malloc	6/6
+++ TESTING trace trace-12-malloc:
# Test of malloc failure on insert_tail
---	trace-12-malloc	6/6
+++ TESTING trace trace-13-malloc:
# Test of malloc failure on new
---	trace-13-malloc	6/6
+++ TESTING trace trace-14-perf:
# Test performance of insert_tail, reverse, and sort
---	trace-14-perf	6/6
+++ TESTING trace trace-15-perf:
# Test performance of sort with random and descending orders
# 10000: all correct sorting algorithms are expected pass
# 50000: sorting algorithms with O(n^2) time complexity are expected failed
# 100000: sorting algorithms with O(nlogn) time complexity are expected pass
---	trace-15-perf	6/6
+++ TESTING trace trace-16-perf:
# Test performance of insert_tail
---	trace-16-perf	6/6
+++ TESTING trace trace-17-complexity:
# Test if time complexity of q_insert_tail, q_insert_head, q_remove_tail, and q_remove_head is constant
Probably constant time
Probably constant time
Probably constant time
Probably constant time
---     trace-17-complexity     5/5
---	TOTAL		100/100

由以上的測試結果，可以發現對於目前的實作， valgrind 沒有發出任何的警告

接著使用 Massif 觀察記憶體使用情況，輸入命令 valgrind -v --tool=massif ./qtest -f traces/trace-01-ops.cmd 產生輸出檔 massif.out.17672

接著輸入命令 massif-visualizer ./massif.out.17672 讀取上述指令產生的輸出檔，以下為結果:

以 traces/trace-01-ops.cmd 為例，可以發現所有分配的記憶體在最後會完全被釋放

確保 qtest 提供的 `web` 命令得以搭配上述佇列實作使用

經過實驗證實，目前的實作會有以下的問題

問題: 在輸入命令 web 並建立連線後，會有以下的問題

當對伺服器端送出請求後，此時再從命令列透過 stdio 送出命令後，會產生一段很長的延遲，也就是後者的命令過了很久才會被處理
對伺服器端送出請求後，會產生以下的錯誤訊息

Unknown command 'favicon.ico'

首先從第一個問題，可以得知上述提到的延遲是由於 lab0-c 內部的函式 read 壅塞所導致，主要是 I/O 正在處理來自客戶端或是來自命令列的命令，使其一無法馬上執行。因此這邊使用函式 select 對檔案描述子 (file descriptor) 進行管理，以下為 select 的 I/O 模型，屬於 Multiplexing 模型，分成兩部份，第一部份為 select ，目的為等待準備好的資料，並回傳已經準備好的資料數，第二部份則是讀取準備好的資料。透過這樣的方法，可以一次管理多個檔案描述子。

descriptor 翻譯為「描述子」，參見 operator 為什麼翻譯為運算子
:notes: jserv

而目前的目標則是同時管理 stdin 及 socket 的資料，首先可以參考原本管理 I/O 的實作，位於檔案 console.c 的函式 run_console





























bool run_console(char *infile_name)
{
    if (!push_file(infile_name)) {
        report(1, "ERROR: Could not open source file '%s'", infile_name);
        return false;
    }

    if (!has_infile) {
        char *cmdline;
        while (use_linenoise && (cmdline = linenoise(prompt))) {
            interpret_cmd(cmdline);
            line_history_add(cmdline);       /* Add to the history. */
            line_history_save(HISTORY_FILE); /* Save the history on disk. */
            line_free(cmdline);
            while (buf_stack && buf_stack->fd != STDIN_FILENO)
                cmd_select(0, NULL, NULL, NULL, NULL);
            has_infile = false;
        }
        if (!use_linenoise) {
            while (!cmd_done())
                cmd_select(0, NULL, NULL, NULL, NULL);
        }
    } else {
        while (!cmd_done())
            cmd_select(0, NULL, NULL, NULL, NULL);
    }

    return err_cnt == 0;
}

將目光放在上述程式的變數 use_linenoise ，其功能是用來區別目前輸入為 stdin 或是 socket ，當輸入命令 web 時，根據函式 do_while 的內容，可以發現變數 use_linenoise 被設定為 false (下列程式碼第 12 行) ，同時也表示下次將讀取 socket 的資料


















static bool do_web(int argc, char *argv[])
{
    int port = 9999;
    if (argc == 2) {
        if (argv[1][0] >= '0' && argv[1][0] <= '9')
            port = atoi(argv[1]);
    }

    web_fd = web_open(port);
    if (web_fd > 0) {
        printf("listen on port %d, fd is %d\n", port, web_fd);
        use_linenoise = false;
    } else {
        perror("ERROR");
        exit(web_fd);
    }
    return true;
}

因此可以得知函式 run_console 第 10 行的迴圈用來處理來自 stdin 的資料，而第 19 行則是處理來自 socket 的資料

接著修改函式 run_console 如下所示，移除變數 use_linenoise 並且由函式 linenoise 來同時管理兩者的資料

bool run_console(char *infile_name)
{
    if (!push_file(infile_name)) {
        report(1, "ERROR: Could not open source file '%s'", infile_name);
        return false;
    }

    if (!has_infile) {
        char *cmdline;
        while ((cmdline = linenoise(prompt))) {
            interpret_cmd(cmdline);
            line_history_add(cmdline);       /* Add to the history. */
            line_history_save(HISTORY_FILE); /* Save the history on disk. */
            line_free(cmdline);
            while (buf_stack && buf_stack->fd != STDIN_FILENO)
                cmd_select(0, NULL, NULL, NULL, NULL);
            has_infile = false;
        }
    } else {
        while (!cmd_done())
            cmd_select(0, NULL, NULL, NULL, NULL);
    }

    return err_cnt == 0;
}

接著引入上述提到的函式 select 到函式 linenoise 內部，主要修改位於檔案 linenoise.c 的核心函式 line_edit 。首先使用關鍵字 extern 取得 console.c 的變數 web_fd ，其為 socket 檔案描述子

extern int web_fd;
fd_set read_set;

當然變數 web_fd 的宣告也要作修改

- static int web_fd;
+ int web_fd = -1;

接著在函式 line_edit 開始實作 select 的設定，可以參考 CS:APP 的第 12 章，前面提到要管理 stdin 及 socket ，因此設定的部份可以對應到下列程式的第 3 行及第 7 行，使用巨集函式 FD_SET 啟動 select 的監聽，其他的巨集函式可以參考 select(2) — Linux manual page













while (1) {
    FD_ZERO(&read_set);       /* clear read set */
    FD_SET(l.ifd, &read_set); /* set stdin fd */
    int fd = l.ifd + 1;
    /* If web not ready listen */
    if (web_fd != -1) {
        FD_SET(web_fd, &read_set); /* set web fd */
        fd = web_fd + 1;
    }
    ...
}

select(fd, &read_set, NULL, NULL, NULL);

接著是讀取函式的部份，實作如下所示，有兩種情況 — socket 及 stdin ，前者主要是使用 web_recv 讀取資料，並且對客戶端回傳 HTTP 狀態碼，後者則是維持原本的實作，使用函式 read 讀取來自 stdin 的資料

if ((web_fd != -1) && FD_ISSET(web_fd, &read_set)) {
    struct sockaddr_in clientaddr;
    socklen_t clientlen = sizeof(clientaddr);
    FD_CLR(web_fd, &read_set);

    int web_connfd =
        accept(web_fd, (struct sockaddr *) &clientaddr, &clientlen);

    char *p = web_recv(web_connfd, &clientaddr);
    int len = strlen(p);
    for (int i = 0; i < len; i++)
        line_edit_insert(&l, p[i]);

    char *buffer = "HTTP/1.1 200 OK\r\nContent-Type: text/plain\r\n\r\n";
    web_send(web_connfd, buffer);
    free(p);
    close(web_connfd);
} else if (FD_ISSET(l.ifd, &read_set)) {
    signed char c;
    int nread;
    char seq[5];
    FD_CLR(l.ifd, &read_set);

    nread = read(l.ifd, &c, 1);
    if (nread <= 0)
        return l.len;
    ...
}

接著將 socket 設定為 non-blocking 模式，修改檔案 web.c 裡的函式 web_open ，如下所示

/* Create a socket descriptor */
- if ((listenfd = socket(AF_INET, SOCK_STREAM, 0)) < 0)
+ if ((listenfd = socket(AF_INET, SOCK_STREAM | SOCK_NONBLOCK, 0)) < 0)
     return -1;

+ /* set socket non-blocking */
+ int flags = fcntl(listenfd, F_GETFL);
+ fcntl(listenfd, F_SETFL, flags | O_NONBLOCK);

另外要特別注意，當 socket 設定為 non-blocking 模式時，此時的讀取函式 read 需要考慮到回傳值 EAGAIN 及 EWOULDBLOCK ，可參考 read(2) — Linux manual page ，以下為說明

EAGAIN or EWOULDBLOCK: The file descriptor fd refers to a socket and has been marked nonblocking (O_NONBLOCK), and the read would block. POSIX.1-2001 allows either error to be returned for this case, and does not require these constants to have the same value, so a portable application should check for both possibilities.

因此將函式 rio_read 進行修改

static ssize_t rio_read(rio_t *rp, char *usrbuf, size_t n)
{
    int cnt;
    while (rp->count <= 0) { /* refill if buf is empty */
        rp->count = read(rp->fd, rp->buf, sizeof(rp->buf));
        if (rp->count < 0) {
+           // no data available yet
+           if (errno == EWOULDBLOCK || errno == EAGAIN)
+               rp->count = 0;       /* and call read() again */
-           if (errno != EINTR) /* interrupted by sig handler return */
+           else if (errno != EINTR) /* interrupted by sig handler return */
                return -1;
        } else if (rp->count == 0) { /* EOF */
            return 0;
        } else
            rp->bufptr = rp->buf; /* reset buffer ptr */
    }
    ...
}

依循 CONTRIBUTING.md 規範，在迴圈內讓 EOF 狀況及早脫離
:notes: jserv

接著是目前的展示成果，主要展示當 stdin 已經先輸入資料但尚未處理，接著從另一個終端機透過 socket 輸入命令，並不會導致資料阻塞，完整實作可見 commit 26a5dc

Note: 由於目前讀取資料後的操作是透過函式 line_edit_insert 處理，因此會讓兩個 I/O 的資料拼接在一起

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

TODO:

處裡 "Unknown command 'favicon.ico'" 的問題
確認 linenoise 對於 web 端輸入的命令也能展現歷史紀錄

2023q1 Homework1 (lab0)

實驗環境

佇列實作

使用的結構

q_new

q_free

q_insert_head

q_insert_tail

q_remove_head

q_remove_tail

q_size

q_delete_mid

q_delete_dup

q_swap

q_reverse

q_reverseK

q_sort

q_descend

q_merge

研讀 Linux 核心原始碼 list_sort.c

引入 Linux 核心原始碼 list_sort.c 到 lab0-c

比較自己實作的 merge sort 和 Linux 核心程式碼

運用 Valgrind 排除 qtest 實作的記憶體錯誤

確保 qtest 提供的 web 命令得以搭配上述佇列實作使用

在 qtest 提供新的命令 shuffle

Read more

第一次使用 ARM Cortex-M4 就上手

面試經驗

Cortex-M4 practices

2023q1 Homework3 (fibdrv)

確保 qtest 提供的 `web` 命令得以搭配上述佇列實作使用

在 qtest 提供新的命令 `shuffle`