2022q1 Homework1 (lab0)

contributed by < Risheng1128 >

作業說明

實驗環境

$ gcc --version
gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0

Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
Address sizes:                   39 bits physical, 48 bits virtual
CPU(s):                          4
On-line CPU(s) list:             0-3
Thread(s) per core:              2
Core(s) per socket:              2
Socket(s):                       1
NUMA node(s):                    1
Vendor ID:                       GenuineIntel
CPU family:                      6
Model:                           142
Model name:                      Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz
Stepping:                        9
CPU MHz:                         2700.000
CPU max MHz:                     3100.0000
CPU min MHz:                     400.0000
BogoMIPS:                        5399.81
Virtualization:                  VT-x
L1d cache:                       64 KiB
L1i cache:                       64 KiB
L2 cache:                        512 KiB
L3 cache:                        3 MiB
NUMA node0 CPU(s):               0-3

開發紀錄

二個重要的結構體

struct list_head {
    struct list_head *prev;
    struct list_head *next;
};

/* Linked list element */
typedef struct {
    char *value;
    struct list_head list;
} element_t;

q_new

根據 list.h 參考函式 INIT_LIST_HEAD ，其功能為將 linked list 的 next 及 prev 指向本身，即初始化

static inline void INIT_LIST_HEAD(struct list_head *head)
{
    head->next = head;
    head->prev = head;
}

實作流程
1. 使用 malloc 分配記憶體空間，並由指標 new 指向
2. 如果空間分配失敗 malloc 會回傳 NULL ，此時回傳 NULL
3. 使用 INIT_LIST_HEAD 初始化

實際程式碼

struct list_head *q_new()
{
    struct list_head *new = malloc(sizeof(struct list_head));
    if (!new)
        return NULL;

    INIT_LIST_HEAD(new);
    return new;
}

q_free

這邊利用到了一個特殊的巨集 list_entry ，可以從 list.h 查到該巨集

#define list_entry(node, type, member) container_of(node, type, member)

可以看到 list_entry 會使用到 container_of ，參考 Linux 核心原始程式碼巨集: container_of可以得知 container_of 可以用來得到包含 ptr 之 object 的起始地址

#ifndef container_of
#ifdef __LIST_HAVE_TYPEOF
#define container_of(ptr, type, member)                            \
    __extension__({                                                \
        const __typeof__(((type *) 0)->member) *__pmember = (ptr); \
        (type *) ((char *) __pmember - offsetof(type, member));    \
    })
#else
#define container_of(ptr, type, member) \
    ((type *) ((char *) (ptr) -offsetof(type, member)))
#endif
#endif

實作流程
1. 如果傳入的 head 為 NULL → return
2. 走訪整個 linked list ，每經過一個節點，就對其使用 list_entry() 並釋放其空間
3. 釋放分配給 head 的記憶體空間

實際程式碼，套用 API 可以寫出更精簡的寫法

void q_free(struct list_head *l)
{
    if (!l)
        return;

    struct list_head *node = l->next;

-   while (node != l) {
-       struct list_head *del = node;
-       node = node->next;
-       q_release_element(list_entry(del, element_t, list));
-   }
+   element_t *node, *next;
+   list_for_each_entry_safe (node, next, l, list) {
+       list_del(&node->list);
+       q_release_element(node);
+   }
    free(l);
}

q_insert_head

根據 list.h 參考函式 list_add ，查看原始碼可以得知其功能為把 node 節點加在 head 之後

static inline void list_add(struct list_head *node, struct list_head *head)
{
    struct list_head *next = head->next;

    next->prev = node;
    node->next = next;
    node->prev = head;
    head->next = node;
}

實作流程

使用 malloc() 分配 element_t 大小的記憶體空間，若分配失敗則回傳 false

element_t *new = malloc(sizeof(element_t));
if (!new)
    return false;

使用 strlen() 取得字串長度，並分配要複製字串的空間

分配的空間大小要比字串長度多一個位元組，記得加入 '\0'

int str_size = strlen(s);
new->value = malloc((str_size + 1) * sizeof(char));
if (!new->value) {
    free(new);
    return false;
}
strncpy(new->value, s, str_size);
*(new->value + str_size) = '\0';

使用 list_add() 將節點加在 head 的下一個
```
list_add(&new->list, head);
```

實際程式碼























bool q_insert_head(struct list_head *head, char *s)
{
    if (!head)
        return false;

    element_t *new = malloc(sizeof(element_t));
    if (!new)
        return false;

    INIT_LIST_HEAD(&new->list);
    int str_size = strlen(s);
    new->value = malloc((str_size + 1) * sizeof(char));

    if (!new->value) {
        free(new);
        return false;
    }

    strncpy(new->value, s, str_size);
    *(new->value + str_size) = '\0';
    list_add(&new->list, head);
    return true;
}

踩雷小紀錄: q_insert_head() 第 15 行
在執行測試檔 trace-12-malloc.cmd 時，一開始都會有 memory leak 的問題，原本以為是 q_free() 有問題，結果最後發現問題是出在 q_insert_head() 和 q_insert_tail()
原因: 缺乏考慮當 new 成功被分配空間而 new->value 分配空間失敗的情況
解法: 在 new->value 分配失敗時記得要將 new 的空間也釋放掉

q_insert_tail

q_insert_tail() 的思考幾乎都和 q_insert_head() 相同，兩者只差在插入的位置不同，也就使用不同函式

q_insert_head() 使用 list_add()
q_insert_tail() 使用 list_add_tail()

實際程式碼

bool q_insert_tail(struct list_head *head, char *s)
{
    if (!head)
        return false;

    element_t *new = malloc(sizeof(element_t));
    if (!new)
        return false;

    INIT_LIST_HEAD(&new->list);
    int str_size = strlen(s);
    new->value = malloc((str_size + 1) * sizeof(char));

    if (!new->value) {
        free(new);
        return false;
    }

    strncpy(new->value, s, str_size);
    *(new->value + str_size) = '\0';
    list_add_tail(&new->list, head);
    return true;
}

q_remove_head

根據 list.h 參考函式 list_del ，其功能為將節點 node 從 linked list 上 remove，成為單獨的節點

static inline void list_del(struct list_head *node)
{
    struct list_head *next = node->next;
    struct list_head *prev = node->prev;

    next->prev = prev;
    prev->next = next;

#ifdef LIST_POISONING
    node->prev = (struct list_head *) (0x00100100);
    node->next = (struct list_head *) (0x00200200);
#endif
}

思考方法
1. 如果 head 為 NULL 或是 empty ，則回傳 NULL
2. 將佇列的第一個 element 取出(利用 list_entry )並且 remove 該 element
3. 將 element->value 根據 bufsize 大小做複製

實際程式碼

element_t *q_remove_head(struct list_head *head, char *sp, size_t bufsize)
{
    if (!head || list_empty(head))
        return NULL;

    element_t *target = list_entry(head->next, element_t, list);
    list_del(&target->list);

    if (bufsize) {
        strncpy(sp, target->value, bufsize - 1);
        *(sp + bufsize - 1) = '\0';
    }
    return target;
}

q_remove_tail

q_remove_tail() 的作法和 q_remove_head() 大致相同，兩者只差在移除的位置不同

q_remove_tail(): 被移除的節點位置為 head->prev
q_remove_head(): 被移除的節點位置為 head->next

實際程式碼

element_t *q_remove_tail(struct list_head *head, char *sp, size_t bufsize)
{
    if (!head || list_empty(head))
        return NULL;

    element_t *target = list_entry(head->prev, element_t, list);
    list_del(&target->list);

    if (bufsize) {
        strncpy(sp, target->value, bufsize - 1);
        *(sp + bufsize - 1) = '\0';
    }
    return target;
}

q_size

根據 list.h 參考巨集函式 list_for_each()，其功能為走訪整個 linked list

#define list_for_each(node, head) \
    for (node = (head)->next; node != (head); node = node->next)

利用 list_for_each() 可以完成 q_size()

實際程式碼

int q_size(struct list_head *head)
{
    if (!head)
        return 0;
    int size = 0;
    struct list_head *node;
    list_for_each (node, head)
        size++;
    return size;
}

q_delete_mid

思考方法
1. 使用「龜兔賽跑」Floyd’s Cycle detection來尋找中間的節點
  - rabbit 每一次迭代會走訪兩個節點
  - turtle 每一次迭代會走訪一個節點
2. 當 rabbit 走到 head 或是 rabbit->next 為 head 表示已經抵達「終點」，此時 turtle 指到的位址剛好會是佇列正中間的節點
3. 透過 list_del 可以將正中間的節點移出 linked list
4. 透過 list_entry 可以得到正中間 element 的地址，即可以進行記憶體空間釋放

實際程式碼

bool q_delete_mid(struct list_head *head)
{
    // https://leetcode.com/problems/delete-the-middle-node-of-a-linked-list/
    if (!head || !head->next)
        return false;

    struct list_head *rabbit = head->next, *turtle = head->next;

    while (rabbit != head && rabbit->next != head) {
        rabbit = rabbit->next->next;
        turtle = turtle->next;
    }

    list_del(turtle);
    q_release_element(list_entry(turtle, element_t, list));
    return true;
}

q_delete_dup

思考方法

如果 head 為 NULL 或 empty 或只有一個 element 時，則會回傳 false
→ 表示一定不會有重複的資料

if (!head || list_empty(head) || list_is_singular(head))
   return false;

變數解釋
- curr 為指向當前節點的指標
- next 為指向 curr 下一個節點的指標
- 宣告布林型態變數 key 表示是否發生資料相同的情況
  key 為 false 表示當前的 curr 和 next 沒有發生資料相同的情況
  key 為 true 表示當前的 curr 和 next 有發生資料相同的情況

走訪整個 linked list 並取得 curr 和 next 的 element 地址

element_t *curr_entry = list_entry(curr, element_t, list);
element_t *next_entry = list_entry(next, element_t, list);

當資料相同時，會對 next 指向的節點進行移除並釋放空間，同時 key 被設置為 true
這邊資料的比較使用 strcmp() 函式，當資料相同時回傳 0

while (next != head && !strcmp(curr_entry->value, next_entry->value)) {
       list_del(next);
       q_release_element(next_entry);
       next = curr->next;
       next_entry = list_entry(next, element_t, list);
       key = true;
   }

當 key 為 true 時，表示發生過資料相同的情況，但 curr 指到的節點尚未釋放，因此要記得釋放該空間

if (key) {
       list_del(curr);
       q_release_element(curr_entry);
       key = false;
   }

實際程式碼

bool q_delete_dup(struct list_head *head)
{
    // https://leetcode.com/problems/remove-duplicates-from-sorted-list-ii/
    if (!head || list_empty(head) || list_is_singular(head))
        return false;

    struct list_head *curr = head->next, *next = curr->next;
    bool key = false;

    while (curr != head && next != head) {
        element_t *curr_entry = list_entry(curr, element_t, list);
        element_t *next_entry = list_entry(next, element_t, list);

        while (next != head && !strcmp(curr_entry->value, next_entry->value)) {
            list_del(next);
            q_release_element(next_entry);
            next = curr->next;
            next_entry = list_entry(next, element_t, list);
            key = true;
        }

        if (key) {
            list_del(curr);
            q_release_element(curr_entry);
            key = false;
        }

        curr = next;
        next = next->next;
    }
    return true;
}

q_swap

思考方法
1. 建立兩個指標 first 及 second
2. 利用 list_del_init 將 first 指向的節點從 linked list 取出
3. 利用 list_add(first, second); 將 first 指向節點加到 second 指向節點的下一個位置，即兩節點交換位置
4. 當 first 或 first->next 為 head 時，表示已經到 linked list 的盡頭

實際程式碼

void q_swap(struct list_head *head)
{
    // https://leetcode.com/problems/swap-nodes-in-pairs/
    if (!head)
        return;

    struct list_head *first = head->next;

    while (first != head && first->next != head) {
        struct list_head *second = first->next;
        list_del_init(first);
        list_add(first, second);
        first = first->next;
    }
}

q_reverse

建立三個指標 prev, curr 及 next，並依照 prev → curr → next 的順序指向不同節點

struct list_head *prev = head->prev, *curr = head, *next = NULL;

next = curr->next; (next 指向 curr 的下一個節點 )

curr->next = prev; ( curr 的 next 指向原本的上一個節點 prev )

curr->prev = next; ( curr 的 prev 指向原本的下一個節點 next )

prev = curr; ( prev 往下移動一個節點 )

curr = next;

經過這六個步驟，可以發現 head 的 prev 和 next 的變化

head 的 next 從指向 bear (第一個 element)改成指向 meekat (最後一個 element)
head 的 prev 從指向 meerkat (最後一個 element)改成指向 bear (第一個 element)

實際程式碼

void q_reverse(struct list_head *head)
{
    if (!head)
        return;

    struct list_head *prev = head->prev, *curr = head, *next = NULL;

    while (next != head) {
        next = curr->next;
        curr->next = prev;
        curr->prev = next;
        prev = curr;
        curr = next;
    }
}

q_sort

實作 quick sort 遞迴版本

參考 quick-sort.c 使用 Linux Kernel API 實做 quick sort

參數介紹

less: 存放資料比 pivot 小的 linked list
greater: 存放資料比 pivot 大或相同的 linked list

首先將 pivot 設定為 linked list 的第一個節點，並且從 linked list上移除

pivot = list_first_entry(head, element_t, list);
list_del(&pivot->list);

接著走訪整個節點，並讓每個節點和 pivot 的資料互相比較(使用 strcmp() 並根據其結果決定將節點加在 less 或是 greater)

list_for_each_entry_safe (item, is, head, list) {
    if (strcmp(item->value, pivot->value) < 0)
        list_move_tail(&item->list, &less);
    else
        list_move(&item->list, &greater);
}

使用遞迴分別將 less 及 greater 進行比較和分割

INIT_LIST_HEAD(&less);
INIT_LIST_HEAD(&greater);

將每次分割好的節點依照 less → head → greater 重組起來

list_add(&pivot->list, head);
list_splice(&less, head);
list_splice_tail(&greater, head);

實際程式碼

void q_sort(struct list_head *head)
{
    struct list_head less, greater;
    element_t *pivot, *item, *is = NULL;

    if (!head || list_empty(head) || list_is_singular(head))
        return;

    INIT_LIST_HEAD(&less);
    INIT_LIST_HEAD(&greater);

    pivot = list_first_entry(head, element_t, list);
    list_del(&pivot->list);

    list_for_each_entry_safe (item, is, head, list) {
        if (strcmp(item->value, pivot->value) < 0)
            list_move_tail(&item->list, &less);
        else
            list_move(&item->list, &greater);
    }

    q_sort(&less);
    q_sort(&greater);

    list_add(&pivot->list, head);
    list_splice(&less, head);
    list_splice_tail(&greater, head);
}

結果 trace-14-perf.cmd 和 trace-15-perf.cmd 沒有通過QQ

注意排序操作期待是 stable sorting
jserv

好的，那學生試試看使用 merge sort 實作，不過想請問老師為什麼要使用 stable sorting ，是因為 unstable sorting 的效能比較不好，還是有其他的原因，謝謝老師！

「效能」只是評估的一個因素，stable sorting 和 unstable sorting 的應用場景不同，請回去讀書。
jserv

了解老師的意思了，謝謝老師

決定改成用 merge sort 試試看

實作 merge sort 遞迴版本

merge sort 的實作被拆為三個函式 q_sort() mergesort() 及 mergelist()

q_sort()

首先將指向 head 的指標改成 NULL，目的在於把 doubly linked list 的終點從 head 改為 NULL，變成單向 linked list
```
head->prev->next = NULL;
```

接著呼叫 mergesort() 開始進行 merge sort

head->next = mergesort(head->next);

排序完後每個節點的 prev 會亂掉，因此需要走訪各個節點把所有 prev 接回來

struct list_head *curr = head, *next = head->next;
while (next) {
    next->prev = curr;
    curr = next;
    next = next->next;
}

mergesort()
參考你所不知道的 C 語言: linked list 和非連續記憶體裡頭， merge sort 的實作方式並做修改

使用「龜兔賽跑」Floyd’s Cycle detection演算法找出 linked list 正中間的節點，並將 linked list 切成 left 及 right 兩個 linked list

struct list_head *rabbit = head, *turtle = head;
while (rabbit && rabbit->next) {
    rabbit = rabbit->next->next;
    turtle = turtle->next;
}

struct list_head *mid = turtle;
turtle->prev->next = NULL;

struct list_head *left = mergesort(head);
struct list_head *right = mergesort(mid);

呼叫 mergelist() 合併 left 及 right

return mergelist(left, right);

mergelist()
參考21. Merge Two Sorted Lists，並實作合併兩個 linked list
1. 利用 pointer to pointer indirect 指向 pointer res ，並藉由修改指標完成 linked list 合併的動作
```
struct list_head *res = NULL;
struct list_head **indirect = &res;
```
2. 使用 strcmp 判斷資料的大小
```
node = strcmp(list1_entry->value, list2_entry->value) < 0 ? &list1
```

實際程式碼

/*
 * Merge the two lists in a one sorted list.
 */
static struct list_head *mergelist(struct list_head *list1,
                                   struct list_head *list2)
{
    struct list_head *res = NULL;
    struct list_head **indirect = &res;
    for (struct list_head **node = NULL; list1 && list2;
         *node = (*node)->next) {
        element_t *list1_entry = list_entry(list1, element_t, list);
        element_t *list2_entry = list_entry(list2, element_t, list);
        node = strcmp(list1_entry->value, list2_entry->value) < 0 ? &list1
                                                                  : &list2;
        *indirect = *node;
        indirect = &(*indirect)->next;
    }
    *indirect = (struct list_head *) ((u_int64_t) list1 | (u_int64_t) list2);
    return res;
}

/*
 * Divide the list into several nodes and merge to sorted list.
 * No effect if q is NULL or empty.
 */
static struct list_head *mergesort(struct list_head *head)
{
    if (!head || !head->next)
        return head;

    struct list_head *rabbit = head, *turtle = head;
    while (rabbit && rabbit->next) {
        rabbit = rabbit->next->next;
        turtle = turtle->next;
    }

    struct list_head *mid = turtle;
    turtle->prev->next = NULL;

    struct list_head *left = mergesort(head);
    struct list_head *right = mergesort(mid);
    return mergelist(left, right);
}

/*
 * Sort elements of queue in ascending order
 * No effect if q is NULL or empty. In addition, if q has only one
 * element, do nothing.
 */
void q_sort(struct list_head *head)
{
    if (!head || list_empty(head))
        return;
    head->prev->next = NULL;
    head->next = mergesort(head->next);

    struct list_head *curr = head, *next = head->next;
    while (next) {
        next->prev = curr;
        curr = next;
        next = next->next;
    }
    curr->next = head;
    head->prev = curr;
}

研讀 Linux 核心原始程式碼 list_sort.c

作業要求

研讀 Linux 核心原始程式碼的 lib/list_sort.c
嘗試引入 Linux 核心原始程式碼到 lab0-c 專案
比較你自己實作的 merge sort 和 Linux 核心程式碼之間效能落差

為了讓篇幅小一點，這邊再開了一個新的筆記用來紀錄分析過程: 研讀 Linux 核心原始程式碼 list_sort

引入 Linux 核心原始碼到 lab0-c

為了引入 Linux 核心原始碼，必須要先了解 Linux 核心的 merge sort 整個的運作流程，已將整個分析放在額外的筆記本上，根據分析結果，以下是我規劃的實作流程

實作流程

將檔案 lib/list_sort.c 加進 lab0-c
將位於其他檔案的巨集函式加進 list_sort.h
在 qtest.c 新增函式 cmp()
修改 qtest.c 的 do_sort() 函式

建立 list_sort.c 及 list_sort.h ，前者包含 Linux 核心原始碼，後者包含必要的定義

首先需要將 lab0-c 所沒有的定義加進 list_sort.h ，如下所示

list_cmp_func_t (位於 include/linux/list_sort.h)
list_sort (位於 include/linux/list_sort.h)
likely() (位於 include/linux/compiler.h)
unlikely() (位於 include/linux/compiler.h)

以下為 list_sort.h 的程式碼

#ifndef _LIST_SORT_H
#define _LIST_SORT_H
#include <stdint.h>
#include "list.h"
#define USE_LINUX_SORT  1
#define likely(x)	__builtin_expect(!!(x), 1)
#define unlikely(x)	__builtin_expect(!!(x), 0)
typedef int __attribute__((nonnull(2,3))) (*list_cmp_func_t)(void *, const struct list_head *, const struct list_head *);
void list_sort(void *priv, struct list_head *head, list_cmp_func_t cmp);
#endif /* End of _LIST_SORT_H_ */

接著在 makefile 的 OBJS 新增 list_sort.o

OBJS := qtest.o report.o console.o harness.o queue.o list_sort.o \
        random.o dudect/constant.o dudect/fixture.o dudect/ttest.o \
        linenoise.o

在 qtest.c 上建立函式 cmp()

__attribute__((nonnull(2,3)))
int cmp(void *priv, const struct list_head *list1, const struct list_head *list2)
{
    element_t *list1_entry = list_entry(list1, element_t, list);
    element_t *list2_entry = list_entry(list2, element_t, list);
    return strcmp(list1_entry->value, list2_entry->value) < 0 ? 0 : 1;
}

採雷小紀錄: 修改 list_sort() 的 prototype ，移除 __attribute__((nonnull(2,3)))
原因: 在測資 trace-10-robust.cmd ，會輸入 NULL 給 head ，原本在 list_sort 裡新增以下程式，但編譯器會編不過，錯誤訊息如下

if(!head)
    return;

  CC	list_sort.o
list_sort.c: In function ‘list_sort’:
list_sort.c:182:7: error: nonnull argument ‘head’ compared to NULL [-Werror=nonnull-compare]
  182 |     if(!head)
      |       ^
cc1: all warnings being treated as errors
make: *** [Makefile:50: list_sort.o] Error 1

解法: 移除 __attribute__((nonnull(2,3))) 即可

最後為了可以切換 list_sort() 及 q_sort() ，在 list_sort.h 新增新的巨集 USE_LINUX_SORT
→ 如果 USE_LINUX_SORT 為 1 ，則使用 list_sort() ，反之則使用 q_sort()

修改 qtest.c 裡的函式 do_sort() ，如下所示

if (exception_setup(true)) {
    #if (USE_LINUX_SORT == 1)
            list_sort(NULL, l_meta.l, cmp);
    #else
            q_sort(l_meta.l);
    #endif
}

最後結果: q_sort() 及 list_sort() 都可以正常執行

比較自己實作的 merge sort 和 Linux 核心程式碼

首先這邊新增了一個測資 trace-sort.cmd ，如以下所示，使用 qtest 提供的命令 time 可以算出排序使用的時間

option fail 0
option malloc 0
new
ih RAND 100000
time
sort
time

輸入命令 make check 測試的示意如下 (這邊的排序是自己寫的 q_sort())

./qtest -v 3 -f traces/trace-sort.cmd
cmd> option fail 0
cmd> option malloc 0
cmd> new
l = []
cmd> ih RAND 100000
l = [lhajuiamm nptum qdmcmfl nxyasup hemjdlj tetxmj pznuvn buyczxacm wbplbcipt vnbsjzf chtbzfy hpxnv upzkmvv bpvlvjz boocfn ucnxwj kemoxa azrxtyor duuks tebrmfg vfrxshpxu evpfw nmwfcp qrdld rahnv xqqroncbl qfppumm kwcrpvz puxlhhhcu ertih ... ]
cmd> time
Elapsed time = 0.087, Delta time = 0.087
cmd> sort
l = [aaaze aaazhb aabfh aabhbdo aabim aabpzxwgi aabqblx aabubl aacdjotav aacnapcm aacqj aacvkdmfq aacwcqd aacwg aacxso aadaocyzf aadldsjdq aadtapna aaeaciodj aaeakdh aaejadsff aaelwk aaeqgzt aaevmbu aafcfy aafeary aafjlj aafklh aafosjbi aafqy ... ]
cmd> time
Elapsed time = 0.216, Delta time = 0.129
Freeing queue

接著可以統計出 q_sort() 和 list_sort() 的排序的時間，這邊每一組測試三次

資料總數	`q_sort()` 第一次 (s)	`q_sort()` 第二次 (s)	`q_sort()` 第三次 (s)	`list_sort()` 第一次 (s)	`list_sort()` 第二次 (s)	`list_sort()` 第三次 (s)
100000	0.130	0.128	0.129	0.085	0.084	0.084
300000	0.504	0.496	0.492	0.305	0.306	0.306
500000	0.905	0.907	0.900	0.546	0.543	0.548

最後統計效能差異

資料總數	`q_sort()` 平均 (s)	`list_sort()` 平均 (s)	`list_sort()` 和 `q_sort()` 效能比較 (%)
100000	0.129	0.084	34.88%
300000	0.497	0.306	38.43%
500000	0.904	0.546	39.60%

接著探索 list_sort() 比較快的原因，這邊參考Linux 效能分析工具: Perf

這邊採用的資料點為 500000 筆

輸入命令 perf stat --repeat 5 -e cache-misses,cache-references,instructions,cycles ./qtest -f traces/trace-sort.cmd
→ 程式會執行 5 次，並且自動平均，以下為 q_sort() 及 list_sort() 執行的結果

首先是 q_sort() 的結果

 Performance counter stats for './qtest -f traces/trace-sort.cmd' (5 runs):

         7430,5695      cache-misses              #   69.606 % of all cache refs      ( +-  0.92% )
       1,0675,1677      cache-references                                              ( +-  0.24% )
      14,6569,3251      instructions              #    0.46  insn per cycle           ( +-  0.12% )
      32,1192,4749      cycles                                                        ( +-  0.90% )

            1.2041 +- 0.0109 seconds time elapsed  ( +-  0.90% )

接著是 list_sort() 的結果

Performance counter stats for './qtest -f traces/trace-sort.cmd' (5 runs):

         5698,1650      cache-misses              #   69.677 % of all cache refs      ( +-  0.84% )
         8178,0080      cache-references                                              ( +-  0.43% )
      14,6192,8852      instructions              #    0.65  insn per cycle           ( +-  0.08% )
      22,3567,4206      cycles                                                        ( +-  1.22% )

           0.84768 +- 0.00768 seconds time elapsed  ( +-  0.91% )

將輸出的結果做成表格，如下表所示

	`q_sort()`	`list_sort()`
cycles	3,211,924,749	2,235,674,206
instructions	1,465,693,251	1,461,928,852
cache-references	106,751,677	81,780,080
cache-misses	74,305,695	56,981,650
insn per cycle	0.46	0.65

這邊可以發現很有趣的事

兩者的 instructions 執行的次數沒有差很多，但 cycles 卻差了非常多
list_sort() 的 cache miss 比 q_sort() 來的少很多

運用 Valgrind 排除 qtest 實作的記憶體錯誤

作業目標

運用 Valgrind 排除 qtest 實作的記憶體錯誤，並透過 Massif 視覺化 “simulation” 過程中的記憶體使用量，需要設計對應的實驗

由於不知從何下手，因此起頭參考laneser的開發紀錄

command 是「命令」，instruction 是「指令」，二者語意不同。
jserv
已修正！

使用命令 valgrind -q --leak-check=full ./qtest -f traces/trace-01-ops.cmd 對檔案 trace-01-ops.cmd 進行測試

執行結果

cmd> # Test of insert_head and remove_head
cmd> option fail 0
cmd> option malloc 0
cmd> new
l = []
cmd> ih gerbil
l = [gerbil]
cmd> ih bear
l = [bear gerbil]
cmd> ih dolphin
l = [dolphin bear gerbil]
cmd> rh dolphin
Removed dolphin from queue
l = [bear gerbil]
cmd> rh bear
Removed bear from queue
l = [gerbil]
cmd> rh gerbil
Removed gerbil from queue
l = []
cmd> 
Freeing queue
==41370== 4 bytes in 1 blocks are still reachable in loss record 1 of 3
==41370==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==41370==    by 0x4A4E50E: strdup (strdup.c:42)
==41370==    by 0x1108F9: linenoiseHistoryAdd (linenoise.c:1236)
==41370==    by 0x11148C: linenoiseHistoryLoad (linenoise.c:1325)
==41370==    by 0x10C6B0: main (qtest.c:951)
==41370== 
==41370== 84 bytes in 19 blocks are still reachable in loss record 2 of 3
==41370==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==41370==    by 0x4A4E50E: strdup (strdup.c:42)
==41370==    by 0x11086D: linenoiseHistoryAdd (linenoise.c:1236)
==41370==    by 0x11148C: linenoiseHistoryLoad (linenoise.c:1325)
==41370==    by 0x10C6B0: main (qtest.c:951)
==41370== 
==41370== 160 bytes in 1 blocks are still reachable in loss record 3 of 3
==41370==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==41370==    by 0x1108B9: linenoiseHistoryAdd (linenoise.c:1224)
==41370==    by 0x11148C: linenoiseHistoryLoad (linenoise.c:1325)
==41370==    by 0x10C6B0: main (qtest.c:951)
==41370==

可以看到有出現 still reachable 類型的 memory lost

still reachable: 程式結束時有未釋放的記憶體，不過卻還有指標指著，通常會發生在 global 變數

跟著錯誤訊息的追隨路徑(main → linenoiseHistoryLoad → linenoiseHistoryAdd)可以找到沒有被釋放的記憶體空間

































int linenoiseHistoryAdd(const char *line)
{
    char *linecopy;

    if (history_max_len == 0)
        return 0;

    /* Initialization on first call. */
    if (history == NULL) {
        history = malloc(sizeof(char *) * history_max_len);
        if (history == NULL)
            return 0;
        memset(history, 0, (sizeof(char *) * history_max_len));
    }

    /* Don't add duplicated lines. */
    if (history_len && !strcmp(history[history_len - 1], line))
        return 0;

    /* Add an heap allocated copy of the line in the history.
     * If we reached the max length, remove the older line. */
    linecopy = strdup(line);
    if (!linecopy)
        return 0;
    if (history_len == history_max_len) {
        free(history[0]);
        memmove(history, history + 1, sizeof(char *) * (history_max_len - 1));
        history_len--;
    }
    history[history_len] = linecopy;
    history_len++;
    return 1;
}

錯誤訊息一共發生兩種

第一種位於 linenoiseHistoryAdd 第 10 行的位置，即 history = malloc(sizeof(char *) * history_max_len);
第二種位於 linenoiseHistoryAdd 第 22 行的位置，即 linecopy = strdup(line);

在終端機輸入命令 man strdup 可以得到其訊息，其回傳的空間會經過 malloc()

The strdup() function returns a pointer to a new string which is a duplicate of the string s. Memory for the new string is obtained with malloc(3), and can be freed with free(3).

查看了 linenoiseHistoryAdd 的原始碼後，發現 histroy 分配的每個指標指到的空間為 linecopy 分配的空間，因此只要將整個 history 釋放即可

history[history_len] = linecopy;

接著在 lineoise.c 中發現 API linenoiseAtExit() 包含 freeHistory() ，可以用來釋放 history

static void linenoiseAtExit(void)
{
    disableRawMode(STDIN_FILENO);
    freeHistory();
}

static void freeHistory(void)
{
    if (history) {
        int j;

        for (j = 0; j < history_len; j++)
            free(history[j]);
        free(history);
    }
}

實作步驟

將 static void linenoiseAtExit() 改成 void linenoiseAtExit() ，目的在於讓 console.c 裡的 finish_cmd() 可以呼叫該 API ，最後可以把 history 釋放
將 linenoiseAtExit() 的宣告從 linenoise.c 移到 linenoise.h
在 console.h 裡 include linenoise.h
在 finish_cmd() 裡呼叫 linenoiseAtExit()

重新測試 valgrind -q --leak-check=full ./qtest -f traces/trace-01-ops.cmd

測試結果

cmd> # Test of insert_head and remove_head
cmd> option fail 0
cmd> option malloc 0
cmd> new
l = []
cmd> ih gerbil
l = [gerbil]
cmd> ih bear
l = [bear gerbil]
cmd> ih dolphin
l = [dolphin bear gerbil]
cmd> rh dolphin
Removed dolphin from queue
l = [bear gerbil]
cmd> rh bear
Removed bear from queue
l = [gerbil]
cmd> rh gerbil
Removed gerbil from queue
l = []
cmd> 
Freeing queue

可以看到原本的錯誤訊息已經消失，表示 memory leak 已經解決
接著從第一筆測資測試到最後一筆測資

在測試到 trace-14-perf.cmd 時發生其他的問題，這邊先用最後出現問題的地方將不同的錯誤訊息分類

錯誤訊息分類

# 第一種
==339801== 8 bytes in 1 blocks are still reachable in loss record 2 of 37
==339801==    at 0x483DD99: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==339801==    by 0x10CDD8: calloc_or_fail (report.c:208)
==339801==    by 0x10D6A2: parse_args (console.c:154)
==339801==    by 0x10D6A2: interpret_cmd (console.c:208)
==339801==    by 0x10DF5D: cmd_select (console.c:584)
==339801==    by 0x10E22A: run_console (console.c:657)
==339801==    by 0x10C6E0: main (qtest.c:962)
# 第二種
==339801== 5 bytes in 1 blocks are still reachable in loss record 1 of 37
==339801==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==339801==    by 0x10CE7F: strsave_or_fail (report.c:230)
==339801==    by 0x10D6D9: parse_args (console.c:157)
==339801==    by 0x10D6D9: interpret_cmd (console.c:208)
==339801==    by 0x10DF5D: cmd_select (console.c:584)
==339801==    by 0x10E22A: run_console (console.c:657)
==339801==    by 0x10C6E0: main (qtest.c:962)
# 第三種
==339801== 32 bytes in 1 blocks are still reachable in loss record 23 of 37
==339801==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==339801==    by 0x10CD55: malloc_or_fail (report.c:189)
==339801==    by 0x10D7DF: add_cmd (console.c:90)
==339801==    by 0x10C631: console_init (qtest.c:796)
==339801==    by 0x10C631: main (qtest.c:945)
# 第四種
==339801== 48 bytes in 1 blocks are definitely lost in loss record 31 of 37
==339801==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==339801==    by 0x10E29C: test_malloc (harness.c:138)
==339801==    by 0x10E800: q_insert_head (queue.c:62)
==339801==    by 0x10C017: do_ih (qtest.c:201)
==339801==    by 0x10D1CB: interpret_cmda (console.c:186)
==339801==    by 0x10D717: interpret_cmd (console.c:209)
==339801==    by 0x10DF5D: cmd_select (console.c:584)
==339801==    by 0x10E22A: run_console (console.c:657)
==339801==    by 0x10C6E0: main (qtest.c:962)
# 第五種
==339801== Invalid read of size 8
==339801==    at 0x10AC42: do_sort (qtest.c:631)
==339801==    by 0x10D1CB: interpret_cmda (console.c:186)
==339801==    by 0x10D717: interpret_cmd (console.c:209)
==339801==    by 0x10DF5D: cmd_select (console.c:584)
==339801==    by 0x10E22A: run_console (console.c:657)
==339801==    by 0x10C6E0: main (qtest.c:962)
==339801==  Address 0xfffffffffffffff8 is not stack'd, malloc'd or (recently) free'd

錯誤訊息一共有五種

第一種位於 report.c 的函式 calloc_or_fail() ，即 void *p = calloc(cnt, bytes); ，類型為 still reachable
第二種位於 report.c 的函式 strsave_or_fail() ，即 char *ss = malloc(len + 1); ，類型為 still reachable
第三種位於 report.c 的函式 malloc_or_fail() ，即 void *p = malloc(bytes); ，類型為 still reachable
第四種位於 harness.c 的函式 test_malloc() ，即 block_ele_t *new_block = malloc(size + sizeof(block_ele_t) + sizeof(size_t)); ，類型為 definitely lost

definitely lost: 真的 memory leak
第五種位於 qtest.c 的函式 do_sort ，即 if (strcasecmp(item->value, next_item->value) > 0) { ，類型為 Invalid read

Invalid read: 表示被讀取的地址為 process 不可以讀取的地址， size 8 表示 process 想要讀取 8 bytes

最後發現，以上全部的訊息是因為超出測試的時間，程式進中斷後結束時沒有把記憶體釋放而產生

首先查了一段時間後發現在 harness.c 裡有一個變數 time_limit ，將其數值增加(例如改成 5) ，以上的問題都會解決

static int time_limit = 1;

以 time_limit = 5 重新輸入命令 valgrind -q --leak-check=full ./qtest -f traces/trace-14-perf.cmd 測試後，可以發現通過測試，且之後測資經過測試後也都通過

cmd> # Test performance of insert_tail, reverse, and sort
cmd> option fail 0
cmd> option malloc 0
cmd> new
l = []
cmd> ih dolphin 1000000
l = [dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin ... ]
cmd> it gerbil 1000000
l = [dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin ... ]
cmd> reverse
l = [gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil gerbil ... ]
cmd> sort
l = [dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin dolphin ... ]
cmd> 
Freeing queue

最後透過 massif 查看記憶體的狀況，輸入命令 valgrind -v --tool=massif ./qtest -f traces/trace-01-ops.cmd
→ 產生了名為 massif.out.81335 的檔案

接著使用 massif-visualizer 打開檔案，輸入命令 massif-visualizer ./massif.out.81335 ，以下為最終結果

可以看到建立的動態記憶體最後歸為 0 ，表示所有記憶體已被釋放

在 qtest 提供新的命令 shuffle

作業目標

在 qtest 提供新的命令 shuffle ，允許藉由 Fisher–Yates shuffle 演算法，對佇列中所有節點進行洗牌 (shuffle) 操作
解釋 select 系統呼叫在本程式的使用方式，並分析 console.c 的實作，說明其中運用 CS:APP RIO 套件的原理和考量點。可對照參考 CS:APP 第 10 章重點提示

為了了解怎麼新增命令，這邊先分析整個 qtest 是如何從得到命令一直到使用 console.c 來添加以及執行命令的運作流程，並做紀錄

首先，從 main 可以得知 qtest 是如何接收使用者給出的命令，其中使用到了函式 getopt()，輸入命令 man getopt 可以得到其描述

getopt is used to break up (parse) options in command lines for easy parsing by shell procedures, and to check for valid options. It uses the GNU getopt(3) routines to do this.

原始碼

while ((c = getopt(argc, argv, "hv:f:l:")) != -1) {
        switch (c) {
        case 'h':
            usage(argv[0]);
            break;
        case 'f':
            strncpy(buf, optarg, BUFSIZE);
            buf[BUFSIZE - 1] = '\0';
            infile_name = buf;
            break;
        case 'v': {
            char *endptr;
            errno = 0;
            level = strtol(optarg, &endptr, 10);
            if (errno != 0 || endptr == optarg) {
                fprintf(stderr, "Invalid verbosity level\n");
                exit(EXIT_FAILURE);
            }
            break;
        }
        case 'l':
            strncpy(lbuf, optarg, BUFSIZE);
            buf[BUFSIZE - 1] = '\0';
            logfile_name = lbuf;
            break;
        default:
            printf("Unknown option '%c'\n", c);
            usage(argv[0]);
            break;
        }
    }

得知 qtest 如何得到命令後，接著繼續了解 console.c 如何添加命令
找到函式 console_init() 可以發現裡頭加入了所有佇列會用到的所有命令，因此可以在這裡定義 shuffle 命令

console_init()

static void console_init()
{
    ADD_COMMAND(new, "                | Create new queue");
    ADD_COMMAND(free, "                | Delete queue");
    ADD_COMMAND(
        ih,
        " str [n]        | Insert string str at head of queue n times. "
        "Generate random string(s) if str equals RAND. (default: n == 1)");
    ADD_COMMAND(
        it,
        " str [n]        | Insert string str at tail of queue n times. "
        "Generate random string(s) if str equals RAND. (default: n == 1)");
    ADD_COMMAND(
        rh,
        " [str]          | Remove from head of queue.  Optionally compare "
        "to expected value str");
    ADD_COMMAND(
        rt,
        " [str]          | Remove from tail of queue.  Optionally compare "
        "to expected value str");
    ADD_COMMAND(
        rhq,
        "                | Remove from head of queue without reporting value.");
    ADD_COMMAND(reverse, "                | Reverse queue");
    ADD_COMMAND(sort, "                | Sort queue in ascending order");
    ADD_COMMAND(
        size, " [n]            | Compute queue size n times (default: n == 1)");
    ADD_COMMAND(show, "                | Show queue contents");
    ADD_COMMAND(dm, "                | Delete middle node in queue");
    ADD_COMMAND(
        dedup, "                | Delete all nodes that have duplicate string");
    ADD_COMMAND(swap,
                "                | Swap every two adjacent nodes in queue");
    add_param("length", &string_length, "Maximum length of displayed string",
              NULL);
    add_param("malloc", &fail_probability, "Malloc failure probability percent",
              NULL);
    add_param("fail", &fail_limit,
              "Number of times allow queue operations to return false", NULL);
}

，開始分析巨集函式 ADD_COMMAND ，可以發現 ADD_COMMAND 被定義為， add_cmd(#cmd, do_##cmd, msg)

#define ADD_COMMAND(cmd, msg) add_cmd(#cmd, do_##cmd, msg)

查看函式 add_cmd() ，可以看到一個有趣的結構 CELE (位於 add_cmd() 第 10 行， cmd_ele 為結構 CLEL 的變數)
發現該函式做的事情就只是把輸入的命令存到一個名為 CELE 的結構 (從 11 行到 14 行)

add_cmd()
















void add_cmd(char *name, cmd_function operation, char *documentation)
{
    cmd_ptr next_cmd = cmd_list;
    cmd_ptr *last_loc = &cmd_list;
    while (next_cmd && strcmp(name, next_cmd->name) > 0) {
        last_loc = &next_cmd->next;
        next_cmd = next_cmd->next;
    }

    cmd_ptr ele = (cmd_ptr) malloc_or_fail(sizeof(cmd_ele), "add_cmd");
    ele->name = name;
    ele->operation = operation;
    ele->documentation = documentation;
    ele->next = next_cmd;
    *last_loc = ele;
}

分析結構 CELE 後，可以發現 CELE 為 linked list 的結構形式，其中比較重要的變數為 operation ，被定義為函式指標，存放不同命令的函式地址。
由此可知，在一開始初始化的時候，所有命令會被存到一組 linked list 裡

/* Each command defined in terms of a function */
typedef bool (*cmd_function)(int argc, char *argv[]);

typedef struct CELE cmd_ele, *cmd_ptr;
struct CELE {
    char *name;
    cmd_function operation;
    char *documentation;
    cmd_ptr next;
};

最後開始分析 console.c 是如何執行命令
在 main 找到函式 run_console() ，仔細觀察原始碼後可以得知，執行命令一共分為兩種模式

一種為讓使用者一個命令接著一個命令輸入並執行 (run_console() 的第 8 行到第 15 行)
另一種則為直接讀取一個檔案執行 ( run_console() 的第 16 行到第 19 行 )

run_console()






















bool run_console(char *infile_name)
{
    if (!push_file(infile_name)) {
        report(1, "ERROR: Could not open source file '%s'", infile_name);
        return false;
    }

    if (!has_infile) {
        char *cmdline;
        while ((cmdline = linenoise(prompt)) != NULL) {
            interpret_cmd(cmdline);
            linenoiseHistoryAdd(cmdline);       /* Add to the history. */
            linenoiseHistorySave(HISTORY_FILE); /* Save the history on disk. */
            linenoiseFree(cmdline);
        }
    } else {
        while (!cmd_done())
            cmd_select(0, NULL, NULL, NULL, NULL);
    }

    return err_cnt == 0;
}

首先先討論第一種模式，在 run_console() 第 11 行可以找到函式 interpret_cmd() ，用途在於解析使用者輸入的命令並執行

解析命令使用函式 parse_args()
執行命令使用函式 interpret_cmda()

interpret_cmd()

/* Execute a command from a command line */
static bool interpret_cmd(char *cmdline)
{
    if (quit_flag)
        return false;

#if RPT >= 6
    report(6, "Interpreting command '%s'\n", cmdline);
#endif
    int argc;
    char **argv = parse_args(cmdline, &argc);
    bool ok = interpret_cmda(argc, argv);
    for (int i = 0; i < argc; i++)
        free_string(argv[i]);
    free_array(argv, argc, sizeof(char *));

    return ok;
}

接著探討很重要的用來執行命令的函式 interpret_cmda() ，可以歸納以下幾點

首先利用輸入命令的名字開始尋找是否有該命令存在，走訪 linked list 得知 (interpret_cmd() 的第 9 行和第 10 行)
如果有該命令的話，則利用函式指標開始呼叫該命令要執行的函式 (interpret_cmd() 的第 12 行)

interpret_cmda






















/* Execute a command that has already been split into arguments */
static bool interpret_cmda(int argc, char *argv[])
{
    if (argc == 0)
        return true;
    /* Try to find matching command */
    cmd_ptr next_cmd = cmd_list;
    bool ok = true;
    while (next_cmd && strcmp(argv[0], next_cmd->name) != 0)
        next_cmd = next_cmd->next;
    if (next_cmd) {
        ok = next_cmd->operation(argc, argv);
        if (!ok)
            record_error();
    } else {
        report(1, "Unknown command '%s'", argv[0]);
        record_error();
        ok = false;
    }

    return ok;
}

接下來換成討論第二種執行情況，也就是直接讀取檔案並執行
從之前提到的函式 run_console() ，查看函式 cmd_select() ，可以歸類以下幾點

在第 33 行，這邊用到一個函式庫 select.h 的函式 select() 詳細資料之後補上
在第 46 行的位置，開始執行命令

cmd_select()


















































int cmd_select(int nfds,
               fd_set *readfds,
               fd_set *writefds,
               fd_set *exceptfds,
               struct timeval *timeout)
{
    int infd;
    fd_set local_readset;

    if (cmd_done())
        return 0;

    if (!block_flag) {
        /* Process any commands in input buffer */
        if (!readfds)
            readfds = &local_readset;

        /* Add input fd to readset for select */
        infd = buf_stack->fd;
        FD_SET(infd, readfds);
        if (infd == STDIN_FILENO && prompt_flag) {
            printf("%s", prompt);
            fflush(stdout);
            prompt_flag = true;
        }

        if (infd >= nfds)
            nfds = infd + 1;
    }
    if (nfds == 0)
        return 0;

    int result = select(nfds, readfds, writefds, exceptfds, timeout);
    if (result <= 0)
        return result;

    infd = buf_stack->fd;
    if (readfds && FD_ISSET(infd, readfds)) {
        /* Commandline input available */
        FD_CLR(infd, readfds);
        result--;
        if (has_infile) {
            char *cmdline;
            cmdline = readline();
            if (cmdline)
                interpret_cmd(cmdline);
        }
    }
    return result;
}

結論

一開始使用函式 getopt() 取得使用者對 qtest 的輸入
接著在函式 console_init() 裡一路追蹤，得知了所有的命令一開始都會被儲存在一個 linked list 上
執行命令一共有兩種模式，等待使用者輸入以及讀取檔案
利用函式指標執行每種不同的命令

從上述的分析，可以得知 qtest 從執行、初始化命令到執行命令的過程，並想出了增加命令的實作方法，如下所示

實作步驟

模仿其他的命令新增函式 do_shuffle() ，作為命令 shuffle 的執行函式
實作函式 q_shuffle() ，用來實現 Fisher–Yates shuffle 演算法
在 console_init() 裡新增 shuffle 命令

在 qtest.c 新增函式 do_shuffle()

static bool do_shuffle(int argc, char *argv[])
{
    if (argc != 1) {
        report(1, "%s takes no arguments", argv[0]);
        return false;
    }

    if (!l_meta.l)
        report(3, "Warning: Try to access null queue");
    error_check();

    bool ok = true;
    if (exception_setup(true))
        ok = q_shuffle(l_meta.l);
    exception_cancel();

    show_queue(3);
    return ok && !error_check();
}

由於 queue.h 無法更改，因此將 q_shuffle() 宣告在 do_shuffle() 的上面

這邊參考 Fisher–Yates shuffle 裡提到的 Pencil-and-paper method (這邊使用 Fisher and Yates' original method) ，流程如下表所示

Range	Roll	Scratch	Result
		A B C D
1~4	2	A C D	B
1~3	3	A C	B D
1~2	1	C	B D A
1	1		B D A C

思考方式
1. 變數 number 決定目前要取第幾個 element
2. 取得節點後，將該節點 remove 並加到 linked list 的最後一個
3. 重複這些動作後，最一開始移動的節點最後會回到第一個

q_shuffle()

static bool q_shuffle(struct list_head *head)
{
    if (!head)
        return false;

    int size = q_size(head);
    while (size) {
        struct list_head *tmp = head->next;
        int number = rand() % size--;
        for(int i = 0; i < number; i++)
            tmp = tmp->next;
        list_del(tmp);
        list_add_tail(tmp, head);
    }
    
    return true;
}

在 console_init() 裡新增 shuffle 命令

ADD_COMMAND(
        shuffle, "                | Shuffle the queue by using Fisher–Yates shuffle");

接著輸入命令 ./qtest 以及 help ，可以看到所有可使用的命令，可以發現 shuffle 已被加入

./qtest
cmd> help
Commands:
	#	 ...            | Display comment
	dedup	                | Delete all nodes that have duplicate string
	dm	                | Delete middle node in queue
	free	                | Delete queue
	help	                | Show documentation
	ih	 str [n]        | Insert string str at head of queue n times. Generate random string(s) if str equals RAND. (default: n == 1)
	it	 str [n]        | Insert string str at tail of queue n times. Generate random string(s) if str equals RAND. (default: n == 1)
	log	 file           | Copy output to file
	new	                | Create new queue
	option	 [name val]     | Display or set options
	quit	                | Exit program
	reverse	                | Reverse queue
	rh	 [str]          | Remove from head of queue.  Optionally compare to expected value str
	rhq	                | Remove from head of queue without reporting value.
	rt	 [str]          | Remove from tail of queue.  Optionally compare to expected value str
	show	                | Show queue contents
+	shuffle	                | Shuffle the queue by using Fisher–Yates shuffle
	size	 [n]            | Compute queue size n times (default: n == 1)
	sort	                | Sort queue in ascending order
	source	 file           | Read commands from source file
	swap	                | Swap every two adjacent nodes in queue
	time	 cmd arg ...    | Time command execution
Options:
	echo	1	Do/don't echo commands
	error	5	Number of errors until exit
	fail	30	Number of times allow queue operations to return false
	length	1024	Maximum length of displayed string
	malloc	0	Malloc failure probability percent
	simulation	0	Start/Stop simulation mode
	verbose	4	Verbosity level
cmd>

測試程式結果

cmd> it 9
l = [1 2 3 4 5 6 7 8 9]
cmd> shuffle
l = [6 2 7 3 1 4 9 5 8]
cmd> shuffle
l = [4 8 1 5 2 7 6 9 3]
cmd> sort
l = [1 2 3 4 5 6 7 8 9]
cmd> shuffle
l = [1 4 8 3 2 5 6 7 9
cmd> sort
l = [1 2 3 4 5 6 7 8 9]
cmd> shuffle
l = [2 6 3 7 8 5 9 1 4]

在 qtest 提供新的命令 web

作業要求

在 qtest 提供新的命令 web，提供 web 伺服器功能，注意: web 伺服器運作過程中，qtest 仍可接受其他命令
- 依據前述說明，嘗試整合 tiny-web-server

首先查看了 tiny-web-server 後，發現有些步驟會用到 tiny-web-server 的函式，因此把 tiny.c 額外分出兩個檔案 tiny_function.c 及 tiny_function.h ，以下為 tiny_function.h 程式碼

tiny_function.h

#ifndef _TINY_FUNCTION_H
#define _TINY_FUNCTION_H

#include <arpa/inet.h> /* inet_ntoa */
#include <dirent.h>
#include <errno.h>
#include <fcntl.h>
#include <netinet/in.h>
#include <netinet/tcp.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/sendfile.h>
#include <sys/socket.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <time.h>
#include <unistd.h>

#define LISTENQ 1024 /* second argument to listen() */
#define MAXLINE 1024 /* max length of a line */
#define RIO_BUFSIZE 8192

#ifndef DEFAULT_PORT
#define DEFAULT_PORT 9999 /* use this port if none given as arg to main() */
#endif

#ifndef FORK_COUNT
#define FORK_COUNT 4
#endif

#ifndef NO_LOG_ACCESS
#define LOG_ACCESS
#endif

typedef struct {
    int rio_fd;                /* descriptor for this buf */
    int rio_cnt;               /* unread byte in this buf */
    char *rio_bufptr;          /* next unread byte in this buf */
    char rio_buf[RIO_BUFSIZE]; /* internal buffer */
} rio_t;

/* Simplifies calls to bind(), connect(), and accept() */
typedef struct sockaddr SA;

typedef struct {
    char filename[512];
    off_t offset; /* for support Range */
    size_t end;
} http_request;

typedef struct {
    const char *extension;
    const char *mime_type;
} mime_map;

// https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types/Common_types

void rio_readinitb(rio_t *rp, int fd);
ssize_t writen(int fd, void *usrbuf, size_t n);
ssize_t rio_readlineb(rio_t *rp, void *usrbuf, size_t maxlen);
void format_size(char *buf, struct stat *stat);
void handle_directory_request(int out_fd, int dir_fd, char *filename);
int open_listenfd(int port);
void url_decode(char *src, char *dest, int max);
void parse_request(int fd, http_request *req);
void log_access(int status, struct sockaddr_in *c_addr, http_request *req);
void client_error(int fd, int status, char *msg, char *longmsg);
void serve_static(int out_fd, int in_fd, http_request *req, size_t total_size);
void process(int fd, struct sockaddr_in *clientaddr);
void print_help();

#endif /* _TINY_FUNCTION_H */

修改 Makefile ，讓 lab0-c 可以使用 tiny-web-server 的函式，另外這邊新增命令 make tiny ，保留原本 tiny-web-server 的功能

OBJS := qtest.o report.o console.o harness.o queue.o list_sort.o \
        random.o dudect/constant.o dudect/fixture.o dudect/ttest.o \
-       linenoise.o
+       linenoise.o tiny_function.o

+ tiny: tiny.c tiny_function.c
+	$(CC) $(CFLAGS) -o $@ $^
+	./$@

開始新增命令 web ，在 init_cmd() 新增以下程式碼

ADD_COMMAND(web, "                | Use web server");

另外，由於會使用到的函式分散在不同檔案，這邊宣告一些全域變數(位於 linenoise.h)

struct sockaddr_in clientaddr;
socklen_t clientlen;
int listenfd;
bool noise;

實作函式 do_web() ，這邊參考 tiny.c 的函式 main() 的作法，使用預設的 port 9999

if(!listenfd) {
    listenfd = open_listenfd(DEFAULT_PORT);
    noise = false;
    printf("listen on port %d, fd is %d\n", DEFAULT_PORT, listenfd);
} else
    printf("Server has been launched\n");
return true;

接著的步驟幾乎都是參考作業說明，跟著整合 tiny-web-server 的步驟慢慢做

修改 run_console() ，使其可以選擇使用 linenoise() 或是 cmd_select()

if (!has_infile) {
    char *cmdline;
    while (noise && (cmdline = linenoise(prompt)) != NULL) {
            interpret_cmd(cmdline);
            linenoiseHistoryAdd(cmdline); /* Add to the history. */
            linenoiseHistorySave(
                HISTORY_FILE); /* Save the history on disk. */
            linenoiseFree(cmdline);
    }
    if (!noise) {
        while (!cmd_done())
            cmd_select(0, NULL, NULL, NULL, NULL);
    }
} else {
    while (!cmd_done())
        cmd_select(0, NULL, NULL, NULL, NULL);
}

修改 cmd_select()

int cmd_select(int nfds,
               fd_set *readfds,
               fd_set *writefds,
               fd_set *exceptfds,
               struct timeval *timeout)
{
    int infd;
    fd_set local_readset;

    if (cmd_done())
        return 0;

    if (!block_flag) {
        /* Process any commands in input buffer */
        if (!readfds)
            readfds = &local_readset;

        /* Add input fd to readset for select */
        infd = buf_stack->fd;
+       FD_ZERO(readfds);
        FD_SET(infd, readfds);

+       /* If web not ready listen */
+       if (listenfd != -1)
+           FD_SET(listenfd, readfds);


        if (infd == STDIN_FILENO && prompt_flag) {
            printf("%s", prompt);
            fflush(stdout);
            prompt_flag = true;
        }

        if (infd >= nfds)
            nfds = infd + 1;
+       if (listenfd >= nfds)
+           nfds = listenfd + 1;
    }

    if (nfds == 0)
        return 0;

    int result = select(nfds, readfds, writefds, exceptfds, timeout);
    if (result <= 0)
        return result;

    infd = buf_stack->fd;
    if (readfds && FD_ISSET(infd, readfds)) {
        /* Commandline input available */
        FD_CLR(infd, readfds);
        result--;
        char *cmdline;
        cmdline = readline();
        
        if (cmdline)
            interpret_cmd(cmdline);
        
    }
+   else if (readfds && FD_ISSET(listenfd, readfds)) {
+       FD_CLR(listenfd, readfds);
+       result--;
+       int connfd = accept(listenfd, (SA *) &clientaddr, &clientlen);
+       char *p = url_process(connfd, &clientaddr);
+       if (p)
+           interpret_cmd(p);
+       free(p);
+       close(connfd);
+   }
    return result;
}

繼續照著作業說明實作，在 linenoiseEdit 新增程式碼

if(listenfd) {
    fd_set set;
    FD_ZERO(&set);
    FD_SET(listenfd, &set);
    FD_SET(stdin_fd, &set);
    clientlen = sizeof(clientaddr);
    int rv = select(listenfd + 1, &set, NULL, NULL, NULL);
    int connfd;
    switch (rv) {
    case -1:
        perror("select"); /* an error occurred */
        continue;
    case 0:
        printf("timeout occurred\n"); /* a timeout occurred */
        continue;
    default:
        if (FD_ISSET(listenfd, &set)) {
            connfd = accept(listenfd, (SA *) &clientaddr, &clientlen);
            char *p = url_process(connfd, &clientaddr);
            strncpy(buf, p, strlen(p) + 1);
            close(connfd);
            free(p);
            return strlen(p);
        } else if (FD_ISSET(stdin_fd, &set)) {
            nread = read(l.ifd, &c, 1);
            if (nread <= 0)
                return l.len;
        }
        break;
    }
} else {
    nread = read(l.ifd, &c, 1);
    if (nread <= 0)
        return l.len;
}

最後新增函式 url_process() ，透過 HTTP 的 GET method 傳送要執行的命令，這邊參考原本 tiny-web-server 的實作，不同在於 tiny-web-server 會讀取所在目錄的檔案，而 url_process() 指單純開啟空白頁面，實作如下

char *url_process(int fd, struct sockaddr_in *clientaddr)
{
#ifdef LOG_ACCESS
    printf("accept request, fd is %d, pid is %d\n", fd, getpid());
#endif
    http_request req;
    parse_request(fd, &req);
    
    int status = 200;
    char buf[MAXLINE];
    snprintf(buf, MAXLINE, "HTTP/1.1 %d\r\n",status);
    writen(fd, buf, strlen(buf));

    char *p = req.filename;
    /* Change '/' to ' ' */
    while (*p) {
        ++p;
        if (*p == '/') {
            *p = ' ';
        }
    }
#ifdef LOG_ACCESS
    log_access(status, clientaddr, &req);
#endif
    char *ret = malloc(strlen(req.filename) + 1);
    strncpy(ret, req.filename, strlen(req.filename) + 1);
    return ret;
}

最後結果如以下所示，可以成功開啟網頁，以及接收使用者命令

cmd> web
listen on port 9999, fd is 3
cmd> accept request, fd is 4, pid is 84485
0.0.0.0:0 200 - '.' (text/plain)
cmd> accept request, fd is 4, pid is 84485
127.0.0.1:33774 200 - 'new' (text/plain)
l = []
cmd> accept request, fd is 4, pid is 84485
127.0.0.1:33776 200 - 'ih 4' (text/plain)
l = [4]
cmd> accept request, fd is 4, pid is 84485
127.0.0.1:33778 200 - 'ih 5' (text/plain)
l = [5 4]
cmd> accept request, fd is 4, pid is 84485

不過還有一個小問題，開啟 server 之後，會經過很久的時間才能從終端機輸入命令，目前還沒有解決

進度紀錄

這些都在該 git log 描述，不需要在此提及，有 git commit 簡述即可。
jserv
已修正！

參考資料

Edward Hsu

2022/02/21 12:46:49

紀錄

Hello 您好: 想跟您請教個問題。我已經實作完成所有的q_xxx API()。可是 trace-17-complexity 一直過不了。請問您如何分析問題並解決問題，讓make test順利通過? 感謝!! Edward (Edited)

Risheng

2022/02/21 13:33:27

以下是我的思考流程： 1. 首先可以先修改 makefile 單獨測試 trace-17-complexity (使用 test check) 2. 接著可以看到是哪個函式的效能不夠好 3. 最後是把沒有通過的函式整個邏輯思考一次，把多餘的步驟去除 (舉例像是strlen這類會多花時間的函式) (Edited)

2022/02/22 02:31:20

好的!感謝您的提示!

2022q1 Homework1 (lab0)

實驗環境

開發紀錄

二個重要的結構體

q_new

q_free

q_insert_head

q_insert_tail

q_remove_head

q_remove_tail

q_size

q_delete_mid

q_delete_dup

q_swap

q_reverse

q_sort

實作 quick sort 遞迴版本

實作 merge sort 遞迴版本

研讀 Linux 核心原始程式碼 list_sort.c

引入 Linux 核心原始碼到 lab0-c

比較自己實作的 merge sort 和 Linux 核心程式碼

運用 Valgrind 排除 qtest 實作的記憶體錯誤

在 qtest 提供新的命令 shuffle

在 qtest 提供新的命令 web

進度紀錄

參考資料

Read more

第一次使用 ARM Cortex-M4 就上手

面試經驗

Cortex-M4 practices

2023q1 Homework3 (fibdrv)