2024q1 Homework1 (lab0)

contributed by < yu-hisennn >

Reveiwed by `yy214123`

你的 q_delete_mid 以快慢指標的策略進行實作，為找出串列的中間節點：

快指標須走訪
$n$ 次
慢指標須走訪
$\frac{n}{2}$ 次
總共
$\frac{3 n}{2}$ 次

考慮到使用的資料結構為雙向環狀鍊結串列，可改以下方方式實作：

僅走訪

n

次即可找到串列之中間節點。

開發環境

$ gcc --version
gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0

$ lscpu
架構：                   x86_64
CPU 作業模式：            32-bit, 64-bit
Address sizes:          39 bits physical, 48 bits virtual
Byte Order:             Little Endian
CPU(s):                 8
On-line CPU(s) list:    0-7
供應商識別號：             GenuineIntel
Model name:             Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz
CPU 家族：               6
型號：                   60
每核心執行緒數：           2
每通訊端核心數：           4
Socket(s):              1
製程：                   3
CPU max MHz:            4400.0000
CPU min MHz:            800.0000
BogoMIPS:               7981.55

針對佇列的實作

無論標題和內文中，中文和英文字元之間要有空白字元 (對排版和文字搜尋有利)

在實作過程中，發現有些程式碼已經在 list.h 檔中實作出來了，所以直接呼叫即可，這樣一來可以讓程式碼更精簡，可讀性也會有所提昇，像是︰

q_new

struct list_head *q_new()
{
    return NULL;
    struct list_head *head = malloc(sizeof(struct list_head));
    if (!head) {
        return NULL;  // Allocation failed
    }
-   head->prev = head->next = head;
+   INIT_LIST_HEAD(head);
    return head;
}

上述 diff 的內容，應該以 git diff 或 diff 命令產生，注意修改行數的位移量。

q_insert_head

bool q_insert_head(struct list_head *head, char *s)
{
    if (!head || !s) {
        return false;
    }
    element_t *new_elem = malloc(sizeof(element_t));
    if (!new_elem) {
        return false;  // Allocation failed
    }
    new_elem->value = strdup(s);  // Copy string
    if (!new_elem->value) {
        free(new_elem);
        return false;  // String copy failed
    }
-   new_elem->list.next = head->next;
-   new_elem->list.prev = head;
-   head->next->prev = &new_elem->list;
-   head->next = &new_elem->list;
+   list_add(&new_elem->list, head);
    return true;
}

q_insert_tail

bool q_insert_head(struct list_head *head, char *s)
{
    if (!head || !s) {
        return false;
    }
    element_t *new_elem = malloc(sizeof(element_t));
    if (!new_elem) {
        return false;  // Allocation failed
    }
    new_elem->value = strdup(s);  // Copy string
    if (!new_elem->value) {
        free(new_elem);
        return false;  // String copy failed
    }
-   new_elem->list.prev = head->prev;
-   new_elem->list.next = head;
-   head->prev->next = &new_elem->list;
-   head->prev = &new_elem->list;
+   list_add_tail(&new_elem->list, head);
    return true;
}

或是直接呼叫之前寫好的 q_insert_head

bool q_insert_tail(struct list_head *head, char *s) {
    if (!head || !s) {
        return false;
    }
-   if (!head || !s) {
-       return false;
-   }
-   element_t *new_elem = malloc(sizeof(element_t));
-   if (!new_elem) {
-       return false; // Allocation failed
-   }
-   new_elem->value = strdup(s); // Copy string
-   if (!new_elem->value) {
-       free(new_elem);
-       return false; // String copy failed
-   }
-   list_add_tail(&new_elem->list, head);
-   return true;
+   return q_insert_head(head->prev, s);
}

同理，q_remove_tail也可以呼叫之前寫好的 q_remove_head

element_t *q_remove_tail(struct list_head *head, char *sp, size_t bufsize) {
    if (!head || head->prev == head) {
        return NULL; // Queue is empty or NULL
    }
-   struct list_head *last = head->prev;
-   element_t *elem = list_entry(last, element_t, list);
-   if (sp && bufsize > 0) {
-       strncpy(sp, elem->value, bufsize - 1);
-       sp[bufsize - 1] = '\0';
-   }
-   list_del(last);
-   q_release_element(elem);
-   return elem;
+   return q_remove_head(head->prev->prev, sp, bufsize);
}

q_insert_head

commit 7ab86a7

避免非必要的項目縮排 (即 * 和 - )，以清晰、明確，且流暢的漢語書寫。

原本利用 strdup 的方式來實作，測試時發現會報出 ERROR: Probably not constant time or wrong implementation 的錯誤，於似乎就去確認 strdup 的實作方式，發現可能是因為多次呼叫需要跑遍整條字串長度的函式而造成的，於是就先把它改成自行宣告的作法。

第一次修改時，沒有使用一個變數來紀錄 strlen(n) 的回傳值，而是要用時才作呼叫，結果也會發生 ERROR: Probably not constant time or wrong implementation ，最後利用一個變數來紀錄避免多次呼叫 strlen(n) 來降低整體的時間複雜度。

bool q_insert_head(struct list_head *head, char *s)
{
    if (!head || !s) {
        return false;
    }
    element_t *new_elem = malloc(sizeof(element_t));
+   int s_len = strlen(s);

    if (!new_elem) {
        return false;  // Allocation failed
    }
-   new_elem->value = strdup(s);
+   new_elem->value = (char *) malloc((s_len + 1) * sizeof(char));
    if (!new_elem->value) {
        free(new_elem);
        return false;  // String copy failed
    }

+   strncpy(new_elem->value, s, (s_len + 1));
    list_add(&new_elem->list, head);
    return true;
}

q_delete_mid

採取快慢指標的方式，來找到中間要刪掉的節點，其中迴圈終止時，slow 即為整條鏈結串列的中點。

struct list_head *slow = head->next, *fast = head->next->next;
while (fast != head && fast->next != head) {
    slow = slow->next;
    fast = fast->next->next;
}

commit 488930d

q_delete_dup

commit db89b47

先判斷目前的值與下一個的值是不是重複的，重複的話就將下一個值給刪除，並且用一個旗標來紀錄目前的值有出現重複的情形，最後再將其刪除即可。

bool found = false;
while (current != head && current->next != head) {
    element_t *current_entry = list_entry(current, element_t, list);
    element_t *next_entry = list_entry(current->next, element_t, list);

    if (strcmp(current_entry->value, next_entry->value) == 0) {
        list_del(current->next);
        q_release_element(next_entry);
        found = true;
    } else {
        current = current->next;
        if (found) {
            list_del(current->prev);
            q_release_element(current_entry);
            found = false;
        }
    }
}

q_swap

利用兩個節點來暫存下一個及下下一個節點，按照順序來修改節點指向

struct list_head *current = head->next;
while (current != head && current->next != head) {
    struct list_head *next = current->next;
    struct list_head *nextNext = next->next;

    next->next = current;
    next->prev = current->prev;
    current->prev->next = next;
    current->next = nextNext;
    nextNext->prev = next;
    if (current->next != head) {
        current->next->prev = current;
    }

    current = nextNext;
}

commit 504c744

改用 list_move 的方式來實作

while (current != head && current->next != head) {
-   struct list_head *next = current->next;
-   struct list_head *nextNext = next->next;
-   next->next = current;
-   next->prev = current->prev;
-   current->prev->next = next;
-   current->next = nextNext;
-   nextNext->prev = next;
-   if (current->next != head) {
-       current->next->prev = current;
-   current = nextNext;
+   struct list_head *prev = current->prev;
+   list_move(current->next, prev);
+   current = current->next;
}

commit 47abac9

`q_reverse` 及 `q_reverseK`

commit a30283b

避免濫用詞彙，此「核心」非彼「核心」(還記得課名嗎？)，可改說「二者實作手法相似，唯 ___ 不同」

~~這兩個實作核心概念差不多~~，都是利用到巨集裡的 list_move ，利用一個指標定在第一個值上(最後會變成最後一個值)，把它後面的都移到head/start後面即可達成反轉的效果

struct list_head *start = head, *current = NULL;
while (times--) {
    int count = k;
    for (current = start->next; --count > 0;)
        list_move(current->next, start);
    start = current;
}

q_ascend 及 q_descend

這兩個實作手法也是大同小異，只差在裡面的判斷是 >= 或是 <= ，這邊利用一個變數 result 來儲存佇列沒有被變動幾次，最後直接回傳即可(不需要在呼叫 q_size )

element_t *min_entry = list_entry(head->prev, element_t, list);
int result = 1;

for (struct list_head *current = head->prev->prev; current != head;) {
    element_t *current_entry = list_entry(current, element_t, list);
    struct list_head *prev = current->prev;
    if (strcmp(current_entry->value, min_entry->value) >= 0) {
        list_del_init(current);
        q_release_element(current_entry);
    } else {
        result++;
        min_entry = current_entry;
    }
    current = prev;
}

commit 32571b7

將中心的實作內容拉出來成一個函式，呼叫時再把要遞增或是遞減傳入，減少兩個函式重複的程式碼

int process_list(struct list_head *head, bool ascend)
{
    // 雙方重複的程式碼
    bool condition =
            ascend ? (strcmp(current_entry->value, extreme_entry->value) >= 0)
                   : (strcmp(current_entry->value, extreme_entry->value) <= 0);
}
int q_ascend(struct list_head *head)
{
    return process_list(head, true);
}

int q_descend(struct list_head *head)
{
    return process_list(head, false);
}

commit f7a6c27

q_sort

這邊的實作是利用 merge sort 的方式去進行，一開始先利用快慢指標的方式去找到中點，找到中點後切一刀來分成左半邊及右半邊，在利用 merge 函式來完成合併的動作。

用 list_move 來串接節點，當有一邊串列為空且另一邊不為空時，再把不為空的串列接回 current 後方。

避免非必要的項目縮排 (即 * 和 - )，以清晰、明確，且流暢的漢語書寫。

void merge(struct list_head *head,
           struct list_head *left,
           struct list_head *right,
           bool descend)
{
    struct list_head *current = head;
    while (!list_empty(left) && !list_empty(right)) {
        element_t *left_entry = list_entry(left->next, element_t, list);
        element_t *right_entry = list_entry(right->next, element_t, list);

        if (descend ? strcmp(left_entry->value, right_entry->value) >= 0
                    : strcmp(left_entry->value, right_entry->value) <= 0) {
            list_move(left->next, current);
        } else {
            list_move(right->next, current);
        }
        current = current->next;
    }
    struct list_head *remaining = list_empty(left) ? right : left;
    list_splice_tail(remaining, current->next);
}

/* Sort elements of queue in ascending/descending order */
void q_sort(struct list_head *head, bool descend)
{
    if (!head || list_empty(head) || head->next->next == head) {
        return;
    }

    struct list_head *slow = head->next, *fast = head->next->next;
    while (fast != head && fast->next != head) {
        slow = slow->next;
        fast = fast->next->next;
    }

    LIST_HEAD(left);   // left half
    LIST_HEAD(right);  // right half
    list_cut_position(&left, head, slow);
    list_splice_init(head, &right);

    q_sort(&left, descend);
    q_sort(&right, descend);

    merge(head, &left, &right, descend);
}

commit 23062f2

q_merge

不理解就說不理解，不要說「不太理解」，理工人說話要精準。
你若有疑慮，就嘗試提交修改，從而確認對整個專案的衝擊。

不太了解 descend 參數的用意，因為已經拿到 k 個 sorted 的佇列了，合併時還有需要管是不是 descend 嗎？還是說有可能給的 sorted 佇列跟 descend 兩邊是不一樣的排序方式
先將所有的內佇列串起來，再執行 q_sort 來進行排序。

避免非必要的項目縮排 (即 * 和 - )，以清晰、明確，且流暢的漢語書寫。

queue_contex_t *target = list_entry(head->next, queue_contex_t, chain);
queue_contex_t *current = NULL;
list_for_each_entry (current, head, chain) {
    if (current == target)
        continue;
    list_splice_init(current->q, target->q);
    target->size = target->size + current->size;
    current->size = 0;
}
q_sort(target->q, descend);

commit 6d8e3db

實作疑問

詳見作業說明。

了解!

目前分數還是卡在 95/100 ，因 q_insert_head 及 q_insert_tail ，有機會變成 ERROR: Probably not constant time

新增 percentile

在原先的 lab0-c 中，percentile 被拔掉了，於是先研讀論文看 percentile 會對測量造成什影響，發現到

percentile 是來避免不正確的外部資料來拖長整體的時間，造成時間分析上會不準確
利用給定的閥值去篩選哪些資料要留下來，哪些資料要捨去
圖為
$1 - (\frac{1}{2})^{\frac{10 * (i + 1)}{N_{P E R C E N T I L E S}}}$ ，也就是選取的值經過排序後，範圍要在裡面才會被採用
參照 dudect ，以及觀摩 willwillhi 的作法

static int cmp(const int64_t *a, const int64_t *b)
{
    return (int) (*a - *b);
}

static int64_t percentile(int64_t *a, double which, size_t size)
{
    size_t array_position = (size_t) ((double) size * (double) which);
    assert(array_position >= 0);
    assert(array_position < size);
    return a[array_position];
}

void prepare_percentiles(int64_t *exec_times, int64_t *percentiles)
{
    qsort(exec_times, N_MEASURES, sizeof(int64_t),
          (int (*)(const void *, const void *)) cmp);
    for (size_t i = 0; i < N_PERCENTILES; i++) {
        percentiles[i] = percentile(
            exec_times, 1 - (pow(0.5, 10 * (double) (i + 1) / N_PERCENTILES)),
            N_MEASURES);
    }
}

commit 4f98446

分數紀錄

利用 API 重構

commit 6d3cad3

無論標題和內文中，中文和英文字元之間要有空白字元 (對排版和文字搜尋有利)

研讀論文 <Dude, is my code constant time?>

作者開發了一套工具 dudect ，使其可以檢驗程式是否是常數時間複雜度。

方法

步驟一: 在兩個不同類別的資料上，個別測量其執行時間
- 類別定義: 利用 fix-vs-random 來做測試。其中，第一種類別是常數，第二種則為隨機選取
- 迴圈計數器: 用來精準地測量執行時間
- 環境變數: 去最小化對於環境改變，造成的結果的影響
步驟二: 後處理
- 裁減: 由於系統或其他外部活動可能會造成測量的偏差，在這種情況下，需要丟棄那些與類別無關或是超過的數據
- 典型時間分布是正偏態(positively skew)，示意圖如下
步驟三: 統計測試
- t-test: 利用 Welch's t-test. 來檢驗平均值的等價性
- 非參數測試: 利用 Kolmogorov-Smirnov 等方式來測試

實作

利用
$p$ 當作閥值，丟棄掉那些大於百分位數
$p$ 的數值
使用 Welch's t-test 與 Welford's 的變異數去改善數值的穩定性

Student's t-distribution

定義

是一種機率分佈的類型，與常態分佈相似，常應用在估計樣本少或是不知道變異數的情況下

與常態分佈比較

兩者都屬於對稱分佈
相較於常態分佈，前端以及尾端有較多分佈機率
常態分佈會被標準差以及變異數兩個所影響，T-分佈則是被 degree of freedom(df)所控制

為何此篇論文使用t-distribution?

在測試的資料集中，樣本數量少，故不適用需要龐大樣本數量的常態分佈，T-分佈會較合適

指出論文和程式碼實作出入之處。

引入 lib/list_sort.c

流程

新增 list_sort.c 及 list_sort.h
修改 #include ，以及 list_sort.h 的內容

- #include <linux/kernel.h>
- #include <linux/bug.h>
- #include <linux/compiler.h>
- #include <linux/export.h>
- #include <linux/string.h>
- #include <linux/list.h>
+ #include "queue.h"
+ #include "list_sort.h"

把 list_sort.o 加入至 MakeFile 的 OBJS 中
在 qtest.c 中，新增 do_sort_list 的函式以及指令
打 make -> help 確認有沒有出現，即是否正確執行

cmd> help
Commands:
  #           ...          | Display comment
  ascend                   | Remove every node which has a node with a strictly less value anywhere to the right side of it
  dedup                    | Delete all nodes that have duplicate string
  descend                  | Remove every node which has a node with a strictly greater value anywhere to the right side of it
  dm                       | Delete middle node in queue
  free                     | Delete queue
  help                     | Show summary
  ih          str [n]      | Insert string str at head of queue n times. Generate random string(s) if str equals RAND. (default: n == 1)
  it          str [n]      | Insert string str at tail of queue n times. Generate random string(s) if str equals RAND. (default: n == 1)
  list_sort                | Sort queue in ascending/descening order by linux list sort
  log         file         | Copy output to file
  merge                    | Merge all the queues into one sorted queue
  new                      | Create new queue
  next                     | Switch to next queue
  option      [name val]   | Display or set options. See 'Options' section for details
  prev                     | Switch to previous queue
  quit                     | Exit program
  reverse                  | Reverse queue
  reverseK    [K]          | Reverse the nodes of the queue 'K' at a time
  rh          [str]        | Remove from head of queue. Optionally compare to expected value str
  rt          [str]        | Remove from tail of queue. Optionally compare to expected value str
  show                     | Show queue contents
  size        [n]          | Compute queue size n times (default: n == 1)
  sort                     | Sort queue in ascending/descening order
  source                   | Read commands from source file
  swap                     | Swap every two adjacent nodes in queue
  time        cmd arg ...  | Time command execution
  web         [port]       | Read commands from builtin web server
Options:
  descend     0            | Sort and merge queue in ascending/descending order
  echo        1            | Do/don't echo commands
  entropy     0            | Show/Hide Shannon entropy
  error       5            | Number of errors until exit
  fail        30           | Number of times allow queue operations to return false
  length      1024         | Maximum length of displayed string
  malloc      0            | Malloc failure probability percent
  simulation  0            | Start/Stop simulation mode
  verbose     4            | Verbosity level

commit d16586c

比較

為防止只使用一次比較可能會有不公平的情形發生，我採用每個大小個測3次來取其平均。

# script for compare q_sort and list_sort performance.
new
it RAND 100000
time
sort # list_sort
time
free

q_sort

使用 Metric prefix，而非漢語的「萬」。

資料數	round 1	round 2	round 3	平均
$10^{6}$	0.067	0.067	0.068	0.0673
$2 * 10^{6}$	0.151	0.143	0.149	0.1476
$3 * 10^{6}$	0.239	0.233	0.240	0.2373
$4 * 10^{6}$	0.331	0.334	0.334	0.333
$5 * 10^{6}$	0.425	0.432	0.428	0.4283

list_sort

資料數	round 1	round 2	round 3	平均
$10^{6}$	0.059	0.061	0.063	0.061
$2 * 10^{6}$	0.136	0.127	0.130	0.131
$3 * 10^{6}$	0.219	0.207	0.223	0.2163
$4 * 10^{6}$	0.297	0.296	0.311	0.3013
$5 * 10^{6}$	0.385	0.382	0.384	0.3836

可以發現資料數量 100000 ~ 500000 時間差從 0.0063 -> 0.0447

提供更多解讀，闡述你的啟發。

在 qtest 提供新的命令 `shuffle`

此函式用於打亂串列的順序
利用 Fisher-Yates shuffle 演算法來實作

struct list_head *last = head->prev;
int size = q_size(head);
while (size-- > 1) {
    int index = rand() % size;
    struct list_head *current = head->next, *temp;
    while (index--) {
        current = current->next;
    }
    temp = last->prev;
    list_move(last, current->prev);
    if (temp != current)
        list_move(current, temp);
    last = current->prev;
}

流程如下：

找到整條串列長度，並設為 size
隨機挑選範圍 0 ~ size - 1 的節點與 last 作交換
size 數減 1，並將 last 更新為 current 的前一位
重複 2. ~ 3. ，直到 size == 1 時，即完成亂序

圖示(為了圖片簡潔，這邊不畫最後一個節點連到 head 的線)：

Round 1:

Random Index 0 ~ 2, ex: Index = 0

Round 2:

Random Index 0 ~ 1, ex: Index = 0

current == temp

size == 1, 函式結束

測試隨機度

採用 python script 來對此隨機函式作測試，總共執行 1000000 次
統計如下：

順序	觀察到的頻率	期待的頻率	$\frac{(O_{i} - E_{i})^{2}}{E_{i}}$
123	166653	166666	0.00101
132	166775	166666	0.07128
213	167066	166666	0.96000
231	166651	166666	0.00135
312	166440	166666	0.30645
321	166415	166666	0.37800
Total			0.71812

$X^{2}$ = 1.7181188724754899

亂序 1234 的如圖所示

2024q1 Homework1 (lab0)

Reveiwed by yy214123

開發環境

針對佇列的實作

q_insert_head

q_delete_mid

q_delete_dup

q_swap

q_reverse 及 q_reverseK

q_ascend 及 q_descend

q_sort

q_merge

實作疑問

新增 percentile

研讀論文 <Dude, is my code constant time?>

方法

實作

Student's t-distribution

定義

與常態分佈比較

為何此篇論文使用t-distribution?

引入 lib/list_sort.c

流程

比較

q_sort

list_sort

在 qtest 提供新的命令 shuffle

測試隨機度

Read more

2024q1 Homework6 (integration)

2024q1 Homework5 (assessment)

2024q1 Homework4 (quiz3+4)

2024q1 Homework2 (quiz1+2)

Reveiwed by `yy214123`

`q_reverse` 及 `q_reverseK`

在 qtest 提供新的命令 `shuffle`