2022q1 Homework1 (lab0)

contributed by < nckusaniel >

實驗環境

$ gcc --version
gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0

$ lscpu
架構：                           x86_64
CPU 作業模式：                   32-bit, 64-bit
Byte Order:                      Little Endian
Address sizes:                   39 bits physical, 48 bits virtual
CPU(s):                          8
On-line CPU(s) list:             0-7
每核心執行緒數：                 2
每通訊端核心數：                 4
Socket(s):                       1
NUMA 節點：                      1
供應商識別號：                   GenuineIntel
CPU 家族：                       6
型號：                           126
Model name:                      Intel(R) Core(TM) i5-1035G1 CPU @ 1.00GHz
製程：                           5
CPU MHz：                        1200.000
CPU max MHz:                     3600.0000
CPU min MHz:                     400.0000
BogoMIPS:                        2380.80
虛擬：                           VT-x
L1d 快取：                       192 KiB
L1i 快取：                       128 KiB
L2 快取：                        2 MiB
L3 快取：                        6 MiB
NUMA node0 CPU(s)：              0-7

作業要求

q_new : 建立新的「空」佇列
q_free : 釋放佇列所佔用的記憶體
q_insert_head : 在佇列開頭 (head) 插入 (insert) 給定的新節點 (以 LIFO 準則)
q_insert_tail : 在佇列尾端 (tail) 插入 (insert) 給定的新節點 (以 FIFO 準則)
q_remove_head : 在佇列開頭 (head) 移去 (remove) 給定的節點
q_release_element : 釋放特定節點的記憶體空間
q_size : 計算目前佇列的節點總量
q_delete_mid : 移走佇列中間的節點，詳見 LeetCode 2095. Delete the Middle Node of a Linked List
q_delete_dup : 在已經排序的狀況，移走佇列中具備重複內容的節點，詳見 LeetCode 82. Remove Duplicates from Sorted List II
q_swap : 交換佇列中鄰近的節點，詳見 LeetCode 24. Swap Nodes in Pairs
q_reverse : 以反向順序重新排列鏈結串列，該函式不該配置或釋放任何鏈結串列節點，換言之，它只能重排已經存在的節點
q_sort : 以遞增順序來排序鏈結串列的節點

開發過程

實做queue.c

關鍵結構
1.




typedef struct {
    char *value;
    struct list_head list;
} element_t;




struct list_head {
    struct list_head *prev;
    struct list_head *next;
};

q_new

1.給node配置一個空間
2.檢查空間是否足夠
3.利用INIT_LIST_HEAD將node初始化









struct list_head *q_new()
{
    struct list_head *node = malloc(sizeof(struct list_head));
    if (!node) {
        return NULL;
    }
    INIT_LIST_HEAD(node);
    return node;
}

q_free

1.宣告兩個element_t指標（enrty在前，safe在後）
2.判斷指標l是否為空的
3.利用list_for_each_entry_safe，探索l串列之element_t結構並利用q_release_element 釋放entry
4.釋放l









void q_free(struct list_head *l)
{
    element_t *entry, *safe;
    if (!l)
        return;
    list_for_each_entry_safe (entry, safe, l, list)
        q_release_element(entry);
    free(l);
}

q_insert_head

1.先判斷head是否為空
2.配置空間給q作插入，並判斷q否存在
3.配置空間給value，並判斷value是否存在並將s複製過去
4.利用list_add作插入



















bool q_insert_head(struct list_head *head, char *s)
{
    //檢查head
    if (head == NULL)
        return false;
    element_t *n = malloc(sizeof(element_t));
    if (!n) {
        return false;
    }

    n->value = malloc(strlen(s) + 1);
    if (!n->value) {
        free(n); 
        return false;
    }
    strncpy(n->value, s, strlen(s) + 1);
    list_add(&n->list, head);
    return true;
}

一開始不知道要配置空間給value，直接使用q->value=s，導致錯誤的發生，查詢後知道了要先malloc空間給value，再使用strcpy執行陣列的複製

commit時發現因為使用了會導致buffer overflow的函式strcpy，在研讀[https://security.web.cern.ch/recommendations/en/codetools/c.shtml] 後改用strlcpy，而後又發現c99目前無法支援strlcpy，因此改用strncpy
而後發現trace 03會不通過，因此回頭細看q_insert_head是否有錯誤並參考其他同學的code後發現了幾項問題
1.q->value 必須要配置空間，並判斷是否存在，不存在的話執行strncpy會有溢位產生

最重要的！！

我希望算出s的大小而後配置給n->value因此寫


n->value = malloc(**sizeof**(strlen(s) + 1));
//發現不管怎麼測試都會錯誤

發現
1.sizeof(s),算出的s是char*型態的位元組，而非s中有幾個元素
2.sizeof(strlen(s)+1)，其中sizeof不是函數，因此不能在裡面放入strlen
3.strlen會回傳size_t而sizeof(size_t)會算出size_t的位元組量，與所求s的大小不相同而導致錯誤？
改寫


n->value = malloc(strlen(s) + 1);

刪除sizeof，直接malloc s的字串大小

Todo

使用strdup 來簡化程式碼
去確認原先程式碼錯誤，是否如上述發現中鎖描述

q_insert_tail

原理同 q_insert_head，但使用list_add_tail

















bool q_insert_tail(struct list_head *head, char *s)
{
    if (head == NULL)
        return false;
    element_t *n = malloc(sizeof(element_t));
    if (!n) {
        return false;
    }
    n->value = malloc(strlen(s) + 1);
    if (!n->value) {
        free(n);
        return false;
    }
    strncpy(n->value, s, strlen(s) + 1);
    list_add_tail(&n->list, head);
    return true;
}

q_remove_head

1.判斷queue中是否有元素
2.宣告target指向第一個元素
3.移除target
4.將target的資料給sp



















element_t *q_remove_head(struct list_head *head, char *sp, size_t bufsize)
{
    //判斷queue是否有元素
    if (!head || list_empty(head))
        return NULL;

    // target指向第一個點
    element_t *target = list_entry(head->next, element_t, list);

    //移除taget
    list_del_init(head->next);

    // target的value 非空且被移除（初始化）就將value的資料給sp
    if (sp != NULL) {
        strncpy(sp, target->value, bufsize - 1);
        // sp[bufsize]="\0";
    }
    return target;
}

發現錯誤，改進
1.判斷queue中是否有元素用list_empty來改進
2.使用strnspy

q_remove_tail

1.道理同remove_head


















element_t *q_remove_tail(struct list_head *head, char *sp, size_t bufsize)
{
    //判斷queue是否有元素
    if (!head || list_empty(head))
        return NULL;

    // target指向最後一個點（head->prev）
    element_t *target = list_entry(head->prev, element_t, list);

    //移除taget
    list_del_init(head->prev);

    // target的value 非空且被移除（初始化）就將value的資料給sp
    if (sp != NULL) {
        strncpy(sp, target->value, bufsize - 1);
    }
    return target;
}

q_release_element

1事先預設好了





void q_release_element(element_t *e)
{
    free(e->value);
    free(e);
}

q_size

預設












int q_size(struct list_head *head)
{
    if (!head)
        return 0;

    int len = 0;
    struct list_head *li;

    list_for_each (li, head)
        len++;
    return len;
}

q_delete_mid

1.先判斷是否為空
2.使用慢指標去追蹤，快指標到終點時，慢指標會指到中間
3.把slow 刪除並釋放

















bool q_delete_mid(struct list_head *head)
{
    // https:leetcode.com/problems/delete-the-middle-node-of-a-linked-list/
    if (!head || list_empty(head))
        return false;
    //宣告快慢指標
    struct list_head *fast, *slow;
    for (fast = slow = head->next; fast->next != head && fast != head;
         slow = slow->next, fast = fast->next->next) {
    }

    //找到中間點（慢指標），移除後釋放
    element_t *mid = list_entry(slow, element_t, list);
    list_del_init(slow);
    q_release_element(mid);
    return true;
}

q_delete_dup

思路
1.先判斷head是否為空
2.使用變數count，目的是希望當目前節點跟下一個節點不同時，去判斷目前節點是否要被刪除
3.宣告兩個指標first,second指向list_head，利用list_for_each_safe探索queue
4.將兩個指標分別利用list_entry取得element_t的address，目的是取用value
5.第一個if中，利用strcmp來判斷first 跟second之中的字串是否相等，若相等就刪除並釋放，且將count+1
6.若字串不相等，必須要利用count來判斷，目前節點的字串之前是否曾經釋放過，是的話目前節點也必須釋放

疑問：原本執行時一直跑出Segmentation fault occurred. You dereferenced a NULL or invalid pointer。之後觀察其他同學的code之後發現在


if ( 0 == strcmp(entry->value, safe->value))

加入second != head 就可以解決此問題。猜想是因為當second＝head時，執行element_t *safe = list_entry(second, element_t, list); 且因為head沒有被嵌入至任何struct之中，所以在取用其safe的value時會發生記憶體上的錯誤。
































bool q_delete_dup(struct list_head *head)
{
    // https://leetcode.com/problems/remove-duplicates-from-sorted-list-ii/
    if (!head || list_empty(head)) {
        return false;
    }
    //計算這個node殺了幾次
    int count = 0;
    // 宣告兩個指向現在跟下一個list
    struct list_head *first, *second;
    //探訪所有node
    list_for_each_safe (first, second, head) {
        // list轉換成struct
        element_t *entry = list_entry(first, element_t, list);
        element_t *safe = list_entry(second, element_t, list);
        //如果now跟next的字串內容相同，殺了first，並釋放，注意seocnd不能等於head沒value會錯誤
        if (second != head && 0 == strcmp(entry->value, safe->value)) {
            list_del(first);
            q_release_element(entry);
            //砍了一次
            count++;
            //字串跟下一個點不相同，判斷之前有沒有砍過
        } else {
            if (count > 0) {
                list_del(first);
                q_release_element(entry);
                count = 0;
            }
        }
    }
    return true;
}

q_swap
1.檢查head是否存在
2.宣告兩個指標now以及next1，分別指向目前節點，及下一點
3.for迴圈實做，list_del_init(now)使next1跑到now前面，再利用list_add(now,next1)將now加入next1後面，使它們位子交換












void q_swap(struct list_head *head)
{
    // https://leetcode.com/problems/swap-nodes-in-pairs/
    if (!head || list_empty(head))
        return;
    struct list_head *now, *next1;
    for (now = head->next, next1 = now->next; now->next != head && now != head;
         now = now->next, next1 = now->next) {
        list_del_init(now);
        list_add(now, next1);
    }
}

這題卡了很久，有以下幾點要注意
1.參考kdnvt的寫法才知道只要先刪除，在插入就可以使兩個點的位子交換了
2.如果queue中有奇數個資料，則最後一項不需要交換
3.在利用list_del_init(now)之後next1會跑到前面

q_reverse

1.用類似交換排序的方式來實做，但是移出比較大小
單純就是把資料放在最右邊



















void q_reverse(struct list_head *head)
{
    struct list_head *node, *first, *second;
    // n紀錄總共多少點
    int n = 0;
    int j;
    list_for_each (node, head)
        n++;
    //第一層for迴圈代表已經放了幾個點到最後面執行n-1次即可
    for (int i = 1; i < n; i++) {
        //第二層執行左右交換，並且每次可以根據i來減少交換次數
        for (first = head->next, second = first->next, j = n; i < j;
             j--, second = first->next) {
            //刪除first加入其後面
            list_del_init(first);
            list_add(first, second);
        }
    }
}

效率太差作更正

思路
1.探索每一個點，將其一個一個插入到最前面











void q_reverse(struct list_head *head)
{
    //要考慮head為null的情況
    if (!head || list_empty(head)) {
        return;
    }
    struct list_head *node, *safe;
    list_for_each_safe (node, safe, head) {
        list_move(node,head);
    }
}

原本用打算依序將節點移動到最後面用list_move_tail實做，但是終止條件有點難寫

嘗試用list_move_tail實做

q_sort
1.交換排序實做
























void q_sort(struct list_head *head)
{
    struct list_head *node, *first, *second;
    // n紀錄總共多少點
    int n = 0;
    int j;
    list_for_each (node, head)
        n++;
    //第一層for迴圈代表已經放了幾個點到最後面執行n-1次即可
    for (int i = 1; i < n; i++) {
        //第二層執行左右交換，並且每次可以根據i來減少交換次數
        for (first = head->next, second = first->next, j = n; i < j;
             j--, second = first->next) {
            //利用element取value
            element_t *current = list_entry(first, element_t, list);
            element_t *current_next = list_entry(second, element_t, list);
            // value大的放後面
            if (strlen(current->value) > strlen(current_next->value)) {
                list_del_init(first);
                list_add(first, second);
            }
        }
    }
}

qsort
1.參考你所不知道的 C 語言: linked list 和非連續記憶體之中的quicksort實做





























void q_sort(struct list_head *head)
{
    struct list_head list_less, list_greater;
    element_t *pivot;
    element_t *item = NULL, *is = NULL;

    if (!head || list_empty(head) || list_is_singular(head))
        return;

    INIT_LIST_HEAD(&list_less);
    INIT_LIST_HEAD(&list_greater);

    pivot = list_first_entry(head, element_t, list);
    list_del(&pivot->list);

    list_for_each_entry_safe (item, is, head, list) {
        if (strcmp(item->value, pivot->value) < 0)
            list_move_tail(&item->list, &list_less);
        else
            list_move(&item->list, &list_greater);
    }

    q_sort(&list_less);
    q_sort(&list_greater);

    list_add(&pivot->list, head);
    list_splice(&list_less, head);
    list_splice_tail(&list_greater, head);
}

執行後發現trace-14以及trace-15過不了，因為執行時間太久，必須要用複雜度更低的方法，因此在研讀你所不知道的 C 語言: linked list 和非連續記憶體決定使用merge sort來實做

利用merge sort實做

參考Risheng1128 來糾正我原本的錯誤

思路

分成三個funtion qsort，merge_cut，merge_twolist。
2.利用qsort呼叫merge_cut來做list的分解，再利用merge_twolist來合併

qsort

1.注意邊界條件


    if (!head || list_empty(head) || list_is_singular(head))
        return;

2.因為circular doubly linked list，有pre以及next兩個指標，並且head以及tail相接，在執行上會比較複雜，因此將circular doubly linked list先改成singular linked list。
並且在執行完merge_cut之後再進行link的調整


head->prev->next = NULL;
head->next = merge_cut(head->next);

起初我的想法是直接head= merge_cut(head);
而後發現把head也丟進merge_cut以及merge_twolist之中，會有更多的邊界條件要處理，因為list會一直被切割，並且只有某些list有head。因此使用head->next = merge_cut(head->next); 來解決head的傳入問題

最後會發現link list 所有的prev都沒有指向正確的地方，並且首尾沒有相接所以執行以下









struct list_head *now = head, *next = head->next;
while (next) {
        next->prev = now;
        now = next;
        next = next->next;
    }
    now->next = head;
    head->prev = now;
}

merge_cut

先討論邊界條件



    if (!node1 || !node1->next) {
        return node1;
    }

這邊是傳入原本List中的head->next，並且tail->next已經被改成NULL，因此不能使用list_empty(node1)，原先使用list_empty(node1)導致一直出錯，之後利用GDB檢查之後才發線問題所在

利用快慢指標來將list進行分割，並遞迴呼叫merge_twolist

1.mid是後面串列的開頭，所以有fast->next; fast = fast->next->next兩個結束循環的可能性
2.將slow的next設為NULL，來確保遞迴呼叫時所有tail的next都是NULL，來避免邊界條件








    struct list_head *slow = node1, *fast = slow->next;
    for (; fast && fast->next; fast = fast->next->next)
        slow = slow->next;
    struct list_head *mid = slow->next;
    slow->next = NULL;
    struct list_head *left = merge_cut(node1);
    struct list_head *right = merge_cut(mid);
    return merge_twolist(left, right);

merge_twolist

1.利用[指標的指標]，來實作兩個list的合併
2.因為ptr = &head，所以最初ptr就是代表head
3.而if之中的*ptr則是指node中的next指標
4.if之中只有改變next指標，並沒有改變pre指標，所以qsort 之中才要去將所有prev指標做調整














    struct list_head *head = NULL;
    struct list_head **ptr = &head;
    while (list1 && list2) {
        //比大小<0代表l1小
        if (strcmp(list_entry(list1, element_t, list)->value,
                   list_entry(list2, element_t, list)->value) < 0) {
            *ptr = list1;
            list1 = list1->next;
        } else {
            *ptr = list2;
            list2 = list2->next;
        }
        ptr = &(*ptr)->next;
    }

最後當l1或l2為NULL時(也就是ptr所停在的地方)，我們要探討到底是將l1連接到l2抑或是l2連接到l1。

將list1及list2做OR運算，並且轉換型態存給ptr


    *ptr = (struct list_head *) ((uintptr_t) list1 | (uintptr_t) list2);

完整程式碼

























































struct list_head *merge_twolist(struct list_head *list1,
                                struct list_head *list2)
{
    struct list_head *head = NULL;
    struct list_head **ptr = &head;
    while (list1 && list2) {
        //比大小<0代表l1小
        if (strcmp(list_entry(list1, element_t, list)->value,
                   list_entry(list2, element_t, list)->value) < 0) {
            *ptr = list1;
            list1 = list1->next;
        } else {
            *ptr = list2;
            list2 = list2->next;
        }
        ptr = &(*ptr)->next;
    }
    // l1和l2作or運算並型態轉換存回ptr
    *ptr = (struct list_head *) ((uintptr_t) list1 | (uintptr_t) list2);
    return head;
}
//將資料切割（divide
struct list_head *merge_cut(struct list_head *node1)
{
    //邊界條件
    if (!node1 || !node1->next) {
        return node1;
    }
    struct list_head *slow = node1, *fast = slow->next;
    for (; fast && fast->next; fast = fast->next->next)
        slow = slow->next;
    // mid是新串列開頭
    struct list_head *mid = slow->next;
    slow->next = NULL;
    struct list_head *left = merge_cut(node1);
    struct list_head *right = merge_cut(mid);
    return merge_twolist(left, right);
}


void q_sort(struct list_head *head)
{
    //邊界條件
    if (!head || list_empty(head) || list_is_singular(head))
        return;
    //首先要將傳進來的head改變成單向的linklist
    head->prev->next = NULL;
    head->next = merge_cut(head->next);
    struct list_head *now = head, *next = head->next;
    while (next) {
        next->prev = now;
        now = next;
        next = next->next;
    }
    now->next = head;
    head->prev = now;
}

代辦事項

研讀 Linux 核心原始程式碼的 lib/list_sort.c 並嘗試引入到 lab0-c 專案，比較你自己實作的 merge sort 和 Linux 核心程式碼之間效能落差
在 qtest 提供新的命令 shuffle，允許藉由 Fisher–Yates shuffle 演算法，對佇列中所有節點進行洗牌 (shuffle) 操作
在 qtest 提供新的命令 web，提供 web 伺服器功能，注意: web 伺服器運作過程中，qtest 仍可接受其他命令，可依據前述說明，嘗試整合 tiny-web-server
運用 Valgrind 排除 qtest 實作的記憶體錯誤，並透過 Massif 視覺化 “simulation” 過程中的記憶體使用量，需要設計對應的實驗
解釋 select 系統呼叫在本程式的使用方式，並分析 console.c 的實作，說明其中運用 CS:APP RIO 套件的原理和考量點。可對照參考 CS:APP 第 10 章重點提示
研讀論文〈Dude, is my code constant time?〉，解釋本程式的 “simulation” 模式是如何透過以實驗而非理論分析，達到驗證時間複雜度，需要解釋 Student’s t-distribution 及程式實作的原理。注意：現有實作存在若干致命缺陷，請討論並提出解決方案;
指出現有程式的缺陷 (提示: 和 RIO 套件有關) 或者可改進之處 (例如更全面地使用 Linux 核心風格的鏈結串列 API)，嘗試實作並提交 pull request