2020q1 Homework1 (lab0)

contributed by < mtbehisseste >

tags: `linux2020`

作業說明 : lab0-c

Environment

Docker for Mac

$ uname -a
Linux ubuntu18-docker 4.9.125-linuxkit #1 SMP Fri Sep 7 08:20:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
$ gcc --version
gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0

Prework study

Git Hooks

git hook 是一些在特定 git 操作時會執行的腳本，對於開發中的一些規範或風格都可以透過 git hook 來達到自動化檢查。當執行 git init 時，在 .git/hooks 底下會生成一些範例腳本（ *.sample ）。若要使用客製化腳本的話，則將檔案放在 .git/hooks 下即可（檔案須為 executable）。本專案則是用 link 的方式將檔案連到統一放置的 scripts/ 目錄下的執行檔。

觀察一下腳本內容： commit-msg.hook 檢查 commit message 是否符合裡面的 9 個規則，包括使用祈使語氣、句尾不加句號等等。 pre-commit.hook 使用 cppcheck 、 aspell 、 clang-format 等工具做檢查，也過濾 unsafe function 的使用。另外

FILES=`git diff --cached --name-only --diff-filter=ACMR | grep -E "\.(c|cpp|h)$"`

的用法可以只檢查現階段還沒被 commit 的那些 c/cpp/h 檔，有 cache 的效果。

pre-push.hook 會先確認是不是從 sysprog21/lab0-c 某次 commit 之後才 fork 的。而如果要 push 到 master 的話會先做一次 make 確保遠端的程式碼是可以成功編譯的。

scripts/driver.py

自動化將 trace 輸入到 qtest 的腳本，裡面使用 subprocess.call() 來執行 qtest 的指令。因為平時我多用 os.system() 所以我順手搜尋了兩者的比較：

os.system() 會產生一個 subshell 來執行指令，底層實作是呼叫 C 的 system() ，所有產生的結果會在 stdout 輸出
而 subprocess.call() （Python 3.5 後可用 subprocess.run() 代替）除了在給定參數 shell=True 之外，不會產生新的 shell 。另外也有 capture_output 等參數可以選擇，功能也比 os.system() 多

Reference: https://docs.python.org/3/library/subprocess.html#subprocess.run

另外這邊使用 getopt() 來解析命令列參數，與我平時使用的 argparse() 不同，使用方法與 C 語言中相同：

getopt(args, shortopts, longopts = []);

args 是待解析的參數 list，通常傳入 sys.argv[1:]
shortopts 是一串待辨認的字母選項，若字母後有冒號表示該選項後面需接上參數。如 hp:t:， h 後不需接東西， p 及 t 則要
longopts 為支援的選項名字，以 list 表示。如 ['valgrind'] 代表可以用 ./driver.py --valgrind 。若 longopts 須帶參數，則在字串後加上等號

最後回傳一個 (option, value) pair 的 list 以及剩餘的 args。

clang-format

過去在寫程式的時候一直希望自己能夠保持一致的程式風格，但卻都沒有參閱 Google C++ Style Guide 這類大型專案所採取的風格規範，導致自己可能會憑"感覺"去寫，像下面這樣的情況就會發生:

for (i = 0; i < 10; i++) // 正確
for (i=0; i<10; i++)     // 自己覺得太長就把空格省略了

使用 clang-format 讓一些沒注意到的地方可以自動改成符合風格的，且在 sysprog21/lab0-c 最近的 commit 提到，可以將 clang-format 與 vim 整合，對於往後程式開發更加便利。

Valgrind

Address Sanitizer

Implementation

queue 結構包含兩指標 head、 tail 及一整數 size， tail 為方便實作 q_insert_tail 及 q_reverse 等； size 為使 q_size 複雜度為

O (1)

：

typedef struct {
    list_ele_t *head; /* Head of linked list of elements */
    list_ele_t *tail; /* Tail of linked list of elements */
    int size;
} queue_t;

q_new

初始化 queue，首先宣告一 queue_t 結構的指標 q 指向一塊記憶體區塊，其大小為 queue_t 結構的大小；並分別初始化 queue_t 中每個成員。

queue_t *q_new()
{
    queue_t *q = malloc(sizeof(queue_t));
    if (!q)
        return NULL;

    q->head = NULL;
    q->tail = NULL;
    q->size = 0;
    return q;
}

q_free

將傳入的指標 q 所指向的記憶體釋放，不斷呼叫 q_remove_head 直到其回傳 false（當 q 為空或 NULL），並 free(q) 。
在 ISO/IEC 9899 7.20.3.2 中提到：

The free function causes the space pointed to by ptr to be deallocated, that is, made available for further allocation. If ptr is a null pointer, no action occurs.

因此在 free(q) 前我們不必費心考慮 q 為空或 NULL。

void q_free(queue_t *q)
{
    /* Free head element until there's no element left,
     * since q_remove_head() return false when q is NULL or empty */
    while (q_remove_head(q, NULL, 0))
        ;

    free(q);
}

q_insert_head

產生一新 element newh，將給定的字串複製到 newh 中的 value 成員並將 newh 接在 q 的頭。若 q 為 NULL 或無法配置記憶體則回傳 false，成功回傳 true 。

首先要 malloc 一塊 list_ele_t 結構大小的記憶體並由 newh 指向，接著 malloc 一塊大小為 (strlen(s) + 1) * sizeof(char) 的記憶體，由於 char 為 1 byte，可以直接用 s 的長度來知道所需記憶體大小。而 strlen 只回傳字串本身，並不包含字串後的 terminator，因此要多要 1 byte。這邊要注意的是：若是 malloc 失敗，則必須將剛剛 malloc 的 newh 的記憶體也釋放掉，並且將 newh 指向 NULL ，這是我一開始疏忽掉的。

字串複製我使用 strncpy，相較於 strcpy，指定複製長度更安全。長度則是 s 長度加上 terminator。另外我也有看到其他同學使用 memcpy 不過我想既然是字串的處理使用專門處理字串的函式可以讓未來更方便。

最後將 newh 擺到 q 的頭並增加 q->size 即可，要注意的是若 q 目前為空，newh 將成為 q 中唯一元素，則 q->tail 也必須指向 newh。

bool q_insert_head(queue_t *q, char *s)
{
    if (!q)
        return false;

    list_ele_t *newh; /* Add a new head */
    newh = malloc(sizeof(list_ele_t));
    if (!newh)
        return false;

    newh->value = malloc((strlen(s) + 1) * sizeof(char));
    if (!newh->value) {
        free(newh); /* Free the allocated space */
        newh = NULL;
        return false;
    }
    strncpy(newh->value, s, strlen(s) + 1);

    newh->next = q->head;
    q->head = newh;
    if (!q_size(q)) /* The queue is empty */
        q->tail = newh;

    q->size++;
    return true;
}

q_insert_tail

作法與 q_insert_head 雷同，當 q 為空亦須將 q->head 指向 newt ，唯最後 newt->next 指向 NULL 並成為新的 q->tail。

bool q_insert_tail(queue_t *q, char *s)
{
    if (!q)
        return false;

    list_ele_t *newt; /* Add a new tail */
    newt = malloc(sizeof(list_ele_t));
    if (!newt)
        return false;

    newt->value = malloc((strlen(s) + 1) * sizeof(char));
    if (!newt->value) {
        free(newt); /* Free the allocated space */
        newt = NULL;
        return false;
    }
    strncpy(newt->value, s, strlen(s) + 1);

    newt->next = NULL;
    if (!q_size(q))     /* The queue is empty */
        q->head = newt; /* This handles inserting to tail when queue is empty */
    else
        q->tail->next = newt;
    q->tail = newt;

    q->size++;
    return true;
}

q_remove_head

當 q 為空或 NULL 則回傳 false，否則新增一指摽 newh 指向新的 q->head （即 q->head->next ）。當 sp 為非 NULL，代表對應到 qtest 中的 rh [str] 命令，須將 head->value 複製到 sp 中。使用 strncpy 複製長度 bufsize - 1 的字串到 sp，最後再加上字串 terminator。

最後 free 掉 q->head->value 及 q->head 指向的元素，將兩者指向 NULL ，並將 q->head 指向 newh。

bool q_remove_head(queue_t *q, char *sp, size_t bufsize)
{
    if (!q || !q_size(q))
        return false;

    list_ele_t *newh;
    newh = q->head->next; /* Points to new head element */
    if (sp) {
        strncpy(sp, q->head->value, bufsize - 1);
        sp[bufsize - 1] = '\0';
    }
    free(q->head->value);
    q->head->value = NULL;
    free(q->head);
    q->head = NULL;
    q->head = newh;

    q->size--;
    return true;
}

q_size

函式中的註解要求此函式必須為

O (1)

，因此在 struct queue_t 另外新增一成員 size 是必要的，每次 q_size 只需回傳該成員即可。

int q_size(queue_t *q)
{
    if (!q || !q->size)
        return 0;

    return q->size;
}

q_reverse

這邊用到資料結構中的反轉 linked list，宣告三個指摽 pre 、 cur 、 pos 來操作，流程如下：

pre(head) cur     pos
   |       |       |
   v       v       v
+----+  +----+  +----+  +----+
|ele1|->|ele2|->|ele3|->|ele4|->...
+----+  +----+  +----+  +----+

  pre      cur     pos
   |        |       |
   v        v       v
+----+   +----+  +----+  +----+
|ele1|<=>|ele2|  |ele3|->|ele4|->...     // cur->next=pre
+----+   +----+  +----+  +----+

         pre,cur   pos
            |       |
            v       v
+----+   +----+  +----+  +----+
|ele1|<=>|ele2|  |ele3|->|ele4|->...     // pre = cur
+----+   +----+  +----+  +----+

           pre   cur,pos
            |       |
            v       v
+----+   +----+  +----+  +----+
|ele1|<=>|ele2|  |ele3|->|ele4|->...     // cur = pos
+----+   +----+  +----+  +----+

           pre     cur     pos
            |       |       |
            v       v       v
+----+   +----+  +----+  +----+
|ele1|<=>|ele2|  |ele3|->|ele4|->...     // pos = pos->next
+----+   +----+  +----+  +----+

            ...                          // loop the above steps
            
 head           pre,cur(tail) pos
   |                  |        |
   v                  v        |
+----+   +------+  +----+      v
|ele1|...|eleN-1|<-|eleN|    NULL        // if (!pos) { break; }
+----+   +------+  +----+  

head,tail           pre,cur   pos
    |                  |       |
    v                  v       |
 +----+   +------+  +----+     v
 |ele1|...|eleN-1|<-|eleN|   NULL        // q->tail = q->head
 +----+   +------+  +----+  
// now ele1->next still points to ele2

      head,tail           pre,cur   pos
          |                  |       |
          v                  v       |
       +----+   +------+  +----+     v
NULL <-|ele1|...|eleN-1|<-|eleN|   NULL     // q->tail->next = NULL
       +----+   +------+  +----+  
    
        tail               head   
          |                  |    
          v                  v    
       +----+   +------+  +----+  
NULL <-|ele1|...|eleN-1|<-|eleN|         // q->head = cur
       +----+   +------+  +----+

void q_reverse(queue_t *q)
{
    if (!q || !q_size(q) || q_size(q) == 1)
        return;

    list_ele_t *pre = q->head;
    list_ele_t *cur = pre->next;
    list_ele_t *pos = cur->next;
    while (1) {
        cur->next = pre;
        pre = cur;
        if (!pos)
            break;
        cur = pos;
        pos = pos->next;
    }
    q->tail = q->head;
    q->tail->next = NULL;
    q->head = cur;
}

q_sort

q_sort 原本使用 bubble sort 但由於時間複雜度太高，因此後來選擇 merge sort 實作。 merge_sort 參數為 q->head 的 reference，在 merge_sort 結束後不必多餘的指派， q->head 即指向整個排序後的 queue。

void q_sort(queue_t *q)
{
    if (!q || !q_size(q) || q_size(q) == 1)
        return;

    merge_sort(&q->head);
}

merge_sort 流程為將 queue 平均分為兩 queue，分別由 front 、 back 指向；再分別對兩 queue 遞迴的作 merge_sort 直到 queue 為 NULL 或剩餘一元素。

參數的 **q_head 為一指向 head pointer 的指標，目的是為了將 head pointer 本身傳入，並且在函式結束後讓 head pointer 指向最終結果。後面遞迴的時候 front 及 back 會分別代表新的 head pointer ，最後 merge 會將兩 queue 合併並回傳排序後 queue 的 head pointer，指派給最初傳入的 *q_head。

void merge_sort(list_ele_t **q_head)
{
    /* q_head is the pointer point to the address of the
     * head element pointer, dereference to get pointer itself */
    list_ele_t *head = *q_head;

    if (!head ||
        !head->next) /* Return if q_head is NULL or there's only one element */
        return;

    /* Split the queue into two queue */
    list_ele_t *front = NULL;
    list_ele_t *back = NULL;
    split_queue(head, &front, &back);

    /* Sort each queue */
    merge_sort(&front);
    merge_sort(&back);

    *q_head = merge(front, back);
}

split_queue 將 q_head 指向的 queue 均分為二，方法為使用兩指標 fast 及 slow 並讓 fast 以兩倍 slow 的速度迭代整個 queue，當 fast 為 NULL 代表到底了，此時 slow 會指向中間元素的前一元素。將 q_head 指派給 *front_p ； slow->next 指派給 *back_p ，並將 slow->next = NULL 可分為兩 queue。

要注意的是 split_queue 後兩參數為 pointer to front 及 pointer to back 並且 front 及 back 亦分別為指標，因此呼叫的時候是把 front 及 back 指標本身的 reference 傳入：split_queue(head, &front, &back); 。且在最後是使用*front_p 及 *back_p 來拿到 pointer 本身。

void split_queue(list_ele_t *q_head, list_ele_t **front_p, list_ele_t **back_p)
{
    list_ele_t *fast;
    list_ele_t *slow;
    slow = q_head;
    fast = q_head->next;

    /* fast iterates two times faster than slow.
     * So that when fast reaches the end of queue,
     * slow is at middle of the queue. */
    while (fast) {
        fast = fast->next;
        if (fast) {
            fast = fast->next;
            slow = slow->next;
        }
    }

    *front_p = q_head;
    *back_p = slow->next;
    slow->next = NULL;
}

merge 將傳入的兩 head pointer 指向的 queue 合併，首先決定新 queue 的 head 並由 result 、 current 兩指標指向。並且在分別迭代 a 、 b 兩 queue，比較大小決定 current->next 直到其中一者取完為止，並將另一者剩餘部分接在新 queue 後完成，最後 result 會指向新 queue 的頭，回傳後會直接指派給 merge_sort 的 *q_head（即傳入 merge_sort 的 q->head）完成排序。

另外原本使用遞回版本的 merge 會遇到 stack 空間不足的情況。

list_ele_t *merge(list_ele_t *a, list_ele_t *b)
{
    list_ele_t *result, *current;

    if (!a)
        return b;
    else if (!b)
        return a;

    /* Initialize the head of result */
    if (strcmp(a->value, b->value) < 0) {
        result = a;
        a = a->next;
    } else {
        result = b;
        b = b->next;
    }
    current = result;

    /* Iterative version of merge */
    while (1) {
        if (!a) {
            current->next = b;
            break;
        }
        if (!b) {
            current->next = a;
            break;
        }

        if (strcmp(a->value, b->value) < 0) {
            current->next = a;
            a = a->next;
        } else {
            current->next = b;
            b = b->next;
        }

        current = current->next;
    }

    return result;
    
    /* Recursive version of merge, this might cause stack
     * explosion where there are too many elements. */
    /*
     * if (strcmp(a->value, b->value) < 0) {
     *     result = a;
     *     result->next = merge(a->next, b);
     * } else {
     *     result = b;
     *     result->next = merge(a, b->next);
     * }
     *
     * return result;
    list_ele_t *result_head = result;
     */
}