2019q1 Homework1 (lab0)

contributed by <f26401004>

尚未完成：

trace harness.h / harness.c
qtest 機制分析

目標

實作簡單的 queue data structure，並提供 insert head / insert tail / remove head 等操作
題目中特別限制部份程式碼須在特定的時間複雜度下
練習 github 以及 git hook

程式碼

queue_t *q_new()

queue_t *q_new()
{
    queue_t *q = malloc(sizeof(queue_t));
    /* return null if malloc failed */
    if (q == NULL) {
        return NULL;
    }
    /* reset the queue */
    q->head = NULL;
    q->tail = NULL;
    q->size = 0;
    return q;
}

該 function 建立一個 queue
需要小心的是當 malloc 不成功時需要 return null 以及 malloc 成功後須重設 queue 的 member

void q_free(queue_t *q)

/* Free all storage used by queue */
void q_free(queue_t *q)
{
    /* just return if queue do not exist */ 
    if (q == NULL) {
        return;
    }
    /* use two pointer to traversal all the queue entry */
    list_ele_t *iter = q->head;
    list_ele_t *prev = iter;
    q->head = NULL;
    q->tail = NULL;
    while(iter) {
        prev = iter;
        iter = iter->next;
        prev->next = NULL;
        free(prev->value);
        free(prev);
    }
    free(q);
}

在 free 一個 queue 時需要注意：因為每個 entry 的 value 當初也是被 malloc 的，因此也需要特別去 free 掉，否則會產生 dangling pointer （最一開始在寫的時候忘記 free 掉，因此在 make test 時會產生有 block 仍然存在的 error）
更多閱讀：dangling pointer

bool q_insert_head(queue_t q, char s)

bool q_insert_head(queue_t *q, char *s)
{
    /* return false if queue do not exist */
    if (q == NULL) {
        return false;
    }
    list_ele_t *newh = malloc(sizeof(list_ele_t));
    /* return false if malloc a new entry failed */
    if (newh == NULL) {
        return false;
    }
    char *buf = malloc((strlen(s) + 1) * sizeof(char));
    /* return false if malloc a new buffer to copy the string failed */
    if (buf == NULL) {
        /* [Important]: also need to free the newh entry */
        free(newh);
        return false;
    }
    q->size++;
    strcpy(buf, s);
    /* reset the newh entry */
    newh->value = buf;
    newh->next = NULL;
    /* insert the new entry to the queue head */
    if (q->head == NULL && q->tail == NULL) {
        q->head = q->tail = newh;
        return true;
    }
    newh->next = q->head;
    q->head = newh;
    return true;
}

在 insert head 時需要注意因為本次的 malloc 了兩次記憶體，因此不論在哪個階段 malloc 都需要檢查該 malloc 是否成功，不成功則需要將上一個階段的 malloc 成功的記憶體 free 掉並且 return false
若 insert head 時發現 head 和 tail 皆為 null 時則初始將 head 和 tail 皆設為新的 entry

bool q_insert_tail(queue_t q, char s)

bool q_insert_tail(queue_t *q, char *s)
{
    /* return false if queue do not exist */
    if (q == NULL) {
        return false;
    }
    list_ele_t* newh = malloc(sizeof(list_ele_t));
    /* return false if malloc a new entry failed */
    if (newh == NULL) {
        return false;
    }
    char *buf = malloc((strlen(s) + 1) * sizeof(char));
    /* return false if malloc a new buffer to copy the string failed */
    if (buf == NULL) {
        /* [Important]: also need to free the newh entry */
        free(newh);
        return false;
    }
    q->size++;
    strcpy(buf, s);
    /* reset the newh entry */
    newh->value = buf;
    newh->next = NULL;
    /* insert the new entry to the queue head */
    if (q->head == NULL && q->tail == NULL) {
        q->head = q->tail = newh;
        return true;
    }
    q->tail->next = newh;
    q->tail = newh;
    return true;
}

因為 insert tail 也要求時間複雜度只能為 O(1)，故在 queue 的 data structure 新增 tail pointer link，每次 insert tail 時同時更新最新的 tail pointer link。有了 tail pointer link 便可以和 insert head 相同使用 O(1) 時間複雜度插入

bool q_remove_head(queue_t q, char sp, size_t bufsize)

bool q_remove_head(queue_t *q, char *sp, size_t bufsize)
{
    /* if queue do not exist or q->head do not exist or q->size equal to zero, then just return false */
    if (q == NULL || q->head == NULL || q->size == 0) {
        return false;
    }
    /* minus the queue size first */
    q->size--;
    list_ele_t *iter = q->head;
    if (sp != NULL) {
        /* reset sp first */
        memset(sp, '\0', (int)sizeof(sp));
        strncpy(sp, iter->value, bufsize - 1);
        /* [Important]: truncated the rest length of the sp */
        sp[bufsize - 1] = '\0';
    }
    q->head = q->head->next;
    /* free the memory space */
    iter->next = NULL;
    free(iter->value);
    free(iter);
    return true;
}

題目要求當 sp 不為 NULL 時便將正要移除的 entry value 複製到 sp 中
然而一開始使用 strncpy 時根據原始碼的實作會發現若 from 的長度大於 to 時會自動 expend to 的長度，在某些情況會造成效率低落。（當要複製的 string 長度遠小於指定的長度時）
此外strncpy 不能保證 to 存在 \0，因此必須手動填上 \0 在 to 的尾巴確保存在 null character
另一種的作法可以使用 strlcpy，它可以確保 to 的結尾存在 null character，查閱 C89/C99/C11 pdf 後發現並不存在 strlcpy，我的電腦系統為 Unbuntu 18.04LTS 需要額外安裝相關 package 才可以使用。
- $ sudo apt-get install libbsd-dev
- 引用 #include <bsd/string.h>
或者使用 memcpy 直接複製指定的記憶體區段，但同樣的他也不能保證 null character，因此也需要額外填上 \0 在 to 的尾巴。
[更多閱讀]：strncpy source code (linux v5.0-rc7)
[更多閱讀]：memcpy source code (linux v5.0-rc7)

避免說「此函式似乎沒有被納入 standard ANSI C」這種話，工程人員的論述要清晰，要指出 C89/C99/C11 是否存在 strlcpy，以及 POSIX 相關規範 (包含 4BSD) 是否有 conformance 描述

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

jserv

int q_size(queue_t *q)

int q_size(queue_t *q)
{
    /* if queue do not exist or q->head do not exist or q->size equal to zero, then just return zero */
    if (q == NULL || q->size == 0 || q->head == NULL) {
        return 0;
    }
    return q->size;
}

直接回傳在每次操作時紀錄的長度即可

void q_reverse(queue_t *q)

void q_reverse(queue_t *q)
{
    /* if queue do not exist or q->head do not exist or q->size equal to zero, then just return */
    if (q == NULL || q->size == 0 || q->head == NULL) {
        return;
    }
    list_ele_t *prev = NULL; // use prev to store previous entry
    list_ele_t *next = q->head; // use next to store next entry
    list_ele_t *current = q->head; // use current entry to reverse the next link
    /* traversal through the list and reverse the entry next pointer link */
    while(current) {
        next = next->next;
        current->next = prev;
        prev = current;
        current = next;
    }
    /* [Important]: also need to reset the head and tail pointer link */
    q->tail = q->head;
    q->head = prev;
}

利用 3 個 pointer 來 traversal 整個 queue，過程如下：
1. 初始化 next / current 為 head，prev 為 null
2. 只要 current 不為 null 則繼續進行下面的步驟
3. next 移往下一個 entry
4. 將 current->next 設置為 prev
5. 設置 prev 為 current、current = next
6. 回到第 2 步驟
需要特別注意的是別忘記回來重設 head 和 tail pointer

討論

1. 自動評分系統運作原理

首先可以先查看 makefile 中的 test，其對應行為為執行了一段 python 程式碼

test: qtest scripts/driver.py
	scripts/driver.py

若有安裝 python 環境也可以在我們 repo 檔案路徑下執行下面程式碼驅動

$ python3 scripts/driver.py

而裡面定義了一個 Tracer class ，其中在這個 class 讀入了每個 trace 的檔案以及會 output trace 過程

而最主要執行每次驗證的結果是在 Tracer 中的 runTrace:

    def runTrace(self, tid):
        if not tid in self.traceDict:
            print("ERROR: No trace with id %d" % tid)
            return False
        fname = "%s/%s.cmd" % (self.traceDirectory, self.traceDict[tid])
        vname = "%d" % self.verbLevel
        clist = [self.qtest, "-v", vname, "-f", fname]
        try:
            retcode = subprocess.call(clist)
        except Exception as e:
            print("Call of '%s' failed: %s" % (" ".join(clist), e))
            return False
        return retcode == 0

由上面程式碼的第 9 行可以看到他會產生 subprocess 去執行 traces 資料夾底下的 cmd，在裡面定義了許多執行 qtest 的指令，再將執行的結果抓出來計算分數。

而值得注意的是，因為跑 subprocess，因此實際上如果真的 malloc 失敗或 free 失敗，如果程式寫的嚴謹可以跑完 process，程式寫的不嚴謹可能會造成程式錯誤而令 process 中止，但是不論哪一種因為在 c 裡面沒有類似 c++ 的 try catch 語法，因此實際上我們的 malloc 以及 free 的 function 都經過改寫，塞入了錯誤訊息的紀錄程式碼，如此一來我們便可以測試自己的程式邏輯是否正確。(not sure)

malloc 與 free 的改寫被定義在 harness.h / harness.c 中，利用 define 將 test_malloc 定義為 malloc、 test_free 定義為 free

harness.c / harness.h

在 harness.c / harness.h 中實際上自行定義了一個 double linked list 來完成我們在 qtest 的操作








typedef struct BELE {
    struct BELE *next;
    struct BELE *prev;
    size_t payload_size;
    size_t magic_header; /* Marker to see if block seems legitimate */
    unsigned char payload[0];
    /* Also place magic number at tail of every block */
} block_ele_t;

其中可以發現每個 entry 的 data structure 存在一個 magic_header field ，他的作用是在於每次 insert 新的 entry 時會同時紀錄當前 queue 的 magic_header，以下程式碼節錄自 harness.c 的 test_malloc(size_t size)








if (new_block == NULL) {
        report_event(MSG_FATAL, "Couldn't allocate any more memory");
        error_occurred = true;
    }
    new_block->magic_header = MAGICHEADER;
    new_block->payload_size = size;
    *find_footer(new_block) = MAGICFOOTER;
    void *p = (void *) &new_block->payload;

如此一來在之後進行 free operation 時便可以檢查是否 free 到 unallocated block
而 payload_size 則紀錄了要 malloc 記憶體的大小；payload 則是模擬實際獲得的記憶體空間，他會在 test_malloc 時順便全部被初始化為 0x55

不過我發現 MAGICHEADER 都會被初始化為 0xdeadbeef，不知道是不是故意的XD
後來我去查了一下發現有一種專有名詞 Hexspeak 就是只這種將 16 進位表達成英文單詞，不同處理器、作業系統會有不同的意義的樣子

TODO: 解讀這裡 magic number 存在的必要，以及找出類似追蹤記憶體使用而出現的 magie number 設計

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

jserv

接著為了確保 queue 的執行，在 harness 中也特別定義


static block_ele_t *allocated = NULL;
static size_t allocated_count = 0;

allocated 會紀錄最近一次 malloc 的 block_ele_t pointer，而 allocated_count 則會紀錄目前 allocated block_ele_t 數

而在 harness.c 中透過下面這兩個函式檢查每次 free 一個 pointer 是否合法

static block_ele_t *find_header(void *p)
static size_t *find_footer(block_ele_t *b)

因為 test_malloc 會回傳的是 payload 的指標，因此當使用者需要 free 一個空間的時候也會把 payload 的指標作為參數丟給 test_free
find_head 中首先直接用數學運算得到該 payload 的 block_ele_t pointer，接著自 allocated 開始尋找是否存在相同的 block_ele_t pointer，如果不存在即紀錄錯誤訊息並報錯
而 find_footer 則可以用來檢查每次 test_malloc 會特別註記在每個 block_ele_t 尾巴的 MAGICFOOTER

2. qtest 的行為以及裡面的技巧

簡單繪製了一下各個 .c / .h 檔被引用的狀況

[額外筆記] 為什麼 strlcpy 被視為不安全的操作？

像是 strlcpy 以及 strlcat，兩者皆沒有被納入標準的 ANSI C 中，根據 ANSI C maintainer Ulrich Drepper 認為這兩個 function 是不安全的，一旦使用者使用了此兩個 function 很可能會造成一些 truncation error 被忽視，導致未來可能會產生更多的 bug 需要被處理。

Ulrich Drepper 建議使用 mempcpy，利用回傳得到的最後一個 character pointer 直接寫入該 pointer value 為 \0
*((char *) mempcpy (dst, src, n)) = '\0';
(雖然 mempcpy 也沒有被納入標準 ANSI C 中)

另外還有一個議題是因為沒有被納入標準的 ANSI C，因此實際上不同的 OS 的 implementation 也不相同。因此可能會導致在不同作業系統編譯出來的執行檔執行結果錯誤。