2019q1 Homework1 (lab0)

contributed by < ab37695543xs >

開發環境

架構：               x86_64
CPU 作業模式：       32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              8
On-line CPU(s) list: 0-7
每核心執行緒數：     2
每通訊端核心數：     4
Socket(s):           1
NUMA 節點：          1
供應商識別號：       GenuineIntel
CPU 家族：           6
型號：               142
Model name:          Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz
製程：               10
CPU MHz：            2099.944
CPU max MHz:         3400.0000
CPU min MHz:         400.0000
BogoMIPS:            3600.00
虛擬：               VT-x
L1d 快取：           32K
L1i 快取：           32K
L2 快取：            256K
L3 快取：            6144K
NUMA node0 CPU(s)：  0-7

作業目標

CMU Introduction to Computer Systems (ICS) 準備了 C Programming Lab 作為檢閱學生 C 語言程式設計的認知:

Explicit memory management, as required in C.
Creating and manipulating pointer-based data structures.
Working with strings.
Enhancing the performance of key operations by storing redundant information in data structures.
Implementing robust code that operates correctly with invalid arguments, including NULL pointers.

實驗目標為實作 queue

first in, first out (FIFO)
last in, first out (LIFO)

基本要求

queue.h

為了滿足 O(1) 的時間要求下，實作出 q_insert_tail() 與q_size()，便在資料結構上新增了 tail 與 size 來紀錄





typedef struct {
    list_ele_t *head;
    list_ele_t *tail;
    int size;
} queue_t;

queue.c

q_new()

成功分配記憶體的話，就初始化 q，反之 q 會回傳 NULL











queue_t *q_new()
{
    queue_t *q = malloc(sizeof(queue_t));
    /* initialize the queue */
    if (q) {
        q->head = NULL;
        q->tail = NULL;
        q->size = 0;
    }
    return q;
}

q_size()

由於透過額外 field 來記錄 queue 的大小，且在 q_new() 已初始化，這邊直接回傳即可

犯錯的地方：
由於初始化是在 q_new() 時進行，需要考慮到未初始化的情況，即 q 為 NULL







int q_size(queue_t *q)
{
    if (q != NULL)
        return q->size;
    else
        return 0;
}

q_insert_head()

考慮傳入的 q 與記憶體分配的 malloc() 與 strdup() 為 NULL 的可能
若複製字串空間不夠時，會視為插入失敗，同時釋放該節點
需要考慮最初的情況，head 和 tail 為同一個，避免插入失敗
有新節點的情況也會需要更新 size





















bool q_insert_head(queue_t *q, char *s)
{
    if (q == NULL)
        return false;
    list_ele_t *newh = malloc(sizeof(list_ele_t));
    if (newh == NULL)
        return false;
    /* set the element */
    newh->value = strdup(s);
    if (newh->value == NULL) {
        free(newh);
        return false;
    }
    newh->next = q->head;
    /* maintain the queue */
    q->head = newh;
    if (q->tail == NULL)  // same element
        q->tail = newh;
    q->size++;
    return true;
}

q_insert_tail()

由於透過額外 field 記錄 tail 位置，所以省去遍尋的時間，剩下邏輯同 q_insert_head()






















bool q_insert_tail(queue_t *q, char *s)
{
    /* Remember: It should operate in O(1) time */
    list_ele_t *newt = malloc(sizeof(list_ele_t));
    if (newt == NULL)
        return false;
    /* set the element */
    newt->value = strdup(s);
    if (newt->value == NULL) {
        free(newt);
        return false;
    }
    newt->next = NULL;
    /* maintain the queue */
    if (q->head == NULL)  // same element
        q->head = newt;
    else
        q->tail->next = newt;
    q->tail = newt;
    q->size++;
    return true;
}

q_free()

由於是整個 queue 要 free，沒有必要保留 head，所以透過移動 head 來依序移除
字串在儲存時有經由 strdup() 分配記憶體，所以要手動 free
最後所有節點 free 完後，head 和 tail 也就不復存在，可以直接 free 掉 queue

犯錯的地方：

只要有參照指標 q 的地方，都應該檢查 q 是否為 NULL

雖然 free(NULL) 是可行的，但在 while 裡先使用到 q->head，所以會先失敗














void q_free(queue_t *q)
{
    if (q == NULL)
        return;
    list_ele_t *tmp;
    while (q->head != NULL) {  // next element
        tmp = q->head;
        q->head = tmp->next;
        free(tmp->value);  // strdup()
        free(tmp);
    }
    /* nodes all freed */
    free(q);
}

q_remove_head()

要移除 head 一定要先檢查 head 存不存在
這邊要將刪除的 head 所存的字串複製到終點 sp，同時有限制複製的字元數不超過 bufsize
考慮到超過 bufsize 的字串不會複製到 '\0'，這邊是先將 sp 統一設為 '\0'，再將 bufsize - 1 以內的字元複製過去
刪除節點同樣需要更新 size

犯錯的地方：

一開始只考慮到檢查 head 而已，忽略了 q 為 NULL 的情況

字串複製時忽略了 sp 為 NULL 時，會造成無法寫入的情形

















bool q_remove_head(queue_t *q, char *sp, size_t bufsize)
{
    if (q == NULL || q->head == NULL)
        return false;
    /* pop the element */
    if (sp != NULL) {
        memset(sp, (int) '\0', bufsize);
        strncpy(sp, q->head->value, bufsize - 1);
    }
    free(q->head->value);
    /* maintain the queue */
    list_ele_t *tmp = q->head;
    q->head = tmp->next;
    free(tmp);
    q->size--;
    return true;
}

q_reverse()

反轉的部分一開始寫的是後面的版本 (由尾到頭依序)，後來改良的是以下版本 (由頭到尾依序)

改良版的運作如下圖，透過三個指標 prev、head 和 next 來完成交換。
迴圈前為圖 1, 2，先記錄一開始的 head 為 newt，需要考慮最後完成的 tail 要接 NULL
迴圈操作為圖 3, 4
1. 記錄 head 下一個節點為 next
2. 將 head 連接至 prev，完成反轉
3. 將 prev 移至 head
4. 將 head 移至 next，繼續下一輪
完成情況如下
1. head 移到原 tail 的下一個 (NULL)，跳出迴圈
2. 因為是透過 head 來遍尋 queue，所以將 head 移至 prev (原 tail)
3. 將 tail 移至一開始記錄的 newt，完成整個 queue 的反轉



















void q_reverse(queue_t *q)
{
    if (q == NULL)
        return;
    /*
    start from head to tail
    link current(head) and prev then jump to next
    */
    list_ele_t *newt = q->head;
    list_ele_t *prev = NULL;
    while (q->head != NULL) {
        list_ele_t *next = q->head->next;
        q->head->next = prev;
        prev = q->head;
        q->head = next;
    }
    q->head = prev;  // prevent NULL
    q->tail = newt;  // origin head
}

原始版的運作如下圖，透過兩個指標 tmp 和 tail 來完成交換，缺點是每次都需要從頭找到 tail 前一個
首先記錄一開始的 tail 為 newh
迴圈操作為圖 2, 3, 4
1. tmp 從 head 開始遍尋
2. 直到下一個節點為 tail 時停下來
3. 將 tail 連接至 tmp，完成反轉
4. tail 移至 tmp，作為下一次的中斷點
完成情況如下
1. tail 與 head 重合時，跳出迴圈
2. 將目前 head (新 tail) 連接至 NULL
3. 將 head 移至一開始記錄的 newh，完成整個 queue 的反轉











    /* start from tail to head */
    list_ele_t *newh = q->tail;
    while (q->tail != q->head) {
        list_ele_t *tmp = q->head;
        while (tmp->next != q->tail)
            tmp = tmp->next;
        q->tail->next = tmp;    // reverse the link of tail
        q->tail = tmp;
    }
    q->head->next = NULL;
    q->head = newh; // origin tail

自動評分系統

makefile

執行 make test 會等同執行 makefile 裡以下指令
第一列是代表相依的檔案，第二列為執行的命令

test: qtest scripts/driver.py
    scripts/driver.py

scripts/driver.py

執行 driver.py 後，首先會檢查有沒有後續的引數

if __name__ == "__main__":
    run(sys.argv[0], sys.argv[1:])

提供了幾種引數，-p 可以決定待測試的程式，-t 挑選測資題目，-v 決定細節顯示 (0~3 越高越細)，或是 --valgrind 以 valgrind 開啟程式

有個隱藏引數是 -A，開啟 autograde (以 JSON 字串印出)，但因為過程中就會自動計算分數，所以預設不會用到

print("Usage: %s [-h] [-p PROG] [-t TID] [-v VLEVEL] [--valgrind]" % name)
print("  -h        Print this message")
print("  -p PROG   Program to test")
print("  -t TID    Trace ID to test")
print("  -v VLEVEL Set verbosity level (0-3)")

scripts/driver.py/run()

執行 run 後，主要處理上述引數
預設情況下會執行以下片段建立 class 並執行 function run

t = Tracer(qtest = "", verbLevel = 1, autograde = False, useValgrind = False)
t.run(0)

scripts/driver.py/Tracer.run()

根據是否有執行 valgrind，腳本的指令也會相對不同
上半部對應到的是 valgrind ./qtest
下半部對應到的是 ./qtest

事實上下半部的 command 可以不用再次賦值，預設就是 ./qtest，不過這樣可以增加可讀性

if self.useValgrind:
    self.command = 'valgrind'
else:
    self.command = self.qtest
    self.qtest = ''

在預設 tid 未指定測資的情況下，會依序跑完所有測資

for t in tidList:
    tname = self.traceDict[t]
    if self.verbLevel > 0:
        print("+++ TESTING trace %s:" % tname)
    ok = self.runTrace(t)

分數是透過每個測資的滿分加總決定，無論是錯在該題的哪個地方，該題都是以 0 分計算，所以會需要 verbLevel 幫助了解

    maxval = self.maxScores[t]
    tval = maxval if ok else 0
    print("---\t%s\t%d/%d" % (tname, tval, maxval))
    score += tval
    maxscore += maxval
    scoreDict[t] = tval

scripts/driver.py/Tracer.runTrace()

依序將前面所得的對應指令結合，便完成對應的腳本，再透過 python 的 subprocess.call 來呼叫 qtest

這邊事實上也可以將 fname 的部份全部預設寫在 dict 裡面，拆開來只是為了精簡過程中顯示的文字

fname = "%s/%s.cmd" % (self.traceDirectory, self.traceDict[tid])
vname = "%d" % self.verbLevel
clist = [self.command, self.qtest, "-v", vname, "-f", fname]
try:
    retcode = subprocess.call(clist)

qtest 行為與技巧

C 與 Python 擷取引數的差別

在 qtest.c 裡，使用的是 getopt()
字串中的 h v: f: l: 即是代表要處理的引數
輸入指令時必須是以 -h、-v 這樣的形式才會擷取
:表示後面需要有對應的資訊，如 -f test.txt，該資訊會存在 optarg 這個固定的變數
成功擷取會回傳對應的字元，失敗則根據情況會回傳 -1、? 和 ：

























int c;
while ((c = getopt(argc, argv, "hv:f:l:")) != -1) {
    switch (c) {
    case 'h':
        usage(argv[0]);
        break;
    case 'f':
        strncpy(buf, optarg, BUFSIZE);
        buf[BUFSIZE - 1] = '\0';
        infile_name = buf;
        break;
    case 'v':
        level = atoi(optarg);
        break;
    case 'l':
        strncpy(lbuf, optarg, BUFSIZE);
        buf[BUFSIZE - 1] = '\0';
        logfile_name = lbuf;
        break;
    default:
        printf("Unknown option '%c'\n", c);
        usage(argv[0]);
        break;
    }
}

在 driver.py 裡，使用的是 getopt.getopt() C-style parser，而 python 也有提供比較強大的模組是 argparse
這邊的長引數 valgrind 是透過 list 來儲存，在 C 則需要呼叫 getopt_long()，以特別的 struct 來儲存
回傳值 optlist 和 args 可以更改變數名稱，前者以 list 儲存所有擷取到的引數，後者是儲存擷取後剩餘的引數
回傳的 list 每個元素都是以 (option, value) 形式儲存，value 只有在指定 : 時才會一併把資訊放入


















optlist, args = getopt.getopt(args, 'hp:t:v:A', ['valgrind'])
for (opt, val) in optlist:
    if opt == '-h':
        usage(name)
    elif opt == '-p':
        prog = val
    elif opt == '-t':
        tid = int(val)
    elif opt == '-v':
        vlevel = int(val)
        levelFixed = True
    elif opt == '-A':
        autograde = True
    elif opt == '--valgrind':
        useValgrind = True
    else:
        print("Unrecognized option '%s'" % opt)
        usage(name)

測試過程

qtest.c

qtest 也提供了幾種可用的引數，-f 和 -v 對應到上述自動評分提及的腳本與細節顯示，-l 則是指定紀錄檔存放的位置

printf("Usage: %s [-h] [-f IFILE][-v VLEVEL][-l LFILE]\n", cmd);
printf("\t-h         Print this information\n");
printf("\t-f IFILE   Read commands from IFILE\n");
printf("\t-v VLEVEL  Set verbosity level\n");
printf("\t-l LFILE   Echo results to LFILE\n");

執行 qtest 後會先根據需要的 signal 註冊 signal handler
這邊只有考慮兩種事件，SIGSEGV 處理記憶體錯誤，SIGALRM 則處理定時器

static void queue_init()
{
    fail_count = 0;
    q = NULL;
    signal(SIGSEGV, sigsegvhandler);
    signal(SIGALRM, sigalrmhandler);
}

handler 裡面會直接呼叫 harness.c 的 trigger_exception()，注意 handler 會有固定一個參數儲存信號

void sigsegvhandler(int sig)
{
    trigger_exception(
        "Segmentation fault occurred.  You dereferenced a NULL or invalid "
        "pointer");
}

在每個與 queue.c 對應的函式裡，扣除引數的檢查後，都會先檢查 q 是否為 NULL
接著執行 harness.c 的 error_check()，因為尚未測試 queue，這邊忽略回傳值，目的只是重設 error_occurred

if (q == NULL)
    report(3, "Warning: Calling insert tail on null queue");
error_check();

bool error_check()
{
    bool e = error_occurred;
    error_occurred = false;
    return e;
}

以 do_new() 為例，完成上述步驟後會先設置一個存檔點，true 表示允許開啟定時器，~~畢竟闖關就是要存檔~~，完成後就會解除定時器

if (exception_setup(true))
    q = q_new();
exception_cancel();

部份測試會開啟額外的模式，如做完 do_free() 會開啟 harness.c 的 cautious_mode，確保任何要 free 的區塊目前都 allocated，但似乎不會用到 (?)

影響到 find_header()，只有 test_free() 用到，但沒有呼叫到 test_free() 的機會

set_cautious_mode(true);

do_reverse() 也會開關 noallocate_mode，開啟的情況下會禁止 test_malloc() 和 test_free()

一樣沒有呼叫到兩個函式的機會

set_noallocate_mode(true);
if (exception_setup(true))
    q_reverse(q);
exception_cancel();
set_noallocate_mode(false);

harness.c

以下變數是為了處理例外發生，可以搭配參考 setjmp.h
env 會儲存當下的 stack 資訊，宣告雖然可以用 jmp_buf，但可以改成 sigjmp_buf 比較有一致性
volatile 表示該變數能夠在 set 和 jump 之間存取，且不會套用編譯器的最佳化，避免影響變數的更新
sig_atomic_t 表示變數的更動需要在一個指令內完成，所以兩個關鍵字常一併出現
time_limited 表示是否有開啟定時器
time_limit 表示定時器的間隔，預設一秒

static jmp_buf env;
static volatile sig_atomic_t jmp_ready = false;
static bool time_limited = false;
static int time_limit = 1;

當 handler 捕捉到對應信號時，記錄要顯示的錯誤資訊，並使用 siglongjmp() 跳回上一個存檔點









void trigger_exception(char *msg)
{
    error_occurred = true;
    error_message = msg;
    if (jmp_ready)
        siglongjmp(env, 1);
    else
        exit(1);
}

這邊存檔點的設置使用的是 sigsetjump()，算是 setjump() 的改良版，差別在可以處理信號內容
比較複雜的 context switch 可以參考 setcontext()
savesigs 的位置傳入 1，根據 signal，表示能夠處理 SIGHUP 程序中斷的情況
建立存檔點時，固定呼叫 exception_setup(true)
1. 先經由 if 建立存檔點，sigsetjmp() 成功會是回傳 0，跳到 else
2. 允許 jump 並開啟定時器
3. 因定時器間隔一秒，正常是還不會觸發 SIGALRM 就回傳 true
假設在執行 queue.c 的函式時發生 SIGSEGV
1. 捕捉到信號會立即執行 trigger_exception()，記錄對應的訊息
2. 接著 siglongjmp() 會回到上一個存檔點 sigsetjmp()
3. jump 回來會使得回傳值非 0，進入 if
4. 禁止 jump 並關閉定時器
5. 印出對應的錯誤訊息並跳出
6. 因為 false 使得原本 qtest.c 測試的 if 區塊不再執行
























bool exception_setup(bool limit_time)
{
    if (sigsetjmp(env, 1)) {
        /* Got here from longjmp */
        jmp_ready = false;
        if (time_limited) {
            alarm(0);
            time_limited = false;
        }
        if (error_message) {
            report_event(MSG_ERROR, error_message);
        }
        error_message = "";
        return false;
    } else {
        /* Got here from initial call */
        jmp_ready = true;
        if (limit_time) {
            alarm(time_limit);
            time_limited = true;
        }
        return true;
    }
}

無論執行過程中是否有 SIGSEGV，最終都會呼叫 exception_cancel()，差別在發生例外的情況下會多做一次而已









void exception_cancel()
{
    if (time_limited) {
        alarm(0);
        time_limited = false;
    }
    jmp_ready = false;
    error_message = "";
}

qtest.c

執行完測試的指令後，仍會再次執行 harness.c 的 error_check()
如果正常執行，那麼仍會維持執行前的 error_occurred = false
反之只要有發生 trigger_exception()，就會維持 error_occurred = true，使得最終回傳值為 false，表示失敗

return ok && !error_check();

bool error_check()
{
    bool e = error_occurred;
    error_occurred = false;
    return e;
}