Try   HackMD

2019q1 Homework1 (lab0)

contributed by <MetalheadKen>

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
題目出處

In progress

這句話不用說,共筆除了自己初始的創作,還要接受他人的檢視和批評,從而予以調整和改進,總是「進行中」

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
jserv

實驗環境

$ cat /etc/os-release
NAME="Ubuntu"
VERSION="18.04.1 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.1 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic
$ uname -a
Linux kendai-XPS-13-9360 4.15.0-43-generic #46-Ubuntu SMP Thu Dec 6 14:45:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
$ gcc --version | head -n 1
gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0

題目摘要

  • Some of the skills tested are :

    • Explicit memory management, as required in C
    • Creating and manipulating pointer-based data structures
    • Working with strings
    • Enhancing the proformance of key operations by storing redundant information in data structures
    • Implementing robust code that operates correctly with invalid arguments, including NULL pointers
  • Implementing a queue

    • Supporting both last-in, first-out (LIFO) and first-in-first-out (FIFO)
    • The underlying data structure is a singly-linked list
  • Resources

  • The task is to modify the code in queue.h and queue.c to fully implement the following functions

    • q_new: Create a new, empty queue
    • q_free: Free all storage used by a queue
    • q_insert_head: Attempt to insert a new element at the head of the queue (LIFO discipline)
    • q_insert_tail: Attempt to insert a new element at the tail of the queue (FIFO discipline)
    • q_remove_head: Attempt to remove the element at the head of the queue
    • q_size: Compute the number of elements in the queue
    • q_reverse: Reorder the list so that the queue elements are reversed in order.
  • Precautions

    • A NULL queue is one for which the pointer has value NULL. An empty queue is one pointing to a valid structure, but the head field has value NULL
    • For functions that provide strings as arguments, you must create and store a copy of the string by calling malloc to allocate space and then copying from the source to the newly allocated space
    • You cannot assume any fixed upper bound on the length of a string ─ you must allocate space for each string based on its length
    • q_insert_tail and q_size will require that your implementations operate in time
      O(1)
    • q_reverse should not allocate or free any list elements. Instead, it should rearrange the existing elements

實作流程

queue.h

依據題意,為了讓函式 q_insert_tailq_size 的時間複雜度為

O(1),在 struct queue_t 裡新增成員 tailsize,使得在 runtime 的時候可以動態紀錄 queue 的尾端和其大小

/* Queue structure */
typedef struct {
    list_ele_t *head;
    list_ele_t **tail;
    size_t size;
} queue_t;

q_new

需注意若 malloc 失敗時需回傳 NULL

queue_t *q_new()
{
    queue_t *q = (queue_t *) malloc(sizeof(queue_t));
    if (!q)
        return NULL;

    q->head = NULL;
    q->tail = &(q->head);
    q->size = 0;
    return q;
}

q_free

釋放 queue 的空間,注意記得一併釋放節點和字串所佔的記憶體空間

void q_free(queue_t *q)
{
    if (!q)
        return;

    while (q->head != NULL) {
        list_ele_t *tmp = q->head;
        q->head = q->head->next;

        free(tmp->value);
        free(tmp);
    }

    free(q);
    q = NULL;
}

q_insert_head

bool q_insert_head(queue_t *q, char *s)
{
    if (!q)
        goto ERR;

    list_ele_t *newh;
    newh = (list_ele_t *) malloc(sizeof(list_ele_t));
    if (!newh)
        goto ERR;

    char *str = (char *) malloc(strlen(s) + 1);
    if (!str)
        goto ERR_FREE_LIST;

    strcpy(str, s);

    newh->value = str;
    newh->next = q->head;

    if (!q->head) {
        q->tail = &(newh->next);
    }
    q->head = newh;
    q->size++;

    return true;

ERR_FREE_LIST:
    free(newh);
ERR:
    return false;
}

q_insert_tail

bool q_insert_tail(queue_t *q, char *s)
{
    if (!q)
        goto ERR;

    list_ele_t *newt = (list_ele_t *) malloc(sizeof(list_ele_t));
    if (!newt)
        goto ERR;

    char *str = (char *) malloc(strlen(s) + 1);
    if (!str)
        goto ERR_FREE_LIST;

    strcpy(str, s);

    newt->value = str;
    newt->next = NULL;

    *(q->tail) = newt;
    q->tail = &(newt->next);
    q->size++;

    return true;

ERR_FREE_LIST:
    free(newt);
ERR:
    return false;
}

q_remove_head

  • sp 不為 NULL 時代表需把移除的字串複製到 sp
  • 注意有可能要移除的字串的長度超過 sp 的大小,因而最多只需複製長度為 bufsize - 1,並注意要在結尾上加上 terminating null byte
  • 調整節點數量
bool q_remove_head(queue_t *q, char *sp, size_t bufsize)
{
    if (!q || !q->head)
        return false;

    if (sp != NULL) {
        strncpy(sp, q->head->value, bufsize - 1);
        sp[bufsize - 1] = '\0';
    }

    list_ele_t *tmp = q->head;
    q->head = q->head->next;
    q->tail = (--q->size == 0) ? &(q->head) : q->tail;

    free(tmp->value);
    free(tmp);
    return true;
}

q_size

  • 題意規定時間複雜度為
    O(1)
    ,因此藉由 size 動態紀錄 queue 的大小,並在此函式直接回傳 size 的內容
  • 需注意有可能在沒有使用 new command 的情況下直接使用 size command,所以需判斷若 queue 為 NULL queue 或 empty queue 則直接回傳數值 0
int q_size(queue_t *q)
{
    if (q == NULL || q->head == NULL) {
        return 0;
    } else {
        return q->size;
    }
}

可改寫上方程式碼為更簡潔的形式:

int q_size(queue_t *q)
{
    if (!q || !q->head)
        return 0;
    return q->size;
}

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
jserv

LGTM.
MetalheadKenFeb 22, 2019

q_reverse

  • 若 queue 為 NULL queue 或 empty queue 或長度為 1 的 queue 則無需 reverse
  • 使用 prev, curr, next 來紀錄反轉的過程
    • prev 紀錄 queue 前一個的節點
    • curr 紀錄 queue 當前的節點,在反轉過程中,curr 的下一個節點應指派為 curr 的前一個節點
    • next 紀錄 queue 下一個的節點
  • 注意最後 headtail 需重設方向
void q_reverse(queue_t *q)
{
    if (!q || !q->head || !q->head->next)
        return;

    list_ele_t *prev, *curr, *next;

    prev = q->head;
    curr = q->head->next;
    q->head->next = NULL;
    q->tail = &(q->head->next);

    while (curr != NULL) {
        next = curr->next;
        curr->next = prev;
        prev = curr;
        curr = next;
    }
    q->head = prev;
}

自動評分系統運作機制

檔案相關資訊

  • console.[ch]: Command-line interpreter for qtest
  • report.[ch]: Printing of information at different levels of verbosity
  • harness.[ch]: Customized version of malloc and free to provide rigorous testing framework
  • qtest.c: Testing code for queue code
  • scripts/driver.py: The driver program, run qtest on a standard set of traces
  • traces/trace-XX-CAT.cmd: Trace files used by the driver. These are input files for qtest
  • Makefile: Builds the evaluation program qtest

關於使用命令來測試程式

如何管理命令

console.[ch] 會把所有的 command, parameter, file 用 singly linked list 串連起來。console.h 用以下 structure 來表示之

typedef bool (*cmd_function)(int argc, char *argv[]);

/* Information about each command */
typedef struct CELE cmd_ele, *cmd_ptr;
struct CELE {
    char *name;
    cmd_function operation;
    char *documentation;
    cmd_ptr next;
};

typedef void (*setter_function)(int oldval);

/* Integer-valued parameters */
typedef struct PELE param_ele, *param_ptr;
struct PELE {
    char *name;
    int *valp;
    char *documentation;
    /* Function that gets called whenever parameter changes */
    setter_function setter;
    param_ptr next;
};

透過 add_cmd 函式可將 command 串連到 linked list 裡。可以發現在串接的時候順便進行按 alphabetical order 的 insertion sort

void add_cmd(char *name, cmd_function operation, char *documentation)
{
    cmd_ptr next_cmd = cmd_list;
    cmd_ptr *last_loc = &cmd_list;
    while (next_cmd && strcmp(name, next_cmd->name) > 0) {
        last_loc = &next_cmd->next;
        next_cmd = next_cmd->next;
    }
    cmd_ptr ele = (cmd_ptr) malloc_or_fail(sizeof(cmd_ele), "add_cmd");
    ele->name = name;
    ele->operation = operation;
    ele->documentation = documentation;
    ele->next = next_cmd;
    *last_loc = ele;
}

如何呼叫使用者輸入命令對應的函式

一開始使用者執行 qtest 後,輸入命令時,會先執行 console.c 中的 interpret_cmd 並呼叫 parse_args 來將輸入的字串以空白為區隔,轉換為 argcargv,如下

bool interpret_cmd(char *cmdline)
{
    int argc;
    ...
    char **argv = parse_args(cmdline, &argc);
    bool ok = interpret_cmda(argc, argv);
    ...
    return ok;
}

之後在函式 interpret_cmda 中,利用迴圈走訪 singly linked list,並利用 strcmp 找出函式相對應的字串。可以發現在 qtest 中使用函式指標的方式可以很簡單方便的維護使用者可輸入的命令

static bool interpret_cmda(int argc, char *argv[])
{
    if (argc == 0)
        return true;
    /* Try to find matching command */
    cmd_ptr next_cmd = cmd_list;
    bool ok = true;
    while (next_cmd && strcmp(argv[0], next_cmd->name) != 0)
        next_cmd = next_cmd->next;
    if (next_cmd) {
        ok = next_cmd->operation(argc, argv);
        if (!ok)
            record_error();
    } else {
        report(1, "Unknown command '%s'", argv[0]);
        record_error();
        ok = false;
    }
    return ok;
}

在寫好 trace file 後程式如何運作

qtest 中有提供參數 -f <filename> 來讀取 trace file。當使用者輸入命令後,在 qtest.c 中的 main 函式會呼叫 getopt 來解析使用者先前輸入命令的後面所添加的參數,節錄如下

int main(int argc, char *argv[])
{
    ...
    while ((c = getopt(argc, argv, "hv:f:l:")) != -1) {
        switch (c) {
        case 'h':
            usage(argv[0]);
            break;
        case 'f':
            strncpy(buf, optarg, BUFSIZE);
            buf[BUFSIZE - 1] = '\0';
            infile_name = buf;
            break;
        case 'v':
            level = atoi(optarg);
            break;
        case 'l':
            strncpy(lbuf, optarg, BUFSIZE);
            buf[BUFSIZE - 1] = '\0';
            logfile_name = lbuf;
            break;
        default:
            printf("Unknown option '%c'\n", c);
            usage(argv[0]);
            break;
        }
    }
    ...
}

getopt 是一個可以很方便的去 parse 從 command line 傳過來的參數的函式,該函式會一直讀取參數直到沒有更多的參數可以讀時才回傳 -1

  • function prototype:
    int getopt(int argc, char *argv[], const char *optstring)
  • int argc: the argument count passed to the main() function
  • char *argv[]: the argument array passed to the main() function
  • const char *optstring: a string containing the legitimate option characters
    • 一個字元︰ 一個可用的 option
    • 一個字元搭配冒號︰該 option 後有額外帶參數
    • 一個字元當配兩個冒號︰該 option 後帶的參數是可選的 (此為 GNU extension)
  • getopt 回傳數值後
    • 成功回傳 option 字元,無更多 option 可讀則回傳 -1,在 optstring 找不到 option 字元則回傳 ?,在 optstring 設定需帶參數的 option 沒有帶參數的話則回傳 :
    • optarg 變數儲存該 option 後面所帶的參數
    • optind 變數儲存下次呼叫 getopt 時要處理的 argv 的 index
    • optopt 變數儲存當 getopt 找不到 option 字元或是缺少某些參數時的該 option
  • 注意事項
    • getopt 是 POSIX standard 但是並不是 C standard 並在 glibc 實作中的某些行為是 GNU extensions,這點需要特別注意
  • 參考資料

console.c 中定義以下 structure 來儲存每個要使用的檔案的 file descritptor

#define RIO_BUFSIZE 8192
typedef struct RIO_ELE rio_t, *rio_ptr;

struct RIO_ELE {
    int fd;                /* File descriptor */
    int cnt;               /* Unread bytes in internal buffer */
    char *bufptr;          /* Next unread byte in internal buffer */
    char buf[RIO_BUFSIZE]; /* Internal buffer */
    rio_ptr prev;          /* Next element in stack */
};

RIO在 CS:APP 第 10.5 章提及,全名為 Robust I/O
Ref: System-Level I/O (preview version)

並在 push_file 函式中把開啟檔案的 file descriptor 放進 struct RIO_ELE 中的 fd,若無指定開啟檔案的路徑則選擇為標準輸入 (也就是 stdin)

static bool push_file(char *fname)
{
    int fd = fname ? open(fname, O_RDONLY) : STDIN_FILENO;
    if (fd < 0)
        return false;
    if (fd > fd_max)
        fd_max = fd;
    rio_ptr rnew = malloc_or_fail(sizeof(rio_t), "push_file");
    rnew->fd = fd;
    rnew->cnt = 0;
    rnew->bufptr = rnew->buf;
    rnew->prev = buf_stack;
    buf_stack = rnew;
    return true;
}

關於記憶體測試機制

如何偵測記憶體洩漏

harness.[ch] 中藉由替換掉 mallocfree 的實作方式使得可以檢查到 allocation error

harness.h 中可以見到 qtest 使用 macro 來把 mallocfree 替換成 test_malloctest_free

/* Tested program use our versions of malloc and free */
#define malloc test_malloc
#define free test_free

用 marco 來把函式庫替換成自定義的函式雖然是個簡單方便的用法,但是若使用其他的函式庫裡面有呼叫到 mallocfree 就無從追蹤了,如 strdup 就是一個例子。若想解決這個問題則可以使用 dynamic linker 來抽換掉函式庫中呼叫的 mallocfree

而從 test_malloc 所配置出來的記憶體都會被紀錄在 struct BELE 中,structure 在 harness.c 中定義如下並以 doubly linked list 來串接節點

typedef struct BELE {
    struct BELE *next;
    struct BELE *prev;
    size_t payload_size;
    size_t magic_header; /* Marker to see if block seems legitimate */
    unsigned char payload[0];
    /* Also place magic number at tail of every block */
} block_ele_t;

在 structure 中 payload[0] 是一個 struct hack,在 GCC manual 中稱呼為 Arry of Length Zero,藉由這種方式可以做到讓 structure 中的成員可以動態的配置記憶體又不用因為多次呼叫 malloc 造成記憶體碎片化

用法如下

block_ele_t *new_block = malloc(array_size + sizeof(block_ele_t));

這樣的手法其實就是利用 GNU C compiler 預設不會去檢查 Array index out of bounds 的問題來做到的,但是使用這個技巧需要思考以下的問題

  • Flexible array members 需為 incomplete type, 並且 sizeof 不能用在 incomplete type 上
  • Flexible array members 需宣告在最後一個 non-empty structure 成員上
  • Flexible array members 不能單獨出現,至少需一個成員在
  • Flexible array members 本身不可以宣告或包含在 union 中
  • Clang 不支援該 GNU extension

    clang does not support the gcc extension that allows variable-length arrays in structures. This is for a few reasons: one, it is tricky to implement, two, the extension is completely undocumented, and three, the extension appears to be rarely used.

另外在 ISO C90 可以使用 Array of Length One 來做到同樣的事情,而從 ISO C99 開始支援 Flexible Array Members,請參照

  • Wikipedia
  • C99 §6.7.2.1, item 16

    As a special case, the last element of a structure with more than one named member may have an incomplete array type; this is called a flexible array member.

如何配置記憶體

test_malloc 把經由 malloc 配置出來的新的記憶體空間,將其串連到 doubly linked list allocated 中,其中配置出記憶體空間的寫法如下

block_ele_t *new_block = malloc(size + sizeof(block_ele_t) + sizeof(size_t));

可以看到除了 sizesizeof(block_ele_t) 以外,還多了 sizeof(size_t),這是因為在 qtest 中,藉由在尾端多配置出一個空間並填入 magic number 來查看若當該數值有被更動過的話,即表示出現 memory corruption (array access out of bounds),以及偵測是否是由 test_malloc 配置出來的記憶體空間,而前一個成員 magic_header 也是基於此用途的。harness.c 中的 test_malloc 節錄如下

void *test_malloc(size_t size)
{
    if (fail_allocation()) {
        report_event(MSG_WARN, "Malloc returning NULL");
        return NULL;
    }
    block_ele_t *new_block =
        malloc(size + sizeof(block_ele_t) + sizeof(size_t));
    ...
    new_block->magic_header = MAGICHEADER;
    new_block->payload_size = size;
    *find_footer(new_block) = MAGICFOOTER;
    void *p = (void *) &new_block->payload;
    memset(p, FILLCHAR, size);
    new_block->next = allocated;
    new_block->prev = NULL;
    if (allocated)
        allocated->prev = new_block;
    allocated = new_block;
    allocated_count++;
    return p;
}

fail_allocation 中,實作了當亂數產生出來的數小於 0.01 乘上預先設定好的失敗機率 (fail_probability) 的話,即 malloc failure,為何需要如此實作?

MetalheadKen

其中為了方便取得到 payload 的尾端來指派 magic number (MAGICFOOTER),qtest 實作了 find_footer 函式,如下

static size_t *find_footer(block_ele_t *b) { size_t *p = (size_t *) ((size_t) b + b->payload_size + sizeof(block_ele_t)); return p; }

藉由傳進來的 block_ele_t* 的開頭,在加上 payload size 和該 structure 本身的大小來取得到 payload 的尾端

如何釋放記憶體空間

test_free 會把要釋放的記憶體空間從 doubly linked list allocated 移除,並在呼叫 free 來釋放該記憶體空間。該函式節錄如下

void test_free(void *p)
{
    ...
    block_ele_t *b = find_header(p);
    size_t footer = *find_footer(b);
    if (footer != MAGICFOOTER) {
        report_event(MSG_ERROR,
                     "Corruption detected in block with address %p when "
                     "attempting to free it",
                     p);
        error_occurred = true;
    }
    b->magic_header = MAGICFREE;
    *find_footer(b) = MAGICFREE;
    memset(p, FILLCHAR, b->payload_size);

    /* Unlink from list */
    block_ele_t *bn = b->next;
    block_ele_t *bp = b->prev;
    if (bp)
        bp->next = bn;
    else
        allocated = bn;
    if (bn)
        bn->prev = bp;

    free(b);
    allocated_count--;
}

依照上述程式碼可以發現到呼叫 free 來釋放記憶體空間前會先把 magic number 指派為 MAGICFREE,並把 payload 清為 FILLCHAR,是因為在 glibc 的 mallocfree 實作中為了加速等原因,free 並不會清除該記憶體空間,且也並不會釋放給 OS,僅僅是修改該 memory chunk 的資訊以便之後再次使用

Ref: Malloc Tutorial

由於呼叫 free 的傳進來的參數為該記憶體空間 payloadqtest 為了藉由 payload 取得到 block_ele_t* 的開頭,使得可以之後呼叫 free 來釋放整塊記憶體空間,實作了 find_header ,該函式節錄如下

static block_ele_t *find_header(void *p)
{
    ...
    block_ele_t *b = (block_ele_t *) ((size_t) p - sizeof(block_ele_t));
    if (cautious_mode) {
        /* Make sure this is really an allocated block */
        block_ele_t *ab = allocated;
        bool found = false;
        while (ab && !found) {
            found = ab == b;
            ab = ab->next;
        }
        if (!found) {
            report_event(MSG_ERROR,
                         "Attempted to free unallocated block.  Address = %p",
                         p);
            error_occurred = true;
        }
    }
    if (b->magic_header != MAGICHEADER) {
        report_event(
            MSG_ERROR,
            "Attempted to free unallocated or corrupted block.  Address = %p",
            p);
        error_occurred = true;
    }
    return b;
}

關於例外處理

如何送出例外

總的來說,qtest 是藉由送出 signal 和使用 signal handler 來處理 exception

qtest.cmain 函式中可以發現呼叫了 queue_init 來設定 signal 和 signal handler,如下

static void queue_init()
{
    fail_count = 0;
    q = NULL;
    signal(SIGSEGV, sigsegvhandler);
    signal(SIGALRM, sigalrmhandler);
}

signal 函式簡單來說就是當收到 signal number 時會呼叫相對應的函式,SIGSEGVSIGALRM 意思如下

Signal Value Comment
SIGSEGV 11 Invalid memory reference
SIGALRM 14 Timer signal from alarm()

Ref: man pages

而對應 signal 的 handler 如下

/* Signal handlers */
void sigsegvhandler(int sig)
{
    trigger_exception(
        "Segmentation fault occurred.  You dereferenced a NULL or invalid "
        "pointer");
}

void sigalrmhandler(int sig)
{
    trigger_exception(
        "Time limit exceeded.  Either you are in an infinite loop, or your "
        "code is too inefficient");
}

其中,sigsegvhandlersigalrmhandler 皆會呼叫到 trigger_exception,該函式實作如下

void trigger_exception(char *msg)
{
    error_occurred = true;
    error_message = msg;
    if (jmp_ready)
        siglongjmp(env, 1);
    else
        exit(1);
}

可以發現到當進行 signal handler 時該函式會呼叫 siglongjmp,而 siglongjmp 會跳到先前呼叫 sigsetjmp 的地方,為在 harness.cexception_setup 當中

bool exception_setup(bool limit_time)
{
    if (sigsetjmp(env, 1)) {
        /* Got here from longjmp */
        jmp_ready = false;
        if (time_limited) {
            alarm(0);
            time_limited = false;
        }
        if (error_message) {
            report_event(MSG_ERROR, error_message);
        }
        error_message = "";
        return false;
    } else {
        /* Got here from initial call */
        jmp_ready = true;
        if (limit_time) {
            alarm(time_limit);
            time_limited = true;
        }
        return true;
    }
}

alarm(0) 會把正在 pending 的 alarm 給取消掉,而 report_event 是在 report.c 的函式,該用途為回報錯誤訊息

如何偵測已超過指定的時間

在上面的 exception_setup 中可以發現在呼叫 sigsetjmp 後會把 jmp_ready 指派為 true,並呼叫 alarm(time_limit) 來設定 alarm clock

且對應的函式為 exception_cancel,該函式的用途為把先前設定好的 alarm 給取消掉,並把 jmp_ready 指派為 false

void exception_cancel()
{
    if (time_limited) {
        alarm(0);
        time_limited = false;
    }
    jmp_ready = false;
    error_message = "";
}

其中以 qtest.c 中的 do_reverse 為例,可以發現到在呼叫 exception_setup 後呼叫 q_reverse 來進行 queue 反轉,但若沒在設定 alarm 的時間內 (time_limited) 呼叫 exception_cancel 來把 alarm 關掉,在超過時間後即會發送 SIGALRM,即為超時

bool do_reverse(int argc, char *argv[])
{
    ...
    error_check();
    set_noallocate_mode(true);
    if (exception_setup(true))
        q_reverse(q);
    exception_cancel();
    set_noallocate_mode(false);
    show_queue(3);
    return !error_check();
}