2020q1 Homework1 (lab0)

contributed by < frankchang0125 >

題目出處：

開發環境

採用 MacBook Pro + VirtualBox (Ubuntu 18.04)，環境都是在 Ubuntu 上做開發。

$ uname -a
Linux frankchang-ubuntu 4.15.0-72-generic #81-Ubuntu SMP Tue Nov 26 12:20:02 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

$ gcc --version
gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE

$ lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              2
On-line CPU(s) list: 0,1
Thread(s) per core:  1
Core(s) per socket:  2
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               69
Model name:          Intel(R) Core(TM) i5-4288U CPU @ 2.60GHz
Stepping:            1
CPU MHz:             2600.000
BogoMIPS:            5200.00
Hypervisor vendor:   KVM
Virtualization type: full
L1d cache:           32K
L1i cache:           32K
L2 cache:            256K
L3 cache:            3072K
NUMA node0 CPU(s):   0,1
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx rdrand hypervisor lahf_lm abm invpcid_single pti fsgsbase avx2 invpcid md_clear flush_l1d

作業要求

依據指示著手修改 queue.[ch] 和連帶的檔案：

q_new(): Create empty queue.
q_free(): Free all storage used by queue.
q_insert_head(): Attempt to insert element at head of queue.
q_insert_tail(): Attempt to insert element at tail of queue.
q_remove_head(): Attempt to insert element at tail of queue.
q_size(): Return number of elements in queue.
q_reverse(): Reverse elements in queue. This function should not allocate or free any list elements. It should rearrange the existing ones.
q_sort(): Sort elements of queue in ascending order.

實作

依照 C Programming Lab: Assessing Your C Programming Skills 的說明，q_insert_tail 及 q_size 的時間複雜度必須為

O (1)

，因此在 queue_t 中新增 list_ele_t *tail 及 int size 來指向目前 queue 中最後一個節點及目前的 queue 的大小：





typedef struct {
    list_ele_t *head;       /* Head node of linked list */
    list_ele_t *tail;       /* Tail node of linked list */
    int size;               /* Size of linked list */
} queue_t;

q_new()


















/*
 * Create empty queue.
 * Return NULL if could not allocate space.
 */
queue_t *q_new()
{
    queue_t *q = malloc(sizeof(queue_t));

    if (!q) {
        return NULL;
    }

    q->head = NULL;
    q->tail = NULL;
    q->size = 0;

    return q;
}

分配 queue 記憶體空間。
對 q->head、q->tail 及 q->size 初始化。

q_free()




















/* Free all storage used by queue */
void q_free(queue_t *q)
{
    if (!q) {
        return;
    }

    list_ele_t *cur = q->head;
    list_ele_t *node;

    while (cur) {
        node = cur;
        cur = cur->next;
        free(node->value);
        free(node);
    }

    /* Free queue structure */
    free(q);
}

判斷 q 是否為 NULL。
在 q_free() 中，透過 while 迴圈，一個一個造訪 queue 中的每個 node 並將其所佔用的記憶體空間給釋放。在釋放 node 本身的記憶體空間前，要先釋放其 value 字串所佔的空間。當所有 nodes 所佔用的記憶體空間都釋放完後，最後再釋放 queue 本身所佔的記憶體空間。

q_allocate_node()

由於分配 node 記憶體空間除了在 q_insert_head() 中會用到，也會在 q_insert_tail() 中用到。因此我特別多定義了一個 q_allocate_node() 來統一處理：




































/*
 * Allocate node space for given string.
 * Return NULL if s is NULL, empty string
 * or could not allocate space.
 * Otherwise, return the address of the allocated node.
 */
list_ele_t *q_allocate_node(char *s)
{
    if (!s) {
        return NULL;
    }

    size_t s_len = strlen(s) + 1;

    if (s_len == 1) {
        return NULL;
    } 

    list_ele_t *node = malloc(sizeof(list_ele_t));

    if (!node) {
        return NULL;
    }

    node->next = NULL;
    node->value = malloc(s_len);

    if (!node->value) {
        free(node);
        return NULL;
    }

    memcpy(node->value, s, s_len);

    return node;
}

檢查所傳入的 char *s 是否為 NULL 或是空字串，若為真，則直接 return NULL。
分配 node 及其 value 字串的記憶體空間。特別要注意的是如果在分配 value 字串記憶體空間失敗的時候，需要在 return NULL 之前，把已經分配給 node 的記憶體空間給釋放掉，否則會有 memory leak 的情況。

q_insert_head()






























/*
 * Attempt to insert element at head of queue.
 * Return true if successful.
 * Return false if q is NULL or could not allocate space.
 * Argument s points to the string to be stored.
 * The function must explicitly allocate space and copy the string into it.
 */
bool q_insert_head(queue_t *q, char *s)
{
    if (!q) {
        return false;
    }

    list_ele_t *node = q_allocate_node(s);

    if (!node) {
        return false;
    }

    node->next = q->head;
    q->head = node;

    if (!q->tail) {
        q->tail = node;
    }

    q->size += 1;

    return true;
}

檢查 q 是否為 NULL。
呼叫 q_allocate_node() 來分配 node 及其 value 字串的記憶體空間。
將 node->next 指向目前的 q->head。
將 q->head 指向新增的 node。
如果 q->tail 為 NULL，代表原先 queue 中並沒有任何的 nodes，因此 q->tail 也應指向本次所新增的 node。
將 q->size 加 1。

q_insert_tail()
































/*
 * Attempt to insert element at tail of queue.
 * Return true if successful.
 * Return false if q is NULL or could not allocate space.
 * Argument s points to the string to be stored.
 * The function must explicitly allocate space and copy the string into it.
 */
bool q_insert_tail(queue_t *q, char *s)
{
    if (!q) {
        return false;
    }

    list_ele_t *node = q_allocate_node(s);

    if (!node) {
        return false;
    }

    if (!q->head) {
        q->head = node;
    }

    if (q->tail) {
        q->tail->next = node;
    }

    q->tail = node;
    q->size += 1;
    
    return true;
}

檢查 q 是否為 NULL。
呼叫 q_allocate_node() 來分配 node 及其 value 字串的記憶體空間。
如果 q->head 為 NULL，代表原先 queue 中並沒有任何的 nodes，因此 q->head 也應指向本次所新增的 node。
如果 q->tail 不為 NULL，則將目前的 q->tail node 的 next pointer 指向本次所新增的 node。
將 q->tail 更新為本次所新增的 node。
將 q->size 加 1。

q_remove_head()
































/*
 * attempt to remove element from head of queue.
 * Return true if successful.
 * Return false if queue is NULL or empty.
 * If sp is non-NULL and an element is removed, copy the removed string to *sp
 * (up to a maximum of bufsize-1 characters, plus a null terminator.)
 * The space used by the list element and the string should be freed.
 */
bool q_remove_head(queue_t *q, char *sp, size_t bufsize)
{
    if (!q || !q->head) {
        return false;
    }

    list_ele_t *head = q->head;
    q->head = q->head->next;

    if (q->tail == head) {
        q->tail = NULL;
    }

    if (sp) {
        strncpy(sp, head->value, bufsize - 1);
        sp[bufsize - 1] = '\0';
    }

    free(head->value);
    free(head);
    q->size -= 1;
    
    return true;
}

檢查 q 是否為 NULL，及是否有 head node 可以被移除。
宣告一新的指標指向目前的 q->head，以便在最後釋放其記憶體空間；更新 q->head 為下一個 node (or NULL)。
如果 q->tail == q->head，代表目前 queue 終止有一個 node，因此必須將 q->tail 也設為 NULL。
複製欲被移除的 head node 的 value 字串至 *sp，最多 bufsize - 1 個字元 + NULL terminator。
釋放 head node 及其 value 字串所佔的記憶體空間。

q_size()











/*
 * Return number of elements in queue.
 * Return 0 if q is NULL or empty
 */
int q_size(queue_t *q)
{
    if (!q) {
        return 0;
    }
    return q->size;
}

檢查 q 是否為 NULL，若為 NULL (代表 new 指令沒有被執行) 則直接回傳 0。
若 q 不為 NULL，則直接回傳 q->size (
$O (1)$ 時間複雜度)。

q_reverse()




























/*
 * Reverse elements in queue
 * No effect if q is NULL or empty
 * This function should not allocate or free any list elements
 * (e.g., by calling q_insert_head, q_insert_tail, or q_remove_head).
 * It should rearrange the existing ones.
 */
void q_reverse(queue_t *q)
{
    if (!q || !q->head) {
        return;
    }

    list_ele_t *prev, *cur, *next;

    prev = NULL;
    cur = q->head;
    q->tail = q->head;

    while (cur) {
        next = cur->next;
        cur->next = prev;
        prev = cur;
        cur = next;
    }
    
    q->head = prev;
}

檢查 q 是否為 NULL 或是 empty queue。
透過 prev 記錄前一個 node 及 next 記錄下一個 node，並依序將 cur 的 next pointer 指向 prev。最後的 prev 即為原 queue 最後的 node，因此最後將 q->head 指向 prev 即完成 queue 的反轉。

q_sort()























/*
 * Sort elements of queue in ascending order
 * No effect if q is NULL or empty. In addition, if q has only one
 * element, do nothing.
 */
void q_sort(queue_t *q)
{
    if (!q || !q->head) {
        return;
    }

    if (q->head == q->tail) {
        return;
    }

    q->head = merge_sort(q->head);

    // 更新 q->tail
    q->tail = q->head;
    while (q->tail && q->tail->next) {
        q->tail = q->tail->next;
    }
}

merge_sort()




























list_ele_t *merge_sort(list_ele_t *head)
{
    if (!head || !head->next) {
        return head;
    }

    /* Use fast/slow pointers to split list
     * into left and right parts.
     */
    list_ele_t *fast = head->next;
    list_ele_t *slow = head;

    while (fast && fast->next) {
        fast = fast->next->next;
        slow = slow->next;
    }

    list_ele_t *left = head;
    list_ele_t *right = slow->next;
    slow->next = NULL;

    /* Perform merge sort on left part and right part lists separately. */
    left = merge_sort(left);
    right = merge_sort(right);

    /* Merge the sorted left part and right part lists into one list. */
    return merge(left, right);
}

merge()
















































list_ele_t *merge(list_ele_t *left, list_ele_t *right)
{
    if (!left) {
        return right;
    }

    if (!right) {
        return left;
    }

    list_ele_t *result = NULL;
    list_ele_t *cur = NULL;

    // Initialize, compare left and right list's first node's value
    // string length, point result to the node with shorter string length
    // determined by strcasecmp().
    if (strcasecmp(left->value, right->value) < 1) {
        result = cur = left;
        left = left->next;
    } else {
        result = cur = right;
        right = right->next;
    }
    
    while (left && right) {
        if (strcasecmp(left->value, right->value) < 1) {
            cur->next = left;
            left = left->next;
        } else {
            cur->next = right;
            right = right->next;
        }

        cur = cur->next;
    }

    // Left list has nodes left.
    if (left) {
        cur->next = left;
    }

    // Right list has nodes left.
    if (right) {
        cur->next = right;
    }

    return result;
}

q_sort() 使用 merge sort 來排序 queue 中的 nodes。一開始透過快慢 pointer 的方式找出 linked list 中的 middle node，將其切分為左半部及右半部的 lists 分別做 merge sort (i.e. Divide and conquer)。
Merge sort 的最好/平均/最差的時間複雜度都是
$O (n l o g n)$ 。一般 array 透過 merge sort 排序的時候都需要額外分配同樣長度的陣列當作暫存，因此空間複雜度為
$O (n)$ ，而這也是為何一般的 sort() 底層都不會是採用 merge sort (實務上，會根據資料特性，e.g. 資料長度，混和多個不同的 sort algorithms)。但由於我們這邊排序的是 linked list，只需要改變 next pointer 所指向的 node 即可，因此空間複雜度僅為
$O (1)$ 。

執行結果

$ make test
scripts/driver.py
---     Trace           Points
+++ TESTING trace trace-01-ops:
# Test of insert_head and remove_head
---     trace-01-ops    6/6
+++ TESTING trace trace-02-ops:
# Test of insert_head, insert_tail, and remove_head
---     trace-02-ops    6/6
+++ TESTING trace trace-03-ops:
# Test of insert_head, insert_tail, reverse, and remove_head
---     trace-03-ops    6/6
+++ TESTING trace trace-04-ops:
# Test of insert_head, insert_tail, size, and sort
---     trace-04-ops    6/6
+++ TESTING trace trace-05-ops:
# Test of insert_head, insert_tail, remove_head, reverse, size, and sort
---     trace-05-ops    5/5
+++ TESTING trace trace-06-string:
# Test of truncated strings
---     trace-06-string 6/6
+++ TESTING trace trace-07-robust:
# Test operations on NULL queue
---     trace-07-robust 6/6
+++ TESTING trace trace-08-robust:
# Test operations on empty queue
---     trace-08-robust 6/6
+++ TESTING trace trace-09-robust:
# Test remove_head with NULL argument
---     trace-09-robust 6/6
+++ TESTING trace trace-10-malloc:
# Test of malloc failure on new
---     trace-10-malloc 6/6
+++ TESTING trace trace-11-malloc:
# Test of malloc failure on insert_head
---     trace-11-malloc 6/6
+++ TESTING trace trace-12-malloc:
# Test of malloc failure on insert_tail
---     trace-12-malloc 6/6
+++ TESTING trace trace-13-perf:
# Test performance of insert_tail
---     trace-13-perf   6/6
+++ TESTING trace trace-14-perf:
# Test performance of size
---     trace-14-perf   6/6
+++ TESTING trace trace-15-perf:
# Test performance of insert_tail, size, reverse, and sort
---     trace-15-perf   6/6
+++ TESTING trace trace-16-perf:
# Test performance of sort with random and descending orders
# 10000: all correct sorting algorithms are expected pass
# 50000: sorting algorithms with O(n^2) time complexity are expected failed
# 100000: sorting algorithms with O(nlogn) time complexity are expected pass
---     trace-16-perf   6/6
+++ TESTING trace trace-17-complexity:
# Test if q_insert_tail and q_size is constant time complexity
Probably constant time
Probably constant time
---     trace-17-complexity     5/5
---     TOTAL           100/100

TODO:
看到別人 MetalheadKen 的作業有提到 Linus Torvalds 所分享的 Good taste of codes。改寫本作業 linked list 的實作使其符合 Good taste of codes 規範。

把人名或代號標注出來，不要說「別人」
:notes: jserv

已更正
Frank ChangMar 01 2020

Natsort

思考 nature sort order 在什麼場合用得到？提示: 思考 Linux 核心版本號碼
:notes: jserv

採用 natsort 來比對字串。如同 natsort 說明所述，在排序以下字串：rfc1.txt、rfc2086.txt、rfc822.txt 時，natsort 的排序結果將會是：rfc1.txt -> rfc822.txt -> rfc2086.txt，也就是我們 "人類" 比較習慣的排序方式。而一般的 strcmp() 結果則會是：rfc1.txt ->rfc2086.txt -> rfc822.txt。

將 strnatcmp.h 及 strnatcmp.c 加入專案中。
將 strnatcmp.o 加入 Makefile compile objects (OBJS) 中：

OBJS := qtest.o report.o console.o harness.o queue.o \
        random.o dudect/constant.o dudect/fixture.o dudect/ttest.o \
        strnatcmp.o

在 queue.c 中 include strnatcmp.h：


#include "strnatcmp.h"

將 merge sort 中 merge() 的 strcasecmp() 替換為 strnatcasecmp()：







    if (strnatcasecmp(left->value, right->value) < 1) {
        result = cur = left;
        left = left->next;
    } else {
        result = cur = right;
        right = right->next;
    }











    while (left && right) {
        if (strnatcasecmp(left->value, right->value) < 1) {
            cur->next = left;
            left = left->next;
        } else {
            cur->next = right;
            right = right->next;
        }

        cur = cur->next;
    }

新增 traces 測試案例。在 natsort repo 中已經有其所使用的 test cases 了。將其複製至 traces/ 下，並加上指令，也就是將：





2000-1-10
2000-1-2
1999-12-25
2000-3-23
1999-3-3

修改為：








new
ih 2000-1-10
ih 2000-1-2
ih 1999-12-25
ih 2000-3-23
ih 1999-3-3
sort
free

修改 scripts/driver.py，新增測試案例：

a. 修改 traceDict：












    traceDict = {
        1: "trace-01-ops",
        2: "trace-02-ops",
        3: "trace-03-ops",
        4: "trace-04-ops",
        ...
        18: "trace-18-dates",
        19: "trace-19-debs",
        20: "trace-20-debvers",
        21: "trace-21-fractions",
        22: "trace-22-versions"
}

b. 修改 traceProbs：












    traceProbs = {
        1: "Trace-01",
        2: "Trace-02",
        3: "Trace-03",
        4: "Trace-04",
        ...
        18: "Trace-18",
        19: "Trace-19",
        20: "Trace-20",
        21: "Trace-21",
        22: "Trace-22"
}

c. 修改 maxScores


    maxScores = [0, 6, 6, 6, 6, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 5,
                 6, 6, 6, 6, 6]

新增五組 tests 的分數 (各為 6 分)。

再次執行 make test：

$ make test
scripts/driver.py
---     Trace           Points
+++ TESTING trace trace-01-ops:
# Test of insert_head and remove_head
---     trace-01-ops    6/6
+++ TESTING trace trace-02-ops:
# Test of insert_head, insert_tail, and remove_head
---     trace-02-ops    6/6
+++ TESTING trace trace-03-ops:
# Test of insert_head, insert_tail, reverse, and remove_head
---     trace-03-ops    6/6
+++ TESTING trace trace-04-ops:
# Test of insert_head, insert_tail, size, and sort
---     trace-04-ops    6/6
+++ TESTING trace trace-05-ops:
# Test of insert_head, insert_tail, remove_head, reverse, size, and sort
---     trace-05-ops    5/5
+++ TESTING trace trace-06-string:
# Test of truncated strings
---     trace-06-string 6/6
+++ TESTING trace trace-07-robust:
# Test operations on NULL queue
---     trace-07-robust 6/6
+++ TESTING trace trace-08-robust:
# Test operations on empty queue
---     trace-08-robust 6/6
+++ TESTING trace trace-09-robust:
# Test remove_head with NULL argument
---     trace-09-robust 6/6
+++ TESTING trace trace-10-malloc:
# Test of malloc failure on new
---     trace-10-malloc 6/6
+++ TESTING trace trace-11-malloc:
# Test of malloc failure on insert_head
---     trace-11-malloc 6/6
+++ TESTING trace trace-12-malloc:
# Test of malloc failure on insert_tail
---     trace-12-malloc 6/6
+++ TESTING trace trace-13-perf:
# Test performance of insert_tail
---     trace-13-perf   6/6
+++ TESTING trace trace-14-perf:
# Test performance of size
---     trace-14-perf   6/6
+++ TESTING trace trace-15-perf:
# Test performance of insert_tail, size, reverse, and sort
ERROR: Time limit exceeded.  Either you are in an infinite loop, or your code is too inefficient
---     trace-15-perf   0/6
+++ TESTING trace trace-16-perf:
# Test performance of sort with random and descending orders
# 10000: all correct sorting algorithms are expected pass
# 50000: sorting algorithms with O(n^2) time complexity are expected failed
# 100000: sorting algorithms with O(nlogn) time complexity are expected pass
---     trace-16-perf   6/6
+++ TESTING trace trace-17-complexity:
# Test if q_insert_tail and q_size is constant time complexity
Probably constant time
Probably constant time
---     trace-17-complexity     5/5
+++ TESTING trace trace-18-dates:
ERROR: Not sorted in ascending order
---     trace-18-dates  0/6
+++ TESTING trace trace-19-debs:
ERROR: Not sorted in ascending order
---     trace-19-debs   0/6
+++ TESTING trace trace-20-debvers:
ERROR: Not sorted in ascending order
---     trace-20-debvers        0/6
+++ TESTING trace trace-21-fractions:
---     trace-21-fractions      6/6
+++ TESTING trace trace-22-versions:
ERROR: Not sorted in ascending order
---     trace-22-versions       0/6
---     TOTAL           100/130

可以觀察到下列幾點：

由 natsort 來比對字串。所新增的 test cases (18 ~ 22)，測試結果皆失敗 (ERROR: Not sorted in ascending order)，這是因為 qtest.c 中的 q_sort() 所使用的字串比對函式為 strcasecmp()：




        for (list_ele_t *e = q->head; e && --cnt; e = e->next) {
            /* Ensure each element in ascending order */
            /* FIXME: add an option to specify sorting order */
            if (strcasecmp(e->value, e->next->value) > 0) {

將其換為 natsort 的 strnatcasecmp() 即可通過 test cases 18 ~ 22：

$ make test
scripts/driver.py
---     Trace           Points
+++ TESTING trace trace-01-ops:
# Test of insert_head and remove_head
---     trace-01-ops    6/6
+++ TESTING trace trace-02-ops:
# Test of insert_head, insert_tail, and remove_head
---     trace-02-ops    6/6
+++ TESTING trace trace-03-ops:
# Test of insert_head, insert_tail, reverse, and remove_head
---     trace-03-ops    6/6
+++ TESTING trace trace-04-ops:
# Test of insert_head, insert_tail, size, and sort
---     trace-04-ops    6/6
+++ TESTING trace trace-05-ops:
# Test of insert_head, insert_tail, remove_head, reverse, size, and sort
---     trace-05-ops    5/5
+++ TESTING trace trace-06-string:
# Test of truncated strings
---     trace-06-string 6/6
+++ TESTING trace trace-07-robust:
# Test operations on NULL queue
---     trace-07-robust 6/6
+++ TESTING trace trace-08-robust:
# Test operations on empty queue
---     trace-08-robust 6/6
+++ TESTING trace trace-09-robust:
# Test remove_head with NULL argument
---     trace-09-robust 6/6
+++ TESTING trace trace-10-malloc:
# Test of malloc failure on new
---     trace-10-malloc 6/6
+++ TESTING trace trace-11-malloc:
# Test of malloc failure on insert_head
---     trace-11-malloc 6/6
+++ TESTING trace trace-12-malloc:
# Test of malloc failure on insert_tail
---     trace-12-malloc 6/6
+++ TESTING trace trace-13-perf:
# Test performance of insert_tail
---     trace-13-perf   6/6
+++ TESTING trace trace-14-perf:
# Test performance of size
---     trace-14-perf   6/6
+++ TESTING trace trace-15-perf:
# Test performance of insert_tail, size, reverse, and sort
ERROR: Time limit exceeded.  Either you are in an infinite loop, or your code is too inefficient
---     trace-15-perf   0/6
+++ TESTING trace trace-16-perf:
# Test performance of sort with random and descending orders
# 10000: all correct sorting algorithms are expected pass
# 50000: sorting algorithms with O(n^2) time complexity are expected failed
# 100000: sorting algorithms with O(nlogn) time complexity are expected pass
---     trace-16-perf   6/6
+++ TESTING trace trace-17-complexity:
# Test if q_insert_tail and q_size is constant time complexity
Probably constant time
Probably constant time
---     trace-17-complexity     5/5
+++ TESTING trace trace-18-dates:
---     trace-18-dates  6/6
+++ TESTING trace trace-19-debs:
---     trace-19-debs   6/6
+++ TESTING trace trace-20-debvers:
---     trace-20-debvers        6/6
+++ TESTING trace trace-21-fractions:
---     trace-21-fractions      6/6
+++ TESTING trace trace-22-versions:
---     trace-22-versions       6/6
---     TOTAL           124/130

trace-15-perf 超時 (ERROR: Time limit exceeded. Either you are in an infinite loop, or your code is too inefficient)。在交叉比對後，發現是在引入 natsort 後所造成。需將 harness.c 中的 time_limit 值增加至 2 以上才能通過。

TODO:
研究 natsort 的 strnatcmp0() 是否能提升其演算法效率。

使用 Valgrind 排除 `qtest` 實作的記憶體錯誤

使用 make valgrind 指令可以使用 Valgrind 分析我們的程式是否有 memory leak 的問題。

Makefile 分析

valgrind_existence:
	@which valgrind 2>&1 > /dev/null || (echo "FATAL: valgrind not found"; exit 1)

valgrind: valgrind_existence
	# Explicitly disable sanitizer(s)
	$(MAKE) clean SANITIZER=0 qtest
	$(eval patched_file := $(shell mktemp /tmp/qtest.XXXXXX))
	cp qtest $(patched_file)
	chmod u+x $(patched_file)
	sed -i "s/alarm/isnan/g" $(patched_file)
	scripts/driver.py -p $(patched_file) --valgrind
	@echo
	@echo "Test with specific case by running command:" 
	@echo "scripts/driver.py -p $(patched_file) --valgrind -t <tid>"

檢查 valgrind 指令是否存在。
關閉 SANITIZER，並重新編譯 qtest。
因為在跑 Valgrind 的時候我們必須將 alarm() 給取消，因此複製所編譯出來的 qtest，至 /tmp 資料夾，並透過 sed 將 qtest binary 中的 alarm 函式呼叫全部替換為 isnan。
執行 driver.py，並加上 --valgrind 參數，在 driver.py 中就會使用 Valgrind 來執行 qtest。

.valgrindrc 分析

可以閱讀 Valgrind manual 2.6.7 對於 .valgrindrc 的說明：

Note that Valgrind also reads options from three places:

The file ~/.valgrindrc

The environment variable $VALGRIND_OPTS

The file ./.valgrindrc

These are processed in the given order, before the command-line options. Options processed later override those processed earlier; for example, options in ./.valgrindrc will take precedence over those in ~/.valgrindrc.

Any tool-specific options put in $VALGRIND_OPTS or the .valgrindrc files should be prefixed with the tool name and a colon. For example, if you want Memcheck to always do leak checking, you can put the following entry in ~/.valgrindrc:

--memcheck:leak-check=yes

This will be ignored if any tool other than Memcheck is run. Without the memcheck: part, this will cause problems if you select other tools that don't understand –leak-check=yes.

而 qtest 的 .valgrindrc 內容如下：

--quiet
--memcheck:leak-check=full
--show-leak-kinds=all
--error-exitcode=1

-q –quiet: Run silently, and only print error messages. Useful if you are running regression tests or have some other automated test machinery.
leak-check=<no|summary|yes|full> [default: summary]: When enabled, search for memory leaks when the client program finishes. If set to summary, it says how many leaks occurred. If set to full or yes, each individual leak will be shown in detail and/or counted as an error, as specified by the options –show-leak-kinds and –errors-for-leak-kinds.
–show-leak-kinds=<set> [default: definite,possible]: Specifies the leak kinds to show in a full leak search, in one of the following ways:
- a comma separated list of one or more of definite indirect possible reachable.
- all to specify the complete set (all leak kinds). It is equivalent to –show-leak-kinds=definite,indirect,possible,reachable.
- none for the empty set.
–error-exitcode=<number> [default: 0]: Specifies an alternative exit code to return if Valgrind reported any errors in the run. When set to the default value (zero), the return value from Valgrind will always be the return value of the process being simulated. When set to a nonzero value, that value is returned instead, if Valgrind detects any errors. This is useful for using Valgrind as part of an automated test suite, since it makes it easy to detect test cases for which Valgrind has reported errors, just by inspecting return codes.

分析 `qtest`

在原先版本的 queue.c 的 q_remove_head() 中，原先複製字串至 *sp 我是使用 memcpy() 的：




    if (sp) {
        memcpy(sp, head->value, bufsize - 1);
        sp[bufsize - 1] = '\0';
    }

不過在使用 Valgrind 的時候，會出現 Invalid read of size 8 的錯誤：

---     Trace           Points
+++ TESTING trace trace-01-ops:
# Test of insert_head and remove_head
==8991== Invalid read of size 8
==8991==    at 0x4C368AC: memmove (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==8991==    by 0x10CBBD: memcpy (string_fortified.h:34)
==8991==    by 0x10CBBD: q_remove_head (queue.c:135)
==8991==    by 0x10A129: do_remove_head (qtest.c:365)
==8991==    by 0x10B949: interpret_cmda (console.c:220)
==8991==    by 0x10BEBD: interpret_cmd (console.c:243)
==8991==    by 0x10C48B: cmd_select (console.c:571)
==8991==    by 0x10C6D3: run_console (console.c:630)
==8991==    by 0x10ADF8: main (qtest.c:771)
==8991==  Address 0x55ced98 is 8 bytes before a block of size 2,049 alloc'd
==8991==    at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==8991==    by 0x109F87: do_remove_head (qtest.c:331)
==8991==    by 0x10B949: interpret_cmda (console.c:220)
==8991==    by 0x10BEBD: interpret_cmd (console.c:243)
==8991==    by 0x10C48B: cmd_select (console.c:571)
==8991==    by 0x10C6D3: run_console (console.c:630)
==8991==    by 0x10ADF8: main (qtest.c:771)
==8991== 
...

但在將 memcpy() 換成 strncpy() 後，錯誤就消失了。

原本一直想不透到底是什麼問題，直到看見 Naetw 作業的說明：

memcpy 會完全複製 bufsize - 1 bytes 到 sp 裡，但是 ptr->value 的空間長度可能遠小於 bufsize - 1，先前誤解了註解意思，以為要直接複製 bufsize - 1 bytes，實際上它只是最大值而已。將 memcpy 替換成 strncpy 這樣一來在遇到 NULL byte 時複製行為便會直接停止。

才理解原來是 memcpy() 和 strncpy() 針對 NULL terminator 的處理不同所致。

查看 memcpy GNU C Library 說明：

Function: void * memcpy (void *restrict to, const void *restrict from, size_t size)

    Preliminary: | MT-Safe | AS-Safe | AC-Safe | See POSIX Safety Concepts.

    The memcpy function copies size bytes from the object beginning at from into the object beginning at to. The behavior of this function is undefined if the two arrays to and from overlap; use memmove instead if overlapping is possible.

    The value returned by memcpy is the value of to.

    Here is an example of how you might use memcpy to copy the contents of an array:

    struct foo *oldarray, *newarray;
    int arraysize;
    …
    memcpy (new, old, arraysize * sizeof (struct foo));

strcpy、strncpy GNU C Library 的說明：

Function: char * strcpy (char *restrict to, const char *restrict from)

    Preliminary: | MT-Safe | AS-Safe | AC-Safe | See POSIX Safety Concepts.

    This copies bytes from the string from (up to and including the terminating null byte) into the string to. Like memcpy, this function has undefined results if the strings overlap. The return value is the value of to.


Function: char * strncpy (char *restrict to, const char *restrict from, size_t size)

    Preliminary: | MT-Safe | AS-Safe | AC-Safe | See POSIX Safety Concepts.

    This function is similar to strcpy but always copies exactly size bytes into to.

    If from does not contain a null byte in its first size bytes, strncpy copies just the first size bytes. In this case no null terminator is written into to.

    Otherwise from must be a string with length less than size. In this case strncpy copies all of from, followed by enough null bytes to add up to size bytes in all.

    The behavior of strncpy is undefined if the strings overlap.

    This function was designed for now-rarely-used arrays consisting of non-null bytes followed by zero or more null bytes. It needs to set all size bytes of the destination, even when size is much greater than the length of from. As noted below, this function is generally a poor choice for processing text.

由 strcpy 的說明我們可以看見：

This copies bytes from the string from (up to and including the terminating null byte) into the string to.

所以將 memcpy() 改為 strncpy() 才不會複製到多餘的 bytes。透過 Valgrind 我們的確可以找到原先所未設想到的記憶體溢漏 (or 錯誤存取) 情況。

Valgrind massif 分析

Valgrind massif 的說明可以參考 massif 的官方說明文件。

massif 做為 heap profiler，可以分析並記錄我們程式的 heap 使用量。

使用方式只需指定 --tool=massif 即可，e.g.：

valgrind --tool=massif prog

在程式執行完成後，會在該程式目錄下新增一個：massif.out.<你的程式 PID> 的檔案。我們可以再透過 ms_print 指令讀取該檔案，顯示 massif 分析的結果，e.g.：

ms_print massif.out.<你的程式 PID>

以 trace-13-perf.cmd 做為範例：

原先的 .valgrindrc 的 show-leak-kinds 前面並沒有加上 memcheck:，代表該指令只會在 memcheck 的 tool 下才會使用，因此指定 –tool 為 massif 時會出現：valgrind: Unknown option: show-leak-kinds=all 的錯誤訊息。將 .valgrindrc 中的 show-leak-kinds 改為：--memecheck:show-leak-kinds=all 即可避免此錯誤。

--------------------------------------------------------------------------------
Command:            /tmp/qtest.22gTV0 -f traces/trace-13-perf.cmd
Massif arguments:   (none)
ms_print arguments: massif.out.26686
--------------------------------------------------------------------------------


    MB
122.3^                                                 :                      
     |                                    ::::###::::::::                     
     |                                  ::::  #  :  :  ::@                    
     |                                ::::::  #  :  :  ::@:                   
     |                              :@::::::  #  :  :  ::@::                  
     |                            :::@::::::  #  :  :  ::@:::                 
     |                          :::::@::::::  #  :  :  ::@::::                
     |                         ::::::@::::::  #  :  :  ::@::::::              
     |                       ::::::::@::::::  #  :  :  ::@:::::::             
     |                     ::::::::::@::::::  #  :  :  ::@:::::::@            
     |                    :: ::::::::@::::::  #  :  :  ::@:::::::@:           
     |                 ::::: ::::::::@::::::  #  :  :  ::@:::::::@::          
     |               :::: :: ::::::::@::::::  #  :  :  ::@:::::::@:::         
     |             :@:::: :: ::::::::@::::::  #  :  :  ::@:::::::@::::        
     |           :::@:::: :: ::::::::@::::::  #  :  :  ::@:::::::@:::::@      
     |          ::::@:::: :: ::::::::@::::::  #  :  :  ::@:::::::@:::::@:     
     |       :::::::@:::: :: ::::::::@::::::  #  :  :  ::@:::::::@:::::@::    
     |      ::::::::@:::: :: ::::::::@::::::  #  :  :  ::@:::::::@:::::@:::   
     |    :@::::::::@:::: :: ::::::::@::::::  #  :  :  ::@:::::::@:::::@::::  
     |  :@:@::::::::@:::: :: ::::::::@::::::  #  :  :  ::@:::::::@:::::@::::@ 
   0 +----------------------------------------------------------------------->Mi
     0                                                                   919.3

Number of snapshots: 80
 Detailed snapshots: [4, 6, 16, 33, 41 (peak), 46, 57, 67, 77]

--------------------------------------------------------------------------------
  n        time(i)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
--------------------------------------------------------------------------------
  0              0                0                0             0            0
  1     11,754,735        2,922,096        2,375,811       546,285            0
  2     19,830,459        4,965,504        4,036,067       929,437            0
  3     33,996,363        8,549,904        6,948,379     1,601,525            0
  4     44,114,472       11,110,104        9,028,531     2,081,573            0
81.26% (9,028,531B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
->81.17% (9,018,424B) 0x10D57E: test_malloc (harness.c:137)
| ->43.71% (4,856,040B) 0x10DDBD: q_allocate_node (queue.c:229)
| | ->43.71% (4,856,040B) 0x10DA7D: q_insert_head (queue.c:62)
| |   ->43.71% (4,856,040B) 0x109B08: do_insert_head (qtest.c:213)
| |     ->43.71% (4,856,040B) 0x10C3AD: interpret_cmda (console.c:220)
| |       ->43.71% (4,856,040B) 0x10C450: interpret_cmd (console.c:243)
| |         ->43.71% (4,856,040B) 0x10D03F: cmd_select (console.c:571)
| |           ->43.71% (4,856,040B) 0x10D395: run_console (console.c:630)
| |             ->43.71% (4,856,040B) 0x10B184: main (qtest.c:771)
| |               
| ->37.46% (4,162,320B) 0x10DDE7: q_allocate_node (queue.c:236)
| | ->37.46% (4,162,320B) 0x10DA7D: q_insert_head (queue.c:62)
| |   ->37.46% (4,162,320B) 0x109B08: do_insert_head (qtest.c:213)
| |     ->37.46% (4,162,320B) 0x10C3AD: interpret_cmda (console.c:220)
| |       ->37.46% (4,162,320B) 0x10C450: interpret_cmd (console.c:243)
| |         ->37.46% (4,162,320B) 0x10D03F: cmd_select (console.c:571)
| |           ->37.46% (4,162,320B) 0x10D395: run_console (console.c:630)
| |             ->37.46% (4,162,320B) 0x10B184: main (qtest.c:771)
| |               
| ->00.00% (64B) in 1+ places, all below ms_print's threshold (01.00%)
| 
->00.09% (10,107B) in 1+ places, all below ms_print's threshold (01.00%)

... (以下略)

Valgrind massif 會在程式執行期間製作多個 snapshots，顯示程式 heap memory 執行時期的使用量。其中：

: 代表 Normal snapshot.
@ 代表 Detailed snapshot.
# 代表 Peak snapshot.

Detailed snapshot 和 Peak snapshot 都會顯示當下 heap 使用量的詳細資訊，包含分配記憶體的 stack traces。

我們可以看到因為我們的程式有正確的釋放所佔用的記憶體空間，因此最終的記憶體使用量會回到 0。

若我們修改我們的 q_free()：





/* Free all storage used by queue */
void q_free(queue_t *q)
{
    return;
}

再次執行 Valgrind massif：

--------------------------------------------------------------------------------
Command:            /tmp/qtest.60n4QC -f traces/trace-13-perf.cmd
Massif arguments:   (none)
ms_print arguments: massif.out.23011
--------------------------------------------------------------------------------


    MB
122.3^                                                                       :
     |                                                    :::::::###::::::::::
     |                                                  :::::    #  :     :  :
     |                                               :::: :::    #  :     :  :
     |                                            ::@:::: :::    #  :     :  :
     |                                          ::::@:::: :::    #  :     :  :
     |                                       :::::::@:::: :::    #  :     :  :
     |                                    :::: :::::@:::: :::    #  :     :  :
     |                                 ::::::: :::::@:::: :::    #  :     :  :
     |                               ::: ::::: :::::@:::: :::    #  :     :  :
     |                            @::::: ::::: :::::@:::: :::    #  :     :  :
     |                         :@:@::::: ::::: :::::@:::: :::    #  :     :  :
     |                       :::@:@::::: ::::: :::::@:::: :::    #  :     :  :
     |                   :::::::@:@::::: ::::: :::::@:::: :::    #  :     :  :
     |                 :::: ::::@:@::::: ::::: :::::@:::: :::    #  :     :  :
     |              @:@:::: ::::@:@::::: ::::: :::::@:::: :::    #  :     :  :
     |           :::@:@:::: ::::@:@::::: ::::: :::::@:::: :::    #  :     :  :
     |         ::: :@:@:::: ::::@:@::::: ::::: :::::@:::: :::    #  :     :  :
     |      :::::: :@:@:::: ::::@:@::::: ::::: :::::@:::: :::    #  :     :  :
     |  :::::::::: :@:@:::: ::::@:@::::: ::::: :::::@:::: :::    #  :     :  :
   0 +----------------------------------------------------------------------->Mi
     0                                                                   628.8

Number of snapshots: 58
 Detailed snapshots: [1, 15, 17, 26, 28, 44, 54 (peak)]

--------------------------------------------------------------------------------

可以看到，此時的 Valgrind massif 顯示記憶體用量直到程式結束都沒有降低，代表我們有 memory leak 的發生。

透過 Valgrind massif，我們可以分析我們的程式 heap 的用量是否合理，並透過 stack trace，找出 heap 用量的 hot spot，並試著優化我們的程式。

預設 Valigrind massif 是採用 instructions executed 為橫軸單位，但如果程式執行的時間很短，會導致記憶體分配幾乎集中在圖形最後的部分 (因為 main() 在執行前還有很多 "前置作業"，也會一併計算在 instructions executed 中)。因此可以在 .valgrindrc 中指定 massif tool 使用 --time-unit=B 參數 (i.e. --massif:time-unit=B)，將橫軸改為以 bytes allocated/deallocated 為單位。

Select 系統呼叫與 RIO 套件分析

int select(int nfds, fd_set *readfds, fd_set *writefds,
           fd_set *exceptfds, struct timeval *timeout);

select() + read() 為 I/O multiplexing 的一種。關於 Blocking / Non-blocking I/O / I/O multiplexing 及 Asynchronous I/O，可以參考下列文章的說明：

Blocking / Non-blocking I/O：在等待 I/O 資料準備就緒 (e.g. 有資料可以讀取) 的過程，User space process 是否會被 blocked。
- Blocking I/O：在等待 I/O 資料準備就緒的過程，User space process 是被 blocked 的。
  - e.g. read() + 未設定 O_NONBLOCK flag、select()、epoll()。
- Non-blocking I/O：在向 Kernel 詢問是否可以進行 I/O operation 的過程，不論 I/O 資料是否準備就緒，User space process 都不會被 blocked。若 I/O 資料尚未準備就緒 (e.g. 尚無資料可以讀取)，則會直接返回，並設定 errno 指明錯誤 (e.g. EAGAIN、EWOULDBLOCK)。User space 必須重複詢問 (e.g. 在 while 迴圈中重複呼叫 read()) 直到 I/O 資料準備就緒才可進行 I/O operation。
  - e.g. read() + 設定 O_NONBLOCK flag、aio_read()、aio_write()
Synchronous / Asynchronous I/O：在從/向 Kernel space 讀取/寫入資料 (i.e. 實際在做 I/O operation) 的過程，User space 是否會被 blocked。
- Synchronous I/O：從/向 Kernel space 讀取/寫入資料的過程，User space process 是被 blocked 的
  - e.g. read()、recvfrom()。
- Asynchronous I/O：從/向 Kernel space 讀取/寫入的過程是交由 Kernel space 來處理的。Kernel 複製完資料後會再通知 User space process。這中間 User space process 不會被 blocked。
  - e.g. aio_read()、aio_write()。

所以 I/O multiplexing 為 Blocking I/O 的一種。但 select() + read() 比單一 read() 的好處在於 select() 可以同時等待多個 file descriptors (fd)。

在呼叫 select() 時所傳入的是多個 fd set，而非只是一個 fd。只要 fd set 中任一個以上的 fd 的 I/O 資料準備就緒，select() 就會返回。呼叫 select() 的 User space process 必須 iterate fd set 來檢查是哪幾個 fd 可以進行 I/O operation，而後再呼叫 read()、recvfrom() 讀取資料。此時 I/O 資料應準備就緒，可以直接讀取。但也有例外，e.g. select() man page 所述：

Under Linux, select() may report a socket file descriptor as "ready for reading", while nevertheless a subsequent read blocks. This could for example happen when data has arrived but upon examination has wrong checksum and is discarded. There may be other circumstances in which a file descriptor is spuriously reported as ready. Thus it may be safer to use O_NONBLOCK on sockets that should not block.

select() 後的 read()、recvfrom() 是 Blocking I/O 或是 Non-blocking I/O，還是依是否設了 O_NONBLOCK flag 而定。將 read() 或 recvfrom() 設為 NON_BLOCKING 才可以避免 User process 再次被 blocked。

P.S. 由於 select() 最多只可同時 "監控" (watch) 1024 個 fd，且當返回時，User space process 還必須 iterate 整個 fd set 才可以知道是哪些 fd 的資料已經準備好了，效率不佳。因此才有後續的 epoll()。

回到 qtest：

main() 呼叫 run_console() 並將 trace cmd 檔路徑傳入。
run_console() 呼叫 push_file()，分配一個 RIO_ELE，並將 trace cmd 檔的 fd 填入。每個 RIO_ELE 中都有一個 internal buffer：char buf[RIO_BUFSIZE]，也就是 RIO 套件所指的 RIO buffer，另外多個 RIO_ELE 也會透過 prev pointer 串接成一個 linked list。push_file() 完成後，會再呼叫 cmd_select()，準備解析 cmd 檔：











bool run_console(char *infile_name)
{
    if (!push_file(infile_name)) {
        report(1, "ERROR: Could not open source file '%s'", infile_name);
        return false;
    }

    while (!cmd_done())
        cmd_select(0, NULL, NULL, NULL, NULL);
    return err_cnt == 0;
}


















static bool push_file(char *fname)
{
    int fd = fname ? open(fname, O_RDONLY) : STDIN_FILENO;
    if (fd < 0)
        return false;

    if (fd > fd_max)
        fd_max = fd;

    rio_ptr rnew = malloc_or_fail(sizeof(rio_t), "push_file");
    rnew->fd = fd;
    rnew->cnt = 0;
    rnew->bufptr = rnew->buf;
    rnew->prev = buf_stack;
    buf_stack = rnew;

    return true;
}

cmd_select() 中第一次執行時，read_ready() 會回傳 false，代表目前尚未有未執行的指令。再來會將 readfds 設為 local_readset，做為接下來 select() 的 readset 參數。其中會將 trace cmd 檔的 fd (也就是 buf_stack->fd) 透過 FD_SET() 將 readfds 中對應的 bit 設為 1，代表我們要 "監控" (watch) 此 fd 是否有資料可以讀取：

















    if (!block_flag) {
        /* Process any commands in input buffer */
        if (!readfds)
            readfds = &local_readset;

        /* Add input fd to readset for select */
        infd = buf_stack->fd;
        FD_SET(infd, readfds);
        if (infd == STDIN_FILENO && prompt_flag) {
            printf("%s", prompt);
            fflush(stdout);
            prompt_flag = true;
        }

        if (infd >= nfds)
            nfds = infd + 1;
    }

P.S. select() 的 nfds 參數要設比我們最大想 "監控" (watch) 的 fd 值 + 1，因此 ndfs = infd + 1。

由於是讀取本地檔案，因此 select() 會立即返回，其返回值 (result) 代表目前有多少個 fd 資料準備就緒，可以存取。接著透過 FD_ISSET() 檢查 trace cmd 檔的 fd 是否準備就緒。若為 true，則再透過 readline() 讀取 trace cmd 檔的指令 (也是在此運用了　RIO 套件的原理)，並透過 FD_CLR() 從 readfds 中清除對應的 fd bit：















    int result = select(nfds, readfds, writefds, exceptfds, timeout);
    if (result <= 0)
        return result;

    infd = buf_stack->fd;
    if (readfds && FD_ISSET(infd, readfds)) {
        /* Commandline input available */
        FD_CLR(infd, readfds);
        result--;
        cmdline = readline();
        if (cmdline)
            interpret_cmd(cmdline);
    }
    return result;

:question: 疑問：
由於同時間只會 "監控" (watch) 一個 trace cmd 檔的 fd，是否真的有需要使用 select()? 或是其實直接呼叫 readline() 讀取 trace cmd 檔即可? 畢竟多呼叫了 select() 還是多了一個 system call，但卻沒有享受到 select() 所帶來的好處。

根據 CS:APP 第 10 章重點提示說明，read() 在以下情況會有 short count 的問題：

Encountering (end-of-file) EOF.
Reading text lines from a terminal.
Reading and writing network sockets.

而 read() 在以下情況不會有 short count 的問題：

Reading from disk file (except for EOF).
Writing to disk files.

經過自己測試，read() 在讀取本地檔案時，碰到 \n 並不會返回。

因此，CS:APP RIO 套件為了避免 short count 的情形，定義了 rio_t 這個 struct，其中包含了 internal buffer：rio_buf、rio_cnt 記錄 internal buffer 中剩餘未讀取的 bytes 數，及 rio_bufptr 指向 internal buffer 中剩餘未讀取資料的起始 offset。在呼叫 read() 時，所傳入的 *buf 為 rio_buf internal buffer。

readline() 運用了同樣的原理：
























































/* Read command from input file.
 * When hit EOF, close that file and return NULL
 */
static char *readline()
{
    int cnt;
    char c;
    char *lptr = linebuf;

    if (!buf_stack)
        return NULL;

    for (cnt = 0; cnt < RIO_BUFSIZE - 2; cnt++) {
        if (buf_stack->cnt <= 0) {
            /* Need to read from input file */
            buf_stack->cnt = read(buf_stack->fd, buf_stack->buf, RIO_BUFSIZE);
            buf_stack->bufptr = buf_stack->buf;
            if (buf_stack->cnt <= 0) {
                /* Encountered EOF */
                pop_file();
                if (cnt > 0) {
                    /* Last line of file did not terminate with newline. */
                    /*  Terminate line & return it */
                    *lptr++ = '\n';
                    *lptr++ = '\0';
                    if (echo) {
                        report_noreturn(1, prompt);
                        report_noreturn(1, linebuf);
                    }
                    return linebuf;
                }
                return NULL;
            }
        }

        /* Have text in buffer */
        c = *buf_stack->bufptr++;
        *lptr++ = c;
        buf_stack->cnt--;
        if (c == '\n')
            break;
    }

    if (c != '\n') {
        /* Hit buffer limit.  Artificially terminate line */
        *lptr++ = '\n';
    }
    *lptr++ = '\0';

    if (echo) {
        report_noreturn(1, prompt);
        report_noreturn(1, linebuf);
    }

    return linebuf;
}

當 buf_stack->cnt <= 0 時 (代表目前沒有尚未解析的指令)，會呼叫 read() 將讀取的資料存入 buf_stack->buf，並重置 buf_stack->bufptr 及更新 buf_stack->cnt。而後開始解析 buf_stack->buf 中所讀入的資料，直到碰到 '\n'，代表完成了一個指令解析 (解析完的指令存在 linebuf 中)。
當 buf_stack->cnt = 0 時，有下面兩種情況：
- 本次讀取的 text size > RIO_BUFSIZE，且 text 中都沒包含 \n (因此 buf_stack->cnt 被減為 0 了)，此時由於 RIO 的 internal buffer，因此可以繼續讀取後續的檔案內容。
- 碰到 EOF，此時再呼叫 pop_file() 清除此 trace cmd 檔對應的 RIO_ELE 所佔用的記憶體空間。

回到 cmd_select()，執行 interpret_cmd() 執行對應的指令。
離開 cmd_select() 回到 run_console()。但由於 cmd_done() 仍為 false，因此又會再次進到 cmd_select()。但此時若仍有尚未解析完成的指令，則 read_ready() 為 true，會再次呼叫 readline()。不過在 readline() 中，由於 buf_stack->cnt 仍 > 0，因此不會再次呼叫 read()，而只會更新 linebuf，透過 while 迴圈，我們就可以將後續未解析完成的指令一一解析完成並執行，直到所有 buf_stack linked list 的 RIO_ELE 都被處理完成為止 (cmd_done() 回傳值為 true)。：





    while (!block_flag && read_ready()) {
        cmdline = readline();
        interpret_cmd(cmdline);
        prompt_flag = true;
    }