Try   HackMD

2021q1 Homework1 (lab0)

contributed by < chiehen >

tags: linux2021

作業要求

  • 依據指示著手修改 queue.[ch] 和連帶的檔案,測試後用 Git 管理各項修改,記得也該實作 q_sort 函式。
  • 開啟 Address Sanitizer,修正 qtest 執行過程中的錯誤
  • 運用 Valgrind 排除 qtest 實作的記憶體錯誤,並透過 Massif 視覺化 “simulation” 過程中的記憶體使用量,需要設計對應的實驗
  • 在 qtest 中實作 coroutine,並提供新的命令 web,提供 web 伺服器功能
  • 解釋 select 系統呼叫在本程式的使用方式,並分析 console.c 的實作,說明其中運用 CS:APP RIO 套件 的原理和考量點。
  • 說明 antirez/linenoise 的運作原理,注意到 termios 的運用
  • 研讀論文 Dude, is my code constant time?
  • 指出現有程式的缺陷 (提示: 和 RIO 套件 有關),嘗試強化並提交 pull request

環境準備

  1. 安裝必要的開發工具套件
$ sudo apt install build-essential git-core valgrind
$ sudo apt install cppcheck clang-format aspell colordiff
$ sudo apt install tig
  1. 取得程式
    1. 在 GitHub 上 fork 專案(lab-0)
    2. clone 至本地端
    ​​​​$ git clone git@github.com:你的帳號名稱/lab0-c
    

C Programming Lab

在程式中 queue.h 定義了鏈結串列(linked-list) 的結構:

/* Linked list element (You shouldn't need to change this) */ typedef struct ELE { /* Pointer to array holding string. * This array needs to be explicitly allocated and freed */ char *value; struct ELE *next; } list_ele_t;

為單向的 linked-list,
接著用鏈結串列來實作佇列 (queue):

/* Queue structure */ typedef struct { list_ele_t *head; /* Linked list of elements */ /* TODO: You will need to add more fields to this structure * to efficiently implement q_size and q_insert_tail. */ /* TODO: Remove the above comment when you are about to implement. */ } queue_t;

可發現 queue.c 僅提供介面 (interface) 但尚未有完整程式實作,包含以下:

  • q_new: 建立新的「空」佇列;
  • q_free: 釋放佇列所佔用的記憶體;
  • q_insert_head: 在佇列開頭 (head) 加入 (insert) 給定的新元素 (以 LIFO 準則);
  • q_insert_tail: 在佇列尾端 (tail) 加入 (insert) 給定的新元素 (以 FIFO 準則);
  • q_remove_head: 在佇列開頭 (head) 移去 (remove) 給定的元素。
  • q_size: 計算佇列中的元素數量。
  • q_reverse: 以反向順序重新排列鏈結串列,該函式不該配置或釋放任何鏈結串列元素,換言之,它只能重排已經存在的元素;
  • q_sort: 以遞增順序來排序鏈結串列的元素。

因此第一個作業要求即是完成此部份實做。

實做 q_new

根據註解創建一個空的 queue, 將 headtail 指向 NULL, size 設為 0 。如果無法成功獲得動態配置的記憶體空間則回傳 NULL
以下為程式碼:

/*
 * Create empty queue.
 * Return NULL if could not allocate space.
 */
queue_t *q_new()
{
    queue_t *q = malloc(sizeof(queue_t));
    if (!q)
        return NULL;
    q->head = NULL;
    q->tail = NULL;
    q->size = 0;
    return q;
}

實做 q_free

當呼叫此函式時將釋放 queue 中所有配置的記憶體空間,依照 linked-list 單向的結構,從 head 開始逐一釋放記憶體。 另外,因 list element 中成員 value 有額外配置記憶體,因此在釋放記憶體時,須先釋放 value 的記憶體。在釋放完所有 linked-list 的記憶體後,最後釋放 queue 的記憶體。
以下為程式碼:

/* Free all storage used by queue */
void q_free(queue_t *q)
{
    /* Free queue structure */
    if (q) {
        while (q->head) {
            list_ele_t *p = q->head;
            q->head = p->next;
            free(p->value);
            free(p);
        }
    }
    free(q);
}

實做 q_insert_head

此段函式在佇列開頭 (head) 加入 (insert) 給定的新元素。依據註解如果 queue 是NULL,或記憶體配置失敗則回傳 false

配置出字串的記憶體時:

char *val = malloc(strlen(s) + 1);

可注意到配置的空間是 strlen(s) +1 bytes, 是因為根據 Linux Programmer's Manual:

The strlen() function calculates the length of the string pointed to by s, excluding the terminating null byte ('\0').

但在複製字串時也須複製空字元 ('\0') 因此多配置1 byte 儲存此字元,

此外, 須注意若此時字串配置記憶體失敗, 在回傳 false 前要記得釋放先前配置的 list element 記憶體, 以免記憶體漏失(memory leak)。

一開始在複製字串時使用 strcpy(), 但在執行 git commit 時Git pre-commit hook 提示 strcpy() 為不安全函式,
本來想改為使用較完善且安全的 strlcpy, 但發現此函式並沒有被 glibc 支援, 因此改用 strncpy() 複製字串:

strncpy(val, s, strlen(s) + 1);

完整程式碼為:

/*
 * Attempt to insert element at head of queue.
 * Return true if successful.
 * Return false if q is NULL or could not allocate space.
 * Argument s points to the string to be stored.
 * The function must explicitly allocate space and copy the string into it.
 */
bool q_insert_head(queue_t *q, char *s)
{
    /* Check if q is NULL */
    if (!q)
        return false;

    list_ele_t *newh;
    newh = malloc(sizeof(list_ele_t));
    /* Check if allocation successes */
    if (!newh)
        return false;
    
    char *val = malloc(strlen(s) + 1);
    /* Check if allocation successes */
    if (!val) {
        free(newh);
        return false;
    }

    /* Copy string */
    strncpy(val, s, strlen(s) + 1);
    
    /* Update list element */
    newh->value = val;
    newh->next = q->head;
    
    /* Update queue */
    q->head = newh;
    // If this is first element in q, 
    // then also update tail
    if (!q->tail)
        q->tail = newh;
    q->size++;
    
    return true;
}

實做 q_insert_tail

在佇列尾端 (tail) 加入 (insert) 給定的新元素。依據註解如果 queue 是NULL,或記憶體配置失敗則回傳 false
為達成運算時間為

O(1) 的要求, 增加新成員 list_ele_t *tail 至 queue 的結構以紀錄尾端位置:

/* Queue structure */
typedef struct {
    list_ele_t *head; /* Linked list of elements */
    list_ele_t *tail; // Make q_insert_tail compute in O(1)
    int size; 
} queue_t;

實做方式與 q_insert_head 大致相同, 只有在更新 queue 時須改為以下:

/* Update queue */
if (q->tail)
    q->tail->next = newt;
q->tail = newt;
// If this is the first element in q, 
// then also update head
if (!q->head)
    q->head = newt;

完整程式碼:

/*
 * Attempt to insert element at tail of queue.
 * Return true if successful.
 * Return false if q is NULL or could not allocate space.
 * Argument s points to the string to be stored.
 * The function must explicitly allocate space and copy the string into it.
 */
bool q_insert_tail(queue_t *q, char *s)
{
    /* Check if q is NULL */
    if (!q)
        return false;
    
    list_ele_t *newt;  // newt means new tail
    newt = malloc(sizeof(list_ele_t));
    /* Check if allocation successes */
    if (!newt)
        return false;
    
    char *val = malloc(strlen(s) + 1);
    /* Check if allocation successes */
    if (!val) {
        free(newt);
        return false;
    }
    /* Copy string */
    strncpy(val, s, strlen(s) + 1);
    
    /* Update list element */
    newt->value = val;
    newt->next = NULL;
    
    /* Update queue */
    if (q->tail)
        q->tail->next = newt;
    q->tail = newt;
    // If this is the first element in q, 
    // then also update head.
    if (!q->head)
        q->head = newt;
    q->size++;
    
    return true;
}

實做 q_remove_head

在佇列開頭 (head) 移去 (remove) 給定的元素, 並將被移除字串複製至 *sp

  • 在複製字串時, 須考慮被移除字串長度大於 buffsize 的情形:
if (strlen(val) > bufsize - 1) {
    // The removed string is too long,
    // then only copy bufsize-1 characters.
    strncpy(sp, val, bufsize - 1);
    sp[bufsize - 1] = '\0';
} else {
    strncpy(sp, val, strlen(val) + 1);
}

此時將只複製 bufsize-1 characters

  • 在釋放前須檢查釋放元素非tail 所指向的元素, 否則將造成 q_insert_tail 時使用到已釋放的記憶體, i.e.
q = [world]
cmd> rh
Removed world from queue
q = []
cmd> it t
q = [t ... ]
ERROR:  Either list has cycle, or queue has more than 1 elements
cmd> free
ERROR: Attempted to free unallocated block.  Address = 0x5555555555555555
匯流排錯誤 (核心已傾印)

因此須添加判斷:

if(q->tail == ele)
    q->tail = NULL; // avoid use after free

完整程式碼:

/*
 * Attempt to remove element from head of queue.
 * Return true if successful.
 * Return false if queue is NULL or empty.
 * If sp is non-NULL and an element is removed, copy the removed string to *sp
 * (up to a maximum of bufsize-1 characters, plus a null terminator.)
 * The space used by the list element and the string should be freed.
 */
bool q_remove_head(queue_t *q, char *sp, size_t bufsize)
{
    /* Check if queue is NULL or empty, return false */
    if (!q || !q->head)
        return false;
    list_ele_t *ele = q->head;
    q->head = q->head->next;
    
    /* copy removed string to *sp */
    char *val = ele->value;
    if (sp) {
        if (strlen(val) > bufsize - 1) {
            strncpy(sp, val, bufsize - 1);
            sp[bufsize - 1] = '\0';
        } else {
            strncpy(sp, val, strlen(val) + 1);
        }
    }
    
    /* Free memory */
    if(q->tail == ele)
        q->tail = NULL; // avoid use after free
    free(val);
    free(ele);
    
    q->size--;
    
    return true;
}

實做 q_size

計算佇列中的元素數量, 且運算時間須為

O(1)
為達成
O(1)
要求, 須增加新成員 int size 至 queue 的結構:

/* Queue structure */
typedef struct {
    list_ele_t *head; /* Linked list of elements */
    list_ele_t *tail;
    int size; // Make q_size compute in O(1)
} queue_t;

建立新的佇列時(q_new) 將 size 初始至

0。在函式 q_insert_head, q_insert_tail, q_remove_head中須更新 size

則在 q_size 只須回傳 size 即可:

/*
 * Return number of elements in queue.
 * Return 0 if q is NULL or empty
 */
int q_size(queue_t *q)
{
    if (q)
        return q->size;
    return 0;
}

實做 q_reverse

函式以反向順序重新排列鏈結串列。
當更新 next 前, 須紀錄前一個元素(q->head)位置及後一個元素位置(n)。

void q_reverse(queue_t *q)
{
    if (q && q->head) {
        list_ele_t *curr = q->head->next;
        q->tail = q->head;
        q->head->next = NULL;
        while (curr) {
            list_ele_t *n = curr->next; // the original next
            curr->next = q->head;
            q->head = curr;
            curr = n;
         }
    }    
}

實做 q_sort

以遞增順序來排序鏈結串列的元素。

參考Big-O Cheat Sheet, 選擇 Average case 和 Worst case 表現都不差 (

O(nlog(n))) 的 Merge sort 實做 q_sort。

/*
 * Sort elements of queue in ascending order
 * No effect if q is NULL or empty. In addition, if q has only one
 * element, do nothing.
 */
void q_sort(queue_t *q)
{
    /* q is NULL or empty*/
    if (!q || !q->head || !q->head->next)
        return;

    q->head = merge_sort(q->head);
    list_ele_t *temp = q->head;
    
    /* update tail */
    while (temp->next)
        temp = temp->next;
    q->tail = temp;
}

Merge Sort

參考 Linked List Sort 實做 Merge Sort

在 Merge Sort 中須尋找 list 的中間點。 利用兩變數 fastslow, fast 速度為每次兩步, slow 速度為每次一步, 因此當 fast 走至 list 尾端時(

2X), slow 便在 list 中間(
X
):

list_ele_t *fast = head->next;
list_ele_t *slow = head;

// find the middle
while (fast && fast->next) {
    slow = slow->next;
    fast = fast->next->next;
}
fast = slow->next;
slow->next = NULL;

// divide
list_ele_t *l1 = merge_sort(head);
list_ele_t *l2 = merge_sort(fast);

而在 merge 函式中, 原先如參考文件一樣建立 pseudo node 來進行合併:

list_ele_t *merge(list_ele_t *l1, list_ele_t *l2)
{
    // merge with pseudo node
    list_ele_t *result = malloc(sizeof(list_ele_t));
    list_ele_t *result_h = result;
    if (!result)
        return l1;
    ...
    ...
    list_ele_t *head = result_h->next;
    free(result_h);
    return head;
}

但在使用 qtest 測試時得到報錯:

cmd> sort
FATAL ERROR: Calls to malloc disallowed
FATAL Error.  Exiting

因此得知不允許在此處要求配置記憶體,因此改為先比較兩個 list 的第一個元素:

list_ele_t *merge(list_ele_t *l1, list_ele_t *l2)
{
    list_ele_t *result;
    list_ele_t *result_h;
    if (!l2 || (l1 && strcmp(l1->value, l2->value) < 0)) {
        result = l1;
        l1 = l1->next;
    } else {
        result = l2;
        l2 = l2->next;
    }
    result_h = result;
    ...
    ...
    return result_h;
}

Merge Sort 完整程式碼:

list_ele_t *merge_sort(list_ele_t *head) { // when no element or one element in list, stop if (!head || !head->next) return head; list_ele_t *fast = head->next; list_ele_t *slow = head; // find the middle while (fast && fast->next) { slow = slow->next; fast = fast->next->next; } fast = slow->next; slow->next = NULL; // divide list_ele_t *l1 = merge_sort(head); list_ele_t *l2 = merge_sort(fast); return merge(l1, l2); } list_ele_t *merge(list_ele_t *l1, list_ele_t *l2) { // merge with pseudo node list_ele_t *result; list_ele_t *result_h; if (!l2 || (l1 && strcmp(l1->value, l2->value) < 0)) { result = l1; l1 = l1->next; } else { result = l2; l2 = l2->next; } result_h = result; while (l1 && l2) { if (strcmp(l1->value, l2->value) < 0) { result->next = l1; result = result->next; l1 = l1->next; } else { result->next = l2; result = result->next; l2 = l2->next; } } if (l1) result->next = l1; if (l2) result->next = l2; return result_h; }

Notice

測試

$ make
$ make test

測試結果:

---     TOTAL           100/100

Address Sanitizer

開啟 Address Sanitizer,修正 qtest 執行過程中的錯誤。

開啟 Address Sanitizer, 如果先前已生成執行檔, 先清除:

$ make clean 
$ make SANITIZER=1

執行測試:

$ make test

錯誤訊息節錄:

==25269==ERROR: AddressSanitizer: global-buffer-overflow on address 0x56189862a5e0 at pc 0x561898612ae8 bp 0x7ffc3dec4a40 sp 0x7ffc3dec4a30                           
READ of size 4 at 0x56189862a5e0 thread T0
    #0 0x561898612ae7 in do_option_cmd /home/jane/linux2021/lab0-c/console.c:369
    #1 0x5618986118d0 in interpret_cmda /home/jane/linux2021/lab0-c/console.c:221
    #2 0x5618986120b5 in interpret_cmd /home/jane/linux2021/lab0-c/console.c:244
    #3 0x5618986132e1 in cmd_select /home/jane/linux2021/lab0-c/console.c:594
    #4 0x56189861385b in run_console /home/jane/linux2021/lab0-c/console.c:667
    #5 0x5618986103e5 in main /home/jane/linux2021/lab0-c/qtest.c:780
    #6 0x7fac60ba20b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2)
    #7 0x56189860db8d in _start (/home/jane/linux2021/lab0-c/qtest+0x8b8d)

0x56189862a5e1 is located 0 bytes to the right of global variable 'simulation' defined in 'console.c:21:6' (0x56189862a5e0) of size 1
  'simulation' is ascii string ''
SUMMARY: AddressSanitizer: global-buffer-overflow /home/jane/linux2021/lab0-c/console.c:369 in do_option_cmd

得知錯誤發生在 console.c 的 369 行
又與變數 'simulation' 有關

int oldval = *plist->valp;

發現在 104 行, simulation 從 bool 型態被強制轉形成 int 型態後被放入 param_ptr 結構的成員 valp 中

add_param("simulation", (int *) &simulation, "Start/Stop simulation mode ",

因此在369行取值時發生錯誤

修正

嘗試宣告 simulation 為 int

int simulation

報錯:

console.c:21:5: error: conflicting types for ‘simulation’
   21 | int simulation = 0;
      |     ^~~~~~~~~~
In file included from console.c:3:
console.h:11:13: note: previous declaration of ‘simulation’ was here
   11 | extern bool simulation;
      |             ^~~~~~~~~~

也更改 consle.h 中的宣告

extern int simulation

另外發現執行 qtest 中的 help 指令也會有錯誤訊息

$ ./qtest                 
cmd> help 

錯誤訊息節錄:

==26168==ERROR: AddressSanitizer: global-buffer-overflow on address 0x560412b823c0 at pc 0x560412b6b7bd bp 0x7ffeb60b7c50 sp 0x7ffeb60b7c40
READ of size 4 at 0x560412b823c0 thread T0
    #0 0x560412b6b7bc in do_help_cmd /home/jane/linux2021/lab0-c/console.c:307
    #1 0x560412b6b8d0 in interpret_cmda /home/jane/linux2021/lab0-c/console.c:221
    #2 0x560412b6c0b5 in interpret_cmd /home/jane/linux2021/lab0-c/console.c:244
    #3 0x560412b6d7f8 in run_console /home/jane/linux2021/lab0-c/console.c:660
    #4 0x560412b6a3e5 in main /home/jane/linux2021/lab0-c/qtest.c:780
    #5 0x7f83b44520b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2)
    #6 0x560412b67b8d in _start (/home/jane/linux2021/lab0-c/qtest+0x8b8d)

0x560412b823c1 is located 0 bytes to the right of global variable 'echo' defined in 'console.c:59:13' (0x560412b823c0) of size 1
SUMMARY: AddressSanitizer: global-buffer-overflow /home/jane/linux2021/lab0-c/console.c:307 in do_help_cmd

發現一樣在 108 行時被強制轉型

add_param("echo", (int *) &echo, "Do/don't echo commands", NULL);

修正

同理將 echo 也宣告為 int 型態

static int echo

Valgrind

運用 Valgrind 排除 qtest 實作的記憶體錯誤,並透過 Massif 視覺化 “simulation” 過程中的記憶體使用量,需要設計對應的實驗

$ make valgrind

得到輸出:

==6204== 5 bytes in 1 blocks are still reachable in loss record 1 of 3          
==6204==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreloa
d_memcheck-amd64-linux.so)                                                      
==6204==    by 0x4A5250E: strdup (strdup.c:42)                                  
==6204==    by 0x1100D8: linenoiseHistoryAdd (linenoise.c:1236)                 
==6204==    by 0x110C6B: linenoiseHistoryLoad (linenoise.c:1325)                
==6204==    by 0x10C22C: main (qtest.c:769)                                     
==6204== 
==6204== 91 bytes in 19 blocks are still reachable in loss record 2 of 3
==6204==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreloa
d_memcheck-amd64-linux.so)
==6204==    by 0x4A5250E: strdup (strdup.c:42)
==6204==    by 0x11004C: linenoiseHistoryAdd (linenoise.c:1236)
==6204==    by 0x110C6B: linenoiseHistoryLoad (linenoise.c:1325)
==6204==    by 0x10C22C: main (qtest.c:769)
==6204== 
==6204== 160 bytes in 1 blocks are still reachable in loss record 3 of 3
==6204==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreloa
d_memcheck-amd64-linux.so)
==6204==    by 0x110098: linenoiseHistoryAdd (linenoise.c:1224)
==6204==    by 0x110C6B: linenoiseHistoryLoad (linenoise.c:1325)
==6204==    by 0x10C22C: main (qtest.c:769)
==6204== 

發現造成記憶體錯誤的為引入的套件 linenoise 中

Massif 視覺化 “simulation” 過程

$ valgrind --tool=massif ./qtest                 
cmd> option simulation 1                                                        
cmd> it                                                                         
Probably constant time                                                          
cmd> size                                                                       
Probably constant time                                                          
cmd> option simulation 0                                                        
cmd> quit                                                                       
Freeing queue                     

使用 massif 視覺化

$ massif-visualizer

得到: