Try   HackMD

2020q3 Homework1 (lab0)

contributed by < shauming1020 >

tags: homework

環境

$uname -a
Linux dcmc-System-Product-Name 5.4.0-42-generic #46~18.04.1-Ubuntu SMP Fri Jul 10 07:21:24 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

$ gcc --version
gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

作業要求

在 queue.[c/h] 中完成以下實作

  • q_new: 建立新的「空」佇列;
  • q_free: 釋放佇列所佔用的記憶體;
  • q_insert_head: 在佇列開頭 (head) 加入 (insert) 給定的新元素 (以 LIFO 準則);
  • q_insert_tail: 在佇列尾端 (tail) 加入 (insert) 給定的新元素 (以 FIFO 準則);
  • q_remove_head: 在佇列開頭 (head) 移除 (remove) 給定的元素。
  • q_size: 計算佇列中的元素數量。
  • q_reverse: 以反向順序重新排列鏈結串列,該函式不該配置或釋放任何鏈結串列元素,換言之,它只能重排已經存在的元素;
  • q_sort: 以遞增順序來排序鏈結串列的元素

實作流程

queue.h

  • 為滿足 q_insert_tail 和 q_size 可在 O(1) 時間內完成,在 queue_t 中新增兩成員變數來紀錄相關資訊

    • [list_ele_t *tail] 指向尾端節點
    • [int size] 紀錄現有的節點個數
  • 每次執行新增/刪除時需要額外執行 tail 及 數量增減的管理

/* Queue structure */
typedef struct {
    list_ele_t *head, *tail; /* Linked list of elements */
    int size; /* Number of nodes */
} queue_t;

節點結構體

/* Linked list element (You shouldn't need to change this) */
typedef struct ELE {
    /* Pointer to array holding string.
     * This array needs to be explicitly allocated and freed
     */
    char *value;
    struct ELE *next;
} list_ele_t;
  • 要特別注意 value 為一字串,因此刪除節點時,也要記得釋放該佔用的空間

queue.c

q_new

  • malloc 失敗時要回傳 NULL
/* * Create empty queue. * Return NULL if could not allocate space. */ queue_t *q_new() { queue_t *q = malloc(sizeof(queue_t)); if (!q) return NULL; q->head = q->tail = NULL; q->size = 0; return q; }

q_free

  • 遍歷整個 queue 的節點,釋放所有節點所佔有的空間
  • 先用 tmp 指標指向 head 所指的 heap 位置後,才移動 head 往下個節點走
  • 要先釋放字串所佔空間,再釋放 tmp
/* Free all storage used by queue */ void q_free(queue_t *q) { if (!q) return; while (q->head) { list_ele_t *tmp = q->head; q->head = q->head->next; free(tmp->value); free(tmp); } free(q); }

q_insert_head

  • 先檢查是否有足夠的空間配置給字串
  • 再檢查剩下的空間是否足以配置給節點,不夠的話連同釋放配置給字串的空間
  • 當一開始只有一個節點時即為 head,tail 也必須讓它指向 head
/* * Attempt to insert element at head of queue. * Return true if successful. * Return false if q is NULL or could not allocate space. * Argument s points to the string to be stored. * The function must explicitly allocate space and copy the string into it. */ bool q_insert_head(queue_t *q, char *s) { if (!q) return false; char *news = malloc(strlen(s) + 1); if (!news) { free(news); return false; } else strcpy(news, s); list_ele_t *newh = malloc(sizeof(list_ele_t)); if (!newh) { free(newh); free(news); return false; } newh->next = q->head; /* insert head */ newh->value = news; q->head = newh; q->size++; if (q->size == 1) q->tail = newh; return true; }

q_insert_tail

  • 與 q_insert_head 相同,需先檢查是否成功 malloc
  • 與 q_insert_head 不同的地方在於,如果 tail 一開始為 NULL,此時要移動 tail 到 next 節點時,會遭遇 dereferenced a null 的錯誤,因此必須先檢查 tail 是否有值
/* * Attempt to insert element at tail of queue. * Return true if successful. * Return false if q is NULL or could not allocate space. * Argument s points to the string to be stored. * The function must explicitly allocate space and copy the string into it. */ bool q_insert_tail(queue_t *q, char *s) { if (!q) return false; char *news = malloc(strlen(s) + 1); if (!news) { free(news); return false; } else strcpy(news, s); list_ele_t *newt = malloc(sizeof(list_ele_t)); if (!newt) { free(newt); free(news); return false; } newt->next = NULL; /* Remember: It should operate in O(1) time */ newt->value = news; if (!q->tail) { q->head = newt; q->tail = newt; } else { q->tail->next = newt; q->tail = q->tail->next; } q->size++; return true; }

q_remove_head

  • sp 存放刪除的字串,並於結尾處放置 null terminator
  • 也記得要釋放字串佔有的空間
/* * Attempt to remove element from head of queue. * Return true if successful. * Return false if queue is NULL or empty. * If sp is non-NULL and an element is removed, copy the removed string to *sp * (up to a maximum of bufsize-1 characters, plus a null terminator.) * The space used by the list element and the string should be freed. */ bool q_remove_head(queue_t *q, char *sp, size_t bufsize) { if (!q || !(q->head)) return false; if (sp) { int size = strlen(q->head->value) < bufsize ? strlen(q->head->value) : bufsize - 1; strncpy(sp, q->head->value, size); sp[size] = '\0'; } list_ele_t *tmp = q->head; q->head = q->head->next; if (q->size) q->size--; free(tmp->value); free(tmp); return true; }

[ERROR] copying of string in remove_head overflowed destination buffer
到 qtest.c 去看

#define MAXSTRING 1024
#define STRINGPAD MAXSTRING
...
char *removes = malloc(string_length + STRINGPAD + 1); // 1024 + 1024 + 1 = 2049
removes[0] = '\0';
memset(removes + 1, 'X', string_length + STRINGPAD - 1); 
removes[string_length + STRINGPAD] = '\0';
...
   rval = q_remove_head(q, removes, string_length + 1); // bufsize = 1025
  1. 推敲 removes 長度為 2048 用來存放被刪除的字串,其頭尾都被放入 '\0' null terminator
  2. 一開始撰寫形式如下
      if (sp) {
       strcpy(sp, q->head->value);
       sp[strlen(q->head->value)] = '\0';
      }

仍會遇到 ERROR: copying of string in remove_head overflowed destination buffer.
原因在於若將大於 bufsize 的字串 copy 到 removes (buffer) 時所出現的錯誤,例如 長度為 2050 的字串要放到 2049 的 buffer

  1. 因此改用 strncpy 只要複製出最短的長度即可

int size = strlen(q->head->value) < bufsize ? strlen(q->head->value)
: bufsize - 1;

q_size

/* * Return number of elements in queue. * Return 0 if q is NULL or empty */ int q_size(queue_t *q) { if (!q) return 0; return q->size; }

q_reverse

  • 參考 quiz1 的 reverse 做法
  • 當 cursor 只有指向一個節點時,其 next 指向 NULL,則該節點即為新 list 的 tail
/* * Reverse elements in queue * No effect if q is NULL or empty * This function should not allocate or free any list elements * (e.g., by calling q_insert_head, q_insert_tail, or q_remove_head). * It should rearrange the existing ones. */ void q_reverse(queue_t *q) { if (!q) return; list_ele_t *cursor = NULL; list_ele_t *node_head = q->head; list_ele_t *node_tail = q->tail; while (node_head) { list_ele_t *next = node_head->next; node_head->next = cursor; cursor = node_head; if (!(cursor->next)) node_tail = cursor; node_head = next; } q->head = cursor; q->tail = node_tail; }

q_sort

/* * Sort elements of queue in ascending order * No effect if q is NULL or empty. In addition, if q has only one * element, do nothing. */ void q_sort(queue_t *q) { if (!q) return; MergeSort(&q->head); }
Merge Sort
list_ele_t *SortedMerge(list_ele_t *a, list_ele_t *b) { if (a == NULL) return (b); else if (b == NULL) return (a); list_ele_t *result = NULL; if (strcasecmp(a->value, b->value) < 0) { result = a; result->next = SortedMerge(a->next, b); } else { result = b; result->next = SortedMerge(a, b->next); } return (result); }
void MergeSort(list_ele_t **headRef) { list_ele_t *head = *headRef, *a, *b; if ((head == NULL) || (head->next) == NULL) return; // FrontBackSplit list_ele_t *fast = head->next; list_ele_t *slow = head; while (fast != NULL) { fast = fast->next; if (fast != NULL) { slow = slow->next; fast = fast->next; } } a = head; b = slow->next; slow->next = NULL; MergeSort(&a); MergeSort(&b); *headRef = SortedMerge(a, b); }
實作結果 (9/19)
$ make test
scripts/driver.py -c
---	Trace		Points
+++ TESTING trace trace-01-ops:
# Test of insert_head and remove_head
---	trace-01-ops	6/6
+++ TESTING trace trace-02-ops:
# Test of insert_head, insert_tail, and remove_head
---	trace-02-ops	6/6
+++ TESTING trace trace-03-ops:
# Test of insert_head, insert_tail, reverse, and remove_head
---	trace-03-ops	6/6
+++ TESTING trace trace-04-ops:
# Test of insert_head, insert_tail, size, and sort
---	trace-04-ops	6/6
+++ TESTING trace trace-05-ops:
# Test of insert_head, insert_tail, remove_head, reverse, size, and sort
---	trace-05-ops	5/5
+++ TESTING trace trace-06-string:
# Test of truncated strings
---	trace-06-string	6/6
+++ TESTING trace trace-07-robust:
# Test operations on NULL queue
---	trace-07-robust	6/6
+++ TESTING trace trace-08-robust:
# Test operations on empty queue
---	trace-08-robust	6/6
+++ TESTING trace trace-09-robust:
# Test remove_head with NULL argument
---	trace-09-robust	6/6
+++ TESTING trace trace-10-malloc:
# Test of malloc failure on new
---	trace-10-malloc	6/6
+++ TESTING trace trace-11-malloc:
# Test of malloc failure on insert_head
---	trace-11-malloc	6/6
+++ TESTING trace trace-12-malloc:
# Test of malloc failure on insert_tail
---	trace-12-malloc	6/6
+++ TESTING trace trace-13-perf:
# Test performance of insert_tail
---	trace-13-perf	6/6
+++ TESTING trace trace-14-perf:
# Test performance of size
---	trace-14-perf	6/6
+++ TESTING trace trace-15-perf:
# Test performance of insert_tail, size, reverse, and sort
---	trace-15-perf	0/6
+++ TESTING trace trace-16-perf:
# Test performance of sort with random and descending orders
# 10000: all correct sorting algorithms are expected pass
# 50000: sorting algorithms with O(n^2) time complexity are expected failed
# 100000: sorting algorithms with O(nlogn) time complexity are expected pass
---	trace-16-perf	6/6
+++ TESTING trace trace-17-complexity:
# Test if q_insert_tail and q_size is constant time complexity
Probably constant time
Probably constant time
---     trace-17-complexity     5/5
---	TOTAL		94/100

Valgrind

運用 Valgrind 排除 qtest 實作的記憶體錯誤

$make valgrind

其中 trace-15-perf 存在下列錯誤
+++ TESTING trace trace-15-perf:
# Test performance of insert_tail, size, reverse, and sort
==1887== Stack overflow in thread #1: can't grow stack to 0x1ffe801000
==1887== Stack overflow in thread #1: can't grow stack to 0x1ffe801000
==1887== Can't extend stack to 0x1ffe8010a8 during signal delivery for thread 1:
==1887==   no stack segment
==1887== 
==1887== Process terminating with default action of signal 11 (SIGSEGV)
==1887==  Access not within mapped region at address 0x1FFE8010A8
==1887== Stack overflow in thread #1: can't grow stack to 0x1ffe801000
==1887==    at 0x4C33619: strcasecmp (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==1887==  If you believe this happened as a result of a stack
==1887==  overflow in your program's main thread (unlikely but
==1887==  possible), you can try to increase the size of the
==1887==  main thread stack using the --main-stacksize= flag.
==1887==  The main thread stack size used in this run was 8388608.
==1887== Stack overflow in thread #1: can't grow stack to 0x1ffe801000
...
==1887== 56,000,000 bytes in 1,000,000 blocks are still reachable in loss record 33 of 33
==1887==    at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==1887==    by 0x10C7D3: test_malloc (harness.c:137)
==1887==    by 0x10CD01: q_insert_tail (queue.c:95)
==1887==    by 0x10A5E2: do_insert_tail (qtest.c:297)
==1887==    by 0x10B9D2: interpret_cmda (console.c:220)
==1887==    by 0x10BF46: interpret_cmd (console.c:243)
==1887==    by 0x10C514: cmd_select (console.c:569)
==1887==    by 0x10C75C: run_console (console.c:628)
==1887==    by 0x10AE81: main (qtest.c:772)

Stack overflow, wiki
In software, a stack overflow occurs if the call stack pointer exceeds the stack bound. The call stack may consist of a limited amount of address space, often determined at the start of the program. The size of the call stack depends on many factors, including the programming language, machine architecture, multi-threading, and amount of available memory.

  • 造成 stack overflow 的原因,推測是採用遞迴呼叫的 merge sort,過多的函式呼叫導致使用的堆疊大小超過事先規畫的大小,覆蓋其他記憶體內的資料,在POSIX相容平台上,堆疊溢位通常會造成作業系統產生SIGSEGV訊號

觀察 trace-15-perf 內容

# Test performance of insert_tail, size, reverse, and sort
option fail 0
option malloc 0
new
ih dolphin 1000000
it gerbil 1000000
size 1000
reverse
sort
size 1000
  • 在 list 頭尾各插入 1000000 筆的字串,再進行 size、reverse 及 sort 函式的運算,評估程式效能

利用 qtest 的 cmd 逐行檢測

$valgrind -q --leak-check=full ./qtest

log
...
cmd> sort
ERROR: Time limit exceeded.  Either you are in an infinite loop, or your code is too inefficient
==1542== Invalid read of size 8
==1542==    at 0x109CD2: do_sort (qtest.c:561)
==1542==    by 0x10B9D2: interpret_cmda (console.c:220)
==1542==    by 0x10BF46: interpret_cmd (console.c:243)
==1542==    by 0x10C6B5: cmd_select (console.c:607)
==1542==    by 0x10C75C: run_console (console.c:628)
==1542==    by 0x10AE81: main (qtest.c:772)
==1542==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==1542== 
Segmentation fault occurred.  You dereferenced a NULL or invalid pointer
==1542== 5 bytes in 1 blocks are still reachable in loss record 1 of 32
==1542==    at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==1542==    by 0x10B5F2: strsave_or_fail (report.c:230)
==1542==    by 0x10BF06: parse_args (console.c:190)
==1542==    by 0x10BF06: interpret_cmd (console.c:242)
==1542==    by 0x10C6B5: cmd_select (console.c:607)
==1542==    by 0x10C75C: run_console (console.c:628)
==1542==    by 0x10AE81: main (qtest.c:772)
...
==1542== 26,442,416 bytes in 472,186 blocks are still reachable in loss record 32 of 32
==1542==    at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==1542==    by 0x10C7D3: test_malloc (harness.c:137)
==1542==    by 0x10CC4A: q_insert_head (queue.c:53)
==1542==    by 0x10A869: do_insert_head (qtest.c:212)
==1542==    by 0x10B9D2: interpret_cmda (console.c:220)
==1542==    by 0x10BF46: interpret_cmd (console.c:243)
==1542==    by 0x10C6B5: cmd_select (console.c:607)
==1542==    by 0x10C75C: run_console (console.c:628)
==1542==    by 0x10AE81: main (qtest.c:772)
==1542== 
已經終止 (核心已傾印)
  • 確實在執行到 sort 時,遇到 blocks are still reachable 的情形,程式結束時有未釋放的記憶體,不過卻還有指標指著
  • 分析程式碼,除了函式 MergeSort 採用遞迴之外,函式 SortedMerge 也採用了遞迴呼叫更新 result 的 next,推測遞迴再遞迴的方式造成 stack overflow
  • 參考 ZhuMon,將函式 SortedMerge 改寫
void SortedMerge(list_ele_t *a, list_ele_t *b, list_ele_t **headRef) { *headRef = NULL; list_ele_t **result = headRef; while (a && b) { if (strcasecmp(a->value, b->value) < 0) { *result = a; a = a->next; } else { *result = b; b = b->next; } result = &((*result)->next); } (*result) = b ? b : a; }
void MergeSort(list_ele_t **headRef) { list_ele_t *head = *headRef, *a, *b; if ((head == NULL) || (head->next) == NULL) return; // FrontBackSplit list_ele_t *fast = head->next; list_ele_t *slow = head; while (fast != NULL) { fast = fast->next; if (fast != NULL) { slow = slow->next; fast = fast->next; } } a = head; b = slow->next; slow->next = NULL; MergeSort(&a); MergeSort(&b); SortedMerge(a, b, headRef); }
實作結果 (9/20) $make valgrind 沒有任何錯誤訊息
$make valgrind
# Explicitly disable sanitizer(s)
make clean SANITIZER=0 qtest
make[1]: Entering directory '/home/dcmc/OS/Q3/lab0-c'
rm -f qtest.o report.o console.o harness.o queue.o random.o dudect/constant.o dudect/fixture.o dudect/ttest.o .qtest.o.d .report.o.d .console.o.d .harness.o.d .queue.o.d .random.o.d .dudect/constant.o.d .dudect/fixture.o.d .dudect/ttest.o.d *~ qtest /tmp/qtest.*
rm -rf .dudect
rm -rf *.dSYM
(cd traces; rm -f *~)
  CC	qtest.o
  CC	report.o
  CC	console.o
  CC	harness.o
  CC	queue.o
  CC	random.o
  CC	dudect/constant.o
  CC	dudect/fixture.o
  CC	dudect/ttest.o
  LD	qtest
make[1]: Leaving directory '/home/dcmc/OS/Q3/lab0-c'
cp qtest /tmp/qtest.S79FKQ
chmod u+x /tmp/qtest.S79FKQ
sed -i "s/alarm/isnan/g" /tmp/qtest.S79FKQ
scripts/driver.py -p /tmp/qtest.S79FKQ --valgrind 
---	Trace		Points
+++ TESTING trace trace-01-ops:
# Test of insert_head and remove_head
---	trace-01-ops	6/6
+++ TESTING trace trace-02-ops:
# Test of insert_head, insert_tail, and remove_head
---	trace-02-ops	6/6
+++ TESTING trace trace-03-ops:
# Test of insert_head, insert_tail, reverse, and remove_head
---	trace-03-ops	6/6
+++ TESTING trace trace-04-ops:
# Test of insert_head, insert_tail, size, and sort
---	trace-04-ops	6/6
+++ TESTING trace trace-05-ops:
# Test of insert_head, insert_tail, remove_head, reverse, size, and sort
---	trace-05-ops	5/5
+++ TESTING trace trace-06-string:
# Test of truncated strings
---	trace-06-string	6/6
+++ TESTING trace trace-07-robust:
# Test operations on NULL queue
---	trace-07-robust	6/6
+++ TESTING trace trace-08-robust:
# Test operations on empty queue
---	trace-08-robust	6/6
+++ TESTING trace trace-09-robust:
# Test remove_head with NULL argument
---	trace-09-robust	6/6
+++ TESTING trace trace-10-malloc:
# Test of malloc failure on new
---	trace-10-malloc	6/6
+++ TESTING trace trace-11-malloc:
# Test of malloc failure on insert_head
---	trace-11-malloc	6/6
+++ TESTING trace trace-12-malloc:
# Test of malloc failure on insert_tail
---	trace-12-malloc	6/6
+++ TESTING trace trace-13-perf:
# Test performance of insert_tail
---	trace-13-perf	6/6
+++ TESTING trace trace-14-perf:
# Test performance of size
---	trace-14-perf	6/6
+++ TESTING trace trace-15-perf:
# Test performance of insert_tail, size, reverse, and sort
---	trace-15-perf	6/6
+++ TESTING trace trace-16-perf:
# Test performance of sort with random and descending orders
# 10000: all correct sorting algorithms are expected pass
# 50000: sorting algorithms with O(n^2) time complexity are expected failed
# 100000: sorting algorithms with O(nlogn) time complexity are expected pass
---	trace-16-perf	6/6
+++ TESTING trace trace-17-complexity:
# Test if q_insert_tail and q_size is constant time complexity
Probably constant time
Probably constant time
---     trace-17-complexity     5/5
---	TOTAL		100/100

Test with specific case by running command:
scripts/driver.py -p /tmp/qtest.S79FKQ --valgrind -t <tid>

Massif 視覺化 “simulation” 過程

is_insert_tail_const

$ valgrind --tool=massif --stacks=yes --time-unit=i ./qtest
option simulation 1
it
quit
$ ms_print massif.out.<pid>  > log

  • 從峰值處可以看到程式執行的順序(上方最後呼叫)

is_size_const

$ valgrind --tool=massif --stacks=yes --time-unit=i ./qtest
option simulation 1
size
quit
$ ms_print massif.out.<pid>  > log

Dude, is my code constant time?

解釋本程式的 “simulation” 模式是如何透過以實驗而非理論分析,達到驗證時間複雜度

解釋 Student’s t-distribution

程式實作的原理

  • is_insert_tail_const(fixture.c) 會初始化資料,分成兩類
void t_init(t_ctx *ctx)
{
    for (int class = 0; class < 2; class ++) {
        ctx->mean[class] = 0.0;
        ctx->m2[class] = 0.0;
        ctx->n[class] = 0.0;
    }
    return;
}
  • doit
    • prepare_inputs(input_data, classes);
      準備實驗資料
    ​​​​void prepare_inputs(uint8_t *input_data, uint8_t *classes) ​​​​{ ​​​​ randombytes(input_data, number_measurements * chunk_size); ​​​​ for (size_t i = 0; i < number_measurements; i++) { ​​​​ classes[i] = randombit(); ​​​​ if (classes[i] == 0) ​​​​ memset(input_data + (size_t) i * chunk_size, 0, chunk_size); ​​​​ } ​​​​ for (size_t i = 0; i < NR_MEASURE; ++i) { ​​​​ /* Generate random string */ ​​​​ randombytes((uint8_t *) random_string[i], 7); ​​​​ random_string[i][7] = 0; ​​​​ } ​​​​}
    • measure(before_ticks, after_ticks, input_data, mode);
      進行測量,mode 為 test_insert_tail
    ​​​​if (mode == test_insert_tail) { ​​​​ for (size_t i = drop_size; i < number_measurements - drop_size; i++) { ​​​​ char *s = get_random_string(); ​​​​ dut_new(); ​​​​ dut_insert_head( ​​​​ get_random_string(), ​​​​ *(uint16_t *) (input_data + i * chunk_size) % 10000); ​​​​ before_ticks[i] = cpucycles(); ​​​​ dut_insert_tail(s, 1); ​​​​ after_ticks[i] = cpucycles(); ​​​​ dut_free(); ​​​​ }

TODO