# 2020q3 Homework1 (lab0) contributed by <`solnone`> ## 環境 ```shell $ uname -a Linux Macmini 5.4.0-45-generic #49-Ubuntu SMP Wed Aug 26 13:38:52 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux $ lscpu | grep "Model name" Model name: Intel(R) Core(TM)2 Duo CPU P8800 @ 2.66GHz $ gcc -v 2>&1 | grep "gcc version" gcc version 9.3.0 (Ubuntu 9.3.0-10ubuntu2) $ gdb --version GNU gdb (Ubuntu 9.1-0ubuntu1) 9.1 ... $ clang-format --version clang-format version 10.0.0-4ubuntu1 $ cppcheck --version Cppcheck 1.90 $ valgrind --version valgrind-3.15.0 $ git --version git version 2.25.1 ``` ## 使用工具 ### 使用 [git](https://git-scm.com) 取得程式碼 [5f9/lab0-c](https://github.com/5f9/lab0-c.git) 從 [GitHub sysprog21/lab0-c](https://github.com/sysprog21/lab0-c) 進行 fork 後 clone ```shell $ git clone https://github.com/5f9/lab0-c.git $ cd lab0-c ``` ### 靜態程式碼分析工具 [Cppcheck](http://cppcheck.sourceforge.net) ```shell $ cppcheck --enable=all --inline-suppr --inconclusive --suppress=missingIncludeSystem --verbose *.[ch] dudect/*.[ch] Checking console.c ... ... console.c:134:22: style:inconclusive: Function 'add_param' argument 3 names different: declaration 'doccumentation' definition 'documentation'. [funcArgNamesDifferent] char *documentation, ^ console.h:48:22: note: Function 'add_param' argument 3 names different: declaration 'doccumentation' definition 'documentation'. char *doccumentation, ^ console.c:134:22: note: Function 'add_param' argument 3 names different: declaration 'doccumentation' definition 'documentation'. char *documentation, ^ 1/11 files checked 25% done ... qtest.c:746:13: warning:inconclusive: The buffer 'lbuf' may not be null-terminated after the call to strncpy(). If the source string's size fits or exceeds the given size, strncpy() does not add a zero at the end of the buffer. This causes bugs later in the code if the code assumes buffer is null-terminated. [terminateStrncpy] strncpy(lbuf, optarg, BUFSIZE); ^ 5/11 files checked 75% done ... ``` 統一變數名稱, 並修正誤用的變數 buf -> lbuf ```diff diff --git a/console.h b/console.h index 33e7d5d..590a7a1 100644 --- a/console.h +++ b/console.h @@ -45,7 +45,7 @@ void add_cmd(char *name, cmd_function operation, char *documentation); /* Add a new parameter */ void add_param(char *name, int *valp, - char *doccumentation, + char *documentation, setter_function setter); ... diff --git a/qtest.c b/qtest.c index 86d68fb..fe4f9e6 100644 --- a/qtest.c +++ b/qtest.c @@ -744,7 +744,7 @@ int main(int argc, char *argv[]) break; case 'l': strncpy(lbuf, optarg, BUFSIZE); - buf[BUFSIZE - 1] = '\0'; + lbuf[BUFSIZE - 1] = '\0'; logfile_name = lbuf; break; default: ... ``` ### 使用 [Address Sanitizer](https://github.com/google/sanitizers/wiki/AddressSanitizer) ```shell $ SANITIZER=1 VERBOSE=1 make test ... +++ TESTING trace trace-17-complexity: ... 0x5622af06b9a1 is located 0 bytes to the right of global variable 'simulation' defined in 'console.c:20:6' (0x5622af06b9a0) of size 1 'simulation' is ascii string '' SUMMARY: AddressSanitizer: global-buffer-overflow console.c:368 in do_option_cmd Shadow bytes around the buggy address: ... ``` function add_param 的宣告為 ```cpp void add_param(char *name, int *valp, char *documentation, setter_function setter); ``` simulation 和 echo 變數型態為 bool, 但 valp 是 int pointer, 因此會造成 global-buffer-overflow [修改 simulation 和 echo 變數型態為 int](https://github.com/5f9/lab0-c/commit/5c56373afe16b324f0176a5379adaeb1580b231f) ### 動態程式分析工具 [Valgrind](https://valgrind.org) #### leak check ```shell $ make valgrind ``` #### fds not closed ```shell $ make && valgrind --track-fds=yes ./qtest -f traces/trace-17-complexity.cmd -l log.txt ... ==2513== FILE DESCRIPTORS: 5 open at exit. ==2513== Open file descriptor 5: /dev/urandom ==2513== at 0x4AC0D1B: open (open64.c:48) ==2513== by 0x10E16F: open (fcntl2.h:53) ==2513== by 0x10E16F: randombytes (random.c:17) ... ==2513== ==2513== Open file descriptor 3: log.txt ==2513== at 0x4AC0D1B: open (open64.c:48) ==2513== by 0x4A43195: _IO_file_open (fileops.c:189) ==2513== by 0x4A43459: _IO_file_fopen@@GLIBC_2.2.5 (fileops.c:281) ==2513== by 0x4A35B0D: __fopen_internal (iofopen.c:75) ==2513== by 0x4A35B0D: fopen@@GLIBC_2.2.5 (iofopen.c:86) ==2513== by 0x10C1D7: set_logfile (report.c:49) ... ``` [Close `/dev/urandom` and `logfile` descriptor](https://github.com/5f9/lab0-c/commit/e9551d2b7328349f4502699ff829123ae26fd681) ### 使用 [Make](https://www.gnu.org/software/make) 最佳化 -O2 或 -O3 ```shell $ make CFLAGS="-O2 -g -Wall -Werror -Idudect -I." CC qtest.o qtest.c: In function 'main': qtest.c:746:13: error: 'strncpy' specified bound 256 equals destination size [-Werror=stringop-truncation] strncpy(lbuf, optarg, BUFSIZE); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors make: *** [Makefile:48: qtest.o] Error 1 ``` [Fix strncpy warning in report file](https://github.com/5f9/lab0-c/commit/bbb2a8738941832bdce5291cb2edcb76475ed352) ### Core dump ```shell $ ulimit -c unlimited && ./qtest $ scripts/debug.py -a ``` ### Debug Using [GDB](https://www.gnu.org/software/gdb) ```shell $ make CFLAGS="-O0 -g -Wall -Werror -Idudect -I." VERBOSE=1 clean all $ scripts/debug.py -d Reading symbols from lab0-c/qtest... Signal Stop Print Pass to program Description SIGALRM No No No Alarm clock Starting program: lab0-c/qtest cmd> ``` ### [Perf](http://www.brendangregg.com/perf.html) ```shell $ sudo sh -c " echo 0 > /proc/sys/kernel/kptr_restrict" $ perf top -p $(pidof qtest) ``` ## 使用套件 ### Natural sort 從 [GitHub sourcefrog/natsort](https://github.com/sourcefrog/natsort.git) 進行 fork 後 clone ```shell git submodule add https://github.com/5f9/natsort.git ``` [Fix cppcheck errors](https://github.com/5f9/natsort/commit/376c59b00738e8ae6548c4cd20f25f8e75080d18) ### 自動補完 從 [GitHub antirez/linenoise](https://github.com/antirez/linenoise) 進行 fork 後 clone ```shell git submodule add https://github.com/5f9/linenoise.git ``` [Fix cppcheck errors](https://github.com/5f9/linenoise/commit/8f97b1b057355da798841503836de9b023680ed1) ## Queue 開發紀錄 ### queue.h 參考 [Naetw](https://hackmd.io/@bHCinvrjSmaS33lxDnYjog/BysQssYHN?type=view#%E4%BD%BF%E7%94%A8-macro-%E7%B0%A1%E5%8C%96%E7%A8%8B%E5%BC%8F%E7%A2%BC) 使用 macro 簡化程式碼 ```cpp #ifndef likely #define likely(x) __builtin_expect(!!(x), 1) #endif #ifndef unlikely #define unlikely(x) __builtin_expect((x), 0) #endif #define RETURN_IF_NULL(obj, ret_val) \ do { \ if (unlikely(!(obj))) \ return ret_val; \ } while (0) #define STRNCPY(dest, src, len) \ do { \ strncpy(dest, src, len); \ dest[len - 1] = '\0'; \ } while (0) /* Data structure declarations */ /* Linked list element */ typedef struct ELE { /* Pointer to array holding string. * This array needs to be explicitly allocated and freed */ char *value; struct ELE *next; } list_ele_t; /* Queue structure */ typedef struct { list_ele_t *head; /* Linked list of elements */ list_ele_t *tail; /* Element at tail of queue */ size_t size; } queue_t; ``` ### queue.c 參考 [2020q3 第 1 週測驗題](https://hackmd.io/@sysprog/sysprog2020-quiz1) ```cpp static inline list_ele_t *new_ele_str(char const *s) { list_ele_t *e = malloc(sizeof(list_ele_t)); RETURN_IF_NULL(e, NULL); size_t length = strlen(s) + 1; e->value = malloc(length); if (likely(e->value)) { STRNCPY(e->value, s, length); return e; } else { free(e); return NULL; } } static inline void free_ele_str(list_ele_t *e) { free(e->value); free(e); } queue_t *q_new() { queue_t *q = malloc(sizeof(queue_t)); RETURN_IF_NULL(q, NULL); q->head = NULL; q->tail = NULL; q->size = 0; return q; } void q_free(queue_t *q) { RETURN_IF_NULL(q, ); /* Free queue structure */ for (list_ele_t *p, *e = q->head; e; e = p) { p = e->next; free_ele_str(e); } free(q); } bool q_insert_head(queue_t *q, const char *s) { RETURN_IF_NULL(q, false); list_ele_t *e = new_ele_str(s); RETURN_IF_NULL(e, false); e->next = q->head; q->head = e; if (!(q->size++)) q->tail = e; return true; } bool q_insert_tail(queue_t *q, const char *s) { RETURN_IF_NULL(q, false); RETURN_IF_NULL(q->head, q_insert_head(q, s)); list_ele_t *e = new_ele_str(s); RETURN_IF_NULL(e, false); e->next = NULL; q->tail->next = e; q->tail = e; q->size++; return true; } bool q_remove_head(queue_t *q, char *sp, size_t bufsize) { RETURN_IF_NULL(q && q->size, false); list_ele_t *e = q->head; q->head = e->next; if (sp && bufsize) STRNCPY(sp, e->value, bufsize); free_ele_str(e); if (!(--q->size)) q->tail = NULL; return true; } int q_size(const queue_t *q) { RETURN_IF_NULL(q, 0); return q->size; } void q_reverse(queue_t *q) { RETURN_IF_NULL(q && 1 < q->size, ); list_ele_t *head = q->head; q->tail = head; list_ele_t *cursor = NULL; while (head) { list_ele_t *next = head->next; head->next = cursor; cursor = head; head = next; } q->head = cursor; } ``` ## 字串比較 ### str_cmp.h ```cpp enum { NATURAL_E = 0x01, // natural sort CI_E = 0x02, // case-insensitive DESC_E = 0x04, // descending order }; /* Comparison of the Linked list element function */ typedef int (*cmp_func)(const char *, const char *); cmp_func get_compar(const size_t order_bits); ``` ### str_cmp.c ```cpp #include "str_cmp.h" // cppcheck-suppress unusedFunction int cmp_str_desc(const char *a, const char *b) { return strcmp(b, a); } // cppcheck-suppress unusedFunction int cmp_str_nat_desc(const char *a, const char *b) { return strnatcmp(b, a); } // cppcheck-suppress unusedFunction int cmp_str_case_desc(const char *a, const char *b) { return strcasecmp(b, a); } // cppcheck-suppress unusedFunction int cmp_str_nat_case_desc(const char *a, const char *b) { return strnatcasecmp(b, a); } cmp_func get_compar(const size_t order_bits) { switch (order_bits) { case NATURAL_E: return strnatcmp; case CI_E: return strcasecmp; case NATURAL_E | CI_E: return strnatcasecmp; case DESC_E: return cmp_str_desc; case NATURAL_E | DESC_E: return cmp_str_nat_desc; case CI_E | DESC_E: return cmp_str_case_desc; case NATURAL_E | CI_E | DESC_E: return cmp_str_nat_case_desc; case 0: default: return strcmp; } } ``` ## Sort 開發紀錄 移植 Linux kernel merge sort [lib/list_sort.c](https://github.com/torvalds/linux/blob/master/lib/list_sort.c) * 抽出 [test_list_sort.c](https://elixir.bootlin.com/linux/latest/source/lib/test_list_sort.c) 來測試 [5f9/linux_list_sort](https://github.com/5f9/linux_list_sort) * [Use Linux kernel merge](https://github.com/5f9/lab0-c/commit/113645b2cdd3d30adc7d0db5bf1404502e64ca3b) * [Use Linux kernel merge final](https://github.com/5f9/lab0-c/commit/5376aceff653f92cb79a94a9c242141c3c527c71) * [Use Linux kernel merge sort](https://github.com/5f9/lab0-c/commit/78e1fc5fc72fd5acf5c6cd2353d9f9a509510ee0) ### merge_sort.c ```cpp // From https://github.com/torvalds/linux/blob/master/lib/list_sort.c static inline list_ele_t *merge(const cmp_func cmp, list_ele_t *a, list_ele_t *b) { // cppcheck-suppress unassignedVariable list_ele_t *head, **tail = &head; for (;;) { if (cmp(a->value, b->value) <= 0) { *tail = a; tail = &(a->next); a = a->next; if (!a) { *tail = b; break; } } else { *tail = b; tail = &b->next; b = b->next; if (!b) { *tail = a; break; } } } return head; } static inline void merge_final(const cmp_func cmp, queue_t *q, list_ele_t *a, list_ele_t *b) { list_ele_t head = {.next = q->head}; list_ele_t *tail = &head; for (;;) { if (cmp(a->value, b->value) <= 0) { tail->next = a; tail = a; a = a->next; if (!a) break; } else { tail->next = b; tail = b; b = b->next; if (!b) { b = a; break; } } } /* Finish linking remainder of list b on to tail */ tail->next = b; do { tail = b; b = b->next; } while (b); q->head = head.next; q->tail = tail; } void merge_sort(queue_t *q, const cmp_func cmp) { list_ele_t *sorted[64 - __builtin_clzll((unsigned long long) q->size)]; size_t sorted_count = 0; list_ele_t *list = q->head->next, *pending = q->head; pending->next = NULL; size_t count = 1; do { size_t bits; sorted[sorted_count] = pending; size_t prev_count = 0; /* Find the least-significant clear bit in count */ for (bits = count; bits & 1; bits >>= 1) prev_count++; /* Do the indicated merge */ if (likely(bits)) { size_t end = sorted_count - prev_count; size_t start = end - 1; sorted[start] = merge(cmp, sorted[start], sorted[end]); for (size_t i = end; i < sorted_count; i++) sorted[i] = sorted[i + 1]; } else { sorted_count++; } pending = list; list = list->next; pending->next = NULL; count++; } while (list); for (size_t i = sorted_count - 1; i > 0; i--) { pending = merge(cmp, sorted[i], pending); } merge_final(cmp, q, sorted[0], pending); } ``` ### queue.c ```cpp void q_sort(queue_t *q, const cmp_func cmp) { RETURN_IF_NULL(q && 1 < q->size, ); merge_sort(q, cmp); } ``` ### qtest.c ```cpp static void console_init() { ... add_cmd("sort", do_sort, " [ci] [desc] | Sort queue in order\n" " | ci: case-insensitive " "(default: case-sensitive)\n" " | desc: descending (default: " "ascending)"); add_cmd("nsort", do_sort, " [ci] [desc] | Natural sort queue in order\n" " | ci: case-insensitive " "(default: case-sensitive)\n" " | desc: descending (default: " "ascending)"); ... } bool do_sort(int argc, char *argv[]) { if (simulation) { if (argc != 1) { report(1, "%s does not need arguments in simulation mode", argv[0]); return false; } bool ok = is_natural_sort(); if (!ok) { report(1, "ERROR: Probably not natural sort"); return false; } report(1, "Probably natural sort"); return ok; } if (argc != 1 && argc != 2 && argc != 3) { report(1, "%s needs 0-2 arguments", argv[0]); return false; } if (!q) report(3, "Warning: Calling sort on null queue"); error_check(); size_t sort_order = 0; if (!strcmp("nsort", argv[0])) sort_order |= NATURAL_E; if (argc > 1) { // case-insensitive if (!strcmp("ci", argv[1])) sort_order |= CI_E; if (!strcmp("desc", argv[1])) sort_order |= DESC_E; } if ((sort_order & CI_E) && argc > 2) { if (!strcmp("desc", argv[2])) sort_order |= DESC_E; } int cnt = q_size(q); if (cnt < 2) report(3, "Warning: Calling sort on single node"); error_check(); cmp_func cmpare_func = get_compar(sort_order); char *sorting_name = (sort_order & CI_E) ? "descending" : "ascending"; char *cmpare_name = (sort_order & NATURAL_E) ? " natural" : ""; set_noallocate_mode(true); if (exception_setup(true)) q_sort(q, cmpare_func); exception_cancel(); set_noallocate_mode(false); bool ok = true; if (q) { list_ele_t *e = q->head; for (; e && --cnt; e = e->next) { /* Ensure each element in ascending order */ if (cmpare_func(e->value, e->next->value) > 0) { report(1, "ERROR: Not sorted in %s%s order", sorting_name, cmpare_name); ok = false; break; } } if (e != q->tail) { report(1, "ERROR: The element at the end of the queue is incorrect"); ok = false; } } show_queue(3); return ok && !error_check(); } ``` ## 自動補完 antirez/linenoise 開發紀錄 ### [completion.c](https://github.com/5f9/lab0-c/blob/master/completion.c) ```cpp void completion(const char *buf, linenoiseCompletions *lc) { switch (buf[0]) { case 'f': linenoiseAddCompletion(lc, "free"); break; case 'h': linenoiseAddCompletion(lc, "help"); break; case 'i': linenoiseAddCompletion(lc, "ih"); linenoiseAddCompletion(lc, "it"); break; ... } } char *hints(const char *buf, int *color, int *bold) { if (!strcmp(buf, "sort") || !strcmp(buf, "nsort")) { *color = 35; *bold = 0; return " ci desc"; } return NULL; } ``` ### 移除 select 參考 [KYWeng - 解釋 select 系統呼叫在本程式的使用方式](https://hackmd.io/@KYWeng/S1DPSVSQ8#%E8%A7%A3%E9%87%8B-select-%E7%B3%BB%E7%B5%B1%E5%91%BC%E5%8F%AB%E5%9C%A8%E6%9C%AC%E7%A8%8B%E5%BC%8F%E7%9A%84%E4%BD%BF%E7%94%A8%E6%96%B9%E5%BC%8F) 移除 select ### console.c ```cpp bool run_console(char *infile_name, bool autocompletion) { if (!push_file(infile_name)) { report(1, "ERROR: Could not open source file '%s'", infile_name); return false; } bool is_stdin = buf_stack->fd == STDIN_FILENO; if (is_stdin && autocompletion) { linenoiseSetCompletionCallback(completion); linenoiseSetHintsCallback(hints); linenoiseHistoryLoad(HISTORY_FILE); } while (!cmd_done()) { if (is_stdin && autocompletion) { char *line = linenoise(prompt); if (line) { size_t len = strlen(line); strncpy(linebuf, line, RIO_BUFSIZE); linebuf[len] = '\n'; linebuf[len + 1] = '\0'; if (echo) { report_noreturn(1, prompt); report_noreturn(1, linebuf); } if (interpret_cmd(linebuf)) { linenoiseHistoryAdd(line); } linenoiseFree(line); } } else { if (is_stdin) { printf("%s", prompt); fflush(stdout); } char *cmdline = readline(); if (cmdline) interpret_cmd(cmdline); } } if (is_stdin && autocompletion) { linenoiseHistorySave(HISTORY_FILE); } return err_cnt == 0; } ``` ### scripts/debug.py ```python def debug(self, program): return self([ "-ex", "handle SIGALRM ignore", "-ex", "set args -d", # Disable auto completion "-ex", "start", "-ex", "c", program ]) ``` ## [oreparaz/dudect](https://github.com/oreparaz/dudect) ### 移植 dudect cropped ## Valgrid - Massif ```shell $ valgrind --tool=massif --stacks=yes --time-unit=B ./qtest -f traces/trace-17-complexity.cmd $ ms_print massif.out.4257 $ massif-visualizer massif.out.4257 ``` ## 自動測試機制 ### scripts/driver.py ```python= ... traceDict = { 1: "trace-01-ops", ... 18: "trace-18-natsort", 19: "trace-19-misc", 20: "trace-20-natsort-perf" } traceProbs = { 1: "Trace-01", ... 18: "Trace-18", 19: "trace-19", 20: "trace-20" } maxScores = [0, 6, 6, 6, 6, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 5, 1, 1, 1] ... ``` ## References * [I01: lab0](https://hackmd.io/@sysprog/2020-lab0) * [2020 年系統軟體系列課程討論區](https://www.facebook.com/groups/system.software2020/) * [C Programming Lab](https://www.cs.cmu.edu/afs/cs/academic/class/15213-s20/www/labs/cprogramminglab.pdf) * [Programming languages — C](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf) * [Programming languages - C](http://port70.net/~nsz/c/c11/n1570.html) * [Naetw - 使用 macro 簡化程式碼](https://hackmd.io/@bHCinvrjSmaS33lxDnYjog/BysQssYHN?type=view#%E4%BD%BF%E7%94%A8-macro-%E7%B0%A1%E5%8C%96%E7%A8%8B%E5%BC%8F%E7%A2%BC) * [KYWeng - 使用 antirez/linenoise 強化 qtest 命令列功能](https://hackmd.io/@KYWeng/S1DPSVSQ8#使用-antirezlinenoise-強化-qtest-命令列功能) * [Branch prediction macros in GCC](https://www.geeksforgeeks.org/branch-prediction-macros-in-gcc) * [Find first set](https://en.wikipedia.org/wiki/Find_first_set) * [Linux kernel lib/list_sort.c](https://github.com/torvalds/linux/blob/master/lib/list_sort.c) * [tommarshall/git-good-commit](https://github.com/tommarshall/git-good-commit) * [Perf](http://www.brendangregg.com/perf.html) * [oreparaz/dudect](https://github.com/oreparaz/dudect) * [Welch's t-test](https://en.wikipedia.org/wiki/Welch%27s_t-test) * [Introduction to t statistics | Confidence intervals | AP Statistics | Khan Academy](https://youtu.be/a2rd4Qy8yNI) * [Welford's online algorithm](https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Welford's_online_algorithm) * [你所不知道的 C 語言: 浮點數運算](https://hackmd.io/@sysprog/c-floating-point) * [Basic data structure and algorithm in Linux kernel](https://docs.google.com/document/d/1n31QMPyvdAkS2uxM5AyKu_DD3uRu6-zoRukAitDwPvA/edit#heading=h.ih9dq6159td) * [Perf 原理和實務](https://hackmd.io/@sysprog/gnu-linux-dev/https%3A%2F%2Fhackmd.io%2Fs%2FB11109rdg) * [CS:APP 學習指引](https://hackmd.io/@sysprog/CSAPP) * [Alternatives to forking into the same account](https://github.community/t/alternatives-to-forking-into-the-same-account/10200) * [How to update a forked repo with git rebase](https://medium.com/@topspinj/how-to-git-rebase-into-a-forked-repo-c9f05e821c8a)