# 2024q1 Homework1 (lab0) contributed by < [`johnny69527`](https://github.com/johnny69527/lab0-c) > ## 開發環境 ```bash $ gcc --version gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 Copyright (C) 2021 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 36 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 4 On-line CPU(s) list: 0-3 Vendor ID: GenuineIntel Model name: Intel(R) Core(TM) i5-2410M CPU @ 2.30GHz CPU family: 6 Model: 42 Thread(s) per core: 2 Core(s) per socket: 2 Socket(s): 1 Stepping: 7 CPU max MHz: 2900.0000 CPU min MHz: 800.0000 BogoMIPS: 4589.81 ``` ## 指定的佇列操作 不知道為何在 trace-17-complexity 中的幾個單純的操作被認為可能不是 constant time? > follow up: 研究 dudect 相關程式 ```shell +++ TESTING trace trace-17-complexity: # Test if time complexity of q_insert_tail, q_insert_head, q_remove_tail, and q_remove_head is constant ERROR: Probably not constant time or wrong implementation ERROR: Probably not constant time or wrong implementation Probably constant time ERROR: Probably not constant time or wrong implementation --- trace-17-complexity 0/5 --- TOTAL 95/100 make: *** [Makefile:60: test] Error 1 ``` ## 引入 [lib/list_sort.c](https://github.com/torvalds/linux/blob/master/lib/list_sort.c) ### 加入 linux list sort 實作程式碼 [`list_sort.h`](https://github.com/johnny69527/lab0-c/blob/master/list_sort.h)和[`list_sort.c`](https://github.com/johnny69527/lab0-c/blob/master/list_sort.c) ### 加入 linux_sort 命令 :::danger 使用 `git diff` 或 `diff -u` 命令來產生程式碼之間的變更列表,不要憑感覺填入,注意位移量。 ::: 修改 `qtest.c` : ```c +#include "list_sort.h" #include "queue.h" ... .. . -bool do_sort(int argc, char *argv[]) +static int cmp(void *priv, struct list_head *a, struct list_head *b) +{ + element_t *l = list_entry(a, element_t, list); + element_t *r = list_entry(b, element_t, list); + return strcmp(l->value, r->value); +} + +static int cmp_descend(void *priv, struct list_head *a, struct list_head *b) +{ + return -1 * cmp(priv, a, b); +} + +bool _do_sort(int argc, char *argv[], int mode) { ... .. . set_noallocate_mode(true); - if (current && exception_setup(true)) - q_sort(current->q, descend); + if (current && exception_setup(true)) { + if (mode == 0) + q_sort(current->q, descend); + else if (mode == 1) + list_sort(NULL, current->q, descend ? cmp_descend : cmp); + } exception_cancel(); set_noallocate_mode(false); ... .. . +bool do_sort(int argc, char *argv[]) +{ + return _do_sort(argc, argv, 0); +} + +bool do_linux_sort(int argc, char *argv[]) +{ + return _do_sort(argc, argv, 1); +} + ... .. . ADD_COMMAND(sort, "Sort queue in ascending/descening order", ""); + ADD_COMMAND(linux_sort, "Sort queue with linux lib/list_sort.c", ""); ADD_COMMAND(size, "Compute queue size n times (default: n == 1)", "[n]"); ... .. ``` 不知道為何?上述修改,若把 `#include "list_sort.h` 放在 `#include "harness.h"` 之前,會得到如下編譯錯誤。但放在之後,便能正確編譯。 :::danger 不要說「不知道為何?」陳述事實即可,找到根本原因。 ::: ```bash $ make CC qtest.o qtest.c: In function ‘do_free’: qtest.c:104:5: error: implicit declaration of function ‘error_check’ [-Werror=implicit-function-declaration] 104 | error_check(); | ^~~~~~~~~~~ qtest.c:107:9: error: implicit declaration of function ‘set_cautious_mode’ [-Werror=implicit-function-declaration] 107 | set_cautious_mode(false); | ^~~~~~~~~~~~~~~~~ qtest.c:118:13: error: implicit declaration of function ‘exception_setup’ [-Werror=implicit-function-declaration] 118 | if (exception_setup(true)) | ^~~~~~~~~~~~~~~ qtest.c:120:9: error: implicit declaration of function ‘exception_cancel’ [-Werror=implicit-function-declaration] 120 | exception_cancel(); | ^~~~~~~~~~~~~~~~ qtest.c:132:19: error: implicit declaration of function ‘allocation_check’ [-Werror=implicit-function-declaration] 132 | size_t bcnt = allocation_check(); | ^~~~~~~~~~~~~~~~ qtest.c: In function ‘do_reverse’: qtest.c:526:5: error: implicit declaration of function ‘set_noallocate_mode’ [-Werror=implicit-function-declaration] 526 | set_noallocate_mode(true); | ^~~~~~~~~~~~~~~~~~~ qtest.c: In function ‘console_init’: qtest.c:1087:26: error: ‘fail_probability’ undeclared (first use in this function) 1087 | add_param("malloc", &fail_probability, "Malloc failure probability percent", | ^~~~~~~~~~~~~~~~ qtest.c:1087:26: note: each undeclared identifier is reported only once for each function it appears in qtest.c: In function ‘sigalrm_handler’: qtest.c:1109:5: error: implicit declaration of function ‘trigger_exception’ [-Werror=implicit-function-declaration] 1109 | trigger_exception( | ^~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors make: *** [Makefile:54: qtest.o] Error 1 ``` 修改 `Makefile`: ``` OBJS := qtest.o report.o console.o harness.o queue.o \ random.o dudect/constant.o dudect/fixture.o dudect/ttest.o \ shannon_entropy.o \ - linenoise.o web.o + linenoise.o web.o list_sort.o ``` ### 效能比較 修改 `harness.c` 關掉命令的執行時限: ```c -static int time_limit = 1; +static int time_limit = 0; ``` 建立 [trace-perf-merge-sort.cmd](https://github.com/johnny69527/lab0-c/blob/master/traces/trace-perf-merge-sort.cmd) 和 [trace-perf-linux-sort.cmd](https://github.com/johnny69527/lab0-c/blob/master/traces/trace-perf-linux-sort.cmd) 命令序列,用來給 qtest 産生我自己實作的 merge sort 和 linux list sort 的執行時間數據。 修改 `Makefile`,加入 `sort_perf` make target: ``` sort_perf: ./qtest -v 2 -f traces/trace-perf-linux-sort.cmd -l linux_sort.dat cat linux_sort.dat | grep -v 'Delta time =' | sed 's/# //g' > linux_sort.dat01 cat linux_sort.dat | grep 'Delta time =' | sed 's/Delta time = //g' > linux_sort.dat02 pr -m -T linux_sort.dat01 linux_sort.dat02 > linux_sort.dat rm linux_sort.dat01 linux_sort.dat02 ./qtest -v 2 -f traces/trace-perf-merge-sort.cmd -l merge_sort.dat cat merge_sort.dat | grep -v 'Delta time =' | sed 's/# //g' > merge_sort.dat01 cat merge_sort.dat | grep 'Delta time =' | sed 's/Delta time = //g' > merge_sort.dat02 pr -m -T merge_sort.dat01 merge_sort.dat02 > merge_sort.dat rm merge_sort.dat01 merge_sort.dat02 gnuplot scripts/sort_perf.gp rm linux_sort.dat merge_sort.dat xdg-open sort_perf.png ``` 執行 `make sort_perf` 産生以下執行時間比較圖,可以看出我實作的 merge sort 和 linux list sort 有很大的效能落差。 ![sort_perf](https://hackmd.io/_uploads/Bk3syEmp6.png) :::danger 你如何設計實驗?用什麼資料當作排序程式的輸入? ::: ## 整合 web 命令和 linenoise 套件 ### 解決 favicon.ico 的問題 按照作業說明的做法解決 favicon.ico 的問題。 ```c char *p = web_recv(web_connfd, &clientaddr); char *buffer = "HTTP/1.1 200 OK\r\n%s%s%s%s%s%s" "Content-Type: text/html\r\n\r\n" "<html><head><style>" "body{font-family: monospace; font-size: 13px;}" "td {padding: 1.5px 6px;}" "</style><link rel=\"shortcut icon\" href=\"#\">" "</head><body><table>\n"; web_send(web_connfd, buffer); ``` 但是引申了另一個問題 - [page loads twice in Google chrome](https://stackoverflow.com/questions/2009092/page-loads-twice-in-google-chrome),一些瀏覽器(如 chrome)會每次發出重複多次請求: ```shell cmd> web listen on port 9999, fd is 3 cmd> l = [] cmd> l = [] cmd> l = [] cmd> l = [a] cmd> l = [a a] cmd> l = [a a a] cmd> Current queue ID: 2 l = [a a a] cmd> Current queue ID: 2 l = [a a a] cmd> Current queue ID: 2 l = [a a a] cmd> ``` 在 `console.c` 加入一個不做任何事的 favicon.ico 命令迴避這個問題。 ```c static bool do_nothing(int arc, char *argv[]) { printf("favicon.ico work around\n"); return true; } ... .. . ADD_COMMAND(web, "Read commands from builtin web server", "[port]"); add_cmd("favicon.ico", do_nothing, "Do nothing command for favicon.ico request from browser", ""); ``` ```bash cmd> web listen on port 9999, fd is 3 cmd> l = [] cmd> favicon.ico work around cmd> l = [a] cmd> favicon.ico work around cmd> Current queue ID: 0 l = [a] cmd> favicon.ico work around cmd> ``` ### 用 select 系統呼叫同時處理 stdin 及 socket 在 `linenoise.c` 中的 `line_edit` 的 `while(1)` 迴圈裡採用 `select` 系統呼叫查詢哪個 File Descriptor 有內容可以讀取,便不會造成 web 和 stdio 相互阻塞的情況。 :::danger 使用 `git diff` 或 `diff -u` 命令來產生程式碼之間的變更列表,不要憑感覺填入,注意位移量。 ::: ```c +extern int web_fd; +extern int web_connfd; /* This function is the core of the line editing capability of linenoise. * It expects 'fd' to be already in "raw mode" so that every key pressed * will be returned ASAP to read(). */ static int line_edit(int stdin_fd, int stdout_fd, char *buf, size_t buflen, const char *prompt) { ... .. . while (1) { + fd_set set; + FD_ZERO(&set); + if (web_fd) + FD_SET(web_fd, &set); + FD_SET(l.ifd, &set); + + int rv = web_fd ? select(web_fd + 1, &set, NULL, NULL, NULL) + : select(l.ifd + 1, &set, NULL, NULL, NULL); + + switch (rv) { + case -1: + perror("select"); /* an error occurred */ + continue; + case 0: + printf("timeout occurred\n"); /* a timeout occurred */ + continue; + default: + if (web_fd && FD_ISSET(web_fd, &set)) { + struct sockaddr_in clientaddr; + socklen_t clientlen = sizeof clientaddr; + web_connfd = + accept(web_fd, (struct sockaddr *) &clientaddr, &clientlen); + char *p = web_recv(web_connfd, &clientaddr); + strncpy(buf, p, strlen(p) + 1); + + char *buffer = + "HTTP/1.1 200 OK\r\nContent-Type: text/plain\r\n\r\n"; + web_send(web_connfd, buffer); + + free(p); + return strlen(p) + 1; + } else if (FD_ISSET(l.ifd, &set)) { signed char c; int nread; char seq[5]; nread = read(l.ifd, &c, 1); if (nread <= 0) return l.len; ... .. . - return l.len; + } + } + return l.len; } ``` ## Fisher-Yates shuffle `q_shuffle` 實作: ```c /* Shuffle elements in queue */ void q_shuffle(struct list_head *head) { if (!head || list_empty(head) || list_is_singular(head)) return; /* srand(time(0)); */ int len = q_size(head); struct list_head *i_th = head->prev; for (int i = len; i > 0; i--) { struct list_head *r_th = head; int r = rand() % i + 1; if (r != i) { // find r_th for (int j = 0; j < r; j++) r_th = r_th->next; // swap r_th and i_th element element_t *e_rth = list_entry(r_th, element_t, list); element_t *e_ith = list_entry(i_th, element_t, list); char *temp = e_rth->value; e_rth->value = e_ith->value; e_ith->value = temp; } i_th = i_th->prev; } } ``` ### 統計分析 使用作業說明提供的 script 進行分析: ``` $ python3 scripts/chi_squared_sum.py Expectation: 41666 Observation: {'1234': 41399, '1243': 41753, '1324': 41752, '1342': 41786, '1423': 41823, '1432': 41857, '2134': 41372, '2143': 41883, '2314': 41408, '2341': 41436, '2413': 42025, '2431': 41837, '3124': 41544, '3142': 41452, '3214': 41737, '3241': 41646, '3412': 41844, '3421': 41626, '4123': 41845, '4132': 41684, '4213': 41543, '4231': 41539, '4312': 41529, '4321': 41680} chi square sum: 18.117601881630108 ``` ![shuffle](https://hackmd.io/_uploads/B1KbBtETa.png) 用[查表](https://homepage.divms.uiowa.edu/~mbognar/applets/chisq.html)的方式查出自由度 23,$X^2$=18.117601881630108 時,p=0.75108=75.108%。$H_0$ 不顯著,無證據說此分佈不均勻。 若以時間作為亂數種子 `srand(time(0))`,進行分析: ``` $ python3 scripts/chi_squared_sum.py Expectation: 41666 Observation: {'1234': 35691, '1243': 55973, '1324': 44561, '1342': 13536, '1423': 48781, '1432': 72626, '2134': 67690, '2143': 33906, '2314': 28418, '2341': 22146, '2413': 25936, '2431': 38575, '3124': 60054, '3142': 20356, '3214': 34389, '3241': 33245, '3412': 41248, '3421': 38249, '4123': 58762, '4132': 56140, '4213': 41892, '4231': 49423, '4312': 35990, '4321': 42413} chi square sum: 122952.21797148754 ``` ![srand_time](https://hackmd.io/_uploads/SkDJvK4pa.png) 自由度 23,$X^2$=122952.21797148754 時,p=0=0% 經由卡方檢定計算出的 p 值小於設定的信心水準 (如0.05) 拒絕虛無假說,不符合平均分佈。 為甚麼採用不同的亂數種子 - `os_random(getpid() ^ getppid())` 和 `time(0)`,會産生相對的結果呢? > > The srand() function sets its argument as the seed for a new sequence of pseudo-random integers to be returned by rand(). These sequences are repeatable bycalling srand() with the same seed value > >man srand 根據手冊所說,種子代表一組 pseudo-random integer,同一個 pseudo-random number generator 不應該是一樣亂嗎? :::danger 複習離散數學。 :::