2022q1 Homework1 (lab0)

--- tags: linux2022 --- # 2022q1 Homework1 (lab0) contributed by < [Eddielin0926](https://github.com/Eddielin0926) > ## :male-teacher: 作業要求 **記錄完成的項目** - [x] 在 GitHub 上 fork [lab0-c](https://github.com/sysprog21/lab0-c) - 參閱 [Git 教學和 GitHub 設定指引](https://hackmd.io/@sysprog/git-with-github) ^附教學影片^ - [x] 依據指示著手修改 `queue.[ch]` 和連帶的檔案，測試後用 Git 管理各項修改，要能滿足 `$ make test` 自動評分系統的所有項目。 - 在提交程式變更前，務必詳閱 [如何寫好 Git Commit Message](https://blog.louie.lu/2017/03/21/%E5%A6%82%E4%BD%95%E5%AF%AB%E4%B8%80%E5%80%8B-git-commit-message/) - 詳閱「[你所不知道的 C 語言: linked list 和非連續記憶體](https://hackmd.io/@sysprog/c-linked-list)」並應用裡頭提及的手法，例如 pointer to pointer (指向某個指標的指標，或說 indirect pointer)，讓程式碼更精簡且有效 - 研讀 Linux 核心原始程式碼的 [lib/list_sort.c](https://github.com/torvalds/linux/blob/master/lib/list_sort.c) 並嘗試引入到 `lab0-c` 專案，比較你自己實作的 merge sort 和 Linux 核心程式碼之間效能落差 - [x] 在 `qtest` 提供新的命令 `shuffle`，允許藉由 [Fisher–Yates shuffle](https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle) 演算法，對佇列中所有節點進行洗牌 (shuffle) 操作 - [ ] 在 `qtest` 提供新的命令 `web`，提供 web 伺服器功能，注意: web 伺服器運作過程中，`qtest` 仍可接受其他命令 - 可依據前述說明，嘗試整合 [tiny-web-server](https://github.com/7890/tiny-web-server) - [ ] 除了修改程式，也要編輯「[作業區](https://hackmd.io/@sysprog/linux2022-homework1)」，增添開發紀錄和 GitHub 連結，除了提及你如何逐步達到自動評分程式的要求外，共筆也要涵蓋以下: - 開啟 [Address Sanitizer](https://github.com/google/sanitizers/wiki/AddressSanitizer)，修正 `qtest` 執行過程中的錯誤 - 先執行 `qtest` 再於命令提示列輸入 `help` 命令，會使開啟 [Address Sanitizer](https://github.com/google/sanitizers/wiki/AddressSanitizer) 觸發錯誤，應予以排除 - 運用 Valgrind 排除 `qtest` 實作的記憶體錯誤，並透過 Massif 視覺化 "simulation" 過程中的記憶體使用量，需要設計對應的實驗 - 解釋 [select](http://man7.org/linux/man-pages/man2/select.2.html) 系統呼叫在本程式的使用方式，並分析 [console.c](https://github.com/sysprog21/lab0-c/blob/master/console.c) 的實作，說明其中運用 CS:APP [RIO 套件](http://csapp.cs.cmu.edu/2e/ch10-preview.pdf) 的原理和考量點。可對照參考 [CS:APP 第 10 章重點提示](https://hackmd.io/@sysprog/H1TtmVTTz) - 研讀論文〈[Dude, is my code constant time?](https://eprint.iacr.org/2016/1123.pdf)〉，解釋本程式的 "simulation" 模式是如何透過以實驗而非理論分析，達到驗證時間複雜度，需要解釋 [Student's t-distribution](https://en.wikipedia.org/wiki/Student%27s_t-distribution) 及程式實作的原理。注意：現有實作存在若干致命缺陷，請討論並提出解決方案; - [ ] 指出現有程式的缺陷 (提示: 和 [RIO 套件](http://csapp.cs.cmu.edu/2e/ch10-preview.pdf) 有關) 或者可改進之處 (例如更全面地使用 Linux 核心風格的鏈結串列 API)，嘗試實作並提交 pull request ## :gear: 環境我使用的是 [WSL](https://docs.microsoft.com/zh-tw/windows/wsl/)，Linux Distribution 是 Ubuntu 20.04 ```shell $ neofetch --off eddie@eddie-laptop ------------------ OS: Ubuntu 20.04.3 LTS on Windows 10 x86_64 Kernel: 5.10.16.3-microsoft-standard-WSL2 Uptime: 2 hours, 53 mins Packages: 870 (dpkg) Shell: zsh 5.8 Terminal: /dev/pts/4 CPU: 11th Gen Intel i5-1135G7 (8) @ 2.419GHz Memory: 595MiB / 12562MiB ``` 另外甲骨文 (Oracle) 有提供免費的 [VPS](https://www.oracle.com/tw/cloud/free/) 我也有申請一台做為測試。 ```shell $ neofetch --off ubuntu@eddie-oracle-cloud ------------------------- OS: Ubuntu 20.04.3 LTS x86_64 Host: KVM/QEMU (Standard PC (i440FX + PIIX, 1996) pc-i440fx-4.2) Kernel: 5.11.0-1027-oracle Uptime: 32 mins Packages: 637 (dpkg), 6 (snap) Shell: bash 5.0.17 Terminal: /dev/pts/0 CPU: AMD EPYC 7551 (2) @ 1.996GHz GPU: 00:02.0 Vendor 1234 Device 1111 Memory: 169MiB / 972MiB ``` :::info :bulb: 我找到一個蠻有趣的[腳本](https://bench.sh/)可以幫你測試電腦的各項規格和網路速度及延遲執行 `wget -qO- bench.sh | bashc` 或 `curl -Lso- bench.sh | bash` 開始測試 :::spoiler 執行結果 ``` -------------------- A Bench.sh Script By Teddysun ------------------- Intro : https://teddysun.com/444.html Version : v2022-01-01 Usage : wget -qO- bench.sh | bash ---------------------------------------------------------------------- CPU Model : 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz CPU Cores : 8 CPU Frequency : 2419.200 MHz CPU Cache : 8192 KB Total Disk : 2999.0 GB (1696.0 GB Used) Total Mem : 7816 MB (632 MB Used) Total Swap : 2048 MB (0 MB Used) System uptime : 0 days, 0 hour 59 min Load average : 0.08, 0.02, 0.01 OS : Ubuntu 20.04.3 LTS Arch : x86_64 (64 Bit) Kernel : 5.10.60.1-microsoft-standard-WSL2 TCP CC : cubic Virtualization : Dedicated Organization : AS3462 Data Communication Business Group Location : Taichung / TW Region : Taiwan ---------------------------------------------------------------------- I/O Speed(1st run) : 926 MB/s I/O Speed(2nd run) : 1.7 GB/s I/O Speed(3rd run) : 1.7 GB/s I/O Speed(average) : 1469.2 MB/s ---------------------------------------------------------------------- Node Name Upload Speed Download Speed Latency Speedtest.net 38.66 Mbps 51.09 Mbps 3.30 ms Los Angeles, US 38.90 Mbps 46.17 Mbps 251.47 ms Dallas, US 39.19 Mbps 47.69 Mbps 251.45 ms Montreal, CA 39.02 Mbps 51.83 Mbps 251.66 ms Paris, FR 34.37 Mbps 50.47 Mbps 279.10 ms Amsterdam, NL 36.70 Mbps 52.75 Mbps 251.82 ms Shanghai, CN 8.31 Mbps 48.72 Mbps 106.83 ms Nanjing, CN 34.93 Mbps 47.15 Mbps 55.13 ms Guangzhou, CN 3.81 Mbps 3.86 Mbps 163.36 ms Hongkong, CN 32.39 Mbps 51.52 Mbps 81.51 ms Seoul, KR 36.47 Mbps 52.32 Mbps 58.88 ms Tokyo, JP 39.41 Mbps 45.80 Mbps 42.65 ms ---------------------------------------------------------------------- Finished in : 6 min 20 sec Timestamp : 2022-02-19 00:45:50 ---------------------------------------------------------------------- ``` ::: ### :construction_worker: 前置作業 #### 格式化程式碼我使用的是 [Visual Studio Code](https://marketplace.visualstudio.com/items?itemName=ms-vscode.cpptools)，安裝 [C/C++ 擴充套件](https://marketplace.visualstudio.com/items?itemName=ms-vscode.cpptools)後，在 `setting.json` 中設定在儲存檔案時格式化，可自動格式化程式碼。 (如果現行目錄下有 `.clang-format` 檔案，會自動套用格式) ```json "editor.formatOnSave": true ``` > 參考 [Visual Studio Code - Code formatting]( https://code.visualstudio.com/docs/cpp/cpp-ide#_code-formatting) ## :chains: 佇列和其操作 ### q_new **功能:** 分配記憶體空間給新的佇列並初始化後，傳回佇列位址，當 `malloc` 失敗時會回傳 `NULL`。撰寫程式碼時，我發現到 `lish.h` 有很多巨集定義，因此先了閱讀 `lish.h` 才繼續寫下去。 ```c struct list_head *q_new() { struct list_head *head = malloc(sizeof(struct list_head)); if (head == NULL) return NULL; INIT_LIST_HEAD(head); return head; } ``` ### q_free **功能:** 釋放佇列與其元素所分配到到的記憶體空間 `list_for_each_safe` 利用了 `tmp` 指標保留下次迭代的位置，因此可以在迭代的過程中移除節點。一開始 `container_of` 確實讓我疑惑了一下，但簡單來說就是可以反向找到包含這個屬性 (attribute) 的結構體 (struct) 位址。 ```c void q_free(struct list_head *l) { if (!l) return; struct list_head *pos, *tmp; list_for_each_safe (pos, tmp, l) { element_t *node = container_of(pos, element_t, list); free(node->value); list_del(pos); free(node); } free(l); } ``` > 參考: [你所不知道的 C 語言: linked list 和非連續記憶體 - container_of](https://hackmd.io/@sysprog/c-linked-list#container_of) ### q_insert_head **功能:** 將一個新元素插入到佇列的第一個位置。如果佇列為 `NULL`，它就回傳 `false`。如果記憶體空間分配失敗，該函式將返回 `false` 並釋放函式內其他要求的記憶體。這邊讓我考慮比較多的其實是要不要使用 `strdup`，這個函式在論壇上有一些[討論](https://stackoverflow.com/questions/12984948/why-is-strdup-considered-to-be-evil)，它不隸屬於標準的 C 函式庫，而是涵蓋於 [POSIX](https://en.wikipedia.org/wiki/POSIX)，加上它是少數會分配記憶體的函式，因此需要注意記憶體流失 (memory leak)。 :::warning 針對電腦技術名詞，若沒有涉及本地化 (localization)，不要引用中文維基百科的詞目，改用英語的 Wikipedia，因為前者資訊往往過時。 :notes: jserv ::: ```c bool q_insert_head(struct list_head *head, char *s) { if (!head || !s) return false; element_t *new_ele = malloc(sizeof(element_t)); if (new_ele) return false; new_ele->value = strdup(s); if (!new_ele->value) { free(new_ele); return false; } return true; } ``` > 參考: [What is the rationale for not including strdup in the C Standard?](https://stackoverflow.com/questions/32944390/what-is-the-rationale-for-not-including-strdup-in-the-c-standard) ### q_insert_tail **功能:** 將一個新元素插入到佇列的最後一個位置。如果佇列為 `NULL`，它就回傳 `false`。如果記憶體空間分配失敗，該函數將返回 `false` 並釋放函式內其他要求的記憶體。與 `q_insert_head` 的差別只在於使用了 `list_add_tail`。 ```c bool q_insert_tail(struct list_head *head, char *s) { if (!head || !s) return false; element_t *new_ele = malloc(sizeof(element_t)); if (!new_ele) return false; new_ele->value = strdup(s); if (!new_ele->value) { free(new_ele); return false; } list_add_tail(&new_ele->list, head); return true; } ``` ### q_remove_head **功能:** 移除佇列的第一個節點。如果佇列是 `NULL` 或是空的會回傳 `NULL`。如果 `sp` 不為 `NULL` 就會將元素拷貝到指定的緩衝區中，且最多複製 `bufsize - 1` 個字元。如果來源的字串長度大於給定的字元數， `strncpy` 不會自動附加 `\0` 字元在末端，因此須保證緩衝區的末端會是 `\0` 字元。 ```c element_t *q_remove_head(struct list_head *head, char *sp, size_t bufsize) { if (!head || list_empty(head)) return NULL; element_t *node = container_of(head->next, element_t, list); if (sp) { strncpy(sp, node->value, bufsize - 1); sp[bufsize - 1] = '\0'; } list_del(&node->list); return node; } ``` ### q_remove_tail **功能:** 移除佇列的最後一個元素。如果佇列是 `NULL` 或是空的會回傳 `NULL`。如果 `sp` 不為 `NULL` 就會將節點複製到指定的緩衝區中，且最多複製 `bufsize - 1` 個字元。 ```c element_t *q_remove_tail(struct list_head *head, char *sp, size_t bufsize) { if (!head || list_empty(head)) return NULL; element_t *node = container_of(head->prev, element_t, list); if (sp) { strncpy(sp, node->value, bufsize - 1); sp[bufsize - 1] = '\0'; } list_del(&node->list); return node; } ``` ### q_size **功能:** 計算佇列的長度，當佇列為空時回傳 0。 ```c int q_size(struct list_head *head) { if (!head || list_empty(head)) return 0; int size = 0; struct list_head *node; list_for_each (node, head) size++; return size; } ``` ### q_delete_mid **功能:** 刪除佇列中間的節點，如果長度是偶數則取 $\frac{n}{2}$ 的整數位置。我的做法是同時從開頭和從尾端開始尋找，當開頭的指標等於尾端的指標，或是開頭的下一個指標是尾端的指標，就代表找到中間的節點。 ```c bool q_delete_mid(struct list_head *head) { if (!head || list_empty(head)) return false; element_t *mid = NULL; if (list_is_singular(head)) { mid = list_first_entry(head, element_t, list); } else { struct list_head *front = head->next, *back = head->prev; while (front != back && front->next != back) { front = front->next; back = back->prev; } mid = list_entry(front, element_t, list); } list_del(&mid->list); free(mid->value); free(mid); return true; } ``` ### q_delete_dup **功能:** 刪除**所有**重複的節點每指到一個節點就會去檢查是否跟下一個節點相同，如果相同就移除掉，在刪除完重複的節點後再將自己刪除。 ```c bool q_delete_dup(struct list_head *head) { if (!head || list_empty(head) || list_is_singular(head)) return false; struct list_head *node = head->next; while (node != head) { bool is_dup = false; element_t *cur = list_entry(node, element_t, list); while (node->next && node->next != head) { element_t *next = list_entry(node->next, element_t, list); if (strcmp(cur->value, next->value) == 0) { list_del(&next->list); free(next->value); free(next); is_dup = true; } else { break; } } node = node->next; if (is_dup) { list_del(&cur->list); free(cur->value); free(cur); } } return true; } ``` ### q_swap *功能:* 將節點兩兩互換這邊利用了 `list_move_tail` 來做交換，將下一個的節點剪下並插入到當前節點之前。 ```c void q_swap(struct list_head *head) { if (!head || list_empty(head) || list_is_singular(head)) return; for (struct list_head *node = head->next; node != head && node->next != head; node = node->next) { list_move_tail(node->next, node); } } ``` ### q_reverse **功能:** 將佇列的元素順序反向將 `list` 的 `prev` 和 `next` 交換即可 ```c void q_reverse(struct list_head *head) { if (!head || list_empty(head)) return; struct list_head *node, *safe, *tmp; list_for_each_safe (node, safe, head) { tmp = node->prev; node->prev = node->next; node->next = tmp; } tmp = head->prev; head->prev = head->next; head->next = tmp; } ``` ### q_sort **功能:** 排序佇列內的元素這邊我使用的是 merge sort，因此我額外寫了一個函式 `_list_merge` 用來將兩個佇列合併。 merge sort 簡單來說就是不斷將陣列分成兩段，一直分到剩一個元素後，將元素和按照順序合併，再合併排序好的陣列。 ```c void q_sort(struct list_head *head) { if (!head || list_empty(head) || list_is_singular(head)) return; struct list_head *mid = head; int size = q_size(head) / 2; for (int i = 0; i < size; i++) mid = mid->next; struct list_head left, right; INIT_LIST_HEAD(&left); INIT_LIST_HEAD(&right); list_cut_position(&left, head, mid); list_cut_position(&right, head, head->prev); q_sort(&left); q_sort(&right); _list_merge(&left, &right); list_splice(&left, head); } ``` `_list_merge` 用來合併陣列，按照大小將元素放到 `head` 佇列中，最後將 `head` 移到 `left` 。 ```c void _list_merge(struct list_head *left, struct list_head *right) { struct list_head head; INIT_LIST_HEAD(&head); while (!list_empty(left) && !list_empty(right)) { if (strcmp(list_first_entry(left, element_t, list)->value, list_first_entry(right, element_t, list)->value) < 0) list_move_tail(left->next, &head); else list_move_tail(right->next, &head); } if (!list_empty(right)) list_splice_tail(right, &head); list_splice(&head, left); } ``` ## :game_die: 佇列洗牌在 `qtest` 提供新的命令 `shuffle`。首先，在 `console_init` 先增加一個新的指令 `shuffle`， ```c static void console_init() { ... ADD_COMMAND(shuffle, " | Shuffle queue"); ... } ``` `ADD_COMMAND` 會將 `shuffle` 的指令展開成 `do_shuffle` 的函式。 ```c add_cmd("shuffle", do_shuffle, " | Shuffle queue") ``` 我參考了功能類似的 `do_swap`，然後將 `q_swap` 替換成 `q_shuffle`。 ```c static bool do_shuffle(int argc, char *argv[]) { if (argc != 1) { report(1, "%s takes no arguments", argv[0]); return false; } if (!l_meta.l) report(3, "Warning: Try to access null queue"); error_check(); set_noallocate_mode(true); if (exception_setup(true)) q_shuffle(l_meta.l); exception_cancel(); set_noallocate_mode(false); show_queue(3); return !error_check(); } ``` 在 `queue.h` 新增 `q_shuffle` 的定義。 ```c /* * Shuffle elements of queue * No effect if q is NULL or empty. In addition, if q has only one * element, do nothing. */ void q_shuffle(struct list_head *head); ``` 接下來實作 `q_shuffle`，我採用類似 Fisher and Yates' original method 來實作，隨機取一個元素並放置到新的佇列中，最後在將這佇列放到原本的佇列。 ```c void q_shuffle(struct list_head *head) { if (!head || list_empty(head) || list_is_singular(head)) return; struct list_head shuffle; INIT_LIST_HEAD(&shuffle); while (!list_empty(head)) { int r = rand() % q_size(head); struct list_head *node = head->next; for (int i = 0; i < r; i++) { node = node->next; } list_del(node); list_add_tail(node, &shuffle); } list_splice(&shuffle, head); } ``` ## :globe_with_meridians: 整合 [tiny-web-server](https://github.com/7890/tiny-web-server) 這個功能是要建立一個伺服器並將 http request 當作 command line input 來操作 `qtest`。與 `shuffle` 一樣，我們需要先將 `web` 指令註冊到 `qtest` 中。 ```c static void console_init() { ... ADD_COMMAND(web, " | Launch a tiny web server"); ... } ``` 接下來先做一個簡單的 `do_web`。 ```c static bool do_web(int argc, char *argv[]) { if (argc != 1) { report(1, "%s takes no arguments", argv[0]); return false; } return true; } ``` 為了要整合 tiny-web-server，我先觀察了他的 `main` 函式，原本的功能是要開啟資料夾或檔案，因此先將路徑相關的處理刪除，只留下設定 socket 的程式碼，並且將回傳設為 socket 的檔案描述符 (file descriptor)，然後將 `main` 改名為 `tinyserver`。 ```c int tinyserver(int argc, char **argv) { int default_port = DEFAULT_PORT, listenfd, connfd; char buf[256]; if (argc == 2) { if (argv[1][0] >= '0' && argv[1][0] <= '9') { default_port = atoi(argv[1]); } } listenfd = open_listenfd(default_port); if (listenfd > 0) { printf("listen on port %d, fd is %d\n", default_port, listenfd); } else { perror("ERROR"); exit(listenfd); } return listenfd; } ``` :::info TODO: 應該只使用 `open_listenfd`，然後將錯誤整合到 `report`，也不需要 `exit`。 ::: 另外我將 `SIGPIPE` 放到 `queue_init`，與其他 `signal` 一起處理。 ```diff static void queue_init() { fail_count = 0; l_meta.l = NULL; signal(SIGSEGV, sigsegvhandler); signal(SIGALRM, sigalrmhandler); + // Ignore SIGPIPE signal, so if browser cancels the request, it + // won't kill the whole process. + signal(SIGPIPE, SIG_IGN); } ``` 新增標頭檔 `tinyserver.h`，然後在 `qtest.c` 中引入。 ```c #ifndef TINY_SERVER_H #define TINY_SERVER_H int tinyserver(int argc, char **argv); #endif // TINY_SERVER_H ``` 修改剛剛的 `do_web` 並呼叫 `tinyserver`。 ```c static bool do_web(int argc, char *argv[]) { if (argc > 2) { report(1, "%s takes only one argument as the port number", argv[0]); return false; } tinyserver(argc, argv); return true; } ``` 在 Makefile 中新增 `tinyserver.o` ``` OBJS := qtest.o report.o console.o harness.o queue.o \ random.o dudect/constant.o dudect/fixture.o dudect/ttest.o \ linenoise.o tinyserver.o ``` 重新編譯並測試看看 `web` 指令。 ```shell= $ make CC tinyserver.o LD qtest $ ./qtest cmd> web listen on port 9999, fd is 3 cmd> ``` 到這邊我們已經簡單將 web socket 打開了，接下來我們需要接受 request。原本的 `process` 函式用來處理 request 並打開對應的檔案，現在我們改為將 request 的 command 回傳。 ```c char *process(int fd, struct sockaddr_in *clientaddr) { #ifdef LOG_ACCESS printf("accept request, fd is %d, pid is %d\n", fd, getpid()); #endif http_request req; parse_request(fd, &req); char *p = req.filename; /* Change '/' to ' ' */ while ((*p) != '\0') { ++p; if (*p == '/') *p = ' '; } int status = 200; #ifdef LOG_ACCESS log_access(status, clientaddr, &req); #endif char *ret = strdup(req.filename); writen(fd, req.filename, strlen(req.filename)); return ret; } ``` 之後按照 [整合 tiny-web-server](https://hackmd.io/@sysprog/linux2022-lab0#%E6%95%B4%E5%90%88-tiny-web-server) 的說明對 `console.c` 做修改即可。用 curl 簡單做個測試。 ```shell $./qtest cmd> web listen on port 9999, fd is 3 cmd> accept request, fd is 4, pid is 9798 127.0.0.1:38134 200 - 'new' (text/plain) l = [] cmd> accept request, fd is 4, pid is 9798 127.0.0.1:38136 200 - 'it 5' (text/plain) l = [5] cmd> accept request, fd is 4, pid is 9798 127.0.0.1:38138 200 - 'quit' (text/plain) Freeing queue ```