--- tags: linux2022 --- # 2022q1 Homework1 (lab0) contributed by < [fourcolor](https://github.com/fourcolor) > ### 實驗環境 ```shell $ gcc --version gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 Copyright (C) 2019 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian Address sizes: 39 bits physical, 48 bits virtual CPU(s): 4 On-line CPU(s) list: 0-3 Thread(s) per core: 1 Core(s) per socket: 4 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 158 Model name: Intel(R) Core(TM) i3-9100F CPU @ 3.60GHz Stepping: 11 CPU MHz: 4000.589 CPU max MHz: 4200.0000 CPU min MHz: 800.0000 BogoMIPS: 7200.00 Virtualization: VT-x L1d cache: 128 KiB L1i cache: 128 KiB L2 cache: 1 MiB L3 cache: 6 MiB NUMA node0 CPU(s): 0-3 ``` ## [作業要求](https://hackmd.io/@sysprog/2022-lab0) - [x] 在 GitHub 上 fork [lab0-c](https://github.com/sysprog21/lab0-c) * 參閱 [Git 教學和 GitHub 設定指引](https://hackmd.io/@sysprog/git-with-github) ^附教學影片^ - [x] 依據上述指示著手修改 `queue.[ch]` 和連帶的檔案,測試後用 Git 管理各項修改,要能滿足 `$ make test` 自動評分系統的所有項目。 * 在提交程式變更前,務必詳閱 [如何寫好 Git Commit Message](https://blog.louie.lu/2017/03/21/%E5%A6%82%E4%BD%95%E5%AF%AB%E4%B8%80%E5%80%8B-git-commit-message/) * 詳閱「[你所不知道的 C 語言: linked list 和非連續記憶體](https://hackmd.io/@sysprog/c-linked-list)」並應用裡頭提及的手法,例如 pointer to pointer (指向某個指標的指標,或說 indirect pointer),讓程式碼更精簡且有效 * 研讀 Linux 核心原始程式碼的 [lib/list_sort.c](https://github.com/torvalds/linux/blob/master/lib/list_sort.c) 並嘗試引入到 `lab0-c` 專案,比較你自己實作的 merge sort 和 Linux 核心程式碼之間效能落差 - [x] 在 `qtest` 提供新的命令 `shuffle`,允許藉由 [Fisher–Yates shuffle](https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle) 演算法,對佇列中所有節點進行洗牌 (shuffle) 操作 - [x] 在 `qtest` 提供新的命令 `web`,提供 web 伺服器功能,注意: web 伺服器運作過程中,`qtest` 仍可接受其他命令 * 可依據前述說明,嘗試整合 [tiny-web-server](https://github.com/7890/tiny-web-server) - [ ] 除了修改程式,也要編輯「[作業區](https://hackmd.io/@sysprog/linux2022-homework1)」,增添開發紀錄和 GitHub 連結,除了提及你如何逐步達到自動評分程式的要求外,共筆也要涵蓋以下: * 開啟 [Address Sanitizer](https://github.com/google/sanitizers/wiki/AddressSanitizer),修正 `qtest` 執行過程中的錯誤 * 先執行 `qtest` 再於命令提示列輸入 `help` 命令,會使 開啟 [Address Sanitizer](https://github.com/google/sanitizers/wiki/AddressSanitizer) 觸發錯誤,應予以排除 * 運用 Valgrind 排除 `qtest` 實作的記憶體錯誤,並透過 Massif 視覺化 "simulation" 過程中的記憶體使用量,需要設計對應的實驗 * 解釋 [select](http://man7.org/linux/man-pages/man2/select.2.html) 系統呼叫在本程式的使用方式,並分析 [console.c](https://github.com/sysprog21/lab0-c/blob/master/console.c) 的實作,說明其中運用 CS:APP [RIO 套件](http://csapp.cs.cmu.edu/2e/ch10-preview.pdf) 的原理和考量點。可對照參考 [CS:APP 第 10 章重點提示](https://hackmd.io/@sysprog/H1TtmVTTz) * 研讀論文〈[Dude, is my code constant time?](https://eprint.iacr.org/2016/1123.pdf)〉,解釋本程式的 "simulation" 模式是如何透過以實驗而非理論分析,達到驗證時間複雜度,需要解釋 [Student's t-distribution](https://en.wikipedia.org/wiki/Student%27s_t-distribution) 及程式實作的原理。注意:現有實作存在若干致命缺陷,請討論並提出解決方案; - [ ] 指出現有程式的缺陷 (提示: 和 [RIO 套件](http://csapp.cs.cmu.edu/2e/ch10-preview.pdf) 有關) 或者可改進之處 (例如更全面地使用 Linux 核心風格的鏈結串列 API),嘗試實作並提交 pull request :::danger 今年還有新的要求 (例如 LeetCode 題目),請留意! :notes: jserv ::: ## 開發過程 ### q_new * 使用 `malloc` 後都要檢查是否配置記憶體成功 * 利用 `INIT_LIST_HEAD` 初始化 ```c struct list_head *q_new() { struct list_head *head = malloc(sizeof(struct list_head)); if (head == NULL) return NULL; INIT_LIST_HEAD(head); return head; } ``` ### q_free * 使用 `list_for_each_safe` 可以很好的代替 `for` 迴圈(後來有看到 `list_for_each_entry` 可以節省下方 `list_entry` ,之後會在做修改) ```c void q_free(struct list_head *head) { if (!head) return; struct list_head *node, *safe; list_for_each_safe (node, safe, head) { element_t *e = list_entry(node, element_t, list); free(e->value); free(e); } free(head); } ``` ### q_insert_head :::danger 注意共筆寫寫規範:中英文間用一個半型空白字元區隔 :notes: jserv ::: 第一個 commit 發出後發現 `value` 沒有檢查是否為 `NULL` ,後來加了之後發現無法 commit ,原因是因為前面的 `element_t *e` 沒有 `free` 掉。 ```c bool q_insert_head(struct list_head *head, char *s) { if (!head) return 0; element_t *e = malloc(sizeof(element_t)); if (!e) return 0; e->value = malloc((strlen(s) + 1) * sizeof(char)); if (!e->value) { free(e); return 0; } memcpy(e->value, s, strlen(s) + 1); list_add(&e->list, head); return 1; } ``` ### q_insert_tail 與 `q_insert_head` 一樣 ```c bool q_insert_tail(struct list_head *head, char *s) { if (!head) return 0; element_t *e = malloc(sizeof(element_t)); if (!e) return 0; e->value = malloc((strlen(s) + 1) * sizeof(char)); if (!e->value) { free(e); return 0; } memcpy(e->value, s, strlen(s) + 1); list_add_tail(&e->list, head); return 1; } ``` ### q_remove_head 一開始寫的時候沒注意到 `bufsize-1` 和 `strlen(sp)` 會不一樣導致在 `make check` 時卡了不少時間。後來關注到去年 [RZHuangJeff](https://hackmd.io/@Forward1105/2020q1Hw_lab0) 的複製時的字串長度有去做 `bufsize - 1` 跟 `strlen(sp)` 不一樣時的檢查,才豁然開朗 ```c element_t *q_remove_head(struct list_head *head, char *sp, size_t bufsize) { if (!head || list_empty(head)) { sp[0] = '\0'; return NULL; } element_t *e = list_entry(head->next, element_t, list); list_del(head->next); if (sp) { int len = bufsize > strlen(e->value) ? strlen(e->value) : bufsize; memcpy(sp, e->value, len); sp[len] = '\0'; } return e; } ``` ### q_remove_tail 與 `q_remove_head` 相同 ```c element_t *q_remove_tail(struct list_head *head, char *sp, size_t bufsize) { if (!head || list_empty(head)) { sp[0] = '\0'; return NULL; } element_t *e = list_entry(head->prev, element_t, list); list_del(head->prev); if (sp) { int len = bufsize > strlen(e->value) ? strlen(e->value) : bufsize; memcpy(sp, e->value, len); sp[len] = '\0'; } return e; } ``` ### q_delete_mid 利用兩個指標,一個一次走兩個,一個一次走一個,當比較快的那個指標走到底時,比較慢的指標即為中間 ```c bool q_delete_mid(struct list_head *head) { // https://leetcode.com/problems/delete-the-middle-node-of-a-linked-list/ if (!head || list_empty(head) || head->next->next == head) return false; struct list_head *node, *mid = head->next; list_for_each (node, head) { if (node->next != head) { node = node->next; mid = mid->next; } } list_del(mid); element_t *e = list_entry(mid, element_t, list); free(e->value); free(e); return true; } ``` ### q_delete_dup 當發生相鄰節點相等時,第二個節點便開始往後,直到兩個節點不相同。接著刪去除了第二個節點以外的節點的部份。 ```c= bool q_delete_dup(struct list_head *head) { if (!head) return 0; element_t *e, *safe; list_for_each_entry_safe (e, safe, head, list) { if (&safe->list != head && strcmp(safe->value, e->value) == 0) { do { safe = list_entry(safe->list.next, element_t, list); } while (&safe->list != head && strcmp(safe->value, e->value) == 0); e->list.prev->next = &safe->list; safe->list.prev = e->list.prev; while (e != safe) { element_t *tmp = list_entry(e->list.next, element_t, list); q_release_element(e); e = tmp; } } } return true; } ``` ### q_swap ```c= void q_swap(struct list_head *head) { if (!head || list_empty(head)) return; struct list_head *node = head->next, *adj_node = head->next->next; while (node != head && adj_node != head) { node->next = adj_node->next; node->next->prev = node; adj_node->prev = node->prev; adj_node->prev->next = adj_node; adj_node->next = node; node->prev = adj_node; node = node->next; adj_node = node->next; } } ``` ### q_reverse 藉由前、中、後三個指標紀錄 reverse 時要用到的節點,每次將中間的節點轉向。 ```c void q_reverse(struct list_head *head) { if (!head) return; struct list_head *prev = head->prev, *next, *cur = head; list_for_each (next, head) { cur->next = prev; cur->prev = next; prev = cur; cur = next; } cur->next = prev; cur->prev = next; } ``` ### q_sort 在這裡我實做了 merge-sort 的演算法。一開始會把 circular doubly linked list 的 `head` 去掉變成 doubly linked list 。並且在進行 merge-sort 時都會把 doubly linked list 視為一般 linked list 來實做。而在merge的部份有參考[你所不知道的-C-語言-linked-list-和非連續記憶體](https://hackmd.io/@sysprog/c-linked-list)。當所 merge sort 執行完畢,變會將一般的 linked list 改回原本的 circular doubly linked list ``` c= struct list_head *find_mid(struct list_head *head) { if (!head) return NULL; struct list_head *node, *mid = head; for (node = head; node != NULL; node = node->next) { if (node->next != NULL) { node = node->next; mid = mid->next; } } return mid; } struct list_head *merge(struct list_head *l1, struct list_head *l2) { struct list_head *head = NULL, **ptr = &head, **node; for (node = NULL; l1 && l2; *node = (*node)->next) { char *v1 = list_entry(l1, element_t, list)->value, *v2 = list_entry(l2, element_t, list)->value; node = (strcasecmp(v1, v2) <= 0) ? &l1 : &l2; *ptr = *node; ptr = &(*ptr)->next; } if (l1 == NULL) *ptr = l2; else *ptr = l1; return head; } struct list_head *merge_sort(struct list_head *head) { if (!head) return NULL; struct list_head *mid = find_mid(head), *ll = head, *rl = mid; if (mid == head) return head; mid->prev->next = NULL; ll = merge_sort(ll); rl = merge_sort(rl); return merge(ll, rl); } /* * * Sort elements of queue in ascending order * No effect if q is NULL or empty. In addition, if q has only one * element, do nothing. */ void q_sort(struct list_head *head) { if (!head || list_empty(head)) return; struct list_head *single_list_head = head->next; head->prev->next = NULL; struct list_head *sorted = merge_sort(single_list_head); struct list_head *ptr, *next_ptr; for (ptr = sorted, next_ptr = sorted->next; next_ptr != NULL; ptr = next_ptr, next_ptr = next_ptr->next) { next_ptr->prev = ptr; } head->next = sorted; sorted->prev = head; head->prev = ptr; ptr->next = head; } ``` ### q_shuffle 首先在 `qtest` 中的 `console_init` 新增 `shuffle command` ```c= ADD_COMMAND(shuffle, " | shuffle the queue"); ``` 接著根據 [Fisher–Yates shuffle](https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle) 中的 pseudo code 寫出 C 版本的 shuffle ``` -- To shuffle an array a of n elements (indices 0..n-1): for i from n−1 downto 1 do j ← random integer such that 0 ≤ j ≤ i exchange a[j] and a[i] ``` ```c= static void q_shuffle(struct list_head *head) { if (!head || list_empty(head)) return; int size = q_size(head); struct list_head *iter = head->prev; for (int i = size; i > 0; i--) { int step = (rand() % i) + 1; struct list_head *node = head->next; while (step--) { node = node->next; } struct list_head *tmp = iter->next; iter->next = node->next; node->next->prev = iter; node->next = tmp; tmp->prev = node; tmp = iter->prev; iter->prev = node->prev; node->prev->next = iter; node->prev = tmp; tmp->next = node; iter = node->prev; } } ``` 最後參考其他 do_* function 實作 `do_shuffle` ```c= static bool do_shuffle(int argc, char *argv[]) { if (argc != 1) { report(1, "%s takes no arguments", argv[0]); return false; } if (!l_meta.l) report(3, "Warning: Try to access null queue"); error_check(); set_noallocate_mode(true); if (exception_setup(true)) q_shuffle(l_meta.l); exception_cancel(); set_noallocate_mode(false); show_queue(3); return !error_check(); } ``` ### 整合 [tiny-web-server](https://github.com/7890/tiny-web-server) 首先新增 web 指令(可以在 qtest.c 或 console.c 新增),並實做 do_web 來開啟監聽 ```c= static bool do_web(int argc, char *argv[]) { listenfd = open_listenfd(DEFAULT_PORT); if (listenfd > 0) { printf("listen on port %d, fd is %d\n", DEFAULT_PORT, listenfd); } else { perror("ERROR"); exit(listenfd); } signal(SIGPIPE, SIG_IGN); noise = false; return 1; } ``` ```diff= // console.c void init_cmd() { ... + ADD_COMMAND(web, " | Web mode"); ... } ``` 接著參考 [K01: lab0](https://hackmd.io/@sysprog/linux2022-lab0#%E6%95%B4%E5%90%88-tiny-web-server) 更改 tinyWebServer.c 中的 process 接收 request 並解析想要執行的 function name ,而其餘 tinyWebServer.c 的 fuction 在用原本的就可以了 ```c= char *process(int fd, struct sockaddr_in *clientaddr) { #ifdef LOG_ACCESS printf("accept request, fd is %d, pid is %d\n", fd, getpid()); #endif http_request req; parse_request(fd, &req); int status = 200; char *p = req.function_name; /* Change '/' to ' ' */ while ((*p) != '\0') { ++p; if (*p == '/') { *p = ' '; } } #ifdef LOG_ACCESS log_access(status, clientaddr, &req); #endif char *ret = malloc(strlen(req.function_name) + 1); strncpy(ret, req.function_name, strlen(req.function_name) + 1); return ret; } ``` 接著依據 [K01: lab0](https://hackmd.io/@sysprog/linux2022-lab0#%E6%95%B4%E5%90%88-tiny-web-server) 修改 console.c 中對應的 function 來達成 non-blocking 的連線,這時會發現 console.c 會需要 tinyWebServer.c 中的 listenfd ,因此需要在 tinyWebServer.h 設為全域(這部份有參考 [2020leon](https://hackmd.io/@6649/linux2022-lab0) ) ```c= extern int listenfd; extern bool noise; ``` 雖然到這裡 qtest.c 已經能夠處理從 web 端傳來的指令,然而我想要讓 server 能夠將結果傳回 web 端,因此研究了整個架構後,發現流程為 cmd_select -> interpret_cmd -> interpret_cmda -> next_cmd->operation(argc, argv) 這時根據 ADD_COMMAND 會發現 next_cmd->operation(argc, argv) 就是指令對應的 do_* function 而每個 do_function 中所呼叫的各種 report function 就是印出接結果的 fuction ,這時我們可以修改各種 report function 來回傳訊息,並且透過 getsockopt 來確認 connfd 是否連線中。 ```diff= void report(int level, char *fmt, ...) { if (!verbfile) init_files(stdout, stdout); if (level <= verblevel) { va_list ap; va_start(ap, fmt); vfprintf(verbfile, fmt, ap); fprintf(verbfile, "\n"); fflush(verbfile); va_end(ap); if (logfile) { va_start(ap, fmt); vfprintf(logfile, fmt, ap); fprintf(logfile, "\n"); fflush(logfile); va_end(ap); } + int error_code; + int error_code_size = sizeof(error_code); + if (!getsockopt(connfd, SOL_SOCKET, SO_KEEPALIVE, &error_code, + (socklen_t *) &error_code_size)) { + va_start(ap, fmt); + char buf[1024] = {'\0'}; + vsnprintf(buf, sizeof(buf) - strlen(buf), fmt, ap); + strcat(send_buf, buf); + va_end(ap); + } } } ``` 而這裡我會先用一個 send_buf (宣告在 tinyWebServer.h ) 來儲存要回的訊息,等到 interpret_cmd 執行完才會回傳回去 ```diff= int cmd_select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout) { ... infd = buf_stack->fd; if (readfds && FD_ISSET(infd, readfds)) { /* Commandline input available */ FD_CLR(infd, readfds); result--; char *cmdline; cmdline = readline(); if (cmdline) interpret_cmd(cmdline); } else if (readfds && FD_ISSET(listenfd, readfds)) { FD_CLR(listenfd, readfds); result--; struct sockaddr_in clientaddr; socklen_t clientlen = sizeof(clientaddr); connfd = accept(listenfd, (SA *) &clientaddr, &clientlen); char *p = process(connfd, &clientaddr); if (p) { interpret_cmd(p); + response(connfd); } free(p); close(connfd); } return result; } ``` ```c= void response(int fd) { char buf[RIO_BUFSIZE] = {'\0'}; snprintf(buf, sizeof(buf), "HTTP/1.1 200 OK\r\nAccept-Ranges: bytes\r\n"); snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), "Cache-Control: no-cache\r\n"); snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), "Content-type: text/html\r\n\r\n"); snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), "<!DOCTYPE html><html><head><link rel=\"shortcut icon\" " "href=\"#\"></head><body>"); strcat(buf, send_buf); strcat(buf, "</body></html>"); writen(fd, buf, strlen(buf) + 1); free(send_buf); } ``` 結果如下圖 ![](https://i.imgur.com/TY9hL1l.png) ### 引入 linux/lib/list_sort.c 比較效能 先加入新的 option 來指定要執行的演算法 ```c= console_init() { ... add_param("sort", &sort_algo, "Which algorithm use when sorting (0: user defined(default), 1: " "kernal)", NULL); ... } ``` 接著引入 lib/list_sort.c ,引入時會發現原本的 linux/* 的 library 要替換成專案裡面的 list.h ,並且 u8 要改成 uint8_t ,最後是要把 complier.h 裡面的 unlikely(x) 定義出來。 為了能夠比較方便的比較自己寫的和 linux kernal 兩者的效能,因此寫了 trace-sort-cmp.cmd ``` # Test performance of q_sort and list_sort option echo 0 # q_sort option sort 0 new ih RAND 100000 time sort # list_sort option sort 1 new ih RAND 100000 time sort ``` 結果如下,linux 核心的 list_sort 比我自己寫的快了 16% 左右 ``` # q_sort Delta time = 0.098 # list_sort Delta time = 0.082 ``` 接著在進行一次對已排序 linked list 做排序的效能比較,linux 核心的 list_sort 比我自己寫的快了 32% 左右 ``` # q_sort Delta time = 0.098 # list_sort Delta time = 0.082 ```