# 2021q1 Homework1 (lab0) contributed by < `ambersun1234` > > [作業說明](https://hackmd.io/@sysprog/linux2021-lab0) ### Reviewed by `jserv` * [commit 81f38](https://github.com/ambersun1234/lab0-c/commit/81f3819cba1de8c63678f6671cbd3abd33b39f34) 對程式碼變更的歷史不友善,難以看出過往程式碼的變更和你在合併的過程中,多做哪些修改。改用 [git rebase](https://git-scm.com/docs/git-rebase) 可消弭這樣的問題,並確保你所做的修改會在 git 變更歷程的最前方 * [commit f35422](https://github.com/ambersun1234/lab0-c/commit/f35422c0686bff78f9df03e2a4c353e16743c87a) 的訊息過於寒酸: "Remove natsort",外人實在不易理解你背後的考量,請閱讀 [How to Write a Git Commit Message](https://chris.beams.io/posts/git-commit/),特別是這幾點: * Use the imperative mood in the subject line * Use the body to explain what and why vs. how * [commit 15c022](https://github.com/ambersun1234/lab0-c/commit/15c02221292095cc604e7faa25ed85c014ab39ad) 宣稱 "Fix Address sanitizer error",但沒說問題發生原因,更沒談及這樣的解法是否充分,難保日後會不會重複發生。 * 共筆已列出 select 系統呼叫的參考資訊和 `lab0-c` 中對應的程式碼,但本程式實際用到 select 系統呼叫嗎?你看到的程式碼,真有符合預期地運作嗎?在沒有驗證前,務必大膽假設和小心求證 * 缺乏對 coroutine 的實作 ## 開發環境 ```shell $ uname -a Linux station 5.8.0-44-generic #50~20.04.1-Ubuntu SMP Wed Feb 10 21:07:30 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux $ gcc -v gcc version 9.3.0 (Ubuntu 9.3.0-17ubuntu1~20.04) ``` ## 開啟 [Address Sanitizer](https://github.com/google/sanitizers/wiki/AddressSanitizer),於 qtest 中輸入 help 會導致錯誤 + ![](https://i.imgur.com/hrBLQDB.png) + 上圖為執行結果,白色區塊顯示的區塊為 traceback,錯誤的地方為 `console.c:307` ```c=305 report(1, "Options:"); while (plist) { report(1, "\t%s\t%d\t%s", plist->name, *plist->valp, plist->documentation); plist = plist->next; } return true; ``` 這個是輸出 help 裡面的 option 的程式碼,考慮新增 option 程式 ```c= void add_param(char *name, int *valp, char *documentation, setter_function setter) { ... ele->valp = valp; ... ``` 呼叫端 ```c= add_param("simulation", (int *) &simulation, "Start/Stop simulation mode", NULL); add_param("verbose", &verblevel, "Verbosity level", NULL); add_param("error", &err_limit, "Number of errors until exit", NULL); add_param("echo", (int *) &echo, "Do/don't echo commands", NULL); ``` 其中第二個參數定義 ```c= /* Parameters */ static int err_limit = 5; static bool echo = 0; bool simulation = false; int level = 4; ``` `add_param` 將上述所有參數都強制轉型成 int pointer,因此 dereference 的時候大小會不一致,導致 Address Sanitizer 錯誤。 將 `bool` 更改為 `int` 即可 根據 [C 語言規格書 p.86 提到](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf#page=98) > Each of the operators < (less than), > (greater than), <= (less than or equal to), and >= (greater than or equal to) shall yield 1 if the specified relation is true and 0 if it is false.92) The result has type int 我們可以以 `0` 代替 `false` ## 解釋 [select](http://man7.org/linux/man-pages/man2/select.2.html) 系統呼叫在本程式的使用方式,並分析 [console.c](https://github.com/sysprog21/lab0-c/blob/master/console.c) 的實作,說明其中運用 CS:APP [RIO 套件](http://csapp.cs.cmu.edu/2e/ch10-preview.pdf) 的原理和考量點。 根據 [man 2 select](https://man7.org/linux/man-pages/man2/select.2.html) > select() allows a program to monitor multiple file descriptors, waiting until one or more of the file descriptors become "ready" for some class of I/O operation (e.g., input possible). A file descriptor is considered ready if it is possible to perform a corresponding I/O operation (e.g., read(2), or a sufficiently small write(2)) without blocking. 監控給定 file descriptor 是否處於 ready 狀態。 ```c= int cmd_select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout) { int infd; fd_set local_readset; if (cmd_done()) return 0; if (!block_flag) { /* Process any commands in input buffer */ if (!readfds) readfds = &local_readset; /* Add input fd to readset for select */ infd = buf_stack->fd; FD_SET(infd, readfds); if (infd == STDIN_FILENO && prompt_flag) { printf("%s", prompt); fflush(stdout); prompt_flag = true; } if (infd >= nfds) nfds = infd + 1; } if (nfds == 0) return 0; int result = select(nfds, readfds, writefds, exceptfds, timeout); if (result <= 0) return result; infd = buf_stack->fd; if (readfds && FD_ISSET(infd, readfds)) { /* Commandline input available */ FD_CLR(infd, readfds); result--; if (has_infile) { char *cmdline; cmdline = readline(); if (cmdline) interpret_cmd(cmdline); } } return result; } ``` 函式 cmd_select 是用於同步檔案,更準確地來說是 internal buffer,逐步同步檔案,將 command 送到 interpret_cmd 進行運算。值得注意的是,sets 如果在迴圈裡面執行,需要手動重新初始化 > Note well: Upon return, each of the file descriptor sets is modified in place to indicate which file descriptors are currently "ready". Thus, if using select() within a loop, the sets must be reinitialized before each call. The implementation of the fd_set arguments as value-result arguments is a design error that is avoided in poll(2) and epoll(7). 所以才需要 `FD_CLR()` <hr> 根據 [man 2 read](https://man7.org/linux/man-pages/man2/read.2.html) 的內容顯示 > It is not an error if this number is smaller than the number of bytes requested; this may happen for example because fewer bytes are actually available right now (maybe because we were close to end-of-file, or because we are reading from a pipe, or from a terminal), or because read() was interrupted by a signal. See also NOTES. 由此可見,qtest 使用 [RIO](http://csapp.cs.cmu.edu/2e/ch10-preview.pdf) 的原因是為了避免有 short count(即讀取到的資料短少)的問題 ## 說明 [antirez/linenoise](https://github.com/antirez/linenoise) 的運作原理,注意到 [termios](http://man7.org/linux/man-pages/man3/termios.3.html) 的運用 基本上是當使用者按下 tab 的時候,呼叫 [linenoise/linenoise.c completeLine](https://github.com/antirez/linenoise/blob/master/linenoise.c#364),然後由 `completionCallback` 提供可能的結果而 call back 的註冊是呼叫使用者定義 function,以 qtest 為例,就是 [lab0-c/console.c completion](https://github.com/sysprog21/lab0-c/blob/master/console.c#620) ```c= static bool cmd_maybe(char *target, const char *src) { for (int i = 0; i < strlen(src); i++) { if (target[i] == '\0') return false; if (src[i] != target[i]) return false; } return true; } void completion(const char *buf, linenoiseCompletions *lc) { if (strncmp("option ", buf, 7) == 0) { param_ptr plist = param_list; while (plist) { char str[128] = ""; // if parameter is too long, now we just ignore it if (strlen(plist->name) > 120) continue; strcat(str, "option "); strcat(str, plist->name); if (cmd_maybe(str, buf)) linenoiseAddCompletion(lc, str); plist = plist->next; } return; } cmd_ptr clist = cmd_list; while (clist) { if (cmd_maybe(clist->name, buf)) linenoiseAddCompletion(lc, clist->name); clist = clist->next; } } ``` 上述實作會尋找所有 `param_list` 以及 `cmd_list` 的名字進行相似度比對(使用 `cmd_maybe`) 然後交回 `completeLine` 執行。 使用者可以按 `tab` 或 `esc` 分別對應 `顯示下一個可能 command` 或 `顯示原本輸入字串` <hr> [man 2 termios](https://man7.org/linux/man-pages/man3/termios.3.html) > The termios functions describe a general terminal interface that is provided to control asynchronous communications ports. 可以得知,termios 是一個介面提供非同步的操作,可是這樣還是很抽象阿。 根據 [Is there a way to detect if a key has been pressed?](https://stackoverflow.com/a/22166185) 的線索 > So tcgetattr and tcsetattr are used to get input from terminal without needing to press 'ENTER' key. 回想 `./qtest` 中的操作,確實按 tab 或 esc 的時候是及時反應的。對照程式碼來看 ```c= /* Raw mode: 1960 magic shit. */ static int enableRawMode(int fd) { struct termios raw; if (!isatty(STDIN_FILENO)) goto fatal; if (!atexit_registered) { atexit(linenoiseAtExit); atexit_registered = 1; } if (tcgetattr(fd,&orig_termios) == -1) goto fatal; raw = orig_termios; /* modify the original mode */ /* input modes: no break, no CR to NL, no parity check, no strip char, * no start/stop output control. */ raw.c_iflag &= ~(BRKINT | ICRNL | INPCK | ISTRIP | IXON); /* output modes - disable post processing */ raw.c_oflag &= ~(OPOST); /* control modes - set 8 bit chars */ raw.c_cflag |= (CS8); /* local modes - choing off, canonical off, no extended functions, * no signal chars (^Z,^C) */ raw.c_lflag &= ~(ECHO | ICANON | IEXTEN | ISIG); /* control chars - set return condition: min number of bytes and timer. * We want read to return every single byte, without timeout. */ raw.c_cc[VMIN] = 1; raw.c_cc[VTIME] = 0; /* 1 byte, no timer */ /* put terminal in raw mode after flushing */ if (tcsetattr(fd,TCSAFLUSH,&raw) < 0) goto fatal; rawmode = 1; return 0; fatal: errno = ENOTTY; return -1; } static void disableRawMode(int fd) { /* Don't even check the return value as it's too late. */ if (rawmode && tcsetattr(fd,TCSAFLUSH,&orig_termios) != -1) rawmode = 0; } ``` `raw mode` 根據 [man 2 termios](https://man7.org/linux/man-pages/man3/termios.3.html) > cfmakeraw() sets the terminal to something like the "raw" mode of the old Version 7 terminal driver: input is available character by character, echoing is disabled, and all special processing of terminal input and output characters is disabled. The terminal attributes are set as follows: 簡單來說,disable echo,輸入輸出處理 disable。不過對照 man page 的 raw mode 有一點不一樣 ![](https://i.imgur.com/kY6WqXm.png) > 為了方便比較,我將一些不影響結果的部分更改了 由上圖可知以下結果 ||多出以下|少了以下| |:--:|:--:|:--:| |c_iflag||igncr<br>inlcr<br>ignbrk<br>parmrk| |c_lflag|iexten|echonl| |c_cflag||csize<br>parenb| |c_cc|c_cc[VMIN]=1<br>c_cc[VTIME]=0|| 參照註解 [linoise.c](https://github.com/antirez/linenoise/blob/master/linenoise.c#L243) ```c /* input modes: no break, no CR to NL, no parity check, no strip char, * no start/stop output control. */ /* output modes - disable post processing */ /* control modes - set 8 bit chars */ /* local modes - choing off, canonical off, no extended functions, * no signal chars (^Z,^C) */ /* control chars - set return condition: min number of bytes and timer. * We want read to return every single byte, without timeout. */ /* 1 byte, no timer */ ``` 就是對於輸入不做基本處理,然後不輸出以及限制一次讀一個 byte 可參考 [linux tty通過VTIME VMIN實現阻塞與非阻塞接收](https://www.twblogs.net/a/5b8dcabf2b7177188340b2c5) ## 指出現有程式的缺陷 (提示: 和 [RIO 套件](http://csapp.cs.cmu.edu/2e/ch10-preview.pdf) 有關),嘗試強化並提交 pull request ## 在 `qtest` 中實作 [coroutine](https://en.wikipedia.org/wiki/Coroutine),並提供新的命令 `web`,提供 web 伺服器功能,注意: web 伺服器運作過程中,`qtest` 仍可接受其他命令 > 可嘗試整合 [tiny-web-server](https://github.com/7890/tiny-web-server),將 `FORK_COUNT` 變更為 `0`,並以 [coroutine](https://en.wikipedia.org/wiki/Coroutine) 取代原本的 fork 系統呼叫 ###### tags: `linux2021`