CSAPP: Chapter 8

--- tags: CSAPP, 作業系統 --- # CSAPP: Chapter 8 :::info 僅記錄最粗淺的概念，詳細的例子請直接參考原書籍或者課程投影片: * [Exceptions & Processes](http://www.cs.cmu.edu/afs/cs/academic/class/15213-m19/www/lectures/14-ecf-procs.pdf) * [ECF: Signals](http://www.cs.cmu.edu/afs/cs/academic/class/15213-m19/www/lectures/15-ecf-signals.pdf) ::: ## Control Flow 從開機直至結束，CPU 會照著指令的序列執行，這些指令的序列形成了 CPU 的 [**control flow**](https://en.wikipedia.org/wiki/Control_flow)。造成 control flow 改變的機制，如 jump / branch 或者 call / return，是根據程式狀態的改變而變化的。然而僅依靠這些來改變 control flow 是不夠的，系統除了反應程式狀態的改變，有時也要反映系統狀態的改變，例如: * 當一個資料從外部(disk/router)抵達 * 除以零的指令 * I/O 輸入 * 系統的 timer expire 我們將這些機制通稱為 "exceptional control flow"。 ## Exceptions exception 是將 control 轉至 OS kernel 以處理某些事件的行為。如下圖，當 exception 發生，control flow 會轉至 kernel 中的 exception handler，並且在處理完畢後，可能發生下列其中一種行為: 1. 回去執行觸發 exception 的指令 2. 回去執行觸發 exception 指令的下一指令 3. abort ![](https://i.imgur.com/ntkzTEn.png) 當 exception 發生時，造成 exception 的事件根據硬體會有一個編號。透過這個編號，就可以 index 到 kernel 中 exception table 中對應處理之的 handler 位址，以對此事件做後續的處理。 ![](https://i.imgur.com/9REXnYF.png) Exception 可以進一步分成同步(synchronous)與非同步(asynchronous)兩種類型。 * 非同步的 exception 來自外部設備: * **interrput**(例如 ctrl-c 的輸入、timer 等) 由硬體設置某個電位後觸發，且在處理完後回到觸發時的下一個指令。 * 同步的 exception 則可以再細分。 * **abort** 是由不可恢復的錯誤造成的，通常是硬體的錯誤(例如 RAM 損壞)，abort 時會直接將該造成 abort 的 process 終止。 * **Trap** 是刻意觸發的 exception，例如透過 breakpoint、system call 指令以進入 kernel，以要求作業系統來實現特定的行為(例如讀寫檔案、`fork()` 等)，返回時則回到觸發 trap 的下一個指令。 * **Fault** 是由錯誤狀況產生(例如 page fault)，但是是可能被恢復的。 ## Process 多核心的電腦，甚至手機在近代已經相當普及，透過多個運算單元，很容易想像我們可以在電腦上同時做好多事情。然而思考一下，如果在單核心的架構下，是不是電腦就永遠只能專心在某個工作上面了? 當我們在用 Hackmd 作筆記的時候，是不是背後就沒辦法撥放音樂了? 顯然事情並非如此，想解釋其原因，就必須提到在作業系統中相當重要的概念: process。process 是程式被執行後產生的 instance。透過 process，使用者可以得到自己的每個程式都獨享 CPU / 記憶體等硬體資源的假象。事實上，程式又是如何運行的呢? 系統中的每個程式都運行在一個 process context 中，context 是由可使程式正確運行的狀態所組成的(包含程式的代碼本身、使用的 stack、program counter、register 的狀態等)。這些程式在系統中 concurrent 的運行，就像下圖所展示的，在每個時間點中，都只有一個 process 存在，但只要藉由合理的交錯每個 process 的運行，迅速的切換每個程式的執行，就可以造成程式獨享資源的假象。 ![](https://i.imgur.com/xoZZbZJ.png) ### Context switch 前面我們提到 process 會被切換因此交錯執行，這個實現被稱為 [context switch](https://en.wikipedia.org/wiki/Context_switch)。如下圖中所展示的，簡而言之，程式會因為 exception 而進入 kernel，然後作業系統會切換硬體的使用權給另一個 process。 ![](https://i.imgur.com/lmBxJz6.png) ## Library Function Error Handling 當 unix 系統函數遇到錯誤發生時，library function 通常會回傳 -1，並且設置一個全域的 `errono` 來表示發生錯誤的原因。作為程式的開發者，我們有義務需要為錯誤進行處理，藉由 `errorno` 得知錯誤原因的細節。以 fork 為例: ```cpp if ((pid = fork()) < 0) { fprintf(stderr, "fork error: %s\n", strerror(errno)); exit(1); } ``` ## Process Control Unix 中提供許多操作 process 的 system call。 ### 獲得 process id * `pid_t getpid(void)`: 得到 process 自己的 pid * `pid_t getppid(void)`: 得到 parent 的 pid ### 建立與終止 process 從 programmar 的角度，我們可以把 process 認為是處在以下其中一種狀態下: * Running: process 正在執行(executing)或者等待被執行(排程) * Stopped: process 被 suspend，直到得到某種"通知"(如之後會探討的 signal)前都不會被排程 * Terminated: process 永久性的終止對於 process 的終止而言，基本可以總結與三種原因: * 獲得一個終止的 signal * 從 main function 返回 * 呼叫 `exit()`: `void exit(int status)` 以變數 `status` 指定的狀態退出，通常以 0 代表正常退出，非零則是錯誤情形對於 process 的建立，我們可以透過 `fork()` 從一個 parent process 建立一個 child process。`int fork(void)` 是一個有趣的函數，因為一次 `fork()` 的呼叫會返回兩次，。child 得到返回值 0，parent 則會得到 child 的 pid 為返回值。並且，child 會很像 parent: child 會得到和 parent 相同 virtual address space(但實體對應則可能不同)複本，也會得到 parent 的 open file descriptor 複本。當然了，兩者的 pid 是不同的。 :::warning 細節省略，對於 `fork()` 我猜網路上有很多相關的文章了，~~畢竟考研要考?~~ ::: ### 回收 child process 當一個 process 終止時，系統並不會立即將它從系統中清除，process 會被保持在終止的狀態中，直到被它的 parent 回收。一般而言，parent 會透過 `wait` 或 `waitpid` 來等待其 child 的終止。parent 會得到 child 退出時的狀態，然後轉交給 kernel 來將其回收。如果一個 process 終止了但未被回收，我們稱其為 zombie。如果一個 parent 終止了，卻沒有回收它的 child process,這些 process 該如何是好呢?kernel 會安排 `init` 來做後續的處理，`init` 是開機時由 kernel 建立的，是所有 process 的祖先 (pid 為 1)，會負起責任來收養這些"孤兒"。不過，對於 shell 或者 server 這種長時間運行的程式，他們仍總是該回收自己的 child，否則終將造成硬體資源的消耗。我們可以通通過使用 `int wait(int *child_status)` 來回收 child process。當呼叫 `wait` 時，呼叫的 process 會被 suspend，直到 child process 結束。其返回值會是終止的 child 的　pid。如果 `child_status != NULL`，則 `child_status` 指向的值會是 child 結束時的狀態值(定義在 `wait.h` 中)。 ```cpp= void fork10() { pid_t pid[N]; int i, child_status; for (i = 0; i < N; i++) if ((pid[i] = fork()) == 0) { exit(100+i); /* Child */ } for (i = 0; i < N; i++) { /* Parent */ pid_t wpid = wait(&child_status); if (WIFEXITED(child_status)) printf("Child %d terminated with exit status %d\n", wpid, WEXITSTATUS(child_status)); else printf("Child %d terminate abnormally\n", wpid); } } ``` 以上面的程式為例，在使用 `fork()` 建立新的 process 時，儲存每個 child process 的 pid(第 6 行會讓 child process 產生的立即被 exit 掉)，接著第 8 行的 for 迴圈透過 `wait` 等待 N 個 child 被回收(當然，順序是不一定的)。根據 `WIFEXITED(child_status)` 可以得知 child 是否正確的終止，根據 `WEXITSTATUS(child_status)` 則能得到退出時得狀態。 `pid_t waitpid(pid_t pid, int *status, int options)` 是另一個可以用來回收 child process 的呼叫。這個函式會回收特定的(根據 `pid`) child。 ```cpp void fork11() { pid_t pid[N]; int i; int child_status; for (i = 0; i < N; i++) if ((pid[i] = fork()) == 0) exit(100+i); /* Child */ for (i = N-1; i >= 0; i--) { pid_t wpid = waitpid(pid[i], &child_status, 0); if (WIFEXITED(child_status)) printf("Child %d terminated with exit status %d\n", wpid, WEXITSTATUS(child_status)); else printf("Child %d terminate abnormally\n", wpid); } } ``` 以上面的範例為例，透過這個架構，我們就可以依照建立 child process 的順序來回收 child process 了。 ### 載入並運行程式 `int execve(char *filename, char *argv[], char *envp[])` 可以用來將指定名稱 `*filename` 的可執行檔案載入，帶著程式參數 `*argv[]` 的陣列(一般 `argv[0]==filename`)，以及環境參數 `*envp[]` (以 `name=value` 的結構存在)。當 `execv` 被呼叫時，它將覆蓋原本 process 的 code, data, stack 等，但保留原本 process 的 pid，打開的檔案。且 `execv` 一旦呼叫就不會返回。可以參考下面的投影片所敘述的流程: ![](https://i.imgur.com/zGwky5O.png) ## Shell shell 是一個可以和使用者互動，為使用者執行程式的應用程式(例如 `sh` 或者 `bash`)。shell 執行一系列的 read 和 evaluate 步驟: read 步驟讀取使用者的輸入命令，evaluate 步驟則分析該命令的內容，並代表使用者執行程式。下面展示一個簡單的 shell 程式範例: ```cpp int main(int argc, char** argv) { char cmdline[MAXLINE]; /* command line */ while (1) { /* read */ printf("> "); Fgets(cmdline, MAXLINE, stdin); if (feof(stdin)) exit(0); /* evaluate */ eval(cmdline); } ``` 在 main function 中，透過 [`Fgets`](https://man7.org/linux/man-pages/man3/fgets.3p.html) 從 `stdin` 讀出命令到 `cmdline` 陣列中。[`feof`](https://man7.org/linux/man-pages/man3/feof.3p.html) 判斷輸入的內容不為 EOF 後，就呼叫 `eval` 去對命令進行解析。 ```cpp void eval(char *cmdline) { char *argv[MAXARGS]; /* Argument list execve() */ char buf[MAXLINE]; /* Holds modified command line */ int bg; /* Should the job run in bg or fg? */ pid_t pid; /* Process id */ strcpy(buf, cmdline); bg = parseline(buf, argv); if (argv[0] == NULL) return; /* Ignore empty lines */ if (!builtin_command(argv)) { if ((pid = Fork()) == 0) { /* Child runs user job */ if (execve(argv[0], argv, environ) < 0) { printf("%s: Command not found.\n", argv[0]); exit(0); } } /* Parent waits for foreground job to terminate */ if (!bg) { int status; if (waitpid(pid, &status, 0) < 0) unix_error("waitfg: waitpid error"); } else printf("%d %s", pid, cmdline); } return; } ``` * `parseline` 會解析字串中是否存在 `&`，來決定是否把任務放到背景執行 * `builtin_command` 判斷是否是內建的命令，如果是 `builtin_command` 會為此命令進行處理，如果不是，就建立一個 child process 並呼叫 `execve` 來為其執行 * 如果非背景執行，parent 就會透過 `waitpid` 等待 child process 完成，否則就直接往下執行上面的程式顯然存在問題，當 child process 為背景時，parent 並沒有負起責任將起回收。parent 該透過甚麼方式來妥善的處理 child process 呢? 那就是接下來我們所要談及的 Signal 的用途了。 ## Signal 和字面的意思很類似，signal 的用途是用來通知 process 某些事件發生的訊息之機制。signal 僅以一個 id 表達，透過辨認這個 ID，process 可以得知收到的訊息為何。 ![](https://i.imgur.com/k5abwpE.png) kernel 會透過更新 process context 來傳遞 signal。當 kernel 發現某些事件(例如 divide-by-zero `SIGFPE` / child process 終止 `SIGCHLD`)發生，或者其他 process 發出 kill signal 到另一個 process 時，kernel 會將該 signal pending 到目的地 process 下。所謂的 pending 要怎麼做呢? kernel 會維護一個 bit vectors，當 signal 被送出時，就 set 對應的位置 bit `k`;反之，當 signal 已被接收，就 clear 該對應位置。此外，process 也可以 block 某種特定的 signal，透過 function `sigprocmask` 事先處理 bit vectors，阻止接收該類型的 signal。需特別注意到，每個 process 對於同種 signal 最多只能有一個 pending。換句話說。signal 並不會"排隊"，如果 process 已經接受到 signal k 且還未將其處理(pending)，則下個接收到的 signal k 會直接被捨棄掉。 ### Process Groups 每個 process 都會屬於一個特定的 process group。可以透過 `getpgrp` 得到呼叫的 process 所屬的 process group，透過 `setpgrp` 改變呼叫的 process 所屬的 process group。可以以 process group 為單位來傳遞 signal。 ![](https://i.imgur.com/0zRs1Jp.png) ## Signal handling ### Overview 當 kernel 結束 exception handler，準備要回到 user space 之前，檢查下個要執行的 process p 之 `pnb = pending & ~blocked`，可以得知有哪些 signal 需要被處理。如果 `pnb == 0`，沒有需要處理的 signal，就讓 process p 向下執行。如果 `pnb != 0`，則從 `pnb` 的最低位之非零位數 `k`，表示最優先要處理的 signal `k`，一直處理到最高位之非零位數，再讓 process 向下執行。每個 signal 都有一個預設的處理方式，可能是: * 終止 process 執行 * process 停止直到接收到 SIGCONT * 直接忽略該 signal 我們可以通過註冊 signal handler 來改變對 signal 的處理方式，透過函式`handler_t *signal(int signum, handler_t *handler)` 可以達成此目的。其中，`handler` 變數可以是: * `SIG_IGN`: 標示將行為改成忽略該 signal * `SIG_DFL`: 將 signal 的處理變成預設方法 * 否則，可以將 `handler` 指定為一個 user level 的函式指標。當收到指定的 signal，這個函式會被呼叫來處理之。且當此函式 return 時，程式的 control flow 會回到 process 被該 signal interrput 之處下面是一個簡單的註冊 handler 的範例: [`signal`](https://man7.org/linux/man-pages/man2/signal.2.html) 將 `SIGINT` 類型的 signal(也就是 ctrl+C) 的處理換成 `sigint_handler` 函式。因此當 ctrl+C 被輸入時，不會立即發生預設的結束程式，而是會有一段訊息的輸出後，才真正中止。 ```cpp void sigint_handler(int sig) /* SIGINT handler */ { printf("So you think you can stop the bomb with ctrl-c, do you?\n"); sleep(2); printf("Well..."); fflush(stdout); sleep(1); printf("OK. :-)\n"); exit(0); } int main(int argc, char** argv) { /* Install the SIGINT handler */ if (signal(SIGINT, sigint_handler) == SIG_ERR) unix_error("signal error"); /* Wait for the receipt of a signal */ pause(); return 0; } ``` ### Guidelines for Writing Safe Handlers 在使用 signal handler 時，因為其與 main program 是 concurrent 運行的，而使用 global 變數是共享的，因此最好遵守特定的規則，以避免出現與預期不同的運行結果。 0. handler 中進行的任務儘量保持簡單 1. 在 handler 中只呼叫 [Async-Signal-Safety](#Async-Signal-Safety) 的 function 2. 進入 handler 時保存 errno，而離開時恢復 errno，確保 handler 運行過程中 errno 不會遭到其他 handler 更改 3. 透過暫時 block 所有 signal 的方式來保護共享資料的存取 4. 將 global variable 定義為 [volatile](https://en.wikipedia.org/wiki/Volatile_(computer_programming))，避免編譯器錯誤的 cache 該 variable 導致非預期的結果 5. 將 global flag 定義為 *volatile [sig_atomic_t](https://en.cppreference.com/w/c/program/sig_atomic_t)* ### Async-Signal-Safety 如果一個 function 是可重入([reentrant](https://en.wikipedia.org/wiki/Reentrancy_(computing))) 或者不可被中斷(non-interruptible)的，我們稱其為 async-signal-safety。在 signal handler 中，最好只調用 async-signal-safety 的函式，否則可能發生不可預測的執行結果。如下所示，可以看到 `printf` 並非安全的函式。事實上，要在 signal handler 中產生輸出，唯一的方法是透過 `write`。 ![](https://i.imgur.com/IXG78Z8.png) ### Correct Signal Handling 在下面所展示的例子中，parent process 設置了 `SIGCHLD`，並且 `fork` 出 `N` 個 child process，然後迴圈直到 `ccount <= 0`。當 child process 終止時，kernel 會發送 `SIGCHILD` 給其 parent，此時運行此前註冊的 handler 進行回收。 ```cpp #define N 3 volatile int ccount = 0; void child_handler(int sig) { int olderrno = errno; pid_t pid; if ((pid = wait(NULL)) < 0) Sio_error("wait error"); ccount--; sio_puts("Handler reaped child "); sio_putl((long)pid); sio_puts(" \n"); sleep(1); errno = olderrno; } void fork14() { pid_t pid[N]; int i; ccount = N; Signal(SIGCHLD, child_handler); for (i = 0; i < N; i++) { if ((pid[i] = Fork()) == 0) { sleep(1); exit(0); /* Child exits */ } } while (ccount > 0) /* Parent spins */ ; } ``` 看起來相當簡單，理論上，`ccount` 被設置為 `N`，因此當 `N` 個 child 都被回收時，parent process 就會中止。但實際運行時，可能發生 parent 永遠無法結束的問題。問題就發生在曾提到的 signal 並不會"排隊"。設想以下的情境: parent 接受到第一個 child 結束產生的 `SIGCHILD`，然後開始對其進行處理時，第二個 `SIGCHILD` 到達，此時第二個 signal 被加入到待處理的 signal 集合中。接著，在第一個 `SIGCHILD` 尚未處理完的狀態下，第三個 `SIGCHILD` 也到達了，但因為待處理的 signal 集合中已經存在 `SIGCHILD`，因此第三個 `SIGCHILD` 會直接被捨棄。最後，第一個 `SIGCHILD` 處理完後，在待處理的 signal 集合中的第二個 `SIGCHILD` 也會被處理，而第三個 `SIGCHILD` 則因為被丟棄於是永遠都無法回收。下面是對 handler 的正確修正。改為使用 `while` 迴圈進行回收。因為一個待處理的 signal 所代表的是至少一個該種 signal 被發出，因此我們需要盡可能地在每次 handler 被觸發時盡量回收 child。 ```cpp void child_handler2(int sig) { int olderrno = errno; pid_t pid; while ((pid = wait(NULL)) > 0) { ccount--; sio_puts("Handler reaped child "); sio_putl((long)pid); sio_puts(" \n"); } if (errno != ECHILD) sio_error("wait error"); errno = olderrno; } ``` ### Blocking and Unblocking Signals ![](https://i.imgur.com/mojCsaI.png) ### Synchronizing Flows to Avoid Races 下面展示了一個存在 race 問題而導致同步錯誤的 shell 範例。理想上，當 parent 每次產生一個 child 時，就將其添加到一個 global list 結構中。接收到 `SIGCHLD` 時則將 child 回收並移出 list。在操作 global list 時，會 block 所有的 signal 以保護 global list 的狀態正確。但這個範例如果是依以下的執行順序進行時，會發生問題: 1. parent 執行 fork，然後 kernel 排程給 child 運行 2. 在 parent 再次運行前，child 終止，於是 kernel 發送 `SIGCHILD` 給 parent 3. 排程到 parent 並開始運行前，發現有未處理的 `SIGCHILD`，於是 handler 啟動， `deletejob` 會被調用，但其甚麼都不作，因為 parent 還沒將該 child 加入 list 中 4. handler 結束後，parent 運行，呼叫 `addjob` 把已經不存在的 child 加入 list，產生錯誤 ```cpp void handler(int sig) { int olderrno = errno; sigset_t mask_all, prev_all; pid_t pid; sigfillset(&mask_all); while ((pid = waitpid(-1, NULL, 0)) > 0) { /* Reap child */ sigprocmask(SIG_BLOCK, &mask_all, &prev_all); deletejob(pid); /* Delete the child from the job list */ sigprocmask(SIG_SETMASK, &prev_all, NULL); } if (errno != ECHILD) sio_error("waitpid error"); errno = olderrno; } int main(int argc, char **argv) { int pid; sigset_t mask_all, prev_all; int n = N; /* N = 5 */ sigfillset(&mask_all); signal(SIGCHLD, handler); initjobs(); /* Initialize the job list */ while (n--) { if ((pid = Fork()) == 0) { /* Child */ Execve("/bin/date", argv, NULL); } sigprocmask(SIG_BLOCK, &mask_all, &prev_all); /* Parent */ addjob(pid); /* Add the child to the job list */ sigprocmask(SIG_SETMASK, &prev_all, NULL); } exit(0); } ``` 這個錯誤的原因來自我們假設 parent 總是會先運行，但事實上則非如此。下面是針對此進行的修正。 ```cpp int main(int argc, char **argv) { int pid; sigset_t mask_all, mask_one, prev_one; int n = N; /* N = 5 */ sigfillset(&mask_all); sigemptyset(&mask_one); sigaddset(&mask_one, SIGCHLD); Signal(SIGCHLD, handler); initjobs(); /* Initialize the job list */ while (n--) { sigprocmask(SIG_BLOCK, &mask_one, &prev_one); /* Block SIGCHLD */ if ((pid = Fork()) == 0) { /* Child process */ sigprocmask(SIG_SETMASK, &prev_one, NULL); /* Unblock SIGCHLD */ Execve("/bin/date", argv, NULL); } sigprocmask(SIG_BLOCK, &mask_all, NULL); /* Parent process */ addjob(pid); /* Add the child to the job list */ sigprocmask(SIG_SETMASK, &prev_one, NULL); /* Unblock SIGCHLD */ } exit(0); } ``` 可以看到一個 `mask_one` 被用來儲存表示 `SIGCHLD` 的 mask。在 `fork` 以前，先 block `SIGCHILD`，因此 parent 在調用 `addjob` 前不會收 `SIGCHILD`，也就因此不會發生 `deletejob` 先於 `addjob` 的非預期行為。此外，因為 child process 也會繼承 `SIGCHILD` 的 blocking，因此需要在調用 `execve` 前先行解除。 ### Explicitly Waiting for Signals 有時候我們需要讓 main process 等待 signal handler 的運行，例如 shell 建立一個 foreground job 後，在接受下一個 command 以前，需先等待 foreground 運行的任務結束。下面是一個範例。 ```cpp volatile sig_atomic_t pid; void sigchld_handler(int s) { int olderrno = errno; pid = waitpid(-1, NULL, 0); /* Main is waiting for nonzero pid */ errno = olderrno; } void sigint_handler(int s) { } int main(int argc, char **argv) { sigset_t mask, prev; int n = N; /* N = 10 */ signal(SIGCHLD, sigchld_handler); signal(SIGINT, sigint_handler); sigemptyset(&mask); sigaddset(&mask, SIGCHLD); while (n--) { sigprocmask(SIG_BLOCK, &mask, &prev); /* Block SIGCHLD */ if (Fork() == 0) /* Child */ exit(0); /* Parent */ pid = 0; sigprocmask(SIG_SETMASK, &prev, NULL); /* Unblock SIGCHLD */ /* Wait for SIGCHLD to be received (wasteful!) */ while (!pid) ; /* Do some work after receiving SIGCHLD */ printf("."); } printf("\n"); exit(0); } ``` 可以看到 parent process 先註冊 `SIGCHILD` 和 `SIGINT` 的 handler，然後進入一個迴圈中。首先 block `SIGCHILD` 避免 parent 和 child 之間的 race。在 `fork` 以後，先將 pid 設為 0，然後取消 block `SIGCHILD`，進入另一個 while 迴圈等待 pid 的改變。如此一來，當 child 在 signal handler 被回收後，pid 會被設為非 0 的值，迴圈可以前進到下個 iteration。這個程式的運行沒有甚麼問題，不過 `while(!pid)` 會浪費 processor 的資源。我們可以更改成: ```cpp while (!pid) /* Race! */ pause(); ``` * [pause](https://man7.org/linux/man-pages/man2/pause.2.html) * 但是這個做法有 race 問題，在 while 測試後到 pause 之前收到 SIGCHILD 的話，pause 會永久睡眠。(因為在 pause 真正呼叫後已經不會有 `SIGCHILD` 的發出) ```cpp while (!pid) /* Too slow! */ sleep(1); ``` * 在 while 測試後到 sleep 之前收到 SIGCHILD 的話，會需要等待 sleep 的時間這裡更好的解決方案是用 [`sigsuspend`](https://man7.org/linux/man-pages/man2/rt_sigsuspend.2.html)，`sigsuspend` 可以視為 atomic 的 ```cpp sigprocmask(SIG_SETMASK, &mask, &prev); pause(); sigprocmask(SIG_SETMASK, &prev, NULL); ``` 下面是進行的修改: ```cpp int main(int argc, char **argv) { sigset_t mask, prev; int n = N; /* N = 10 */ Signal(SIGCHLD, sigchld_handler); Signal(SIGINT, sigint_handler); Sigemptyset(&mask); Sigaddset(&mask, SIGCHLD); while (n--) { Sigprocmask(SIG_BLOCK, &mask, &prev); /* Block SIGCHLD */ if (Fork() == 0) /* Child */ exit(0); /* Wait for SIGCHLD to be received */ pid = 0; while (!pid) Sigsuspend(&prev); /* Optionally unblock SIGCHLD */ Sigprocmask(SIG_SETMASK, &prev, NULL); /* Do some work after receiving SIGCHLD */ printf("."); } printf("\n"); exit(0); } ``` `sigsuspend` 以其參數的 mask 替換目前的 block 集合，因此暫時取消 block SIGCHILD，然後進入 suspend 狀態，直到收到 signal，在 return 以前，`sigsuspend` 會恢復成替換前的 block 集合，因此再次 block SIGCHILD。 ### Portable Signal Handling 不同的 UNIX 版本對 signal 的處理存在差異，例如: * 有些老的 UNIX 系統在 signal k 被 handler 處理完後會變回 default，handler 需要再次呼叫 `signal` 註冊自己 * 有些系統的 system call 會被 signal 打斷 * 有些系統不會 block 自己正在處理的 signal 對此，相較 signal，[sigaction](https://man7.org/linux/man-pages/man2/sigaction.2.html) 會是一個更佳的選擇。 ## Nonlocal jumps C 語言中提供了一種 user level 的 exception flow 形式，稱為 nonlocal jump。和正常的 call / return 不同，透過 [`setjmp`](https://man7.org/linux/man-pages/man3/longjmp.3.html) 和 `longjmp`，直接將控制從一個函數轉移到這另一個函數，這在 error recovery 和 signal hanlding 時相當有用。 `int setjmp(jmp_buf j)`，需要先於 `longjmp` 前被呼叫，它會被呼叫一次並且可能返回多次。在參數 `j` 中保存當前的 state(包含 program counter、stack pointer、general purpose register)後，返回 0。 `void longjmp(jmp_buf j, int i)` 透過紀錄的 buffer `j` 回復狀態，也就是最近一次呼叫 `setjmp` 的地方，且讓 `setjmp` 的返回值變成 `i`。它會被呼叫一次並且從不返回。 ```cpp /* Deeply nested function foo */ void foo(void) { if (error1) longjmp(buf, 1); bar(); } void bar(void) { if (error2) longjmp(buf, 2); } jmp_buf buf; int error1 = 0; int error2 = 1; void foo(void), bar(void); int main() { switch(setjmp(buf)) { case 0: foo(); break; case 1: printf("Detected an error1 condition in foo\n"); break; case 2: printf("Detected an error2 condition in foo\n"); break; default: printf("Unknown error condition in foo\n"); } exit(0); } ``` 上面是一個 `setjmp` 和 `longjmp` 應用的簡單案例。在 `main` 中，`setjmp` 第一次被呼叫時返回 0，因此執行 `foo()`，`error1` 初始值為 0 所以往下執行 `bar()`，`bar()` 則呼叫 `longjmp` 因此 control 會轉移到 `setjmp` 處，且返回 2。我們可以發現 nonlocal jump 存在的限制: `longjmp` 只能跳轉到已經被呼叫過的 function。 ```cpp #include "csapp.h" sigjmp_buf buf; void handler(int sig) { siglongjmp(buf, 1); } int main() { if (!sigsetjmp(buf, 1)) { Signal(SIGINT, handler); Sio_puts("starting\n"); } else Sio_puts("restarting\n"); while(1) { Sleep(1); Sio_puts("processing...\n"); } exit(0); /* Control never reaches here */ } ``` `sigsetjmp` 和 `siglongjmp` 是 `setjmp` 和 `longjmp` 可以在 signal handling 中使用的版本。上面提供了一個簡單的範例。程式啟動後，先用 `sigsetjmp` 保存 state，然後就進入一個無限迴圈。當使用者輸入 ctrl+c，程式會收到 `SIGINT` 並轉移到 signal hander 中，`handler` 透過呼叫 `siglongjmp` 將控制轉回到 `main` 當開始處而非被中斷的地方。於是這個程式的運行可以得到類似下面的輸出: ``` starting processing... processing... processing... (輸入 ctrl+c) restarting processing... processing... (輸入 ctrl+c) restarting processing... processing... processing... ``` 關於上面的程式還有一些值得注意的細節。首先，為了避免競爭，必須在呼叫 `sigsetjmp` 後再呼叫 `signal` 設置 handler，否則可能發生在 `sigsetjmp` 被呼叫前就去呼叫 `siglongjmp` 的錯誤。此外，`sigsetjmp` 和 `siglongjmp` 不是 async signal safety，因為 `siglongjmp` 可以跳轉到任意 code，所以必須注意儘量在 `siglongjmp` 可達的 code 中使用安全的 function 以避免出錯的風險。