# CS:APP Shell Lab 解題筆記
###### tags: `cs:app`
Shell Lab 對應第八章『異常控制流』,實作一個簡單的 shell `tsh`,當使用者在 terminal 輸入命令時,程式必需正確地執行且符合預期的行為:
1. 提示符為 "tsh> "
2. 使用者輸入執行程序的名稱及相應的參數,如果名稱為 shell 內置的命令,則 shell 應該立即執行並等待下一個命令。否則假定該名稱為可執行程序的路徑,shell 將讀取並創建子程序來執行
3. 無須支援管道 (pipe) 及 I/O 重新定向(< & >)
4. 若使用者輸入 ctrl-c (ctrl-z),則 shell 應該要傳送 SIGINT(SIGTSTP) 信號至目前的前景執行程序,以及該程序的所有子程序。如果前景沒有執行中的程序,則信號沒有任何影響
5. 如果命令行結尾有 `&` 字元,則程序應在背景執行
6. 每一個執行程序(任務)都有由 shell 賦予的 process ID (PID) 及 job ID (JID)。命令行若要指定 JID,需加入前綴 `%`,例如 `%5` 代表 JID 5,`5` 代表 PID 5
7. shell 應支援以下內置的命令,其中 \<job> 可以為 JID 或 PID
- quit: 立刻終止 shell 運行
- jobs: 列出所有的 job 及其運行狀態
- bg \<job>: 傳送 SIGCONT 信號重啟任務,並將任務轉換於背景執行
- fg \<job>: 傳送 SIGCONT 信號重啟任務,並將任務轉換於前景執行
8. shell 應回收所有處在僵死狀態的子進程,如果子進程是因為接收到信號而終止,shell 必需打印出 JID 及 signal 的訊息
CMU 提供的程式碼已經提供了 shell 的基本邏輯以及 job 操作的介面,讓學生可以專注在信號的細節處理。我們需要實作的函式為:
1. builtin_cmd: 解析命令行並確認是否為內置命令
2. sigint_handler: SIGINT 的信號處理
3. sigtstp_handler: SIGTSTP 的信號處理
4. sigchld_handler: SIGCHLD 的信號處理
5. eval: 解析並執行命令行的主函式
6. waitfg: 等待前景執行程序完成
7. do_bgfg: bg & fg 內置命令
```cpp
/*
* main - The shell's main routine
*/
int main(int argc, char **argv)
{
char c;
char cmdline[MAXLINE];
int emit_prompt = 1; /* emit prompt (default) */
/* Redirect stderr to stdout (so that driver will get all output
* on the pipe connected to stdout) */
dup2(1, 2);
/* Parse the command line */
while ((c = getopt(argc, argv, "hvp")) != EOF) {
switch (c) {
case 'h': /* print help message */
usage();
break;
case 'v': /* emit additional diagnostic info */
verbose = 1;
break;
case 'p': /* don't print a prompt */
emit_prompt = 0; /* handy for automatic testing */
break;
default:
usage();
}
}
/* Install the signal handlers */
/* These are the ones you will need to implement */
Signal(SIGINT, sigint_handler); /* ctrl-c */
Signal(SIGTSTP, sigtstp_handler); /* ctrl-z */
Signal(SIGCHLD, sigchld_handler); /* Terminated or stopped child */
/* This one provides a clean way to kill the shell */
Signal(SIGQUIT, sigquit_handler);
/* Initialize the job list */
initjobs(jobs);
/* Execute the shell's read/eval loop */
while (1) {
/* Read command line */
if (emit_prompt) {
printf("%s", prompt);
fflush(stdout);
}
if ((fgets(cmdline, MAXLINE, stdin) == NULL) && ferror(stdin))
app_error("fgets error");
if (feof(stdin)) { /* End of file (ctrl-d) */
fflush(stdout);
exit(0);
}
/* Evaluate the command line */
eval(cmdline);
fflush(stdout);
fflush(stdout);
}
exit(0); /* control never reaches here */
}
```
另外本實驗共有 16 個 trace 檔案來確認 `tsh` 實作的正確性,並可從 `tshref.out` 對照執行結果是否如預期
## builtin_cmd
`builtin_cmd` 接收已解析的命令列 `argv`,並判斷第一個參數是否為內置函數,如果為內置函數,則立刻執行並回傳 `1`,否則回傳 `0`。
實作並不複雜,內置函數如下
1. quit: 使用 `exit` 直接離開程式即可
2. jobs: 呼叫已實作的函數 `listjobs` 即可列出目前 `jobs` 的狀態,由於 `jobs` 為全域變數,為避免執行過程中斷被其他進程修改 job 的狀態,需阻塞所有信號的接收
3. bg/fg: 呼叫 `do_bgfg`
```cpp
/*
* builtin_cmd - If the user has typed a built-in command then execute
* it immediately.
*/
int builtin_cmd(char **argv)
{
sigset_t mask_all, mask_prev;
if (!strcmp(argv[0], "quit")) /* quit command */
exit(0);
if (!strcmp(argv[0], "jobs")) { /* jobs is a global variable, block all signal then execute the function */
sigfillset(&mask_all);
sigprocmask(SIG_BLOCK, &mask_all, &mask_prev);
listjobs(jobs);
sigprocmask(SIG_SETMASK, &mask_prev, NULL);
return 1;
}
if (!strcmp(argv[0], "bg") || !strcmp(argv[0], "fg")) { /* change job running status */
do_bgfg(argv);
return 1;
}
if (!strcmp(argv[0], "&")) /* ignore single & */
return 1;
return 0; /* not a builtin command */
}
```
## sigint_handler
當使用者輸入 ctrl-c 時,內核傳送 `SIGINT` 信號給 shell,shell 接收到信號後傳送給所有在前景執行的程式及其所在進程組的所有進程。
- `jobs` 為全域變數,操作時必需阻塞所有信號
- 要傳送給進程組的所有進程,因此 `kill` 的 PID 需帶負號
```cpp
/*
* sigint_handler - The kernel sends a SIGINT to the shell whenver the
* user types ctrl-c at the keyboard. Catch it and send it along
* to the foreground job.
*/
void sigint_handler(int sig)
{
int olderrno = errno;
sigset_t mask_all, mask_prev;
sigprocmask(SIG_BLOCK, &mask_all, &mask_prev);
for (int i = 0; i < MAXJOBS; i++) {
if (jobs[i].state == FG) {
kill(-jobs[i].pid, SIGINT); /* send signal to entire foreground process group */
sigprocmask(SIG_SETMASK, &mask_prev, NULL);
return;
}
}
sigprocmask(SIG_SETMASK, &mask_prev, NULL);
errno = olderrno;
return;
}
```
## sigtstp_handler
當使用者輸入 ctrl-z 時,內核傳送 `SIGTSTP` 信號給 shell,shell 接收到信號後傳送給所有在前景執行的程式及其所在進程組的所有進程。
- `jobs` 為全域變數,操作時必需阻塞所有信號
- 要傳送給進程組的所有進程,因此 `kill` 的 PID 需帶負號
```cpp
/*
* sigtstp_handler - The kernel sends a SIGTSTP to the shell whenever
* the user types ctrl-z at the keyboard. Catch it and suspend the
* foreground job by sending it a SIGTSTP.
*/
void sigtstp_handler(int sig)
{
int olderrno = errno;
sigset_t mask_all, mask_prev;
sigprocmask(SIG_BLOCK, &mask_all, &mask_prev);
for (int i = 0; i < MAXJOBS; i++) {
if (jobs[i].state == FG) {
kill(-jobs[i].pid, SIGTSTP); /* send signal to entire foreground process group */
sigprocmask(SIG_SETMASK, &mask_prev, NULL);
return;
}
}
sigprocmask(SIG_SETMASK, &mask_prev, NULL);
errno = olderrno;
return;
}
```
## sigchld_handler
當子進程終止(正常終止或因外部信號中斷,成為僵死狀態)或停止運行(收到 `SIGTSTP`),內核傳送 `SIGCHLD` 信號給 shell,shell 必需負責將所有僵死狀態的子進程回收,但如果當下沒有任何子進程可以回收,則應該直接立即返回
等待跟回收子進程一般會呼叫系統函數 `waitpid`,先用 `man waitpid` 查看 waitpid 的預設行為
>The waitpid() system call suspends execution of the calling thread until a child specified by pid argument has changed state. By default, waitpid() waits only for terminated children, but this behavior is modifiable via the options argument
`waitpid` 預設會掛起呼叫該函數的進程,直到其等待集合 (wait set) 中的一個子進程終止,且只返回已終止的子進程,但作業要求當沒有進程可以回收時,不可掛起呼叫的進程,且當子進程接收到 `SIGTSTP` 而暫停時,也必需有相應的處理,參考中文版課本 cs:app 3/e p.517,使用 `WNOHANG | WUNTRACED` 選項
- 若子進程正常退出,則從任務列表中刪除對應的任務
- 若子進程因信號而終止,除了刪除任務外,還需打印終止狀態
- 若子進程被暫停,則需要將任務的狀態切換為 `ST`
- 為避免 race condition,`jobs` 相關的操作必需阻塞所有信號
```cpp
/*
* sigchld_handler - The kernel sends a SIGCHLD to the shell whenever
* a child job terminates (becomes a zombie), or stops because it
* received a SIGSTOP or SIGTSTP signal. The handler reaps all
* available zombie children, but doesn't wait for any other
* currently running children to terminate.
*/
void sigchld_handler(int sig)
{
int olderrno = errno;
int status;
sigset_t mask_all, mask_prev;
pid_t pid;
sigfillset(&mask_all);
while ((pid = waitpid(-1, &status, WNOHANG | WUNTRACED)) > 0) {
if (WIFEXITED(status)) { /* if child process return normally, block signal & delete job */
sigprocmask(SIG_BLOCK, &mask_all, &mask_prev);
deletejob(jobs, pid);
sigprocmask(SIG_SETMASK, &mask_prev, NULL);
}
if (WIFSIGNALED(status)) { /* if child process terminated by signal, print state, block signal & delete job */
sigprocmask(SIG_BLOCK, &mask_all, &mask_prev);
printf("Job [%d] (%d) terminated by signal %d\n", pid2jid(pid), pid, WTERMSIG(status));
deletejob(jobs, pid);
sigprocmask(SIG_SETMASK, &mask_prev, NULL);
}
if (WIFSTOPPED(status)) { /* if child process stopped by signal, print state, block signal & modify state of job */
sigprocmask(SIG_BLOCK, &mask_all, &mask_prev);
printf("Job [%d] (%d) stopped by signal %d\n", pid2jid(pid), pid, WSTOPSIG(status));
struct job_t* job = getjobpid(jobs, pid);
job->state = ST;
sigprocmask(SIG_SETMASK, &mask_prev, NULL);
}
}
errno = olderrno;
return;
}
```
## eval & waitfg
`eval` 函式為執行命令行的主要邏輯函數,課本有給出實作的基本邏輯,我們必需在這個基礎上使程式碼更加完善,讓前/背景程序的信號處理的執行符合預期
`eval` 必需包含以下功能
- 判斷是否為內置命令
- 若是則立即執行
- 若否則視為可執行文件的路徑, `fork` 一個子進程嘗試執行,若執行失敗則打印相關訊息並離開
- 將新建的任務加入任務列表
- 若任務在前景執行,shell 必需等待任務終止然後返回,否則打印相關訊息
課本提到如果沒有在 `fork` 前阻塞 `SIGCHLD`,則可能發生子進程執行完成後先刪除不存在於列表的任務,然後 shell 再將已完成的任務添加至任務列表,導致任務列表有一個不再存在且永遠不會被刪除的任務。
一個消除這種 race condition 的方法為在呼叫 `fork` 前先阻塞 `SIGCHLD`,確保任務在添加至任務列表前,都不會因為觸發 `sigchld_handler` 而被刪除。要特別注意子進程會繼承父進程的信號狀態,在執行 `execve` 前必需將 `SIGCHLD` 的阻塞解除,以下節錄自 CMU 的 lab 說明文件[2]
>In eval, the parent must use sigprocmask to block SIGCHLD signals before it forks the child, and then unblock these signals, again using sigprocmask after it adds the child to the job list by calling addjob. Since children inherit the blocked vectors of their parents, the child must be sure to then unblock SIGCHLD signals before it execs the new program. The parent needs to block the SIGCHLD signals in this way in order to avoid the race condition where the child is reaped by sigchld handler (and thus removed from the job list) before the parent calls addjob.
shell lab 說明文件[2] 還提到,當使用者輸入 ctrl-c/ctrl-z 時,應傳送對應信號給前景任務及對應進程組的所有進程,為了避免刪除 shell 本身,在創建子進程時必需呼叫 `setpgid(0, 0);`,讓子進程形成一個獨立的進程組
>When you run your shell from the standard Unix shell, your shell is running in the foreground process group. If your shell then creates a child process, by default that child will also be a member of the foreground process group. Since typing ctrl-c sends a SIGINT to every process in the foreground group, typing ctrl-c will send a SIGINT to your shell, as well as to every process that your shell created, which obviously isn’t correct.
Here is the workaround: After the fork, but before the execve, the child process should call setpgid(0, 0), which puts the child in a new process group whose group ID is identical to the child’s PID. This ensures that there will be only one process, your shell, in the foreground process group. When you type ctrl-c, the shell should catch the resulting SIGINT and then forward it to the appropriate foreground job (or more precisely, the process group that contains the foreground job).
```cpp
/*
* eval - Evaluate the command line that the user has just typed in
*
* If the user has requested a built-in command (quit, jobs, bg or fg)
* then execute it immediately. Otherwise, fork a child process and
* run the job in the context of the child. If the job is running in
* the foreground, wait for it to terminate and then return. Note:
* each child process must have a unique process group ID so that our
* background children don't receive SIGINT (SIGTSTP) from the kernel
* when we type ctrl-c (ctrl-z) at the keyboard.
*/
void eval(char *cmdline)
{
char* argv[MAXARGS];
char buf[MAXLINE];
int bg;
pid_t pid;
strcpy(buf, cmdline);
bg = parseline(buf, argv);
if (argv[0] == NULL) return; /* empty line */
sigset_t mask_all, mask_one, mask_prev;
sigfillset(&mask_all);
sigemptyset(&mask_one);
sigaddset(&mask_one, SIGCHLD);
if (!builtin_cmd(argv)) {
/* block sigchld */
sigprocmask(SIG_BLOCK, &mask_one, &mask_prev);
if ((pid = fork()) == 0) { /* child process */
/* unblock sigchld */
setpgid(0, 0);
sigprocmask(SIG_SETMASK, &mask_prev, NULL);
if (execve(argv[0], argv, environ) < 0) { /* failed to execute file */
printf("%s: Command not found\n", argv[0]);
exit(0);
}
}
/* add job to job list */
sigprocmask(SIG_BLOCK, &mask_all, &mask_one);
addjob(jobs, pid, bg?BG:FG, buf);
sigprocmask(SIG_SETMASK, &mask_prev, NULL);
if (!bg) { /* parent wait for foreground terminate */
waitfg(pid);
} else {
printf("[%d] (%d) Running %s", pid2jid(pid), pid, buf);
}
}
return;
}
```
`waitfg` 使用 busy waiting 等待前景的子進程不再是前景程序,雖然我們可以在 `waitfg` & `sigchld_handler` 都實作 `waitpid`,但這樣等於要在 `waitfg` 重複 `sigchld_handler` 的程式碼,而且兩個函式的功能切不乾淨會讓其他人感到混淆。
shell lab 說明文件[2] 建議統一由 `sigchld_handler` 回收子進程,所以 `waitfg` 的任務變得非常單純
>One of the tricky parts of the assignment is deciding on the allocation of work between the waitfg and sigchld handler functions. We recommend the following approach:
– In waitfg, use a busy loop around the sleep function.
– In sigchld handler, use exactly one call to waitpid.
While other solutions are possible, such as calling waitpid in both waitfg and sigchld handler, these can be very confusing. It is simpler to do all reaping in the handler.
此外參考課本的建議,使用 `sigsuspend` 避免 busy waiting 浪費太多 CPU 資源。參考課本 `sigsuspend` 等價於以下程式碼的 `atomic` 版本
```cpp
sigprocmask(SIG_SETMASK, &mask, &prev);
pause();
sigprocmask(SIG_SETMASK, &prev, NULL);
```
以下節自 `man sigsuspend`
>sigsuspend() temporarily replaces the signal mask of the calling thread with the mask given by mask and then suspends the thread until delivery of a signal whose action is to invoke a signal handler or to terminate a process.
If the signal terminates the process, then sigsuspend() does not return. If the signal is caught, then sigsuspend() returns after the signal handler returns, and the signal mask is restored to the state before the call to sigsuspend().
```cpp
/*
* waitfg - Block until process pid is no longer the foreground process
*/
void waitfg(pid_t pid)
{
sigset_t mask_prev;
sigemptyset(&mask_prev); /* do not block any signal so that parent process could receive SIGCHLD */
while (pid == fgpid(jobs)) /* busy waiting */
sigsuspend(&mask_prev);
return;
}
```
## do_bgfg
`do_bgfg` 執行內置命令 `bg/fg`
- bg \<job>: 傳送 SIGCONT 信號重啟任務,並將任務轉換於背景執行
- fg \<job>: 傳送 SIGCONT 信號重啟任務,並將任務轉換於前景執行
此外參考 `tshref.out`,必需有一些簡單的判斷邏輯
- 確認使用者的輸入格式是否正確
- JID/PID 是否存在於任務列表中
最後針對 `bg/fg` 命令,有各自對應的指令要執行
- 若為前景程式,shell 變更任務狀態為 FG,並等待前景任務執行完成或終止
- 若為背景程式,shell 變更任務狀態為 BG,並打印相關訊息
```cpp
/*
* do_bgfg - Execute the builtin bg and fg commands
*/
int builtin_cmd(char **argv)
{
sigset_t mask_all, mask_prev;
if (!strcmp(argv[0], "quit")) /* quit command */
exit(0);
if (!strcmp(argv[0], "jobs")) { /* jobs is a global variable, block all signal then execute the function */
sigfillset(&mask_all);
sigprocmask(SIG_BLOCK, &mask_all, &mask_prev);
listjobs(jobs);
sigprocmask(SIG_SETMASK, &mask_prev, NULL);
return 1;
}
if (!(strcmp(argv[0], "bg")) || !(strcmp(argv[0], "fg"))) { /* change job running status */
do_bgfg(argv);
return 1;
}
if (!strcmp(argv[0], "&")) /* ignore single & */
return 1;
return 0; /* not a builtin command */
}
/*
* do_bgfg - Execute the builtin bg and fg commands
*/
void do_bgfg(char **argv)
{
char* cid;
int nid, isjid, len;
if (argv[1] == NULL) {
printf("%s command requires PID or %%jobid argument\n", argv[0]);
return;
}
cid = argv[1];
if (cid[0] == '%') {
cid++;
isjid = 1;
} else {
isjid = 0;
}
len = strlen(cid);
for (int i = 0; i < len; i++) {
if (cid[i] < 48 || cid[i] > 57) {
printf("%s: argument must be a PID or %%jobid\n", argv[0]);
return;
}
}
char* endptr;
struct job_t* job;
sigset_t mask_all, mask_prev;
sigfillset(&mask_all);
nid = strtol(cid, &endptr, 10);
if (isjid) {
sigprocmask(SIG_BLOCK, &mask_all, &mask_prev);
job = getjobjid(jobs, nid);
sigprocmask(SIG_SETMASK, &mask_prev, NULL);
if (!job) {
printf("%%%d: No such job\n", nid); return;
}
} else {
sigprocmask(SIG_BLOCK, &mask_all, &mask_prev);
job = getjobpid(jobs, nid);
sigprocmask(SIG_SETMASK, &mask_prev, NULL);
if (!job) {
printf("(%d): No such process\n", nid); return;
}
}
kill(-(job->pid), SIGCONT);
if (!(strcmp(argv[0], "fg"))) {
sigprocmask(SIG_BLOCK, &mask_all, &mask_prev);
job->state = FG;
sigprocmask(SIG_SETMASK, &mask_prev, NULL);
waitfg(job->pid);
} else {
sigprocmask(SIG_BLOCK, &mask_all, &mask_prev);
job->state = BG;
sigprocmask(SIG_SETMASK, &mask_prev, NULL);
printf("[%d] (%d) %s", job->jid, job->pid, job->cmdline);
}
return;
}
```
完整程式碼請參考 [github 連結](https://github.com/Chang-Chia-Chi/CSAPP-Labs/blob/main/Shell-Lab/tsh.c)
## Reference
1. [Computer Systems: A Programmer’s Perspective, 3/E (CS:APP3e)](http://csapp.cs.cmu.edu/3e/labs.html)
2. [CS 213, Fall 2002 Lab Assignment L5: Writing Your Own Unix Shell](http://csapp.cs.cmu.edu/3e/shlab.pdf)