shell lab

github

此筆記主要紀錄 CS:APP 作業 shell lab 的解題思路，由於是一步一腳印撰寫，並非寫完再事後筆記，因此會顯得比較雜亂

TODO:

unlify the notation style for better readability
need futhur discussion about the signal safety (POSIX)

基礎知識複習 - Exceptions & Process

要完成 shell lab 需先對 exceptional control flow (ECL) 有基礎的認知，建議先至少閱讀過 CS:APP 第 8 章再進行此作業

`fork`

called once but return twice

#include <sys/types.h>
#include <unistd.h>
int fork(void)

RETURN

0
returned in the child
pid
returned in the parent

`waitpid`

#include <sys/types.h>
#include <sys/wait.h>
pid_t waitpid(pid_t pid, int *status, int options)

init (pid = 1) reaps zombie after parent terminated, so explicit reaping are only necessary for

long-run processes (e.g., shells and servers)
non-stop child process (infinite loop)

`pid_t pid`

< -1
meaning wait for any child process whose process group ID is equal to the absolute value of pid
-1
meaning wait for any child process
0
meaning wait for any child process whose process group ID is equal to that of the calling process
> 0
meaning wait for the child whose process ID is equal to the value of pid

`int option`

0 (default)
WNOHANG
Return immediately (with a return value of 0) if none of the child processes in the wait set has terminated yet
WUNTRACED
Return the PID of the terminated or stopped child that caused the return (The default behavior returns only for terminated children)
WNOHANG|WUNTRACED
Return immediately, with a return value of 0, if none of the children in the wait set has stopped or terminated, or with a return value equal to the PID of one of the stopped or terminated children.

`int *status`

If the status argument is non-NULL, then waitpid encodes status information about the child that caused the return in the status argument. The wait.h include file defines several macros for interpreting the status argument:

WIFEXITED(status)
Returns true if the child terminated normally, via a call to exit or a return.
- WEXITSTATUS(status)
  Returns the exit status of a normally terminated child. This status is only defined if WIFEXITED returned true.
WIFSIGNALED(status)
Returns true if the child process terminated because of a signal that was not handled.
- WTERMSIG(status)
  Returns the number of the signal that caused the child process to terminate. This status is only defined if WIFSIGNALED(status) returned true.
WIFSTOPPED(status)
Returns true if the child that caused the return is currently stopped.
- WSTOPSIG(status)
  Returns the number of the signal that caused the child to stop. This status is only defined if WIFSTOPPED(status) returned true.

RETURN

PID of child
waitpid success
0
if WNOHANG was specified and one or more child specified by pid exist, but have not yet changed state
-1
ERROR

ERRORS

Error message accordingly can be accessed through strerror(errno)

ECHILD
The process specified by pid (waitpid()) or idtype and id (waitid()) does not exist or is not a child of the calling process.
EINTR
WNOHANG was not set and an unblocked signal or a SIGCHLD was caught

#include <sys/types.h>
#include <sys/wait.h>
int wait(int *status)

wait(&status) == waitpid(-1, &status, 0)

`getpid`

#include <sys/types.h>
#include <unistd.h>
pid_t getpid(void);
pid_t getppid(void);

getpid returns the PID of the calling process
getppid returns the PID of its parent

`execve`

int execve(char *filename, char *argv[], char *envp[])

The execve function loads and runs the executable object file filename with the argument list argv and the environment variable list envp. While it overwrites the address space of the current process, it does not create a new process. The new program still has the same PID, and it inherits all of the file descriptors that were open at the time of the call to the execve function.

RETURN

nothing
execve success
-1
ERROR

appendex - `envp`

#include <stdlib.h>
char *getenv(const char *name);
int setenv(const char *name, const char *newvalue, int overwrite);
void unsetenv(const char *name);

getenv searches the envp[] for a string "name=value"
- Returns: poiner to name if exists, NULL if no match
setenv searches the envp[] for a string "name=oldvalue" and replaces oldvalue with newvalue
- If name does not exist, then setenv adds "name=newvalue" to the envp[]
- Returns: 0 on success, −1 on error
unsetenv searches the envp[] for a string "name=value" and deletes it

extern char **environ;

defined as a global variable in the Glibc source file

further information

基礎知識複習 - Signals

#include <unistd.h>
pid_t getpgrp(void);

a child process belongs to the same process group as its parent

#include <unistd.h>
int setpgid(pid_t pid, pid_t pgid);

changes the process group of process pid to pgid
Returns: 0 on success, −1 on error
If pid is zero, the PID of the current process is used
If pgid is zero, the PID of the process specified by pid is used for the process group ID

`kill` - send signal, not kill!

#include <sys/types.h>
#include <signal.h>
int kill(pid_t pid, int sig);

Returns: 0 if OK, −1 on error

unix> /bin/kill -9 15213

sends signal 9 (SIGKILL) to process 15213
A negative PID causes the signal to be sent to every process in process group PID

`signal`

#include <signal.h>
typedef void (*sighandler_t)(int);
sighandler_t signal(int signum, sighandler_t handler);

Returns: ptr to previous handler if OK, SIG_ERR on error (does not set errno)
If handler is SIG_IGN, then signals of type signum are ignored
If handler is SIG_DFL, then the action for signals of type signum reverts to the default action
Otherwise, handler is the address of a user-defined function, called a signal handler

#include <signal.h>
int sigaction(int signum, struct sigaction *act, struct sigaction *oldact);

portable signal handling (Posix standard)
Returns: 0 if OK, −1 on error
Interrupted system calls are automatically restarted whenever possible

a cleaner approach using wrapper:





















handler_t *Signal(int signum, handler_t *handler)
{
    struct sigaction action, old_action;

    action.sa_handler = handler;
    sigemptyset(&action.sa_mask);  /* Sig of type being handled is still blocked */
    action.sa_flags = SA_RESTART;  /* Restart syscalls if possible */

    if (sigaction(signum, &action, &old_action) < 0)
        unix_error("Signal error");
    return (old_action.sa_handler);
}

/* The sigaction structure is defined as something like: */
struct sigaction {
    void      (*sa_handler)(int);
    void      (*sa_sigaction)(int, siginfo_t *, void *);
    sigset_t    sa_mask;
    int         sa_flags;
    void      (*sa_restorer)(void);
};

直接看 man 比較清楚!
->man sigaction

`sigprocmask`

#include <signal.h>
int sigprocmask(int how, const sigset_t *set, sigset_t *oldset);
    int sigemptyset(sigset_t *set);
    int sigfillset(sigset_t *set);
    int sigaddset(sigset_t *set, int signum);
    int sigdelset(sigset_t *set, int signum);
Returns: 0 if OK, −1 on error

int sigismember(const sigset_t *set, int signum);
Returns: 1 if member, 0 if not, −1 on error

how
- SIG_BLOCK: Add the signals in set to blocked (blocked = blocked | set)
- SIG_UNBLOCK: Remove the signals in set from blocked (blocked = blocked & ~set).
- SIG_SETMASK: blocked = set.
If oldset is non-NULL, the previous value of the blocked bit vector is stored in oldset
Signal sets such as set are manipulated using the following functions:
int sigemptyset(sigset_t *set) initializes set to the empty set
int sigfillset(sigset_t *set) adds every signal to set
int sigaddset(sigset_t *set, int signum)adds signum to set
int sigdelset(sigset_t *set, int signum)deletes signum from set

解題思路

[CS:APP 8.4.6] A shell performs a sequence of read/evaluate steps, and then terminates. The read step reads a command line from the user. The evaluate step parses the command line and runs programs on behalf of the user.

首先看教科書範例：

#include "csapp.h"
#define MAXARGS 128

/* Function prototypes */
void eval(char *cmdline);
int parseline(char *buf, char **argv);
int builtin_command(char **argv);

int main()
{
    char cmdline[MAXLINE]; /* Command line */
    while (1) {
    /* Read */
    printf("> ");
    Fgets(cmdline, MAXLINE, stdin);
    if (feof(stdin))
        exit(0);
        
    /* Evaluate */
    eval(cmdline);
    }
}

/* eval - Evaluate a command line */
void eval(char *cmdline)
{
    char *argv[MAXARGS]; /* Argument list execve() */
    char buf[MAXLINE];   /* Holds modified command line */
    int bg;              /* Should the job run in bg or fg */
    pid_t pid;           /* Process id */
    
    strcpy(buf, cmdline);
    bg = parseline(buf, argv);
    if (argv[0] == NULL)
        return;  /* Ignore empty lines */
        
    if (!builtin_command(argv)) { 
        if ((pid = Fork()) == 0) {  /* Child runs user job */
            if (execve(argv[0], argv, environ) < 0) {
                printf("%s: Command not found.\n", argv[0]);
                exit(0);
            }
        }
        
        /* Parent waits for foreground job to terminate */
        if (!bg) {
            int status;
            if (waitpid(pid, &status, 0) < 0)
                unix_error("waitfg: waitpid error");
        }
        else
            printf("%d %s", pid, cmdline);
    }
    return;
}

/* If first arg is a builtin command, run it and return true */
int builtin_command(char **argv)
{
    if (!strcmp(argv[0], "quit")) /* quit command */
        exit(0);
    if (!strcmp()    /* Ignore singleton & */
        return 1;
    return 0;                     /* Not a builtin command */
}

/* parseline - Parse the command line and build the argv array */
int parseline(char *buf, char **argv)
{
    char *delim;  /* Points to first space delimiter */
    int argc;     /* Number of args */
    int bg;       /* Background job? */
    
    buf[strlen(buf)-1]=' '; /* Replace trailing ’\n’ with space */
    while (*buf && (*buf == ' ')) /* Ignore leading spaces */
        buff++;
        
    /* Build the argv list */
    argc = 0;
    while ((delim = strchr(buf, ' '))) {
        argv[argc++] = buf;
        *delim = '\0';
        buf = delim + 1;
        while (*buf && (*buf == ' ')) /* Ignore spaces */
            buf++;
    }
    argv[argc] = NULL;
    
    if (argc == 0) /* Ignore blank line */
        return 1;
        
    /* Should the job run in the background? */
    if ((bg = (*argv[argc-1] == '&')) != 0)
        argv[--argc] = NULL;
        
    return bg;
}

接下來會依照作業說明，根據 trace file 的順序完成作業

trace01

trace01.txt - Properly terminate on EOF

基本上一開始就是完成的=_=




if (feof(stdin)) { /* End of file (ctrl-d) */
    fflush(stdout);
    exit(0);
}

Ctrl+D invokes EOF in stdin

trace02

trace02.txt - Process builtin quit command

就是簡單的字符偵測，先交給 parseline 分析輸入的指令串，再交給 builtin_cmd 偵測








bg = parseline(cmdline, argv);
int builtin_cmd(char **argv) 
{
    if (!strcmp(argv[0], "quit")) {  /* quit command */
        exit(0);
    }
    return 0;     /* not a builtin command */
}

trace03 & trace04

trace03.txt - Run a foreground job (FG)
trace04.txt - Run a background job (BG)

這兩個 trace 可以一起做，具體的邏輯順序如下

parseline 分析輸入的指令串，判斷是否為 FG/BG job
確認是否為 builtin command
fork 出 child 並讓其 execve 執行 job
根據 FG/BG 決定 parent 是否等待 child 結束

















bg = parseline(cmdline, argv);
if (argv[0] == NULL) {  /* ignore empty lines */
    return;
}
if (!builtin_cmd(argv)) {  /* no need to fork buildin command */
    if ((pid = fork()) == 0) {  /* child runs the job */
        if(execve(argv[0], argv, environ) < 0) {
           unix_error("execve error");
        }
    }
    /* parent waits for fg job terminate */
    if (!bg) {
        if (waitpid(pid, &status, 0) < 0) {
            unix_error("waitpid error");
        }
    }
}

trace05

trace05.txt - Process jobs builtin command

內鍵指令 jobs 的實作很簡單，就是登錄一個指令到 builtin_cmd，實作的部分則直接呼叫 listjobs 來列出 job list 的內容 (作業題目一開始就已提供)











int builtin_cmd(char **argv) 
{
    if (!strcmp(argv[0], "quit")) {  /* quit command */
        exit(0);
    }
    if (!strcmp(argv[0], "jobs")) {  /* jobs command */
        listjobs(jobs);
        return 1;
    }
    return 0;     /* not a builtin command */
}

接下來需要處理 job list 的登錄問題，共需處理以下幾個點：

fork 後需要呼叫 addjob 來將 child 登錄到 job list
child 執行結束後需要呼叫 deletejob，這會再細分為兩個狀況
- child 為前景程式，parent 會等待其結束並 reap
- child 為背景程式，之後由 sigchld_handler 負責 reap

在實作前我們先注意一下 writeup 中給予的提示：

One of the tricky parts of the assignment is deciding on the allocation ofwork between the waitfg and sigchld handler functions. We recommend the following approach:
– In waitfg, use a busy loop around the sleep function.
– In sigchld handler, use exactly one call to waitpid.
While other solutions are possible, such as calling waitpid in both waitfg and sigchld handler,
these can be very confusing. It is simpler to do all reaping in the handler.

writeup 中提到在 waitfg 及 sigchld_handler 內皆使用 waitpid 雖然可能是可行的方案，但建議還是交由 sigchld_handler 來統一 reap child 以避免混亂，因此一開始在 trace03 部分的寫法是不行的，後面需再修改

The parent needs to block the SIGCHLD signals in this way in order to avoid the race condition where the child is reaped by sigchld handler (and thus removed from the job list) before the parent
calls addjob.

另外，需在 parent 執行 fork 前先阻擋住 SIGCHLD，並在 parent 執行 addjob 後再解封 SIGCHLD，否則有可能會因為 child 先執行完畢，造成 sigchld_handler 在 parent 執行 addjob 前就先執行 deletejob 並造成 race condition

When you run your shell from the standard Unix shell, your shell is running in the foreground process group. If your shell then creates a child process, by default that child will also be a member of the foreground process group. Since typing ctrl-c sends a SIGINT to every process in the foreground group, typing ctrl-c will send a SIGINT to your shell, as well as to every process that your shell
created, which obviously isn’t correct.
Here is the workaround: After the fork, but before the execve, the child process should call setpgid(0, 0), which puts the child in a new process group whose group ID is identical to the child’s PID. This ensures that there will be only one process, your shell, in the foreground process group. When you type ctrl-c, the shell should catch the resulting SIGINT and then forward it to the appropriate foreground job (or more precisely, the process group that contains the foreground
job).

為了避免 ctrl-c 後將我們寫的 shell 一起砍掉，需使用 setpgid(0, 0) 來將 child 移到別的 group ID

綜合上述的討論並修改 trace03 & trace04 的結果如下：

















































/* 
 * eval - Evaluate the command line that the user has just typed in
 * 
 * If the user has requested a built-in command (quit, jobs, bg or fg)
 * then execute it immediately. Otherwise, fork a child process and
 * run the job in the context of the child. If the job is running in
 * the foreground, wait for it to terminate and then return.  Note:
 * each child process must have a unique process group ID so that our
 * background children don't receive SIGINT (SIGTSTP) from the kernel
 * when we type ctrl-c (ctrl-z) at the keyboard.
 */
void eval(char *cmdline) 
{
    char *argv[MAXARGS];  /* argument list execve() */
    pid_t pid;
    int bg;  /* should the job runs in bg or fg */
    int status;
    sigset_t mask;

    sigemptyset(&mask);

    bg = parseline(cmdline, argv);
    if (argv[0] == NULL) { /* ignore empty lines */
        return;
    }

    if (!builtin_cmd(argv)) {  /* no need to fork buildin command */
        sigaddset(&mask, SIGCHLD);
        sigprocmask(SIG_BLOCK, &mask, NULL);  /* block SIGCHLD */
        if ((pid = fork()) == 0) {  /* child runs the job */
            sigprocmask(SIG_UNBLOCK, &mask, NULL); /* unblock SIGCHLD in child */
            setpgid(0,0);  /* puts the child in a new process group, GID = PID */
            if(execve(argv[0], argv, environ) < 0) {
                unix_error("execve error");
            }
        }
        /* adds the child to job list */
        addjob(jobs, pid, (bg?BG:FG), cmdline);
        sigprocmask(SIG_UNBLOCK, &mask, NULL);  /* unblock SIGCHLD */

        if (!bg) {  /* parent waits for fg job terminate */
            waitfg(pid);
        }
        else {  /* shows information of bg job */
            printf("[%d] (%d) %s", pid2jid(pid), pid, cmdline);
        }
    }
    return;
}










/* 
 * waitfg - Block until process pid is no longer the foreground process
 */
void waitfg(pid_t pid)
{
    while(pid == fgpid(jobs)) {
        sleep(0);
    }
    return;
}



















/* 
 * sigchld_handler - The kernel sends a SIGCHLD to the shell whenever
 *     a child job terminates (becomes a zombie), or stops because it
 *     received a SIGSTOP or SIGTSTP signal. The handler reaps all
 *     available zombie children, but doesn't wait for any other
 *     currently running children to terminate.  
 */
void sigchld_handler(int sig) 
{
    pid_t pid;
    int status;

    while((pid = waitpid(-1, &status, WNOHANG|WUNTRACED)) > 0) {
        if(WIFEXITED(status)) {  /* process terminated normaly */
            deletejob(jobs, pid);
        }
    }
    return;
}

註：看 eval 的開頭註解可以發現尚未解決 SIGINT (ctrl-c) 與 SIGTSTP (ctrl-z) 的問題，這些會在 trace06 一起解決

trace06 ~ trace08

trace06.txt - Forward SIGINT to foreground job
trace07.txt - Forward SIGINT only to foreground job
trace08.txt - Forward SIGTSTP only to foreground job

當我們輸入 ctrl-c 或 ctrl-z 時，OS 會傳送 SIGINT 或 SIGTSTP 至我們寫的 shell，因此我們還需要使用 sigint_handler 來將此訊號再送往對應的 FG job



































/* 
 * sigint_handler - The kernel sends a SIGINT to the shell whenver the
 *    user types ctrl-c at the keyboard.  Catch it and send it along
 *    to the foreground job.  
 */
void sigint_handler(int sig) 
{
    pid_t pid = fgpid(jobs);
    
    if(pid != 0) {  /* do nothing if no FG job exist */
    /* send signal to entire foreground process group */
        if(kill(-pid, SIGINT) < 0) {
            unix_error("sigint error");
        }
    }
    return;
}

/*
 * sigtstp_handler - The kernel sends a SIGTSTP to the shell whenever
 *     the user types ctrl-z at the keyboard. Catch it and suspend the
 *     foreground job by sending it a SIGTSTP.  
 */
void sigtstp_handler(int sig) 
{
    pid_t pid = fgpid(jobs);

    if(pid != 0) {  /* do nothing if no FG job exist */
    /* send signal to entire foreground process group */
        if(kill(-pid, SIGTSTP) < 0) {
            unix_error("sigint error");
        }
    }
    return;
}

sigchld_handler 也需新增處理 SIGINT (ctrl-c) 與 SIGTSTP (ctrl-z) 的部分：































/* 
 * sigchld_handler - The kernel sends a SIGCHLD to the shell whenever
 *     a child job terminates (becomes a zombie), or stops because it
 *     received a SIGSTOP or SIGTSTP signal. The handler reaps all
 *     available zombie children, but doesn't wait for any other
 *     currently running children to terminate.  
 */
void sigchld_handler(int sig) 
{
    pid_t pid;
    int status;

    while((pid = waitpid(-1, &status, WNOHANG|WUNTRACED)) > 0) {
        if(WIFEXITED(status)) {  /* process terminated normaly */
            deletejob(jobs, pid);
        }
        if(WIFSIGNALED(status)) {  /* process terminated by signals e.g., ctrl-c */
            printf("Job [%d] (%d) terminated by signal %d\n", pid2jid(pid), pid, WTERMSIG(status));
            deletejob(jobs, pid);
        }
        if(WIFSTOPPED(status)) {  /* process stopped by signals e.g., ctrl-z */
            printf("Job [%d] (%d) stopped by signal %d\n", pid2jid(pid), pid, WSTOPSIG(status));
            struct job_t *job = getjobpid(jobs, pid);
            job->state = ST;
        }
    }
    if(pid < 0 && errno != ECHILD) {
        unix_error("waitpid error");
    }  
    return;
}

trace09

trace09.txt - Process bg builtin command
trace10.txt - Process fg builtin command

首先修改 builtin_cmd，增加 bg 和 fg 的判斷










int builtin_cmd(char **argv) 
{
...
...
    if(!strcmp(argv[0], "bg") || !strcmp(argv[0], "fg")) {  /* bg and fg command */
        do_bgfg(argv);
        return 1;
    }
    return 0;     /* not a builtin command */
}

do_bgfg 實作的部分我們先看一下 writeup 給的提示

Each job can be identified by either a process ID (PID) or a job ID (JID), which is a positive integer assigned by tsh. JIDs should be denoted on the command line by the prefix ’%’. For example, “%5”
denotes JID 5, and “5” denotes PID 5.

The bg <job> command restarts <job> by sending it a SIGCONT signal, and then runs it in the background. The <job> argument can be either a PID or a JID.

The fg <job> command restarts <job> by sending it a SIGCONT signal, and then runs it in the foreground. The <job> argument can be either a PID or a JID.

因此整體的解題順序大致為：

解析使用者輸入的參數，即 argv
根據解析結果索取 job 的指標
透過 kill 發送 SIGCONT 到目標 job
更新 job 的狀態
根據 BG/FG 進行不同的後續處理，這部分跟 eval 在 execve 後的處理邏輯一樣

最後結果如下
























































/* 
 * do_bgfg - Execute the builtin bg and fg commands
 */
void do_bgfg(char **argv) 
{
    char *id = argv[1];
    struct job_t *job;
    int i;
    int length;

    if(id == NULL) {
        printf("%s command requires PID or %%jobid argument\n",argv[0]);
        return;
    }
    if(id[0] == '%') {  /* identified by JID */
        id++;  /* skip the '%' */
        length = strlen(id);
        for (i = 0; i < length; i++) {  /* check if ID are digit numbers */
            if(!isdigit(id[i])) {
                printf("%s: argument must be a PID or %%jobid\n", argv[0]);
                return;
            }
        }
        job = getjobjid(jobs, atoi(id));
        if(job == NULL) {
            printf("%%%d: No such job\n", atoi(id));
            return;
        }
    }
    else {  /* identified by PID */
        length = strlen(id);
        for (i = 0; i < length; i++) {  /* check if ID are digit numbers */
            if(!isdigit(id[i])) {
                printf("%s: argument must be a PID or %%jobid\n", argv[0]);
                return;
            }
        }
        job = getjobpid(jobs, atoi(id));
        if(job == NULL) {
            printf("(%d): No such process\n", atoi(id));
            return;
        }
    }

    kill(-(job->pid), SIGCONT); /* send SIGCONT to the job */

    if(!strcmp(argv[0], "fg")) {  /* waits until fg job terminates */
        job->state = FG;
        waitfg(job->pid);
    }
    else {  /* shows information of bg job */
        job->state = BG;
        printf("[%d] (%d) %s", job->jid, job->pid, job->cmdline);
    }
    return;
}

trace12 ~ 16

trace12.txt - Forward SIGTSTP to every process in foreground process group
trace13.txt - Restart every stopped process in process group
trace14.txt - Simple error handling
trace15.txt - Putting it all together
trace16.txt - Tests whether the shell can handle SIGTSTP and SIGINT signals that come from other processes instead of the terminal

這幾個 trace 基本上是用來檢驗前面幾個步驟是否有疏漏，在此不再贅述

延伸問題: `exit()` v.s. `_exit()`

首先認真看一下 man 3 exit 與 man 2 _exit 的內容

The exit() function causes normal process termination and the value of status & 0377 is returned to the parent (see wait(2)).
All functions registered with atexit(3) and on_exit(3) are called, in the reverse order of their registration. All open stdio(3) streams are flushed and closed. Files created by tmpfile(3) are removed.

Note that a call to execve(2) removes registrations created using atexit(3) and on_exit(3).

The function _exit() is like exit(3), but does not call any functions registered with atexit(3) or on_exit(3).

因此若 fork() 後 child 的 execve() 失敗，child 內應該使用 _exit() 而不是 exit()，因為 _exit() 會把 parent 的 stdio 沖掉、暫存檔刪掉還會呼叫已經不存在的 atexit(3) 與 on_exit(3)。















void eval(char *cmdline)
{
...
...
    if ((pid = fork()) == 0) {  /* child runs the job */
        sigprocmask(SIG_UNBLOCK, &mask, NULL); /* unblock SIGCHLD in child */
        setpgid(0,0);  /* puts the child in a new process group, GID = PID */
        if(execve(argv[0], argv, environ) < 0) {
            fprintf(stderr, "%s: Command not found\n", argv[0]);
            _exit(1);
        }
    }
...
...
}

reference

延伸問題: async-signal-safe function

本章節之後會再行探討，現在的版本有待改善

到此為止雖然所有的 trace 都會顯示正確 (或是說有機會顯示全部正確)，但如果有認真上課的話…會發現我們在撰寫 signal handler 的時候完全無視了 signal safety 的議題，根據 POSIX 規範，signal handler 裡面只能使用 async-signal-safe function，然而 printf 並不是!

man 7 signal-safety

An async-signal-safe function is one that can be safely called from within a signal handler. Many functions are not async-signal-safe. In particular, nonreentrant functions are generally unsafe to call from a signal handler.

The kinds of issues that render a function unsafe can be quickly understood when one considers the implementation of the stdio library, all of whose functions are not async-signal-safe.
When performing buffered I/O on a file, the stdio functions must maintain a statically allocated data buffer along with associated counters and indexes (or pointers) that record the amount of data and the current position in the buffer. Suppose that the main program is in the middle of a call to a stdio function such as printf(3) where the buffer and associated variables have been partially updated. If, at that moment, the program is interrupted by a signal handler that also calls printf(3), then the second call to printf(3) will operate on inconsistent data, with unpredictable results.

在我測試 trace file 時也確實發生過這種狀況，signal 在 main 執行某個 system call 的時候發生，導致 signal handler 會在中途執行

$ make test07
./sdriver.pl -t trace07.txt -s ./tsh -a "-p"
#
# trace07.txt - Forward SIGINT only to foreground job.
#
tsh> ./myspin 4 &
[1] (2345) ./myspin 4 &
tsh> ./myspin 5
waitpid error: Interrupted system call

To avoid problems with unsafe functions, there are two possible choices:

Ensure that (a) the signal handler calls only async-signal-safe functions, and (b) the signal handler itself is reentrant with respect to global variables in the main program.

Block signal delivery in the main program when calling functions that are unsafe or operating on global data that is also accessed by the signal handler.

因此可行的方案為

在 signal handler 內只使用 async-signal-safe functions，且與 main 不具有共用的 global variables
在 main 內操作與 handler 共用的 global variables 時，需暫時擋住 signal

我們將修改的流程分為兩個步驟

確保只使用 async-signal-safe functions
確保 global variables 的安全性

步驟1 - 確保只使用 async-signal-safe functions

CS:APP 其實有提供 csapp.c 的輔助文件，內含可以在 signal handler 內使用的方程式，節錄我們會用到的部分如下




















/* signal-safe I/O functions ported from csapp.c */
static size_t sio_strlen(char s[])
{
    int i = 0;

    while (s[i] != '\0')
        ++i;
    return i;
}

ssize_t sio_puts(char s[]) /* Put string */
{
    return write(STDOUT_FILENO, s, sio_strlen(s)); //line:csapp:siostrlen
}

void sio_error(char s[]) /* Put error message and exit */
{
    sio_puts(s);
    _exit(1);                                      //line:csapp:sioexit
}

unix_error 可以很簡單的改為 sio_error，但 printf 的部分比較麻煩，因為 sio_puts 不支援格式操作 (%d, %s 等)，因此需要在 signal handler 內設置 flag，之後再 main 中根據 flag 執行 printf

步驟2 - 確保 global variables 的安全性

以本題的狀況考慮，共用的 global variable 是 jobs 這個物件，以下列出會存取這個物件的方程式

main
 └getjobpid
 
eval
 ├addjob
 └pid2jid
 
builtin_cmd
 └listjobs
 
do_bgfg
 ├getjobjid
 └getjobpid

sigchld_handler    sigint_handler    sigtstp_handler
 ├deletejob         └fgpid            └fgpid
 └fgpid

可行的方案有兩個

把 deletejob 和 fgpid 移出 signal handler
把全部取用 jobs 的方程式都用 sigpromask 包起來…

由於本題要求的關係，方案1反而更難實行 (主要是實作 waitfg 的部分會因此變得很麻煩)，因此採用方案 2，也就是將全部取用 jobs 的方程式都用 sigpromask 包起來。
但仔細想想並非所有的都需要包起來，其中 eval 內的 pid2jid 以及 handler 內的 fgpid 是不用特別處理也不會有問題，前者是就算衝突也只是列出不對的數據，後者是因為 main 中所有取用 jobs 的方程式都用 sigpromask 包起來了，不可能衝突。

為了避免版面太雜亂，以下僅列出關鍵的部分，只新增 sigpromask 的部分就不特別列出來了


























































































/* Global flag variables for signal handlers */
volatile sig_atomic_t sigint_flag = 0;
volatile sig_atomic_t sigstp_flag = 0;
volatile pid_t sigint_pid = 0;
volatile pid_t sigstp_pid = 0;
volatile pid_t sigint_jid = 0;
volatile pid_t sigstp_jid = 0;
volatile int sigint_WIF;
volatile int sigstp_WIF;
volatile sig_atomic_t fgjob_flag = 0;

int main(int argc, char **argv)
{
    ...
    ...
    ...
    while (1) {
        ...
        ...
        /* Evaluate the command line */
        eval(cmdline);

        /* signal hamdling */
        if(sigint_flag) {
            printf("Job [%d] (%d) terminated by signal %d\n", sigint_jid, sigint_pid, sigint_WIF);
            sigint_flag = 0;
        }
        if(sigstp_flag) {
            printf("Job [%d] (%d) stopped by signal %d\n", sigstp_jid, sigstp_pid, sigstp_WIF);
            struct job_t *job = getjobpid(jobs, sigstp_pid);
            job->state = ST;
            sigstp_flag = 0;
        }
        fflush(stdout);
    } 

    exit(0); /* control never reaches here */
}

void sigchld_handler(int sig) 
{
    pid_t pid;
    int status;
    int olderrno = errno;  /* prevent errno overwrite by signal handler */
    sigset_t mask_all, prev_all;

    sigfillset(&mask_all);
    
    while((pid = waitpid(-1, &status, WNOHANG|WUNTRACED)) > 0) {
        /* process terminated normaly */ 
        if(WIFEXITED(status)) {      
            if(pid == fgpid(jobs)) {
                fgjob_flag = 0;
            }
            /* block all signals while running critical code  */
            sigprocmask(SIG_BLOCK, &mask_all, &prev_all);
            deletejob(jobs, pid);
            sigprocmask(SIG_SETMASK, &prev_all, NULL);
        }
        /* process terminated by signals e.g., ctrl-c */
        if(WIFSIGNALED(status)) {
            if(pid == fgpid(jobs)) {
                fgjob_flag = 0;
            }
            sigint_flag = 1;
            sigint_pid = pid;
            sigint_jid = pid2jid(sigint_pid);
            sigint_WIF =  WTERMSIG(status);
            /* block all signals while running critical code  */
            sigprocmask(SIG_BLOCK, &mask_all, &prev_all);
            deletejob(jobs, pid);
            sigprocmask(SIG_SETMASK, &prev_all, NULL);
        }
        /* process stopped by signals e.g., ctrl-z */
        if(WIFSTOPPED(status)) {
            if(pid == fgpid(jobs)) {
                fgjob_flag = 0;
            }
            sigstp_flag = 1;
            sigstp_pid = pid;
            sigstp_jid = pid2jid(sigstp_pid);
            sigstp_WIF = WSTOPSIG(status);
        }
    }
    if(pid < 0 && errno != ECHILD) {
        sio_error("waitpid error");
    }
    errno = olderrno;
    return;
}

shell lab

基礎知識複習 - Exceptions & Process

fork

RETURN

waitpid

pid_t pid

int option

int *status

RETURN

ERRORS

getpid

execve

RETURN

appendex - envp

基礎知識複習 - Signals

kill - send signal, not kill!

signal

sigprocmask

解題思路

trace01

trace02

trace03 & trace04

trace05

trace06 ~ trace08

trace09

trace12 ~ 16

延伸問題: exit() v.s. _exit()

延伸問題: async-signal-safe function

步驟1 - 確保只使用 async-signal-safe functions

步驟2 - 確保 global variables 的安全性

tags: csapp

Read more

2020q1 Homework2 (fibdrv)

課堂作業與專題

malloc lab

2020q1 Homework3 (quiz3)

`fork`

`waitpid`

`pid_t pid`

`int option`

`int *status`

`getpid`

`execve`

appendex - `envp`

`kill` - send signal, not kill!

`signal`

`sigprocmask`

延伸問題: `exit()` v.s. `_exit()`

tags: `csapp`