Process API - HackMD

# Process API ## *fork* system call The *fork* systam call is used to create a new process. Let's look at a code example : ```cpp //p1.c int main(){ printf("hello world (pid:%d)\n", (int) getpid()); int rc = fork(); if(rc < 0){ fprintf(stderr, "fork failed\n"); } else if (rc == 0){ printf("hello, I am child (pid:%d)\n", (int) getpid()); } else{ printf("hello, I am parent of %d (pid:%d)\n", rc, (int)getpid()); } } ``` It should print out something like this ```shell= hello world (pid:29146) hello, I am parent of 29147 (pid:29146) hello, I am child (pid:29147) ``` The first hello is from main obviously. However, when the main program execute > int rc = fork(); It creates another process, and at this time, there are two p1 program for OS. Since the copied program is the same as the original p1, the program counter(PC) now points to the *fork()* and wants to return. So, below the *fork()*, we now have the two same process, and for the newly-created one the fork returns 0, and for the parent process, it returns the child pid as shown in the output. ## *wait* system call One problem may occur in the above example is that the order of hello is nondeterministic. "Hello I am a parent" can print first and "Hello I am a child" also. CPU scheduler will decide which one to execute first. The nondeterministic behavior can be improved by using *wait()* system call which will return after its child processes complete. ```cpp //p2.c int main(){ printf("hello world (pid:%d)\n", (int) getpid()); int rc = fork(); if(rc < 0){ fprintf(stderr, "fork failed\n"); } else if (rc == 0){ printf("hello, I am child (pid:%d)\n", (int) getpid()); } else{ int wc = wait(NULL); printf("hello, I am parent of %d (pid:%d)\n", rc, (int)getpid()); } } ``` If we run p2.c the output will always show "hello, I am child " first. ## *exec* system call By using *fork()* systen call, we create a new process but with the same context with the program that creates it. Often, we want a to run a different program, and we can use *exec()* to achieve this. ```cpp //pi3.c int main(int argc, char *argv[]){ printf("hello world (pid:%d)\n", (int) getpid()); int rc = fork(); if (rc < 0) { // fork failed; exit fprintf(stderr, "fork failed\n"); exit(1); } else if (rc == 0) { // child (new process) printf("hello, I am child (pid:%d)\n", (int) getpid()); char *myargs[3]; myargs[0] = strdup("wc"); // program: "wc" (word count) myargs[1] = strdup("p3.c"); // argument: file to count myargs[2] = NULL; // marks end of array execvp(myargs[0], myargs); // runs word count printf("this shouldn’t print out"); } else { // parent goes down this path (main) int wc = wait(NULL); printf("hello, I am parent of %d (wc:%d) (pid:%d)\n", rc, wc, (int) getpid()); } return 0; } ``` When execvp(a variant of exec) is called, it loads code and other data into the current programs, transforming it into the program we want to execute. And since the original program is transformed into another program, the call to exec will never return if succeed. ## Why? The combination use of *fork()* and *exec()* looks somewhat strange, it is cleat thar there are clear ways to launch a process and execute it. We know the shell is a program, it shows the prompt and waits for user to type some commands. The separate use of *fork()* and *exec()* make whe shell to do many things easily. For example, we want to wirte the output of the executable into a new file called out.txt. Shell first finds where the executable is, fork a new process, and redirect the stardatd output to the file output and then exec the new program. The separate use of *fork()* and *exec()* makes the redirection(or other) more easily. Ref : [Operaing Systems Three Easy Pieces Chapter 5](https://raw.githubusercontent.com/Areadrill/HaPOS/master/Operating%20Systems%20-%20Three%20Easy%20Pieces.pdf)