OSTEP Notes

Virtualization - Processes

The Abstraction: A Process

  • Process (行程) is a running program
  • OS 利用 CPU virtualization (虛擬化) 來讓我們以為有很多 CPU 的錯覺。
  • Time sharing (分時): 用於共享電腦資源,OS 讓一個 process 跑一段時間後再切到另一個 process 運行。(Cost: Performance)
  • Space sharing: 電腦資源會被分配給需要使用的 process。(e.g. disk space is a space-shared resource)
  • Mechanisms (機制): Mechanisms 是實作出我們需要的功能的底層方法 (method) 或協定 (protocol)。(e.g. context switch)
  • Policies (策略): Policies 是 OS 做出某種決策的演算法。(e.g. scheduling (排程) policy)
  • Mechanisms (low level) -> answer to a how question about a system (e.g. OS 如何執行 context switch?)
    Policies (high level) -> answer to a which question about a system (e.g. OS 要執行哪一個行程?)
  • Machine state: 程式在運行時可以寫入或讀取的對象。包含:記憶體、暫存器 (register, e.g. program counter)
  • Program counter (PC 程式計數器, aka instruction pointer, IP): PC 存放著接下來要執行的指令 (instruction)

Process API

  • Create - create a process
  • Destroy - destory a process
  • Wait - wait for another process to stop running
  • Miscellaneous Control - e.g. suspend and resume the process
  • Status - status info about a process

Process Creation: A Little More Detail

  • Q: OS 如何執行一個程式?行程是如何建立的?
  • OS 從硬碟 (disk) 載入程式碼及 static 資料到行程的 address space (the memory that the process can address)。早期,OS 會很積極地 (eagerly, i.e. 在運行程式之前先載入所有資料) 去載入資料,而現代的 OS 則是 lazily (僅先載入一定要用到的資料, lazy loading is related to paging and swapping) 去載入資料。在運行 process 前,OS 要配置記憶體給程式的 (run-time) stack,以及 heap (用於需要動態配置記憶體 (dynamically-allocated) 的資料, i.e. linked lists, hash tables, trees),程式可利用 malloc() 和 free() 來配置及釋放 heap 的記憶體資源。在 UNIX 中,每個行程預設有三種 file descriptor (standard input, output, and error) 讓程式輕鬆地做輸入與輸出。最後 OS 才把 CPU 的控制權限轉交給新建立的行程,然後才開始運行程式。

Process States

Process 可簡單地劃分成處於這三種狀態

  • Running: process 在 CPU 上運行,執行指令
  • Ready: process 可以運行,但 OS 基於某些理由暫時不運行這個 process
  • Blocked: process 含有一些運算要等到其他 event 完成才能執行 (e.g. 讀取/寫入硬碟資料),這段空閒時段可讓其他 process 運作。

image

Data Structures

  • process list: OS 要有一個清單,內含每個 process 以及他們的狀態
  • process 的結構會含有 regsiter context (紀錄 register 的資料),在 process 暫停的時候,register data 會存放在記憶體,等 process 繼續執行的時候在把資料放回 register,這叫做 context switch。
  • process 在其他 OS 可能也會有另外的狀態,如 initial (when the process is being created) 及 final state (when the process has exited but has not yet been cleaned up, aka zombie state in UNIX)

Summary

  • OS 用 Process (行程) 來描述運行中的程式。不論是什麼時候,process 皆可以用 process state 來描述當前的狀態,如 address space 裡的記憶體,CPU register (暫存器) 的內容 (i.e. program counter, stack pointer etc.) 及有關 I/O 的資訊。
  • Process API 有著有關 process 的呼叫 (call) (i.e. creation, destruction, etc.)
  • Process 會存在於一種 process state (i.e. running, ready, blocked),I/O 會讓 process 被 scheduled or descheduled,讓它的 process state 改變。
  • Process list 包含著 OS 裡所有 process 的資訊,每一筆資料被稱為 process content block (PCB),內含有關這個 process 的所有資料。

Homework (Simulation)

1

100%, since the chance of instruction using CPU is both 100%; therefore, no IO will happen.

2

11 time units, "4:100,1:0" the first process will take up 4 time units, and the second process will take up 7 time units (1 for calling I/O (initialization), 5 (default) for doing I/O, and 1 for completion) (These are all used for simplicity, not realistic though)

3

Yes, since we are running the I/O part first and it is blocked during I/O query, in the meantime, the other process can run its instructions; afterward, the CPU control returns to the first process for I/O completion.

4

The time units will increase to 11. Since the other process cannot take over the control of CPU (because the flag is set to SWITCH_ON_END) when the first process is blocked.

5

The time units fall back to 7. Since the other process can do work when the other is blocked (doing I/O).

6

The first process delayed its io_done which causes its next I/O to be runned after all the other process had finished which is not very efficient. No, the CPU is not being effectively utilized.

7

Since I/O done is not delayed, the OS can effectively schedule the processes, i.e. run I/O contiguously and when the process is blocked, it switch to the other process.
It differs from the last scenario since IO done is waitng for the other processes to complete and then executes it which postpones the subsequent IO execution, and in turn increases the overall run time.
The reason is that we dont know if we still get IO to do subsequently; therefore, postponing the io_done will cause subsequent IO actions to be waiting for the first IO action to be completed which is not ideal for such case, the better solution would be IO running consecutively and other process be executed when the IO process is blocked.

8

  1. 15 clock ticks, no changes with flags IO_RUN_IMMEDIATE or IO_RUN_LATER. When run with flag SWITCH_ON_END the clock ticks increase to 18.
  2. 16 clock ticks, no changes with flags IO_RUN_IMMEDIATE or IO_RUN_LATER. When run with flag SWITCH_ON_END the clock ticks increase to 30.
  3. 18 clock ticks. When run with flag IO_RUN_IMMEDIATE, clock ticks decrease to 17. When run with flag SWITCH_ON_END the clock ticks increase to 24.

Interlude: Process API

  • To create a process in UNIX, we can use a pair of system calls: fork() and exec(), a third routine wait() can be used by a process to wait for a process it created to complete.

The fork() System Call

  • The fork() system call is used to create a new child process. (caller is called the parent process)
  • PID - Process Identifier, used to name the process
  • The fork()'d child process has its own address space, registers, PC etc.

The wait() System Call

  • The parent process can call the wait() syscall to wait for child process to complete before continuing executing

Finally, The exec() System Call

  • The exec() system call is useful when you need to run a different program
  • On Linux, there is six variants of exec(): execl(), execlp(), execle(), execv(), execvp(), and execvpe().
  • a successful call to exec() never returns.

Why? Motivating The API

  • The separation of fork() and exec() can be useful. In UNIX shell, this separation allows shell run code after the call to fork() but before the call to exec().

Process Control And Users

  • kill() system call - send signals (pause, die, etc.) to a process
  • control-c sends a SIGINT (interrupt) to the process (normally terminating it)
  • control-z sends a SIGTSTP (stop) signal thus pausing the process in mid-execution (resume the process with fg command)

Useful Tools

  • ps - see which processes are running
  • top - show processes of the system
  • kill - send a kill signal to a process specified with their PID

Summary

  • Each process has a name which is a number known as a process ID (PID).
  • The fork() system call is used in UNIX systems to create a new process. The creator is called the parent; the newly created process is called the child.
  • The wait() system call allows a parent to wait for its child to complete execution.
  • The exec() family of system calls allows a child to break free from its similarity to its parent and execute an entirely new program.
  • A UNIX shell commonly uses fork(), wait(), and exec() to launch user commands; the separation of fork and exec enables features like input/output redirection, pipes, and other cool features, all without changing anything about the programs being run.
  • Process control is available in the form of signals, which can cause jobs to stop, continue, or even terminate.
  • Which processes can be controlled by a particular person is encapsulated in the notion of a user; the operating system allows multiple users onto the system, and ensures users can only control their own processes.
  • A superuser (root) can control all processes.

Homework (Simulation)

1

                           Process Tree:
                               a

Action: a forks b
                               a
                               └── b
Action: b forks c
                               a
                               └── b
                                   └── c
Action: a forks d
                               a
                               ├── b
                               │   └── c
                               └── d
Action: d forks e
                               a
                               ├── b
                               │   └── c
                               └── d
                                   └── e
Action: c forks f
                               a
                               ├── b
                               │   └── c
                               │       └── f
                               └── d
                                   └── e

(each generation might vary)

2

As fork_percentage goes higher, the process tree would more likely be nested, vice versa.

3

                           Process Tree:
                               a

Action: a forks b
                               a
                               └── b
Action: a forks c
                               a
                               ├── b
                               └── c
Action: a forks d
                               a
                               ├── b
                               ├── c
                               └── d
Action: c EXITS
                               a
                               ├── b
                               └── d
Action: b forks e
                               a
                               ├── b
                               │   └── e
                               └── d

(each generation might vary)

4

Without the -R flag, the child process of c (d and e) would become the child process of a. However, with he -R flag, process d and e would now become the child process of b.

Orphan Process

When a parent process exits (either terminates or crashes unexpectedly) and leaves its child processes still running, its child processes are called "Orphan Process".

Or, when a parent process completes execution before the child process finishes, the child process is also called as "Orphan Process".

Inside Linux, the kernel will identify such situations and attach or reassign the "orphan process" the the "init" process, which is a process with PID 1 and is the root of all processes in Linux kernel (it always keep running to ensure system stability). The init process will ensure the orphaned process to complete and release its memory, remove from process list, etc.

source

5

                           Process Tree:
                               a

Action: a forks b
Action: a forks c
Action: b EXITS
Action: a forks d
Action: a forks e

                        Final Process Tree:
                               a
                               ├── c
                               ├── d
                               └── e

(each generation might vary)

6

                           Process Tree:
                               a

Action: a forks b
Action: a forks c
Action: b forks d
Action: b forks e
Action: b forks f

                        Final Process Tree:
                               a
                               ├── b
                               │   ├── d
                               │   ├── e
                               │   └── f
                               └── c

(each generation might vary)

Homework (Code)

1

Write a program that calls fork(). Before calling fork(), have the main process access a variable (e.g., x) and set its value to something (e.g., 100). What value is the variable in the child process? What happens to the variable when both the child and parent change the value of x?

Below is the C program i am working on.

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main(void) {
    printf("hello world! (pid:%d)\n", (int) getpid());
    int x = 1;
    int rc = fork();
    if (rc < 0) {
        // fork failed; exit
        fprintf(stderr, "fork failed\n");
        exit(1);
    } else if (rc == 0) {
        // child (new process)
        printf("hello, I am child (pid:%d)\n", (int) getpid());
        x = 10;
        printf("x: %d\n", x);
    } else {
        // parent goes down this path (original process)
        printf("hello, I am parent of %d (pid:%d)\n",
	       rc, (int) getpid());
        x = 5;
        printf("x: %d\n", x);
    }
    return 0;
}

This program outputs:

hello world! (pid:89846)
hello, I am parent of 89847 (pid:89846)
x: 5
hello, I am child (pid:89847)
x: 10

Which is expected. (except for the fact that sometimes but rarely child process will get executed first and reverse the ordering of the messages. this can be implemented by adding the wait() system call at the parent process path)

However, what if I try to comment out x = 10 inside the child process path?

Well, it now gets interesting! I got the following output.

hello world! (pid:90932)
hello, I am parent of 90933 (pid:90932)
x: 5
hello, I am child (pid:90933)
x: 1

Though the parent process ran first, the value of x inside the child process seems to remain with the value of 1.

Let us comment out x = 5 to see what happens.

hello world! (pid:91555)
hello, I am parent of 91556 (pid:91555)
x: 1
hello, I am child (pid:91556)
x: 10

It is somewhat expected since the parent ran first and didn't change the value of x, and child process changed it so x became 10 in child process.

What if we let the parent process wait for the child process to finish?

} else {
    // parent goes down this path (original process)
+   wait(NULL);
    printf("hello, I am parent of %d (pid:%d)\n",
        rc, (int) getpid());
    ...

I ran the program again, here's what I got.

hello world! (pid:92283)
hello, I am child (pid:92285)
x: 10
hello, I am parent of 92285 (pid:92283)
x: 1

Now, it seems like the child process and the parent process do really have separate address space. No matter how i modify the program, they don't share the same x variable. This answers the question, What happens to the variable when both the child and parent change the value of x? they each have their own copy of x, and changes in one do not affect the other.

2

Write a program that opens a file (with the open() system call) and then calls fork()to create a new process. Can both the child and parent access the file descriptor returned by open()? What happens when they are writing to the file concurrently, i.e., at the same time?

Below is a program that opens a file using the open() syscall, calls fork(), writes something to the file, and closes the file. Both child and parent process can access the file descriptor returned by open(). When these two process run concurrently, there is no guarantee which process will run first, but usually the parent process will run first.

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <string.h>

int main(void) {
    char str1[20] = "Hello from child!\n";
    char str2[20] = "Hello from parent!\n";

    printf("Parent PID: %d\n", (int) getpid());

    int fd = open("./open.txt", O_CREAT | O_RDWR | O_TRUNC, 0644);
    if (fd < 0) {
        perror("Error opening file");
        exit(EXIT_FAILURE);
    }

    pid_t pid = fork();
    switch (pid) {
    case -1:
        close(fd);
        perror("fork error");
        exit(EXIT_FAILURE);
    case 0:
        write(fd, str1, strlen(str1));
        close(fd);
        printf("Child PID %d\n", (int) getpid());
        puts("Child exiting.");
        exit(EXIT_SUCCESS);
    default:
        write(fd, str2, strlen(str2));
        close(fd);
        puts("Parent exiting.");
        exit(EXIT_SUCCESS);
    }

    return 0;
}

File Descriptor

File descriptor is an unique identifier (integer) in the process that refers to the file description in the kernel. A file descriptor can be used to access files, pipes, or network sockets.

11.1.1 Streams and File Descriptors
When you want to do input or output to a file, you have a choice of two basic mechanisms for representing the connection between your program and the file: file descriptors and streams. File descriptors are represented as objects of type int, while streams are represented as FILE * objects.

File descriptors provide a primitive, low-level interface to input and output operations. Both file descriptors and streams can represent a connection to a device (such as a terminal), or a pipe or socket for communicating with another process, as well as a normal file. But, if you want to do control operations that are specific to a particular kind of device, you must use a file descriptor; there are no facilities to use streams in this way. You must also use file descriptors if your program needs to do input or output in special modes, such as nonblocking (or polled) input (see File Status Flags).

Streams provide a higher-level interface, layered on top of the primitive file descriptor facilities. The stream interface treats all kinds of files pretty much alike—the sole exception being the three styles of buffering that you can choose (see Stream Buffering).

The main advantage of using the stream interface is that the set of functions for performing actual input and output operations (as opposed to control operations) on streams is much richer and more powerful than the corresponding facilities for file descriptors. The file descriptor interface provides only simple functions for transferring blocks of characters, but the stream interface also provides powerful formatted input and output functions (printf and scanf) as well as functions for character- and line-oriented input and output.

Since streams are implemented in terms of file descriptors, you can extract the file descriptor from a stream and perform low-level operations directly on the file descriptor. You can also initially open a connection as a file descriptor and then make a stream associated with that file descriptor.

In general, you should stick with using streams rather than file descriptors, unless there is some specific operation you want to do that can only be done on a file descriptor. If you are a beginning programmer and aren’t sure what functions to use, we suggest that you concentrate on the formatted input functions (see Formatted Input) and formatted output functions (see Formatted Output).

If you are concerned about portability of your programs to systems other than GNU, you should also be aware that file descriptors are not as portable as streams. You can expect any system running ISO C to support streams, but non-GNU systems may not support file descriptors at all, or may only implement a subset of the GNU functions that operate on file descriptors. Most of the file descriptor functions in the GNU C Library are included in the POSIX.1 standard, however.

I/O System call

There are 5 I/O system calls: creat(), open(), close(), read(), and write().

geeksforgeeks

3

Write another program using fork(). The child process should print “hello”; the parent process should print “goodbye”. You should try to ensure that the child process always prints first; can you do this without calling wait() in the parent?

My first thought was to add the sleep() function, but this is quite unusable since you cannot predict how long would it will take for another process to complete.

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main(void) {
    int rc = fork();
    if (rc < 0) {
        perror("fork failed");
        exit(EXIT_FAILURE);
    } else if (rc == 0) {
        // child
        printf("hello\n");
    } else {
        // parent
        sleep(1);
        printf("goodbye\n");
    }

    return 0;
}

When reading the manuals for wait() system call, I came across a system call, waitpid(), which allows you to specify which child process to wait for. Additionally, there is another system call waitid() that can store signal information.

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>

int main(void) {
    pid_t rc = fork();
    if (rc < 0) {
        perror("fork failed");
        exit(EXIT_FAILURE);
    } else if (rc == 0) {
        // child
        printf("hello\n");
    } else {
        // parent
        
        waitpid(rc, NULL, 0);
        // or
        waitid(P_PID, rc, NULL, WEXITED); // NULL is the siginfo
        
        printf("goodbye\n");
    }

    return 0;
}

Alternatively, we can use a shared flag between processes by leveraging mmap(). By adding the MAP_SHARED flag, the memory region is visible to both the parent and child processes, allowing them to share the same variable.

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/mman.h>

int main(void) {
    int* flag = mmap(NULL, sizeof(int), PROT_READ | PROT_WRITE,
                     MAP_SHARED | MAP_ANONYMOUS, -1, 0);
    *flag = 0;

    pid_t rc = fork();
    if (rc < 0) {
        perror("fork failed");
        exit(EXIT_FAILURE);
    } else if (rc == 0) {
        printf("hello\n");
        *flag = 1;
    } else {
        while (*flag != 1)
            continue;
        printf("goodbye\n");
    }

    return 0;
}

4

Write a program that calls fork() and then calls some form of exec() to run the program /bin/ls. See if you can try all of the variants of exec(), including (on Linux) execl(), execle(), execlp(), execv(), execvp(), and execvpe(). Why do you think there are so many variants of the same basic call?

From the manual, the family of exec() is build on top of the execve() system call.

int execve(const char *pathname, char *const _Nullable argv[],
           char *const _Nullable envp[]);

execve() executes the program referred to by pathname. This causes the program that is currently being run by the calling process to be replaced with a new program, with newly initialized stack, heap, and (initialized and uninitialized) data segments.

execve() does not return on success, and the text, initialized data, uninitialized data (bss), and stack of the calling process are overwritten according to the contents of the newly loaded program.

If the current program is being ptraced, a SIGTRAP signal is sent to it after a successful execve().

The exec() family

int execl(const char *pathname, const char *arg, ...
                /*, (char *) NULL */);
int execlp(const char *file, const char *arg, ...
                /*, (char *) NULL */);
int execle(const char *pathname, const char *arg, ...
                /*, (char *) NULL, char *const envp[] */);
int execv(const char *pathname, char *const argv[]);
int execvp(const char *file, char *const argv[]);
int execvpe(const char *file, char *const argv[], char *const envp[]);

From wikipeida:

The base of each is exec (execute), followed by one or more letters:

e – Environment variables are passed as an array of pointers to null-terminated strings of form name=value. The final element of the array must be a null pointer.

l – Command-line arguments are passed as individual pointers to null-terminated strings. The last argument must be a null pointer.

p – Uses the PATH environment variable to find the file named in the file argument to be executed.

v – Command-line arguments are passed as an array (vector) of pointers to null-terminated strings. The final element of the array must be a null pointer.

execl()

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>

int main(void) {
    int p1 = fork();
    if (p1 < 0) {
        fprintf(stderr, "fork failed\n");
        exit(EXIT_FAILURE);
    } else if (p1 == 0) {
        execl("/bin/ls", "", "-ls", NULL);
    } else {
        wait(NULL);
    }
    return EXIT_SUCCESS;
}

For some reasons, i have to add an empty string (or sth else like "idk") at arg[0].

execlp()

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>

int main(void) {
    int p1 = fork();
    if (p1 < 0) {
        fprintf(stderr, "fork failed\n");
        exit(EXIT_FAILURE);
    } else if (p1 == 0) {
        execlp("/bin/ls", "", "-la", (char*) NULL);
    } else {
        wait(NULL);
    }
    return EXIT_SUCCESS;
}

execle()

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>

int main(void) {
    int p1 = fork();
    if (p1 < 0) {
        fprintf(stderr, "fork failed\n");
        exit(EXIT_FAILURE);
    } else if (p1 == 0) {
        execle("/bin/ls", "", "-la", (char*) NULL);
    } else {
        wait(NULL);
    }
    return EXIT_SUCCESS;
}

execv()

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>

int main(void) {
    int p1 = fork();
    if (p1 < 0) {
        fprintf(stderr, "fork failed\n");
        exit(EXIT_FAILURE);
    } else if (p1 == 0) {
        char* someargs[3] = {"/bin/ls", "-la", NULL};
        execv(someargs[0], someargs);
    } else {
        wait(NULL);
    }
    return EXIT_SUCCESS;
}

execvp()

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>

int main(void) {
    int p1 = fork();
    if (p1 < 0) {
        fprintf(stderr, "fork failed\n");
        exit(EXIT_FAILURE);
    } else if (p1 == 0) {
        char* someargs[3] = {"/bin/ls", "-la", NULL};
        execvp(someargs[0], someargs);
    } else {
        wait(NULL);
    }
    return EXIT_SUCCESS;
}

execvpe()

#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>

int main(void) {
    int p1 = fork();
    if (p1 < 0) {
        fprintf(stderr, "fork failed\n");
        exit(EXIT_FAILURE);
    } else if (p1 == 0) {
        char* someargs[3] = {"/usr/bin/ls", "-la", NULL};
        execvpe(someargs[0], someargs, NULL);
    } else {
        wait(NULL);
    }
    return EXIT_SUCCESS;
}

  • Note that execvpe() is a GNU extension, i ran this program with lima on macOS, macOS doesn't suport this natively.

The reason why we need six variants of exec() is that we have different use cases. The exec() family has l, e, v, and p suffixes. l stands for list, e stands for Environment, v stands for Vector, and p stands for Path. Sometimes you might need to parse the user input as variable arguments or a vector and you will use execl() or execv(), and sometimes you might need to change the environment variables to make it differ from the parent process and you might use execve(), and lastly, on different systems, the program may not be in the same location across systems, so execvp() helps search in PATH.

5

Now write a program that uses wait() to wait for the child process to finish in the parent. What does wait() return? What happens if you use wait() in the child?

The following program does what the question stated. When i run the program, it executes with no errors. The wait() system call returns the PID of the terminated child process.

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>

int main(void) {
    pid_t p_pid = getpid();
    printf("Parent PID: %d\n", p_pid);
    int fork_rc = fork();
    if (fork_rc < 0) {
        fprintf(stderr, "fork failed.");
        exit(EXIT_FAILURE);
    } else if (fork_rc == 0) {
        printf("Child PID: %d\n", (int) getpid());
        wait(NULL);
    } else {
        int wait_rc = wait(NULL);
        printf("wait_rc: %d\n", wait_rc);
    }
    return EXIT_SUCCESS;
}

I then used waitpid() to let the parent and child process wait for each other. Surprisingly, it doesn't end up to be a infinite loop. It executes in a flash. I wonder how this make sense.

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>

int main(void) {
    pid_t p_pid = getpid();
    printf("Parent PID: %d\n", p_pid);
    int fork_rc = fork();
    if (fork_rc < 0) {
        fprintf(stderr, "fork failed.");
        exit(EXIT_FAILURE);
    } else if (fork_rc == 0) {
        printf("Child PID: %d\n", (int) getpid());
        waitpid(p_pid, NULL, 0);
    } else {
        int wait_rc = waitpid(fork_rc, NULL, 0);
        printf("wait_rc: %d\n", wait_rc);
    }
    return EXIT_SUCCESS;
}

6

Write a slight modification of the previous program, this time using waitpid() instead of wait(). When would waitpid() be useful?

I did it before this is mentioned before this is on the next page (hurray). waitpid() gets useful when you need to wait for a specific process, not just the child process.

7

Write a program that creates a child process, and then in the child closes standard output (STDOUT FILENO). What happens if the child calls printf() to print some output after closing the descriptor?

When the child process calls printf() after closing the descriptor, it doesn't output the child path: Sup. message.

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>

int main() {
    printf("program started!\n");
    int rc = fork();
    if (rc < 0) {
        fprintf(stderr, "fork failed.");
        exit(EXIT_FAILURE);
    } else if (rc == 0) {
        close(STDOUT_FILENO);
        printf("child path: Sup.\n");
    } else {
        printf("parent path: Hello.\n");
        wait(NULL);
    }
    return EXIT_SUCCESS;
}

8

Write a program that creates two children, and connects the standard output of one to the standard input of the other, using the pipe() system call.

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>

int main(void) {
    int pipefd[2];
    char* buf;
    pid_t cpid1;
    pid_t cpid2;

    if (pipe(pipefd) == -1) {
        perror("pipe");
        exit(EXIT_FAILURE);
    }

    cpid1 = fork();
    if (cpid1 == -1) {
        perror("fork");
        exit(EXIT_FAILURE);
    } else if (cpid1 == 0) {
        close(pipefd[0]);
        char* str = "Hello my friend!\n";
        write(pipefd[1], str, strlen(str));
        close(pipefd[1]);
        exit(EXIT_SUCCESS);
    }

    cpid2 = fork();
    if (cpid2 == -1) {
        perror("fork");
        exit(EXIT_FAILURE);
    } else if (cpid2 == 0) {
        close(pipefd[1]);
        while (read(pipefd[0], &buf, 1) > 0)
            write(STDOUT_FILENO, &buf, 1);
        close(pipefd[0]);
        exit(EXIT_SUCCESS);
    } else {
        wait(NULL);
    }

    return EXIT_SUCCESS;
}

Mechanism: Limited Direct Execution

  • By time sharing the CPU, we can achieve CPU virtualization and run many jobs (processes) seemingly at the same time.
  • When virtualizing the CPU we want performance which adds little overhead to the system, and secondly, we want control over the system, we do not want a process or program to take over the system and run forever.

Basic Technique: Limited Direct Execution

  • "Direct Execution" -> Runs the instructions directly on the CPU
  • Direct Execution Protocol
    image

Problem #1: Restricted Operations

  • user mode -> restricted.
    If programs in user mode issue I/O operations -> processor raises an exception -> OS kill the program
  • kernel mode (the mode the kernel or OS runs in) -> can do privileged operations. (e.g. I/O)
  • user program performs system calls to do privileged operations.
  • To execute system calls, user program executes trap instruction to jumps into the kernel and raise the privilege level to kernel mode simultaneously. Now the system can executed the privileged operations (if allowed).
  • When finished, the OS calls return-from-trap instruction to return back to the user program, change privilege level to user mode, and continue execution.
  • On x86 platforms, the details of user-mode program such as program counter, flags etc. will be saved into the per-process kernel stack.
  • At boot time, the kernel sets up a trap table that maps interrupts (like hardware events or exceptions) and trap instructions (like syscalls) to the addresses of trap handler functions.
  • These trap handlers are small routines inside the OS that contain instructions to handle specific events or forward control to appropriate kernel functions (e.g. the syscall dispatcher).
  • User code must specify the system call via system-call number in the register or specified location on the stack. The OS will first ensure it is valid and the executes it. This form of indirection provide protection to the system, and the user cannot specify the exact address of the syscall to jump to.
  • Limited Direct Execution (LDE) Protocol (bold means the instruction is privileged)
    image

Problem #2: Switching Between Processes

How can the operating system regain control of the CPU so that it can
switch between processes?

A Cooperative Approach: Wait For System Calls

  • It is quite often for a program to transfer control of CPU to the OS by calling syscalls. So, the OS can switch to another program to run.
  • yield system call - transfer control to the OS so it can run other processes
  • Programs transfer control to the OS when doing illegal stuff such as dividing by zero, tries to access protected memory. These bad action will generate a trap to the OS and transfer control to the OS.
  • When a process is stuck or running infinitely, we can only reboot the system.

A Non-Cooperative Approach: The OS Takes Control

  • timer interrupt - a timer can raise interrupt every few milliseconds which halt the running process, and a pre-configured interrupt handler in the OS runs
  • At boot time, the OS will inform the hardware which code to run when timer interrupt happens and start a timer.

Saving and Restoring Context

  • CPU scheduler will decide to continue running the program or switch to another program.
  • When the scheduler decided to switch the program, it is called context switch. It will call the switch() routine to save the context of current process (general purpose registers, PC, and the kernel stack pointer, etc.), and then switches contexts by changing the stack pointer to use the soon-to-be-run process's kernel stack. And the OS returns-from-trap, restores the new process register and start running the program.
  • user registers are implicitly saved by the hardware and into the kernel stack of that process.
  • kernel registers are explicitly saved by the OS but into the process's process structure.
  • Limited Direct Execution Protocol (Timer Interrupt)
    image

Summary

ASIDE: KEY CPU VIRTUALIZATION TERMS (MECHANISMS)

  • The CPU should support at least two modes of execution: a restricted user mode and a privileged (non-restricted) kernel mode.
  • Typical user applications run in user mode, and use a system call to trap into the kernel to request operating system services.
  • The trap instruction saves register state carefully, changes the hardware status to kernel mode, and jumps into the OS to a pre-specified destination: the trap table.
  • When the OS finishes servicing a system call, it returns to the user program via another special return-from-trap instruction, which reduces privilege and returns control to the instruction after the trap that jumped into the OS.
  • The trap tables must be set up by the OS at boot time, and make sure that they cannot be readily modified by user programs. All of this is part of the limited direct execution protocol which runs programs efficiently but without loss of OS control.
  • Once a program is running, the OS must use hardware mechanisms to ensure the user program does not run forever, namely the timer interrupt. This approach is a non-cooperative approach to CPU scheduling.
  • Sometimes the OS, during a timer interrupt or system call, might wish to switch from running the current process to a different one, a low-level technique known as a context switch.

Homework (Measurement)

  • gettimeofday()
  • rdtsc instruction (x86)
  • mach_absolute_time (macOS)
  • sched_setaffinity() (Linux) can set processes be on the same CPU.

Measure the costs of a system call


Measure the costs of a context switch


Scheduling: Introduction

  • mechanism - context switch
  • policy - CPU schduling
  • scheduling policies are also called discplines

Workload Assumptions

We have the following (unrealistic) assumptions about each running process (sometimes called job):

  1. Each job runs for the same amount of time.
  2. All jobs arrive at the same time.
  3. Once started, each job runs to completion.
  4. All jobs only use the CPU (i.e., they perform no I/O)
  5. The run-time of each job is known.
    (Each assumption will be relaxed as we progress)

Scheduling Metrics

  • turnaround time:
    Tturnaround=TcompletionTarrival

    Since A.2,
    Tarrival
    is always zero.
    Hence,
    Tturnaround=Tcompletion
  • Turnaround time is a performance metric
  • Fairness metric
  • If CPU scheduler prevents some jobs from running to improve performance (lower turnaround time), then the fairness decreases.

First In, First Out (FIFO)

  • First In, First Out (FIFO) scheduling or sometimes First Come, First Served (FCFS)
  • If job A, B, and C arrived at roughly the same time (
    Tarrival=0
    ) with the sequence A->B->C, and each job will take 10 seconds to finish, what is the average turnaround time?
  • 10+20+303=20
  • image

Relax Assumption 1 (Each job runs for the same amount of time.)

  • image