# System Programming - 2
## Files and Directories
* Properties
* ownership
* access permission
* time attributes, type, management, etc
---
### stat(), fstat(), lstat()

```c
include <sys/types.h>
include <sys/stat.h>
int stat(const char *pathname, struct stat *buf);
int fstat(int fd, struct stat *buf);
int lstat(const char *pathname, struct stat *buf);
struct stat {
mode_t st_mode; // file type & mode (permissions)
ino_t st_ino; // i-node number (serial number)
dev_t st_dev; // device number (file system)
dev_t st_rdev; // device number for special files
nlink_t st_nlink; // number of links
uid_t st_uid; // user ID of owner
gid_t st_gid; // group ID of owner
off_t st_size; // size in bytes, for regular files
time_t st_atime; // time of last access
time_t st_mtime; // time of last modification
time_t st_ctime; // time of last file status change
blksize_t st_blksize; // best I/O block size
blkcnt_t st_blocks; // number of disk blocks allocated
}
```
* stat(), lstat()
* `lstat()` returns information about a symbolic link
* `stat()` returns the file referenced by the symbolic link
---
### Mask, File Types, Access Permissions
1. **st_mode mask**
```
S_IFMT 0170000 bit mask: file type
S_IFSOCK 0140000 socket
S_IFLNK 0120000 symbolic link
S_IFREG 0100000 regular file
S_IFBLK 0060000 block device
S_IFDIR 0040000 directory
S_IFCHR 0020000 character device
S_IFIFO 0010000 FIFO
S_ISUID 04000 set-user-ID on execution
S_ISGID 02000 set-group-ID on execution
S_ISVTX 01000 sticky bit
S_IRWXU 00700 owner: r, w, x
S_IRUSR 00400 owner: r
S_IWUSR 00200 owner: w
S_IXUSR 00100 owner: x
S_IRWXG 00070 group: r, w, x
S_IRGRP 00040 group: r
S_IWGRP 00020 group: w
S_IXGRP 00010 group: x
S_IRWXO 00007 other: r, w, x
S_IROTH 00004 other: r
S_IWOTH 00002 other: w
S_IXOTH 00001 other: x
```
2. **File Types**
* Types
* Regular Files : text, binary, etc
* Directory Files : { (filename, pointer) }, only kernel can update the info
* Character Special Files : tty, audio
* Block Special Files : disk
* FIFO : named pipes
* Sockets : the type of file used for network communication between processes
* Symbolic Links : the type of file that points to another file
* Macros <sys/stat.h>
* The argument to macros : st_mode
```
S_ISREG() // regular file
S_ISDIR() // directory file
S_ISCHR() // character special file
S_ISBLK() // block special file
S_ISFIFO() // pipe or FIFO
S_ISLNK() // symbolic link
S_ISSOCK() // socket
example:
#define S_IFMT 0xF000 // type of file
#define S_IFDIR 0x4000 // directory
#define S_ISDIR(mode) ((mode & S_IFMT) == S_IFDIR)
```
3. **Access Permissions**
* Operations
* Directory
* X: access the i-node of files in a dir
* e.g., open a file ("/usr/include/stdio.h")
* need `x` of “/“, “/usr”, “/usr/include”
* R: read a dir and list all filenames in a dir
* W: create or delete a file
* create a file : need `w+x` (dir)
* delete a file : need `w+x` (dir), file's permission doesn't matter
* e.g., everyone has `w+x` for /tmp, but cannot delete any file in /tmp (sticky bit, [here](#Sticky-bit-S_ISVTX-01000))
* File
* X : execute a file, `exec()`
* R : `O_RDONLY`, `O_RDWR` for `open()`
* W : `O_WRONLY`, `O_RDWR`, `O_TRUNC` for `open()`
### UID/GID

* A process could have more than one ID.
* ID type
* Real UID/GID
* from /etc/passwd
* Effective UID/GID, Supplementary GID’s
* determine file access permissions for processes
* whether effective UID/GID == st_uid/st_gid (at least one applies)
* when a program/file is executed(no set-uid/set-gid)
* the process's effective UID/GID = real UID/GID
* each time a process executes, creates, opens, or deletes a file, the kernel perform the File Access Test. Check from 1~4, and the access is granted if one applies.
1. effective UID == 0 : superuser
2. effective UID == st_uid (UID of the file)
3. effective GID or any of its supplementary group ID == st_gid (GID of the file)
4. performs the access permission check for other
* Saved Set-User/Group-ID
* copies of the effective UID/GID
* saved by exec()
* Set-User-ID (setuid), Set-Group-ID (setgid), sticky bit
* setuid, setgid: if the bit is set to on, when this file is executed
* the process's effective UID/GID = the owner of the file(st_uid/st_gid)
* S_ISUID/S_ISGID is set
* S_ISUID (04000) on st_mode
* Symbolic : `--s --- ---`
* S_ISGID (02000) on st_mode
* Symbolic : `--- --s ---`
* e.g., setuid/setgid program allows normal users to have root permission to update /etc/passwd
* Sticky bit (01000) on st_mode
* Symbolic : `--- --- --t`
* will be mentioned [here](#Sticky-bit-S_ISVTX-01000)
* Example 1 :
```bash
$ ls -alt /usr/bin/passwd
-rwsr-xr-x 1 root root 25692 May 24 ...
```
* Example 2 :

* if permission of File A1 = 0755
* All of the IDs are B; the process cannot read File A2.
* If File A2’s permission is 0644, the process can read it.
* if permission of File A1 = 4755
* Real User/Group ID is B.
* Effective User ID is A.
* The process can read File A2
* Create a new file
* UID of a file (st_uid) = the effective UID of the process
* GID of a file (st_gid) (POSIX.1 allows one of these two) =
1. the effective GID of the process
2. the GID of directory in which the file is being created (if meet the following conditions)
* Linux supports 1 and 2, FreeBSD and Mac OS X supports 2
* If 2 is used, it assures all files and dirs in a dir have the same GID as a dir
---
### access(), umask()
```c
#include <unistd.h>
// 0: OK, -1: on error
int access(const char *pathname, int mode);
```
* mode :
* R_OK : test for read permission
* W_OK : test for write permission
* X_OK : test for execute permission
* F_OK : test for existence of file
* check if the real UID/GID has access to a file
```c
#include <sys/stat.h>
mode_t umask(mode_t cmask);
```
* return : the previous value of the mask
* Turn **off** the file mode(st_mode)
* cmask = bitwise-or S_IRUSR, ...
* file mode creation mask: per process
* child's inherited from the parent ([here](#fork-vs-exec))
* change the mask has no affection for the mask of others
* A built-in command in a shell
* Example :
```bash
// example 1
$ umask 0x22
$ created file: 0x777
result: 0x755
// a.out source code
umask(0);
creat("foo", RWRWRW); // create -rw-rw-rw for "foo"
umask(S_IRGRP | S_IWGRP | S_IROTH | S_IWOTH);
creat("bar", RWRWRW); // create -rw-rw-rw for "bar"
------------
$ umask
002
$./a.out
$ ls -l foo bar
-rw-rw-rw 1 sar 0 Dec 7 21:20 foo
-rw------ 1 sar 0 Dec 7 21:20 bar
$ umask
002 // No affection for the mask of its parent
```
---
### chmod(), fchmod()
```c
#include <sys/types.h>
#include <sys/stat.h>
// 0: OK, -1: on error
int chmod(const char *pathname, mode_t mode);
int fchmod(int fd, mode_t mode);
```
* mode : bitwise-or S_IRUSR, ..., S_ISVTX (sticky bit), S_IS[UG]ID, etc
* chmod() updates on i-nodes
* Callers must be a superuser or effective UID = file UID
* chmod() automatically clears two permission bits under the conditions
* Set the sticky bit (S_ISVTX) on a regular file with no superuser privileges
* the sticky bit in the mode is automatically turned off
* GID of a newly created file != the effective GID (or supplementary GID) of the calling process, or the process != superuser
* clear the set-group-ID bit
* a non-superuser process writes to a set-uid/gid file
* Clear up set-user/group-ID bits
---
### Sticky bit (S_ISVTX, 01000)
* executable file with S_ISVTX :
* Used to save a copy of a S_ISVTX executable in the swap area to speed up the execution next time, when the process terminates
* Not needed for a system with a virtual memory system and fast file system
* no effect in modern system
* directory with S_ISVTX :
* User needs both permissions to remove or rename the file in the directory
* has `w` for the directory
* And one of the following
* User is superuser (root)
* Owns the file
* Owns the directory (if the file type is dir)
* Symbolic : `--- --- --t`
* Example :
```bash
$ ls -ld /tmp
drwxrwxrwt 4 root sys 485 Nov 10 06:01 /tmp
```
---
### chown(), fchown(), lchown()
```c
#include <sys/types.h>
#include <unistd.h>
// 0: OK, -1: on error
int chown(const char *pathname, uid_t owner, gid_t group);
int fchown(int filedes, uid_t owner, gid_t group);
int lchown(const char *pathname, uid_t owner, gid_t group);
```
* change a file's UID/GID
* pathname is symbolic link
* lchown(): change symbolic link file itself
* chown(): dereference the symbolic link
* owner or group = -1: that ID is not changed.
* ability to change
* UID: with the CAP_CHOWN capability, e.g., superuser(root)
* GID:
* with the CAP_CHOWN capability
* owner of the file may change the group to any group which it is a member
* set-user/group-ID bits would be cleared if chown is called by nonsuper users.
---
### sysconf(), pathconf(), fpathconf() --- Run-Time Limits
```c
#include <unistd.h>
long sysconf(int name);
long pathconf(const char *pathname, int name);
long fpathconf(int *filedes, int name);
```
* sysconf()
* name : _SC_CHILD_MAX, _SC_OPEN_MAX, etc.
* fpathconf()
* name : _PC_LINK_MAX, _PC_PATH_MAX, _PC_PIPE_BUF, _PC_NAME_MAX, _PC_SYMLINK_MAX, etc.
* Return –1 and set errno if any error occurs
* EINVAL if the name is incorrect.
---
### File Size
* File Sizes --- st_size, only meaningful for the following
* Regular files – 0~max (off_t)
* Directory files – multiples of 16 or 512
* Symbolic links – pathname length
* File Holes
The memory is allocated as multiples of blocks to a file (a file may not use up the whole block if `st_size != multiples of st_blksize`)
* st_blksize: preferred block size for efficient filesystem I/O
* st_blocks: number of blocks allocated to the file
* st_size: file character size
If a file has a hole (e.g., by lseek()), `st_size > st_blocks * st_blksize` (the hole is filled with null bytes)
---
### truncate(), ftruncate()
```c
#include <sys/types.h>
#include <unistd.h>
int truncate(const char *pathname, off_t length);
int ftruncate(int fd, off_t length);
```
* file must be writable
* size not equal:
* file size > length: truncate
* file size < length: file size increase, create hole
---
### File System



### i-node and data blocks




* i-node (fixed size) :
* i-node size:
* Version 7: 64B, 4.3+BSD:128B, S5:64B, UFS:128B
* File type, access permission, file size, data blocks, link count, etc.
* Predefined number of files that can be created
* It can happen that there is enough size, but i-node table is full.
* ls –i filename (show i-node number)
* Link count --- hard link
* How many pointers from files in directories to a specific i-node
* only be deleted if the link count = 0
* contained in st_nlink in stat
* LINK_MAX: maximum value for link count
* Moving files among directories
* Move within the partition :
* only directory block is updated (add new entry and unlink old entry)
* link count remains the same
* Move between the partitions :
* the files are moved pnysically (move data block)
### Hard Link vs. Soft Link


* Hard Link
* Filesize = data size
* Is a different name for the same set of data blocks
* limitation
* require both pathnames and link to be on the same file system
* only superuser can link/unlink to a directory
* Soft Link (Symbolic Link)
* Filesize = pathname length
* Can be a directory or file
* Is a pointer to a set of data blocks
* File type is S_IFLINK
* relative path:
* the path name is relative to the directory containing the symbolic link

* Problem 1 : dangling pointers
* Example : if `/usr/joe/foo` was deleted, `/usr/sue/bar` becomes a dangling pointer
* Problem 2 : infinite loop

* functions follow symbolic link or not:

---
### link(), unlink(), rename(), remove()
```c
int link(char *existingpath, char *newpath);
```
* return
* 0: OK, -1: on error, if newpath already exists
* need permission `w+x` for the directory
* create a new hard link
* atomic operation : creation of the new dir entry and increment of the link count
```c
int unlink(char *pathname);
```
* need permission `w+x` for the directory
* sticky bits was set for a residing dir:
* need the same permissions as delete files in dir (see sticky bit)
* pathname is a symbolic link:
* only unlink the symbolic link itself
* Checking if any process has the file open
* If open, the link is removed, but the file is delayed to be deleted until all references to it have been closed
```c
int remove(const char *pathname);
```
* pathname is a file : unlink
* pathname is a dir : rmdir (ANSI C)
```c
int rename(const char *oldname, const char *newname);
```
* need `w+x` for the directories containing oldname and newname
* condition:
* oldname: file
* newname exists and not a dir: file(newname) removed
* newname exists and is a dir: error
* oldname: dir
* newname exists and is an empty dir: file(newname) removed
* newname exists and is a file, or oldname is a prefix of newname: error
* e.g., `"/usr/foo"` is a prefix of `"/usr/foo/testdir"`
---
### symlink(), readlink()
```c
int symlink(const char *actualpath, const char *sympath);
```
* actualpath does not need to exist
* actualpath and sympath can be in different file systems
```c
int readlink(const char *pathname, char *buf, int bufsize);
```
* read the contents of the symbolic link into buf
* readlink is an action consisting of open, read, and close – content put in buf are not null terminated
* `readlinkat()`, `symlinkat()`
* analogous to `open()` v.s. `openat()`
---
### File Times, utime(), utimes()
1. **File Times**
| Field | Description | Example | ls-option |
| -------- | ----------------------- | ------------ | --------- |
| st_atim | last-access-time | read | -lu |
| st_mtim | last-content-modification-time | write | -l |
| st_ctim | last-i-node-change-time | chmod, chown | -lc |
* `access()` and `stat()` don't change file time
* changing the access permissions, user ID, link count, etc, only affects the i-node (st_ctim)
* ctime is modified automatically! (stat & access are for reading)
* effect of functions on file times:

2. **utime()**
```c
#include <sys/types.h>
#include <utime.h>
int utime(const char *pathname, const struct utimbuf *times);
int utimes(const char *pathname, const struct timeval times[2]);
struct utimbuf {
time_t actime;
time_t modtime;
}
// time_t: number of seconds since 1970 Jan 1 00:00:00
struct timeval {
time_t tv_sec;
long tv_usec;
}
```
* utimes()
* finer grained resolution than utime()
* `times[0]` = access time
* `times[1]` = modification time
* time values are in seconds since the Epoch
* times :
* == null : set as the current time
* (Effective UID = file UID) or `w` right to the file
* != null : set as requested
* (Effective UID = file UID or superuser) and `w` right to the file
---
### mkdir(), rmdir(), opendir(), readdir(), rewinddir(), closedir()
1. **mkdir()**
```c
#include <unistd.h>
// 0: OK, -1: on error
int mkdir(const char *pathname, mode_t mode);
```
* umask, UID/GID setup (works the same as file)
* . and .. are automatically created
2. **rmdir()**
```c
#include <unistd.h>
// 0: OK, -1: on error
int rmdir(const char *pathname);
```
* delete an empty directory (space freed) if
* link count of a dir = 0
* no other process has a dir open
* if not the above
* ENOTEMPTY, EEXIST
3. **opendir(), readdir(), rewinddir(), closedir()**
```c
#include <sys/types.h>
#include <dirent.h>
// returns pointer, -1: on error
DIR *opendir(const char *pathname);
// returns pointer, NULL: at the end of the dir on error
struct dirent *readdir(DIR *dp);
// 0: OK, -1: on error
void rewinddir(DIR *dp);
int closedir(DIR *dp);
struct dirent {
ino_t d_ino; /* not in POSIX.1 */
char d_name[NAME_MAX+1];
}
// implementation dependent
```
* Only the kernel can write to a dir
* `w+x` for creating/deleting a file
* ftw()/nftw()
* recursively traversing the file system
---
### chdir(), fchdir(), getcwd()
a process has a current working directory - where all relative pathnames begin (per process)
1. **chdir(), fchdir()**
```c
#include <unistd.h>
// 0:OK, -1: on error
int chdir(const char *pathname);
int fchdir(int fd);
```
* chdir : must be built into shells
* The kernel only maintains the i-node number and dev ID for the current working directory
2. **getcwd()**
```c
#include <unistd.h>
char *getcwd(char *buf, size_t size);
```
* The buffer must be large enough, or an error returns
* chdir follows symbolic links, and getcwd has not idea of symbolic links
---
### st_ino (I-node number)
* Each file has a unique i-node number (index number)
* The i-node number can be used to look up a file’s information (i-node) in a system table (the i-list)
* A file's i-node contains :
* user and group ids of its owner
* permission bits
* etc
---
## Process Control
- user process v.s. kernel process

- process memory layout

- PID: non-negative integer
- 0: swapper or scheduler/idle process
* kernel process
* no program on the disk corresponds to this process
- 1: init process
* user process with superuser
* bring up the Unix system after the kernel has been bootstrapped
* initialize system services, login processes, etc
* run init scripts
- 2: pagedaemon or kthreadd
* kernel process
* page the virtual memory system
- bootstrap
- computer power on
- CPU executes the firmware from ROM
- firmware(BIOS/UEFI) initialize hardware(e.g., RAM)
- loads software from the storage device(boot partition in hard drive) to RAM

- CPU executes loaded programs(bootloader(grub, u-boot))
- bootloader loads OS to RAM and more bootstrap
- OS bootstrap more hardware
- OS create pid 1, 2
- the system is brought up for multi-user operation

- process control block (PCB)
- each process is represented in the OS kernel by PCB

- process scheduling


### getpid(), getppid()
```c
#include <unistd.h>
int getpid(void);
int getppid(void);
```
* return:
- getpid(): PID of calling process
- getppid(): PID of parent process of calling process
- OS kernel track parent process for each process except for pid 0, 1, 2
### fork(), vfork()


```c
#include <unistd.h>
pid_t fork(void);
```

- return:
- 0: if in the child
- pid of child: if in the parent
- -1: on error
- can create many child, but only single parent
- parent v.s. child memory
- parent and child share the same memory address layout (discuss later)
- child get a copy of parent's data, text segment, heap, stack
- a copy or share
- copy: data, heap, stack
- share: text segment
- share the same file offset

- when child terminates, any shared fd's offsets will update accordingly
- close() in parent or child neither interferes with the other's open fds
- cannot predict whether child or parent to run first
- decided by the kernel scheduler
- if synchronization is needed, `sleep()`...etc
- reason for fork() to fail
- too many processes in the system
- total number of process for a real UID > `CHILD_MAX` (system's limit)
```c
#include <unistd.h>
pid_t vfork(void);
```
- return:
- same as `fork()`
- optimization of `fork()`
- share the address space until changes are required
- the same as `fork()` except
- runs the same address space until child calls `exec()` or `exit()`
- child always runs first
- parent is blocked until child calls `exec()` or `exit()`
- modern Unix systems empolys copy-on-write (COW)
- if `fork()` already use COW, `vfork()` does not add much performance gain

- behavior of `vfork()` is undefined if
- child modify any data except the variable
- used for return value from `vfork()`
- makes function calls
- return without calling `exec()` or `exit()`
- example:
```c
// fork()
if ((pid = fork()) < 0) {
err_sys("fork error");
} else if (pid == 0) { /* child */
globvar++; /* modify variables */
var++;
} else { /* parent */
sleep(2); /* we don’t flush stdout */
}
printf("pid = %ld, glob = %d, var = %d\n", (long)getpid(), globvar, var);
output: (copy of memory)
pid = 430, glob = 7, var = 89
pid = 429, glob = 6, var = 88
// vfork()
if ((pid = vfork()) < 0) {
err_sys("vfork error");
} else if (pid == 0) {
globvar++;
var++;
_exit(0);
}
printf("pid = %ld, glob = %d, var = %d\n” (long)getpid(), globvar, var);
output: (shared memory address)
pid = 29039, glob = 7, var = 89
```
### exit(), atexit()

- process termination
- normal termination
- return from main()
- call `exit()`, `_exit()` or `_Exit()`
- return of the last thread from its start routine (later)
- calling pthread_exit from the last thread (later)
- abnormal termination
- call `abort()` (generating the `SIGABRT` signal) ([here](#abort-sleep))
- terminated by a signal ([here](#Signal))
- response of the last thread to a cancellation request (later)
- kernel eventually deallocate all process' memory, closes process' open fd, etc.
- parent call `wait()` to reap the resources of child
- kernel will notify parent by sending `SIGCHLD`
```c
#include <stdlib.h> (specified in ISO C)
void exit(int status);
void _Exit(int status);
#include <unistd.h> (specified in POSIX.1)
void _exit(int status);
```
- status
- transmitted by the OS kernel to parent to tell it how child is terminated
- normal exits: status set to 0
- cleanup tasks
- `_Exit()` and `_exit()` does not perform
- `exit()` performs
- calls all exit handlers
- clean shutdown of the standard I/O library: `fclose()` all open I/O streams
```c
#include <stdlib.h>
// Returns 0 if OK, nonzero on error
int atexit(void (*func)(void));
```
- register `func` as exit handlers
- same exit func can be registered for several times
- handlers will be called in reverse order of their registration
- child inherit copies of parent's registrations
- removed when `exec()`
### wait(), waitpid()
```c
#include <sys/wait.h>
// Both return: process ID if OK, 0, or -1 on error
pid_t wait(int *statloc);
pid_t waitpid(int pid, int *statloc, int options);
```
- pid:
- pid < -1: wait for any child whose GID == |pid|
- pid = -1: wait for any child, `wait(&status) == waitpid(-1, &status, 0)`
- pid > 0: wait for pid
- pid = 0: wait for any child whose GID = parent's GID
- options
- `WCONTINUED`, `WNOHANG` (nonblocking), `WUNTRACED`
- wait for state changes of child
- child terminated, child stopped by a signal, child resumed by a signal
- change `statloc` to retrieve status value
- set `statloc = NULL`: ignore status
- process calls `wait()`, `waitpid()`
- block: if all children are still running
- return immediately with status value: any child has changed state
- return immediately with an error: if no child
- `wait()` v.s. `waitpid()`
- `wait()`: block until a child changes
- `waitpid()`: block until a specific child change (modifiable, e.g., nonblocking, wait for any child, set via `pid` and `options`)
### zombie and orphan process

- zombie
- child has been terminated, but parent has not yet waited for it (zombie state)
- do no use lots of memory, but consume PIDs
- PIDs are limited resources
- orphan
- child remains running, while parent has terminated
- OS kernel set parent of orphan = init process (pid == 1)
- child of init never become zombie
- init calls one of the `wait()` to fetch the status
### exec()
```c
#include <unistd.h>
// All return -1 on error; no return on success
int execl(const char *pathname, const char *arg0, … /* (char *)0 */ );
int execv(const char *pathname, char *const argv[]);
int execle(const char *pathname, const char *arg0, … /* (char *)0, char *const envp[] */ );
int execve(const char *pathname, char *const argv[], char *const envp[]);
int execlp(const char *filename, const char *arg0, … /* (char *)0 */ );
int execvp(const char *filename, char *const argv[]);
```
- call `exec()` to execute a program
- `exec()` replace process' text(code), data segments(global variables), heap, and stack
- PID not changed
- new program execute `main()`
- `l, v, p, e` after the function name:
- l: list, list of arguments terminated by a null pointer
- e.g., `execl(“/bin/ls", "ls", "-l", NULL);`
- v: vector, arguments passed via an array which terminated by a null pointer
- e.g.,
```c
char *cmd = “/bin/ls”;
char *args[] = {cmd, “-l”, NULL};
execv(cmd, args);
```
- e: environment, environ variable for the new process
- e.g.,
```c
char *cmd = “/bin/ls”;
char *args[] = {cmd, “-l”, NULL};
char *env[] = {“PATH=/bin”, “HOME=/home/user”, NULL};
execve(cmd, args, env);
```
- p: path, if `filename` has no slash(/), search pathnames in the `PATH` environment variable; if `filename` has slash(/), ignore
- e.g.,
```c
char *cmd = “ls”;
char *args[] = {“ls”, “-l”, NULL);
execvp(cmd, args);
// if filename is not an executable, execute /bin/sh
```
### fork() v.s. exec()


### pipe
- Inter-Process Communication (IPC): to solve race conditions between multiple processes

```c
#include <unistd.h>
// Returns 0 if OK, -1 on error
int pipe(int fd[2]);
```
- a pipe has a read end and a write end
- write to write end -> buffered by the kernel -> read from the read end'

- returned `fd[2]`:
- `fd[0]`: read end
- `fd[1]`: write end
- output of `fd[1]` = input for `fd[0]`
- file type of them are FIFO
- accessing closed fds of pipes:
- write end closed: read from a pipe will see the EOF (returns 0)
- read end closed: write to a pipe triggers `SIGPIPE` signal sent to the process
- if `SIGPIPE` is ignored, `write()` return -1, errno = `EPIPE`
- pipe capacity:
- `PIPE_BUF`: kernel's pipe buffer size, 4096 bytes in Linux
- write data size <= `PIPE_BUF` bytes: data is contiguous (atomic)
- write data size > `PIPE_BUF` bytes: data is interleaved with other writes
- `pipe()` and then `fork()`

- cause blocking:
- read from an empty pipe: block until data is available
- write to a full pipe: block until sufficient data has been read from the pipe
- I/Os on pipes are slow system calls: cannot tell when the pipe is ready for reading or writing
- two limitations:
- half duplex: data flow in one direction
- can only be used between processes that have the same ancestor: `fork()` from the same parent who creates the pipe
- solution:
- socket (stream pipes): address both
- FIFOs: address the second
### FIFO
- a kind of file
- also called named pipes
- `S_ISFIFO`: used for check by stat()
- data kept internally in the kernel (no write to file system): FIFO file has no contents
```c
#include <sys/types.h>
#include <sys/stat.h>
// Returns 0 if OK, -1 on error
int mkfifo(const char *path, mode_t mode);
```
- path:
- name of FIFO
- mode:
- FIFO's permissions, same as `open()`
- can use file system-related I/O (e.g., `write()`, ...) on FIFO
- must be opened on both read write end, blocked until the other is opened
- read end: open with `O_RDONLY`
- write end: open with `O_WRONLY`
- if a process wirte to FIFO which has no readers, signal `SIGPIPE` sent to the process
- FIFO and `O_NONBLOCK`
- open without `O_NONBLOCK`:
- open for read blocks until another process open for write
- open for write blocks until another process open for read
- open with `O_NONBLOCK`:
- open for read returns immediately
- open for write ruturns error = `ENXIO` if no process open for read
## Signal

- defined by a name(begins with `SIG`) and a number(positive integer constants)
- signal types

- terminal-generated
- hardware exceptions
- software
- the process can tell the kernel to do one of these dispositions(actions): ignore, catch with own handler, perform default actions(default handler)

- the process can receive signal at any place in the program
- `SIGKILL`, `SIGSTOP` cannot be ignored or caught by the program's own handler
- core dump file
- generated when the program crashes or exits abnormally, used for debugging
- not generated if
- set-UID/GID process: real UID/GID != program file’s UID/GID
- no write permission to the directory
- the generated core dump file is too big and the file system has no space left
- signal inheritance

- interrupted system calls
- signal can be delivered when making a system call, e.g.,
- the system call returns an error, errno = `EINTR`
- the system call is automatically restarted after the signal handler returns if the SA_RESTART flag is set(`sigaction()`)
- pending and blocking

- pending, delivered:
- pending: the signal has not been caught.
- delivered: has been caught
- a process has the option of blockng signal delivery; if the signal is blocked and the disposition is not ignore, the signal remains pending until
- the process unblocks the signal
- disposition becomes ignore
- `sigpending()` can determine which signals are blocked and pending
- if a signal is generated more than once before being unblocked
- POSIX.1 allows to deliver more than once (signals are queued)
- Linux does not queue the same signals
- if more than one signal is ready to be delivered to a process
- POSIX.1 does not specify the order for signal delivery
- POSIX.1 suggests that signals related to the current state of a process (e.g., `SIGSEGV`) should be delivered first
### signal()
```c
#define SIG_ERR (void (*)())-1
#define SIG_DFL (void (*)())0
#define SIG_IGN (void (*)())1
#include <signal.h>
typedef void (*sighandler_t) (int); // we typedef a new data type of signal handlers
sighandler_t signal(int signum, sighandler_t handler);
```
- return:
- old disposition of the `signum` if OK
- `SIG_ERR` on error
- `handler` (disposition of the `signum`):
- function pointer
- `SIG_IGN`: ignore the signal
- `SIG_DFL`: signal handled with default actions
- `SIGKILL` and `SIGSTOP` cannot be caught or ignored
### Reentrant Functions
- the signal handler cannot tell where the process was executing before the signal was caught
- a function could be called twice, once before the signal occurs, and once by the signal handler
- reentrant functions: functions that can be safely called recursively
- make sure the signal handler is reentrant
- non-reentrant functions cases:
- use static or global variables & data structures
- calls `malloc()` or `free()`: both functions use a static global data structure to track what memory blocks are
- modifies the errno without backups
- calls non-entrant functions
- reentrant functions specified by the Single Unix Specification

### Signal Sets
```c
#include <signal.h>
// All return: 0 if OK, -1 on error
int sigemptyset(sigset_t *set);
int sigfillset(sigset_t *set);
int sigaddset(sigset_t *set, int signo);
// Return 1 if true, 0 if false, -1 on error
int sigismember(const sigset_t *set, int signo);
typedef struct {
unsigned long sig[_NSIG_WORDS];
} sigset_t
```
- defined to manipulate signal sets (do not manipulate directly)
- initialize a signal set: `sigemptyset()`, `sigfillset()`
- add or delete a signal to an existing set: `sigaddset()`, `sigdelset()`
- test if a signal is in a set: `sigismember()`
- POSIX.1 defines the data type `sigset_t` to contain a signal set
### Signal Mask
- defines the set of signals currently blocked from delivery to that process
```c
#include <signal.h>
// Returns 0 if OK, -1 on error
int sigprocmask(int how, const sigset_t *set, sigset_t *oset);
```
- a process can get and change its signal mask by `sigprocmask()`
- `how`:
- `SIG_BLOCK`: new mask = current mask ∪ `set`
- `SIG_UNBLOCK`: new mask = current mask - `set`
- `SIG_SETMASK`: new mask = `set`
- `set`, `oset`
- if `oset` != NULL: previous value of the signal mask is stored in `oset`
- if `set` == NULL: the signal mask is unchanged and `how` is ignored
- if `set` != NULL: the `how` indicates how the current signal mask is modified
- if any unblocked signals are pending, at least one of these signals is delivered to the process before `sigprocmask()` returns
### sigpending(), kill(), raise()
```c
#include <signal.h>
// Returns 0 if OK, -1 on error
int sigpending(sigset_t *set);
```
- returns the signals that are currently pending for the calling process
- update the set of signals in `set`
- return error if set points to an invalid memory address
```c
#include <signal.h>
// Returns 0 if OK, -1 on error
int kill(pid_t pid, int signo);
int raise(int signo);
```
- `kill()`: sends a signal to a process or a group of processes
- `pid`:
- `pid` > 0: to the process pid
- `pid` == 0: to all processes with the same GID of the calling process (excluding process pid = 0, 1, 2)
- `pid` < 0: to all processes with gid == |pid|
- `pid` == -1: to all processes (broadcasted)
- `signo`:
- `signo` == 0 (POSIX null signal), no actual signal is sent
- can used to test whether a specific process exist
- not exist: return -1, `errno` = `ESRCH`
- not atomic: by the time that kill() returns, the process might have exited
- permission
- superuser can send a signal to any process
- caller's real or effective UID == receiver's real or effective UID
- if support `_POSIX_SAVED_IDS`: check caller's saved-set-UID instead of its effective UID
- `raise()`: sends a signal to itself (caller)
- `raise(signo) == kill(getpid(), signo);`
### alarm(), pause()
```c
#include <unistd.h>
unsigned int alarm(unsigned int seconds);
```
- sets a timer that will expire in the future
- when the timer expires, `SIGALRM` signal is generated and sent to the calling process
- `SIGALRM`: default action is to terminate the process
- return:
- number of seconds left for the previously scheduled alarm
- 0: no previously scheduled alarm
- only one alarm clock per process
- `seconds` == 0: any pending alarm is canceled
- `seconds` != 0: if the previous alarm has not yet expired, the alarm clock is set to new value
```c
#include <unistd.h>
int pause(void);
```
- suspends the calling process until a signal is caught
- returns only if a signal handler is executed and that handler is returned; in this case `pause()` returns -1 with errno set to `EINTR`
### sigaction()
```c
#include <signal.h>
// Returns 0 if OK, -1 on error
int sigaction(int signo, const struct sigaction *act, struct sigaction *oact);
struct sigaction {
void (*sa_handler)(int);
sigset_t sa_mask;
int sa_flags;
/* alternate handler */
void (*sa_sigaction)(int, siginfo_t *, void *);
}
struct siginfo {
int si_signo /* signal number */
int si_errno; /* error number */
int si_code; /* signal code */
pid_t si_pid; /* sending process ID */
uid_t si_uid; /* sending process's real user ID */
void *si_addr; /* address of faulting instruction */
int si_status; /* exit value or signal */
union sigval si_value; /* signal value */
/* some other fields */
}
```
- examine or modify (or both) the action associated with a particular signal
- `signo`: signal number being examined or modified
- cannot change the action for `SIGKILL` and `SIGSTOP`
- `act`:
- `act` != NULL: the new action for `signo` is installed from `act`
- `oact` != NULL: the previous/current action (depending on the value of act) for `signo` is saved in `oact`
- `sigaction()` supersedes `signal()` and should be used in preference
- `struct sigaction`:
- `sa_handler`: address of the signal handler, or SIG_DFL, SIG_IGN
- `sa_mask`: specify a signal mask which should be blocked during the exection of the signal handler (added to the signal mask of the process)

- (A) OS kernel adds the following signals to the signal mask of the process before invoking its signal handler
- the current signal to be delivered (by default), unless `SA_NODEFER` is set in the `sa_flags`
- the signals specified in the `sa_mask`
- (B) When signal handling function returns, the signal mask of the process is reset to its previous value (before (A))

- `sa_flag`: signal options
- `sa_sigaction`: if `SA_SIGINFO` is specified in `sa_flags`, `sa_sigaction` is used instead of `sa_handler`
### Nonlocal jumps
- cannot `goto` a label that is in another function
- nonlocal jumps enable program control transfer to an arbitrary program location (nonlocal gotos)
```c
#include <setjmp.h>
int setjmp(jmp_buf env);
void longjmp(jmp_buf env, int val);
```
- `setjmp()`: establishes the target to which control will later be transferred
- saves the calling environment (stack and CPU registers) in the `env` (`jmp_buf`: data type in some form of an array)
- `env` (global variable): sets first by `setjmp()` and later used by `longjmp()`
- multiple `setjmp()` uses the same `env` (a better practice: each `setjmp()` should employ a unique `env`)
- return:
- 0: if called directly
- nonzero ( == `val` in `longjmp()`): if returning from a call to `longjmp()`
- `longjmp()`: performs the transfer of execution
- uses `env` to transfer control back to the point where setjmp() was called and to restore the environment to its state at the time of the setjmp() call
- `val`: fake return value for `setjmp()`
- variable value:
- stored in memory: = values at the time of `longjmp()`
- stored in CPU and floating-point registers: = values at the time of `setjmp()`
- values of local variables and register variables are indetermined
- compiler optimization could put local and register variables in CPU registers (rolled back)
- use `volatile` if you do not wish to roll back the value
- `global`, `volatile`, `static` are unchanged after fake return
- can only `longjmp()` to the place in the function which has not returned


### sigsetjmp(), siglongjmp()
```c
#include <setjmp.h>
// Returns: 0 if called directly, nonzero if returning from a call to siglongjmp
int sigsetjmp(sigjmp_buf env, int savemask);
void siglongjmp(sigjmp_buf env, int val);
```
- POSIX does not specify whether `setjmp()` and `longjmp()` save and restore signal masks
- In FreeBSD 8.0 and Mac OS X 10.6.8: yes
- In Linux: no
- POSIX provides these two functions to support saving and restoring signal masks (behave the same as `setjmp()` and `longjmp()`)
- `savemask`:
- `savemask` != 0: `sigsetjmp()` saves the current signal mask of the calling process to env
### sigsuspend()
```c
#include <signal.h>
// Returns −1 with errno set to EINTR (If it returns to the caller)
int sigsuspend(const sigset_t *sigmask);
```
- `sigprocmask()` + `pause()` in a single atomic operation, replace the signal mask with `sigmask` and then suspend the process until
- returns after the signal handler returns
- not returns if the process is terminated
- resotres the signal mask to the value before `sigsuspend()` after returns
### abort(), sleep()
```c
#include <stdlib.h>
// The function never returns
void abort(void);
```
- cause abnormal termination
- unblocks `SIGABRT` and raises it for the calling process
- default disposition: terminate the process
- if ignored or caught by other handlers: still terminates
- `abort()` restore the default disposition and raises the signal for a second time
```C
#include <unistd.h>
// Returns: 0 or the number of unslept seconds
unsigned int sleep(unsigned int seconds);
```
- causes the calling process to be suspended until either:
- `seconds` passed: return 0
- a signal is caught by the process and the signal handler returns (returns the unslept seconds)