# NTU CSIE Sysyem Programming Final Notes
## I. Signals
### 0. Basics
- asynchronous notifications of an event to processes
- **anytime, any order** (no wait)
- can be *blocked* or *ignored*; else kernel delivers it to process
### 1. Signal Handler
> **Disposition of a signal**: (1) ignore (2) default action (3) catch w/ per-process **signal handler**
>> `SIGKILL`/`SIGSTOP`: ==always default action==
- inherited thru `fork()`; reset to default thru `exec()`
- **Core Dump:** consists of the recorded program states, generated when it crashes or exits abnormally
- `signal()` sets the disposition of `signum` to `handler`:
```c!
sighandler_t signal(int signum, sighandler_t handler);
//handler: a function ptr / SIG_IGN(ignore) / SIG_DFL(default)
```
- `pause()` suspends a process until a signal is delivered (i.e. waits for a signal to be caught)
### 2. Interrupted System Calls
- Case 1: sys call returns error (`EINTR`)
- Case 2: sys call **automatically restarts**
- if `SA_RESTART` flag ($\S$ 6) is set on a signal handler
- some specific sys calls (eg. `ioctl()`, `read()`, `wait()`, ...), which can be disabled ==per-signal==
### 3. Reentrant Functions
:::danger
The signal handler does **not** know where the process is executing as the signal was caught.
:::
- a.k.a. *async-signal safe* func.
- can be safely called *recursively*
- non-reentrant:
- static/global memory (variables & data structure)
- standard I/O library
- `malloc()`/`free()`
- modifies *errno* w/o backups
- calls non-reentrant functions
### 4. Pending Signals & Signal Mask
#### 4.1
```mermaid
---
title: Life of a Signal
---
graph LR
Generated --> Blocked/Pending;
Generated --> Ignored;
Blocked/Pending --> Delivered;
Blocked/Pending --> Ignored;
```
- **blocking** (pending): a process can block a signal from delivery until it is either (1) *unblocked* or (2) *ignored*
#### 4.2 Signal Mask
> (==per-process==) defines the set of signals currently blocked
- Signal Sets:
- `sig[empty/fill]set()`: initialization
- `sig[add/del]set()`
- `sigismember()`: test if a signal is in the set
- `sigprocmask()`: modify the signal mask
- To get the current sig mask: `sigprocmask(-, NULL, &set)`
```c!
int sigprocmask(int how, const sigset_t *set, sigset_t *oset);
// how: SIG_[BLOCK(add)/UNBLOCK(del)/SETMASK(set new)]
// set: NULL-> ignore 'how'
// oset: store prev sig mask
```
- `sigpending()`: returns currently pending signals
### 5. Sending Signals
- `kill(pid, signo)`: send signal to a group of processes:
- real/effective UID of sender == receiver
- signo == 0: check if a process exists (errno `ESRCH` if not exist), not atomic
- `raise()`: send signal to self
- `alarm(seconds)`: sets a timer that generates `SIGALRM` (*only 1 timer per process*)
- prev alarm is replaced by new values and functions returns # of seconds left for that alarm
- default disposition for `SIGALARM`: terminating the process
- `seconds == 0`: cancel all pending alarms
:::warning
#### Race Condition
> eg. implement a `sleep1()` func w/ `alarm` & `pause`
Happens when a signal is received & handled ==before== `pause()` is called
:::
- `abort()`: unblocks `SIGABRT` and raise`SIGABRT` for the calling process
- ==Always terminates the process.== If `SIGABRT` is ignored or caught by a handler, `abort()` will restore the default disposition for it and raise the signal again
### 6. `sigaction()`
> to examine/modify the action on a particular signal.
- `signal()` ($\S$ 1) can be implemented using `sigaction()`
```c!
int sigaction(int signo, const struct sigaction *act, struct sigaction *oact);
struct sigaction{
void (*sa_handler)(int); // addr of signal-handling func
sigset_t sa_mask;
int sa_flags; // "SA_..." flags
void (*sa_sigaction)(int, siginfo_t *, void *);
}
```
- `sa_mask`: specifies the additional signals to block ***during the execution of the signal handler***,
:::info
**Before** invoking a signal handler,
* the ***current signal*** being delivered
* signals specified in `sa_mask`
are added to the signal mask.
:::
- `sa_sigaction`:
- used instead of `sa_handler` w/ `SA_SIGINFO` on
- `siginfo_t` contains **info about why the signal was generated**
- struct sigaction: handler, mask, flags (`SA_`)
### 7. Use Case
- parent-child synchronization
| properties | `fork()` | `exec()` |
| -------- | -------- | -------- |
| signal mask | Inherited from the parent | unchanged |
| pending signals/alarms | emptied | unchanged |
| signal action | Inherited from the parent | not ignored -> default; ignored -> ignored |
---
## II. Nonlocal Jumps
### 0. Stack Frame
- **Calling Convention**: defines that for a function, how args are placed, where return values go, what register func may use, how to allocate local variables
- caller/callee (==currently-executing func is a callee==)
- **Stack Frame**: a new room in the stack which is reserved when a func is called and reclaimed when it returns. It is consist of:
- return address (to the caller)
- arguments (passed by the caller)
- saved registers
- automatic variables: **local variables** which are allocated/deallocated automatically when program flow enters/leaves their scope
### 1. Nonlocal Jumps
#### 1.1 Objectives
(1) error-handling in deeply nested function calls
(2) signal handlers can goto a specific point
> In C, *goto* a label in another function is not valid. Nonlocal jumps (aka nonlocal *goto*s) enable program control transfer to an arbitrary location.
#### 1.2 Implementation
- `setjmp(env)`: dynamically set the target to which control will be transfered
- the calling environment (stack & registers) are saved in the `env`
- returns
- `0` when called directly
- `val` when returning from a call to `longjump()` (==fake return==)
- `longjmp(env, val)`: performs the transfer
- rewind the environment w/ `env`
##### Variables
| Storage Location | Memory | CPU or float-point *registers* |
| --- | --- | ---- |
| **Type** | global, volatile, static | ==compiled with optimization==: local(auto), register |
| **Value** | **remains** the same as of the time of `longjmp()` | **restored** to when `setjump()` was called |
:::danger
#### Warning
We can only `longjmp()` to the env of a func that **has been called but not yet completed**!
> Or else its stack frame is reclaimed and not in the call stack
:::
:::warning
#### Race Condition
> (1) from $\S$ I.5
> (2) `read()` w/ a timeout by `alarm()`
can both be solved w/ nonlocal jumps
:::
### 2. Signal Continuation
> **Problem 1**: `setjmp()` & `longjmp()` does **not** save/restore the signal mask (not specified in POSIX)
- `sigsetjmp()` & `siglongjmp()`: support saving & restoring signal masks to `env` (of type `sigjmp_buf`)
> **Problem 2**: updating signal mask and then `pause()` is ==not atomic==
- `sigsuspend()`: updates the sig mask and pauses the process in a ***atomic*** operation
- returns **after** the signal handler returns
- after returning, restores the sig mask to the value before the call to `sigsuspend()`
---
## III. Threads
### 0. Basics
- *lightweight process*
- a process can consist of multiple *threads of control*, performing multiple tasks **within the same environment**
- per-thread:
- TID
- stack
- signal mask
- *errno*
- shared by threads in the same process:
- PID, UID, GID
- environment variables
- text, heap
- open fd table
- file record locks
- signal handlers, signals/alarms
- benefits: improves responsiveness & throughput, resource sharing, computing-intensive tasks, ...
| Stack | Heap |
| -------- | -------- |
| local variables / function args | gloabal & static variables |
| per-thread | per-process |
### 1. Multithreading Models
> User threads: managed w/o OS kernel support
> Kernel threads: supported & managed directly by kernel
- **Many-to-one**: Maps many user threads to one kernel thread (which doesn't know about multi-thread)
- Pros: no modification in the kernel
- Cons: other threads can be blocked by slow system call of a thread; threads cannot run in parallel on multiple CPUs
- **One-to-one**: Maps each user thread to a respective kernel thread (this model is implemented in Linux)
- Pros: multiple threads can run in parallel and not block each other
- Cons: thread creation & destroy at higher cost (more kernel thread)
### 2. POSIX Threads
#### 2.0 Identification
`pthread_self()`: TID
`pthread_equal(TID1, TID2)`: compare TIDs (==not `==` !==)
- UNIX implementations may use a ptr to data structer for `pthread_t`
#### 2.1 Creation
- a process starts in the *main thread*
- a master thread would create worker(peer) threads
- ==no guarantees of execution order among threads==
```c!
int pthread_create(pthread_t *tidp, pthread_attr_t *attr, void *(*start_rtn)(void*), void *arg);
```
- `start_rtn` is the ***start routine*** function that a new thread starts running at, and `arg` is passed to it.
#### 2.2 Termination
- Termintating a process and **all** of its threads:
- any thread calls `exit()`, `_Exit()`, `_exit()`
- a signal with disposition set to termination
- return from `main()` in the main thread
- Termintating a specific thread:
- return from the thread's ***start routine*** function
- calls `pthread_exit()` itself
- others call `pthread_cancel()`
- `pthread_exit(void *rval_ptr)` returns a exit status `rval_ptr` to another thread that calls `pthread_join()`
- The resources of a process are not released until the last thread of it terminates
#### 2.3 Join/Detach
> ##### *joinable* thread
> can be reaped and killed by others; resources are not freed until it is reaped
`pthread_join()`: caller reaps exit status (**blocking**)
`pthread_detach()`: kernel reaps resources & exit status
- join or detach an already detached thread -> **undefined**
#### 2.4 Cancellation
`pthread_cancel(tid)` (==`pthread_exit(PTHREAD_CANCELED)`): just a ***request***; does not block
The thread being canceled can ignore or control the cancellation by:
- cancelability state: `pthread_setcancelstate()`
- ENABLE
- DISABLE: request is queued until ENABLE
- cancelability type: `pthread_setcanceltype()`
- ASYNCHRONOUS: can be canceled anytime
- DEFFERED: the request is **pending** until the thread next call a **cancellation point** func
`pthread_testcancel()`: immediately creates a cancellation point and respond to currently pending cancel requests; no effect if `CANCEL_DISABLE` or no request pending
#### 2.5 Attributes
- `pthread_attr_[init/destroy]()`: initialize/disinitialize the structure `pthread_attr_t`
- `pthread_attr_[set/get]detachstate()`: set/check if the thread should start in *detached state* or *joinable*
#### 2.6 Clean-up Handlers
- `pthread_cleanup_[push/pop]()`: register a cleanup handler (recorded in stack)
- called when:
1. `pthread_exit()`
2. responding to `pthread_cancel()` request
3. `pthread_cleaup_pop(execute)` w/ nonzero `execute`
:::danger
`return` doesn't trigger clean-up handlers!
:::
#### 2.7 Process vs Thread
| Process | Thread |
| -------- | -------- |
| `fork` | `pthread_create` |
| `exit` | `pthread_exit` |
| `waitpid` | `pthread_join` |
| `atexit` | `pthread_cleanup_push` |
| `getpid` | `pthread_self` |
| `abort` | `pthread_cancel` |
### 3. Thread Synchronization
> **non-atomic** operation will have inconsistent result if it is made at the same time by many threads
:::success
Goal: ***Sequential Consistency***
:::
#### 3.1 Mutex (Mutual Exclusion Interface)
- basically serves as a lock
- other thread will **block** until mutex is released as only one thread could acquire a mutex at a time
- `pthread_mutex_[init/destroy]()`
- `pthread_mutex_init(mutex, attr)` $\approx$ `mutex = PTHREAD_MUTEX_INITIALIZER`
- `_lock/trylock/unlock()`
- `lock` will **block** if mutex is locked; `trylock` **returns `EBUSY` immediately**
- `unlock` some lock the caller does not hold is **undefined**
##### Attributes
- `pthread_mutexattr_[init/destroy]()`
- `_[get/set]pshared()`: set/check if the mutex is ==shared between processes==
##### Deadlock Cases & Avoidance
1. One tries to lock the same mutex twice $\rightarrow$ set attr `PTHREAD_MUTEX_ERRORCHECK`
2. One tries to lock mutexes in the opposite order from another
- All threads should employ the **same lock ordering**
- Use `pthread_mutex_trylock` in a polling fashion
:::warning
###### Remarks:
coarse-grained <---------> fine-grained
bad concurrency <---------> high complexity
:::
#### 3.2 Reader-Writer Locks
> ***concurrent read & exclusive write***
> shared read lock & exclusive write lock
- `pthread_rwlock_[init/destroy]()`
- `pthread_rwlock_init(rwlock, attr)` $\approx$ `rwlock= PTHREAD_RWLOCK_INITIALIZER`
- `_[rd/wr]lock()`, `_unlock()`, `_try[rd/wr]lock()`
:::info
The functions are basically the same as mutex.
:::
#### 3.3 Condition Variables
> Threads blocks, atomically acquire a mutexblock, and then change/evaluate a **condition** (event) until the condition is satisfied
- `pthread_cond_[init/destroy]()`
- `pthread_cond_init(cond, attr)` $\approx$ `cond = PTHREAD_COND_INITIALIZER`
- `_wait(cond, mutex)`: **blocks** caller, wait by specified cond. variable to be *signaled* or *broadcast*
- atomatically **unlocks mutex** and **wait**
- when returns the caller **locks mutex**
- `_timedwait(cond, mutex, timespec)`: specifies a wait time and returns an error if time is up before wait finishes
- `_signal/broadcast(cond)`: effective only when there's any thread blocked on `cond`
- **signal** wakes up (at least) **1** thread
- **broadcast** wakes up **all** threads blocked on the specified `cond`
- threads calling these 2 func should ==locks the mutex in `_wait()` first== for *predictable scheduling behaviour*
```mermaid
---
title: Common Usage of Conditional Variable
---
flowchart TD
classDef c fill:#99f
subgraph _wait
subgraph Atomic
top1[Unlocks mutex] --> bottom1[sleep]
end
bottom1--"Waken up by notification"-->Q[Mutex acquired?]
Q -- Yes -->F[Returns]
Q -- No -->bottom1
end
A[Locks mutex]
A-->B["`**Iteratively** checks a cond`"]
B-->C[Calls _wait]
C:::c-.-top1
F -.- G[Unlocks mutex]
G:::c
```
##### Attributes
- `pthread_condattr_init/destroy()`
- `_get/setpshared()`: set/check if the mutex is ==shared between processes==
##### Spurious Wakeup
When a thread is waken up from signaled cond variable but ==the condition it is waiting for is not satisfied==. Possible reasons:
- between the cond variable is signaled & the waiting thread ran, another thread changed the cond
- OS implementation
Hence the **iteratively checking**.
### 4. Other Stuff & Threads
#### 4.1 Threads & `fork`
After `fork()`, only the caller thread exists in the child. Child inherits every mutex, rw lock and cond var.
- `pthread_atfork()`: registers fork handlers, eg. to unlock mutexes locked by other threads
#### 4.2 Threads & Signals
- **Signal mask**: per-thread (inherited from the creator)
- `pthread_sigmask()`
- `sigwait(set, signo)`: atomatically *unblocks* (==so must blocks beforehand, or else undefined==) & wait for 1 or more signals. Signal mask is restored before return.
- Does **not** trigger signal handlers
- Multiple threads `sigwait()`ing the same signal $\to$ **only 1** one of them will return at delivery
- **Signal dispositions/actions**: per-process, and any thread can modify them
- `pthread_kill(thread, signo)`: send a signal `signo` to speicified thread
#### 4.3 Threads & I/O
Since file descriptors are shared between threads, cocurrent access to the same fd cause race condition $\to$ use `p[read/write]()`
#### 4.4 Thread-safe vs Reentrant
**Thread-safe**: can be safely called by multiple threads concurrently
**Reentrant**: can be recursively called
#### 4.5 Pthread Spinlock
A busy-waiting lock alternative. (Acquiring mutex is blocked by sleeping)
---
## IV. Process ENV
### 1. Memory Addressing
- **Physical Address**: On RAM, each byte has a unique PA
- **Virtual Address**: CPU access RAM by using VA through a translator **MMU** (Memory Management Unit)
```mermaid
graph LR
A(CPU)
B(MMU)
C(RAM)
A--"VA"-->B--"PA"-->C
```
#### 1.1 Process Address Space
**Address Space** (per-process): an array of contiguous byte-size VAs.
==**Memory Mapping**==:
- The OS kernel connects a VA to a PA and stores the mapping in a **page table**
- VAs (in [0, `MAX_ADDR`]) are either *mapped/unmapped*
- Same VA in different process could map to:
- Same PA: shared memory, eg. COW after `fork()`
- Different PAs: eg. after `exec()`
#### 1.2 Memory Layout of a C Program
- **Memory Allocation**: `malloc()`, `realloc()`, `free()` -> (de)allocates memory from the heap ==making NO system call==
- `sbrk()` sys call: to expand/contract the heap
- `mmap()`: creates a new **memory region** (i.e. a new mapping) in the address space of caller
- The **memory region** could be map to *a file/device*, or else *the memory* (not allocating memory from the heap)
- Inherited through `fork()`
#### 1.3 Memory-mapped I/O
> A **memory-mapped** file: a *memory buffer* created by `mmap()` that maps to a file, from/to which reading/writing corresponds to the same bytes of the file
```c!
void *mmap(void *addr, size_t len, int prot, int flags, int fd, off_t off);
```
- returns the starting address or `MAP_FAILED`
- `addr`: specify the starting address
- NULL: kernel picks for ya
- not NULL: works as a *hint*
- `len` > 0; `off` is offset
- `fd` is fd of the file/device; closing it does not unmap the region
- `prot`: specify the memory protection (`PROT_[READ/WRITE/EXEC/NONE]`); `SIGSEGV` generated if invalid accesses
- `mprotect()` can modify the permissions later
- `flags`:
- `MAP_SHARED`: a shared mapping; updates to this mapping is visible to other processes and **carried through** to the respective file
- `MAP_PRIVATE`: a private COW mapping; updates to this mapping is invisible to other processes and **not carried through** to the respective file
- `MAP_FIXED`: create exactly at `addr`, or else failed
- `MAP_ANONYMOUS`: mapping is backed by memory and `fd` should be -1; contents of the mapping are init to 0.
- cf. open `/dev/zero`
- `munmap()`: unmap memory-mapped region. The contents of the MMR will not be write back to the file on the disk.
### 2. Program Compilation
> source program/code -> machine instructions/code and data, packaged into an **object file**
#### 2.1 Object Files
Format (linux): ELF (Executable and Linkable Format)
- ELF Header: specifies info about the file, eg. file type, entry point.
- `.symtab`: a symbol table about functions and global vars
- one entry per symbol, including type (func/var), size, name
- `.rel.text`: *relocation entries* for code. A list of locations in `.text` that need to be modified during linking, eg. calling an external function, referencing global vars
- `.rel.data`: *relocation entries* for data. Info for any global vars that are referenced or defined by the OF
Types
- **Executable** obj file: can be copied directly into memory and executed
- **Relocatable** obj file: can be combined w/ other relocatable OFs at compile time to create an executable
- **Shared** obj file: a special type of relocatable that can be loaded into memory and linked **dynamically** at either load/run time
#### 2.2 Linking
- **Dynamic**: Combining various relocatables into a single executable ==at program load time (by loader)/runtime (by program)==
- **Static**: Combining various files into a single executable ==at compile time (by linker)==
Pros: Code maintainance, splitting big programs into smaller modules.
#### 2.3 Static Linking
- **Linker**: make sure the executable can access **symbols**
- **Symbols**: consider an obj file *F*
- Global (linker) symbols: defined by *F*, can be referenced by others
- Global externals: referenced by *F*, defined by other obj files
- Local symbols: defined by *F*, can **NOT** be referenced by others
- **Static Libraries**: consist of multiple related OFs, and are stored as the *archieve* format `.a`
- `ar` can be used to create a static lib
- linker copies only the OFs that are referenced by program
##### Step 1: Symbol Resolution
Linker associates each symbol ref with one def by searching in the symbol table, and copies from static libraries (only the OFs that are referenced by program).
##### Stpe 2 Symbol Relocation
1. Relocating sections and symbol def.
i. Merge sections from each OFs into one
ii. Assign runtime mem addr to the new sections
2. Relocating symbol ref within sections
i. linker checks *relocation entries* in `.rel` and updates the symbol ref in program to correct runtime addr
#### 2.4 Dynamic Linking
> Cons of Static Linking:
> - Static libraries requires maintenance and update -> relink every time it is updated!!
> - Libraries are copied to executables -> bigger size
- **Shared Libraries**: can be loaded at an arbitrary memory and linked at load/runtime by **Dynamic Linker**; its mapping can be shared by multiple processes. (`.so`, i.e. shared OFs)
| *Linking* | Dynamic | Static |
| -------- | -------- | -------- |
| *Pros* | Size efficient, simple maintenance | Less runtime overhead |
#### 2.5 Loading
- **Loader**: copies code and data in the executable from disk to *memory* (i.e. process address space) and jump to its *entry point*
- For C, the entry point is `_start()`, which calls `main()` later
- `exec()` invokes kernel to load respective program into PAS