NTU CSIE Sysyem Programming Final Notes

I. Signals

0. Basics

asynchronous notifications of an event to processes
anytime, any order (no wait)
can be blocked or ignored; else kernel delivers it to process

1. Signal Handler

Disposition of a signal: (1) ignore (2) default action (3) catch w/ per-process signal handler

SIGKILL/SIGSTOP: always default action

inherited thru fork(); reset to default thru exec()
Core Dump: consists of the recorded program states, generated when it crashes or exits abnormally
signal() sets the disposition of signum to handler:

sighandler_t signal(int signum, sighandler_t handler);
//handler: a function ptr / SIG_IGN(ignore) / SIG_DFL(default)

pause() suspends a process until a signal is delivered (i.e. waits for a signal to be caught)

2. Interrupted System Calls

Case 1: sys call returns error (EINTR)
Case 2: sys call automatically restarts
- if SA_RESTART flag (
  $§$ 6) is set on a signal handler
- some specific sys calls (eg. ioctl(), read(), wait(), …), which can be disabled per-signal

3. Reentrant Functions

The signal handler does not know where the process is executing as the signal was caught.

a.k.a. async-signal safe func.
can be safely called recursively
non-reentrant:
- static/global memory (variables & data structure)
  - standard I/O library
- malloc()/free()
- modifies errno w/o backups
- calls non-reentrant functions

4. Pending Signals & Signal Mask

4.1

blocking (pending): a process can block a signal from delivery until it is either (1) unblocked or (2) ignored

4.2 Signal Mask

(per-process) defines the set of signals currently blocked

Signal Sets:
- sig[empty/fill]set(): initialization
- sig[add/del]set()
- sigismember(): test if a signal is in the set
sigprocmask(): modify the signal mask
- To get the current sig mask: sigprocmask(-, NULL, &set)

int sigprocmask(int how, const sigset_t *set, sigset_t *oset);
// how: SIG_[BLOCK(add)/UNBLOCK(del)/SETMASK(set new)]
// set: NULL-> ignore 'how'
// oset: store prev sig mask

sigpending(): returns currently pending signals

5. Sending Signals

kill(pid, signo): send signal to a group of processes:
- real/effective UID of sender == receiver
- signo == 0: check if a process exists (errno ESRCH if not exist), not atomic
raise(): send signal to self
alarm(seconds): sets a timer that generates SIGALRM (only 1 timer per process)
- prev alarm is replaced by new values and functions returns # of seconds left for that alarm
- default disposition for SIGALARM: terminating the process
- seconds == 0: cancel all pending alarms

Race Condition

eg. implement a sleep1() func w/ alarm & pause

Happens when a signal is received & handled before pause() is called

abort(): unblocks SIGABRT and raiseSIGABRT for the calling process
- Always terminates the process. If SIGABRT is ignored or caught by a handler, abort() will restore the default disposition for it and raise the signal again

6. `sigaction()`

to examine/modify the action on a particular signal.

signal() (
$§$ 1) can be implemented using sigaction()

int sigaction(int signo, const struct sigaction *act, struct sigaction *oact);
struct sigaction{
    void (*sa_handler)(int); // addr of signal-handling func
    sigset_t sa_mask;
    int sa_flags; // "SA_..." flags
    void (*sa_sigaction)(int, siginfo_t *, void *);
}

sa_mask: specifies the additional signals to block during the execution of the signal handler,

Before invoking a signal handler,

the current signal being delivered
signals specified in sa_mask

are added to the signal mask.

sa_sigaction:
- used instead of sa_handler w/ SA_SIGINFO on
- siginfo_t contains info about why the signal was generated
struct sigaction: handler, mask, flags (SA_)

7. Use Case

parent-child synchronization

properties	`fork()`	`exec()`
signal mask	Inherited from the parent	unchanged
pending signals/alarms	emptied	unchanged
signal action	Inherited from the parent	not ignored -> default; ignored -> ignored

II. Nonlocal Jumps

0. Stack Frame

Calling Convention: defines that for a function, how args are placed, where return values go, what register func may use, how to allocate local variables
- caller/callee (currently-executing func is a callee)
Stack Frame: a new room in the stack which is reserved when a func is called and reclaimed when it returns. It is consist of:
- return address (to the caller)
- arguments (passed by the caller)
- saved registers
- automatic variables: local variables which are allocated/deallocated automatically when program flow enters/leaves their scope

1. Nonlocal Jumps

1.1 Objectives

(1) error-handling in deeply nested function calls
(2) signal handlers can goto a specific point

In C, goto a label in another function is not valid. Nonlocal jumps (aka nonlocal gotos) enable program control transfer to an arbitrary location.

1.2 Implementation

setjmp(env): dynamically set the target to which control will be transfered
- the calling environment (stack & registers) are saved in the env
- returns
  - 0 when called directly
  - val when returning from a call to longjump() (fake return)
longjmp(env, val): performs the transfer
- rewind the environment w/ env

Variables

Storage Location	Memory	CPU or float-point registers
Type	global, volatile, static	compiled with optimization: local(auto), register
Value	remains the same as of the time of `longjmp()`	restored to when `setjump()` was called

Warning

We can only longjmp() to the env of a func that has been called but not yet completed!

Or else its stack frame is reclaimed and not in the call stack

Race Condition

(1) from
$§$ I.5
(2) read() w/ a timeout by alarm()

can both be solved w/ nonlocal jumps

2. Signal Continuation

Problem 1: setjmp() & longjmp() does not save/restore the signal mask (not specified in POSIX)

sigsetjmp() & siglongjmp(): support saving & restoring signal masks to env (of type sigjmp_buf)

Problem 2: updating signal mask and then pause() is not atomic

sigsuspend(): updates the sig mask and pauses the process in a atomic operation
- returns after the signal handler returns
- after returning, restores the sig mask to the value before the call to sigsuspend()

III. Threads

0. Basics

lightweight process
a process can consist of multiple threads of control, performing multiple tasks within the same environment
per-thread:
- TID
- stack
- signal mask
- errno
shared by threads in the same process:
- PID, UID, GID
- environment variables
- text, heap
- open fd table
- file record locks
- signal handlers, signals/alarms
benefits: improves responsiveness & throughput, resource sharing, computing-intensive tasks, …

Stack	Heap
local variables / function args	gloabal & static variables
per-thread	per-process

1. Multithreading Models

User threads: managed w/o OS kernel support
Kernel threads: supported & managed directly by kernel

Many-to-one: Maps many user threads to one kernel thread (which doesn't know about multi-thread)
- Pros: no modification in the kernel
- Cons: other threads can be blocked by slow system call of a thread; threads cannot run in parallel on multiple CPUs
One-to-one: Maps each user thread to a respective kernel thread (this model is implemented in Linux)
- Pros: multiple threads can run in parallel and not block each other
- Cons: thread creation & destroy at higher cost (more kernel thread)

2. POSIX Threads

2.0 Identification

pthread_self(): TID
pthread_equal(TID1, TID2): compare TIDs (not == !)

UNIX implementations may use a ptr to data structer for pthread_t

2.1 Creation

a process starts in the main thread
a master thread would create worker(peer) threads
no guarantees of execution order among threads

int pthread_create(pthread_t *tidp, pthread_attr_t *attr, void *(*start_rtn)(void*), void *arg);

start_rtn is the start routine function that a new thread starts running at, and arg is passed to it.

2.2 Termination

Termintating a process and all of its threads:
- any thread calls exit(), _Exit(), _exit()
- a signal with disposition set to termination
- return from main() in the main thread
Termintating a specific thread:
- return from the thread's start routine function
- calls pthread_exit() itself
- others call pthread_cancel()
pthread_exit(void *rval_ptr) returns a exit status rval_ptr to another thread that calls pthread_join()
The resources of a process are not released until the last thread of it terminates

2.3 Join/Detach

joinable thread

can be reaped and killed by others; resources are not freed until it is reaped

pthread_join(): caller reaps exit status (blocking)
pthread_detach(): kernel reaps resources & exit status

join or detach an already detached thread -> undefined

2.4 Cancellation

pthread_cancel(tid) (==pthread_exit(PTHREAD_CANCELED)): just a request; does not block

The thread being canceled can ignore or control the cancellation by:

cancelability state: pthread_setcancelstate()
- ENABLE
- DISABLE: request is queued until ENABLE
cancelability type: pthread_setcanceltype()
- ASYNCHRONOUS: can be canceled anytime
- DEFFERED: the request is pending until the thread next call a cancellation point func

pthread_testcancel(): immediately creates a cancellation point and respond to currently pending cancel requests; no effect if CANCEL_DISABLE or no request pending

2.5 Attributes

pthread_attr_[init/destroy](): initialize/disinitialize the structure pthread_attr_t
pthread_attr_[set/get]detachstate(): set/check if the thread should start in detached state or joinable

2.6 Clean-up Handlers

pthread_cleanup_[push/pop](): register a cleanup handler (recorded in stack)
called when:
1. pthread_exit()
2. responding to pthread_cancel() request
3. pthread_cleaup_pop(execute) w/ nonzero execute

return doesn't trigger clean-up handlers!

2.7 Process vs Thread

Process	Thread
`fork`	`pthread_create`
`exit`	`pthread_exit`
`waitpid`	`pthread_join`
`atexit`	`pthread_cleanup_push`
`getpid`	`pthread_self`
`abort`	`pthread_cancel`

3. Thread Synchronization

non-atomic operation will have inconsistent result if it is made at the same time by many threads

Goal: Sequential Consistency

3.1 Mutex (Mutual Exclusion Interface)

basically serves as a lock
other thread will block until mutex is released as only one thread could acquire a mutex at a time
pthread_mutex_[init/destroy]()
- pthread_mutex_init(mutex, attr)
  $\approx$ mutex = PTHREAD_MUTEX_INITIALIZER
_lock/trylock/unlock()
- lock will block if mutex is locked; trylock returns EBUSY immediately
- unlock some lock the caller does not hold is undefined

Attributes

pthread_mutexattr_[init/destroy]()
_[get/set]pshared(): set/check if the mutex is shared between processes

Deadlock Cases & Avoidance

One tries to lock the same mutex twice
$\to$ set attr PTHREAD_MUTEX_ERRORCHECK
One tries to lock mutexes in the opposite order from another
- All threads should employ the same lock ordering
- Use pthread_mutex_trylock in a polling fashion

Remarks:

coarse-grained <––––-> fine-grained
bad concurrency <––––-> high complexity

3.2 Reader-Writer Locks

concurrent read & exclusive write
shared read lock & exclusive write lock

pthread_rwlock_[init/destroy]()
- pthread_rwlock_init(rwlock, attr)
  $\approx$ rwlock= PTHREAD_RWLOCK_INITIALIZER
_[rd/wr]lock(), _unlock(), _try[rd/wr]lock()

The functions are basically the same as mutex.

3.3 Condition Variables

Threads blocks, atomically acquire a mutexblock, and then change/evaluate a condition (event) until the condition is satisfied

pthread_cond_[init/destroy]()
- pthread_cond_init(cond, attr)
  $\approx$ cond = PTHREAD_COND_INITIALIZER
_wait(cond, mutex): blocks caller, wait by specified cond. variable to be signaled or broadcast
- atomatically unlocks mutex and wait
- when returns the caller locks mutex
- _timedwait(cond, mutex, timespec): specifies a wait time and returns an error if time is up before wait finishes
_signal/broadcast(cond): effective only when there's any thread blocked on cond
- signal wakes up (at least) 1 thread
- broadcast wakes up all threads blocked on the specified cond
- threads calling these 2 func should locks the mutex in _wait() first for predictable scheduling behaviour

Attributes

pthread_condattr_init/destroy()
_get/setpshared(): set/check if the mutex is shared between processes

Spurious Wakeup

When a thread is waken up from signaled cond variable but the condition it is waiting for is not satisfied. Possible reasons:

between the cond variable is signaled & the waiting thread ran, another thread changed the cond
OS implementation

Hence the iteratively checking.

4. Other Stuff & Threads

4.1 Threads & `fork`

After fork(), only the caller thread exists in the child. Child inherits every mutex, rw lock and cond var.

pthread_atfork(): registers fork handlers, eg. to unlock mutexes locked by other threads

4.2 Threads & Signals

Signal mask: per-thread (inherited from the creator)
- pthread_sigmask()
- sigwait(set, signo): atomatically unblocks (so must blocks beforehand, or else undefined) & wait for 1 or more signals. Signal mask is restored before return.
  - Does not trigger signal handlers
  - Multiple threads sigwait()ing the same signal
    $\to$ only 1 one of them will return at delivery
Signal dispositions/actions: per-process, and any thread can modify them
- pthread_kill(thread, signo): send a signal signo to speicified thread

4.3 Threads & I/O

Since file descriptors are shared between threads, cocurrent access to the same fd cause race condition

\to

use p[read/write]()

4.4 Thread-safe vs Reentrant

Thread-safe: can be safely called by multiple threads concurrently
Reentrant: can be recursively called

4.5 Pthread Spinlock

A busy-waiting lock alternative. (Acquiring mutex is blocked by sleeping)

IV. Process ENV

1. Memory Addressing

Physical Address: On RAM, each byte has a unique PA
Virtual Address: CPU access RAM by using VA through a translator MMU (Memory Management Unit)

1.1 Process Address Space

Address Space (per-process): an array of contiguous byte-size VAs.
Memory Mapping:

The OS kernel connects a VA to a PA and stores the mapping in a page table
VAs (in [0, MAX_ADDR]) are either mapped/unmapped
Same VA in different process could map to:
- Same PA: shared memory, eg. COW after fork()
- Different PAs: eg. after exec()

1.2 Memory Layout of a C Program

Memory Allocation: malloc(), realloc(), free() -> (de)allocates memory from the heap making NO system call
- sbrk() sys call: to expand/contract the heap
mmap(): creates a new memory region (i.e. a new mapping) in the address space of caller
- The memory region could be map to a file/device, or else the memory (not allocating memory from the heap)
- Inherited through fork()

1.3 Memory-mapped I/O

A memory-mapped file: a memory buffer created by mmap() that maps to a file, from/to which reading/writing corresponds to the same bytes of the file

void *mmap(void *addr, size_t len, int prot, int flags, int fd, off_t off);

returns the starting address or MAP_FAILED
addr: specify the starting address
- NULL: kernel picks for ya
- not NULL: works as a hint
len > 0; off is offset
fd is fd of the file/device; closing it does not unmap the region
prot: specify the memory protection (PROT_[READ/WRITE/EXEC/NONE]); SIGSEGV generated if invalid accesses
- mprotect() can modify the permissions later
flags:
- MAP_SHARED: a shared mapping; updates to this mapping is visible to other processes and carried through to the respective file
- MAP_PRIVATE: a private COW mapping; updates to this mapping is invisible to other processes and not carried through to the respective file
- MAP_FIXED: create exactly at addr, or else failed
- MAP_ANONYMOUS: mapping is backed by memory and fd should be -1; contents of the mapping are init to 0.
  - cf. open /dev/zero
munmap(): unmap memory-mapped region. The contents of the MMR will not be write back to the file on the disk.

2. Program Compilation

source program/code -> machine instructions/code and data, packaged into an object file

2.1 Object Files

Format (linux): ELF (Executable and Linkable Format)

ELF Header: specifies info about the file, eg. file type, entry point.
.symtab: a symbol table about functions and global vars
- one entry per symbol, including type (func/var), size, name
.rel.text: relocation entries for code. A list of locations in .text that need to be modified during linking, eg. calling an external function, referencing global vars
.rel.data: relocation entries for data. Info for any global vars that are referenced or defined by the OF

Types

Executable obj file: can be copied directly into memory and executed
Relocatable obj file: can be combined w/ other relocatable OFs at compile time to create an executable
Shared obj file: a special type of relocatable that can be loaded into memory and linked dynamically at either load/run time

2.2 Linking

Dynamic: Combining various relocatables into a single executable at program load time (by loader)/runtime (by program)
Static: Combining various files into a single executable at compile time (by linker)

Pros: Code maintainance, splitting big programs into smaller modules.

2.3 Static Linking

Linker: make sure the executable can access symbols
Symbols: consider an obj file F
- Global (linker) symbols: defined by F, can be referenced by others
- Global externals: referenced by F, defined by other obj files
- Local symbols: defined by F, can NOT be referenced by others
Static Libraries: consist of multiple related OFs, and are stored as the archieve format .a
- ar can be used to create a static lib
- linker copies only the OFs that are referenced by program

Step 1: Symbol Resolution

Linker associates each symbol ref with one def by searching in the symbol table, and copies from static libraries (only the OFs that are referenced by program).

Stpe 2 Symbol Relocation

Relocating sections and symbol def.
i. Merge sections from each OFs into one
ii. Assign runtime mem addr to the new sections
Relocating symbol ref within sections
i. linker checks relocation entries in .rel and updates the symbol ref in program to correct runtime addr

2.4 Dynamic Linking

Cons of Static Linking:

Static libraries requires maintenance and update -> relink every time it is updated!!

Libraries are copied to executables -> bigger size

Shared Libraries: can be loaded at an arbitrary memory and linked at load/runtime by Dynamic Linker; its mapping can be shared by multiple processes. (.so, i.e. shared OFs)

Linking	Dynamic	Static
Pros	Size efficient, simple maintenance	Less runtime overhead

2.5 Loading

Loader: copies code and data in the executable from disk to memory (i.e. process address space) and jump to its entry point
- For C, the entry point is _start(), which calls main() later
exec() invokes kernel to load respective program into PAS

NTU CSIE Sysyem Programming Final Notes

I. Signals

0. Basics

1. Signal Handler

2. Interrupted System Calls

3. Reentrant Functions

4. Pending Signals & Signal Mask

4.1

4.2 Signal Mask

5. Sending Signals

Race Condition

6. sigaction()

7. Use Case

II. Nonlocal Jumps

0. Stack Frame

1. Nonlocal Jumps

1.1 Objectives

1.2 Implementation

Variables

Warning

Race Condition

2. Signal Continuation

III. Threads

0. Basics

1. Multithreading Models

2. POSIX Threads

2.0 Identification

2.1 Creation

2.2 Termination

2.3 Join/Detach

joinable thread

2.4 Cancellation

2.5 Attributes

2.6 Clean-up Handlers

2.7 Process vs Thread

3. Thread Synchronization

3.1 Mutex (Mutual Exclusion Interface)

Attributes

Deadlock Cases & Avoidance

Remarks:

3.2 Reader-Writer Locks

3.3 Condition Variables

Attributes

Spurious Wakeup

4. Other Stuff & Threads

4.1 Threads & fork

4.2 Threads & Signals

4.3 Threads & I/O

4.4 Thread-safe vs Reentrant

4.5 Pthread Spinlock

IV. Process ENV

1. Memory Addressing

1.1 Process Address Space

1.2 Memory Layout of a C Program

1.3 Memory-mapped I/O

2. Program Compilation

2.1 Object Files

2.2 Linking

2.3 Static Linking

Step 1: Symbol Resolution

Stpe 2 Symbol Relocation

2.4 Dynamic Linking

2.5 Loading

Read more

拉人更輕鬆 -- g0v 基礎建設更新計畫

Self Intro

6. `sigaction()`

4.1 Threads & `fork`