NTU CSIE Sysyem Programming Final Notes

I. Signals

0. Basics

  • asynchronous notifications of an event to processes
  • anytime, any order (no wait)
  • can be blocked or ignored; else kernel delivers it to process

1. Signal Handler

Disposition of a signal: (1) ignore (2) default action (3) catch w/ per-process signal handler

SIGKILL/SIGSTOP: always default action

  • inherited thru fork(); reset to default thru exec()
  • Core Dump: consists of the recorded program states, generated when it crashes or exits abnormally
  • signal() sets the disposition of signum to handler:
sighandler_t signal(int signum, sighandler_t handler);
//handler: a function ptr / SIG_IGN(ignore) / SIG_DFL(default)
  • pause() suspends a process until a signal is delivered (i.e. waits for a signal to be caught)

2. Interrupted System Calls

  • Case 1: sys call returns error (EINTR)
  • Case 2: sys call automatically restarts
    • if SA_RESTART flag (
      §
      6) is set on a signal handler
    • some specific sys calls (eg. ioctl(), read(), wait(), ), which can be disabled per-signal

3. Reentrant Functions

The signal handler does not know where the process is executing as the signal was caught.

  • a.k.a. async-signal safe func.
  • can be safely called recursively
  • non-reentrant:
    • static/global memory (variables & data structure)
      • standard I/O library
    • malloc()/free()
    • modifies errno w/o backups
    • calls non-reentrant functions

4. Pending Signals & Signal Mask

4.1

Generated
Blocked/Pending
Ignored
Delivered
Life of a Signal
  • blocking (pending): a process can block a signal from delivery until it is either (1) unblocked or (2) ignored

4.2 Signal Mask

(per-process) defines the set of signals currently blocked

  • Signal Sets:
    • sig[empty/fill]set(): initialization
    • sig[add/del]set()
    • sigismember(): test if a signal is in the set
  • sigprocmask(): modify the signal mask
    • To get the current sig mask: sigprocmask(-, NULL, &set)
int sigprocmask(int how, const sigset_t *set, sigset_t *oset);
// how: SIG_[BLOCK(add)/UNBLOCK(del)/SETMASK(set new)]
// set: NULL-> ignore 'how'
// oset: store prev sig mask
  • sigpending(): returns currently pending signals

5. Sending Signals

  • kill(pid, signo): send signal to a group of processes:
    • real/effective UID of sender == receiver
    • signo == 0: check if a process exists (errno ESRCH if not exist), not atomic
  • raise(): send signal to self
  • alarm(seconds): sets a timer that generates SIGALRM (only 1 timer per process)
    • prev alarm is replaced by new values and functions returns # of seconds left for that alarm
    • default disposition for SIGALARM: terminating the process
    • seconds == 0: cancel all pending alarms

Race Condition

eg. implement a sleep1() func w/ alarm & pause

Happens when a signal is received & handled before pause() is called

  • abort(): unblocks SIGABRT and raiseSIGABRT for the calling process
    • Always terminates the process. If SIGABRT is ignored or caught by a handler, abort() will restore the default disposition for it and raise the signal again

6. sigaction()

to examine/modify the action on a particular signal.

  • signal() (
    §
    1) can be implemented using sigaction()
int sigaction(int signo, const struct sigaction *act, struct sigaction *oact);
struct sigaction{
    void (*sa_handler)(int); // addr of signal-handling func
    sigset_t sa_mask;
    int sa_flags; // "SA_..." flags
    void (*sa_sigaction)(int, siginfo_t *, void *);
}
  • sa_mask: specifies the additional signals to block during the execution of the signal handler,

Before invoking a signal handler,

  • the current signal being delivered
  • signals specified in sa_mask

are added to the signal mask.

  • sa_sigaction:
    • used instead of sa_handler w/ SA_SIGINFO on
    • siginfo_t contains info about why the signal was generated
  • struct sigaction: handler, mask, flags (SA_)

7. Use Case

  • parent-child synchronization
properties fork() exec()
signal mask Inherited from the parent unchanged
pending signals/alarms emptied unchanged
signal action Inherited from the parent not ignored -> default; ignored -> ignored

II. Nonlocal Jumps

0. Stack Frame

  • Calling Convention: defines that for a function, how args are placed, where return values go, what register func may use, how to allocate local variables
    • caller/callee (currently-executing func is a callee)
  • Stack Frame: a new room in the stack which is reserved when a func is called and reclaimed when it returns. It is consist of:
    • return address (to the caller)
    • arguments (passed by the caller)
    • saved registers
    • automatic variables: local variables which are allocated/deallocated automatically when program flow enters/leaves their scope

1. Nonlocal Jumps

1.1 Objectives

(1) error-handling in deeply nested function calls
(2) signal handlers can goto a specific point

In C, goto a label in another function is not valid. Nonlocal jumps (aka nonlocal gotos) enable program control transfer to an arbitrary location.

1.2 Implementation

  • setjmp(env): dynamically set the target to which control will be transfered
    • the calling environment (stack & registers) are saved in the env
    • returns
      • 0 when called directly
      • val when returning from a call to longjump() (fake return)
  • longjmp(env, val): performs the transfer
    • rewind the environment w/ env
Variables
Storage Location Memory CPU or float-point registers
Type global, volatile, static compiled with optimization: local(auto), register
Value remains the same as of the time of longjmp() restored to when setjump() was called

Warning

We can only longjmp() to the env of a func that has been called but not yet completed!

Or else its stack frame is reclaimed and not in the call stack

Race Condition

(1) from

§ I.5
(2) read() w/ a timeout by alarm()

can both be solved w/ nonlocal jumps

2. Signal Continuation

Problem 1: setjmp() & longjmp() does not save/restore the signal mask (not specified in POSIX)

  • sigsetjmp() & siglongjmp(): support saving & restoring signal masks to env (of type sigjmp_buf)

Problem 2: updating signal mask and then pause() is not atomic

  • sigsuspend(): updates the sig mask and pauses the process in a atomic operation
    • returns after the signal handler returns
    • after returning, restores the sig mask to the value before the call to sigsuspend()

III. Threads

0. Basics

  • lightweight process
  • a process can consist of multiple threads of control, performing multiple tasks within the same environment
  • per-thread:
    • TID
    • stack
    • signal mask
    • errno
  • shared by threads in the same process:
    • PID, UID, GID
    • environment variables
    • text, heap
    • open fd table
    • file record locks
    • signal handlers, signals/alarms
  • benefits: improves responsiveness & throughput, resource sharing, computing-intensive tasks,
Stack Heap
local variables / function args gloabal & static variables
per-thread per-process

1. Multithreading Models

User threads: managed w/o OS kernel support
Kernel threads: supported & managed directly by kernel

  • Many-to-one: Maps many user threads to one kernel thread (which doesn't know about multi-thread)
    • Pros: no modification in the kernel
    • Cons: other threads can be blocked by slow system call of a thread; threads cannot run in parallel on multiple CPUs
  • One-to-one: Maps each user thread to a respective kernel thread (this model is implemented in Linux)
    • Pros: multiple threads can run in parallel and not block each other
    • Cons: thread creation & destroy at higher cost (more kernel thread)

2. POSIX Threads

2.0 Identification

pthread_self(): TID
pthread_equal(TID1, TID2): compare TIDs (not == !)

  • UNIX implementations may use a ptr to data structer for pthread_t

2.1 Creation

  • a process starts in the main thread
  • a master thread would create worker(peer) threads
  • no guarantees of execution order among threads
int pthread_create(pthread_t *tidp, pthread_attr_t *attr, void *(*start_rtn)(void*), void *arg);
  • start_rtn is the start routine function that a new thread starts running at, and arg is passed to it.

2.2 Termination

  • Termintating a process and all of its threads:

    • any thread calls exit(), _Exit(), _exit()
    • a signal with disposition set to termination
    • return from main() in the main thread
  • Termintating a specific thread:

    • return from the thread's start routine function
    • calls pthread_exit() itself
    • others call pthread_cancel()
  • pthread_exit(void *rval_ptr) returns a exit status rval_ptr to another thread that calls pthread_join()

  • The resources of a process are not released until the last thread of it terminates

2.3 Join/Detach

joinable thread

can be reaped and killed by others; resources are not freed until it is reaped

pthread_join(): caller reaps exit status (blocking)
pthread_detach(): kernel reaps resources & exit status

  • join or detach an already detached thread -> undefined

2.4 Cancellation

pthread_cancel(tid) (==pthread_exit(PTHREAD_CANCELED)): just a request; does not block

The thread being canceled can ignore or control the cancellation by:

  • cancelability state: pthread_setcancelstate()
    • ENABLE
    • DISABLE: request is queued until ENABLE
  • cancelability type: pthread_setcanceltype()
    • ASYNCHRONOUS: can be canceled anytime
    • DEFFERED: the request is pending until the thread next call a cancellation point func

pthread_testcancel(): immediately creates a cancellation point and respond to currently pending cancel requests; no effect if CANCEL_DISABLE or no request pending

2.5 Attributes

  • pthread_attr_[init/destroy](): initialize/disinitialize the structure pthread_attr_t
  • pthread_attr_[set/get]detachstate(): set/check if the thread should start in detached state or joinable

2.6 Clean-up Handlers

  • pthread_cleanup_[push/pop](): register a cleanup handler (recorded in stack)
  • called when:
    1. pthread_exit()
    2. responding to pthread_cancel() request
    3. pthread_cleaup_pop(execute) w/ nonzero execute

return doesn't trigger clean-up handlers!

2.7 Process vs Thread

Process Thread
fork pthread_create
exit pthread_exit
waitpid pthread_join
atexit pthread_cleanup_push
getpid pthread_self
abort pthread_cancel

3. Thread Synchronization

non-atomic operation will have inconsistent result if it is made at the same time by many threads

Goal: Sequential Consistency

3.1 Mutex (Mutual Exclusion Interface)

  • basically serves as a lock
  • other thread will block until mutex is released as only one thread could acquire a mutex at a time
  • pthread_mutex_[init/destroy]()
    • pthread_mutex_init(mutex, attr)
      mutex = PTHREAD_MUTEX_INITIALIZER
  • _lock/trylock/unlock()
    • lock will block if mutex is locked; trylock returns EBUSY immediately
    • unlock some lock the caller does not hold is undefined
Attributes
  • pthread_mutexattr_[init/destroy]()
  • _[get/set]pshared(): set/check if the mutex is shared between processes
Deadlock Cases & Avoidance
  1. One tries to lock the same mutex twice
    set attr PTHREAD_MUTEX_ERRORCHECK
  2. One tries to lock mutexes in the opposite order from another
    • All threads should employ the same lock ordering
    • Use pthread_mutex_trylock in a polling fashion
Remarks:

coarse-grained <-> fine-grained
bad concurrency <-> high complexity

3.2 Reader-Writer Locks

concurrent read & exclusive write
shared read lock & exclusive write lock

  • pthread_rwlock_[init/destroy]()
    • pthread_rwlock_init(rwlock, attr)
      rwlock= PTHREAD_RWLOCK_INITIALIZER
  • _[rd/wr]lock(), _unlock(), _try[rd/wr]lock()

The functions are basically the same as mutex.

3.3 Condition Variables

Threads blocks, atomically acquire a mutexblock, and then change/evaluate a condition (event) until the condition is satisfied

  • pthread_cond_[init/destroy]()

    • pthread_cond_init(cond, attr)
      cond = PTHREAD_COND_INITIALIZER
  • _wait(cond, mutex): blocks caller, wait by specified cond. variable to be signaled or broadcast

    • atomatically unlocks mutex and wait
    • when returns the caller locks mutex
    • _timedwait(cond, mutex, timespec): specifies a wait time and returns an error if time is up before wait finishes
  • _signal/broadcast(cond): effective only when there's any thread blocked on cond

    • signal wakes up (at least) 1 thread
    • broadcast wakes up all threads blocked on the specified cond
    • threads calling these 2 func should locks the mutex in _wait() first for predictable scheduling behaviour
_wait
Atomic
Waken up by notification
Yes
No
Mutex acquired?
Returns
sleep
Unlocks mutex
Locks mutex

Iteratively checks a cond

Calls _wait
Unlocks mutex
Common Usage of Conditional Variable
Attributes
  • pthread_condattr_init/destroy()
  • _get/setpshared(): set/check if the mutex is shared between processes
Spurious Wakeup

When a thread is waken up from signaled cond variable but the condition it is waiting for is not satisfied. Possible reasons:

  • between the cond variable is signaled & the waiting thread ran, another thread changed the cond
  • OS implementation

Hence the iteratively checking.

4. Other Stuff & Threads

4.1 Threads & fork

After fork(), only the caller thread exists in the child. Child inherits every mutex, rw lock and cond var.

  • pthread_atfork(): registers fork handlers, eg. to unlock mutexes locked by other threads

4.2 Threads & Signals

  • Signal mask: per-thread (inherited from the creator)

    • pthread_sigmask()
    • sigwait(set, signo): atomatically unblocks (so must blocks beforehand, or else undefined) & wait for 1 or more signals. Signal mask is restored before return.
      • Does not trigger signal handlers
      • Multiple threads sigwait()ing the same signal
        only 1 one of them will return at delivery
  • Signal dispositions/actions: per-process, and any thread can modify them

    • pthread_kill(thread, signo): send a signal signo to speicified thread

4.3 Threads & I/O

Since file descriptors are shared between threads, cocurrent access to the same fd cause race condition

use p[read/write]()

4.4 Thread-safe vs Reentrant

Thread-safe: can be safely called by multiple threads concurrently
Reentrant: can be recursively called

4.5 Pthread Spinlock

A busy-waiting lock alternative. (Acquiring mutex is blocked by sleeping)


IV. Process ENV

1. Memory Addressing

  • Physical Address: On RAM, each byte has a unique PA
  • Virtual Address: CPU access RAM by using VA through a translator MMU (Memory Management Unit)
VA
PA
CPU
MMU
RAM

1.1 Process Address Space

Address Space (per-process): an array of contiguous byte-size VAs.
Memory Mapping:

  • The OS kernel connects a VA to a PA and stores the mapping in a page table
  • VAs (in [0, MAX_ADDR]) are either mapped/unmapped
  • Same VA in different process could map to:
    • Same PA: shared memory, eg. COW after fork()
    • Different PAs: eg. after exec()

1.2 Memory Layout of a C Program

  • Memory Allocation: malloc(), realloc(), free() -> (de)allocates memory from the heap making NO system call

    • sbrk() sys call: to expand/contract the heap
  • mmap(): creates a new memory region (i.e. a new mapping) in the address space of caller

    • The memory region could be map to a file/device, or else the memory (not allocating memory from the heap)
    • Inherited through fork()

1.3 Memory-mapped I/O

A memory-mapped file: a memory buffer created by mmap() that maps to a file, from/to which reading/writing corresponds to the same bytes of the file

void *mmap(void *addr, size_t len, int prot, int flags, int fd, off_t off);
  • returns the starting address or MAP_FAILED
  • addr: specify the starting address
    • NULL: kernel picks for ya
    • not NULL: works as a hint
  • len > 0; off is offset
  • fd is fd of the file/device; closing it does not unmap the region
  • prot: specify the memory protection (PROT_[READ/WRITE/EXEC/NONE]); SIGSEGV generated if invalid accesses
    • mprotect() can modify the permissions later
  • flags:
    • MAP_SHARED: a shared mapping; updates to this mapping is visible to other processes and carried through to the respective file
    • MAP_PRIVATE: a private COW mapping; updates to this mapping is invisible to other processes and not carried through to the respective file
    • MAP_FIXED: create exactly at addr, or else failed
    • MAP_ANONYMOUS: mapping is backed by memory and fd should be -1; contents of the mapping are init to 0.
      • cf. open /dev/zero
  • munmap(): unmap memory-mapped region. The contents of the MMR will not be write back to the file on the disk.

2. Program Compilation

source program/code -> machine instructions/code and data, packaged into an object file

2.1 Object Files

Format (linux): ELF (Executable and Linkable Format)

  • ELF Header: specifies info about the file, eg. file type, entry point.
  • .symtab: a symbol table about functions and global vars
    • one entry per symbol, including type (func/var), size, name
  • .rel.text: relocation entries for code. A list of locations in .text that need to be modified during linking, eg. calling an external function, referencing global vars
  • .rel.data: relocation entries for data. Info for any global vars that are referenced or defined by the OF

Types

  • Executable obj file: can be copied directly into memory and executed
  • Relocatable obj file: can be combined w/ other relocatable OFs at compile time to create an executable
  • Shared obj file: a special type of relocatable that can be loaded into memory and linked dynamically at either load/run time

2.2 Linking

  • Dynamic: Combining various relocatables into a single executable at program load time (by loader)/runtime (by program)
  • Static: Combining various files into a single executable at compile time (by linker)

Pros: Code maintainance, splitting big programs into smaller modules.

2.3 Static Linking

  • Linker: make sure the executable can access symbols
  • Symbols: consider an obj file F
    • Global (linker) symbols: defined by F, can be referenced by others
    • Global externals: referenced by F, defined by other obj files
    • Local symbols: defined by F, can NOT be referenced by others
  • Static Libraries: consist of multiple related OFs, and are stored as the archieve format .a
    • ar can be used to create a static lib
    • linker copies only the OFs that are referenced by program
Step 1: Symbol Resolution

Linker associates each symbol ref with one def by searching in the symbol table, and copies from static libraries (only the OFs that are referenced by program).

Stpe 2 Symbol Relocation
  1. Relocating sections and symbol def.
    i. Merge sections from each OFs into one
    ii. Assign runtime mem addr to the new sections
  2. Relocating symbol ref within sections
    i. linker checks relocation entries in .rel and updates the symbol ref in program to correct runtime addr

2.4 Dynamic Linking

Cons of Static Linking:

  • Static libraries requires maintenance and update -> relink every time it is updated!!
  • Libraries are copied to executables -> bigger size
  • Shared Libraries: can be loaded at an arbitrary memory and linked at load/runtime by Dynamic Linker; its mapping can be shared by multiple processes. (.so, i.e. shared OFs)
Linking Dynamic Static
Pros Size efficient, simple maintenance Less runtime overhead

2.5 Loading

  • Loader: copies code and data in the executable from disk to memory (i.e. process address space) and jump to its entry point
    • For C, the entry point is _start(), which calls main() later
  • exec() invokes kernel to load respective program into PAS