System Programming notes - 3

# System Programming - 3 ## Thread & Process - process - a program in execution - an entity scheduled by the OS kernel for execution on CPUs - thread - perform multiple tasks within the environment of a single process - threads of the same process share resources - properties of POSIX threads (pthreads) ![image](https://hackmd.io/_uploads/H1aABV4Ua.png) - single-threaded v.s. multi-threaded process ![image](https://hackmd.io/_uploads/By7eUNEL6.png =500x) - process v.s. threads ![image](https://hackmd.io/_uploads/ByjlLEVUp.png =500x) - process v.s. threads memory layout ![image](https://hackmd.io/_uploads/BJVWU4V8p.png =300x) - multithreading program ![image](https://hackmd.io/_uploads/Byp-IVNUp.png =500x) - multithreading models - user threads - supported above the kernel and managed without OS kernel support - kernel threads - supported and managed directly by the OS kernel - many-to-one model (very few systems employ this now) ![image](https://hackmd.io/_uploads/SJwY44NLp.png =250x) - the kernel knows nothing about multi-threads - a thread library does thread management and scheduling in user space - Pros: - no modification is required in the kernel to support multiple threads - Cons: - if one thread makes a slow system call, all other threads will be blocked - multiple threads cannot run in parallel on multiple CPUs - examples: - run-time system, thread table - one-to-one model (implemented in Linux and Windows) ![image](https://hackmd.io/_uploads/SJ45VVEIT.png =250x) - the kernel knows about threads and manages threads - the creation of a user thread requires the creation of a corresponding kernel thread - the user-kernel thread mapping is fixed - Pros: - multiple threads can run in parallel on multiple CPUs and not block each other - Cons: - higher cost than the many-to-one model in thread creation and destrunction - threads and I/O - Concurrent accesses to the same file descriptors cause race condition - address: use `pread()` and `pwrite()` - thread-safe v.s. reentrant - a function is thread-safe if: it can be safely called by multiple threads at the same time - a function is reentrant if: it can be recursively called - a thread-safe function may not be reentrant ![image](https://hackmd.io/_uploads/BJ0jNVEUp.png =500x) - POSIX threads - when a program runs, it starts as a process with a single thread (main thread), which creates more threads (peer threads) during its execution - the execution is interleaved among different threads ![image](https://hackmd.io/_uploads/SJ2nEE48a.png =400x) - process v.s. thread primitives ![image](https://hackmd.io/_uploads/SJ76ENVIa.png =700x) ### pthread_self(), pthread_equal() ```c #include <pthread.h> // pthread_self() always succeeds pthread_t pthread_self(void); // Returns: nonzero if equal, 0 otherwise int pthread_equal(pthread_t tid1, pthread_t tid2); ``` - `pthread_t`: thread ID - only have significance within the context of the process which they belong to - POSIX.1 allows implementations to choose data type of `pthread_t`, e.g., non-negative integer or structure - `pthread_slef()`: get current thread ID - `pthread_equal()`: compare two thread IDs - cannot use `==` ### pthread_create() ```c #include <pthread.h> // Returns: 0 if OK, error number on failure int pthread_create(pthread_t *tidp, pthread_attr_t *attr, void *(*start_rtn)(void *), void *arg); ``` - creates a new thread and specifies the procedure for it to run - `tidp`: store the new thread ID when successful call - `attr`: customize thread attributes(NULL: default) - `start_rtn`: address of the function (start routine) that the new thread starts running at - `arg`: argument passed to `start_rtn()` - Order of execution: no guarantees if which thread will run first (similar to `fork()`) ![image](https://hackmd.io/_uploads/HkpTVEN8T.png) ### pthread_exit(), pthread_join(), pthread_detach() - thread termination - terminate both thread and process - any thread within a process calls `exit()`, `_Exit()`, or `_exit()` - a signal that is sent to a thread (with the default action set to terminate the process) - return from `main()` in the main thread - terminate only thread - return from the thread’s start routine - the thread calls `pthread_exit()` - canceled by another thread in the same process (calling `pthread_cancel()`) - thread control ![image](https://hackmd.io/_uploads/H1SCVEV8a.png =500x) ```c #include <pthread.h> void pthread_exit(void *rval_ptr); // Returns: 0 if OK, error number on failure int pthread_join(pthread_t thread, void **rval_ptr); ``` - `pthread_exit()`: terminates the calling thread - `rval_ptr`: a return value that is available to another thread in the same process who calls `pthread_join()` - called the "return code" or "exit status" - when a thread terminates, the resources (e.g., file descriptors, etc.) of the process are not released if other threads are still alive! - released after the last thread in a process terminates - `pthread_join()`: block the calling thread until `thread` exits - `rval_ptr` == NULL: if we are not interested in the thread’s return value - `rval_ptr` != NULL: - `rval_ptr` = return code(exit status) - `rval_ptr` = `PTHREAD_CANCELED` if the thread was canceled (discuss later) - By default, threads are created with the `joinable` attribute: - a `joinable` thread can be reaped and killed by other threads. Its memory resources (stack, ...) are not freed until it is reaped by another thread. - to avoid memory leaks, each joinable thread should be reaped by another thread calling `pthread_join()` - a thread can also be set to `detached` if you don’t want it to be joined ```c #include <pthread.h> // Returns: 0 if OK, error number on failure int pthread_detach(pthread_t tid); ``` - when a thread is detached, its resources (e.g., exit status) will be reclaimed immediately by the kernel - undefined behavior if `pthread_join()` is called to wait for detached threads or `pthread_detach()` to detach an already detached thread ### pthread_cancel(), cancelstate, canceltype ```c #include <pthread.h> // Returns: 0 if OK, error number on failure int pthread_cancel(pthread_t tid); ``` - request the thread `tid` in the same process to be canceled (terminated) - behave as the thread `tid` calls `pthread_exit(PTHREAD_CANCELED)` - the thread `tid` can choose to ignore or control how itself is canceled - by two attributes: cancelability state, cancelability type - `pthread_cancel()` only makes the request: it does not block and wait for the thread to terminate - cancellation steps: 1. clean-up handlers are popped and called (discuss later) 2. thread-specific data destructors are called 3. the thread is terminated - call `pthread_join()` to tell if a thread is canceled (exit status = `PTHREAD_CANCELED`) ```c #include <pthread.h> // Returns: 0 if OK, error number on failure int pthread_setcancelstate(int state, int *oldstate); ``` - sets the current cancelability state to `state`, and stores the previous state in `oldstate` - `state` can be either - `PTHREAD_CANCEL_ENABLE` (default state): cancelable - `PTHREAD_CANCEL_DISABLE`: not cancelable. If a cancellation request is received, it is blocked until cancelability is enabled ```c #include <pthread.h> // Returns: 0 if OK, error number on failure int pthread_setcanceltype(int type, int *oldtype); ``` - sets the current cancelability type to `type` and stores the previous cancelability type in `oldtype` - `type` can be either - `PTHREAD_CANCEL_ASYNCHRONOUS`: can be canceled anytime - `PTHREAD_CANCEL_DEFERRED` (default type): the cancellation request is deferred (e.g., pending) until the next function call is a "cancellation point" - cancellation point - cancellation point (must) ![image](https://hackmd.io/_uploads/H1XmHVELa.png) - optional cancellation point (may) ![image](https://hackmd.io/_uploads/ryrgW8d86.png) - cancellation cases: - cancelability state == `PTHREAD_CANCEL_DISABLE`: - the request is queued until the cancelability state becomes `PTHREAD_CANCEL_ENABLE` - cancelability state == `PTHREAD_CANCEL_ENABLE`: - cancelability type == `PTHREAD_CANCEL_ASYNCHRONOUS`: - the thread is canceled immediately - cancelability type == `PTHREAD_CANCEL_DEFERRED`: - the cancellation is deferred until the thread reaches a cancellation point ```c #include <pthread.h> void pthread_testcancel(void); ``` - creates a cancellation point within the calling thread ### Thread cleanup handlers ```c #include <pthread.h> void pthread_cleanup_push(void (*rtn)(void *), void *arg); void pthread_cleanup_pop(int execute); ``` - `pthread_cleanup_push()`: register a thread cleanup handler (`rtn`) with a argument (`arg`) - `rtn` will be called when the thread: - call `pthread_exit()` - responds to a cancellation request - call `pthread_cleanup_pop()` with `execute` != 0 - more than one cleanup handler can be established for a thread - the handlers are recorded in a stack and later executed in a first come last served order - `pthread_cleanup_pop()`: removes the handler at the top of the stack and optionally executes it if `execute` != 0 ### Thread attributes ```c #include <pthread.h> // Returns: 0 if OK, error number on failure int pthread_attr_init(pthread_attr_t *attr); int pthread_attr_destroy(pthread_attr_t *attr); ``` - `pthread_attr_init()`: initialize `attr` to the default values for all the thread attributes of the object `pthread_attr_t` - after this call, individual attributes of the object can be set using various related functions - `attr` can be later used by `pthread_create()` - `pthread_attr_destroy()` deinitializes `attr` - sets `attr` to invalid values - destroying `attr` has no effect on threads that were created using that object. - thread attributes - POSIX.1 thread attributes ![image](https://hackmd.io/_uploads/SyFHSVELa.png) - Linux functions to set individual thread attributes ![image](https://hackmd.io/_uploads/BJCSrEVLa.png) ```c #include <pthread.h> // Returns: 0 if OK, error number on failure int pthread_attr_getdetachstate(const pthread_attr_t *attr, int *detachstate); int pthread_attr_setdetachstate(pthread_attr_t *attr, int detachstate); ``` - `pthread_attr_getdetachstate()`: set detach state attribute of `attr` - `detachstate`: `PTHREAD_CREATE_DETACHED`, `PTHREAD_CREATE_JOINABLE` - `pthread_attr_getdetachstate()`: store detach state attribute of `attr` in the buffer pointed to by `detachstate` ### Pthread Mutex - mutex: ensure only one thread can get access the shared resources at a time - effectively serves as a lock - a thread set/lock/acquire the mutex before accessing shared data and unset/release/unlock the mutex when it is done - if the mutex is locked, other threads that try to set it will block until the mutex is released - if more than one thread is blocked when we unlock the mutex - all threads blocked on the lock will be made runnable - the first thread that gets to run will be able to lock the mutex - the other threads will still be blocked ```c #include <pthread.h> // returns 0 if OK, error number on failure int pthread_mutex_init(pthread_mutext_t *mutex, const pthread_mutexattr_t *attr); int pthread_mutex_destroy(pthread_mutext_t *mutex); int pthread_mutexattr_init(pthread_mutexattr_t *attr); int pthread_mutexattr_destroy(pthread_mutexattr_t *attr); int pthread_mutexattr_getpshared(pthread_mutexattr_t *attr); int pthread_mutexattr_setpshared(pthread_mutexattr_t *attr, int pshared); ``` - initialize a mutex (two ways) - use macro: `pthread_mutext_t mutex = PTHREAD_MUTEX_INITIALIZER;` - use `pthread_mutex_init()`: init with `attr` - Set `attr` to NULL to use the default attributes - `pthread_mutexattr_init()`: initializes `attr` with the default attributes - `pshared`: - `PTHREAD_PROCESS_PRIVATE`(default): mutex scope = one process (same as the thread that initialized the mutex) - `PTHREAD_PROCESS_SHARED`: mutex scope = multiple processes ```c #include <pthread.h> // returns 0 if OK, error number on failure int pthread_mutex_lock(pthread_mutext_t *mutex); int pthread_mutex_trylock(pthread_mutext_t *mutex); int pthread_mutex_unlock(pthread_mutext_t *mutex); ``` - `pthread_mutex_lock()`, `pthread_mutex_trylock()`: lock a mutex - mutex is unlock: - both functions will lock the mutex without blocking and return 0 - mutex is locked: - `pthread_mutex_lock()`: block until the mutex is unlocked - `pthread_mutex_trylock()`: the function will return (`EBUSY`) immediately - `pthread_mutex_unlock()`: unlock a mutex - unlock a mutex that the calling thread does not hold => undefined behavior - deadlock avoidance: - one thread tries to lock the same mutex twice - pthread provides features to identify self-deadlocking (`PTHREAD_MUTEX_ERRORCHECK`) - one thread tries to lock the mutexes in the opposite order from another thread - no deadlock if all threads employ the same lock ordering - use `pthread_mutex_trylock()` to try again and again ![image](https://hackmd.io/_uploads/HJwLSEEL6.png =400x) ### Reader-Writer Locks - three state: read lock, write lock, none ![image](https://hackmd.io/_uploads/By2DBEVLa.png =300x) ```c #include <pthread.h> // returns 0 if OK, error number on failure int pthread_rwlock_init(pthread_rwlock_t *rwlock, const pthread_rwlockattr_t *attr); int pthread_rwlock_destroy(pthread_rwlock_t *rwlock); int pthread_rwlock_rdlock(pthread_rwlock_t *rwlock); int pthread_rwlock_wrlock(pthread_rwlock_t *rwlock); int pthread_rwlock_unlock(pthread_rwlock_t *rwlock); int pthread_rwlock_tryrdlock(pthread_rwlock_t *rwlock); int pthread_rwlock_trywrlock(pthread_rwlock_t *rwlock); ``` - similar to mutex verion - initialize a reader-writer locks (two ways) - use macro: `pthread_rwlock_t rwlock = PTHREAD_RWLOCK_INITIALIZER;` - use `pthread_rwlock_init()`: init with `attr` - Set `attr` to NULL to use the default attributes - `pthread_rwlock_tryrdlock()`, `pthread_rwlock_trywrlock()`: - returns `EBUSY` if the lock cannot be acquired ### Condition Variables - enable threads to atomically block and test the condition under the protection of a mutex until the condition is satisfied - support two primary operations: wait & notify (signal/broadcast) - wait: the mutex locked by the thread is released, and the thread block waits on the condition variable - nodify: a thread that changes the condition notifies (by signal or broadcast) the condition variable to unblock the waiting threads (discuss later) ```c #include <pthread.h> // returns 0 if OK, error number on failure int pthread_cond_init(pthread_cond_t *cond, const pthread_condattr_t *attr); int pthread_cond_destroy(pthread_cond_t *cond); int pthread_condattr_init(pthread_condattr_t *attr); int pthread_condattr_destroy(pthread_condattr_t *attr); int pthread_condattr_getpshared(pthread_mutexattr_t *attr); int pthread_condattr_setpshared(pthread_mutexattr_t *attr, int pshared); ``` - similar to mutex verion - initialize a condition variable (two ways) - use macro: `pthread_cond_t cond = PTHREAD_COND_INITIALIZER;` - use `pthread_cond_init()`: init with `attr` - Set `attr` to NULL to use the default attributes - `pthread_condattr_init()`: initializes `attr` with the default attributes - `pshared`: - `PTHREAD_PROCESS_PRIVATE`(default): condition variable scope = one process (same as the thread that initialized the condition variable) - `PTHREAD_PROCESS_SHARED`: condition variable scope = multiple processes ```c #include <pthread.h> // returns 0 if OK, error number on failure int pthread_cond_wait(pthread_cond_t *cond, pthread_mutex_t *mutex); int pthread_cond_timedwait(pthread_cond_t *cond, pthread_mutex_t *mutex, const struct timespec *tsptr); ``` - `pthread_cond_wait()`: blocks the calling thread, waiting for `cond` to be signaled/broadcasted to - must be called with `mutex` locked by the calling thread, otherwise undefined behavior - atomically unlocks `mutex` and wait - when pthread_cond_wait() returns, `mutex` shall have been locked by the calling thread - `pthread_cond_timedwait()`: similar to `pthread_cond_wait()`, except that an error is returned if the absolute time `tsptr` has passed - usage procedure: - acquires `mutex` - checks a condition (iteratively): e.g., a boolean flag is set, etc - calls `pthread_cond_wait()` - `mutex` was released; the thread goes to sleep until notified - reacquires `mutex` when waken up by the notification (return from wait) - releases `mutex` ```c #include <pthread.h> // returns 0 if OK, error number on failure int pthread_cond_signal(pthread_cond_t *cond); int pthread_cond_broadcast(pthread_cond_t *cond); ``` - unblock (wake up) the thread(s) blocked on `cond` - `pthread_cond_signal()`: at least one thread - `pthread_cond_broadcast()`: all threads - no effect if there are no threads presently blocked on `cond` - `pthread_cond_broadcast()` and `pthread_cond_signal()` can be called by a thread who does not currently hold the mutex that is passed to `pthread_cond_wait()` - predictable scheduling behavior: make sure the signal/broadcast results in a “fair scheduling” of the “thread who blocks on the condition” - if the predictable scheduling behavior is required by the program, the associated mutex should be locked by the thread calling `pthread_cond_broadcast()` or `pthread_cond_signal()` - spurious wakeup: when a thread is waken up, it finds that the condition isn't satisfied - reason: - in between the time when signaled and when the waiting thread finally ran, another thread ran and changed the condition - OS implementations may allow condition variables to return from a wait even if not signaled - address: - wrap the condition wait in a loop; when wake up, always check if the condition it sought is satisfied ### Threads and fork() - a thread call `fork()`: only a single thread (the thread that calls fork()) exists in the child process - child inherit the state of every mutex, reader-writer lock, and conditional variable from the parent process - The mutex locked by (the threads from) the parent will be locked in the child ```c #include <pthread.h> // returns 0 if OK, error number on failure int pthread_atfork(void (*prepare)(void), void (*parent)(void), void (*child)(void)); ``` - registers fork handlers that are to be executed when `fork()` is called - `prepare`: a handler called in the parent before `fork()` - `parent`: a handler called in the parent after `fork()`, but before the `fork()` has returned - `child`: a handler called in the child before the `fork()` has returned - each args can be NULL if no need ### Threads and Signals - each thread has its own signal mask - inherit from the thread who calls `pthread_create()` - all threads share the signal dispositions - any of the threads can modify the dispositions associated with a given signal - signal delivery - to a single thread: hardware fault (e.g., divided by zero) - to an arbitrary thread: others ```c #include <signal.h> // returns 0 if OK, error number on failure int pthread_sigmask(int how, const sigset_t *set, sigset_t *oset); int sigwait(const sigset_t *set, int *signop); ``` - `pthread_sigmask()`: works like `sigprocmask()`, but for threads - `sigwait()`: wait for one or more signals to occur: - `set`: a set of signals that the thread is waiting - `signop`: store returned number of the signal that was delivered - the signal handler of the signal waited by `sigwait()` will not be called - a thread must block the signals before calling `sigwait()`, otherwise undefined behavior - atomically unblocks the signals for the calling thread in the set and wait until one of the signals in the set is delivered - before return: - remove the signal from the set of signals pending for the process - restore the thread’s signal mask - if multiple threads `sigwait()` for the same signal, only one of the threads will return ```c #include <signal.h> // returns 0 if OK, error number on failure int pthread_kill(pthread_t thread, int signo); ``` - sends a signal `signo` to `thread` - if a default action for a signal is to terminate the process, then sending the signal to a thread terminates the entire process ### Pthread Spinlock ```c #include <pthread.h> // returns 0 if OK, error number on failure int pthread_spin_init(pthread_spinlock_t *lock, int pshared); int pthread_spin_destroy(pthread_spinlock_t *lock); int pthread_spin_lock(pthread_spinlock_t *lock); int pthread_spin_trylock(pthread_spinlock_t *lock); int pthread_spin_unlock(pthread_spinlock_t *lock); ``` - block a thread by busy-waiting (spinning) <-> mutex: the thread is blocked by sleeping - could result in CPU resource wastage, should be held only for a short period of time - many mutex implementations are so efficient that the performance is equivalent to the performance useing spin locks ## Process Environment - memory addressing ![image](https://hackmd.io/_uploads/B1tXu4VUa.png =350x) - physical address - virtual address - MMU: memory management unit - each process has its own address space ![image](https://hackmd.io/_uploads/BJZOcE4IT.png =350x) - an array of contiguous byte-size virtual addresses - a given VA is either mapped to a PA (colored boxes) or unmapped (white boxes) - OS kernel manages the page tables to build the VA->PA mapping - MMU walks the page tables to perform the address translation - different process address space - same set of VAs: [0:MAX_ADDR] - same VAs map to either the same or different PAs - same: process share memory, e.g., COW after `fork()`, shared `mmap()` memory region - different: e.g., heap, layout after `exec()` - process address space after `fork()` - child inherits all memory mapping from parent - COW on memory mapping ### mmap(), munmap(), mprotect() ```c #include <sys/mman.h> // Returns: the starting address of mapped region if OK, MAP_FAILED on error void *mmap(void *addr, size_t len, int prot, int flags, int fd, off_t off ``` - `addr`: starting address for the new mapping - == NULL: let kernel choose - != NULL: kernel takes `addr` as a hint for memory allocation - if another mapping already exists there: kernel pickes a new address - `len`: length of the mapping (>0) - POSIX: if `len = 0`, `mmap()` shall fail - `prot`: memory protection of the new mapping - `prot` == `PROT_NONE` or bitwise_or `PROT_READ`, `PROT_WRITE`, `PROT_EXEC` - cannot violate the `open()` mode - e.g., `PROT_WRITE` for the file opened read-only - kernel rejects memory access with invalid access permission - e.g., memory write to a read-only memory-mapped region - `SIGSEGV` is generated - `fd`: file/device object to be mapped - `off`: byte offset of the object - `flag`: attributes of the memory-mapped region - share the modification on the same mapping region with other process or not - `MAP_SHARED`: updates are visible by other processes - `MAP_PRIVATE`: private copy-on-write mapping - `MAP_FIXED`: tell the kernel to create at exactly `addr` - `mmap()` fail if the kernel cannot satisfy the request - `MAP_ANONYMOUS`: allocates a memory buffer. The mapping is not backed by any files or devices but by memory; the contents of the mapping are initialized to zero - creates a new memory region (new mapping) in the address space of the calling process - the new memory region could be mapped to a file or a device - if not mapped to a file or a device, mapped to the memory - similar to `malloc()`, but not allocate memory from the heap - users can specify the address to create memory region, or let the OS kernel decide - memory-mapped I/O - `mmap()` creates a memory buffer mapping to a file - as if the program can read/write the storage device (via the mapped file) - for a memory-mapped file - bytes read/wrote correspond to the same bytes of the file ![image](https://hackmd.io/_uploads/SJ4hgUVU6.png =450x) - after `mmap()` return, `fd` can be closed immediately; this does not invalidate the mapping ```c #include <sys/mman.h> // Returns: 0 if OK, -1 on error int munmap(void *addr, size_t len); ``` - deletes a memory-mapped region for the specific address range - after `munmap()` succeeds, further accesses to the address range are invalid - the memory-mapped regions are automatically unmapped when the process is terminated - closing `fd` does not unmap the region - `munmap()` does not affect the object that was mapped - The contents of the memory-mapped region will not be written back to the file on the disk on a `munmap()` ```c #include <sys/mman.h> // Returns: 0 if OK, -1 on error int mprotect(void *addr, size_t len, int prot); ``` - change the permissions on a memory region in the process address space - `addr`: start address - `len`: size of the memory region - `prot`: `PROT_NONE` or bitwise_or `PROT_READ`, `PROT_WRITE`, `PROT_EXEC` ### Program Compilation - object file: have different format (PE, Mach-O, ELF) - executable object file (executables): can be loaded into memory and executed, also called executables - relocatable object file: can be combined with other relocatable object files at compile time to create an executables - shared object file (shared objects) (shared library): special type of relocatable object file that can be loaded into memory and linked dynamically at either load time or run time - compilation with GCC ![image](https://hackmd.io/_uploads/HkfQM9HUp.png) - preprocessor - modifies the C program according to `#`, e.g., `#include <stdio.h>`: insert the content of stdio.h into the program text - result: a new C program (with .i suffix) - compiler - result: an assembly code - assembler - packages result files into a relocatable object file - result: machine-language instructions - linker - handles the merging of relocatable object files - result: executable object file ### Symbols - three types of symbols that the linker cares about (consider an object file m): - global (linker) symbols: symbols defined by m that other object files can reference - global externals: symbols referenced by m but undefined by m (i.e. by other object files) - local symbols: symbols that are defined and referenced exclusively by m - e.g., local static variables - symbol table (for an ELF object file) ![image](https://hackmd.io/_uploads/HyR_giBIp.png =400x) - information about the symbols of the object file - an entry for each symbol contains: - symbol type (function or variable), symbol size, symbol name, etc. ### ELF ![image](https://hackmd.io/_uploads/r1OkccrLT.png =300x) - ELF header: specify information about the file (for the linker to parse and interpret the file) ![image](https://hackmd.io/_uploads/ryFOc5BIa.png =350x) - contains: - size of the ELF header (e_ehsize) - object file type (e_type) - program intry point (e_entry) - can check ELF header with command `readelf` - section header table: describes the size and location of various sections - .text: includes the machine code - .data: includes initialized global and static variables - .rodata: includes read-only data (e.g., the format string in printf()) - .bss: uninitialized global and static variables + any global or static variables that are initialized to zero - .symtab: a symbol table with information about functions and global variables defined and referenced in the program; no automatic variables - .rel.text: relocation entries for code: a list of locations in the .text section that will need to be modified when the linker combines this object file with others - includes locations of the code that calls to an external function or references global variables - .rel.data: relocation entries for data: relocation information for any global variables that are referenced or defined by the object file ### linking - linker makes sure the executable can access symbols - static linking: performed at compile time (by the linker); generates a fully-linked executable object file ![image](https://hackmd.io/_uploads/SkX1PTS8p.png =500x) - symbol resolution: associates each symbol reference with one symbol definition - search for symbol definitions from the different symbol tables of the input relocatable object files - if no definition can be found, the linker outputs an error message and terminates - symbol relocation: ![image](https://hackmd.io/_uploads/SJ7ii6BIT.png =500x) - relocate sections and symbol definitions - Merge sections from each object files into a new aggregated section of the same type - assign runtime memory addresses to the new sections - relocate symbol references within sections - the linker checks the relocation entries and update symbol references in the program to reference symbols with correct the runtime addresses - static library: - a file consists of multiple related object files - stored on the disk as the archive format (file names with .a suffix) - use the tool called `ar` to create a static library - linkers take static libraries as input and link with relocatable object files - updates in static library: relink with the updated libraries - libraries are copied to executable object files -> bigger size - dynamic linking: performed at program load time (by the loader) or runtime (by the program) by dynamic linker - do some of the linking statically when the executable file is created, then complete the linking process dynamically when the program is loaded ![image](https://hackmd.io/_uploads/ryOkTEkwa.png =400x) - `ld-linux.so` does the following to finish the linking task - relocate the text and data of `libc.so` and `libvector.so` into different memory segments - relocate any references in `prog21` to symbols defined by `libc.so` and `libvector.so` - pass control to the original binary's entry point - shared library (with .so suffix): - an object file that can be loaded at an arbitrary memory and linked with a program at either load time or runtime - loaded into the process address space. The mapping for the same shared libraries could be shared by multiple processes using them - static linking v.s. dynamic linking - library sharing - shared library: a single copy of the library shared by all executables - static library: content of the library are copied and embedded in executables - relink after library updates - shared library: no relink is needed - static library: relink program after library updates ### Loading - loading ELF executable object file ![image](https://hackmd.io/_uploads/Hkzp6pSLp.png =700x) - copies the code and data in the executable from the storage device into memory (i.e., the process address space) and runs the program by jumping to its first instruction at the program’s entry point - for C programs, the entry point is the address of the _start() function (defined in the system object file crt1.o) and, eventually, calls the main() function of the program - whenever the `exec()` is called by a process, the kernel invokes the loader for the respective object file format (e.g., ELF) to load the program into the process address space and finally executes it