Ch6:Process Synchronization

--- title: Ch6:Process Synchronization --- # 作業系統Ch6: Process Synchronization ###### tags: `Operating System` `Review` `Ch6` ## Background - Concurrent access to 共享 data 導致 **data inconsistency** - 維護 data consistency 需要確保按順序地執行 cooperating processes 的機制。 ### Consumer & Producer Problem - 決定是否 buffer 是空的或滿的 - 過去: use in, out position - 現在: use counter ```c= /* producer */ while (1) { nextItem = getItem(); while (counter == BUFFER_SIZE); // buffer 滿了 buffer[in] = nextItem; in = (in + 1) % BUFFER_SIZE; counter++; // 生產一個，加一個 } /* consumer */ while (1) { while (counter == 0); // buffer 為空 item = buffer[out]; out = (out + 1) % BUFFER_SIZE; counter--; // 消耗一個，減一個 } ``` > 但是這樣會有 synchronization 問題，因為 counter++ 語句和 counter--語句在組語下指令可能是多行組成，如下: > ```c= > /* counter++ */ > move ax, counter > add ax, 1 > move counter, ax > /* counter-- */ > move bx, counter > sub bx, 1 > move counter, bx > ``` 用例子去看，如果一開始給定 $counter\ =\ 5$，過程可能會是如下這樣: ``` producer: move ax, counter //ax = 5 producer: add ax, 1 //ax = 6 context switch consumer: move bx, counter //bx = 5 consumer: sub bx, 1 //bx = 4 context switch producer: move counter, ax //counter = 6 context switch consumer: move counter, bx //counter = 4 ``` 最後這種情況的 $counter\ =\ 4$，而不是我們預期的$counter\ =\ 5$，甚至也可能產生 $5$ 和 $6$ 結果，取決於 runtime 時的情況。 ### Race Condition - 多個 processes 並行 (**concurrently**) 存取並操弄 shared data 的情況。 - shared data 的最終值會取決於process 最後完成的。 - 為了避免 race condition， concurrent processes 一定需要 synchronized - 在單一 processor 機器上，我們可以 disable interrupt or use non-preemptive CPU scheduling > disable interrupt 會影響到在系統上不同的使用者，所以不會是一個好方法。 - 通常描述為 **critical section problem** ### Critical Section - 目標: 為 processes 協調的一套協議 (protocol) - 問題描述: - $N$ processes 為使用某個 shared data 而競爭 - 每一個 process 有一個稱作 **critical section** 的 code segment，shared data 在 critical section 被存取。 - 確保當一個 process 在它的 critical section 執行時，沒有其他 process 被允許進入在它的 critical section 執行。 $\rightarrow$ **mutually exclusive** - 一般 code section structure - Only one process can be in a critical section ```c= do { remainder section // entry section: Gets entry permission critical section // Modify shared data // exit section: Release entry permission remainder section } while (1); ``` #### Critical Section Requirements 1. Mutual Exclusion: 如果 process P 在 CS 中正在執行，沒有其他 processes 可以在它們的 CS 中執行。 2. Progress: 如果沒有 process 在 CS 中正在執行，存在某些 processes 想要進入它們的 CS，這些 processes 不能被**無限期拖延**。 > 其中一個 process 要趕快進入 CS。 3. Bounded Waiting: 一個 process 做了要進入它的 CS 請求之後，允許其他 processes 進入其 CS 的次數必須存在限制。 > 換言之，已經請求的 process 不能「不確定地」無限等待。 #### Algorithm for Two Process - 只有兩個 processes， $P_{0}\ and\ P_{1}$ - Shared variable - `int turn;` //initially `turn = 0` - `turn = i` $\implies\ P_{i}\ can\ enter\ its\ critical\ section$ ```c= /* Process 0 */ do { while (turn != 0); // entry section critical section turn = 1; // exit section remainder section } while (1); /* Process 1 */ do { while (turn != 1); // entry section critical section turn = 0; // exit section remainder section } while (1); ``` 1. Mutual exclusion ? Yes，`turn` 只能是 `1 or 0` 2. Progress ? No，如果 Process 0 離開 CS ，很有可能 Process 1 還沒有意願要做 do-while，所以 Process 0 有意願要繼續進去 CS 時，會卡在 `while (turn != 0)`，不滿足在 CS 為空時，有意願進去的可以進去。 3. Bounded Waiting ? Yes，因為是輪流進去，最多就是等一個 process。 ### Peterson’s Solution for Two Processes - Shared variables - `int turn;` //initially `turn = 0` - `turn = i` $\implies\ P_{i}\ can\ enter\ its\ critical\ section$ - `bool flag[2];` //initially `flag[0] = flag[1] = false;` - `flag[i] = true` $\implies\ P_{i}\ ready\ to\ enter\ its\ critical\ section$ ```c= // Pi: do { flag[i] = true; turn = j; while (flag[j] && turn == j); critical section // 進入 CS 當 (1) process 得到它的 turn //(2) 其他 process 沒有要進去 flag = false flag[i] = false; remainder section } while (1); ``` > 當 P0 做 do-while 時，想要進去 CS，所以 `flag` 會設為 true，且 `turn` 這個許可權會先給對方(自己在內部 `while` 等待)，如果 P1 已經在內部`while`等待時，對方會先進去，之後再把 `turn = 0` 許可權還給 P0；如果 P1 此時正在 remainder section 或者正要進入 `do-while` 的話，很快就會將`turn = 0` 許可權還給 P0。 1. Mutual exclusion ? Yes - if $P_{0}$ enter CS $\implies$ `flag[1] == false || turn == 0` - if $P_{1}$ enter CS $\implies$ `flag[0] == false || turn == 1` - 如果兩個 processes 都想進 CS $\implies$ `flag[0] = flag[1] = true` - `turn = 0` for $P_{0}$ to enter, turn = 1 for $P_{1}$ to enter > 然而，`turn` 只會是 `1 or 0` 因為它的值是設給兩個 processes，但是只有一個值能夠持續。 > 因此， $P_{0}$ 和 $P_{1}$ 不能同時待在 CS ![](https://i.imgur.com/fZEZ2OC.png) 2. Progress ? Yes $P_{0}$ 想要進去 CS - (1). if $P_{1}$ 還不想進去 $\implies$ `flag[1] = false` $\implies$ $P_{0}$ 可以進去 - (2). if both 都想進去 $\implies$ `flag[0] = flag[1] = true` - If `turn = 0` then $P_{0}$ enters, otherwise $P_{1}$ enters - 無論哪種情況，都有一些等待 process 可以進入 CS！ ![](https://i.imgur.com/puDew9n.png) 3. Bounded waiting ? Yes $P_{0}$ 想要進去 CS - (1). 一旦 $P_{1}$ 離開 CS $\implies$ `flag[1] = false` $\implies$ $P_{0}$ 可以進去 - (2). 如果 $P_{1}$ 離開 CS 且重設 `flag[1] = true` $\implies$ `turn = 0` (overwrite $P_{0}$ setting) $\implies$ $P_{0}$ 可以進去 - $P_{0}$ 不會無止盡等待 ![](https://i.imgur.com/TUVFBpJ.png) ### Back To Producer/Consumer Problem ```c= /* producer */ while (1) { // entry-section(); (1) nextItem = getItem(); while (counter == BUFFER_SIZE); // buffer 滿了 buffer[in] = nextItem; in = (in + 1) % BUFFER_SIZE; // entry-section(); (2) entry-section(); //(3) counter++; // 生產一個，加一個 exit-section(); //~(3) computing(); //context switch // exit-section(); ~(2) // exit-section(); ~(1) } /* consumer */ while (1) { // entry-section(); (1) while (counter == 0); // buffer 為空 item = buffer[out]; out = (out + 1) % BUFFER_SIZE; // entry-section(); (2) entry-section(); //(3) counter--; // 消耗一個，減一個 exit-section(); //~(3) computing(); //context switch // exit-section(); ~(2) // exit-section(); ~(1) } ``` 1. 第一種情況會導致 deadlock，如果 consumer 先進去 CS。 2. 第二種情況雖然正確，但是效能低落，因為會進入到 computing()。 3. 第三種情況正確且極大化 concurrent performance。 ### Bakery Algorithm (n processes) - 進入它的 CS 之前，每一個 process 會收到一個 number > 類似取餐號碼牌 - 數字最小的持有者可以進入 CS。 - 分發數字方案總是以非遞減順序產生數字，i.e., 1,2,**3,3**,4,**5,5,5**。 - 如果 processes $P_{i}\ and\ P_{j}$ 收到同樣數字，那麼如果 $i\ <\ j$，則 $P_{i}$ 會先進去。 - Notation: $(a,\ b)\ <\ (c,\ d)\ \ \ \ if\ a\ <\ c\ \ or\ \ if\ a\ =\ c\ \&\&\ b\ <\ d$ ```c= /* Process i: */ do { choosing[i] = true; //lock num[i] = max(num[0], num[1], ..., num[n - 1]) + 1; // Get tickets choosing[i] = false; //unlock for (j = 0; j < n; j++) { while (choosing[j]); // cannot compare when num is being modified while ((num[j] != 0) && ((num[j],j) < (num[i], i))); // FCFS } critical section num[i] = 0; // release ticket remainder section } while (1); ``` 1. Mutual exclusion ? Yes 只有數字最小的人能進去，如果有兩個或多個數字最小的，則看 Process 的編號(有唯一性)，不會有兩個 processes 同時在它們自己的 CS。 2. Progress ? Yes 如果 CS 為空時，有一群 Process 拿到數字，其中拿到最小數字的 process (編號最小的) 一定可以 break 掉 for 迴圈裡的兩個 while，因此就可以進入 CS，不會無限期拖延(可以進去，但是卻沒辦法進去)。 3. Bounded-waiting ? Yes 因為 processes 進入 CS 是以 First come First serve，排隊進入，所以可以有限制的等待，而非不明確的等待。 - 是否有可能許多 processes 收到相同 number? 有可能，因為取號碼牌大致可以分成三個動作 1. 挑最大值 2. 進行加 1 3. 賦值給相對應的nums[i] | | $P_{i}$ | $P_{j}$ | | -------- | ----------------- | ----------------- | | $T_{0}$ | choosing[i]= true | | | $T_{1}$ | | choosing[j]= true | | $T_{2}$ | 挑選最大值並進行加1，未賦值 | | | $T_{3}$ | | 挑選最大值並進行加1，未賦值 | | $T_{4}$ | 賦值 | | | $T_{5}$ | | 賦值 | - 為什麼當 number 在修改時不能比較? - Without locking > 假如 5 是目前最大 number，如果 $P_{1}\ and\ P_{4}$ 一起要獲得數字，但是 $P_{4}$ 比 $P_{1}$ 早完成 ( 也就是，$P_{1}\ =\ 0,\ P_{4}\ =\ 6\ \rightarrow\ P_{4}\ will\ enter\ the\ CS$ )，在 $P_{1}$ 拿到數字後， $P_{1}\ =\ P_{4}\ =\ 6$， $P_{1}$ 也將會進入 CS。 (實際上，$P_{1}\ 要比\ P_{4}\ 早進入才是正確的$) - With locking > $P_{4}$ 將必須等待到 $P_{1}$ 拿到數字，在比較之前，$P_{1}\ and\ P_{4}$ 都會拿到數字 6。 ### Pthread Lock/Mutex Routines - 為了使用 mutex，它必須宣告型態為 `pthread_mutex_t` 和以 `pthread_mutex_init()` 初始化。 - 以 `pthread_mutex_destroy()` 摧毀了一個 mutex。 - 一 critical section 可以使用 `pthread_mutex_lock()` 和 `pthread_mutex_unlock()` 受保護。 Example: ```c= #include <pthread.h> pthread_mutex_t mutex; pthread_mutex_init(&mutex, NULL); //NULL: specify default attribute for the mutex pthread_mutex_lock(&mutex); //enter critical section critical section pthread_mutex_unlock(&mutex); //leave critical section pthread_mutex_destroy(&mutex); ``` prototype `int pthread_mutex_init(pthread_mutex_t *restrict mutex, const pthread_mutexattr_t *restrict attr);` 對於屬於允許 set or get，相關內容可見 [what is the "attribute" of a pthread mutex?](https://stackoverflow.com/questions/4252005/what-is-the-attribute-of-a-pthread-mutex)。值得一提的是 `NULL`作為預設屬性，且對於屬性中的 type 是 PTHREAD_MUTEX_DEFAULT，但是實際上這並沒有真的定義，可以是任何，所以需進行[測試](https://stackoverflow.com/questions/29624365/what-is-the-default-mutex-attributes-of-the-pthread-mutex)。 ### Condition Variables (CV) - CV 表示**某情況**一 thread 可以: - Wait on 直到 condition 發生，或 - 通知其他等待 threads， condition 已經發生了。 - 在 Condition Variables　上的三種操作: - `wait()` --- Block 直到另一 thread 呼叫 signal() or broadcast() on the CV. - `signal()` --- 叫醒在 CV 上等待的一 thread。 - `broadcast()` --- 叫醒在 CV 上等待的所有 thread。 - 在 Pthread 中， CV 型態是一 `pthread_cond_t` - 使用 `pthread_cond_init()` 初始化。 - `pthread_cond_wait(&theCV, &somelock)` > prototype: int pthread_cond_wait(pthread_cond_t *restrict cond, pthread_mutex_t *restrict mutex); - `pthread_cond_signal(&theCV)` > prototype: int pthread_cond_signal(pthread_cond_t *cond); - `pthread_cond_broadcast(&theCV)` > prototype: int pthread_cond_broadcast(pthread_cond_t *cond); Example: - 一 thread 可以設計成 **take action 當 $x\ =\ 0$**。 - 另一 thread 有責任為 counter 遞減一。 - 所有 condition variable 操作 ** 一定**執行於 mutex 已經有上了鎖。 ``` c= #include <pthread.h> pthread_cond_t cond; pthread_cond_init(cond, NULL); pthread_mutex_t mutex; pthread_mutex_init(mutex, NULL); action() { pthread_mutex_lock(&mutex); while(x != 0) pthread_cond_wait(cond, mutex); /* some code here.... */ pthread_mutex_unlock(&mutex); take_action(); } counter() { pthread_mutex_lock(&mutex); x--; if (x == 0) pthread_cond_signal(cond); pthread_mutex_unlock(&mutex); } ``` > 如果 `action( )` 的 thread 先開始，前面已經 `mutex` 的物件會上鎖，接著因為 `x` 不等於 `0`，所以會進入 while 迴圈中，並執行 `pthread_cond_wait(cond, mutex)` 的 statement，這個 statement 會使 `action( )` 執行的 thread 進入 sleep 狀態，並且 **releases the lock of action() thread**，然後這時的 `counter( )` 的 thread 進行 `mutex` 的上鎖 (與 `action ( )` 中 `mutex` 是同一物件)，接著 `action( )` thread 會一直處於 sleep 狀態，而 `counter( )` thread 會持續對 `x` 的值進行減一，直到當 `x = 0`時，它會發送 `Signal( )` 喚醒 `action( )` 的 thread，但是此時 `counter` 的 thread 仍然是上鎖的，就是等到 `counter` 的 thread 執行 releases the lock，接著 `action( )` thread 會 **Re-acquire lock**，恢復執行，然後`action( )` thread 才會 Release the lock。 ### ThreadPool implementation **Task structure :** ```c= typedef struct { void (*function)(void *); void *argument; } threadpool_task_t; ``` **Threadpool structure :** ```c= struct threadpool_t { pthread_mutex_t lock; pthread_cond_t notify; pthread_t *threads; threadpool_task_t *queue; int thread_count; int queue_size; int head; int tail; int count; int shutdown; //Flag indicating if the pool is shutting down int started; }; ``` **Allocate thread and task queue :** Threadpool 一開始建立的時候就決定 Thread Pool 和 tasks 的最大數量。 ```c= pool->threads = (pthread_t *) malloc(sizeof(pthread_t) * thread_count); pool->queue = (threadpool_task_t *) malloc(sizeof(threadpool_task_t) * queue_size); ``` **每個 Thread 要執行的 Function :** ```c= static void *threadpool_thread(void *threadpool) { threadpool_t *pool = (threadpool_t *)threadpool; threadpool_task_t task; for(;;) { /* Lock must be taken to wait on conditional variable */ pthread_mutex_lock(&(pool->lock)); /* Wait on condition variable, check for spurious wakeups. When returning from pthread_cond_wait(), we own the lock. */ while((pool->count == 0) && (!pool->shutdown)) { pthread_cond_wait(&(pool->notify), &(pool->lock)); } if((pool->shutdown == immediate_shutdown) || ((pool->shutdown == graceful_shutdown) && (pool->count == 0))) { break; } /* Grab our task */ task.function = pool->queue[pool->head].function; task.argument = pool->queue[pool->head].argument; pool->head += 1; pool->head = (pool->head == pool->queue_size) ? 0 : pool->head; pool->count -= 1; /* Unlock */ pthread_mutex_unlock(&(pool->lock)); /* Get to work */ (*(task.function))(task.argument); } pool->started--; pthread_mutex_unlock(&(pool->lock)); pthread_exit(NULL); return(NULL); } ``` > 在 `for(;;)` 裡面，Thread 第一件要做的事情就是去搶奪 pool 的 lock，當搶到 lock 的 Thread 發現沒有工作可以做的時候，就會執行 `pthread_cond_wait` 來等待通知。這時候 `pool->lock` 會被 Unlock，因此這時候其他 Thread 也可以進來這個區域。所以在完全沒有工作的情況下，所有的 Thread 都會在這邊 Waiting。 > 當 Thread 被透過 `pthread_cond_signal( )` 喚醒的時候，該 Thread 就會重新取得 `pool->lock`。這時他就可以安心的取出 queue 中的 task，等待取出完畢之後，再 unlock 讓其他被喚醒的 Thread 也可以去取得 Task。 > 之後就是執行 task 中的 function pointer 做該做的工作。\[1]\[2] ## Synchronization Hardware ### HW Support - HW support: **atomic instructions** - atomic: 作為一不可中斷的單位(硬體實作出來的) - examples: TestAndSet(var), Swap(a,b) $\rightarrow$ 假設的指令，實際上 Computer system 會有其他的指令。 ### Atomic TestAndSet() ```c= /** * execute atomically: return the value of "lock" and * set "lock" to TRUE. **/ bool TestAndSet(bool &lock) { bool value = lock; lock = true; return value; } ``` > 不管原先 lock 值是甚麼，就是上鎖，然後回傳原本 lock 的值。 Shared data: `bool lock;` //initially `lock = false;` ```c= /* P0 */ do { while (TestAndSet(lock)); //obtain lock critical section lock = false; remainder section // release lock } while (1); /* P1 */ do { while (TestAndSet(lock)); critical section lock = false; remainder section } while (1); ``` > $P_{0}$ 進入 do-while 會先遇到 while 的條件式 `TestAndSet( )`，由於一開始 `lock` 為 `false`，所以會 break the while 進入 critical section，此時 lock 會是 true，所以 $P_{1}$ 在 while 的條件式 `TestAndSet( )` 會回傳為 `true`，等待 $P_{0}$ 離開 critical section 並將 `lock` 設為 `false` ，這樣 $P_{1}$才會 break the while 進入 critical section。 1. Mutual exclusion ? Yes `lock` 只會為 `true` or `false`，當在一個 process 為 `false` 可以進去 critical section 時，其他 process 會因為 `TestAndSet` 將 lock 設為 true，而不得進去。 2. Progress ? Yes 當 critical section 沒有 process 時，`lock` 設為 `false`，只要有任何 process 去做 `while (TestAndSet (lock) ) ;` 都可以進去 critical section ，不會無限期 delay。 3. Bounded-Wait ? No 當 $P_{0}$ 離開 critical section 且`lock` 設為 `false`，有想要進去的其他 processes 並不是排隊進去，而是靠搶奪的方式誰先執行 `while (TestAndSet (lock) ) ;` ，誰就能先進去。 ### Atomic Swap() - idea: enter CS if `lock = false` Shared data: `bool lock;` //initially `lock = false;` ```c= /* P0 */ do { key0 = true; while (key0 == true) Swap(lock, key0); critical section lock = false; remainder section // release lock } while (1); /* P1 */ do { key1 = true; while (key1 == true) Swap(lock, key1); critical section lock = false; remainder section } while (1); ``` > 當 $P_{0}$ 的 `key0` 一開始為 `true` 並且 `lock = false`，進行 `swap` 交換，使 `key0` 為 false ，可以 break the while，並且進入 critical section；其他 processes 的 `key1` 會由於 `swap` 交換到仍是`true`，所以卡在`while` 迴圈中，直到 $P_{0}$ 離開並將 `lock` 設為 `false`。 1. Mutual exclusion ? Yes 理由同 `TestAndSet`，`lock` 只會為 `true` or `false`，只允許一個 process 進入 critical section。 2. Progress ? Yes 理由同 `TestAndSet`。 3. Bounded-Wait ? No 理由同 `TestAndSet`，也是需要用搶奪的方式執行交換。 ## Semaphores - 一般化 synchronization problem 的工具 - 更具體來說是「有多少可用的特定資源的 record」。 - if record # = 1 $\rightarrow$ binary semaphore, mutex lock - if record # > 1 $\rightarrow$ counting semaphore - 只存取透過 2 atomic ops: **wait & signal** - Spinlock implementation: > Spinlock: 對於 user 而言看起來沒有在動，但是對於 CPU 而言，卻是一直在重複跑 `while (S <= 0)`。 - Semaphore is an integer variable #### busy waiting ```c= wait(S) { while (S <= 0); //busy waiting S--; } signal(S) { S++; } ``` > 浪費 CPU 資源 ### POSIX Semaphore - Semaphore 是 POSIX standard 的一部分但是不屬於`Pthread` - It can be used with or **without** thread - POSIX Semaphore routines: - **sem_init(sem_t \*sem, int pshared, unsigned int value)** $\rightarrow$ Initial value of semaphore - **sem_wait(sem_t \*sem)** - **sem_post\(sem_t \*sem)** - **sem_getvalue(sem_t \*sem, int \*valptr)** $\rightarrow$ Current value of the semaphore - **sem_destory(sem_t \*sem)** - Example: ```c= #include<semaphore.h> sem_t sem; sem_init(&sem); sem_wait(&sem); critical section sem_post(&sem); sem_destory(&sem); ``` ### n-Process Critical Section Problem - Shared data: - semaphore mutex; //initially mutex = 1 Process $P_{i}$: ```c= do { wait(mutex); //pthread_mutex_lock(&mutex); critical section signal(mutex); //pthread_mutex_unlock(&mutex); remainder section } while (1); ``` 1. Mutual exclusion ? Yes 2. Progress ? Yes 3. Bounded waiting ? Yes It depends on the implementation `wait( )` ### Non-busy waiting Implementation - Semaphore 是 data struct with a queue - 可以使用任何 queuing strategy (FIFO, FILO, etc) ```c= typedef struct { int value; //init to 0 struct process *L; //"PCB" queue } semaphore; ``` ![](https://i.imgur.com/2jNEwg2.png) - wait( ) and signal( ) - use system calls: sleep() and wakeup() - 一定是 atomically 執行。 ```c= void wait(semaphore S) { S.value--; //subtract first if (S.value < 0) { add this process to S.L; sleep(); } } void signal(semaphore S) { S.value++; if (S.value <= 0) { remove a process P from S.L; wakeup(P); } } ``` - How to ensure atomic wait & signal ops? - Signal-processor: diable interrupts > 影響其他使用者 - Multi-processor: - HW support (e.g., TestAndSet, Swap) - SW solution (Peterson's solution, Bakery algorithm) ### Semaphore with Critical Section ```c= void wait (semaphore S) { entry-section(); S.value--; if (S.value < 0) { add this process to S.L; exit-section(); sleep(); } else { exit-section(); } } void signal (semaphore S) { entry-section(); S.value++; if (S.value <= 0) { remove a process P from S.L; exit-section(); wakeup(P); } else { exit-section(); } } ``` > Busy waiting for entry-section() ? > 僅限於 `wait` 和 `signal` 的 CS（~10 條 instructions） $\rightarrow$ very short period of time。Non-busy waiting 會牽扯到 system call ，所以指令會比 busy waiting 多很多，所以一般來說適合時間比較長的情況下，使用 Non-busy waiting，而時間比較短的情況下用 Busy waiting。 System call 不用擔心 synchronization 問題，因為 OS 會去處理。 ### Cooperation Synchronization - P1 executes S1； P2 executes S2 - S2 只執行在 S1 完成後。 - Implementation - shared var: `semaphore sync; //initially sync = 0;` ![](https://i.imgur.com/iK8E3Ir.png) > 即使 P2 先執行也會卡住，等待 P1 執行完 S1 後，發 signal 去喚醒 P2。 #### A More Complicated Example 一開始，所有 semaphores 都是 $0$ ![](https://i.imgur.com/ediS0ny.png) > edge 代表著 semaphore，outcome 代表著 `signal`，而 income 代表著 `wait`。 ### Deadlocks & Starvation - Deadlocks: 2 processes 正在無止盡等待彼此釋出資源。 - Starvation: 例如: LIFO queue in semaphore process queue。 > 這是對於單一 process 來說。而 Deadlocks 則是對於多個 processes，但是同樣也會造成 starvation。 ![](https://i.imgur.com/vVZ57xp.png) > 如果 semaphore 的值都是 $0$，則 $P_{0}\ and\ P_{1}$ 會連第一個 `wait`都會卡住；如果 semaphore 的值都是 $1$，則 $P_{0}\ and\ P_{1}$ 通過第一個 `wait`，但是卡在第二個。全都是 Deadlock 的情況。 ## Classical Problems of Synchronization 以經典 Synchronization 的問題，去用於測試新提出的同步方案。 - Bounded-Buffer (Producer-Consumer) Problem - Reader-Writers Problem - Dining-Philosopher Problem ### Bounded-Buffer Problem 同 Producer-Consumer 問題 ### Reader-Writers Problem - A set of shared data objects - A group of processes - reader processes (read shared objects) - writer processes (update shared objects) - a writer process 有獨佔存取一個 shared object > 也就是在更新 shared object，不會給 reader 讀取。 - 牽扯到優先權的不同版本 - first RW problem: 沒有 reader 需要等待，也就是現在有一個 reader 正在對於 a shared object，其他 reader 也可以進去讀取，除非現在是 writer 正在更新 a shared object。 > first RW 問題會有 writer 餓死情況發生。 - second RW problem: 一旦 writer 準備好，只要 shared object 被 released，它馬上就會執行更新。 - writer has higher priority than reader - 一旦 writer 準備好，沒有新的 reader 可以開始 reading。 #### First Reader-Writer Algorithm - Readers 可以共享單一 wrt lock ```c= /* mutual exclusion for write */ semaphore wrt = 1; /* mutual exclusion for readcount */ semaphore mutex = 1; int readcount = 0; Writer() { while (true) { wait(wrt); // Writer Code signal(wrt); } } Reader() { while (true) { wait(mutex); readcount++; /* Acquire write lock if "reads" haven't */ if (readcount == 1) wait(wrt); signal(mutex); // Reader Code wait(mutex); readcount--; /* release write lock if no more reads */ if (readcount == 0) signal(wrt); signal(mutex); } } ``` > Writer 和 Reader 競爭 lock wrt，如果是 Reader 先 `wait(wrt)`，則 Writer 只能先進入 sleep，等待 Reader 的 `readcount = 0` (實際上是最後一個 reader)時發生 signal 喚醒在 waiting queue 中的 writer，Reader 在這之前都會持續進行 Reader code 的執行，而 Writer code 的執行要等到收到 Signal 後才行。 ### Dining-Philosophers Problem - 5 個人坐在椅子上，桌上有 5 根筷子 - 一個人不是正在 thinking 就是正在 eating - thinking: 沒有與其他人有互動 > idle - eating: 需要兩根筷子在手 - 一個人一次「可能」只拿到一根筷子 $\rightarrow$ 可能身旁的人正在吃，所以他不能吃。 - 完成 eating: 放下手上兩根筷子 - deadlock problem - 一根筷子作為一個 semaphore - starvation problem - 有人吃不到 ![](https://i.imgur.com/LEiRSUi.png) > 解決方案，在接下來小節。 ## Monitors ### Motivation - 雖然 semaphores 提供了一個方便而且有效的 synchronization 機制，它的正確性卻是仰賴於 programmer - 所有 processes 存取一共享 data object 一定 execute `wait()` and `signal()` 在正確的順序和地方上。 - 這可能不會是成立，因為編程錯誤或不合作的 programmer #### Monitor --- A high-level language construct - a monitor 型態的表示組成由 - 變數的宣告，其值定義這型態的實例的狀態 (state)。 - **Procedures/functions** 在類型上實作 operations。 - The monitor 類型類似於 class in O.O. language - 在 monitor 內的 a procedure 只能存取 local variables and the formal variables。 - monitor 的 local variables 只可以被 local procedures 使用。 - 但是，monitor 確保一次只能有一個 process 在 monitor 內 **active**。 ### Monitor - High-level synchronization construct 允許在 concurrent processes 之中安全共享 abstract data type。 ``` Syntax monitor monitor-name { //shared variable declarations procedure body P1(...){ ... } procedure body P2(...){ ... } procedure body Pn(...){ ... } initialization code{ } } ``` **Schematic View** ![](https://i.imgur.com/mkNb6dx.png) ### Monitor Condition Variables - 為了允許一個 process wait **within** the monitor, a condition variable 一定要宣告，如 - `condition x, y;` - Condition variable 只能與 operations `wait()` and `signal()` 一起使用 - `x.wait();` > 意思呼叫這個操作的 process 被暫停直到另一個 process invokes。 - `x.signal();` > 僅恢復一個暫停中的 process。如果沒有 process 是暫停的，則 `signal` operation 沒有作用。 (相較之下，signal 總是改變 semaphore 的狀態） #### Monitor With Condition Variables ![](https://i.imgur.com/vpAIPdb.png) ### Dining Philosophers Example ```c= monitor dp { enum {thinking, hungry, eating} state[5]; //current state condition self[5]; //delay eating if can't obtain chopsticks void pickup(int i); //pickup chopsticks void putdown(int i); //putdown chopsticks void test(int i); //try to eat void init() { for (int i = 0; i < 5; i++) state[i] = thinking; } } void pickup(int i) { state[i] = hungry; test[i]; //try to eat if (state[i] != eating) self[i].wait(); //wait to eat } void putdown(int i) { state[i] = thinking; /* checking if neighbors are waiting to eat */ test((i + 4) % 5); test((i + 1) % 5); } /* try to let Pi eat (if it it hungry) */ void test(int i) { if ((state[(i + 4) % 5] != eating) && (state[(i + 1) % 5] != eating) && (state[i] == hungry)) { /* No neighbors are eating and Pi is hungry */ state[i] = eating; /** * If Pi is suspended, resume it * If Pi is not suspended, no effect **/ self[i].signal(); } } ``` #### An illustration ![](https://i.imgur.com/xLiEbLn.png) ``` P1: P2: DiningPhilosophers.pickup(1) DiningPhilosophers.pickup(2) eat eat DiningPhilosophers.putdown(1) DiningPhilosophers.putdown(2) ``` > P1 開始做 `pickup(1)`，它的 `state[1] = hungry;`，接著做 `test(1)`，發現左右兩邊的人 `state` 都不是 eating，且自己 `state` 是 hungry，所以將自己 `state` 改成 `eating`，並且執行`self[1].signal` (no effect)。換 P2 開始做 `pickup(2)`，它的 `state[2] = hungry;`，接著做 `test(2)`，發現 `test(2)` if 條件不滿足，所以 `self[2].wait();`。P1 吃完後放下筷子 `DiningPhilosophers.putdown(1)`，它的 `state[1] = thinking;`，檢查 ( `test`）左右兩邊的人 (P2 and P0 `thinking`)，其中 P2 的 `state` 會變成 `eating`，然後恢復暫停中的 P2， P2 可以去吃，然後 P2 吃完放下筷子`DiningPhilosophers.putdown(2)`。 ### Synchronized Tools in JAVA - Synchronized Methods (Monitor) - Synchronized method uses the method receiver as a lock - 在同一物件上，synchronized methods 的兩次呼叫不可能交錯(要等到一個做完，才能做另一個)。 ```java= public class SynchronizedCounter { private int c = 0; public synchronized void increment() { c++; } public synchronized void decrement() { c--; } public synchronized int value() { return c; } } ``` - Synchronized Statement (Mutex Lock) - Synchronized blocks uses the expression as a lock - A synchronized Statement: the thread 獲得的鎖只能夠用於在語句中已經提及到的 object or class > 下面的程式碼中，p1 前後會有 lock 和 unlock 去解決 synchronization problem - useful for improving concurrency with fine-grained ```java= public void run() { synchronized(p1) { int i = 10; // statement without locking requirement p1.display(s1); } } ``` ## Atomic Transactions ### System Model - Transaction: 執行單個邏輯功能的指令的集合。 - Atomic Transaction: operations happen as a single logical unit of work, in its entirely, or not at all。 > 在單一 logical 工作單元只有一次做完 transaction 或者根本不做。 - 對於 database system 而言，特別關注於 Atomic Transaction。 ### File I/O Example - Transaction 是 read 和 write 一連串的 operations。 - 由 `commit` (transaction successful) 或者由 `abort` (transaction failed) 來中止。 - Aborted transaction 一定要回滾到它未做任何改變的執行前。 ### Log-Based Recovery - 記錄到 stable storage 關於所有由 transaction (交易) 所修改的資訊 > stable storage: never lost its stored data - Write-ahead logging (預寫式日誌): 每個 log record 描述單一 transaction 寫入 operation。 - Transaction name - Data item name - Old & new values - Special events: $<T_{i},\ starts>,\ <T_{i},\ commits>$ - log 用於重建由 transactions 修改的 data items 的狀態。 - 使用 undo ($T_{i}$), redo($T_{i}$) to recover data ### Checkpoints - 當失敗發生時，必須查看 log 以確定必須重新執行哪些 transactions。 - 搜尋 process 是耗時的。 - 並非所有 transactions 都需要重做 - 使用 checkpoints 減少上述開銷 (overhead): - 輸出所有 log records 到 stable storage - 輸出所有 modified data 到 stable storage - 輸出一 log record \<checkpoint> 到 stable storage > 簡單來說，就是類似於將系統還原的還原點，不需要從 0 開始做，從有受到影響之前開始再做即可。 ## Ref [1] [軟件開發平台及語言筆記大全(超詳細)](https://www.cntofu.com/book/46/index.html) [2] [threadpool/threadpool.c at master · mbrossard/threadpool · GitHub](https://github.com/mbrossard/threadpool/blob/master/src/threadpool.c) [3] [上課影片 link](https://www.youtube.com/playlist?list=PLS0SUwlYe8czigQPzgJTH2rJtwm0LXvDX) [4] [投影片 link](https://ocw.nthu.edu.tw/ocw/index.php?page=course_news_content&cid=141&id=999)