Chap. 06 - Process synchronization

# Chap. 06 - Process synchronization > 課程內容 : 清華大學開放式課程周志遠教授 > 參考書目 : Operating System Concepts (9th), Abraham Silberschatz, Peter Baer, Galvin, Greg Gagne > > 其他科目內容請參見 [[Here]](https://hackmd.io/@ChenZE/By4SOO6Jyl) **Note:** [#race condition](#Q-請說明何謂-race-condition-與-critical-section-) [#critical section](#Q-請說明何謂-race-condition-與-critical-section-) [#mutex lock](#Q-請說明互斥鎖mutex-lock的概念-) [#H.W. support](#Q-除了用軟體的方式解決同步化的問題外，能不能使用其他方式-) [#semaphore](#Q-請說明同步化中的-semaphore-) [#spinlock](#Q-請說明何謂自旋鎖spinlock-) [#monitor](#Monitor) ## Content * [Background](#Background) * [1. Race condition](#1-Race-condition) * [Critical section](#Critical-section) * [1. Critical section problem](#1-Critical-section-problem) * [2. Critical section requirements](#2-Critical-section-requirements) * [3. Software solution](#3-Software-solution) * [3.1 Peterson's solution](#31-Petersons-solution) * [3.2 Producer/consumer problem](#32-Producerconsumer-problem) * [3.3 Bakery algorithm](#33-Bakery-algorithm) * [3.4 Condition variable](#34-Condition-variable) * [Synchronization hardware](#Synchronization-hardware) * [1. Hardware support](#1-Hardware-support) * [2. Automic `TestAndSet()`](#2-Automic-TestAndSet()) * [3. Atomic `Swap()`](#3-Atomic-Swap()) * [Semaphores](#Semaphores) * [1. Non-busy waiting implement](#1-Non-busy-waiting-implement) * [2. Semaphore with critical section](#2-Semaphore-with-critical-section) * [3. Cooperation synchronization](#3-Cooperation-synchronization) * [4. Deadlock and starvation](#4-Deadlock-and-starvation) * [Classical problem of synchronization](#Classical-problem-of-synchronization) * [1. Bounded-buffer problem](#1-Bounded-buffer-problem) * [2. Reader-writer problem](#2-Reader-writer-problem) * [2.1 First RW problem](#21-First-RW-problem) * [3. Dining-Philosophers problem](#3-Dining-Philosophers-problem) * [Monitor](#Monitor) * [1. Monitor condition variables](#1-Monitor-condition-variables) * [2. Dining Philosophers example](#2-Dining-Philosophers-example) ## Background 當有兩個以上的 process **同時存取(access)共享(shared)的資料**時，會發生資料的不一致性(data inconsistency)，最主要的原因是**執行==順序不同==**。為了維持資料間的一致性，需要確保**使用 shared data 的 processes 之間的==執行順序==**。以消費者-生產者問題(consumer - producer problem)為例： ```c= // producer while (1) { nextLitem = getItem(); whil (counter == BUFFER_SIZE); buffer[in] = nextItem; in = (in + 1) % BUFFER_SIZE; counter++; } // consumer while (1) { while (counter == 0); item = buffer[out]; out = (out + 1) % BUFFER_SIZE; counter--; } ``` 將 `counter++`/`counter--` 分解成 assembly code 有 3 個步驟 ``` ; counter++ move ax, counter ; move from memory(counter) to register(ax) add ax, 1 ; plus 1 move counter, ax ; move from register(ax) to memory(counter) ; counter-- move bx, counter ; move from memory(counter) to register(bx) sub bx, 1 ; plus 1 move counter, bx ; move from register(bx) to memory(counter) ``` 綜合交錯的指令順序如下： ``` # block 1 producer: move ax, counter -> ax = 5 producer: add ax, 1 -> ax = 6 #----- context switch -----# # block 2 consumer: move bx, counter -> bx = 5 consumer: sub bx, 1 -> bx = 4 #----- context switch -----# # block 3 producer: move counter, ax -> counter = 6 #----- context switch -----# # block 4 consumer: move counter, b -> counter = 4 ``` 儘管使用一個全域變數 counter 來控制增減，但在 assembly code 的最後階段一樣會產生同步問題，因為兩者都要從暫存器(ax/bx)寫入記憶體(counter)中，依據**不同的指令順序 `counter` 會產生不同的結果** * block: 1 -> 2 -> 3 -> 4, counter = 4 * block: 1 -> 2 -> 4 -> 3, counter = 6 * block: 1 -> 3 -> 2 -> 4, counter = 5 ### 1. Race condition **競爭條件(race condition)** 指的是當有**多個 process 同時存取**與控制 shared data 時，最終**結果會取決於==存取時的順序==**。為了避免 race condition 引發的問題，**同時執行的 process** 必須做 ==**同步化(synchronized)**== 的處理，這類問題通常也稱為 critical section problem。若一個系統中含有 n 個 processes，每一個 process 都有**一段程式碼**會涉及到**與其他 process 共享資料**(Ex. 存取/更新)，則這段程式碼就稱為**臨界區間(critical seciton)**。(以下簡稱 C.S.) (註 : 每個 process 都有自己的臨界區間) :::info * Critical section 指的是**可能產生 race condition 的區域** * 對**單一核心的系統**而言，可以直接**阻擋** interrupt 的訊號(disable interrupt)或是使用 non-preemptive CPU scheduling 預防 race condition。這也是適合 OS kernel 的解決辦法之一 ::: > [!Tip] **Interview Ques.** > ##### Q: 請說明何謂 race condition 與 critical section ? > Race condition 指的是有多個 process 同時存取或修改共享資源的時候，因為執行順序不同導致結果不可預期。 > Critical section 指的是涉及到與其他 process 共享資料的區段。 ## Critical section ### 1. Critical section problem Critical section 的目的是**制定**一個所有涉及存取共享資料的 processes 都必須遵守的**協議(protocol)**。 * 有 n 個 process 正在競爭使用某些共享資料 * 每個 process 都有一塊被稱為 critical section 的程式碼，共享資料會在這個區塊進行存取動作 * 需要確保**一次只有一個** process 能夠在 C.S. 中執行 :::info 會造成同步問題的區段稱為 critical section，需在 C.S. 前後加上一些協議(程式)保證只能有一個 process 能在 C.S. 中執行。 ::: 一般 critical section 的程式碼結構如下： * 要求允許進入 C.S. 的入口(entry) * C.S. * 離開 C.S. 的出口(exit) * 剩餘程式碼(remainder) ``` do { entry section # get entry premission critical section # modify shared data exit section # release entry permission remainder section } while (1); ``` ### 2. Critical section requirements C.S. 的設計需要滿足以下三個**必要條件**： **(1) Mutual exclusion** (互斥) 當某個 process P 正在自己的 C.S. 中執行，其他 process 不能進入自己的 C.S. 中執行。**==確保所有 C.S. 中一次只能有一個 process 正在執行==**。如果此條件不滿足可能會導致 race condition。 **(2) Progress** 如果 C.S. 中**沒有** process 正在執行，且此時有某個 process 希望進入 C.S. 中執行，則此 process 可以 **==隨時進入不會被阻擋==**。如果此條件不滿足可能會導致 **dead lock**。 **(3) Bounded waiting** Process **==等待進入 C.S. 的時間必須要有限制==**，不可以無限制的讓 process 等待。如果此條件不滿足可能會發生 **starvation**。 ### 3. Software solution 假設有兩個 process P$_0$ 與 P$_1$ 和一個共享變數 `turn`(初始值 = 0)。當 `turn = i` 時表示 P$_i$ 進入自己的 C.S. 中。每個 process 各自的 code segment 如下圖： ![image](https://hackmd.io/_uploads/ryR6NpT_Jx.png) ```= # process P0 do { while (turn != 0); # entry section(busy waiting) critical section turn = 1; # exit section remainder section } while (1); # process P1 do { while (turn != 1); # entry section(busy waiting) critical section turn = 0; # exit section remainder section } while (1); ``` * 對 process P$_0$ * 當 `turn != 0` 時不會進入 P$_0$ 的 C.S.，會在外層(第 3 行)的 `while` 迴圈中等待(無窮迴圈)，直到取得鑰匙(`turn == 0`) * 當 P$_0$ 進入 C.S. 執行完離開後需要**交出鑰匙(`turn = 1`)讓其他 process 可以進入**，且 P$_0$ 自己繼續執行剩下的 section * 對 process P$_1$ * 當 `turn != 1` 時不會進入 P$_1$ 的 C.S.，會在外層(第 12 行)的 `while` 迴圈中等待(無窮迴圈)，直到取得鑰匙(`turn == 1`) * 當 P$_1$ 進入 C.S. 執行完離開後需要**交出鑰匙(`turn = 0`)讓其他 process 可以進入**，且 P$_1$ 自己繼續執行剩下的 section 以上述演算法來說 * Mutual exclusion :heavy_check_mark: * Progess :heavy_multiplication_x: * 當 P$_0$ 執行完交出鑰匙(`turn = 1`)後但 P$_1$ 還不需要執行，此時如果 P$_0$ 又要執行就會無法進入 C.S. 中 * Bounded waiting :heavy_check_mark: #### 3.1 Peterson's solution 將上述基礎方法做部分改進(shared variable) * (原有) `int turn;` 初始值 `turn = 0` * `turn = i` 表示 P$_i$ **可以進入** C.S. * (新增) `bool flag[2];` 初始值 `flag[0] = flag[1] = False` * `flag[i] = True` 表示 P$_i$ **準備進入** C.S 中 ```= # for process Pi do { flag[i] = True; turn = j; while (flag[j] && turn == j); critical section flag[i] = False; remainder section } while (1); ``` 當某個 process P$_i$ 想要執行 C.S. 時 * (第 3 行)準備進入(`flag[i] = True`) * (第 4 行)把鑰匙交出去給別人 `turn = j` * 目的是**檢查有沒有其他的 process P$_j$ 要進入 C.S. 中**，如果有就先給別人執行 * (第 5 行)檢查其他 process P$_j$ 的條件 * 如果 P$_j$ 準備好(`flag[j] = True`) * 且 P$_j$ 也拿到鑰匙(`turn == j`) * 則不會執行 P$_i$ 的 C.S.，會在外層(第 5 行)的 `while` 迴圈等待(無窮迴圈) * (第 7 行)當 P$_i$ 執行完 C.S 後設置為不想進入 C.S. 中(`flag[i] = False`) ##### Proof (1) Mutual exclusion :heavy_check_mark: 假設同時有兩個 process 都在 C.S. 中，則 `flag[0] = flag[1] = True;`，但 `turn` 不可能同時為 `1` 與 `0`，因此不會有兩個 process 同時在 C.S. 的情況發生。 (2) Progess :heavy_check_mark: 假設 P$_0$ 準備進入 C.S. 中(`flag[0] = True`) * 若 P$_1$ 尚未準備進入(`flag[1] = False`)，則 P$_0$ 可以進入 * 若 P$_1$ 也準備進入(`flag[1] = True`) * `turn == 0` 則 P$_0$ 進入；`turn == 1` 則 P$_1$ 進入不論哪個 case，只要有 process **準備好正在等待就可以隨時進入** C.S.。 (3) Bounded waiting :heavy_check_mark: 假設 P$_0$ 準備進入 C.S. 中(`flag[0] = True`) * 若 P$_1$ 離開 C.S.(`flas[1] = False`)，則 P$_0$ 可以進入 * 若 P$_1$ 離開 C.S. 且又準備再次進入(`flag[1] = True`) * 鑰匙在 P$_0$ 手上(`turn == 0`)，只有 P$_0$ 可以進入就算 P$_1$ 想馬上再次進入 C.S. 中，但因為一開始已經把鑰匙交出去(`turn == 0`)，所以無法馬上執行，P$_0$ 不會一直在等待。(所以 `turn` 才要放前面) #### 3.2 Producer/consumer problem 以 producer/consumer problem 為例，嘗試將 C.S. 放在不同位置 (1) 將整段程式碼放在 C.S. 之中 ```c= // Producer while (TRUE){ entry-section(); // enter critical section nextItem = getItem(); while (counter == BUFFER_SIZE); buffer[in] = nextItem; in = (in + 1) % BUFFER_SIZE; counter++; remainder(); // remainder code exit-section; // exit critical section } // Consumer while (TRUE){ entry-section(); // enter critical section while (counter == 0); item = buffer[out]; out = (out + 1) % BUFFER_SIZE; counter--; remainder(); // remainder code exit-section(); // exit critical section } ``` 可能產生 deadlock。如果 consumer 先進入 C.S.，此時 buffer 是空的(counter == 0)，所以會停下來等待 producer 產生資料，但同時 producer 又正在等 consumer 結束後進入 C.S. 中。 (2) 將 race condition 後的程式碼放到 C.S. 中 ```c= // Producer while (TRUE){ nextItem = getItem(); while (counter == BUFFER_SIZE); buffer[in] = nextItem; in = (in + 1) % BUFFER_SIZE; entry-section(); // enter critical section counter++; remainder(); // remainder code exit-section; // exit critical section } // Consumer while (TRUE){ while (counter == 0); item = buffer[out]; out = (out + 1) % BUFFER_SIZE; entry-section(); // enter critical section counter--; remainder(); // remainder code exit-section(); // exit critical section } ``` 改善了 deadlock 但效能變差，因為還沒釋放 C.S. 就開始做其他計算，會導致 C.S. 的區間擴大，降低效能。 (3) 只將 race condition 的部分放到 C.S. 中 ```c= // Producer while (TRUE){ nextItem = getItem(); while (counter == BUFFER_SIZE); buffer[in] = nextItem; in = (in + 1) % BUFFER_SIZE; entry-section(); // enter critical section counter++; exit-section; // exit critical section remainder(); // remainder code } // Consumer while (TRUE){ while (counter == 0); item = buffer[out]; out = (out + 1) % BUFFER_SIZE; entry-section(); // enter critical section counter--; exit-section; // exit critical section remainder(); // remainder code } ``` 正確做法，維持效能且不會產生 deadlock。 #### 3.3 Bakery algorithm 前述方法只比較**兩個 process 之間的同步問題**，現在考慮 n 個 process 之間的同步問題(bakery algorithm) * 在進入 C.S. 之前，所有 process 會抽取一個號碼牌(#) * 只有最小 (#) 的 process 才能進入 C.S. 中 * (#) 以非遞減的方式產生，可能會有相同 (#) 狀況需要做額外處理 * 若兩個 porcess 的 (#) 相同，則比較兩者的 PID(越小的先執行) * Notion: (a, b) = (#, PID) 以下段程式碼為例 * `bool choosing[n]`: 全域變數，告訴其他的 process 目前有哪個 process 正在抽 (#) * `choosing[i] = True` 表示正在抽牌 * `choosing[i] = False` 表示沒有在抽牌(預設) * `int num[n]`: 全域變數，表示 process 目前的號碼，預設為 0 ```c= // for process Pi do { choosing[i] = True; num[i] = max(num[0], num[1], ..., num[n-1]) + 1; choosing[i] = False; for (int j = 0; j < n; j++){ while (choosing[j]); while ((num[j] != 0) && ((num[i], i) < (num[j], j))); } critical section; num[i] = 0; remainder section; } while (1); ``` * (第 3 行)：告訴其他 process 目前 P$_i$ 正在抽 (#) * (第 4 行)：P$_i$ 的 (#) = 目前最大 (#) + 1 * (第 5 行)：P$_i$ 結束抽 (#) * (第 6 行)：跟其他的 process 比較 * (第 7 行)：如果 `choosing[j] = True` 表示 P$_j$ 正在抽牌，(#) 有可能改變 * 需等待 P$_j$ 抽完後 P$_i$ 才能繼續(進入無窮迴圈等待) * (第 8 行)：跟其他 P$_j$ 比較 (#)，若 (#) 相同再比較 PID(i for P$_i$ and j for P$_j$) * 需要 `num[j] != 0` 比較才有意義，因為 `num[j] = 0` 表示 P$_j$ 根本沒抽牌 * 如果 (#) 相同且 PID 比較大(i > j)則等待別人先執行(進入無窮迴圈) * (第 11 行)：執行完 C.S. 後釋放號碼牌 :::info **`choosing[i]` 的用意** * 若缺少 `choosing` 陣列(without locking) * 假設 5 是目前最大號碼 * 若 P$_1$ 與 P$_4$ 一起抽 (#) 且 P$_4$ 先完成 * 此時 P$_1$ = 0 且 P$_4$ = 6 所以 P$_4$ 先進入 C.S. 中 * 此時 P$_1$ 也抽完(P$_1$ = 6)所以也會進入 C.S. 中 * 加上 `choosing` 陣列後(with locking) * P$_4$ 抽完牌後會被擋住直到 P$_1$ 抽完，此時 P$_1$ = P$_4$ = 6，兩者再做比較並選擇一個進入 C.S. 中 ::: ##### Example 在 C 中的 `<pthread.h>` 函式庫有實作 mutual exclusion 的相關函式(POSIX protocol) * 使用 `mutual exclusion(mutex)` 前需要先宣告一個名為 `pthread_mutex_t` 的型態並初始化 `mutex` * `pthread_mutex_destroy()` 用來結束 mutex * C.S. 會被夾在 `pthread_mutex_lock()` 與 `pthread_mutex_unlock()` 中保護 * `pthread_mutex_lock()` 會讓 process 取得權限並進入 C.S. * `pthread_mutex_unlock()` 會讓 process 釋放權限並離開 C.S. ```c= // Example pthread #include <pthread.h> pthread_mutex_t mutex; pthread_mutex_init(&mutex, NULL); pthread_mutex_lock(&mutex); critical section; pthread_mutex_unlock(&mutex); pthread_mutex_destroy(&mutex); ``` > [!Tip] **Interview Ques.** > ##### Q: 請說明互斥鎖(mutex lock)的概念 ? > Mutex lock 是一種保護共享資源的機制，避免多個 process 同時進入 critical section 導致 race condition。 #### 3.4 Condition variable Condition variable 指的是當某個條件發生時不同的 processes/threads 會執行對應的動作 * 某些 threads 會持續等待直到某個 condition variable(C.V.)發生 * 當 condition 發生時，某些 thread 會去通知其他的 waiting threads 通常有三種運算 * `wait()`: thread 會被阻擋直到其他的 threads 呼叫 `signal()` 或 `broadcase()` * `signal()`: 喚醒某個正在等待 C.V. 的 thread * `broadcast()`: 喚醒所有正在等待 C.V. 的 thread 在 `<pthread.h>` 函式庫中支援以上 3 種運算，且使用前同樣需要宣告一個 `pthread_cond_t` 的型態並初始化 * `pthread_cond_init(&theCV)` * `pthread_cond_wait(&theCV, &someLock)` * `pthread_cond_signal(&theCV)` * `pthread_cond_broadcast(&theCV)` ##### Example Condition variable 通常搭配 mutual exclusion 一起使用。假設有兩個 threads，其中一個當 `x = 0` 時會執行動作，另一個 thread 負責將 `x` 值遞減 ```c= // Example pthread_cond_t cond; pthread_cond_init(cond, NULL); pthread_mutex_t mutex; pthread_mutex_init(mutex, NULL); // thread 1 action(){ pthread_mutex_lock(&mutex); if (x != 0) pthread_cond_wait(cond, mutex); pthread_mutex_unlock(&mutex); tack_action(); } // thread 2 counter(){ pthread_mutex_lock(&mutex); x--; if (x == 0) pthread_cond_signal(cond); pthread_mutex_unlock(&mutex); } ``` * (第 12 行)當 `x!= 0` 時 thread 1 會進入 sleep 並釋放權限 * 直到 `x == 0` 時 thread 2 會喚醒(第 22 行)正在等待的 thread 並釋放權限 * (第 12 行)當 thread 1 被喚醒後會重新取得權限並執行 C.S. * (第 13 行)執行完 C.S. 後釋放權限並執行 remainder section ## Synchronization hardware > [!Tip] **Interview Ques** > ##### Q: 除了用軟體的方式解決同步化的問題外，能不能使用其他方式 ? > 也可以使用硬體支援的方式解決同步化問題。可以在底層硬體或組合語言中設計一個**原子指令(atomic instruction)**，確保在 critical section 的操作不會被其他 process 中斷。 > ### 1. Hardware support C.S. problem 的問題是共享資料的修改會被打斷(interrupt)，硬體上的實作可以使用中斷干擾(disable interrupts)，但僅適用在 single-core 上，多核心系統無法使用。另一種 H.W. support 的解決方法：**atomic instrucitons**，一種不會被打斷的指令。以 `TestAndSet()` 與 `Swap()` 兩個概念說明。 ### 2. Automic `TestAndSet()` ```c boolean TestAndSet(bool &lock){ bool value = lock; // 儲存 lock 原來的值 lock =True; // 儲存後鎖起來 return value; } ``` :::info 上述的 `TestAndSet()` 與下面的 `Swap()` 函式只是一個**概念**，通常會**用 assembly code 或更底層的硬體**將此概念實作成單一條指令(instruction) ::: ```c= // Process P0 do { while (TestAndSet(lock)); critical section lock = False; remainder section } while (1); // Process P1 do { while (TestAndSet(lock)); critical section lock = False; remainder section } while (1) ``` * shared data: `boolean lock` * 初始值為 `False` * (第 3 行)與(第 11 行)檢查 `lock` 的狀況 * 當 `lock == False`(沒鎖)：進入 C.S. * 當 `lock == True`(鎖)：等待(無窮迴圈) * 只有第一個執行的 process 才能拿到鑰匙(`lock == False`)，因為此時 `lock = True`，其他 process 會持續等待 * (第 5 行)與(第 13 行)歸還鑰匙(`lock = False`) ### 3. Atomic `Swap()` ```c= // Process P0 do { key0 = True; while (key0 == True) Swap(lock, key0); /* critical section */ lock = False; /* remainder section */ } while (1); // Process P1 do { key1 = True; while (key1 == True) Swap(lock, key1); /* critical section */ lock = False; /* remainder section */ } while (1); ``` * Shared data: `boolean lock` * 初始值為 `False` * 當 `lock == False` 時才能進入 C.S. * 只有第一個呼叫 `swap()` 的才可以進入 C.S. * (第 7 行)與(第 17 行)：歸還鑰匙(`lock = False`) ## Semaphores > Semaphore (n): 號誌 ; 旗號 Semaphore 是一種**透過==控制 resource 的數量==處理同步問題**的方法，更具體的說是**紀錄目前還有多少(#record)的 resources 可以提供 processes 使用**。 * 若 #record = 1：目前**只有一個資源**可以使用，稱為 binary semaphore，類似 mutex lock * 若 #record > 1：目前有**多個資源**可以使用，稱為 counting semaphore recources 的存取仰賴**兩個 atomic operations**: `wait()` 與 `signal()` | Operaions | Intro | | :- | :- | | `wait()` | process 準備**開始執行**(需要 resource)，使可用**資源減少** | | `signal()` | process **結束執行**(釋放 resource)，使可用**資源增加** | ```c= wait (S){ while (S <= 0); // busy-waiting S--; // take out resource } signal(S){ s++; // release resource } ``` :::danger 這邊 `wait()` 與 `signal()` 的意義與 condition variable 不同，不要搞錯。 ::: > [!Tip] **Interview Ques.** > ##### Q: 請說明同步化中的 semaphore ? > Semaphore 是透過控制資源數量來處理同步問題的方法，它會紀錄目前還有多少的資源可以提供給 process 使用。提供兩種 atomic instruction : > * `wait()` 取得資源，數量減 1 > * `signal()` 釋放資源，數量加 1 上面 `wait()` 與 `signal()` 的實作同樣是一種**概念**，實際上會用**更底層的方式**將其包成 atomic instruction 進行，這種使用 busy-waiting 的實作俗稱 ==**spinlock(自旋鎖)**==。因為當沒有資源可用時(第 2 行)會持續**等待**(無窮迴圈)，造成 **CPU 連續執行不中斷**。 > [!Tip] **Interview Ques.** > ##### Q: 請說明何謂自旋鎖(spinlock) ? > 自旋鎖也是一種保護 critical section 的同步機制。當 process 嘗試取得被佔用的鎖時，系統會進入 busy-waiting 的無窮迴圈中持續檢查等待。與之相比 mutex lock 在等待期間可能會進入睡眠狀態，等其他 process 結束後才會被喚醒執行。 > > ##### Q: 自旋鎖(spinlock)有何優點 ? > 因為 CPU 需要不斷偵測檢查，所以適合用在鎖(lock)的時間比較短的情況，減少 process 被喚醒時的成本。 > POSIX 也有提供 semaphore 的相關函式庫，且不會跟 pthread 函式庫綁在一起，可以在 process 的層級上使用，不一定只能用在 thread level。 **POSIX semapohre routines:** * `sem_init(sem_t *sem, int pshared, unsigned int value)` * `value`: 是 semaphore 的初始值(#record) * `sem_wait(sem_t *sem)` * `sem_post(sem_t *sem)` * `sem_getvalue(sem_t *sem, int *valptr)` * `valptr`: 目前 semaphore 的值 * `sem_destroy(sem_t *sem)` ```c= // Example #include <semaphore.h> sem_t sem; sem_init(&sem); sem_wait(&sem); // critical section sem_post(&sem); sem_destroy(&sem); ``` :::info 當 semaphore 的初始值(#record)設為 1 時其實就是 mutex lock 的概念，所以 #record = 1 時建議還是使用 mutex lock (如下) ```c= semaphore mutex = 1; // shared data, initialized mutex = 1 do { wait(mutex); // similar to pthread_mutex_lock(&mutex) /* critical section */ signal(mutex); // similar to pthread_mutex_unlock(&mutex) /* remainder section */ } while (1); ``` ::: ### 1. Non-busy waiting implement Non-busy waiting 的 semaphore 實作**不會使用到 while loop**，CPU 負擔小，但缺點是會**使用到 system call 所以速度較慢**，如果需要較長的等待時間在用這個方法。與 busy waiting 的方法相比需要修改兩處 : 1. 將 semaphore 以 queue 實作(對比 busy waiting，semaphore 是一個整數)，將目前正在**等待資源的 processes 放到 queue 中** 2. `wait()` 與 `signal()` 需要改為不使用 while loop 的方法 ##### 以 queue 實作 semaphore ```c= tydedef struct { int value = 0; // initialized to 0 struct process *L; } semaphore; ``` * 當 `value = 0` 表示沒有 process 在等待 * 當 `value = n > 0` 表示有 n 個 resources 可以使用 * 當 `value = -n < 0` 表示有 n 個 process 正在等待(即 queue 的長度) <img src="https://hackmd.io/_uploads/SkHsvIlYke.png" width=500> ##### 修改 `wait()` 與 `signal()` `wait()` 與 `signal()` 的實作**仰賴 system call**: `sleep()` 與 `wakeup()`，且也必須是 atomic operations ```c= void wait(semaphore S){ S.value--; if (S.value < 0){ // 目前沒有 recource 可用 add this process to S.L; // 去排隊 sleep(); // 放到 wait queue 中等待，不耗費 CPU 資源 } } void signal(semaphore S){ S.value++; if (S.value <= 0){ // 有人在 queue 中排隊 remove a process P from S.L; wakeup(P); // 移到 ready queue 準備執行 } } ``` :::info 這邊的 `sleep()` 與 `wakeup()` 兩個 system call 只是 pseudo code 的概念。 ::: ### 2. Semaphore with critical section 因為 semaphore 是**控制多個 processes 使用 shared resources 的方法**，所以雖然能夠使用 semaphore 處理同步問題，但 semaphore 的機制本身也會有同步問題，需要將可能產生 race condition 的共享變數放入 critical section 中 ```c= void wait(semaphore S){ entry-section(); S.value--; if (S.value < 0){ add this process to S.L; exit-section(); sleep(); } else { exit-section(); } } void signal(semaphore S){ S.value++; if (S.value <= 0){ remove a process P from S.L.; exit-section(); wakeup(P); } else { exit-section(); } } ``` ### 3. Cooperation synchronization 假設 P$_1$ 執行 S$_1$，P$_2$ 執行 S$_2$ 且 S$_2$ 只有在 S$_1$ 完成後才能執行 (1) Simple implement ``` semaphore sync; # shared variable and initialized to 0 P1: P2: -------------------------------------------------- S1; wait(sync); signal(sync); ---------------------> S2; P2 execute S2 until P1 call signal() ``` (2) Complicated example ``` processes No. | execute -------------------------------------------------- P1 | S1; signal(a); signal(b); P2 | wait(a); S2; signal(c); P3 | wait(b); S3; signal(d); P4 | wait(c); S4; signal(e); signal(f); P5 | wait(e); S5; signal(g); P6 | wait(f); wait(d); S6; signal(h); P7 | wait(g); wait(h); S7; ``` <img src="https://hackmd.io/_uploads/Sy37bvgY1g.png" width=200> ### 4. Deadlock and starvation Semaphore 是控制資源的手段與方法，當多個 process **請求 semaphore 的順序不當**時，可能會造成 process 之間彼此等待對方釋放 resource，產生 **deadlock**。 ![img](https://hackmd.io/_uploads/HyBrEDgt1g.png) Starvation 是因為 queue 的設計不佳，如果是 LIFO 機制的 queue，會導致先進入 queue 的 process 一直無法執行，因為後面進入 queue 的都會被優先執行。 ## Classical problem of synchronization ### 1. Bounded-buffer problem 就是 producer-consumer problem 暫略 ### 2. Reader-writer problem 假設前提： * Set of shared data * Group of processes * Reader processes: 負責讀取 shared data * Writer processes: 負責更新 shared data * Writer processes 在操作時，其他的 reader 都不能使用(i.e., exclusive access) Reader-writer problem 有兩種變型： * First RW problem(reader priority) * 多個 reader 可以同時存取 shared data，因為讀取本身不影響 shared data * Writer 必須等待直到所有的 reader 讀取完才能進行 update * Second RW problem(writer priority) * 一旦 writer 準備好，則 reader 就無法讀取 * 當 writer 更新完成後 reader 才能繼續讀取 shared data #### 2.1 First RW problem First RW problem 涉及到兩個同步化問題，因此會需要兩個 mutex lock 分別處理這兩個問題。 * 一個是 reader 與 writer 之間的順序導致的同步問題 * Race condition 是 writer 更新 shared data 的過程 * 需要確保 writer 的寫入動作被包含在 C.S. 中 * 另一個是多個 reader 之間讀取 shared data 的順序所導致的同步問題 * Race condition 為 `readCount` 變數，需要將其包含在 C.S. 中 ```c= // mutual exclusion for writer semaphore wrt = 1; // mutex lock for writer writer(){ while(True){ wait(wrt); // writer code signal(wrt); } } ``` * (第 2 行)writer 的 mutex lock * (第 6 行)確保 writer 拿到鑰匙後獨佔資源後才進行 update shared data ```c= // mutex exclusion for multi-reader semaphore nutex = 1; int readCounter = 0; reader(){ while(True){ wait(mutex); // entry critical section readCount++; if (readCount == 1){ wait(wrt); // exit critical section } signal(mutex); /* reader code */ wait(mutex); // entry critical section readerCount--; if (readCount == 0){ signal(wrt); } signal(mutex); // exit critical section } } ``` * Shared data * `mutex`: 的目的是保護 `readCounter` 避免多個 reader 之間的 race condition * `readCount`: 計算目前有多少的 reader 正在等待 * `reader()` 開始讀取，先進入 C.S. 讓 `readCounter` 加 1(第 7 行) * (第 9 行)如果目前的 `reader()` 是第一個(`readCount == 1`) * 必須先搶到 writer 的 mutex lock 讓 writer 無法執行(第 10 行) * 搶到後離開 C.S. 並開始執行當前 reader 的動作(第 12 行) * (第 16 行) reader 的動作結束後再次進入 C.S. 中讓 `readCount` 減 1 * (第 18 行)若此時沒有 reader 在等待執行(`readCount == 0`)，則可以開始進行 writer 的動作 * (第 19 行)釋放 writer 的 mutex locl * (第 21 行)離開 reader 的 C.S. > [!Warning] > Writer 可能發生 starvation problem，因為如果持續有 reader 在讀取 shared data，那 writer 將永遠無法更新 shared data ### 3. Dining-Philosophers problem 5 個人坐在 5 個椅子使用 5 支筷子吃飯，每個人只有兩種狀態: (1) thinking/waiting 與 (2) eating。需要同時拿到兩支筷子才能夠 eating，每個時間點只能一次拿起一支筷子。 <img src="https://hackmd.io/_uploads/S1Yf7a-FJl.png" width=300> (後面再詳述) ## Monitor Monitor 是比較高級的處理同步問題的方式，類似 O.O. 的概念，可以視為一種特殊的 class，需要保護的變數等同於 class 中的局部變數，必須操作方法(method)才能更改 local variable。但 monitor 需要確保一次只能直營一個方法。 > [!Note] > 嚴格來說不是一次執行一個，而是只有一個能夠活動(active)，其他都要等待(wait) ### 1. Monitor condition variables 結合 monitor 與 condition variables，允許 process 在 monitor 中等待，直到某個條件滿足後才繼續執行。condition variable 只能被 `wait()` 與 `signal()` 兩種方法使用 * `wait()`: * 將 process 加到 queue 中 * 將呼叫此方法的 process 中止，直到其他 process 呼叫 `signal()` 方法 * `signal()`: * 將 process 移除 queue 中 * 喚醒某個正在 `wait()` 的 process 開始動作，如果 `wait()` 中沒有 process 在等待則不會產生任何效果 > [!Note] > Monitor 與 semaphore 的差別在於 monitor 是用 queue 實作，而 semaphore 是用 counter 實作。以 `signal()` 運算來說，當沒有 process 在等待時，monitor 的 `signal()` 不會有任何動作，但 semaphore 會將 counter 加 1。 ### 2. Dining Philosophers example ``` monitor DP{ enum {think, hungry, earing} state[5]; condition self[5]; void pickup(int i); void putdown(int i); void test(int i); void init(){ for (int i = 0; i < 5; i++) state[i] = think; } } ``` 假設有 5 個人(`state[5]`)且每人可能有三種狀況 * `thinging`: 思考要不要吃(idle) * `hungry`: 要吃但正在等筷子 * `eating`: 正在吃(有兩筷) 每個人可執行以下動作 * `pickup(int i)`: 拿起筷子 * `putdown(int i)`: 放下筷子 * `test(int i)`: 測試能不能吃一開始初始化所有人都是 thinking 狀態。 ```c= void pickup(int i){ state[i] = hungry; test(i); if (state[i] != eating) self[i].wait(); } void putdown(int i){ state[i] = thinking; test((i + 4) % 5); // check left neighbor test((i + 1) % 5); // check right neighbor } void test(int i){ if ((state[(i + 4) % 5] != eating) && (state[(i + 1) % 5] != eating) && state[i] == hungry){ state[i] = eating; self[i].signal(); } } ``` * `pickup(int i)` 表示第 i 個人拿起筷子 * 拿起筷子表示準備要吃，狀態設為 `hungry` * 第 3 行的 `test(i)` 用來測試自己的狀況能不能吃 * 如果不能吃 `state[i] != eating` 則自己中止並進入 waiting queue 等別人吃完 * `putdown(int i)` 表示第 i 個人放下筷子(吃完) * 吃完後狀態改為 `thinking` * 第 12 行與 13 行測試左右兩邊的相鄰者能不能吃 * `test(int i)` 用來測試第 i 個人能不能吃(如果他想吃的話) * 如果左鄰居不吃(`state[(i + 4) % 5] != eating`)且右鄰居不吃(`state[i + 1] % 5 != eating`)且測試者想吃(`state[i] == hungry`) * 如果都滿足就將第 i 個人狀態改為 `eating` 表示開始吃 * 並叫醒第 i 個測試者開始動作 * `test(int i)` 的用意是測試第 i 個人能不能吃，且呼叫者跟受試者不見得相同以 Dining Philosophers problem 為例 ```= eat 表示與 synchronization 有關的 code P1: P2: ---------------------------- DP.pickup(1) | DP.pickup(2) eat eat DP.putdown(1) | DP.putdown(2) ``` * (第 5 行)當 P1 呼叫 `pickup(1)` 時狀態轉為 `hungry`，經過 `pickup(1)` 內部測試後開始吃 * 當 P1 吃到一半(第 6 行)時 P2 呼叫 `pickup(2)` 準備開始吃(第 5 行) * 此時 P2 狀態轉為 `hungry`，但經過 `pickup(2)` 內部測試後失敗(`state[2] != eating`) * 進入 waiting queue 等待 * 當 P1 結束後放下筷子呼叫 `putdown(1)`，在 `putdown(1)` 內部檢查左右鄰居是否想吃 * 經過 `putdown(1)` 內部調整後喚醒 P2 開始吃