## Deadlock - A situation in which **every** process in a set of processes is **waiting** for an event that can be caused only by another process in the set :::success **Ex.** - 2 processes and 2 tape drivers - Each process holds a tape drive - Each process requests another tape drive ::: ## System Model A **thread** may use a resource in the following sequence - **Request** - 要求分配某些資源 - **Use** - 使用分配到的資源 - **Release** - 釋出分配到的資源 --- - Resources types $R_1,R_2,...R_m$ - E.g., CPU, Disk, I/O devices, ... - Each resource type $R_i$ has $W$ instances - E.g, a computer has 4 CPU cores, 1 networking interface ... ## Deadlock in Multithreaded Applications ```c /* thread_one runs in this function */ void *do_work_one(void *param) { pthread_mutex_lock(&first_mutex); pthread_mutex_lock(&second_mutex); /** * Do some work */ pthread_mutex_unlock(&second_mutex); pthread_mutex_unlock(&first_mutex); pthread_exit(0); } /* thread_two runs in this function */ void *do_work_two(void *param) { pthread_mutex_lock(&second_mutex); pthread_mutex_lock(&first_mutex); /** * Do some work */ pthread_mutex_unlock(&first_mutex); pthread_mutex_unlock(&second_mutex); pthread_exit(0); } ``` - 若 work 1 等到 `second_mutex` 前, work 2 已取得 `second_mutex`,就會產生 deadlock - work1 持有 `first_mutex`,並等待 work 2 持有的 `second_mutex`,但 work2 也在等 work1 的 `first_mutex` - work2 需要 work1 釋出 `first_mutex`,才能釋出 `second_mutex`,但同時 `second_mutex` 要先被釋出 work1 才會釋出 `first_mutex` ### Live lock ```c /* thread_one runs in this function */ void *do_work_one(void *param) { int done = 0; while (!done) { pthread_mutex_lock(&first_mutex); if (pthread_mutex_trylock(&second_mutex)) { /** * Do some work */ pthread_mutex_unlock(&second_mutex); pthread_mutex_unlock(&first_mutex); done = 1; } else { pthread_mutex_unlock(&first_mutex); } } pthread_exit(0); } /* thread_two runs in this function */ void *do_work_two(void *param) { int done = 0; while (!done) { pthread_mutex_lock(&second_mutex); if (pthread_mutex_trylock(&first_mutex)) { /** * Do some work */ pthread_mutex_unlock(&first_mutex); pthread_mutex_unlock(&second_mutex); done = 1; } else { pthread_mutex_unlock(&second_mutex); } } pthread_exit(0); } ``` ## Deadlock Characterization ### Necessary Conditions - A deadlock can arise when four conditions hold simultaneously - **Mutual Exclusion** - Only 1 process at a time can use a resource - 一個 resource 一次只能被一個 process 使用 - **Hold** & **wait** - A process holding some resources and is waiting for another resource - Process 至少取得一個 resource,然後等其他 process 握住的 resource 釋出 - **No preemption** - a resource can be only released by a process voluntarily - 釋放資源時,是 process 自願的,沒有被中斷 - **Circular wait** - there exists a set $\{P_0,P_1,...,P_n\}$ of waiting processes such that $P_0 \rightarrow P_1 \rightarrow ... \rightarrow P_n \rightarrow P_0$ - 每個 process 都在等另一個 process 握住的 resource 釋放,這也代表 single process 不會有 deadlock :::warning **All 4 conditions** must hold for **possible deadlock** (要形成deadlock,必須同時滿足 4 個條件) ::: ### Resource-Allocation Graph - System resource-allocation graph - Request edge - Assignment edge ![image](https://hackmd.io/_uploads/BJfDQh-wp.png) ![image](https://hackmd.io/_uploads/S1_OX3bwT.png) ![image](https://hackmd.io/_uploads/rJVKQnWPT.png) ![image](https://hackmd.io/_uploads/SyY57hWPT.png) ## Methods for Handling Deadlocks - Deal with the deadlock problem in one of three way - Ignore the problem - Use a protocol to **prevent** or **avoid** deadlocks, ensuring that the system will never enter a deadlock state - Allow the system to enter a deadlock state, **detect** it, and **recover** ## Deadlock prevention - For a deadlock to occur, each of the four conditions necessary conditions must hold - By ensuring that **at least one of these conditions cannot hold**, we can prevent the deadlock - 設計時就不讓 deadlock 產生 ### Mutual exclusion - sharable resources - Non-sharable resources - 對**不可共用**的資源類型(例如寫檔)而言,**互斥一定成立** - 而可共用的資源類型(例如讀檔,因為可以同時讀取相同檔案),一定不會產生 ### Hold and wait - Process 必須保證在要求一項資源時,不能佔用其他資源,除非他可以一次取得所有資源 - Require each process to request and be allocated all its resources before it begins execution; or, allows a process to request resources only when it has none - Disadvantages - Low resource utilization - Starvation is possible ### No Preemption - 只要 process 握著一些資源,但得不到其他的,就要先全部釋放,再重新申請 - If a process is holding some resources and requests another resources that cannot be immediately allocated to it, then all resources the process is currently holding are preempted ### Circular wait - 對 resource type 強制安排線性順序,不讓 circular wait 的條件達成 - Impose a total ordering of all resource types and to require that each process requests resources in an increasing order of enumeration ```c void transaction(Account from, Account to, double amount) { mutex lock1, lock2; lock1 = get_lock(from); lock2 = get_lock(to); acquire(lock1); acquire(lock2); withdraw(from, amount); deposit(to, amount); release(lock2); release(lock1); } ``` ## Deadlock Avoidence ### Safe state - Safe sequence ![image](https://hackmd.io/_uploads/ByVsw3ZPp.png) ### Resource-Allocation Graph Algorithm - Single instance of each resource type - Use a variant of the resource-allocation graph - 以下圖為例,方框表示 resource、圓形表示 process - 從 process 指向 resource 的虛線箭頭: 表示 process 想要該資源 - 從 resource 指向 process 的實線箭頭: 表示資源被分配給該 process ![image](https://hackmd.io/_uploads/HkKxOhZvT.png) ![image](https://hackmd.io/_uploads/HyE-d2-D6.png) ### Banker's Algorithm - Data structures - Available: 目前系統中可用的閒置資源 - Max: 某個 process 要完成執行,需要的資源數量 - Allocation: 某個 process 已分配到的資源數量 - Need: Max - Allocation,即某 process 還需要多少資源以完成執行 <!-- example to do --> ## Deadlock Detection ### Single Instance of Each Resource Type - Wait for graph ![image](https://hackmd.io/_uploads/Bkj2jhZP6.png) ### Several Instances of a Resource Type - Deadlock detection algorithm - Data structure - Available - Allocation - Request ### Detection-Algorithm Usage - How often is a deadlock likely to occur? - How many processes will be affected by a deadlock when it happens? ## Recovery from Deadlock ### Process and Thread Termination - Abort **all** deadlocked processes - Abort **one** process at a time **until the deadlock cycle is eliminated** --- - The factors may affect which process is chosen - What the **priority** of the process is? - **How long** process has computed and **how much longer** the process will compute before completing its designated task? - How many and what types of **resources** the process has used? - **How many more resources** the process needs to complete? - **How many** processes will **need to be terminated**? - Whether the process is **interactive or batch**? - 停止 batch process 使用者比較無感 - QoE (Quality of Experience) ### Resource Preemption - Three issues need to be addressed - Selecting a victim - 選擇哪個 process 要被搶佔 (preempted)。 - 確保搶佔的順序以最小化成本。 - Rollback - 從某個 process 搶佔資源可能會使其無法正常執行,必須 rollback 回到某個安全的狀態再繼續。 - 最簡單的 rollback 是直接中止這個 process;較有效率的方法是僅在必要時打破 deadlock,但系統需要保存更多關於每個 process 的 state 的資訊。 - Starvation - 如何保證資源不會總是被同一個 process 搶佔?