Ch05: CPU Scheduling - HackMD

<style> p { display: inline; } .red { color: rgb(226, 62, 48); font-weight: bold; } .blue { color: rgb(41, 122, 236); font-weight: bold; } .green { color: rgb(32, 145, 63); font-weight: bold; } </style> [延伸閱讀](https://hackmd.io/@Chang-Chia-Chi/OS/https%3A%2F%2Fhackmd.io%2F%40Chang-Chia-Chi%2FOS-CH5) # CPU Scheduling ## Basic Concepts - CPU scheduling is the basis of multi-programmed operating systems - By **switching CPU among processes**, the OS can make the computer more productive - Objective - Have some process running at all times to **maximum CPU utilization** ### CPU-I/O Burst Cycle ![](https://hackmd.io/_uploads/SyPSwhEz6.png) - Cycle - CPU burst - I/O burst ![](https://hackmd.io/_uploads/S156DnVzT.png) - A large number of short CPU bursts and a small number of long CPU bursts - Selecting proper CPU scheduling algorithm ### CPU Scheduler - <font class='red'>Short-term scheduler (or CPU scheduler)</font> - Selecting ==a== process from the processes in the memory that are ready to execute and allocates the CPU to that processes ### Preemptive and Nonpreemptive Scheduling - CPU-scheduling decision may take place under the following circumstance - ==考點：影響決策的時間點== - <font class='blue'>running -> waiting</font> - <font class='blue'>running -> ready</font> - <font class='blue'>waiting -> ready</font> - <font class='blue'>terminate</font> - Non-preemptive scheduling - Preemptive scheduling ### Dispatcher - The dispatcher - The module that gives control of the CPU to the process selected by the short-term scheduler - Function - Switching context - Switching to user mode - Jumping to the proper location in the user program to restart that program - <font class='green'>Dispatch latency - The time it takes for the dispatcher to stop one process and start another running</font> ![](https://hackmd.io/_uploads/H1_sqnEf6.png) ## Scheduling Criteria - **CPU utilization** - keep the CPU as busy as possible - 讓 CPU 盡可能保持忙碌 - **Throughput** - \# of processes that complete their execution per time unit - 每單位時間完成執行的 process 數量 - **Turnaround time** - amount of time to execute a particular process - 執行特定 process 的時間量 - **Waiting time** - amount of time a process has been waiting in the ready queue - Process 在 ready queue 中等待的時間量 - **Response time** - amount of time it takes from when a request was submitted until the first response is produced - 從提交 request 到產生第一個 response 所花費的時間 ## Scheduling Algorithms ### FCFS (First Come, First Served) scheduling - 依 (Burst time) 抵達順序執行(先來的先執行) ![image.png](https://hackmd.io/_uploads/rJTDKzPQT.png) - Waiting time: P1 = 0, P2 = 24, P3 = 27 - Average Waiting Time (AWT): (0 + 24 + 27) / 3 = 17 - **Convoy effect** 護航效應 - Occur when all the other processes **wait for the one big process** to get off the CPU, results in lower CPU and device utilization. - 很多 process 在等待一個很長 CPU Time 的 process，造成平均等待時間大幅增加的不良現象 ### SJF (Shortest Job First) Scheduling - 時間最短的 process 先處理 - The SJF is optimal - It gives the minimum average time for a given set of process - Difficulty - 不容易執行，因為不好預測未來需要的時間 - Knowning the length of the next CPU request - The next CPU burst is predicted as an exponential average of the measured lengths of previous CPU bursts - <font class="red">$\tau_{n+1}\ =\ \alpha t_n\ +\ (1\ -\ \alpha)\tau_n$</font> - t~n~ = actual length of n^th^ CPU burst - $\tau_n$ = 預測的 length of n^th^ CPU burst - $\tau_{n+1}$ = 預測的 length of n+1 CPU burst - $0 \le \alpha \le 1$, 通常為 1/2 - The next CPU time is predicted as exponential average of the measured lengths of previous CPU bursts ![image.png](https://hackmd.io/_uploads/r16ZbMDmp.png) ### RR (Round Robin) - Round Robin - Time quantum or time slice - $N$ processes, time quantum is $q$ - $(N - 1)*q$ time units until its next time quantum ![image.png](https://hackmd.io/_uploads/BkVXmGvX6.png =600x) ![image.png](https://hackmd.io/_uploads/rkwNQzvX6.png =600x) ### Priority - SJF is a special case of the general priority scheduling - Problem - Indefinite blocking or starvation - Solution: **Aging** - Gradually increasing the priority of processes that wait in the system for a long time - Combining RR and priority scheduling ### Multilevel Queue - Foreground (interactive) processes and background (batch) processes ![image.png](https://hackmd.io/_uploads/BJy04fPmT.png =400x) ![image.png](https://hackmd.io/_uploads/Byh0EGD7T.png =600x) ### Multilevel Feedback Queue - Allowing a process to move between queues - If a process **uses too much CPU time**, it will be **moved to a lower-priority queue** - A process that **waits too long** in a lower-priority queue may be **moved to a higher-priority queue** - Aging - Prevent starvation ![image.png](https://hackmd.io/_uploads/SJNkufPma.png) --- - Defined by the following parameters - The number of queues - The scheduling algorithm for each queue - The method used to determine when to **upgrade** a process to a **higher-priority** queue - The method used to determine when to **demote** a process to a **lower-priority** queue - The method used to determine which queue a process will enter when that process needs service ## Thread Scheduling - User level thread - Kernel level thread ### Contention Scope - **PCS** (Process contention scope) - Competition for the CPU takes place among **threads belonging to <font color=red>the same process</font>** (屬於相同 process 的 thread 會發生 CPU 競爭) - Thread will be scheduled on an [**LWP**](https://hackmd.io/NUX5wREyQzW-LbvJXUlO-Q#Scheduler-Activation) - **Many-to-one** and **many-to-many** - **SCS** (System contention scope) - Competition for the CPU with SCS scheduling takes place among **all threads in the system** - **One-to-one** ## Multiple-Processor Scheduling - Load sharing - Multiprocessor - Traditional - Systems that provided **multiple physical processors**, where each processor contained one **single-core** CPU - Modern - Multicore CPUs - Multithreaded cores - NUMA (non-uniform memory access) systems - Heterogeneous multiprocessing ### Approaches to Multiple Processor Scheduling 考比較或==選擇題== - **Asymmetric multiprogramming** - <font class='red'>所有系統的執行會由一個 master processor 掌控</font> - <font class='red'>其他 processor 只執行 user code (由 master 配置)</font> - Has all scheduling decisions, I/O processing, and other **system activities** handled by a single processor - i.e. the <p class="green">master server</p> - The other processors execute only **user code** - Benefit: simple - <p class="blue">Only one processor accesses the system data structures, reducing the need for data sharing</p> - Disadvantage - The **master server** becomes the performance **bottleneck** - **Symmetric MultiProgramming** (SMP) - <p class="red">每一個 processor 都是 self-scheduling</p> - <font class='red'>所有 processors 共用 ready queue，或是每一個 processor 有自己的 ready queue</font> - <font class='red'>需要同步機制</font> - Scheduling proceeds by having the scheduler for each processor examine the ready queue and select a process to execute - Two strategies for organizing the threads eligible to be scheduled ![image.png](https://hackmd.io/_uploads/H1HcxXv76.png =500x) - All threads may be in a **common ready queue** - Must ensure that two separate processors do not choose to schedule the same thread - Locking - Each processor may have its own **private queue** of threads - Balancing ### Multicore Processors - **Memory stall** - 當存取記憶體時，花費許多時間在等待資料 available (ex. cache miss) ![image.png](https://hackmd.io/_uploads/S1UcMmDQ6.png =500x) - **Multithreaded multicore system** - 兩個或更多的硬體 thread 被 assign 給每一個 core - 當一個 thread 在存取記憶體時，其他的 thread 就可以執行 CPU 指令 ![image.png](https://hackmd.io/_uploads/B1FKQQwm6.png =500x) - Hardware thread - **Chip MultiThreading** (CMT) ![image.png](https://hackmd.io/_uploads/BktfVXwXa.png =250x) - 若一個 processor 有四個 core，每個 core 有兩個 hardware thread，OS 會當成有八個可以分配的 CPU。 --- - Two ways to multithread a processing core - **Coarse-grained**（粗顆粒） multithreading - 當 memory-stall 發生時切換到另一個 thread，因為 instruction pipeline 必須被 flush 所以成本很高 - a thread executes on a core until a long-latency event such as a memory stall occurs. - **Fine-grained**（細顆粒） multithreading (interleaved) - 把 pipeline 的狀態保留並切換到另一個 thread 執行，需要更多的 register 來保留資料 - Two levels of scheduling ![image.png](https://hackmd.io/_uploads/HJ9c47D7p.png =300x) ### Load Balancing - **Push migration** - 定期檢查 processor load，若發現不平衡就將 thread 推到比較不忙的 processor 來達到平衡。 - **Pull migration** - 空閒的 processor 從忙碌的 processor 拉取正在 waiting 的工作。 - 通常兩者並行 ### Processor Affinity - Processor affinity（親和力） - process 想儘量長時間運行在某個指定的 CPU 上，且不被轉移至其他 CPU 的傾向性 - A process has an affinity for the processor on which it is **currently running** - **Soft affinity** - 盡量保持在同一個 processor 運行 - When an OS has a policy of attempting to keep a process running on the same processor but **not guaranteeing** that it will do so - **Hard affinity** - 只能在一個 processor 運行 - Some systems provide system calls that support hard affinity - Allowing a process to specify a subset of processors on which it may run - The main-memory architecture of a system can affect processor affinity issue ![image.png](https://hackmd.io/_uploads/H1sgD7DmT.png) - NUMA (Non-Uniform Memory Access) - A CPU has faster access to its local memory than to memory local to another CPU ### Heterogeneous Multiprocessing - Heterogeneous multiprocessing (HMP) - Heterogeneous: 異構 - 運算由不同指令集、架構的 processor、或是不同的電路(GPU 或其他運算電路)完成 - E.g, CPU 和 GPU 同時進行運算 ## Real Time CPU Scheduling - Real-time 不代表越快越好，重點在 deadline 前要完成 - Soft real-time systems：限制反應時間，但可容許一些延遲 - 雖然不希望超過 deadline，但並不會馬上出問題 - Ex: streaming, online game - Hard real-time systems：硬性要求反應時間要在規定內 - 若超過 deadline 則導致 fundamental failure [16. 即時作業系統 (Real-time Operating System) | 宅學習](https://sls.weco.net/node/21340) ### Minimizing Latency (這個圖感覺會考) - Event latency ![Event latency](https://hackmd.io/_uploads/SJ5aDXvXa.png =400x) - Two types of latencies affect the performance of real-time systems - **Interrupt latency** ![image.png](https://hackmd.io/_uploads/HJBZO7Dm6.png =400x) - **Dispatch latency** ![image.png](https://hackmd.io/_uploads/HkfNuQv7a.png =400x) - **Conflict phase** - Two components of the conflict phase - **Preemption** of any process running in the kernel - Release by low-priority processes of resources needed by a high-priority process - **Dispatch phase** ### Priority Based Scheduling - Priority-based algorithm with preemption ![image.png](https://hackmd.io/_uploads/Hyn0xVv7a.png =600x) - Periodic task - $0 \le t \le d \le p$ - $t$: processing time - $d$: deadline - $p$: period - Rate - $\cfrac{1}{p}$ - **Admission Control** - process may have to announce its deadline requirements to the scheduler - admit: guaranteeing that the process will **complete on time**. - reject: **cannot guarantee** that the task will be serviced by its deadline. ### Rate Monotonic Scheduling - 依據頻率的大小來做排程 - 更短的週期 (週期是固定的數值，在 run time 期間不會變動) ---> 更高的優先度 - Scheduling periodic tasks using a static priority policy with preemption - Each periodic task is assigned a priority **inversely based on its period** - The shorter the period, the higher the priority ![image](https://hackmd.io/_uploads/SJ_s6rgEp.png =500x) ![image](https://hackmd.io/_uploads/r1Zg0Sx4p.png =500x) ![image](https://hackmd.io/_uploads/S1OrCHxE6.png =500x) ### Earliest Deadline First Scheduling - 更早 deadline ---> 更高優先度 - Earliest Deadline First (EDF) scheduling dynamically assigns priorities according to deadline - The earlier the deadline, the higher the priority ![image](https://hackmd.io/_uploads/Hyn3kLgVa.png =500x) ### Proportional Share Scheduling - Proportional Share schedulers operate by allocating $T$ shares among all application - An application can receive $N$ shares of time - The application will have $N/T$ of the total processor time ## Algorithm Evaluation - CPU utilization - Response time - Throughput - Turnaround time ### Deterministic Modeling - Analytic evaluation - Takes a particular predetermined workload and defines the performance of each algorithm for that workload. - Example ![image](https://hackmd.io/_uploads/B1Ps35SPp.png) ### Queueing Models - Queueing network analysis - <font class='red'>Little’s formula</font> - $n = \lambda \times W$ - $n$: the average queue length - $\lambda$: the average arrival rate - $W$: the average waiting time in the queue ### Simulations - Trace files ![image](https://hackmd.io/_uploads/r1G3fSSva.png) ### Implementation - 將演算法實作出來，藉由觀察實際結果評估演算法的優劣 - High cost