Operating System - 期中Note

# Operating System - 期中Note ###### `大三上` `作業系統` ## CH1 Introduction ### What Operating Systems Do #### Computer system: * Hardware >CPU, memory, I/O devices * Operating system * System and Application programs >Compilers, assembler, word processors, your first hello world program… * Users ![](https://hackmd.io/_uploads/BkjGlWkla.png) #### What Operating Systems Do? * 是個software * 管理computer hardware * 提供環境給application programs執行 ![](https://hackmd.io/_uploads/B1h6lZ1ea.png) #### Defining Operating System * 沒有通用的定義 * 常常用的定義 * a program running at all times(***kernel***) #### 電腦系統組織 * 經由共用記憶體的公用匯流排(bus)連接在一起 * Device drivers: software * OS上都有device driver(裝置驅動程式) * CPU & controller會搶memory => 記憶體控制器(能同步存取) * Device controller: hardware * 有local buffer & special-purpose registers ![](https://hackmd.io/_uploads/ByM0NZ1ep.png) ![](https://hackmd.io/_uploads/Sk8lHbyx6.png) #### 自問自答時間 >![](https://hackmd.io/_uploads/HkHESWyxp.png) --- ### Computer-System Organization #### Interrupts 中斷控制器(Device controller)要通知驅動程式操作已經完成 => 中斷 1. 控制器在**中斷請求線**上(interrupt-request line)發出signal 2. CPU讀取(catches)中斷編號 -> CPU dispatches it to the **中斷處理常式**(interrupt handler) 3. handlers提供serve給裝置 ![](https://hackmd.io/_uploads/Skpqwb1lp.png) :::info :bulb: ISR(Interrupt service routine) 或interrupt handler ::: * ISR * 保存狀態 * Service the interrupt * Executes a 'retrun_from_interrupt' instruction * CPU 返回中斷前的狀態 ![](https://hackmd.io/_uploads/HJm2KW1ep.png) :::success 中斷可做優先層級priority levels考量 OS用中斷處理非同步事件 ::: #### Storage Structure 儲存體結構 * organized in hierarchy(層級) * 最讚的 = **semi-conductor memory** * Primary storage * 主記憶體(隨機存取記憶體RAM) >用動態隨機存取記憶體(DRAM)的半導體技術製作 * RAM是揮發性(volatile)的 => 關閉電腦會內容消失 >用EEPROM(電子抹除式複寫唯讀記憶體) & firmware(韌體)取代 >但是EEPROM很慢 * Secondary storage 輔助儲存器 * Electrical-based: 非揮發性記憶體設備Nonvolatile Memory (NVM) * Flash memory (SSD): 最常見的方法 * Mechanical-based: hard-disk drives (HDD) (or called magnetic disks) * 最常見的輔助儲存器 ![](https://hackmd.io/_uploads/HyBgoZkga.png) * Tertiary storage * 儲存備份用 * I/O operation * Device driver : 在裝置控制器中載入適當的暫存器 * Device controller : ```markmap 1. 檢查reg的內容 -> 決定做甚麼action 2. 開始動作 -> 從裝置傳送data到本地緩衝區(local buffer) 3. 完成了 -> Device controller利用interrupt通知CPU ``` --- ### Computer-System Architecture #### Single-Processor Systems 單一處理系統 * 只有一個通用處理器 * 特殊用途處理器 >磁碟鍵盤圖形控制器 #### Multiprocessor Systems * 優點: increased throughput => 完成更多工作量 * 兩種管理方案: * Asymmetric Multiprocessing >主處理器控制系統, 定義了**master-slave** relationship * Symmetric Multiprocessing (SMP): >最常見的系統 >每個處理器都是對等的(peer), **不**存在master-slave relationship * Approaches * Multi-processors: 每個處理器有一個單核心CPU >![](https://hackmd.io/_uploads/rk_s5TwfT.png) >![](https://hackmd.io/_uploads/S17R5TPzT.png) * Multi-core systems: 單一晶片上的多個內核 >比"很多個單核還有效率" >**On-chip communication** 比 **between-chip communication** 快 * Non-uniform memory access (NUMA): >在多處理器系統上增加太多CPU >* 系統匯流排爭用成為瓶頸 >* 性能下降 >* ![](https://hackmd.io/_uploads/SJ-6aTDG6.png) >Alternative approach(替代方法): NUMA >Provide each CPU with its local memory >* Accessed via a local bus >越來越受歡迎 >![](https://hackmd.io/_uploads/Bysk1AwGp.png) * Blade server: >多個**processor board** are placed in the **same box** >每個processor board 獨立啟動並運行自己的OS >![](https://hackmd.io/_uploads/rk-He0PG6.png) #### Clustered Systems (叢集系統) 透過區域網路 (LAN) 連接的多台電腦 >![](https://hackmd.io/_uploads/BkJcTeOMp.png) * The key of a clustered system is high availability(高可用性) * Graceful degradation(優雅降級): 繼續提供與surviving hardware水準成比例的服務 >![](https://hackmd.io/_uploads/rkOPaguMp.png) * Fault tolerant(容錯): 即使任何一個組件故障，也能繼續提供服務 --- ### Operating-System Operations >作業系統的基本運作方法: OS is event-driven * Interrupt: driven by hardware(硬體驅動) >![](https://hackmd.io/_uploads/rJox1-_fT.png) * Trap (or exception): a software-generated interrupt >* 例如: Error：除以零，非法指示 >![](https://hackmd.io/_uploads/H1vukbuza.png) >* System calls or monitor calls >作業系統提供的功能，為應用程式提供服務 >read(), write(), open()…… >![](https://hackmd.io/_uploads/rkhFybOza.png) #### Multiprogramming and Multitasking * Single-programming systems * One job at a time: 一個工作做完才做下一個 * Very inefficient : 當正在執行的程式執行 I/O 時 CPU, 處於閒置(idle)狀態 ![](https://hackmd.io/_uploads/BJZw-bdz6.png) * Multi-programmed systems * Why use multiprogramming >efficiency: 有效率多了(跟Single-programming systems比) >執行多個程式使 CPU 或 I/O 裝置始終處於忙碌(busy)狀態 * Memory Layout(佈局) for a Multiprogramming System >A set of jobs is kept in **memory** >![](https://hackmd.io/_uploads/rkIHz-OM6.png) * Time-sharing (or multitasking) systems * **Problem** of Multi-programming systems >Multi-programming 有些問題: >Does not provide user interaction(用戶交互) >![](https://hackmd.io/_uploads/HkR7NZdfT.png) * **Solutions**: **Timesharing (Multitasking)** >* 改善方法: Timesharing >* Logical extension of multiprogramming >* CPU switches jobs so **frequently** (直到要執行 I/O) >* Users can **interact** with each job **frequently** >* Create an **interactive**(互動式) computing environment (EX: using keyboards or mouse) >* **Response time** is short >![](https://hackmd.io/_uploads/HJ1NSb_M6.png) #### Dual-Mode and Multi-mode Operation * To ensure proper execution of the system >必須區分**execution of OS code** & **user program code** * **Solution:** CPU supports dual-mode or multimode operations * **Dual-Mode Operation** >現代CPU至少有兩種執行模式 >![](https://hackmd.io/_uploads/ry7pPzdM6.png) * User mode: 代表使用者執行 * Monitor mode: 代表作業系統執行 >* 也稱為supervisor mode, system mode, or privileged mode >* 可以執行所有指令, 包含**privileged instructions** * Controlled by a mode bit(模式位元) * Privileged instructions >* 可能cause harm的說明 >* 只能在supervisor模式下執行 >* Example: I/O control, timer management, interrupt management * Why uses Dual-mode operation? >* 保護 >* **Privileged instruction**只能在kernel mode執行 >如果在使用者模式下執行, 硬體會引發**trap** >Trap導致CPU跳到作業系統, 從而終止應用程式 * Entering Monitor Mode (the same as entering OS) * As the result of an interrupt * Through a system call trap * As a result of an error trap * Multi-Mode >EX: >* x86 CPU has four protection rings (modes) >* ARM8 has seven modes #### Timers * **Question**: 如何防止用戶程式永遠獲得CPU？ >也就是說如何實現time sharing systems * We must have a way to enter OS >System call traps >Error Traps >Interrupts * **Solution: timer** >引起timer interrupt periodically >CPU回傳給作業系統 >修改定時器內容的指令是有特權的(privileged) :::success :collision: 作業系統的基本運作方法和CPU/硬體有密切關係 ::: --- ### Security and Protection * **Protection** – **controlling** access to resources >區分**授權**使用和**未經授權**的使用 * **Security** – **防禦**系統免受攻擊 * 為了實現保護和安全, 系統必須**區分用戶** * **User identifier** (user IDs): unique, one per user >使用者ID與使用者的所有file和processes相關聯 * **Group identifier** (group ID):unique, one per group >也與每個process和file關聯 #### Open-Source Operating Systems * 以原始碼格式提供的作業系統 >Windows: just binary, **closed-source** * 由自由軟體基金會 **(FSF)** 發起，該基金會擁有「copyleft」GNU 公共授權 **(GPL)** * Examples include **GNU/Linux**, **BSD UNIX**, and **Sun Solaris** --- ### 來個CH1總結吧~ * Computer System Structure電腦系統結構 >Hardware, operating system, system and application programs, users * Computer-System Organization電腦系統組織 >Interrupts >Storage structure * Computer-System Architecture電腦系統架構 >Single-Processor Systems >Multiprocessor Systems >Clustered Systems * Operating-System Operations作業系統操作 >Multiprogramming and Multitasking >Dual-Mode and Multi-mode Operation >Timers --- ## CH2 Operating-System Structures ### Operating System Services * 作業系統為程式和使用者提供服務 >EX: >Program execution >Manage hardware --- ### Operating-System Interface to Users * 對於用戶來說, 與作業系統互動的基本方法 * Command-line interface **(CLI)** (or command interpreter) >**命令解釋器** 允許直接鍵入命令 * 從用戶Fetch命令並執行 * 作為**系統程式(system program)** 實現 >**Not** part of OS * Linux中實作了多種選擇－**shells** >Bourne shell (bash), C shell (csh), Korn shell (ksh) * Batch interface (批次介面) * Graphical User Interface (GUI) * 使用者友善的**桌面**介面 >Mouse-based的視窗和選單 * Touch-Screen Interface * Touchscreen devices >觸控螢幕裝置 * Voice commands --- ### System Calls * 作業系統提供的Functions, 為user applications提供服務 * Example: open(), read(), write(), close()…… * 就算是簡單的程式也是可能用很多system call的 :+1: >![](https://hackmd.io/_uploads/H1Bw_LYGT.png) * 通常每個system call都關聯一個數字 * OS維護一個根據這些數字索引的表 :grey_question: * APIs * 系統呼叫通常透過Application Program Interface(API)**間接存取(indirectly)** * 不是直接call system call * fopen(), fread(), fwrite(), printf()…都是APIs >紅: system call; 藍: APIs >![](https://hackmd.io/_uploads/rkCEi8KMa.png) * API就是一組可供程式呼叫的function * 來一鍋常見的API 1. **Win32** API for Windows 2. **POSIX** API for POSIX-based systems >All versions of UNIX, Linux, and Mac OS X 3. **Java** API for the Java virtual machine (JVM) * Supported by **library** >之前組語說的Kernel32.lib就是Win32 API的函式庫 :::success :accept: **為啥選API 不是選system call咧?** * **Portability (可攜帶性)** * system call的interface不同 >![](https://hackmd.io/_uploads/r1pFRLYG6.png) * API不用修改source code >![](https://hackmd.io/_uploads/rJ4n0LFMp.png) * **Easier to use** >程式設計師不用知道系統呼叫是如何實現的 ::: * System Call Parameter Passing * 傳參數(parameters)給OS的3種方法 1. Pass the parameters in **registers** >參數可能比reg多 2. Parameters儲存在block in memory, and address of block passed as a parameter in a register 3. 參數由program壓入stack並由OS從stack彈出 * Block and stack方法不限制參數的數量或長度 * CPU Registers >![](https://hackmd.io/_uploads/r1_PGvYGT.png) * Memory Block >![](https://hackmd.io/_uploads/B1XKfvYGa.png) * Stack >![](https://hackmd.io/_uploads/Sk05GPFza.png) --- ### System Services * System services (or system utilities) >為程式開發和執行提供便利的環境 * Categories * File management * Status information >Windows工作管理員 * File modification * Programming language support >Compiler, assembler, interpreter * Program loading and execution >Loader and debugger * Communications * Background services >Usually called **services**, **subsystems**, or **daemons** --- ### Operating-System Design and Implementation * 設計和實現(implementing)作業系統時遇到的問題 * Design goal * 受影響的有: >1. Hardware >2. Type of system：傳統桌上型電腦/筆記型電腦、mobile、distributed或real time * **User goals** and **System goals** * There is no unique way to achieve all goals ⇨ compromises(妥協) may be taken * Mechanism and policies * mechanism跟policies分離的重要原則 * Mechanism >怎做到? >**insensitive(不敏感)** to changes in policy * Policy >將會做啥？ >可能會隨著地點或時間的推移而改變 :question: * Separation這兩家伙是為了實現flexibility * Implementation * OS Implementation >以前OS是用組語寫的 :shocked_face_with_exploding_head: 像是MS-DOS >現在多用高階語言 :smile: * Major performance improvement >**不是透過assembly語言改進performance** >compiler變先進了現在只剩**bottleneck routine**是用assembly寫的 >是透過better **data structure** and **algorithms** :anger: --- ### Operating-System Structure #### Monolithic (龐大的/單一的\[from課本\]) structure * 沒有明確的結構 >Examples: >* MS-DOS >* Original UNIX => 在system call interface跟physical hardware之間都是**kernel**:+1: >![](https://hackmd.io/_uploads/r1QAl-nGT.png) :::info :+1: :明顯的性能優勢(performance advantage),system-call interface的開銷非常小,kernel通訊速度快 :-1: :不好implement and extend ::: * UNIX, Linux and Windows are monolithic kernels #### Layered approach * OS分很多層 * Layer approach: 每一層only使用lower-level layers的功能和服務 * Advantages: * 方便 construct & debug: Starts from the lowest layer * Information hiding: 不必了解其他層的details * Layered File System(分層檔案系統) >![](https://hackmd.io/_uploads/BJYjmWnGT.png) * Difficulties * 很難確定層數 * 很難定義每一層的功能 * 效率較低 * 一個服務可能涉及多個layers, 但多加layer會增加開銷 * Trend: 少少layers, 多多functionality #### Microkernel * **Motivation**: Micro是因為UNIX變得龐大且難以管理 * 想法：盡可能從核心移動到**user space** * 小點的kernel * CMU * What a microkernel **should provide** ? * **minimal process** + **memory management** + **communication facility** * Communication facility * 提供**客戶端程式**和**各種服務**（稱為伺服器）之間的通信 * 透過**訊息傳遞**提供 ![](https://hackmd.io/_uploads/HyWUUWhz6.png) :::info :+1: :**容易擴展**作業系統 + **容易將作業系統移植到新架構**(architecture) + **更可靠、更安全** :-1: :服務之間**傳遞訊息**的效能開銷 + **Windows NT**：microkernel/**Windows XP**：monolithic kernel ::: >教授挖了坑, 吊大家胃口 :wave: >![](https://hackmd.io/_uploads/B15WYbhza.png) #### Modules * 目前OS設計中最佳的方法 * OS含有: * A **core kernel** * 一組**loadable kernel modules (LKMs)** <= 可在運行時dynamically loadable * 類似於分層系統 * more **flexible** * 類似於microkernel approach * more **efficient**:point_left:因為Kernel module通訊是function calls, 而不是message passing ![](https://hackmd.io/_uploads/BytWhbhGa.png) #### Hybrid systems(混合系統) * 現代作業系統不採用(adopt)單一結構 * Hybrid systems結合了多種結構 * *Linux* is **monolithic** * modules for dynamic loading of functionality * *Windows* is mostly **monolithic** * personalities (user-level services) **(microkernel)** * modular for dynamic loading of functionality(這跟Linux一樣:shocked_face_with_exploding_head: ) --- ## CH3 Processes ### Process Concept #### The Process * **Process又叫jobs, tasks** * **Program – passive** * **Process – active** >Program - 是.exe(存在記憶體中的) >Process - 是程式的運行實例, 儲存在RAM中(工作管理員看到的) >![](https://hackmd.io/_uploads/ByuRCZhfT.png) * 一個process需要certain resources(一定的資源) * CPU >目前活動的狀態：PC（Program Counter）、暫存器… * Memory >**Text section**: program code >**Data section**: global variable >**Stack section**: temporary data >**Heap section**: memory that are dynamically allocated by calling **malloc()** and **free()** >![](https://hackmd.io/_uploads/rJfR2a2GT.png) * Files ![](https://hackmd.io/_uploads/B1FlrRnzp.png) * C Program的Actual Memory Layout(實際記憶體佈局) * **global data section(全域資料區)** is divided into **initialized(初始)** data and **uninitialized(未初始)** data sections >為*argc*和*argv*參數提供了separate section ![](https://hackmd.io/_uploads/SkqyrCnG6.png) * 取得segment of a program的大小 ```linux size executable-file-name ``` ![reference link](https://hackmd.io/_uploads/HJFzM02zp.png) -- **text**：程式碼段 -- **data**：初始化的全域數據 -- **bss(block start by symbol)**：未初始化的全域數據 -- **dec(10) and hex(16)**：三個部分的總和 #### Process State * 當process執行時，它會改變狀態(state) * **new:** The process is being created * **running:** Instructions are being executed * **waiting:** The process is waiting for some event to occur * **ready:** The process is waiting to be assigned to a processor :point_right: 程式 is runable(隨時可以跑跑跑了~) * **terminated:** The process has finished execution * 一個core上只能運行**一個process** * Many processes may be **ready** or **waiting** >被打斷的running狀態變成ready >碰上I/O的running變成waiting:point_right:完成後又變ready ![](https://hackmd.io/_uploads/BJ1aN0nfa.png) #### Process Control Block ***(PCB)*** * 對於每個process, OS維護一個稱為**process control block(task control block, process descriptor, task descriptor)** 的資料結構來保存與每個process的資訊 * Keep **information** associated with each **process** * PC: **address of the next instruction**(下一指令的位址) * 當發生interrupt或context switches時儲存 * CPU registers * 當發生interrupt或context switches時儲存((教授又賣關子:/ * CPU scheduling information: process優先權 * 記憶體管理資訊 * Accounting information * **CPU time** and **real time** used * I/O狀態訊息 * Open files and devices * PCB大概長這樣XDD >![](https://hackmd.io/_uploads/HJqr0Cnza.png) * Process Representation(表示) in Linux ![](https://hackmd.io/_uploads/rJg61y6zT.png) --- ### Process Scheduling * Process scheduler (or CPU scheduler) * 選擇一個準備好的processes在CPU core上執行 * Degree of multiprogramming * 當前記憶體中的processes數量 #### Scheduling Queues * OS maintains **scheduling queues** of processes * **Ready queue**:point_right:residing in main memory * **A set of wait queues**:point_right:set of processes waiting for events * Processes在**ready queue和various wait queues之間**遷移(migrate) * **Queuing-Diagram** Representation of Process Scheduling * 來~process執行中會發生什麼呢? 讓我們繼續看下去 :satisfied: * The process issue an I/O request >process 被放在 I/O wait queue * The process creates a new child process >process 被丟在 wait queue 等待 child 結束 * The process is removed from the core due to interrupt >process 被放在 interrupt wait queue * The process is removed from the core, due to its time slice expire >process 被放回 ready queue >![](https://hackmd.io/_uploads/BkxpAJpf6.png) #### CPU Scheduling * **CPU Scheduler** (or Process Scheduler) * 從ready queues中選擇一個process並為其allocates CPU core >如果process執行太久:point_right:換人 >![](https://hackmd.io/_uploads/SymQexTGT.png) * CPU scheduler is invoked very frequently ,但不能一直用(:question: * 某些作業系統引入了調度的中間(intermediate)形式:point_right:**Swapping** * 如果memory空間用完, 就**remove process** from memory * process 也可以 re-introduced into memory >先暫時從記憶體丟出去 ,再等有空間的時候重新抓回來記憶體中 >![](https://hackmd.io/_uploads/B1tx8Waf6.png) #### Context Switch * CPU從一個process**切換**到另一個process * 作業系統必須在其 PCB 中**保存(save)** old process 的 **state(context)** * 作業系統必須從 PCB **恢復(restore)** 新process **saved state(context)** * **Context (state):** * CPU registers(PC) * Memory areas * Various tables * 構成process的environment or context :::info :i_love_you_hand_sign: 切換Process時 **(Context Switch)**，必須要儲存當下Process的state，然後復原接下來要執行的Process的state! :point_right: 儲存的state記錄會在**PCB**當中 ::: 來!圖片解釋: P0換到P1, 先把P0存到PCB, 再讓P1下去跑 ![](https://hackmd.io/_uploads/ryzjYWpMa.png) :::warning :zap:比較: ISR 就沒有要保存/恢復的狀態 ::: * Context-switch time is overhead (開銷大) * 切換時, 系統**不執行**使用者應用程式 (浪費了) * 用於減少context switch overhead的Hardware支撐 * save/load所有暫存器的單一指令 :::success :100: 複習：Related Instructions **1. PUSHFD and POPFD** :point_right: 壓入和彈出 EFLAGS 暫存器 **2. PUSHAD** pushes 32-bit 通用暫存器 on stack :point_right: Order: EAX, ECX, EDX, EBX, ESP, EBP, ESI, EDI **3. POPAD** 以相反的順序將相同的暫存器從堆疊中彈出 :point_right: PUSHA 和 POPA 對 16 位元暫存器做同樣的事情 ::: --- ### Operations on Processes #### Process creation :::success :key: **問題：** 當Double Click某一個應用程式，發生了什麼事情呢? **解答:** Windows檔案總管(or Shell in Linux)會create一個process(or 請OS creates一個process)，執行此一應用程式 ::: * 一個process可能會建立多個新process * Forming a tree of processes >![image.png](https://hackmd.io/_uploads/SJyuD5l7T.png) >作業系統根據唯一的process identifier（或pid）來識別processes * Execution * Parent waits 要等 children terminate * 用 **wait()** * Address space * fork()系統呼叫在Linux中建立一個new process * exec()系統呼叫以新程式取代process的記憶體空間 * UNIX Process Creation * 每個process都有一個唯一的process ID **（PID）** * 透過 **fork()** 系統呼叫建立一個new process * Child複製parent的address space * system call * **child：** fork的回傳值為**0** * **Parent：** fork的回傳值為**child的PID** * **execlp()** 系統呼叫用於將new binary file載入到記憶體中 - 銷毀舊程式碼 ![image.png](https://hackmd.io/_uploads/BkesahqlQT.png) #### Process termination * Process invokes **exit()** system call * 回傳 **status value** with **its process identifier** 給 parent * Parent透過 **wait()** 系統呼叫取得child的status value和process identifier * Then, process的資源被作業系統釋放 ![image.png](https://hackmd.io/_uploads/BJLs_BZma.png) * 當子process終止且他的父process還沒呼叫 wait() * 子process => **zombie(殭屍) process** >**process control block (PCB)** 仍然存在用於儲存**exit status** value >![image.png](https://hackmd.io/_uploads/Skqdcr-m6.png) * BUT, 如果父process**沒有呼叫 wait()** 並且被終止 * child process => **orphans(孤兒)** >為了避免孤兒出現: Linux 將child的**parent設定為 init** >**init定期呼叫wait()** 來收集child的退出狀態值 >![image.png](https://hackmd.io/_uploads/Hkt99rbma.png) --- ### Interprocess Communication (IPC) * Independent process(獨立) >不會跟別人share data >不被影響也不影響別人 * Cooperating process(合作) >會Share data >可被影響OR影響別人 :::info :bulb: **為何要cooperating?** 1. Information sharing 資訊共享 2. Computation speedup 計算加速 3. Modularity 模組化 ::: * Cooperating processes 需要 **interprocess communication (IPC) mechanism(機制)** * 2種常見models(os_lab7做過的) 1. Message passing (a) 2. Shared memory (b) >![image.png](https://hackmd.io/_uploads/HkqyTHWm6.png) * Interprocess Communication (IPC) * 比較Message passing & Shared memory | \ | Message passing | Shared memory| | -------- | -------- | -------- | | **速度** | 慢(需system call)| 快(不需system call) | | **實現** | 簡單 | 難 | | **衝突** | No conflict | 需要protection and synchronization| | **其他** | 交換少量資料很有用 | 共享記憶體| :::success :microscope: **複習: 為何Microkernel效能不好？** Performance overhead by message passing between services (效能開銷) :point_right: 因為system call 很多 ::: --- ### IPC in Shared-Memory Systems * Shared-Memory >* Communicating processes 需要**建立**共享記憶體區域 >* 使用該區域的其他process必須將其**附加(attach)到其位址空間** * 共享記憶體**消除**了一個process無法存取另一個process記憶體的限制 `教授PPT跳過Example - consumer & producer` --- ### IPC in Message Passing Systems * 2種 1. send(message) 2. receive(message) #### Synchronous(同步) or asynchronous(非同步) communication * *blocking* or *non-blocking* >**同步 or 非同步** * **Blocking(等待)** - process 暫停直到 I/O 完成 * 易於使用 but 某些地方仍然不優呀~ * **Nonblocking(非等待)** - I/O 立即呼叫返回, 並返回盡可能多的可用數據 * return value指示傳輸了多少位元組(bytes) * | Blocking | Non-blocking | | -------- | -------- | | 又稱synchronous | 又稱asynchronous | 1. | Blocking send | Blocking receive | | -------- | -------- | | 發送者被阻止，直到接收者收到訊息|接收者block直到有訊息可用| 2. |Non-blocking send |Non-blocking receive| | -------- | -------- | |發送方發送訊息並恢復操作|接收者檢索到有效訊息或空訊息| --- ## CH4 Thread & Concurrency ### Overview :::warning :question: **Problem: Process is high overhead** 1. minimize code size or shorten execution time(最小化程式碼大小或縮短執行時間) 2. Resource sharing(資源共享): by **threads** ::: * Process: **High overhead(高開銷)** abstraction to execute a program >E.g., 建立process是一項expensive的操作 * Solutions: **thread** >CPU利用率(utilization)的基本單位 >跟process很像, 也需要thread ID, program counter(PC), register set, and stack >**某些資源是和相同process的執行緒們共享** >* **Code section**, **data section**, and other **OS resources** * Single and Multithreaded Processes ![image.png](https://hackmd.io/_uploads/H10KYI-7a.png) * Create a Process v.s Create a Thread >可以共用**Code section**, **data section**, and other **OS resources** ![image.png](https://hackmd.io/_uploads/SylxcLZ7T.png) * **Thread Control Block (TCB)**(or Thread Descriptor) >每個thread都有自己的TCB，就像每個process都有自己的PCB一樣 >:::success >:bulb:: 為啥同一個Process下的Threads**不共享Program Counters(PC)和Registers Values**咧? >:a:: Threads也有自己的執行狀態,所以,*Context Switch*時必須 >1.將目前正在執行Thread的狀態儲存到TCB >2.復原接下來要執行Thread的狀態 >::: >![image.png](https://hackmd.io/_uploads/H1MKcUZm6.png) * Thread 的 Benefits * Responsiveness(反應能力) >如果一個執行緒被blocked或執行冗長的操作, 其他執行緒可以繼續運行 * Resource Sharing(資源共享) >執行緒共享它們所屬process的記憶體（code和data）和資源 * Economy(經濟) >**Create** and **context-switch** thread 比 processes 快得多 >與processes相比, *creating速度快30倍*, *context switch速度快5倍* * Scalability(可擴展性) >multi-CPU machine上的多執行緒可提高並發性(concurrency) >每個執行緒可以parallel(平行) * Examples * 很多 software packages 都是 multi-threaded * *File servers* or *web servers* * 需要在短時間內處理多個請求 1. Process sequentially：不好 2. 為每個請求建立一個process => 耗時又佔用資源 3. 為每個請求建立一個thread => 有效率 --- ### Multicore Programming #### Concurrency v.s. Parallelism * 在single-processor系統上, **processes是concurrently**運行的, 但**不是parallel**運行的 * **Concurrency**支援多個任務取得進展 * multicore系統, processes可以**parallel** * Parallelism意味著系統可以**同時(simultaneously)執行多個任務** ![image.png](https://hackmd.io/_uploads/r16JZFGQT.png) #### Types of Parallelism :::info :palm_tree: 教授言: 今日的電腦都是Multi-Core, 所以我們以後寫程式要Parallel Execution, 以充分利用每一個Core！ ::: * 兩種 parallelism: **data parallelism v.s. task parallelism** 1. Data parallelism * 在資料的不同組成部分上運行**相同的任務** :::warning :bulb:EX：對大小為 1000 的陣列求和（如果有兩個核心）一個核心上的執行緒 A 對元素 [0]…[499] 求和另一個核心上的執行緒 B 對元素 [500]…..[999] 求和 ::: 2. Task parallelism * 對**相同資料**運行多個任務 :::warning :bulb:EX：對大小為 1000 的陣列求和並相乘（如果有兩個核心）一個核心上的執行緒 A 對元素 [0]…[999] 求和另一個核心上的執行緒 B 執行元素 [0]x [1]x…x[999] ::: >![image.png](https://hackmd.io/_uploads/rkT8cYzXT.png) * Hybrid Most applications use **data parallelism + task parallelism** :::warning :warning: 例如：對大小為 1000 的陣列求和並相乘（如果有四個核心）一個核心上的執行緒 A 對元素 [0]…[999] 求和另一個核心上的執行緒 B 執行元素 [0]x [1]x…x[333] 另一個核心上的執行緒 C 執行元素 [334]x [335]x…x[666] 另一個核心上的執行緒 D 執行元素 [667]x [668]x…x[999] ::: #### Multi-Core Programming Challenges * **Divide activities** >將應用程式劃分為單獨的線程 * **Balance** >每個執行緒執行同等價值的同等工作 * **Data splitting** >每個執行緒存取的資料必須劃分在每個核心上運行 * **Data dependency** >Chapter 6 才說 * **Testing and debugging** >比單執行緒應用程式更困難 --- ### Multithreading Models :::success :tanabata_tree: **Question: How to Support Thread?** 1. **fork() is used to create process** => 程式設計師需要一個function來create thread, 假設此一function 為 *pthread_create()* 2. **fork() function is provided by OS** => *pthread_create()* is provided by **OS & User-level library** ::: * **User threads** * user-level library * user level * **Kernel threads** * OS * kernel level ![image.png](https://hackmd.io/_uploads/r1FXkqfXa.png) * **Many-to-One** * 用於不支援kernel threads的系統 * 執行緒管理由user space的thread library完成：**Efficient** * **Many-to-One:** kernel process * 傳統沒有使用thread的processes我們視為single thread process * **Blocking problem:** * 如果一個執行緒進行blocked系統調用, 整個行程就會被blocking * 不支援在 MP 上並行運行 * kernel 只看到一個 thread (process) >![image.png](https://hackmd.io/_uploads/Sy-4ZczXp.png) * 很少有系統繼續使用這種模型 >無法利用multiple cores的優勢 * **One-to-One** * 用於支援kernel threads的系統 * 每個user-level thread 映射(map)到一個kernel thread * **Good: more concurrency(並發性)** * 無blocking系統呼叫問題 >如果一個線程進行 blocking 系統調用, kernel可以運行另一個thread * 允許多個執行緒在多處理器上parallel * **Bad:** * 創建用戶線程需要creating a kernel thread * if 核心執行緒很多, overhead會很高 >![image.png](https://hackmd.io/_uploads/H1mKX7m7a.png) * **Many-to-Many** :::warning :dizzy: **前兩個models的問題:** **Many-to-one:** 沒有獲得真正的 concurrency **One-to-one:** 不能建立過多的 user threads ::: * Many-to-Many 允許 ***M*** 個使用者層級的 threads map 到 ***N*** 個 kernel threads, 其中 ***N <= M*** * 保證真正的 concurrency * 核心執行緒可以在 multiprocessor 上 parallel * 沒有 blocking system call 的問題 * 無限數量的使用者 threads * 用戶可以創建盡可能多的 user threads >![image.png](https://hackmd.io/_uploads/Sk85sXXXa.png) >使用者執行緒和核心執行緒的數量可能隨時間而變化。 >但無論有多少用戶線程, kernel thread的最大數量都限制為3。 ##### **Two-level Model** * 多對多的一種變體 * **many-to-many** + **one-to-one** models * 雖然many-to-many 是最 flexible 的 1. 但很難 implement 2. 限制 kernel threads 的數量變得不再那麼重要 * 因為處理核心(processing core)數量不斷增加 3. 大部分 OS use one-to-one model >![image.png](https://hackmd.io/_uploads/BkmIaQQm6.png) :::info :earth_asia: **Summary：使用者Thread與核心Thread** * **User thread** *(many-to-one model)* >由**user-level threads library**完成執行緒管理 >快速建立和管理 => 因為是 **local function call** in user space >阻止 **blocking system call** 的所有執行緒 >*無法*利用**多處理器(multi-processors)** * Examples >POSIX Pthreads >Win32 threads >Java threads * **Kernel thread** *(one-to-one model)* >由核心支援ㄉ >創建和管理速度較慢 => 因為是 **system call** >**沒有** blocking system call 的問題 >*可以*利用**多處理器 (multi-processor)** 環境 * Examples >Windows >Solaris >Linux ::: --- ### Thread Library * 提供用於**creating** and **managing** threads 的 ***APIs*** * Two approaches(兩種方法) 1. User-level thread 2. Kernel-level thread * Three main thread libraries 1. POSIX Pthreads (Pthreads) 2. Win32 threads 3. Java threads * Strategies for Creating Threads * *Asynchronous(非同步) threading v.s. synchronous(同步) threading* 1. **Asynchronous(非同步) threading** >創建 child thread 後, **parent thread 恢復(resumes)** 執行 >parent thread 和 child thread 同時運行 2. **Synchronous(同步) Threading** >創建 child thread 後, parent thread **等待(waits)** 所有 children threads 終止 >Called **fork-join** strategy >=> 1 & 2 是 child thread, 0 是 parent thread >![image.png](https://hackmd.io/_uploads/ByLd8VX7T.png) #### Pthreads * thread行為的規範(specification), 而不是實現(implementation) * **API** 指定 thread **library** 的行為 * **Implementation** 取決於 library 的發展 * 常見於 **UNIX** 作業系統 * A Pthreads example * thread 以 main() 開始 * main() 透過呼叫 **pthread_create()** 建立第二個 thread * 兩個 thread 共享全域變數 *sum* * 透過呼叫 **pthread_join()** 等待執行緒終止 --- ### Implicit Threading * 由於 multicore computers, 應用程式可能包含數百甚至數千個 threads * 設計這樣一個 multi-threading 應用程式並不容易 * **Solutions:** implicit threading * **執行緒的建立和管理**是由***編譯器***和***函式庫***完成的, 而不是程式設計師 #### OpenMP * Define parallel regions * 可以 parallel 運行的程式碼區塊 * 程式設計師只需插入 **compiler directives** 來識別 parallel regions * 這些指令使 OpenMP **library parallel** 執行該區域 >![image.png](https://hackmd.io/_uploads/HJ2ooB7X6.png) #### Grand Central Dispatch * *蘋果的啦~* Apple technology for Mac OS X and iOS * 允許**識別 parallel** 運行的 **section of code(tasks)** >By defining **"block"** >Like OpenMP #### Intel **T**hread **B**uilding **B**locks **(TBB)** * 用於設計平行 C++ 程式的函式庫 * EX：將陣列 v 中的每個值傳遞給 **apply()** 函數 >apply() 函數為每個輸入參數 +1 >![image.png](https://hackmd.io/_uploads/ryroRB7mT.png) * **Method 1:** 簡單 for 迴圈的串列版本 * :-1:: 耗時, 無法利用多核心處理 * **Method 2:** uses TBB * TBB提供了**parallel_for模板** * TBB將循環迭代劃分為多個任務（工作） * TBB創建許多執行緒來parallel運行這些任務 ![image.png](https://hackmd.io/_uploads/SypsJUQQp.png) #### Fork Join * 這裡所描述的 fork-join 模型是**明確的(explicit)** * 例如, Pthreads、Windows 執行緒… * 多個線程**被分叉(forked)**, 然後**連接(joined)** * **fork-join 模型**也可以是**隱式的(implicit)** > 呼叫fork不會產生對應的thread, 而是產生對應的tasks > A library > 對於 **recursive 分而治之(divide-and-conquer)** 演算法很有用 * **General algorithm(通用演算法)** for fork-join strategy >* **Parallelism** >![image.png](https://hackmd.io/_uploads/SysK-UQ76.png) >--- >* 在 **fork** 階段建立 **tasks** 而**不是 threads** >![image.png](https://hackmd.io/_uploads/H1FgMIQ76.png) ***切***成4個任務由多個執行緒執行 ![image.png](https://hackmd.io/_uploads/BJpq48XQT.png) #### Summary * Implicit Threading * 開發人員**只需要識別可以 parallel 運行的tasks**, 而不是執行緒 * **run-time library/compiler** 將任務maps到線程, 並確定thread的**創建**和**管理** * :+1: : >1. 開發人員只需要確定什麼 operations 可以 run in parallel >2. library 管理細節 --- ### Threading Issues #### Thread Pools * Consider 一個使用執行緒為 each request 提供服務的 Web 伺服器 * 動態建立和銷毀執行緒－**耗時(time consuming)** * 無限數量的執行緒－**耗盡系統資源(exhaust system resources)** * **Sol.: *Thread Pools*** * 預先在 pool **創建一定數量的 threads** * 收到請求 :point_right: 從 **pool 中喚醒一個執行緒** :point_right: **將請求傳遞給 service** * 一旦 thread **完成**其服務 :point_right: **return to the pool** * if **沒有可用 thread** :point_right: 伺服器 **waits 直到有空閒(free)** >1. ![image.png](https://hackmd.io/_uploads/r14zuIQQa.png) >:point_down::point_down::point_down: **去pool抓threads來用** >2. ![image.png](https://hackmd.io/_uploads/SkNX_L7X6.png) :::info **Advantages** :+1: * 更快地處理請求 >透過現有thread 而不是建立新thread * 允許應用程式中的執行緒數與pool的大小綁定(bound) ::: #### Semantics(語意) of fork() and exec() system calls * **fork()** in a **multi-thread** application? * If one thread calls ***fork()*** >複製所有 thread 或 >複製呼叫 fork() 的 thread * How about **exec()**? >如果執行緒調用 exec() 系統調用 >替換**整個(entire)** process —— **包括 all threads** #### Signal handling(訊號處理) * **Events** - 訊號在 UNIX 系統中用於通知process發生了事件 * Illegal memory access (非法記憶體訪問) * Divided by zero… (除以零...) * Ctrl-C to terminate a process (Ctrl-C 終止進程) * Timer expires (定時器到期) * 所有訊號都遵循**相同的模式(pattern)** 1. 訊號是由特定事件**產生(generated)** 的 2. 訊號**傳遞(delivered)** 給進程 3. 訊號由進程的**訊號處理程序(signal hander)**來**處理(handled)** * 每個訊號由兩個 handlers 之一處理 * kernel 中的 **default signal handler** * 使用者程式中 **user-defined signal handler** * In **single-threaded** programs * 訊號被傳遞到 process * In a **multi-threaded** programs * 將訊號傳送到**訊號應用(signal applies)** 的 thread. >EX: 非法記憶體訪問, 除以零 * 將訊號傳遞給 process 中的**每個 thread** >EX: Ctrl-C 非同步訊號 * 將訊號傳遞給 process 中的 **certain threads** >pthread_kill( pthread_t tid, int signal) * 分配一個 **specific thread** 來接收 process 的所有訊號 #### Thread cancellation * 在 thread 完成之前終止該 thread >EX: 使用多執行緒搜尋資料庫, 猜密碼 >**找到就 cancel, 猜到就 cancel** * **Target thread:** 要取消的線程 * 取消的兩種方法: 1. Asynchronous cancellation(取消非同步) * 立即終止target thread * May have problems in **resource reclaim(資源回收)** >target thread 正在**持有資源** >target thread 正在**更新共享數據** 2. Deferred cancellation(延期取消) * target thread **periodically** 檢查是否應該終止 * 允許 target thread 在 **safe point（cancellation point）** 自行終止 * **EX:** Pthread * 預設類型是**延遲的(deferred)** >收到取消請求後, 只有當執行緒到達 cancellation points 時才會發生取消 #### Thread-Local Storage * 雖然屬於同一 process 的 threads **share** the data >每個執行緒可能需要自己的某些資料副本 * Solution: **Thread-local Storage (TLS)** >允許每個 thread 擁有自己的資料副本 * 與 local variables 不同 * Local variables 僅在關閉函數期間可見 * TLS 在函數呼叫中可見 * 類似於 **static** 數據 * TLS 對於每個執行緒來說是**唯一的（不共用）** ![image.png](https://hackmd.io/_uploads/H1UVY_mQT.png) --- ### Operating-System Examples `Linux Threads` * Linux 將它們稱為 tasks 而不是 threads * 執行緒的創建是透過 clone() 系統呼叫完成的 * clone() 允許父子任務之間共享多少 * 如果設定了以上所有flags * clone() **類似於創建 thread** * 如果沒有給出任何flags * **clone() 與 fork() 類似** * A Sets of Flags to clone() >![image.png](https://hackmd.io/_uploads/SydRcu7Qp.png) --- ## CH5 CPU Scheduling ### Basic Concepts #### CPU-I/O Burst Cycle * Background * **CPU–I/O Burst Cycle** – process 執行由 CPU 執行和 I/O **wait** 週期組成 >EX: CPU burst, I/O burst, CPU burst, I/O burst...(一直重複:leaves: ) * **CPU burst distribution** – 通常有大量的短 CPU bursts 和少量的長 CPU bursts * **I/O bound:** 很多短 CPU bursts, 很少長 I/O bursts >花在 I/O 上的時間 > 計算的時間 * **CPU-bound:** 很少長時間的 CPU burst >花更多時間進行計算 #### CPU Scheduler * ***Term Definition(術語定義)*** * **Process scheduler** (or CUP scheduler) * 從準備執行的進程中選擇, 即**在 ready queue 中** * **Degree of multiprogramming** * 當前記憶體中的process數量 * ***Process scheduler 應滿足以下目標*** * **Objective of multi-programming** * 讓一些 process 一直在運行, 以最大化 CPU 使用率 * **Objective of timesharing** * process 之間的切換必須 frequent, 才能與各個程式進行交互 #### Preemptive and Nonpreemptive Scheduling * 當 process 出現以下情況時, 可能會做出 **CPU scheduling decisions**： 1. Switches **from running to waiting** state >e.g. perform **I/O** or **invoke wait()** 2. Switches **from running to ready** state >e.g., when **quantum expires** 3. Switches **from waiting to ready** state >e.g., at **completion of I/O** 4. Terminates >**複習: Process State** >![image.png](https://hackmd.io/_uploads/SkEYBtmXp.png) > >--- > >**複習: Process Scheduling** >![image.png](https://hackmd.io/_uploads/BJfArKQ76.png) * **Nonpreemptive v.s Preemptive** * **Nonpreemptive** or **cooperative** scheduling >當調度僅在 1 和 4 下進行時(上面那個數字) >Process 保留 CPU 直到 releases CPU >**對 time-sharing 系統不利** * **Preemptive** scheduling >在所有情況下都會進行調度（1 到 4） >已經獲得CPU的 process 可能會 **be forced to** release CPU >與 shared data 存取相關的成本 => process 可能會在 unsafe point 被搶佔 #### Dispatcher(分派器) * 將***CPU控制權***交給短程排班器***選出process時所採用的的model*** * This involves: * 將context從一個process切換到另一個process * Switching to user mode * 跳到user program中的正確位置以恢復(resume)該程式 * **Dispatch latency** – 調度程序**stop one process**並**start another running**運行所需的時間 * 應該盡可能快 * The Role of the Dispatcher >![image.png](https://hackmd.io/_uploads/S1172K776.png) --- ### Scheduling Criteria * 用於判斷scheduling algorithm的效能 * **CPU utilization** * 盡可能保持CPU繁忙 * **Throughput** * 每個時間單位完成的process數 * **Turnaround time** * 執行特定行程的時間量 * 從process提交到process終止 * **Waiting time** * process在就緒佇列中等待的時間量 * **Response time** * 從提交請求到產生response所花費的時間 * 對於interactive流程, 我們可能不關心turnaround time --- ### Scheduling Algorithms #### **1. First-Come, First-Served(FCFS)** Scheduling * Ready queue is a **FIFO queue** * EX: >**用Gantt Chart看:** >Waiting time for **P1 = 0**; **P2 = 24**; **P3 = 27** >Average waiting time: **(0 + 24 + 27)/3 = 17** >![image.png](https://hackmd.io/_uploads/S1gXRK77a.png) >![image.png](https://hackmd.io/_uploads/HkhH0tmma.png) >***換一種順序, 平均花費的時間好多了*** >Waiting time for **P1 = 6**; **P2 = 0**; **P3 = 3** >Average waiting time: **(6 + 0 + 3)/3 = 3** >![image.png](https://hackmd.io/_uploads/SJcZJcQQT.png) * FCFS scheduling 是**不可搶先的(nonpreemptive)** >**不適合分時系統(time-sharing systems)** * **Convoy effect (護衛）:** 許多短process等待一個大process離開CPU * 考慮一個 CPU-bound process 和多個 I/O-bound process #### **2. Shortest-Job-First(SJF)** Scheduling * 選擇 ***"下一個"*** 最小的CPU burst length >**shortest-next-CPU-burst first algorithm** * Two schemes: * **nonpreemptive(不可搶先)** >看到達時間排隊 >![image.png](https://hackmd.io/_uploads/SyKQIcQ7a.png) * **preemptive(可搶先)** >到達時間到的話可以搶先 >若新process到達時 CPU burst length 小於目前執行行程的**剩餘時間**, >則搶先(preempt)目前行程, 也稱為**最短剩餘時間優先 (SRTF) 演算法** >![image.png](https://hackmd.io/_uploads/rkm4U977T.png) * **Problem:** 如何實施？ * 不容易知道下一次CPU burst的長度 * **Sol.:** predict >**Predict method:** >使用previous CPU bursts的長度, 使用**exponential averaging** >![image.png](https://hackmd.io/_uploads/Sy6Uc9XQa.png) * 預測下一次 CPU Burst 的長度 >![image.png](https://hackmd.io/_uploads/BkYOnc7mT.png) >1. **α = 0**: 最近的歷史不計算在內 >2. **α = 1**: 僅實際最後一次 CPU burst counts #### 3. Round-Robin Scheduling * 每個process獲得一小部分CPU時間 * 稱為**time quantum**或**time slice** * 通常為 **10-100 毫秒**(milliseconds) * 經過這個時間後, 進程**被搶先**並添加到就緒隊列的尾部 * **RR is preemptive(可搶先)** * FCFS is non-preemptive * **Question:** * 作業系統如何知道time quantum已過期(expired)？ * => 定時器設定為time slice value * EX: RR with **Time Quantum = 20** >以20秒為界, 時間到了沒跑完就先丟到尾端乖乖排隊 >![image.png](https://hackmd.io/_uploads/Hylvk37mp.png) * 平均週轉時間比 SJF 更長, 但better response * Performance >**time quantum 越大 => 越趨近 FCFS** >**time quantum 必須夠大 => 以減少上下文切換開銷** * **經驗法則(Rule of thumb):** 80% 的 CPU bursts 應該短於時間量 * Time Quantum 大的話, Context Switches 就小 >**`上下文切換不是免費的！`** >![image.png](https://hackmd.io/_uploads/SJObN37XT.png) :::success :dart: 複習: **Context Switch** >![image.png](https://hackmd.io/_uploads/HkfxH3m7a.png) ::: :::info :information_desk_person: **Turnaround Time Varies With The Time Quantum** >q 取 80% 的區間才是最小的 Turnaround Time >80% 是經驗法則 ![image.png](https://hackmd.io/_uploads/SJW8U37Xa.png) ::: #### 4. Priority Scheduling * 每個process都有一個優先權編號（整數） >CPU被指派給優先權最高的process :::info :face_palm: Problem: **starvation(飢餓)** or **indefinite blocking(無限期阻塞)** 低優先權process可能永遠不會執行 :point_down: >有趣小故事: >MIT, 有一項 1967 年提交的job在 1973 年 IBM 7094 系統關閉時都還沒運行XD:satisfied: :ballot_box_with_check: Solution : **aging** >隨著時間的推移, process的優先順序會增加 ::: >**Example of Priority Scheduling:** >![image.png](https://hackmd.io/_uploads/Sk52thQmT.png) #### 5. Multilevel Queue Scheduling * **Questions:** 如果有兩個process有一樣的priority? * **Sol.:** 結合 **RR** & **priority** scheduling >運行具有最高優先權的process >具有**相同優先權的process使用循環法(round-robin)** :::info :cactus: **Priority Scheduling & Round-Robin** >**按照優先權執行, 如果優先權相同, 就用RR, 直到下一個優先權level~** >![image.png](https://hackmd.io/_uploads/Skkyo2mm6.png) ::: * Question: 資料結構怎麼樣？ >如果是單一隊列, 每次需要O(n)搜尋 * sol.: 使用Multilevel queue代替 >![image.png](https://hackmd.io/_uploads/BkF_hnQQa.png) :::info :wave: **A Two-Level Queue Scheduling** >![image.png](https://hackmd.io/_uploads/r1a0Cn7XT.png) ::: * 就緒佇列被劃分為***單獨的佇列*** * 行程被***永久***分配到一個佇列 * 每個隊列都有***自己的***調度演算法 * 調度必須在隊列***之間***進行 * **Fixed priority preemptive scheduling** * Foreground隊列比background隊列有**絕對(absolute)優先權** * 只有當Foreground隊列為空時, 才執行background隊列 * **Starvation** problem * **Time-slice based** * 每個佇列獲得一定量的CPU時間 #### 6. Multilevel Feedback-Queue Scheduling * process可以在各個隊列之間移動 :::info :bulb: **EX: 根據CPU bursts的特徵劃分process** * Q0 – RR 時間量為 8 毫秒 * Q1 – RR 時間量 16 毫秒 * Q2 – FCFS >**新作業**進入由 RR 提供服務的佇列 Q0 >當它獲得 CPU 時, 作業會收到 8 毫秒的時間 >如果未在 8 毫秒內完成, 作業將**移至佇列 Q1 的尾端** >如果 Q0 為空, 則 Q1 由 RR 提供服務並額外接收 16 毫秒 >如果**仍未完成, 則被搶佔並移至佇列 Q2** >![image.png](https://hackmd.io/_uploads/HyFVM6X7p.png) ::: * From above example: * process使用過多 CPU 時間 => 移至較低優先權佇列 * 將 **I/O 綁定**和**交互**process留在較高優先權佇列中 * 在低優先權 Q 中等待太長時間 => 轉移到更高優先權佇列 * Aging * It is the **most generic(最通用)** algorithm * 可以配置以匹配特定系統 * It is the **most complex(最複雜)** algorithm * 必須為每個參數選擇合適的值 --- ### Thread Scheduling * User threads由**thread library**管理和調度 * Kernel threads由**作業系統**管理和調度 * **Process-contention scope (PCS) - Process爭用範圍** * Threads library調度用戶thread以放置到可用的kernel thread上 * 同一 process 的執行緒之間的競爭 * 當選擇user thread到kernel thread上 >並不意味著執行緒實際上在CPU上運行 * **System-contention scope (SCS) - 系統爭用範圍** * 核心決定在CPU核心上運行哪個kernel thread * 系統中所有kernel threads之間的競爭 --- ### Multiple-Processor Scheduling * 當多個CPU可用時CPU調度更加複雜 * Multi-processor可以是以下任一種架構: * homogeneous(同質) >Multicore CPUs(多核心CPU) >Multithreaded cores(多執行緒核心) >NUMA systems * Heterogeneous multiprocessing(異構多處理) #### Approaches to Multiple-Processor Scheduling * **Asymmetric multiprocessing(非對稱多處理)** :::success :gem: **只有master processor處理調度演算法** 優點: 簡單缺點: master成為瓶頸 ::: * **Symmetric Multiprocessing(SMP) - (對稱多處理)** * 每個處理器做出***自己的***調度決策 * 標準方法 => **Windows、Linux、Mac OS** 採用 * SMP中就緒佇列的組織 >![image.png](https://hackmd.io/_uploads/HJ-V8AQQa.png) * A common **ready queue** * 需要某種形式的鎖定來protect the queue * 訪問隊列是performance bottleneck * 每個處理器都有其**private ready queue** * Good >不受protection overhead的影響 >允許更有效地使用**cache memory** * Bad >不同大小的工作負載**透過load balance解決** `Most commonly used` #### Multicore Processors :::info :bouquet: **Multicore processor:** 同一chip上的多個處理器核心 * Problem: **memory stall** (使動彈不得) >等待數據可用 * Solution: **Chip-multithreading 晶片多執行緒 (CMT)** >一個核心中有兩個或多個**hardware threads** >如果一個硬體thread停止, core會切換到另一個硬體thread >每個硬體執行緒對於作業系統來說都表現為一個**logical processor** >Intel 處理器上的 **Hyper-threading**（也稱為 **simultaneously multithreading 或 SMT**）技術 ::: * A Dual-Threaded, Quad-Core System (雙執行緒、四核心系統) >**作業系統的八個邏輯處理器** >![image.png](https://hackmd.io/_uploads/rkgfYA7Xa.png) * If **M-to-M or M-to-1 Model** (如果是多對多或多對一模式) * 改為使用三級調度 **Level 1:** 選擇每個kernel thread上執行哪個使用者執行緒 => Done by **thread library** **Level 2:** 選擇在每個logical processor上執行哪個kernel thread => Done by the **OS** **Level 3:** 指定每個core如何決定要執行哪個hardware thread => Done by the **hardware** * If **One-to-One Model** (如果是一對一模型) * 多執行緒多核心處理器中的兩級調度 **Level 1:** 選擇在每個logical processor上執行哪個軟體執行緒（kernel thread） => Done by the **OS** **Level 2:** 指定每個core如何決定要執行哪個硬體thread => Done by the **hardware** >![image.png](https://hackmd.io/_uploads/SJwsoAm7a.png) #### Load Balancing * 讓工作負載 **均勻分布(evenly distributed)** 在所有處理器上 * 負載平衡是透過 **process migration(遷移)** 來完成的 * 僅在**private就緒佇列**系統上需要 >In common-ready-Q systems, the load is already balanced #### Processor Affinity(處理器親和力/喜好) * 當一個行程migrates到另一個處理器時 * *Old cache memory* 必須 **invalidated** * 必須 **re-populated** *New cache memory* * 大多數 SMP 系統避免處理程序從一個處理器遷移到另一個處理器 * **Processor affinity:** 保持process/thread在同一處理器上運行 * **Per-processor ready queues:** 免費提供處理器親和力！ * **A common ready queue:** 需要特殊處理 :::warning :bell: **但負載平衡抵消了處理器親和力的好處** * 負載平衡是**透過process遷移來完成**的 * 處理器親和力**盡量不要遷移process** ::: #### Heterogeneous Multiprocessing(異構多重處理) * 上述四個問題假設**homogeneous** multi-processors * **Heterogeneous Multiprocessing(HMP)** - 異構多處理 * 某些系統是使用運行**相同指令集**的核心設計的，但其**clock speed**和**power management**各不相同 * 特別適用於mobile devices * EX: ARM big.L ITTLE architecture * **高效能big cores** & **節能LITTLE cores** 結合 :::success :t-rex: **優點:** * **CPU scheduler** * 將效能要求不高, 但需要運行時間較長的任務分配給小核 * 將互動任務分配給大核心 * **Power management** * 如果處於省電模式, 可以停用大核心, 系統僅依賴小核心 ::: ---