# Operation System ## Ch01 Introduction ### Introduction * OS是user programs/application和硬體之間的程式,將硬體細節隱藏並分配 (fair and efficient resource): * resouce allocator / control program ![未命名2.png](https://hackmd.io/_uploads/BkZq8rq7T.png) * Startup:Bootstrap program = boot loader (位於 ROM 或 EEPROM)load kernel * Computer System Organization ![55DD12BB-49F9-405D-95E8-8021004D6F02.jpeg](https://hackmd.io/_uploads/SJh053_Xp.jpg) * Device driver 也是 OS 的一部分,Device controller 完成 I/O 後會用 **Interrupt** 通知 CPU (由 OS 處理)。 ### Interrupt #### Common Funtion of Interrupt * 在 CPU 和 Device 的互動中,給予 Device 任務後可以 * Polling ( request process wait Device ) = **Synchronous I/O** * Interrupt = **Asynchronous I/O**。 * 當任務完成,Device controller 發 Interrupt Request (**IRQ**) 給 CPU,收到 IRQ 後去 Interrupt vector table (**IVT**) 查詢 Interrupt instruction = Interrupt serivce routine (**ISR**) = Interrupt handler 具體 Instrution 或其 address。 * Interrupt 也可以和軟體交互(也就是traps): * Error, Eg: Division by Z, Non-Authorization Access Memory. * User request, Eg: Ctrl+C. * **可以說OS是Interrupt driven!!** #### Interrupt Handling 1. 硬體或 OS **Save** CPU 狀態(Eg: Register, PC, IR...) 2. 尋找 ISR or Instruction address from IVT 3. 執行ISR 4. 回復 CPU 狀態 ![82835CDF-1576-4D12-89EF-3844E10068DD.jpeg](https://hackmd.io/_uploads/S1B156dma.jpg) * 這個 **save-restore** 過程就是 Interrupt 成本 #### I/O Sub-system * OS 控管 HW 的系統: I/O subsystem * Driver interface * Memory Management: buffering, caching ### Storage Structure #### Storage Hierarchy ![未命名2.png](https://hackmd.io/_uploads/rkxt_r5Q6.png) * [Direct Memory Access (**DMA**)](https://blog.csdn.net/as480133937/article/details/104927922): The process of transferring data without the involvement of the processor (CPU) itself. * Transfered by Device controller. * Raise Interrupt when done. * Only Primary Storage in storage hierarchy would be accessed by CPU directly. * Caching: Information in use copied from slower to faster storage temporarily. #### Memory Management * **Data** and **Instruction** are in memory * File-System management / Access control * OS's job: * Stores files into the storage devices. * Creating and deleting files and directories * Primitives (操作, Eg: read/write/append, set/get status/permission ) to manipulate files and dirs. * Free-space management, Storage allocation, Disk scheduling ### Computer-System Archutecture * Multi-processor systems = parallel systems = tightly-coupled systems. * 不同 processor 共用 main memory。 * [Symmetric (reliability) / Asymmetric (Master/Slave)](https://tomchen.me/2020/03/19/OS/%5BOS%5D%20%E4%BD%9C%E6%A5%AD%E7%B3%BB%E7%B5%B1%E7%AD%86%E8%A8%98-%E7%B3%BB%E7%B5%B1%E6%9E%B6%E6%A7%8B/) ![未命名2.png](https://hackmd.io/_uploads/ryTJpBqQp.png) * Multi-core Systems * 單一處理器 `processor` 上有多個 `CPU`,處理器上共用 `L2 cache`,`CPU` 有自己的 `L1 cache`。 ![未命名2.png](https://hackmd.io/_uploads/Sy4gprqX6.png) * Non-Uniform Memory Access (**NUMA**) * Porblem: Multi-core + Multi-processor = system bus 上出現效能瓶頸。 * Solution: local memory,Access 本地資料加快;Access其他資料變慢 $\Rightarrow$ 需要 OS 謹慎的安排 task / data. ![未命名2.png](https://hackmd.io/_uploads/ryOG0H97a.png) * Clustered Systems * 通過 Local area network (LAN) 或其他方法連接很多電腦。 * High availability, Fault tolerant, Graceful degradation. * Symmetric (efficient) / Asymmetric (hot-standby mode) ![未命名2.png](https://hackmd.io/_uploads/BydjRS976.png) ### Program Scheduling * **Long-term scheduler = Job scheduler** (不頻繁) * 多工處理, CPU 可以執行多個 Process,以最大化利用 CPU,由它決定哪個 Process 從 job queue 到 ready queue (admit) * Process被分成兩種,最好是混合 1. I/O-bound = short CPU bursts 2. CPU-bound = long CPU bursts * Windows 和 UNIX 沒有,所有 job 都被丟進 memory。 * 為了 Improve process mix => Midterm Scheduling * **Short-term scheduler = CPU scheduler** (頻繁,millisec.) * 決定 ready queue 上的 proocess 哪個該被執行 * 經常做 Context Switch ### OS Operation #### Dual-Mode Operation * OS 通過 Dual-mode Operation 保護自己 * User mode = non-privileged mode * Kernel mode = privileged mode (for OS) * Transition from user mode to kernel mode (by system call) * Eg: read file, send pkg, etc. ![未命名2.png](https://hackmd.io/_uploads/r1JDBI5Q6.png) #### Process Management * **Process = Program in Exection** * Single-thread process (1 PC) / Multi-thread process (Multiple PC, 1PC per thread) * Create / Delete / Suspend / resume / synchronization / communication / deadlock handling #### Protection and Security * 每個 user 有自己的 userID,用於處理 Process 和 File 的 access control。 * Group ID 可以讓多的 user 擁有 Process / File 權限 * **Privilege escalation (特權提升)** 使 user 可以暫時提高權限 * Eg: Setuid (UNIX), [passwd (UNIX)](https://blog.gtwang.org/linux/linux-passwd-command-examples/) ![未命名2.png](https://hackmd.io/_uploads/rJa6EvcQp.png) ### Virtualization * 為硬體 (單一電腦) 提供不同的執行環境 (Create Illusion of environment) $\Rightarrow$ **Allows OSes to run on other OSes** 1. Emulation (仿真): 通過軟體模擬硬體去支援另一種OS,較慢 2. Virtualization (虛擬化): 由 VMM (Virtual Machine Manager) = Hypervisor 支援此服務 **可以(不)通過 host OS** 直接取用 HW source * Type I: 不通過 host OS,直接由 VMM 銜接,VMs 有自己的 kernel ![未命名2.png](https://hackmd.io/_uploads/rJ6cwwqX6.png) * Type II: 通過 host OS,分成 host 和 guest ![未命名2.png](https://hackmd.io/_uploads/SJ38PD976.png) ### Computing Environment * 除了 Traditoinal Computing / Mobile Computing,還有: 1. Client-Server Computing * Application-server: 提供 Client interface for request service * File-server: 提供 store / retrieve files insterface for Client 2. Peer-to-Peer Computing * peers 概念無 Server / Client, Eg: Skype 3. Cloud Computing * Categories of Services 1. Software as a Service (SaaS) 2. Platform as a Service (PaaS) 3. Infrastructure as a Srevice (IaaS) * Categories of Authority: Public / Private / Hybrid 4. Real-Time Embedded System * No OS / Tiny embedded OS / Standard OS ## Ch02 System Structure ### Operating System Services * Functions helpful to user / applications: * User Interface: CLI / GUI * Program execution: Load program into memory and run, terminate execution. * I/O operations (involve file or I/O device) * File-system manipulation (操作): read / write / create / delete / search / list info. / permission management * Communications, between processes or computer * Error detection * Hardware Errors (CPU, memory, I/O devices) * Software Error (Invalid memory access) * Finctions ensures efficient operation of system * Resource allocation * Logging / Accouting * Protection / Security ![未命名2.png](https://hackmd.io/_uploads/HJxrAQdc7a.png) ### Interface Provided by OS 1. User Interface: * CLI: Shell, fetch command from user and executes it. * GUI :+1: 2. Programming Interface * **System call**: Programming Interface provided by OS. * System call 有**編號 (number)**,以 table 紀錄 * Caller 不知道 System call 如何運作,傳進 OS 後由 sys. call handler 處理。 ![未命名2.png](https://hackmd.io/_uploads/rykiLd576.png) * 通常user使用更high-level的API, Eg: Windows API, POSIX API, JavaAPI * 不同 OS 的 sys. call 參數/名稱不同,但 API 可以用於多數 OS,圖上 C library 就是包裝過後的 system call ![未命名2](https://hackmd.io/_uploads/Bk01YuqXp.png) ![image](https://hackmd.io/_uploads/HkcIt_5m6.png) * System call 如果要傳參數 1. Register: 太小 :-1: 2. Memory Address / Table: 參數放在 Table 裡,只傳 Address :+1: 3. Stack :+1: ![未命名2](https://hackmd.io/_uploads/BknL0u9mp.png) * 分類: 1. Process Control 2. File Management 3. Information Maintenance 4. Communication 5. Protection ### System Programs * 部分是 interface of sys. call, 部分更複雜 * 功能: 1. File management / modification, Eg: File manager 2. Status information, Eg: date, time... 3. Programming language support / loading and execution, Eg: compiler, assembler, ELF loader... 4. Communication, Eg: Chrome * [Linker and Loader ](https://hackmd.io/@alan303138/Sy26KTAJY#%E9%80%A3%E7%B5%90%E5%99%A8) * Linker: 將許多目的檔連結成一個執行檔的程式 * Loader: 將可執行檔載入到記憶體當中執行的程式。 ![未命名2](https://hackmd.io/_uploads/B1G9fKq7a.png) ### OS Design & Implementation 1. User Goal (convenient to use) vs. System Goal (easy to design) 2. 因為 Compiler 的進步,是否使用組語寫 OS 差異減小,主要是資料結構和算法影響較大。 ### OS Structure * Simple Structure * Eg.1: MS-DOS * Eg.2: UNIX (Monolithic kernel) ![未命名2](https://hackmd.io/_uploads/BJRzmq5mp.png) * Layered Approach * 將 OS 分層,上層 call 下層,難以定義每層為何且效率較低。 ![image](https://hackmd.io/_uploads/rk1RXqcQT.png) * Microkernels [2022 OS midterm] * 將 OS 部分功能(Eg: File System, Device Driver...)上推至User mode與Applications同級,而 Basic IPC, memory, CPU Scheduling...等等功能維持在kernel里共用,使被分區的功能之間通過Kernel mode 傳遞 message (message passing)。 * pros: 容易擴展功能、OS較能適應別的架構、kernel mode code 減少 $\Rightarrow$ reliale & secure * cons: inter-subsystem communication 成本增加,component分得越細,成本越高。 * [More Detail](https://hackmd.io/@sysprog/microkernel-design) ![image](https://hackmd.io/_uploads/SyNYI9qQ6.png) ![image](https://hackmd.io/_uploads/H1x9UqcXT.png) * [Modules (Object-Oriented)](http://cc.ee.ntu.edu.tw/~farn/courses/OS/slides/ch02.pdf) * 現在大多數系統使用 loadable kernel modules,Core Components 是分離的,需要時可以被 kernel 動態加載 ![image](https://hackmd.io/_uploads/HJZFv9c76.png) * Hybrid Systems * 就是上面的都來一點以兼顧不同 Policy * Linux: monolithic + modular * Windos: monolithic (most) + microkernel + modular ![1280px-OS-structure2.svg](https://hackmd.io/_uploads/HJoaL55QT.png) ### Virtual Machines * 為硬體 (單一電腦) 提供不同的執行環境 (Create Illusion of environment) ⇒ **Allows OSes to run on other OSes** 1. Emulation (仿真): 通過軟體模擬虛擬機硬體,慢 2. Virtualization (虛擬化): VMM (Virtual Machine Manager) = Hypervisor **可以不通過 host OS** 直接取用 HW source ![image](https://hackmd.io/_uploads/H15gK55Qp.png) ![未命名2](https://hackmd.io/_uploads/SkWQK5c7p.png) ## Ch03 Process ### Process Concept * job = process (active) = program (passive stored on disk (**executable file**)) in execution * Memory Layout of C Program ![未命名2](https://hackmd.io/_uploads/Hkqchc9XT.png) #### Process State * new: being created * ready: runnable, waiting to be assigned CPU * running: being executed, owns the CPU is one of running status * waiting: waiting for event occur * terminated: finished execution ![image](https://hackmd.io/_uploads/S1DeAccQ6.png) #### Process Control Block **(PCB)** =Task Control Block * 一種資料結構,每一個process有自己的PCB,存儲資料如下: 1. Process Status 2. CPU register, Eg: PC, IR etc 3. Memory-management information, memory allocated to the process 4. Accounting & identification information, Eg: CPU time, PID 5. I/O status information ![image](https://hackmd.io/_uploads/rJ-7kscXa.png) #### Threads $\Rightarrow$ Ch04 * a process is a program that performs **a single thread of execution** #### **Process Queues** 1. Job queue: all process in system 2. Ready queue = run queue: queue 所有 ready state 的 process 3. Device queues: queue 在等 I/O 的 process 4. Event queues: queue 在等特定 event 的 process, 各種 event 有自己的 event queue ![image](https://hackmd.io/_uploads/HJCIWi9Xp.png) ### Process Scheduling #### Goal 1. Multi-programming: High CPU utilization 2. Time-sharing = Multi-tasking: Short response time #### Implement * Context Switch (around millisec) * 當 CPU switch,需要替換 PCB 資料 (做SL) 這就是 Context Switch,也就是switch process 的成本 * 可以通過增加 CPU register 數量降低 Switch 成本 ![未命名2](https://hackmd.io/_uploads/BkO9XiqQp.png) * Schedulers * **Long-term sceduler = Job scheduler** (不頻繁) * 多工處理, CPU 可以執行多個 Process,以最大化利用 CPU,由它決定哪個 Process 從 job queue 到 ready queue (admit) * Process被分成兩種,最好是混合 1. I/O-bound = short CPU bursts 2. CPU-bound = long CPU bursts * Windows 和 UNIX 沒有,所有 job 都被丟進 memory。 * 當希望保有但 degree of multi-programming 降級:Midterm Scheduling * **Short-term scheduler = CPU scheduler** (頻繁,millisec.) * 決定 ready queue 上的 proocess 哪個該被執行 * 經常做 Context Switch (PCB 資料轉換) * **mid-term scheduler** ![image](https://hackmd.io/_uploads/Hy_HU7sm6.png) 原因是:**improving process mix** * 可能因為一開始沒有正確評估出 memory 大小 * 可能不確定是 I/O bound 還是 CPU bound ### Operations on Process #### Creation ![image](https://hackmd.io/_uploads/Hk5vLj5QT.png) * 根據Creation關係,可以畫出 tree of processes, 他們的 **pid** 不同 * Child 的 資源可以由 OS 或 Parent 給予,後者的好處有可以避免 Child 一直做 Creation 吃掉系統資源 * Parent 和 Child 的關係可以有 * 資源分配: 共用、Child使用部分Parent資源、不共用(Child資源由OS提供) * 執行: concurrently (同時)、Parent wait for child termination * [Address space](https://ntust-csie-islab.github.io/106-OS-homework1/%E9%9A%A8%E7%8F%AD%E9%99%84%E8%AE%80m121355/): 相同 / 不同 * 舉例(UNIX): * **fork()** system call creates new process * [**exec()**](https://jasonblog.github.io/note/linux_system/linuxxi_tong_bian_cheng_zhi_jin_cheng_ff08_wu_ff09.html) system call used after a fork() to **replace** the process’ memory space with a new program ``` int main() { pid_t ret, dead; int status; ret = fork(); // fork another process, ret = child pid if (ret < 0) { // Error occured fprint(stderr, "Fork Failed."); exit(-1); } else if (ret == 0) { // child process execlp("/bin/ls", "ls", NULL); // 用/bin/ls的檔案取代接下來的code } else { // parent process dead = wait(&status); //dead = -1 代表 Child terminal,否則會回傳 Child pid printf("Child Complete."); exit(0); } } ``` ![image](https://hackmd.io/_uploads/rJ0czh5m6.png) #### Termination * Process 結束時,通過 `exit()` system call 呼叫 OS 刪除自己 (Free / Deallocate PCB and resource),其 Parent 可以通過 Wait 指令 get exit status * Parent 可以主動 terminate Children * 可能因為吃太多資源或不需要了 * 有的 OS 在 Parent Terminate 時會讓所有 Children termiated $\Rightarrow$ cascading termination, Eg: VMS * 也可能是將 Parent 設成 init/systemd (pid = 1), Eg: UNIX ### [Inter-Process Communication (IPC)](https://tomchen.me/2020/06/07/OS/%5BOS%5D%20%E4%BD%9C%E6%A5%AD%E7%B3%BB%E7%B5%B1%E7%AD%86%E8%A8%98-Process%E9%96%93%E7%9A%84%E6%BA%9D%E9%80%9A/) * Independent process 不受其他 process 影響,也不影響其他 Process;Cooperating Process 則相反 * 任何共用 Data 的 Process 都是 Cooperating Process,對於需要切割 job 後平行跑的 MP system 而言是必要功能* Communications Models * [兩種model](https://hackmd.io/@YiZjennnnn/ipc_interprocess_communication#Message-Passing) * Shared Memory: 更快,不用 System Call for read / write * Message Passing: 更慢,需要 System Call for sending / receiving,但更容易實現 ![image](https://hackmd.io/_uploads/Sy95I25Qa.png) #### Shared Memory * Memory region 需要先宣告,然後被 attached to processes 的 address space (不再使用時被 detach) * Producer-Consumer Problem * 實作時因為 Memory 存在容量問題,根據盛裝 Data 的 Buffer 分類為 * unbounded-buffer: Producer 不用等, Consumer 可能要等 * bounded-buffer: 兩個都可能要等 * Shared Memory: ``` #define BUFFER_SIZE 10 // 實際可以有 BUFFER_SIZE-1 個 elements typedef { ... } item; // 宣告 Data 的資料結構 item buffer[BUFFER_SIZE]; int in = 0, out = 0; if (in == out) { // empty } else if ((in + 1) % BUFFER_SIZE == out) { // full } ``` * Producer: ``` while (1) { while ((in + 1) % BUFFER_SIZE == out) { //full, do nothing } buffer[in] = item; in = (in + 1) % BUFFER_SIZE; } ``` * Comsumer: ``` while (1) { while (in == out) { // do nothing, nothing in buffer } item = buffer[out]; out = (out + 1) % BUFFER_SIZE; } ``` > skip the POSIX code,寫了我也不會記得 :+1: #### Message Passing * Operation: Send and Receive * Sender 和 Reciever 要有 Communication link,Message size 可變可不變 * 分成 Direct Communication 和 Indirect Communication * 也分成 Synchronization = Blocking 和 Asynchronous = Non-Blocking * Blocking 和 Non-Blocking 可以成對,比如 Blocking Sender + Non-Blocking Receiver ##### Direct Communication 通過 Link 連接一對 processes,必須一邊 call send 一邊 call recieve 才能溝通 * Properties: 1. Processes 之間必須**明確**知道收 / 送件人 2. Link 自動建立,一對 process pair 一個 Link 3. Symmetric Addressing 4. Pid 嚴格規範,如果 Pid 改變就出事 ##### Indirect Communication 通過 Mailboxes = ports 運送 msg,mailbox 有自己的id,只有共享 mailbox 的 Processes 能交流 * Properties: 1. 共享 mailbox 的 processes 有 Link (不局限於兩個 Processes,同時 pair of processes 之間也可能不只一個 Link) 2. Link 可以是單向也可以是雙向 3. Mailbox owner 可能是 creator 或 OS * Creator: 代表 Mailbox 存在該 Process 的 address space 且該 Process 是唯一的 reciever;當 Process terminal,Mailbox 消失 * OS: OS 會給 User 刪除 Mailbox 的方法,即使 Creator Process terminal,Mailbox 仍可能存在 ##### Synchronization = Blocking * Blocking Sender: 一定要確認 receiver 有收到 message ,才能 return * Blocking Receiver:一定要拿到 message 才能 return ##### ASynchronization = Non-Blocking * Non-Blocking Sender: 反正我是送出去了,但可能有 drop * Non-Blocking Receiver: **多 return 一個 token**,檢查到底有沒有真的收到訊息 ##### Buffer 1. **0 capacity**:Sender 要等 receiver 2. **Bounded capacity**:Full 時 Sender 等;Empty 時 Receiver 等 3. **Unbounded capacity**:Sender都不用等;Empty 時 Receiver 等 > skip POSIX and System V API again :+1: ### Examples of IPC Systems #### Pipes = Anonymous Pipes in Windows * Require **Parent-Child relationship**, 通常單向 ![image](https://hackmd.io/_uploads/Hyokwaqm6.png) ``` int main(int argc, char *argv[]) { int pipefd[2]; pid_t cpid; char buf; if (argc != 2) { fprintf(stdrr, ": %s<string>\n, argv[0]."); exit(EXIT_FAILURE); } if (pipe(pipefd) == -1) { perror("pipe"); exit(EXIT_FAILURE); } cpid = fork(); if (cpid == -1) { perror("fork"); exit(EXIT_FAILURE); } if (cpid == 0) { // Child read from pipe close(pipefd[1]); // Close unused write end while(read(pipefd[0], &buf, 1) > 0) { write(STDOUT, &buf, 1); } write(STDOUT_FILENO, "\n", 1); close(pipefd[0]); exit(EXIT_SUCCESS); } else{ close(pipefd[0]); wrtie(pipefd[1], argv[1], strlen(argv[1])); close(pipefd[1]); wait(NULL); exit(EXIT_SUCCESS); } } ``` > 關於 [argc, argv](https://blog.gtwang.org/programming/c-cpp-tutorial-argc-argv-read-command-line-arguments/),argc 的報錯指定只有兩個參數 > argv[1] 是要丟進 pipe 的 data,argv[0] 則指向程序運行的全路徑名 > 操作解析: > 1. 檢查兩個參數、檢查 pipe 兩端有 process > 2. Fork (檢查是否失敗) > 3. cpid = 0 $\Rightarrow$ Child: 關閉 pipe[1]\(sender),將 pipe[0]資料儲存,關閉 pipe[0],EXIT(0) > 4. cpid != 0 $\Rightarrow$ Parent: 關閉 pipe[0]\(reciever),將 argv[1] 資料寫入 pipe[1] ,關閉 pipe[1],EXIT(0) #### Named Pipes = FIFOs * **不必須是 Parent-Child relationship** ### Communication in Client-Server Systems #### Socket * 定義:Endpoint for Communication,由 IP address 和 port 組成 #### Remote Procedure Calls (RPC) * 讓 Client 如同呼叫本機一樣呼叫 remote host,msg 通過 RPC daemon;msg 包含 function id and parameter,結果再以 msg 方式回傳 ![399673684_707274467647895_8600112520730956596_n](https://hackmd.io/_uploads/Skm9bR9m6.jpg) * Issues 1. 協定,需要統一 Data 存放、表現方式,由 marshaling 處理 2. Error handling * 可能因為網路錯誤而 fail 或 丟出很多次 msg * ALO && AMO * **At Least Once (ALO)**: Server 收到會回 Client **Ack**,即"至少成功傳出一次" * **At Most Once (AMO)**: Server 收到一樣 (timestamp) 的 msg 會直接 drop 掉 3. Blinding: Client 不知道 Server 的 Port?會有 daemon 在監聽固定的 port 去確認 server port #### Remote Method Invocation (RMI, for Java) > 就 Java 版,不想寫 ![image](https://hackmd.io/_uploads/rysnmCqXp.png) ## Ch04 Threads & Concurrency ### Overview * Process 是 OS 分配資源的最小單位;Thread 則是 CPU 運用的基本單元 * Process 是 Thread 的容器,Process 分為 Single- / Multi-Threaded Process ![image](https://hackmd.io/_uploads/H1xWtCqQp.png) ### Multithreading * 因為 Process Creation 很耗時,當我們想要一個 Server 提供給大量 Client 就會很卡 $\Rightarrow$ 改用更輕量的 Thread * 在 RPC (Remote Procedure Calls) 中 Thread 也被使用 ![image](https://hackmd.io/_uploads/HJNecA9Qa.png) * 優點(前二者也是 multi-process 的優點,後二者針對 multi-thread): * Responsiveness: 可以服務很多 Client,不會被一個 Client 卡死 * Utilization of MP Architecture,可以實現平行運算 * Resource Sharing * 更輕量 = Economy * System 處理 Task 可以分成 * Concurrency: 跑跑 A 再跑跑 B,算都有 support * Parallelism: 同時在跑很多 task ![image](https://hackmd.io/_uploads/r1Zk6Aq7p.png) * Data Parallelism: 切割 Data 後對 Data 做一樣的事 * Task Parallelism: 對同一份 Data,不同任務自己拉要用的去跑 ![未命名2](https://hackmd.io/_uploads/ryv-R0cXT.png) #### Amdahl’s Law 根據系統的 parallel 和 serial 比例以及 core 數推估速度提升 ![image](https://hackmd.io/_uploads/HyG3GkiQa.png) #### Thread Types 1. User thread: done by **thread libarary** 2. Kernel thread: Supported by kernel ![image](https://hackmd.io/_uploads/S1rjmyoQ6.png) #### Multi-thread models 1. Many-to-One * 好處: thread 管理上比較 efficient * 壞處: 只要一個 user thread 做 blocking system call,就會卡住其他所有 thread $\Rightarrow$ 無法做 Multi-Thread ![image](https://hackmd.io/_uploads/HJ53NysQT.png) 2. One-to-One * 好處: 沒有上面的 blocking 問題 * 壞處: 過多的 kernel thread 對 kernel 負擔過大 ![image](https://hackmd.io/_uploads/H1VEBJjXa.png) 3. Many-to-Many (M user thread $\Rightarrow$ N kernel thread, N $<=$ M) * 好處: 沒有 Blocking 問題、Kernel 負擔不過重 * 壞處: 設計複雜 ![image](https://hackmd.io/_uploads/B1D3S1jXT.png) 4. Two-level Model(類似M:M,只是允許少部分 thread 單獨 map 成 kernel thread) ![image](https://hackmd.io/_uploads/H1GQ8yoQT.png) ### Thread Libarary * 提供 API 去創建和管理 thread * 分成 * User-level library: code 和 data structure 都在 user space,library call 不會產生 system call * Kernel-level library: code 和 data structure 都在 kernel space,library call 通常會產生 system call * Eg: 1. Pthread (User / Kernel) ``` int sum; // global variable, 不同 threds 共用 void *runner(void *param); // sub-thread code main(int argc, char *argv[]) { pthread_t tid; //thread id pthread_attr_t attr; //thread attribute if (argc != 2) { fprintf(stderr, "usage: a.out <integer value>\n"); } if (atoi(argv[1]) < 0) { fprintf(stderr, "%d must be <= 0.\n", atoi(argv[1])); } pthread_attr_init(&attr); // get default attribute value pthread_create(&tid, &attr, runner, argv[1]); // 回傳新建 tid,賦其 attr 以及 function name and input pthread_join(tid, NULL); // 等新 thread 結束 printf("sum = %d.\n", sum); } void *runner(void *param) { int upper = atoi(param); int i; // These are local var. sum = 0; //This is global var. if (upper > 0) { for (i = 1; i <= upper; i++) sum += i; } pthread_exit(0); // end thread } ``` 2. Windows (Kernel) ``` CreateThread() => create the thread WaitForSingleObject() => wait for thread to finish ``` 3. Java threads > Skip, 我肯定不會記得 #### Implicit Threading ($\Leftrightarrow$ Explicit threading) 告訴系統這塊要平行,但是不自己寫 Eg: OpenMP `#pragma omp parallel` 用於宣告 parallel region ### Threading Issue #### Semantics of fork() and exec() * Question: 當一個 process 有 5 個 thread,其中一個 call fork(),該怎麼處理? Ans. 1: 複製全部 Ans. 2: 只複製 call fork() 的 caller thread $\Rightarrow$ 當該 thread 接下來會 call exec(),可以只複製 caller,其他情況則複製全部 thread #### Thread cancelling * 想要在一個 thread 結束前 terminate * Solution: 1. Asynchronous Cancellation: * 馬上 terminate thread * 缺點:資源已被分配、該 thread 可能正在更新共用的資料 2. Deffered Cancellation: * 讓 thread 在安全點檢查是不是有人叫他刪掉: `pthread_testcancel()` * 確保 thread 是在 Cancellation Point (save point) = `open()/close()/read()/write()` 被刪除 3. 就不給刪 ![image](https://hackmd.io/_uploads/Hkik0yjQT.png) #### [Signal handing](https://hackmd.io/@Chang-Chia-Chi/OS/https%3A%2F%2Fhackmd.io%2F%40Chang-Chia-Chi%2FOS-CH4) Signal are used in UNIX systems to notify a process that a particular event has occurred **A signal can be handled only once!!** 1. Signal 由特定事件產生 2. Signal 會被送給 Process (Signal destination) 3. Signal is handled by signal handlers * What if a multi-threaded process 收到 Signal? 1. 給 signal 指定的 thread. - For syn signals, 要送給搞出 signal 的 thread 2. 給 process 裡所有 thread - For Ctrl-C, 要送給所有 thread 3. 給 process 某個特定 thread 4. 設定一個 thread 接收所有 process - 直接由 main thread 來處理 signal,如 file handler 的操作 #### Thread pools 考慮一個面對大量 Client 的 web server,如果每個 Client 來都創一個 thread 給他: 1. Thread creation and termination 都要成本 (Thread 有自己的 register 和 Stack) 2. Thread 沒有上限 Solution: Thread Pools * 先建好 Pool of Thread,創好的 thread 放在裡面等著被用 * \# of thread in pools 取決於 (部分 pool 架構的 thread 數量是可調整的): * CPU 數量 * 實際 memory 大小 * 預期的使用量 #### Thread specific data = Thread-Local Storage (TLS) * 讓 Thread 擁有自己的一份 Data copy * 注意:封裝性 1. Global variable: 不同 threads 都能看到 2. Thread specific data: 僅有該 thread 和被該 thread call 的 fuction 才能看到該 variable 3. Local variables: 只有 during single function invocation 可視 ### Operating-System Example ![image](https://hackmd.io/_uploads/S1G5PloQ6.png) > 放棄背 code :crying_cat_face: ## Ch05 CPU Scheduling ### Basic Concepts * Scheduling: **basis of multi-programming** = switch CPU between process to improve CPU utilization * CPU burst cycle: cycle of CPU execution, 包含執行指令以及等待被從 rea待被從ready 選回 running 的時間 * I/O burst cycle: cycle of I/O waI/O wait ![image](https://hackmd.io/_uploads/B18Ysgs7T.png) * CPU Scheduler = short-term scheduler: 從 ready to execute 的 process 中選擇,並給它 CPU。出現時機: 1. Process 在等 I/O 或 children: 從running 移到 waiting (主動放棄) 2. Process 跑太久了 (time expire): 從 running 移到 ready (被 Scheduler 趕走) 3. Process 完成 I/O: 從 waiting 回到 ready (被 Scheduler 趕走) 4. 在跑的 Process 做完了: Terminate (主動放棄) * Non-preemptive vs. Preemptive * Non-preemptive Scheduling = Cooperative Scheduling: * 只要不主動放棄 CPU,就沒人能搶 * 需要 Process 自主合作 * Code 好寫、不用額外硬體支援、USER感受可能較差 * Preemptive Scheduling: * 對於 Time-sharing system 或 real-time system 較合適 * 更多的 context switch (cost)、額外的硬體、複雜的 code ### Scheduling Criteria 1. CPU Utilization ($\Uparrow$ , better) * CPU 有多少比例的時間是非閒置的 2. Throughput ($\Uparrow$ , better) * 單位時間可以完成多少 process 3. Turnaround time ($\Downarrow$, better) * 對於一個 process,從 process submission 到 process termination 的時間 (即開始到結束) 4. Waiting time ($\Downarrow$, better) * 從進 ready queue 到被選走的時間 5. Response time ($\Downarrow$, better) * 從 request submmitted 到第一個 response 的時間 ### Scheduling Algorithm 1. FCFS * 先到先做 * 缺點: non-preemptive * 護衛效應(_Convoy Effect_): 如果第一個 process 很長,會使所有 process waiting time 上升 * 低 CPU 利用率 2. SJF * Burst短的先做 * 分成兩種: * Non-preemptive: 不檢查 new process 的時間,只在每次拿 Process 時挑最短的,一定會把手頭上的做完 * Preemptive = **Shortest-Remaining-Time-First (SRTF)**: 發現新的 process 進來,所有 process 的剩餘時間進行比較 3. Predicting of Next CPU Burst ![image](https://hackmd.io/_uploads/SkdNf-ima.png) 4. **Round Robin (RR)** * 設定一個時間 Time Quantum **q**,大家都上來跑,不能超過 **q**,否則就 Time expire 請下去 ready queue * 較差的 average turnaround time 和較好的 response time * **q** 太大 $\Rightarrow$ FIFO **q** 太小 $\Rightarrow$ 太多 Context Switch 消耗 * 經驗法則:CPU burst * 80% < **q** 5. Priority Scheduling * 進來的 Process 有 Priority 編號,由小到大做;其實 SJF 就是 Priority 的一種 * 缺點:**Starvation** - 低優先度的可能永遠做不到 Solution: **Aging** 6. Multiple-Processor Scheduling * 當 Process 可以被輕鬆分類時使用,將Process 分在不同的 Queue 並依據特性決定算法 * Eg: Foreground (有互動需求) 和 Background (可以分批處理),FG可以有比較高的 Priority $\Rightarrow$ FG uses RR + BG uses FCFS * Scheduling 要兼顧不同的 Queue * Eg: 固定 Priority 先做 FG 再做 BG;或 Time slice,大多數時間做 FG、少部分做 BG **$\overrightarrow{\rm improved}$ Multilevel Feedback Queue:** 配合已經處理的資訊換 Queue * Eg: 如果 CPU bound Process 一直占用資源 $\Rightarrow$ 降級去低優先級的 Queue;如果低優先級 Queue 裡有 Process 一直沒拿到 CPU 就:$\overrightarrow{\rm aging}$ 移動到更高優先級的 Queue * The most **generic and complex** algorithm ## Reference Mainly Based on CSIE, NCKU OS class 2023 ### Ch01 1. [陳士杰-作業系統(Operating Systems) (Operating Systems)](http://debussy.im.nuu.edu.tw/sjchen/OS/97Spring/Ch_1.pdf) 2. [作業系統筆記-電腦系統架構](https://tomchen.me/2020/03/19/OS/%5BOS%5D%20%E4%BD%9C%E6%A5%AD%E7%B3%BB%E7%B5%B1%E7%AD%86%E8%A8%98-%E7%B3%BB%E7%B5%B1%E6%9E%B6%E6%A7%8B/) ### Ch02 1. [系統程式ch5:Linker/Loader](https://hackmd.io/@alan303138/Sy26KTAJY#%E9%80%A3%E7%B5%90%E5%99%A8) 2. https://hackmd.io/@sysprog/microkernel-design 3. http://cc.ee.ntu.edu.tw/~farn/courses/OS/slides/ch02.pdf ## Midterm ### 簡稱練習 1. RPC 2. PCB 3. NUMA 4. IVT 5. DMA 6. SJF / SRTF 7. ALO & AMO 8. ISR 9. RR 10. TLS ### 名詞練習 1. Storage Hierarchy 2. Multi-processor systems 3. Multi-core systems 4. Clustered Systems 5. Long-term scheduler & short-term scheduler & mid-term scheduler 6. Dual-Mode Operation 7. Privilege escalation 8. Emulation & Virtualization (type I & II) 1. Linker and Loader 2. Simple Structure - Layered Approach - Microkernels - Modules 3. Process State Diagram 4. Process Queues 5. Context Switch 6. Tree of processes Diagram 7. Parent & Children fork() & exec() Diagram 8. Shared Memory & Msg passing 9. direct Communication & Indirect Communication 10. Pipes & Named Pipes 11. Socket 12. Multithreading 13. Throughput 14. Multilevel Feedback Queue 15. Turnaround time 16. Non-preemptive vs. Preemptive 17. Thread Pools 18. Deffered Cancellation 19. Concurrency 20. Implicit Threading