# OS 筆記 contributed by < `limaoxd` > # 第一章 ## OS介紹 - app 和 hardware 的中介 - os的目標 - 讓 app 不用直接去處理硬體 - os 有更好的效率去處理 resource sharing - computer 分成 4 個區塊 - Hardware ( cpu, mem, I/O device - Operating system - System and application program ( App - User ( People, machine ## Operating System - OS is - Recource allocator - 管理所有的資源 - App 不用直接去管理 Mem - 在**效率(effcient** 和 **公平(fair**的衝突中做資料的使用 - Control program - 控制執行中的程式來避免出現 error 和 improper use - 通常是 Bootstrap program 去載入 os - 通常存在 ROM 或 EELPROM - Job - 初始化所有的 aspect of system - 載入 OS kernel 和執行 OS execution - Bootstrap(in-disk) 通常由 PC, BIOS 載入 - I/O device 和 CPU 可以同時執行 (concurrently - 每個 device controller 都是 charge of a particular device type - 每個 device controller 有自己的 buffer (local buffer - For data write - **Device driver (part of OS** 在 cpu 上都從 Main memory 移至 Local buffer - one of the jobs of an op is **Manege Interrupt** ## Common Function Interrupts - 執行 interrupt 透過 **Interrupt Vector ( table** 去使用 **ISR** - interrupt vector 可能包含 address or instructions > X86 is address, Arm is instruction - Processor 必須存 Address of **Interrupted Instruction** 在跳去 ISR 之前 - 雖然可以由 I/O device controller 去 interrupt 但 Software 也能generate/raise interrupt - **trap**通常是 **error 或 user request** 產生的 interrupt - Error : devision by zero, invalid mem acess - User request - **int** instruction in x86 - **svc/swi** instruction in Arm - An op system is **interrupt driven** - 通常是呼叫 ISR 或 interrupt handler 去執行 interrupt - **interrupt Handling** - save the cpu state >done by HW or OS - continue the interrupt work by 恢復(restoring the Cpu state - saving/restoring cpu state - ![](https://i.imgur.com/gl41kjy.png) - interrupt-drivce I/O cycle - ![](https://i.imgur.com/qDaa6Ll.png) - interrupt timeline - ![](https://i.imgur.com/GNwmwOg.png) - 同步或非同步的 I/O - ![](https://i.imgur.com/HbotdWI.png) ## DMA (Direct Mem Acess - Device controller 傳送 block of data 從 Local Mem **直接**傳到 Main Mem > without **CPU intervention** - 當 DMA 結束時產生 Interrupt ## Storage Structure - storage system 的階級 - the layers differ in **Size**, **Speed**, **Cost**, **Volatility ( 揮發性** - ![](https://i.imgur.com/C0B8yRC.png) - **Primary storage** 最主要存的 - Main mem > CPU 能**直接 (via load/store instruction**取得的 large storage media - **Secondary storage** 次要 - extension of main mem that provides large **Nonevolatile(非揮發性** storage capacity - **HDD ( Magnetic disk** - **NVM ( Nonvolatile mem** - SSD (solid state drive - **Tertia storage** > 用作備份的 ## Caching - 被用來從慢的 storage 到快的 storage 抓出來檔案 (臨時用的 > main mem 可當 secondary storage 的 cache - cache 首先確認是否有此檔案資訊存在 - yes, 檔案直接從 cache 載入 - no, 複製檔案到 cache 再載入 - cache 的大小 < 被抓取的 storage - 所以 cache 的管理是很重要問題 - Cache Size and Replacement Policy - 可在 **many levels in computer system** 中使用 >e.g. hardware, os, software - Perfomance of various levels of Storage - ![](https://i.imgur.com/vGcTbFB.png) - > DRAM 由電晶體組成是動態記憶體 - data A 從 Disk 到 Register 的**移植( migration** - ![](https://i.imgur.com/OXydKrK.png) ## Computer-System Architecture - 許多系統使用 **General-Purpose Processors** - 有些系統則是用 **Special-purpose processor** - **Multi-Pocessor systems** 使用上的成長和重要性 - **Parallel System**, **Tightly-Coupleed Systems** - 優點 1. 增加 throughput 2. 增加 reliablity > **graceful degration** 或 **fault tolerance** 3. **Ecocomy of Scale** > 一樣的性能下多核比起強大的單核便宜 - Types 1. **Asymmetric Multiprocessing** > e.g. Qualcomm Snapdragon, MTK Helio 2. **Symmetric Multiprocessing** > e.g. Intel i7 based PC - Symmetric Multiprocessing Architecture - ![](https://i.imgur.com/nJ586t9.png) ## Multi-core systems - **Mult-core**: 數個 **compute cores** 在單個 chip 上 - 比 multiple sing-core chips 更高效 - onechip 的內部溝通比 cross-chip 溝通快 - ![](https://i.imgur.com/QEC5JFa.png) - Graceful Degradation - ![](https://i.imgur.com/rH7crqB.png) ## NUMA ( Non-Uniform Mem Acess - 增加太多 CPU 在 **multiprocessor systems** 會導致 - 對 **system bus** 的**爭搶 (contention** - >每個 CPU 想要 **acess** Main Mem - 越多性能會下降 - NUMA 是一種好方法 - 提供每個 CPU 有自己的 Local Mem - access own mem bus - 越來越受到 server 和 high-end computing systems 青睞 - ![](https://i.imgur.com/aaTsk1Z.png) - problem - 讀取時間 Remote mem >> access local mem - 所以 OS 需要調配好 tasks/data - > **Co-locates (共同** a task and its data ## Clustered Systems - 多台電腦連接透過 **LAN(local area network** 或 **Faster Interconnection Network** - clustered system 的關鍵是 **High Avaliablity(可用性?** - **Fault tolerant** - 承受 **single component** 和 **continue operation** 的錯誤 - **graceful degradation** - 效能遞減會是成比例的下降 - general structure of a vlusteres system - ![](https://i.imgur.com/WXSihtf.png) - **Asymmetric Clustering** - 有些機器在 **Hot-standby mode** 而其他的在跑 app - Hot-standby machine 監控其他機器是否 active 或 fails - **Symmetric Clustering** - 機器再跑 app 時順便互相監控其他的機器 - 能更有效發揮那些能用的機器 ## MultiplePromming & Timesharing - Multiprogramming is need for **efficiency** - 讓 single job 不能整天使 CPU 和 I/O device **busy** - Multiprogramming organizes job(code and data) 能至少保持有事可做 - >增加 CPU 的使用率 - 其他的 jobs 持續在 **on-disk job pool** 裡 - job 被選擇和放入記憶體裡透過 **job scheduling** - Memory Layout for a Multiprogrammmed System - ![](https://i.imgur.com/V4EU77v.png) - **Timesharing (multitasking)** - CPU **頻繁**切換 jobs 的 **logical extension** - 透過增加 **interactive computing** 讓使用者能在 job running 時 interact - response time < 1 second - 如果 several jobs 準配好且在同個時間時 -> **CPU scheduling** - > If processes don’t fit in memory, swapping moves them in and out to run – Virtual memory allows execution of processes not completely in memory • Allows a program that is larger than the physical memory to ru n ## Operating System Operation - OS 必須防止 user jobs/process 破壞 OS 的正常運作 - **Dual-mode operation** - 使 OS 能保護自己 - Processor 提供 **user mode** 和 **kernel mode** - > os 跑 kernel mode(privileged mode), process 跑 user mode(non-privileged mode) - **privilleged instruction** - e.g. I/O control, special register,interrupt related instruction ... - a **mode bit** is provided 來區分系統再跑哪個 mode - **System call** 改變模式到 **kernel mode** , 然後在 return 切回去 - Transition from User to Kernel mode - ![](https://i.imgur.com/iK8juBS.png) - Modes are provided by HW(CPU) - > intel's 8088 只有 1 個 mode - 現代的 x86 則多於 4 個 mode - ![](https://i.imgur.com/Zt4mrFl.png) ## Timers - 概念上, user processes 跑在 OS 最上層. 但其實是**直接跑在 CPU 上!** - 工作都丟出去, os 可能會無法執行 - timer 用來避免 process **hogging resources** 卡在無限的迴圈 - set timeout priod ## Process Management - process 是執行中的 program, 是 system working 的基本單位 - >process is acitive entity, program is passive entity - process 必須要 data 塞進 Mem 才能完成 task - process 的中止需要 reclaiming reusable resources. - **Single-threaded** - one PC 用來指定下一個 instruction 的位置用來執行 - **Multi-threaded** - 有多個 PC, 每個 program 各有一個 per **Thread** - 通常會有多個 program 同時進行在一個 CPU 中, 透過 **Multiplexing( 多路複用在 process/threads 中同時執行** - Activities 跟 process management 的關係 - 創造 and 刪除 process - 暫停(suspending and 回復(resume process - providing machanisms(機制 給 process 的同步 - providing machanisms 給 communication - providing machanisms 給鎖死處理(deadlock ## Mem Management - Data 要被先放到 mem 在被處理(process 之前 - Instruction 則是放進 mem 在被執行之前 - mem management 決定 mem 裡有甚麼 - goal: 優化 CPU 的使用率和回應給 users - **Memory management activities** - 在需要時去收集和釋放 memory space - 持續追蹤記憶體使否**被用**和**被誰用** - 決定哪個 process 移進或移出記憶體 ## Storage Management - OS 提供 uniform, logical 的方式給資料儲存 - diff types of storage devices (ssd, hdd,tape drives - 將 **physical properties** 提取至 **logical storage unit - file** - free format e.g. text files - fixed format e.g. executable files - files 通常被組織進 directories - **Access control** 在系統中決定**誰**可以做**何**操作 - **OS activities** include - 創建或刪除盪案和資料夾 - Primitives to manipulate(操作 files and dirs >read/write/append files, set/get file status,set/get file permissions - **Mass-Storage Management (focus on efficient operation)** - 通常是 disks - Mass storages 被用來儲存 - 那些 **not fit** 在 main mem 的 data - 或那些要長期儲存的 data - OS activies - Free space management - Storage allocation - Disk scheduling - 對 system performance 的重要性 - disk 通常是效能的瓶頸 - Latency (潛伏 在電腦運算通常依靠 disk subsystem 和他的演算法 ## I/O Subsystem - OS其中一個目的是對 users 隱藏硬體的特點 - **I/O subsystem** - **Drivers** - **Memory management** of I/O including - **buffering** 在資料被傳輸時儲存 - **caching** 為了效能儲存部分資料到 faster storage ## Protection and Security - **protection** - 控制 **access**(of processes or users) to **resourse** 的機制 - 被 authorizes 或 unauthorized 區分 - **Security** - 防禦系統內或外的攻擊 - huge range, including denial(拒絕 of service, viruses... - 通常是先分辨 users 去決定他能做啥 - each user has user id - user id 關係到所有檔案和process 去決定 access control - process owners and file owner are all uesers - Group id 允許設定或管理 users, 也關係到 process 和 file - **Privilege escalation (特權升級** 暫時允許 effective ID 有更多權力 - 但容易當城市被攻擊時變得脆弱 > ls -al passwd 能設定 userid ## Virtualization - 提取 **the hardware of a single computer** 到數個 diff 執行環境 - 製造一個 **illusion** 讓各個 environment 跑在私人電腦上 - allow OSes to run on other OSes - ![](https://i.imgur.com/QazITjv.png) - **Emulation** - 在 CPU type 不同時使用 - slow - **Virtualization** - 當 OS **natively compiled for a CPU**, 而那個 OS 也將直接跑在 CPU 上 - **VMM (Virtual Machine Manager)** 提供 virtualization services - also **Hypervisor** - can run natively without host OS - ![](https://i.imgur.com/qUe1Pv7.png) ## Kernel Data Structure - Lists, Stacks, Queues - Trees - Hash Tables - linux data structure define **include** files > linux/list.h , linux/kfifo.h , linux/rbtree.h - **List** - ![](https://i.imgur.com/lVX1o8z.png) - **Stack** : **LIFO** - push - pop - **Queue** : **FIFO** - **Tree** - general tree - binary tree - binary search tree - balanced binary search tree - **Hashing and Bitmap** - **Hash function can create a hash map** - **Bitmap** - string of n binary digits representing the status of n items (這大概每個人都會做吧) ## Computing Environments - types - Traditonal Computing - Mobile Computing - Client-Server Computing - Peer-to-Peer Computing - Cloud Computing - Real-Time Embedded Systems - **Traditional Computing** - office environment - PCs connected to a network, **terminals** 附在 **mainframe** or **minicomputers** 提供 **batch(分批** 和 **timesharing** - 新的 **portals** 允許相近的 systems 去存取 internal resource - home networks - 被用在單一系統再來是 **modems** - 現在是 firewalled, networked - **Mobile Computing** - smartphones, tablets - issues - limited mem - slower processors - I/O constraints: small keyboard - Limited power - extra features - GPS, accelerometers, and 陀螺儀 - leaders 是 apple ios, google android - **Client-Server Computing** - **Servers**: serving request, **Clients**: issuing requests - **application-server** : 提供介面給 client request的服務 - **File-servers** : 提供介面給 client 去存或取回(retrieve 檔案 - 對 client 來說, **dumb** terminals 被 PCs 或 smartphne 取代 - **Peer-to-Peer Computing** - 取代節點並視為 **peers** - 各個 peer 被當作 client 或 server - Node must join P2P network - 優點 - in client-server system, server 可能是瓶頸. 在 P2P 裡 server 卻能由其他的 nodes 提供 - e.g. skype - Node must join P2P network - 註冊他的 service(s) 在 Central lookup service(中央查找服務) on network - 廣播 request for services 和回應 services via **discovery protocol** - **Cloud Computing** - Deliveres 運算, 儲存, 甚至是 app as a service across a network - **virtualization**的邏輯擴張 (Logical extension - 通常使用在 virtualization - ![](https://i.imgur.com/0vPaogn.png) - diff types - **public cloud** - 給那些付錢的人用 - **Private cloud** - 公司自用 - **Hybrid cloud** - 包含上述兩種的 cloud components - Catogories - **SaaS (Software as a Service ** - 一個或更多 app available via cloud - **PaaS (Platform as a Service** - software stack 準備給那些透過 cloud 的 app - **IaaS (Infrastructure** - servers or storage avalable via cloud (storage fot avaliable backup - cloud computing 由 traditional OSes, plus VMMs, plus cloud management tools 組成 - Internet connectivity 需要 **security** 像 firewalls - **Load balancer** 橫跨 multiple apps - **Real-Time Embedded Systems** - 有時候跑 **real-time OS** - Processing 必須有 **deadlines** 去完成作業 - 否則 system will fail - e.g. 導彈系統 - **Open-Source OS** : linux, freeBSD, OpenSolaris, Illumos... # 第二章 ## Outline - OP Service - Interfave provide by an OP - System Program - OP 設計和實作 - OP 的結構 ## **OP Services** - 對 user 或 app 有用的 Function - **UI (User interface** - **Program execution** - load program in mem, terminate execute,either normally or abnormally (代表失敗嘞) - **I/O operation** - 運行的程式可能要使用 I/O, 可能涉及 file 或 I/O - **File-System manipulation** - Program 需要讀寫檔案和資料夾, 創造或刪除 item, 查詢, 列出檔案資訊和權限管理 - **Communications** - Process 會交換資訊, 在同個電腦或其他甚至是 cloud - **Error detection** - OS 必須立即知道可能的 errors - hardware errors - software errors - invalid mem access - each type of error,OS 必須慘取適當的行動確保正確和持續運算 - Functions that ensures the **efficient operation of the system itself** via resource sharing - **Resource allocation** - 當多個 users 和多個 job 同時進行時, 資源必須被 allocated 給他們 - **Logging/Accounting** - 保持追蹤 users 用了哪種和多少的電腦資源 - **Protection & Security** - 確保所有的 access 到 system resources 被控制嘞 - 那些同時跑的 process 不能影響到彼此 - user authentication - 保護資源避免被違法的使用 - A view of OP services - ![](https://i.imgur.com/69xONiU.png) - User Interface provided by an OS - **CLI** - CLI 提供直接輸入指令 - 有時候被 kernel 執行,有時候是 system program - 有時實現出多種 **flavor** - bash, c shell, TC shell, Korn shell - 主要從 user 讀取 instruction 並執行 - 有些是 built-in, 有些是executable program (後者可以不需要 shell 的改動 - user Interface provided by an OS - **GUI** - user-friendly **desktop** metaphor interface - mouse, keyboard, monitor - **Icons** 是代表物件像 files, program, actions - 多種不同的 actions 可以被展現在 objects 上 > e.g. provide information, execute function, open dir(folder) - Invented at Xerox PARC - 現有許多系統包含 GUI 和 CLI interface ## **Programming Interface of an OS - System calls** 應該挺重要的 - system call - programming interface provided by OS - 通常寫在高階語言 - In many cases, user program 被稱為 high-level **Application Program Interface - API** 比起被稱作 system calls - API: a set of functions that can be invoked by the programs - 3 個最常見的 API 是 : **Windows API** for windows, **POSIX API** for POSIX-based systems (包含許多版本的 UNIX, Linux, and Mac OS X) 和 **Java API** for JVM - 用 API 比起用 System calls - 更簡單使用 - 可移植性 (Portability - Each OS has it own name for each system call - System call 序列複製內容到另一個 file - ![](https://i.imgur.com/pn6MfCj.png) ## **System Call Implementation** - 通常, **number** 關連到各個 system call - System-call interface 留有 table 去 indexed 他們的 **number** - The system call interface **invoke (調用** intend(打算 system call handler(處理程序 in the OS kernel (system call interface 調用在kernal 那些打算使用 system call handler 的程序 - After system call handler finishes, 接下來會回傳底下的... - status of the system call - return values - The **Caller** 不知道 system call 是如何被實現的 - 只要遵循 the **High-Level API** or **System call interface** 然後將知道 OS 會做啥 - 但 details of the OS system call interface 被 API 隱藏嘞 - managed by run-time support lib - System call 的實現 - ![](https://i.imgur.com/0ax2yX3.png) -Standard C lib e.g. -C program invokes the printf() libary call, 在 system call 是被做 **write()** - ![](https://i.imgur.com/bu7PJFD.png) > c lib 其實還是在 user mode 裡 - System call on x86 - ![](https://i.imgur.com/tSeljoz.png) > in modern x86, the **syscall** instruct 可以實現更快速的 system call - replcement of int 0x80 - **Passing Parameters (參數傳遞** in system call - 通常, 有更多的資訊被system call 需要比起簡單地 ID/number - type 和很多的資訊包含 OS 和 call - 通常有 3 種方式在 OS 裡被用來傳遞參數 - Simplest : 傳遞 parameter in **registers** - in some cases, there may be more parameters than register - Paremeter 存**block/table** 在 mem 裡, 且將 address of the block 被當作 parameter 在 register 裡 - Parameter 被placed或 pushed on stack, 通過 program 和**popped off** the stack 在 OS 裡 - Block 和 stack methods 沒有限制當 paremeter 被傳遞時的**數量**或**長度** - parameter passing via block/table - ![](https://i.imgur.com/yW6P4oc.png) - **MS-DOS Execution - Single-Tasking** - ![](https://i.imgur.com/zlhE5OS.png) - **FreeBSD Execution - Multi-Tasking** - ![](https://i.imgur.com/lwN61y7.png) - **Types of System Calls** - **process control** - creare/execute/terminate process - get/set prosses/terminate processes - Wait for time - Wait signals/events - Allocate and free mem - **File management** - Create/delete files - Open/close files - read, write, reposition (seek) - Get/set attributes - **Device management** - Request/release devices - read, write, reposition (seek) - Get/set attributes - Attach/detach devices - **Information maintenance** - Get/set date or time - Get/set system info. - **Communications** - Create/delete connetions - Send/receice messages - Transfer status information - **Protection** - Control access to resources - Get and set permissions - Allow and deny user access - **Examples of System Calls (會考** - ![](https://i.imgur.com/hHzxu1l.png) ## **System Programs** - **Most users's view of an OS** 是被宣告在 system programs, 不是真的 system calls - 有些時候底下的 function 被提供給 **system utilities** 或 **application programs** - System programs 提供一個方便的環境給 program development 和 execution, 可被分成底下 - **File management** - Create, delete, copy, rename, print, dump, list, and generally manipulate files and directories. >e.g. file manager - **Status information** - Some provide general status info. - date, time, amount of available mem, disk space, number of users - Others provide detailed performance, logging, and debugging info. - Typically, these programs **format and print** the output to the terminal or other output devices - Some system implement a **registry** - used to store and retrive(取回 configuration information - e.g., **task manager** and **registry** in windows, **top** utility(效用 in Linux - **File modification** - Text editors 去造或更改 files - 特殊的指令是去**尋找**內容(contents or **展示** transformations of the text - **Programming language support** - Compiliers, assemblers, debugger and interpreters sometimes provided - **Program loading and execution** - ELF (Extendible Linking Format) loader - **Communications** - 允許使用者傳送訊息到其他的 螢幕, 網頁瀏覽介面, 傳送 eletronic-mail messages, log in remotely, 從這個機器傳送檔案(files 到其他的機器 - E.g., Outlook, Chrome/Edge... ## **Linkers and Loaders** - source code 編譯進 object files - **Linker** 合併 object files 進 single binary **executable** file (program) - Program 像 binary execytable 住進(resides **secondary storage** - program 被 **loader** 執行時一定要被帶進 (brought into memory - Object, executable files 有基礎格式, 所以 OP 知道如何載入和啟動它們 - 現代 system 通常的目的是不需要 link libraries 進執行檔 - 並且, **DLLs(dynamically linked libraries)** 只要需要就被載入, 分享給那些用相同版本相同 lib 的程式 (loaded once) - **Role of the Linker and Loader** - ![](https://i.imgur.com/sDjW8Kp.png) ## Operating System Design and Implementation - OS Design - 不同種類 OP 的內部結構 can very widely - Stary by **defining goals** and **specifications (規格** - 被選擇的系統型態或硬體影響 - **User goals** 和 **Sustem goals** - User goals - OP 應該**簡單的被使用**, 能被**簡單的學習**, **可靠** **安全**且**快速** - System goals - OP 應該簡單的**被設計**, **實現**和**持續(maintain**以及**彈性**, **可靠**, **error-free** 和**有效率** - OS design - 被分成 **policy** 和 **mechanism** - **Policy**: **甚麼**將被完成 - **Mechanism**: **如何**去做到 - Mechanisms 決定如何去做某事, Policies 決定甚麼被完成 - 分離 policy 從 mechanism 是非常重要的決定, 將會允許 **maximal flexiblity** 如果之後 **policy decisions are to be changed** - OS Implementation - 在早期之前, OS 是組合語言寫的 - MS-DOS was written in Intel 8088 assembly language - 至今許多的 OS 都是由高階語言寫的 - C/C++ - >90% 的 linux code 是由 C 寫的 - Portability(移植性 matters! - OS 主要的效率進步由 - Better date structure and algorithms - Not from using assembly language - because of **advanced compiler techniques** - although some performance-critical code are still assembly code... ## Operating System Structures - **Simple Structure** - MS-DOS: 提供最有功能性的被寫進最少的空間 - Not divided into modules - MS-DOS Structure - ![](https://i.imgur.com/fgnBmv8.png) - UNIX - 被硬體的功能性限制, 原始的 UNIX OP 是 limited structuring - The UNIX OS 包括 - System programs - The kernel - 包含所有東西且在 system call interface 下和硬體之上 - 提供 file system, CPU scheduing, memory management, 和其他 OP functions; a large number of function in one level - Monolithic kernel (單片核心 - UNIX System Structure - ![](https://i.imgur.com/ozjsE4B.png) - **Layered Approach** - 像 layers of network protocol, OP 被分進 **a number of layers(levels)**, 每份被建在 lower layers 的最上面 - The bottom layer(layer 0) is hardware; The highest leyer is the user interface - with **modularity**, layers 被選則像是每一層調用 Operations/services of only the lower-level layer - 優點 - 簡單建構和除錯 - start from the lowest layer - 不知道其他層的細節 - hide the details from the other layers - 難點 - hard define the layers - Less efficient - services 可能橫跨很多層 - **Fewer layers** are applied (少量的 layers 被應用 - Layered Operating System - ![](https://i.imgur.com/6KTCNMd.png) - **Microkernels** - kernel 將一些功能下放到 **user** space - ![](https://i.imgur.com/BxvIcqX.png) - Communication 取得 user modules 的空間(place 透過 **message passing** - 好處 - 好移植 - 容易擴大 - 較少程式需要跑 - 更安全 - 弊病 - 使用 communication 跨太多次 kernekl mode 和 user mode 使效能降低 - 對於內部系統的溝通 (file system invoke disk driver) - 1 function call + return -> 2 system calls and 2 upcalls - **fine grined(粒狀 compoments** -> high overhead - **Microkernel system structure** - ![](https://i.imgur.com/fxJwPWQ.png) - **Mac OS X Structure** - ![](https://i.imgur.com/UisxK5J.png) - **Modules** - 現代最多的 OP 實作 kernel modules - 使用 object-oriented approach (物件導向) - Each core component is seperated - Each talks to the others through **known interface** - Each is **dynamically** **loadable as needed** within the kernel - 總而言之 , 像 layers 但更具彈性 - **Solaris kernel** - ![](https://i.imgur.com/I0tfLcX.png) - Hybrid Systems - 最現代化的 OP 不用去採用單一, 且嚴格定義的結構 - **Hybrid** systems 結合多個 structures to address **performance, usablity, flexibility....** - linux is **monolithic(單體的** - +**microkernel** for diff **personalities (user-level services)** - +**modular** for dynamic loading of functionality - Windows NT Kernel - ![](https://i.imgur.com/kSLewZi.png) ## Virtual machines - Multiple **virtual** machines on a **physical** machine - A VM 提供 interface **identical** 給 underlying(深層 bare hardware - each VM can have its own OP - 硬體的資源被 vm 分享 - 好處 - machine consolidation(合併 (簡單的管理, 減少 cost) - useful for **OP rearch and development** - VM 提供 **complete protection** of system resource. Each VM is **isolated** from other VMs. - 系統可以開發在 vm, 取代在在實體上開發並且也不會正常系統的運作 - remain avalible 在 hardware **maintenance/upgrade** - The VM providing services can migrate to physical machine - VM ![](https://i.imgur.com/HAqNZae.png) - VMware Aechitecture - Workstation ![](https://i.imgur.com/KvZDzQc.png) # 第三章 ## Process Concept - 一個 OP 執行user programs - Batch system (批次系統 - jobs - Time-sharing systems - processes or tasks - We use the terms(條款 **job** 和 **process** almost interchangeably (可交換的 - Process - 一個執行中的 program; process execution 必須 progress (進展 in sequential fashion (方法 - A process includes - text (i.r., code section(部份) - **date section** - **heap** - **stack** - **program counter** and the content of the **processor registers** - Program - **passive**; Process - **active** - Process in Mem ![](https://i.imgur.com/wjhPvhf.png) > more than one processes can be associated with the same program - Memory Layout of a C Program ![](https://i.imgur.com/sePJmyc.png) - **Prpcess State** - process 執行時, 會改變自身的狀態 - **new**: The process is being created - **ready**: The process is waiting to be **assigend** to a CPU - the process is runnable - **running**: Instruction are being executed (i.e., **owns the CPU** - **Waiting**: The process is **waiting** for some **event occur** - **terminated**: The process has finished execution - Diagram of Process State ![](https://i.imgur.com/EXs6vAd.png) - **Process Control Block (PCB)** - **Information 關係到各個 process** - process state - program counter - The address of next instruction - 必需再 interrupt 產生時**儲存** - CPU registers - 多數的 number and type, 根據 **processor architecture** 上 > Accumulators, index registers, stack pointers, general purpose registers - 必需再 interrupt 產生時**儲存** - CPU scheduling information - Priority, pointers to scheduling queues, other scheduling paremeters - Memory-management information - Accounting &identification information - **CPU time** and **real time used** - Time limits - process numbers - I/O status information - Open files and devices... - also called **Task Control Block(TCB)** ![](https://i.imgur.com/XwhOHSv.png) - **Threads** - **So far**, process 是一個 program 執行在 **Single Thead** - Process executes in sequential fashion - 許多現代的 OP 提供了 **multi-threaded** processes ## Process Scheduling - Goals of - **Multiprogramming** - High CPU utilization - **Time-sharing** - Short response time - process scheduler 有責任的跑選擇的程式, 嘗試去達成上面的目標 - **job queue** - set of all processes **in the system** - **Ready queue (or called run queue)** - set of all process **residing in main mem, ready** and **wating** to execute - **Device queues** - set of processes **waiting for I/O device** - **Event queues** - set of processes **wait** for an event - **process 移植在 queue 之中** - **Ready Queue and Various I/O device Queues** ![](https://i.imgur.com/bUjuEvH.png) - **Queuing-Diagram Representation of Process Scheduling** ![](https://i.imgur.com/o2oqPbs.png) - **Context (框架 Switch** - 當 cpu 切換到其他 process, the system 需 **save** 舊 process 的狀態在 PCB 中和 **load** 儲存的給新的 process - this is called **context switch** - Context-switch time is overhead; the system 在切換時不能有效的工作 - Time 依靠硬體提供 - e.g. SUN UltraSPARC provide multiple set of **registers** - context switch == chage the pointer to the **current register set** - Typically, context switch 需要 **a few microseconds** - **CPU Switch from Process to Process** ![](https://i.imgur.com/mmJddgk.png) ## **schedulars** - **Long-Term(長期 scheduler** (or job scheduler) - 決定那些 process 需要從 job queue 推進 ready queue - **invoked very infrequently (seconds, minutes) -> (can be slow)** - Control the **degree of multiprogramming** - 應該去選擇好的 **process mix** - 去調適 CPU-bound 和 I/O-bound - **Short-Term scheduler** (or CPU scheduler) - 決定哪些在 ready queue 的 process 是下一個應該要執行的 (excute on CPU) - **invoked very freqently(milliseconds) -> (must be fast)** - E.g., 100 ms for each process, 10 ms for performing schedling - 10 / (10 + 100) = 9 % CPU time wasted on the scheduling - Processes can be describes as either - **I/O-bounded process** - spends more time doing I/O than computationis, many short **CPU bursts** - **CPU-bounded process** - spends more time doing computations; few very long CPU bursts - **Windows and UNIX have no long-term schedulers** - put all the jobs into mem - Addition of Medium Term Scheduling ![](https://i.imgur.com/Y5ILDiO.png) ## Operations on Process - We 介紹兩種 Operation 1. **Process Creation** - parent process 產生 children processes, 其他的產生 process 形成 **tree** of process - Parent and Child - Resource allocation - a child can obtain its **resources** (CPU times, mem, files, IO dev...) from (a** OS** (b **Its Parent** - If a child can **only** obtain its resources from it's parent - 防止從 **overloading system** 生太多 process - Resource sharing - 父子共享資源 - 孩子分享資源從父的 subset(子集 - 父子不共享資源 - Execution - 父子**同時**執行 - 父**等待**子結束 - **Address space** - 子跟隨父一樣 run 在同個 address 或 - 子有自己的 address space - UNIX e.g. - **fork** system call 創造新的 process - **exec** system call - 創造新的 process 並 **replace** 舊的 process mem 位置 - 通常用在 fork 之後 - **C Program Forking Another Processs** ![](https://i.imgur.com/v0M0ish.png) - **Execution Flow of the Program** ![](https://i.imgur.com/xTqosx6.png) 2. **Process Termination** - Process **執行**最後的狀態並且**告訴** OP 去刪除這個 process (via the **exit()** system call) - 輸出 **data/status** 給他的 **parent** - Process resources 被 OP 給解除分配 (deallocate - 父可能終止子的 process - Reasons : - Child has exceeded allocated resources - 不需要再指派 Task 給 child - If parent is exiting - some OP(VMS) 不允許 子程序繼續當父程序已終止 - All children terminated - **cascading termination** - In unix, **cascading termination** is not required - Child's **parent** 被設定成 **init/systemd** process (pid = 1 ## Inter-Process Communications (IPC) - **Independent** process 不能被其他執行的 process 影響或被影響 - **Cooperating** process 則能被其他執行的 process 影響或被影響 - Any processes that **share data** with others are **cooperating process** - Process cooperating 的優勢 - 資訊分享 - 計算加速 - divide the jobs and run them in parallel on MultProcess systemn - 模組化 ( Modularity - IPC 允許 processes 去交換 data - 2 種基本的模型 1. **Shared mem** - Faster > Requires **no** system when reading/writing data (重要!! 2. **Message passing** - Useful for exchanging **smaller data** - Easier to implement - **Slower**; requires **system calls** for sending/receiving messages - **Communications Models** ![](https://i.imgur.com/Su9ihFm.png) ## Shared Memory - shared mem region 必須先被創建 - 然後,shared mem 是**attached** 在 processes 的 **address space** - Mem 存取能夠自由的在範圍內完成 - 原本 OS 是防止 address of a process 被其他 address 存取 - OS 不管那些在 shared mem region 的資料使否改寫 - The region can be **detached** 當 process 沒有要用時 - **Producer-Consumer** Problem - Paradigm(範例 for cooperating process, **producer** process 生成資訊 (i.e. data items)that is consumed by a **consumer** process - 使用 buffer 去保留 data items - **unbounded-buffer** 沒有實際的大小限制 - Producer 不需要等待, 但 consumer 可能就要嘞 - **bounded-buffer** 假定那裏有 fixed 的大小 - producer 和 consumer 就**可能**需要等待 - **Bounded-Buffer** - share mem solution ![](https://i.imgur.com/wiHrftt.png) - **Bounded-Buffer** - Producer ![](https://i.imgur.com/BV81g4A.png) - **Bounded-Buffer** - Consumer ![](https://i.imgur.com/QQxYpk4.png) - **shared mem API** - System V API - shmget() - shmat() - shmdt() - shmctl() - POSIX (Portable Operating System Interface) - shm_open(), mmap(), munmap(), close(), shm_unlink() ## Message Passing - Mesage system - process 沒有儲存 shared variables/mem 的互相溝通 - 2 major operation - **send(message)** - **recieve(message)** - message 的大小可以被修正或可變的 - If P and Q wish to communicate, they need to - have a **communications link** between them - 交換資訊透過 send/receive ## Direct Communication - processes 必須互相明確(explicitly 的命名 - send(P, message) - send a message to process P - receive(Q, message) - receive a message from Q - Communicaton link 的性質 - Link 是**自動 (automatically** 建立的 - 需要知道互相的 id - 連結聯繫到準確的一對溝通的 processes - Between each pair there exits exactly one link - The API above 在**addressing**中是**同步 (symmetric**的 - **Asymmetric addressing** - **send** (P, message) - 傳送訊息到 P - **recieve** (&id, message) - 接收訊息並**回傳**給 sender - 弊病 - Hard-coding process id - 改動 preocess id 可能會產生問題 ## Indirect Communication - Messages 透過 **Mailboxes(ports)** 做傳送和接收 - each mailbox has a unique id - Processes can communicate only if they **share** a mailbox - Communication link 的性質 - Link 只有建立在 process 共享 mailbox 時 - A link 可能聯繫到多個 processes - 每一對 processes 可能分享**數個** communication links - Link 可能是**單向 (unidirections**或**雙向 (bi-directional** - Operations - create a new mailbox - send and recieve message **through** mailbox - destroy a mailbox - Primitives are defined as: - **send**(A, message) - send a message to **mailbox A** - **receive**(A, message) - receive a message from **mailbox A** - **Mailbox Ownership** - A mailbox 被**Creator** or **OS** 所持有 - Owned by creator - mailbox 是 creator's address space 的一部分 - **The creator is the only receiver** - Creator terminate -> mailbox disappers... - Owned by OS - 當 creator 被終止後可能持續存在 - OS 應該提供使用者一個機制去**明確 (explicitly**的刪掉 mailbox ## Synchronization - message passing 可能 **blocking** 或 **non-blocking** - **blocking** is considered **synchronous** - **Blocking send** has the seder block until the message is recieve - **Blocking receive** has the receiver block until a message is available - **Non-Blocking** is considered **asychronous** - **Non-blocking send** has the sender send the message and continue -> **does not wait for the receiver** - **Non-blocking receive** has the receiver receive a valid message or null - **Message Buffering** - **A queue of messages** attached to the link - Implemented in one of 3 ways - 0 空間 - 0 message sender must wait for receicer **No buffering** - Bounded capacity - **finite length of n messages** Sender must **wait** if link full - Unbounded capacity - infinite length Sender never waits **automatic buffering** - Reciever have to wait if there is no message(case of **blocking receive**) - **Message Passing APIs** - System V API - msgget() - msgsnd() - msgrcv() - msgctl() - POSIX Message Queue API - **mq_xxx()** functions - mq_open(), mq_send(), mq_receive(), mq_close(), mq_unlink()... ## Pipes - a container for byte stream - Producer writes to one end (the **write end** of the pipe) - Consumer reads from the other end (the **read-end** of the pipe) - Pipes are **typically unidirectional** - but some systems are bidirectional - 要求 **parent-child relationship** 在溝通中的 processes - Window calls thesr **anonymous (匿名 pipes** - **IPC Using Pipes** ![](https://i.imgur.com/eAaH46x.png) - **Named Pipes** - also called FIFOs - On linux, name pipes are created by **mkfifo()** - on Linux, name pipes ae more powerful than ordinary pipes - **NO parent-child relationship** 在溝通中的 processes 是必要的 - Any process can open the FIFO for reading or writing since it has a **name** - 在 unix 和 windows systems 都有提供 ## Client-Server Communication - Sockets - Remote Procedure Calls - Remote Method Invocation(同 invoke 調用) (Java) ## Sockets - A socket is defined as an **endpoint for communication** - Concatenation(並聯) of **IP Address** and **port** - The socket 161.25.19.8 refer to port 1625 on host 161.25.19.8 - Communication 關係到一對的 sockets ![](https://i.imgur.com/nGuhG7O.png) ## **Remote Procedure Calls (RPC** - RPC 摘取 procedure calls 在 **process on networked systems** 之間 - Allow a client to **invoke a procedure** 在**鄰近的 host** as it would invoke a procedure **locally** - **Based on message passing** - each message is sent to the rpc daemon listening to a port - Result are sent **back** as separated massages - **Stubs** - transforms procedure calls to msgs, and **vice versa( 反之亦然** - The **client-side stub** locates the server and **marshals (元帥** the parameters - The **server-side stub** receives this message, **unpacks** the marshaales parameters, and performs the procedure on the server - 回傳值被執行在 similar manner( 類似的方式 - Issue about RPC - data representation - XDR - Error handling - RPC can fail, or be executed more than once due to **network error** - How does the client knows the port number of the server - Fixed port number - inflexible >> server can't change its port easily - Execution of RPC ![](https://i.imgur.com/M407nqD.png) ## Remote Method Invocation (RMI - RMI is javaa mechanism similar RPC - RMI allow a Java program **on one mechine** invoke a method on a remote object - Differ from RPC, RMI is - object-based - **Possible to pas objects as parameter** ![](https://i.imgur.com/yFKHhT7.png) ## Marshallin Parameter ![](https://i.imgur.com/2YBoyEh.png) > parcel (包裹 # 第四章 ## Single and Multithreaded Process ![](https://i.imgur.com/LSvRmwz.png) ## Motivation - if a web server is **single-threaded** - serve only one client at a time - If it processes the requests one-by-one -> a long waiting time - **Multi-process solution** - Common before threasd become popular - process creation is **time consuming** and **resource intensive (密集的資料** - **Multi-threaded solution** - More efficient - thread more lightweight than process - A multi-threaded web server may - Use a **thread for listening for client request** - Create a thread 來處理各個要求 - Multithreaded Server Architecture ![](https://i.imgur.com/h2c2mWw.png) - Threads 對 RPC systems 也很重要 - RPC servers are multi-threaded - Java's RMI is also multi-threaded - each thread performs a specific task - 好處 - Responsiveness - 允許程式持續型 when block - Usilization of MP Architectures - Threads can run in parallel on diff processorss - allows a multithread app to tun on the top of multiple processors - Resource Sharing - Threads share mem & resources - Lightweight communication through mem sharing - Economy - more economic to create/switch threads - The former two also **apply to multi-process architectures**, while the latter two are more **specific** to multi-threading ## Multiprocessor Programming - multiprocessor 和 multicore 對程式員來說是挑戰 - Dividing activities - Balance - Data splitting - Data dependency - Testing and debugging - **Paarallelism** impies a system can perform more than on task simultaneously (同時 - 執行不只一個 task - **Concurrency (並發** supports more than one task making progress - single processor/core, scheduler providing concurrency - 一個 task 分配成不同的小任務 - Concurrency vs Parallism ![](https://i.imgur.com/DlYVJYK.png) - Type of parrallelism - Data parallelism - 將不同 data 交給不同 cores 執行**相同的運算** - Task parallelism - 所有 cores 會處理各個檔案的不同片段 - ![](https://i.imgur.com/34Mx8cZ.png) ## Amdahl's Law - indentifies performance gains from adding additonal cores to an app that has both **Serial** and **Parallel** ![](https://i.imgur.com/8ZaHkjV.png) > e.g. if app is 75% Parallel and 25% Serial, moving from 1 to 2 cores results in speedup of 1.6 times > if N approach infinity, speedup approaches 1/S ![](https://i.imgur.com/JgMnnDS.png) ## User Threads - **user threads** - User-level threads - Thread mangement done by **thread libary** - Common thread libaries (並非都是由 user-level 實現) - POSIX Pthreads (Portable Operating System Interface - Windows threads - Java threads - **kernel threads** - support by the Kernel - e.g. - Windows - SAolarids - Linux - Tru64 UNIX (Formerly digital UNIX) - Mac OS X - **User and Kernel Threads** ![](https://i.imgur.com/uDPw3R2.png) ## Multithreading Models 1. **Many-to-One** - 許多 user-level threads mapped 到單一個 kernel thread - Thread managemant is done by the **thread Lib** in the **user space** - 在 thread 管理上更有效 - 但如果一個 thread 卡住 system call, 在同個 kernel thread的其他 thread 會被卡住 - Multiple thread 不能跑 parallel 在 multiple processors - 對些 thread 來說, 只有一個 kernel thread - ![](https://i.imgur.com/VadFHDk.png) 2. **One-to-One** - 每個 user thread 對應 map 到一個 kernel thread - 其他 thread 能在 system call block 時 run - 過多的 kernel threads 會阻礙 app 的效能 - ![](https://i.imgur.com/TdtURmi.png) 3. **Many-to-Many** - 讓 M 個 user thread map 到 N 個 kernel threads, N <= M - 允許 OP 創造足夠的 kernal threads - 讓 app 可以創造需要的 user threads 的數量 - kernel thread 可以執行 parallel on multiprocessors - blocking system call 不會擋住所有的 process - ![](https://i.imgur.com/NT3JBug.png) 4. **Two-level Model** - similar to m:m 但允許 user thread 可以 **bound** to kernel thread - ![](https://i.imgur.com/X56xUl1.png) ## Thread Libraries - Provide an API for creating and managing threads - 2種主要方法去實現 - **User-level lib** - code and data structure reside entirely in the user space - **a lib call does not result in a system call** - **Kernel-level lib** - Code and data structure reside in the kernel mode - A lib call typically results in a system call - Common Thread Lib - Pthreads - 提供 user 或 kernel-level lib - Windows thread - Kernel-level lib - Java threads - create and manage java threads - 使用本地的 threading support - use windows thread in Windows - use Pthreads in UNIX or Linux ## Pthread - POSIX threads - A specification for thread behavior - not an implementation - OS 設計者可以實作 specification - Numbers of implementations in - Solaris, Linux, Mac OS X, Tru64 UNIX - **Shareware implementatins in windows** - A Pthreads example - a thread begins with main() - main() creates a second thread by **pthread_creare()** - Both threads share the gloval variavle **sum** - Wait for a thread to terminate by **pthread_join** - ![](https://i.imgur.com/mgMR71M.png) - ![](https://i.imgur.com/lILmj7M.png) ## Windows Threads - Threads are created by **CreateThread()** - **WaitforSingleObject()** - 等待指定的 thread 中止 ## Java Threads - 2 techniques to create Java threads - create a class deriving from the **Thread** calss and override its **run()** method - Define a class that implements the **Runnable** interface > Public interface Runnable >{ >Public abstract void run(); >} - Threads are actually created when the **start()** method of the thread object is invoked - The start() method - Allocates mem and init a new thread in the JVM - Invoke **join()** method of the thread object to **wait for the thread to exit** ## Implicit Threading - 變得相當受歡迎 - 只要 thread 增加, 程式的正確率耿南被 difficult wth 明確的 threads - Creation and Management of **threads done by compilers** and **run-time lib** 比起 programmers. ## Threading Issues - Semantics(語意 of **fork()** 和 **exec()** system calls - Thread ancellation - Signal handling - thread pools - thread specific data ## Thread Cancelllation - termiating a thread **before** it has finished - e.g. - Searching a database - if one thread finds the result, cancel the downloading threads - Web browser - if user press the cancel button, cancel the downloading thread - Two genetal approaches for thread canllation - **Asynchronous cancellation** terget thread **immediately** - 可能的問題 - Resources have been allocated (已分配 to the target thread - The target thread is updating a shared data - **deferred (延遲 cancellation** allows the target thread to **check later** if it should be cancelled - Allow the target a thread to be cnacelled in a **safe point(cancellation point)** - Cancellation mode depend on **state** and **type** - ![](https://i.imgur.com/8ockf43.png) - **Cancellation points** for deferred cancellation - certain functions must, and certain other functions may, be cancellation points - Insert a cancellation point: **pthread_testcancel()** ## Signal Handling - signals are used in UNIX system to **notify a process** that a particular **event** has occurred - A signal may be received either synchronously or asynchronously - **Synchronous** signals - E.g., illegal memory access, divided by 0 - process 自己產生 - **Asynchronus** signals - E.g., terminate a process with Ctrl-C, timer expire - 由外部事件生成 - A **signal handler** is used to process signals - Every signal has a **default handler** that kernel runs when handling signal - User-defined signal handler can override default - Signal generation & handlin 1. Signal is generated by particular event 2. Signal is delivered to a process (signal destination) 3. Signal is handled - Signal destination for a mult-threaded prosses - Deliver the signal to the thread to which the signal applies - Deliver the signal to every thread in the process - Deliver the signal to certain threads in the process - Assign a specific thread to receive all signals for the process - **A signal can be handled only once!** - For **syn signals**, it should be **delivered to the thread** that **causes the signal** - For asyn signals, it depends... - E.g., Ctrl-Cthe signal should be sent to all the threads - Many OSes allow a **thread** to **specify** which signals to accept and which signalls to block - A signal can be delivered to the **first found** thread that accept it - A signal can be handled only once! - Sending a signal to a process in UNIX - kill (pid, sighal) - Sending a signal to a thread in Pthreads - pthread_kill (tid, signal) - **Asynchronous Procedure Call (APC)** - in Windows systems - Similar to asyn signals - Allow a thread to specify a function to be called - delivered to a **thread**, not a porcess ## Thread Pools - consider a web server that **serves each request with seperated thread** - Thread creation and termination are not free - **No upper limit** on the number of threads - Solution: **Thread Pool** - 創造一池的 thread 等待 work - 優點 - usually slightly(稍微 faster to serve a request with an existing thread **than create a new thread** - Allows the number of threads in the app to be bound to the size of the pool - The **number of threads in the pool** can be based on - Number of cpu - Amount of physical mem - Expected number of requests - Some thread pool architecture can adjust the number of threads in the pool according to usage patterns or ## Thread-Specific Data (TSD - allows each thread to have it's own copy of data - pthread_setspecific() - pthread_getspecific() ## Windows Threads - implements the one-to-one mapping - also support a **fiber lib** that provides the M:M model - Each thread contains - A thread id - Register set - Separate user and kernel stacks - Private data storage area (i.e. thread specific data) - The register set, stacks, and private storage area are known as the **context** of the threads - The primary data structures of a thread include: - ETHREAD (executive thread block) - KTHREAD (kernel thread block) - TEB (thread environment block) - data structures of a windows thread ![](https://i.imgur.com/N951yrU.png) ## Linux Threads - Linux refer to them as **tasks** rather than threads or processes - Thread creation is done through **clone()** **system call** - **clone()** allows a child task to share the address space of the parent task(process)