作業系統 CH3 Process

tags: `作業系統 Operating System`

Process Concept

Process 與 Program 最主要的差異，在於是否處於執行狀態

Program (Passive entity) 只是儲存於 disk 中等待被執行的程式碼
Process (Active entity) 則是位於記憶體中的正在被執行的 Program

一個 Process 由以下內容組成：

Code segment (text section) (將 program (已編譯為 instruction) 讀到記憶體中，等待被 CPU fetch 並執行 )
Data section –> 全域變數
Stack –> 局部變數以及函數 (方便使用後直接 pop 掉)
Heap –> 動態配置記憶體的變數或是類 (classes)
Current Activity (管理 Process 的 Meta data) –> Program counter, Register Contents
A set of associated resources –> 相關資源的 set，如 Open file handlers (檔案的開啟、或是跟 OS 要某些硬體的控制權)

以下為 Process 執行時記憶體的狀態

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Threads

Threads 又稱為 Lightweight process，是使用 CPU 的最小單位，同 Process 的 Threads 有共享的記憶體空間。在 Parent Process 創造 Threads 時就會 allocate ，因此省去在空間管理及處理上的執行動作。

同 Process 中，Threads 共用以下內容
- code section
- data section
- OS resources (e.g. open files and signals)
但以下部份各自獨立
- stack
- register set –>
  可能 instruction 的執行位置不同，甚至執行不同的 function call
- thread ID
- program counter

Process State

Process 在其生命週期中，共有 5 個狀態

New：當 process 被創建時，將 code 讀到記憶體中，並將前述記憶體狀態初始化
Ready： Process 競爭的資源是 CPU 的核心，管理的方式為佇列 (Queue)，在等待被執行的階段，會被放在佇列當中，此狀態稱為 Ready
Running：被 CPU 排程選到，執行 instructions。OS 為了確保主控權，一段時間就會將 CPU switch 給 scheduler 而==打斷(interrupt)==程序
Waiting：有些指令不會直接使用 CPU (如 I/O)，因為不需要 CPU ，又需等待某些指令完成，所以進入 Waiting 狀態，待完成後重新放回 Ready
Terminated：Process 完成執行，釋放資源給 OS。

一個 Processor 一次只執行一個 Process ; 但同時可能有多個 Process 處於 Ready 或 Waiting 的狀態。

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Process Control Block

User 創建 Process 後，OS 會自動建立 PCB 來做管理。我們說將 Process 放到佇列當中，指得是將 Process 的 pointer 放到佇列中，而非記憶體空間，使用的資料結構為 Linked list。PCB 包含以下內容：

Process state
Program counter
CPU registers

此外還有：

CPU scheduling information
memory management information (base & limit register) –>
只有 process 在執行時才會把這兩個值從 memory load 進 CPU 的 register。
I/O status/information
accounting information

因為這些資訊只有 OS 需要知道，所以會放在 kernel space 的記憶體空間

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Context Switch

CPU 一次只能執行一個 Process，要切換給另一個 Process 時，必須將舊 Process 的資訊 (e.g. PCB) 儲存起來，並載入新 Process 的資訊，這個動作稱為「Context Switch」。
CPU 完整切換的流程如下:

執行 P0
P0 被打斷讓出 CPU ，此時處於 idle 狀態
進行 Context Switch
- 將 P0 的狀態存於 PCB
- 並讀取 P1 的 PCB。
- 完成 Context Switch
執行 P1

Context switch 的過程中， CPU 沒有執行任何 instruction，因此 Context Switch 其實是 overhead，浪費 CPU 的資源。但為了 CPU sharing 及 time sharing，Context switch 無法避免地經常發生，所以我們只能盡可能縮短時間：

memory speed
number of registers (越少則存取的次數降低，但現在的系統中為了更快速的計算，register 數量是變多的)
合併 load & save PCB 的 instruction，讓 CPU 一次做完
硬體支援 –> multiple sets of registers，一次記住多個 Process 的狀態，可以快速的在 registers 間進行切換，無須透過 memory access
Image Not Showing Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →

Process Scheduling

Multiprogramming：多工處理， CPU 可以執行多個 Process，以最大化利用 CPU
Time sharing：分時系統，為一種資源的共享方式，系統可以頻繁地進行 Context Switch，讓多個 User 可以同時使用 Program

由於 Process 共享 CPU，使用的先後順序就是由 scheduler 來決定。

Process Scheduling Queue

Job queue (New State)：系統中所有的 Process
Ready queue (Ready State)：在主記憶體中正在及等待執行的 Process
Device queue (Wait State)：等待 I/O 處理的 Process

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Process Scheduling Diagram

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Schedulers

Scheduling 分為 CPU 及 Job Scheduling

CPU Scheduling : 因為 time sharing 的原因，發生頻率非常高，又稱為 Short-term scheduler。CPU Scheduler 會選擇該被執行的 Process 並 Load 到 CPU 當中，從 Ready State 轉換成 Run State
Job Scheduling : 可能幾秒或幾分鐘才需要 scheduling，又稱為 Long-term scheduler，通常是使用者人為觸發，新的程式被執行，產生程序時才需要被排程，從 New State 轉換成 Ready State
Medium-term : 跟虛擬記憶體結合產生的一種 scheduler，因為 memory 空間是有限的，有時會將 disk 的空間拿來使用，當 memory 不夠時，就會將部份 Process 從 memory 踢回 disk，這個動作即由 Medium-term scheduler 來處理，由 Ready state 轉換為 Wait state

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Long-Term Scheduler

控制在記憶體中 Process 的數量 (degree of multi-programming)
當 degree 太低時， CPU 有很多的時間在 idle
當 degree 太高時，會發生 Thrashing，有太多的 Process 在爭搶有限的 memory，導致 Process 一直在 memory 跟 disk 之間 swap，不斷的做 I/O
因我們希望 CPU 跟 I/O 執行的時間差不多，負載盡可能平衡而最大化系統效能，所以 Process 要能夠在 CPU-Bound & I/O-Bound 良好的混合
在 UNIX/NT 系統中，並沒有 Long-Term Scheduler
- 直接將 Process 放到 short-term scheduler 中 (現今電腦的記憶體空間通常是足夠的)
- Multipropramming degree 受硬體限制 (e.g. # of terminals) 或著由使用者自己調整

Short-Term Scheduler

執行頻率很高，大約 100ms 執行一次
由演算法來縮短每個 Process 的等待時間
因為效率要高，演算法也不能太過複雜，避免 CPU overhead 過高
- overhead = (time of pick) / (time of execution + time of pick)

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Medium-Term Schduler

因現代 memory 空間的增長以及虛擬記憶體的觀念引入，過往由 Long-Term Schduler 處理的動作多改由 Medium-Term Schduler 執行

swap out : 將 Process 由 memory 搬到 disk 中
swap in : 將 Process 由 disk 搬到 memory 中
propose : 改善 Process mix / 減少記憶體中的 Process 數量，降低 degree釋放記憶體空間
Image Not Showing Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →

Operations on Processes

回到 programmer 及 user 角度，我們到底如何跟 Process 溝通

Tree of Processes

無論什麼作業系統， Processes 一定可以畫成如下的樹狀結構
每一個 Process 由 unique processor identifier (pid) 識別
Image Not Showing Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →

Process Creation

資源之間的關聯性可能如下
- Process 的 Parent 及 Child 共享資源
- Child Process 共享部份 Parent 的資源
- Parent 及 Child 完全不分享資訊
兩種可能的執行方式
- Parent 及 Children 同時執行
- Parent 等到 Children 結束後才執行
兩種可能的記憶體空間
- Child 為 Parent 的複製 (執行碼、counter 等與 Parent 完全一樣)，透過 sharing variable 溝通 (address 不同)
- 將特定的 program 丟進 Child 的 code section，重設 counter，讓 Child 變成完全不同狀態的 Process，透過 message passing 溝通

UNIX/Linux Process Creation

fork system call (Create Process)
- 創建新的 (child) process
- 該 process 複製其 parent 的記憶體空間
- Child 與 Parent 在 fork 後同時執行 (execution concurrently)
- Child 在 fork 的回傳值為 0
- Parent 在 fork 的回傳值為 Child process 的 PID
execlp system call (Reset Process)
- 先創建 Process 及 OS 要記憶體空間，並註冊新的 PID
- 將新的 binary file 讀入記憶體，並將內容如變數 (heap、stack)、counter 等全部清空 (reset) ，等於讓新的 Child Process 執行不同的程式
wait system call
- 因 Child & Parent 在 Unix 中是 concurrent 執行，若要控制執行的順序，必須用 wait system call，強制 parent 直到其中一個 Child 的 process 完成後才執行
Memory space of fork()
- 舊的實作 Child 為 Parent 的完整複製
- 新的實作使用 copy-on-wirte 技術，在 Run time 過程中，儲存 Child 的與 Parent 的不同處 (A's Child 在執行過程中會慢慢增長)
- 若 call execlp，則會產生完全獨立的 Child Process (本例中的 C)

以下為 UNIX/Linux Process Creation 的範例程式碼

以下範例簡單示意 fork 的特性， Child Process 會保存 fork() 呼叫當下 Parent Process 的狀態。注意如果過程中呼叫 execlp ，則該分支會斷掉，樹不會繼續往下生長

Process Termination

當 Process 執行到最後一個指令或是呼叫 exit()，都會中止程序

該 Process 所有相關資源，如記憶體空間 (physical, virtual)、I/O buffer、open files 等都會清空並將資源還給 OS
Parent 可利用 PID(abort) 來中止特定 Child Process
- 當 Child 配置太多記憶體
- Child Process 執行的任務不再需要
Cascading termination：因為是樹狀結構，所以當 Parent 被中止，則 Child Process 也會被中止，這也是為什麼 ctrl+C 可以強制停止 Process，因為其他程序都是由 console 程序創建，更進階的是呼叫 kill，但需要管理者權限

Interprocess Communication

定義：在多個 Process (或 Threads) 間溝通的機制
目的：
- 資訊的共享
- 加快計算速度 (not always true)
- 便利性 (performs several tasks at one time)
- modularity (e.g. micro-kernel)

Communication

要做到 IPC 用 memory 分別有兩種方式

Shared memory：共用記憶體空間
- 需要小心處理 synchronization
- 利用 memory address 實作，好處就是快
- 透過 memory 的 address 存取資料
Message passing：透過 memory 的 copy，網路或跨電腦的程式通常都是以這種作法溝通
- 沒有 conflict 的問題，但若資料小，用這個方法的話可以省去 share memory 的 lock 之類的額外資源消耗，反而會略快（且不用在意 sync 的問題）
- 使用 send/recv message
- 以 system call 實作，通常較慢

Communication Methods

Sockets
- 透過 IP & Port 來 identify 使用者，port number 指得就是 Process
- 藉由未結構化的位元資料來溝通 (unstructed stream of bytes)
- 會有 client & server
- server 要先打開 client 才能連
- client 進行 connect 之後，就可以盡情 read / write
- 補充：
  server 在 accept 之後，會動態開 thread，如此才能一次處理多個 request
Remote Procedure Calls (RPC)
- 可以呼叫其他的 Process 甚至其他電腦的 Process
- 參數以及回傳值以 message 傳遞
- 與 socket 不同，傳輸的資料是有 data type 的
- Library 通常會有 stubs，在 client 端和 server 端會各有一隻小程式，（server 端的叫 skeleton）會負責 implement 細節
- 其實通常下面也可能用 socket，只是幫 programmer 包起來而已
- parameter mashaling
  把 params 包進 message 且嚴謹的管理。要把兩台電腦串在一起其實有許多細節要處理，RPC 就要負責幫使用者把這些繁瑣的事情解決掉
In distributed computing, a remote procedure call (RPC) is when a computer program causes a procedure (subroutine) to execute in a different address space (commonly on another computer on a shared network), which is coded as if it were a normal (local) procedure call, without the programmer explicitly coding the details for the remote interaction.

Shared Memory

Process 要能夠 …

創建 shared memory 的區塊
- 共享的記憶體區塊通常會在創建 shared memory 的 Process 的附近
- 透過這個 shared memory 溝通的 Processes，要告知 OS 說我要共用這塊記憶體位置 (attach)
資料格式為 bytes，由 process 自行定義， OS 不處理
Processes 不能同時對同一塊記憶體做讀寫的動作

Consumer & Producer Problem

兩個 Process 共用一塊 Buffer 記憶體
Buffer 是大小為 B 的 circular array
因為 Buffer 有空間限制，必須有以下資訊來控制
- next free: in
- first available: out
- empty: in = out (out 拿光)
- full: (in + 1) % B = out
- 這邊為了 sync，避免無法判斷 in = out 時是 empty 還是 full，必須使用 in+1 留一個空格。之後會提到以 locking 的方式使用所有的 Buffer 空間

Message-Passing System

讓 Process 之間可以溝通且同步
IPC 一般來說 OS 會提供兩種操作：Send/Receive 的方式達到同步化，並在背後做各種 memory copy 的處理
- Send(message)
- Receive(message)
Message system : Process 不使用 shared variable 達到溝通的目的
為了 Process 間的溝通
- 必須建立 communication link
- 藉由 Send/Receive 交換資訊
實作 communication link 的方式
- physical (e.g. shared memory, HW bus or network)
- logical (e.g. logical properties, programmer 在意的事情)
  - communication 是否有方向性
  - 角色是否相對
  - Blocking / non-blocking：傳輸行為執行完成才會回傳，或不管對方是否有收到立刻回傳
  - send by copy or by reference
  - 固定大小 / 可變大小的 message

Direct communication (如打電話)

Processes 必須要有對方的名稱
- Send(P, message) : 傳輸 message 給 Process P
- Receive(Q, message) : 從 Process Q 接收 message
communication link 的特性
- Links 通常為自動建立
- links & processes 為一對一的關係
- link 可以是單向，但通常是雙向
Direct communicatioｎ　最大的缺點是關係唯一對一，且連線的對象無法變更，除非重新建立，所以比較難以模組化 (modularity)

Indirect communication (如寄電子郵件)

Messsages 由 mailboxes(或稱為 port) 導向及接受
- 每一個 mailbox 有自己的 ID
- Processes 若共享 mailbox 則可以互相溝通
- Send(A, message)：沒有指定接收者，單純將 message 放到 mailbox A
- Receive(A, message)：沒有指定傳送者，單純將 message 由 mailbox A 接收
communication link 的特性
- Link 必須由 programmer 建立，透過 Process 間共享 mailbox 來溝通
- 因為沒有指定接收者/傳送者，所以 Link 與 Process 間為 Many-to-Many
- Link 可以為單向或雙向
- Mailbox 可以由 Process 或 OS 擁有，通常處理 message 的是另一隻程式
Mailbox Sharing
- 因為可能同時有很多人去讀取 message，但理論上一個 message 應該只能有一個人拿到，有以下幾種解決方法：
  - Link 只允許兩個 Process (一個 send 及一個 receive，等同 direct，最糟的解法)
  - 利用 lock 一次只允許一個 Process 拿取 message
  - 讓 mailbox 自行決定誰來接收。receiver 會先提出要求， mailbox 決定接收者後， sender 會被告知 receiver 是誰，再告知 receiver 來取得 message

Synchronization

Message passing 可以為 blocking (synchronous）或是 non-blocking (asynchronous)
- Blocking send: sender 被 blocked 直到 message 被 receiver 或 mailbox 接收才 return
- Nonblocking send: sender 送出 message 之後繼續執行
- Blocking receive: receiver 被 blocked 直到皆收到 message 才 return
- Nonblocking receive:若 sender 先 call，則 receiver 正常接收 message ; 若 sender 前 call，則 receiver 只會收到之前的值。所以 function call 通常還會 return 一個 token ，來確定是否真的收到 message
Buffer implementation : non-blocking 通常透過 buffer 實作，只要 buffer 空間沒有滿，就可以一直 send ; 對 receiver 來說就是一直接收，不管 buffer 有沒有東西
- Zero capacity ：通常是 blocking send/receive
- Bounded capacity ：若 buffer 滿了， sender 會被 blocked，或是 return 一個 error
- Unbounded capacity ：一直送直到系統滿了為止

作業系統 CH3 Process

tags: 作業系統 Operating System

Process Concept

Threads

Process State

Process Control Block

Context Switch

Process Scheduling

Process Scheduling Queue

Process Scheduling Diagram

Schedulers

Long-Term Scheduler

Short-Term Scheduler

Medium-Term Schduler

Operations on Processes

Tree of Processes

Process Creation

UNIX/Linux Process Creation

Process Termination

Interprocess Communication

Communication

Communication Methods

Shared Memory

Consumer & Producer Problem

Message-Passing System

Direct communication (如打電話)

Indirect communication (如寄電子郵件)

Synchronization

Refernece

tags: `作業系統 Operating System`