Ch8:Memory Management

--- title: Ch8:Memory Management --- # 作業系統Ch8: Memory Management ###### tags: `Operating System` `Review` `Ch8` ## Background - CPU 可以直接存取的只有 Main memory and registers。 - 在 disk 上等待被帶入 memory 和被執行的 processes 集合。 - 多個 programs 帶入到 memory 以改善資源使用率和 response time to users。 - A process 可能在執行過程中的 disk 和 memory 之間移動。 ## Multistep Processing of a User Program 在不同時間，可能會進行不同 address binding。 ![](https://i.imgur.com/q8lJufh.png) ## Address Binding ### Address Binding - Compile time - Compiler translates symbolic code into **absolute code** > absolute code 使用絕對位址 - If starting location changes $\rightarrow$ recompile - e.g., MS-DOS .COM format binary 記憶體位址從 Compile 完，就會全部都知道，然後再 load 到 memory。 ![](https://i.imgur.com/cXk4Ogh.png) ### Address Binding - Load time - Compiler translates symbolic code into **relocatable code** > relocatable code: 機器語言可以從任何記憶體位址運行。 - If starting location changes $\rightarrow$ reload 如果不是在 compile time 就知道 process 的 data 或者 instruction 位於記憶體的位址，那麼會產生 relocatable code ，且 binding 會到 load time 時才做。 ![](https://i.imgur.com/qjXqoRd.png) ### Address Binding - Execution time - Compiler translates symbolic code into logical-address (i.e. virtual-address) code - 這方法需要硬體 MMU。 - 通用目的的 OS 使用這方法。 Program 在 compile 完後會產生 logical address，而 CPU 讀取 memory content 將 logical address load 到 Memory 前必須要透過 MMU 才能轉換成 Physical address ![](https://i.imgur.com/sMlFM9l.png) ### Memory-Management Unit (MMU) - 硬體裝置用來 virtual address 映射到 physical address. - physical address 是由 user process 產生的每一個 address 被送入到記憶體時加上在 relocation register 中的值。 ![](https://i.imgur.com/kWZpt3I.png) ### Logical vs. Physical Address - Logical address – generated by CPU - a.k.a. virtual address - Physical address - seen by the memory module - compile-time & load-time address binding - logical addr $=$ physcial addr - Execution-time address binding - logical addr $\neq$ physical addr - user program 處理 logical addr，它永遠不會看見 the real physcial addr。 ## static/dynamic loading and linking ### Dynamic loading **Static loading**: 將所有 data 和 function 全部放入到 memory ，為了去執行。 **Dynamic loading**: A routine is loaded into memory when it is called。 - 更好記憶體空間使用率 - unused routine is never loaded - Particularly useful when large amounts of code are infrequently used (e.g., error handling code) - No special support from OS is required implemented through program (library, API calls) Dynamic Loading Example in C > dlopen: 開啟 library 且為使用準備它。 > dlsym: 在開啟的 library 中查找 the value of a symbol。 > dlclose: 關閉 DL library。 ```c= #include<dlfcn.h> int main() { double (*cosine)(double); void *handle = dlopen("/lib/libm.so.6", RTLD_LAZY); cosine = dlsym(handle, "cos"); printf("%f\n", (*cosine)(2.0)); dlclose(handle); } ``` 概念示意圖如下，如果 function B 和 C 是 Dynamic loading，則一開始只有 function A 會載入到記憶體。 ![](https://i.imgur.com/WxYy7SA.png) ### Static linking **Static linking**: ~~loader~~(此處恐龍書有 typing mistake，應為linker) 會將 libraries 混入 program 在 memory image。 > A memory image is simply a copy of the process's virtual memory, saved in a file. - Waste memory: duplicated code - Faster during execution time > 因為不用去尋找。 Static linking + Dynamic loading: 仍然會有 duplicated code。 > Dynamic loading 是針對一 process 在呼叫時才載入，可是如果今天有多個 process 需要進行 printf() 的 function call 時，這每個 process 同樣都會進行Dynamic loading，因此記憶體還是會產生重複 code。 ![](https://i.imgur.com/5YzzBhk.png) ![](https://i.imgur.com/nem9PLG.png) ### Dynamic Linking **Dynamic Linking**: 等到執行時再進行 linking (也就是延遲 linking)。 - Only one code copy in memory and shared by everyone - A stub is included in the program in memory image for each lib reference - 當呼叫函數時，會先跳入 stub 中，而 stub call 會檢查被參考 lib 是否在記憶體中，如果不在的話，則載入 lib，接著執行lib 中的 function。 - DLL (Dynamic link library) on Windows；在 UNIX/Linux 中的動態連結函式庫則被稱為 Share Objects，其附檔名通常是 .so。 ![](https://i.imgur.com/BOVXU1R.png) ## Swapping ### Swapping - a process 可以被 swapped out of memory to a **backing store**，之後 swapped into memory for continuous execution。 - Also used by midterm scheduling. - Backing store - 一塊 disk，跟檔案系統是分隔開來的，提供直接 access to these images。 - Why Swap a process: - Free up memory - Roll out, roll in: swap lower-priority process with a higher one - Swap back memory location - if binding is done at compile/load time - swap back memory address must be the same > 這樣就會產生麻煩，因為有可能 memory address 被其他的占用。 - If binding is done at execution time - swap back memory address can be different A process to be swapped 一定要是 idle，也不能進行 I/O > 如果正在等待 I/O時，此時卻 swapped，會產生一些問題。 > 解決方式就是: > (1) 不要 swap 正在 pending I/O 的 process > (2) 透過 OS buffers 來完成 I/O operations >> OS 為 process create 一個 OS buffer(為 I/O 寫入的資料) ，位在 OS 的記憶體空間，接著將 I/O 資料搬到 OS memory，在搬的時候，process 沒有 touch 到它本身的 memory，所以 process 可以被 swap 掉，等到 I/O 做完了，再把 process swapped into memory，接著 OS buffer 的內容 copy back 那個 process 的記憶體空間。 - 大部分 swap time 的時間都是進行 transfer time，總 transfer time 跟 memory swapped 的量成正比。 ![](https://i.imgur.com/dCxI0Jj.png) ## Contiguous Memory Allocation ### Memory Allocation - Fixed-partition allocation: - Each process loads into one partition of fixed-size。 - multiprogramming 的程度受限於 partitions 的數量。 - Variable-size allocation: - Hole: 連續 free memory 的區塊 - 多個不同大小的 holes 會散落在記憶體中。 - OS 會維護資訊 - 正在使用的和有空閒的 hole - A free hole 可以和另外一個 hole 合併形成較大的 hole。 ![](https://i.imgur.com/AjuLKyC.png) ### Dynamic Storage Allocation Problem 如何從一串 free holes 來滿足有一個大小為 n 的分配請求? - First-fit: 看到第一個滿足它的就分配給它。 - Best-fit: 分配最小的 hole，但是可以滿足它。 - 需要搜尋一遍 - Worst-fit: 分配最大的 hole。 - 需要搜尋一遍前兩者比後者在速度和空間使用率上更好。 ### Fragmentation - External fragmentation - 總 free mem space 足夠大以滿足 a request，但是不連續。 - 發生在 variable-size allocation ![](https://i.imgur.com/3KOdOIf.png) - Internal fragmentation - partition 的內部有記憶體未被使用 - 發生在 Fixed-partition allocation ![](https://i.imgur.com/3IhQNUU.png) Solution: - Compaction - 在 run time，重組記憶體內容來將所有 free memory 放置到一塊大的 block。 - Only if binding is done at execution time ![](https://i.imgur.com/0kDHGXA.png) ## Non-Contiguous Memory Allocation - Page ### Paging Concept Method: - Physical memory 分割成固定大小的 blocks 稱作 frames。 - logical address space 切成同樣大小的 blocks 稱作 pages。 - 為了跑 n pages 的 program，就需要找到 n free frames 且載入 program - 追蹤 free frames - 設立 a page table 去轉換 logical to physical address > 每一個 process 會有自己的 page table，由 OS 維護。 Page table: - Each entry maps to the base address of a page in physical memory - A structure maintained by OS **for each process** - Page table 只包含由 process 擁有的 pages - A process 不能存取在它之外的空間得 memory ![](https://i.imgur.com/MKZhT31.png) Benefit: - 允許 process 的 physical address space 是 **noncontinuous** - 避免 external fragmentation - 有限 internal fragmentation - 提供共享 memory / pages ### Address Translation Scheme - Logical address 分成兩個部分 - Page number$p$: - 當作 page table 的 index ，包含在實體記憶體中每一 page 的 base address。 - N bits 代表 a process 可以分配到最多 $2^{N}$ pages > $2^{N}\times\ page\ size\ memory\ size$ 代表一個 process 能夠擁有的記憶體空間。 - Page offset$d$: - 結合 base address 去定義送到 memory unit 的實體記憶體。 > 在實體記憶體的 page 大小 - N bits 代表 the page size 是 $2^{N}$ $Physical\ addr\ =\ page\ base\ addr\ +\ page\ offset$ Address Translation Architecture ![](https://i.imgur.com/fQZf1n8.png) ### Free Frames free-frame list 是 linked list，每次有新的 process 分成 pages，去逐一對應從 linked list 頭開始。 ![](https://i.imgur.com/iYwommr.png) ### Page / Frame Size - page (frame) size 由硬體定義: - 通常是 a power of 2 - 4KB/8 KB 常用 - page size 太大會有 internal fragmentation，page table 變小，更多浪費空間。 - page size 太小會造成 page 數量變多，page table 變大，造成搜尋時間上的增加。 ### Paging Summary - Paging 分隔使用者的視角和真實 physical memory。 - OS maintains a copy of the page table for each process. - OS maintains a frame table for managing physical memory. ### Implementation of Page Table - Page table 保留在 memory 中。 - Page-table base register (PTBR) - The physical memory address of the page table - PTBR 值儲存在 PCB - 在 Context-switch 期間，改變 PTBR 的值 - 每次記憶體參考會導致 2 memory reads。 - The 2-access problem can be solved by - Translation Look-aside Buffers (TLB) which is implemented by Associative memory (HW) ### Associative Memory - 所有 memory entries 可以同時存取。$O(1)$ - entries 的數量有限。($64\ \sim\ 1024$) ![](https://i.imgur.com/gr8OfPR.png) ### Translation Look-aside Buffer - A cache for page table shared by all processes > 如果 cache hit，可以少一次 access；如果 cache miss，則依然需要 2 access ，然後再 cache 住。 - TLB must be flushed after a context switch - Otherwise, TLB entry must has a PID field (address-space identifiers (ASIDs)) ![](https://i.imgur.com/xVAe3AZ.png) ### Effective Memory-Access Time 有 TLB 的話可以有效降低 EMAT。 - 20 ns for TLB search - 100 ns for memory access - 70% TLB hit-ratio: - EMAT = 0.70 x (20 + 100) + (1-0.70) * (20+100+100) = 150 ns - 98% TLB hit-ratio: - EMAT = 0.98 x 120 + 0.02 x 220 = 122 ns > 如果兩次 memory access 是 200 ns ### Memory Protection - 每 page 與在 page table 中的一組 protection bit 有關。 - Common use: valid-invalid bit - Valid: the page/frame 是在 process 中的 logical address space，因此是合法的 page - Invalid: the page/frame **不**在 process 中的 logical address space - Potential issues: - 未使用的 page entry 造成記憶體浪費 $\rightarrow$ 使用 page table length register(PTLR) > 指出 the size of the page table (以 byte 為單位)。它的值會針對每個 logical address 檢查去驗證是否 address 在對於 process 而言有效的 range。測試失敗的話會對於OS 產生 error trap。 - Process memory may NOT be on the boundary of a page $\rightarrow$ memory limit register is still needed ![](https://i.imgur.com/9uoL1Oi.png) ### Shared Pages - Page 允許 processes 共享 common code 且一定要是 reentrant。 - Reentrant code (pure code) - 它在執行期間不會改變 - e.g., text editors, compilers, web servers, etc - 只會在 physical memory 保持一份 shared code。 - 兩個以上 virtual addr 會映射到同一個 physical address。 - Process 會保有它的私有 data and code 一份。 - Shared code 會出現在所有 processes 的 logical addr space 相同位置。 ![](https://i.imgur.com/uaZLQm4.png) ### Page Table Memory Structure - Page table 可能太大且難以載入 > 假設有 4GB ($2^{32}$) logical address 且有 4KB ($2^{12}$) 大小的 page，將會有 1 million ($2^{20}$) page table entries。 > 又假設每一 entry 需要有 4 bytes (32 bits) 去做紀錄，則需要 Total size = 4MB。在實體記憶體中不一定能夠找到連續的 4MB 大小空間。 > 因此需要拆成很多較小的 page table，在單一 page size 更好的範圍內。(i.e. 4KB) 或者減少 total size of page table (很多分配而沒用到的) - Solutions: - Hierarchical Paging - Hash Page Tables - Inverted Page Table ### Hierarchical Paging - Break up the logical address space into multiple page tables. - Two-level paging (32-bit address with 4KB ($2^{12}$) page size) - 12-bit offset (d) $\rightarrow$ 4KB ($2^{12}$) page size - 10-bit outer page number $\rightarrow$ 1KB ($2^{10}$) page table entries - 10-bit inner page number $\rightarrow$ 1KB ($2^{10}$) page table entries - 3 memory accesses ![](https://i.imgur.com/wojukG8.png) 第二層的 p2 的值對應到 physical address，然後透過 page offset 得到在 physical address 的佔用空間。 ![](https://i.imgur.com/UmbYVJB.png) 假如一個 entry 需要佔 4 B，那麼原先我們需要找到 $2^{22}$ B，也就是 4MB 才能儲存 page table，運用兩層的 page table，我們只需要找 4KB就可以儲存一個小的 page table，我們實際上是把 page table 的大小變小，但是原先 page table 數量只有一個，現在是變成有 $2^{10} + 1$ page tables，而且如果去看 total entries 數量，也是比原先大的，因為我們需要用額外 entries (實則level 1 的 entries) 去追蹤原先每 $2^{10}$ 劃分成小 page table。總而言之，如果單看總記憶體使用量和 entries 數，完全是比原先還要多，但是使用兩層的好處就是它更容易儲存於記憶體中。概念示意圖如下: ![](https://i.imgur.com/EnH8Gg9.png) 如果考慮到 64-bit Address 分兩層 page tables 如下: - 42 (p1) + 10 (p2) + 12 (offset) > outer table requires $2^{42}$ x 4B = 16TB contiguous memory!!! 分五層 page tables 如下: - 12 (p1)+10 (p2)+10 (p3)+10 (p4)+10 (p5)+12 (offset) > outer table requires $2^{12}$ x 4B = 16KB contiguous memory > 6 memory accesses > 實務上，分四層已經是很多了，不過因為可能會有頭尾都有用到，但是中間卻是 free 的狀況，但是還是要分配成 page，所以很浪費記憶體空間。 ### Hashed Page Table (> 32 bits) - Virtual page number is hashed into a hash table - The size of the hash table varies - Larger hash table $\rightarrow$ smaller chains in each entry - Each entry in the hashed table contains - $(Virtual\ Page\ Number,\ Frame\ Number,\ Next\ Pointer)$ - Pointers waste memory - Traverse linked list waste time & cause additional memory references ![](https://i.imgur.com/nTw4wGC.png) ![](https://i.imgur.com/Bkhbkmh.png) Hashed Page Table Address Translation 概念示意圖如下: ![](https://i.imgur.com/dvLt5aU.png) --- 考慮到當 page 很多時，可能會產生在某一個 hash function 後的值會是一長串，因此需要花費很多時間去 traverse，因此我們需要改變成 hash array 的方式，將它一次scan 多個並 load 到 MMU 去做處理(comparison)。 Improved Hashed Page Table Implementation 概念示意圖如下: ![](https://i.imgur.com/UZFw250.png) ### Inverted Page Table - 對於每個 process 不維護 page table - 對於整個實體記憶體維護一個 frame table - Each entry in the frame table has - $(PID,\ Page\ Number)$ - Eliminate the memory needed for page tables but increase memory access time - 每次存取都需要搜尋整個 frame table - Solution: use **hashing** for the frame table - Hard to support shared page/memory > 要做到一個 frame 對到多個 pages 很難 Inverted Page Table Addr Translation 概念示意圖如下: ![](https://i.imgur.com/oG6blBj.png) ## Non-Contiguous Memory Allocation — Segmentation ### Segmentation - 使用者觀點的 memory > 也就是隨需要調整大小的 memory。 - A program 是 a collection of segments. A segment is a logical unit such as: - main program - function, object - local/global variables - stack, symbol table - arrays, etc... ![](https://i.imgur.com/epEDL19.png) ### Segmentation Table - Logical address: $(seg\#,\ offset)$ - offset has the same length as physical addr - Segmentation table - maps two-dimensional physical addr; each table entry has: - Base (4 bytes): physical addr 起始位址 - Limit (4 bytes): segment 的長度 - Segment-table base register (STBR): - the physcial addr of the segmentation table - Segment-table length register (STLR): - the $\#$ of segments ### Segmentation Hardware - limit register: 檢查 offset length - MMU 分配 memory 透過指定一個適當的 base addr 對於每一個 segment - Physical addr 不能 overlap between segments ![](https://i.imgur.com/B2vjxBB.png) ### Address Translation Comparison ![](https://i.imgur.com/Mss53ty.png) ### Example of Segmentation ![](https://i.imgur.com/3csasdv.png) ### Sharing of Segments ![](https://i.imgur.com/KqoYHRL.png) ### Protection & Sharing - Protection bits 與 segments 相關 - Read-only segment (code) - Read-write segments (data, heap, stack) - Code 共享只發生在 segment level - shared memory communication - shared library - 在兩個 segment tables 中，共享 segment 擁有 same base ## Non-Contiguous Memory Allocation - Segmentation with Paging ### Basic Concept - 運用 segmentation 在 logical addr space - 運用 paging 在 physical addr space ![](https://i.imgur.com/Ji0mYsZ.png) ### Address Translation Segmentation 和 paging unit 都在 MMU 裡面 ![](https://i.imgur.com/0sc779Q.png) ### Example: The Intel Pentium - Logical address space 分成兩 partitions: - 1st: 8K($2^{13}$) segments (private), local descriptor table (LDT) - 2nd: 8K($2^{13}$) segments (shared), global descriptor table (GDT) > 可透過 OS 跟其他 segment 共享 - logical address: - max $\#$ of segments per process = $2^{14}$ = 16K - size of a segment $\leq$ $\ 2^{32}$ = 4GB ![](https://i.imgur.com/2MWJaP7.png) > protection info 代表 Read-only, Read-write, execute ### Intel Pentium Segmentation - Segment descriptor - Segment base address and length - Access right and privileged level ![](https://i.imgur.com/9M1gzMP.png) > 實際上 selector 找到在 table 上面的 index，會有 base and length (limit)，然後 offset 如果沒有超過的話，就跟 base 相加(兩者同樣都是 32 bits)，得到 32-bit linear address ### Intel Pentium Paging (Two-Level) - Page size can be either 4KB or 4MB - Each page directory entry has a flag for indication > 設立一個 bit 去決定要 4KB or 4MB ![](https://i.imgur.com/yGMHbuu.png) > 透過兩層 page， outer and inner 大小都是 10 bits，offset 則是 12 bits。將 linear address 分成三段，如果今天 flag 設為 0，則採用 4KB 方案，跟兩層 page 作法一樣；flag 設為 1 時，則將 inner 和 offset 的部份合併在一起 (22 bits) 當作 offset，對應到 4MB page。 ### Example Question > Q: Let the physical mem size is 512B, the page size is 32B and the logical address of a program can have 8 segments. Given a 12 bits hexadecimal logical address “448”, translate the addr. With blow page and segment tables. > A: $(448)_{16} = (01001001000)_{2}$ > 根據題意，最高位 bit 頭三個是 seg $\#$，其餘都是 seg offset > 也就是 $(010)_{2}$ 代表第二個 segment，去找到第二個 segment 對應到> 的 base address，再加上 offset 即可獲得 linear address = $(010111110)_{2}$。 > 接著根據題意，最高位 bit 頭四個為 page $\#$，其餘都是 page offset， > 也就是 $(0101)_{2}$ 代表第五個 page，對應到 frame $\#$，找到相對應的位置(即 frame number 乘上 page size)，再加上 offset $(11110)_{2}$，即是 physical address。 ![](https://i.imgur.com/Ee2CsbQ.png) ## Ref [1] [上課影片 link](https://www.youtube.com/playlist?list=PLS0SUwlYe8czigQPzgJTH2rJtwm0LXvDX) [2] [投影片 link](https://ocw.nthu.edu.tw/ocw/index.php?page=course_news_content&cid=141&id=999)