# Memory Management
## Memory Management
### Background

$\star$ Main memory and registers are the only storage CPU can access directly
$\star$ Collection of **processes** are waiting on disk to be **brought into memory and be executed**
$\star$ **Multiple programs are brought into memory** to improve resource utilization and response time to users
$\star$ A process may be **moved between disk and memory** during its execution
### Outline
$\star$ How to refer memory in a program $\rightarrow$ address binding (把program和memory對在一起)
$\star$ How to load a program into memory $\rightarrow$ static/dynamic loading and linking
$\star$ How to move a program between memory & disk? $\rightarrow$ swap
$\star$ How to allocate memory? $\rightarrow$ paging, segment
### Adress binding
#### In...
(1) In Compile time
(2) In Load time
(3) In Executation time (run time)
#### Comparison
(1) Logical address = virtual address - generated by **CPU**
(2) Physical address - send by the memory module
(3) Compile time and Load time address binding:
$\rightarrow$ Logical address = physical address
(4)Execution time address binding:
$\rightarrow$ Logical address $\neq$ physical address
(5) The user program deals with logical addresses, never sees the real physical addresses. **(全部丟給MMU就ok)**
### Static/Dynamic loading and linking
#### Dynamic Loading
$\star$ We can use dynamic-loading to avoid all the program be in the memory for it to execute.
$\star$ Better memory-space utilization:
Partircularly useful when large amounts of code are infrequently used like error handling code.
$\star$ Prevent the waste of memory.

Include dynamic function的header,先define cosine這個function,給他一個pointer。打開這些需要做dynamic loading的function compile組成的library。在runtime的時候把cosine和library中的cos binding。只有當call 到printf這行,才會把cosine這個dynamic function load到memory。
dlopen(): opens a library and prepares it for use.
desym(): looks up the value of a symbol in a given(opened) library.
dlclose(): closes a DL library.
#### Stack Linking

$\star$ libraries are combined by the loader into the program in memory image.
$\star$
#### Dynamic Linking

$\star$ Linking postponed until execution time.
$\star$ A stub is included in the program for each lib reference, it will check whether the library is in the memory, if not, load the library.
$\star$ DLL(Dynamic link library) on windows
因為在execution(run) time的時候用stub 去找dynamic library是否有重複的program,所以花費的時間會稍長。
### Swap
#### swapping
$\star$ A process can be swapped out of memory to a chunk of disk **backing store**
$\star$ Why?
(1) Free up memory
(2) Roll out, roll in: swap lower-priority process with a higher one.
$\star$ The location of swapping back:
(1) In compile/load time, the swap back memory must be same, so it is not flexible.
(2) In execution time, the swap back memory can be different, so it is flexible.
(3) Total transfer time is direcly proportional to the amount of memory swapped.
### Contiguous Memory Allocation (連續的空間把記憶體分配給程式)
#### Fixed-partition allocation
Each process loads into one partition of fixed-size.
#### Variable-size partition
$\star$ The block of contiguous free memory called Hole.
$\star$ Algorithm of allocate:
(1) **First-fit: allocate the 1st hole that fits**
(2) Best-fit: allocate the smallest hole that fits through searching the whole list
(3) Worst-fit: allocate the largest hole through searching the whole list
### Non-Contigunous Memory Allocation
#### Paging
$\star$ Divide **physical memory** into fixed-sized blocks called **frames**.
$\star$ Divide **logical address space** into blocks of the same size called pages.
$\star$ Keep track of free frames(找閒置的frames才能分配)
$\star$ Set up a **page table** to translate logical to physical addresses.(mapping page number and frame number)

$\star$ Benifit:
(1) Allow the physical-address space of a process to be noncontiguous.(user看到的logical是連續的,但可以根據page table對應到不連續的frame)
(2) Avoid external fragmentation(因為page是fixed-size,一個蘿蔔一個坑,所以不會有external fragmentation)
(3) Limited internal fragmentation(page切越小,maximum fragmentation會被限制)
(4) Provide **shared memory/pages**
#### Address Translation Scheme

$\star$ Logical address is divided into two parts:
(1)Page number $(p)$ :
>$\star$ The index into a page table which contains base address of each page in physical memory.
>$\star$ N bits means that each process can allocate $2^N$ pages. In other words, The memory size allocated by each process is $2^N \times pagesize$.
(2)Page offset $(d)$:
>$\star$ N bits means that the page size is $2^N$.
#### Address Translation
For example, given 32 bits logical address, 36 bits physical address and 4KB page size.
>Page table size: $2^{32}/2^{12}=2^{20}$ entries
>Max program memory: $2^{32}=4$GB
>Total physical memory size: $2^{36}=64$GB
>Number of bits for page number: $2^{20} entries \rightarrow 20\ bits$
>Number of bits for frame number: $2^{24} entries \rightarrow 24\ bits$
#### Free Frames

$\star$ A free-frame list maintained by OS
(當有一個新的process要被create,他的page要去找一個frame去擺的時候,因為這些frames是Non-Contigunous,所以只要從這個free-frame list中找一個,一起放進去page table做assign, allocate)
#### Page/Frame Size
$\star$ A power of 2
$\star$ 4KB / 8KB page is commonly used (32bit / 64bit)
$\star$ Large page size $\rightarrow$ More space waste(Internal fragmentation)
$\star$ Page sizes have grown over time:
(1) memory, process, datasets have became larger(大的page size可以一次從disk中抓取較多的資料)
(2) Better I/O (during page fault)
(3) page table is smaller(因為當page size小,page number大,會需要大的空間存儲page table)
#### Paging Summary
$\star$ Paging helps separate user’s view of memory and the actual physical memory.
$\star$ User view’s memory: one single contiguous space(對於使用者來說logical space是連續的,在不影響使用者操作的複雜度下,可以幫助作業系統有效地使用physical memory space)
$\star$ OS maintains a copy of the page table for each process
$\star$ OS maintains a frame table for managing physical memory(相反於page table,每一個frame存的是哪個process的page)
#### Implementation of Page Table
$\star$ Page table is kept in **memory**
$\star$ Doing the translation between logical address and physical address is MMU and it is a hardware. Memory contents need to load in Page-table base register(PTBR).(page table 存在memory的什麼位置供MMU使用)
$\star$ The pthsical memory address of the page table.
$\star$ PTBR value is stored in PCB(Process Control Block)
$\star$ With PTBR, each memory reference results in 2 memory reads:
(1) Find the page table, load it to MMU and finish the translation.(讀page table)
(2) Find the real address(physical address)(mapping後才讀真正的資料位置)
但這種2-access method會花費較多的時間,因此解決方法:
$\star$ Translation Look-aside Buffers(TLB) which is implemented by Associative memory.
#### Associative Memory

TLB是一個Associative Memory,它可以緩存曾經查詢的page-frame。
(1) 沒有順序性,page3不一定會在第4的位置,所要搜尋所有的entries去找到所需要的page number是否有在緩存中。但是,它的hardware logic設計是可以同時查詢所有的entries,所以它的look up time 是order 1,才可以滿足效能的要求。
(3) 因為它的設計相對一般memory複雜,所以它的entries數量有限制,正常情況下沒辦法把所有process的page table塞進去這個cache所以後續會說到flush。
#### Translation Look-aside Buffer (TLB)

前面提到page table存在memory,需要先從memory中load page table去完成translation,接著再load physical address,這是2-access的方法較耗時,因此利用TLB來解決。
TLB是MMU的cache,MMU translate一個page後會把page-frame關聯緩存在TLB中,在後續的translation中先從TLB中查詢是否已緩存在內,如果有就TLB hit,可以直接translate;如果沒有就只能從page table查詢做2-access的 memory read。
因為不同process有不同page table並共用一個TLB,所以不同page table的相同page number對應到的frame number不會相同,因此有電腦系統有兩個方法處理:
(1) Flush TLB:直接清空TLB中的所有緩存
(2) Process ID:在TLB中另外加Process ID,在查詢過程中除了page number要相同外,process ID也要相同才可以TLB hit 然後translate。通常是用flush,比較省時。而這也是為什麼context switch耗時,因為當TLB flush後要花很多的時間去填TLB,在那期間TLB miss會耗費很多時間在做2-access的memory read然後做translate。