NCTU OSDI Dicussion - Memory Management I

# NCTU OSDI Dicussion - Memory Management I ###### tags: `OSDI` * 在影片中提到的 `malloc`，有搞不清楚的地方是`malloc`裡內部 call 的 sbrk system call 拿到的是 physicall address 還是 virtual address * `malloc`就是user space在要空間，所以都是virtual address。 * 如果要在 kernel 中存取 physicall address，應該要把 MMU 關掉，不過這樣不就找不到下一個指令了嗎? 因為原本的 kernel code 為 virtual address。 * Kernel is one to one virtual to physical address mapping with simple offset setting. ```c phys_addr = virt_to_phys(virt_addr); virt_addr = phys_to_virt(phys_addr); #define __virt_to_phys(x) (((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET)) #define __phys_to_virt(x) ((unsigned long)((x) - PHYS_OFFSET + PAGE_OFFSET)) ``` * compile kernel 的 compiler不知道這是要拿來當作kernel的，那怎麼做到讓kernel使用的是指定的那 1G virtual memory? * https://elinux.org/Kernel_Size_Tuning_Guide * 如果多裝一支記憶體到電腦上，MMU怎麼知道可以用的記憶體變多了，kernel又是怎麼知道的?如果不知道的話，多插幾支就沒意義了? * MMU 在 CPU裡面 * 既然不會一次給user program全部的記憶體，那user program剛載入時會給多少，是code size+data size+stack size嗎?還是code和data也不會都載入? * 一開始占多少與compiler有關，其餘都是有碰到該memory才真的給，要不然都是假的。 * code應該在load時就會載入，但是用到的library不一定，應該會有使用到才會載入，dynamic shared library。 * ![](https://i.imgur.com/3kdYgi8.png) * 在Lab3中start up的assembly有一個步驟是利用system control register來disable MMU, 但是在Lab3之前define的的address卻不用作任何改變(像是mbox和uart的address), 是因為OS並沒有告訴MMU該怎麼map logical address, 所以MMU會將physical address設定為和logical address一樣嗎?(這個步驟我只是照抄reference的, 所以不太清楚原因) * before MMU enable, there is no logical address. Addresses are all physical addresses. * "User和Kernel共享4G的空間" 指的是每個process都要有1G保留給Kernel space, 而Kernel space包含了Kernel code和發生system call時所需要用到的kernel stack嗎? **對** * Kernel stack is in the kernel space. Not only kernel code, and kernel stack, there are many kernel structures occupying kernel space. * kernel佔的記憶體一定會長大，開越多process，就會越多的TCB之類的，就會慢慢長大。 * 為何在User address space中每個Virtual memory都有Kernel space, 但這些Kernel space卻沒有map到Physical space裡? * All kernel space has physical memory mapped. * user space dynamic被隨機分配到不同的physical上。kerel只要有東西就會被mapping到physical上。所有的user space的kernel的部分是共用的。 * 每個人1G的kernel space memory應該是map到一樣的physical memory * 電腦至少要有1G的memory才能run對不對？NONO，kernel是慢慢長的。跟 user space 一樣的邏輯，應該只是留了 1G 的 logical address 定址能力。 * ![](https://i.imgur.com/JcsTEdn.png) * 針對記憶體較小的板子是不是應該要調整kernel跟user的定址範圍？ * physical memory就是有多大，要調的是kernel和user space的用量，所以可能是用的library要改之類的。 * logical memory 就是定址能力，1G:3G的東東不需要做更改，一個邏輯上的配置。 * 3G+1G virtual memory allocation for small physical memory device * 老師有說到 Logical memory，也就是把MMU往後移，但計算機組織說這樣在cache中會產生 Aliasing 的問題，會吧?還是不會? * The aliasing problem appears when using virtual address cache. * 除了用 "Virtual index, physical tagged" 解決之外現代的設計架構還有什麼方法解決嗎? * 也會有alias的問題。 * 不想要有alias的問題，就用physical index physical tagged. * alias need cache flush * MMU 負責轉換 memory address 可是 paging 的東西不是在 os 裡面嗎所以 mmu 要再請 cpu 去幫它跑玩 paging 的東西再把找到真正的 address 給 cpu 嗎？ * ![](https://i.imgur.com/35fcQvo.png) * Address translation is done by hardware. Hardware can use page table which seted by kernel. If address translation need CPU involve, how to handle address translation in that time. I think that increases complex in software and reduces performance. * TLB很多table資料是在memory中的。很複雜的狀況。 * os 中多層的 page table 有包含 segment 那段嗎 * Depending OS implementaiton * OS要不要支援。 * 大部分人認為page table和segment是redundnt的，很多只會開一個。 * logical 跟 physical 兩個 cache 都用的話效率會提昇嗎 * Increasing cache size would help. But separating the same cache size to logical cache and physical cache does not bring performance improvement (I feel), but increase the hardware cost * 兩個做法是不一樣的喔喔。 * 以硬體來說，cache 的 cost很大。cache增加就是花錢。 * 同樣大小拆成兩塊，performance部會改變太多，但是會cost增加很大 * 若是 Compile time Address-binding，如果不同程式卻 bind 到相同區段的記憶體位置，這樣還能夠同時執行嗎？ * Virtual address or physical address? * application bind是virtual address，在linking時才知道實際位置。 * 每個應用程式能夠使用的最大記憶體量，就是作業系統能夠定址的能力扣掉 kernel 佔用的 address space，這樣的理解對嗎**對**？(在沒有 swap 的情況) * Even with swap space, the address is limited to addressing space-kernel space. Swap space is not for increasing application address space, it is for a computer without sufficient physical memory holding all running processing. * swapping space並沒有增加定址能力。swaping space只是能夠一起開很多program，且這些program加起來的memory超大。 * 在 VM 內的 address translation，會交給 host CPU 的 MMU 處理嗎？因為如果 core 是跟 host 共享的，那這樣 address translation 是誰做呢？ * 好問題，vm 裡的address誰做的？ * 應該guest OS 交給 host OS做的 * 因為VM通常是需要OS support，所以host CPU會幫忙轉換應該是有個別設計出來幫忙的部分。 * 要去另外讀VM和docker啦。 * 大多的cache都是nonprogrammable，為什麼會這樣設計呢? (已解，或許是說x86大多沒有這樣設計，ARM MIPS似乎都有lock cachlines的功能來讓部分的cache可以被使用者使用) * 現在很多cache都是programable的。 * 想請問為什麼x86要先作segment translate再作page translate呢? 兩個都使用會比擇一還好嗎? * History * 為什麼老師上課說"page table,TLB的管理是hardware design issue，而software要注意甚麼時後被清空就好?" logical address被切成" selector | offset "，linear address被切成" d | p | offset "這件事情是硬體決定的嗎**是**? * Software (more specifically, kernel) design issues * logical被切， * OS會管理memory怎麼管理 * hardware 會提供一定的空間可以config * virtual memory中有1G是kernel space，那這些kernel space的physical address是指到相同的地方嗎? * Not necessary * 承上題，如果是的話，kernel space需要放的是OS中的哪個部分?interrupt或exception的handler?若不是的話，那每個程式的這段kernel space又放些什麼資料? * a lot, e.g. file, kernel stack * x86經過segment及page轉換的好處是什麼?這樣轉換時間不是拉長轉換時間嗎? * linux 基於何種原因不太使用segment？ * https://unix.stackexchange.com/questions/469253/does-linux-not-use-segmentation-but-only-paging * 總而言之就是覺得是redundant的東西 * 老師影片中提到,我們會給user process 4G的virtual memory,其中1G會給kernel,3G給user,那如果每一個process都會有1G的空間給kernel,為何我們不要直接將1G的physical address給kernel就好了呢？ * 回答了啦 * 為何當process被context switch的時候不一起備份cache內容,這樣當此process又拿到執行權時,就不會cache miss了,是否可以達到更好的效能呢？ * How to backup cache ? save cache into memory. How to access memory ? use cache. The overhead of saving and restoring cache is bigger than cache miss. * cache會發生作用，是因為不跟memory拿，如果把cache備份到memory中，等一下又過去拿，這樣484不太有用，畢竟又搬進cache會多2次的搬移，而且又不一定會用到。最多是一定會用到的星，像是kernel的東西，所以有時候會lock一些cache。 * 現在記憶體越做越大，個人PC上裝有32G記憶體越來越常見，是不是當memory夠大時，virtual memory就不太需要了？畢竟還要多一層轉換增加overhead。（我自己就是關掉這個Virtual memory啦） * Could you share how to close virtual memory ? I think you close swap not MMU. * 應該是關掉swapping space才對，而不是關掉virtual memory。因為virtual memory會有一些保護的作用。 * 如果沒有virtual memory 就不能有 swapping space。 * virtual memory的存在是為了protection。 * swapping space是為了擴充不夠的空間 * 影片中有提到，切segment，有的區段是read only，有的是可以read/write，以放不一樣的東西。這個規範是固定的嗎**通常沒有**？可能某幾個位置就是固定read only這樣，但是這樣要怎麼應付不一樣大小的memory？ * 硬體會給很大的控制權，去做config * Memory control block之類的 * 影片中有提到，如果某個process存取了一個不是他自己的空間地位址，就會被揪出來，然後segmentation fault，可是有時候在windows開發時，可能陣列存取超過空間了，他也不會尖叫，程式就順利結束了，是不是Windows沒寫好？ * process存取的空間都是自己的空間，畢竟都是virtual address，不會去弄到別人家的空間。只是能不能被存取的問題。 ```c int main () { int *i, j[80]; i = 0x00000000; printf("%d", *i); // illegal address accrss，因為access到kernel space。 printf("%d", j[1000]); // 因為VMA沒有給你到那麼大空間。超過空間，VMA也會給你擋起來 printf("%d", j[80]); // 很多時候可以跑，通常會以page為單位要空間，所以超出去一點點，VMA還是有空間可以用，所以通常不會出錯 printf("%d", j[79); } ``` * 沒有超過 stack 下限就可以...吧，假設進 enrty 前有用到 stack + 1000 應該就可以。 * `i`和`j`被配置在 stack segment，因為在function裡面，而不是 data segement。 * 如果是 ```c int *i, j[80]; int main () { //..... } * 這樣會在哪裡 * 如果是 ```c const int *i, j[80]; int main () { //..... } ``` * 有 initial value的，所以會放在data segement中 * ![](https://i.imgur.com/hAhoK6y.png) * 存取超過空間但是還沒有超過 user space allocated 的空間，試著直接 access 隨便一個亂數 pointer 對他取值，你就會很容易得到 segmenatation fault。你存取超出了範圍的該 array 只是 program 方面的定義範圍，他可能會違反你程式該有的邏輯，但是這個動作本身並沒有被 memory management 看成有問題。 * access一個沒有配置的空間，到底合不合法？VMA會做檢查，一般來說鮮配空間也不是真的有，都是要存取才真的去要出來，所以應該會壞掉。 * `movl $0x0, -0x4(%rbp)` 老師影片中說這邊應該是放變數b的位置，但是好像原程式沒有給b初始值? * 不要管啦，影片例子有小問題 * 為何5+6=11不是放回原本b的位置-0x4(%rbp)，而是放到-0xc的位置? * I-cache跟D-cache使用上各會比較希望用logical cache還是physical cache? 還是就會看設計上的需求? * I-cache in cortex a53 is VIPT, D-cache is PIPT * 64-bit軟體不能跑在32-bit系統上的主因就是因為memory的限制嗎? * ISA may diff * 32bits process 4GB由來以及64 bits process的max addressing? [ref](http://blog.linux.org.tw/~jserv/archives/001463.html) * 影片後面提到說 kerenl space 因為是 1:1 所以常見是直接用減某個 offset 得到 physical address，這樣一來 processes 間的 context switch 所需要的 TLB/page table flush 就不會影響了。 * 課程說到關於 MPU 及 segmentation 的部份，其中提到 code 和 data 會分開，然後會去保護 text 段。那想請問關於 self modified code 是不是會導致 segmentation fault ？所以要跑這種 code 需要關閉 MPU 的功能嗎？ * Good question, JIT is very special * MPU 是可以被config的 * 為什麼 ARM11 要採用 physical cache？ * Cache is physically addressed, solving many cache aliasing problems and reducing context switch overhead. * [wiki](https://en.wikipedia.org/wiki/ARM11) * 有提到 MMU 和 MPU 是取決於做到的程度，那這兩者有什麼主要的區別？ * MMU 比 MPU 更先進，還有一些更進一步的功能 * MMU 可以 cover MPU 所有的功能 * MPU 比 MMU 簡單 * [ref](https://www.cnblogs.com/thammer/p/10570301.html) * Segmentation 與 paging 的詳細優缺點是甚麼 [已解] Ref * logical cache physical cache 跟 L1, L2, L3 cache 有關連嗎？ * Swap 是只有在 page fault 的 handler 時候才會發生的嗎，被 swap 掉但是後來一直沒用到的 page 就會一直不會被 swap 回去嗎。 * 如果某個process存取memory發生page fault時，OS後續是怎麼處理下一次再存取那個記憶體這件事情？這段時間OS是否需要停下來等CPU通知，還是說OS可以不用停下來，甚至可以schedule給其他process執行？ * OS run on CPU * OS memory page fault時，就會被trap，然後就會去找meory然後重新執行。如果需要swapping，就會suspend，因為要做很多事情，很久，等一下再來。 * 不太理解GDT/LDT的處理以及為什麼要這樣做 * [已解決] 在知乎上找到相對完整的介紹： https://zhuanlan.zhihu.com/p/69504370 * ARM的記憶體管理設計上會是怎麼做的？ * [已解決] https://developer.arm.com/architectures/learn-the-architecture/memory-management/the-memory-management-unit-mmu * [solved] What is GDT LDT TSS * [solved] 請問 Data Cache, 跟 Instruction Cache的分割是如何作到的？在硬體上會有實際的兩塊獨立cache嗎？還是我們必須自己進行邏輯分割跟配置。 * -> [ARMv8 processor properity](https://developer.arm.com/docs/den0024/a/armv8-a-architecture-and-processors/armv8-a-processor-properties). * 硬體上拉線的設計啦。 * 新版的arm還可以自己config instruction 和 data cache可以自己c0nfig要多大。 * Cache 有分 data/instr. cache 想請問 CPU 怎判斷它讀近來的 word 是 data 還是 instr. 並把它放進對應的 cache。 * CPU knows it self operation is fetching instruction or access memory. Because those operation in different pipeline stage. * CPU一定知道自己是在找instruct還是Data * Physical Memory也可以有swap到硬碟的機制嗎？(已解：我認為應該也可以設計) * 錯了，老師上面回答說不行，一定要Virtual Memory * 可以將Kernel space使用Physical Memory，User space使用Virtual Memory來加速作業系統的速度嗎？還是因為Kernel space跟Physical Memory映射已經是用shift的所以overhead已經夠小了？或是CPU只有全開或全關兩種模式 * 上面有回答了 * 在現在這個RAM很大的時代，Virtual Memory真的比直接使用Physical Memory更有優勢嗎？Virtual Memory其中一個優點是在需要比Physical Memory更大的空間時，程式仍然能夠正常執行，不過實際上通常速度會慢很多，通常會直接在買更多的RAM。 * 剛剛回過了啦 * memory很大跟virtual memory不一樣 * virtual memory可以確保每個人的定址空間部會有衝突。 * 如果直接使用Physical Memory不是更能夠保證我的程式是連續的記憶體空間，而不是看起來是連續的，實際上卻是在不同的Page * Because kernel has duty to manage memory, it directly use physical address how kernel to limit user program. Kernel needs a software MMU to implement physical memory under control. Or your “virtual memory” means swap mechanism? * 所以用physical address就不能控制user讀取的address這樣嗎？（不是指swap mechanism） * May use MPU. But it’s hard to offer process memory to run. * instruction 轉成 0/1 總共是 32bits，但在 machine code 可能是 1 bytes, 3 bytes...，所以這兩個是轉換完分別儲存嗎？ * 什麼是 swapping space？（已解） * 可以把 process 轉成 segmentations 再轉成 pages，是分別儲存嗎？這樣不會更浪費空間嗎？ * 查了一下發現 x86-64 已經不使用 segmentation 了，請問當初 IA32 為什麼會混用segmentation 和 paging 呢？ * 在安裝作業系統時，作業系統是怎麼知道底層的硬體有 MMU ? * Know at kernel design stage * OS去問硬體的盃 * * instruction 轉成 0/1 總共是 32bits，但在 machine code 可能是 1 bytes, 3 bytes...，所以這兩個是轉換完分別儲存嗎？ * 什麼是 swapping space？（已解） * 可以把 process 轉成 segmentations 再轉成 pages，是分別儲存嗎？這樣不會更浪費空間嗎？ * 查了一下發現 x86-64 已經不使用 segmentation 了，請問當初 IA32 為什麼會混用 segmentation 和 paging 呢？ * 在安裝作業系統時，作業系統是怎麼知道底層的硬體有 MMU ? * Know at kernel design stage * 為什麼使用實體記憶體就不能在程式執行到一半時 load 其他部份的程式近來呢？如果我一開始就有保留部份的記憶體的話 * If you allocate or retain memory when booting, you can load other program when run time. * 為什麼 TLB 加上 PID 就不會被 flush 掉？TLB 知所以會被 flush 掉不是因為只有一個 TLB 的空間嗎？ * With ASID, kernel can decide flush tlb or not. * 為什麼 kernel space 和 user space 要有一樣的定址空間，kernel space 不是有自己的查 physical address 的方法嗎？ * 32 bit, use the same page table * Armv8-A kernel space and user space use different page table * 以前在OS的課堂有學到可以透過spatial locality的特性使cache的使用得到一些效率上的好處。然而在virtual memory中連續的區段，對應到physical memory並不一定是連續的。是不是代表有時候即使我們考慮spatial locality而設計程式，並不一定能得到其帶來的好處呢？ * 這邊有一些誤解，virtual轉physical應該是一個block對應一個block(2^N aligned)，所以勢必還是能保持一定的spatial locality * 呈上衍生的問題:在非fully associative的cache狀況下，我們都知道在physical memory中相鄰的兩個block(假設是A、B)應該要對應到不同位置的cache line。 * 但相鄰的virtual memory並不一定對應physical memory相鄰的兩個block，假設對應到的physical block剛好到對應同一位置的cache，那麼循環的access這兩塊virtual memory不是就會不斷的cache miss嗎?(假設是direct mapping，n-way也是類似的情境) * https://stackoverflow.com/questions/52955107/does-spatial-locality-provided-by-a-cache-refer-to-virtual-memory-physical-memo * Cache coloring:https://en.wikipedia.org/wiki/Cache_coloring * cache coloring的技術就是使virtual memory中這些相鄰的block，不會對應到會使用同一個cache位置的physical memory block，進而避免上面提到的問題。 * [solved] 好奇 intel memory management * [利用 PML4, paget directory pointer tables, page directories, page tables 把 linear address 轉成 physical address] * [solved] 好奇 arm memory management * [利用 level0~level3 的 table 把 VA 轉換成 PA, 還發現 arm 還可以設定 2 stage 的轉換 VA -> IPA -> PA] * [solved] 理論上 32bit CPU 只能定址到 4GB 的空間但利用 PAE 可以讓可定址的空間超過 4G * [https://superuser.com/questions/556008/memory-limits-in-16-32-and-64-bit-systems] * 在 fork 發生後， child 和 parent 還是共用同一塊 share memeory，那 TLB 該記誰的 PID? * I think ASID is better * 如果 process 結束了，那在 page table 或 TLB 的資料要清掉嗎? * page table can be free * 一開始就會分配好os可用的memory空間，如果os因為一些事件使用空間越來越大超過limit值，會如何處理？ * oom * swap space 是不是越大越好? * no * [Solved] Cache與Memory之間傳遞資料有沒有混用的address，就例如說Tag參考的是Physical address，index參考的是virtual address? * (ans. 有 VIPT PIPT VIVT) * 一個Page的大小設多少才合理? 普遍來說好像都是2MB? * Usually 4K, possibly huge page 2M or even larger * swap partition一般都設多大? (目前似乎都不用設swap了?) * Depends * compiler 產生的 virtual address，會經過轉換後變成 physical 才讓CPU執行，那如果兩個同樣的program會有同樣的virtual address，這部分跟使用paging的關係是怎麼樣呢？ * Map to different phys addr * page table 在 memory 裡，那 MMU 存取 page table 時，是用 physical address 嗎？ * Yes * OS 內的資料結構或code，也會透過 virtual address 存取嗎？ * Yes * Pagefault 的處理是blocking wait嗎?如果不是的話TLB的內容會在context switch時 flush，在完成後再回到原本的program這樣不會容易產生迴圈嗎? * 如果為了避免在 context switch 時 flush TLB可以加上PID作為分辨，但如果像是相同physical不同 virtual address的記憶體要如何處理? * Alias handle by cache flush * 如果說對於 user process 來說都會有一個 4G - X 的空間，那這一段的記憶體位置應該是沒辦法被access的吧?那這樣在 pagetable 中會有記錄這一段 address 的 entry 嗎? * User can’t access, but pagetable still need to write entry for kernel use * 為甚麼 kernel 的 virtual memory address 是連續對到 physical memory address ？ * Easier to manage phys mem * 不同user process的virtual memory 裡面的kernel space在physical memory裡面都對應到同一個位置嗎? * Not always * 想請問 logical cache 跟 physical cache 的差別，是因為一個是直接從 virtual address 直接到 cache，一個則是先轉換成 physical address 再查是否有 cache 。如果使用 physical cache 不就代表還必須要先 access 普通的 ram 查表以轉換出 physical memory。但是 cache 的目的應是為了解決 CPU 跟 memory 巨大的速度差距嗎，如果使用 physical cache，每次 access memory 依然至少要一次 access 一般 memory，這樣 cache 起到的作用真的足夠大嗎? 還是實際上 context switch 過於頻繁，physical cache 才會顯得有優勢? * Thus, hardware has TLB to solve that. * 如果 page table 的層數不高，TLB 是否會幾乎沒有作用? * TLB is always useful * 課程提到一般來說是 server 級別的電腦可能會比較需要 kernel space ，這邊不太懂，就算是 server 不是也是寫了 user space 的 program ? 還是說是另外有特別針對 server 的 OS/kernel 直接就把 server 的功能寫進去了? * 有很多server在run的程式，就會直接寫在kernel中。