# Linux Memory Management - vm_area_struct [TOC] ## Citations ### The `vm_area_struct` ```c struct vm_area_struct { /* VMA covers [vm_start; vm_end) addresses within mm */ unsigned long vm_start; unsigned long vm_end; struct mm_struct *vm_mm; /* The address space we belong to. */ const vm_flags_t vm_flags; struct list_head anon_vma_chain; struct anon_vma *anon_vma; /* Function pointers to deal with this struct. */ const struct vm_operations_struct *vm_ops; struct file * vm_file; /* File we map to (can be NULL). */ void * vm_private_data; /* was vm_pte (shared mem) */ }; ``` I'm trying to draw a parallel with how virtual pages are represented, but unfortunately it it's not so straight forward, because what Linux actually sees here is an area. A virtual memory area with a beginning, with an end, and an array of virtual pages in between. So it's kind of complicated to draw this parallel in between how a physical page is represented and how a virtual one is represented, because we don't have a single struct to represent this concept. -- [5:05, Inspecting and Optimizing Memory Usage in Linux - João Marcos Costa, Bootlin](https://youtu.be/pIR1H7ZyWe4?si=R3BiqwGD4A06Tads&t=305) ### The `mm_struct` The `mm_struct` pointer in the `task_struct` indicated the virtual memory areas in a process. And there is a `vm_area_struct` pointer in the `mm_struct` record the memory mapping information. In the `mm_struct`, there's a `pgd` and a `mmap`. The `pgd` is the top-level page table. The `mmap` points to its virtual memory areas. This can be section in a process like stack, data, heap etc.. -- [0:51, Tracing on Page Table - YuHsiang Tseng & ChinEn Lin, National Taiwan Ocean University](https://youtu.be/Of1KXLHpZDg?si=pUCnw8PUeyyvjIUG) ### The hierarchy of `mm_struct` and `vm_area_struct` The physical memory you use is described by the `struct page`. Each memory area is described by a `vm_area_struct`. A `vm_area_struct` describes a continuous block in the virtual address space. All of them do not overlap with each other in the virtual address space. Each `vm_area_struct` has a type. It can be annonymous memory, a memory mapped file, device memory and so on. Those types are defined by the `VM_*` flags in the `include/linux/mm.h`. ### Two views of the `vm_area_struct` The process links those `vm_area_struct` in two ways, at once: by a linked list, and by a [maple tree](https://youtu.be/RaXhP-QLUxI?si=x8PTeRpTQ-l68J5k). This way, the kernel could choose operations that it sees fit from either of the data structures at any given situation. For example, treating it as a linked list is convenient for traversing the entire structure, while treating it as a maple tree makes the kernel benefits from its search complexity. [![Screenshot from 2024-10-21 20-26-18](https://hackmd.io/_uploads/Bkr3upXgJl.png)](https://www.slideshare.net/slideshow/process-address-space-the-way-to-create-virtual-address-page-table-of-userspace-application-251425396/251425396) The `vm_next` and the `vm_prev` are the pointers that connects the `vm_area_struct` as a linked list, while the `mm_mt`, which is a maple tree, is how the `vm_area_struct` are connected as a tree. You may look into the `/proc/$PID/maps` directory of a process to see all those `vm_area_struct`. ## References ### [Tracing on Page Table - YuHsiang Tseng & ChinEn Lin, National Taiwan Ocean University](https://youtu.be/Of1KXLHpZDg) {%youtube Of1KXLHpZDg %} ### [Lower Response Time of Fork by Extending Copy-on-write to the Page Table - Chih-En Lin](https://youtu.be/J53PxYfpro4) {%youtube J53PxYfpro4 %} ### [Structures to manage the stack, virtual memory, and process ids in the Linux kernel](https://youtu.be/yaxcsAt8Mhw) {%youtube yaxcsAt8Mhw %} ### [Hiding Process Memory via Anti-Forensic Techniques](https://youtu.be/tMxCfxjtvnk) {%youtube tMxCfxjtvnk %} ### [作業系統設計與實作 - Lec 07 Memory Management – Part III](https://youtu.be/1zMipcUhsOs) {%youtube 1zMipcUhsOs %}