vm_area_struct
I'm trying to draw a parallel with how virtual pages are represented, but unfortunately it it's not so straight forward, because what Linux actually sees here is an area. A virtual memory area with a beginning, with an end, and an array of virtual pages in between. So it's kind of complicated to draw this parallel in between how a physical page is represented and how a virtual one is represented, because we don't have a single struct to represent this concept.
– 5:05, Inspecting and Optimizing Memory Usage in Linux - João Marcos Costa, Bootlin
mm_struct
The mm_struct
pointer in the task_struct
indicated the virtual memory areas in a process. And there is a vm_area_struct
pointer in the mm_struct
record the memory mapping information.
In the mm_struct
, there's a pgd
and a mmap
. The pgd
is the top-level page table. The mmap
points to its virtual memory areas. This can be section in a process like stack, data, heap etc..
– 0:51, Tracing on Page Table - YuHsiang Tseng & ChinEn Lin, National Taiwan Ocean University
mm_struct
and vm_area_struct
The physical memory you use is described by the struct page
. Each memory area is described by a vm_area_struct
. A vm_area_struct
describes a continuous block in the virtual address space. All of them do not overlap with each other in the virtual address space.
Each vm_area_struct
has a type. It can be annonymous memory, a memory mapped file, device memory and so on. Those types are defined by the VM_*
flags in the include/linux/mm.h
.
vm_area_struct
The process links those vm_area_struct
in two ways, at once: by a linked list, and by a maple tree. This way, the kernel could choose operations that it sees fit from either of the data structures at any given situation.
For example, treating it as a linked list is convenient for traversing the entire structure, while treating it as a maple tree makes the kernel benefits from its search complexity.
The vm_next
and the vm_prev
are the pointers that connects the vm_area_struct
as a linked list, while the mm_mt
, which is a maple tree, is how the vm_area_struct
are connected as a tree.
You may look into the /proc/$PID/maps
directory of a process to see all those vm_area_struct
.