Try   HackMD

Linux Memory Management - Zones

Citations

NOTICE NOTICE NOTICE

The Citations section contains excerpts from the talks listed in the References. The origins are labeled in the end of the paragraphs, and the contents are attributed their respective authors.

NUMA nodes

If we have to look at the physical memory of your system, then we can see that the physical memory is subdivided into the so-called nodes.

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

Here we see a simplified example of a NUMA system, with two nodes. Part of the memory is at node 0, and part of the memory is at node 1. Both nodes together that's the total meory space of this system.

You can see in a node we also have various CPUs connected, and the other as well in the system has various CPUs connected. These CPUs can very fast access the memory in the same node, but they even can access memory in the other node, but that will be done via internal connect, which is slow.

43:23, Tutorial: Linux Memory Management and Containers - Gerlof Langeveld, AT Computing

Zones

So memory is subdivided into node, and nodes are subdivided into zones for Linux memory management.

cat /proc/buddyinfo 
Node 0, zone      DMA      1      0      0      1      0      1      0      0      0      1      2 
Node 0, zone    DMA32      4      8      6      8      6      7      7      5      3      5    341 
Node 0, zone   Normal   4240   1274   1804   1471    746    462    210     81     24     24   5971

ZONE_DMA

What we see here, the first zone in memory is the so-called DMA zone. That's the first 16 MB of the memory. That's is still rather presious memory if you are still using ISA controllers. ISA controller can only address 24 bit addresses, and they can only do DMA in the first 16 MB of memory. So that's a separate zone.

44:59, Tutorial: Linux Memory Management and Containers - Gerlof Langeveld, AT Computing

(We can see that it physical memory is not quite a homogeneous pool of addresses. That's where we kind of start abstracting this.)

5:42, Inspecting and Optimizing Memory Usage in Linux - João Marcos Costa, Bootlin

ZONE_DMA32

Then we have another zone, which is the DMA32 zone, and that's from the 16 MB to 4 GB, which is addressable by 32 bits for 32bit controllers that might do DMA. They have to have their buffer there.

ZONE_NORMAL

The rest of your memory is in fact, NORMAL zone.

/proc/buddyinfo

Also see the proc_buddyinfo(5) on the man pages.

You can have those information about nodes and zones in this buddyinfo file in the procfs.

# cat /proc/buddyinfo
Node 0, zone Normal 28 13 8 3 3 1 2 2 2 3 2 51

First I got this from a Arm board with 32-bit machine. Here we can see that we only have normal zone because we haven't hit the roughly the 900 MB limit.

For each of those columns, we have a number of available consecutive memory chunks of a certain size. They all have an order. We have 28 chunks of 4K size, because the order is 0. The next column we have 13 chunks of the "double the page" size (8K size), and so on and so forth.

This is also a way to have an idea on how fragmented your memory is, because in the left most side you have the smaller chunks, and in the right most side you have the bigger chunks of memory.

7:17, Inspecting and Optimizing Memory Usage in Linux - João Marcos Costa, Bootlin

(Another example of the /proc/buddyinfo from my laptop):

Node 0, zone      DMA      1      0      0      1      0      1      0      0      0      1      2 
Node 0, zone    DMA32      4      8      6      8      6      7      7      5      3      5    341 
Node 0, zone   Normal   4240   1274   1804   1471    746    462    210     81     24     24   5971

Allocation and zones

If you are allocating memory (for DMA), ultimately you're going to end up allocating pages. That's what we use for the DMA. When allocating a big chunk of contiguous memory, you can ask page allocator for the memory to come from a specific zone.

At the startup the kernel partitions the memory into different zones in order to give some amount of granularity with regard to the location of the memory allocations, and that's the best granularity we're going to get in order to get the memory in a specific placement.

27:25, SUSE Labs Conference 2020 - DMA mapping for the Raspberry Pi 4

References

Also see:

  1. Physical Memory in the Linux Kernel Documentation.
  2. Hierarchical NUMA on LPC2017 and its audio recording.
  3. How the Linux kernel divides up your RAM in Chris's Wiki.

Tutorial: Linux Memory Management and Containers - Gerlof Langeveld, AT Computing (43:23)

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Inspecting and Optimizing Memory Usage in Linux - João Marcos Costa, Bootlin (5:42)

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

SUSE Labs Conference 2020 - DMA mapping for the Raspberry Pi 4 (27:25)

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →