Try   HackMD

Linux Paging - VA to PA

這篇文紀錄通過 QEMU + gdb 的方式,徒手把 Virtual Address 轉成 Physical Address 的過程

文中會先補理論的重點再紀錄實作

理論和實作的基礎是建立於 x86-64 的 Linux 4.17.0

Paging

  • 參考 AMD 的手冊,可見 Ref 1

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

(上圖來自 AMD 手冊)

  • VA 實際上只會用到 48 bits,其餘是 Sign Extension
    • 9 bits PML4I (Page-Map Level-4 Index)
    • 9 bits PDPI (Page Directory Pointer Index)
    • 9 bits PDI (Page Directory Index)
    • 9 bits PTI (Page Table Index)
    • 12 bits Physical Page Offset
      • 上面用詞用 Index 比較合適,圖是寫 Offset
  • 在 x86-64 中 Page Size 為 4 KB,所以 Physical Page Offset 需要有 12 bits
  • 在四級分頁中 (PML4、PDP、PD、PT)
    • 每一個 Table 為 1 Page (4 KB)
    • 每一個 Entry 為 8 Bytes
    • 因此每一個 Table 含有 512 個 Entries
    • 因此每個 Table 的 Entry Index 需要有 9 bits
  • VA to PA 大致過程如下
  • 首先從 CR3 爬出 Page-Map Level-4 Table Base
  • 第 PML4I 個 Entry 紀載下一級頁表 PDP 的 Base
  • 第 PDPI 個 Entry 紀載下一級頁表 PD 的 Base
  • 第 PDI 個 Entry 紀載下一級頁表 PT 的 Base
  • 第 PTI 個 Entry 紀載 Physical Page Frame Base
  • Physical Page Frame Base + Physical Page Offset 就完成了轉換
  • 2 MB Page 的轉換過程只是少了一級的頁表,實際上大同小異

CR3

  • CR3 在 x86 的不同模式下有不同意涵,我們就假設在 long mode
  • CR3 結構如下

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

  • Page-Map Level-4 Table Base Address 從第 12 bits 開始
    • 與 Page Size 4 KB 有關

PML4 Entry

  • PML4 (Page Map Level-4)
  • 在 Linux Kernel 中就是 PGD (Page Global Directory)
  • 在 2 MB / 4 KB Page Size 的情況結構都相同

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

PDP Entry

  • PDP (Page Directory Pointer)
  • 在 Linux Kernel 中就是 PUD (Page Upper Directory)
  • 在 2 MB / 4 KB Page Size 的情況結構都相同

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

PD Entry

  • PD (Page Directory)
  • 在 Linux Kernel 中就是 PMD (Page Middle Directory)
  • 在 2 MB / 4 KB Page Size 的情況結構不同
  • 以第 7 bit (PS bit) 來區分

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

PT Entry

  • PT (Page Table)
  • 在 Linux Kernel 中是一樣的稱呼
  • 在 2 MB / 4 KB Page Size 的情況結構不同
    • 2 MB Page Size 的情況下不會有此級頁表

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Page-Translation-Table Entry Fields

  • P (Present)
  • R/W (Read/Write)
  • U/S (User/Supervisor)
  • PWT (Page-level WriteThrough)
  • PCD (Page-level Cache Disable)
  • A (Accessed)
  • D (Dirty)
  • PS (Page Size)
  • G (Global Page)
  • AVL (AVaiLable to software)
  • PAT (Page-Attribute Table)
  • MPK (Memory Protection Key)
  • NX (No eXecute)

gdb-pt-dump

Memory Region

  • 見 Ref 8
Virtual memory map with 4 level page tables:

0000000000000000 - 00007fffffffffff (=47 bits) user space, different per mm
hole caused by [47:63] sign extension
ffff800000000000 - ffff87ffffffffff (=43 bits) guard hole, reserved for hypervisor
ffff880000000000 - ffffc7ffffffffff (=64 TB) direct mapping of all phys. memory
ffffc80000000000 - ffffc8ffffffffff (=40 bits) hole
ffffc90000000000 - ffffe8ffffffffff (=45 bits) vmalloc/ioremap space
ffffe90000000000 - ffffe9ffffffffff (=40 bits) hole
ffffea0000000000 - ffffeaffffffffff (=40 bits) virtual memory map (1TB)
... unused hole ...
ffffec0000000000 - fffffbffffffffff (=44 bits) kasan shadow memory (16TB)
... unused hole ...
				    vaddr_end for KASLR
fffffe0000000000 - fffffe7fffffffff (=39 bits) cpu_entry_area mapping
fffffe8000000000 - fffffeffffffffff (=39 bits) LDT remap for PTI
ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks
... unused hole ...
ffffffef00000000 - fffffffeffffffff (=64 GB) EFI region mapping space
... unused hole ...
ffffffff80000000 - ffffffff9fffffff (=512 MB)  kernel text mapping, from phys 0
ffffffffa0000000 - fffffffffeffffff (1520 MB) module mapping space
[fixmap start]   - ffffffffff5fffff kernel-internal fixmap range
ffffffffff600000 - ffffffffff600fff (=4 kB) legacy vsyscall ABI
ffffffffffe00000 - ffffffffffffffff (=2 MB) unused hole

Translate VA to PA

  • 實驗方式是用 QEMU 模擬 x86-64 Linux Kernel 運作
  • 隨便挑一個記憶體位址下手,如圖

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

  • 在這個例子中,Virtual Address 0xffffffff8111c398 就是我們感興趣的位址
  • 我們來算看看他的 Physical Address
  • 並觀察 PA 上的數值是否跟 VA 上的數值相同,以此驗證正確性
  • 首先我們先分解此 VA,先以四級頁表的方式分解
    • 0xffffffff8111c398
    • 換成二進制,並加上輔助的間隔線
    • 0b1111111111111111 | 111111111 | 111111110 | 000001000 | 100011100 | 001110011000
頁表 binary decimal
PML4I 0b111111111 511
PDPI 0b111111110 510
PDI 0b000001000 8
PTI 0b100011100 284
Offset 0b001110011000 920

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

  • 取得 CR3 的值,其為 PML4 Base Address,注意其為 PA
    • 0x7a7e000
  • 算出對應 PML4 Entry 位址為
    • 0x7a7e000 + 511 * 8

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

  • monitor 指令是傳送後面的指令給 gdbserver
  • xp 為 QEMU 提供的指令
    • Physical memory dump starting at addr.
  • QEMU 提供的指令見 Ref 7.
  • 得到 PML4 Entry 內容為 0x1816067
    • 得到下一層頁表 Base 為 0x1816000
  • 算出對應 PDP Entry 位址為
    • 0x1816000 + 510 * 8

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

  • 得到 PDP Entry 內容為 0x1817063
    • 得到下一層頁表 Base 為 0x1817000
  • 算出對應 PD Entry 位址為
    • 0x1817000 + 8 * 8

  • 得到 PD Entry 內容為 0x10001e1
    • 第 7 bit (PS bit) 為 1,表示此 Page Entry 紀錄的 Page 是使用 2 MB Page Size
    • 修正我們前面使用四級頁表的假設,改成只有三級
頁表 binary decimal
PML4I 0b111111111 511
PDPI 0b111111110 510
PDI 0b000001000 8
Offset 0b100011100001110011000 1164184
  • 得到 Physical Page Base Address 為 0x1000000
  • 算出 VA 對應的 PA 位址為
    • 0x1000000 + 1164184

  • 看一下 VA 內容

  • 的確是一樣的,表示我們轉換正確

Related CTF Challenge

  • midnightCTF-2021 Brohammer
    • 改掉 Page Entry 的 US bit
    • 讓 User 可以讀到原本不能讀的區域

Ref

  1. https://www.amd.com/system/files/TechDocs/24593.pdf
  2. https://francescoquaglia.github.io/TEACHING/AOS/SLIDES/kernel-level-memory-management.pdf
  3. https://0xax.gitbooks.io/linux-insides/content/Theory/linux-theory-1.html
  4. https://www.cnblogs.com/muahao/p/10297852.html
  5. https://lwn.net/Articles/717293/
  6. https://github.com/martinradev/gdb-pt-dump/
  7. https://qemu.readthedocs.io/en/latest/system/monitor.html
  8. https://github.com/torvalds/linux/blob/v4.17/Documentation/x86/x86_64/mm.txt
  9. https://hxp.io/blog/82/Midnightsun-CTF-2021-Brohammer/
  10. https://xz.aliyun.com/t/7625