harunanase
  • NEW!
    NEW!  Connect Ideas Across Notes
    Save time and share insights. With Paragraph Citation, you can quote others’ work with source info built in. If someone cites your note, you’ll see a card showing where it’s used—bringing notes closer together.
    Got it
      • Create new note
      • Create a note from template
        • Sharing URL Link copied
        • /edit
        • View mode
          • Edit mode
          • View mode
          • Book mode
          • Slide mode
          Edit mode View mode Book mode Slide mode
        • Customize slides
        • Note Permission
        • Read
          • Only me
          • Signed-in users
          • Everyone
          Only me Signed-in users Everyone
        • Write
          • Only me
          • Signed-in users
          • Everyone
          Only me Signed-in users Everyone
        • Engagement control Commenting, Suggest edit, Emoji Reply
      • Invite by email
        Invitee

        This note has no invitees

      • Publish Note

        Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note No publishing access yet

        Your note will be visible on your profile and discoverable by anyone.
        Your note is now live.
        This note is visible on your profile and discoverable online.
        Everyone on the web can find and read all notes of this public team.

        Your account was recently created. Publishing will be available soon, allowing you to share notes on your public page and in search results.

        Your team account was recently created. Publishing will be available soon, allowing you to share notes on your public page and in search results.

        Explore these features while you wait
        Complete general settings
        Bookmark and like published notes
        Write a few more notes
        Complete general settings
        Write a few more notes
        See published notes
        Unpublish note
        Please check the box to agree to the Community Guidelines.
        View profile
      • Commenting
        Permission
        Disabled Forbidden Owners Signed-in users Everyone
      • Enable
      • Permission
        • Forbidden
        • Owners
        • Signed-in users
        • Everyone
      • Suggest edit
        Permission
        Disabled Forbidden Owners Signed-in users Everyone
      • Enable
      • Permission
        • Forbidden
        • Owners
        • Signed-in users
      • Emoji Reply
      • Enable
      • Versions and GitHub Sync
      • Note settings
      • Note Insights New
      • Engagement control
      • Make a copy
      • Transfer ownership
      • Delete this note
      • Save as template
      • Insert from template
      • Import from
        • Dropbox
        • Google Drive
        • Gist
        • Clipboard
      • Export to
        • Dropbox
        • Google Drive
        • Gist
      • Download
        • Markdown
        • HTML
        • Raw HTML
    Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
    Create Create new note Create a note from template
    Menu
    Options
    Engagement control Make a copy Transfer ownership Delete this note
    Import from
    Dropbox Google Drive Gist Clipboard
    Export to
    Dropbox Google Drive Gist
    Download
    Markdown HTML Raw HTML
    Back
    Sharing URL Link copied
    /edit
    View mode
    • Edit mode
    • View mode
    • Book mode
    • Slide mode
    Edit mode View mode Book mode Slide mode
    Customize slides
    Note Permission
    Read
    Only me
    • Only me
    • Signed-in users
    • Everyone
    Only me Signed-in users Everyone
    Write
    Only me
    • Only me
    • Signed-in users
    • Everyone
    Only me Signed-in users Everyone
    Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note No publishing access yet

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.

    Your account was recently created. Publishing will be available soon, allowing you to share notes on your public page and in search results.

    Your team account was recently created. Publishing will be available soon, allowing you to share notes on your public page and in search results.

    Explore these features while you wait
    Complete general settings
    Bookmark and like published notes
    Write a few more notes
    Complete general settings
    Write a few more notes
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    Memory Paging === ###### tags: `linux` `kernel` `memory` memory paging 的學習筆記 在此僅紀錄 paging 的部分,無 segmentation ## 目錄 [TOC] ## Regular Paging ### 簡介 - Starting from Intel 80386, has 4KB pages - Linear address 32 bits, divides as follow: | 欄位 | 說明 | | :-----: | :-------: | | Directory | the most significant 10 bits | | Table | the intermediate 10 bits | | Offset | the last significant 12 bits | ![](https://i.imgur.com/CxjpEY5.png) - The Page Directory physical address is stored at **cr3** register. - Each Page is 4096 bytes (2^12^) ### page diectory / table entry 的欄位 The entries of Page Directories and Page Tables have the same structure. ![](https://i.imgur.com/40coQsO.png) ![](https://i.imgur.com/iNgDSCa.png) NOTE: Avail. 為保留欄位,可供 OS 填入其他數值 1. Present Flag - If (set) 其所對應的 Page Table or Page 存在於main memory 中 - Else 不存在於 main memory 中,會被 OS 拿去做別的事。 - 若試圖存取 present flag == clear 的 linear address,則 paging unit stored that linear address to control register named “cr2” and generate `Exception 14: The Page Fault Exception` 2. A field contain 20 the most significant bits of a page frame physical address - 因為 offset = 12 bits, each page = 4KB, 所以對 page directory or page table 來說 physical address 的後面 12 bits 都是 0 (that is, the page frame’s physical address must be the multiple of 4096),是由 offset 決定的,所以儲存前 20 bits,而其他欄位就被用來儲存 flag 3. Access Flag - Set each time the paging unit addresses the corresponding page frame. - ONLY can be reset by OS (OS may used this field for page swapped out), the paging can never reset this field. 4. Dirty Flag - Applies only to the Page Table entries. It is set each time a write operation is performed on the page frame. As with the Accessed flag, Dirty may be used by the operating system when selecting pages to be swapped out. The paging unit never resets this flag; this must be done by the operating system. 5. Read/Write Flag - If (set) the corresponding page table or page can be read and written. - Else can ONLY be READ 6. User/Supervisor Flag - If (set) can only access when "CPL" < 3 (that is, the kernel mode) *[CPL]: Current Priviledge Level - Else can alway be accessed. 7. PCD and PWT Flag - PCD: page cache disable; specifies whether the cache must be enabled or disabled while accessing data included in the page frame - PWT: page write through; specifies whether the write-back or the write-through strategy must be applied while writing data into the page frame - linux clears these two bits on page directory and table,也就是預設是 cache enable 且採用了 write-back 8. Page Size Flag - Applies only on page directory entries. - If set, the entry refers to a 2MB (PAE enable) or 4MB (extend paging, PAE disable) long page frame. It is used for “Extend Paging”. *[PAE]: Physical Address Extension, check out the section of it. 9. Global Flag - Applies only to Page Table entries. - 用來防止經常被使用的 page 從 TLB 中被 flush 掉 *[TLB]: translation lookaside buffer - It works only if the **PGE** (Page Global Enable) flag of register **cr4** is set. ## Extend Paging ### 簡介 - Starting Intel Pentium model, page frame size become 4MB, instead of 4KB - Advantages 1. Save memory (without intermediate page table) 2. Preserve TLB entries ### 架構 | 欄位 | 說明 | | :--: | :--: | | Directory | 10 most significant bits | | Offset | the rest 22 bits | ![](https://i.imgur.com/SMTgwj4.png) ### 注意事項 - The **Page Size** flag must be set; 也就是前面提到的 page directory / table entries 的 flag - Extend paging 和 regular paging 同時存在;it is enabled by setting **PSE** flag of the **cr4** register. ## The PAE mechanism (Physical Address Extension) ### 簡介 - 在原本 x86 32bit 的作業系統上 RAM 最多只能到 4GB,這和接在 address bus 上的 pin 數有關。 - Starting with Pentium Pro, intel 將 address pins 數從 32 提高到 36,因此 RAM 上限從 4GB 提高到 64GB (236)。 - PAE is activated by setting the Physical Address Extension (PAE) flag in the **cr4** control register. (設置 **cr4** 的第 5 bit 可以啟用 PAE) - 可以使用指令 `$ cpuid` 來查看是否有啟用 PAE - The Page Size Flag in the page directory entry enable large page size (2MB when PAE is enable) ### Paging 機制改變 - 32 bit 變成 36 bit,這意味著 paging mechanism 也必須改變 - 改變的機制 1. 64GB RAM 被分成 2^24^ distinct page frames, the page frame number field is extended from 20 to 24 bits. 2. 因為 extend 到 24 bits,加上 flag 12 bits ([看前面的圖](#page-diectory--table-entry-的欄位) ),變成 36 bits,原本每個 entry 32 bits 會不夠用,所以擴展成每個 entry 有 64 bits,因為擃展了 2 倍,所以 4KB PAE page directory 的 entry 從原本的 1024 個 entry 變為 512 個 entry,page table 亦同。 3. A new level called PDPT (Page Directory pointer table),看下圖 *[PDPT]: Page Directory Pointer Table ![](https://i.imgur.com/iu4Ojez.png) ▲ PAE enable, page size 4KB , without extend paging ![](https://i.imgur.com/ZnHwBK9.png) ▲ PAE enable, page size 2MB, and with extend paging (PAGE SIZE flag set) 4. cr3 用 27 bit 來儲存 PDPT 的 address,因為 PDPT 會被處存在前 4GB 的 memory 中,且被 aligned to 32 byte 的倍數(25),所以 27 bits 已足夠用來表示 PDPT 的 base address 了。 - 雖然 PAE 擴充了記憶體的總容量,但其 linear address (virtual address) 仍只有 32 bits 而已,這表示我們必須 REUSE the same linear address 來做 mapping ## Paging for 64-bit Architecture - 前面講的 paging 方法用在 32-bit 上是可行的,但對 64-bit 來說,two-level paging (page directory and page table) 的 entry 數太多了 - 例:對 64-bit system,採用 4KB page, 這代表有 64-12 = 52 bit 要拿來做 page directory/table 的 entry,若做 two level 這樣仍會有 252/2 = 226 個 entry,太多了。 - 所以實際上在 64-bit architecture 上是不會做 two-level 的,以 x86_64 在 linux 上為例,它是做 four-level,使用了 48 bit (9+9+9+9+12) ,最後的 12 為 offset,即 page size,並沒有 64 bit 全用。 ![](https://i.imgur.com/dBrcQ9Z.png) ## 額外名詞 ### TLB - Translation Lookaside Buffer - linear address to physical address 的查找很浪費時間,因為需要 access Page Directory / Table,且存取 memory 是 cost 很大的一件事。 - 所以,CPU 提供了一個 cache (快取),來存放這個 table (TLB) - 當一個 linear address 第一次查找後,會將其對應的 physical address 存在 TLB 中,以加速查找,因為 cache 的存取速度 >> memory - 對 multiprocessor system 來說,每個 CPU 都有自己的 TLB,他們之間不需要 synchronize,因為不同的 process 執行在不同的 CPU 上,他們也許會存取到相同的 linear address,但其所對應的 physical address 卻是不一樣的,所以沒有必要去做 synchronize - 當 **cr3** control register 更新後,hardware 也會自動去更新 TLB,因為這代表新的 page table 產生了,若不更新 TLB 則其所儲存的 linear address 對應到 physcial address 的資料仍會是舊的。 ## Paging in Linux ### 簡介 自 Kernel 2.6.11 後開始,Linux 採用了 4-level paging (參考 paging for 64-bit architecture)。見下圖: ![](https://i.imgur.com/scMMRqC.png) 因此 linux 將 linear address split to 5 parts,但實際上每個 part 有幾個 bit 要看 computer architecture 來決定。 #### 32-bit system 對於 32-bit system (without PAE),two-level paging 就夠用了,所以 linux 將 Page Upper Directory and Page Middle Directoyr 的 field bits 設為 0 (即只用到 Page Global Diectory 和 page table)。 但是其 address 仍會存在,目的是要讓同樣的程式能夠跑在 32 or 64-bit 的機器上。 若 enable PAE,則 Page Global Directory 對應到前面説的 8086 PDPT,沒有 Page Upper Directory,Page Middle Directoy 對應到 8086 Page Directory,Page Table 則對應到 8086 Page Table。 #### 64-bit system 採用 3 or 4-level paging,可參考上面的 Table 2-4,不同 hardware 有不同的作法。 ### 為何 Linux 採用 Paging 作為 memory management? - 透過 paging, assign 不同的 physical address 到每個 process; 由於每個 process 都是不同的 page,藉此可以實現每個 process 都有其個別的 memory 保護機制。 - Distinguish pages (a group of data) from page frames (physical address in main memory) ## Physical Memory Layout Linux kernel is installed in the RAM from the physical address 0x00100000 (也就是 1MB 之後,從 2MB 開始儲存) 通常 kernel load 都不會超過 3MB 像是 _text 就對應的 physical address 的 0x00100000 NOTE: 圖中的數字是 page frame,不是 address! ![](https://i.imgur.com/x51ilCH.png) 上圖中的 symbol 並沒有 defined 在 Linux source code,而是在 compling kernel 時產生,可使用 `$ cat /boot/System.map-$(uname -r)` 來查看 ![](https://i.imgur.com/y6Id3CM.png) ▲ Example in Ubuntu x86_64 ### 為何 kernel 不從 第 1 個 MB 開始 load (0x00000000)? - 第 0 個 page frame 經常會在 POST (Power-On Self-Test) 中被 BIOS 拿來存 hardware 設定,有些筆電甚至在 OS load 完後還會寫 data 進去 *[POST]: Power-On Self-Test - 0x000a0000 ~ 0x000fffff 這段位置會被保留,OS 無法使用,通常是用來給 BIOS routines 用或者是拿來 map ISA graphic cards,640KB ~ (1MB-1) (從上面的位置來轉換的) 這段位址就是有名的 memory hole,被用在 IBM-compatible PC 上,不過這些都是一些很老的 PC 了啦... *[ISA]: Industry Standard Architecture kernel 會呼叫 BIOS procedures 來得知 physical memory 的大小, 接著 kernel 會執行 `machine_specific_memory_setup()` 這個 function 會進行 physical address map,如下圖: ![](https://i.imgur.com/dJKTjCT.png) 如果 BIOS 沒特別設定怎麼 map 的話,就會照這個去做設定,0x0009ffff (LOWMEMSIZE()) ~ 0x00100000 (HIGH_MEMORY) 為 Reserved 在 `machine_specific_memory_setup()` 呼叫過後,會再去呼叫 `setup_memory()` 去 set 一些有關 kernel memory layout 的 variable,如下圖: ![](https://i.imgur.com/NvU3Kqt.png) ### 舉例 以圖 (Table 2-9) 舉例 1. 0x07ff0000 ~ 0x07ff2fff 用來儲存 BIOS 在 POST 階段寫入的硬體資訊 2. 0x07ff3000 ~ 0x07ffffff mapped to the ROM 3. 0xffff0000 ~ 0xffffffff hardware 將此區 mapped 到 BIOS's ROM chip 4. BIOS 並不會將每一段的 physical address 都標示出來,像是 0x000a0000 ~ 0x000effff 就不在上表中,Linux 預設將這些區域設為 unusable ### Physical Memory 實際的 Layout 圖 - < 1MB ![](https://i.imgur.com/wneBf7f.png) - 1MB 之後 ![](https://i.imgur.com/06gknLZ.png) - 可參考 [Reference [4]](#Reference) ## Process Page Tables 對於每個 process,linear address 會被分成兩部分。 注意:是 linear address 喔!physical address 的討論已經結束了 1. 0x00000000 ~ 0xbfffffff 能在 process runs in User Mode 或 Kernel Mode 時被存取。 2. 0xc0000000 ~ 0xffffffff 只能在 Kernel Mode 時被存取 ## Reference 1. Understanding the Linux Kernel, Third Edition: Daniel P. Bovet. Marco Cesati. 2. [wiki - PAE](https://en.wikipedia.org/wiki/Physical_Address_Extension) 3. [OSDev wiki - Paging](https://wiki.osdev.org/Paging) 4. [OSDev wiki - Memory Map(x86)](https://wiki.osdev.org/Memory_Map_(x86))

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password
    or
    Sign in via Google Sign in via Facebook Sign in via X(Twitter) Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    By signing in, you agree to our terms of service.

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully