Bin
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee
    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee
  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       owned this note    owned this note      
    Published Linked with GitHub
    Subscribed
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    Subscribe
    # XV6 A simple, Unix-like Teaching Operating System Chapter 2 --- > [TOC] --- ## Chapter 2 Page tables --- ![](https://i.imgur.com/G15T5zK.png) --- :::info - Paging - 由於分割法無法避免碎片的產生,只能透過分割策略改善程度的輕重,為此,發展出另一系列的技術,允許同一個程式載入不連續的記憶體空間,只要可用空間的總大小足以容納該程式即可。分頁法的原理是將 **記憶體** 劃分成相同大小的區塊,稱為 **頁框(frame)** ,然後將程式亦劃分成相同大小的區塊,稱為 **分頁(page)** ,頁框的大小和分頁的大小通常是一樣的。當程式準備執行時,它的分頁會被載入記憶體的頁框,每個分頁對應一個頁框,但不一定是要連續的頁框,分散或順序顛倒亦無妨。為了記錄程式的每個分頁是載入到記憶體的哪些頁框,所以需要一個 **分頁表(page table)** 。 - 分頁(英語:Paging),是一種作業系統裡記憶體管理的一種技術,可以使電腦的主記憶體可以使用儲存在輔助記憶體中的資料。作業系統會將輔助記憶體(通常是磁碟)中的資料分割成固定大小的區塊,稱為「頁」(pages)。當不需要時,將分頁由主記憶體(通常是記憶體)移到輔助記憶體;當需要時,再將資料取回,載入主記憶體中。相對於分段,分頁允許記憶體儲存於不連續的區塊以維持檔案系統的整齊。分頁是磁碟和記憶體間傳輸資料塊的最小單位。 - 一個Page為4096字節(Byte = 8 bit),圖中第一個Page Table 叫 Page Directory,用來存放實體位址。Offset取決於Page大小,在x64下可能有4K,2M,1G等四種,x86為4K,2M,4M。(以4K為例:Offset = 32 - 12) - 4K-page 頁面下是二級轉換結構,如上面的圖所示: 1. CR3 寄存器提供 PDT 表的基地址 2. 線性地址的PDTE index 在 PDT 表中索引得到 PDTE 3. PDTE 提供 PT 的基地址 4. PTE index 在PT 表里索引得到 PTE ,PTE 裡提供page 基地址 5. page 基地址+ offset 值得到最終的物理地址 - PDT(頁目錄表)表 - 在4K-page頁面下,整個 page-translation tables結構裡只有1個 PDT。 - PDT index有10位,那麼可索引1024個 PDTE(從0 - 1023),也就是說:PDT 可能容納1024個 PDTE - 在32位下的4K-page結構裡,每個PDTE size是 4 bytes,那麼 PDT 的大小為:1024 * 4 = 4K bytes - PDT 表基地址 - CR3寄存器指出PDT表的base物理地址,這個PDT base地址可以在物理空間的任意位置,前提是: :::warning 由於PDT 表的大小為4K bytes,因此PDT 表是在4K 邊界上,那麼CR3 給出的PDT base address 將會在4K 邊界上 ::: - 這種情況下,CR3 寄存器的低 12 位是無效的,除了其中的兩位控制著整個頁表轉換結構的cache屬性: - Bit 3 控制 PWT 屬性,PWT = 0是writeback(回寫),PWT = 1是writethrough(通寫) - Bit 4 控制 PCD 屬性,PCD = 0是cacheable(可緩衝),PCD = 1是non-cacheable(不可緩衝) ![](https://i.imgur.com/m3FYj9L.png) - 上圖說明了CR3寄存器與PDT表的關係。上圖所示:系統構造了5個Page-Translation Tables(頁轉換錶)結構,每個頁轉換錶結構中只有1個PDT表。 - 圖中已經很好的解釋了:為什麼4K-page頁面下CR3寄存器的的高20位指出PDT表的地址?答案就是:每個PDT的大小是4K,每個PDT表需要在4K邊界上,否則會出現重疊的情形出現(兩個PDT表重疊在一起,這是不可能允許出現的情況) - 由CR3得到PDT base address的物理地址低12位會被置為0,當寫CR3寄存器時,CR3寄存器低12位會被清0,除了PWT和PCD標誌位外。 ::: --- ### Paging hardware - 轉址過程: - Logical -> Virtual -> Physical :::warning 此過程皆為硬體!!! ::: - 用運用兩個硬件 - Segment Unit : Logical -> Virtua - Paging Unit : Virtual -> Physical :::info - 補充 - **實模式下的內存尋址:** 段首地址×16+偏移量=物理地址 段首地址×16:CPU將段寄存器中的數值向左偏移4-bit,放到20-bit的地址線上就成為20-bit的Base Address - **保護模式下分段機制的內存尋址:** 保護模式下分段機制是利用一個稱作段選擇符的偏移量,從而到描述符表找到需要的段描述符,而這個段描述符中就存放著真正的段的物理首地址,再加上偏移量 - **實模式:** 在bootloader接手BIOS的工作後,當前的PC系統處於實模式(16位模式)運行狀態,在這種狀態下軟件可訪問的物理內存空間不能超過1MB,且無法發揮Intel 80386以上級別的32位CPU的4GB內存管理能力。 實模式將整個物理內存看成分段的區域,程序代碼和數據位於不同區域,操作系統和用戶程序並沒有區別對待,而且每一個指針都是指向實際的物理地址。這樣,用戶程序的一個指針如果指向了操作系統區域或其他用戶程序區域,並修改了內容,那麼其後果就很可能是災難性的。通過修改A20地址線可以完成從實模式到保護模式的轉換。有關A20的進一步信息可參考附錄“關於A20 Gate”。 - **保護模式:** 只有在保護模式下,80386的全部32根地址線有效,可尋址高達4G字節的線性地址空間和物理地址空間,可訪問64TB(有2^14^ 個段,每個段最大空間為2^32^字節)的邏輯地址空間,可採用分段存儲管理機制和分頁存儲管理機制。這不僅為存儲共享和保護提供了硬件支持,而且為實現虛擬存儲提供了硬件支持。通過提供4個特權級和完善的特權檢查機制,既能實現資源共享又能保證代碼數據的安全及任務的隔離。 ::: --- ### Process address space ![](https://i.imgur.com/ymts65w.png) --- ### Code: creating an address space - kvmalloc 創建並切換到一個擁有內核運行所需的 KERNBASE 以上映射的 page - setupkvm 完成 kvmalloc 大多數的工作 - mappages 建立內核需要的映射 - walkpgdir 找到該地址對應的PTE地址 ### - Structure ```c= // This table defines the kernel's mappings, which are present in // every process's page table. static struct kmap { void *virt; uint phys_start; uint phys_end; int perm; } kmap[] = { { (void*)KERNBASE, 0, EXTMEM, PTE_W}, // I/O space { (void*)KERNLINK, V2P(KERNLINK), V2P(data), 0}, // kern text+rodata { (void*)data, V2P(data), PHYSTOP, PTE_W}, // kern data+memory { (void*)DEVSPACE, DEVSPACE, 0, PTE_W}, // more devices }; // Memory layout #define EXTMEM 0x100000 // Start of extended memory #define PHYSTOP 0xE000000 // Top physical memory #define DEVSPACE 0xFE000000 // Other devices are at high addresses // Key addresses for address space layout (see kmap in vm.c for layout) #define KERNBASE 0x80000000 // First kernel virtual address #define KERNLINK (KERNBASE+EXTMEM) // Address where kernel is linked ``` ### - kvmalloc() Kernel page table ```c= // File:vm.c // Allocate one page table for the machine for the kernel address // space for scheduler processes. void kvmalloc(void) { kpgdir = setupkvm(); switchkvm(); } ``` - 在這裡 kpgdir 的型態是 pde_t *,是一個 uint,因此是 32 bit 如下圖接到一個 32 bit 的 pgdir,相當於 4K 的大小當目錄存所有的 Table。 - 放入 CR3 ![](https://i.imgur.com/I2ZgRjU.png) ### - setupkvm() ```c= // Set up kernel part of a page table. pde_t* setupkvm(void) { pde_t *pgdir; struct kmap *k; if((pgdir = (pde_t*)kalloc()) == 0) return 0; memset(pgdir, 0, PGSIZE); if (p2v(PHYSTOP) > (void*)DEVSPACE) panic("PHYSTOP too high"); for(k = kmap; k < &kmap[NELEM(kmap)]; k++) if(mappages(pgdir, k->virt, k->phys_end - k->phys_start, (uint)k->phys_start, k->perm) < 0) return 0; return pgdir; } ``` - kalloc 一次要 4K Bytes - 程式中的 PHYSTOP,DEVSPACE 對應著上圖 Figure 2-2 ```c= #define PHYSTOP 0xE000000 // Top physical memory #define DEVSPACE 0xFE000000 // Other devices are at high addresses ``` ### - mappages() ```c= // Create PTEs for virtual addresses starting at va that refer to // physical addresses starting at pa. va and size might not // be page-aligned. static int mappages(pde_t *pgdir, void *va, uint size, uint pa, int perm) { char *a, *last; pte_t *pte; a = (char*)PGROUNDDOWN((uint)va); last = (char*)PGROUNDDOWN(((uint)va) + size - 1); for(;;){ if((pte = walkpgdir(pgdir, a, 1)) == 0) return -1; if(*pte & PTE_P) panic("remap"); *pte = pa | perm | PTE_P; if(a == last) break; a += PGSIZE; pa += PGSIZE; } return 0; } ``` ### - walkpgdir() ```=c // Return the address of the PTE in page table pgdir // that corresponds to virtual address va. If alloc!=0, // create any required page table pages. static pte_t * walkpgdir(pde_t *pgdir, const void *va, int alloc) { pde_t *pde; pte_t *pgtab; pde = &pgdir[PDX(va)]; if(*pde & PTE_P){ pgtab = (pte_t*)p2v(PTE_ADDR(*pde)); } else { if(!alloc || (pgtab = (pte_t*)kalloc()) == 0) return 0; // Make sure all those PTE_P bits are zero. memset(pgtab, 0, PGSIZE); // The permissions here are overly generous, but they can // be further restricted by the permissions in the page table // entries, if necessary. *pde = v2p(pgtab) | PTE_P | PTE_W | PTE_U; } return &pgtab[PTX(va)]; } ``` ### - switchkvm() ```c= // Switch h/w page table register to the kernel-only page table, // for when no process is running. void switchkvm(void) { lcr3(v2p(kpgdir)); // switch to the kernel page table } ``` --- ### Physical memory allocation - Kernel 在運行時必須為以下物件分配 Physical Memory - Page Tables - Process User Memory - Kernel Stacks - Pipe Buffers --- ### Code: Physical memory allocator - 相關文件 - kalloc.c - mmu.h ```c= // Physical memory allocator, intended to allocate // memory for user processes, kernel stacks, page table pages, // and pipe buffers. Allocates 4096-byte pages. void freerange(void *vstart, void *vend); extern char end[]; // first address after kernel loaded from ELF file struct run { struct run *next; }; struct { struct spinlock lock; int use_lock; struct run *freelist; } kmem; ``` ### - 分配器的初始化 kinit1(), kinit2(). ```c= // File:kalloc // Initialization happens in two phases. // 1. main() calls kinit1() while still using entrypgdir to place just // the pages mapped by entrypgdir on free list. // 2. main() calls kinit2() with the rest of the physical pages // after installing a full page table that maps them on all cores. void kinit1(void *vstart, void *vend) { initlock(&kmem.lock, "kmem"); kmem.use_lock = 0; freerange(vstart, vend); } void kinit2(void *vstart, void *vend) { freerange(vstart, vend); kmem.use_lock = 1; } ``` ### - freerange() ```c= void freerange(void *vstart, void *vend) { char *p; p = (char*)PGROUNDUP((uint)vstart); for(; p + PGSIZE <= (char*)vend; p += PGSIZE) kfree(p); } ``` ### - kfree() ```c= //PAGEBREAK: 21 // Free the page of physical memory pointed at by v, // which normally should have been returned by a // call to kalloc(). (The exception is when // initializing the allocator; see kinit above.) void kfree(char *v) { struct run *r; if((uint)v % PGSIZE || v < end || v2p(v) >= PHYSTOP) panic("kfree"); // Fill with junk to catch dangling refs. memset(v, 1, PGSIZE); if(kmem.use_lock) acquire(&kmem.lock); r = (struct run*)v; r->next = kmem.freelist; kmem.freelist = r; if(kmem.use_lock) release(&kmem.lock); } ``` --- ## 20170825 Meeting - 不熟悉 - 暫存器必須找到其英文全名,並了解每個暫存器功用。 - Context Switch 的理解有誤,必須再去重看。 --- ### User part of an address space ![](https://i.imgur.com/aKxbHII.png) --- ### Code: exec - 相關文件 - elf.h - exec.c - exec 是創建地址空間中用戶部分的系統調用。它根據文件系統中保存的某個文件來初始化用戶部分。(第六章會說明) :::warning 在這部份是關於 ELF 檔案的讀取方式,應該不用多做改變。 ::: ### - Structure ```c= // Format of an ELF executable file #define ELF_MAGIC 0x464C457FU // "\x7FELF" in little endian // File header struct elfhdr { uint magic; // must equal ELF_MAGIC uchar elf[12]; ushort type; ushort machine; uint version; uint entry; uint phoff; uint shoff; uint flags; ushort ehsize; ushort phentsize; ushort phnum; ushort shentsize; ushort shnum; ushort shstrndx; }; // Program section header struct proghdr { uint type; uint off; uint vaddr; uint paddr; uint filesz; uint memsz; uint flags; uint align; }; // Values for Proghdr type #define ELF_PROG_LOAD 1 // Flag bits for Proghdr flags #define ELF_PROG_FLAG_EXEC 1 #define ELF_PROG_FLAG_WRITE 2 #define ELF_PROG_FLAG_READ 4 ``` - exec - 用 namei 打開 elf 二進位文件 - 檢查格式並計算大小 - 設定參數 並對齊4 - Size 加上一個 Page (因為 Stack 大小是一個 Page) - 分配 Program 需要的記憶體 - 將 Program 讀取至主記憶體 - 初始化 Stack - 詳細資訊參考 User Stack 資訊圖 - 最後將 elf.entry (程式進入點) 放至 eip 中,eip 是記錄現在程式執行到哪裡,即將會執行的地方。 ```c= int exec(char *path, char **argv) { char *s, *last; int i, off; uint argc, sz, sp, ustack[3+MAXARG+1]; struct elfhdr elf; struct inode *ip; struct proghdr ph; pde_t *pgdir, *oldpgdir; if((ip = namei(path)) == 0) return -1; ilock(ip); pgdir = 0; // Check ELF header if(readi(ip, (char*)&elf, 0, sizeof(elf)) < sizeof(elf)) goto bad; if(elf.magic != ELF_MAGIC) goto bad; if((pgdir = setupkvm()) == 0) goto bad; // Load program into memory. sz = 0; for(i=0, off=elf.phoff; i<elf.phnum; i++, off+=sizeof(ph)){ if(readi(ip, (char*)&ph, off, sizeof(ph)) != sizeof(ph)) goto bad; if(ph.type != ELF_PROG_LOAD) continue; if(ph.memsz < ph.filesz) goto bad; if((sz = allocuvm(pgdir, sz, ph.vaddr + ph.memsz)) == 0) goto bad; if(loaduvm(pgdir, (char*)ph.vaddr, ip, ph.off, ph.filesz) < 0) goto bad; } iunlockput(ip); ip = 0; // Allocate two pages at the next page boundary. // Make the first inaccessible. Use the second as the user stack. sz = PGROUNDUP(sz); if((sz = allocuvm(pgdir, sz, sz + 2*PGSIZE)) == 0) goto bad; clearpteu(pgdir, (char*)(sz - 2*PGSIZE)); sp = sz; // Push argument strings, prepare rest of stack in ustack. for(argc = 0; argv[argc]; argc++) { if(argc >= MAXARG) goto bad; sp = (sp - (strlen(argv[argc]) + 1)) & ~3; if(copyout(pgdir, sp, argv[argc], strlen(argv[argc]) + 1) < 0) goto bad; ustack[3+argc] = sp; } ustack[3+argc] = 0; ustack[0] = 0xffffffff; // fake return PC ustack[1] = argc; ustack[2] = sp - (argc+1)*4; // argv pointer sp -= (3+argc+1) * 4; if(copyout(pgdir, sp, ustack, (3+argc+1)*4) < 0) goto bad; // Save program name for debugging. for(last=s=path; *s; s++) if(*s == '/') last = s+1; safestrcpy(proc->name, last, sizeof(proc->name)); // Commit to the user image. oldpgdir = proc->pgdir; proc->pgdir = pgdir; proc->sz = sz; proc->tf->eip = elf.entry; // main proc->tf->esp = sp; switchuvm(proc); freevm(oldpgdir); return 0; bad: if(pgdir) freevm(pgdir); if(ip) iunlockput(ip); return -1; } ``` --- ### Real world ---

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully