陳昭詣, 周姵彣
Porting the xv6-riscv Operating System to the RISC-V RV32I Architecture and Running it on QEMU Emulator.
riscv32-unknown-elf-gcc
) to ensure compatibility with the RV32I architecture.(1) entry.S:
csrr mhartid
) by replacing them with RV32I-compatible instructions.(2) printf.c and trap.c:
(3) VirtIO Driver:
>> 32
) to fit the 32-bit architecture.(1) ACLINT/CLINT:
(2) Page Table Entries:
(3) Memory Management:
(4) File System:
(5) User Mode:
(6) Trap:
(7) Interrupt & Device Driver:
(1) Use the RISC-V toolchain (e.g., riscv32-unknown-elf-gcc) to compile the modified xv6 code into an executable kernel for the RV32I architecture.
(2) Resolve compilation errors and ensure the generated files are fully compatible with the 32-bit architecture.
(1) Run the compiled xv6 kernel on the QEMU emulator:
qemu-system-riscv32 -machine virt -kernel kernel/kernel.elf
(2) Test the kernel's core functionalities, including:
(3) Validate that the system boots successfully and identify any issues that may require further adjustments.
https://github.com/mit-pdos/xv6-riscv
https://github.com/nananapo/xv6-rv32
https://github.com/jserv/xv6-riscv
https://github.com/harihitode/ladybird-xv6/commits/ladybird/
csrr a1, mhartid
This code is used to retrieve the hartid
from the mhartid
register.
li a1, 0
The hartid
is directly set to 0, assuming a single-core processor environment.
mhartid
Instructioncsrr mhartid
instruction cannot be used because mhartid
is a privileged-mode register, which is unsupported in the 32-bit environment.hartid
to 0 is based on the assumption of a single-core environment, where the default hardware thread ID is 0.long long
and unsigned long long
in the functions with int
and unsigned int
to adapt to a 32-bit architecture.When porting to a 32-bit architecture, long long
and unsigned long long
are 64-bit data types, which exceed the range of a 32-bit architecture.
Replace incompatible pointer-to-integer and integer-to-pointer conversions with uintptr_t
.
Mark the unused variable trampoline_userret
as used with void
to avoid warnings.
uintptr_t
to standardize conversions between pointers and integers, ensuring cross-platform compatModify 1: features_low
and features_high
features
was used to represent the VirtIO device's feature fields.features
field has been split into two 32-bit variables:(void)features_high;
to avoid compiler warnings, as currently only the low feature field needs to be processed.Modify 2: Operating on features_low
features
.features_low
to disable unnecessary VirtIO features:features_low
.Modify 3: Handling high and low parts of physical addresses
uintptr_t
instead of uint64.pa_start
and PGROUNDUP
to uintptr_t
, and then cast to char*
to ensure type consistency.Original Code(kernel/riscv.h):
(kernel/vm.c)
Modified Code(kernel/riscv.h):
(kernel/vm.c)
(1) Adjust bits and values to 32-bit.
(2) walk()
function adjustment:
For the memory allocation pointer returned by kalloc()
, use uintptr_t
and proper casting to handle 32-bit addresses.
Reason for modification:
The modifications are necessary because the original code was designed for a 64-bit architecture, whereas in a 32-bit environment, the address space, data types, and page table format are different. Without these changes, the following issues may arise:
(1) Address calculation errors (overflow or incorrect addresses).
(2) Page table entry access failures.
(3) Memory allocation and management errors.
(4) Inability to properly execute virtual memory management functions.
CPU has multiple modes, each with different privileges. In RISC-V, there are three modes:
Mode | Overview | xv6 |
---|---|---|
Machine mode | Highest, most powerful. | Startup + Initialization 、 Timer Interrupts |
Supervisor mode | Mode in which the kernel operates | All kernel code and some instructions are privileged runs in this mode. |
User mode | Mode in which applications operate | All user code runs in this mode. |
Opcode and operands | Overview |
---|---|
csrr rd, csr | Read from CSR |
csrw csr, rs | Write to CSR |
csrrw rd, csr, rs | Read from and write to CSR at once |
sret | Return from trap handler (restoring program counter, operation mode, etc.) |
sfence.vma | Clear Translation Lookaside Buffer (TLB) |
Name | Description |
---|---|
mhartid | Hardware thread ID . |
mstatus | Machine status register. |
mstatush | Additional machine status register, RV32 only. Contain the same fields found in bits 62:36 of mstatus |
mtvec | Machine trap-handler base address. |
mepc | Machine exception program counter. |
mscratch | Scratch register for machine trap handlers. |
mie | Machine interrupt-enable register. |
medeleg | Machine exception delegation register. |
medelegh | Upper 32 bits of medeleg, RV32 only. |
mideleg | Machine interrupt delegation register. |
pmpcfg0 | Physical memory protection configuration. |
pmpcfg1 | Upper 32 bits of medeleg, RV32 only. |
pmpaddr0 | Physical memory protection address register. |
RISC-V have three types of page table:
Reference:sv32
Sv32 uses a two-level page table to enable virtual memory. A virtual address contains two Virtual Page Numbers (VPN) and an offset. The Physical Page Number (PPN) of the leaf Page Table Entry (PTE) is combined with the offset to form the physical address.
There are three kinds of event which cause the CPU to set aside ordinary execution of instructions and force a transfer of control to special code that handles the event.
Reference: user/user.h
Reference: kernel/riscv.h & RISC-V Instruction Set Manual
stvec (Supervisor Exception Program Counter) : The kernel writes the address of its trap handler here; the RISC-V jumps to the address in stvec to handle a trap.
sepc (Supervisor Exception Program Counter) : When a trap occurs, RISC-V saves the program counter here (since the pc is then overwritten with the value in stvec
). The sret
(return from trap) instruction copies sepc to the pc. The kernel can write sepc
to control where sret
goes.
scause: RISC-V puts a number here that describes the reason for the trap.
sscratch (Supervisor Scratch Register): A temporary storage to save the stack pointer at the time of exception occurrence, which is later restored. For example,hold the address of the trapframe page while the hart is executing user code.
sstatus (Supervisor Statue Register) : The SIE (Supervisor mode Interrupt Enable) bit controls whether device interrupts are allowed. If the kernel clears SIE bit, the RISC-V will defer device interrupts until SIE is set again. The SPP bit indicates whether a trap comes from user mode or supervisor mode,and controls to what mode sret
return
satp (Supervisor Address Translation and Protection Register) : Holds the address of the page table.
sstatus
SIE bit is clear, skip the following steps.sstatus
.sepc
.sstatus
.scause
to reflect the trap’s cause.Here is the structure of scause
If the trap was caused by an interrupt,the Interrupt field will be 1. The WLRL
field contains a code that identify the latest exception or interrupt.
stvec
to the pc.Note the the CPU doesn't switch either to the kernel page table or to kernel stack. Kernel software must perform these tasks.
In user space,trap may occur if the user program make a system call(ecall),or does something illegal, or if a device interrupts.
However, RISC-V hardware does not switch page tables when it forces a trap. This means that the user page table must include mappings for uservec
to execute properly. In user space, stvec
should store a pointer pointing to uservec
.
Xv6 satisfies these requirements using a trampoline page which contains uservec
. The trampoline page is mapped in both user and kernel space's page table.
write
stvec
points tousertap()
(kernel/trap.c)usertrap()
call the function syscall()
syscall()
usertrapret
(kernel/trap.c)usertrapret
's call to userret
(kernel/trampoline.S) passes a pointer to the process's user page table in a0
sret
to return to user space.struct trapframe {
/* 0 */ uint32 kernel_satp; // kernel page table
/* 4 */ uint32 kernel_sp; // top of process's kernel stack
/* 8 */ uint32 kernel_trap; // usertrap()
/* 12 */ uint32 epc; // saved user program counter
/* 16 */ uint32 kernel_hartid; // saved kernel tp
/* 20 */ uint32 ra;
/* 24 */ uint32 sp;
/* 28 */ uint32 gp;
/* 32 */ uint32 tp;
... ALL 31 general purpose register
/* 140 */ uint32 t6;
};
After calling ecall,trap handler switches to supervisor mode,and the Program Counter is set to the location (0x3ffffff000) of the trampoline page (which stvec
points to ). It then jumps to the beginning of the trampoline page, which is uservec.
satp
register to point to the kernel page table.usertrap()
to handle the trapuservec:
# trap.c sets stvec to point here, so
# traps from user space start here,
# in supervisor mode, but with a
# user page table.
#
# save user a0 in sscratch so
# a0 can be used to get at TRAPFRAME.
csrw sscratch, a0
# each process has a separate p->trapframe memory area,
# but it's mapped to the same virtual address
# (TRAPFRAME) in every process's user page table.
li a0, TRAPFRAME
# save the user registers in TRAPFRAME
sw ra, 20(a0)
sw sp, 24(a0)
sw gp, 28(a0)
...
# save the user a0 in p->tf->a0
csrr t0, sscratch
sw t0, 56(a0)
# restore kernel stack pointer from p->tf->kernel_sp
lw sp, 4(a0)
# make tp holw the current hartid, from p->tf->kernel_hartid
lw tp, 16(a0)
# load the address of usertrap(), p->tf->kernel_trap
lw t0, 8(a0)
# restore kernel page table from p->tf->kernel_satp
lw t1, 0(a0)
csrw satp, t1
sfence.vma zero, zero
# a0 is no longer valid, since the kernel page
# table does not specially map p->tf.
# jump to usertrap(), which does not return
jr t0
a0
into the sscratcha0
now points to TRAPFRAME, allowing operations on TRAPFRAME to be performed through the a0 register.process->trapframe
is mapped to TRAPFRAME such that the physical address and the virtual memory address are the same.a0
.sp
and tp
from TRAPFRAME : read the stack pointer and hart ID.This function handles the specific behavior corresponding to the cause of the trap.
kernelvec
) in the stvec
register.scause
register and handle traps based on their cause:
usertrap(void)
{
int which_dev = 0;
if((r_sstatus() & SSTATUS_SPP) != 0)
panic("usertrap: not from user mode");
// send interrupts and exceptions to kerneltrap(),
// since we're now in the kernel.
w_stvec((uint32)kernelvec);
struct proc *p = myproc();
// save user program counter.
p->tf->epc = r_sepc();
if(r_scause() == 8){
// system call
if(p->killed)
exit(-1);
// sepc points to the ecall instruction,
// but we want to return to the next instruction.
p->tf->epc += 4;
// an interrupt will change sstatus &c registers,
// so don't enable until done with those registers.
intr_on();
syscall();
} else if((which_dev = devintr()) != 0){
// ok
} else {
printf("usertrap(): unexpected scause %p pid=%d\n", r_scause(), p->pid);
printf(" sepc=%p stval=%p\n", r_sepc(), r_stval());
p->killed = 1;
}
if(p->killed)
exit(-1);
// give up the CPU if this is a timer interrupt.
if(which_dev == 2)
yield();
usertrapret();
}
SSTATUS_SPP
bit of the sstatus . This bit indicates the processor mode before the trap (0: User mode : 1: Supervisor mode)The first step in returning to user space is the call to usertrapret. This function sets up the RISC-V control registers to prepare for a future trap from user space.
sp
and tp
registers1. usertrapret(void)
2. {
3. struct proc *p = myproc();
4.
5. // turn off interrupts, since we're switching
6. // now from kerneltrap() to usertrap().
7. intr_off();
8.
9. // send syscalls, interrupts, and exceptions to trampoline.S
10. w_stvec(TRAMPOLINE + (uservec - trampoline));
11.
12. // set up trapframe values that uservec will need when
13. // the process next re-enters the kernel.
14. p->tf->kernel_satp = r_satp(); // kernel page table
15. p->tf->kernel_sp = p->kstack + PGSIZE; // process's kernel stack
16. p->tf->kernel_trap = (uint32)usertrap;
17. p->tf->kernel_hartid = r_tp(); // hartid for cpuid()
18.
19. // set up the registers that trampoline.S's sret will use
20. // to get to user space.
21.
22. // set S Previous Privilege mode to User.
23. unsigned long x = r_sstatus();
24. x &= ~SSTATUS_SPP; // clear SPP to 0 for user mode
25. x |= SSTATUS_SPIE; // enable interrupts in user mode
26. w_sstatus(x);
27.
28. // set S Exception Program Counter to the saved user pc.
29. w_sepc(p->tf->epc);
30.
31. // tell trampoline.S the user page table to switch to.
32. uint32 satp = MAKE_SATP(p->pagetable);
33.
34. // jump to trampoline.S at the top of memory, which
35. // switches to the user page table, restores user registers,
36. // and switches to user mode with sret.
37. uint32 fn = TRAMPOLINE + (userret - trampoline);
38. ((void (*)(uint32,uint32))fn)(TRAPFRAME, satp);
39. }
This is assembly code located on the trampoline page that is mapped in both user and kernel page tables; the reason is that userret will switch page tables.
spp
bit to user mode.SPIE
bit to enable interruptsuserret:
# userret(TRAPFRAME, pagetable)
# switch from kernel to user.
# usertrapret() calls here.
# a0: TRAPFRAME, in user page table.
# a1: user page table, for satp.
# switch to the user page table.
csrw satp, a1
sfence.vma zero, zero
# put the saved user a0 in sscratch, so we
# can swap it with our a0 (TRAPFRAME) in the last step.
lw t0, 56(a0)
csrw sscratch, t0
# restore all but a0 from TRAPFRAME
lw ra, 20(a0)
lw sp, 24(a0)
...
# restore user a0, and save TRAPFRAME in sscratch
csrrw a0, sscratch, a0
# return to user mode and user pc.
# usertrapret() set up sstatus and sepc.
sret
Xv6 handles traps from kernel code in a different way than traps from user code.
Q: How we got into supervisor mode ?
Ans: A trap occurs in user mode, and then we enter supervisor mode.
usertrap(void)
{
int which_dev = 0;
if((r_sstatus() & SSTATUS_SPP) != 0)
panic("usertrap: not from user mode");
// send interrupts and exceptions to kerneltrap(),
// since we're now in the kernel.
w_stvec((uint32)kernelvec);
...
We can see that usertrap() is already in supervisor mode at this point.We have set stvec to the memory address of kernelvec.
And we are in supervisor mode, which means that we can directly rely on sp and satp to execute the trap handler.
When a trap occurs while the CPU is in kernel mode, the stvec register points to the kernelvec (located in kernel/Kernelvec.S)
Kernelvec will Saves all 32 general-purpose registers onto the stack of the interrupted kernel thread. This ensures the interrupted thread can resume execution without interference from the trap.
Then Jump to Kerneltrap
.globl kerneltrap
.globl kernelvec
.align 4
kernelvec:
// make room to save registers.
addi sp, sp, -128
// save the registers.
sw ra, 0(sp)
sw sp, 4(sp)
sw gp, 8(sp)
...
sw t6, 120(sp)
// call the C trap handler in trap.c
call kerneltrap
kerneltrap()
{
int which_dev = 0;
uint32 sepc = r_sepc();
uint32 sstatus = r_sstatus();
uint32 scause = r_scause();
if((sstatus & SSTATUS_SPP) == 0)
panic("kerneltrap: not from supervisor mode");
if(intr_get() != 0)
panic("kerneltrap: interrupts enabled");
if((which_dev = devintr()) == 0){
printf("scause %p\n", scause);
printf("sepc=%p stval=%p\n", r_sepc(), r_stval());
panic("kerneltrap");
}
// give up the CPU if this is a timer interrupt.
if(which_dev == 2 && myproc() != 0 && myproc()->state == RUNNING)
yield();
// the yield() may have caused some traps to occur,
// so restore trap registers for use by kernelvec.S's sepc instruction.
w_sepc(sepc);
w_sstatus(sstatus);
}
devintr() = 1 (UART/DISK)
devintr() = 2 (Timer)
devintr() = 0 (other)
If the trap is caused by a timer interrupt and a process’s kernel thread is running(that the interrupted thread is a kernel thread, not a scheduler thread)
The kerneltrap calls yield() to let other threads execute.
Q: What Happens in yield() ?
Ans: The current thread give up the CPU. Another thread is scheduled to run, and eventually, the original thread resumes execution at its kerneltrap context.
Control is passed back to kernelvec, which pops the saved registers from the stack. Then executes the sret instruction.
Copies the value of sepc to the program counter (PC).Then we're back to the memory address where the trap occurred.
// restore registers.
lw ra, 0(sp)
lw sp, 4(sp)
lw gp, 8(sp)
// not this, in case we moved CPUs: lw tp, 12(sp)
lw t0, 16(sp)
lw t1, 20(sp)
...
// return to whatever we were doing in the kernel.
sret
Page Fault is triggered when the CPU fials to translate a virtual address to valid a physical address.
RISC-V has three different types of page faults:
(Reference: scause CSR)
Important Registers:
Xv6’s Handling of Page Faults:
User space:the kernel kills the faulting process
Kernel space:the kernel panics.
Goal: Share physical pages between parent and child processes to improve memory efficiency during fork.
Mechanism:
Advantages: Reduces memory usage and speeds up fork
Goal: Delay physical memory allocation until it is actually used.
Mechanism:
Advantages: Prevents memory waste for unused allocations,and reduces overhead for large memory requests.
Goal: Improve application startup times by loading memory pages only when accessed.
Mechanism:
Advantages:
Goal: Manage memory usage when the total demand exceeds physical RAM.
Mechanism:
Advantages: Enables efficient memory utilization, allowing more processes to run simultaneously.
Devices that need attention from the operating system can usually be configured to generate interrupts, which are one type of trap. The kernel trap handling code recognizes when a device has raised an interrupt and calls the driver’s interrupt handler.
Recall the Trap Roadmap.System calls, page faults, and interrupts all use the same mechanism. Therefore,this dispatch happens in devintr (kernel/trap.c).
The following figure is from SiFive's manual on SoC(FU540-C000)
All devices are connected to the processor, which handles device interrupts through Platform Level Interrupt Control (PLIC).
As you can see in the upper left corner, we have 53 different interrupts from the device. After these interrupts arrive at the PLIC, the PLIC routes these interrupts. Core-Local Interruptor (CLINT) generates local interrupt.
Platform-Level Interrupt Controller (PLIC). The global interrupt controller in a RISC-V system.
Global interrupts are routed through a Platform-Level Interrupt Controller (PLIC),which can direct interrupts to any hart in the system via the external interrupt.
Reference:PLIC.adoc
Actually, QEMU "implements" the PLIC in two divices:
Core-Local Interruptor (CLINT).Local interrupts (Software and timer interrupts) are signaled directly to an individual hart with a dedicated interrupt value.
Reference: Core Local Interrupt (CLINT)
mip
kernel/start.c()
void
timerinit()
{
// each CPU has a separate source of timer interrupts.
int id = r_mhartid();
// ask the CLINT for a timer interrupt.
uint32 interval = 1000000; // cycles; about 1/10th second in qemu.
*(uint64*)CLINT_MTIMECMP(id) = *(uint64*)CLINT_MTIME + interval;
// prepare information in scratch[] for timervec.
// scratch[0..3] : space for timervec to save registers.
// scratch[4] : address of CLINT MTIMECMP register.
// scratch[5] : desired interval (in cycles) between timer interrupts.
uint32 *scratch = &mscratch0[32 * id];
scratch[4] = CLINT_MTIMECMP(id);
scratch[5] = interval;
w_mscratch((uint32)scratch);
// set the machine-mode trap handler.
w_mtvec((uint32)timervec);
// enable machine-mode interrupts.
w_mstatus(r_mstatus() | MSTATUS_MIE);
// enable machine-mode timer interrupts.
w_mie(r_mie() | MIE_MTIE);
}
Advanced Core Local Interrupt (ACLINT). The RISC-V ACLINT specification defines a set of memory mapped devices which provide inter-processor interrupts (IPI) and timer functionalities for each HART on a multi-HART RISC-V platform.
ACLINT is a group of memory mapped devices used on multi-HART RISC-V platforms to provide
The SiFive Core-Local Interruptor (CLINT) device has been widely adopted in the RISC-V world to provide machine-level IPI and timer functionalities.
Unfortunately, the SiFive CLINT has a unified register map for both IPI and timer functionalities and it does not provide supervisor-level IPI functionality.
So the RISC-V ACLINT specification takes a more modular approach by defining separate memory mapped devices for IPI and timer functionalities.
Name | Privilege Level | Functionality |
---|---|---|
MTIMER | Machine | Fixed-frequency counter and timer events |
MSWI | Machine | Inter-processor (or software) interrupts |
SSWI | Supervisor | Inter-processor (or software) interrupts |