Try   HackMD

RISC-V Hypervisor

徐崇智, 邱家浩, 林育丞

Goals

  • Explore RISC-V hypervisor behavior using the riscv_hypervisor repository.
  • Updata QEMU and GNU toolchain version.
  • Run multiple xv6 instances on QEMU.
  • Document all encountered issues and troubleshooting steps.

What's QEMU

QEMU (Quick Emulator) is a free and open-source machine emulator and virtualizer that leverages dynamic binary translation to emulate a computer's processor. It allows operating systems and applications built for one architecture to run on another by translating binary code during runtime.
QEMU offers an extensive range of hardware and device models for virtual mahcines, supporting the emulation of various architectures, including x86, ARM, PowerPC, RISC-V, and more.

Environment Setting

Environment : Ubuntu LTS 24.04.1

Avoid unnecessary indentions.

Got it!

NOTE: watch out for hardcoded absolute paths (QEMU and GNU toolchain).

Install qemu 9.2.0

$ wget https://download.qemu.org/qemu-9.2.0.tar.xz
$ tar -xvf qemu-9.2.0.tar.xz
$ cd qemu-9.2.0

TODO: Bump QEMU versions.

Finish. We update the QEMU version to 9.2.0.

To support the RISC-V architecture required for running the xv6 operating system, we specify --target-list=riscv64-softmmu when configuring QEMU.

$ sudo apt install make (if you did not install)
$ sudo apt install libpixman-1-dev
$ ./configure --target-list=riscv64-softmmu
$ make
$ sudo make install

Install and Compile riscv-gnu-toolchain

Create a new folder to place the RISC-V toolchain, and install several standard packages are needed to build the toolchain.

$ cd ~
$ mkdir riscv-tools
$ cd riscv-tools
$ git clone https://github.com/riscv/riscv-gnu-toolchain
$ sudo apt-get install autoconf automake autotools-dev curl python3 python3-pip python3-tomli libmpc-dev libmpfr-dev libgmp-dev gawk build-essential bison flex texinfo gperf libtool patchutils bc zlib1g-dev libexpat-dev ninja-build git cmake libglib2.0-dev libslirp-dev

The build defaults to targeting RV64imafdc with glibc. Then, simply run the following command:

Always write in English!

We'll remember it!

$ cd riscv-gnu-toolchain
$ ./configure
$ sudo make (This may take long time)

Install RISC-V Hypervisor

$ cd ~
$ git clone https://github.com/krizmanmarko/riscv_hypervisor
$ cd riscv_hypervisor
$ git submodule update --init --recursive

Build xv6-riscv-guest on QEMU

$ cd src/guest/xv6-riscv-guest
$ make qemu
(base) neat@neat:~/riscv_hypervisor/src/guest/xv6-riscv-guest$ make kernel/kernel fs.img
/home/marko/shit/riscv-gnu-toolchain-from-source/mybuild/bin/riscv64-unknown-elf-gcc    -c -o kernel/entry.o kernel/entry.S
make: /home/marko/shit/riscv-gnu-toolchain-from-source/mybuild/bin/riscv64-unknown-elf-gcc: No such file or directory
make: *** [<builtin>: kernel/entry.o] Error 127

When I ran make kernel/kernel fs.img , I encountered this error. It’s worth noting that the path to the tool riscv64-unknown-elf-gcc is set to a strange location: /home/marko/shit/riscv-gnu-toolchain-from-source/mybuild/bin/riscv64-unknown-elf-gcc. This seems like a personal path from someone else’s computer.

To resolve the issue, I edited the Makefile located at /home/neat/riscv_hypervisor/src/guest/xv6-riscv-guest/Makefile. In this file, I modified the TOOLPREFIX and QEMU variables to point to the correct personal location on my system like:

TOOLPREFIX = /usr/local/bin/riscv64-unknown-elf-

# ......

QEMU = /home/neat/qemu-8.0.2/build/qemu-system-riscv64

Build hypervisor

Before we build the hypervisor, we need to run the relevant make commands to generate the required files, such as xv6.bin, keygrab.bin, and printer.bin.

Remember to change Makefile in /riscv_hypervisor/src/guest/keygrab

CROSS_COMPILE = /usr/local/bin/riscv64-unknown-elf-

$ cd ~/riscv_hypervisor/src/guest/keygrab
$ make
// produce keygrab.bin file

$ cd ../printer
$ make
// produce printer.bin file

When I ran the command make, I encountered the following error:

(base) neat@NEAT-LAB:~/riscv_hypervisor/src/guest/keygrab$ make
[+] driver/plic.c -> build/driver/plic.o
[+] driver/uart.c -> build/driver/uart.o
[+] core/main.S -> build/core/main_asm.o
[+] Successfuly built build/keygrab!
make: ctags: No such file or directory
make: *** [Makefile:57: tags] Error 127

This issue indicates that the ctags command is missing on your system. Do the following command to solve:

$ sudo apt update
$ sudo apt install universal-ctags
(base) neat@NEAT-LAB:~/riscv_hypervisor/src/guest/keygrab$ make
[i] created tags
(base) neat@NEAT-LAB:~/riscv_hypervisor/src/guest/printer$ make
[+] core/main.S -> build/core/main_asm.o
[+] Successfuly built build/printer!
[i] created tags

xv6.bin has been produced at Build xv6-riscv-guest step.

$ cd ../imgs
$ ./refresh.sh

The refresh.sh script performs the following actions:

  1. Deletes all .bin files in the current directory (guest/imgs).
  2. Searches for .bin files starting from the parent directory using find .. -name '*.bin'.
  3. Copies the found .bin files to the current directory.

So, you should now see your imgs folder structured as follows:

imgs/
├── xv6.bin
├── keygrab.bin
├── printer.bin
└── refresh.sh

Ultimately, we can build up the RISC-V hypervisor using the following command:

$ cd ~/riscv_hypervisor/src
$ make
Result:
[+] core/main.c -> ../build/core/main.o
[+] core/trap.c -> ../build/core/trap.o
[+] core/vm_run.c -> ../build/core/vm_run.o
[+] driver/cpu.c -> ../build/driver/cpu.o
[+] driver/pci.c -> ../build/driver/pci.o
[+] driver/plic.c -> ../build/driver/plic.o
[+] driver/uart.c -> ../build/driver/uart.o
[+] guest/vm_config.c -> ../build/guest/vm_config.o
[+] lib/bits.c -> ../build/lib/bits.o
[+] lib/lock.c -> ../build/lib/lock.o
[+] lib/printf.c -> ../build/lib/printf.o
[+] lib/sbi.c -> ../build/lib/sbi.o
[+] lib/string.c -> ../build/lib/string.o
[+] mem/kmem.c -> ../build/mem/kmem.o
[+] mem/vmem.c -> ../build/mem/vmem.o
[+] virtual/vcpu.c -> ../build/virtual/vcpu.o
[+] virtual/vplic.c -> ../build/virtual/vplic.o
[i] created structs_in_asm.h
[+] core/sboot.S -> ../build/core/sboot_asm.o
[+] core/trap.S -> ../build/core/trap_asm.o
[+] lib/spinlock.S -> ../build/lib/spinlock_asm.o
[+] Successfuly built ../build/hypervisor!

Launch xv6 on QEMU

$ cd ~/riscv_hypervisor/
$ make kernel

When I ran the command make kernel, I encountered the following error:

(base) neat@neat:~/riscv_hypervisor$ make kernel 
qemu-system-riscv64: -chardev socket,id=pciserial1,host=127.0.0.1,port=1337,server=on: info: QEMU waiting for connection on: disconnected:tcp:127.0.0.1:1337,server=on
qemu-system-riscv64: -drive file=src/guest/xv6-riscv-guest/fs.img,if=none,format=raw,id=x0: Could not open 'src/guest/xv6-riscv-guest/fs.img': No such file or directory
make: *** [Makefile:3: kernel] Error 1

The issue seems to arise because we are only building one instance of the xv6 OS, but two image files are being referenced at the same time. We are still missing the file fs2.img.

We went back to Step Build xv6-riscv-guest run make kernel/kernel fs2.img.

Expected Output (On QEMU 8.0.2):

After successfully booting xv6, we should see the following:

# OpenSBI Boot Information
Booting!
xv6 kernel is booting

init: starting sh
$

Observed Output (Bumped QEMU to 9.2.0):

Following the QEMU update, the output unexecpectedly changed to:

Booting!
xv6 kernel is booting

It's appears that an issue occurred during initialization, preventing the system from booting successfully.

GDB Observation

We want to find out where in the program it gets stuck during execution by using GDB.
We discovered that the program gets stuck after executing the w_satp function within kvminithart() in the main.c file.

// Switch h/w page table register to the kernel's page table,
// and enable paging.
void
kvminithart()
{
  // wait for any previous writes to the page table memory to finish.
  sfence_vma();
  uint64 satp_value = MAKE_SATP(kernel_pagetable);
  w_satp(MAKE_SATP(kernel_pagetable));

  // flush stale entries from the TLB.
  sfence_vma();
  printf("Exiting kvminithart\n");
}

The purpose of w_satp():
In RISC-V, the satp CSR determines the root node of the current paging mode and the translation mode, in xv6 is Sv39.

  • satp format:
 |63  60|59  44|43                   0|
 +------+------+----------------------+
 | mode | ASID | physical page number |  
 +------+------+----------------------+ 
  • Sv39 format:
|38    30|29    21|20    12|11          0|
+--------+--------+--------+-------------+ 
| VPN[2] | VPN[1] | VPN[0] | page offset | 
+--------+--------+--------+-------------+

Once w_satp() is executed, the CPU immediately switches to using the new Page Table for instruction fetching and data access translation.

Therefore, immediately after w_satp(), when the CPU fetches the next instruction from memory, it uses the new Page Table for address resolution.

In our case, we observed a crash right after switching to the new page table. To investigate, we attached GDB and examined several key registers and CSRs:

(gdb) p/x $stval
$1 = 0x80100f56

(gdb) p/x $scause
$2 = 0xc

From the RISC-V privileged specification, scause = 0xc indicates an Instruction Page Fault the CPU failed to fetch an instruction at stval = 0x80100f56.

In spec Volume 2:
If stval is written with a nonzero value when an instruction access-fault or page-fault exception occurs on a system with variable-length instructions, then stval will contain the virtual address of the portion of the instruction that caused the fault, while sepc will point to the beginning of the instruction.

This strongly hinted that our newly switched page table disallowed execution at that address.

After observing the Instruction Page Fault and noticing it occurred right after switching to the new page table, we suspected that the CPU considered the instruction's virtual address "non-executable".

Typically, this could be because:

  1. The PTE lacks the Execute (X) Permission, or
  2. The PTE has A = 0 (Accessed bit cleared) and the CPU does not automatically update it.

In many RISC-V kernels (like xv6), kernel code pages are given X permission, so scenario(1) is less likely. Hence, scenario (2) the CPU failing to auto-update the A bit became the prime suspect.

To verify this, we inspected two hypervisor-mode configuration registers:

Machine Environment Configuration Register (menvcfg):

image

Hypervisor Environment Configuration Register (henvcfg):

image

QEMU 8.0.2:

(gdb) p/x $menvcfg
$1 = 0xa0000000000000f0

(gdb) p/x $henvcfg
$2 = 0xa000000000000000

Under QEMU 8.0.2, henvcfg was 0xa000000000000000, indicating bit 61 (ADUE) was set, thus the CPU would auto-update A and D bits.

By contrast, on QEMU 9.2.0, we observed:
QEMU 9.2.0:

(gdb) p/x $menvcfg
$1 = 0x80000000000000f0

(gdb) p/x $henvcfg
$2 = 0x8000000000000000

Meaning bit 61 was not set at reset time.

In QEMU source code target/riscv/cpu.c, we found the key difference between these two versions.

//===============8.0.2===============//
static void riscv_cpu_reset_hold(Object *obj)
{
    env->menvcfg = (cpu->cfg.ext_svpbmt ? MENVCFG_PBMTE : 0) |
                   (cpu->cfg.ext_svadu ? MENVCFG_HADE : 0);
    env->henvcfg = (cpu->cfg.ext_svpbmt ? HENVCFG_PBMTE : 0) |
                   (cpu->cfg.ext_svadu ? HENVCFG_HADE : 0);
}
//===============9.2.0===============//
static void riscv_cpu_reset_hold(Object *obj, ResetType type)
{
    env->menvcfg = (cpu->cfg.ext_svpbmt ? MENVCFG_PBMTE : 0) |
                   (!cpu->cfg.ext_svade && cpu->cfg.ext_svadu ?
                    MENVCFG_ADUE : 0);
    env->henvcfg = 0;
}

HADE was renamed to ADUE in QEMU commit <ed67d637>

In 9.2.0, henvcfg is explicitly zeroed out instead of automatically setting bit 61.
A related in QEMU commit <148189ff> states:

The hypervisor should decide what it wants to enable.
Zero all configuration enable bits on reset.

Hence, we had to explicitly enable ADUE in source file src/core/vm_run.c:

// v8.0.2
CSRS(henvcfg, 1ULL << 63);

// changed to v9.2.0
CSRS(henvcfg, (1ULL << 63) | (1ULL << 61));

After modified the code above, we can successfully boot multiple xv6 on new version of QEMU.

Reference