Try   HackMD

Implement Linux userspace RV32 emulation for RVVM

Computer Architecture
  • RVVM 提供 ELF loader,但其可能不完整,特別是 ELF32 執行檔,請嘗試讓 rv32emu 預先編譯的執行檔得以在RVVM 中載入,過程中,你可能會遇到技術問題,記錄下來,適度運用 GitHub Issue 提交問題
  • 改進 RVVM ELF loader 的強健程度,例如 ELF 執行檔經過 strip 後,應該仍可載入並執行,若否,修改 RVVM 程式碼來滿足此需求
  • 一旦 RVVM 得以載入 rv32emu 提供的預先編譯執行檔 (Quake 和 Doom 不考慮),嘗試比較 RVVM (RV32 + JIT) 和 rv32emu (w/ JIT) 的效能表現。注意,過程中你可能需要在 RVVM 補上系統呼叫,否則時間相關的系統呼叫會無法處理
  • 詳實記錄你遇到的問題和描述如何排除

Trace relative code of elf loader

To understand if elf loader is completed, trace the relative code of elf loader is necessary.

bool elf_load_file(rvfile_t* file, elf_desc_t* elf)
{
    uint8_t tmp[64];
    WRAP_ERR(file && elf, "Invalid arguments to elf_load_file()");
    WRAP_ERR(rvread(file, tmp, 64, 0) == 64, "Failed to read ELF header");
    WRAP_ERR(read_uint32_le_m(tmp) == 0x464c457F, "Not an ELF file");
    WRAP_ERR(tmp[4] == 1 || tmp[4] == 2, "Invalid ELF class");
    WRAP_ERR(tmp[5] == 1, "Not a little-endian ELF");

The code above shows the elf file can be loaded by rvread and verifyied by the read_uint32_le_m. Besides, the commit message of the code can tell us that the elf loader of RVVM is implemented as a userspace application. Therefore, the first step is to build the RVVM.

elf_load: Userland ELF loading, use VMA helpers

  • Implement mapping an ELF into userspace VMA
  • Relocatable ELF_DYN
  • Relocate elf->entry & elf->phdr against elf->base
  • Fix ELF interpreter parsing

Build & Run RVVM

According to the guidline of the RVVM, building RVVM need to prepare three things, fw_jump.bin from OpenSBI, Linux kernel from Linux and the root filesystem. First, we can get fw_jump.bin from the release version of OpenSBI - release. After downloading the release version of OpenSBI, fw_jump.bin is under the directory share.

Configure & Build Linux kernel

To build the Linux kernel we should configure and build the GNU Toolchain first.

$  git clone https://github.com/riscv-collab/riscv-gnu-toolchain.git --recursive
$  cd riscv-gnu-toolchain
$  ./configure --prefix=opt/riscv32-linux --with-arch=rv32imac --enable-  linux
%  sudo make linux

After building the GNU Toolchain, add the environment variable to the bashrc. The purpose of the step is to help the compiler to find the GNU Toolchain.

$  nano ~/.bashrc
...
export PATH=$PATH:/opt/riscv32-linux/bin
export CROSS_COMPILE=/opt/riscv32-linux/bin/riscv32-unknown-linux-gnu-
export ARCH=riscv
$  source ~/.bashrc

Then downloads the Linux code from the github, change it into whatever the version you want then and gernerate the Linux kernel Image.

$  git clone https://github.com/torvalds/linux
$  git checkout v5.18.0
$  make ARCH=riscv CROSS_COMPILE=riscv32-unknown-linux-gnu- -j$(nproc)
...
LD [M]  drivers/virtio/virtio_dma_buf.ko
LD [M]  fs/efivarfs/efivarfs.ko
LD [M]  fs/nls/nls_iso8859-1.ko
Kernel: arch/riscv/boot/Image.gz is ready

Create root filesystem image

The preparation hasn't done yet, the final step is to create the root filesystem. Root filesystem is the first filesystem to mount when a computer or a virtual machine is activated the root filesystem often including the directories below.

root
├──bin/
├──dev/
├──bin/
├──proc/
├──tmp/
├──var/
├──lib/
├──etc/
└──mnt/

In the case, I use the buildroot and busybox to build the root filesystem. I download the buildroot first, and get the buildroot .config file and busybox .config file from the semu to build the root filesystem image. To configure the output filesystem format of the root filesystem, use the following command.

$  make menuconfig

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

Then make the project, you can get a rootfs.ext2 root filesystem in output directory.

Debug & Run

The preparation is completed, then run the RVVM by following the guidline of the READMD.md.

$  make
$  cd release.rvvm.riscv
$  ./rvvm_riscv fw_jump.bin -kernel ../linux_rv32 -rv32 -image ../rootfs.ext2 -mem 1G -smp 2 -res 1280x720

Then the error comes.

RVVM/release.linux.riscv$ ./rvvm_riscv fw_jump.bin -kernel ../linux_rv32 -rv32 -image ../rootfs.ext2
...
[    0.297044] NET: Registered PF_PACKET protocol family
[    0.302607] debug_vm_pgtable: [debug_vm_pgtable         ]: Validating architecture page table helpers
[    0.306567] ALSA device list:
[    0.307740]   No soundcards found.
[    0.310877] VFS: Cannot open root device "nvme0n1" or unknown-block(0,0): error -6
[    0.313292] Please append a correct "root=" boot option; here are the available partitions:
[    0.315810] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
[    0.318519] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.15.1-rvvm #1
[    0.319944] Hardware name: RVVM v0.6-16e4574 (DT)
[    0.321245] Call Trace:
[    0.322520] [<c00032c6>] dump_backtrace+0x1a/0x22
[    0.323994] ---[ end Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0) ]---

I have no idea where the error comes from so I ask the developer of the project, then I get a suggestion of using the specific config file which is built with riscv64 of the kernel. So I get the config file and make the adjustment of it.

$  make menuconfig

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

From the figure above, I change the Base ISA term from CONFIG_ARCH_RV64I to CONFIG_ARCH_RV32I, then create another Linux kernel image from this configuration.

./rvvm_riscv fw_jump.bin -kernel ../Image -rv32 -image ../rootfs.ext2 -mem 1G -smp 2 -res 1280x720

Finally, RVVM can run successfully.

...
[    1.012844] kallsyms_selftest: start
[    1.018321] Run /sbin/init as init process
[    1.164654] EXT4-fs (nvme0n1): re-mounted 2ade8330-a00b-457a-b9ad-40a1f03277c1. Quota mode: disabled.
[    2.648478] kallsyms_selftest:  ---------------------------------------------------------
[    2.650191] kallsyms_selftest: | nr_symbols | compressed size | original size | ratio(%) |
[    2.651839] kallsyms_selftest: |---------------------------------------------------------|
[    2.653424] kallsyms_selftest: |      24927 |        271546   |       461124  |  58.88   |
[    2.655542] kallsyms_selftest:  ---------------------------------------------------------
[    2.960294] kallsyms_selftest: kallsyms_lookup_name() looked up 24927 symbols
[    2.961988] kallsyms_selftest: The time spent on each symbol is (ns): min=900, max=511600, avg=11656
[    2.968950] kallsyms_selftest: kallsyms_on_each_symbol() traverse all: 5326200 ns
[    2.970850] kallsyms_selftest: kallsyms_on_each_match_symbol() traverse all: 35500 ns
[    2.972932] kallsyms_selftest: finish

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

Load ELF file

What is ELF file ? ELF file is short for Executable and Linkable Format and a command standard file format for executable files, object code, shared libraries and core dumps. It usually contains of three section, ELF header, program header table and section header table. ELF header is compposed of the information of an ELF file, and program header shows the segments used in the run time, moreover, section header table shows the lists of the sections.

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

ELF header

The ELF header is different between RV32 and RV64, but they still share something identical.

Offset RV32 RV64
0x0 0x7f 0x45 0x4c 0x46 0x7f 0x45 0x4c 0x46
0x4 1 2
0x5 1 : little endian 2 : big endian 1 : little endian 2 : big endian
0x6 1 1
0x7 ABI number (eg. Linux=3) ABI number (eg. Linux=3)
0x8 ABI version ABI version
0x9 unused unused
0x10 object file type (eg. Executable file=3) object file type (eg. Executable file=3)
0x12 ISA ISA
0x14 1 1
0x18 entry point entry point
0x1c pointer to the start of program header table entry point
0x20 pointer to the start of section header table pointer to the start of program header table
0x24 depends on the target architecture pointer to the start of program header table
0x28 size of this header pointer to the start of section header table
0x2a size of program header table entry pointer to the start of section header table
0x2c number of entries in the program header table pointer to the start of section header table
0x2e size of a section header table entry pointer to the start of section header table
0x30 number of entries in the section header table depends on the target architecture
0x32 number of entries in the section header table depends on the target architecture
0x34 end of ELF Header size of this header
0x36 size of program header table entry
0x38 number of entries in the program header table
0x3a size of a section header table entry
0x3c number of entries in the section header table
0x3e number of entries in the section header table
0x40 end of ELF Header

Program header table

In the ELF header, there is a pointer to point to the start position of the program header table. The program header table declares further details of the ELF file. In the program header table, it's mostly about the execution information of the file. Also same as the ELF header, the layout is slightly different between in RV32 and RV64.

Offset RV32 RV64
0x00 Type of the segement Type of the segement
0x04 Offset of the segment in the file image Segment-dependent flags
0x08 Virtual address where it should be loaded Offset of the segment in the file image
0x0c Physical address where it should be loaded
0x10 Size on file Virtual address where it should be loaded
0x14 Size on memory
0x18 Segment-dependent flags Physical address where it should be loaded
0x1c
0x20 End of the program header Size on file
0x28 Size on memory
0x30
0x38 End of the program header

Load elf file as a root file

According to the previous section, the RVVM can be activated by fw_jump.bin, Linux kernel and the root filesystem. Let focus on the root filesystem, under the root there is a directory run as the initial process of the machine. It can be showed as the kernel message.

[    0.794604] Run /sbin/init as init process

The message shows that the file in the directory is the first file to be loaded. Then, instead of building the root filesystem by using busybox and buildroot, I build the root filesystem just by the elf file I want.

$ dd if=/dev/zero of=root.ext2 bs=1M count=10
$ mkdir -p /mnt/root
$ sudo mount root.ext2 mnt/root
$ mkdir -p /mnt/root/sbin
$ sudo cp hello.elf mnt/root/sbin/init
$ sudo umount /mnt/root

The command above is to create a filesystem with only the elf file in it. It was provided by the RVVM - issue. Then run the RVVM, you can get the elf file is loaded as the initial program of the machine.

2024-01-07 22-04-22 screenshot

After loading the elf file as the root file of the RVVM, there comes a question, can I load the elf file in other order?

Trace elf load file code

To figure out how the elf file loaded, the code of loading it must be traced. After viewing the code, I have to say the code doesn't like a typical elf loader. The elf loader of RVVM only provides the essential part of executing. And what is the essential part of executing? Like the struct elf_desc_t showed in below, the base and buf_size mean the address and the size of the elf file. struct elf_desc_t only provides the phyiscal address , dynamic linked file name, and the entry number of the program section.

typedef struct {
    // Pass a buffer for objcopy, NULL for userland loading
    // Receive base ELF address for userland
    void*  base;
    // Objcopy buffer size
    size_t buf_size;

    // Various loaded ELF info
    size_t entry;
    char*  interp_path;
    size_t phdr;
    size_t phnum;
} elf_desc_t;

If you use readelf you can know what are phyiscal address , dynamic linked file name, and the entry number of the program section.

$ readelf -a hello.elf
...
程式標頭:
  類型           偏移量   虛擬位址   實體位址 檔案大小 記憶大小 旗標 對齊
  LOPROC+0x3     0x001049 0x00000000 0x00000000 0x0001a 0x00000 R   0x1
  LOAD           0x001000 0x00000000 0x00000000 0x00049 0x00049 R E 0x1000
...
本檔案沒有動態區段。
...

The program table is where the assembly code store in the elf file.

bool bin_objcopy(rvfile_t* file, void* buffer, size_t size, bool try_elf)
{
    uint8_t mag[4] = {0};
    if (try_elf && rvread(file, mag, 4, 0) == 4 && read_uint32_le_m(mag) == 0x464c457F) {
        elf_desc_t elf = {
            .base = buffer,
            .buf_size = size,
        };
        if (elf_load_file(file, &elf)) return true;
    }
    return rvread(file, buffer, size, 0);
}

In bin_objcopy is to load the elf file and check if it's a elf file by reading the magic number 0x464c457F.

bool elf_load_file(rvfile_t* file, elf_desc_t* elf) { uint8_t tmp[64]; WRAP_ERR(file && elf, "Invalid arguments to elf_load_file()"); WRAP_ERR(rvread(file, tmp, 64, 0) == 64, "Failed to read ELF header"); WRAP_ERR(read_uint32_le_m(tmp) == 0x464c457F, "Not an ELF file"); WRAP_ERR(tmp[4] == 1 || tmp[4] == 2, "Invalid ELF class"); WRAP_ERR(tmp[5] == 1, "Not a little-endian ELF"); // Parse ELF header bool objcopy = !!elf->base; bool class64 = (tmp[4] == 2); uint16_t elf_type = read_uint16_le_m(tmp + 16); uint64_t elf_entry = class64 ? read_uint64_le_m(tmp + 24) : read_uint32_le_m(tmp + 24); uint64_t elf_phoff = class64 ? read_uint64_le_m(tmp + 32) : read_uint32_le_m(tmp + 28); //uint64_t elf_shoff = class64 ? read_uint64_le_m(tmp + 40) : read_uint32_le_m(tmp + 32); size_t elf_phnsz = class64 ? 56 : 32; size_t elf_phnum = read_uint16_le_m(tmp + (class64 ? 56 : 44)); ... for (size_t i=0; i<elf_phnum; ++i) { uint64_t elf_phent_off = elf_phoff + (elf_phnsz * i); WRAP_ERR(rvread(file, tmp, elf_phnsz, elf_phent_off) == elf_phnsz, "Failed to read ELF phent"); uint32_t p_type = read_uint32_le_m(tmp); uint64_t p_offset = class64 ? read_uint64_le_m(tmp + 8) : read_uint32_le_m(tmp + 4); uint64_t p_vaddr = class64 ? read_uint64_le_m(tmp + 16) : read_uint32_le_m(tmp + 8); uint64_t p_fsize = class64 ? read_uint64_le_m(tmp + 32) : read_uint32_le_m(tmp + 16); uint64_t p_memsz = class64 ? read_uint64_le_m(tmp + 40) : read_uint32_le_m(tmp + 20); //uint32_t p_flags = class64 ? read_uint32_le_m(tmp + 4) : read_uint32_le_m(tmp + 24); if (p_type == ELF_PT_LOAD || p_type == ELF_PT_PHDR) { // Load ELF program segment or PHDR segment if (objcopy) { p_vaddr -= elf_loaddr; WRAP_ERR(p_vaddr + p_memsz <= elf->buf_size, "ELF does not fit in objcopy buffer"); } void* vaddr = ((uint8_t*)elf->base) + p_vaddr; if (!objcopy) { WRAP_ERR(vma_alloc(vaddr, p_memsz, VMA_RDWR | VMA_FIXED) == vaddr, "Failed to allocate ELF VMA"); } WRAP_ERR(rvread(file, vaddr, p_fsize, p_offset) == p_fsize, "Failed to read ELF segment"); } if (p_type == ELF_PT_INTERP && !objcopy && !elf->interp_path) { // Get ELF interpreter path WRAP_ERR(p_fsize < 1024, "ELF interpreter path is too long"); elf->interp_path = safe_new_arr(char, p_fsize + 1); WRAP_ERR(rvread(file, elf->interp_path, p_fsize, p_offset) == p_fsize, "Failed to read ELF interp_path"); } return true; }

In the elf_load, it's clearly that it does support the ELF32 loading. From line 3 to line 18, the elf header is been parsed and the line after 20 is mapping the virtual memory in the RVVM to the elf file. So, why is it not a typical elf loader? Because the function is lack of providing information such as ELF header, Program table and symbols etc. Therefore, of course it couldn't load the elf file which isn't been stripped.

Support reading elf file with symbol table

image

The difference between stripped file and not stripped file

To read stripped file and not stripped file, we must know what is the difference between them. The difference between them is the stripped file is lack of Symbol table.

$ file hello.elf
hello.elf: ELF 32-bit LSB executable, UCB RISC-V, version 1 (SYSV), statically linked, not stripped
$ readelf -s hello.elf

Symbol table '.symtab' contains 13 entries:
   編號:    值     大小 類型    約束   版本     索引名稱
     0: 00000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 00000000     0 SECTION LOCAL  DEFAULT    1 
     2: 0000003c     0 SECTION LOCAL  DEFAULT    2 
     3: 00000000     0 SECTION LOCAL  DEFAULT    3 
     4: 00000000     0 FILE    LOCAL  DEFAULT  ABS hello.o
     5: 0000005d     0 NOTYPE  LOCAL  DEFAULT  ABS SYSEXIT
     6: 00000040     0 NOTYPE  LOCAL  DEFAULT  ABS SYSWRITE
     7: 0000003c     0 NOTYPE  LOCAL  DEFAULT    2 str
     8: 0000000d     0 NOTYPE  LOCAL  DEFAULT  ABS str_size
     9: 00000000     0 NOTYPE  LOCAL  DEFAULT    1 $xrv32i2p1
    10: 00000010     0 NOTYPE  LOCAL  DEFAULT    1 loop
    11: 00000030     0 NOTYPE  LOCAL  DEFAULT    1 end
    12: 00000000     0 NOTYPE  GLOBAL DEFAULT    1 _start
    
$ file pi.elf
pi.elf: ELF 32-bit LSB executable, UCB RISC-V, version 1 (SYSV), statically linked, stripped
$ readelf -s pi.elf
<nothing>

To read symbol table, it requires to read the full elf header to get the entries to the section header.

Support reading full elf header file

The loading process includes parsing the elf header and program header in elf file, and the section header is to dissasemble the program. The first two parts of the loading process which are parsing elf header and mapping the program part into the memory have been done in RVVM.
However, the section header is been ignored through the loading process, so the elf file can't be read correctly especially the not stripped one.

/* rvvm_types.h */
// Max XLEN/SXLEN values
#ifdef USE_RV64
typedef uint64_t maxlen_t;
typedef int64_t smaxlen_t;
#define MAX_XLEN 64
#define MAX_SHAMT_BITS 6
#define PRIxXLEN PRIx64
#else
typedef uint32_t maxlen_t;
typedef int32_t smaxlen_t;


/* elf_load.h */

struct ELFEhdr{  
    uint8_t  e_ident[EI_NIDENT]; 
    uint16_t e_type;                 
    uint16_t e_machine;              
    uint32_t e_version;              
    maxlen_t e_entry;            
    maxlen_t e_phoff;            
    maxlen_t e_shoff;            
    uint32_t e_flags;             
    uint16_t e_ehsize;               
    uint16_t e_phentsize;            
    uint16_t e_phnum;                
    uint16_t e_shentsize;            
    uint16_t e_shnum;                
    uint16_t e_shstrndx;             
};

struct ELFShdr{
    uint32_t sh_name;                
    uint32_t sh_type;                
    maxlen_t sh_flags;           
    maxlen_t sh_addr;            
    maxlen_t sh_offset;          
    uint32_t sh_link;                
    uint32_t sh_info;                
    maxlen_t sh_addralign;       
    maxlen_t sh_entsize;         
};

Then write a function to load the elf header, moreover seperate the the check elf from the elf_load as a verification of the elf header.

bool verify_elf(rvfile_t* file)
{
    WRAP_ERR(file, "Invalid arguments to elf_load_file()");
#ifdef USE_RV64
    uint8_t tmp[64] = {0};
    WRAP_ERR(rvread(file, tmp, 64, 0) == 64, "Failed to read 64bit ELF header");
#else
    uint8_t tmp[52] = {0};
    WRAP_ERR(rvread(file, tmp, 52, 0) == 52, "Failed to read 32bit ELF header");
#endif
    WRAP_ERR(read_uint32_le_m(tmp) == 0x464c457F, "Not an ELF file");
    WRAP_ERR(tmp[4] == 1 || tmp[4] == 2, "Invalid ELF class");
    WRAP_ERR(tmp[5] == 1, "Not a little-endian ELF");
    return true;
}

To load the elf header, I use a simple way to do it. I point the struct ELFEhdr* ehdr to the elf file and set the size of the buffer to fit the size of the struct ELFEhdr.

struct ELFEhdr* load_ehdr(rvfile_t* file)
{
// Determined as make option
#ifdef USE_RV64
    uint8_t buffer[64] = {0};
    WRAP_ERR(rvread(file, buffer, 64, 0) == 64, "Failed to read 64bit ELF header");
#else
    uint8_t buffer[52] = {0};
    WRAP_ERR(rvread(file, buffer, 52, 0) == 52, "Failed to read 32bit ELF header");
#endif
    WRAP_ERR(verify_elf(file), "The file is invalid as a ELF file");
    struct ELFEhdr* ehdr = (struct ELFEhdr*) buffer;
    WRAP_ERR(sizeof(*ehdr) == sizeof(buffer), "Conflict size of ELF header file");
    return ehdr;
}

To check out the result, I use hexdump and readelf to dump the raw data in the hello.elf.

$ readelf -a hello.elf
...

區段標頭:
  [號] 名稱              類型            位址     偏移   大小   全 旗標 連結 資  齊
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .text             PROGBITS        00000000 001000 00003c 00  AX  0   0  4
  [ 2] .rodata           PROGBITS        0000003c 00103c 00000d 00   A  0   0  1
  [ 3] .riscv.attributes RISCV_ATTRIBUTE 00000000 001049 00001a 00      0   0  1
  [ 4] .symtab           SYMTAB          00000000 001064 0000d0 10      5  12  4
  [ 5] .strtab           STRTAB          00000000 001134 000042 00      0   0  1
  [ 6] .shstrtab         STRTAB          00000000 001176 00003b 00      0   0  1
...

The decoding of unwind sections for machine type RISC-V is not currently supported.

Symbol table '.symtab' contains 13 entries:
   編號:    值     大小 類型    約束   版本     索引名稱
     0: 00000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 00000000     0 SECTION LOCAL  DEFAULT    1 
     2: 0000003c     0 SECTION LOCAL  DEFAULT    2 
     3: 00000000     0 SECTION LOCAL  DEFAULT    3 
     4: 00000000     0 FILE    LOCAL  DEFAULT  ABS hello.o
     5: 0000005d     0 NOTYPE  LOCAL  DEFAULT  ABS SYSEXIT
     6: 00000040     0 NOTYPE  LOCAL  DEFAULT  ABS SYSWRITE
     7: 0000003c     0 NOTYPE  LOCAL  DEFAULT    2 str
     8: 0000000d     0 NOTYPE  LOCAL  DEFAULT  ABS str_size
     9: 00000000     0 NOTYPE  LOCAL  DEFAULT    1 $xrv32i2p1
    10: 00000010     0 NOTYPE  LOCAL  DEFAULT    1 loop
    11: 00000030     0 NOTYPE  LOCAL  DEFAULT    1 end
    12: 00000000     0 NOTYPE  GLOBAL DEFAULT    1 _start
...

$ hexdump hello.elf
0000000 457f 464c 0101 0001 0000 0000 0000 0000
0000010 0002 00f3 0001 0000 0000 0000 0034 0000
0000020 11b4 0000 0000 0000 0034 0020 0002 0028
0000030 0007 0006 0003 ...

Then I modify the code in rvvm_main to let the RVVM can get the argument from the terminal.

char*  elf_filename = NULL;
...
for (int i=1; i<argc; i+=arg_size) {
        arg_size = get_arg(argv + i, &arg_name, &arg_val);
        if (cmp_arg(arg_name, "m") || cmp_arg(arg_name, "mem")) {
            if (rvvm_strlen(arg_val)) {
                mem = ((size_t)str_to_int_dec(arg_val)) << mem_suffix_shift(arg_val[rvvm_strlen(arg_val)-1]);
            }
        } else if (cmp_arg(arg_name, "s") || cmp_arg(arg_name, "smp")) {
             smp = str_to_int_dec(arg_val);
        } else if (cmp_arg(arg_name, "e")) {
            elf_name = arg_val;
            ...
                

rvfile_t *elf_file = rvopen(elf_filename, RVFILE_RW);
struct ELFEhdr* ehdr = load_ehdr(elf_file);

After setting the experiment down, I random pick three members e_machine, e_shnum, e_shstrndx in ehdr to see if the code is successful running.

$ ./rvvm_riscv ../fw_jump.bin -kernel ../Image -rv32 -image ../rootfs.ext2 -mem 1G  -res 1280x720 -elf ../elf/hello.elf
...
243 7 6

Executing the elf file in RVVM

The executing part is a big problem in RVVM, because currently RVVM doesn't support the userspace elf loader. A userspace emulation is different from the virtual machine in RVVM, in the case they are independent platform but with the same cpu source. After discussing with the developer of the RVVM, he(or she) releases a WIP of running the elf file on the userspace \(\to\) see commit cdfea8b. The part still exists a lot of bug, so continue debugging.
The elf load api in RVVM has two modes. The first one is to load the kernel file and image file into the virtual memory of the virtual machine. In this part, the elf file can be written into memory of the machine, and used them as vmlinux.

    ...
    if (machine->kernel_file) {
        size_t kernel_offset = machine->rv64 ? 0x200000 : 0x400000;
        size_t kernel_size = machine->mem.size > kernel_offset ? machine->mem.size - kernel_offset : 0;
        bin_objcopy(machine->kernel_file, machine->mem.data + kernel_offset, kernel_size, elf);
    }
    rvvm_addr_t dtb_addr = rvvm_get_opt(machine, RVVM_OPT_DTB_ADDR);
    if (machine->dtb_file) {
        size_t dtb_size = rvfilesize(machine->dtb_file);
        size_t dtb_offset = machine->mem.size > dtb_size ? machine->mem.size - dtb_size : 0;
        dtb_addr = machine->mem.begin + dtb_offset;
        rvread(machine->dtb_file, machine->mem.data + dtb_offset, machine->mem.size - dtb_offset, 0);
    }
    ...

The second one is to run the userspace process.

elf_desc_t elf = {
    .base = NULL,
};
elf_desc_t interp = {
    .base = NULL,
};

To run in the userspace process, we also need a struct exec_desc_t to described the process and program information.

typedef struct {
    // Self explanatory
    size_t argc;
    const char** argv;
    const char** envp;

    size_t base;         // Main ELF base address (relocation)
    size_t entry;        // Main ELF entry point
    size_t interp_base;  // ELF interpreter (aka linker usually) base address
    size_t interp_entry; // ELF interpreter entry point
    size_t phdr;         // Address of ELF PHDR section
    size_t phnum;        // Number of PHDRs
} exec_desc_t;

According to the developer of the RVVM, he has tested the x86_64 arch elf file can be loaded on the user process. The test method is to add the following code in main, then try to pass the elf file you want.

int rvvm_user(int argc, char** argv, char** envp);

int main(int argc, char** argv, char** envp)
{
    if (argc > 2 && rvvm_strcmp(argv[1], "-user")) {
        rvvm_set_loglevel(LOG_INFO);
        return rvvm_user(argc - 2, argv + 2, envp);
    }
    ...

Then begin the test with below command.

$ make CFLAGS="-static" && rvvm -user hello.elf
  ██▀███   ██▒   █▓ ██▒   █▓ ███▄ ▄███▓
 ▓██ ▒ ██▒▓██░   █▒▓██░   █▒▓██▒▀█▀ ██▒
 ▓██ ░▄█ ▒ ▓██  █▒░ ▓██  █▒░▓██    ▓██░
 ▒██▀▀█▄    ▒██ █░░  ▒██ █░░▒██    ▒██ 
 ░██▓ ▒██▒   ▒▀█░     ▒▀█░  ▒██▒   ░██▒
 ░ ▒▓ ░▒▓░   ░ █░     ░ █░  ░ ▒░   ░  ░
   ░▒ ░ ▒░   ░ ░░     ░ ░░  ░  ░      ░
   ░░   ░      ░░       ░░  ░      ░   
    ░           ░        ░         ░   
               ░        ░              

Detected OS: Linux
Detected CC: GCC 9
Target arch: riscv
Version:     RVVM 0.6-f241437-dirty
...
/usr/bin/ld: /usr/lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/libX11.a(xim_trans.o): in function `_XimXTransSocketINETConnect':
(.text+0xcfc): 警告: Using 'getaddrinfo' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/bin/ld: /usr/lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/libX11.a(OpenDis.o): in function `OutOfMemory':
(.text+0x3c4): undefined reference to `xcb_disconnect'
...

It seems that the dynamic linking file attached to architecture x86_64 which was supposed to be RSIC-V, and I have no idea of the situation.