# Implement Linux userspace RV32 emulation for [RVVM](https://github.com/LekKit/RVVM) ###### Computer Architecture :::info - [ ] RVVM 提供 ELF loader,但其可能不完整,特別是 ELF32 執行檔,請嘗試讓 rv32emu 預先編譯的執行檔得以在RVVM 中載入,過程中,你可能會遇到技術問題,記錄下來,適度運用 GitHub Issue 提交問題 - [ ] 改進 RVVM ELF loader 的強健程度,例如 ELF 執行檔經過 strip 後,應該仍可載入並執行,若否,修改 RVVM 程式碼來滿足此需求 - [ ] 一旦 RVVM 得以載入 rv32emu 提供的預先編譯執行檔 (Quake 和 Doom 不考慮),嘗試比較 RVVM (RV32 + JIT) 和 rv32emu (w/ JIT) 的效能表現。注意,過程中你可能需要在 RVVM 補上系統呼叫,否則時間相關的系統呼叫會無法處理 - [ ] 詳實記錄你遇到的問題和描述如何排除 ::: ## Trace relative code of elf loader To understand if elf loader is completed, trace the relative code of elf loader is necessary. ```c bool elf_load_file(rvfile_t* file, elf_desc_t* elf) { uint8_t tmp[64]; WRAP_ERR(file && elf, "Invalid arguments to elf_load_file()"); WRAP_ERR(rvread(file, tmp, 64, 0) == 64, "Failed to read ELF header"); WRAP_ERR(read_uint32_le_m(tmp) == 0x464c457F, "Not an ELF file"); WRAP_ERR(tmp[4] == 1 || tmp[4] == 2, "Invalid ELF class"); WRAP_ERR(tmp[5] == 1, "Not a little-endian ELF"); ``` The code above shows the elf file can be loaded by `rvread` and verifyied by the `read_uint32_le_m`. Besides, the commit message of the code can tell us that the elf loader of [RVVM](https://github.com/LekKit/RVVM) is implemented as a userspace application. Therefore, the first step is to build the [RVVM](https://github.com/LekKit/RVVM). >elf_load: Userland ELF loading, use VMA helpers >- Implement mapping an ELF into userspace VMA >- Relocatable ELF_DYN >- Relocate elf->entry & elf->phdr against elf->base >- Fix ELF interpreter parsing ## Build & Run [RVVM](https://github.com/LekKit/RVVM) According to the guidline of the [RVVM](https://github.com/LekKit/RVVM), building [RVVM](https://github.com/LekKit/RVVM) need to prepare three things, `fw_jump.bin` from [OpenSBI](https://github.com/riscv-software-src/opensbi), Linux kernel from Linux and the root filesystem. First, we can get `fw_jump.bin` from the release version of [OpenSBI - release](https://github.com/riscv-software-src/opensbi/releases/tag/v1.4). After downloading the release version of OpenSBI, `fw_jump.bin` is under the directory `share`. ### Configure & Build Linux kernel To build the Linux kernel we should configure and build the GNU Toolchain first. ```shell $ git clone https://github.com/riscv-collab/riscv-gnu-toolchain.git --recursive $ cd riscv-gnu-toolchain $ ./configure --prefix=opt/riscv32-linux --with-arch=rv32imac --enable- linux % sudo make linux ``` After building the GNU Toolchain, add the environment variable to the `bashrc`. The purpose of the step is to help the compiler to find the GNU Toolchain. ```shell $ nano ~/.bashrc ... export PATH=$PATH:/opt/riscv32-linux/bin export CROSS_COMPILE=/opt/riscv32-linux/bin/riscv32-unknown-linux-gnu- export ARCH=riscv $ source ~/.bashrc ``` Then downloads the Linux code from the github, change it into whatever the version you want then and gernerate the Linux kernel Image. ```shell $ git clone https://github.com/torvalds/linux $ git checkout v5.18.0 $ make ARCH=riscv CROSS_COMPILE=riscv32-unknown-linux-gnu- -j$(nproc) ... LD [M] drivers/virtio/virtio_dma_buf.ko LD [M] fs/efivarfs/efivarfs.ko LD [M] fs/nls/nls_iso8859-1.ko Kernel: arch/riscv/boot/Image.gz is ready ``` ### Create root filesystem image The preparation hasn't done yet, the final step is to create the root filesystem. Root filesystem is the first filesystem to mount when a computer or a virtual machine is activated the root filesystem often including the directories below. ``` root ├──bin/ ├──dev/ ├──bin/ ├──proc/ ├──tmp/ ├──var/ ├──lib/ ├──etc/ └──mnt/ ``` In the case, I use the [buildroot](https://buildroot.org/) and [busybox](https://www.busybox.net/) to build the root filesystem. I download the [buildroot](https://buildroot.org/) first, and get the buildroot `.config` file and busybox `.config` file from the [semu](https://github.com/sysprog21/semu) to build the root filesystem image. To configure the output filesystem format of the root filesystem, use the following command. ```shell $ make menuconfig ``` ![2023-12-30 21-07-49 screenshot](https://hackmd.io/_uploads/SJr68q6Dp.png) Then `make` the project, you can get a `rootfs.ext2` root filesystem in output directory. ### Debug & Run The preparation is completed, then run the RVVM by following the guidline of the `READMD.md`. ```shell $ make $ cd release.rvvm.riscv $ ./rvvm_riscv fw_jump.bin -kernel ../linux_rv32 -rv32 -image ../rootfs.ext2 -mem 1G -smp 2 -res 1280x720 ``` Then the error comes. ```shell RVVM/release.linux.riscv$ ./rvvm_riscv fw_jump.bin -kernel ../linux_rv32 -rv32 -image ../rootfs.ext2 ... [ 0.297044] NET: Registered PF_PACKET protocol family [ 0.302607] debug_vm_pgtable: [debug_vm_pgtable ]: Validating architecture page table helpers [ 0.306567] ALSA device list: [ 0.307740] No soundcards found. [ 0.310877] VFS: Cannot open root device "nvme0n1" or unknown-block(0,0): error -6 [ 0.313292] Please append a correct "root=" boot option; here are the available partitions: [ 0.315810] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0) [ 0.318519] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.15.1-rvvm #1 [ 0.319944] Hardware name: RVVM v0.6-16e4574 (DT) [ 0.321245] Call Trace: [ 0.322520] [<c00032c6>] dump_backtrace+0x1a/0x22 [ 0.323994] ---[ end Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0) ]--- ``` I have no idea where the error comes from so I [ask the developer of the project](https://github.com/LekKit/RVVM/issues/57#issuecomment-1871896448), then I get a suggestion of using the [specific config file](https://github.com/LekKit/patches-misc/blob/master/linux-rvvm-configs/config_tiny_6.2) which is built with riscv64 of the kernel. So I get the config file and make the adjustment of it. ```shell $ make menuconfig ``` ![2023-12-30 21-32-43 screenshot](https://hackmd.io/_uploads/BJefn5pDp.png) From the figure above, I change the `Base ISA` term from `CONFIG_ARCH_RV64I` to `CONFIG_ARCH_RV32I`, then create another Linux kernel image from this configuration. ``` ./rvvm_riscv fw_jump.bin -kernel ../Image -rv32 -image ../rootfs.ext2 -mem 1G -smp 2 -res 1280x720 ``` Finally, RVVM can run successfully. ```shell ... [ 1.012844] kallsyms_selftest: start [ 1.018321] Run /sbin/init as init process [ 1.164654] EXT4-fs (nvme0n1): re-mounted 2ade8330-a00b-457a-b9ad-40a1f03277c1. Quota mode: disabled. [ 2.648478] kallsyms_selftest: --------------------------------------------------------- [ 2.650191] kallsyms_selftest: | nr_symbols | compressed size | original size | ratio(%) | [ 2.651839] kallsyms_selftest: |---------------------------------------------------------| [ 2.653424] kallsyms_selftest: | 24927 | 271546 | 461124 | 58.88 | [ 2.655542] kallsyms_selftest: --------------------------------------------------------- [ 2.960294] kallsyms_selftest: kallsyms_lookup_name() looked up 24927 symbols [ 2.961988] kallsyms_selftest: The time spent on each symbol is (ns): min=900, max=511600, avg=11656 [ 2.968950] kallsyms_selftest: kallsyms_on_each_symbol() traverse all: 5326200 ns [ 2.970850] kallsyms_selftest: kallsyms_on_each_match_symbol() traverse all: 35500 ns [ 2.972932] kallsyms_selftest: finish ``` ![2023-12-30 21-40-41 screenshot](https://hackmd.io/_uploads/r1ZZ09aPp.png) ## Load ELF file What is ELF file ? ELF file is short for **Executable and Linkable Format** and a command standard file format for executable files, object code, shared libraries and core dumps. It usually contains of three section, ELF header, program header table and section header table. ELF header is compposed of the information of an ELF file, and program header shows the segments used in the run time, moreover, section header table shows the lists of the sections. ![Elf-layout--en](https://hackmd.io/_uploads/B1Io95-dT.svg) ### ELF header The ELF header is different between RV32 and RV64, but they still share something identical. | Offset | RV32 | RV64 | |:------:|:---------------------------------------------:|:---------------------------------------------:| | 0x0 | `0x7f 0x45 0x4c 0x46` | `0x7f 0x45 0x4c 0x46` | | 0x4 | 1 | 2 | | 0x5 | 1 : little endian 2 : big endian | 1 : little endian 2 : big endian | | 0x6 | 1 | 1 | | 0x7 | ABI number (eg. Linux=3) | ABI number (eg. Linux=3) | | 0x8 | ABI version | ABI version | | 0x9 | unused | unused | | 0x10 | object file type (eg. Executable file=3) | object file type (eg. Executable file=3) | | 0x12 | ISA | ISA | | 0x14 | 1 | 1 | | 0x18 | entry point | entry point | | 0x1c | pointer to the start of program header table | entry point | | 0x20 | pointer to the start of section header table | pointer to the start of program header table | | 0x24 | depends on the target architecture | pointer to the start of program header table | | 0x28 | size of this header | pointer to the start of section header table | | 0x2a | size of program header table entry | pointer to the start of section header table | | 0x2c | number of entries in the program header table | pointer to the start of section header table | | 0x2e | size of a section header table entry | pointer to the start of section header table | | 0x30 | number of entries in the section header table | depends on the target architecture | | 0x32 | number of entries in the section header table | depends on the target architecture | | 0x34 | end of ELF Header | size of this header | | 0x36 | | size of program header table entry | | 0x38 | | number of entries in the program header table | | 0x3a | | size of a section header table entry | | 0x3c | | number of entries in the section header table | | 0x3e | | number of entries in the section header table | | 0x40 | | end of ELF Header | ### Program header table In the ELF header, there is a pointer to point to the start position of the program header table. The program header table declares further details of the ELF file. In the program header table, it's mostly about the execution information of the file. Also same as the ELF header, the layout is slightly different between in RV32 and RV64. | Offset | RV32 | RV64 | |:------:|:------------------------------------------:|:------------------------------------------:| | 0x00 | Type of the segement | Type of the segement | | 0x04 | Offset of the segment in the file image | Segment-dependent flags | | 0x08 | Virtual address where it should be loaded | Offset of the segment in the file image | | 0x0c | Physical address where it should be loaded | | | 0x10 | Size on file | Virtual address where it should be loaded | | 0x14 | Size on memory | | | 0x18 | Segment-dependent flags | Physical address where it should be loaded | | 0x1c | | | | 0x20 | End of the program header | Size on file | | 0x28 | | Size on memory | | 0x30 | | | | 0x38 | | End of the program header | ### Load elf file as a root file According to the previous section, the RVVM can be activated by `fw_jump.bin`, Linux kernel and the root filesystem. Let focus on the root filesystem, under the root there is a directory run as the initial process of the machine. It can be showed as the kernel message. ``` [ 0.794604] Run /sbin/init as init process ``` The message shows that the file in the directory is the first file to be loaded. Then, instead of building the root filesystem by using busybox and buildroot, I build the root filesystem just by the elf file I want. ``` $ dd if=/dev/zero of=root.ext2 bs=1M count=10 $ mkdir -p /mnt/root $ sudo mount root.ext2 mnt/root $ mkdir -p /mnt/root/sbin $ sudo cp hello.elf mnt/root/sbin/init $ sudo umount /mnt/root ``` The command above is to create a filesystem with only the elf file in it. It was provided by the [RVVM - issue](https://github.com/LekKit/RVVM/issues/57). Then run the RVVM, you can get the elf file is loaded as the initial program of the machine. ![2024-01-07 22-04-22 screenshot](https://hackmd.io/_uploads/HyQFkV_u6.png) After loading the elf file as the root file of the RVVM, there comes a question, can I load the elf file in other order? ### Trace elf load file code To figure out how the elf file loaded, the code of loading it must be traced. After viewing the code, I have to say the code doesn't like a typical elf loader. The elf loader of RVVM only provides the essential part of executing. And what is the essential part of executing? Like the `struct elf_desc_t` showed in below, the `base` and `buf_size` mean the address and the size of the elf file. `struct elf_desc_t` only provides the phyiscal address , dynamic linked file name, and the entry number of the program section. ```c typedef struct { // Pass a buffer for objcopy, NULL for userland loading // Receive base ELF address for userland void* base; // Objcopy buffer size size_t buf_size; // Various loaded ELF info size_t entry; char* interp_path; size_t phdr; size_t phnum; } elf_desc_t; ``` If you use `readelf` you can know what are phyiscal address , dynamic linked file name, and the entry number of the program section. ``` $ readelf -a hello.elf ... 程式標頭: 類型 偏移量 虛擬位址 實體位址 檔案大小 記憶大小 旗標 對齊 LOPROC+0x3 0x001049 0x00000000 0x00000000 0x0001a 0x00000 R 0x1 LOAD 0x001000 0x00000000 0x00000000 0x00049 0x00049 R E 0x1000 ... 本檔案沒有動態區段。 ... ``` The program table is where the assembly code store in the elf file. ```c bool bin_objcopy(rvfile_t* file, void* buffer, size_t size, bool try_elf) { uint8_t mag[4] = {0}; if (try_elf && rvread(file, mag, 4, 0) == 4 && read_uint32_le_m(mag) == 0x464c457F) { elf_desc_t elf = { .base = buffer, .buf_size = size, }; if (elf_load_file(file, &elf)) return true; } return rvread(file, buffer, size, 0); } ``` In `bin_objcopy` is to load the elf file and check if it's a elf file by reading the magic number `0x464c457F`. ```c= bool elf_load_file(rvfile_t* file, elf_desc_t* elf) { uint8_t tmp[64]; WRAP_ERR(file && elf, "Invalid arguments to elf_load_file()"); WRAP_ERR(rvread(file, tmp, 64, 0) == 64, "Failed to read ELF header"); WRAP_ERR(read_uint32_le_m(tmp) == 0x464c457F, "Not an ELF file"); WRAP_ERR(tmp[4] == 1 || tmp[4] == 2, "Invalid ELF class"); WRAP_ERR(tmp[5] == 1, "Not a little-endian ELF"); // Parse ELF header bool objcopy = !!elf->base; bool class64 = (tmp[4] == 2); uint16_t elf_type = read_uint16_le_m(tmp + 16); uint64_t elf_entry = class64 ? read_uint64_le_m(tmp + 24) : read_uint32_le_m(tmp + 24); uint64_t elf_phoff = class64 ? read_uint64_le_m(tmp + 32) : read_uint32_le_m(tmp + 28); //uint64_t elf_shoff = class64 ? read_uint64_le_m(tmp + 40) : read_uint32_le_m(tmp + 32); size_t elf_phnsz = class64 ? 56 : 32; size_t elf_phnum = read_uint16_le_m(tmp + (class64 ? 56 : 44)); ... for (size_t i=0; i<elf_phnum; ++i) { uint64_t elf_phent_off = elf_phoff + (elf_phnsz * i); WRAP_ERR(rvread(file, tmp, elf_phnsz, elf_phent_off) == elf_phnsz, "Failed to read ELF phent"); uint32_t p_type = read_uint32_le_m(tmp); uint64_t p_offset = class64 ? read_uint64_le_m(tmp + 8) : read_uint32_le_m(tmp + 4); uint64_t p_vaddr = class64 ? read_uint64_le_m(tmp + 16) : read_uint32_le_m(tmp + 8); uint64_t p_fsize = class64 ? read_uint64_le_m(tmp + 32) : read_uint32_le_m(tmp + 16); uint64_t p_memsz = class64 ? read_uint64_le_m(tmp + 40) : read_uint32_le_m(tmp + 20); //uint32_t p_flags = class64 ? read_uint32_le_m(tmp + 4) : read_uint32_le_m(tmp + 24); if (p_type == ELF_PT_LOAD || p_type == ELF_PT_PHDR) { // Load ELF program segment or PHDR segment if (objcopy) { p_vaddr -= elf_loaddr; WRAP_ERR(p_vaddr + p_memsz <= elf->buf_size, "ELF does not fit in objcopy buffer"); } void* vaddr = ((uint8_t*)elf->base) + p_vaddr; if (!objcopy) { WRAP_ERR(vma_alloc(vaddr, p_memsz, VMA_RDWR | VMA_FIXED) == vaddr, "Failed to allocate ELF VMA"); } WRAP_ERR(rvread(file, vaddr, p_fsize, p_offset) == p_fsize, "Failed to read ELF segment"); } if (p_type == ELF_PT_INTERP && !objcopy && !elf->interp_path) { // Get ELF interpreter path WRAP_ERR(p_fsize < 1024, "ELF interpreter path is too long"); elf->interp_path = safe_new_arr(char, p_fsize + 1); WRAP_ERR(rvread(file, elf->interp_path, p_fsize, p_offset) == p_fsize, "Failed to read ELF interp_path"); } return true; } ``` In the `elf_load`, it's clearly that it does support the ELF32 loading. From line 3 to line 18, the elf header is been parsed and the line after 20 is mapping the virtual memory in the RVVM to the elf file. So, why is it not a typical elf loader? Because the function is lack of providing information such as ELF header, Program table and symbols etc. Therefore, of course it couldn't load the elf file which isn't been stripped. ### Support reading elf file with symbol table ![image](https://hackmd.io/_uploads/rJEXZukt6.png) #### The difference between stripped file and not stripped file To read stripped file and not stripped file, we must know what is the difference between them. The difference between them is the stripped file is lack of Symbol table. ``` $ file hello.elf hello.elf: ELF 32-bit LSB executable, UCB RISC-V, version 1 (SYSV), statically linked, not stripped $ readelf -s hello.elf Symbol table '.symtab' contains 13 entries: 編號: 值 大小 類型 約束 版本 索引名稱 0: 00000000 0 NOTYPE LOCAL DEFAULT UND 1: 00000000 0 SECTION LOCAL DEFAULT 1 2: 0000003c 0 SECTION LOCAL DEFAULT 2 3: 00000000 0 SECTION LOCAL DEFAULT 3 4: 00000000 0 FILE LOCAL DEFAULT ABS hello.o 5: 0000005d 0 NOTYPE LOCAL DEFAULT ABS SYSEXIT 6: 00000040 0 NOTYPE LOCAL DEFAULT ABS SYSWRITE 7: 0000003c 0 NOTYPE LOCAL DEFAULT 2 str 8: 0000000d 0 NOTYPE LOCAL DEFAULT ABS str_size 9: 00000000 0 NOTYPE LOCAL DEFAULT 1 $xrv32i2p1 10: 00000010 0 NOTYPE LOCAL DEFAULT 1 loop 11: 00000030 0 NOTYPE LOCAL DEFAULT 1 end 12: 00000000 0 NOTYPE GLOBAL DEFAULT 1 _start $ file pi.elf pi.elf: ELF 32-bit LSB executable, UCB RISC-V, version 1 (SYSV), statically linked, stripped $ readelf -s pi.elf <nothing> ``` To read symbol table, it requires to read the full elf header to get the entries to the section header. #### Support reading full elf header file The loading process includes parsing the elf header and program header in elf file, and the section header is to dissasemble the program. The first two parts of the loading process which are parsing elf header and mapping the program part into the memory have been done in RVVM. However, the section header is been ignored through the loading process, so the elf file can't be read correctly especially the not stripped one. ```c /* rvvm_types.h */ // Max XLEN/SXLEN values #ifdef USE_RV64 typedef uint64_t maxlen_t; typedef int64_t smaxlen_t; #define MAX_XLEN 64 #define MAX_SHAMT_BITS 6 #define PRIxXLEN PRIx64 #else typedef uint32_t maxlen_t; typedef int32_t smaxlen_t; /* elf_load.h */ struct ELFEhdr{ uint8_t e_ident[EI_NIDENT]; uint16_t e_type; uint16_t e_machine; uint32_t e_version; maxlen_t e_entry; maxlen_t e_phoff; maxlen_t e_shoff; uint32_t e_flags; uint16_t e_ehsize; uint16_t e_phentsize; uint16_t e_phnum; uint16_t e_shentsize; uint16_t e_shnum; uint16_t e_shstrndx; }; struct ELFShdr{ uint32_t sh_name; uint32_t sh_type; maxlen_t sh_flags; maxlen_t sh_addr; maxlen_t sh_offset; uint32_t sh_link; uint32_t sh_info; maxlen_t sh_addralign; maxlen_t sh_entsize; }; ``` Then write a function to load the elf header, moreover seperate the the check elf from the `elf_load` as a verification of the elf header. ```c bool verify_elf(rvfile_t* file) { WRAP_ERR(file, "Invalid arguments to elf_load_file()"); #ifdef USE_RV64 uint8_t tmp[64] = {0}; WRAP_ERR(rvread(file, tmp, 64, 0) == 64, "Failed to read 64bit ELF header"); #else uint8_t tmp[52] = {0}; WRAP_ERR(rvread(file, tmp, 52, 0) == 52, "Failed to read 32bit ELF header"); #endif WRAP_ERR(read_uint32_le_m(tmp) == 0x464c457F, "Not an ELF file"); WRAP_ERR(tmp[4] == 1 || tmp[4] == 2, "Invalid ELF class"); WRAP_ERR(tmp[5] == 1, "Not a little-endian ELF"); return true; } ``` To load the elf header, I use a simple way to do it. I point the `struct ELFEhdr* ehdr` to the elf file and set the size of the buffer to fit the size of the `struct ELFEhdr`. ```c struct ELFEhdr* load_ehdr(rvfile_t* file) { // Determined as make option #ifdef USE_RV64 uint8_t buffer[64] = {0}; WRAP_ERR(rvread(file, buffer, 64, 0) == 64, "Failed to read 64bit ELF header"); #else uint8_t buffer[52] = {0}; WRAP_ERR(rvread(file, buffer, 52, 0) == 52, "Failed to read 32bit ELF header"); #endif WRAP_ERR(verify_elf(file), "The file is invalid as a ELF file"); struct ELFEhdr* ehdr = (struct ELFEhdr*) buffer; WRAP_ERR(sizeof(*ehdr) == sizeof(buffer), "Conflict size of ELF header file"); return ehdr; } ``` To check out the result, I use `hexdump` and `readelf` to dump the raw data in the `hello.elf`. ``` $ readelf -a hello.elf ... 區段標頭: [號] 名稱 類型 位址 偏移 大小 全 旗標 連結 資 齊 [ 0] NULL 00000000 000000 000000 00 0 0 0 [ 1] .text PROGBITS 00000000 001000 00003c 00 AX 0 0 4 [ 2] .rodata PROGBITS 0000003c 00103c 00000d 00 A 0 0 1 [ 3] .riscv.attributes RISCV_ATTRIBUTE 00000000 001049 00001a 00 0 0 1 [ 4] .symtab SYMTAB 00000000 001064 0000d0 10 5 12 4 [ 5] .strtab STRTAB 00000000 001134 000042 00 0 0 1 [ 6] .shstrtab STRTAB 00000000 001176 00003b 00 0 0 1 ... The decoding of unwind sections for machine type RISC-V is not currently supported. Symbol table '.symtab' contains 13 entries: 編號: 值 大小 類型 約束 版本 索引名稱 0: 00000000 0 NOTYPE LOCAL DEFAULT UND 1: 00000000 0 SECTION LOCAL DEFAULT 1 2: 0000003c 0 SECTION LOCAL DEFAULT 2 3: 00000000 0 SECTION LOCAL DEFAULT 3 4: 00000000 0 FILE LOCAL DEFAULT ABS hello.o 5: 0000005d 0 NOTYPE LOCAL DEFAULT ABS SYSEXIT 6: 00000040 0 NOTYPE LOCAL DEFAULT ABS SYSWRITE 7: 0000003c 0 NOTYPE LOCAL DEFAULT 2 str 8: 0000000d 0 NOTYPE LOCAL DEFAULT ABS str_size 9: 00000000 0 NOTYPE LOCAL DEFAULT 1 $xrv32i2p1 10: 00000010 0 NOTYPE LOCAL DEFAULT 1 loop 11: 00000030 0 NOTYPE LOCAL DEFAULT 1 end 12: 00000000 0 NOTYPE GLOBAL DEFAULT 1 _start ... $ hexdump hello.elf 0000000 457f 464c 0101 0001 0000 0000 0000 0000 0000010 0002 00f3 0001 0000 0000 0000 0034 0000 0000020 11b4 0000 0000 0000 0034 0020 0002 0028 0000030 0007 0006 0003 ... ``` Then I modify the code in `rvvm_main` to let the RVVM can get the argument from the terminal. ```c char* elf_filename = NULL; ... for (int i=1; i<argc; i+=arg_size) { arg_size = get_arg(argv + i, &arg_name, &arg_val); if (cmp_arg(arg_name, "m") || cmp_arg(arg_name, "mem")) { if (rvvm_strlen(arg_val)) { mem = ((size_t)str_to_int_dec(arg_val)) << mem_suffix_shift(arg_val[rvvm_strlen(arg_val)-1]); } } else if (cmp_arg(arg_name, "s") || cmp_arg(arg_name, "smp")) { smp = str_to_int_dec(arg_val); } else if (cmp_arg(arg_name, "e")) { elf_name = arg_val; ... rvfile_t *elf_file = rvopen(elf_filename, RVFILE_RW); struct ELFEhdr* ehdr = load_ehdr(elf_file); ``` After setting the experiment down, I random pick three members `e_machine`, `e_shnum`, `e_shstrndx` in `ehdr` to see if the code is successful running. ``` $ ./rvvm_riscv ../fw_jump.bin -kernel ../Image -rv32 -image ../rootfs.ext2 -mem 1G -res 1280x720 -elf ../elf/hello.elf ... 243 7 6 ``` ### Executing the elf file in RVVM The executing part is a big problem in RVVM, because currently RVVM doesn't support the userspace elf loader. A userspace emulation is different from the virtual machine in RVVM, in the case they are independent platform but with the same cpu source. After discussing with the developer of the RVVM, he(or she) releases a WIP of running the elf file on the userspace $\to$ see [commit cdfea8b](https://github.com/LekKit/RVVM/commit/cdfea8bff67a2400e849d91c064f0b61b4c1a1ca). The part still exists a lot of bug, so continue debugging. The elf load api in RVVM has two modes. The first one is to load the kernel file and image file into the virtual memory of the virtual machine. In this part, the elf file can be written into memory of the machine, and used them as vmlinux. ```c ... if (machine->kernel_file) { size_t kernel_offset = machine->rv64 ? 0x200000 : 0x400000; size_t kernel_size = machine->mem.size > kernel_offset ? machine->mem.size - kernel_offset : 0; bin_objcopy(machine->kernel_file, machine->mem.data + kernel_offset, kernel_size, elf); } rvvm_addr_t dtb_addr = rvvm_get_opt(machine, RVVM_OPT_DTB_ADDR); if (machine->dtb_file) { size_t dtb_size = rvfilesize(machine->dtb_file); size_t dtb_offset = machine->mem.size > dtb_size ? machine->mem.size - dtb_size : 0; dtb_addr = machine->mem.begin + dtb_offset; rvread(machine->dtb_file, machine->mem.data + dtb_offset, machine->mem.size - dtb_offset, 0); } ... ``` The second one is to run the userspace process. ```c elf_desc_t elf = { .base = NULL, }; elf_desc_t interp = { .base = NULL, }; ``` To run in the userspace process, we also need a `struct exec_desc_t` to described the process and program information. ```c typedef struct { // Self explanatory size_t argc; const char** argv; const char** envp; size_t base; // Main ELF base address (relocation) size_t entry; // Main ELF entry point size_t interp_base; // ELF interpreter (aka linker usually) base address size_t interp_entry; // ELF interpreter entry point size_t phdr; // Address of ELF PHDR section size_t phnum; // Number of PHDRs } exec_desc_t; ``` According to the developer of the RVVM, he has tested the x86_64 arch elf file can be loaded on the user process. The test method is to add the following code in `main`, then try to pass the elf file you want. ```c int rvvm_user(int argc, char** argv, char** envp); int main(int argc, char** argv, char** envp) { if (argc > 2 && rvvm_strcmp(argv[1], "-user")) { rvvm_set_loglevel(LOG_INFO); return rvvm_user(argc - 2, argv + 2, envp); } ... ``` Then begin the test with below command. ``` $ make CFLAGS="-static" && rvvm -user hello.elf ██▀███ ██▒ █▓ ██▒ █▓ ███▄ ▄███▓ ▓██ ▒ ██▒▓██░ █▒▓██░ █▒▓██▒▀█▀ ██▒ ▓██ ░▄█ ▒ ▓██ █▒░ ▓██ █▒░▓██ ▓██░ ▒██▀▀█▄ ▒██ █░░ ▒██ █░░▒██ ▒██ ░██▓ ▒██▒ ▒▀█░ ▒▀█░ ▒██▒ ░██▒ ░ ▒▓ ░▒▓░ ░ █░ ░ █░ ░ ▒░ ░ ░ ░▒ ░ ▒░ ░ ░░ ░ ░░ ░ ░ ░ ░░ ░ ░░ ░░ ░ ░ ░ ░ ░ ░ ░ ░ Detected OS: Linux Detected CC: GCC 9 Target arch: riscv Version: RVVM 0.6-f241437-dirty ... /usr/bin/ld: /usr/lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/libX11.a(xim_trans.o): in function `_XimXTransSocketINETConnect': (.text+0xcfc): 警告: Using 'getaddrinfo' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking /usr/bin/ld: /usr/lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/libX11.a(OpenDis.o): in function `OutOfMemory': (.text+0x3c4): undefined reference to `xcb_disconnect' ... ``` It seems that the dynamic linking file attached to architecture x86_64 which was supposed to be RSIC-V, and I have no idea of the situation.