To understand if elf loader is completed, trace the relative code of elf loader is necessary.
The code above shows the elf file can be loaded by rvread
and verifyied by the read_uint32_le_m
. Besides, the commit message of the code can tell us that the elf loader of RVVM is implemented as a userspace application. Therefore, the first step is to build the RVVM.
elf_load: Userland ELF loading, use VMA helpers
- Implement mapping an ELF into userspace VMA
- Relocatable ELF_DYN
- Relocate elf->entry & elf->phdr against elf->base
- Fix ELF interpreter parsing
According to the guidline of the RVVM, building RVVM need to prepare three things, fw_jump.bin
from OpenSBI, Linux kernel from Linux and the root filesystem. First, we can get fw_jump.bin
from the release version of OpenSBI - release. After downloading the release version of OpenSBI, fw_jump.bin
is under the directory share
.
To build the Linux kernel we should configure and build the GNU Toolchain first.
After building the GNU Toolchain, add the environment variable to the bashrc
. The purpose of the step is to help the compiler to find the GNU Toolchain.
Then downloads the Linux code from the github, change it into whatever the version you want then and gernerate the Linux kernel Image.
The preparation hasn't done yet, the final step is to create the root filesystem. Root filesystem is the first filesystem to mount when a computer or a virtual machine is activated the root filesystem often including the directories below.
In the case, I use the buildroot and busybox to build the root filesystem. I download the buildroot first, and get the buildroot .config
file and busybox .config
file from the semu to build the root filesystem image. To configure the output filesystem format of the root filesystem, use the following command.
Then make
the project, you can get a rootfs.ext2
root filesystem in output directory.
The preparation is completed, then run the RVVM by following the guidline of the READMD.md
.
Then the error comes.
I have no idea where the error comes from so I ask the developer of the project, then I get a suggestion of using the specific config file which is built with riscv64 of the kernel. So I get the config file and make the adjustment of it.
From the figure above, I change the Base ISA
term from CONFIG_ARCH_RV64I
to CONFIG_ARCH_RV32I
, then create another Linux kernel image from this configuration.
Finally, RVVM can run successfully.
What is ELF file ? ELF file is short for Executable and Linkable Format and a command standard file format for executable files, object code, shared libraries and core dumps. It usually contains of three section, ELF header, program header table and section header table. ELF header is compposed of the information of an ELF file, and program header shows the segments used in the run time, moreover, section header table shows the lists of the sections.
The ELF header is different between RV32 and RV64, but they still share something identical.
Offset | RV32 | RV64 |
---|---|---|
0x0 | 0x7f 0x45 0x4c 0x46 |
0x7f 0x45 0x4c 0x46 |
0x4 | 1 | 2 |
0x5 | 1 : little endian 2 : big endian | 1 : little endian 2 : big endian |
0x6 | 1 | 1 |
0x7 | ABI number (eg. Linux=3) | ABI number (eg. Linux=3) |
0x8 | ABI version | ABI version |
0x9 | unused | unused |
0x10 | object file type (eg. Executable file=3) | object file type (eg. Executable file=3) |
0x12 | ISA | ISA |
0x14 | 1 | 1 |
0x18 | entry point | entry point |
0x1c | pointer to the start of program header table | entry point |
0x20 | pointer to the start of section header table | pointer to the start of program header table |
0x24 | depends on the target architecture | pointer to the start of program header table |
0x28 | size of this header | pointer to the start of section header table |
0x2a | size of program header table entry | pointer to the start of section header table |
0x2c | number of entries in the program header table | pointer to the start of section header table |
0x2e | size of a section header table entry | pointer to the start of section header table |
0x30 | number of entries in the section header table | depends on the target architecture |
0x32 | number of entries in the section header table | depends on the target architecture |
0x34 | end of ELF Header | size of this header |
0x36 | size of program header table entry | |
0x38 | number of entries in the program header table | |
0x3a | size of a section header table entry | |
0x3c | number of entries in the section header table | |
0x3e | number of entries in the section header table | |
0x40 | end of ELF Header |
In the ELF header, there is a pointer to point to the start position of the program header table. The program header table declares further details of the ELF file. In the program header table, it's mostly about the execution information of the file. Also same as the ELF header, the layout is slightly different between in RV32 and RV64.
Offset | RV32 | RV64 |
---|---|---|
0x00 | Type of the segement | Type of the segement |
0x04 | Offset of the segment in the file image | Segment-dependent flags |
0x08 | Virtual address where it should be loaded | Offset of the segment in the file image |
0x0c | Physical address where it should be loaded | |
0x10 | Size on file | Virtual address where it should be loaded |
0x14 | Size on memory | |
0x18 | Segment-dependent flags | Physical address where it should be loaded |
0x1c | ||
0x20 | End of the program header | Size on file |
0x28 | Size on memory | |
0x30 | ||
0x38 | End of the program header |
According to the previous section, the RVVM can be activated by fw_jump.bin
, Linux kernel and the root filesystem. Let focus on the root filesystem, under the root there is a directory run as the initial process of the machine. It can be showed as the kernel message.
The message shows that the file in the directory is the first file to be loaded. Then, instead of building the root filesystem by using busybox and buildroot, I build the root filesystem just by the elf file I want.
The command above is to create a filesystem with only the elf file in it. It was provided by the RVVM - issue. Then run the RVVM, you can get the elf file is loaded as the initial program of the machine.
After loading the elf file as the root file of the RVVM, there comes a question, can I load the elf file in other order?
To figure out how the elf file loaded, the code of loading it must be traced. After viewing the code, I have to say the code doesn't like a typical elf loader. The elf loader of RVVM only provides the essential part of executing. And what is the essential part of executing? Like the struct elf_desc_t
showed in below, the base
and buf_size
mean the address and the size of the elf file. struct elf_desc_t
only provides the phyiscal address , dynamic linked file name, and the entry number of the program section.
If you use readelf
you can know what are phyiscal address , dynamic linked file name, and the entry number of the program section.
The program table is where the assembly code store in the elf file.
In bin_objcopy
is to load the elf file and check if it's a elf file by reading the magic number 0x464c457F
.
In the elf_load
, it's clearly that it does support the ELF32 loading. From line 3 to line 18, the elf header is been parsed and the line after 20 is mapping the virtual memory in the RVVM to the elf file. So, why is it not a typical elf loader? Because the function is lack of providing information such as ELF header, Program table and symbols etc. Therefore, of course it couldn't load the elf file which isn't been stripped.
To read stripped file and not stripped file, we must know what is the difference between them. The difference between them is the stripped file is lack of Symbol table.
To read symbol table, it requires to read the full elf header to get the entries to the section header.
The loading process includes parsing the elf header and program header in elf file, and the section header is to dissasemble the program. The first two parts of the loading process which are parsing elf header and mapping the program part into the memory have been done in RVVM.
However, the section header is been ignored through the loading process, so the elf file can't be read correctly especially the not stripped one.
Then write a function to load the elf header, moreover seperate the the check elf from the elf_load
as a verification of the elf header.
To load the elf header, I use a simple way to do it. I point the struct ELFEhdr* ehdr
to the elf file and set the size of the buffer to fit the size of the struct ELFEhdr
.
To check out the result, I use hexdump
and readelf
to dump the raw data in the hello.elf
.
Then I modify the code in rvvm_main
to let the RVVM can get the argument from the terminal.
After setting the experiment down, I random pick three members e_machine
, e_shnum
, e_shstrndx
in ehdr
to see if the code is successful running.
The executing part is a big problem in RVVM, because currently RVVM doesn't support the userspace elf loader. A userspace emulation is different from the virtual machine in RVVM, in the case they are independent platform but with the same cpu source. After discussing with the developer of the RVVM, he(or she) releases a WIP of running the elf file on the userspace \(\to\) see commit cdfea8b. The part still exists a lot of bug, so continue debugging.
The elf load api in RVVM has two modes. The first one is to load the kernel file and image file into the virtual memory of the virtual machine. In this part, the elf file can be written into memory of the machine, and used them as vmlinux.
The second one is to run the userspace process.
To run in the userspace process, we also need a struct exec_desc_t
to described the process and program information.
According to the developer of the RVVM, he has tested the x86_64 arch elf file can be loaded on the user process. The test method is to add the following code in main
, then try to pass the elf file you want.
Then begin the test with below command.
It seems that the dynamic linking file attached to architecture x86_64 which was supposed to be RSIC-V, and I have no idea of the situation.