# Refine system emulation for rv32emu > 陳禹丞 ## Objective 1. Study the [Virtio Specification](https://docs.oasis-open.org/virtio/virtio/v1.2/virtio-v1.2.pdf), and use previous study to understands the internal behavior of semu. 2. Analyze and list preperation tasks for migrating Virtio device emulation from semu to rv32emu, covering difference in structures and access interfaces. ## Virtio Specification ### Introduction Virtio is an open standard and framework, which defines virtualization-friendly abstarction for drivers and hardware. It is primarily used in virtualized environments to efficiently virtualize I/O devices, including network adapters, disk contollers, etc. ### Structure Specifications According to &sect;1.4 in Virtio specification, all in-memory structures are assumed to be without additional padding. That is, all structures shall be enforced with GNU extension `__attribute__((packed))`. ### Virtqueues The mechanism for bulk data transport on Virtio devices is called a virtqueue. Each device can have zero or more virtqueues. For example, a network device can has two virtqueues for transmitting and receiving data, respectivly. There are two types of virtqueues: 1. Split Virtqueue * The only format supported by Vitio v1.0 and earlier * The format seperates the virtqueue into several parts, where each part is writable by either the driver and the device, but not both. * Multiple parts and/or location within a part need to be updated when making a buffer available and when marking it as used. * Since the current semu virtio implementation employs the split virtqueue, this note will primarily focus on discussing the split virtqueue. 2. Packed Virtqueue: * An alternative compact virtqueue layout using read-write memory, which indicates that memory is both read and written by host and guest. * As the split virtqueue design is not cache-friendly, [virtio: support packed ring](https://lore.kernel.org/lkml/20181121100330.24846-1-tiwei.bie@intel.com/) patch reported a 30% performance gain when using packed virtqueue. #### Split Virtqueue Each split virtqueue consistes of three parts: 1. Descriptor Table: occupies the Descriptor Area * Device Read-Only * Even though the virtio specfication(&sect;2.7.5.3) allows indirect descriptors, semu does not implement `VIRTQ_DEC_F_INDIRECT` feature. Therefore, this note will not cover it. * if the `VRING_DESC_F_WRITE` flag is set, the buffer it points to is write-only to the device and read-only otherwise. 3. Available Ring: occupies Driver Area * Device Read-Only * `used_event` shall be neglected if `VIRTIO_F_EVENT_IDX` not negotiated. 4. Used Ring: occupies the Device Area * Driver Read-Only * `avail_event` shall be neglected if `VIRTIO_F_EVENT_IDX` not negotiated. ```graphviz digraph split_vq{ rankdir=LR node [shape=record]; queue [label="<s0>Descriptor Table|<s1>Available Ring \n(...padding)|<s2>Used Ring"]; vring_desc [label="le64 addr|le32 len|le16 flags|le16 next"] vring_avail [label="le16 flags|le16 idx|le16 ring[]|le16 used_event"] vring_used [label="le16 flags|le16 idx|<ring>struct ring[]"] vring_used_elem [label="le32 id|le32 len|le16 avail_event"] queue:s0 -> vring_desc queue:s1 -> vring_avail queue:s2 -> vring_used vring_used:ring -> vring_used_elem } ``` ## Preperation for Migrating from semu to rv32emu ### Todo List - [x] Migrate virtio.h & virtio-blk.c - [x] Modify MMIO function-like macros. - [x] Modify Device Tree * To align with semu, virtio-blk device is located at 0x4200000, 0x4100000 is reserved for future virtio-net implementation. - [x] Integrate virtio-related declarations/definitions into device/virtio.h * Originally located in device.h in semu, but rv32emu does not have a corresponding file. - [x] Enable Virtio Linux config * Already enabled by default config of rv32emu. - [x] Parsing virtio block device argument * As 'b' for block device and 'd' for disk is already taken by other arguments, 'v' is chosen for virtio-blk device. - [x] Maps image file into memory ### MMU In semu, all memory access are performed using `mem_store` 、`mem_load` and `mem_fetch`. Unlike rv32emu, which distinguishes between different access length and provides variation like `mmu_write_w/s/b` for different operations. Additionally, semu does not modularize virtual/physical address translation into a separate function, resulting in huge `mem_store` and `mem_load` function. In contrast, rv32emu does have `mmu_translate` function and also inspects whether the address belongs to the PLIC or UART region within the `MMIO_READ`/`MMIO_WRITE` function-like macros. Note that, in semu's implementation of virtio registers, the `RV_EXC_LOAD_MISALIGN` exception is raised if the access width is not 4. Also, `ram` in semu is a raw pointer to uint32_t array, but in rv32emu is a pointer to `memory_t`: ```c typedef struct { uint8_t *membase; uint64_t mem_size; } memory_t; ``` ### Misalignment and illegal instruction handling In semu, plic and uart I/O accept only word and byte instructions respectively, and raise `RV_EXC_LOAD_MISALIGN` exception if trespassed. However, rv32emu does not verify whether the accessing instruction is word or byte instruction. ```c semu implementation of plic_read void plic_read(hart_t *vm, plic_state_t *plic, uint32_t addr, uint8_t width, uint32_t *value) { switch (width) { case RV_MEM_LW: if (!plic_reg_read(plic, addr >> 2, value)) vm_set_exception(vm, RV_EXC_LOAD_FAULT, vm->exc_val); break; case RV_MEM_LBU: case RV_MEM_LB: case RV_MEM_LHU: case RV_MEM_LH: vm_set_exception(vm, RV_EXC_LOAD_MISALIGN, vm->exc_val); return; default: vm_set_exception(vm, RV_EXC_ILLEGAL_INSN, 0); return; } } ``` ## Development Progress Virtio-blk device has been successfully migrated into rv32emu. Currently cleaning up the code. ``` [ 1.700327] virtio_blk virtio0: 1/0/0 default/read/poll queues [ 1.702067] virtio_blk virtio0: [vda] 524288 512-byte logical blocks (268 MB/256 MiB) ``` ## Future Work: Migrate virtio-net into rv32emu Currently, virtio-net device in semu does not function properly, and is not covered in this note. ## Issue ### 1. Linking Error with GCC Linking of rv32emu with `-O0` flag fails with following message: ```diff $ git diff Makefile diff --git a/Makefile b/Makefile index 9921389..ee0e244 100644 --- a/Makefile +++ b/Makefile @@ -7,7 +7,7 @@ BIN := $(OUT)/rv32emu CONFIG_FILE := $(OUT)/.config -include $(CONFIG_FILE) -CFLAGS = -std=gnu99 -O2 -Wall -Wextra -Werror +CFLAGS = -std=gnu99 -O0 -Wall -Wextra -Werror CFLAGS += -Wno-unused-label CFLAGS += -include src/common.h -Isrc/ ``` ```bash $ cc --version cc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 Copyright (C) 2023 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ make BUILD_SYSTEM=1 ... /usr/bin/ld: /tmp/ccQKnLSJ.ltrans0.ltrans.o: in function `fdt_del_node': <artificial>:(.text+0x25fae): undefined reference to `fdt_node_end_offset_' collect2: error: ld returned 1 exit status make: *** [Makefile:314: build/rv32emu] Error 1 ``` :::warning See if the latest `master` branch remains the above issue. If so, create an issue on GitHub. > [name=otteryc] After discussing the linking process and referencing [c-linker-loader](/@sysprog/c-linker-loader), we found that Link-Time Optimization (LTO) was the culprit. GCC worked fine after adding the `-fno-lto` flag. I am not quite sure whether it should be reported as an issue on GitHub. ::: However, clang-18 works perfectly: ```bash $ CC=clang-18 make BUILD_SYSTEM=1 ... CC build/dtc/libfdt/fdt_rw.o CC build/main.o CC build/devices/plic.o CC build/devices/uart.o LD build/rv32emu $ echo $? 0 ``` ### 2. Readability Issue The code from semu accesses the available ring and the used ring by manually offseting, which actually makes the code difficult to read. For instance: ```c uint32_t vq_used_addr = queue->QueueUsed + 1 + (new_used % queue->QueueNum) * 2; ram[vq_used_addr] = buffer_idx; /* virtq_used_elem.id (le32) */ ram[vq_used_addr + 1] = len; /* virtq_used_elem.len (le32) */ queue->last_avail++; new_used++; ```