rv32emu 開發紀錄

--- tags: 2022 computer architecture --- # rv32emu 開發紀錄 contributed by < [`Risheng1128`](https://github.com/Risheng1128) > :::info 目標: - [x] 通過 Standard Extension for Compressed Instruction 的 [riscv-arch-test](riscv-arch-test) (目前指令 `c.ebreak` 未通過) - [x] 解決 [Avoid duplications in RISC-V exception handlers #61](https://github.com/sysprog21/rv32emu/issues/61) - [x] 通過 privilege instruction 的測試 - [ ] 解決 [Migrate to latest RISC-V Architecture Test #49](https://github.com/sysprog21/rv32emu/issues/49) - [ ] 提升 rv32emu 執行效能 ::: ## 測試環境 ```shell $ riscv64-unknown-elf-gcc --version riscv64-unknown-elf-gcc (g2ee5e430018) 12.2.0 $ gcc --version gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0 $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian Address sizes: 39 bits physical, 48 bits virtual CPU(s): 12 On-line CPU(s) list: 0-11 Thread(s) per core: 2 Core(s) per socket: 6 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 141 Model name: 11th Gen Intel(R) Core(TM) i5-11400H @ 2.70GHz Stepping: 1 CPU MHz: 2700.000 CPU max MHz: 4500.0000 CPU min MHz: 800.0000 BogoMIPS: 5376.00 Virtualization: VT-x L1d cache: 288 KiB L1i cache: 192 KiB L2 cache: 7.5 MiB L3 cache: 12 MiB NUMA node0 CPU(s): 0-11 ``` ## 環境設定首先下載 SiFive 公司維護的 RISC-V toolchain [riscv-gnu-toolchain](https://github.com/riscv-collab/riscv-gnu-toolchain) ，以下為安裝步驟 ```shell sudo apt install autoconf automake autotools-dev curl gawk git build-essential bison flex texinfo gperf libtool patchutils bc git libmpc-dev libmpfr-dev libgmp-dev gawk zlib1g-dev libexpat1-dev git clone --recursive https://github.com/riscv/riscv-gnu-toolchain cd riscv-gnu-toolchain mkdir -p build && cd build ../configure --prefix=/opt/riscv --enable-multilib sudo make -j$(nproc) ``` :::info **[xPack GNU RISC-V Embedded GCC](https://xpack.github.io/riscv-none-elf-gcc/)** 是另一個 toolchain 選擇，針對 MS-Windows, Linux, macOS 等作業系統提供預先編譯的執行檔 ::: 結束後就將 toolchain 的路徑加到環境變數 ```shell 1. vim ~/.bashrc # 在最下方加入 2. export PATH=$PATH:/opt/riscv/bin ``` 上面步驟都做完後可執行以下命令，測試是否安裝成功 ```shell riscv64-unknown-elf-gcc -v ``` 期望輸出 ```shell $ riscv64-unknown-elf-gcc -v Using built-in specs. COLLECT_GCC=riscv64-unknown-elf-gcc COLLECT_LTO_WRAPPER=/opt/riscv/libexec/gcc/riscv64-unknown-elf/12.2.0/lto-wrapper Target: riscv64-unknown-elf ... Thread model: single Supported LTO compression algorithms: zlib gcc version 12.2.0 (g2ee5e430018) ``` ## 下載 [rv32emu](https://github.com/Risheng1128/rv32emu/tree/master) 安裝完 toolchain 後可以開始使用 rv32emu 專案 > git clone https://github.com/Risheng1128/rv32emu.git 下載 [SDL2 library](https://www.libsdl.org/) > sudo apt install libsdl2-dev 修改檔案 `mk/toolchain.mk` 裡的參數 `CROSS_COMPILE` 為我們使用的 toolchain ```diff - CROSS_COMPILE ?= riscv32-unknown-elf- + CROSS_COMPILE ?= riscv64-unknown-elf- ``` :::warning 可比照 `tests/gdbstub.sh`，提供一組 [target triplet](https://wiki.osdev.org/Target_Triplet) 清單，讓系統偵測並採用特定標的。 > 完整修改可參考 [Detect toolchain automatically](https://github.com/sysprog21/rv32emu/commit/99c9e34389ab65240db6b26f857ddd2b524b5aca) ::: 建立 emulator ```shell make ``` 期望輸出 ``` CC build/map.o make -C src/mini-gdbstub/ O=/home/benson/rv32emu/build/mini-gdbstub/ make[1]: Entering directory '/home/benson/rv32emu/src/mini-gdbstub' cc -c -Iinclude -Wall -Wextra -MMD -O3 ./lib/conn.c -o /home/benson/rv32emu/build/mini-gdbstub//conn.o cc -c -Iinclude -Wall -Wextra -MMD -O3 ./lib/packet.c -o /home/benson/rv32emu/build/mini-gdbstub//packet.o cc -c -Iinclude -Wall -Wextra -MMD -O3 ./lib/gdbstub.c -o /home/benson/rv32emu/build/mini-gdbstub//gdbstub.o cc -c -Iinclude -Wall -Wextra -MMD -O3 ./lib/utils/csum.c -o /home/benson/rv32emu/build/mini-gdbstub//csum.o cc -c -Iinclude -Wall -Wextra -MMD -O3 ./lib/utils/translate.c -o /home/benson/rv32emu/build/mini-gdbstub//translate.o ar -rcs /home/benson/rv32emu/build/mini-gdbstub//libgdbstub.a /home/benson/rv32emu/build/mini-gdbstub//conn.o /home/benson/rv32emu/build/mini-gdbstub//packet.o /home/benson/rv32emu/build/mini-gdbstub//gdbstub.o /home/benson/rv32emu/build/mini-gdbstub//csum.o /home/benson/rv32emu/build/mini-gdbstub//translate.o make[1]: Leaving directory '/home/benson/rv32emu/src/mini-gdbstub' CC build/emulate.o CC build/io.o CC build/elf.o CC build/main.o CC build/syscall.o CC build/syscall_sdl.o CC build/gdbstub.o CC build/breakpoint.o LD build/rv32emu ``` 以 `make check` 執行內建的測試標的，以下為期望輸出 ``` Running hello.elf ... [OK] Running puzzle.elf ... [OK] Running pi.elf ... [OK] ``` ## [rv32emu](https://github.com/sysprog21/rv32emu) - RISC-V emulator with ELF support rv32emu 是一款 32 位元 RISC-V 指令集模擬器，藉由軟體來模擬 RISC-V 的處理器並表現指令行為。至於要怎麼做一個簡單的 RISC-V 模擬器，可參考 [Write a simple RISC-V emulator in plain C](https://fmash16.github.io/content/posts/riscv-emulator-in-c.html) ### rv32emu 主要運作流程首先探討 rv32emu 的 `main` 函式，這裡主要著重在 RISC-V 處理器的設計細節，其他部份可以由 [`src/main`](https://github.com/sysprog21/rv32emu/blob/master/src/main.c) 找到一開始就是單純的解析使用者輸入的命令 ```c if (!parse_args(argc, args)) { print_usage(args[0]); return 1; } ``` 讀取使用者輸入的執行檔 ```c /* open the ELF file from the file system */ elf_t *elf = elf_new(); if (!elf_open(elf, opt_prog_name)) { fprintf(stderr, "Unable to open ELF file '%s'\n", opt_prog_name); return 1; } ``` 建立 RISC-V 模擬器的 I/O 界面 ```c /* install the I/O handlers for the RISC-V runtime */ const struct riscv_io_t io = { /* memory read interface */ .mem_ifetch = MEMIO(ifetch), .mem_read_w = MEMIO(read_w), .mem_read_s = MEMIO(read_s), .mem_read_b = MEMIO(read_b), /* memory write interface */ .mem_write_w = MEMIO(write_w), .mem_write_s = MEMIO(write_s), .mem_write_b = MEMIO(write_b), /* system */ .on_ecall = syscall_handler, .on_ebreak = rv_halt, }; ``` 這裡使用到結構 `riscv_io_t` ，該結構定義在檔案 `src/riscv.h` ，以下為其定義。基本上每個結構成員都是各種不同的函式指標，而這裡的 `w`, `s`, `b` 則是分別存取 word, half 及 byte ```c /* memory read handlers */ typedef riscv_word_t (*riscv_mem_ifetch)(struct riscv_t *rv, riscv_word_t addr); typedef riscv_word_t (*riscv_mem_read_w)(struct riscv_t *rv, riscv_word_t addr); typedef riscv_half_t (*riscv_mem_read_s)(struct riscv_t *rv, riscv_word_t addr); typedef riscv_byte_t (*riscv_mem_read_b)(struct riscv_t *rv, riscv_word_t addr); /* memory write handlers */ typedef void (*riscv_mem_write_w)(struct riscv_t *rv, riscv_word_t addr, riscv_word_t data); typedef void (*riscv_mem_write_s)(struct riscv_t *rv, riscv_word_t addr, riscv_half_t data); typedef void (*riscv_mem_write_b)(struct riscv_t *rv, riscv_word_t addr, riscv_byte_t data); /* system instruction handlers */ typedef void (*riscv_on_ecall)(struct riscv_t *rv); typedef void (*riscv_on_ebreak)(struct riscv_t *rv); /* RISC-V emulator I/O interface */ struct riscv_io_t { /* memory read interface */ riscv_mem_ifetch mem_ifetch; riscv_mem_read_w mem_read_w; riscv_mem_read_s mem_read_s; riscv_mem_read_b mem_read_b; /* memory write interface */ riscv_mem_write_w mem_write_w; riscv_mem_write_s mem_write_s; riscv_mem_write_b mem_write_b; /* system */ riscv_on_ecall on_ecall; riscv_on_ebreak on_ebreak; }; ``` 建立 RISC-V 模擬器 ```c state_t *state = state_new(); /* find the start of the heap */ const struct Elf32_Sym *end; if ((end = elf_get_symbol(elf, "_end"))) state->break_addr = end->st_value; /* create the RISC-V runtime */ struct riscv_t *rv = rv_create(&io, state); if (!rv) { fprintf(stderr, "Unable to create riscv emulator\n"); return 1; } ``` 這邊用到管理整個 RISC-V 模擬器的結構 `riscv_t` ，該結構定義在檔案 `src/riscv_private.h` 裡，以下為其定義。基本上這個結構定義了 I/O 介面、通用暫存器及 program counter 等等處理器的硬體 ```c struct riscv_t { bool halt; /* I/O interface */ struct riscv_io_t io; /* integer registers */ riscv_word_t X[RV_NUM_REGS]; riscv_word_t PC; /* user provided data */ riscv_user_t userdata; #if RV32_HAS(GDBSTUB) /* gdbstub instance */ gdbstub_t gdbstub; /* GDB instruction breakpoint */ breakpoint_map_t breakpoint_map; #endif #if RV32_HAS(EXT_F) /* float registers */ union { riscv_float_t F[RV_NUM_REGS]; uint32_t F_int[RV_NUM_REGS]; /* integer shortcut */ }; uint32_t csr_fcsr; #endif /* csr registers */ uint64_t csr_cycle; uint32_t csr_mstatus; uint32_t csr_mtvec; uint32_t csr_misa; uint32_t csr_mtval; uint32_t csr_mcause; uint32_t csr_mscratch; uint32_t csr_mepc; uint32_t csr_mip; uint32_t csr_mbadaddr; /* current instruction length */ uint8_t insn_len; }; ``` 引入記憶體操作的抽象處理，包裝執行檔載入過程，這樣後續 `rv32emu` 就在該物件上操作: ```c /* load the ELF file into the memory abstraction */ if (!elf_load(elf, rv, state->mem)) { fprintf(stderr, "Unable to load ELF file '%s'\n", args[1]); return 1; } ``` 根據使用者的輸入決定模擬器的執行模式 ```c /* run based on the specified mode */ if (opt_trace) { run_and_trace(rv, elf); } #if RV32_HAS(GDBSTUB) else if (opt_gdbstub) { rv_debug(rv); } #endif else { run(rv); } ``` 將指令執行的結果輸出 ```c /* dump test result in test mode */ if (opt_arch_test) dump_test_signature(rv, elf); ``` 將已建立的動態記憶體移除 ```c /* finalize the RISC-V runtime */ elf_delete(elf); rv_delete(rv); state_delete(state); return 0; } ``` ### RISC-V 模擬器實作細節這裡主要探討 rv32emu 是如何執行輸入的執行檔，也就是在下列 `main` 函式的部份，只討論單純執行的過程，也就是執行函式 `run` 的情況，如下所示 ```c /* run based on the specified mode */ if (opt_trace) { run_and_trace(rv, elf); } #if RV32_HAS(GDBSTUB) else if (opt_gdbstub) { rv_debug(rv); } #endif else { run(rv); } ``` 接著進到函式 `run` 的實作，基本上就是等待處理器被中止 (halt) ，當處理器被中止後，處理器的狀態 `halt` 會從 false 更改為 true ，此時函式 `rv_has_halted` 會回傳 true ，也就會中止迴圈 ```c static void run(struct riscv_t *rv) { const uint32_t cycles_per_step = 100; for (; !rv_has_halted(rv);) { /* run until the flag is done */ /* step instructions */ rv_step(rv, cycles_per_step); } } ``` 接著進到 rv32emu 的核心函式 `rv_step` ，這個函式有非常多的細節，在直接 "意淫" 程式碼之前，先了解整個處理器的運作原理。參考 [Writing a simple RISC-V emulator in plain C - the main file](https://fmash16.github.io/content/posts/riscv-emulator-in-c.html#the-main-file) 可以看到一個簡化版的處理器，如下所示 ```c // Initialize cpu, registers and program counter struct CPU cpu; cpu_init(&cpu); // Read input file read_file(&cpu, argv[1]); // cpu loop while (1) { // fetch uint32_t inst = cpu_fetch(&cpu); // Increment the program counter cpu.pc += 4; // execute if (!cpu_execute(&cpu, inst)) break; dump_registers(&cpu); if(cpu.pc==0) break; } ``` 基本上從上面的程式碼可以總結處理器的幾個步驟 1. 取得指令 (Fetch instruction): 讀取 pc 儲存之地址之指令資料，對應上述的函式 `cpu_fetch` 2. 解碼指令 (Decode instruction): 將讀取到的指令做解碼的動作，實作在上述的函式 `cpu_execute` 3. 執行指令 (Execute instruction): 執行已經解碼的指令，實作在上述的函式 `cpu_execute` 4. pc 移到下個指令: 對應程式碼 `cpu.pc += 4;` 有了處理器的基本概念，接著就可以開始分析函式 `rv_step` ，因為函式很長，所以就逐段一一分析首先透過巨集函式 `RV32_HAS` 判斷要使用 computed goto 或是使用函式指標的方式，前者使用 label 的地址建立 jump table ，細節可參考〈[你所不知道的 C 語言: goto 和流程控制篇](https://hackmd.io/@sysprog/c-control-flow#switch-%E8%83%8C%E5%BE%8C%E7%9A%84-goto-%E5%92%8C%E5%AF%A6%E4%BD%9C%E8%80%83%E9%87%8F)〉，而後者則是透過函式呼叫的方式。這裡主要就以使用 computed goto 的方式為例 ```c /* Feature test macro */ #define RV32_HAS(x) RV32_FEATURE_##x void rv_step(struct riscv_t *rv, int32_t cycles) { assert(rv); const uint64_t cycles_target = rv->csr_cycle + cycles; uint32_t insn; #define OP_UNIMP op_unimp #if RV32_HAS(COMPUTED_GOTO) #define OP(insn) &&op_##insn #define TABLE_TYPE const void * #define TABLE_TYPE_RVC const void * #else /* !RV32_HAS(COMPUTED_GOTO) */ #define OP(insn) op_##insn #define TABLE_TYPE const opcode_t #define TABLE_TYPE_RVC const c_opcode_t #endif ``` 接著定義指令的 table ，如果使用 computed goto 的方法，此時儲存的資料就是 label 的地址，至於這個 table 是如何定義呢 ? 參考 [The RISC-V Instruction Set Manual Volume I: User-Level ISA](https://github.com/riscv/riscv-isa-manual/releases/download/Ratified-IMAFDQC/riscv-spec-20191213.pdf) 裡的章節 **RV32/64G Instruction Set Listings** 以及章節 **"C" Standard Extension for Compressed Instructions** 即可得知首先是 RV32G 的部份，這裡的 G 是由 IMAFD 這些 extension 所集合而成的，從下圖對照程式碼可以清楚看到整個 table 的定義 - 列的部份對應指令編碼的 bit [6:5] - 行的部份對應指令編碼的 bit [4:2] ![](https://i.imgur.com/DWvVgcT.png) ```c /* clang-format off */ static TABLE_TYPE jump_table[] = { // 000 001 010 011 100 101 110 111 OP(load), OP(load_fp), OP(unimp), OP(misc_mem), OP(op_imm), OP(auipc), OP(unimp), OP(unimp), // 00 OP(store), OP(store_fp), OP(unimp), OP(amo), OP(op), OP(lui), OP(unimp), OP(unimp), // 01 OP(madd), OP(msub), OP(nmsub), OP(nmadd), OP(fp), OP(unimp), OP(unimp), OP(unimp), // 10 OP(branch), OP(jalr), OP(unimp), OP(jal), OP(system), OP(unimp), OP(unimp), OP(unimp), // 11 }; ``` 接著是 RV32C 的部份，從下表可以看到對於 RV32 、 RV64 及 RV128 架構，這裡的 RVC 是如何作分類的，接著就單純探討 RV32C 的部份 ![](https://i.imgur.com/65re320.png) 接著根據以下的圖，可以對應程式碼的 table - 列的部份對應指令編碼的 bit [13:15] - 行的部份對應指令編碼的 bit [0:1] ![](https://i.imgur.com/cEwhkgi.png) ![](https://i.imgur.com/clH4XTR.png) ![](https://i.imgur.com/HJ4g3jY.png) ```c #if RV32_HAS(EXT_C) static TABLE_TYPE_RVC jump_table_rvc[] = { // 00 01 10 11 OP(caddi4spn), OP(caddi), OP(cslli), OP(unimp), // 000 OP(cfld), OP(cjal), OP(cfldsp), OP(unimp), // 001 OP(clw), OP(cli), OP(clwsp), OP(unimp), // 010 OP(cflw), OP(clui), OP(cflwsp), OP(unimp), // 011 OP(unimp), OP(cmisc_alu), OP(ccr), OP(unimp), // 100 OP(cfsd), OP(cj), OP(cfsdsp), OP(unimp), // 101 OP(csw), OP(cbeqz), OP(cswsp), OP(unimp), // 110 OP(cfsw), OP(cbnez), OP(cfswsp), OP(unimp), // 111 }; #endif /* clang-format on */ ``` 接著討論巨集函式 `DISPATCH` 主要做以下幾件事 1. 取得指令 (第 6 行) 2. 判斷取得的資料是否為未壓縮指令 (uncompressed instruction) (第 8 行) 3. 跳進對應的 label (第 11 行及巨集 `DISPATCH_RV32C` 裡) ```c= #define DISPATCH() \ { \ if (unlikely(rv->csr_cycle >= cycles_target || rv->halt)) \ return; \ /* fetch the next instruction */ \ insn = rv->io.mem_ifetch(rv, rv->PC); \ /* standard uncompressed instruction */ \ if ((insn & 3) == 3) { \ uint32_t index = (insn & INSN_6_2) >> 2; \ rv->insn_len = INSN_32; \ goto *jump_table[index]; \ } else { \ /* Compressed Extension Instruction */ \ DISPATCH_RV32C() \ } \ } ``` 這裡先解釋 RV32 的部份，對應到上述程式的第 8 ~ 11 行 - 第 8 行: 判斷該指令是否為 uncompressed instruction ，以 RV32 為例，參考下圖可以知道可以使用指令編碼的 bit [0:1] 決定是 16-bit 或是 32-bit 的指令 ![](https://i.imgur.com/NOK4nvF.png) - 第 9 行: 利用指令編碼的 bit [2:6] 取得 table 的 index - 第 10 行: 設定目前指令的寬度為 32 - 第 11 行: 跳進對應的 label 接著討論 RV32C 的情況，由以下的巨集函式 `DISPATCH_RV32C` 實作 - 第 4 行: 遮罩高於 bit [16:31] 的部份 - 第 5 行: 計算 table 的 index - 第 6 行: 設定目前指令的寬度為 16 - 第 7 行: 跳進對應的 label ```c= #if RV32_HAS(COMPUTED_GOTO) #if RV32_HAS(EXT_C) #define DISPATCH_RV32C() \ insn &= 0x0000FFFF; \ int16_t c_index = (insn & FC_FUNC3) >> 11 | (insn & FC_OPCODE); \ rv->insn_len = INSN_16; \ goto *jump_table_rvc[c_index]; #else #define DISPATCH_RV32C() #endif ``` 跳進特定的 label 後，接著會開始執行指令，使用巨集函式 `EXEC` 來執行各種指令，完整的實作如下所示 - 第 4 行: 執行各種不同的指令 ```c= #define EXEC(instr) \ { \ /* dispatch this opcode */ \ if (unlikely(!op_##instr(rv, insn))) \ return; \ /* increment the cycles csr */ \ rv->csr_cycle++; \ } ``` 用巨集函式 `TARGET` 將 label 、 `EXEC` 及 `DISATCH` 封裝起來 ```c #define TARGET(instr) \ op_##instr : EXEC(instr); \ DISPATCH(); ``` 用巨集函式 `TARGET` 將各種指令做出來，最主要執行的迴圈就是在這裡 ```c /* main loop */ TARGET(load) TARGET(op_imm) TARGET(auipc) TARGET(store) TARGET(op) TARGET(lui) TARGET(branch) TARGET(jalr) TARGET(jal) TARGET(system) #if RV32_HAS(EXT_C) TARGET(caddi4spn) TARGET(caddi) TARGET(cslli) TARGET(cjal) TARGET(clw) TARGET(cli) TARGET(clwsp) TARGET(clui) TARGET(cmisc_alu) TARGET(ccr) TARGET(cj) TARGET(csw) TARGET(cbeqz) TARGET(cswsp) TARGET(cbnez) #endif ... ``` 最後就使用函式 `op_load` 作為範例來探討，從註解其實就很清楚了，可以知道 `op_load` 實際上做了什麼事 - 讀取將要執行的指令 - 藉由指令編碼裡的 `func3` 來決定目前的指令為何，並做出不同的結果 - 將 pc 增加指令寬度的大小 ```c /* RV32I Base Instruction Set * * bits 0-6: opcode * bits 7-10: func3 * bit 11: bit 5 of func7 */ static inline bool op_load(struct riscv_t *rv, uint32_t insn UNUSED) { /* I-type * 31 26 21 16 11 9 6 0 * [ rd 5][ rs1 5][ immhi 5][ immlo 7][fun3][ opcode 7] */ const int32_t imm = dec_itype_imm(insn); const uint32_t rs1 = dec_rs1(insn); const uint32_t funct3 = dec_funct3(insn); const uint32_t rd = dec_rd(insn); /* load address */ const uint32_t addr = rv->X[rs1] + imm; /* dispatch by read size * * imm[11:0] rs1 000 rd 0000011 LB * imm[11:0] rs1 001 rd 0000011 LH * imm[11:0] rs1 010 rd 0000011 LW * imm[11:0] rs1 011 rd 0000011 LD * imm[11:0] rs1 100 rd 0000011 LBU * imm[11:0] rs1 101 rd 0000011 LHU * imm[11:0] rs1 110 rd 0000011 LWU */ switch (funct3) { case 0: /* LB: Load Byte */ rv->X[rd] = sign_extend_b(rv->io.mem_read_b(rv, addr)); break; case 1: /* LH: Load Halfword */ if (addr & 1) { rv_except_load_misaligned(rv, addr); return false; } rv->X[rd] = sign_extend_h(rv->io.mem_read_s(rv, addr)); break; case 2: /* LW: Load Word */ if (addr & 3) { rv_except_load_misaligned(rv, addr); return false; } rv->X[rd] = rv->io.mem_read_w(rv, addr); break; case 4: /* LBU: Load Byte Unsigned */ rv->X[rd] = rv->io.mem_read_b(rv, addr); break; case 5: /* LHU: Load Halfword Unsigned */ if (addr & 1) { rv_except_load_misaligned(rv, addr); return false; } rv->X[rd] = rv->io.mem_read_s(rv, addr); break; default: rv_except_illegal_insn(rv, insn); return false; } /* step over instruction */ rv->PC += rv->insn_len; /* enforce zero register */ if (rd == rv_reg_zero) rv->X[rv_reg_zero] = 0; return true; } ``` 稍微總結一下，基本上整個 rv32emu 的核心就是在於透過巨集函式 `DISPATCH` 取得指令並跳進對應的 label ，接著使用巨集函式 `EXEC` 做出對應的執行行為並且對 pc 增加對應寬度的大小，重複執行這些動作直到處理器被中止 ### rv32emu 對 ELF 執行檔實作細節 :::warning TODO: 補齊 ELF 細節 ::: ### 了解 [riscv-arch-test](https://github.com/riscv-non-isa/riscv-arch-test) 測試原理除了理解 rv32emu 的實作細節，也應當深入理解測試模擬器的專案是如何運作的，首先確定目前 rv32emu 使用的測試版本，使用命令 `git submodule status` 確定，以下為輸出 ```shell 4acb4c5d79e02532c642509f476f975465632863 src/mini-gdbstub (4acb4c5) 6f7f47bdc61c0c51c0cbf75789678a1235eeefc2 tests/riscv-arch-test (2.7.4-2-g6f7f47b) ``` 從 hash 值可以找到對應的版本，如下所示，同時這也是 [old-framework-2.x](https://github.com/riscv-non-isa/riscv-arch-test/commits/old-framework-2.x) 分支的版本 ![](https://i.imgur.com/UOtsKuk.png) ## [riscv-arch-test](https://github.com/Risheng1128/rv32emu#riscv-arch-test) 在實作目標之前，可以先參考已經完成的指令集是怎麼測試，這裡分析 Base Integer Instruction Set 是如何實作，使用命令 `make arch-test RISCV_DEVICE=I` 來分析 ### Makefile 運作流程首先輸入命令 `make arch-test RISCV_DEVICE=I` 後會開始執行 Makefile ，其中以下的程式碼則會使用到檔案 `mk/riscv-arch-test.mk` ```shell # RISC-V Architecture Tests include mk/riscv-arch-test.mk ``` 而在檔案 `mk/riscv-arch-test.mk` 裡，包含負責處理命令的部份，以下為完整程式碼 ```shell= ARCH_TEST_DIR ?= tests/riscv-arch-test ARCH_TEST_BUILD := $(ARCH_TEST_DIR)/Makefile export RISCV_TARGET := tests/arch-test-target export RISCV_PREFIX ?= $(CROSS_COMPILE) export TARGETDIR := $(shell pwd) export XLEN := 32 export JOBS ?= -j export WORK := $(TARGETDIR)/build/arch-test $(ARCH_TEST_BUILD): git submodule update --init $(dir $@) arch-test: $(BIN) $(ARCH_TEST_BUILD) ifndef CROSS_COMPILE $(error GNU Toolchain for RISC-V is required. Please check package installation) endif $(Q)$(MAKE) --quiet -C $(ARCH_TEST_DIR) clean $(Q)$(MAKE) --quiet -C $(ARCH_TEST_DIR) ``` 在第 13 行會使用 `git submodule` 將子模組 arch-test-target 加到 rv32emu 裡，接著在第 18 及 19 行分別都使用了一次檔案 `tests/riscv-arch-test/Makefile` 接著來到檔案 `tests/riscv-arch-test/Makefile` ，首先使用 `$(RISCV_DEVICE)` 決定要使用命令 `all_variant` 或是命令 `variant` ，前者會將所有擁有包含在目錄 `tests/arch-test-target/device/rv32i_m` 裡的指令集都測試一次，後者則是對特定的指令集做測試 ```shell RISCV_ISA_ALL = $(shell ls $(TARGETDIR)/$(RISCV_TARGET)/device/rv$(XLEN)i_m) RISCV_ISA_OPT = $(subst $(space),$(pipe),$(RISCV_ISA_ALL)) RISCV_ISA_ALL := $(filter-out Makefile.include,$(RISCV_ISA_ALL)) ifeq ($(RISCV_DEVICE),) RISCV_DEVICE = I DEFAULT_TARGET=all_variant else DEFAULT_TARGET=variant endif ... variant: simulate verify all_variant: @for isa in $(RISCV_ISA_ALL); do \ $(MAKE) $(JOBS) RISCV_TARGET=$(RISCV_TARGET) RISCV_TARGET_FLAGS="$(RISCV_TARGET_FLAGS)" RISCV_DEVICE=$$isa variant; \ rc=$$?; \ if [ $$rc -ne 0 ]; then \ exit $$rc; \ fi \ done simulate: $(MAKE) $(JOBS) \ RISCV_TARGET=$(RISCV_TARGET) \ RISCV_DEVICE=$(RISCV_DEVICE) \ run -C $(SUITEDIR) verify: simulate riscv-test-env/verify.sh ``` 這裡以單純測試 RV32I 為例，會分別照順序執行命令 `simulate` 以及 `verify` ，前者的部份則會執行檔案 `tests/riscv-arch-test/riscv-test-suite/rv32i_m/I/Makefile` ，而後者會執行 shell script `riscv-test-env/verify.sh` 首先為命令 `simulate` 的部份，主要是建立並執行每個組合語言的測試資料，可以從檔案 `Makefile.include` 找到 ```shell include ../../Makefile.include $(eval $(call compile_template,-march=rv32i -mabi=ilp32 -DXLEN=$(XLEN))) ``` `tests/arch-test-target/device/rv32i_m/I/Makefile.include` 的部份可以參考以下程式碼，這個檔案就是用來編譯我們需要的 assembly 並執行 rv32emu 的地方 ```shell RUN_TARGET= $(TARGETDIR)/build/rv32emu $(<) \ $(RISCV_TARGET_FLAGS) \ --arch-test $(*).signature.output \ 1>$(@) 2>&1 RISCV_GCC ?= $(RISCV_PREFIX)gcc RISCV_GCC_OPTS ?= \ -march=rv32g \ -mabi=ilp32 \ -static \ -mcmodel=medany \ -fvisibility=hidden \ $(RVTEST_DEFINES) \ -nostdlib \ -nostartfiles COMPILE_TARGET = \ $$(RISCV_GCC) $(1) $$(RISCV_GCC_OPTS) \ -I$(ROOTDIR)/riscv-test-suite/env/ \ -I$(TARGETDIR)/$(RISCV_TARGET)/ \ -T$(TARGETDIR)/$(RISCV_TARGET)/link.ld \ $$(<) -o $$(@); ``` 而命令 `verify` 的部份則是將目錄 `tests/riscv-arch-test/riscv-test-suite/rv32i_m/I/references` 裡的各種 `.reference_output` 檔和位於 `build/arch-test/rv32i_m/I` 裡的實際執行的各種 `.signature.output` 輸出檔相互比較，來確定是否測試成功 ### RVC 的設計思路根據前面的討論 - [Makefile 運作流程](#Makefile-運作流程)，可以知道程式最後會在目錄 `tests/arch-test-target/device/rv32i_m` 裡根據使用者的輸入對應不同的 extension ，並同時執行 rv32emu ，以下為執行 rv32emu 的程式碼 ```shell RUN_TARGET= $(TARGETDIR)/build/rv32emu $(<) \ $(RISCV_TARGET_FLAGS) \ --arch-test $(*).signature.output \ 1>$(@) 2>&1 ``` 同時也根據前面的討論 - [rv32emu 主要運作流程](#rv32emu-主要運作流程) ，了解 rv32emu 的主要運作原理，如此一來已經對 rv32emu 有一定的認知，接著要了解究竟什麼是 RVC 呢 ? 參考 [The RISC-V Instruction Set Manual Volume I: User-Level ISA](https://github.com/riscv/riscv-isa-manual/releases/download/Ratified-IMAFDQC/riscv-spec-20191213.pdf) 的章節 **"C" Standard Extension for Compressed Instructions, Version 2.0** 基本上 "C" extension 主要就是 RISC-V standard compressed instruction set extension ，顧名思義就是用來壓縮指令的寬度，達到提升指令密度的作用，在資源受限的環境中，這是很重要的特徵。 > The C extension can be added to any of the base ISAs (RV32, RV64, RV128), and we use the generic term “RVC” to cover any of these. 接著就直接講到 RVC 的指令類型，基本上類型 CR 、 CI 及 CSS 可以使用 RV32I 任何的暫存器 (x0 ~ x31) ，而類型 CIW 、 CL 、 CS 及 CB 則只能使用 RV32I 裡的其中 8 個暫存器 (x8 ~ x15) ，用下圖來表示 > CR, CI, and CSS can use any of the 32 RVI registers, but CIW, CL, CS, and CB are limited to just 8 of them. ![](https://i.imgur.com/IWysnJE.png) 至於是只能使用哪 8 個暫存器，則可以透過下圖得知 ![](https://i.imgur.com/PhqhNC2.png) 由於這次目標是要實作指令 `c.ebreak` ，因此就來好好認識這個指令 ![](https://i.imgur.com/yoQA4Qo.png) ![](https://i.imgur.com/Fs7CLvr.png) 指令 `c.ebreak` 主要是將控制權轉移給 debugger ，可以用來中斷程式的運作，也就是說，觸發中斷點就是用這個指令來執行的，另外從上圖可以發現指令 `c.ebreak` 和 `c.add` 共用 opcode ，因此可以將其歸類為 CR 類型 > Debuggers can use the C.EBREAK instruction, which expands to ebreak, to cause control to be transferred back to the debugging environment. C.EBREAK shares the opcode with the C.ADD instruction, but with rd and rs2 both zero, thus can also use the CR format. 參考 [riscv-isa-sim](https://github.com/riscv-software-src/riscv-isa-sim) 裡對指令 `c.ebreak` 的實作，位於檔案 `c_ebreak.h` 裡，如下所示 ```cpp require_extension('C'); if (!STATE.debug_mode && ((STATE.prv == PRV_M && STATE.dcsr->ebreakm) || (STATE.prv == PRV_S && STATE.dcsr->ebreaks) || (STATE.prv == PRV_U && STATE.dcsr->ebreaku))) { throw trap_debug_mode(); } else { throw trap_breakpoint(STATE.v, pc); } ``` 有點難懂...只好來看規格書了！參考 [The RISC-V Instruction Set Manual Volume II: Privileged Architecture](https://github.com/riscv/riscv-isa-manual/releases/download/Priv-v1.12/riscv-privileged-20211203.pdf) 的章節 **Introduction** 可以發現 RISC-V hardware thread (hart) 有一些不同的 priviledge level ，而通常會有三種不同的 priviledge level ，如下表所示 ![](https://i.imgur.com/XEiokZV.png) 接著解釋這三種模式的差別 - The machine level has the highest privileges and is the only mandatory privilege level for a RISC-V hardware platform. Code run in machine-mode (M-mode) is usually inherently trusted, as it has low-level access to the machine implementation. M-mode can be used to manage secure execution environments on RISC-V - User-mode (U-mode) and supervisor-mode (S-mode) are intended for conventional application and operating system usage respectively ![](https://i.imgur.com/mw988Ad.png) 有了以上的知識，回到前面的程式碼，可以很清楚的知道， `if` 的邏輯主要是判斷目前的處理器是否處於 Debug 模式且判斷處理器的 privilege 的狀態 - 如果處理器不是處於 Debug 模式，且 priviledge level 為 M 模式、 S 模式或 U 模式，則處理器會進到 Debug 模式 - 反之，則會觸發中斷點接著就有了兩種情況，分別是進入 Debug 模式以及觸發中斷點，以下為前者在 [riscv-isa-sim](https://github.com/riscv-software-src/riscv-isa-sim) 的實作 ```c void processor_t::enter_debug_mode(uint8_t cause) { state.debug_mode = true; state.dcsr->write_cause_and_prv(cause, state.prv); set_privilege(PRV_M); state.dpc->write(state.pc); state.pc = DEBUG_ROM_ENTRY; } ``` 接著是後者的實作，這邊主要擷取執行 Machine 模式的部份，可以看到主要都是對各種 CSR 的暫存器做設定，會在後面的實作仔細介紹各個暫存器的功能 ```c // Handle the trap in M-mode set_virt(false); reg_t vector = (state.mtvec->read() & 1) && interrupt ? 4 * bit : 0; state.pc = (state.mtvec->read() & ~(reg_t)1) + vector; state.mepc->write(epc); state.mcause->write(t.cause()); state.mtval->write(t.get_tval()); state.mtval2->write(t.get_tval2()); state.mtinst->write(t.get_tinst()); reg_t s = state.mstatus->read(); s = set_field(s, MSTATUS_MPIE, get_field(s, MSTATUS_MIE)); s = set_field(s, MSTATUS_MPP, state.prv); s = set_field(s, MSTATUS_MIE, 0); s = set_field(s, MSTATUS_MPV, curr_virt); s = set_field(s, MSTATUS_GVA, t.has_gva()); state.mstatus->write(s); if (state.mstatush) state.mstatush->write(s >> 32); // log mstatush change set_privilege(PRV_M); ``` ### 通過 `c.ebreak` 的測試在 rv32emu 裡，首先看到函式 `op_ccr` 可以發現原始指令 `c.ebreak` 的實作，基本上就是直接中止處理器，如下所示 ```c if (rs1 == 0 && rs2 == 0) /* C.EBREAK */ rv->io.on_ebreak(rv); ``` 而 `on_ebreak` 早在一開始就指向函式 `rv_halt` 了 ```c /* install the I/O handlers for the RISC-V runtime */ const struct riscv_io_t io = { ... /* system */ .on_ecall = syscall_handler, .on_ebreak = rv_halt, }; ``` 經過翻閱規格書後，最後的完整實作可以參考 [Implement c.ebreak properly](https://github.com/sysprog21/rv32emu/commit/331ce724e32ac1482a0ccad96cf2376117ed1724) ，這邊就講解主要函式 `rv_except_breakpoint` ，參考 [The RISC-V Instruction Set Manual Volume II: Privileged Architecture](https://github.com/riscv/riscv-isa-manual/releases/download/Priv-v1.12/riscv-privileged-20211203.pdf) ```c= static void rv_except_breakpoint(struct riscv_t *rv, uint32_t old_pc) { /* mtvec (Machine Trap-Vector Base Address Register) * mtvec[MXLEN-1:2]: vector base address * mtvec[1:0] : vector mode */ const uint32_t base = rv->csr_mtvec & ~0x3; const uint32_t mode = rv->csr_mtvec & 0x3; /* Exception Code: Breakpoint */ const uint32_t code = 3; /* mepc (Machine Exception Program Counter) * mtval(Machine Trap Value Register) : Breakpoint */ rv->csr_mepc = old_pc; rv->csr_mtval = old_pc; switch (mode) { case 0: /* DIRECT: All exceptions set PC to base */ rv->PC = base; break; case 1: /* VECTORED: Asynchronous interrupts set PC to base + 4 * code */ rv->PC = base + 4 * code; break; } /* mcause (Machine Cause Register): store exception code */ rv->csr_mcause = code; } ``` 首先是程式碼第 7 ~ 8 行的部份，以下為暫存器 `mtvec` 的 bit map ，基本上有了下圖程式碼就很清楚了，也就是分別擷取 BASE 及 MODE 的部份 ![](https://i.imgur.com/m2TDk92.png) 而 BASE 及 MODE 的部份可以參考下圖，主要邏輯如下 - MODE 為 0 的話， pc 就會指到 BASE 的值 - MODE 為 1 的話， pc 就會指到 BASE + 4 * cause 的值，而這邊的 cause 則是暫存器 Machine Cause Register (mcause) 的值如此一來就對應到第 19 ~ 25 行 ![](https://i.imgur.com/Eti4D3B.png) 接著變數 `code` 則代表 Exception Code ，由下表可知， breakpoint exception 的 Exception code 值為 3 ![](https://i.imgur.com/dHw72gm.png) 暫存器 `mepc` 的部份，由以下原文可以得知，如果 trap 執行在 M 模式的話，此時暫存器 `mepc` 的值則是該指令的 virtual address ，在這裡指的就是 `rv->PC` > When a trap is taken into M-mode, mepc is written with the virtual address of the instruction that was interrupted or that encountered the exception. Otherwise, mepc is never written by the implementation, though it may be explicitly written by software. 暫存器 `mepc` 的部份，由以下原文可以得知，當發生的 trap 為 breakpoint 時，此時的贊存器 `mtval` 其值為造成 trap 的指令的 virtual address ，在這裡就是指 `rv->PC` > If mtval is written with a nonzero value when a breakpoint, address-misaligned, access-fault, or page-fault exception occurs on an instruction fetch, load, or store, then mtval will contain the faulting virtual address. 經過這樣的修改後，目前 Compressed Instructions 已經都測試通過 ```shell Check cadd-01 ... OK Check caddi-01 ... OK Check caddi16sp-01 ... OK Check caddi4spn-01 ... OK Check cand-01 ... OK Check candi-01 ... OK Check cbeqz-01 ... OK Check cbnez-01 ... OK Check cebreak-01 ... OK Check cj-01 ... OK Check cjal-01 ... OK Check cjalr-01 ... OK Check cjr-01 ... OK Check cli-01 ... OK Check clui-01 ... OK Check clw-01 ... OK Check clwsp-01 ... OK Check cmv-01 ... OK Check cnop-01 ... OK Check cor-01 ... OK Check cslli-01 ... OK Check csrai-01 ... OK Check csrli-01 ... OK Check csub-01 ... OK Check csw-01 ... OK Check cswsp-01 ... OK Check cxor-01 ... OK -------------------------------- OK: 27/27 RISCV_TARGET=tests/arch-test-target RISCV_DEVICE=C XLEN=32 ``` 上述程式碼修改對應到 [Issue #60](https://github.com/sysprog21/rv32emu/pull/60) ## [Avoid duplications in RISC-V exception handlers #61](https://github.com/sysprog21/rv32emu/issues/61) 根據 [Avoid duplications in RISC-V exception handlers #61](https://github.com/sysprog21/rv32emu/issues/61) 的敘述，可以清楚了解該 Issue 的目的，由於目前處理 exception 的函式都寫得非常相似，有非常多重複的程式碼，因此可以使用巨集進行改寫，而需要修改的函式如下所示 ``` 1. rv_except_insn_misaligned 2. rv_except_load_misaligned 3. rv_except_store_misaligned 4. rv_except_illegal_insn 5. rv_except_breakpoint ``` 完整的修改可參考 [Avoid duplications in RISC-V exception handlers](https://github.com/Risheng1128/rv32emu/commit/21c09f0fb2255f84ee85a12a7dbbf9cc3a529214) ，以下為主要的巨集函式 ```c= #define GET_EXCEPTION_CODE(type) rv_exception_code_##type #define EXCEPTION_HANDLER_IMPL(type) \ UNUSED static void rv_except_##type(struct riscv_t *rv, uint32_t mtval) \ { \ /* mtvec (Machine Trap-Vector Base Address Register) \ * mtvec[MXLEN-1:2]: vector base address \ * mtvec[1:0] : vector mode \ */ \ const uint32_t base = rv->csr_mtvec & ~0x3; \ const uint32_t mode = rv->csr_mtvec & 0x3; \ /* Exception Code */ \ const uint32_t code = GET_EXCEPTION_CODE(type); \ /* mepc (Machine Exception Program Counter) \ * mtval (Machine Trap Value Register) \ */ \ rv->csr_mepc = rv->PC; \ rv->csr_mtval = mtval; \ switch (mode) { \ case 0: /* DIRECT: All exceptions set PC to base */ \ rv->PC = base; \ break; \ /* VECTORED: Asynchronous interrupts set PC to base + 4 * code */ \ case 1: \ rv->PC = base + 4 * code; \ break; \ } \ /* mcause (Machine Cause Register): store exception code */ \ rv->csr_mcause = code; \ } ``` 需要注意的地方在於上方的第 12 ~ 17 行，首先第 12 行的部份主要是取得 exception 的 exception code ，這裡建立一個列舉並透過巨集函式 `GET_EXCEPTION_CODE` 來取得，而參考的表格如下 ![](https://i.imgur.com/oROhSdP.png) 而列舉的定義如下 ```c /* RISC-V exception code list */ #define RV_EXCEPTION_LIST \ _(insn_misaligned) /* Instruction address misaligned */ \ _(insn_fault) /* Instruction access fault */ \ _(illegal_insn) /* Illegal instruction */ \ _(breakpoint) /* Breakpoint */ \ _(load_misaligned) /* Load address misaligned */ \ _(load_fault) /* Load access fault */ \ _(store_misaligned) /* Store/AMO address misaligned */ enum { #define _(type) GET_EXCEPTION_CODE(type), RV_EXCEPTION_LIST #undef _ }; ``` 接著是第 16 行的部份，可參考 [The RISC-V Instruction Set Manual Volume II: Privileged Architecture](https://github.com/riscv/riscv-isa-manual/releases/download/Priv-v1.12/riscv-privileged-20211203.pdf) 裡提到的暫存器 `mepc` ，其中有一段句子如下 > When a trap is taken into M-mode, mepc is written with the virtual address of the instruction that was interrupted or that encountered the exception. Otherwise, mepc is never written by the implementation, though it may be explicitly written by software. 可以得知執行在 Machine 模式發生 trap ，此時的 `mepc` 會被設定為被中斷或是遇到 exception 的指令地址，而 rv32emu 目前只有執行 Machine 模式，因此可以設定 `mepc` 為 `rv->PC` 最後是第 17 行，參考規格書的暫存器 `mtval` 的敘述，可以發現 `mtval` 會根據不同的 exception 被設定不同的值，以下做簡單的分類 - Breakpoint, address-misaligned, access-fault, or page-fault exception occurs on an instruction fetch, load, or store: `mtval` will contain the **faulting virtual address** - Misaligned load or store causes an access-fault or page-fault exception: `mtval` will contain the **virtual address of the portion of the access** that caused the fault - Instruction access-fault or page-fault exception occurs on a system with variable-length instructions: `mtval` will contain the **virtual address of the portion of the instruction** that caused the fault, while mepc will point to the beginning of the instruction - If mtval is written with a nonzero value when an illegal-instruction exception occurs, then mtval will contain the shortest of: 1. the actual faulting instruction 2. the first ILEN bits of the faulting instruction 3. the first MXLEN bits of the faulting instruction 這裡就沿用原本的實作，也就是讓不同的指令計算其 `mtval` 的數值，當發生 exception 時將 `mtval` 的值用傳進 exception handler 裡 ## 通過 privilege instruction 測試 ### 處理 `ebreak` 指令在[通過 `c.ebreak` 的測試](#通過-`c.ebreak`-的測試)理，已經通過了 RV32C 裡 `ebreak` 指令的測試，但很奇怪的是 RV32I 裡的 `ebreak` 指令卻無法通過測試，使用命令 `make arch-test RISCV_DEVICE=privilege` 做測試，以下為測試結果 ```shell Check ebreak ... FAIL Check ecall ... FAIL Check misalign1-jalr-01 ... OK Check misalign2-jalr-01 ... OK Check misalign-beq-01 ... OK Check misalign-bge-01 ... OK Check misalign-bgeu-01 ... OK Check misalign-blt-01 ... OK Check misalign-bltu-01 ... OK Check misalign-bne-01 ... OK Check misalign-jal-01 ... OK Check misalign-lh-01 ... OK Check misalign-lhu-01 ... OK Check misalign-lw-01 ... FAIL Check misalign-sh-01 ... OK Check misalign-sw-01 ... FAIL ``` 既然 `ebreak` 沒通過，看一下 rv32emu 實際執行了什麼指令，使用命令 `riscv64-unknown-elf-objdump -d ebreak.elf` ，以下是最關鍵的部份 ``` 80000104 <rvtest_code_begin>: 80000104: 00001097 auipc ra,0x1 80000108: f4c08093 ddi ra,ra,-180 # 80001050 <begin_signature> 8000010c: 11111137 lui sp,0x11111 80000110: 11110113 addi sp,sp,273 # 11111111 <value+0x11111101> 80000114: 9002 ebreak 80000116: 0001 nop 80000118: 0001 nop 8000011a: 0000a023 sw zero,0(ra) 8000011e: 0020a223 sw sp,4(ra) 80000122: 0001 nop 80000124: 00000013 nop 80000128: 00000013 nop 8000012c: 00000013 nop ``` 可以發現最關鍵的指令 `ebreak` 被編譯成 RV32C 的指令 `c.ebreak` ，其編碼如下所示 ``` 15 13 12 11 7 6 2 1 0 [ 1 0 0 ][ 1 ][ 0 0 0 0 0 ][ 0 0 0 0 0 ][ 1 0 ] c.ebreak ``` 最後在檔案 `tests/arch-test-target/device/rv32i_m/privilege/Makefile.inclide` 可以找到編譯 privilege 測試檔的編譯器選項，這裡將 RV32C 的部份移除 ```diff - -march=rv32gc \ + -march=rv32g \ ``` 接著重新觀察新編譯出來的執行檔，一樣使用命令 `riscv64-unknown-elf-objdump -d ebreak.elf` ，以下為輸出結果 ```shell 80000104 <rvtest_code_begin>: 80000104: 00001097 auipc ra,0x1 80000108: f6c08093 addi ra,ra,-148 # 80001070 <begin_signature> 8000010c: 11111137 lui sp,0x11111 80000110: 11110113 addi sp,sp,273 # 11111111 <value+0x11111101> 80000114: 00100073 ebreak 80000118: 00000013 nop 8000011c: 00000013 nop 80000120: 0000a023 sw zero,0(ra) 80000124: 0020a223 sw sp,4(ra) 80000128: 00000013 nop 8000012c: 00000013 nop ``` 很明顯編譯器已經編譯出我們要的指令，接著看測試結果，如下所示 ```shell Check ebreak ... FAIL Check ecall ... FAIL Check misalign1-jalr-01 ... OK Check misalign2-jalr-01 ... OK Check misalign-beq-01 ... OK Check misalign-bge-01 ... OK Check misalign-bgeu-01 ... OK Check misalign-blt-01 ... OK Check misalign-bltu-01 ... OK Check misalign-bne-01 ... OK Check misalign-jal-01 ... OK Check misalign-lh-01 ... OK Check misalign-lhu-01 ... OK Check misalign-lw-01 ... OK Check misalign-sh-01 ... OK Check misalign-sw-01 ... OK ``` 可以發現像是 `misalign-lw` 及 `misalign-sw` 也是因為編譯器選項的關係而沒有通過，但我們最主要要修復的 ebreak 還沒通過最後發現在原本的實作中，執行完 `ebreak` 後，多了不需要的步驟，也就是下方程式碼的第 15 行 ```c= switch (funct3) { case 0: switch (imm) { /* dispatch from imm field */ case 0: /* ECALL: Environment Call */ rv->io.on_ecall(rv); break; case 1: /* EBREAK: Environment Break */ rv->io.on_ebreak(rv); break; ... } ... } /* step over instruction */ rv->PC += rv->insn_len; ``` 因此就在 `ebreak` 執行結束後直接回傳即可，如下所示 ```diff - break; + return true; ``` 最後再次測試，這時的 ebreak 就通過了測試 ```shell Check ebreak ... OK Check ecall ... FAIL Check misalign1-jalr-01 ... OK Check misalign2-jalr-01 ... OK Check misalign-beq-01 ... OK Check misalign-bge-01 ... OK Check misalign-bgeu-01 ... OK Check misalign-blt-01 ... OK Check misalign-bltu-01 ... OK Check misalign-bne-01 ... OK Check misalign-jal-01 ... OK Check misalign-lh-01 ... OK Check misalign-lhu-01 ... OK Check misalign-lw-01 ... OK Check misalign-sh-01 ... OK Check misalign-sw-01 ... OK ``` 以上的完整修改可以對應到 [Improve compliance for privileged instructions](https://github.com/sysprog21/rv32emu/commit/1de60d76df8f244e90ba37aef427954d82555c25) ### 處理 `ecall` 指令根據[處理 `ebreak` 指令](#處理-`ebreak`-指令)，對於目前的 privilege instruction 測試，剩下最後一個沒通過的指令 `ecall` ，先觀察目前的實作在檔案 `src/emulate.c` 裡，函式 `op_system` 負責判斷目前的指令是否為 `ecall` ```c /* dispatch by func3 field */ switch (funct3) { case 0: switch (funct12) { /* dispatch from imm field */ case 0: /* ECALL: Environment Call */ rv->io.on_ecall(rv); break; ... } ... } ``` 接著函式指標 `on_ecall` 會呼叫函式 `syscall_handler` ，該函式位於檔案 `src/syscall.c` ，實作如下 ```c void syscall_handler(struct riscv_t *rv) { /* get the syscall number */ riscv_word_t syscall = rv_get_reg(rv, rv_reg_a7); switch (syscall) { /* dispatch system call */ #define _(name, number) \ case SYS_##name: \ syscall_##name(rv); \ break; SUPPORTED_SYSCALLS #undef _ default: fprintf(stderr, "unknown syscall %d\n", (int) syscall); rv_halt(); break; } } ``` 間單來說，目前的實作其實只有取得暫存器 `a7` 的資料作為 syscall number 並分配給不同的處理函式，然而目前實作缺乏了 control and status registers (CSRs) 的設定根據 [Avoid duplications in RISC-V exception handlers #61](#Avoid-duplications-in-RISC-V-exception-handlers-#61) ，可以發現目前紀錄 exception 的 CSR 不同的地方為紀錄 exception code 的暫存器 `mcause` 以及紀錄 exception 訊息的暫存器 `mtval` ，以下就著重這兩個部份首先 `mcause` 的部份可參考 [The RISC-V Instruction Set Manual Volume II: Privileged Architecture](https://github.com/riscv/riscv-isa-manual/releases/download/Priv-v1.12/riscv-privileged-20211203.pdf) 裡 `mcause` 儲存的資料，可以發現目標為 Environment call from M-mode ，其 exception code 為 11 ![](https://i.imgur.com/aEve226.png) 接著擴充在 rv32emu 裡 exception list 的實作 ```diff #define RV_EXCEPTION_LIST \ _(insn_misaligned) /* Instruction address misaligned */ \ _(insn_fault) /* Instruction access fault */ \ _(illegal_insn) /* Illegal instruction */ \ _(breakpoint) /* Breakpoint */ \ _(load_misaligned) /* Load address misaligned */ \ _(load_fault) /* Load access fault */ \ _(store_misaligned) /* Store/AMO address misaligned */ \ - _(store_fault) /* Store/AMO access fault */ + _(store_fault) /* Store/AMO access fault */ \ + _(ecall_U) /* Environment call from U-mode */ \ + _(ecall_S) /* Environment call from S-mode */ \ + _(reserved) /* Reserved */ \ + _(ecall_M) /* Environment call from M-mode */ ``` 接著是 `mtval` 的部份，暫存器 `mtval` 會根據不同的 exception 而儲存不同的資料，而 `ecall` 則是對應到以下原文，清楚了解 `ecall` exception 發生時， `mtval` 目前是儲存 0 > For other traps, mtval is set to zero, but a future standard may redefine mtval’s setting for other traps. 這裡新增新的函式 `ecall_handler` ，並且在 `rv_except_ecall_M` 傳入 0 ，也就是暫存器 `mtval` 將儲存的數值 ```c void ecall_handler(struct riscv_t *rv) { assert(rv); syscall_handler(rv); rv_except_ecall_M(rv, 0); } ``` 完整修改可參考 [Implement environment call properly](https://github.com/sysprog21/rv32emu/commit/137df3f2cacb1f6f4ed244da60bac98d0ae96d7e) ，以下是對 privilege instruction 的測試，目前完整通過 riscv-arch-test 的 privilege 測試 ! ```shell Check ebreak ... OK Check ecall ... OK Check misalign1-jalr-01 ... OK Check misalign2-jalr-01 ... OK Check misalign-beq-01 ... OK Check misalign-bge-01 ... OK Check misalign-bgeu-01 ... OK Check misalign-blt-01 ... OK Check misalign-bltu-01 ... OK Check misalign-bne-01 ... OK Check misalign-jal-01 ... OK Check misalign-lh-01 ... OK Check misalign-lhu-01 ... OK Check misalign-lw-01 ... OK Check misalign-sh-01 ... OK Check misalign-sw-01 ... OK ``` ### 指令 `ecall` 非預期錯誤問題: 在[處理 `ecall` 指令](#處理-`ecall`-指令)的實作中，已經完成 privilege instruction 的測試，但當使用命令 `make check` 來執行 `hello.elf` 、 `puzzle.elf` 及 `pi.elf` 時卻進入無限迴圈首先要先知道 RISC-V 在處理 exception 時的運作，參考 [RISC-V: 中斷與異常處理](https://ithelp.ithome.com.tw/articles/10268967?sc=rss.iron)，其中很重要的一點就是當 exception 發生時， RISC-V 會根據暫存器 `mtvec` 去修改原本的 program counter ，以執行其 exception handler ，以下是 rv32emu 對應的實作 ```c switch (mode) { \ case 0: /* DIRECT: All exceptions set PC to base */ \ rv->PC = base; \ break; \ /* VECTORED: Asynchronous interrupts set PC to base + 4 * code */ \ case 1: \ rv->PC = base + 4 * code; \ break; \ } ``` 接著使用通過測試的 `ecall.elf` 以及未通過的 `hello.elf` 做比較以找到問題點，分別對兩者都做反組譯首先是通過測試的 `ecall.elf` ，以下節錄修改 `mtvec` 以及執行 `ecall` 的部份 ```= 8000015c <init_mtvec>: 8000015c: 00000317 auipc t1,0x0 80000160: 08030313 addi t1,t1,128 # 800001dc <mtrampoline> 80000164: 00001e97 auipc t4,0x1 80000168: ef8e8e93 addi t4,t4,-264 # 8000105c <mtvec_save> 8000016c: 305313f3 csrrw t2,mtvec,t1 80000170: 007ea023 sw t2,0(t4) 80000174: 30502e73 csrr t3,mtvec 80000178: e86e0ae3 beq t3,t1,8000000c <rvtest_prolog_done> 80000104 <rvtest_code_begin>: 80000104: 00001097 auipc ra,0x1 80000108: f6c08093 addi ra,ra,-148 # 80001070 <begin_signature> 8000010c: 11111137 lui sp,0x11111 80000110: 11110113 addi sp,sp,273 # 11111111 <value+0x11111101> 80000114: 00000073 ecall 80000118: 00000013 nop 8000011c: 00000013 nop 80000120: 0000a023 sw zero,0(ra) 80000124: 0020a223 sw sp,4(ra) 80000128: 00000013 nop 8000012c: 00000013 nop ``` 可以發現在上述程式碼的第 2 ~ 9 行，暫存器 `mtvec` 被設定為 0x800001dc ，當執行 ecall 時，會根據目前 `mtvec` 的 base 及 mode 調整新的 pc 另外是 `hello.elf` 的部份，以下是反組譯的結果 ``` 00000000 <.text>: 0: 00000293 li t0,0 4: 00500313 li t1,5 8: 0040006f j 0xc c: 00000013 nop 10: 02628263 beq t0,t1,0x34 14: 04000893 li a7,64 18: 00100513 li a0,1 1c: 00000597 auipc a1,0x0 20: 02458593 addi a1,a1,36 # 0x40 24: 00d00613 li a2,13 28: 00000073 ecall 2c: 00128293 addi t0,t0,1 30: fe1ff06f j 0x10 34: 05d00893 li a7,93 38: 00000513 li a0,0 3c: 00000073 ecall 40: 6548 .2byte 0x6548 42: 6c6c .2byte 0x6c6c 44: 6f57206f j 0x72f38 48: 6c72 .2byte 0x6c72 4a: 2164 .2byte 0x2164 4c: 000a .2byte 0xa ``` 可以很清楚發現，在 `hello.elf` 並沒有對於 control and status registers (CSRs) 的實作，而這樣會有一個問題，前面有提到 RISC-V 在發生 exception 時會根據暫存器 `mtvec` 來設定 program counter 以執行 exception handler 。而根據上述的程式碼，執行指令 `ecall` 的當下，暫存器 `mtvec` 的資料會維持為 0 ，此時 pc 會被設定為 0 ，會重新抓取第一行的指令，也就重複了這樣的循環，導致產生無限迴圈的問題知道了問題的原因，接著來修補坑洞吧！參考 [riscv-arch-test](https://github.com/riscv-non-isa/riscv-arch-test/blob/main/spec/TestFormatSpec.adoc#the-architectural-test) 以及 [RISC-V Exception and Interrupt implementation](https://mullerlee.cyou/2020/07/09/riscv-exception-interrupt/) ，了解目前的測試檔缺少了 `mtvec` 的設定，這裡先對 `hello.S` 做修改，以下是完整的修改 ```diff= .text _start: + # init mtvec + la t0, trap_entry + csrw mtvec, t0 # write trap entry to mtvec li t0, 0 li t1, 5 ... loop: beq t0, t1, end li a7, SYSWRITE # "write" syscall li a0, 1 # 1 = standard output (stdout) la a1, str # load address of hello string li a2, str_size # length of hello string ecall # invoke syscall to print the string addi t0, t0, 1 j loop +# dummy trap handler +trap_entry: + csrr t2, mepc # read pc from mepc + addi t2, t2, 4 # move to next instruction + csrw mepc, t2 # write new pc to mepc + mret end: li a7, SYSEXIT # "exit" syscall add a0, x0, 0 # Use 0 return code ecall # invoke syscall to terminate the program ``` 在上述程式碼的第 3 ~ 5 行對 `mtvec` 先設定預設值，接著第 21 ~ 26 行則是做了一個「虛設」的 trap handler ，主要就是利用 `mepc` 設定新的 pc 值，再利用指令 `mret` 將 `mepc` 的值設定給 pc 經過這樣的擴充後，重新編譯並且用 rv32emu 執行，以下為執行結果，結果正常！ ```shell Hello World! Hello World! Hello World! Hello World! Hello World! inferior exit code 0 ``` 雖然經過上面的修改可以正確執行，但每個執行檔都要經過修改不是個好方法，最後採用一開始就給 `mtvec` 預設值 0 ，當發生 exception 時，若 `mtvec` 的值不變，表示該執行檔沒有實作 `mtvec` 的設定，此時就執行預設的 handler ，完整修改可參考 [Implement environment call properly](https://github.com/sysprog21/rv32emu/commit/c54dcde37ce3447d8ad32ed85346fb512c1deda0) ## 解決 [Migrate to latest RISC-V Architecture Test #49](https://github.com/sysprog21/rv32emu/issues/49) 在解決問題之前，應先熟悉 issue 所提到的 RISC-V 測試框架 — [RISCOF](https://riscof.readthedocs.io/en/latest/) ，這裡將安裝方式以及測試放在[使用 RISCOF 測試 rv32emu](https://hackmd.io/@Risheng/RISCOF) 裡 ## 提升 rv32emu 執行效能參考其他簡易模擬器的實作，如 wip-fastrv32 及 refactor-rv32 等等，提升 rv32emu 的效能，使用 [coremark](https://en.wikipedia.org/wiki/Coremark) 作為效能的評比，以下為不同模擬器實作的效能表現 wip-fastrv32: ```shell 2K performance run parameters for coremark. CoreMark Size : 666 Total ticks : 16869865 Total time (secs): 16.869865 Iterations/Sec : 1185.545942 Iterations : 20000 Compiler version : GCC11.1.0 Compiler flags : -O2 -DPERFORMANCE_RUN=1 Memory location : Please put data memory location here (e.g. code in flash, data on heap etc) seedcrc : 0xe9f5 [0]crclist : 0xe714 [0]crcmatrix : 0x1fd7 [0]crcstate : 0x8e3a [0]crcfinal : 0x382f Correct operation validated. See README.md for run and reporting rules. CoreMark 1.0 : 1185.545942 / GCC11.1.0 -O2 -DPERFORMANCE_RUN=1 / Heap ``` refactor-rv32: ```shell ./rv32 coremark.elf 2K performance run parameters for coremark. CoreMark Size : 666 Total ticks : 12096173 Total time (secs): 12.096173 Iterations/Sec : 909.378528 Iterations : 11000 Compiler version : GCC11.1.0 Compiler flags : -O2 -DPERFORMANCE_RUN=1 Memory location : Please put data memory location here (e.g. code in flash, data on heap etc) seedcrc : 0xe9f5 [0]crclist : 0xe714 [0]crcmatrix : 0x1fd7 [0]crcstate : 0x8e3a [0]crcfinal : 0x33ff Correct operation validated. See README.md for run and reporting rules. CoreMark 1.0 : 909.378528 / GCC11.1.0 -O2 -DPERFORMANCE_RUN=1 / Heap inferior exit code 0 ``` rv32emu: ```shell 2K performance run parameters for coremark. CoreMark Size : 666 Total ticks : 19845160 Total time (secs): 19.845160 Iterations/Sec : 554.291323 Iterations : 11000 Compiler version : GCC11.1.0 Compiler flags : -O2 -DPERFORMANCE_RUN=1 Memory location : Please put data memory location here (e.g. code in flash, data on heap etc) seedcrc : 0xe9f5 [0]crclist : 0xe714 [0]crcmatrix : 0x1fd7 [0]crcstate : 0x8e3a [0]crcfinal : 0x33ff Correct operation validated. See README.md for run and reporting rules. CoreMark 1.0 : 554.291323 / GCC11.1.0 -O2 -DPERFORMANCE_RUN=1 / Heap inferior exit code 0 ``` 可以發現 rv32emu 的效能還有很大的提升空間 ### 將 rv32emu 裡 decode 的部份從 `emulate.c` 抽離在 refactor-rv32 裡用到了 [basic block](https://en.wikipedia.org/wiki/Basic_block) 的觀念，將準備要執行的指令轉換成一個個的 block ，以減少頻繁對記憶體存取的動作，而 basic block 的範例可以參考 [basic blocks in compiler design](https://www.geeksforgeeks.org/basic-blocks-in-compiler-design/) 為了簡化問題，分成三個步驟實作 1. 將 rv32emu 裡 decode 的部份從 `emulate.c` 抽離 > 完整修改可參考 [Decouple instruction decoding from emulation unit (#79)](https://github.com/sysprog21/rv32emu/commit/549f0692eb725ba09f61f9859cd83b980317b499) 2. 在 rv32emu 裡實作 basic block > 完整修改可參考 [Introduce basic block](https://github.com/sysprog21/rv32emu/commit/bb63d931ad3feea03ba1aa8967340ea4e125e941) :::warning 目前發現 rv32emu 在使用 [Dhrystone](https://en.wikipedia.org/wiki/Dhrystone) 測試時，其效能和 `refactor-rv32` 相差甚遠 (兩者的實作邏輯相似) ，以下為測試結果 ```shell refactor-rv32 : 1415 DMIPS rv32emu : 999 DMIPS ``` TODO: 使用 Perf 找到造成 rv32emu 效能較低的原因 ::: 3. 在函式 `rv_emulate` 裡使用 computed goto :::warning TODO: 對照 fastrv32 的實作 ::: ## 參考資料 - [The RISC-V Instruction Set Manual Volume I: User-Level ISA](https://github.com/riscv/riscv-isa-manual/releases/download/Ratified-IMAFDQC/riscv-spec-20191213.pdf) - [The RISC-V Instruction Set Manual Volume II: Privileged Architecture](https://github.com/riscv/riscv-isa-manual/releases/download/Priv-v1.12/riscv-privileged-20211203.pdf) - [Write a simple RISC-V emulator in plain C](https://fmash16.github.io/content/posts/riscv-emulator-in-c.html) - [RISCOF](https://riscof.readthedocs.io/en/stable/index.html) - [riscv64-sim](https://github.com/iamlouk/riscv64-sim) --- ## 面談紀錄 ### 20220923 面談 - [riscv-isa-sim](https://github.com/riscv-software-src/riscv-isa-sim) - [Write a simple RISC-V emulator in plain C](https://fmash16.github.io/content/posts/riscv-emulator-in-c.html) - [RISC-V calling convention](https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf) ### 20221003 面談 - [riscv-tests](https://github.com/riscv-software-src/riscv-tests) - [basic block](https://en.wikipedia.org/wiki/Basic_block) - [HotpathVM: An Effective JIT Compiler for Resource-constrained Devices](https://www.usenix.org/legacy/events/vee06/full_papers/p144-gal.pdf) (十月底討論) - [Static Single-Assignment form](https://en.wikipedia.org/wiki/Static_single-assignment_form) (SSA) ![](https://i.imgur.com/6bw2tQp.png) ### 20221013 面談 - 新增 CSR 預設處理函式到 rv32emu (rv32emu 開始取指令之前) - Boot ROM & OpenSBI - SIE & SIP ### 20221024 面談 1. 將 rv32emu 裡的 riscv-arch-test 更新至最新版本 - 符合 RISCOF 測試標準 - 理解 [RISCOF - Overview](https://riscof.readthedocs.io/en/stable/overview.html) 裡 RISCOF 的架構及實作細節 - 決策: 和 RISCOF 相同，以 python 執行測試檔的編譯以及模擬器的執行 - 維持原本 rv32emu 的使用方式，如 `make arch-test` 等等 2. 提升 rv32emu 效能 (引入 jit 至 rv32emu) - 對照 fastrv32 及 refactor-rv32 的實作 - 嘗試將原本 decode 的實作從 `emulate.c` 抽離 ### 20221104 面談 - [以 JIT 編譯達到提升 50x 執行效率的手段](https://github.com/seal9055/sfuzz/blob/main/docs/code_gen.md) - [動態二進位轉換 SW + HW](https://github.com/srokicki/HybridDBT) - [英國劍橋大學的 RISC-V 模擬器實作，藉由 DBT 將 RISC-V 指令動態轉換為 x86-64 指令，從而達到高效執行](https://github.com/nbdd0121/r2vm) > 對應論文: [Accelerate Cycle-Level Full-System Simulation of Multi-Core RISC-V Systems with Binary Translation](https://carrv.github.io/2020/papers/CARRV2020_paper_6_Guo.pdf) → **==要記得看==** - [ria-jit](https://github.com/ria-jit/ria-jit) > 搭配閱讀: [ria-jit 重點摘要](https://hackmd.io/@WeiCheng14159/rkCixiYnv) - macro instruction or superinstruction - [hisimu](https://github.com/xzhong86/hisimu) - IR-lifting - [Static Single-Assignment form](https://en.wikipedia.org/wiki/Static_single-assignment_form) (SSA) - Graph-coloring allocation - Register allocation - static binary translation (SBT) - dynamic binary translation (DBT) → 和 SBT 相比記憶體開銷較低、實作較簡單 - Macro-op - Redundant sign extension elimination (案例: RISC-V 的 load 有頻繁的 sign extension) - De-optimisatoin (為了 debugging 的需求) :::warning TODO: - 改善 commit 的說明 (最重要的精神: IR) ，像是引入 IR 的好處動機等等 - 可留下一些 TODO 或是 FIXME - decode 的 computed goto 可以捨去 (computed goto 對 decode 的效率可能不高)，但 emulate 一定需要 ::: :::warning TODO: - 寫一段 [abs](https://man7.org/linux/man-pages/man3/abs.3.html) 的 RISC-V 程式碼 - 寫一段 memcpy 的 RISC-V 程式碼 :::