# Implement RISC-V compressed instruction support for rv32emu-next contributed by < [`kaeteyaruyo`](https://github.com/kaeteyaruyo) > ## Introduction The **RISC-V standard compressed instruction set extension**, short for **"RVC"**, is a RISC-V standard extension which reduces static and dynamic code size by adding short 16-bit instruction encodings for common operations. This documentation is aimed to explain how to support RVC extension on [rv32emu-next](https://github.com/sysprog21/rv32emu-next). You can find the source code [here](https://github.com/kaeteyaruyo/rv32emu-next/tree/feature-RVC). ## Instruction Fetch Alignment To support this extension, I introduce 2 attribute into the virtual machine: `inst_buffer` and `inst_length`. * `inst_buffer`: Stand for "Insturction buffer". In RV32I, instruction fetch gets a 32-bit instruction at once. But in RV32C, it is possible that we get more than one instruction at once (what we possibly get are listed in table below), so we need to buffer the part that we are not processing. And since we distinguish 32-bit instruction from 16-bit one by checking its lower 2 bit (`(inst_buffer & 3) == 3 ? 32-bit : 16-bit`), so we need to make sure that anything left in buffer is placed in the lower byte. ![](https://i.imgur.com/5u9MzSR.png) * `inst_length`: This field indicates how many bytes the PC should increase in this execution. If the instruction length is 2 byte, PC should increase by 2, else it should increase by 4. The insruction fetch process now looks like this: ```flow st=>start: rv_step() cond1=>condition: beffer empty? op1=>operation: fetch instruction cond2=>condition: compressed instruction? op3=>operation: cut down lower 2 bytes as whole instruction op4=>operation: shift buffer to left by 16 bits sub1=>subroutine: decompress instruction op5=>operation: set inst_length = 2 cond3=>condition: buffer empty? op6=>operation: get lower 2 bytes op7=>operation: fetch next instruciton op8=>operation: get higher 2 bytes op9=>operation: shift buffer to left by 16 bits op10=>operation: get entire instruction op11=>operation: clear buffer op12=>operation: set inst_length = 4 op13=>operation: dispatch instruction st->cond1 cond1(yes)->op1->cond2 cond1(no)->cond2 cond2(yes)->op3->op4->sub1->op5 cond2(no)->cond3 cond3(yes)->op10->op11->op12 cond3(no)->op6->op7->op8->op9->op12 op5->op13 op12->op13 ``` ```c= while (rv->csr_cycle < cycles_target && !rv->halt) { uint32_t inst = 0; #ifdef ENABLE_RV32C if (rv->inst_buffer == 0) { // fetch the next instruction rv->inst_buffer = rv->io.mem_ifetch(rv, rv->PC); } if ((rv->inst_buffer & 3) != 3) { // cut 16 bit instruction from buffer inst = rv->inst_buffer & 0x0000FFFF; rv->inst_buffer >>= 16; // TODO: filter illegal instruction if (inst == 0) { rv->halt = true; break; } // decompress it const uint8_t index = ((inst & 0x0003) << 3) | ((inst & 0xE000) >> 13); const decompressor_t decompressor = decompressors[index]; assert(decompressor); inst = decompressor(inst); rv->inst_length = 2; } else { if (rv->inst_buffer != 0) { inst = rv->inst_buffer; rv->inst_buffer = rv->io.mem_ifetch(rv, rv->PC + 2); inst |= rv->inst_buffer << 16; rv->inst_buffer >>= 16; } else { inst = rv->inst_buffer; rv->inst_buffer = 0; } rv->inst_length = 4; } #else inst = rv->io.mem_ifetch(rv, rv->PC); #endif const uint32_t index = (inst & INST_6_2) >> 2; // dispatch this opcode const opcode_t op = opcodes[index]; assert(op); if (!op(rv, inst)) break; rv->csr_cycle++; } ``` ## Instruction Decompression According to [this slide](http://ec2-18-188-66-21.us-east-2.compute.amazonaws.com/call_file/A-08%E3%80%90%E7%A4%BA%E7%AF%84%E6%95%99%E6%9D%90%E3%80%91RISC-V%E6%8C%87%E4%BB%A4%E9%9B%86%E6%9E%B6%E6%A7%8B%E5%AF%A6%E4%BD%9C%E8%88%87%E7%A1%AC%E9%AB%94%E6%9E%B6%E6%A7%8B%E8%A8%AD%E8%A8%88%E6%A8%A1%E7%B5%84%E6%95%99%E6%9D%90%E7%B0%A1%E6%98%93%E7%89%88.pdf), it is a better approach that we **decompress** the compressed instructions, rather than **decode** them. So what I tend to do is to translate the incoming compressed instructions into the standard 32-bit one. Then, the emulator can just decode them like standard instrucitons. The decompression process includes following steps: 1. Combine `funct3` and `opcode` to form index for table look up. 2. Look up decompressor table to find an appropriate instruction decompressor. 3. In the decompressor, decode everything we need (like `rd`, `rs1`, `rs2`, `imm`, ...etc) and extend them if necessary, e.g. `rd = rd' & 0b1000`. 4. Encode to corresponding 32-bit instruction and return. The decoding and encoding functions can be found in [`rvc_private.h`](https://github.com/kaeteyaruyo/rv32emu-next/blob/feature-RVC/rvc_private.h). ### Implemented Instruction List I only implement the instructions that exist in RV32C. Instructions in RV32FC, RV64C and RV128C extension are ignored. Some instructions' `funct3` and `opcode` value are identical, so we need to introduce another parsing function to distinguish them. The decompressors can be found in [`rvc.c`](https://github.com/kaeteyaruyo/rv32emu-next/blob/feature-RVC/rvc.c). | `funct3` | `opcode` | RV32C inst. | RV32I inst. | RV32I format | |-|-|-|-|-| | 000 | 00 | C.ADDI4SPN | ADDI | I-type | | 010 | 00 | C.LW | LW | I-type | | 110 | 00 | C.SW | SW | S-type | | 000 | 01 | C.NOP | NOP | I-type | | 000 | 01 | C.ADDI | ADDI | I-type | | 001 | 01 | C.JAL | JAL | J-type | | 010 | 01 | C.LI | ADDI | I-type | | 011 | 01 | C.ADDI16SP | ADDI | I-type | | 011 | 01 | C.LUI | LUI | U-type | | 100 | 01 | C.SRLI | SRLI | R-type | | 100 | 01 | C.SRAI | SRAI | R-type | | 100 | 01 | C.ANDI | ANDI | I-type | | 100 | 01 | C.SUB | SUB | R-type | | 100 | 01 | C.XOR | XOR | R-type | | 100 | 01 | C.OR | OR | R-type | | 100 | 01 | C.AND | AND | R-type | | 101 | 01 | C.J | JAL | J-type | | 110 | 01 | C.BEQZ | BEQ | B-type | | 111 | 01 | C.BNEZ | BNE | B-type | | 000 | 10 | C.SLLI | SLLI | R-type | | 010 | 10 | C.LWSP | LW | I-type | | 100 | 10 | C.JR | JALR | I-type | | 100 | 10 | C.MV | ADD | R-type | | 100 | 10 | C.EBREAK | EBREAK | I-type | | 100 | 10 | C.JALR | JALR | I-type | | 100 | 10 | C.ADD | ADD | R-type | | 110 | 10 | C.SWSP | SW | S-type | ### Compliance Test Result I use [RISC-V Compliance](https://github.com/riscv/riscv-compliance) test to check for the correctness of decompressor implementation. Except `C.EBREAK`, all the other instructions are passed the compliance tests. #### Setup To run the compliance test, there are some script files we need to prepare. You can checkout the [official document](https://github.com/riscv/riscv-compliance/tree/master/doc) for detail. Here I only show the content of script files I use. * In `~/.bashrc`, exports the path of executable of `rv32emu-next` to the `$PATH` * `/riscv-compliance/Makefile.include` ```cmake=11 # ... export RISCV_TARGET ?= rv32emu-next # set the RISCV_DEVICE environment to a single extension you want to compile, simulate and/or verify. # Leave this blank if you want to iterate through all the supported extensions available in the target export RISCV_DEVICE ?= C # ... ``` * `/riscv-compliance/riscv-target/rv32emu-next/device/rv32i_m/C/Makefile.include` ```cmake= TARGET_SIM ?= rv32emu TARGET_FLAGS ?= $(RISCV_TARGET_FLAGS) ifeq ($(shell command -v $(TARGET_SIM) 2> /dev/null),) $(error Target simulator executable '$(TARGET_SIM)` not found) endif RUN_TARGET=\ $(TARGET_SIM) +signature=$(*).signature.output \ $< > $(*).signature.output 2> $@; RISCV_PREFIX ?= riscv32-unknown-linux-gnu- RISCV_GCC ?= $(RISCV_PREFIX)gcc RISCV_OBJDUMP ?= $(RISCV_PREFIX)objdump RISCV_GCC_OPTS ?= -g -static -mcmodel=medany -fvisibility=hidden -nostdlib -nostartfiles $(RVTEST_DEFINES) COMPILE_CMD = $$(RISCV_GCC) $(1) $$(RISCV_GCC_OPTS) \ -I$(ROOTDIR)/riscv-test-suite/env/ \ -I$(TARGETDIR)/$(RISCV_TARGET)/ \ -T$(TARGETDIR)/$(RISCV_TARGET)/link.ld \ $$(<) -o $$@ OBJ_CMD = $$(RISCV_OBJDUMP) $$@ -D > $$@.objdump; \ $$(RISCV_OBJDUMP) $$@ --source > $$@.debug COMPILE_TARGET=\ $(COMPILE_CMD); \ if [ $$$$? -ne 0 ] ; \ then \ echo "\e[31m$$(RISCV_GCC) failed for target $$(@) \e[39m" ; \ exit 1 ; \ fi ; \ $(OBJ_CMD); \ if [ $$$$? -ne 0 ] ; \ then \ echo "\e[31m $$(RISCV_OBJDUMP) failed for target $$(@) \e[39m" ; \ exit 1 ; \ fi ; ``` * `/riscv-compliance/riscv-target/rv32emu-next/link.ld` ```= OUTPUT_ARCH( "riscv" ) ENTRY(rvtest_entry_point) SECTIONS { . = 0x80000000; .text.init : { *(.text.init) } . = ALIGN(0x1000); .tohost : { *(.tohost) } . = ALIGN(0x1000); .text : { *(.text) } . = ALIGN(0x1000); .data : { *(.data) } .data.string : { *(.data.string)} .bss : { *(.bss) } _end = .; } ``` * `/riscv-compliance/riscv-target/rv32emu-next/model_test.h` ```c= #ifndef _COMPLIANCE_MODEL_H #define _COMPLIANCE_MODEL_H #if XLEN == 64 #define ALIGNMENT 3 #else #define ALIGNMENT 2 #endif #define RVMODEL_DATA_SECTION \ .pushsection .tohost,"aw",@progbits; \ .align 8; .global tohost; tohost: .dword 0; \ .align 8; .global fromhost; fromhost: .dword 0; \ .popsection; \ .align 8; .global begin_regstate; begin_regstate: \ .word 128; \ .align 8; .global end_regstate; end_regstate: \ .word 4; #define RVMODEL_HALT #define RVMODEL_BOOT //RV_COMPLIANCE_DATA_BEGIN #define RVMODEL_DATA_BEGIN \ .align 4; .global begin_signature; begin_signature: //RV_COMPLIANCE_DATA_END #define RVMODEL_DATA_END \ .align 4; .global end_signature; end_signature: \ RVMODEL_DATA_SECTION \ //RVTEST_IO_INIT #define RVMODEL_IO_INIT //RVTEST_IO_WRITE_STR #define RVMODEL_IO_WRITE_STR(_R, _STR) //RVTEST_IO_CHECK #define RVMODEL_IO_CHECK() //RVTEST_IO_ASSERT_GPR_EQ #define RVMODEL_IO_ASSERT_GPR_EQ(_S, _R, _I) //RVTEST_IO_ASSERT_SFPR_EQ #define RVMODEL_IO_ASSERT_SFPR_EQ(_F, _R, _I) //RVTEST_IO_ASSERT_DFPR_EQ #define RVMODEL_IO_ASSERT_DFPR_EQ(_D, _R, _I) #define RVMODEL_SET_MSW_INT \ li t1, 1; \ li t2, 0x2000000; \ sw t1, 0(t2); #define RVMODEL_CLEAR_MSW_INT \ li t2, 0x2000000; \ sw x0, 0(t2); #define RVMODEL_CLEAR_MTIMER_INT #define RVMODEL_CLEAR_MEXT_INT #endif // _COMPLIANCE_MODEL_H ``` #### Result With all these files, now we can run `make` in RISC-V Compliance's root directory and check the result. ``` kinoe@:~/riscv-compliance$ make ... Compare to reference files ... Check cadd-01 ... OK Check caddi-01 ... OK Check caddi16sp-01 ... OK Check caddi4spn-01 ... OK Check cand-01 ... OK Check candi-01 ... OK Check cbeqz-01 ... OK Check cbnez-01 ... OK Check cebreak-011,6c1,6 < 00000000 < 11111111 < 0000008f < 00000003 < 00000108 < 00000108 --- > deadbeef > deadbeef > deadbeef > deadbeef > deadbeef > deadbeef ... FAIL Check cj-01 ... OK Check cjal-01 ... OK Check cjalr-01 ... OK Check cjr-01 ... OK Check cli-01 ... OK Check clui-01 ... OK Check clw-01 ... OK Check clwsp-01 ... OK Check cmv-01 ... OK Check cnop-01 ... OK Check cor-01 ... OK Check cslli-01 ... OK Check csrai-01 ... OK Check csrli-01 ... OK Check csub-01 ... OK Check csw-01 ... OK Check cswsp-01 ... OK Check cxor-01 ... OK -------------------------------- FAIL: 1/27 RISCV_TARGET=rv32emu-next RISCV_DEVICE=C XLEN=32 ... ``` ## Reference * [RISC-V Instruction Set Manual](https://riscv.org//wp-content/uploads/2017/05/riscv-spec-v2.2.pdf) * [台灣大學 吳安宇教授, RISC-V指令集架構實作與硬體架構設計](http://ec2-18-188-66-21.us-east-2.compute.amazonaws.com/call_file/A-08%E3%80%90%E7%A4%BA%E7%AF%84%E6%95%99%E6%9D%90%E3%80%91RISC-V%E6%8C%87%E4%BB%A4%E9%9B%86%E6%9E%B6%E6%A7%8B%E5%AF%A6%E4%BD%9C%E8%88%87%E7%A1%AC%E9%AB%94%E6%9E%B6%E6%A7%8B%E8%A8%AD%E8%A8%88%E6%A8%A1%E7%B5%84%E6%95%99%E6%9D%90%E7%B0%A1%E6%98%93%E7%89%88.pdf) * [riscv-compliance - Github](https://github.com/riscv/riscv-compliance)