# Implement RISC-V compressed instruction support for rv32emu-next
contributed by < [`kaeteyaruyo`](https://github.com/kaeteyaruyo) >
## Introduction
The **RISC-V standard compressed instruction set extension**, short for **"RVC"**, is a RISC-V standard extension which reduces static and dynamic code size by adding short 16-bit instruction encodings for common operations. This documentation is aimed to explain how to support RVC extension on [rv32emu-next](https://github.com/sysprog21/rv32emu-next).
You can find the source code [here](https://github.com/kaeteyaruyo/rv32emu-next/tree/feature-RVC).
## Instruction Fetch Alignment
To support this extension, I introduce 2 attribute into the virtual machine: `inst_buffer` and `inst_length`.
* `inst_buffer`: Stand for "Insturction buffer". In RV32I, instruction fetch gets a 32-bit instruction at once. But in RV32C, it is possible that we get more than one instruction at once (what we possibly get are listed in table below), so we need to buffer the part that we are not processing. And since we distinguish 32-bit instruction from 16-bit one by checking its lower 2 bit (`(inst_buffer & 3) == 3 ? 32-bit : 16-bit`), so we need to make sure that anything left in buffer is placed in the lower byte.
![](https://i.imgur.com/5u9MzSR.png)
* `inst_length`: This field indicates how many bytes the PC should increase in this execution. If the instruction length is 2 byte, PC should increase by 2, else it should increase by 4.
The insruction fetch process now looks like this:
```flow
st=>start: rv_step()
cond1=>condition: beffer empty?
op1=>operation: fetch instruction
cond2=>condition: compressed
instruction?
op3=>operation: cut down lower 2 bytes
as whole instruction
op4=>operation: shift buffer to
left by 16 bits
sub1=>subroutine: decompress
instruction
op5=>operation: set inst_length = 2
cond3=>condition: buffer empty?
op6=>operation: get lower 2 bytes
op7=>operation: fetch next instruciton
op8=>operation: get higher 2 bytes
op9=>operation: shift buffer to
left by 16 bits
op10=>operation: get entire instruction
op11=>operation: clear buffer
op12=>operation: set inst_length = 4
op13=>operation: dispatch instruction
st->cond1
cond1(yes)->op1->cond2
cond1(no)->cond2
cond2(yes)->op3->op4->sub1->op5
cond2(no)->cond3
cond3(yes)->op10->op11->op12
cond3(no)->op6->op7->op8->op9->op12
op5->op13
op12->op13
```
```c=
while (rv->csr_cycle < cycles_target && !rv->halt) {
uint32_t inst = 0;
#ifdef ENABLE_RV32C
if (rv->inst_buffer == 0) {
// fetch the next instruction
rv->inst_buffer = rv->io.mem_ifetch(rv, rv->PC);
}
if ((rv->inst_buffer & 3) != 3) {
// cut 16 bit instruction from buffer
inst = rv->inst_buffer & 0x0000FFFF;
rv->inst_buffer >>= 16;
// TODO: filter illegal instruction
if (inst == 0) {
rv->halt = true;
break;
}
// decompress it
const uint8_t index =
((inst & 0x0003) << 3) | ((inst & 0xE000) >> 13);
const decompressor_t decompressor = decompressors[index];
assert(decompressor);
inst = decompressor(inst);
rv->inst_length = 2;
} else {
if (rv->inst_buffer != 0) {
inst = rv->inst_buffer;
rv->inst_buffer = rv->io.mem_ifetch(rv, rv->PC + 2);
inst |= rv->inst_buffer << 16;
rv->inst_buffer >>= 16;
} else {
inst = rv->inst_buffer;
rv->inst_buffer = 0;
}
rv->inst_length = 4;
}
#else
inst = rv->io.mem_ifetch(rv, rv->PC);
#endif
const uint32_t index = (inst & INST_6_2) >> 2;
// dispatch this opcode
const opcode_t op = opcodes[index];
assert(op);
if (!op(rv, inst))
break;
rv->csr_cycle++;
}
```
## Instruction Decompression
According to [this slide](http://ec2-18-188-66-21.us-east-2.compute.amazonaws.com/call_file/A-08%E3%80%90%E7%A4%BA%E7%AF%84%E6%95%99%E6%9D%90%E3%80%91RISC-V%E6%8C%87%E4%BB%A4%E9%9B%86%E6%9E%B6%E6%A7%8B%E5%AF%A6%E4%BD%9C%E8%88%87%E7%A1%AC%E9%AB%94%E6%9E%B6%E6%A7%8B%E8%A8%AD%E8%A8%88%E6%A8%A1%E7%B5%84%E6%95%99%E6%9D%90%E7%B0%A1%E6%98%93%E7%89%88.pdf), it is a better approach that we **decompress** the compressed instructions, rather than **decode** them. So what I tend to do is to translate the incoming compressed instructions into the standard 32-bit one. Then, the emulator can just decode them like standard instrucitons.
The decompression process includes following steps:
1. Combine `funct3` and `opcode` to form index for table look up.
2. Look up decompressor table to find an appropriate instruction decompressor.
3. In the decompressor, decode everything we need (like `rd`, `rs1`, `rs2`, `imm`, ...etc) and extend them if necessary, e.g. `rd = rd' & 0b1000`.
4. Encode to corresponding 32-bit instruction and return.
The decoding and encoding functions can be found in [`rvc_private.h`](https://github.com/kaeteyaruyo/rv32emu-next/blob/feature-RVC/rvc_private.h).
### Implemented Instruction List
I only implement the instructions that exist in RV32C. Instructions in RV32FC, RV64C and RV128C extension are ignored.
Some instructions' `funct3` and `opcode` value are identical, so we need to introduce another parsing function to distinguish them.
The decompressors can be found in [`rvc.c`](https://github.com/kaeteyaruyo/rv32emu-next/blob/feature-RVC/rvc.c).
| `funct3` | `opcode` | RV32C inst. | RV32I inst. | RV32I format |
|-|-|-|-|-|
| 000 | 00 | C.ADDI4SPN | ADDI | I-type |
| 010 | 00 | C.LW | LW | I-type |
| 110 | 00 | C.SW | SW | S-type |
| 000 | 01 | C.NOP | NOP | I-type |
| 000 | 01 | C.ADDI | ADDI | I-type |
| 001 | 01 | C.JAL | JAL | J-type |
| 010 | 01 | C.LI | ADDI | I-type |
| 011 | 01 | C.ADDI16SP | ADDI | I-type |
| 011 | 01 | C.LUI | LUI | U-type |
| 100 | 01 | C.SRLI | SRLI | R-type |
| 100 | 01 | C.SRAI | SRAI | R-type |
| 100 | 01 | C.ANDI | ANDI | I-type |
| 100 | 01 | C.SUB | SUB | R-type |
| 100 | 01 | C.XOR | XOR | R-type |
| 100 | 01 | C.OR | OR | R-type |
| 100 | 01 | C.AND | AND | R-type |
| 101 | 01 | C.J | JAL | J-type |
| 110 | 01 | C.BEQZ | BEQ | B-type |
| 111 | 01 | C.BNEZ | BNE | B-type |
| 000 | 10 | C.SLLI | SLLI | R-type |
| 010 | 10 | C.LWSP | LW | I-type |
| 100 | 10 | C.JR | JALR | I-type |
| 100 | 10 | C.MV | ADD | R-type |
| 100 | 10 | C.EBREAK | EBREAK | I-type |
| 100 | 10 | C.JALR | JALR | I-type |
| 100 | 10 | C.ADD | ADD | R-type |
| 110 | 10 | C.SWSP | SW | S-type |
### Compliance Test Result
I use [RISC-V Compliance](https://github.com/riscv/riscv-compliance) test to check for the correctness of decompressor implementation. Except `C.EBREAK`, all the other instructions are passed the compliance tests.
#### Setup
To run the compliance test, there are some script files we need to prepare. You can checkout the [official document](https://github.com/riscv/riscv-compliance/tree/master/doc) for detail. Here I only show the content of script files I use.
* In `~/.bashrc`, exports the path of executable of `rv32emu-next` to the `$PATH`
* `/riscv-compliance/Makefile.include`
```cmake=11
# ...
export RISCV_TARGET ?= rv32emu-next
# set the RISCV_DEVICE environment to a single extension you want to compile, simulate and/or verify.
# Leave this blank if you want to iterate through all the supported extensions available in the target
export RISCV_DEVICE ?= C
# ...
```
* `/riscv-compliance/riscv-target/rv32emu-next/device/rv32i_m/C/Makefile.include`
```cmake=
TARGET_SIM ?= rv32emu
TARGET_FLAGS ?= $(RISCV_TARGET_FLAGS)
ifeq ($(shell command -v $(TARGET_SIM) 2> /dev/null),)
$(error Target simulator executable '$(TARGET_SIM)` not found)
endif
RUN_TARGET=\
$(TARGET_SIM) +signature=$(*).signature.output \
$< > $(*).signature.output 2> $@;
RISCV_PREFIX ?= riscv32-unknown-linux-gnu-
RISCV_GCC ?= $(RISCV_PREFIX)gcc
RISCV_OBJDUMP ?= $(RISCV_PREFIX)objdump
RISCV_GCC_OPTS ?= -g -static -mcmodel=medany -fvisibility=hidden -nostdlib -nostartfiles $(RVTEST_DEFINES)
COMPILE_CMD = $$(RISCV_GCC) $(1) $$(RISCV_GCC_OPTS) \
-I$(ROOTDIR)/riscv-test-suite/env/ \
-I$(TARGETDIR)/$(RISCV_TARGET)/ \
-T$(TARGETDIR)/$(RISCV_TARGET)/link.ld \
$$(<) -o $$@
OBJ_CMD = $$(RISCV_OBJDUMP) $$@ -D > $$@.objdump; \
$$(RISCV_OBJDUMP) $$@ --source > $$@.debug
COMPILE_TARGET=\
$(COMPILE_CMD); \
if [ $$$$? -ne 0 ] ; \
then \
echo "\e[31m$$(RISCV_GCC) failed for target $$(@) \e[39m" ; \
exit 1 ; \
fi ; \
$(OBJ_CMD); \
if [ $$$$? -ne 0 ] ; \
then \
echo "\e[31m $$(RISCV_OBJDUMP) failed for target $$(@) \e[39m" ; \
exit 1 ; \
fi ;
```
* `/riscv-compliance/riscv-target/rv32emu-next/link.ld`
```=
OUTPUT_ARCH( "riscv" )
ENTRY(rvtest_entry_point)
SECTIONS
{
. = 0x80000000;
.text.init : { *(.text.init) }
. = ALIGN(0x1000);
.tohost : { *(.tohost) }
. = ALIGN(0x1000);
.text : { *(.text) }
. = ALIGN(0x1000);
.data : { *(.data) }
.data.string : { *(.data.string)}
.bss : { *(.bss) }
_end = .;
}
```
* `/riscv-compliance/riscv-target/rv32emu-next/model_test.h`
```c=
#ifndef _COMPLIANCE_MODEL_H
#define _COMPLIANCE_MODEL_H
#if XLEN == 64
#define ALIGNMENT 3
#else
#define ALIGNMENT 2
#endif
#define RVMODEL_DATA_SECTION \
.pushsection .tohost,"aw",@progbits; \
.align 8; .global tohost; tohost: .dword 0; \
.align 8; .global fromhost; fromhost: .dword 0; \
.popsection; \
.align 8; .global begin_regstate; begin_regstate: \
.word 128; \
.align 8; .global end_regstate; end_regstate: \
.word 4;
#define RVMODEL_HALT
#define RVMODEL_BOOT
//RV_COMPLIANCE_DATA_BEGIN
#define RVMODEL_DATA_BEGIN \
.align 4; .global begin_signature; begin_signature:
//RV_COMPLIANCE_DATA_END
#define RVMODEL_DATA_END \
.align 4; .global end_signature; end_signature: \
RVMODEL_DATA_SECTION \
//RVTEST_IO_INIT
#define RVMODEL_IO_INIT
//RVTEST_IO_WRITE_STR
#define RVMODEL_IO_WRITE_STR(_R, _STR)
//RVTEST_IO_CHECK
#define RVMODEL_IO_CHECK()
//RVTEST_IO_ASSERT_GPR_EQ
#define RVMODEL_IO_ASSERT_GPR_EQ(_S, _R, _I)
//RVTEST_IO_ASSERT_SFPR_EQ
#define RVMODEL_IO_ASSERT_SFPR_EQ(_F, _R, _I)
//RVTEST_IO_ASSERT_DFPR_EQ
#define RVMODEL_IO_ASSERT_DFPR_EQ(_D, _R, _I)
#define RVMODEL_SET_MSW_INT \
li t1, 1; \
li t2, 0x2000000; \
sw t1, 0(t2);
#define RVMODEL_CLEAR_MSW_INT \
li t2, 0x2000000; \
sw x0, 0(t2);
#define RVMODEL_CLEAR_MTIMER_INT
#define RVMODEL_CLEAR_MEXT_INT
#endif // _COMPLIANCE_MODEL_H
```
#### Result
With all these files, now we can run `make` in RISC-V Compliance's root directory and check the result.
```
kinoe@:~/riscv-compliance$ make
...
Compare to reference files ...
Check cadd-01 ... OK
Check caddi-01 ... OK
Check caddi16sp-01 ... OK
Check caddi4spn-01 ... OK
Check cand-01 ... OK
Check candi-01 ... OK
Check cbeqz-01 ... OK
Check cbnez-01 ... OK
Check cebreak-011,6c1,6
< 00000000
< 11111111
< 0000008f
< 00000003
< 00000108
< 00000108
---
> deadbeef
> deadbeef
> deadbeef
> deadbeef
> deadbeef
> deadbeef
... FAIL
Check cj-01 ... OK
Check cjal-01 ... OK
Check cjalr-01 ... OK
Check cjr-01 ... OK
Check cli-01 ... OK
Check clui-01 ... OK
Check clw-01 ... OK
Check clwsp-01 ... OK
Check cmv-01 ... OK
Check cnop-01 ... OK
Check cor-01 ... OK
Check cslli-01 ... OK
Check csrai-01 ... OK
Check csrli-01 ... OK
Check csub-01 ... OK
Check csw-01 ... OK
Check cswsp-01 ... OK
Check cxor-01 ... OK
--------------------------------
FAIL: 1/27 RISCV_TARGET=rv32emu-next RISCV_DEVICE=C XLEN=32
...
```
## Reference
* [RISC-V Instruction Set Manual](https://riscv.org//wp-content/uploads/2017/05/riscv-spec-v2.2.pdf)
* [台灣大學 吳安宇教授, RISC-V指令集架構實作與硬體架構設計](http://ec2-18-188-66-21.us-east-2.compute.amazonaws.com/call_file/A-08%E3%80%90%E7%A4%BA%E7%AF%84%E6%95%99%E6%9D%90%E3%80%91RISC-V%E6%8C%87%E4%BB%A4%E9%9B%86%E6%9E%B6%E6%A7%8B%E5%AF%A6%E4%BD%9C%E8%88%87%E7%A1%AC%E9%AB%94%E6%9E%B6%E6%A7%8B%E8%A8%AD%E8%A8%88%E6%A8%A1%E7%B5%84%E6%95%99%E6%9D%90%E7%B0%A1%E6%98%93%E7%89%88.pdf)
* [riscv-compliance - Github](https://github.com/riscv/riscv-compliance)