Implement Vector extension for RVVM

contributed by < ranvd >

V extension

Spec.: https://github.com/riscv/riscv-v-spec

In the following section, I will provide a brief overview of the vector extension and outline the fundamental aspects of how vector operations function. Regrettably, in my term project to RVVM, I have only incorporated the vset{i}vl{i} instructions and lack familiarity with the JIT (Just-In-Time) compilation aspect of RVVM.

Within the RISC-V Vector (RVV) framework, several vector Control and Status Register (CSR) registers govern the handling of vector operations. Of utmost significance are the vtype and vl components. Prior to delving into the intricacies of register encoding, it is imperative to introduce certain terminologies specific to RVV.

VLEN: represents the maximum bit width of each vector register.
ELEN: refers to the maximum bit width of each element.
sew: selected element width. Specify the element width you want.
eew: effective element width. In some case, the destination vector register will have different element width (eg. vwaddu, vnsrl)
lmul: specifing how to group the vector registers.
emul: effective LMUL. This have the same idea between sew and eew.

vtype register

Image Not Showing Possible Reasons

The image was uploaded to a note which you don't have access to
The note which the image was originally uploaded to has been deleted

Learn More →

The vtype register can only be configured through the vset{i}vl{i} instruction within RVV. Additionally, the vill bit at the XLEN-1 position should be set if the specified vsew and vlmul values are deemed invalid. This ensures proper handling of invalid configurations within the vector operation context.

vl register

The vl register can exclusively be configured through the vset{i}vl{i} instruction. This instruction plays a pivotal role in setting the vector length (vl) during vector operations.

Other registers

vlenb: Specifies the length of vector registers in bytes.
vstart: Indicates the starting index for vector operations.
vxrm: Controls the rounding mode for vector arithmetic operations.
vxsat: Indicates if a fixed-point instruction has had to saturate an output value to fit into a destination format.
vcsr: Contain the informations of vsat and vxrm.

vset{i}vl{i}

In the Makefile, I define the macro USE_RVV as the control flag responsible for enabling or disabling the Vector extension.

# Makefile
# Vector extension
ifeq ($(USE_RVV), 1)
override CFLAGS += -DUSE_RVV
endif

Several Vector Control Status Registers (CSRs) should be incorporated into the csr struct within the rvvm_hart_t function. At present, I have exclusively implemented the vtype and vl registers. These two registers delineate crucial parameters such as the element width, register group, mask policy, and other pertinent specifications.

// rvvm.h
struct rvvm_hart_t {
...
    
    struct {
        maxlen_t hartid;
        maxlen_t isa;
        maxlen_t status;
        maxlen_t edeleg[PRIVILEGES_MAX];
        maxlen_t ideleg[PRIVILEGES_MAX];
        maxlen_t ie;
        maxlen_t tvec[PRIVILEGES_MAX];
        maxlen_t scratch[PRIVILEGES_MAX];
        maxlen_t epc[PRIVILEGES_MAX];
        maxlen_t cause[PRIVILEGES_MAX];
        maxlen_t tval[PRIVILEGES_MAX];
        maxlen_t ip;
        maxlen_t fcsr;
+# ifdef USE_RVV
+        maxlen_t vstart;
+        maxlen_t vxsat;
+        maxlen_t vxrm;
+        maxlen_t vcrs;
+        maxlen_t vl;
+        maxlen_t vtype;
+        maxlen_t vlenb;
+# endif
    } csr;
...
}

Given that the Vector extension is enabled, the mstatus.VS field should be configured to either Initial(1) or Clean(2). Upon the execution of any vector instruction by the CPU, the mstatus.VS field will be updated to Dirty(3). During the initialization of a hart, certain fixed parameters must be set, such as VLEN, ELEN, SEW_min, and mstatus.VS.

Image Not Showing Possible Reasons

The image was uploaded to a note which you don't have access to
The note which the image was originally uploaded to has been deleted

Learn More →

static inline void rvv_set_vs(rvvm_hart_t* vm, uint8_t value)
{
    vm->csr.status = bit_replace(vm->csr.status, 9, 2, value);
}

rvvm_hart_t* riscv_hart_init(rvvm_machine_t* machine)
{
...
#ifdef USE_RVV
    vm->VLEN = 0b1 << 16;
    vm->ELEN = 0b1 << 16;
    vm->SEW_min = 8;
    rvv_set_vs(vm, RVV_INITIAL);
#endif
...
    } else {
        vm->csr.isa = CSR_MISA_RV32;
        riscv_decoder_init_rv32(vm);
    }

    riscv_tlb_flush(vm);
    riscv_priv_init(vm);
    return vm;
}

Within the RVVM project, all privileged instructions are registered through the riscv_priv_init() function. Consequently, I have registered the vset{i}vl{i} instructions within this function.

void riscv_priv_init(rvvm_hart_t* vm)
{
...
+#ifdef USE_RVV
+    riscv_install_opcode_ISB(vm, RV_PRIV_V_VSETVL, riscv_v_vsetvl);
+#endif
}

Given that the OP code and func3 field in vsetvl, vsetvli, and vsetivli share identical values, the decoding operation is performed within the riscv_v_vsetvl() function. In my implementation, I have opted for a switch-case statement to handle this decoding process.

Image Not Showing Possible Reasons

The image was uploaded to a note which you don't have access to
The note which the image was originally uploaded to has been deleted

Learn More →

As the Application Vector Length (AVL) value relies on the input values of the rs1 and rd fields, the encoding policy is as below. The vl will be set to the AVL only if the vl value is valid. This validity is determined by the sew and lmul values in the vtype register. Consequently, it is necessary to configure the vtype register first before setting the vl value.

Image Not Showing Possible Reasons

The image was uploaded to a note which you don't have access to
The note which the image was originally uploaded to has been deleted

Learn More →

#ifdef USE_RVV
static void riscv_v_vsetvl(rvvm_hart_t *vm, const uint32_t instruction)
{
    rvv_set_vs(vm, RVV_DIRTY);
    regid_t rds = bit_cut(instruction, 7, 5);
    maxlen_t vtype, vl, vlmax;

    switch (instruction & RV_PRIV_V_VSETVL_MASK)
    {
    case vsetvli_0: // 0b0x
    case vsetvli_1:
        // set vtype
        vtype = bit_cut(instruction, 20, 11);
        if (riscv_csr_op(vm, 0xC21 /* vtype */, &vtype, CSR_SWAP));
        else riscv_trap(vm, TRAP_ILL_INSTR, instruction);
        
        // set vl
        vl = bit_cut(instruction, 15, 5);
        vlmax = rvv_vlmax(vm, vm->csr.vtype);

        if (vl == 0){
            vl = vlmax;
            if (rds == 0)
                vl = (vlmax > vm->csr.vl)? vm->csr.vl : vlmax;
        } else {
            vl = vm->registers[vl] > vlmax ? vlmax : vm->registers[vl];
        }
        // VLMAX = LMUL*VLEN/SEW
        if (riscv_csr_op(vm, 0xC20 /* vl */, &vl, CSR_SWAP))
            vm->registers[rds] = vl;
        else
            riscv_trap(vm, TRAP_ILL_INSTR, instruction);
        break;
    case vsetvl: // 0b10
        // set vtype
        vtype = bit_cut(instruction, 20, 10);
        vtype = vm->registers[vtype];
        if (riscv_csr_op(vm, 0xC21 /* vtype */, &vtype, CSR_SWAP));
        else riscv_trap(vm, TRAP_ILL_INSTR, instruction);
        
        // set vl
        vl = bit_cut(instruction, 15, 5);
        vlmax = rvv_vlmax(vm, vm->csr.vtype);

        if (vl == 0){
            vl = vlmax;
            if (rds == 0)
                vl = (vlmax > vm->csr.vl)? vm->csr.vl : vlmax;
        } else {
            vl = vm->registers[vl] > vlmax ? vlmax : vm->registers[vl];
        }
        
        // VLMAX = LMUL*VLEN/SEW
        if (riscv_csr_op(vm, 0xC20 /* vl */, &vl, CSR_SWAP))
            vm->registers[rds] = vl;
        else
            riscv_trap(vm, TRAP_ILL_INSTR, instruction);
        break;
    case vsetivli: // 0b11
        // set vtype
        vtype = bit_cut(instruction, 20, 5);
        if (riscv_csr_op(vm, 0xC21 /* vtype */, &vtype, CSR_SWAP));
        else riscv_trap(vm, TRAP_ILL_INSTR, instruction);
        
        // set vl
        vl = bit_cut(instruction, 15, 5);
        vlmax = rvv_vlmax(vm, vm->csr.vtype);

        vl = (vlmax > vl)? vl : vlmax;
        if (riscv_csr_op(vm, 0xC20 /* vl */, &vl, CSR_SWAP))
            vm->registers[rds] = vl;
        else
            riscv_trap(vm, TRAP_ILL_INSTR, instruction);
    default:
        break;
    }
}
#endif

In RVVM, each Control and Status Register (CSR) is associated with its own handler. Currently, I have implemented handlers for the vtype and vl registers.

void riscv_csr_global_init()
{
...
+#ifdef USE_RVV /* Todo: action wan r/w csr */
+    riscv_csr_list[0x008] = riscv_csr_zero;     // vstart
+    riscv_csr_list[0x009] = riscv_csr_zero;     // vxsat
+    riscv_csr_list[0x00A] = riscv_csr_zero;     // vxrm
+    riscv_csr_list[0x00F] = riscv_csr_zero;     // vcsr
+    riscv_csr_list[0xC20] = riscv_csr_vl;       // vl
+    riscv_csr_list[0xC21] = riscv_csr_vtype;    // vtype
+    riscv_csr_list[0xC22] = riscv_csr_zero;     // vlenb
+#endif
...
}

For vtype csr handler, it is necessary to validate the input values. If the provided values are deemed invalid, the vtype[XLEN-1] bit should be set accordingly. This ensures proper handling of invalid input values within the context of the specific CSR handler.

#ifdef USE_RVV
static bool riscv_csr_vtype(rvvm_hart_t* vm, maxlen_t* dest, uint8_t op)
{
    maxlen_t val = vm->csr.vtype;
    csr_helper(&val, dest, op);

    // checkt vtype is valid
    int32_t lmul = rvv_raw_vlmul(val);
    
    // lmul should greater equal than SEW_min/ELEN
    if (lmul < ((int32_t)bit_log2(vm->SEW_min) - (int32_t)bit_log2(vm->ELEN))){
        val = bit_replace(val, 31, 1, 1);
    }
    vm->csr.vtype = val;
    return true;
}

static bool riscv_csr_vl(rvvm_hart_t* vm, maxlen_t* dest, uint8_t op)
{
    // check vl is valid
    maxlen_t val = vm->csr.vl;
    csr_helper(&val, dest, op);
    vm->csr.vl = val;
    return true;
}
#endif

Result

I simply insert printf function to inspect the vl and vltype value.

PC 32: 80000000
vtype: 0
vl: 0
---------------
Read vsetvli: c107557
PC 32: 80000004
vtype: c1
vl: 4000

Also, while reading the rvv-intrinsic-doc, I pull request to rvv-intrinsic-doc to fix a typo. XD

Lack of the coverage of vector specific instructions.