# Implement Vector extension for [RVVM](https://github.com/LekKit/RVVM) contributed by < `ranvd` > ## V extension > Spec.: https://github.com/riscv/riscv-v-spec In the following section, I will provide a brief overview of the vector extension and outline the fundamental aspects of how vector operations function. Regrettably, in my term project to RVVM, I have only incorporated the `vset{i}vl{i}` instructions and lack familiarity with the JIT (Just-In-Time) compilation aspect of RVVM. Within the RISC-V Vector (RVV) framework, several vector Control and Status Register (CSR) registers govern the handling of vector operations. Of utmost significance are the `vtype` and `vl` components. Prior to delving into the intricacies of register encoding, it is imperative to introduce certain terminologies specific to RVV. * VLEN: represents the maximum bit width of each vector register. * ELEN: refers to the maximum bit width of each element. * sew: selected element width. Specify the element width you want. * eew: effective element width. In some case, the destination vector register will have different element width (eg. `vwaddu`, `vnsrl`) * lmul: specifing how to group the vector registers. * emul: effective LMUL. This have the same idea between sew and eew. ### vtype register ![image](https://hackmd.io/_uploads/rke8wXpdp.png) The `vtype` register can only be configured through the `vset{i}vl{i}` instruction within RVV. Additionally, the `vill` bit at the XLEN-1 position should be set if the specified `vsew` and `vlmul` values are deemed invalid. This ensures proper handling of invalid configurations within the vector operation context. ### vl register The vl register can exclusively be configured through the vset{i}vl{i} instruction. This instruction plays a pivotal role in setting the vector length (vl) during vector operations. ### Other registers * vlenb: Specifies the length of vector registers in bytes. * vstart: Indicates the starting index for vector operations. * vxrm: Controls the rounding mode for vector arithmetic operations. * vxsat: Indicates if a fixed-point instruction has had to saturate an output value to fit into a destination format. * vcsr: Contain the informations of vsat and vxrm. ## vset{i}vl{i} In the Makefile, I define the macro `USE_RVV` as the control flag responsible for enabling or disabling the Vector extension. ```makefile # Makefile # Vector extension ifeq ($(USE_RVV), 1) override CFLAGS += -DUSE_RVV endif ``` Several Vector Control Status Registers (CSRs) should be incorporated into the `csr` struct within the `rvvm_hart_t` function. At present, I have exclusively implemented the `vtype` and `vl` registers. These two registers delineate crucial parameters such as the element width, register group, mask policy, and other pertinent specifications. ```diff // rvvm.h struct rvvm_hart_t { ... struct { maxlen_t hartid; maxlen_t isa; maxlen_t status; maxlen_t edeleg[PRIVILEGES_MAX]; maxlen_t ideleg[PRIVILEGES_MAX]; maxlen_t ie; maxlen_t tvec[PRIVILEGES_MAX]; maxlen_t scratch[PRIVILEGES_MAX]; maxlen_t epc[PRIVILEGES_MAX]; maxlen_t cause[PRIVILEGES_MAX]; maxlen_t tval[PRIVILEGES_MAX]; maxlen_t ip; maxlen_t fcsr; +# ifdef USE_RVV + maxlen_t vstart; + maxlen_t vxsat; + maxlen_t vxrm; + maxlen_t vcrs; + maxlen_t vl; + maxlen_t vtype; + maxlen_t vlenb; +# endif } csr; ... } ``` Given that the Vector extension is enabled, the `mstatus.VS` field should be configured to either Initial(1) or Clean(2). Upon the execution of any vector instruction by the CPU, the `mstatus.VS` field will be updated to Dirty(3). During the initialization of a hart, certain fixed parameters must be set, such as `VLEN`, `ELEN`, `SEW_min`, and `mstatus.VS`. ![image](https://hackmd.io/_uploads/S1pnSXTOp.png) ```c static inline void rvv_set_vs(rvvm_hart_t* vm, uint8_t value) { vm->csr.status = bit_replace(vm->csr.status, 9, 2, value); } rvvm_hart_t* riscv_hart_init(rvvm_machine_t* machine) { ... #ifdef USE_RVV vm->VLEN = 0b1 << 16; vm->ELEN = 0b1 << 16; vm->SEW_min = 8; rvv_set_vs(vm, RVV_INITIAL); #endif ... } else { vm->csr.isa = CSR_MISA_RV32; riscv_decoder_init_rv32(vm); } riscv_tlb_flush(vm); riscv_priv_init(vm); return vm; } ``` Within the RVVM project, all privileged instructions are registered through the `riscv_priv_init()` function. Consequently, I have registered the `vset{i}vl{i}` instructions within this function. ```diff void riscv_priv_init(rvvm_hart_t* vm) { ... +#ifdef USE_RVV + riscv_install_opcode_ISB(vm, RV_PRIV_V_VSETVL, riscv_v_vsetvl); +#endif } ``` Given that the `OP` code and `func3` field in `vsetvl`, `vsetvli`, and `vsetivli` share identical values, the decoding operation is performed within the `riscv_v_vsetvl()` function. In my implementation, I have opted for a switch-case statement to handle this decoding process. ![Screenshot from 2024-01-10 15-52-17](https://hackmd.io/_uploads/HyQmpao_T.png) As the Application Vector Length (AVL) value relies on the input values of the rs1 and rd fields, the encoding policy is as below. The vl will be set to the AVL only if the vl value is valid. This validity is determined by the sew and lmul values in the vtype register. Consequently, it is necessary to configure the vtype register first before setting the vl value. ![image](https://hackmd.io/_uploads/Hy8KvvTuT.png) ```c #ifdef USE_RVV static void riscv_v_vsetvl(rvvm_hart_t *vm, const uint32_t instruction) { rvv_set_vs(vm, RVV_DIRTY); regid_t rds = bit_cut(instruction, 7, 5); maxlen_t vtype, vl, vlmax; switch (instruction & RV_PRIV_V_VSETVL_MASK) { case vsetvli_0: // 0b0x case vsetvli_1: // set vtype vtype = bit_cut(instruction, 20, 11); if (riscv_csr_op(vm, 0xC21 /* vtype */, &vtype, CSR_SWAP)); else riscv_trap(vm, TRAP_ILL_INSTR, instruction); // set vl vl = bit_cut(instruction, 15, 5); vlmax = rvv_vlmax(vm, vm->csr.vtype); if (vl == 0){ vl = vlmax; if (rds == 0) vl = (vlmax > vm->csr.vl)? vm->csr.vl : vlmax; } else { vl = vm->registers[vl] > vlmax ? vlmax : vm->registers[vl]; } // VLMAX = LMUL*VLEN/SEW if (riscv_csr_op(vm, 0xC20 /* vl */, &vl, CSR_SWAP)) vm->registers[rds] = vl; else riscv_trap(vm, TRAP_ILL_INSTR, instruction); break; case vsetvl: // 0b10 // set vtype vtype = bit_cut(instruction, 20, 10); vtype = vm->registers[vtype]; if (riscv_csr_op(vm, 0xC21 /* vtype */, &vtype, CSR_SWAP)); else riscv_trap(vm, TRAP_ILL_INSTR, instruction); // set vl vl = bit_cut(instruction, 15, 5); vlmax = rvv_vlmax(vm, vm->csr.vtype); if (vl == 0){ vl = vlmax; if (rds == 0) vl = (vlmax > vm->csr.vl)? vm->csr.vl : vlmax; } else { vl = vm->registers[vl] > vlmax ? vlmax : vm->registers[vl]; } // VLMAX = LMUL*VLEN/SEW if (riscv_csr_op(vm, 0xC20 /* vl */, &vl, CSR_SWAP)) vm->registers[rds] = vl; else riscv_trap(vm, TRAP_ILL_INSTR, instruction); break; case vsetivli: // 0b11 // set vtype vtype = bit_cut(instruction, 20, 5); if (riscv_csr_op(vm, 0xC21 /* vtype */, &vtype, CSR_SWAP)); else riscv_trap(vm, TRAP_ILL_INSTR, instruction); // set vl vl = bit_cut(instruction, 15, 5); vlmax = rvv_vlmax(vm, vm->csr.vtype); vl = (vlmax > vl)? vl : vlmax; if (riscv_csr_op(vm, 0xC20 /* vl */, &vl, CSR_SWAP)) vm->registers[rds] = vl; else riscv_trap(vm, TRAP_ILL_INSTR, instruction); default: break; } } #endif ``` In RVVM, each Control and Status Register (CSR) is associated with its own handler. Currently, I have implemented handlers for the `vtype` and `vl` registers. ```diff void riscv_csr_global_init() { ... +#ifdef USE_RVV /* Todo: action wan r/w csr */ + riscv_csr_list[0x008] = riscv_csr_zero; // vstart + riscv_csr_list[0x009] = riscv_csr_zero; // vxsat + riscv_csr_list[0x00A] = riscv_csr_zero; // vxrm + riscv_csr_list[0x00F] = riscv_csr_zero; // vcsr + riscv_csr_list[0xC20] = riscv_csr_vl; // vl + riscv_csr_list[0xC21] = riscv_csr_vtype; // vtype + riscv_csr_list[0xC22] = riscv_csr_zero; // vlenb +#endif ... } ``` For vtype csr handler, it is necessary to validate the input values. If the provided values are deemed invalid, the `vtype[XLEN-1]` bit should be set accordingly. This ensures proper handling of invalid input values within the context of the specific CSR handler. ```c #ifdef USE_RVV static bool riscv_csr_vtype(rvvm_hart_t* vm, maxlen_t* dest, uint8_t op) { maxlen_t val = vm->csr.vtype; csr_helper(&val, dest, op); // checkt vtype is valid int32_t lmul = rvv_raw_vlmul(val); // lmul should greater equal than SEW_min/ELEN if (lmul < ((int32_t)bit_log2(vm->SEW_min) - (int32_t)bit_log2(vm->ELEN))){ val = bit_replace(val, 31, 1, 1); } vm->csr.vtype = val; return true; } static bool riscv_csr_vl(rvvm_hart_t* vm, maxlen_t* dest, uint8_t op) { // check vl is valid maxlen_t val = vm->csr.vl; csr_helper(&val, dest, op); vm->csr.vl = val; return true; } #endif ``` ## Result I simply insert `printf` function to inspect the vl and vltype value. ``` PC 32: 80000000 vtype: 0 vl: 0 --------------- Read vsetvli: c107557 PC 32: 80000004 vtype: c1 vl: 4000 ``` Also, while reading the rvv-intrinsic-doc, I pull request to [rvv-intrinsic-doc](https://github.com/riscv-non-isa/rvv-intrinsic-doc/commit/6be31ff8cf5e2f468922d02be0645987c67c2ef9) to fix a typo. XD :::warning Lack of the coverage of vector specific instructions. :::