--- tags : Computer_Architecture, csapp, cs61c, jserv, RISC-V --- # CA2022 Project: RISCV-Atom and implement RV32M contributed by <[`鄒崴丞StevenChou`](https://github.com/StevenChou499)>, <[`王漢祺WangHanChi`](https://github.com/WangHanChi)> ## Prerequisites ### Install Dependency 1. Git clone the source code > git clone https://github.com/saursin/riscv-atom.git 2. Install git, make, python3, gcc & other tools > sudo apt install git python3 build-essential 3. Install Verilator(Version = 5.002) > cd $HOME git clone https://github.com/verilator/verilator cd verilator git checkout stable export VERILATOR_ROOT=`pwd` autoconf ./configure make export VERILATOR_ROOT=\$HOME/verilator export PATH=\$VERILATOR_ROOT/bin:\$PATH 4. Install GTK Wave > sudo apt install gtkwave 5. Install Screen > sudo apt install screen 6. Install RISC-V GNU Toolchain :warning:Before doing this, we need to add a file in `riscv-atom`. ths file is [here](https://github.com/riscv-collab/riscv-gnu-toolchain/blob/master/configure), or you can do the operation [here](https://hackmd.io/KMhQQ8FZSnmOsBGDSebhlw?view#Fix-a-little-error) > cd riscv-atom > sudo chmod +x install-toolchain.sh > sudo ./install-toolchain.sh -x or install from source [riscv-gnu-toolchain](https://github.com/riscv-collab/riscv-gnu-toolchain) 7. Install Doxygen > sudo apt install doxygen 8. Install Latex Related packages > sudo apt -y install texlive-latex-recommended texlive-pictures texlive-latex-extra latexmk 9. Install sphinx & other python dependencies > cd docs/ && pip install -r requirements.txt 10. Install socat packages > sudo apt install socat ### Building RISC-V Atom 1. RISC-V Atom environment variables > cd riscv-atom > source sourceme > echo "source \<YOUR - PATH>/sourceme" >> ~/.bashrc 2. Building the Simulator > make soctarget=atombones > atomsim --help ::: warning :warning:Because of our `verilator` is installed from source instead of using `sudo apt install verilator`. So we need to change the path in `~/riscv-atom/sim/makefile`. ```makefile=38 #################################################### # Verilog Configs VC := verilator VFLAGS := -cc -Wall --relative-includes --trace -D__ATOMSIM_SIMULATION__ # This should point to verilator include directory #VERILATOR_INCLUDE_PATH := /usr/share/verilator/include VERILATOR_INCLUDE_PATH := /home/hanchi/verilator/include ifeq ("$(wildcard $(VERILATOR_INCLUDE_PATH)/verilated_vcd_c.cpp)","") $(error Verilator include path invalid; Set correct Verilator include path in sim/Makefile) endif # Core files VSRCS = $(RTL_DIR)/Timescale.vh VSRCS += $(RTL_DIR)/core/Utils.vh VSRCS += $(RTL_DIR)/core/Defs.vh VSRCS += $(RTL_DIR)/core/AtomRV.v VSRCS += $(RTL_DIR)/core/Alu.v VSRCS += $(RTL_DIR)/core/Decode.v VSRCS += $(RTL_DIR)/core/RegisterFile.v VSRCS += $(RTL_DIR)/core/CSR_Unit.v ``` ::: ### Fix a little error ::: warning :warning:Because there is some bug in his install-toolchain.sh, I install the riscv-gnu-toolchain form the [source](https://github.com/riscv-collab/riscv-gnu-toolchain.git), and we can get the `configure` file in the riscv-gnu-toolchain dictionary. Therefore, we can modify the install-toolchain.sh as following in the **line 60**. ```shell=50 echo -e "${GREEN}Building toolchain... ${NOCOLOR}" # Create build directory rm -rf ${BUILD_DIR} mkdir -p ${BUILD_DIR} # Build toolchain cd ${BUILD_DIR} echo "../../configure --prefix=$(pwd) ${TOOLCHAIN_CONFIG}" #../../configure --prefix=$(pwd) ${TOOLCHAIN_CONFIG} ~/riscv-gnu-toolchain/configure --prefix=$(pwd) ${TOOLCHAIN_CONFIG} echo "make -j${BUILD_JOBS}" make -j${BUILD_JOBS} # Copy files to the installation directory echo -e "${GREEN}Copying... ${NOCOLOR}" rsync -r --progress * ${TOOLCHAIN_INSTALL_PATH}/ cd ${CWDIR} ``` ::: ## Running Examples on AtomSim ### Hello World Example Switch to the examples dictionary > cd ~/riscv-atom/sw/examples Compile with RISC-V gcc cross-compiler, generate `hello.elf` in `hello-asm` dictionary > make soctarget=atombones ex=hello-asm compile Run the example > atomsim hello-asm/hello.elf ### Alternatively, use make run to run the example > make soctarget=atombones ex=hello-asm run The syntax is asfollowing: > make soctarget=\<TARGET> ex=\<EXAMPLE> compile > make soctarget=\<TARGET> ex=\<EXAMPLE> run ### The Runexamples Script Automatically compile and simulate all examples > atomsim-runexamples > make soctarget=atombones run-all ### Using Atomsim Vuart > atomsim-gen-vports > screen $RVATOM/userport 9600 In another terminal > atomsim hello-asm/hello.elf --vuart=$RVATOM/simport To close the screen command press `ctrl+a`, type `:quit` and press `Enter`. ### Test AtomSim(CPU core) We can type the `atomsim --help` to get all the instruction about the atom simulation. ``` AtomSim v2.2 Interactive RTL Simulator for Atom based systems [ atombones ] Usage: atomsim [OPTION...] input Backend Config options: --vuart arg use provided virtual uart port (default: "") --vuart-baud arg Specify virtual uart port baudrate (default: 9600) --imemsize arg Specify size of instruction memory to simulate (in KB) (default: 65536) --dmemsize arg Specify size of data memory to simulate (in KB) (default: 65536) Debug options: -v, --verbose Turn on verbose output -d, --debug Start in debug mode -t, --trace Enable VCD tracing --trace-file arg Specify trace file (default: trace.vcd) --dump-file arg Specify dump file (default: dump.txt) --ebreak-dump Enable processor state dump at hault --signature arg Enable signature dump at hault (Used for riscv compliance tests) (default: "") General options: -h, --help Show this message --version Show version information --soctarget Show current AtomSim SoC target --no-color Don't show colored output -i, --input arg Specify an input file Sim Config options: --maxitr arg Specify maximum simulation iterations (default: 1000000) ``` 1. Use -t will get the trace.vcd(fst graph) in the `~/riscv-atom/` 2. Use -v will get the detail output --- ## Analyze RISCV-Atom ### 2-Stage Pipeline This author is talking about his design inspiration come from Arm Cortex m0+ on [RISC-V Atom (Core)](https://riscv-atom.readthedocs.io/en/latest/pages/documentation/riscv_atom/riscv_atom.html) And on [MICROCHIP DEVELOPER HELP](https://microchipdeveloper.com/32arm:m0-pipeline) this website mentions that M0+ can minimize its branch penalty dual to two-stage.Because it can reduce the access to Flash, it can further reduce power consumption, which usually accounts for the power consumption of microcontrollers. Therefore, it can work at ultra-low power consumption. ![](https://i.imgur.com/GGoTjTb.png) This picture is show the pipeline of Arm Cortex M0+, we can find that the Decode stage is divided into Pre-decode and Main decode. Such this design can minimize the branch penalty. ![](https://i.imgur.com/yVuFXJU.png) And Atom also inherits this advantage, divided into two stages, we can see the above figure, its stage-1 is mainly used for Fetch Instruction, and stage-2 is responsible for Decode, Execute & Write-back, can reduce its branch penalty to 1. ### Design motivation We can reduce the use of Register file by putting Decode in stage-2, which can reduce the usage of LUT. LUT is a common component of FPGA. It is the same as a ROM. It can pre-store the results of logic functions in it, so that the pre-stored results can be addressed by using the input signal as an address. ![](https://i.imgur.com/dFvUMW7.png) The reason why the amount of LUT usage can be reduced is because the FPGA is composed of a basic block called CLB (Configurable Logic Block) or LAB (Logic Array Block). The following figure is a typical structure of a CLB block, which includes a full adder (FA), a D-Type flip-flop, three multiplexers (mux) and two three-input Lookup tables (3-LUTs ). Therefore, reducing the CLB will also reduce the LUT usage. --- ## Validate RISC-V Atom by Verilator, including Dhrystone. ### Validate The version of Verilator in our environment is 5.002 ```makefile hanchi@hanchi:~$ verilator --version Verilator 5.002 2022-10-29 rev v5.002-29-gdb39d70c7 ``` As above using, we can type `make sim` in the ROOTDIR of riscv-atom to build the atomsim. In the makefile, we can find that is ```makefile=85 # ======== AtomSim ======== .PHONY : sim sim: ## Build atomsim for given target [default: atombones] @echo "$(COLOR_GREEN)>> Building Atomsim for soctarget=$(soctarget) $(COLOR_NC)" make $(MKFLAGS) -C $(sim_dir) soctarget=$(soctarget) DEBUG=1 ``` After get the **atomsim**, we can type `atomsim --help` to get the user guide, that is ```makefile hanchi@hanchi:~/riscv-atom$ atomsim --help AtomSim v2.2 Interactive RTL Simulator for Atom based systems [ atombones ] Usage: atomsim [OPTION...] input Backend Config options: --vuart arg use provided virtual uart port (default: "") --vuart-baud arg Specify virtual uart port baudrate (default: 9600) --imemsize arg Specify size of instruction memory to simulate (in KB) (default: 65536) --dmemsize arg Specify size of data memory to simulate (in KB) (default: 65536) Debug options: -v, --verbose Turn on verbose output -d, --debug Start in debug mode -t, --trace Enable VCD tracing --trace-file arg Specify trace file (default: trace.vcd) --dump-file arg Specify dump file (default: dump.txt) --ebreak-dump Enable processor state dump at hault --signature arg Enable signature dump at hault (Used for riscv compliance tests) (default: "") General options: -h, --help Show this message --version Show version information --soctarget Show current AtomSim SoC target --no-color Don't show colored output -i, --input arg Specify an input file Sim Config options: --maxitr arg Specify maximum simulation iterations (default: 1000000) ``` Now, we can run the simple example to check the atomsim > atomsim sw/examples/hello-asm/hello.elf And we will get the ```makefile= hanchi@hanchi:~/riscv-atom$ atomsim sw/examples/hello-asm/hello.elf Loading segment 1 [base=0x00000000, sz= 456 bytes, at=0x00000000] ... done Loading segment 2 [base=0x04000000, sz= 40 bytes, at=0x04000000] ... done Hello World! -- from Assembly ``` We also can add some flag likes `-v`, `-d`, `-t`...and so on. Therefore, we completed the Validate RISC-V Atom by Verilator. ### Dhrystone The author, [saursin](https://github.com/saursin) have made Dhrystone before, but there are some bugs in the code which made it cannot run. I commented some code in order to fixed it. Now we can type `make dhrystone` in the ROOTDIR to get the dhrystone result. This code is dhry_1.c which is one fragment of dhrystone. ```c=97 /* Initializations */ //#ifdef RISCV //serial_init(UART_BAUD_9600); //#endif Next_Ptr_Glob = (Rec_Pointer) malloc (sizeof (Rec_Type)); Ptr_Glob = (Rec_Pointer) malloc (sizeof (Rec_Type)); ``` Here is the result of dhrystone ```makefile hanchi@hanchi:~/riscv-atom$ make dhrystone atomsim --maxitr 100000000 -t sw/examples/dhrystone/dhrystone.elf Loading segment 1 [base=0x00000000, sz= 11436 bytes, at=0x00000000] ... done Loading segment 2 [base=0x04000000, sz= 1068 bytes, at=0x04000000] ... done Dhrystone Benchmark, Version 2.1 (Language: C) Program compiled without 'register' attribute Execution starts, 2000 runs through Dhrystone Execution ends Final values of the variables used in the benchmark: Int_Glob: 5 should be: 5 Bool_Glob: 1 should be: 1 Ch_1_Glob: A should be: A Ch_2_Glob: B should be: B Arr_1_Glob[8]: 7 should be: 7 Arr_2_Glob[8][7]: 2010 should be: Number_Of_Runs + 10 Ptr_Glob-> Ptr_Comp: 67120132 should be: (implementation-dependent) Discr: 0 should be: 0 Enum_Comp: 2 should be: 2 Int_Comp: 17 should be: 17 Str_Comp: DHRYSTONE PROGRAM, SOME STRING should be: DHRYSTONE PROGRAM, SOME STRING Next_Ptr_Glob-> Ptr_Comp: 67120132 should be: (implementation-dependent), same as above Discr: 0 should be: 0 Enum_Comp: 1 should be: 1 Int_Comp: 18 should be: 18 Str_Comp: DHRYSTONE PROGRAM, SOME STRING should be: DHRYSTONE PROGRAM, SOME STRING Int_1_Loc: 5 should be: 5 Int_2_Loc: 13 should be: 13 Int_3_Loc: 7 should be: 7 Enum_Loc: 1 should be: 1 Str_1_Loc: DHRYSTONE PROGRAM, 1'ST STRING should be: DHRYSTONE PROGRAM, 1'ST STRING Str_2_Loc: DHRYSTONE PROGRAM, 2'ND STRING should be: DHRYSTONE PROGRAM, 2'ND STRING Number Of Runs: 2000 cycles Elapsed: 1718085 Dhrystones_Per_Second_Per_MHz: 1164 DMIPS_Per_MHz: 0.662 ``` :::warning TODO : Add [coremark](https://github.com/riscv-boom/riscv-coremark) and so on. ::: --- ## Study Atomsim AtomSim is the main part of the riscv-atom. There are mainly two parts, which is core and uncore. core is the "core" part of the processor, "uncore" is the remain part of the processor. ### Core First we will take a look at the core part of the Atomsim. The CPU is a 2-stage pipeline, The first stage is the Fetch stage, the second stage is the decode, execute, memory and write back stage. * `Defs.vh` : `Defs.vh` defines all the macros we are going to use in the other files, mainly defining all the instruction type, ALU options and comparator option. * `Decode.vh` : `Decode.vh` decodes the fetched instruction, takes out the `opcode` , `func3` and `func7` part. Figure out the destination register `rd` and source register `rs1` and `rs2` . ```verilog=35 // Decode fields wire [6:0] opcode = instr_i[6:0]; wire [2:0] func3 = instr_i[14:12]; wire [6:0] func7 = instr_i[31:25]; assign mem_access_width_o = func3; assign csru_op_sel_o = func3; assign rd_sel_o = instr_i[11:7]; assign rs1_sel_o = instr_i[19:15]; assign rs2_sel_o = instr_i[24:20]; reg [2:0] imm_format; ``` After we decompose the fetched instruction, we can analyze the function we need to prepare and enable or disable the register file, figure out the comparision type. ```verilog=71 always @(*) begin // DEFAULT VALUES jump_en_o = 1'b0; comparison_type_o = `CMP_FUNC_UN; rf_we_o = 1'b0; rf_din_sel_o = 3'd0; a_op_sel_o = 1'b0; b_op_sel_o = 1'b0; cmp_b_op_sel_o = 1'b0; alu_op_sel_o = `ALU_FUNC_ADD; mem_we_o = 1'b0; d_mem_load_store = 1'b0; imm_format = `RV_IMM_TYPE_U; csru_we_o = 0; casez({func7, func3, opcode}) /* LUI */ 17'b???????_???_0110111: begin rf_we_o = 1'b1; rf_din_sel_o = 3'd0; imm_format = `RV_IMM_TYPE_U; end ... `ifdef verilator if(opcode != 7'b1110011) // EBREAK $display("!Warning: Unimplemented Opcode: %b", opcode); `endif end endcase end ``` Now we figure out the register file we need and the comparison type, the last thing the decoder need to is generate a 32-bit immediate for future calculation. ```verilog=49 /* Decode Immediate */ reg [31:0] getExtImm; always @(*) /*COMBINATORIAL*/ begin case(imm_format) `RV_IMM_TYPE_I : getExtImm = {{21{instr_i[31]}}, instr_i[30:25], instr_i[24:21], instr_i[20]}; `RV_IMM_TYPE_S : getExtImm = {{21{instr_i[31]}}, instr_i[30:25], instr_i[11:8], instr_i[7]}; `RV_IMM_TYPE_B : getExtImm = {{20{instr_i[31]}}, instr_i[7], instr_i[30:25], instr_i[11:8], 1'b0}; `RV_IMM_TYPE_U : getExtImm = {instr_i[31], instr_i[30:20], instr_i[19:12], {12{1'b0}}}; `RV_IMM_TYPE_J : getExtImm = {{12{instr_i[31]}}, instr_i[19:12], instr_i[20], instr_i[30:25], instr_i[24:21], 1'b0}; default: getExtImm = 32'd0; endcase end assign imm_o = getExtImm; ``` * `RegisterFile.v` : `RegisterFile.v` stores all the RISC-V CPU register, providing a reset function. With in a cycle we can write in one register and read out two register at the same time. Because RISC-V CPU have 32 registers, so we need 5 bits to locate each of them, also the first register which is `x0` is a hard-wired zero, means it cannot be changed into any value instead of zero. ```verilog=38 localparam REG_COUNT = 2**REG_ADDR_WIDTH; `ifdef RF_R0_IS_ZERO reg [REG_WIDTH-1:0] regs [1:REG_COUNT-1] /*verilator public*/; // register array `else reg [REG_WIDTH-1:0] regs [0:REG_COUNT-1] /*verilator public*/; // register array `endif ``` We can write into the register file every time the clock is trigger (1 clock cycle) and also the write enable signal `Data_We_i` is high, the written register is choosen by the `Rd_Sel_i` signal. If the reset signal `Rst_i` is high, all the register will be formatted into zero. ```verilog=16 integer i; /* === WRITE PORT === Synchronous write */ always @ (posedge Clk_i) begin if (Rst_i) begin for(i=1; i<REG_COUNT; i=i+1) regs[i] <= {REG_WIDTH{1'b0}}; end else if(Data_We_i) begin `ifdef RF_R0_IS_ZERO if(Rd_Sel_i != 0) regs[Rd_Sel_i] <= Data_i; `else regs[Rd_Sel_i] <= Data_i; `endif end end ``` * `ALU.v` : `ALU.v` is responsible for all the main calculation of the RISC-V CPU, it takes the output of the decoder, which is the register value or an immediate. The main supported calculation are add, subtract, arithmetic shift left and right, bitwise and, or and exclusive or. ALU uses the select signal `sel_i` to select the correct function of the calculation. ```verilog=30 wire sel_add = (sel_i == `ALU_FUNC_ADD); wire sel_sub = (sel_i == `ALU_FUNC_SUB); wire sel_xor = (sel_i == `ALU_FUNC_XOR); wire sel_or = (sel_i == `ALU_FUNC_OR); wire sel_and = (sel_i == `ALU_FUNC_AND); wire sel_sll = (sel_i == `ALU_FUNC_SLL); wire sel_srl = (sel_i == `ALU_FUNC_SRL); wire sel_sra = (sel_i == `ALU_FUNC_SRA); ``` Because addition and subtraction is very similar under bare metal, so we can use the same function at the same time. ```verilog=39 // Result of arithmetic calculations (ADD/SUB) wire [31:0] arith_result = a_i + (sel_sub ? ((~b_i)+1) : b_i); ``` Arithmetic right shift needs to pad the most significand bit and logic shift only need to pad zeros, we can obtain the shift amount by bottom 5 bits of the signal `b_i`. ```verilog=53 // Input to the universal shifter reg signed [32:0] shift_input; always @(*) begin if (sel_srl) shift_input = {1'b0, a_i}; else if (sel_sra) shift_input = {a_i[31], a_i}; else // if (sel_sll) // this case includes "sel_sll" shift_input = {1'b0, reverse(a_i)}; end /* verilator lint_off UNUSED */ wire [32:0] shift_output = shift_input >>> b_i[4:0]; // Universal shifter /* verilator lint_on UNUSED */ ``` After doing all the calculations, we can use the select signal to output the correct answer. ```verilog=78 // Final output mux always @(*) begin if (sel_add | sel_sub) result_o = arith_result; else if (sel_sll | sel_srl | sel_sra) result_o = final_shift_output; else if (sel_xor) result_o = a_i ^ b_i; else if (sel_or) result_o = a_i | b_i; else if (sel_and) result_o = a_i & b_i; else result_o = arith_result; end ``` --- ## Implement the rv32M-Extension In order to implement RV32M, we have to first understand what RV32M stands for. RV32M is an optional extended instruction set besides RV32I. It is mainly used for multiplication, division and remainder of (non-negative) integers. ### Read the risc-v spec ![](https://i.imgur.com/wvcgpXb.png) We can get the machine code in M-Extension in the picture. ![](https://i.imgur.com/per8rlU.png) And we can also know that M-instructions is all R-type. According to the description on page 44 of the RISC-V specification : > REM and REMU provide the remainder of the corresponding division operation. For REM, the sign of the result equals the sign of the dividend. > ... > For both signed and unsigned division, it holds that dividend = divisor × quotient + remainder. Therefore, when doing REM and REMU operations, the sign of the dividend must be the same as the remainder. And all division operations must comply with dividend = divisor x quotient + remainder . * In addition, we need to pay attention to two exceptions when doing division operations. The first is the case where the divisor is 0, and the second is the division overflow of signed numbers. The following are the operation results in special cases : | Condition | Dividend | Divisor | | `DIVU` | `REMU` | `DIV` | `REMU` | |:--------------------:|:----------:|:-------:|:---:|:-------:|:------:|:--------:|:------:| | divided by 0 | $x$ | $0$ | | $2^L-1$ | $x$ | $-1$ | $x$ | | overflow (on signed) | $-2^{L-1}$ | $-1$ | | -- | -- | $-2^L-1$ | $0$ | ### Modify the [Defs.vh](https://github.com/WangHanChi/riscv-atom/blob/main/rtl/core/Defs.vh) We need to expand the ALU for our M instructions. Remove the old instructions, and we will extend ont bit for **I + M instructions**. Therefore it changed from `3'd0` to `4'd0` in ALU_FUNC_ADD. ```diff=48 - `define ALU_FUNC_ADD 3'd0 - `define ALU_FUNC_SUB 3'd1 -`define ALU_FUNC_XOR 3'd2 -`define ALU_FUNC_OR 3'd3 -`define ALU_FUNC_AND 3'd4 -`define ALU_FUNC_SLL 3'd5 -`define ALU_FUNC_SRL 3'd6 -`define ALU_FUNC_SRA 3'd7 ``` ```diff=48 + `define ALU_FUNC_ADD 4'd0 + `define ALU_FUNC_SUB 4'd1 + `define ALU_FUNC_XOR 4'd2 + `define ALU_FUNC_OR 4'd3 + `define ALU_FUNC_AND 4'd4 + `define ALU_FUNC_SLL 4'd5 + `define ALU_FUNC_SRL 4'd6 + `define ALU_FUNC_SRA 4'd7 // ALU_M_EXTENSION + `define ALU_FUNC_MUL 4'd8 + `define ALU_FUNC_MULH 4'd9 + `define ALU_FUNC_MULHSU 4'd10 + `define ALU_FUNC_MULHU 4'd11 + `define ALU_FUNC_DIV 4'd12 + `define ALU_FUNC_DIVU 4'd13 + `define ALU_FUNC_REM 4'd14 + `define ALU_FUNC_REMU 4'd15 ``` ### Modfiy the [AtomRV.v](https://github.com/WangHanChi/riscv-atom/blob/main/rtl/core/AtomRV.v) We will change a little in this verilog file because the file is not related to **Decode**. We will widen the wire for decode alu opcode. ```diff=242 wire d_a_op_sel; wire d_b_op_sel; wire d_cmp_b_op_sel; - wire [2:0] d_alu_op_sel; + wire [3:0] d_alu_op_sel; wire [2:0] d_mem_access_width; wire d_mem_load_store; wire d_mem_we; ``` ### Modfiy the [Decode.v](https://github.com/WangHanChi/riscv-atom/blob/main/rtl/core/Decode.v) ```diff=13 module Decode ( - input wire [31:0] instr_i, + input wire [31:0] instr_i, // A full instruction - output wire [4:0] rd_sel_o, - output wire [4:0] rs1_sel_o, - output wire [4:0] rs2_sel_o, + output wire [4:0] rd_sel_o, // rd + output wire [4:0] rs1_sel_o, // rs1 + output wire [4:0] rs2_sel_o, // rs2 output wire [31:0] imm_o, - output reg jump_en_o, - output reg [2:0] comparison_type_o, + output reg jump_en_o, // check jump or not + output reg [2:0] comparison_type_o, // check compariosn type output reg rf_we_o, output reg [2:0] rf_din_sel_o, output reg a_op_sel_o, output reg b_op_sel_o, output reg cmp_b_op_sel_o, - output reg [2:0] alu_op_sel_o, + output reg [3:0] alu_op_sel_o, output wire [2:0] mem_access_width_o, output reg d_mem_load_store, output reg mem_we_o, ``` ```diff=43 assign mem_access_width_o = func3; assign csru_op_sel_o = func3; - assign rd_sel_o = instr_i[11:7]; - assign rs1_sel_o = instr_i[19:15]; - assign rs2_sel_o = instr_i[24:20]; + assign rd_sel_o = instr_i[11:7]; // rd + assign rs1_sel_o = instr_i[19:15]; // rs1 + assign rs2_sel_o = instr_i[24:20]; // rs2 reg [2:0] imm_format; ``` ```diff=427 + ///////////////////////////////////////////////////////////////////////// +/* M-Extension Instructions */ + /* MUL */ + 17'b0000001_000_0110011: + begin + rf_we_o = 1'b1; + rf_din_sel_o = 3'd2; + a_op_sel_o = 1'b0; + b_op_sel_o = 1'b0; + alu_op_sel_o = `ALU_FUNC_MUL; + end + /* MULH */ + 17'b0000001_001_0110011: + begin + rf_we_o = 1'b1; + rf_din_sel_o = 3'd2; + a_op_sel_o = 1'b0; + b_op_sel_o = 1'b0; + alu_op_sel_o = `ALU_FUNC_MULH; + end + /* MULHSU */ + 17'b0000001_010_0110011: + begin + rf_we_o = 1'b1; + rf_din_sel_o = 3'd2; + a_op_sel_o = 1'b0; + b_op_sel_o = 1'b0; + alu_op_sel_o = `ALU_FUNC_MULHSU; + end + /* MULHU */ + 17'b0000001_011_0110011: + begin + rf_we_o = 1'b1; + rf_din_sel_o = 3'd2; + a_op_sel_o = 1'b0; + b_op_sel_o = 1'b0; + alu_op_sel_o = `ALU_FUNC_MULHU; + end + /* DIV */ + 17'b0000001_100_0110011: + begin + rf_we_o = 1'b1; + rf_din_sel_o = 3'd2; + a_op_sel_o = 1'b0; + b_op_sel_o = 1'b0; + alu_op_sel_o = `ALU_FUNC_DIV; + end + /* DIVU */ + 17'b0000001_101_0110011: + begin + rf_we_o = 1'b1; + rf_din_sel_o = 3'd2; + a_op_sel_o = 1'b0; + b_op_sel_o = 1'b0; + alu_op_sel_o = `ALU_FUNC_DIVU; + end + /* REM */ + 17'b0000001_110_0110011: + begin + rf_we_o = 1'b1; + rf_din_sel_o = 3'd2; + a_op_sel_o = 1'b0; + b_op_sel_o = 1'b0; + alu_op_sel_o = `ALU_FUNC_REM; + end + /* REMU */ + 17'b0000001_111_0110011: + begin + rf_we_o = 1'b1; + rf_din_sel_o = 3'd2; + a_op_sel_o = 1'b0; + b_op_sel_o = 1'b0; + alu_op_sel_o = `ALU_FUNC_REMU; + end+= ``` #### [Alu.v](https://github.com/WangHanChi/riscv-atom/blob/main/rtl/core/Alu.v) ```diff=22 ( input wire [31:0] a_i, input wire [31:0] b_i, - input wire [2:0] sel_i, + input wire [3:0] sel_i, output reg [31:0] result_o ); wire sel_add = (sel_i == `ALU_FUNC_ADD); wire sel_sub = (sel_i == `ALU_FUNC_SUB); wire sel_xor = (sel_i == `ALU_FUNC_XOR); wire sel_or = (sel_i == `ALU_FUNC_OR); wire sel_and = (sel_i == `ALU_FUNC_AND); wire sel_sll = (sel_i == `ALU_FUNC_SLL); wire sel_srl = (sel_i == `ALU_FUNC_SRL); wire sel_sra = (sel_i == `ALU_FUNC_SRA); + // Add M-Extension instructions + wire sel_mul = (sel_i == `ALU_FUNC_MUL); + wire sel_mulh = (sel_i == `ALU_FUNC_MULH); + wire sel_mulhsu = (sel_i == `ALU_FUNC_MULHSU); + wire sel_mulhu = (sel_i == `ALU_FUNC_MULHU); + wire sel_div = (sel_i == `ALU_FUNC_DIV); + wire sel_divu = (sel_i == `ALU_FUNC_DIVU); + wire sel_rem = (sel_i == `ALU_FUNC_REM); + wire sel_remu = (sel_i == `ALU_FUNC_REMU); // Result of arithmetic calculations (ADD/SUB) wire [31:0] arith_result = a_i + (sel_sub ? ((~b_i)+1) : b_i); ``` ```diff=86 else final_shift_output = shift_output[31:0]; end + // M-Extension mux + wire [63:0] result_mul; + /* verilator lint_off UNUSEDSIGNAL */ + wire [63:0] result_mulsu; + /* verilator lint_off UNUSEDSIGNAL */ + wire [63:0] result_mulu; + /* verilator lint_off UNUSEDSIGNAL */ + wire [31:0] result_div; + wire [31:0] result_divu; + wire [31:0] result_rem; + wire [31:0] result_remu; + assign result_mul[63:0] = $signed ({{32{a_i[31]}}, a_i[31: 0]}) * + $signed ({{32{b_i[31]}}, b_i[31: 0]}); + assign result_mulu[63:0] = $unsigned ({{32{1'b0}}, a_i[31: 0]}) * + $unsigned ({{32{1'b0}}, b_i[31: 0]}); + assign result_mulsu[63:0] = $signed ({{32{a_i[31]}}, a_i[31: 0]}) * + $unsigned ({{32{1'b0}}, b_i[31: 0]}); + assign result_div[31:0] = (b_i == 32'h00000000) ? 32'hffffffff : + ((a_i == 32'h80000000) && (b_i == 32'hffffffff)) ? 32'h80000000 : + $signed ($signed (a_i) / $signed (b_i)); + assign result_divu[31:0] = (b_i == 32'h00000000) ? 32'hffffffff : + $unsigned($unsigned(a_i) / $unsigned(b_i)); + assign result_rem[31:0] = (b_i == 32'h00000000) ? a_i : + ((a_i == 32'h80000000) && (b_i == 32'hffffffff)) ? 32'h00000000 : + $signed ($signed (a_i) % $signed (b_i)); + assign result_remu[31: 0] = (b_i == 32'h00000000) ? a_i : + $unsigned($unsigned(a_i) % $unsigned(b_i)); // Final output mux always @(*) begin if (sel_add | sel_sub) result_o = arith_result; else if (sel_sll | sel_srl | sel_sra) result_o = final_shift_output; else if (sel_xor) result_o = a_i ^ b_i; else if (sel_or) result_o = a_i | b_i; else if (sel_and) result_o = a_i & b_i; + else if(sel_mul) // M start + result_o = result_mul[31:0]; + else if(sel_mulh) + result_o = result_mul[63:32]; + else if(sel_mulhsu) + result_o = result_mulsu[63:32]; + else if(sel_mulhu) + result_o = result_mulu[63:32]; + else if(sel_div) + result_o = result_div[31:0]; + else if(sel_divu) + result_o = result_divu[31:0]; + else if(sel_rem) + result_o = result_rem[31:0]; + else if(sel_remu) + result_o = result_remu[31:0]; // M end else result_o = arith_result; end ``` ### riscv-tests We decide to use [riscv-test](https://github.com/riscv-software-src/riscv-tests) to test **atomsim** finally. There are many places need to modify to meet the requires. 1. Beacuse the author of riscv-atom did not implement `fence` this instruction, we modify the **[decode.v](https://github.com/WangHanChi/riscv-atom/blob/main/rtl/core/Decode.v)** to avoid error. ```diff=519 default: begin jump_en_o = 0; comparison_type_o = `CMP_FUNC_UN; rf_we_o = 0; rf_din_sel_o = 0; a_op_sel_o = 0; b_op_sel_o = 0; cmp_b_op_sel_o = 0; alu_op_sel_o = 0; mem_we_o = 1'b0; imm_format = 0; csru_we_o = 0; `ifdef verilator - if(opcode != 7'b1110011) // EBREAK + if(opcode != 7'b1110011 && opcode != 7'b0001111) // EBREAK $display("!Warning: Unimplemented Opcode: %b", opcode); `endif end ``` 2. Modify the linker script for atomsim, we need to compare the linker script of riscv-test. So we make the .text section begin at 0x00000000. ```diff=18 SECTIONS { /* ==== ROM ==== */ .text : { *(.boot*) - . = ORIGIN(ROM) + 0x100; + . = ORIGIN(ROM) + 0x000; _svector = .; KEEP(*(.vector*)) /* Keep all interrupt vector tables at very start of text section */ *(.text) /* Load all text sections (from all files) */ *(.rodata) . = ALIGN(4); _etext = .; } > ROM ``` 3. git clone the riscv-test project and put it in **riscv-atom/test**. And follow the step form [README.md](https://github.com/riscv-software-src/riscv-tests/blob/master/README.md) ```shell $ git clone https://github.com/riscv/riscv-tests $ cd riscv-tests $ git submodule update --init --recursive $ autoconf $ ./configure --prefix=$RISCV/target $ make $ sudo make install ``` 4. Modify the [link script](https://github.com/riscv/riscv-test-env/blob/0666378f353599d01fc48562b431b1dd049faab5/p/link.ld) in `riscv-tests/env/p`. The reason is we want to align the start section between atomsim and riscv-tests. And we find that we should follow the setting of author. He let the ROM is 64MB, and RAM is 64MB in the `riscv-atom/sw/lib/linklink.ld`.**Therefore we should push the `.data section` to `0x04000000`.** ```linker script=1 /* LINKER SCRIPT @See : https://sourceware.org/binutils/docs/ld/Basic-Script-Concepts.html @See : https://interrupt.memfault.com/blog/how-to-write-linker-scripts-for-firmware */ OUTPUT_ARCH( "riscv" ) ENTRY(_start) /* MEMORY LAYOUT */ MEMORY { ROM (rx) : ORIGIN = 0x00000000, LENGTH = 64M /* 64 MB @ 0x0*/ RAM (rwx): ORIGIN = 0x04000000, LENGTH = 64M /* 64 MB @ 0x10000 (0x04000000)*/ } ``` ```diff=1 OUTPUT_ARCH( "riscv" ) ENTRY(_start) SECTIONS { - . = 0x80000000; + . = 0x00000000; .text.init : { *(.text.init) } . = ALIGN(0x1000); .tohost : { *(.tohost) } . = ALIGN(0x1000); .text : { *(.text) } . = ALIGN(0x1000); + . = 0x04000000; .data : { *(.data) } .bss : { *(.bss) } _end = .; } ``` 5. Modify the testing header in order to print 0/1 to check it pass or not. we will change the end part, which **will not change the testing code**. First, we will modify the riscv_test.h in `riscv-tests/env/p`. one is `unimp` because atomsim will stop when it decode the word `ebreak`. Therecore, we will change it into `ebreak`. ```diff=229 #define RVTEST_CODE_END\ - unimp; + ebreak; ``` And because it will not print any symbol to let us know pass or not, we create the similar `pass result` and `fail result` to print 0/1to let us know which code is passed. And this step is also good for us to write shell script to check. We will modify the same file, which is riscv_test.h in `riscv-tests/env/p`. ```diff #define RVTEST_PASS \ fence; \ li TESTNUM, 1; \ li a7, 93; \ li a0, 0; \ - ecall + ebreak; +#define MY_RVTEST_PASS \ + fence; \ + li TESTNUM, 1; \ + addi gp, gp, 48; \ + li t1, 0x08000000; \ + sb gp, 0(t1); \ + li gp, 10; \ + sb gp, 0(t1); \ + li a7, 93; \ + li a0, 0; \ + ebreak; #define TESTNUM gp #define RVTEST_FAIL \ fence; \ 1: beqz TESTNUM, 1b; \ sll TESTNUM, TESTNUM, 1; \ or TESTNUM, TESTNUM, 1; \ li a7, 93; \ addi a0, TESTNUM, 0; \ - ecall + ebreak; +#define MY_RVTEST_FAIL \ + fence; \ + li TESTNUM, 0; \ + addi gp, gp, 48; \ + li t1, 0x08000000; \ + sb gp, 0(t1); \ + li gp, 10; \ + sb gp, 0(t1); \ + li a7, 93; \ + li a0, 0; \ + ebreak; ``` 6. Modify the testing macros to let it print 0/1, we change the end of macro to let it use our similar code which can print 0/1. ```diff=737 #----------------------------------------------------------------------- # Pass and fail code (assumes test num is in TESTNUM) #----------------------------------------------------------------------- #define TEST_PASSFAIL \ bne x0, TESTNUM, pass; \ fail: \ - RVTEST_FAIL; \ + MY_RVTEST_FAIL; \ pass: \ - RVTEST_PASS \ + MY_RVTEST_PASS \ ``` There is a file need to modify beacuse it directly call the macro to check the simple function, so we replace it to our pass macro. The file is simple.S in `riscv-test/isa/rv64ui`. The reason which we modify the rv64 file is that the rv32ui testing code is called the same code in rv64ui, so we should modify in rv64ui instead of rv32ui. ```diff=12 #include "riscv_test.h" #include "test_macros.h" RVTEST_RV64U RVTEST_CODE_BEGIN -RVTEST_PASS +MY_RVTEST_PASS RVTEST_CODE_END ``` 7. We write a shell script to auto confirm the test pass or not in `riscv-atom/scripts`. We can type `make riscv-test` to get the result! ```shell #!/bin/bash # riscv-test dir RED="\e[31m" GREEN="\e[32m" ORANGE="\e[33m" CYAN="\e[36m" NOCOLOR="\e[0m" cd /home/$USER/riscv-atom/test/riscv-tests/isa declare -a isa_RV32UI_p=( "add" "addi" "and" "andi" "auipc" "beq" "bge" "bgeu" "blt" "bltu" "bne" "jal" "jalr" "lb" "lbu" "lh" "lhu" "lui" "lw" "or" "ori" "sb" "sh" "sll" "simple" "slli" "slt" "slti" "sltiu" "sltu" "sra" "srai" "srl" "srli" "sub" "sw" "xor" "xori" ) declare -a isa_RV32UM_p=( "div" "divu" "mul" "mulh" "mulhu" "mulhsu" "rem" "remu") suss_rv32ui=0 fail_rv32ui=0 total_rv32ui=0 suss_rv32um=0 fail_rv32um=0 total_rv32um=0 echo "Now test rv32ui for riscv-atom!!" for isa_RV32UI in ${isa_RV32UI_p[@]} do echo testing instruction ${isa_RV32UI} atomsim rv32ui-p-${isa_RV32UI} output=$(atomsim rv32ui-p-${isa_RV32UI}) length=${#output} substring=${output:length-1:1} if [ $substring == "1" ] then suss_rv32ui=$(($suss_rv32ui+1)) else fail_rv32ui=$(($fail_rv32ui+1)) fi total_rv32ui=$(($total_rv32ui+1)) echo done echo echo "Now test rv32um for riscv-atom!!" for isa_RV32UM in ${isa_RV32UM_p[@]} do echo testing instruction ${isa_RV32UM} atomsim rv32um-p-${isa_RV32UM} output=$(atomsim rv32um-p-${isa_RV32UM}) length=${#output} substring=${output:length-1:1} if [ $substring == "1" ] then suss_rv32um=$(($suss_rv32um+1)) else fail_rv32um=$(($fail_rv32um+1)) fi total_rv32um=$(($total_rv32um+1)) echo done echo "==============================" echo "rv32ui-p instruction set :" echo -e "${NOCOLOR}The pass rate is ${GREEN}$suss_rv32ui/$total_rv32ui" echo -e "${NOCOLOR}The fail rate is ${RED}$fail_rv32ui/$total_rv32ui${NOCOLOR}" if [ $fail_rv32ui == "0" ] then echo -e "${CYAN}Pass rv32ui-p testing! ${NOCOLOR}" else echo -e "${CYAN}Fail rv32ui-p testing! ${NOCOLOR}" fi echo "==============================" echo "rv32um-p instruction set :" echo -e "${NOCOLOR}The pass rate is ${GREEN}$suss_rv32um/$total_rv32um" echo -e "${NOCOLOR}The fail rate is ${RED}$fail_rv32um/$total_rv32um${NOCOLOR}" if [ $fail_rv32um == "0" ] then echo -e "${CYAN}Pass rv32um-p testing! ${NOCOLOR}" else echo -e "${CYAN}Fail rv32um-p testing! ${NOCOLOR}" fi echo "==============================" ``` :::spoiler Result of `make riscv-test` ```makefile hanchi@hanchi:~/riscv-atom$ make riscv-test ./scripts/riscv-test.sh Now test rv32ui for riscv-atom!! testing instruction add Loading segment 1 [base=0x00000000, sz= 1724 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done 1 testing instruction addi Loading segment 1 [base=0x00000000, sz= 1148 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done 1 testing instruction and Loading segment 1 [base=0x00000000, sz= 1724 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done 1 testing instruction andi Loading segment 1 [base=0x00000000, sz= 956 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done 1 testing instruction auipc Loading segment 1 [base=0x00000000, sz= 564 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done 1 testing instruction beq Loading segment 1 [base=0x00000000, sz= 1212 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done 1 testing instruction bge Loading segment 1 [base=0x00000000, sz= 1276 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done 1 testing instruction bgeu Loading segment 1 [base=0x00000000, sz= 1340 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done 1 testing instruction blt Loading segment 1 [base=0x00000000, sz= 1212 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done 1 testing instruction bltu Loading segment 1 [base=0x00000000, sz= 1212 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done 1 testing instruction bne Loading segment 1 [base=0x00000000, sz= 1212 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done 1 testing instruction jal Loading segment 1 [base=0x00000000, sz= 572 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done 1 testing instruction jalr Loading segment 1 [base=0x00000000, sz= 700 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done 1 testing instruction lb Loading segment 1 [base=0x00000000, sz= 1084 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done Loading segment 3 [base=0x04000000, sz= 16 bytes, at=0x04000000] ... done 1 testing instruction lbu Loading segment 1 [base=0x00000000, sz= 1084 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done Loading segment 3 [base=0x04000000, sz= 16 bytes, at=0x04000000] ... done 1 testing instruction lh Loading segment 1 [base=0x00000000, sz= 1148 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done Loading segment 3 [base=0x04000000, sz= 16 bytes, at=0x04000000] ... done 1 testing instruction lhu Loading segment 1 [base=0x00000000, sz= 1212 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done Loading segment 3 [base=0x04000000, sz= 16 bytes, at=0x04000000] ... done 1 testing instruction lui Loading segment 1 [base=0x00000000, sz= 572 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done 1 testing instruction lw Loading segment 1 [base=0x00000000, sz= 1212 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done Loading segment 3 [base=0x04000000, sz= 16 bytes, at=0x04000000] ... done 1 testing instruction or Loading segment 1 [base=0x00000000, sz= 1724 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done 1 testing instruction ori Loading segment 1 [base=0x00000000, sz= 956 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done 1 testing instruction sb Loading segment 1 [base=0x00000000, sz= 1596 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done Loading segment 3 [base=0x04000000, sz= 16 bytes, at=0x04000000] ... done 1 testing instruction sh Loading segment 1 [base=0x00000000, sz= 1788 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done Loading segment 3 [base=0x04000000, sz= 32 bytes, at=0x04000000] ... done 1 testing instruction sll Loading segment 1 [base=0x00000000, sz= 1852 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done 1 testing instruction simple Loading segment 1 [base=0x00000000, sz= 444 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done 1 testing instruction slli Loading segment 1 [base=0x00000000, sz= 1148 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done 1 testing instruction slt Loading segment 1 [base=0x00000000, sz= 1724 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done 1 testing instruction slti Loading segment 1 [base=0x00000000, sz= 1084 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done 1 testing instruction sltiu Loading segment 1 [base=0x00000000, sz= 1084 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done 1 testing instruction sltu Loading segment 1 [base=0x00000000, sz= 1724 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done 1 testing instruction sra Loading segment 1 [base=0x00000000, sz= 1916 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done 1 testing instruction srai Loading segment 1 [base=0x00000000, sz= 1212 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done 1 testing instruction srl Loading segment 1 [base=0x00000000, sz= 1916 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done 1 testing instruction srli Loading segment 1 [base=0x00000000, sz= 1148 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done 1 testing instruction sub Loading segment 1 [base=0x00000000, sz= 1724 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done 1 testing instruction sw Loading segment 1 [base=0x00000000, sz= 1788 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done Loading segment 3 [base=0x04000000, sz= 48 bytes, at=0x04000000] ... done 1 testing instruction xor Loading segment 1 [base=0x00000000, sz= 1724 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done 1 testing instruction xori Loading segment 1 [base=0x00000000, sz= 956 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done 1 Now test rv32um for riscv-atom!! testing instruction div Loading segment 1 [base=0x00000000, sz= 700 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done 1 testing instruction divu Loading segment 1 [base=0x00000000, sz= 700 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done 1 testing instruction mul Loading segment 1 [base=0x00000000, sz= 1724 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done 1 testing instruction mulh Loading segment 1 [base=0x00000000, sz= 1724 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done 1 testing instruction mulhu Loading segment 1 [base=0x00000000, sz= 1724 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done 1 testing instruction mulhsu Loading segment 1 [base=0x00000000, sz= 1724 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done 1 testing instruction rem Loading segment 1 [base=0x00000000, sz= 700 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done 1 testing instruction remu Loading segment 1 [base=0x00000000, sz= 700 bytes, at=0x00000000] ... done Loading segment 2 [base=0x00001000, sz= 72 bytes, at=0x00001000] ... done 1 ============================== rv32ui-p instruction set : The pass rate is 38/38 The fail rate is 0/38 Pass rv32ui-p testing! ============================== rv32um-p instruction set : The pass rate is 8/8 The fail rate is 0/8 Pass rv32um-p testing! ============================== ``` ::: ## Issue and Pull Request **We plan to hold a issue to the [author](https://github.com/saursin)** We encounter some problem need to solve: 1. verilator PATH ERROR - In `~/riscv-atom/sim/makefile`, because our verilator is install by source code, therefore the PATH is not same with author.Hence, we need to change the path before push to github and CICD. ```makefile=44 # This should point to verilator include directory VERILATOR_INCLUDE_PATH := /usr/share/verilator/include #VERILATOR_INCLUDE_PATH := /home/hanchi/verilator/include ``` 2. verilator command - We find that the author's verilator version is not the same with us. Hence,we need to change the /*verilator lint_off UNUSEDSIGNAL*/ to /* verilator lint_off UNUSED */ - This version can pass github CICD ```verilog=92 /* verilator lint_off UNUSED */ wire [63:0] result_mulsu; /* verilator lint_off UNUSED */ wire [63:0] result_mulu; /* verilator lint_off UNUSED */ wire [31:0] result_div; ``` - This version can run in our environment ```verilog=92 /*verilator lint_off UNUSEDSIGNAL*/ wire [63:0] result_mulsu; /*verilator lint_off UNUSEDSIGNAL*/ wire [63:0] result_mulu; /*verilator lint_off UNUSEDSIGNAL*/ wire [31:0] result_div; ``` ## Reference Links [Computer Archiecture 2022: Term Project](https://hackmd.io/@sysprog/arch2022-projects) [saursin / riscv-atom](https://github.com/saursin/riscv-atom) [RISC-V Atom Documentation & User Manual](https://riscv-atom.readthedocs.io/en/latest/index.html) [srv32](https://github.com/sysprog21/srv32/blob/devel/rtl/riscv.v) [riscv-spec](https://riscv.org/wp-content/uploads/2017/05/riscv-spec-v2.2.pdf) [dhrystone](https://github.com/riscv-software-src/riscv-tests/tree/master/benchmarks/dhrystone) [Cortex m0+](https://www.2cm.com.tw/2cm/zh-tw/tech/76DD94733BF744AE81137AAFDCD070A3)