# CA2022 Project : Validate the pipeline design of kleine-riscv and Implement RV32M
###### tags: `Computer Architecture 2022`
contribute by <[`aa12551`](https://github.com/aa12551)>
## Goal
1. 目前 kleine-riscv 的實作以 RV32I 指令為主,需要藉由修改過的 riscv-tests 來驗證,你應該要重現相關測試
2. kleine-riscv 採用經典的 5-stage pipeline 設計,你可對照去年的報告,在 Verilator
上驗證並闡述內部設計,特別是對於 hazard 處置方案 (注意: 跟課堂教材存在出入)
3. 針對 RV32M,你需要擴充 kleine-riscv,加入對應的 mul, div, rem 指令,並運用 riscv-tests 驗證
4. 詳實紀錄過程,包含你遇到的問題和排除方式,可適度在 GitHun 回報給 kleine-riscv 原開發者。
## Prerequisites
According the [reference](https://hackmd.io/@HFmqOuSxQjKLe9lBinlkmw/SJgSrgQpF), we need to do the following step.
- Download the project of [kleine-riscv](https://github.com/rolandbernard/kleine-riscv) form Github.
```
git clone https://github.com/rolandbernard/kleine-riscv.git
```
- Install clang and LLVM
```
sudo apt-get install clang
git clone https://github.com/llvm/llvm-project llvm-project
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS=lld -DCMAKE_INSTALL_PREFIX=/usr/local ../llvm-project/llvm
```
:::info
`cmake` may need some problem. We need to install some packages to make it work. It will be displayed on the terminal.
:::
- We need to link `ld` and `ld.lld`. Becasue `/usr/bin/ld` have been linked to another PATH, we need to delete the original link and build the new link.
```
unlink /usr/bin/ld # delete the original link
sudo ln -s /home/hung/llvm-project/build/bin/ld.lld /usr/bin/ld
```
- Install Verilator
```
cd $HOME
git clone https://github.com/verilator/verilator
cd verilator
git checkout stable
export VERILATOR_ROOT=`pwd`
+ autoconf // Create ./configure script, or it can find the directory of configure
./configure
make
export VERILATOR_ROOT=$HOME/verilator
export PATH=$VERILATOR_ROOT/bin:$PATH
```
- Then we can run `make`
```
sudo make install # it may need high authority
```
- make problem
When we run `make`, the following problems may arise.
```
Building build/Vcore
../sim/memory.cpp:4:10: fatal error: gelf.h: 沒有此一檔案或目錄
4 | #include <gelf.h>
| ^~~~~~~~
compilation terminated.
make[1]: *** [Vcore.mk:66:memory.o] 錯誤 1
%Error: make -C build -f Vcore.mk -j 1 exited with 2
%Error: Command Failed /home/hung/verilator/bin/verilator_bin -Isrc -Isrc/pipeline -Isrc/units -Wall -Mdir build --cc --exe --build -LDFLAGS -lelf src/core.v sim/simulator.cpp sim/core.cpp sim/memory.cpp
make: *** [makefile:43:build/Vcore] 錯誤 2
```
We need to install the library which can run `.elf`
```
sudo apt-get install libelf-dev
```
## Architecture
We can view the `makefile` to understand the overall program execution process.
- It will use `verilator` to build `.v` file to `.h`,`.cpp`,`.elf` in `build` directory. These files can help us simulate the circuit.
- It will use `clang` compiler to make `.S` file and `.C` file to `.elf` file
- The main function is in `simulator.cpp` and it will connect the file between the `.elf` file which make by `clang` compiler and the file which make by `verilator`.
- It will verify the result of the calculation is equal to the answer. If the result is not equal to answer, it will show the following message in terminal.
```
Running test jal
Running test and
Failed test case #5
make: *** [makefile:56:RUNTEST.tests/build/rv32ui/and] 錯誤 1
```
## Validate the pipeline design
I am divided layer according to the relationship of the `.v` file.
```
core.v
├──pipeline.v
│ ├── csr.v
│ ├── regfile.v
│ ├── hazard.v
│ ├── fetch.v
│ ├── decode.v
│ ├── execute.v
│ │ ├── cmp.v
│ │ └── alu.v
│ ├── memory.v
│ └── writeback.v
└── busio.v
```
We can find that `core.v` will establish `pipeline.v` and `busio.v`. Then `pipeline.v` consist of `csr.v`, `regfile.v`, `hazard.v` component and `fetch.v`, `decode.v`, `execute.v`, `memory.v`, `writeback.v` stage. And `execute.v` include `alu.v` and `cmp.v` component.
Later, we will focus on the how the design to deal with hazard.
### Structual hazard
- When we want to write the data to register and the another instruction want to read the register data. They both need the same unit.
- Solution by class:
- 1. Build the `RegFile` with independent read and write port.
- 2. Split `RegFile` access in two. Prepare to write during 1st half, write on falling edge, read during 2nd half of each clock cycle.

- Solution by `kleine-riscv` :
- We can see that the `Regfile.v` have been divided into separate read and write ports. Using solution 1 to deal with this hazard.
``` diff=1
module regfile (
input clk,
// from decode (read ports)
input [4:0] rs1_address,
input [4:0] rs2_address,
// to decode (read ports)
output reg [31:0] rs1_data,
output reg [31:0] rs2_data,
// from writeback (write port)
input [4:0] rd_address,
input [31:0] rd_data
);
```
### Data hazard
- When the data we want to calculate, but the data have not been save yet. Like the below picture.

- Solution by class
- 1. Stalling (To wait for two tcycle)

- 2. Compiler can arrange code to avoid hazard and stall
- 3. Forward result as soon as it is available(After ALU), even though it’s not stored in RegFile yet (need the hardware support)
- Solution by `kleine-riscv`
- In the `decode.v`, it will judge whether the `rs1` and `rs2` of the instruction in the decode stage are equal to the `rd` of memory and writeback stage.
```diff=412
case (rs1_address)
0: begin
rs1_bypassed_out <= 1;
rs1_bypass_out <= 0;
end
bypass_memory_address: begin
rs1_bypassed_out <= 1;
rs1_bypass_out <= bypass_memory_data;
end
bypass_writeback_address: begin
rs1_bypassed_out <= 1;
rs1_bypass_out <= bypass_writeback_data;
end
default: begin
rs1_bypassed_out <= 0;
rs1_bypass_out <= 0;
end
endcase
case (rs2_address)
0: begin
rs2_bypassed_out <= 1;
rs2_bypass_out <= 0;
end
bypass_memory_address: begin
rs2_bypassed_out <= 1;
rs2_bypass_out <= bypass_memory_data;
end
bypass_writeback_address: begin
rs2_bypassed_out <= 1;
rs2_bypass_out <= bypass_writeback_data;
end
default: begin
rs2_bypassed_out <= 0;
rs2_bypass_out <= 0;
end
endcase
```
- This method can deal with the following problem. And why we need to bypass from writeback to decode is that we don't know whether to write or read first. It may read the old version of data, so it need to bypass from writeback to decode.
```
// Bypass from memory to decode
add t0,t1,t2
instruction 1
add t3,t0,t1
// Bypass from writeback to decode
add t0,t1,t2
instruction 1
instruction 2
add t3,t0,t1
```
- But this method can't deal with the following data hazard.
```
// Bypass from execute to decode
add t0,t1,t2
add t3,t0,t1
```
- Then, I start to understand `hazard.v`. In line 65~67. I find that if `rd_address_execute` is equal to `rs1_address_decode` or `rs2_address_decode`, it will ecounter data hazard and stall the instruction. This design is different from what I learn in class.
```diff=64
wire data_hazard = valid_decode && (
(valid_execute && rd_address_execute != 0 && (
uses_rs1 && rs1_address_decode == rd_address_execute
|| uses_rs2 && rs2_address_decode == rd_address_execute
))
|| (valid_memory && rd_address_memory != 0 && !bypass_memory && (
uses_rs1 && rs1_address_decode == rd_address_memory
|| uses_rs2 && rs2_address_decode == rd_address_memory
))
|| uses_csr && (
csr_write_execute && valid_execute
|| csr_write_memory && valid_memory
|| csr_write_writeback && valid_writeback
));
```
- It mean that the following instuction will be stall one cycle.
```
// Bypass from execute to decode
add t0,t1,t2
add t3,t0,t1
// Bypass from execute to decode
lw t0,0(t1)
add t2,t0,t1
```
- Then How about load instruction bypass from memory to decode or from writeback?
- From memory to decode
```
// Load instruction bypass from memory to decode
lw t0,0(t1)
instruction 1
add t2,t0,t1
```
- In `memory.v`, it seem that it doesn't bypass the data which load form memory.
```diff=75
assign bypass_address = (valid_in && bypass_memory_in) ? rd_address_in : 5'h0;
assign bypass_data = write_select_in[0] ? csr_data_in : alu_data_in;
```
- Then in `hazard.v`, if memory doesn't bypass data to decode stage, `bypass_memory` will be set to zero. It will produce data hazard and stall the instruction.
```diff=69
(valid_memory && rd_address_memory != 0 && !bypass_memory && (
uses_rs1 && rs1_address_decode == rd_address_memory
|| uses_rs2 && rs2_address_decode == rd_address_memory
))
```
- From writeback to decode
```
// Load instruction bypass from writeback to decode
lw t0,0(t1)
instruction 1
instruction 2
add t2,t0,t1
```
- It seem that writeback will get the data which load from memory and bypass to decoe when need it.
```diff=86
always @(*) begin
case (write_select_in)
WRITE_SEL_ALU: rd_data = alu_data_in;
WRITE_SEL_CSR: rd_data = csr_data_in;
WRITE_SEL_LOAD: rd_data = load_data_in;
WRITE_SEL_NEXT_PC: rd_data = next_pc_in;
endcase
end
```
### Control Hazard
- When we excute the branch instruction, we need to wait the decision to jump to the label or continue, so we need to wait two instruction to get the decision.

- Solution by class
- The Branch decision made in 2nd stage, so only one nop is needed instead of two
- Branch Prediction : Predict a result and continue to run next instruction. If corrent, continue run. If wrong, then flush all the pipeline and flip prediction.
- Solution by `kleine-riscv`
- It seem that it will make decision at `execute.v`.
```diff=86
cmp ex_cmp (
.clk(clk),
.input_a(acctual_rs1),
.input_b(acctual_rs2),
.function_select(cmp_function_in),
.result(cmp_output_out)
);
```
- In the `cmp.v`, it doesn't have any branch prediction.
```diff=18
wire is_equal = (input_a == input_b);
wire is_less = ($signed({usign ? 1'b0 : input_a[31], input_a}) < $signed({usign ? 1'b0 : input_b[31], input_b}));
always @(posedge clk) begin
negate <= function_select[0];
quasi_result <= less ? is_less : is_equal;
end
assign result = negate ? !quasi_result : quasi_result;
```
- In the `hazard.v`, if branch taken, it will invalidate fetch, decode, exeute stage.
```diff=79
assign invalidate_fetch = reset || branch_invalidate || (!fetch_ready && !data_hazard);
assign invalidate_decode = reset || branch_invalidate || data_hazard;
assign invalidate_execute = reset || branch_invalidate;
assign invalidate_memory = reset || trap_invalidate || (!mem_ready && load_store);
```
## Implement RV32M
- **Step 1 : Modify Makefile**
- We need to modify the makefile in the `test` folder. Modify the parameter `-march=rv32i` to `-march=rv32g`. Because `rv32i` does not have the function of converting `rv32m` to machine code.
```
# == Tools
CC := clang --target=riscv32 -march=rv32g
LD := lld
# ==
```
- **Step 2 : Modify Verilog**
- We need to add some hardware to identify `rv32m` instruction and calculate the correct answer. The following is the code I have modified.
- decode.v
- We declare a register to store whether it is the instruction of `rv32m` or `rv32i`
```diff=56
output reg [2:0] cmp_function_out,
+ output reg alu_select_I_M_out, // for RV32M
output reg jump_out,
```
- And add additional hardware to decode `rv32m` instruction. Because the opcode of `rv32m` is equal to `0110011`, add hardware to distinguish instructions
```diff=283
7'b0110011 : begin // OP
alu_function_out <= instr[14:12];
alu_function_modifier_out <= instr[30];
alu_select_a_out <= ALU_SEL_REG;
alu_select_b_out <= ALU_SEL_REG;
write_select_out <= WRITE_SEL_ALU;
rd_address_out <= rd_address;
bypass_memory_out <= 1;
+ alu_select_I_M_out <= instr[25]; // for RV32M
+ if(!instr[25]) begin // If it is rv32i , it need to judge the following condition.
if (instr[31:25] != 0 && (instr[31:25] != 7'b0100000 || (instr[14:12] != 0 && instr[14:12] != 3'b101))) begin
ecause_out <= 2;
exception_out <= 1;
end
+ end
end
```
- execute.v
- We add a input register to get the information from `decode` stage.
```diff=19
input [1:0] alu_select_b_in,
+ input alu_select_I_M_in, // for RV32M
input [2:0] cmp_function_in,
```
- And set a input register which send information to `alu` component
```diff=112
alu ex_alu (
.clk(clk),
.input_a(alu_input_a),
.input_b(alu_input_b),
.function_select(alu_function_in),
.function_modifier(alu_function_modifier_in),
+ .function_select_I_M(alu_select_I_M_in), // for RV32M
.add_result(alu_addition_out),
.result(alu_data_out)
);
```
- alu.v
- We need to implement `rv32m` operation in `alu.v`.
- I currently use behavior level. I did not implement the multiplier and divider by myself. So there are some problems.
- `*` and `/` in behavior level is unsigned operation. I need to add some `if-else` function to make signed operation correctly.
- If the divisor is zero, then the answer will be wrong. We need to do some exception handling.
```diff=
module alu (
input clk,
input [31:0] input_a,
input [31:0] input_b,
input [2:0] function_select,
input function_modifier,
+ // for RV32M - Set a input register to get information form `execute`
+ input function_select_I_M,
// 1st cycle output
output [31:0] add_result,
// 2nd cycle output
output reg [31:0] result
);
localparam ALU_ADD_SUB = 3'b000;
localparam ALU_SLL = 3'b001;
localparam ALU_SLT = 3'b010;
localparam ALU_SLTU = 3'b011;
localparam ALU_XOR = 3'b100;
localparam ALU_SRL_SRA = 3'b101;
localparam ALU_OR = 3'b110;
localparam ALU_AND_CLR = 3'b111;
+ // for rv32m - Declare the localparam to distinguish which operation is the instruction to perform.
+ localparam ALU_MUL = 3'b000;
+ localparam ALU_MULH = 3'b001;
+ localparam ALU_MULHSU = 3'b010;
+ localparam ALU_MULHU = 3'b011;
+ localparam ALU_DIV = 3'b100;
+ localparam ALU_DIVU = 3'b101;
+ localparam ALU_REM = 3'b110;
+ localparam ALU_REMU = 3'b111;
/* verilator lint_off UNUSED */ // The first bit [32] will intentionally be ignored
wire [32:0] tmp_shifted = $signed({function_modifier ? input_a[31] : 1'b0, input_a}) >>> input_b[4:0];
/* verilator lint_on UNUSED */
assign add_result = result_add_sub;
reg [31:0] result_add_sub;
reg [31:0] result_sll;
reg [31:0] result_slt;
reg [31:0] result_xor;
reg [31:0] result_srl_sra;
reg [31:0] result_or;
reg [31:0] result_and_clr;
reg [2:0] old_function;
+ // for rv32m - Set the register to save the result of the operation.
+ reg [63:0] result_mul;
+ /* verilator lint_off UNUSEDSIGNAL */ // I need 32 bit, but operator `*` must to malloc 64 bit register.
+ reg [63:0] result_mulhsu;
+ reg [63:0] result_mulhu;
+ /* verilator lint_on UNUSEDSIGNAL */
+ reg [31:0] result_div;
+ reg [31:0] result_divu;
+ reg [31:0] result_rem;
+ reg [31:0] result_remu;
+ // for rv32m - temp register
+ reg [63:0] unsigned_input_a;
+ reg [63:0] unsigned_input_b;
+ reg [63:0] signed_input_a;
+ reg [63:0] signed_input_b;
+ // for rv32m - Extend 32 bit to 64 bit
+ assign unsigned_input_a = {32'b0,input_a};
+ assign unsigned_input_b = {32'b0,input_b};
+ assign signed_input_a = {{32{input_a[31]}}, {input_a}};
+ assign signed_input_b = {{32{input_b[31]}}, {input_b}};
always @(posedge clk) begin
old_function <= function_select;
result_add_sub <= input_a + (function_modifier ? -input_b : input_b);
result_sll <= input_a << input_b[4:0];
result_slt <= {
{31{1'b0}},
(
$signed({function_select[0] ? 1'b0 : input_a[31], input_a})
< $signed({function_select[0] ? 1'b0 : input_b[31], input_b})
)
};
result_xor <= input_a ^ input_b;
result_srl_sra <= tmp_shifted[31:0];
result_or <= input_a | input_b;
result_and_clr <= (function_modifier ? ~input_a : input_a) & input_b;
end
+ always @(posedge clk) begin
+ result_mul <= signed_input_a * signed_input_b;
+ result_mulhu <= unsigned_input_a * unsigned_input_b;
+ result_mulhsu <= signed_input_a * unsigned_input_b;
+ if(input_b == 32'h00000000) begin
+ result_divu <= -1;
+ result_remu <= input_a;
+ end else begin
+ result_divu <= input_a / input_b;
+ result_remu <= input_a % input_b;
+ end
+ end
+ // rv32m for signed operation of div and rem
+ always @(*) begin
+ if(input_a[31] == 0) begin
+ if(input_b[31] == 0) begin
+ result_div = input_a / input_b;
+ result_rem = input_a % input_b;
+ end else begin
+ result_div = -(input_a / (-input_b));
+ result_rem = input_a % (-input_b);
+ end
+ end else begin
+ if(input_b[31] == 0) begin
+ result_div = -(-input_a / input_b);
+ result_rem = -(input_a % input_b);
+ end else begin
+ result_div = (-input_a)/(-input_b);
+ result_rem = -(input_a % (-input_b));
+ end
+ end
+ if(input_b == 32'h00000000) begin
+ result_div = -1;
+ result_rem = input_a;
+ end
+ end
always @(*) begin
+ if(!function_select_I_M) begin
case (old_function)
ALU_ADD_SUB: result = result_add_sub;
ALU_SLL: result = result_sll;
ALU_SLT,
ALU_SLTU: result = result_slt;
ALU_XOR: result = result_xor;
ALU_SRL_SRA: result = result_srl_sra;
ALU_OR: result = result_or;
ALU_AND_CLR: result = result_and_clr;
endcase
+ end else begin
+ case(old_function)
+ ALU_MUL: result = result_mul [31:0];
+ ALU_MULH: result = result_mul [63:32];
+ ALU_MULHSU: result = result_mulhsu [63:32];
+ ALU_MULHU: result = result_mulhu [63:32];
+ ALU_DIV: result = result_div;
+ ALU_DIVU: result = result_divu;
+ ALU_REM: result = result_rem;
+ ALU_REMU: result = result_remu;
+ endcase
+ end
+ end
endmodule
```
- pipeline.v
- Because we want to send the information between `decode` and `execute` stage, need to set the wire between these stages.
- Set the output register from `decode`
```diff=275
.alu_select_b_out(decode_to_execute_alu_select_b),
+ .alu_select_I_M_out(decode_to_execute_alu_select_I_M), // for RV32M
.cmp_function_out(decode_to_execute_cmp_function),
```
- Set the wire from `decode` to `execute`
```diff=318
wire [1:0] decode_to_execute_alu_select_b;
+ wire decode_to_execute_alu_select_I_M; // for RV32M
wire [2:0] decode_to_execute_cmp_function;
```
- Set the input register to `execute`
```diff=359
.alu_select_b_in(decode_to_execute_alu_select_b),
+ .alu_select_I_M_in(decode_to_execute_alu_select_I_M), // for RV32M
.cmp_function_in(decode_to_execute_cmp_function),
```
- **Step 3 : Add riscv-test file**
- We can find the testing file from [riscv-test](https://github.com/riscv-software-src/riscv-tests) which the project [Kleine-riscv](https://github.com/rolandbernard/kleine-riscv) is also uses the files from the same source.
- We added the following eight files in rv32um directory.

- **Step 4 : Makefile**
- If the answer differs from the computed result, the following will be displayed. We can find which test case is failed. Then we can modify the code to satisfy this test case.
```
Running test jal
Running test and
Failed test case #5
make: *** [makefile:56:RUNTEST.tests/build/rv32ui/and] 錯誤 1
```
- The reason why it can check the answer. I will discuss belowing.
- In the test file, we can see many functions like the following.
```
// add.S
TEST_RR_OP( 2, add, 0x00000000, 0x00000000, 0x00000000 );
```
- Then we look at the definition of this function which is in `tests/include/test_macros.h`.
```
#define MASK_XLEN(x) ((x) & ((1 << (__riscv_xlen - 1) << 1) - 1))
#define TEST_CASE( testnum, testreg, correctval, code... ) \
test_ ## testnum: \
code; \
li x7, MASK_XLEN(correctval); \
li TESTNUM, testnum; \
bne testreg, x7, fail;
#define TEST_RR_OP( testnum, inst, result, val1, val2 ) \
TEST_CASE( testnum, x14, result, \
li x1, MASK_XLEN(val1); \
li x2, MASK_XLEN(val2); \
inst x14, x1, x2; \
)
```
- And unify `TEST_RR_OP`. It seem that the function will load `val1` , `val2` and execute the instruction we want to test. Compare the value between `correctval` and the answer we calculate. If the answer is failed, it will jump to `fail` label.
```
/* unify `TEST_RR_OP` */
test_2:
li x1, MASK_XLEN(val1);
li x2, MASK_XLEN(val2);
instr x14, x1, x2; // According to the instruction we want to test
li x7, MASK_XLEN(correctval);
li TESTNUM, testnum;
bne x14, x7, fail;
```
- Then, if the answer is correct, it will set `gp` register to 1. If the answer is failed, it will do nothing. The `gp` at this time is `testnum` which is the number of test case.
```
/* test_macros.h */
#define TEST_PASSFAIL \
bne x0, TESTNUM, pass; \
fail: \
RVTEST_FAIL; \
pass: \
RVTEST_PASS \
```
```
/* riscv-test.h */
#define TESTNUM gp (gp is global pointer which is 0x10000000)
#define RVTEST_PASS \
fence; \
li TESTNUM, 1; \
ecall
#define RVTEST_FAIL \
fence; \
ecall
```
- In `README` written by [kleine-riscv](https://github.com/rolandbernard/kleine-riscv), it said
:::info
The simulator will also write every byte written to the address 0x10000000 to stderr (this is a placeholder for a real UART device).
:::
- It mean that `gp` register will store the error code.
- Then, we compile `.S` file to `.elf` file and load in `simulator.cpp`. Where the main function is located. The main function will do the following things.
- Use `loadFromElfFile` to load ELF file
- Use `addRamHandler` to allocate memory.
- Use `addHandlers` to add error handler to core.
- Use `core.cycle()` to run every cycle.
```C++=242
/* simulator.cpp */
int main(int argc, const char** argv) {
uint32_t memory_size = 1 << 20;
int cycle_limit = 0;
int latency = 0;
bool add_exit = false;
const char* elf = NULL;
parseArguments(argc, argv, &memory_size, &cycle_limit, &latency, &add_exit, &elf);
if (elf == NULL) {
return 1;
} else {
Core core;
core.memory_latency = latency;
uint32_t* ram = core.memory.addRamHandler(0x80000000, memory_size);
if (!loadFromElfFile(elf, ram, 0x80000000, memory_size)) {
return 1;
}
addHandlers(core.memory, add_exit);
core.reset();
for (int i = 0; i < cycle_limit || cycle_limit == 0; i++) {
core.cycle();
}
std::cerr << "terminated after " << cycle_limit << " cycles" << std::endl;
delete[] ram;
return 1;
}
}
```
- We can find that when run `core.cycle()`, it will run `handleRequest` which will run all the handler.
```C++=14
/* core.cpp */
void Core::cycle() {
if (memory_wait == 0) {
memory.handleRequest(core_logic);
memory_wait = memory_latency;
} else {
memory.delayRequest(core_logic);
memory_wait--;
}
core_logic.eval();
core_logic.clk = 1;
core_logic.eval();
core_logic.clk = 0;
}
```
```C++=9
/* memory.cpp */
void MagicMemory::handleRequest(Vcore &core) {
if (core.ext_valid) {
uint32_t address = core.ext_address;
for (auto& handler : mapping) {
if (address >= handler.start && address < handler.start + handler.length) {
if (core.ext_write_strobe == 0) {
core.ext_read_data = handler.handle_read(address - handler.start);
} else {
handler.handle_write(address - handler.start, core.ext_write_data, core.ext_write_strobe);
}
break;
}
}
core.ext_ready = true;
} else {
core.ext_ready = false;
}
}
```
- In the function of `addHandlers`, it will read the value in `0x10000000`. If the value is not equal 1, it will print `Failed test case # testnum` and end the program.
```C++=175
/* simulator.cpp */
static void addHandlers(MagicMemory& memory, bool add_exit) {
MagicMappedHandler console = {
.start = 0x10000000,
.length = 4,
.handle_read = [](uint32_t addres) {
return 0;
},
.handle_write = [](uint32_t address, uint32_t data, uint8_t strobe) {
if (strobe & 0b0001) {
std::cerr << (char)(data & 0xff);
}
}
};
memory.addHandler(console);
if (add_exit) {
MagicMappedHandler exiter = {
.start = 0x11000000,
.length = 4,
.handle_read = [](uint32_t addres) {
return 0;
},
.handle_write = [](uint32_t address, uint32_t data, uint8_t strobe) {
if (data == 1) {
exit(EXIT_SUCCESS);
} else if ((data & 0x100) != 0) {
/* another error handle */
} else {
std::cerr << "Failed test case #" << data << std::endl;
exit(1);
}
}
};
memory.addHandler(exiter);
}
}
```
- Like the following.
```
Running test jal
Running test and
Failed test case #5
make: *** [makefile:56:RUNTEST.tests/build/rv32ui/and] 錯誤 1
```
- After we modify the program of this project. We can use `RV32M` instruction. Then We can see that there are no errors when executing the following test.
```
Running test div
Running test mulhsu
Running test divu
Running test mul
Running test remu
Running test rem
Running test mulh
```
## reference
- [Clang command line argument reference](https://clang.llvm.org/docs/ClangCommandLineReference.html#include-path-management)
- [C語言中define的使用方法總結](https://pxnet2768.pixnet.net/blog/post/143425336)
- [Makefile 語法和示範](/XdF11cBIR8aahe8bxU7gfw)
- [Hello Verilator—高品質&開源的 SystemVerilog(Verilog) 模擬器介紹&教學](https://ys-hayashi.me/2020/12/verilator-2/)
- [Linux 建立連結檔 ln 指令教學與範例](https://officeguide.cc/linux-ln-create-link-command-tutorial-examples/)
- [Verilog HDL 教學講義](https://hom-wang.gitbooks.io/verilog-hdl/content/Chapter_05.html)
- [RV32M,RV64M Instructions](https://msyksphinz-self.github.io/riscv-isadoc/html/rvm.html)
- [riscv-test](https://github.com/riscv-software-src/riscv-tests)
- [Errors and Warnings — Verilator 5.005 documentation](https://verilator.org/guide/latest/warnings.html)