# Computer-Architecture term_project riscv SOC
# pre-work
On the afternoon of the day when I chose the subject to do, I returned my two classes, because I knew that I would't have too much fun in the next month, and I needed a lot of time to complete this term_project.
## sample
[從零開始的RISC-V SoC架構設計與Linux核心運行 - 硬體篇](https://hackmd.io/@w4K9apQGS8-NFtsnFXutfg/B1Re5uGa5)
## VIVADO
[guide](https://www.youtube.com/watch?v=fBFn32Al0yw)
Install VIVADO (please use the ML version, do not need to use the license).
**Please use the version after 2020.**
There is a problem of file corruption when downloading in 2018 (personal experience, re-running three times)
## VSCODE
I use VSCODE to edit my code. (notepad is super bad)
## install make on windows
[follow this](https://blog.csdn.net/weixin_45903371/article/details/113886121)
## fpga
ARTY_A7_100T
Be sure to use the above vivado version
Specification req:
![](https://i.imgur.com/8cERv7j.png)
[borad file](https://digilent.com/reference/programmable-logic/guides/installing-vivado-and-sdk)(it got guidence)
## References
[ithome](https://ithelp.ithome.com.tw/users/20141480/ironman/4772)
[bilibili](https://www.bilibili.com/video/BV1Ve411x75W/?spm_id_from=333.337.search-card.all.click&vd_source=03de631ca969e8af97eafc9d8d816f56) well,I know learning how to write verilog on this web site sounds funny,but I'M VERY BAD IN english:(
[datasheet](https://riscv.org/wp-content/uploads/2017/05/riscv-spec-v2.2.pdf) It is very usefull,you don't have to remember the inst-type when this book is in your computer
[exp from github](https://github.com/ucb-bar/riscv-mini)
[book1](https://www.books.com.tw/products/0010933946)(this book is very usefull to learn some basic knoledge)**MUST READ CH1-CH3 VERY CAREFULLY**
![](https://i.imgur.com/sROhDA6.jpg)
[book2](https://www.books.com.tw/products/CN11542677)
[github exp](https://github.com/yutongshen/RISC-V_SoC)
[stackoverflow](https://stackoverflow.com/)
**A 5-STAGE_cpu code from my friend.(x86)**
**A NTU EE MASTER friend**
**A lot of money(fpga board is very expensive QQ).**
[handshakes's RTL](https://www.cnblogs.com/mikewolf2002/p/11345401.html)
## first of all
I made two versions on the CPU side (yes, you read that right
Since I don't know how to write verilog, I trained myself from scratch.
At the beginning, I wrote a 3-stage pipeline CPU that can execute some RV32I (except load, store types) instructions.
Then I tried to write a three-stage pipeline CPU, and tried the SOC, and successfully executed the test (RV32I) on modleSim.
Then came the final version, I tried to add division and multiplication instructions, and added some SOC peripherals, simulated by two top-level files, one is used to integrate all the components of the CPU, and the other is used to connect SOC&CPU
The instruction test can be successfully performed on windows, and the bitstream is successfully produced on vivado.
BUT!!!!!!
**My board hasn't arrived yet**
![](https://i.imgur.com/SvBNHKD.jpg)
Alright, a little too much nonsense, just like the teacher said
***"talk is meaningless,show me the code!"***
# The first version
## Purpose:
Pass the instruction set test of risc-v RV32I
### generated(take addi_test as example)
we turned this rv32i code to binary file,and use it to test our RTL code's instruction fetch,decode,exe,mem_op's ability.
```
generated/rv32ui-p-addi: file format elf32-littleriscv
Disassembly of section .text.init:
00000000 <_start>:
0: 00000d13 li s10,0
4: 00000d93 li s11,0
00000008 <test_2>:
8: 00000093 li ra,0
c: 00008f13 mv t5,ra
10: 00000e93 li t4,0
14: 00200193 li gp,2
18: 27df1c63 bne t5,t4,290 <fail>
0000001c <test_3>:
1c: 00100093 li ra,1
20: 00108f13 addi t5,ra,1
24: 00200e93 li t4,2
28: 00300193 li gp,3
2c: 27df1263 bne t5,t4,290 <fail>
...
```
## RTL introduce(cpu)
Determine the address of the command
pc_reg
```
module pc_reg(
input wire clk,
input wire rst,
input wire[31:0] jump_addr_i,
input wire jump_en,
output reg[31:0] pc_o
);
always @(posedge clk) begin
if (rst==1'b0) //set low/0 as neg
pc_o<=32'b0;
else if(jump_en)
pc_o<=jump_addr_i;
else
pc_o<=pc_o+3'd4;
end
endmodule
```
## if_id
```
module if_id(
input wire clk,
input wire rst,
input wire [31:0] inst_i,
input wire hold_flag_i,
input wire [31:0] inst_addr_i,
output wire[31:0] inst_addr_o,
output wire[31:0] inst_o
);
reg rom_flag;
always @(posedge clk) begin
if(!rst|hold_flag_i)
rom_flag<=1'b0;
else
rom_flag<=1'b1;
end
assign inst_o=rom_flag?inst_i:`INST_NOP;//if flag==1 go rom
dff_set #(32)dff2(clk,rst,hold_flag_i,32'b0,inst_addr_i,inst_addr_o);
```
## id
According to the opcode to determine which type of instruction.
Use the function code (fun3) (fun7) to determine which command to generate the corresponding signal to the next stage
```
`include "defines.v"
module id(
//from if_id
input wire[31:0] inst_i ,
input wire[31:0] inst_addr_i ,
// to regs
output reg[4:0] rs1_addr_o ,
output reg[4:0] rs2_addr_o ,
// from regs
input wire[31:0] rs1_data_i ,
input wire[31:0] rs2_data_i ,
//to id_ex
output reg[31:0] inst_o ,
output reg[31:0] inst_addr_o ,
output reg[31:0] op1_o ,
output reg[31:0] op2_o ,
output reg[4:0] rd_addr_o ,
output reg reg_wen ,
output reg[31:0] base_addr_o ,
output reg[31:0] addr_offset_o ,
//to mem read
output reg mem_rd_req_o ,
output reg[31:0] mem_rd_addr_o
);
wire[6:0] opcode;
wire[4:0] rd ;
wire[2:0] func3 ;
wire[4:0] rs1 ;
wire[4:0] rs2 ;
wire[6:0] func7 ;
wire[11:0]imm ;
wire[4:0] shamt ;
assign opcode = inst_i[6:0];
assign rd = inst_i[11:7];
assign func3 = inst_i[14:12];
assign rs1 = inst_i[19:15];
assign rs2 = inst_i[24:20];
assign func7 = inst_i[31:25];
assign imm = inst_i[31:20];
assign shamt = inst_i[24:20];
always @(*)begin
inst_o = inst_i;
inst_addr_o = inst_addr_i;
case(opcode)
`INST_TYPE_I:begin
mem_rd_req_o = 1'b0 ;
mem_rd_addr_o = 32'b0;
base_addr_o = 32'b0;
addr_offset_o = 32'b0;
case(func3)
`INST_ADDI,`INST_SLTI,`INST_SLTIU,`INST_XORI,`INST_ORI,`INST_ANDI:begin
rs1_addr_o = rs1;
rs2_addr_o = 5'b0;
op1_o = rs1_data_i;
op2_o = {{20{imm[11]}},imm};
rd_addr_o = rd;
reg_wen = 1'b1;
end
`INST_SLLI,`INST_SRI:begin
rs1_addr_o = rs1;
rs2_addr_o = 5'b0;
op1_o = rs1_data_i;
op2_o = {27'b0,shamt};
rd_addr_o = rd;
reg_wen = 1'b1;
end
default:begin
rs1_addr_o = 5'b0;
rs2_addr_o = 5'b0;
op1_o = 32'b0;
op2_o = 32'b0;
rd_addr_o = 5'b0;
reg_wen = 1'b0;
end
endcase
end
`INST_TYPE_R_M:begin
mem_rd_req_o = 1'b0 ;
mem_rd_addr_o = 32'b0;
base_addr_o = 32'b0;
addr_offset_o = 32'b0;
case(func3)
`INST_ADD_SUB,`INST_SLT,`INST_SLTU,`INST_XOR,`INST_OR,`INST_AND:begin
rs1_addr_o = rs1;
rs2_addr_o = rs2;
op1_o = rs1_data_i;
op2_o = rs2_data_i;
rd_addr_o = rd;
reg_wen = 1'b1;
end
`INST_SLL,`INST_SR:begin
rs1_addr_o = rs1;
rs2_addr_o = rs2;
op1_o = rs1_data_i;
op2_o = {27'b0,rs2_data_i[4:0]};
rd_addr_o = rd;
reg_wen = 1'b1;
end
default:begin
rs1_addr_o = 5'b0;
rs2_addr_o = 5'b0;
op1_o = 32'b0;
op2_o = 32'b0;
rd_addr_o = 5'b0;
reg_wen = 1'b0;
end
endcase
end
`INST_TYPE_B:begin
mem_rd_req_o = 1'b0 ;
mem_rd_addr_o = 32'b0;
case(func3)
`INST_BNE,`INST_BEQ,`INST_BLT,`INST_BGE,`INST_BLTU,`INST_BGEU:begin
rs1_addr_o = rs1;
rs2_addr_o = rs2;
op1_o = rs1_data_i;
op2_o = rs2_data_i;
rd_addr_o = 5'b0;
reg_wen = 1'b0;
base_addr_o = inst_addr_i;
addr_offset_o = {{19{inst_i[31]}},inst_i[31],inst_i[7],inst_i[30:25],inst_i[11:8],1'b0};
end
default:begin
rs1_addr_o = 5'b0;
rs2_addr_o = 5'b0;
op1_o = 32'b0;
op2_o = 32'b0;
rd_addr_o = 5'b0;
reg_wen = 1'b0;
base_addr_o = 32'b0;
addr_offset_o = 32'b0;
end
endcase
end
`INST_TYPE_L:begin
case(func3)
`INST_LW,`INST_LH,`INST_LB,`INST_LHU,`INST_LBU:begin
mem_rd_req_o = 1'b1 ;
mem_rd_addr_o = rs1_data_i + {{20{imm[11]}},imm};
rs1_addr_o = rs1;
rs2_addr_o = 5'b0;
op1_o = 32'b0;
op2_o = 32'b0;
rd_addr_o = rd;
reg_wen = 1'b1;
base_addr_o = rs1_data_i;
addr_offset_o = {{20{imm[11]}},imm};
end
default:begin
mem_rd_req_o = 1'b0 ;
mem_rd_addr_o = 32'b0 ;
rs1_addr_o = 5'b0 ;
rs2_addr_o = 5'b0 ;
op1_o = 32'b0 ;
op2_o = 32'b0 ;
rd_addr_o = 5'b0 ;
reg_wen = 1'b0 ;
end
endcase
end
`INST_TYPE_S:begin
case(func3)
`INST_SW,`INST_SH,`INST_SB:begin
mem_rd_req_o = 1'b0 ;
mem_rd_addr_o = 32'b0 ;
rs1_addr_o = rs1 ;
rs2_addr_o = rs2 ;
op1_o = 32'b0 ;
op2_o = rs2_data_i ;
rd_addr_o = 5'b0 ;
reg_wen = 1'b0 ;
base_addr_o = rs1_data_i ;
addr_offset_o = {{20{inst_i[31]}},inst_i[31:25],inst_i[11:7]};
end
default:begin
mem_rd_req_o = 1'b0 ;
mem_rd_addr_o = 32'b0 ;
rs1_addr_o = 5'b0 ;
rs2_addr_o = 5'b0 ;
op1_o = 32'b0 ;
op2_o = 32'b0 ;
rd_addr_o = 5'b0 ;
reg_wen = 1'b0 ;
base_addr_o = 32'b0;
addr_offset_o = 32'b0;
end
endcase
end
`INST_JAL:begin
mem_rd_req_o = 1'b0 ;
mem_rd_addr_o = 32'b0;
rs1_addr_o = 5'b0;
rs2_addr_o = 5'b0;
op1_o = inst_addr_i;
op2_o = 32'h4;
rd_addr_o = rd;
reg_wen = 1'b1;
base_addr_o = inst_addr_i;
addr_offset_o = {{12{inst_i[31]}}, inst_i[19:12], inst_i[20], inst_i[30:21], 1'b0};
end
`INST_LUI:begin
mem_rd_req_o = 1'b0 ;
mem_rd_addr_o = 32'b0;
rs1_addr_o = 5'b0;
rs2_addr_o = 5'b0;
op1_o = {inst_i[31:12],12'b0};
op2_o = 32'b0;
rd_addr_o = rd;
reg_wen = 1'b1;
base_addr_o = 32'b0;
addr_offset_o = 32'b0;
end
`INST_JALR:begin
mem_rd_req_o = 1'b0 ;
mem_rd_addr_o = 32'b0;
rs1_addr_o = rs1;
rs2_addr_o = 5'b0;
op1_o = inst_addr_i;
op2_o = 32'h4;
rd_addr_o = rd;
reg_wen = 1'b1;
base_addr_o = rs1_data_i;
addr_offset_o = {{20{imm[11]}},imm};
end
`INST_AUIPC:begin
mem_rd_req_o = 1'b0 ;
mem_rd_addr_o = 32'b0;
rs1_addr_o = 5'b0;
rs2_addr_o = 5'b0;
op1_o = {inst_i[31:12],12'b0};
op2_o = inst_addr_i;
rd_addr_o = rd;
reg_wen = 1'b1;
base_addr_o = 32'b0;
addr_offset_o = 32'b0;
end
default:begin
mem_rd_req_o = 1'b0 ;
mem_rd_addr_o = 32'b0;
rs1_addr_o = 5'b0;
rs2_addr_o = 5'b0;
op1_o = 32'b0;
op2_o = 32'b0;
rd_addr_o = 5'b0;
reg_wen = 1'b0;
base_addr_o = 32'b0;
addr_offset_o = 32'b0;
end
endcase
end
endmodule
```
id_ex
```
module id_ex(
input wire clk,
input wire rst,
//from id
input wire[31:0] inst_i,
input wire[31:0] inst_addr_i,
input wire[31:0] op1_i,
input wire[31:0] op2_i,
input wire[4:0] rd_addr_i,
input wire reg_wen_i,
input wire[31:0] base_addr_i,
input wire[31:0] addr_offset_i,
//from ctrl
input wire hold_flag_i,
//to ex
output wire[31:0] inst_o,
output wire[31:0] inst_addr_o,
output wire[31:0] op1_o,
output wire[31:0] op2_o,
output wire[4:0] rd_addr_o,
output wire[31:0] base_addr_o,
output wire reg_wen_o,
output wire[31:0] addr_offset_o
);
dff_set #(32) dff1(clk,rst,hold_flag_i,`INST_NOP,inst_i,inst_o);
dff_set #(32) dff2(clk,rst,hold_flag_i,32'b0,inst_addr_i,inst_addr_o);
dff_set #(32) dff3(clk,rst,hold_flag_i,32'b0,op1_i,op1_o);
dff_set #(32) dff4(clk,rst,hold_flag_i,32'b0,op2_i,op2_o);
dff_set #(5) dff5(clk,rst,hold_flag_i,5'b0,rd_addr_i,rd_addr_o);
dff_set #(1) dff6(clk,rst,hold_flag_i,1'b0,reg_wen_i,reg_wen_o);
dff_set #(32) dff7(clk,rst,hold_flag_i,32'b0,base_addr_i,base_addr_o);
dff_set #(32) dff8(clk,rst,hold_flag_i,32'b0,addr_offset_i,addr_offset_o);
endmodule
```
## ex
I started by putting calculations into logic without prior announcement, and then my friends said I was stupid XD
I later verified that this would slow down the overall CPU performance
```
module ex(
//from id_ex
input wire[31:0] inst_i,
input wire[31:0] inst_addr_i,
input wire[31:0] op1_i,
input wire[31:0] op2_i,
input wire[4:0] rd_addr_i,
input wire rd_wen_i,
input wire[31:0] base_addr_i,
input wire[31:0] addr_offset_i,
//to regs
output reg[4:0] rd_addr_o,
output reg[31:0]rd_data_o,
output reg rd_wen_o,
//to ctrl
output reg[31:0]jump_addr_o,
output reg jump_en_o,
output reg hold_flag_o,
//to mem write
output reg mem_wr_req_o,
output reg[3:0] mem_wr_sel_o,
output reg[31:0]mem_wr_addr_o,
output reg[31:0]mem_wr_data_o,
//from memread
input wire[31:0]mem_rd_data_i
);
wire[6:0] opcode;
wire[4:0] rd;
wire[2:0] func3;
wire[4:0] rs1;
wire[4:0] rs2;
wire[6:0] func7;
wire[11:0] imm;
wire[4:0] shamt;
assign opcode=inst_i[6:0];
assign rd=inst_i[11:7];
assign func3 =inst_i[14:12];
assign func7 =inst_i[31:25];
assign rs1=inst_i[19:15];
assign rs2=inst_i[24:20];
assign imm=inst_i[31:20];
assign shamt=inst_i[24:20];
//branch
//wire[31:0] jump_imm={{19{inst_i[31]}},inst_i[31],inst_i[7],inst_i[30:25],inst_i[11:8],1'b0};
wire op1_i_equal_op2_i;
wire op1_i_less_op2_i_signed;
wire op1_i_less_op2_i_unsigned;
assign op1_i_less_op2_i_signed = ($signed(op1_i) < $signed(op2_i))?1'b1:1'b0;
assign op1_i_less_op2_i_unsigned = (op1_i < op2_i)?1'b1:1'b0;
assign op1_i_equal_op2_i = (op1_i == op2_i)?1'b1:1'b0;
// logic units
wire[31:0] op1_i_add_op2_i;
wire[31:0] op1_i_and_op2_i;
wire[31:0] op1_i_xor_op2_i;
wire[31:0] op1_i_or_op2_i;
wire[31:0] op1_i_shift_left_op2_i;
wire[31:0] op1_i_shift_right_op2_i;
wire[31:0] base_addr_add_addr_offset;
assign op1_i_add_op2_i=op1_i+op2_i;
assign op1_i_and_op2_i=op1_i&op2_i;
assign op1_i_xor_op2_i=op1_i^op2_i;
assign op1_i_or_op2_i=op1_i|op2_i;
assign op1_i_shift_left_op2_i=op1_i<<op2_i;
assign op1_i_shift_right_op2_i=op1_i>>op2_i;
assign base_addr_add_addr_offset=base_addr_i+addr_offset_i;
// type I
wire[31:0] SRA_mask;
assign SRA_mask = (32'hffff_ffff) >> op2_i[4:0];
wire[1:0]store_index = base_addr_add_addr_offset[1:0];
wire[1:0]load_index = base_addr_add_addr_offset[1:0];
```
```
`INST_TYPE_I:begin
jump_addr_o=32'b0;//write wen
jump_en_o=1'b0;
hold_flag_o=1'b0;
mem_wr_req_o=1'b0;
mem_wr_sel_o=4'b0;
mem_wr_addr_o=32'b0;
mem_wr_data_o=32'b0;
case(func3)
`INST_ADDI:begin//same instruction structure
rd_data_o=op1_i_add_op2_i;
rd_addr_o=rd_addr_i;
rd_wen_o=1'b1;
end
`INST_SLTI:begin
rd_data_o={30'b0,op1_i_less_op2_i_signed};
rd_addr_o=rd_addr_i;
rd_wen_o=1'b1;
end
`INST_SLTIU:begin
rd_data_o={30'b0,op1_i_less_op2_i_unsigned};
rd_addr_o=rd_addr_i;
rd_wen_o=1'b1;
end
`INST_XORI:begin
rd_data_o=op1_i_xor_op2_i;
rd_addr_o=rd_addr_i;
rd_wen_o=1'b1;
end
`INST_ORI:begin
rd_data_o=op1_i_or_op2_i;
rd_addr_o=rd_addr_i;
rd_wen_o=1'b1;
end
`INST_ANDI:begin
rd_data_o= op1_i_and_op2_i;
rd_addr_o=rd_addr_i;
rd_wen_o=1'b1;
end
`INST_SLLI:begin
rd_data_o=op1_i_shift_left_op2_i;
rd_addr_o=rd_addr_i;
rd_wen_o=1'b1;
end
`INST_SRI:begin
if (func7[5]==1'b1) begin //SRAI (only 5 bit limit)
rd_data_o=((op1_i_shift_right_op2_i) & SRA_mask) | ({32{op1_i[31]}} & (~SRA_mask));
rd_addr_o=rd_addr_i;
rd_wen_o=1'b1;
end
else begin//SRLI
rd_data_o=op1_i_shift_right_op2_i;
rd_addr_o=rd_addr_i;
rd_wen_o=1'b1;
end
end
default:begin
rd_data_o=32'b0;
rd_addr_o=5'b0;
rd_wen_o=1'b0;
end
endcase
end
```
ram
As a first version external device.
```
module ram(
input wire clk,
input wire rst,
input wire [3:0] wen,
input wire [32-1:0]w_addr_i,
input wire [32-1:0]w_data_i,
input wire ren,
input wire [32-1:0]r_addr_i,
output wire [32-1:0]r_data_o
);
wire[11:0] w_addr = w_addr_i[13:2];
wire[11:0] r_addr = r_addr_i[13:2];
dual_ram #(
.DW(8),
.AW(12),
.MEM_NUM(4096)
)
ram_byte0
(
.clk (clk ),
.rst (rst ),
.wen (wen[0] ),
.w_addr_i (w_addr ),
.w_data_i (w_data_i[7:0] ),
.ren (ren ),
.r_addr_i (r_addr ),
.r_data_o (r_data_o[7:0] )
);
dual_ram #(
.DW(8),
.AW(12),
.MEM_NUM(4096)
)
ram_byte1
(
.clk (clk ),
.rst (rst ),
.wen (wen[1] ),
.w_addr_i (w_addr ),
.w_data_i (w_data_i[15:8] ),
.ren (ren ),
.r_addr_i (r_addr ),
.r_data_o (r_data_o[15:8] )
);
dual_ram #(
.DW(8),
.AW(12),
.MEM_NUM(4096)
)
ram_byte2
(
.clk (clk ),
.rst (rst ),
.wen (wen[2] ),
.w_addr_i (w_addr ),
.w_data_i (w_data_i[23:16]),
.ren (ren ),
.r_addr_i (r_addr ),
.r_data_o (r_data_o[23:16])
);
dual_ram #(
.DW(8),
.AW(12),
.MEM_NUM(4096)
)
ram_byte3
(
.clk (clk ),
.rst (rst ),
.wen (wen[3] ),
.w_addr_i (w_addr ),
.w_data_i (w_data_i[31:24]),
.ren (ren ),
.r_addr_i (r_addr ),
.r_data_o (r_data_o[31:24])
);
endmodule
```
rom
As a first version external device.
```
module rom(
input wire clk,
input wire rst,
input wire wen,
input wire[32-1:0] w_addr_i,
input wire[32-1:0] w_data_i,
input wire ren,
input wire[32-1:0] r_addr_i,
output wire[32-1:0] r_data_o
);
wire[11:0] w_addr = w_addr_i[13:2];
wire[11:0] r_addr = r_addr_i[13:2];
dual_ram#(
.DW(32),
.AW(12),
.MEM_NUM(4096)
)
rom_32bit(
.clk(clk),
.rst(rst),
.wen(wen),
.w_addr_i(w_addr),
.w_data_i(w_data_i),
.ren(ren),
.r_addr_i(r_addr),
.r_data_o(r_data_o)
);
endmodule
```
ctrl
Control the jump of B_TYPE instruction
```
module ctrl (
input wire[31:0]jump_addr_i,
input wire jump_en_i,
input wire hold_flag_ex_i,
output reg[31:0]jump_addr_o,
output reg jump_en_o,
output reg hold_flag_o
);
always @(*)begin
jump_addr_o = jump_addr_i;
jump_en_o = jump_en_i;
if( jump_en_i || hold_flag_ex_i)begin
hold_flag_o = 1'b1;
end
else begin
hold_flag_o = 1'b0;
end
end
endmodule
```
### risc-v top
connect each module.
```
module open_risc_v(
input wire clk ,
input wire rst ,
//inst
input wire [31:0] inst_i ,
output wire [31:0] inst_addr_o ,
//read mem
output wire mem_rd_req_o ,
output wire [31:0] mem_rd_addr_o ,
input wire [31:0] mem_rd_data_i ,
//write mem
output wire mem_wr_req_o ,
output wire [3:0] mem_wr_sel_o ,
output wire [31:0] mem_wr_addr_o ,
output wire [31:0] mem_wr_data_o
);
//pc to rom
wire[31:0] pc_reg_pc_o;
assign inst_addr_o = pc_reg_pc_o;
//if to if_id
wire[31:0] if_inst_addr_o;
wire[31:0] if_inst_o;
// if_id to id
wire[31:0] if_id_inst_addr_o;
wire[31:0] if_id_inst_o;
//ex to regs
wire[4:0] ex_rd_addr_o;
wire[31:0] ex_rd_data_o;
wire ex_reg_wen_o;
//id to regs
wire[4:0] id_rs1_addr_o;
wire[4:0] id_rs2_addr_o;
//id to id_ex
wire[31:0] id_inst_o;
wire[31:0] id_inst_addr_o;
wire[31:0] id_op1_o;
wire[31:0] id_op2_o;
wire[4:0] id_rd_addr_o;
wire id_reg_wen;
wire[31:0] id_base_addr_o ;
wire[31:0] id_addr_offset_o ;
//regs to id
wire[31:0] regs_reg1_rdata_o;
wire[31:0] regs_reg2_rdata_o;
//id_ex to ex
wire[31:0] id_ex_inst_o;
wire[31:0] id_ex_inst_addr_o;
wire[31:0] id_ex_op1_o;
wire[31:0] id_ex_op2_o;
wire[4:0] id_ex_rd_addr_o;
wire id_ex_reg_wen;
wire[31:0] id_ex_base_addr_o ;
wire[31:0] id_ex_addr_offset_o;
//ex to ctrl
wire[31:0] ex_jump_addr_o;
wire ex_jump_en_o;
wire ex_hold_flag_o;
//ctrl to pc_reg
wire[31:0] ctrl_jump_addr_o;
wire ctrl_jump_en_o;
//ctrl to if_id id_ex
wire ctrl_hold_flag_o;
pc_reg pc_reg_inst(
.clk (clk ),
.rst (rst ),
.jump_addr_i (ctrl_jump_addr_o ),
.jump_en (ctrl_jump_en_o ),
.pc_o (pc_reg_pc_o )
);
if_id if_id_inst(
.clk (clk ),
.rst (rst ),
.hold_flag_i (ctrl_hold_flag_o ),
.inst_i (inst_i ),
.inst_addr_i (pc_reg_pc_o ),
.inst_addr_o (if_id_inst_addr_o ),
.inst_o (if_id_inst_o )
);
//id to rom
id id_inst(
.inst_i (if_id_inst_o ),
.inst_addr_i (if_id_inst_addr_o ),
.rs1_addr_o (id_rs1_addr_o ),
.rs2_addr_o (id_rs2_addr_o ),
.rs1_data_i (regs_reg1_rdata_o ),
.rs2_data_i (regs_reg2_rdata_o ),
.inst_o (id_inst_o ),
.inst_addr_o (id_inst_addr_o ),
.op1_o (id_op1_o ),
.op2_o (id_op2_o ),
.rd_addr_o (id_rd_addr_o ),
.reg_wen (id_reg_wen ),
.base_addr_o (id_base_addr_o ),
.addr_offset_o (id_addr_offset_o ),
.mem_rd_req_o (mem_rd_req_o ),
.mem_rd_addr_o (mem_rd_addr_o )
);
regs regs_inst(
.clk (clk ),
.rst (rst ),
.reg1_raddr_i (id_rs1_addr_o ),
.reg2_raddr_i (id_rs2_addr_o ),
.reg1_rdata_o (regs_reg1_rdata_o ),
.reg2_rdata_o (regs_reg2_rdata_o ),
.reg_waddr_i (ex_rd_addr_o ),
.reg_wdata_i (ex_rd_data_o ),
.reg_wen (ex_reg_wen_o )
);
id_ex id_ex_inst(
.clk (clk ),
.rst (rst ),
.hold_flag_i (ctrl_hold_flag_o ),
.inst_i (id_inst_o ),
.inst_addr_i (id_inst_addr_o ),
.op1_i (id_op1_o ),
.op2_i (id_op2_o ),
.rd_addr_i (id_rd_addr_o ),
.reg_wen_i (id_reg_wen ),
.base_addr_i (id_base_addr_o ),
.addr_offset_i (id_addr_offset_o ),
.inst_o (id_ex_inst_o ),
.inst_addr_o (id_ex_inst_addr_o ),
.op1_o (id_ex_op1_o ),
.op2_o (id_ex_op2_o ),
.rd_addr_o (id_ex_rd_addr_o ),
.reg_wen_o (id_ex_reg_wen ),
.base_addr_o (id_ex_base_addr_o ),
.addr_offset_o (id_ex_addr_offset_o)
);
ex ex_inst(
.inst_i (id_ex_inst_o ),
.inst_addr_i (id_ex_inst_addr_o ),
.op1_i (id_ex_op1_o ),
.op2_i (id_ex_op2_o ),
.rd_addr_i (id_ex_rd_addr_o ),
.rd_wen_i (id_ex_reg_wen ),
.base_addr_i (id_ex_base_addr_o ),
.addr_offset_i (id_ex_addr_offset_o),
.rd_addr_o (ex_rd_addr_o ),
.rd_data_o (ex_rd_data_o ),
.rd_wen_o (ex_reg_wen_o ),
.jump_addr_o (ex_jump_addr_o ),
.jump_en_o (ex_jump_en_o ),
.hold_flag_o (ex_hold_flag_o ),
.mem_wr_req_o (mem_wr_req_o ),
.mem_wr_sel_o (mem_wr_sel_o ),
.mem_wr_addr_o (mem_wr_addr_o ),
.mem_wr_data_o (mem_wr_data_o ),
.mem_rd_data_i (mem_rd_data_i )
);
ctrl ctrl_inst(
.jump_addr_i (ex_jump_addr_o ),
.jump_en_i (ex_jump_en_o ),
.hold_flag_ex_i (ex_hold_flag_o ),
.jump_addr_o (ctrl_jump_addr_o ),
.jump_en_o (ctrl_jump_en_o ),
.hold_flag_o (ctrl_hold_flag_o )
);
endmodule
```
### dual_ram
```
module dual_ram #(
parameter DW = 32,
parameter AW = 12,
parameter MEM_NUM = 4096
)
(
input wire clk,
input wire rst,
input wire wen,
input wire[AW-1:0] w_addr_i,
input wire[DW-1:0] w_data_i,
input wire ren,
input wire[AW-1:0] r_addr_i,
output wire[DW-1:0] r_data_o
);
wire[DW-1:0] r_data_wire ;
reg rd_equ_wr_flag ;
reg[DW-1:0] w_data_reg ;
assign r_data_o = (rd_equ_wr_flag) ? w_data_reg : r_data_wire;
always @(posedge clk)begin
if(!rst)
w_data_reg <= 'b0;
else
w_data_reg <= w_data_i;
end
//switch
always @(posedge clk)begin
if(rst && wen && ren && w_addr_i == r_addr_i )
rd_equ_wr_flag <= 1'b1;
else if(rst && ren)
rd_equ_wr_flag <= 1'b0;
end
dual_ram_template #(
.DW (DW),
.AW (AW),
.MEM_NUM (MEM_NUM)
)dual_ram_template_isnt
(
.clk(clk),
.rst(rst),
.wen(wen),
.w_addr_i(w_addr_i ),
.w_data_i(w_data_i ),
.ren(ren),
.r_addr_i(r_addr_i ),
.r_data_o(r_data_wire)
);
endmodule
module dual_ram_template #(
parameter DW = 32,
parameter AW = 12,
parameter MEM_NUM = 4096
)
(
input wire clk,
input wire rst,
input wire wen,
input wire[AW-1:0] w_addr_i,
input wire[DW-1:0] w_data_i,
input wire ren,
input wire[AW-1:0] r_addr_i,
output reg[DW-1:0] r_data_o
);
reg[DW-1:0] memory[0:MEM_NUM-1];
always @(posedge clk)begin
if(rst && ren)
r_data_o <= memory[r_addr_i];
end
always @(posedge clk)begin
if(rst && wen)
memory[w_addr_i] <= w_data_i;
end
endmodule
```
### top_soc
Connect each module of SOC.
```
module open_risc_v_soc(
input wire clk,
input wire rst,
input wire uart_rxd,
input wire debug_button,
output wire led_debug,
output wire led2
);
// open_risc_v to rom
wire[31:0] open_risc_v_inst_addr_o;
//rom to open_risc_v
wire[31:0] rom_inst_o;
// open_risc_v to ram
wire open_risc_v_mem_wr_req_o ;
wire[3:0] open_risc_v_mem_wr_sel_o ;
wire[31:0] open_risc_v_mem_wr_addr_o;
wire[31:0] open_risc_v_mem_wr_data_o;
wire open_risc_v_mem_rd_req_o ;
wire[31:0] open_risc_v_mem_rd_addr_o;
//ram to open_risc_v
wire[31:0] ram_rd_data_o;
//uart_debug to rom
wire uart_debug_ce;
wire uart_debug_wen;
wire[31:0] uart_debug_addr_o;
wire[31:0] uart_debug_data_o;
//debug_button_debounce to debug
wire debug;
debug_button_debounce debug_button_debounce_inst(
.clk(clk),
.rst(rst),
.debug_button(debug_button),
.debug(debug),
.led_debug(led_debug)
);
open_risc_v open_risc_v_inst(
.clk(clk),
.rst(rst),
.inst_i(rom_inst_o),
.inst_addr_o(open_risc_v_inst_addr_o),
.mem_rd_req_o(open_risc_v_mem_rd_req_o),
.mem_rd_addr_o(open_risc_v_mem_rd_addr_o),
.mem_rd_data_i(ram_rd_data_o),
.mem_wr_req_o(open_risc_v_mem_wr_req_o),
.mem_wr_sel_o(open_risc_v_mem_wr_sel_o),
.mem_wr_addr_o(open_risc_v_mem_wr_addr_o),
.mem_wr_data_o(open_risc_v_mem_wr_data_o)
);
assign led2 = open_risc_v_mem_wr_data_o[2];
ram ram_inst(
.clk(clk),
.rst(rst ),
.wen(open_risc_v_mem_wr_sel_o),
.w_addr_i(open_risc_v_mem_wr_addr_o),
.w_data_i(open_risc_v_mem_wr_data_o),
.ren(open_risc_v_mem_rd_req_o),
.r_addr_i(open_risc_v_mem_rd_addr_o),
.r_data_o(ram_rd_data_o)
);
rom rom_inst(
.clk(clk),
.rst(debug),
.wen(uart_debug_wen),//ins_write
.w_addr_i(uart_debug_addr_o),
.w_data_i(uart_debug_data_o),
.ren(1'b1),//ins_read
.r_addr_i(open_risc_v_inst_addr_o ),
.r_data_o(rom_inst_o )
);
uart_debug uart_debug_inst(
.clk(clk),
.debug(debug),
.uart_rxd(uart_rxd),
.ce(uart_debug_ce),
.wen(uart_debug_wen),
.addr_o(uart_debug_addr_o),
.data_o(uart_debug_data_o)
);
endmodule
```
### testing insts
```
module tb;
reg clk;
reg rst;
wire x3 = tb.open_risc_v_soc_inst.open_risc_v_inst.regs_inst.regs[3];
wire x26 = tb.open_risc_v_soc_inst.open_risc_v_inst.regs_inst.regs[26];
wire x27 = tb.open_risc_v_soc_inst.open_risc_v_inst.regs_inst.regs[27];
always #10 clk = ~clk;
initial begin
clk <= 1'b1;
rst <= 1'b0;
#30;
rst <= 1'b1;
end
//rom start_value
initial begin
$readmemh("./generated/rv32ui-p-lhu.txt",tb.open_risc_v_soc_inst.rom_inst.rom_32bit.dual_ram_template_isnt.memory);
end
//get wave
initial begin
$dumpfile("tb.vcd");
$dumpvars(0, tb);
end
integer r;
initial begin
wait(x26 == 32'b1);
#200;
if(x27 == 32'b1) begin
$display("############################");
$display("######## pass !!!#########");
$display("############################");
end
else begin
$display("############################");
$display("######## fail !!!#########");
$display("############################");
$display("fail testnum = %2d", x3);
for(r = 0;r < 31; r = r + 1)begin
$display("x%2d register value is %d",r,tb.open_risc_v_soc_inst.open_risc_v_inst.regs_inst.regs[r]);
end
end
$finish;
end
open_risc_v_soc open_risc_v_soc_inst(
.clk (clk),
.rst (rst)
);
endmodule
```
## SIM
### compile and sim
```
import os
import subprocess
import sys
def list_binfiles(path):
files = []
list_dir = os.walk(path)
for maindir, subdir, all_file in list_dir:
for filename in all_file:
apath = os.path.join(maindir, filename)
if apath.endswith('.bin'):
files.append(apath)
return files
def bin_to_mem(infile, outfile):
binfile = open(infile, 'rb')
binfile_content = binfile.read(os.path.getsize(infile))
datafile = open(outfile, 'w')
index = 0
b0 = 0
b1 = 0
b2 = 0
b3 = 0
for b in binfile_content:
if index == 0:
b0 = b
index = index + 1
elif index == 1:
b1 = b
index = index + 1
elif index == 2:
b2 = b
index = index + 1
elif index == 3:
b3 = b
index = 0
array = []
array.append(b3)
array.append(b2)
array.append(b1)
array.append(b0)
datafile.write(bytearray(array).hex() + '\n')
binfile.close()
datafile.close()
def compile():
rtl_dir = os.path.abspath(os.path.join(os.getcwd(), ".."))
iverilog_cmd = ['iverilog']
iverilog_cmd += ['-o', r'out.vvp']
iverilog_cmd += ['-I', rtl_dir + r'/rtl']
iverilog_cmd.append(rtl_dir + r'/tb/tb.v')
iverilog_cmd.append(rtl_dir + r'/rtl/defines.v')
iverilog_cmd.append(rtl_dir + r'/rtl/pc_reg.v')
iverilog_cmd.append(rtl_dir + r'/rtl/if_id.v')
iverilog_cmd.append(rtl_dir + r'/rtl/id.v')
iverilog_cmd.append(rtl_dir + r'/rtl/id_ex.v')
iverilog_cmd.append(rtl_dir + r'/rtl/ex.v')
iverilog_cmd.append(rtl_dir + r'/rtl/regs.v')
iverilog_cmd.append(rtl_dir + r'/rtl/ctrl.v')
# iverilog_cmd.append(rtl_dir + r'/rtl/ram.v')
iverilog_cmd.append(rtl_dir + r'/rtl/rom.v')
iverilog_cmd.append(rtl_dir + r'/rtl/ifetch.v')
iverilog_cmd.append(rtl_dir + r'/rtl/open_risc_v.v')
iverilog_cmd.append(rtl_dir + r'/utils/dff_set.v')
# iverilog_cmd.append(rtl_dir + r'/utils/dual_ram.v')
iverilog_cmd.append(rtl_dir + r'/tb/open_risc_v_soc.v')
process = subprocess.Popen(iverilog_cmd)
process.wait(timeout=5)
def sim():
compile()
vvp_cmd = [r'vvp']
vvp_cmd.append(r'out.vvp')
process = subprocess.Popen(vvp_cmd)
try:
process.wait(timeout=10)
except subprocess.TimeoutExpired:
print('!!!Fail, vvp exec timeout!!!')
def run(test_binfile):
rtl_dir = os.path.abspath(os.path.join(os.getcwd(), ".."))
out_mem = rtl_dir + r'/sim/generated/inst_data.txt'
bin_to_mem(test_binfile, out_mem)
sim()
if __name__ == '__main__':
sys.exit(run(sys.argv[1]))
```
### test_all
```
import os
import subprocess
import sys
from compile_and_sim import compile
from compile_and_sim import list_binfiles
from compile_and_sim import sim
from compile_and_sim import bin_to_mem
def main():
rtl_dir = os.path.abspath(os.path.join(os.getcwd(), ".."))
all_bin_files = list_binfiles(rtl_dir + r'/sim/generated/')
for file_bin in all_bin_files:
cmd = r'python compile_and_sim.py' + ' ' + file_bin
f = os.popen(cmd)
r = f.read()
index = file_bin.index('-p-')
print_name = file_bin[index + 3:-4]
if r.find('pass') != -1:
print(' ins ' + print_name.ljust(10, ' ') + ' PASS')
else:
print('ins ' + print_name.ljust(10, ' ') + ' !!!FAIL!!!')
f.close()
if __name__ == '__main__':
main()
```
![](https://i.imgur.com/5lKB87Y.png)
### test_one_inst
```
from compile_and_sim import compile
from compile_and_sim import list_binfiles
from compile_and_sim import sim
from compile_and_sim import bin_to_mem
import sys
def main(name='addi'):
rtl_dir = os.path.abspath(os.path.join(os.getcwd(), ".."))
all_bin_files = list_binfiles(rtl_dir + r'/sim/generated/')
for file in all_bin_files:
if file.find(name) != -1 and file.find('.bin') != -1:
test_binfile = file
out_mem = rtl_dir + r'/sim/generated/inst_data.txt'
# bin2mem
bin_to_mem(test_binfile, out_mem)
sim()
# get wave
# gtkwave_cmd = [r'gtkwave']
# gtkwave_cmd.append(r'tb.vcd')
# process = subprocess.Popen(gtkwave_cmd)
if __name__ == '__main__':
sys.exit(main(sys.argv[1]))
```
# second version
## Purpose1:Complete mul, div, SRC and other instructions, and add more SOC
## Purpose2:Generate bit stream to load into arty-a7-100t
Make changes based on the first version, I added J_TAG bus UART TIMER and DIV to implement csr div mlu and other instructions (but I haven't overcome the OVERFLOW problem.
I have released some snippets of code to record my implementation process (I haven't slept well for a month...).
## STRUCTURE
![](https://i.imgur.com/XuBwv3T.png)
## RTL
## CORE
### pc_regs
![](https://i.imgur.com/nSI2cuP.png)
The main functions are: reset, jump, pause, address increment and other operations on the address signal of the instruction memory, that is, process the address of the instruction to generate the value of the PC register, which will be used as the instruction memory Address signal, used to read instruction content from rom.
```
always @ (posedge clk) begin
if (rst == 1'b0 || jtag_reset_flag_i == 1'b1) begin
pc_o <= 32'h0;
end else if (jump_flag_i == 1'b1) begin
pc_o <= jump_addr_i;
end else if (hold_flag_i >= 3'b001) begin
pc_o <= pc_o;
end else begin
pc_o <= pc_o + 4'h4;
end
end
```
### rom
![](https://i.imgur.com/bOKbq94.png)
The main function is: store the programmed instruction code, and output the instruction code according to the value of the PC register
Define a 32*4096 two-dimensional array as the space for storing data.
That is to store 32bit instruction codes, up to 4096 instruction codes can be stored, and the dimension of 4096 is the address corresponding to the instruction codes.
**In the process of actually transplanting to FPGA, it is necessary to pay attention to the resource capacity of the FPGA used and adjust the size appropriately**
```
module rom(
input wire clk,
input wire rst,
input wire we_i,// write enable
input wire[31:0] addr_i,// addr
input wire[31:0] data_i,
output reg[31:0] data_o// read data
);
reg[31:0] _rom[0:4095];//rom total-1
always @ (posedge clk) begin
if (we_i == 1'b1) begin
_rom[addr_i[31:2]] <= data_i;
end
end
always @ (*) begin
if (rst == 1'b0) begin
data_o = 32'h0;
end
else begin
data_o = _rom[addr_i[31:2]];
end
end
endmodule
```
### ex
![](https://i.imgur.com/oLbRve1.png)
1. Execute the corresponding operation according to the current instruction (addition, subtraction, multiplication, division, shift, etc.), such as the add instruction, add the value of register 1 to the value of register 2.
2. If it is a jump instruction, a jump signal is issued.
3. If it is a memory load command, read the memory data of the corresponding address.
```
always @ (*) begin//deal div inst
div_dividend_o = reg1_rdata_i;
div_divisor_o = reg2_rdata_i;
div_op_o = fun3;
div_reg_waddr_o = reg_waddr_i;
if ((opcode == 7'b0110011) && (fun7 == 7'b0000001)) begin
div_wenable = 1'b0;
div_wdata = 32'h0;
div_waddr = 32'h0;
case (fun3)
3'b100, 3'b101, 3'b110, 3'b111: begin
div_start = 1'b1;
div_j_flag = 1'b1;
div_h_flag = 1'b1;
div_j_addr = op1_j_add_op2_j_res;
end
default: begin
div_start = 1'b0;
div_j_flag = 1'b0;
div_h_flag = 1'b0;
div_j_addr = 32'h0;
end
endcase
end
else begin
div_j_flag = 1'b0;
div_j_addr = 32'h0;
if (div_busy_i == 1'b1) begin
div_start = 1'b1;
div_wenable = 1'b0;
div_wdata = 32'h0;
div_waddr = 32'h0;
div_h_flag = 1'b1;
end else begin
div_start = 1'b0;
div_h_flag = 1'b0;
if (div_ready_i == 1'b1) begin
div_wdata = div_result_i;
div_waddr = div_reg_waddr_i;
div_wenable = 1'b1;
end else begin
div_wenable = 1'b0;
div_wdata = 32'h0;
div_waddr = 32'h0;
end
end
end
end
```
### Division (this took me several nights...)
![](https://i.imgur.com/XCOFbjh.png)
![](https://i.imgur.com/5UN5Ry9.jpg)
I implemented this state machine with RTL and added it to the core. The biggest trouble I encountered in the middle was the control of interrupt, which would involve BUS, exe, and ctrl, because my model would use multiple cycles to complete the entire finite state machine
Each division operation requires at least 39 clock cycles.
**important:**
During the operation of signed data, the complement of the negative number is inverted and one is added. The purpose of inverting and adding one is obvious: it is actually to convert all negative numbers into positive numbers for calculation (because the complement form of negative numbers has a sign bit, so it cannot be directly calculated), and the final calculated result must also be a positive number. Finally, according to the sign of the divisor and the dividend, the quotient is operated (that is, whether to invert and add one)
```
case(fsm_st)
FSM_IDEL:begin
if (start_i == 1'b1) begin
op_r <= op_i;
dividend <= dividend_i;
divisor <= divisor_i;
reg_waddr_o <=reg_waddr_o;
fsm_st <= FSM_START;
busy_o <= 1'b1;
end
else begin
op_r<=3'h0;
reg_waddr_o <=32'h0;
dividend <=32'h0;
divisor <=32'h0;
ready_o <= 1'b0;
result_o <=32'h0;
busy_o <= 1'b0;
end
end
FSM_START:begin
if (start_i==1'b1) begin
if (divisor==32'h0) begin
if (div_op|divu_op) begin
result_o<=32'hffffffff;
end
else begin
result_o<=dividend;
end
ready_o <=1'b1 ;
fsm_st <=FSM_IDEL;
busy_o <=1'b0 ;
end
else begin
busy_o <=1'h1 ;
ct <=32'h40000000 ;
fsm_st <=FSM_CALC ;
div_result <=32'h0 ;
div_remain <=32'h0 ;
if (div_op|rem_op) begin
if (dividend[31]==1'b1) begin
dividend<=dividend_neg;
minen<=dividend_neg[31];
end
else begin
minen<=dividend[31];
end
if (divisor[31]==1'b1) begin
divisor<=divisor_neg;
end
end
else begin
minen<=dividend[31];
end
if ((div_op && (dividend[31] ^ divisor[31] == 1'b1))||(rem_op && (dividend[31] == 1'b1))) begin
invert_result<=1'b1;
end
else begin
invert_result<=1'b0;
end
end
end
else begin
fsm_st<=FSM_IDEL;
result_o<=32'h0;
ready_o<=1'b0;
busy_o<=1'b0;
end
end
FSM_CALC:begin
if (start_i==1'b1) begin
dividend<={dividend[30:0], 1'b0};
div_result<=div_result_temp;
ct<={1'b0,ct[31:1]};
if (|ct) begin
minen<= {minen_temp[30:0], dividend[30]};
end
else begin
fsm_st<=FSM_END;
if (minen_divisor) begin
div_remain<=minen_sub_res;
end
else begin
div_remain<=minen;
end
end
end
else begin
fsm_st <= FSM_IDEL;
result_o<= 32'h0;
ready_o <= 1'b0;
busy_o <= 1'b0;
end
end
FSM_END:begin
if (start_i==1'b1) begin
ready_o<=1'b1;
fsm_st<=FSM_IDEL;
busy_o<=1'b0;
if (div_op|divu_op) begin
if (invert_result) begin
result_o<=(-div_result);
end
else begin
result_o<=div_result;
end
end
else begin
if (invert_result) begin
result_o<=(-div_remain);
end
else begin
result_o<=div_remain;
end
end
end
else begin
fsm_st<=FSM_IDEL;
result_o<=32'h0;
ready_o<=1'b0;
busy_o<=1'b0;
end
end
endcase
end
end
```
### Interrupt
Interrupt type:
External interrupts: interrupts generated by peripherals, interrupts that occur outside the processing core.
Timer interrupt (one of the external interrupts): controlled by the mtie field in the mie register.
Software interrupt: an interrupt triggered by the software (software language such as C language) itself.
Debug Interrupt: Interrupt when Debugging.
Interrupt masking: through the MIE register, to control different types of interrupt enable and mask (external interrupt, timer interrupt, software interrupt).
### ctrl
![](https://i.imgur.com/Wt9bnSL.png)
A jump is to change the value of the PC register.
And because whether to jump or not needs to be known at the execution stage, when a jump is required, the pipeline needs to be suspended
```
//hold_flag[7:0]
module ctrl(
input wire rst,
// from ex
input wire jump_flag_i,
input wire[31:0] jump_addr_i,
input wire hold_flag_ex_i,
// from rib
input wire hold_flag_rib_i,
// from jtag
input wire jtag_halt_flag_i,
// from clint
input wire hold_flag_clint_i,
output reg[2:0] hold_flag_o,
// to pc_reg
output reg jump_flag_o,
output reg[31:0] jump_addr_o
);
always @ (*) begin
jump_addr_o = jump_addr_i;
jump_flag_o = jump_flag_i;
hold_flag_o = 3'b000;
if (jump_flag_i == 1'b1 || hold_flag_ex_i == 1'b1 || hold_flag_clint_i == 1'b1) begin
hold_flag_o = 3'b011;
end else if (hold_flag_rib_i == 1'b1) begin
hold_flag_o = 3'b001;
end else if (jtag_halt_flag_i == 1'b1) begin
hold_flag_o = 3'b011;
end else begin
hold_flag_o = 3'b000;
end
end
endmodule
```
### csr_reg
![](https://i.imgur.com/hzkQhtJ.png)
Machine Trap Vector=t_vec
Machine Exception Cause
Machine Exception PC
Machine Status
Write register: According to the last 12 bits of the write register address, store the data in the ex or clint module in the control and status register (CSR register).
```
always @(posedge clk) begin//what kind of reg we need to write
if (rst==1'b1) begin
t_vec<=32'h0;
cause<=32'h0;
epc<=32'h0;
mei<=32'h0;
status<=32'h0;
scratch<=32'h0;
end
else begin
if (we_i==1'b1) begin
case (waddr_i[11:0])
12'h305:begin
tvec<=data_i;
end
12'h342:begin
cause<=data_i;
end
12'h341:begin
epc<=data_i;
end
12'h304:begin
mei<=data_i;
end
12'h300:begin
status<=data_i;
end
12'h340:begin
scratch<=data_i;
end
default:begin
end
endcase
end
else if(clint_we_i == 1'b1) begin
case (clint_waddr_i[11:0])
12'h305:begin
t_vec<=clint_data_i;
end
12'h342:begin
cause<=clint_data_i;
end
12'h341:begin
epc<=clint_data_i;
end
12'h304:begin
mei<=clint_data_i;
end
12'h300:begin
status<=clint_data_i;
end
12'h340:begin
scratch<=clint_data_i;
end
default:begin
end
endcase
end
end
end
```
Read register (combined logic): The address of the read register comes from the decoding id module, and the data read from the register is sent to the decoding id module (according to the last 12 bits of the read register address).
The address of the read register comes from the interrupt clint module, and the data read from the register is sent to the clint module (according to the last 12 bits of the read register address).
```
assign global_int_en_o=(status[3]==1'b1)?1'b1:1'b0;
assign clint_csr_mtvec = tvec ;
assign clint_csr_mepc = epc ;
assign clint_csr_mstatus = status ;
```
### clint
![](https://i.imgur.com/68uC8hF.png)
RISC-V interrupts are divided into two types, one is synchronous interrupts, that is, interrupts generated by ECALL, EBREAK and other instructions, and the other is asynchronous interrupts, that is, interrupts generated by peripherals such as GPIO and UART.
When an interrupt (interrupt return) signal is detected, first suspend the entire pipeline, set the jump address as the interrupt entry address, then read and write the necessary CSR registers (mstatus, mepc, mcause, etc.), and wait until these CSR registers are read and written After that, the pipeline suspension is canceled, so that the processor can fetch instructions from the interrupt entry address and enter the interrupt service routine.
1. steps
Synchronous interrupt > asynchronous interrupt > interrupt return
2. CSR register state machine jump
Extract the address returned by the interrupt and the code that caused the interrupt, as well as the state jump of the CSR register.
3. Write CSR registers (status, epc, cause)
First write the interrupt return address epc
Write mstatus again to turn off the global interrupt
Write the interrupt exception code to the mcause register
Interrupt return, the global interrupt bit needs to be restored at the same time of return (status[3]=status[7])
4. Send interrupt signal to ex
int_assert_o: interrupt valid signal, when the signal is 1, start to run the interrupt handler.
inst_flag_i: The interrupt flag signal of the timer interrupt.
int_assert_o: interrupt valid signal, when the signal is 1, start to run the interrupt handler.
```
//def有限狀態機
localparam INT_IDLE = 4'b0001;
localparam INT_SYNC = 4'b0010;
localparam INT_ASYNC = 4'b0100;
localparam INT_MERT = 4'b1000;
//csr regs狀態定義
localparam CSR_IDEL =5'b00001;
localparam CSR_STAT =5'b00010;
localparam CSR_MEPC =5'b00100;
localparam CSR_STMT =5'b01000;
localparam CSR_CAUS =5'b10000;
```
always @(*) begin//控制中斷
if(rst==1'b0)begin
int_st=INT_IDLE;
end
else begin
if(inst_i==32'h73||inst_i==32'h00100073)begin
if(div_started_i==1'b0)begin
int_st=INT_SYNC;
end
else begin
int_st=INT_IDLE;
end
end
else if(int_flag_i!=8'h0&&global_int_en_i==1'b1)begin
int_st=INT_ASYNC;
end
else if(inst_i==32'h30200073)begin
int_st=INT_MERT;
end
else begin
int_st=INT_IDLE;
end
end
end
```
always @(posedge clk) begin//CSR有限狀態機控制
if (rst==1'b0) begin
csr_st <=CSR_IDEL;
cause <= 32'h0;
int_addr <= 32'h0;
end
else begin
case(csr_st)
CSR_IDEL:begin
if (int_st==INT_SYNC) begin
csr_st<=CSR_MEPC;
if (jump_flag_i==1'b1) begin
int_addr<=jump_addr_i-4'h4;
end
else begin
int_addr<=inst_addr_i;
end
case(inst_i)
32'h73:begin
cause<=32'd11;
end
32'h00100073:begin
cause<=32'd3;
end
default:begin
cause<=32'd10;
end
endcase
end
else if (int_st==INT_ASYNC) begin
cause<=32'h80000004;
csr_st<=CSR_MEPC;
if (jump_flag_i==1'b1) begin
int_addr<=jump_addr_i;
end
else if(div_started_i==1'b1) begin
int_addr<=inst_addr_i-4'h4;
end
else begin
int_addr<=inst_addr_i;
end
end
else if (int_st==INT_MERT) begin
csr_st<=CSR_STMT;
end
end
CSR_STAT:begin
csr_st<=CSR_STAT;
end
CSR_MEPC:begin
csr_st<=CSR_CAUS;
end
CSR_STMT:begin
csr_st<=CSR_IDEL;
end
CSR_CAUS:begin
csr_st<=CSR_IDEL;
end
default:begin
csr_st<=CSR_IDEL;
end
endcase
end
end
```
```
//同步中斷&非同步中斷&發送訊號給EX模組
if (rst == 1'b0) begin
we_o <= 1'b0;
waddr_o <= 32'h0;
data_o <= 32'h0;
end
else begin
case(csr_st)//中斷返回地址會需要IMM_PC_ADDR+4
CSR_MEPC:begin
we_o<=1'b1;
waddr_o<={20'h0,12'h341};
data_o<=int_addr;
end
CSR_CAUS:begin
we_o<=1'b1;
waddr_o<={20'h0,12'h342};
data_o<=cause;
end
CSR_STAT:begin
we_o<=1'b1;
waddr_o<={20'h0,12'h300};
data_o<={csr_mstatus[31:4],1'b0,csr_mstatus[2:0]};
end
CSR_STMT:begin
we_o<=1'b1;
waddr_o<={20'h0,12'h300};
data_o<={csr_mstatus[31:4],csr_mstatus[7],csr_mstatus[2:0]};
end
case(csr_st)
CSR_CAUS:begin
int_assert_o<=1'b1;
int_addr_o<=csr_mtvec;
end
CSR_STMT:begin
int_assert_o<=1'b1;
int_addr_o<=csr_mepc;
end
```
### registers
regs
![](https://i.imgur.com/Nqk2en6.png)
Temporary data storage for decoding and execution
A register regs with a width of 32 bits and a depth of 32 bits is defined in the program.
```
reg[31:0] regs[0:32 - 1];
```
1.Write register: store the data in ex or jtag in the register regs
```
always @ (posedge clk) begin
if (rst == 1'b1) begin
if ((we_i == 1'b1) && (waddr_i != 5'h0)) begin
regs[waddr_i] <= wdata_i;
end else if ((jtag_we_i == 1'b1) && (jtag_addr_i != 5'h0)) begin
regs[jtag_addr_i] <= jtag_data_i;
end
end
end
```
2.2. Read register (combination logic):
The address of the read register comes from the decoding id module, and the data read from the register is sent to the decoding id module (regs).
The address of the read register comes from the jtag module, and the data read from the register is sent to the jtag module (jtag read register).
```
always @ (*) begin//we==write_enable
if (raddr1_i == 5'h0) begin
r_data1_o = 32'h0;
end
else if (raddr1_i == waddr_i && we_i == 1'b1) begin
r_data1_o = wdata_i;
end
else begin
r_data1_o = regs[raddr1_i];
end
end
always @ (*) begin
if (r_addr2_i == 5'h0) begin
r_data2_o = 32'h0;
end else if (r_addr2_i == waddr_i && we_i == 1'b1) begin
r_data2_o = wdata_i;
end
else begin
r_data2_o = regs[raddr2_i];
end
end
always @ (*) begin
if (jtag_addr_i == 5'h0) begin
jtag_data_o = 32'h0;
end
else begin
jtag_data_o = regs[jtag_addr_i];
end
end
```
---
Because of the pipeline, when the current instruction is in the execution stage, the next instruction is in the decoding stage. Since the register will not be written in the execution stage, but the register write operation will be performed when the next clock arrives.
If the instruction in the decoding stage requires the result of the previous instruction, the value of the register read at this time is wrong.
For example, the following two instructions: add x1, x2, x3, add x4, x1, x5 The second instruction depends on the result of the first instruction. To solve this problem, if the read register is equal to the write register, the value to be written is directly returned to the read operation.
### databus
Assuming that a peripheral has an address bus and a data bus, and there are N peripherals in total, then the processor core has N address buses and N data buses, and each additional peripheral needs to be modified (the change is not small) ) core code.
With the bus, the processor core only needs one address bus and one data bus, which greatly simplifies the connection between the processor core and peripherals.
1. Bus arbitration mechanism
First, each host sends an access request req to the bus: we will perform the access priority of each foreign agency according to the demand
```
Select the main device according to the order of priority through if_else to perform the corresponding access operation.
For the arbitration of the master device,
the priority order of the master device is: uart serial port download,
ex.v execution module,
jtag module, pc_reg fetch module.
```
and why???
because
2. Download the uart program.
Since the program needs to be updated, it does not matter which step the program executes.
No need to consider other module requests, download directly, and re-run the new program (need to pause the pipeline)
3.ex.v execution module (memory read and write request)
unless the new program code is re-downloaded.
In the case that the program remains unchanged, it is necessary to ensure that the current instruction runs completely in order to ensure that subsequent operations will not go wrong (need to suspend the pipeline)
4. jtag module
the previous step instruction is finished,
The jtag debugging module can modify the debugging parameters, control the execution of the program, including the value, because during the debugging process,
Setting a breakpoint may suspend the value operation (need to suspend the pipeline)
5. The pc_reg instruction fetch module
It is the first step of all the above main equipment modules, serving the above "main equipment".
### riscv_bus
![](https://i.imgur.com/Egm3i8h.png)
Select the slave device that needs to be operated through the case statement, and then pass the write_enable of the master to the slave to be written.
The bus supports multi-master and multi-slave connections, but only supports one master and one slave communication at the same time. A fixed priority arbitration mechanism is adopted between each master device on the RIB bus.
The highest 4 bits of the bus address determine which slave device to access, so up to 16 slave devices are supported.
```
master_addr_i[31:28]
```
```
case (slave_needed)
grant0: begin
case (m0_addr_i[31:28])
slave_0: begin
s0_we_o = m0_we_i;
s0_addr_o = {{4'h0}, {m0_addr_i[27:0]}};
s0_data_o = m0_data_i;
m0_data_o = s0_data_i;
end
slave_1: begin
s1_we_o = m0_we_i;
s1_addr_o = {{4'h0}, {m0_addr_i[27:0]}};
s1_data_o = m0_data_i;
m0_data_o = s1_data_i;
end
slave_2: begin
s2_we_o = m0_we_i;
s2_addr_o = {{4'h0}, {m0_addr_i[27:0]}};
s2_data_o = m0_data_i;
m0_data_o = s2_data_i;
end
slave_3: begin
s3_we_o = m0_we_i;
s3_addr_o = {{4'h0}, {m0_addr_i[27:0]}};
s3_data_o = m0_data_i;
m0_data_o = s3_data_i;
end
slave_4: begin
s4_we_o = m0_we_i;
s4_addr_o = {{4'h0}, {m0_addr_i[27:0]}};
s4_data_o = m0_data_i;
m0_data_o = s4_data_i;
end
slave_5: begin
s5_we_o = m0_we_i;
s5_addr_o = {{4'h0}, {m0_addr_i[27:0]}};
s5_data_o = m0_data_i;
m0_data_o = s5_data_i;
end
default: begin
end
endcase
end
```
## perips
### [GPIO](http://wiki.csie.ncku.edu.tw/embedded/GPIO)
![](https://i.imgur.com/E1bTj5W.png)
Every 2 bits control 1 IO mode, supporting up to 16 IOs
0: high impedance, 1: output, 2: input
Step1: First design two registers: gpio_ctrl (control GPIO input and output mode); gpio_data (store GPIO input or output data).
```
reg[31:0] gpio_ctrl;
reg[31:0] gpio_data;
```
Step2: Plan addresses for these two registers.
```
localparam CTRL = 4'h0;
localparam DATA = 4'h4;
```
Step3: Through register addressing, write to the two registers defined above, and realize the input and output of GPIO by configuring the gpio_ctrl register.
```
always @ (posedge clk) begin
if (rst == 1'b0) begin
gpio_data <= 32'h0;
gpio_ctrl <= 32'h0;
end else begin
if (we_i == 1'b1) begin
case (addr_i[3:0])
CTRL: begin
gpio_ctrl <= data_i;
end
DATA: begin
gpio_data <= data_i;
end
endcase
end else begin
if (gpio_ctrl[1:0] == 2'b10) begin
gpio_data[0] <= iopin_i[0];
end
if (gpio_ctrl[3:2] == 2'b10) begin
gpio_data[1] <= iopin_i[1];
end
end
end
end
always @ (*) begin
if (rst == 1'b0) begin
data_o = 32'h0;
end else begin
case (addr_i[3:0])
CTRL: begin
data_o = gpio_ctrl;
end
DATA: begin
data_o = gpio_data;
end
default: begin
data_o = 32'h0;
end
endcase
end
end
```
Note that the following concepts need to be kept in mind when simulating TOP
```
gpio[0] = (gpio_ctrl[1:0] == 2'b01)? gpio_data[0]: 1'bz;
```
When the configuration register gpio_ctrl[1:0] is 1, it means that GPIO is in output mode, and gpio_data[0] is output to the corresponding IO port. If gpio_ctrl[1:0] is not 1, it is 0 or 2, corresponding to high Both resistive and input modes set the GPIO to a high-impedance state for the following reasons:
High-impedance state is a common term in digital circuits. It refers to an output state of the circuit, which is neither high nor low. The impact is the same as not connected. If you use a multimeter to measure it, it may be high or low, depending on what is connected behind it.
### SPI
[wiki](https://en.wikipedia.org/wiki/Serial_Peripheral_Interface)
[youtube](https://www.youtube.com/watch?v=TR0Pw89EuGk)
![](https://i.imgur.com/sv4zSrv.png)
The SPI protocol specifies 4 logical signal interfaces:
SCLK (Serial Clock, will be issued by the master)
MOSI (Master Out, Slave In)
MISO (Master In, Slave Out)
CS (Chip Select, because a master can communicate with several slaves, so CS is needed to select the slave to communicate with, and usually CS is enabled at low potential)
![](https://i.imgur.com/0PUKJd6.png)
step1:set write_enable(we)always work
```
always @ (posedge clk) begin
if (rst == 1'b0) begin
en <= 1'b0;
end
else begin
if (spi_ctrl[0] == 1'b1) begin
en <= 1'b1;
end else if (done == 1'b1) begin
en <= 1'b0;
end else begin
en <= en;
end
end
end
```
step2:cut_clk count the clk
```
always @ (posedge clk) begin
if (rst == 1'b0) begin
clk_cnt <= 9'h0;
end
else if (en == 1'b1) begin
if (clk_cnt == div_cnt) begin
clk_cnt <= 9'h0;
end
else begin
clk_cnt <= clk_cnt + 1'b1;
end
end
else begin
clk_cnt <= 9'h0;
end
end
```
step3:count SPI_CLK
```
always @ (posedge clk) begin
if (rst == 1'b0) begin
spi_clk_cnt <= 5'h0;
spi_clk_level <= 1'b0;
end
else if (en == 1'b1) begin
if (clk_cnt == div_cnt) begin
if (spi_clk_cnt == 5'd17) begin
spi_clk_cnt <= 5'h0;
spi_clk_level <= 1'b0;
end
else begin
spi_clk_cnt <= spi_clk_cnt + 1'b1;
spi_clk_level <= 1'b1;
end
end
else begin
spi_clk_level <= 1'b0;
end
end
else begin
spi_clk_cnt <= 5'h0;
spi_clk_level <= 1'b0;
end
end
```
step4:write regs
```
always @ (posedge clk) begin
if (rst == 1'b0) begin
spi_ctrl <= 32'h0;
spi_data <= 32'h0;
spi_status <= 32'h0;
end else begin
spi_status[0] <= en;
if (we_i == 1'b1) begin
case (addr_i[3:0])
SPI_CTRL: begin
spi_ctrl <= data_i;
end
SPI_DATA: begin
spi_data <= data_i;
end
default: begin
end
endcase
end
else begin
spi_ctrl[0] <= 1'b0;
if (done == 1'b1) begin
spi_data <= {24'h0, rdata};
end
end
end
end
```
### timer
[youtube](https://www.youtube.com/watch?v=qQdZrY5mhkU)
![](https://i.imgur.com/nlAbdAo.png)
Step1: Define three registers
1. Control register: CTRL=4'h0
2. Counting threshold register: VALUE=4'h4
3. Current count value register (readonly): COUNT=4'h8
Step2:regs read&write
start
```
// counter
always @ (posedge clk) begin
if (rst == 1'b0) begin
t_ct <= 32'h0;
end
else begin
if (t_ctrl[0] == 1'b1) begin
t_ct <= t_ct + 1'b1;
if (t_ct >= t_val) begin
t_ct <= 32'h0;
end
end
else begin
t_ct <= 32'h0;
end
end
end
```
R&W
```
always @ (*) begin
if (rst == 1'b0) begin
data_o = 32'h0;
end
else begin
case (addr_i[3:0])
VALUE: begin
data_o = t_val;
end
CTRL: begin
data_o = t_ctrl;
end
CT: begin
data_o = t_ct;
end
default: begin
data_o = 32'h0;
end
endcase
end
end
```
```
always @ (posedge clk) begin
if (rst == 1'b0) begin
t_ctrl <= 32'h0;
t_val <= 32'h0;
end
else begin
if (we_i == 1'b1) begin
case (addr_i[3:0])
CTRL: begin
t_ctrl <= {data_i[31:3], (t_ctrl[2] & (~data_i[2])), data_i[1:0]};
end
VALUE: begin
t_val <= data_i;
end
endcase
end
else begin
if ((t_ctrl[0] == 1'b1) && (t_ct >= t_val)) begin
t_ctrl[0] <= 1'b0;
t_ctrl[2] <= 1'b1;
end
end
end
end
```
### uart
![](https://i.imgur.com/X74Nlt8.png)
![](https://i.imgur.com/AkzkzGV.png)
[EXP](https://nandland.com/uart-serial-port-module/)
1. UART stands for Universal Asynchronous Receiver Transmitter.
2. Synchronous serial communication requires both communication parties to transmit data synchronously under the control of the same clock; asynchronous serial communication means that both communication parties use their own clocks to control the sending and receiving process of data.
3. A frame of data in the sending or receiving process of UART consists of 4 parts, start bit, data bit, parity bit and stop bit
The rate of serial port communication is represented by baud rate, which represents the number of bits of binary data transmitted per second, and the unit is bps.
then...TX sending
```
always @ (posedge clk) begin
if (rst == 1'b0) begin
FSM_ <= FSM_IDLE;
cycle_count <= 16'd0;
tx_reg <= 1'b0;
bit_count <= 4'd0;
tx_data_ready <= 1'b0;
end
else begin
if (FSM_ == FSM_IDLE) begin
tx_reg <= 1'b1;
tx_data_ready <= 1'b0;
if (tx_data_valid == 1'b1) begin
FSM_ <= FSM_START;
cycle_count <= 16'd0;
bit_count <= 4'd0;
tx_reg <= 1'b0;
end
end
else begin
cycle_count <= cycle_count + 16'd1;
if (cycle_count == uart_baud[15:0]) begin
cycle_count <= 16'd0;
case (FSM_)
FSM_START: begin
tx_reg <= tx_data[bit_count];
FSM_ <= FSM_SEND_BYTE;
bit_count <= bit_count + 4'd1;
end
FSM_SEND_BYTE: begin
bit_count <= bit_count + 4'd1;
if (bit_count == 4'd8) begin
FSM_ <= FSM_STOP;
tx_reg <= 1'b1;
end
else begin
tx_reg <= tx_data[bit_count];
end
end
FSM_STOP: begin
tx_reg <= 1'b1;
FSM_ <= FSM_IDLE;
tx_data_ready <= 1'b1;
end
endcase
end
end
end
end
```
RX reception (partial)
```
assign rx_neg_edge = rx_q1 && ~rx_q0;
always @ (posedge clk) begin
if (rst == 1'b0) begin
rx_q0 <= 1'b0;
rx_q1 <= 1'b0;
end
else begin
rx_q0 <= rx_pin;
rx_q1 <= rx_q0;
end
end
always @ (posedge clk) begin
if (rst == 1'b0) begin
rx_start <= 1'b0;
end
else begin
if (uart_ctrl[1]) begin
if (rx_neg_edge) begin
rx_start <= 1'b1;
end
else if (rx_clk_count == 4'd9) begin
rx_start <= 1'b0;
end
end else begin
rx_start <= 1'b0;
end
end
end
```
Specific process:
a. When sending idle (that is, not sending data), (according to the protocol) keep the sending end set to 1; when sending data is valid (C language writes the data to be sent to the register UART_TXDATA), the sending end sends the start bit 0 (a counting cycle)
b. Control the counting threshold of the clock frequency division counter according to the agreed sending rate (baud rate), send data, first send the low bit and then send the high bit, after sending the data, set the sending end to 1, corresponding to the stop bit in the sequence; and update The corresponding bit of the receiving and sending status register UART_STATUS[0] <= 0;
c. Wait for the next sending (that is, the next sending data valid signal)
tips(from my friend)
Since the input and output pins of the FPGA serial port are at TTL level, 3.3V is used to represent the logic"1", 0V represents logic "0"; while the computer serial port uses RS-232 level, which is a negative logic level.
That is, -15V~-5V represents logic "1", and +5V~+15V represents logic "0". Therefore, when the computer communicates with the FPGA, it is necessary to add a level conversion chip
## SIM
![](https://i.imgur.com/bS4d420.png)
### test_all_inst.py
find all bis files
```
import os
import subprocess
import sys
def list_binfiles(path):
files = []
list_dir = os.walk(path)
for maindir, subdir, all_file in list_dir:
for filename in all_file:
apath = os.path.join(maindir, filename)
if apath.endswith('.bin'):
files.append(apath)
return files
```
test all bin files
```
def main():
bin_files = list_binfiles(r'../tests/isa/generated')
anyfail = False
for file in bin_files:
cmd = r'python new_nw.py' + ' ' + file + ' ' + 'inst.data'
f = os.popen(cmd)
r = f.read()
f.close()
if (r.find('TEST_PASS') != -1):
print(file + ' nlnlsofun')
else:
print(file + '!!!關進熊熊監獄,因為你失敗了!!!')
anyfail = True
break
if (anyfail == False):
print('恨熊熊,你再水時數阿, All PASS...')
if __name__ == '__main__':
sys.exit(main())
```
### step2 new_nw.py
turn bin files to mem files
```
cmd = r'python ../tools/Bin2Mem_CLI.py' + ' ' + sys.argv[1] + ' ' + sys.argv[2]
f = os.popen(cmd)
f.close()
```
compile rtl files
### step3 use Iverilog
```
def main():
rtl_dir = sys.argv[1]
if rtl_dir != r'..':
tb_file = r'/tb/compliance_test/cwwppb_soc_tb.v'
else:
tb_file = r'/tb/cwwppb_soc_tb.v'
# iverilog process
iverilog_cmd = ['iverilog']
...
...
...
```
## most of the trouble
### 1.define is not necessarily very convenient, the risc_v manual is your good friend
When you need a lot of "types" of values, but the VALUE is the same, it will be very inconvenient when coding. You need to keep clicking on the prompt, and then you will be crazy. Why is a wire® designed like this
### 2.[latch](https://zh.wikipedia.org/zh-hant/%E9%94%81%E5%AD%98%E5%99%A8)
As a novice, I have never learned logic design. I remember that I was stuck for 14 hours on the fourth day because of a wrong judgment.
EXP
```
always @(al or b)
begin
if(al) q <= b;
end
```
1. In this "always" block, the if statement ensures that q takes the value of d only when al = 1. This program does not write the result when al = 0, so what happens when al = 0? The variable q retains its original value.
2. Improper use of case statements (where I am stuck)
The case where the latch is generated occurs when the default item is missing when using the case statement.
The function of the case statement is to assign different values to another signal (q in this example) when a signal (sel in this example) takes different values. Pay attention to the example on the left side of the figure below, such as sel=00, q takes the value of a, and sel=11, q takes the value of b.
What is not clear in this example is: what value will q be assigned if sel takes on valuesother than 00 and 11? In the example on the left below, the program is written in Verilog HDL, that is, the default is to keep the original value of q, which will automatically generate a latch.
### 3. **vivado is super invincible and difficult to use**, modelsim can be very good for you to test normally.
### 4. Do not directly apply the board file as your project mode
I use the board format starting with xca7100... to write my xdc file
![](https://i.imgur.com/S8KBm2a.png)
### 5. When designing a finite state machine, be sure to search for information from multiple sources
When I was writing a division model, I had a big problem with my logic, because I was looking at a strange China websites' guide, until my NTU EE friend told me that I couldn't write it for 20,000 years (I was try to solve this with multiplication of divisors... super dumb)
### 6.J_TAG VS UART
[here is the answer look it properly](https://www.quora.com/What-is-the-difference-between-JTAG-and-UART)
### 7.If you want to know something abuot a board go to read the datasheet!
when generating the bitstream,I always thought the I got enought numbers of IO/ports untill I read the datasheet...
[arty a7 100T](https://digilent.com/reference/programmable-logic/arty-a7/reference-manual)
## TEST C code
### GCC toolchains compare and try:
[how to use them](https://hackmd.io/ADNQPiEFSPC_daP2GJ_sSQ)
I took **riscv64-unknow-elf-gcc** as my tool first,but I found that bin file would be too big for our SOC,so I tried -Os as compile method
&reduce size from linker useing [**strip**](https://zh.wikipedia.org/zh-tw/Strip_(Unix)),but sadly,the all faild,so I tried useing the toolchain for MCU([riscv-none-embed-](https://xpack.github.io/riscv-none-elf-gcc/)),that means I have to give up some systemcall on my C code to fit the toolchain:(
### set your toolchain
put your toolchain in tools [download](https://gnu-mcu-eclipse.github.io/blog/2019/05/21/riscv-none-gcc-v8-2-0-2-2-20190521-released/)
I took my [homework 1's](https://hackmd.io/h-6-OkMpRxOAIeN3ebF4Jg) C code as test code
### how to test it
1.test_all_isa
go to sim folder and do this instruction
```
python test_all_isa.py
```
![](https://i.imgur.com/WCD1Wzo.png)
![](https://i.imgur.com/Xex3A3m.png)
2.test C code
go to sim folder and do this instruction
```
python sim.py ..\tests\example\simple\C_test.bin inst.data
```
cause I use riscv-none-embed-gcc as my tool on Windows,it means I have no need to use "newlib" and I abandon some systemcalls like printf(),but I have to say riscv-none-embed-gcc can deal with "newlib",it's just my personal chooice.
![](https://i.imgur.com/l4itbnE.png)
If success you can see this on your computer:
![](https://i.imgur.com/qH7KwBC.png)
```
#include<stdio.h>
#include"..\lib\utils.h"
int main(){
int arr[]={20,1,0,2,1,16,1,3,2,1,2,17};
int height=12;
int ans=trap(arr,height);
if (ans == 141)
set_test_pass();
else
set_test_fail();
return 0;
/*printf("%d\n",ans);*/
}
int trap(int* height, int heightSize){
int maxh=0,maxhi;
if(heightSize==0||heightSize==1)
return 0;
for(int i=0;i<heightSize;i++){
if(height[i]>maxh){
maxh=height[i];
maxhi=i;
}
}
int water_l=0;
int rain=0;
for(int i=0;i<maxhi;i++){
if(height[i]>water_l){
water_l=height[i];
}
rain+=water_l-height[i];
}
water_l=0;
for(int i=heightSize-1;i>maxhi;i--){
if(height[i]>water_l){
water_l=height[i];
}
rain+=water_l-height[i];
}
return rain;
}
```
clips from C_test.c's dump file(Os as CFLAGS)
```
000001d8 <trap>:
1d8: 00100793 li a5,1
1dc: 08b7fe63 bgeu a5,a1,278 <trap+0xa0>
1e0: 00000793 li a5,0
1e4: 00000693 li a3,0
1e8: 02b7c463 blt a5,a1,210 <trap+0x38>
1ec: 00000693 li a3,0
1f0: 00000793 li a5,0
1f4: 00000613 li a2,0
1f8: 0306cc63 blt a3,a6,230 <trap+0x58>
1fc: fff58593 addi a1,a1,-1
200: 00000693 li a3,0
204: 04b84863 blt a6,a1,254 <trap+0x7c>
208: 00078513 mv a0,a5
20c: 00008067 ret
210: 00279713 slli a4,a5,0x2
214: 00e50733 add a4,a0,a4
218: 00072703 lw a4,0(a4)
21c: 00e6d663 bge a3,a4,228 <trap+0x50>
220: 00078813 mv a6,a5
224: 00070693 mv a3,a4
228: 00178793 addi a5,a5,1
22c: fbdff06f j 1e8 <trap+0x10>
230: 00269713 slli a4,a3,0x2
234: 00e50733 add a4,a0,a4
238: 00072703 lw a4,0(a4)
23c: 00e65463 bge a2,a4,244 <trap+0x6c>
240: 00070613 mv a2,a4
244: 40e60733 sub a4,a2,a4
248: 00e787b3 add a5,a5,a4
24c: 00168693 addi a3,a3,1
250: fa9ff06f j 1f8 <trap+0x20>
254: 00259713 slli a4,a1,0x2
258: 00e50733 add a4,a0,a4
25c: 00072703 lw a4,0(a4)
260: 00e6d463 bge a3,a4,268 <trap+0x90>
264: 00070693 mv a3,a4
268: 40e68733 sub a4,a3,a4
26c: 00e787b3 add a5,a5,a4
270: fff58593 addi a1,a1,-1
274: f91ff06f j 204 <trap+0x2c>
278: 00000793 li a5,0
27c: f8dff06f j 208 <trap+0x30>
```
### perips testing
working...
## GITHUB
I'm still working on my project...
here is [Version 2.00](https://github.com/Chiwawachiwawa/cwwppb_RISCV_SOC)
[vedio](https://youtu.be/3COX5TDpetI)
## arty-a7-100t testing(still working)
1.[what is xdc???](https://digilent.com/reference/programmable-logic/guides/vivado-xdc-file)
2.[refrence how do I write a xdc file](https://github.com/Digilent/digilent-xdc)
we got a sucess on generate bit stream...
Not in vain I slept less than six hours almost every day this month, and even dropped two courses QAQ
after 14 hours I finaly deal the last problem...
![](https://i.imgur.com/Dm4m6Go.png)
### non-os booting
I got a question that someone asked me how to boot a non-os machine,
it is a great question.
In risc-v offical datasheet
# boot Linux(先暫定中文,完成後我會轉成英文的)(20230201)
### add eth_ip
### cwwppb_v1.02 bitstream
### PetaLinux
安裝過程請"務必要看datasheet"而不是按照網路上的奇怪教學,對好版本,安裝所需的函式庫
# 心得:
If there are students who want to improve their own strength and are willing to spend time, this class is a blood push, super recommended, you never know where the teacher can push your limit, the teacher is also very serious in class, prepare The teaching materials are also very good, learning things is the second, and some values of the teacher are also worth learning. I was scolded by the teacher for a good sentence: "**Are you talking like an engineer? How can an engineer use it?" "Should" and "probably" are used to describe your thoughts**", in short, I think it is necessary to take this course to ensure that you can learn everything you want to learn!