# Assignment3: Single-cycle RISC-V CPU
contributed by < [p96114175](https://github.com/p96114175/ca2023-lab3) >
## Checklist
- [x] Complete the Single-cycle RISC-V CPU from [lab3](https://hackmd.io/47qzwlJYR0CBBkKNOoV6XQ?view)
- [x] Provide a concise summary of the various test cases, outlining the aspects of the CPU they evaluate, the techniques employed for loading test program instructions, and the outcomes of these test cases.
### Single-cycle CPU architecture diagram

## Instruction Fetch

#### Test InstructionFetch
Use the for loop to iterate 100 times. Each time, apply a random number(0 or 1) to determine if it is a jump.
```scala
val entry = 0x1000
var pre = entry
var cur = pre
c.io.instruction_valid.poke(true.B)
var x = 0
for (x <- 0 to 100) {
Random.nextInt(2) match {
case 0 => // no jump
cur = pre + 4
c.io.jump_flag_id.poke(false.B)
c.clock.step()
c.io.instruction_address.expect(cur)
pre = pre + 4
case 1 => // jump
c.io.jump_flag_id.poke(true.B)
c.io.jump_address_id.poke(entry)
c.clock.step()
c.io.instruction_address.expect(entry)
pre = entry
}
}
```
> sbt "testOnly riscv.singlecycle.InstructionDecoderTest"
#### Analyze with GTKWave
When io_jump_flag_id is set to 0 and the posedge of the clock, the program counter adds 4. For example, the former pc is 0x1000, and the latter pc is 0x1004.

When io_jump_flag_id is set to 1 and the posedge of the clock, the program counter is set to entry (0x1000). For example, the former pc is 0x1004, and the latter pc is 0x1000.

## Instruction Decode

Determine the control signal of MemoryRE and MemoryWE depending on the type of instruction.
If your type of instruction is load, we set `memory_read_enable` as true.
If your type of instruction is store, we set `memory_write_enable` as true.
:::danger
:warning: **Refrain from copying and pasting your solution directly into the HackMD note**. Instead, provide a concise summary of the various test cases, outlining the aspects of the CPU they evaluate, the techniques employed for loading test program instructions, and the outcomes of these test cases.
:::
:::info
Sorry, I would pay attention to this matter
:::
#### Test InstructionDecode
Use this test file to evaluate if the control signal of memory is correct.
This test prepares the S-type, lui, add of instructions to validate.
```scala
c.io.instruction.poke(0x00a02223L.U) // S-type
c.io.ex_aluop1_source.expect(ALUOp1Source.Register)
c.io.ex_aluop2_source.expect(ALUOp2Source.Immediate)
c.io.regs_reg1_read_address.expect(0.U)
c.io.regs_reg2_read_address.expect(10.U)
c.clock.step()
c.io.instruction.poke(0x000022b7L.U) // lui
c.io.regs_reg1_read_address.expect(0.U)
c.io.ex_aluop1_source.expect(ALUOp1Source.Register)
c.io.ex_aluop2_source.expect(ALUOp2Source.Immediate)
c.clock.step()
c.io.instruction.poke(0x002081b3L.U) // add
c.io.ex_aluop1_source.expect(ALUOp1Source.Register)
c.io.ex_aluop2_source.expect(ALUOp2Source.Register)
c.clock.step()
```
> src/test/scala/riscv/singlecycle/InstructionDecoderTest.scala
#### Analyze with GTKWave
* S-type instruction `0x00a02223`
1. The instruction would be split into these elements, like opcode, rd, rs1, and rs2....
```
val opcode = io.instruction(6, 0)
val funct3 = io.instruction(14, 12)
val funct7 = io.instruction(31, 25)
val rd = io.instruction(11, 7)
val rs1 = io.instruction(19, 15)
val rs2 = io.instruction(24, 20)
```
2. Use the judgment expression to check your opcode and assign the value of regs_reg1_read_address.
```
io.regs_reg1_read_address := Mux(opcode === Instructions.lui, 0.U(Parameters.PhysicalRegisterAddrWidth), rs1)
```
In this case, you realize that io.regs_reg1_read_address is equal to 0x0.

`io.regs_reg1_read_address is` meets the test requirements as below.
```diff
c.io.instruction.poke(0x00a02223L.U) // S-type
c.io.ex_aluop1_source.expect(ALUOp1Source.Register)
c.io.ex_aluop2_source.expect(ALUOp2Source.Immediate)
+ c.io.regs_reg1_read_address.expect(0.U)
c.io.regs_reg2_read_address.expect(10.U)
```
> src/test/scala/riscv/singlecycle/InstructionDecoderTest.scala

`c.io.regs_reg2_read_address` meets the test requirements as below.
```diff
c.io.instruction.poke(0x00a02223L.U) // S-type
+ c.io.ex_aluop1_source.expect(ALUOp1Source.Register)
+ c.io.ex_aluop2_source.expect(ALUOp2Source.Immediate)
+ c.io.regs_reg1_read_address.expect(0.U)
+ c.io.regs_reg2_read_address.expect(10.U)
```
* lui instruction `0x000022b7`

`c.io.regs_reg1_read_address` meets the test requirements as below.
```diff
c.io.instruction.poke(0x000022b7L.U) // lui
+ c.io.regs_reg1_read_address.expect(0.U)
+ c.io.ex_aluop1_source.expect(ALUOp1Source.Register)
+ c.io.ex_aluop2_source.expect(ALUOp2Source.Immediate)
```
* add instruction `0x002081b3`

```diff
c.io.instruction.poke(0x002081b3L.U) // add
+ c.io.ex_aluop1_source.expect(ALUOp1Source.Register)
+ c.io.ex_aluop2_source.expect(ALUOp2Source.Register)
```
## Execution

#### Test InstructionExecution
The test includes 3 types, as below.
**x3 = x2 + x1**
This test for `x3 = x2 + x1` uses for loop to iterate 100 times. Then, apply the Random to ganerate op1 and op2. Following, assign the value of op1 and op2 to `c.io.reg1_data` and `c.io.reg2_data`.
Finally, expect that `c.io.mem_alu_result` is equal to result and `c.io.if_jump_flag` is equal to 0.
**branch equal test**
In the final result, expect that `c.io.if_jump_flag` is eqaul to 1 and `c.io.if_jump_address` is equal to 4.
**branch not equal test**
In the final result, expect that `c.io.if_jump_flag` is eqaul to 0 and `c.io.if_jump_address` is equal to 4.
```
c.io.instruction.poke(0x001101b3L.U) // x3 = x2 + x1
var x = 0
for (x <- 0 to 100) {
val op1 = scala.util.Random.nextInt(429496729)
val op2 = scala.util.Random.nextInt(429496729)
val result = op1 + op2
val addr = scala.util.Random.nextInt(32)
c.io.reg1_data.poke(op1.U)
c.io.reg2_data.poke(op2.U)
c.clock.step()
c.io.mem_alu_result.expect(result.U)
c.io.if_jump_flag.expect(0.U)
}
// beq test
c.io.instruction.poke(0x00208163L.U) // pc + 2 if x1 === x2
c.io.instruction_address.poke(2.U)
c.io.immediate.poke(2.U)
c.io.aluop1_source.poke(1.U)
c.io.aluop2_source.poke(1.U)
c.clock.step()
// equ
c.io.reg1_data.poke(9.U)
c.io.reg2_data.poke(9.U)
c.io.if_jump_flag.expect(1.U)
c.io.if_jump_address.expect(4.U)
// not equ
c.io.reg1_data.poke(9.U)
c.io.reg2_data.poke(19.U)
c.clock.step()
c.io.if_jump_flag.expect(0.U)
c.io.if_jump_address.expect(4.U)
```
#### Analyze with GTKWave
**x3 = x2 + x1**

After Execute, `c.io.mem_alu_result` is equal to result and `c.io.if_jump_flag` is equal to 0.
For example, like below.
[Hex counter for validation](https://miniwebtool.com/zh-tw/hex-calculator/?number1=109ECD74&operate=1&number2=0C6DE8B7)

```
// Testing file for x3 = x1 + x2
c.io.mem_alu_result.expect(result.U)
c.io.if_jump_flag.expect(0.U)
```
**branch equal test**
When your `c.io.instruction` is `0x00208163` and `c.io.reg1_data = 9` and `c.io.reg1_data = 9`, the `c.io.if_jump_flag` would be 1 and `c.io.if_jump_address` is equal to 4.

**branch not equal test**
When your `c.io.instruction` is `0x00208163` and `c.io.reg1_data = 9` and `c.io.reg1_data = 13`, the `c.io.if_jump_flag` would be 0 and `c.io.if_jump_address` is equal to 4.

## If you complete the tests of single cycle cpu, you would get the message.
```shell
[info] InstructionDecoderTest:
[info] InstructionDecoder of Single Cycle CPU
[info] - should produce correct control signal
[info] InstructionFetchTest:
[info] ExecuteTest:
[info] InstructionFetch of Single Cycle CPU
[info] Execution of Single Cycle CPU
[info] - should fetch instruction
[info] - should execute correctly
[info] ByteAccessTest:
[info] Single Cycle CPU
[info] - should store and load a single byte
[info] FibonacciTest:
[info] Single Cycle CPU
[info] - should recursively calculate Fibonacci(10)
[info] QuicksortTest:
[info] Single Cycle CPU
[info] - should perform a quicksort on 10 numbers
[info] RegisterFileTest:
[info] Register File of Single Cycle CPU
[info] - should read the written content
[info] - should x0 always be zero
[info] - should read the writing content
[info] Run completed in 7 seconds, 107 milliseconds.
[info] Total number of tests run: 9
[info] Suites: completed 7, aborted 0
[info] Tests: succeeded 9, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
[success] Total time: 31 s, completed 2023年12月1日 上午1:58:51
```
## Run Verilator
Execute the below commands and convert the chisel files to verilog files. Then, use Verilator for simulation.
In the hello.asmbin file, simulate for 1500 cycles, and save the simulation waveform to the dump.vcd file.
> Note that the time is twice the number of cycles.
```shell
$ make verilator
$ ./run-verilator.sh -instruction src/main/resources/hello.asmbin -time 3000 -vcd dump.vcd
```
Following, run `gtkwave dump.vcd` to check its waveform.

I observe that `io_instruction` begin with `0000000` and `00001137`. Then, verify the hexadecimal representation of hello.asmbin
```shell
$ hexdump src/main/resources/hello.asmbin | head -1
```
It aligns with the expected waveform.
```
0000000 1137 0000 00ef 5d40 006f 0000 0297 0000
```
### Fundamental Concepts behind Chisel
Reference from [Lab3: Construct a single-cycle RISC-V CPU with Chisel](https://hackmd.io/@sysprog/r1mlr3I7p?fbclid=IwAR2A0vj1uQ0booGxhfs2z3eSxfe_gFiMkaxmVfJw_Hhc5H4bMjPU6YWm9bw_aem_AX1CSkoj6w0mUl0Uo_q0qHMZn2jprLSLP3Ybt7UuAbP4V61rpHZUeRVT5qBj5Fh4Up0&mibextid=AHjCNw)
Chisel is a domain specific language (DSL) implemented using Scala’s macro features. Therefore, all programming related to circuit logic must be implemented using the macro definitions provided in the Chisel library, rather than directly using Scala language keywords.
> [Chisel Cheatsheet](https://github.com/freechipsproject/chisel-cheatsheet/releases/latest/download/chisel_cheatsheet.pdf)
## Relationship Between Chisel and Verilog
Reference from [Lab3: Construct a single-cycle RISC-V CPU with Chisel](https://hackmd.io/@sysprog/r1mlr3I7p?fbclid=IwAR2A0vj1uQ0booGxhfs2z3eSxfe_gFiMkaxmVfJw_Hhc5H4bMjPU6YWm9bw_aem_AX1CSkoj6w0mUl0Uo_q0qHMZn2jprLSLP3Ybt7UuAbP4V61rpHZUeRVT5qBj5Fh4Up0&mibextid=AHjCNw)
[Chisel](https://www.chisel-lang.org/) is not strictly an equivalent replacement for Verilog but rather a generator language. Hardware circuits written in Chisel need to be compiled into Verilog files and then synthesized into actual circuits through EDA (Electronic Design Automation) software. Furthermore, for the sake of generality, some features found in Verilog, such as negative-edge triggering and multi-clock simulation, are not fully supported or not supported at all in Chisel.
