# Assignment3: Single-Cycle RISC-V CPU
contributed by [david965154](https://github.com/david965154/ca2023-lab3.git)
## Hello World in Chisel
```c
class Hello extends Module {
val io = IO(new Bundle {
val led = Output(UInt(1.W))
})
val CNT_MAX = (50000000 / 2 - 1).U;
val cntReg = RegInit(0.U(32.W))
val blkReg = RegInit(0.U(1.W))
cntReg := cntReg + 1.U
when(cntReg === CNT_MAX) {
cntReg := 0.U
blkReg := ~blkReg
}
io.led := blkReg
}
```
The code above is trying to make output signal `led` change its signal after `CNT_MAX` cycles, here is its process
First time call:new a `Hello` class and set every `val`, then excute
```c
cntReg := cntReg + 1.U
io.led := blkReg
```
CNT_MAX-n times call:excute
```c
cntReg := cntReg + 1.U
io.led := blkReg
```
CNT_MAX times call:
```c
cntReg := cntReg + 1.U
cntReg := 0.U
blkReg := ~blkReg
io.led := blkReg
```
In this time, the `blkReg` was set as `~blkReg` and `io.led` was set as `blkReg`. Therefore, the `io.led` will maintain the signal as `~blkReg` for next `CNT_MAX-1` times call.
### Enhance it by incorporating logic circuit
Here is my logic circuit:

To implement:
```c
class Hello extends Module {
val io = IO(new Bundle {
val led = Output(UInt(1.W))
})
val CNT_MAX = (50000000 / 2 - 1).U;
val cntReg = RegInit(0.U(32.W))
val blkReg = RegInit(0.U(1.W))
// cntReg+=1 by ALU
cntReg := cntReg + 1.U
// Judge cntReg===CNTMAX by ALU to determine what signal to activate MUX
blkReg := Mux(
cntReg===CNTMAX,
~blkReg,
blkReg
)
cntReg := Mux(
cntReg===CNTMAX,
0.U,
cntReg
)
// That io.led equal to blkReg
io.led := blkReg
}
```
## Problems
**OS:`macos 14`** **`terminal`**
**Problem:**
1. GTKWave
I have tried solution on [reddit](https://www.reddit.com/r/FPGA/comments/16tqja3/gtkwave_on_macos_sonoma/) and [github](https://github.com/gtkwave/gtkwave/issues/250#issuecomment-1738097260), still cannot resolve the error:
the author said that
> OK, you're using the most recent version that I built...which was three years ago.
As I said, I don't have a Mac, so unfortunately, I can't build a binary that will work for you. Setting up a build environment on Mac using GTK-OSX is a fairly involved process unless it's been simplified. Another user might be able to help you.
2. Java
After I update my device and figure out how to resolve the problem 1, the `sbt test` cannot be operated and here is its error message:
> java.lang.IllegalArgumentException: requirement failed: /Users/chenjinzhun/ca2023-lab3/verilog/sb.asmbin.txt is not a relative path
why:
If the error was occur in my code, the error message should be the value not equal to expect value. But the message look like the expect path is relative path but not absolute path. And I pretty sure that before I do the operation mention above, the error message is not exist.
## Modify C code in HW2 to compatible with the Single-Cycle RISC-V CPU
**First:**
Modify the C code and put it into `csrc`
To compatible with the sbt test, I modify the `printf` to store word in specify memory address, then the `sbt test` can verify the value and expect value if equal.
```c
#include <stdio.h>
#include <stdint.h>
uint16_t count_leading_zeros(uint64_t x)
{
int pace = 16;
int adj = 16;
int y;
while(x>1){
y = x >> pace;
adj>>=1;
if(y>1){
pace+=adj;
}
else if(y<1){
pace-=adj;
}
else if(y==1){
return (64-pace);
}
}
return 64;
}
int main() {
*((volatile int *) (4)) = 64 - count_leading_zeros(1);
*((volatile int *) (8)) = 64 - count_leading_zeros(129);
*((volatile int *) (12)) = 64 - count_leading_zeros(32768);
*((volatile int *) (16)) = 64 - count_leading_zeros(8393732);
*((volatile int *) (20)) = 64 - count_leading_zeros(4294967295);
return 0;
}
```
### or...
modify the .S code to and put it into `csrc`
```shell
.org 0
# Provide program starting address to linker
.data
data_0: .word 0b00000000000000000000000000000001
data_1: .word 0b00000000000000000000000010000001
data_2: .word 0b00000000000000001000000000000000
data_3: .word 0b00000000100000000001010000000100
data_4: .word 0b11111111111111111111111111111111
nextline: .ascii "\n"
.set str_size, .-nextline
buffer: .byte 0
.text
.global main
main:
mv t1, a0
addi sp, sp, -20
# push four pointers of test data onto the stack
lw t0, data_0
sw t0, 0(sp)
lw t0, data_1
sw t0, 4(sp)
lw t0, data_2
sw t0, 8(sp)
lw t0, data_3
sw t0, 12(sp)
lw t0, data_4
sw t0, 16(sp)
addi s0, x0, 5 # s0 is the iteration times(4 test case)
addi s1, x0, 0 # s1 is counter
addi s2, sp, 0 # s2 initial at (0)sp
li a5, 2
li a6, 0
loop:
lw a1, 0(s2) #load data into a0
li a0, 16
li t2, 16
addi s2, s2, 4 # s2 move to next data
addi s1, s1, 1 # counter++
blt a1, a6, ulimitcase
blt a1, a5, llimitcase
jal clz
right:
sub a0, a0, t2
jal clz
left:
add a0, a0, t2
clz:
srl a2, a1, a0
srl t2, t2, 1
beq a2, x0, right
bge a2, a5, left
jal det
llimitcase:
li a0, 0
jal print
ulimitcase:
li a0, 31
print:
slli t2, s1, 2
sw a0, 0(t2)
bne s1, s0, loop
beq s1, s0, end
end:
li a7, 93 # Exit system call
ecall
```
**Second:**
Modify the Makefile
To compile the my C code and .S to `.o` and `.asmbin`, I need to add mycode.asmbin under the `BINS` in this document. Then type `make`(produce `.o` and `.asmbin`) and `make update`(produce `.asmbin` under the directory `src/main/resources/`)
```shell
...
BINS = \
fibonacci.asmbin \
hello.asmbin \
mmio.asmbin \
quicksort.asmbin \
sb.asmbin \
// add below
priorityencoder.asmbin
...
```
**Last:**
Modify the CPUTest
To follow the rule as previous example code, I new these code.
**What's important**
the part:
```c
for (i <- 1 to 500) {
c.clock.step(1000)
c.io.mem_debug_read_address.poke((i * 4).U) // Avoid timeout
}
```
In my opinion, this part is need to prevent previous part code from executing in the same time.
```c
class PriorityEncoderTest extends AnyFlatSpec with ChiselScalatestTester {
behavior.of("Single Cycle CPU")
it should "implement priority encoder using CLZ" in {
test(new TestTopModule("priorityencoder.asmbin")).withAnnotations(TestAnnotations.annos) { c =>
for (i <- 1 to 500) {
c.clock.step(1000)
c.io.mem_debug_read_address.poke((i * 4).U) // Avoid timeout
}
c.io.mem_debug_read_address.poke(4.U)
c.clock.step()
c.io.mem_debug_read_data.expect(0x0.U)
c.io.mem_debug_read_address.poke(8.U)
c.clock.step()
c.io.mem_debug_read_data.expect(0x7.U)
c.io.mem_debug_read_address.poke(12.U)
c.clock.step()
c.io.mem_debug_read_data.expect(0xf.U)
c.io.mem_debug_read_address.poke(16.U)
c.clock.step()
c.io.mem_debug_read_data.expect(0x17.U)
c.io.mem_debug_read_address.poke(20.U)
c.clock.step()
c.io.mem_debug_read_data.expect(0x1f.U)
}
}
}
```
## **Waveform**
### 1. InstructionFetch

The `io_instruction_address` is depends on `io_jump_flag_id`:
`0`:pc=pc+4(address=address+4)
`1`:pc=jump address
### 2. InstructionDecoder

See the instruction type `io_instruction`, the different between its signal can correspond to `io_regs_reg1_read_address`, `io_regs_reg2_read_address`, `io_memory_read_enable`, `io_memory_write_enable` and some other signal.
`L type` : `io_memory_read_enable` = 1
`S type` : `io_memory_write_enable` = 1
### 3. Execute

In this `TestOnly`, the instruction `add` and `beq` will be execute. The front one in waveform is `add`, the piece of tail part is `beq`, you can see the `io_if_jump_flag` is activate.
### 4. PriortyEncoder

In my `priortyencoder`, the `io_mem_debug_read_address` will store the value correspond to `io_mem_debug_read_data` and to check the answer if equal to the expect value.
### 5. simulate by Verilator
```shell
$ make verilator
$ ./run-verilator.sh -instruction src/main/resources/priorityencoder.asmbin
-time 2000 -vcd dump.vcd
```
will get
```
-time 2000
-memory 1048576
-instruction src/main/resources/priorityencoder.asmbin
```
then, type
```shell
$ gtkwave dump.vcd
```

#### hexadecimal representation of priorityencoder.asmbin:
```shell
$ hexdump src/main/resources/priorityencoder.asmbin | head -1
```
```
0000000 1137 0000 00ef 2600 006f 0000 0297 0000
```