# Assignment3: Single-cycle RISC-V CPU
<style>
/* remove invis after publish*/
.invis {
display:none
}
.vis {
color:#000000;
user-select:none;
}
</style>
Contributed by [raphael89918](https://github.com/raphael89918)
## Finish mycpu
Follow the [Assignment3: single-cycle RISC-V CPU](https://hackmd.io/@sysprog/2023-arch-homework3) to finish MyCPU.
(Instruction Fetch, Instruction Decode, Execute, Combining into a CPU)
For a standard RISC-V 32-bit integer instruction, the format is as follows:
```[ funct7 ] [ rs2 ] [ rs1 ] [ funct3 ] [ rd ] [ opcode ]```
- opcode: The lowest 7 bits of the instruction code, used to specify the type of instruction.
- rd: Destination register, used to store the results of the operation.
- funct3: Function code, used to specify specific operations under the opcode.
- rs1 and rs2: source registers, used to specify the operands of the operation.
- funct7: For certain types of operations, it further specifies the specific type of operation.
### Instruction Fetch

- pc: When executing sequential instructions, the PC should increment by the size of the instruction at each clock cycle. For 32-bit RISC-V instructions, typically the PC increments by 4 each cycle.
- instruction address: we found that pc will follow the instruction address
### Instruction Decode

- io_instruction:
For ```000022B7```, we can break it down like this
- ```00002``` (bits 31-12): The high-order part of the immediate data (imm[31:12]).
- ```2``` (bits 11-7): Destination register address (rd).
- ```B7``` (bits 6-0): opcode.
In RISC-V, opcode ```B7``` (```10111``` in binary) indicates that this is a U-type instruction, specifically a LUI (Load Upper Immediate) instruction. The LUI instruction is used to load a 20-bit immediate value into the high- order bit of the destination register and set the low-order bit to 0. In summary, the ```000022B7``` instruction means in RISC-V:```LUI x2, 0x00002```
- io_reg_write_address: ```io_reg_write_address = 05```, this value indicates that the address of the destination register is ```x5```. For U-type instructions, the destination register is used to store the result of the operation.
### Execute

- io_instruction:
The io_instruction is ```001101B3``` which is mean ```add x3, x2, x1```in riscv, the decoding process is as follows:
- opcode (the lowest 7 bits): 0110011 indicates that this is an R-type integer register-register operation.
- rd (the next 5 bits): 00011 indicates that the result will be stored in register x3.
- funct3 (the next 3 bits): 000 indicates that this is an addition operation.
- rs1 (the next 5 bits): 00010 indicates that the first operand comes from register x2.
- rs2 (the next 5 bits): 00001 indicates that the second operand comes from register x1.
- funct7 (the top 7 bits): 0000000 indicates that this is an addition operation.
## quicksort.asmbin modification
```$ ./run-verilator.sh -instruction src/main/resources/quicksort.asmbin -time 2000 -vcd dump.vcd``` to generate dump.vcd
```$ gtkwave dump.vcd``` to observe waveform

- io_instruction:
This signal contains the instruction currently being decoded or executed. In the instruction fetch stage, it usually comes from the memory address pointed to by the Program Counter (PC). In the instruction decode stage, this signal is used to parse various fields of the instruction, such as opcode, register addresses, immediate values, etc.
- io_memory_bundle_read_data:
This signal represents the data read from memory. When executing memory-related instructions (like load instructions), it contains the data read from the specified memory address. For the instruction fetch stage, ```io_instruction``` is usually obtained from memory through ```io_memory_bundle_read_data```.
- io_memory_bundle_write_data:
This signal is used to indicate the data to be written into memory. When executing memory write instructions (like store instructions), it contains the data to be written into memory. This signal is directly related to io_instruction, as some instructions (like store instructions) specify the data to be written into memory.
When the processor executes a Load instruction, ```io_instruction``` specifies from which memory address to read data, and ```io_memory_bundle_read_data``` contains the actual data read from that address.
When the processor executes a Store instruction, ```io_instruction specifies``` which memory address to write to and which register's data to use for writing. ```io_memory_bundle_write_data``` contains the data to be written to that address.
In the instruction fetch stage, io_instruction is typically obtained from ```io_memory_bundle_read_data```, which contains the instruction data stored at the current PC address.
## hw2 modification
```code=
.data
dataset1: .word 0x3fcccccd #1.6
dataset2: .word 0xbfc00000 #-1.5
dataset3: .word 0x3fb33333 #1.4
dataset4: .word 0xbfa66666 #-1.3
dataset5: .word 0x3f99999a #1.2
dataset6: .word 0xbf8ccccd #-1.1
array_size: .word 0x00000006
sign: .word 0x80000000
exp: .word 0x7F800000
man: .word 0x007FFFFF
bf16man: .word 0x007F0000
bf16format: .word 0xFFFF0000
_EOL: .string "\n"
.global main
main:
addi sp,sp,-24
lw s0, dataset1
lw s1, dataset2
lw s2, dataset3
lw s3, dataset4
lw s4, dataset5
lw s5, dataset6
lw a0, sign
lw a1, exp
lw a2, man
lw a3, bf16man
lw a4, bf16format
sw s0, 0(sp)
sw s1, 4(sp)
sw s2, 8(sp)
sw s3, 12(sp)
sw s4, 16(sp)
sw s5, 20(sp)
lw s11, array_size
jal ShellSort
ShellSort:
lw t0, array_size
srli t0, t0, 1
Whileinterval:
add t1, zero, t0
Foriarraysize:
beq t1, s11, Interval_div2
add t2, zero, t1
slli s0, t1, 2
add s1, sp, s0
lw t5, 0(s1)
ReadData_Temp_done:
sub t3, t2, t0
slli s0, t3, 2
add s1, sp, s0
lw t4, 0(s1)
jal fp32_to_bf16
WhilejandFlag:
beq s10, zero, arrayjtotemp
bltu t2, t0, arrayjtotemp
and t4, t4, a4
slli s0, t2, 2
add s1, sp, s0
sw t4, 0(s1)
sub t2, t2, t0
jal ReadData_Temp_done
arrayjtotemp:
and t5, t5, a4
slli s0, t2, 2
add s1, sp, s0
sw t5, 0(s1)
addi t1, t1, 1
jal Foriarraysize
Interval_div2:
srli t0, t0, 1
jal Whileinterval
fp32_to_bf16:
and s0, a0, t4
and s1, a0, t5
and s2, a1, t4
and s3, a1, t5
and s4, a2, t4
and s5, a2, t5
and s6, a3, t4
and s7, a3, t5
jal BOS
ret
BOS:
blt s0, s1, SMALL
beq s0, s1, Sigsame
jal BIG
Sigsame:
beq s2, s3, Expsame
beq s0, zero, Exp1
jal Exp2
Expsame:
beq s0, zero, Man1
jal Man2
Exp1:
blt s2, s3, BIG
jal SMALL
Exp2:
blt s2, s3, SMALL
jal BIG
Man1:
blt s6, s7, SMALL
jal BIG
Man2:
blt s6, s7, BIG
jal SMALL
BIG:
li s10, 1
jalr ra
SMALL:
li s10, 0
jalr ra
End:
# Exit code can be placed here
li a7, 10
ecall
```
update the csrc/Makefile
```
...
BINS = \
fibonacci.asmbin \
hello.asmbin \
mmio.asmbin \
quicksort.asmbin \
sb.asmbin \
hw2.asmbin
...
```
make the vcd file
```
$ ./run-verilator.sh -instruction src/main/resources/hw2.asmbin -time 2000 -vcd dump01.vcd```
-time 2000
-memory 1048576
-instruction src/main/resources/hw2.asmbin
[-------------------->] 100%
```
