# Assignment1: RISC-V Assembly and Instruction Pipeline
contributed by < [MingJun0609](https://github.com/MingJun0609) >
## Move Zeroes (LeetCode 283)
Given an integer array nums, **move all 0's to the end** of it while maintaining the relative order of the non-zero elements.
*Note that you must do this in-place without making a copy of the array.*
## Implementaion
[source code](https://github.com/MingJun0609/Computer-Architecture).
### C code
Use a variable **index** to record the starting point of a non-zero number, and use the variable i to traverse nums[].
When encounter nums[i] is non-zero, then **nums[index++] = nums[i]**, so all non-zero numbers will be in front of nums[].
After the loop is over, if **index < numsSize**, the remaining part is filled with zeros.
:::warning
:warning: Improve your grammar skills.
:notes: jserv
:::
```c
void moveZeroes(int nums[], int numsSize){
int index=0;
for (int i = 0; i < numsSize; ++i){
if (nums[i])
nums[index++] = nums[i];
}
while (index < numsSize)
nums[index++] = 0;
}
```
### Assembly code
```c
.data
nums: .word 0, 1, 0, 3, 12
numsSize: .word 5
space: .string " "
before: .string "Before move zeroes = "
after: .string "After move zeroes = "
nextline: .string "\n"
.text
main:
la s0, nums # s0 = nums[0]
lw s1, numsSize # s1 = numsSize
la a0, before # print .string
li a7, 4
ecall
jal print # print nums[] before move
la a0, nextline
li a7, 4
ecall
jal moveZeroes # moveZeroes
la a0, after # print .string
li a7, 4
ecall
jal print # print nums[] after move
li a7, 10 # exit
ecall
moveZeroes:
addi a1, zero, 0 # a1: index = 0
addi t0, zero, -1 # t0: i = -1
Loop:
addi t0, t0, 1 # i += 1
bge t0, s1, While # i >= len -> While:
slli t1, t0, 2 # t1: i << 2
add t1, t1, s0 # t1 = nums[i]
lw t1, 0(t1)
beq t1, zero, Loop # nums[i] == 0 -> Loop:
slli t2, a1, 2 # else, t2: index << 2
add t2, t2, s0 # t2 = nums[index]
sw t1, 0(t2) # nums[index] = nums[i]
addi a1, a1, 1 # index += 1
j Loop
While:
bge a1, s1, Exit # index >= numsSize : Exit
slli t2, a1, 2 # t2: index << 2
add t2, t2, s0 # t2: nums[index]
sw zero, 0(t2) # nums[index] = 0
addi a1, a1, 1 # index += 1
j While
Exit:
jr ra
print:
addi t0, zero, 0 # i = 0
loopi:
bge t0, s1, end # i >= numsSize, end
slli t1, t0, 2 # t1: i << 2
add t1, t1, s0 # t1: nums[i]
lw a0, 0(t1)
li a7, 1 # print int
ecall
la a0, space
li a7, 4
ecall
addi t0, t0, 1 # i += 1
j loopi
end:
jr ra
```
:::warning
Can you use fewer insructions?
:notes: jserv
:::
## Results
Result of executing assembly code on Ripes is same as C code

## Ripes Simulator
In Ripes, there are several processor, like **single cycle processor**, **5-stage with hazard detection......**.etc.
Here we choose the **5 stage processor**. Its block diagram look like this:

The **“5-stage”** means this processor using five-stage pipeline to parallelize instructions. Each stages are:
1. IF: Fetch instruction from memory.
2. ID: Read registers and decode the instruction.
3. EXE: Execute the operation or calculate an address.
4. MEM: Access an operand in data memory (if necessary).
5. WB: Write the result into a register (if necessary).
How RSIC-V assembly code works on Ripes
---
Below, I will use the **moveZeroes block** to explain how insrtuction run in the pipeline.
Here is the moveZeroes Pseudo instruction:
```clike=
00000031 <moveZeroes>:
31: 00000593 addi x11 x0 0
35: fff00293 addi x5 x0 -1
00000039 <Loop>:
39: 00128293 addi x5 x5 1
3d: 0292d463 bge x5 x9 40 <While>
41: 00229313 slli x6 x5 2
45: 00830333 add x6 x6 x8
49: 00032303 lw x6 0 x6
4d: fe0306e3 beq x6 x0 -20 <Loop>
51: 00259393 slli x7 x11 2
55: 008383b3 add x7 x7 x8
59: 0063a023 sw x6 0 x7
5d: 00158593 addi x11 x11 1
61: fd9ff06f jal x0 -40 <Loop>
00000065 <While>:
65: 0095dc63 bge x11 x9 24 <Exit>
69: 00259393 slli x7 x11 2
6d: 008383b3 add x7 x7 x8
71: 0003a023 sw x0 0 x7
75: 00158593 addi x11 x11 1
79: fedff06f jal x0 -20 <While>
```
Pipeline Process
---
#### IF: Instruction Fetch

1. When **jal moveZeroes** at EXE stage, it execute result is the absolute address of 31: . Passing address by the datapath and set the below green dot of mulitipelxer, so in the next cycle, we can get the **addi x11 x0 0**.
2. Program Counter **added by 4**, set the green dot 00: of multiplexer, so the next cycle get the **addi x5 x0 -1**.
#### ID: Instruction Decode

(1) We can see the green dot on, cause **jal instr.** at WB stage, it will store the next instrution of jal into the **ra** register, so we know where to jump return after the moveZeroes function end.
(2) Decode the instrution.
(3) Pass corresponding bits **rs1** and **rs2**.
(4) Store the corresponding bits **rd** into pipeline register, so at the WB stage, the instrution can know which register to write.
#### EXE: Execution

(1) The green dot depends on three way,
* 00: if the data at the WB stage and prepare to write in reg. It can solve data Dependencies.
* 01: Data in the Reg1.
* 10: Data is execute by the front instrution. It can solve data Dependencies.
(2) Similar to (1)
(3) The green dot depends on two way,
* 00: program counter
* 01: Data in the Reg1
(4) The green dot depends on two way,
* 00: Data in the Reg2
* 01: Imm., like I-type insrt. need pass bit[31:12]
(5) In this case, **beq instr.** at EXE, it need compare the Reg1(1 forwarding from WB) and Reg2. And the ALU need to calcualte the PC-relative address by add PC and Imm. Because Reg1==Reg2, so this branch need to jump -20, it means go back to previous five instrutions. Therefore it has control-dependencies, and need to flush the wrong instr. at IF and ID.

#### MEM: Memory Access

(1) In this case, **sw** need to write data(x6) into memory address 0(x7), so the dot is green.
(2) **lw** instr. load the data from memory, but not write data to memory, so the dot is red.
(3) If the instrution is **not lw/sw**, then goto MEM/WB register directly.
#### WB - Register write back

Write back has three possible value,
* 00: data from pc+4
* 01: data need to write back to Reg
* 10: data from memory
Reference
---
* [RISC-V Instruction Set Manual](https://riscv.org//wp-content/uploads/2017/05/riscv-spec-v2.2.pdf)
* [Computer Organization and Design RISC-V](http://home.ustc.edu.cn/~louwenqi/reference_books_tools/Computer%20Organization%20and%20Design%20RISC-V%20edition.pdf)
- **printResult**
The instruction "ecall" will evoke OS to do the I/O stuff. In Ripes, you put the corresponding value into register a0 and a7 before "ecall" can help. In my code, I tried to print out integers and strings to the console.

***