# Assignment1: RISC-V Assembly and Instruction Pipeline contributed by < [MingJun0609](https://github.com/MingJun0609) > ## Move Zeroes (LeetCode 283) Given an integer array nums, **move all 0's to the end** of it while maintaining the relative order of the non-zero elements. *Note that you must do this in-place without making a copy of the array.* ## Implementaion [source code](https://github.com/MingJun0609/Computer-Architecture). ### C code Use a variable **index** to record the starting point of a non-zero number, and use the variable i to traverse nums[]. When encounter nums[i] is non-zero, then **nums[index++] = nums[i]**, so all non-zero numbers will be in front of nums[]. After the loop is over, if **index < numsSize**, the remaining part is filled with zeros. :::warning :warning: Improve your grammar skills. :notes: jserv ::: ```c void moveZeroes(int nums[], int numsSize){ int index=0; for (int i = 0; i < numsSize; ++i){ if (nums[i]) nums[index++] = nums[i]; } while (index < numsSize) nums[index++] = 0; } ``` ### Assembly code ```c .data nums: .word 0, 1, 0, 3, 12 numsSize: .word 5 space: .string " " before: .string "Before move zeroes = " after: .string "After move zeroes = " nextline: .string "\n" .text main: la s0, nums # s0 = nums[0] lw s1, numsSize # s1 = numsSize la a0, before # print .string li a7, 4 ecall jal print # print nums[] before move la a0, nextline li a7, 4 ecall jal moveZeroes # moveZeroes la a0, after # print .string li a7, 4 ecall jal print # print nums[] after move li a7, 10 # exit ecall moveZeroes: addi a1, zero, 0 # a1: index = 0 addi t0, zero, -1 # t0: i = -1 Loop: addi t0, t0, 1 # i += 1 bge t0, s1, While # i >= len -> While: slli t1, t0, 2 # t1: i << 2 add t1, t1, s0 # t1 = nums[i] lw t1, 0(t1) beq t1, zero, Loop # nums[i] == 0 -> Loop: slli t2, a1, 2 # else, t2: index << 2 add t2, t2, s0 # t2 = nums[index] sw t1, 0(t2) # nums[index] = nums[i] addi a1, a1, 1 # index += 1 j Loop While: bge a1, s1, Exit # index >= numsSize : Exit slli t2, a1, 2 # t2: index << 2 add t2, t2, s0 # t2: nums[index] sw zero, 0(t2) # nums[index] = 0 addi a1, a1, 1 # index += 1 j While Exit: jr ra print: addi t0, zero, 0 # i = 0 loopi: bge t0, s1, end # i >= numsSize, end slli t1, t0, 2 # t1: i << 2 add t1, t1, s0 # t1: nums[i] lw a0, 0(t1) li a7, 1 # print int ecall la a0, space li a7, 4 ecall addi t0, t0, 1 # i += 1 j loopi end: jr ra ``` :::warning Can you use fewer insructions? :notes: jserv ::: ## Results Result of executing assembly code on Ripes is same as C code ![](https://i.imgur.com/3Ct46j2.jpg) ## Ripes Simulator In Ripes, there are several processor, like **single cycle processor**, **5-stage with hazard detection......**.etc. Here we choose the **5 stage processor**. Its block diagram look like this: ![](https://i.imgur.com/uajbELM.jpg) The **“5-stage”** means this processor using five-stage pipeline to parallelize instructions. Each stages are: 1. IF: Fetch instruction from memory. 2. ID: Read registers and decode the instruction. 3. EXE: Execute the operation or calculate an address. 4. MEM: Access an operand in data memory (if necessary). 5. WB: Write the result into a register (if necessary). How RSIC-V assembly code works on Ripes --- Below, I will use the **moveZeroes block** to explain how insrtuction run in the pipeline. Here is the moveZeroes Pseudo instruction: ```clike= 00000031 <moveZeroes>: 31: 00000593 addi x11 x0 0 35: fff00293 addi x5 x0 -1 00000039 <Loop>: 39: 00128293 addi x5 x5 1 3d: 0292d463 bge x5 x9 40 <While> 41: 00229313 slli x6 x5 2 45: 00830333 add x6 x6 x8 49: 00032303 lw x6 0 x6 4d: fe0306e3 beq x6 x0 -20 <Loop> 51: 00259393 slli x7 x11 2 55: 008383b3 add x7 x7 x8 59: 0063a023 sw x6 0 x7 5d: 00158593 addi x11 x11 1 61: fd9ff06f jal x0 -40 <Loop> 00000065 <While>: 65: 0095dc63 bge x11 x9 24 <Exit> 69: 00259393 slli x7 x11 2 6d: 008383b3 add x7 x7 x8 71: 0003a023 sw x0 0 x7 75: 00158593 addi x11 x11 1 79: fedff06f jal x0 -20 <While> ``` Pipeline Process --- #### IF: Instruction Fetch ![](https://i.imgur.com/XAmup0k.png =40%x)![](https://i.imgur.com/Q3PLkd6.png =60%x) 1. When **jal moveZeroes** at EXE stage, it execute result is the absolute address of 31: . Passing address by the datapath and set the below green dot of mulitipelxer, so in the next cycle, we can get the **addi x11 x0 0**. 2. Program Counter **added by 4**, set the green dot 00: of multiplexer, so the next cycle get the **addi x5 x0 -1**. #### ID: Instruction Decode ![](https://i.imgur.com/LEq74CE.jpg) (1) We can see the green dot on, cause **jal instr.** at WB stage, it will store the next instrution of jal into the **ra** register, so we know where to jump return after the moveZeroes function end. (2) Decode the instrution.![](https://i.imgur.com/lXy0Nvo.png) (3) Pass corresponding bits **rs1** and **rs2**. (4) Store the corresponding bits **rd** into pipeline register, so at the WB stage, the instrution can know which register to write. #### EXE: Execution ![](https://i.imgur.com/Z0c0IOW.jpg) (1) The green dot depends on three way, * 00: if the data at the WB stage and prepare to write in reg. It can solve data Dependencies. * 01: Data in the Reg1. * 10: Data is execute by the front instrution. It can solve data Dependencies. (2) Similar to (1) (3) The green dot depends on two way, * 00: program counter * 01: Data in the Reg1 (4) The green dot depends on two way, * 00: Data in the Reg2 * 01: Imm., like I-type insrt. need pass bit[31:12] (5) In this case, **beq instr.** at EXE, it need compare the Reg1(1 forwarding from WB) and Reg2. And the ALU need to calcualte the PC-relative address by add PC and Imm. Because Reg1==Reg2, so this branch need to jump -20, it means go back to previous five instrutions. Therefore it has control-dependencies, and need to flush the wrong instr. at IF and ID. ![](https://i.imgur.com/A0HWNhH.png =50%x)![](https://i.imgur.com/7fjgdJl.png =50%x) #### MEM: Memory Access ![](https://i.imgur.com/OICtXwQ.jpg =80%x) (1) In this case, **sw** need to write data(x6) into memory address 0(x7), so the dot is green. (2) **lw** instr. load the data from memory, but not write data to memory, so the dot is red. (3) If the instrution is **not lw/sw**, then goto MEM/WB register directly. #### WB - Register write back ![](https://i.imgur.com/y8ed9bX.jpg) Write back has three possible value, * 00: data from pc+4 * 01: data need to write back to Reg * 10: data from memory Reference --- * [RISC-V Instruction Set Manual](https://riscv.org//wp-content/uploads/2017/05/riscv-spec-v2.2.pdf) * [Computer Organization and Design RISC-V](http://home.ustc.edu.cn/~louwenqi/reference_books_tools/Computer%20Organization%20and%20Design%20RISC-V%20edition.pdf) - **printResult** The instruction "ecall" will evoke OS to do the I/O stuff. In Ripes, you put the corresponding value into register a0 and a7 before "ecall" can help. In my code, I tried to print out integers and strings to the console. ![](https://i.imgur.com/OBjROWt.png) ***