# Lab1: RV32I Simulator contributed by < [geniuseric](https://github.com/geniuseric) > ## Requirement [Assignment1: RISC-V Assembly and Instruction Pipeline](https://hackmd.io/@sysprog/2021-arch-homework1) ## Search Insert Position - Given a sorted array of distinct integers and a target value, return the index if the target is found. If not, return the index where it would be if it were inserted in order. - You must write an algorithm with O(log n) runtime complexity. - You can try your own code [here](https://leetcode.com/problems/search-insert-position/). ## Implementation - The sorted array is given as {2, 3, 8, 11, 13, 19} and a target value is set to 13. Since 13 is at index 4 of the array, the program should return the index 4. - The source code is written in [here](https://github.com/geniuseric/Computer_Architecture/tree/master/HW1). - Environment calls are listed [here](https://github.com/mortbopet/Ripes/wiki/Environment-calls). ### C Code ```clike= #include <stdio.h> int searchInsert(int* nums, int numsSize, int target){ int down = 0, up = numsSize - 1, mid = 0; while (down <= up){ mid = (down + up) / 2; if (target == nums[mid]) return mid; else if (target < nums[mid]) up = mid - 1; else down = mid + 1; } return down; } int main(void) { int data[] = {2, 3, 8, 11, 13, 19}; int size = 6, target = 13; int index = searchInsert(data, size, target); printf("The insert position is %d\n", index); return 0; } ``` ### Assembly Code ```asm= .data data: .word 2, 3, 8, 11, 13, 19 # data[] = {2, 3, 8, 11, 13, 19} size: .word 6 # array size = 6 tar: .word 13 # target = 13 str: .string "The insert position is " .text # s1 = data base address # s2 = array size # s3 = target # s4 = index # t0 = down # t1 = up # t2 = mid # t3 = address offset # t4 = date point address # t5 = data[mid] main: la s1, data # s1 = address lw s2, size # s2 = 6 lw s3, tar # s3 = 13 add t0, x0, x0 # t0 = 0 addi t1, s2, -1 # t1 = size - 1 = 5 jal ra, loop # jump and link to loop jal ra, print # jump and link to print li a7, 10 # end program ecall loop: add t2, t0, t1 # mid = down + up srli t2, t2, 1 # mid = mid / 2 slli t3, t2, 2 # offset = mid * 4 add t4, s1, t3 # point address = base address + offset lw t5, 0(t4) # t5 = data[mid] beq s3, t5, equal # if target == data[mid], go to equal blt s3, t5, less # if target < data[mid], go to less addi t0, t2, 1 # if target > data[mid], down = mid + 1 j end # jump to end less: addi t1, t2, -1 # if target < data[mid], up = mid - 1 end: bge t1, t0, loop # if up >= down, go to loop mv s4, t0 # index = down ret # return to main equal: mv s4, t2 # index = mid ret # return to main print: la a0, str # load string li a7, 4 # print string ecall add a0, s4, x0 # load result li a7, 1 # print integer ecall ret # return to main ``` ### Console Output The output result is the same as I thought. ![](https://i.imgur.com/7BX5cth.png) ## Analysis We test our assembly code using [Ripes](https://github.com/mortbopet/Ripes) simulator. ### Disassembled By Ripes - The code below is disassebled by Ripes simulator. - Pseudoinstructions are convertued to base instructions, such as `la` and `mv`. - Register names are changed from application binary interface (ABI) to x1-x31. - [RISC-V Assembly Programmer’s Manual](https://github.com/riscv-non-isa/riscv-asm-manual/blob/master/riscv-asm.md) ```asm= 00000000 <main>: 0: 10000497 auipc x9 0x10000 4: 00048493 addi x9 x9 0 8: 10000917 auipc x18 0x10000 c: 01092903 lw x18 16 x18 10: 10000997 auipc x19 0x10000 14: 00c9a983 lw x19 12 x19 18: 000002b3 add x5 x0 x0 1c: fff90313 addi x6 x18 -1 20: 010000ef jal x1 16 <loop> 24: 048000ef jal x1 72 <print> 28: 00a00893 addi x17 x0 10 2c: 00000073 ecall 00000030 <loop>: 30: 006283b3 add x7 x5 x6 34: 0013d393 srli x7 x7 1 38: 00239e13 slli x28 x7 2 3c: 01c48eb3 add x29 x9 x28 40: 000eaf03 lw x30 0 x29 44: 03e98063 beq x19 x30 32 <equal> 48: 01e9c663 blt x19 x30 12 <less> 4c: 00138293 addi x5 x7 1 50: 0080006f jal x0 8 <end> 00000054 <less>: 54: fff38313 addi x6 x7 -1 00000058 <end>: 58: fc535ce3 bge x6 x5 -40 <loop> 5c: 00028a13 addi x20 x5 0 60: 00008067 jalr x0 x1 0 00000064 <equal>: 64: 00038a13 addi x20 x7 0 68: 00008067 jalr x0 x1 0 0000006c <print>: 6c: 10000517 auipc x10 0x10000 70: fb450513 addi x10 x10 -76 74: 00400893 addi x17 x0 4 78: 00000073 ecall 7c: 000a0533 add x10 x20 x0 80: 00100893 addi x17 x0 1 84: 00000073 ecall 88: 00008067 jalr x0 x1 0 ``` ### 5-Stage Pipeline Processor There are several processors we can choose from Ripes. I choose 5-stage processor for analysis, which is a pipelined processor with hazard detection/elimination and forwarding. ![](https://i.imgur.com/vGwWAKd.png) ### Processor Block Diagram This is the block diagram of the 5-stage processor. Different processors look different. ![](https://i.imgur.com/OLrY8FE.png) ### Pipeline Diagram The figure below shows how 5-stage pipeline works for parallel processing. ![](https://i.imgur.com/lB9ycK6.png) ### Instruction Explanation - Take instruction `jal ra, print` for example. It is diassembled to `jal x1 72`. - According to [RISC-V Manual](https://riscv.org/wp-content/uploads/2019/06/riscv-spec.pdf), the jump and link (JAL) instruction uses the J-type format. - Let's see how does this instruction go thorough each stage. 1. Instruction Fetch (IF) - Program counter (PC) is now at 0x24, which means `jal x1 72` is located at address 0x24. - The instruction in instruction memory is 0x048000ef. ![](https://i.imgur.com/Qk64Gmz.png) 2. Instruction Decode (ID) - J-type format ![](https://i.imgur.com/ymGtpAY.png) - After decoding, the `opcode` is JAL, `dest` is register x1(ra) and signed `offset` is 0x48. ![](https://i.imgur.com/MvCWDmd.png) 3. Execution (EX) - Operand `Op1` is the current PC 0x24 and pperand `Op2` is the signed offset 0x48. - Arithmetic logic unit (ALU) adds two operands together. - The reuslt `Res` equals 0x6c, which is the next PC. ![](https://i.imgur.com/ZRJHLOk.png) 4. Memroy (MEM) - JAL instruction doesn't access memory, there is no operation in this stage. ![](https://i.imgur.com/ksFfT0b.png) 5. Write Back (WB) - JAL stores the address of the instruction following the jump (pc+4) into `dest`. ![Register](https://i.imgur.com/4zx1jbQ.png) - Register x1(ra) is 0x28 after WB. ![WB](https://i.imgur.com/RaYbTBu.png) ## Reference - [LeetCode 35](https://leetcode.com/problems/search-insert-position/) - [Environment Calls](https://github.com/mortbopet/Ripes/wiki/Environment-calls) - [Ripes Simulator](https://github.com/mortbopet/Ripes) - [RISC-V Assembly Programmer’s Manual](https://github.com/riscv-non-isa/riscv-asm-manual/blob/master/riscv-asm.md) - [RISC-V Instruction Set Manual](https://riscv.org/wp-content/uploads/2019/06/riscv-spec.pdf)