# Lab1: RV32I simulator ###### tags: `jserv Computer Architecture 2021` ## Power of Four ([LeetCode 342](https://leetcode.com/problems/power-of-four/)) * Problem Definition:Given an integer n, return true if it is a power of four. Otherwise, return false. An integer n is a power of four, if there exists an integer x such that n == 4^x^. * The C code implementation is as follow. ```c bool isPowerOfFour(int n){ if (n==0) return false; if (n==1 || n==4) return true; if (n%4==0) return isPowerOfFour(n/4); else return false; } ``` :::warning TODO: Reduce the number of branches. :notes: jserv ::: ## Power of Four Assembly (old, 5 branches) ```c .data argument: .word 7 # input number str1: .string "Number " str2: .string " is a power of 4" str3: .string " is not a power of 4" .text main: lw a0, argument # Load argument from static data jal ra, isPowerOfFour # Jump-and-link to the 'isPowerOfFour' label jal ra, printResult # Exit program li a7, 10 ecall isPowerOfFour: addi sp, sp, -4 sw a0, 0(sp) addi t3, t3, 1 # t3 = 1 beq a0, t3 , true # if(n == 1) return True beq a0, zero , false # if(n == 0) return False blt a0, zero , false # if(n < 0) return False loop: andi t4 , a0 , 3 # t4 = n % 4 bne t4,zero,false # if(n % 4 != 0 ) return False srli a0,a0,2 # n = n / 4 beq a0,t3,true # while(n != 1) j loop # do loop true: addi t5, t5 , 1 j res false: addi t5, t5 , 0 res: addi a1,t5,0 # return 0/1 lw a0, 0(sp) addi sp, sp, 4 jr ra printResult: # print mv t0, a0 mv t1, a1 la a0, str1 li a7, 4 ecall mv a0, t0 li a7, 1 ecall beq t1, zero, f la a0, str2 li a7, 4 ecall ret f: la a0, str3 li a7, 4 ecall ret ``` ## Power of Four Assembly (new, 4 branches) ```clike= .data argument: .word 5 str1: .string "Number " str2: .string " is a power of 4" str3: .string " is not a power of 4" .text main: lw a0, argument # Load argument from static data jal ra, isPowerOfFour # Jump-and-link to the 'fact' label lw a0, argument jal ra, printResult # Exit program li a7, 10 ecall isPowerOfFour: addi sp, sp, -4 sw ra, 0(sp) beq a0, zero , false addi t3, zero, 1 addi t6, zero, 4 beq a0, t3 , true beq a0, t6 , true loop: andi t4 , a0 , 3 bne t4,zero,false srli a0,a0,2 jal ra, isPowerOfFour add a1,t5,a1 j res2 true: addi t5, t5 , 1 j res2 false: addi t5, t5 , 0 res2: addi a1,t5,0 lw ra, 0(sp) addi sp, sp, 4 end: jr ra printResult: mv t0, a0 mv t1, a1 la a0, str1 li a7, 4 ecall mv a0, t0 li a7, 1 ecall beq t1, zero, f la a0, str2 li a7, 4 ecall ret f: la a0, str3 li a7, 4 ecall ret ``` ### Result we can check result in console ![](https://i.imgur.com/PyjZOFc.png) ### How to use ? You can change the value you want on the argument( from number 2 row of the assembly code ). ![](https://i.imgur.com/FusaxPz.png) ### Assembly Explained Main idea: To compute the number if it is a power of four, I apply the idea that n must be multiplied by many of 4's if TRUE . So I take the remainder of n whether it is 0 or not . if it is zero , then do n / = 4 by recursively; over and over again in order to ensure divisors of n are all four . There’re 7 sections this program namely `main`, `isPowerOfFour`, `loop`, `true`, `false`, `res2`, `printResult` . * `main`: * Load input value to the `a0` argument register. * Jump to `isPowerOfFour` . * Jump to `printResult` . * Exit program . * `isPowerOfFour`: * Enter to the procedure, and save `ra` argument register to stack. * if (n < 0 ) branch `false`. * if (n == 1 || n ==4) branch `true`. * `loop`: * Do the remainder of n and put it into `t4` temporary register. * if ( t4 != 0 ) branch `false`. * else do n = n / 4 . * call `isPowerOfFour` * get return value * jump to `res2` * `true`: * Move integer 1 to `t5` temporary register. * `false`: * Move integer 0 to `t5` temporary register. * `res2`: * Move `t5` temporary register to return register `a1` * set return register `a1`. * Pop up stack value. load `ra` argument register. * return to `ra` . * `printResult`: * The `ecall` instuction represented system call in RISC-V. * We can check the system call by looking up the system service table. The system call code is loaded by `a7` register. * The system call code corresponding to print string is 4. * The system call code corresponding to print integer is 1. ![](https://i.imgur.com/cOUGC4c.png) ## Analyze ### RISC-V Pipline Ripes is a 5-stages pipeline CPU ![](https://i.imgur.com/Av3X5lV.png) * IF * In this stage, the CPU reads instructions from the address in the instruction memory. * ID * In this stage, instruction is decoded and get the value from the registers. * EX * In this stage, ALU operations are performed. * MEM * In this stage, memory operands are read and written from/to the data memory. * WB * In this stage, computed/fetched value is written back to the register. ### Pipeline Hazard * Structure Hazard * Different instructions want to use the same resource at the same time. * Data Hazard * An instruction in the pipeline needs to use a result that has not been produced by the instruction in the previous stage. * Control Hazard * When the Branch instruction has not yet decided whether to branch, but the subsequent instructions have entered the pipeline, the execution sequence is wrong. #### How to handle hazard * Jump instruction ![](https://i.imgur.com/4nhDa2K.png) when we call `jal`, it means it will jump to other farther address. But, jump condition will be handled in `MEM` stage to modify the program counter. So, the pipeline will still fetch the next two instructions in `IF` and `ID` stage. ![](https://i.imgur.com/9fqvjX8.png) In fact, these two instructions(`jal printResult` and `addi`) must be flushed to be `NOP`, because these are not actually correct instructions after `jal isPowerOfFour`. ![](https://i.imgur.com/4SawBQv.png) * Data hazard(saved by forwarding) ![](https://i.imgur.com/wSKS403.png) `andi` is a write-back instruction, so `t4` register may be updated in the `WB` stage. But, the next `bne` instruction also needs `t4` register value in the `ID` stage. It won't be possible for `bne` to get the value in time. As you can see, ![](https://i.imgur.com/5EgLsqh.png) To solve this problem, forwarding is useful. Forwarding can send the calculated value `t4` from `EX` stage immediately to the next instruction `EX` stage, then `bne` can get the value in time. This is so-called `EX-hazard`. ![](https://i.imgur.com/x6YauLh.png) * Control hazard ![](https://i.imgur.com/JwH0m5w.png) `bne` will decided in `EXE` stage whether it will branch or not. ![](https://i.imgur.com/KRwqqbv.png) If it branches, it will occur two intructions (`srli` and `beq`) flushed into `nop`. In the other hand, if not branch, it will not occur `control hazard`. ![](https://i.imgur.com/i7IJ4BR.png) ## Reference https://leetcode.com/problems/power-of-four/ https://king0980692.medium.com/computer-architecture-cheat-sheet-pipeline-hazard-ee27d0d66e89