# Assignment3: SoftCPU ###### contributed by <`huang-me`> ###### tags: `Computer Architecture` [TOC] --- ## Workflow ```graphviz digraph g { rankdir="LR" { c [label="hw3.c"] gcc [label="xPack GNU RISC-V Embedded GCC", shape=box] elf [label="hw3.elf"] rank=same } sim [label="CPU simulator", shape=box] { ext [label="execution time"] ic [label="instruction count"] ec [label="execution cycles"] res2 [label="wave.fst"] rank=same } c->gcc gcc->elf [xlabel=" generate "] elf->sim sim->{ext, ic, ec} [label="print"] sim->res2 [xlabel=generate] } ``` ## [268. Missing Number](https://leetcode.com/problems/missing-number/) Since I found that compiler will calculate the result while compiling if optimization level is set to -O3, so all testing in this homework is compare under -O2 level. ### Original solution ``` c= int missingNumber(int *nums, int len) { int res = len; for(int i=0; i<len; ++i) { res ^= i; res ^= nums[i]; } return res; } ``` ``` missing: 2 Excuting 1610 instructions, 2086 cycles, 1.295 CPI Program terminate - ../rtl/../testbench/testbench.v:418: Verilog $finish Simulation statistics ===================== Simulation time : 0.043 s Simulation cycles: 2097 Simulation speed : 0.0487674 MHz ``` ![](https://i.imgur.com/RSFU86N.png) > Most of the branch predict failed in for loop. ### Optimization ```c= int missingNumber(int *nums, int len) { int res = len; for(int i=0; i<len-1; i+=2) { res ^= i; res ^= nums[i]; res ^= i+1; res ^= nums[i+1]; } if(len%2) { res ^= nums[len-1]; res ^= len-1; } return res; } ``` ``` missing: 2 Excuting 1604 instructions, 2068 cycles, 1.289 CPI Program terminate - ../rtl/../testbench/testbench.v:418: Verilog $finish Simulation statistics ===================== Simulation time : 0.038 s Simulation cycles: 2079 Simulation speed : 0.0547105 MHz ``` ![](https://i.imgur.com/VI9e0dR.png) > I use loop unrolling to decrease the amount of branch predict, and make the cycle decrease for 18 cycles.