# Lab3: srv32 - RISCV RV32IM Soft CPU **Building C source files** ``` #include<stdio.h> int main() { int n=7; volatile int i; volatile int a=1; volatile int b=2; volatile int c; for(i=2;i<n;i++) { c=a+b; a=b; b=c; } printf("%d",c); return 0; } ``` **Building environment** Follow by https://hackmd.io/@eecheng/B1fEgnQwF **Modify customized Makefile** ``` include ../common/Makefile.common EXE = .elf SRC = ClimbingStairs.c CFLAGS += -L../common LDFLAGS += -T ../common/default.ld TARGET = ClimbingStairs OUTPUT = $(TARGET)$(EXE) .PHONY: all clean all: $(TARGET) $(TARGET): $(SRC) $(CC) $(CFLAGS) -o $(OUTPUT) $(SRC) $(LDFLAGS) $(OBJCOPY) -j .text -O binary $(OUTPUT) imem.bin $(OBJCOPY) -j .data -O binary $(OUTPUT) dmem.bin $(OBJCOPY) -O binary $(OUTPUT) memory.bin $(OBJDUMP) -d $(OUTPUT) > $(TARGET).dis $(READELF) -a $(OUTPUT) > $(TARGET).symbol clean: ``` I found a question that when I use **make** in **~/srv32/tools**, I got illegal instruction problem: ![](https://i.imgur.com/wzGOPB5.png) Also, in **~/srv32/sim** got a problem ![](https://i.imgur.com/BaLi7nG.png) After I change the C code into: ``` #include <stdio.h> int main() { int n=7; volatile int i; volatile int a=1; volatile int b=2; volatile int c; for(i=2;i<n;i++) { c=a+b; a=b; b=c; } return 0; } ``` Finally,I found that if printf contain variable,the compiler would not work. Running results **Run RTL sim** ![](https://i.imgur.com/b7i1RE6.png) **Run ISS sim** ![](https://i.imgur.com/us4FXsB.png) **gtk wave** In the 117ps,regs[15] change from 0 to 1, the end was in 425ps. ![](https://i.imgur.com/wBG04u8.png) c code=> riscv int a=> reg[15] int b=> reg[14] int c=> reg[15] In 161 ps,regs[15] change from 1 to 3, so the variance "c" in riscv in not belong to a register. It was also accomplish in reg[15]. ![](https://i.imgur.com/oNwJNSt.png) Also, we can find a loop time is 60ps. **Wave analyse** The failed branch predictions penalty is 2 ![](https://i.imgur.com/8zrkT63.png) The instruction bltu will cause branch penalty with two cycles. The above picture show that after the branch is taken, and the ex_flush is triggered with two cycles. **Optimization** change c code into: ``` #include <stdio.h> int main() { int n=7; volatile int i; volatile int a=1; volatile int b=2; volatile int c; c=a+b; a=b; b=c; c=a+b; a=b; b=c; c=a+b; a=b; b=c; c=a+b; a=b; b=c; c=a+b; a=b; b=c; } ``` **Loop unrolling** Advantages: * Branch prediction failures are reduced. * If the statements in the loop structure are not related to data, it increases the chance of concurrent execution. * The loop can be unrolled dynamically during execution. This situation cannot be grasped at compile time. Disadvantages: * Code bloat * Code readability is reduced unless the compiler performs loop unrolling transparently. * The inclusion of recursion in the loop structure may reduce the benefits of loop unrolling. RESULT ![](https://i.imgur.com/fItES5q.png) We reduced 110-80=30 cycles.