# Assignment2 RISC-V Toolchain ###### tags: `Computer Architecture` ## Greatest common divisor You can get more detail imformation by clicking the link **[here](https://hackmd.io/@cccccs100203/lab1-RISC-V).** ```clike= .data num1: .word 120 # number 1 (u) num2: .word 78 # number 2 (v) str1: .string "\n" nums: .word 30 10 .text main: lw s0, num1 # u lw s1, num2 # v # check u or v is 0 beqz s1, print beqz s0, assign # use Euclidean algorithm loop: mv s2, s1 rem s1, s0, s1 mv s0, s2 bnez s1, loop j print assign: mv s0, s1 j print print: addi a7, x0, 1 # print_int ecall add a0, s0, x0 # integer ecall exit: ``` ## Translate the assembly programs into C manually ```clike= int gcd(int, int); int _start() { int res = gcd(120, 78); volatile char *tx = (volatile int *)0x40002000; *tx = res; return 0; } int gcd(int a, int b) { int temp; while(b){ temp = b; b = a%b; a = temp; } return a; } ``` ## Execute assembly programs on rv32emu emulator When I tried to compile C program into assembly code, some errors occurred. The instruction that I input to the terminal was : ``` riscv-none-embed-gcc -march=rv32i -mabi=ilp32 -O3 -nostdlib gcd.c -o gcd ``` Then I found out that I didn't include the standard library (Because -nostdlib was declared!). Considering that I'm not familiar with this compiler, I decide to keep the "-nostdlib" declaration. Although complier cannot recognize"-", "*", "/", and "%" operators, I implement these operators additionally in the form of functions to make the compilation work. ![image](https://i.imgur.com/NQBJ1XH.png) So I implement the modulo operator(%) additionally. ```clike= int gcd(int, int); int mod(int, int); char int_to_char(int); int _start() { int res = gcd(120, 78); const char* str = "gcd(120, 78) = "; volatile int *tx = (volatile int *)0x40002000; while(*str){ *tx = *str; str++; } *tx = int_to_char(res); return 0; } int mod(int a, int b) { int temp = 0; while(temp<=a){ temp = temp + b; } return (a+b)+((~temp)+1); } int gcd(int a, int b) { int temp; while (b) { temp = b; b = mod(a, b); a = temp; } return a; } char int_to_char(int num){ char str; switch(num) { case 0: str='0'; break; case 1: str='1'; break; case 2: str='2'; break; case 3: str='3'; break; case 4: str='4'; break; case 5: str='5'; break; case 6: str='6'; break; case 7: str='7'; break; case 8: str='8'; break; case 9: str='9'; break; default: str='-'; break; } return str; } ``` And then the compilation can be done. ## Compare the assembly listing between handwritten and compiler optimized one The generated assmebly code is like: ![image](https://i.imgur.com/AaNBIbS.png) ![image](https://i.imgur.com/Vw8nVk9.png) There are several differences between the compiler-generated assembly code and the original one. First, the instruction count of compiler-generated assembly code is more than the original one. Second, comparing to the original assembly code, the generated code is not intuitive. This may make debugging even more harder. ## The statistics of execution flow After compiler-generated assmebly code was executed on emu-rv32i emulator, we can saw the summary on the terminal: ![image](https://i.imgur.com/r0HsnDt.png) The size of ELF file: ![image](https://i.imgur.com/LP1yAlS.png) Summary of execution: ![image](https://i.imgur.com/c1FbXM4.png) ## Optimize the generated assembly Then, I check the execution of size-optimized generated assembly code. ![image](https://i.imgur.com/0KZOmiB.png) It seems that execution of speed-optimized assembly code is more faster than the size-optimized one. The speedup is about 1.16! Because the jump instruction in speed-optimized one is about 2.68 times less than the size-optimized one. Less jump instruction avoids the instructions being flushed from the pipeline. This may lead to a shorter execution time.