# Lab2: RISC-V Toolchain ###### tags: `Course` `RISC-V` `Computer Architecture 2022` ## Rewrite [Decode XORed Array](https://leetcode.com/problems/decode-xored-array/) * **Problem :** I chose the **Decode XORed Array** from [黃冠宇](https://hackmd.io/@ZLQisilvQvOh2DclLmk1bg/SyjKI7sZi) * **Motivation :** Because I think this problem is tricky." ```c= #include<stdio.h> int* decode(int* encoded, int encodedSize, int first, int* returnSize){ int* result = (int*)malloc(sizeof(int)*(encodedSize+1)); result[0]=first; for(int i=0;i<encodedSize;i++){ result[i+1]=result[i] ^ encoded[i]; } *returnSize = encodedSize+1; return result; } ``` * **Modefied :** Because his C code doesn't include tha data for testing. so I modify his C code as below ```c= #include <stdlib.h> #include <stdio.h> int* decode(int* encoded, int encodedSize, int first){ int* result = (int*)malloc(sizeof(int)*(encodedSize+1)); result[0]=first; for(int i=0;i<encodedSize;i++){ result[i+1]=result[i] ^ encoded[i]; } return result; } int main(){ int nums1[4] = {6,2,7,3}; int* result1 = decode(nums1, 4, 4); printf("THe result is : "); for(int j=0; j<5; j++){ printf("%d ", result1[j]); } } ``` ![](https://i.imgur.com/kY7F86G.png) ## Using RISC-V gcc * 環境變數 > cd $HOME > source riscv-none-elf-gcc/setenv * 編譯 > riscv-none-elf-gcc -march=rv32i -mabi=ilp32 -O3 -o HW2 HW2.c * 反組譯並存成txt > riscv-none-elf-objdump -d ./HW2 > asb.txt * 看elf標頭 > riscv-none-elf-readelf -h ./HW2 * 看size > riscv-none-elf-size ./HW2 * 執行code > build/rv32emu HW2 ## Compare Assembly Code ### Original Code (From [黃冠宇](https://hackmd.io/@ZLQisilvQvOh2DclLmk1bg/SyjKI7sZi)) ### -O1 Optimized Assembly Code ![](https://i.imgur.com/GC6mZgH.png) * Assembly code ```RISCV= 00010208 <main>: 10208: fe010113 addi sp,sp,-32 1020c: 00112e23 sw ra,28(sp) 10210: 00812c23 sw s0,24(sp) 10214: 00912a23 sw s1,20(sp) 10218: 01212823 sw s2,16(sp) 1021c: 000217b7 lui a5,0x21 10220: 73878793 addi a5,a5,1848 # 21738 <__clzsi2+0xa4> 10224: 0007a603 lw a2,0(a5) 10228: 0047a683 lw a3,4(a5) 1022c: 0087a703 lw a4,8(a5) 10230: 00c7a783 lw a5,12(a5) 10234: 00c12023 sw a2,0(sp) 10238: 00d12223 sw a3,4(sp) 1023c: 00e12423 sw a4,8(sp) 10240: 00f12623 sw a5,12(sp) 10244: 00400613 li a2,4 10248: 00400593 li a1,4 1024c: 00010513 mv a0,sp 10250: f35ff0ef jal ra,10184 <decode> 10254: 00050493 mv s1,a0 10258: 00021537 lui a0,0x21 1025c: 72050513 addi a0,a0,1824 # 21720 <__clzsi2+0x8c> 10260: 209000ef jal ra,10c68 <printf> 10264: 00048413 mv s0,s1 10268: 01448493 addi s1,s1,20 1026c: 00021937 lui s2,0x21 10270: 00042583 lw a1,0(s0) 10274: 73490513 addi a0,s2,1844 # 21734 <__clzsi2+0xa0> 10278: 1f1000ef jal ra,10c68 <printf> 1027c: 00440413 addi s0,s0,4 10280: fe9418e3 bne s0,s1,10270 <main+0x68> 10284: 00000513 li a0,0 10288: 01c12083 lw ra,28(sp) 1028c: 01812403 lw s0,24(sp) 10290: 01412483 lw s1,20(sp) 10294: 01012903 lw s2,16(sp) 10298: 02010113 addi sp,sp,32 1029c: 00008067 ret 00010184 <decode>: 10184: fe010113 addi sp,sp,-32 10188: 00112e23 sw ra,28(sp) 1018c: 00812c23 sw s0,24(sp) 10190: 00912a23 sw s1,20(sp) 10194: 01212823 sw s2,16(sp) 10198: 01312623 sw s3,12(sp) 1019c: 00050413 mv s0,a0 101a0: 00058913 mv s2,a1 101a4: 00060993 mv s3,a2 101a8: 00158493 addi s1,a1,1 101ac: 00249493 slli s1,s1,0x2 101b0: 00048513 mv a0,s1 101b4: 1f0000ef jal ra,103a4 <malloc> 101b8: 01352023 sw s3,0(a0) 101bc: 03205863 blez s2,101ec <decode+0x68> 101c0: 00050793 mv a5,a0 101c4: 00040693 mv a3,s0 101c8: ffc48593 addi a1,s1,-4 101cc: 00a585b3 add a1,a1,a0 101d0: 0007a703 lw a4,0(a5) 101d4: 0006a603 lw a2,0(a3) 101d8: 00c74733 xor a4,a4,a2 101dc: 00e7a223 sw a4,4(a5) 101e0: 00478793 addi a5,a5,4 101e4: 00468693 addi a3,a3,4 101e8: feb794e3 bne a5,a1,101d0 <decode+0x4c> 101ec: 01c12083 lw ra,28(sp) 101f0: 01812403 lw s0,24(sp) 101f4: 01412483 lw s1,20(sp) 101f8: 01012903 lw s2,16(sp) 101fc: 00c12983 lw s3,12(sp) 10200: 02010113 addi sp,sp,32 10204: 00008067 ret ``` * Observation * LOC(line of code) : `73` * Allocate `32` bytes on stack * Number of registers used : `$ra`, `$s0~$s3`, `$a0~$a5` * Number of `lw` and `sw` used : `16` and `15` * Execution & CSR count ```rv32emu= yuchengwang@yuchengwang-virtual-machine:~/Desktop/rv32emu$ build/rv32emu --stats HW2.1 The result is : 4 2 0 7 4 inferior exit code 0 CSR cycle count: 4532 ``` ### -O2 Optimized Assembly Code ![](https://i.imgur.com/wZIxgnZ.png) * Assembly code ```RISCV= 000100c4 <main>: 100c4: 000217b7 lui a5,0x21 100c8: 73078793 addi a5,a5,1840 # 21730 <__clzsi2+0xa4> 100cc: 0007a803 lw a6,0(a5) 100d0: 0047a683 lw a3,4(a5) 100d4: 0087a703 lw a4,8(a5) 100d8: 00c7a783 lw a5,12(a5) 100dc: fe010113 addi sp,sp,-32 100e0: 00400613 li a2,4 100e4: 00400593 li a1,4 100e8: 00010513 mv a0,sp 100ec: 00112e23 sw ra,28(sp) 100f0: 00812c23 sw s0,24(sp) 100f4: 00912a23 sw s1,20(sp) 100f8: 01212823 sw s2,16(sp) 100fc: 01012023 sw a6,0(sp) 10100: 00d12223 sw a3,4(sp) 10104: 00e12423 sw a4,8(sp) 10108: 00f12623 sw a5,12(sp) 1010c: 10c000ef jal ra,10218 <decode> 10110: 00050413 mv s0,a0 10114: 00021537 lui a0,0x21 10118: 71850513 addi a0,a0,1816 # 21718 <__clzsi2+0x8c> 1011c: 345000ef jal ra,10c60 <printf> 10120: 01440913 addi s2,s0,20 10124: 000214b7 lui s1,0x21 10128: 00042583 lw a1,0(s0) 1012c: 72c48513 addi a0,s1,1836 # 2172c <__clzsi2+0xa0> 10130: 00440413 addi s0,s0,4 10134: 32d000ef jal ra,10c60 <printf> 10138: ff2418e3 bne s0,s2,10128 <main+0x64> 1013c: 01c12083 lw ra,28(sp) 10140: 01812403 lw s0,24(sp) 10144: 01412483 lw s1,20(sp) 10148: 01012903 lw s2,16(sp) 1014c: 00000513 li a0,0 10150: 02010113 addi sp,sp,32 10154: 00008067 ret 00010218 <decode>: 10218: fe010113 addi sp,sp,-32 1021c: 01212823 sw s2,16(sp) 10220: 00158913 addi s2,a1,1 10224: 00291913 slli s2,s2,0x2 10228: 00812c23 sw s0,24(sp) 1022c: 00050413 mv s0,a0 10230: 00090513 mv a0,s2 10234: 00912a23 sw s1,20(sp) 10238: 01312623 sw s3,12(sp) 1023c: 00060493 mv s1,a2 10240: 00112e23 sw ra,28(sp) 10244: 00058993 mv s3,a1 10248: 154000ef jal ra,1039c <malloc> 1024c: 00952023 sw s1,0(a0) 10250: 03305663 blez s3,1027c <decode+0x64> 10254: ffc90613 addi a2,s2,-4 10258: 00040793 mv a5,s0 1025c: 00450713 addi a4,a0,4 10260: 00c40633 add a2,s0,a2 10264: 0007a683 lw a3,0(a5) 10268: 00470713 addi a4,a4,4 1026c: 00478793 addi a5,a5,4 10270: 00d4c4b3 xor s1,s1,a3 10274: fe972e23 sw s1,-4(a4) 10278: fec796e3 bne a5,a2,10264 <decode+0x4c> 1027c: 01c12083 lw ra,28(sp) 10280: 01812403 lw s0,24(sp) 10284: 01412483 lw s1,20(sp) 10288: 01012903 lw s2,16(sp) 1028c: 00c12983 lw s3,12(sp) 10290: 02010113 addi sp,sp,32 10294: 00008067 ret ``` * Observation * LOC(line of code) : `71` * Allocate `32` bytes on stack * Number of registers used : `$ra`, `$s0~$s3`, `$a0~$a6` * Number of `lw` and `sw` used : `15` and `15` * Execution & CSR count ```rv32emu= yuchengwang@yuchengwang-virtual-machine:~/Desktop/rv32emu$ build/rv32emu --stats HW2.2 The result is : 4 2 0 7 4 inferior exit code 0 CSR cycle count: 4527 ``` ### -O3 Optimized Assembly Code ![](https://i.imgur.com/kN37fx0.png) * Assembly code ```RISCV= 000100c4 <main>: 100c4: 000217b7 lui a5,0x21 100c8: 73078793 addi a5,a5,1840 # 21730 <__clzsi2+0xa4> 100cc: 0007a803 lw a6,0(a5) 100d0: 0047a683 lw a3,4(a5) 100d4: 0087a703 lw a4,8(a5) 100d8: 00c7a783 lw a5,12(a5) 100dc: fe010113 addi sp,sp,-32 100e0: 00400613 li a2,4 100e4: 00400593 li a1,4 100e8: 00010513 mv a0,sp 100ec: 00112e23 sw ra,28(sp) 100f0: 00812c23 sw s0,24(sp) 100f4: 00912a23 sw s1,20(sp) 100f8: 01212823 sw s2,16(sp) 100fc: 01012023 sw a6,0(sp) 10100: 00d12223 sw a3,4(sp) 10104: 00e12423 sw a4,8(sp) 10108: 00f12623 sw a5,12(sp) 1010c: 10c000ef jal ra,10218 <decode> 10110: 00050413 mv s0,a0 10114: 00021537 lui a0,0x21 10118: 71850513 addi a0,a0,1816 # 21718 <__clzsi2+0x8c> 1011c: 345000ef jal ra,10c60 <printf> 10120: 01440913 addi s2,s0,20 10124: 000214b7 lui s1,0x21 10128: 00042583 lw a1,0(s0) 1012c: 72c48513 addi a0,s1,1836 # 2172c <__clzsi2+0xa0> 10130: 00440413 addi s0,s0,4 10134: 32d000ef jal ra,10c60 <printf> 10138: ff2418e3 bne s0,s2,10128 <main+0x64> 1013c: 01c12083 lw ra,28(sp) 10140: 01812403 lw s0,24(sp) 10144: 01412483 lw s1,20(sp) 10148: 01012903 lw s2,16(sp) 1014c: 00000513 li a0,0 10150: 02010113 addi sp,sp,32 10154: 00008067 ret 00010218 <decode>: 10218: fe010113 addi sp,sp,-32 1021c: 01212823 sw s2,16(sp) 10220: 00158913 addi s2,a1,1 10224: 00291913 slli s2,s2,0x2 10228: 00812c23 sw s0,24(sp) 1022c: 00050413 mv s0,a0 10230: 00090513 mv a0,s2 10234: 00912a23 sw s1,20(sp) 10238: 01312623 sw s3,12(sp) 1023c: 00060493 mv s1,a2 10240: 00112e23 sw ra,28(sp) 10244: 00058993 mv s3,a1 10248: 154000ef jal ra,1039c <malloc> 1024c: 00952023 sw s1,0(a0) 10250: 03305663 blez s3,1027c <decode+0x64> 10254: ffc90613 addi a2,s2,-4 10258: 00040793 mv a5,s0 1025c: 00450713 addi a4,a0,4 10260: 00c40633 add a2,s0,a2 10264: 0007a683 lw a3,0(a5) 10268: 00470713 addi a4,a4,4 1026c: 00478793 addi a5,a5,4 10270: 00d4c4b3 xor s1,s1,a3 10274: fe972e23 sw s1,-4(a4) 10278: fef616e3 bne a2,a5,10264 <decode+0x4c> 1027c: 01c12083 lw ra,28(sp) 10280: 01812403 lw s0,24(sp) 10284: 01412483 lw s1,20(sp) 10288: 01012903 lw s2,16(sp) 1028c: 00c12983 lw s3,12(sp) 10290: 02010113 addi sp,sp,32 10294: 00008067 ret ``` * Observation * LOC(line of code) : `71` * Allocate `32` bytes on stack * Number of registers used : `$ra`, `$s0~$s3`, `$a0~$a6` * Number of `lw` and `sw` used : `15` and `15` * Execution & CSR count ```rv32emu= yuchengwang@yuchengwang-virtual-machine:~/Desktop/rv32emu$ build/rv32emu --stats HW2.3 THe result is : 4 2 0 7 4 inferior exit code 0 CSR cycle count: 4527 ``` ### -Os Optimized Assembly Code ![](https://i.imgur.com/xomLz0z.png) ```RISCV= 000100c4 <main>: 100c4: fe010113 addi sp,sp,-32 100c8: 000215b7 lui a1,0x21 100cc: 01000613 li a2,16 100d0: 71058593 addi a1,a1,1808 # 21710 <__clzsi2+0xa0> 100d4: 00010513 mv a0,sp 100d8: 00112e23 sw ra,28(sp) 100dc: 00812c23 sw s0,24(sp) 100e0: 00912a23 sw s1,20(sp) 100e4: 01212823 sw s2,16(sp) 100e8: 231000ef jal ra,10b18 <memcpy> 100ec: 00400613 li a2,4 100f0: 00400593 li a1,4 100f4: 00010513 mv a0,sp 100f8: 10c000ef jal ra,10204 <decode> 100fc: 00050413 mv s0,a0 10100: 00021537 lui a0,0x21 10104: 6f850513 addi a0,a0,1784 # 216f8 <__clzsi2+0x88> 10108: 4e1000ef jal ra,10de8 <printf> 1010c: 01440493 addi s1,s0,20 10110: 00021937 lui s2,0x21 10114: 00042583 lw a1,0(s0) 10118: 70c90513 addi a0,s2,1804 # 2170c <__clzsi2+0x9c> 1011c: 00440413 addi s0,s0,4 10120: 4c9000ef jal ra,10de8 <printf> 10124: fe9418e3 bne s0,s1,10114 <main+0x50> 10128: 01c12083 lw ra,28(sp) 1012c: 01812403 lw s0,24(sp) 10130: 01412483 lw s1,20(sp) 10134: 01012903 lw s2,16(sp) 10138: 00000513 li a0,0 1013c: 02010113 addi sp,sp,32 10140: 00008067 ret 00010204 <decode>: 10204: ff010113 addi sp,sp,-16 10208: 00912223 sw s1,4(sp) 1020c: 00050493 mv s1,a0 10210: 00158513 addi a0,a1,1 10214: 00251513 slli a0,a0,0x2 10218: 00812423 sw s0,8(sp) 1021c: 01212023 sw s2,0(sp) 10220: 00112623 sw ra,12(sp) 10224: 00060913 mv s2,a2 10228: 00058413 mv s0,a1 1022c: 154000ef jal ra,10380 <malloc> 10230: 01252023 sw s2,0(a0) 10234: 00050713 mv a4,a0 10238: 00000793 li a5,0 1023c: 00470713 addi a4,a4,4 10240: 0087ce63 blt a5,s0,1025c <decode+0x58> 10244: 00c12083 lw ra,12(sp) 10248: 00812403 lw s0,8(sp) 1024c: 00412483 lw s1,4(sp) 10250: 00012903 lw s2,0(sp) 10254: 01010113 addi sp,sp,16 10258: 00008067 ret 1025c: 00279693 slli a3,a5,0x2 10260: 00d486b3 add a3,s1,a3 10264: 0006a683 lw a3,0(a3) 10268: ffc72603 lw a2,-4(a4) 1026c: 00178793 addi a5,a5,1 10270: 00c6c6b3 xor a3,a3,a2 10274: 00d72023 sw a3,0(a4) 10278: fc5ff06f j 1023c <decode+0x38> ``` * Observation * LOC(line of code) : `64` * Allocate `32` bytes on stack : * Number of registers used : `$ra`, `$s0~$s2`, `$a0~$a5` * Number of `lw` and `sw` used : `11` and `10` * Execution & CSR count ```rv32emu= yuchengwang@yuchengwang-virtual-machine:~/Desktop/rv32emu$ build/rv32emu --stats HW2 The result is : 4 2 0 7 4 inferior exit code 0 CSR cycle count: 4562 ```