Lab2: RISC-V RV32I[MA] emulator with ELF support

# Lab2: RISC-V RV32I[MA] emulator with ELF support ## Build a RISC-V GNU Compiler Toolchain Getting the RISC-V GNU Compiler Toolchain in [RISC-V GNU Compiler Toolchain](https://github.com/riscv/riscv-gnu-toolchain) github repository. ```shell= git clone --recursive https://github.com/riscv/riscv-gnu-toolchain ``` It will take a long time to download all the required files from the repository (about 4G of data to download), once you download it, follow the steps below to build toolchains. When the download is complete, go to the downloaded folder and start building the toolchain according to your needs (the requirement for this homework is `--with-arch=rv32i` `--with-abi=ilp32`) ```shell= cd riscv-gnu-toolchain export PATH=$PATH:/opt/riscv/bin ./configure --prefix=/opt/riscv --with-arch=rv32i --with-abi=ilp32 sudo make ``` Then wait a long time to build the corresponding compiler toolchain, which is about 45 minutes. When `make` is complete, check the folder to find the corresponding riscv compiler, and then we can use the compiler to start the homework requirements. # rewrite Mulitipiler mul.c : ```clike= /* riscv64-unknown-elf-gcc -march=rv32i -mabi=ilp32 -O3 -nostdlib test1.c -o test1 */ int mul(int, int); int power(int, int); int _start() { int i; int res = mul(5, 4); volatile char *tx = (volatile char *)0x40002000; const char *result1 = "The value of result is "; while (*result1) { *tx = *result1; result1++; } for (i = 31; i >= 0; i--) { if ((res >> i) & 0x01) *tx = '1'; else *tx = '0'; } *tx = '\n'; return 0; } int mul(int mul1, int mul2) { int result = 0; int i; for (i = 0; i < 32; i++) { if ((mul2 >> i) & 0x1) result = result + (mul1 << i); } return result; } ``` If mul(4,5) is passed in, it will get the following result when processing emu-rv32i: ```shell= ./emu-rv32i mul The value of result is 00000000000000000000000000010100 >>> Execution time: 161403 ns >>> Instruction count: 496 (IPS=3073053) >>> Jumps: 116 (23.39%) - 31 forwards, 85 backwards >>> Branching T=114 (76.00%) F=36 (24.00%) ``` Disassemble the ELF files: ```shell= $ riscv-none-embed-objdump -h mul mul: file format elf32-littleriscv Sections: Idx Name Size VMA LMA File off Algn 0 .text 000000d0 00010054 00010054 00000054 2**2 CONTENTS, ALLOC, LOAD, READONLY, CODE 1 .rodata 0000001c 00010124 00010124 00000124 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA 2 .comment 00000033 00000000 00000000 00000140 2**0 ``` Use objdump to see the compiled risc-v assembly results: ```shell= $riscv-none-embed-objdump -d mul mul: file format elf32-littleriscv Disassembly of section .text: 00010054 <_start>: 10054: 00000693 li a3,0 10058: 00100793 li a5,1 1005c: 00400813 li a6,4 10060: 00500513 li a0,5 10064: 02000593 li a1,32 10068: 40f85733 sra a4,a6,a5 1006c: 00177713 andi a4,a4,1 10070: 00f51633 sll a2,a0,a5 10074: 00178793 addi a5,a5,1 10078: 00070463 beqz a4,10080 <_start+0x2c> 1007c: 00c686b3 add a3,a3,a2 10080: feb794e3 bne a5,a1,10068 <_start+0x14> 10084: 000107b7 lui a5,0x10 10088: 12478793 addi a5,a5,292 # 10124 <mul+0x30> 1008c: 05400713 li a4,84 10090: 40002637 lui a2,0x40002 10094: 00e60023 sb a4,0(a2) # 40002000 <__global_pointer$+0x3fff06c0> 10098: 00178793 addi a5,a5,1 1009c: 0007c703 lbu a4,0(a5) 100a0: fe071ae3 bnez a4,10094 <_start+0x40> 100a4: 01f00793 li a5,31 100a8: 400025b7 lui a1,0x40002 100ac: 03000813 li a6,48 100b0: 03100513 li a0,49 100b4: fff00613 li a2,-1 100b8: 0100006f j 100c8 <_start+0x74> 100bc: 00a58023 sb a0,0(a1) # 40002000 <__global_pointer$+0x3fff06c0> 100c0: fff78793 addi a5,a5,-1 100c4: 00c78e63 beq a5,a2,100e0 <_start+0x8c> 100c8: 40f6d733 sra a4,a3,a5 100cc: 00177713 andi a4,a4,1 100d0: fe0716e3 bnez a4,100bc <_start+0x68> 100d4: 01058023 sb a6,0(a1) 100d8: fff78793 addi a5,a5,-1 100dc: fec796e3 bne a5,a2,100c8 <_start+0x74> 100e0: 400027b7 lui a5,0x40002 100e4: 00a00713 li a4,10 100e8: 00e78023 sb a4,0(a5) # 40002000 <__global_pointer$+0x3fff06c0> 100ec: 00000513 li a0,0 100f0: 00008067 ret 000100f4 <mul>: 100f4: 00000793 li a5,0 100f8: 00000613 li a2,0 100fc: 02000813 li a6,32 10100: 40f5d733 sra a4,a1,a5 10104: 00177713 andi a4,a4,1 10108: 00f516b3 sll a3,a0,a5 1010c: 00178793 addi a5,a5,1 10110: 00070463 beqz a4,10118 <mul+0x24> 10114: 00d60633 add a2,a2,a3 10118: ff0794e3 bne a5,a6,10100 <mul+0xc> 1011c: 00060513 mv a0,a2 10120: 00008067 ret ``` Compared the assembly code credit to [王昱翔](https://hackmd.io/atHu-zjwSbKJeF45nBYp6g) ```clike= .file "mul2.c" .option nopic .text .align 2 .globl main .type main, @function main: addi sp,sp,-32 sw s0,28(sp) addi s0,sp,32 li a5,-22 sw a5,-28(s0) li a5,-5 sw a5,-32(s0) sw zero,-20(s0) sw zero,-24(s0) j .L2 .L4: lw a4,-32(s0) lw a5,-24(s0) sra a5,a4,a5 andi a5,a5,1 beqz a5,.L3 lw a4,-28(s0) lw a5,-24(s0) sll a5,a4,a5 lw a4,-20(s0) add a5,a4,a5 sw a5,-20(s0) .L3: lw a5,-24(s0) addi a5,a5,1 sw a5,-24(s0) .L2: lw a4,-24(s0) li a5,31 ble a4,a5,.L4 li a5,0 mv a0,a5 lw s0,28(sp) addi sp,sp,32 jr ra .size main, .-main .ident "GCC: (GNU) 7.2.0" ``` Since the compared version has a reference to stdio.h, linker & loader will expand the code of stdio.h into the original c code, resulting in a lot of instructions for assmebly code. And my version does not reference any libraries, so the assmebly code obtained by objdump will appear much more streamlined. # rewrite Bit Reversal Bit reversal is an operation neeeded in many programsm that reverses the order of bits in a word. For example, reversing the word **0x55000011** will result in **0x880000AA**. The following C code shows a naive implementation of this function **reverse.c** : ```clike= /* riscv64-unknown-elf-gcc -march=rv32i -mabi=ilp32 -O3 -nostdlib test1.c -o test1 */ int reverse(int); int _start() { int i; int res = reverse(0x55000011); volatile char *tx = (volatile char *)0x40002000; const char *result1 = "The value of result is "; while (*result1) { *tx = *result1; result1++; } for (i = 31; i >= 0; i--) { if ((res >> i) & 0x01) *tx = '1'; else *tx = '0'; } *tx = '\n'; return 0; } int reverse(int v) { int i, r = 0; for (i = 0; i < 32; i++) { r <<= 1; r |= ((v >> i) & 0x1); } return r; } ``` Execting result: ```shell= ./emu-rv32i revserse The value of result is 10001000000000000000000010101010 >>> Execution time: 168316 ns >>> Instruction count: 501 (IPS=2976544) >>> Jumps: 87 (17.37%) - 1 forwards, 86 backwards >>> Branching T=85 (70.83%) F=35 (29.17%) ``` Disassemble the ELF files: ```shell= $ riscv-none-embed-objdump -h reverse revserse: file format elf32-littleriscv Sections: Idx Name Size VMA LMA File off Algn 0 .text 000000c8 00010054 00010054 00000054 2**2 CONTENTS, ALLOC, LOAD, READONLY, CODE 1 .rodata 0000001c 0001011c 0001011c 0000011c 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA 2 .comment 00000033 00000000 00000000 00000138 2**0 CONTENTS, READONLY ``` Use objdump to see the compiled risc-v assembly results: ```shell= $ riscv-none-embed-objdump -d revserse revserse: file format elf32-littleriscv Disassembly of section .text: 00010054 <_start>: 10054: 55000637 lui a2,0x55000 10058: 00000713 li a4,0 1005c: 00000793 li a5,0 10060: 01160613 addi a2,a2,17 # 55000011 <__global_pointer$+0x54fee6d9> 10064: 02000593 li a1,32 10068: 40e656b3 sra a3,a2,a4 1006c: 00179793 slli a5,a5,0x1 10070: 0016f693 andi a3,a3,1 10074: 00170713 addi a4,a4,1 10078: 00f6e7b3 or a5,a3,a5 1007c: feb716e3 bne a4,a1,10068 <_start+0x14> 10080: 00010737 lui a4,0x10 10084: 11c70713 addi a4,a4,284 # 1011c <reverse+0x2c> 10088: 05400693 li a3,84 1008c: 40002637 lui a2,0x40002 10090: 00d60023 sb a3,0(a2) # 40002000 <__global_pointer$+0x3fff06c8> 10094: 00170713 addi a4,a4,1 10098: 00074683 lbu a3,0(a4) 1009c: fe069ae3 bnez a3,10090 <_start+0x3c> 100a0: 01f00713 li a4,31 100a4: 400025b7 lui a1,0x40002 100a8: 03000813 li a6,48 100ac: 03100513 li a0,49 100b0: fff00613 li a2,-1 100b4: 0100006f j 100c4 <_start+0x70> 100b8: 00a58023 sb a0,0(a1) # 40002000 <__global_pointer$+0x3fff06c8> 100bc: fff70713 addi a4,a4,-1 100c0: 00c70e63 beq a4,a2,100dc <_start+0x88> 100c4: 40e7d6b3 sra a3,a5,a4 100c8: 0016f693 andi a3,a3,1 100cc: fe0696e3 bnez a3,100b8 <_start+0x64> 100d0: 01058023 sb a6,0(a1) 100d4: fff70713 addi a4,a4,-1 100d8: fec716e3 bne a4,a2,100c4 <_start+0x70> 100dc: 400027b7 lui a5,0x40002 100e0: 00a00713 li a4,10 100e4: 00e78023 sb a4,0(a5) # 40002000 <__global_pointer$+0x3fff06c8> 100e8: 00000513 li a0,0 100ec: 00008067 ret 000100f0 <reverse>: 100f0: 00000793 li a5,0 100f4: 00000713 li a4,0 100f8: 02000613 li a2,32 100fc: 40e556b3 sra a3,a0,a4 10100: 00179793 slli a5,a5,0x1 10104: 0016f693 andi a3,a3,1 10108: 00170713 addi a4,a4,1 1010c: 00f6e7b3 or a5,a3,a5 10110: fec716e3 bne a4,a2,100fc <reverse+0xc> 10114: 00078513 mv a0,a5 10118: 00008067 ret ``` Compared the assembly code credit to [林家葦](https://hackmd.io/MFuJDtz9RkW6QF2_fxJs-g) ```clike= .data str1: .string " reversei bit is " example: .4byte 0x12345678 shift1: .4byte 0xffff0000 shift2: .4byte 0xff00ff00 shift3: .4byte 0xf0f0f0f0 shift4: .4byte 0xcccccccc shift5: .4byte 0xaaaaaaaa .text main: lw a0,example jal bitreverse lw a1,example jal print li a0,10 ecall bitreverse: lw t3,shift1 and t1,a0,t3 srli t1,t1,16 srli t3,t3,16 and t2,a0,t3 slli t2,t2,16 or t0,t1,t2 lw t3,shift2 and t1,t0,t3 srli t1,t1,8 srli t3,t3,8 and t2,t0,t3 slli t2,t2,8 or t0,t1,t2 lw t3,shift3 and t1,t0,t3 srli t1,t1,4 srli t3,t3,4 and t2,t0,t3 slli t2,t2,4 or t0,t1,t2 lw t3,shift4 and t1,t0,t3 srli t1,t1,2 srli t3,t3,2 and t2,t0,t3 slli t2,t2,2 or t0,t1,t2 lw t3,shift5 and t1,t0,t3 srli t1,t1,1 srli t3,t3,1 and t2,t0,t3 slli t2,t2,1 or t0,t1,t2 mv a0,t0 ret print: mv t0,a0 li a0,1 ecall la a1,str1 li a0,4 ecall mv a1,t0 li a0,1 ecall ret ``` Because the reverse.c I made in C language uses shift operation and and operation as much as possible, so RISC-V compiler can be converted into assembly code very close to the instruction behavior, so the resulting assembly code is also much more streamlined.