# Lab2: RISC-V RV32I[MA] emulator with ELF support
**POW function** done by Mr. 李佶龍 in the last homework
I have written the **C** for power function
```cpp
int power(int base, int exponent);
int _start()
{
volatile int base = 5;
volatile int exponent = 3;
volatile int Value = power(base, exponent);
volatile char* tx = (volatile char*) 0x40002000;
volatile int out = Value;
*tx = out;
return 0;
}
int power(int base, int exponent)
{
int result = 1;
for(exponent; exponent>0;exponent--)
{
result = result * base;
}
return result;
}
```
``` shell
surajubuntu@surajubuntu:~/rv32emu$ riscv-none-embed-objdump -d NewPow
NewPow: file format elf32-littleriscv
Disassembly of section .text:
00010054 <_start>:
10054: 1141 addi sp,sp,-16
10056: 4795 li a5,5
10058: c03e sw a5,0(sp)
1005a: 478d li a5,3
1005c: c23e sw a5,4(sp)
1005e: 4682 lw a3,0(sp)
10060: 4712 lw a4,4(sp)
10062: 4785 li a5,1
10064: 00e04f63 bgtz a4,10082 <_start+0x2e>
10068: c43e sw a5,8(sp)
1006a: 47a2 lw a5,8(sp)
1006c: 40002737 lui a4,0x40002
10070: 4501 li a0,0
10072: c63e sw a5,12(sp)
10074: 47b2 lw a5,12(sp)
10076: 0ff7f793 andi a5,a5,255
1007a: 00f70023 sb a5,0(a4) # 40002000 <__global_pointer$+0x3fff0764>
1007e: 0141 addi sp,sp,16
10080: 8082 ret
10082: 02d787b3 mul a5,a5,a3
10086: 177d addi a4,a4,-1
10088: bff1 j 10064 <_start+0x10>
0001008a <power>:
1008a: 4785 li a5,1
1008c: 00b04463 bgtz a1,10094 <power+0xa>
10090: 853e mv a0,a5
10092: 8082 ret
10094: 02a787b3 mul a5,a5,a0
10098: 15fd addi a1,a1,-1
1009a: bfcd j 1008c <power+0x2>
```
Difference between -O3 and -Os argument.
Comment:
-O3 execution time is slower, but instruction per count is less than -Os.
```shell
surajubuntu@surajubuntu:~/rv32emu$ riscv-none-embed-gcc -O3 -nostdlib PowFunc.c -o NewPow
surajubuntu@surajubuntu:~/rv32emu$ ./emu-rv32i NewPow
>>> Execution time: 205 ns
>>> Instruction count: 1 (IPS=4878048)
>>> Jumps: 0 (0.00%) - 0 forwards, 0 backwards
>>> Branching T=0 (-nan%) F=0 (-nan%)
surajubuntu@surajubuntu:~/rv32emu$ riscv-none-embed-gcc -Os -nostdlib PowFunc.c -o NewPow
surajubuntu@surajubuntu:~/rv32emu$ ./emu-rv32i NewPow
>>> Execution time: 193 ns
>>> Instruction count: 1 (IPS=5181347)
>>> Jumps: 0 (0.00%) - 0 forwards, 0 backwards
>>> Branching T=0 (-nan%) F=0 (-nan%)
```
```shell
surajubuntu@surajubuntu:~/rv32emu$ riscv-none-embed-readelf -h NewPow
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: RISC-V
Version: 0x1
Entry point address: 0x10054
Start of program headers: 52 (bytes into file)
Start of section headers: 560 (bytes into file)
Flags: 0x1, RVC, soft-float ABI
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 1
Size of section headers: 40 (bytes)
Number of section headers: 6
Section header string table index: 5
```
Size command
```shell
surajubuntu@surajubuntu:~/rv32emu$ riscv-none-embed-size NewPow
text data bss dec hex filename
72 0 0 72 48 NewPow
```
This assembly program is written by Mr. 李佶龍.
```assembly
.data
argument1: .word 5 # base number in pow function
argument2: .word 3 # power number in pow function
str1: .string " raised to the power of "
str2: .string " is "
.text
main:
# Load arguments from static data
lw a0, argument1
lw a1, argument2
# Jump-and-link to the 'pow' label
jal ra, pow
# Print the result to console
add a2, a0, zero
lw a0, argument1
lw a1, argument2
jal ra, printResult
#Exit program
li a0, 10
ecall
pow:
# Save original register value
addi sp, sp, -8
sw ra, 4(sp)
sw s0, 0(sp)
addi s0, zero, 1 # initialize result
addi t2, zero, 1 # restric loop time
bne a1, zero, loop
ret
# base loop multiply itself
loop:
mv t0, a0
mv t1, s0
jal ra, multiplyBase
addi a1, a1, -1
bne a1, zero, loop
# restore return address
mv a0, s0
lw s0, 0(sp)
lw ra, 4(sp)
addi sp, sp, 8
ret
multiplyBase:
add s0, s0, t1
addi t0, t0, -1
bne t0, t2, multiplyBase
ret
# expects:
# a0: Value of base number
# a1: Value of power number
# a2: power of base result
printResult:
mv t0, a0
mv t1, a1
mv t2, a2
add a1, t0, zero
li a0, 1
ecall
la a1, str1
li a0, 4
ecall
add a1, t1, zero
li a0, 1
ecall
la a1, str2
li a0, 4
ecall
add a1, t2, zero
li a0, 1
ecall
ret
```
My opinion:
In the rv32 emulator, it generate many lw and sw. In the beginning, sp create 16 byte free space. But we only can make 8 byte free space. So, there has no need of 8 extra byte. rv32 emulator use mul instruction for multiply two register. (But professor mentioned in the class that mul instruction isn't in the riscv. Confused :( )
If branch is always taken, then there will be one miss. So 90% brach accuracy. There has no data hazard.
There has no ecall in the emulated assembly code.