owned this note
owned this note
Published
Linked with GitHub
# RISC-V Toolchain(fibonacci number)
contributed by < `eddie9712` >
## experiment environment
```
virtual machine : Oracle VM virtualBox 5.2
operating system : Ubuntu 20.04
compiler : riscv-none-embed-gcc
GNU make : GNU Make 4.2.1
```
## Download the RISC-V RV32I[MA] emulator with ELF support
* follow the tutorial from [this page](https://hackmd.io/@sysprog/rJAufgHYS)
* We can copy the contents inside the setenv to the ~/.bashrc ,so we don't need to source it everytime after we reopen the terminal
* When I was compiling the emu-rv32i I got a error message:
```
emu-rv32i.c:11:10: fatal error: gelf.h: No such file or directory
11 | #include <gelf.h>
| ^~~~~~~~
```
**solve :** refer to this page:https://github.com/sslab-gatech/opensgx/issues/20
run the command:
`sudo apt install libelf-dev`
* After I fixed the issus I got result of `make check`
```
./emu-rv32i test1
Hello RISC-V!
>>> Execution time: 27525 ns
>>> Instruction count: 62 (IPS=2252497)
>>> Jumps: 14 (22.58%) - 0 forwards, 14 backwards
>>> Branching T=13 (92.86%) F=1 (7.14%)
```
## (1)Program rewrite:
### 1.Fibonacci number:
The code below is conrtibuted by [Jason Lin](https://hackmd.io/V3467NpAQWq_SOrx_AT2IQ?view)
```clike=
.data
argument: .word 10
str1: .string "Fibonacci("
str2: .string ") = "
.text
main: # initial value
lw a0, argument # n = 10
li s0, 1 # for comparison with n (n <= 1)
jal ra, fib # call fib(10)
mv a1, a0 # a1 : final falue
lw a0, argument # a0 : argument
jal ra, printResult # print result
j exit # go to exit
fib:
ble a0, s0, L1 # if(n <= 1)
addi sp, sp, -12 # push the stack
sw ra, 8(sp) # store return address
sw a0, 4(sp) # store argument n
addi a0, a0, -1 # argument = n - 1
jal ra, fib # call fib(n - 1)
sw a0, 0(sp) # store return value of fib(n - 1)
lw a0, 4(sp) # load argument n
addi a0, a0, -2 # argument = n - 2
jal ra, fib # call fib(n - 2)
lw t0, 0(sp) # load return value of fib(n - 1)
add a0, a0, t0 # fib(n - 1) + fib(n - 2)
lw ra, 8(sp) # load return address
addi sp, sp, 12 # pop the stack
ret # return
L1:
ret # return
printResult: # Fibonacci(10) = 55
mv t0, a0
mv t1, a1
la a0, str1
li a7, 4
ecall # print string str1
mv a0, t0
li a7, 1
ecall # print int argument n
la a0, str2
li a7, 4
ecall # print string str2
mv a0, t1
li a7, 1
ecall # print int result
ret
exit:
li a7, 10
ecall # exit
````
To understand the way that the emulator work, I find that `#define UART_TX_ADDR 0x40002000` (in emu-rv32i.c)is the address that we need to used for storing the 8bits(char) data. Simutaneously, the emulator will read from it and print the content
```clike=
//the code segment get from emu-rv32i.c
if (addr == UART_TX_ADDR) {
/* test for UART output, compatible with QEMU */
printf("%c", val);
}
```
To verify my thoght, I add some words for consoling:
`printf("receive byte:%c\n", val);`
then I see the out put below:
```
receive byte:H
receive byte:e
receive byte:l
receive byte:l
receive byte:o
receive byte:
receive byte:R
receive byte:I
receive byte:S
receive byte:C
receive byte:-
receive byte:V
receive byte:!
receive byte:
>>> Execution time: 166835 ns
>>> Instruction count: 62 (IPS=371624)
>>> Jumps: 14 (22.58%) - 0 forwards, 14 backwards
>>> Branching T=13 (92.86%) F=1 (7.14%)
```
After that, I started to implement fib.c, it was easy to implement fibonacci number in c with recursive method.However ,when the fib() function returned the result, it had the type **integer** but we want the type **char**(because of the receive type), and then I need to converted integer into string. Because I could not use the **gcc library** so I needed to convert it by my self,I found a way that could convert the integer in [here](https://www.geeksforgeeks.org/convert-a-string-to-an-integer-using-recursion/),but I thought that I used too many recursive method would cause the stack overflow, so I assume that I knew the result of fib(10) was a two-digits number.
```clike=
int i=fib(10); //the result of fib(10)
char str[2]; //here I assume that I know the result is a two-digits number
str[0]=i/10+'0';
str[1]=i%10+'0';
```
and then I got an error when I try to covert a digit into a char :
```
home/eddie/riscv-none-embed-gcc/8.2.0-3.1/bin/../lib/gcc/riscv-none-embed/8.2.0/../../../../riscv-none-embed/bin/ld: /tmp/ccEdQZxS.o: in function `.L43':
fib.c:(.text+0x180): undefined reference to `__divsi3'
/home/eddie/riscv-none-embed-gcc/8.2.0-3.1/bin/../lib/gcc/riscv-none-embed/8.2.0/../../../../riscv-none-embed/bin/ld: fib.c:(.text+0x194): undefined reference to `__modsi3'
collect2: error: ld returned 1 exit status
make: *** [Makefile:15: fib] Error 1
```
The reason why this error happens is because the `-nostdlib` is used , and it will disabled the division and modulo functions(also multiplication), so I need to implement by myself :
```clike=
nt divide(int a, int b) //division:a/b
{
int quotient = 0;
while (a >= b) {
a -= b;
++quotient;
}
return quotient;
}
int mul(int a,int b) //multiplication:a*b
{
int res = 0;
int i;
for (i = 0; i < 32; i++)
{
if ((b >> i) & 0x1)
res = res + (a << i);
}
return res;
}
int modulo(int a,int b) //modulo:a%b
{
int quo=divide(a,b);
int mod=a-mul(quo,b);
return mod;
}
```
After that, we got the answer correctly:
```
eddie@eddie-VirtualBox:~/rv32emu$ ./emu-rv32i fib
55
>>> Execution time: 289820 ns
>>> Instruction count: 1951 (IPS=6731764)
>>> Jumps: 299 (15.33%) - 72 forwards, 227 backwards
>>> Branching T=132 (54.10%) F=112 (45.90%)
```
## (2)Disassemble the ELF files generated by C compiler
### 1.fibonacci number
As I obseved the result of running `riscv-none-embed-objdump -d fib` with the different `RV32I_CFLAGS` ,
1. flag `-O3`
When I used the flag `-O3`(optimaize for speed), it had about 280000ns execution time(execute for many times),and 1951 instructions count
```
>>> Execution time: 288700 ns
>>> Instruction count: 1951 (IPS=6757880)
>>> Jumps: 299 (15.33%) - 72 forwards, 227 backwards
>>> Branching T=132 (54.10%) F=112 (45.90%)
```
2. flag `-Os`
When I used the flag `-Os`(optimaize for size), it had about 330000ns execution time(average for executing many times),and 2197 instructions count
```
>>> Execution time: 311260 ns
>>> Instruction count: 2197 (IPS=7058407)
>>> Jumps: 441 (20.07%) - 132 forwards, 309 backwards
>>> Branching T=160 (63.24%) F=93 (36.76%)
```
It was really strange that the speed optimaization had less instruction counts than size optimization, so when I check the objdump, the text size of flag `-O3` is actually bigger than than the flag `-0s`, so I thought the reason was below:
* Thought:
I haved observed that when I used the flag `-Os`,the assembly code was almost same as the assembly code that I got from student, they both used the method that calculated the fib(8) and fib(9) first,just like the graph showed below:
```graphviz
strict digraph G
{
1[label="F(6)"]
2[label="F(4)"]
3[label="F(5)"]
4[label="F(2)"]
5[label="F(3)"]
6[label="F(3)"]
7[label="F(4)"]
8[label="F(0)", style=filled]
9[label="F(1)", style=filled]
10[label="F(1)", style=filled]
11[label="F(2)"]
12[label="F(1)", style=filled]
13[label="F(2)"]
14[label="F(2)"]
15[label="F(3)"]
16[label="F(0)", style=filled]
17[label="F(1)", style=filled]
18[label="F(0)", style=filled]
19[label="F(1)", style=filled]
20[label="F(0)", style=filled]
21[label="F(1)", style=filled]
22[label="F(1)", style=filled]
23[label="F(2)"]
24[label="F(0)", style=filled]
25[label="F(1)", style=filled]
1 -> {2, 3}
2 -> {4, 5}
3 -> {6, 7}
4 -> {8, 9}
5 -> {10, 11}
6 -> {12, 13}
7 -> {14, 15}
11 -> {16, 17}
13 -> {18, 19}
14 -> {20, 21}
15 -> {22, 23}
23 -> {24, 25}
}
```
As you could see, we needed to calculte some terms redundant times, because the optimize for size just shrinked for the code size.However, the it needed to execute more instructions because of the redundant calculations just like the graph above.Therefore, it would execute more instructions than I expected.
## (3)Check the results of emu-rv32i for the statistics of execution flow and explain the internal counters such as true_counter, true_counter (crucial for branch prediction), jump_counter, etc.
1.true_counter =>which count the numbers of the branches that execute "jump"
2.jump counter=>which count the numbers of the branch and jump instructions