# Lab2: Binary search tree with RISC-V RV32I[MA] emulator ## Source code ```cpp void _start(){ volatile int arr[11] = {1, 4, 7, 10, 13, 16, 19, 22, 25, 28, 31}; volatile int target = 3; volatile int low_i = 0; volatile int high_i = 10; volatile int mean = (low_i + high_i)/2; volatile int iter = 4; volatile int i =0; while(i < iter) { if(arr[mean] == target) break; if(arr[mean] > target) high_i = mean - 1; else low_i = mean + 1; mean = (low_i + high_i)/2; i++; } if(i == 4){ volatile char* tx = (volatile char*) 0x40002000; const char* hello = "Failed!\n"; while (*hello) { *tx = *hello; hello++; } } } ``` ## `-O3` and `-Os` ```shell 00010054 <_start>: Initial: 10054: 000107b7 lui a5,0x10 10058: 1c878793 addi a5,a5,456 # 101c8 <_start+0x174> 1005c: 0007a703 lw a4,0(a5) 10060: 0047a603 lw a2,4(a5) 10064: fb010113 addi sp,sp,-80 10068: 0087a683 lw a3,8(a5) 1006c: 02e12223 sw a4,36(sp) 10070: 00c7a703 lw a4,12(a5) 10074: 02c12423 sw a2,40(sp) 10078: 0107a603 lw a2,16(a5) 1007c: 02d12623 sw a3,44(sp) 10080: 0147a683 lw a3,20(a5) 10084: 02e12823 sw a4,48(sp) 10088: 0187a703 lw a4,24(a5) 1008c: 02c12a23 sw a2,52(sp) 10090: 01c7a603 lw a2,28(a5) 10094: 02d12c23 sw a3,56(sp) 10098: 0207a683 lw a3,32(a5) 1009c: 02e12e23 sw a4,60(sp) 100a0: 0247a703 lw a4,36(a5) 100a4: 04c12023 sw a2,64(sp) 100a8: 0287a783 lw a5,40(a5) 100ac: 04d12223 sw a3,68(sp) 100b0: 04e12423 sw a4,72(sp) 100b4: 04f12623 sw a5,76(sp) 100b8: 00300793 li a5,3 100bc: 00f12623 sw a5,12(sp) 100c0: 00012823 sw zero,16(sp) binary-search: 100c4: 00a00793 li a5,10 100c8: 00f12a23 sw a5,20(sp) 100cc: 01012783 lw a5,16(sp) 100d0: 01412703 lw a4,20(sp) 100d4: 00e78733 add a4,a5,a4 100d8: 01f75793 srli a5,a4,0x1f 100dc: 00e787b3 add a5,a5,a4 100e0: 4017d793 srai a5,a5,0x1 100e4: 00f12c23 sw a5,24(sp) 100e8: 00400793 li a5,4 100ec: 00f12e23 sw a5,28(sp) 100f0: 02012023 sw zero,32(sp) 100f4: 02012703 lw a4,32(sp) 100f8: 01c12783 lw a5,28(sp) 100fc: 04f74463 blt a4,a5,10144 <_start+0xf0> 10100: 08c0006f j 1018c <_start+0x138> assign_mean_to_high: 10104: 01812783 lw a5,24(sp) 10108: fff78793 addi a5,a5,-1 1010c: 00f12a23 sw a5,20(sp) assign_mean_to_low: 10110: 01012783 lw a5,16(sp) 10114: 01412703 lw a4,20(sp) 10118: 00e78733 add a4,a5,a4 1011c: 01f75793 srli a5,a4,0x1f 10120: 00e787b3 add a5,a5,a4 10124: 4017d793 srai a5,a5,0x1 10128: 00f12c23 sw a5,24(sp) 1012c: 02012783 lw a5,32(sp) 10130: 00178793 addi a5,a5,1 10134: 02f12023 sw a5,32(sp) 10138: 02012703 lw a4,32(sp) 1013c: 01c12783 lw a5,28(sp) while_loop: 10140: 04f75663 bge a4,a5,1018c <_start+0x138> check_equal: 10144: 01812783 lw a5,24(sp) 10148: 05010713 addi a4,sp,80 1014c: 00279793 slli a5,a5,0x2 10150: 00f707b3 add a5,a4,a5 10154: fd47a703 lw a4,-44(a5) 10158: 00c12783 lw a5,12(sp) 1015c: 02f70863 beq a4,a5,1018c <_start+0x138> greater: 10160: 01812783 lw a5,24(sp) 10164: 05010713 addi a4,sp,80 10168: 00279793 slli a5,a5,0x2 1016c: 00f707b3 add a5,a4,a5 10170: fd47a703 lw a4,-44(a5) 10174: 00c12783 lw a5,12(sp) 10178: f8e7c6e3 blt a5,a4,10104 <_start+0xb0> less: 1017c: 01812783 lw a5,24(sp) 10180: 00178793 addi a5,a5,1 10184: 00f12823 sw a5,16(sp) 10188: f89ff06f j 10110 <_start+0xbc> Not_found: 1018c: 02012703 lw a4,32(sp) 10190: 00400793 li a5,4 10194: 00f70663 beq a4,a5,101a0 <_start+0x14c> 10198: 05010113 addi sp,sp,80 1019c: 00008067 ret Print_fail: 101a0: 000107b7 lui a5,0x10 101a4: 1f478793 addi a5,a5,500 # 101f4 <_start+0x1a0> 101a8: 04600713 li a4,70 101ac: 400026b7 lui a3,0x40002 101b0: 00e68023 sb a4,0(a3) # 40002000 <__global_pointer$+0x3fff0600> 101b4: 00178793 addi a5,a5,1 101b8: 0007c703 lbu a4,0(a5) 101bc: fe071ae3 bnez a4,101b0 <_start+0x15c> 101c0: 05010113 addi sp,sp,80 101c4: 00008067 ret ``` This assembly code is quite hard for human to understand, because it uses plenty of sw and lw insn to store the variables and return back. While the optimization flag reduces the number of register to used, it only use two registers to manipulate the binary search tree. The variables are stored in the stack and use the sw and lw to store and restore the value, repectively. ### Statistics #### When finding the target value ```shell >>> Execution time: 20666 ns >>> Instruction count: 118 (IPS=5709861) >>> Jumps: 5 (4.24%) - 2 forwards, 3 backwards >>> Branching T=2 (22.22%) F=7 (77.78%) ``` The array is like ```cpp arr[11] = {1, 4, 7, 10, 13, 16, 19, 22, 25, 28, 31}; ``` And the target is `28`. And, there are four branch statement in the source code. We can know that it only needs 2 iteration to find the target value. - In the first one branch statement has never taken. T += 1, F += 2 ```cpp while(i < iter){...} ``` ```shell bge a4,a5,1018c blt a4,a5,1018c ``` - In the second branch statement which will be 2 not taken and 1 taken. T += 1, F += 2. ```cpp if(arr[mean] == target) break; ``` ```shell beq a4,a5,1018c ``` - In the third branch statement which will 2 not taken. T += 0, F += 2. ```cpp if(arr[mean] > target) high_i = mean - 1; else low_i = mean + 1; ``` ```shell blt a5,a4,10104 ``` - Out of the while loop, there will be another branch statement, that is, which not taken 1 times. T += 0, F += 1 ```cpp if( i == 4){...} ``` ```shell beq a4,a5,101a0 ``` - Accuracy : - T : 2 - F : 2 + 2 + 2 +1 \begin{split} Accuracy \ = \ \dfrac{T}{T+F} = \dfrac{2}{2+7} = 0.22 \end{split} #### When failing at finding the target value ```shell Failed! >>> Execution time: 106262 ns >>> Instruction count: 206 (IPS=1938604) >>> Jumps: 15 (7.28%) - 3 forwards, 12 backwards >>> Branching T=13 (59.09%) F=9 (40.91%) ``` I assume the target value is `3`. And the array is the same as previous explaination. ```cpp arr[11] = {1, 4, 7, 10, 13, 16, 19, 22, 25, 28, 31}; ``` The while loop will run until the `i` equals to 4. - In the first one branch statement which will be 1 taken for `blt` and 3 not taken and 1 taken for `bge`. T += 2, F += 3 ```cpp while(i < iter){...} ``` ```shell bge a4,a5,1018c blt a4,a5,1018c ``` - In the second branch statement which will be 4 not taken. T += 0, F += 4 ```cpp if(arr[mean] == target) break; ``` ```shell beq a4,a5,1018c ``` - In the third branch statement which will be 4 taken. T += 4, F += 0. ```cpp if(arr[mean] > target) high_i = mean - 1; else low_i = mean + 1; ``` ```shell blt a5,a4,10104 ``` - Out of the while loop, there will be another branch statement, that is, which 1 taken. T += 1, F += 0. ```cpp if( i == 4 ){...} ``` ```shell beq a4,a5,101a0 ``` - Because of the while loop is true, it will print out the string,`Failed`. T += 6, F += 1 ```cpp while (*hello) { *tx = *hello; hello++; } ``` ```shell bnez a4,101b0 ``` - Accuracy : - T : - F : 1 + 2 + 2 + 1 \begin{split} Accuracy \ = \ \dfrac{T}{T+F} = \dfrac{2}{2+7} = 0.22 \end{split} Reference: - [undefined to memcpy](https://stackoverflow.com/questions/53176254/undefined-reference-to-memcpy)