# Assignment 2-1 Toolchain ###### tags: `Computer Architecture` contributed by <`joe-U16`> ## Binary Search program contributed by [<江承緯>](https://hackmd.io/aAz3-wW7SeGEmq-e1v_N6w) ## (1)Rewrite into C implementations ```c= #define SIZE 11 int binarySearch( int arr[], int low, int high, int x) { while( low <= high ) { unsigned int mid = (low + high) >> 1; if( arr[mid] == x ) return mid; else if( x < arr[mid] ) return binarySearch( arr, low, mid - 1, x ); return binarySearch( arr, mid + 1, high, x ); } return -1; } void _start() { int arr[SIZE] = { 1, 3, 4, 6, 9, 12, 14, 15, 17, 19, 24 }; int x = 8; int result = binarySearch(arr, 0, SIZE-1, x ); volatile char* tx = (volatile char*) 0x40002000; if (result != -1) { char *text = "found at index "; while (*text) { *tx = *text; text++; } *tx = (char) ((result + '0') & 0x000000FF); } else { char* text = "search element "; char* text2 = " not found\n"; while(*text){ *tx = *text; text++; } *tx = (char) ((x + '0') & 0x000000FF); while(*text2){ *tx = *text2; text2++; } } } ``` After rewrite, it can work by [emu-rv32i](https://github.com/sysprog21/rv32emu) successfully. ``` ➜ rv32emu git:(master) ✗ ./emu-rv32i binary search element 8 not found >>> Execution time: 123556 ns >>> Instruction count: 530 (IPS=4289552) >>> Jumps: 51 (9.62%) - 15 forwards, 36 backwards >>> Branching T=34 (80.95%) F=8 (19.05%) ``` ## (2) Disassemble the ELF files generated by C compiler **Use command to see the optimization flags** ``` ➜ riscv-none-embed-gcc --help=optimizers ``` We have optimization flags: > -O\<number> Set optimization level to \<number>. -Ofast Optimize for speed disregarding exact standards compliance. -Og Optimize for debugging experience rather than speed or size. -Os Optimize for space rather than speed. * optimization flags = `-O1` (optimized for size) ``` ➜ rv32emu git:(master) ✗ ./emu-rv32i binary1 search element 8 not found >>> Execution time: 94138 ns >>> Instruction count: 230 (IPS=2443221) >>> Jumps: 42 (18.26%) - 6 forwards, 36 backwards >>> Branching T=28 (70.00%) F=12 (30.00%) ``` --- * optimization flags = `-O3` (optimized for speed) * The flags `-O2` and `-O3` has the same result. ``` ➜ rv32emu git:(master) ✗ ./emu-rv32i binary3 search element 8 not found >>> Execution time: 90073 ns >>> Instruction count: 185 (IPS=2053889) >>> Jumps: 30 (16.22%) - 2 forwards, 28 backwards >>> Branching T=29 (76.32%) F=9 (23.68%) ``` The average execution time with flags `-O1` and `-O3` is close. So we see the assembly code with both flags. **Using GNU Toolchain** - [ ] [objdump](http://man7.org/linux/man-pages/man1/objdump.1.html) > `-d`: Display the assembler mnemonics for the machine instructions * Assembly code with flags `-O1` ``` ➜ rv32emu git:(master) ✗ riscv-none-embed-objdump -d binary1 binary1: file format elf32-littleriscv Disassembly of section .text: 00010054 <binarySearch>: 10054: 04b64a63 blt a2,a1,100a8 <binarySearch+0x54> 10058: 00050713 mv a4,a0 1005c: 00c587b3 add a5,a1,a2 10060: 4017d513 srai a0,a5,0x1 10064: 00050813 mv a6,a0 10068: 00251793 slli a5,a0,0x2 1006c: 00f707b3 add a5,a4,a5 10070: 0007a783 lw a5,0(a5) 10074: 02d78e63 beq a5,a3,100b0 <binarySearch+0x5c> 10078: ff010113 addi sp,sp,-16 1007c: 00112623 sw ra,12(sp) 10080: 00070513 mv a0,a4 10084: 00f6cc63 blt a3,a5,1009c <binarySearch+0x48> 10088: 00180593 addi a1,a6,1 1008c: fc9ff0ef jal ra,10054 <binarySearch> 10090: 00c12083 lw ra,12(sp) 10094: 01010113 addi sp,sp,16 10098: 00008067 ret 1009c: fff80613 addi a2,a6,-1 100a0: fb5ff0ef jal ra,10054 <binarySearch> 100a4: fedff06f j 10090 <binarySearch+0x3c> 100a8: fff00513 li a0,-1 100ac: 00008067 ret 100b0: 00008067 ret 000100b4 <_start>: . . . The <_start> section will print string, but it is too long. So we ignore that. ``` * Assembly code with flags `-O3` ``` ➜ rv32emu git:(master) ✗ riscv-none-embed-objdump -d binary3 binary3: file format elf32-littleriscv Disassembly of section .text: 00010054 <binarySearch>: 10054: 06b64463 blt a2,a1,100bc <binarySearch+0x68> 10058: 00c58833 add a6,a1,a2 1005c: 40185813 srai a6,a6,0x1 10060: 00281793 slli a5,a6,0x2 10064: 00f507b3 add a5,a0,a5 10068: 0007a703 lw a4,0(a5) 1006c: 00080793 mv a5,a6 10070: 02e68663 beq a3,a4,1009c <binarySearch+0x48> 10074: 02e6d863 bge a3,a4,100a4 <binarySearch+0x50> 10078: fff80613 addi a2,a6,-1 1007c: 00c587b3 add a5,a1,a2 10080: 4017d793 srai a5,a5,0x1 10084: 00279713 slli a4,a5,0x2 10088: 00e50733 add a4,a0,a4 1008c: 02b64863 blt a2,a1,100bc <binarySearch+0x68> 10090: 00072703 lw a4,0(a4) 10094: 00078813 mv a6,a5 10098: fcd71ee3 bne a4,a3,10074 <binarySearch+0x20> 1009c: 00078513 mv a0,a5 100a0: 00008067 ret 100a4: 00180593 addi a1,a6,1 100a8: 00c587b3 add a5,a1,a2 100ac: 4017d793 srai a5,a5,0x1 100b0: 00279713 slli a4,a5,0x2 100b4: 00e50733 add a4,a0,a4 100b8: fcb65ce3 bge a2,a1,10090 <binarySearch+0x3c> 100bc: fff00793 li a5,-1 100c0: 00078513 mv a0,a5 100c4: 00008067 ret 000100c8 <_start>: . . . The <_start> section will print string, but it is too long. So we ignore that. ``` ### Illustrate the assembly code with flags `-O3`: * `blt a2,a1,100bc <binarySearch+0x68>` * `a1 = low`, `a2 = high` * When `a1 < a2`, it will branch to `100bc` * `10058` and `1005c` * in c: `int mid = (low + high) >> 1;` * `slli a5,a6,0x2` * `a5` stored same as `mid` * `slli a5,a6,0x2` * let `a5 x 4`, because every data in `arr[]` has 4 bytes. * `add a5,a0,a5` * begin at `a0` * `lw a4,0(a5)` * load data from arr[mid] The execution time between flags `-O1` and `-O3` is so close. After observation the assembly code, the code is close too. - [ ] readelf(http://man7.org/linux/man-pages/man1/readelf.1.html) flags = `-O3` `➜ riscv-none-embed-readelf -h binary3` ``` ➜ rv32emu git:(master) ✗ riscv-none-embed-readelf -h binary3 ELF Header: Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 Class: ELF32 Data: 2's complement, little endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: EXEC (Executable file) Machine: RISC-V Version: 0x1 Entry point address: 0x100c8 Start of program headers: 52 (bytes into file) Start of section headers: 1044 (bytes into file) Flags: 0x0 Size of this header: 52 (bytes) Size of program headers: 32 (bytes) Number of program headers: 1 Size of section headers: 40 (bytes) Number of section headers: 7 Section header string table index: 6 ``` - [ ] size > list the section sizes—and the total size—for each of the object or archive files objfile in its argument list. flags = `-O3` `➜ riscv-none-embed-size binary3` ``` ➜ rv32emu git:(master) ✗ riscv-none-embed-size binary3 text data bss dec hex filename 524 0 0 524 20c binary3 ```