# Lab2: RISC-V RV32I[MA] emulator with ELF support for Tower of Hanoi ###### tags: `Computer Architecture` ## Tower of Hanoi ### Problems encountered In the begining, I want to write a hanoi C program which can accept number of plates. But I don't know how to implement "scanf" in this emulator. However, I decide to implement a simple hanoi program with 3 plates. Below is my C program ```= volatile char* tx = (volatile char*) 0x40002000; const char* show = "Move sheet from "; const char* to = " to "; const char* blank= "\n"; char *d; int n=3; int x=0; void main(){ hanoi(n,'A','B','C'); } void hanoi(int n,char A,char B,char C){ if(n==1){ while(*show){ *tx = *show; show++; x++; } while(x){ show--; x--; } d = &A; *tx = *d; while(*to){ *tx = *to; to++; x++; } while(x){ to--; x--; } d = &C; *tx = *d; *tx = *blank; } else{ hanoi(n-1,A,C,B); hanoi(1,A,B,C); hanoi(n-1,B,A,C); } } ``` ### result without O3 optimization ``` ./emu-rv32i hanoi Move sheet from A to C Move sheet from A to B Move sheet from C to B Move sheet from A to C Move sheet from B to A Move sheet from B to C Move sheet from A to C >>> Execution time: 306239 ns >>> Instruction count: 3478 (IPS=11357142) >>> Jumps: 341 (9.80%) - 41 forwards, 300 backwards >>> Branching T=283 (88.99%) F=35 (11.01%) ``` ### Objdump without O3 optimization ``` hanoi: file format elf32-littleriscv Disassembly of section .text: 00000100 <_start>: 100: 00000093 li ra,0 104: 00000113 li sp,0 108: 00000193 li gp,0 10c: 00000213 li tp,0 110: 00000293 li t0,0 114: 00000313 li t1,0 118: 00000393 li t2,0 11c: 00000413 li s0,0 120: 00000493 li s1,0 124: 00000513 li a0,0 128: 00000593 li a1,0 12c: 00000613 li a2,0 130: 00000693 li a3,0 134: 00000713 li a4,0 138: 00000793 li a5,0 13c: 00000813 li a6,0 140: 00000893 li a7,0 144: 00000913 li s2,0 148: 00000993 li s3,0 14c: 00000a13 li s4,0 150: 00000a93 li s5,0 154: 00000b13 li s6,0 158: 00000b93 li s7,0 15c: 00000c13 li s8,0 160: 00000c93 li s9,0 164: 00000d13 li s10,0 168: 00000d93 li s11,0 16c: 00000e13 li t3,0 170: 00000e93 li t4,0 174: 00000f13 li t5,0 178: 00000f93 li t6,0 0000017c <init_stack>: 17c: 00001117 auipc sp,0x1 180: 23810113 addi sp,sp,568 # 13b4 <_stack> 00000184 <Init>: 184: 010000ef jal ra,194 <main> 188: 0040006f j 18c <Exit> 0000018c <Exit>: 18c: 00000073 ecall 190: 00008067 ret 00000194 <main>: 194: ff010113 addi sp,sp,-16 198: 00112623 sw ra,12(sp) 19c: 00812423 sw s0,8(sp) 1a0: 01010413 addi s0,sp,16 1a4: 3a802783 lw a5,936(zero) # 3a8 <n> 1a8: 04300693 li a3,67 1ac: 04200613 li a2,66 1b0: 04100593 li a1,65 1b4: 00078513 mv a0,a5 1b8: 018000ef jal ra,1d0 <hanoi> 1bc: 00000013 nop 1c0: 00c12083 lw ra,12(sp) 1c4: 00812403 lw s0,8(sp) 1c8: 01010113 addi sp,sp,16 1cc: 00008067 ret 000001d0 <hanoi>: 1d0: fe010113 addi sp,sp,-32 1d4: 00112e23 sw ra,28(sp) 1d8: 00812c23 sw s0,24(sp) 1dc: 02010413 addi s0,sp,32 1e0: fea42623 sw a0,-20(s0) 1e4: 00058793 mv a5,a1 1e8: 00068713 mv a4,a3 1ec: fef405a3 sb a5,-21(s0) 1f0: 00060793 mv a5,a2 1f4: fef40523 sb a5,-22(s0) 1f8: 00070793 mv a5,a4 1fc: fef404a3 sb a5,-23(s0) 200: fec42703 lw a4,-20(s0) 204: 00100793 li a5,1 208: 10f71063 bne a4,a5,308 <hanoi+0x138> 20c: 02c0006f j 238 <hanoi+0x68> 210: 39c02703 lw a4,924(zero) # 39c <show> 214: 39802783 lw a5,920(zero) # 398 <__data_end> 218: 00074703 lbu a4,0(a4) 21c: 00e78023 sb a4,0(a5) 220: 39c02783 lw a5,924(zero) # 39c <show> 224: 00178713 addi a4,a5,1 228: 38e02e23 sw a4,924(zero) # 39c <show> 22c: 3b002783 lw a5,944(zero) # 3b0 <x> 230: 00178713 addi a4,a5,1 234: 3ae02823 sw a4,944(zero) # 3b0 <x> 238: 39c02783 lw a5,924(zero) # 39c <show> 23c: 0007c783 lbu a5,0(a5) 240: fc0798e3 bnez a5,210 <hanoi+0x40> 244: 01c0006f j 260 <hanoi+0x90> 248: 39c02783 lw a5,924(zero) # 39c <show> 24c: fff78713 addi a4,a5,-1 250: 38e02e23 sw a4,924(zero) # 39c <show> 254: 3b002783 lw a5,944(zero) # 3b0 <x> 258: fff78713 addi a4,a5,-1 25c: 3ae02823 sw a4,944(zero) # 3b0 <x> 260: 3b002783 lw a5,944(zero) # 3b0 <x> 264: fe0792e3 bnez a5,248 <hanoi+0x78> 268: feb40713 addi a4,s0,-21 26c: 3ae02623 sw a4,940(zero) # 3ac <d> 270: 3ac02703 lw a4,940(zero) # 3ac <d> 274: 39802783 lw a5,920(zero) # 398 <__data_end> 278: 00074703 lbu a4,0(a4) 27c: 00e78023 sb a4,0(a5) 280: 02c0006f j 2ac <hanoi+0xdc> 284: 3a002703 lw a4,928(zero) # 3a0 <to> 288: 39802783 lw a5,920(zero) # 398 <__data_end> 28c: 00074703 lbu a4,0(a4) 290: 00e78023 sb a4,0(a5) 294: 3a002783 lw a5,928(zero) # 3a0 <to> 298: 00178713 addi a4,a5,1 29c: 3ae02023 sw a4,928(zero) # 3a0 <to> 2a0: 3b002783 lw a5,944(zero) # 3b0 <x> 2a4: 00178713 addi a4,a5,1 2a8: 3ae02823 sw a4,944(zero) # 3b0 <x> 2ac: 3a002783 lw a5,928(zero) # 3a0 <to> 2b0: 0007c783 lbu a5,0(a5) 2b4: fc0798e3 bnez a5,284 <hanoi+0xb4> 2b8: 01c0006f j 2d4 <hanoi+0x104> 2bc: 3a002783 lw a5,928(zero) # 3a0 <to> 2c0: fff78713 addi a4,a5,-1 2c4: 3ae02023 sw a4,928(zero) # 3a0 <to> 2c8: 3b002783 lw a5,944(zero) # 3b0 <x> 2cc: fff78713 addi a4,a5,-1 2d0: 3ae02823 sw a4,944(zero) # 3b0 <x> 2d4: 3b002783 lw a5,944(zero) # 3b0 <x> 2d8: fe0792e3 bnez a5,2bc <hanoi+0xec> 2dc: fe940713 addi a4,s0,-23 2e0: 3ae02623 sw a4,940(zero) # 3ac <d> 2e4: 3ac02703 lw a4,940(zero) # 3ac <d> 2e8: 39802783 lw a5,920(zero) # 398 <__data_end> 2ec: 00074703 lbu a4,0(a4) 2f0: 00e78023 sb a4,0(a5) 2f4: 3a402703 lw a4,932(zero) # 3a4 <blank> 2f8: 39802783 lw a5,920(zero) # 398 <__data_end> 2fc: 00074703 lbu a4,0(a4) 300: 00e78023 sb a4,0(a5) 304: 0600006f j 364 <hanoi+0x194> 308: fec42783 lw a5,-20(s0) 30c: fff78793 addi a5,a5,-1 310: feb44703 lbu a4,-21(s0) 314: fe944603 lbu a2,-23(s0) 318: fea44683 lbu a3,-22(s0) 31c: 00070593 mv a1,a4 320: 00078513 mv a0,a5 324: eadff0ef jal ra,1d0 <hanoi> 328: feb44783 lbu a5,-21(s0) 32c: fe944683 lbu a3,-23(s0) 330: fea44703 lbu a4,-22(s0) 334: 00070613 mv a2,a4 338: 00078593 mv a1,a5 33c: 00100513 li a0,1 340: e91ff0ef jal ra,1d0 <hanoi> 344: fec42783 lw a5,-20(s0) 348: fff78793 addi a5,a5,-1 34c: feb44603 lbu a2,-21(s0) 350: fe944683 lbu a3,-23(s0) 354: fea44703 lbu a4,-22(s0) 358: 00070593 mv a1,a4 35c: 00078513 mv a0,a5 360: e71ff0ef jal ra,1d0 <hanoi> 364: 00000013 nop 368: 01c12083 lw ra,28(sp) 36c: 01812403 lw s0,24(sp) 370: 02010113 addi sp,sp,32 374: 00008067 ret ``` ### Result with O3 optimization ``` ./emu-rv32i hanoi Move sheet from A to C Move sheet from A to B Move sheet from C to B Move sheet from A to C Move sheet from B to A Move sheet from B to C Move sheet from A to C >>> Execution time: 348744 ns >>> Instruction count: 1819 (IPS=5215860) >>> Jumps: 151 (8.30%) - 12 forwards, 139 backwards >>> Branching T=129 (72.47%) F=49 (27.53%) ``` ### Objdump with O3 optimization ``` hanoi: file format elf32-littleriscv Disassembly of section .text: 00000100 <_start>: 100: 00000093 li ra,0 104: 00000113 li sp,0 108: 00000193 li gp,0 10c: 00000213 li tp,0 110: 00000293 li t0,0 114: 00000313 li t1,0 118: 00000393 li t2,0 11c: 00000413 li s0,0 120: 00000493 li s1,0 124: 00000513 li a0,0 128: 00000593 li a1,0 12c: 00000613 li a2,0 130: 00000693 li a3,0 134: 00000713 li a4,0 138: 00000793 li a5,0 13c: 00000813 li a6,0 140: 00000893 li a7,0 144: 00000913 li s2,0 148: 00000993 li s3,0 14c: 00000a13 li s4,0 150: 00000a93 li s5,0 154: 00000b13 li s6,0 158: 00000b93 li s7,0 15c: 00000c13 li s8,0 160: 00000c93 li s9,0 164: 00000d13 li s10,0 168: 00000d93 li s11,0 16c: 00000e13 li t3,0 170: 00000e93 li t4,0 174: 00000f13 li t5,0 178: 00000f93 li t6,0 0000017c <init_stack>: 17c: 00001117 auipc sp,0x1 180: 1c410113 addi sp,sp,452 # 1340 <_stack> 00000184 <Init>: 184: 16c000ef jal ra,2f0 <main> 188: 0040006f j 18c <Exit> 0000018c <Exit>: 18c: 00000073 ecall 190: 00008067 ret 00000194 <hanoi>: 194: fe010113 addi sp,sp,-32 198: 00112e23 sw ra,28(sp) 19c: 00812c23 sw s0,24(sp) 1a0: 00912a23 sw s1,20(sp) 1a4: 00b107a3 sb a1,15(sp) 1a8: 00d10723 sb a3,14(sp) 1ac: 00100793 li a5,1 1b0: 0cf51c63 bne a0,a5,288 <hanoi+0xf4> 1b4: 33002603 lw a2,816(zero) # 330 <show> 1b8: 00064683 lbu a3,0(a2) 1bc: 12068263 beqz a3,2e0 <hanoi+0x14c> 1c0: 33402703 lw a4,820(zero) # 334 <tx> 1c4: 00d70023 sb a3,0(a4) 1c8: 33002683 lw a3,816(zero) # 330 <show> 1cc: 33c02703 lw a4,828(zero) # 33c <x> 1d0: 00168613 addi a2,a3,1 1d4: 00170713 addi a4,a4,1 1d8: 32c02823 sw a2,816(zero) # 330 <show> 1dc: 32e02e23 sw a4,828(zero) # 33c <x> 1e0: 0016c683 lbu a3,1(a3) 1e4: fc069ee3 bnez a3,1c0 <hanoi+0x2c> 1e8: 00070863 beqz a4,1f8 <hanoi+0x64> 1ec: 40e60733 sub a4,a2,a4 1f0: 32002e23 sw zero,828(zero) # 33c <x> 1f4: 32e02823 sw a4,816(zero) # 330 <show> 1f8: 00f14683 lbu a3,15(sp) 1fc: 33402703 lw a4,820(zero) # 334 <tx> 200: 00f10613 addi a2,sp,15 204: 32c02c23 sw a2,824(zero) # 338 <d> 208: 00d70023 sb a3,0(a4) 20c: 32c02603 lw a2,812(zero) # 32c <to> 210: 00064683 lbu a3,0(a2) 214: 0c068a63 beqz a3,2e8 <hanoi+0x154> 218: 33402703 lw a4,820(zero) # 334 <tx> 21c: 00d70023 sb a3,0(a4) 220: 32c02683 lw a3,812(zero) # 32c <to> 224: 33c02703 lw a4,828(zero) # 33c <x> 228: 00168613 addi a2,a3,1 22c: 00170713 addi a4,a4,1 230: 32c02623 sw a2,812(zero) # 32c <to> 234: 32e02e23 sw a4,828(zero) # 33c <x> 238: 0016c683 lbu a3,1(a3) 23c: fc069ee3 bnez a3,218 <hanoi+0x84> 240: 00070863 beqz a4,250 <hanoi+0xbc> 244: 40e60733 sub a4,a2,a4 248: 32e02623 sw a4,812(zero) # 32c <to> 24c: 32002e23 sw zero,828(zero) # 33c <x> 250: 00e14703 lbu a4,14(sp) 254: 33402783 lw a5,820(zero) # 334 <tx> 258: 00e10693 addi a3,sp,14 25c: 32d02c23 sw a3,824(zero) # 338 <d> 260: 00e78023 sb a4,0(a5) 264: 32802703 lw a4,808(zero) # 328 <blank> 268: 33402783 lw a5,820(zero) # 334 <tx> 26c: 00074703 lbu a4,0(a4) 270: 00e78023 sb a4,0(a5) 274: 01c12083 lw ra,28(sp) 278: 01812403 lw s0,24(sp) 27c: 01412483 lw s1,20(sp) 280: 02010113 addi sp,sp,32 284: 00008067 ret 288: 00060493 mv s1,a2 28c: 00f14583 lbu a1,15(sp) 290: 00e14603 lbu a2,14(sp) 294: fff50413 addi s0,a0,-1 298: 00048693 mv a3,s1 29c: 00040513 mv a0,s0 2a0: ef5ff0ef jal ra,194 <hanoi> 2a4: 00e14683 lbu a3,14(sp) 2a8: 00f14583 lbu a1,15(sp) 2ac: 00048613 mv a2,s1 2b0: 00100513 li a0,1 2b4: ee1ff0ef jal ra,194 <hanoi> 2b8: 00e14683 lbu a3,14(sp) 2bc: 00f14603 lbu a2,15(sp) 2c0: 00048593 mv a1,s1 2c4: 00040513 mv a0,s0 2c8: ecdff0ef jal ra,194 <hanoi> 2cc: 01c12083 lw ra,28(sp) 2d0: 01812403 lw s0,24(sp) 2d4: 01412483 lw s1,20(sp) 2d8: 02010113 addi sp,sp,32 2dc: 00008067 ret 2e0: 33c02703 lw a4,828(zero) # 33c <x> 2e4: f05ff06f j 1e8 <hanoi+0x54> 2e8: 33c02703 lw a4,828(zero) # 33c <x> 2ec: f55ff06f j 240 <hanoi+0xac> 000002f0 <main>: 2f0: 32402503 lw a0,804(zero) # 324 <__data_paddr_start> 2f4: 04300693 li a3,67 2f8: 04200613 li a2,66 2fc: 04100593 li a1,65 300: e95ff06f j 194 <hanoi> ``` ### Instruction state without O3 optimization ``` Instructions Stat: AUIPC = 1 JAL = 47 JALR = 11 BNE = 318 LW = 1229 LBU = 342 SB = 191 SW = 606 ADDI = 731 ECALL = 1 LI* = 58 Five Most Frequent: 1) LW = 1229 (35.34%) 2) ADDI = 731 (21.02%) 3) SW = 606 (17.42%) 4) LBU = 342 (9.83%) 5) BNE = 318 (9.14%) ``` ### Instruction state with O3 optimization ``` Instructions Stat: AUIPC = 1 JAL = 12 JALR = 10 BEQ = 28 BNE = 150 LW = 493 LBU = 193 SB = 181 SW = 352 ADDI = 383 SUB = 14 ECALL = 1 LI* = 47 Five Most Frequent: 1) LW = 493 (27.10%) 2) ADDI = 383 (21.06%) 3) SW = 352 (19.35%) 4) LBU = 193 (10.61%) 5) SB = 181 (9.95%) ``` ### Readlf ``` ELF Header: Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 Class: ELF32 Data: 2's complement, little endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: EXEC (Executable file) Machine: RISC-V Version: 0x1 Entry point address: 0x100 Start of program headers: 52 (bytes into file) Start of section headers: 1584 (bytes into file) Flags: 0x0 Size of this header: 52 (bytes) Size of program headers: 32 (bytes) Number of program headers: 1 Size of section headers: 40 (bytes) Number of section headers: 12 Section header string table index: 11 ``` ### Size ``` text data bss dec hex filename 548 20 4104 4672 1240 hanoi ``` ### Optimization comparison | | Compile with -O3| Compile without optimization | | -------- | -------- | -------- | | total instruction | 1819 | 3478 | |Jump |151|341| |Jump forwards|12|41| |Jump backwards|139|300| |Branch True|129|283| |Branch False|49|35|