# Assignment 3: 林家葦 ###### tags: `Computer Architecture` ## Bubble Sort * **ELF-File** * Check [this page](https://github.com/PulseRain/Reindeer/blob/master/sim/verilator/tb_PulseRain_RV2T.cpp) can find the test file(elf) is generated based on [riscv-compliance](https://github.com/riscv/riscv-compliance) . * And git clone this project to generate ELF file. * **Source code** ```C= #include "compliance_test.h" #include "compliance_io.h" #include "test_macros.h" RV_COMPLIANCE_RV32M RV_COMPLIANCE_CODE_BEGIN RVTEST_IO_INIT RVTEST_IO_ASSERT_GPR_EQ(x31, x0, 0x00000000) RVTEST_IO_WRITE_STR(x31, "Test Begin\n") addi sp,sp,-48 sw s0,44(sp) addi s0,sp,48 la t0,test_1_data lw a5,0(t0) sw a5,-48(s0) la t0,test_2_data lw a5,0(t0) sw a5,-44(s0) la t0,test_3_data lw a5,0(t0) sw a5,-40(s0) la t0,test_4_data lw a5,0(t0) sw a5,-36(s0) la t0,test_5_data lw a5,0(t0) sw a5,-32(s0) sw zero,-20(s0) j .L2 .L6: //ginit j to memory sw zero,-24(s0) j .L3 .L5: //gload arr[ j ] to a4 lw a5,-24(s0) slli a5,a5,2 addi a4,s0,-16 add a5,a4,a5 lw a4,-32(a5) //gload arr[j+1] a5 lw a5,-24(s0) addi a5,a5,1 slli a5,a5,2 addi a3,s0,-16 add a5,a3,a5 lw a5,-32(a5) //gif arr[j + 1] > arr[j] bge a5,a4,.L4 //g arr[j+1] < arr[j] //gload arr[j] a5 lw a5,-24(s0) slli a5,a5,2 addi a4,s0,-16 add a5,a4,a5 lw a5,-32(a5) //gstore arr[j] to tmp sw a5,-28(s0) //gload arr[j+1] to a4 lw a5,-24(s0) addi a5,a5,1 slli a5,a5,2 addi a4,s0,-16 add a5,a4,a5 lw a4,-32(a5) //garr[j] = arr[j+1] lw a5,-24(s0) slli a5,a5,2 addi a3,s0,-16 add a5,a3,a5 sw a4,-32(a5) //gstore tmp to arr[j+1] lw a5,-24(s0) addi a5,a5,1 slli a5,a5,2 addi a4,s0,-16 add a5,a4,a5 lw a4,-28(s0) sw a4,-32(a5) .L4: //gj++ lw a5,-24(s0) addi a5,a5,1 sw a5,-24(s0) .L3: //gcheck (j - i) < 4 li a4,4 lw a5,-20(s0) sub a5,a4,a5 lw a4,-24(s0) blt a4,a5,.L5 //gi++ lw a5,-20(s0) addi a5,a5,1 sw a5,-20(s0) .L2: //gcheck i < 4 lw a4,-20(s0) li a5,3 bge a5,a4,.L6 //greturn lw t0, -48(s0) la t1, test_1_res sw t0, 0(t1) lw t0,-44(s0) sw t0, 4(t1) lw t0,-40(s0) sw t0, 8(t1) lw t0,-36(s0) sw t0, 12(t1) lw t0,-32(s0) sw t0, 16(t1) li a5,0 mv a0,a5 lw s0,44(sp) addi sp,sp,48 jr ra RVTEST_IO_WRITE_STR(x31, "Test End\n") RV_COMPLIANCE_HALT RV_COMPLIANCE_CODE_END .data test_1_data : .word 0x00000003 test_2_data : .word 0x00000005 test_3_data : .word 0x00000001 test_4_data : .word 0x00000002 test_5_data : .word 0x00000004 RV_COMPLIANCE_DATA_BEGIN test_1_res: .fill 5, 4, -1 RV_COMPLIANCE_DATA_END ``` * **Compiler source code** ``` $ cd riscv-compliance/riscv-test-suite/rv32i $ make bubble.elf riscv-none-embed-gcc -march=rv32i -mabi=ilp32 -static -mcmodel=medany -fvisibility=hidden -nostdlib -nostartfiles -I../..//riscv-test-env/ -I../..//riscv-test-env/p/ -I../../riscv-target//riscvOVPsim/ -T../..//riscv-test-env/p/link.ld src/T.S -o ../..//work//T.elf; riscv-none-embed-objdump -D ../..//work//T.elf > ../..//work//T.elf.objdump ``` * **Execute** ``` $ make build $ make test T ====> Testing ./obj_dir/VPulseRain_RV2T_MCU TEST CASE: T testing ../compliance/T.elf -r ../compliance/references/T.reference_output ============================================================= === PulseRain Technology, RISC-V RV32IM Test Bench ============================================================= elf file : ../compliance/T.elf reference : ../compliance/references/T.reference_output start address = 0x80000000 begin signature address = 0x80002020 end signature address = 0x80002040 =============> reset... =============> init stack ... =============> load elf file... ... ... ... ========> Matching signature ... 80002020 00000001 PASS 80002024 00000002 PASS 80002028 00000003 PASS 8000202c 00000004 PASS 80002030 00000005 PASS ======> Signature ALL MATCH!!! ``` * **Memory Layout** ![](https://i.imgur.com/lwmLSlp.png) ## GTKwave 1. Reset the CPU, put it into hold state and it will have access to the UART TX port by default. But a valid debug frame sending from the host PC can let OCD to reconfigure the mux and switch the UART TX to OCD side, for which the memory can be accessed, and the control frames can be exchanged ![](https://i.imgur.com/gq8uuBE.png) 2. Call upon toolchain to extract code/data from the .elf file for the test case and OCD will load binary code into Memory, and the cpu will be paused. ![](https://i.imgur.com/M6TxIBf.png) ``` Loading section .text.init ... 2c4 bytes, LMA = 0x80000000 80000000 04c0006f 80000004 34202f73 80000008 00800f93 8000000c 03ff0a63 80000010 00900f93 80000014 03ff0663 80000018 00b00f93 8000001c 03ff0263 80000020 80000f17 80000024 fe0f0f13 80000028 000f0463 8000002c 000f0067 80000030 34202f73 80000034 000f5463 80000038 0040006f 8000003c 5391e193 80000040 00001f17 80000044 fc3f2023 80000048 ff9ff06f 8000004c f1402573 80000050 00051063 80000054 00000297 80000058 01028293 ``` 3. When OCD load whole code/data into memory, it will arise start signal to wake up the CPU. ![](https://i.imgur.com/vys95Xh.png) 4. Start execute the program ![](https://i.imgur.com/hRduwkf.png) 5. When CPU finishes the program, resetting the CPU, put it into hold state for the second time ![](https://i.imgur.com/Mu02NIj.png) 6. Read the data out of the memory, and compare them against the reference signature ![](https://i.imgur.com/ISpDOV3.png) ## How Reindeer works with Verilator Verilator is a free and open-source software tool which converts Verilog (a hardware description language) to a cycle-accurate behavioral model in C++ or SystemC. And Obesve sim/verilator/tb_PulseRain_RV2T.cpp ```cpp uut = new UUT; // Create instance VerilatedVcdC* tfp = new VerilatedVcdC; ``` The uut which is object is also soft cpu. We simulate hold and wait and initialize the register. ```cpp if (!trace_file.empty()) { Verilated::traceEverOn(true); uut->trace (tfp, 99); tfp->open (trace_file.c_str()); } else { tfp = NULL; } testbench t {10, uut, tfp}; uut->reset_n = 0; // Set some inputs uut->sync_reset = 0; std::cout << "\n=============> reset..." << "\n"; uut->sync_reset = 0; uut->ocd_reg_we = 0; uut->ocd_reg_read_addr = 0; uut->ocd_reg_write_addr = 0; uut->ocd_reg_write_data = 0; uut->ocd_read_enable = 0; uut->ocd_write_enable = 0; uut->ocd_rw_addr = 0; uut->ocd_write_word = 0; uut->start = 0; uut->start_address = start_address; t.reset(); std::cout << "=============> init stack ..." << "\n"; uut->ocd_reg_we = 1; uut->ocd_reg_write_addr = 2; // SP uut->ocd_reg_write_data = DEFAULT_STACK_INIT_VALUE; t.run(); uut->ocd_reg_we = 0; t.run(); std::cout << "=============> load elf file..." << "\n"; load_elf_sections(&t, uut); t.run(); ``` When the OCD loads program into memory. The start signal will be arised, and the circuit will start to execute the program. ```cpp load_elf_sections(&t, uut); t.run(); std::cout << "\n=============> start running ..." << "\n"; uut->start = 1; t.run(); uut->start = 0; t.run(); ``` ## Hold and Load ![](https://i.imgur.com/SsIjTrE.png) After reset, the soft CPU will be put into a hold state, and it will have access to the UART TX port by default. But a valid debug frame sending from the host PC can let OCD to reconfigure the mux and switch the UART TX to OCD side, for which the memory can be accessed, and the control frames can be exchanged. A new software image can be loaded into the memory during the CPU hold state, which gives rise to the name "hold-and-load".