Try   HackMD

Testing Memory Leak of rv32emu

Steps of the experiment in detailed:

  1. Install Debian GNU/Linux 12 (bookworm) in VirtualBox 7.0.12 in Windows 10 Home.

  2. Install the dependencies with apt command.
    For the commands in this document: curl git make.
    For compiling rv32emu: libsdl2-dev libsdl2-mixer-dev.
    For Valgrind to debug programs: libc6-dbg.

    ​​​​sudo apt install -y curl git make libsdl2-dev libsdl2-mixer-dev libc6-dbg
    
  3. Get and build the latest version of sysprog21/rv32emu from its GitHub repository. (The latest commit on the master branch is 90b42a6 on 2023-12-11.) The sed command is used to replace -O2 flag to cc with -O0 -g, which is suggested by Valgrind for the correctness of debugging.

    ​​​​mkdir -p ~/ca
    ​​​​cd ~/ca
    ​​​​git clone https://github.com/sysprog21/rv32emu
    ​​​​cd rv32emu
    ​​​​sed -i Makefile -e "s/\(CFLAGS.*\)-O2\(.*\)/\1-O0 -g\2/"
    ​​​​
    ​​​​make
    
  4. Get, build and install the latest release of Valgrind from its official website. (It is 3.22.0 on 2023-12-11.)

    ​​​​cd /tmp
    ​​​​curl -o valgrind.tar.bz2 https://sourceware.org/pub/valgrind/valgrind-3.22.0.tar.bz2
    ​​​​tar -xjf valgrind.tar.bz2
    ​​​​mv valgrind-3.22.0 valgrind
    ​​​​echo 3.22.0 > valgrind/version.txt
    ​​​​rm valgrind.tar.bz2
    ​​​​
    ​​​​cd valgrind
    ​​​​mkdir -p ~/.local/valgrind
    ​​​​./configure --prefix=$HOME/.local/valgrind
    ​​​​make
    ​​​​make install
    ​​​​echo export PATH="~/.local/valgrind/bin/:$PATH" >> ~/.bashrc
    ​​​​. ~/.bashrc
    ​​​​cd
    ​​​​rm -rf /tmp/valgrind
    
  5. In Debian, 4 out of 5 tests in rv32emu yielded segmentation faults. The test of hello.elf yielded Failed.

    ​​​​cd ~/ca/rv32emu
    ​​​​make check
    

    Results:

    ​​​​Segmentation fault
    ​​​​Segmentation fault
    ​​​​Segmentation fault
    ​​​​Segmentation fault
    ​​​​Running hello.elf ... Failed.
    ​​​​make: *** [Makefile:185: check] Error 1
    
  6. According to the target check in Makefile, the core to execute binaries is $(BIN) $(OUT)/$(e).elf. If we focus on a faulty binary hello, the command above is equivalent to build/rv32emu build/hello.elf by tracing through these variables.

  7. Follow the instructions from Valgrind™ Developers (2023), execute the same command that gives segmentation fault in valgrind.

    ​​​​valgrind --leak-check=yes build/rv32emu build/hello.elf
    
    Output (click to expand/hide):
    ​​​​==24784== Memcheck, a memory error detector
    ​​​​==24784== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
    ​​​​==24784== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info
    ​​​​==24784== Command: build/rv32emu build/hello.elf
    ​​​​==24784== 
    ​​​​==24784== Invalid read of size 8
    ​​​​==24784==    at 0x11F826: memory_write.lto_priv.1 (io.h:49)
    ​​​​==24784==    by 0x11FE72: rv_reset (riscv.c:206)
    ​​​​==24784==    by 0x11FCE0: rv_create (riscv.c:126)
    ​​​​==24784==    by 0x123400: main (main.c:232)
    ​​​​==24784==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
    ​​​​==24784== 
    ​​​​==24784== 
    ​​​​==24784== Process terminating with default action of signal 11 (SIGSEGV)
    ​​​​==24784==  Access not within mapped region at address 0x0
    ​​​​==24784==    at 0x11F826: memory_write.lto_priv.1 (io.h:49)
    ​​​​==24784==    by 0x11FE72: rv_reset (riscv.c:206)
    ​​​​==24784==    by 0x11FCE0: rv_create (riscv.c:126)
    ​​​​==24784==    by 0x123400: main (main.c:232)
    ​​​​==24784==  If you believe this happened as a result of a stack
    ​​​​==24784==  overflow in your program's main thread (unlikely but
    ​​​​==24784==  possible), you can try to increase the size of the
    ​​​​==24784==  main thread stack using the --main-stacksize= flag.
    ​​​​==24784==  The main thread stack size used in this run was 8388608.
    ​​​​==24784== 
    ​​​​==24784== HEAP SUMMARY:
    ​​​​==24784==     in use at exit: 121,693 bytes in 281 blocks
    ​​​​==24784==   total heap usage: 380 allocs, 99 frees, 131,285 bytes allocated
    ​​​​==24784== 
    ​​​​==24784== LEAK SUMMARY:
    ​​​​==24784==    definitely lost: 0 bytes in 0 blocks
    ​​​​==24784==    indirectly lost: 0 bytes in 0 blocks
    ​​​​==24784==      possibly lost: 0 bytes in 0 blocks
    ​​​​==24784==    still reachable: 119,677 bytes in 260 blocks
    ​​​​==24784==         suppressed: 0 bytes in 0 blocks
    ​​​​==24784== Reachable blocks (those to which a pointer was found) are not shown.
    ​​​​==24784== To see them, rerun with: --leak-check=full --show-leak-kinds=all
    ​​​​==24784== 
    ​​​​==24784== For lists of detected and suppressed errors, rerun with: -s
    ​​​​==24784== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
    ​​​​Segmentation fault
    

    In short, the error is reading from an invalid memory location. Its trace is as follows.

    ​​​​Invalid read of size 8
    ​​​​   at 0x11F826: memory_write.lto_priv.1 (io.h:49)
    ​​​​   by 0x11FE72: rv_reset (riscv.c:206)
    ​​​​   by 0x11FCE0: rv_create (riscv.c:126)
    ​​​​   by 0x123400: main (main.c:232)
    ​​​​ Address 0x0 is not stack'd, malloc'd or (recently) free'd
    
  8. Right before src/io.h:49, print all pointers to check which one is the bad apple. So, the function is as follows after editing. (Forgive me including headers here. This is to make gcc happy.)

    ​​​​static inline void memory_write(memory_t *m,
    ​​​​                                uint32_t addr,
    ​​​​                                const uint8_t *src,
    ​​​​                                uint32_t size)
    ​​​​{
    ​​​​    #include <stdio.h>
    ​​​​    printf("src: %p\n", src);
    ​​​​    fflush(stdout);
    ​​​​    printf("m: %p\n", m);
    ​​​​    fflush(stdout);
    ​​​​    printf("m->mem_base: %p\n", m->mem_base);
    ​​​​    fflush(stdout);
    ​​​​    memcpy(m->mem_base + addr, src, size);
    ​​​​}
    
  9. Build rv32emu and run hello.elf.

    ​​​​make && build/rv32emu build/hello.elf
    

    Output:

    ​​​​... (bulding info)
    ​​​​src: 0x7ffd91d49b40
    ​​​​m: (nil)
    ​​​​Segmentation fault
    

    Aha! m is nil. We need to find out why m is nil.

  10. In other words, trace back to find out the place where m is initialized.

    1. src/riscv.c:206: s->mem is the argument m of memory_write().
    2. src/riscv.c:196: state_t *s = rv_userdata(rv);.
    3. rv_userdata(rv): If rv is not nil, return rv->userdata. This is not our target, but tells us to trace rv->userdata.
    4. src/riscv.c:153: rv is the parameter of rv_reset(), which is the current function. So, we need to trace its caller, rv_create().
    5. src/riscv.c:126: rv is the argument rv of rv_reset().`
    6. src/riscv.c:112: rv->userdata = userdata;.
    7. src/riscv.c:99: userdata is the parameter of rv_create(), so trace its caller, main.
    8. main.c:232: state is the argument userdata of rv_create().
    9. main.c:223: state_t *state = state_new();.
    10. src/state.h:28: In state_new(), s->mem = memory_new();. This is our target! Look into it.
    11. src/io.c:50: return mem;.
    12. src/io.c:33: memory_t *mem = malloc(sizeof(memory_t));, where sizeof(memory_t) is 16, which shouldn't cause segmentation fault.
    13. src/io.c:42: However, in the same function, this line data_memory_base = malloc(MEM_SIZE); is interesting, for it tries to allocate
      (2321)
      bytes (~ 4 GiB) of memory at once.
  11. With the fact that rv32emu assumes that the operating system commit at least 4 GiB of memory without traps or returning nil, trying to boot a Debian or Ubuntu VM with different memory size gives the following conclusion.

    If the size of the available memory (Mem_total - Mem_used + Swap_available) is smaller than 4 GiB, rv32emu yields segmentation fault in some cases in Debian. Nevertheless, Ubuntu doesn't care about it.

    If rv32emu raises an exception when malloc()-related functions return nil, segmentation fault will not occur.