# Execution visualization for rv32emu
> 唐文駿
## Objective
The primary goal of this project is to provide a visual representation of the behavior of RISC-V programs while incorporating hardware performance estimations, enabling deeper insights into program execution and system performance.
## Execution Plan
### 1. Study the Reference Project
- Analyze the [ama-riscv-sim](https://github.com/AleksandarLilic/ama-riscv-sim) to understand its mechanisms for:
- Logging executed instructions and runtime profiling.
- how to use Python scripts Visualizing program behavior.
### 2. Analyze the Current Emulator
- Evaluate the output for Visualizing of rv32emu.
### 3. Apply Visualization Methods to the Current Emulator
- Integrate a logging mechanism in rv32emu to capture the sequence of executed instructions, program counters, and execution times.
## visulization on [ama-riscv-sim](https://github.com/AleksandarLilic/ama-riscv-sim/blob/master/script/README.md)
ama-riscv-sim use [./run_analysis.py](https://github.com/AleksandarLilic/ama-riscv-sim/blob/master/script/README.md) to implement visulization.
This code aims to visualize the behavior of RISC-V instruction execution and memory access patterns through charts and data analysis. Below are the main functionalities implemented in the code, along with explanations.
#### 1. User Input Parsing and Command Validation
The program parses user-provided command-line arguments and performs corresponding actions based on the parameters, such as processing instruction logs or execution trace files.
```python
import argparse
def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser(description="Analysis of memory access logs and traces")
parser.add_argument('-i', '--inst_log', type=str, help="Path to JSON instruction count log with profiling data")
parser.add_argument('-t', '--trace', type=str, help="Path to binary execution trace")
# Additional parameters omitted...
return parser.parse_args()
```
---
#### 2. File Processing
The provided execution trace file is read and parsed into a format suitable for analysis. Converts binary trace files into a DataFrame for further processing and analysis.
```python
import numpy as np
import pandas as pd
def load_bin_trace(bin_log, args) -> pd.DataFrame:
dtype = np.dtype([
('pc', np.uint32), ('isz', np.uint32),
('dmem', np.uint32), ('dsz', np.uint32), ('sp', np.uint32),
])
data = np.fromfile(bin_log, dtype=dtype)
df = pd.DataFrame(data, columns=['pc', 'isz', 'dmem', 'dsz', 'sp'])
# Additional processing logic omitted...
return df
```
---
#### 3. Visualization Generation
Two main types of visualizations are generated: a histogram of instruction execution frequency and a time series chart of instruction execution behavior.
Shows the dynamic changes in instruction execution during the program's runtime, enabling analysis of instruction access patterns.
##### 3.1. Histogram Generation
```python
import matplotlib.pyplot as plt
def draw_inst_log(df, hl_groups, title, args) -> plt.Figure:
df_g = df[['i_type', 'count']].groupby('i_type').sum()
fig, ax = plt.subplots()
ax.barh(df_g.index, df_g['count'], color="skyblue")
# Additional chart settings omitted...
return fig
```
##### 3.2. Time Series Chart Generation
```python
def draw_exec(df, hl_groups, title, symbols, args, ctype) -> plt.Figure:
fig, ax_t = plt.subplots()
ax_t.step(df.index, df['pc'], where='pre', lw=1.5)
# Additional chart settings omitted...
return fig
```
> what file record dynamic changed?
---
#### 4. Execution and Data Saving
The main function orchestrates the above functionalities and, based on user options, saves the generated data or charts to files.
```python
def run_main(args) -> None:
if args.inst_log:
df, fig = run_inst_log(args.inst_log, hl_groups, title, args)
# Save CSV or images...
elif args.trace:
df, figs_dict = run_bin_trace(args.trace, hl_groups, title, args)
# Save charts...
```
### Environment
```shell
$ riscv64-unknown-elf-gcc --version
riscv64-unknown-elf-gcc (g04696df09) 14.2.0
$ gcc --version
Apple clang version 14.0.3 (clang-1403.0.22.14.1)
$ sysctl -a | grep machdep.cpu
machdep.cpu.cores_per_package: 8
machdep.cpu.core_count: 8
machdep.cpu.logical_per_package: 8
machdep.cpu.thread_count: 8
machdep.cpu.brand_string: Apple M2
```
### Step1 : try to exceute ama-riscv-sim on example file
```shell
../src/ama-riscv-sim ../sw/baremetal/vector_ew_mac_uint8/basic.bin --out_dir_tag=basic_test
```
> problem 1: can't make under src
```shell
main.cpp:4:10: fatal error: 'external/cxxopts/include/cxxopts.hpp' file not found
#include "external/cxxopts/include/cxxopts.hpp"
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
make: *** [build/main.o] Error 1
```
> The file cxxopts.hpp was not found because the `external/cxxopts` submodule might not have been initialized. cxxopts is a C++ library for handling command-line parameters and is usually included in projects as a git submodule.
> Solution: Initialize git submodule and add cxxopts as a submodule:**
```shell
not used [-Werror,-Wunused-private-field]
main_memory* mem;
^
1 error generated.
make: *** [build/hw_models/cache.o] Error 1
```
> problem 2: In `cache.h`, there is an unused private member variable `mem`.
> In this project, the purpose of the `mem` pointer is to enable the `cache` class to interact with the main memory. This is evident from the following key points in the code:
```c++
#if CACHE_MODE == CACHE_MODE_FUNC
act_line.data = mem->rd_line(addr - BASE_ADDR);
#endif
```
> When the cache requires a write-back, dirty data needs to be written back to the main memory:
```c++
#if CACHE_MODE == CACHE_MODE_FUNC and defined(CACHE_VERIFY)
mem->wr_line((act_line.tag << tag_off) - BASE_ADDR, act_line.data);
#endif
```
> However, these operations are surrounded by conditional compilation directives:
```c++
#if CACHE_MODE == CACHE_MODE_FUNC
#if CACHE_MODE == CACHE_MODE_FUNC and defined(CACHE_VERIFY)
```
> This indicates that the `mem` pointer is only used in functional simulation mode (`CACHE_MODE_FUNC`).
> Solution:
> First, add the `mem` member variable in `cache.h`:
```c++
class cache {
private:
uint32_t sets;
uint32_t ways;
// ... other members ...
main_memory* mem; // Add this line
bool speculative_exec_active;
```
> Modify the constructor in `cache.cpp`:
```c++
cache::cache(uint32_t sets, uint32_t ways, std::string cache_name, main_memory* mem) :
sets(sets), ways(ways), cache_name(cache_name), mem(mem)
{
validate_inputs(sets, ways);
// ... rest of the constructor ...
}
```
> Update the initialization in `main_memory.cpp`:
```c++
#ifdef ENABLE_HW_PROF
,
icache(hw_cfg.icache_sets, hw_cfg.icache_ways, "icache", this),
dcache(hw_cfg.dcache_sets, hw_cfg.dcache_ways, "dcache", this)
#endif
```
> issue 2: can't build the sw
```shell
% make -j
../Makefile.inc:63: *** commands commence before first target. Stop.
```
> solution 2: manually build the main.c under `/sw/baremetal/` by `riscv64-unknown-elf-gcc`
:::danger
Don't put screenshots which contain plaintext only.
:::
### Step1 Result:

and we would have following file for step 2.

exec.log records the detailed execution process of the program (timeline).
inst_profiler.json captures the usage frequency of various instructions.
hw_stats.json logs hardware performance metrics (e.g., cache hit rates).
```json
{
"add": {"count": 1024},
"sub": {"count": 0},
"sll": {"count": 0},
"srl": {"count": 0},
"sra": {"count": 0},
...
```
### Step 2 : try to exceute ama-riscv-sim on example file

figure 1 : instructions profiled

figure 2 : PC frequency profiled
We need to identify how the simulator generates the ISA count JSON file. Once this is clarified, we can adapt the approach for RV32emu. By applying the same Python script, we can visualize the instructions effectively.
## Log Generation Mechanism for [ama-riscv-sim](https://github.com/AleksandarLilic/ama-riscv-sim/blob/master/script/README.md)
exec.log / inst_profiler.json /hw_stats.json
these 3 simulation log file was produce by profilers.cpp
### Execution Flow Summary
#### 1. Launching the Simulator
When executing `./ama-riscv-sim` to start the simulator, the program enters `main()` and creates the memory and core:
```cpp
int main(int argc, char* argv[]) {
// ... Parse command-line arguments ...
// Create main memory and core
memory mem(test_bin, hw_cfg);
core rv32(&mem, gen_log_path(test_bin, out_dir_tag), cfg, hw_cfg);
rv32.exec(); // Start execution
}
```
#### 2. Core Execution and Profiler Initialization
The `core` begins executing the program. Inside the `core` class, the `profiler` is initialized:
```cpp
class core {
profiler prof; // Instruction profiler
void step() {
uint32_t inst = fetch(); // Fetch instruction
prof.new_inst(inst); // Notify profiler of new instruction
// Record statistics during instruction execution
switch(inst_type) {
case BRANCH:
prof.log_inst(opc_j::i_beq, taken, direction);
break;
default:
prof.log_inst(opc);
break;
}
}
};
```
**Key Points:**
- The `profiler` maintains counters for each instruction type and execution count.
- Special handling is implemented for branch instruction behavior.
#### 3. Profiler Behavior During Execution
The `profiler` records instruction types and counts during execution:
```cpp
class profiler {
// Instruction counter array
struct prof_g_t {
std::string name;
uint64_t count;
} prof_g_arr[NUM_INST];
// Branch instruction counters
struct prof_j_t {
std::string name;
uint64_t count_taken;
uint64_t count_taken_fwd;
uint64_t count_not_taken;
uint64_t count_not_taken_fwd;
} prof_j_arr[NUM_BRANCH];
void log_inst(opc_g opc) {
prof_g_arr[TO_U32(opc)].count++; // Increment count
}
void log_inst(opc_j opc, bool taken, b_dir_t direction) {
if (taken) {
prof_j_arr[TO_U32(opc)].count_taken++;
if (direction == b_dir_t::forward)
prof_j_arr[TO_U32(opc)].count_taken_fwd++;
}
}
};
```
**Key Points:**
- The `profiler` tracks all executed instructions and branch instruction details, including directions and taken counts.
#### 4. Outputting Execution Statistics
When the program completes execution, the `profiler` outputs statistics:
- **`inst_profiler.json`**: Contains execution statistics for all instructions, including detailed branch prediction information.
```cpp
void profiler::log_to_file() {
// Open output file
ofs.open(log_path + "inst_profiler.json");
// Output general instruction statistics
for (const auto &i : prof_g_arr) {
ofs << "\"" << i.name << "\": {\"count\": " << i.count << "}," << std::endl;
}
// Output branch instruction statistics
for (const auto &e : prof_j_arr) {
ofs << "\"" << e.name << "\": {"
<< "\"count\": " << e.count_taken + e.count_not_taken << ","
<< "\"breakdown\": {"
<< "\"taken\": " << e.count_taken << ","
<< "\"taken_fwd\": " << e.count_taken_fwd
<< "}}," << std::endl;
}
}
```
## Current visulization on [RV32emu](https://github.com/sysprog21/rv32emu)
rv32emu already have it's own visulization, i want to know how it work, and apply python code on it.
### Step 1: Run `rv_histogram`
We will compile `tests/nqueens.c` into an ELF file and analyze it using `rv_histogram`. The steps are as follows:
> issue 1:
pocoloco@wenjuntangdeMacBook-Pro-2 rv32emu % build/rv_histogram -a nqueens.elf
Failed to open nqueens.elf
> solution:
From the code in `src/elf.c`, it is clear that the simulator expects a 32-bit RISC-V ELF file:
```c
/* must be 32bit ELF */
if (e->hdr->e_ident[EI_CLASS] != ELFCLASS32)
return false;
/* check if machine type is RISC-V */
if (e->hdr->e_machine != EM_RISCV)
return false;
```
>**Recompile using RV32:**
```bash
riscv64-unknown-elf-gcc -O2 -march=rv32im -mabi=ilp32 -static tests/nqueens.c -o build/nqueens.elf
```
1. **Compile `nqueens.c` into an ELF file:**
2. **Analyze it using `rv_histogram`:**
- **Display usage statistics for all instructions:**
```bash
build/rv_histogram -a nqueens.elf
```
- **Display register usage statistics:**
```bash
build/rv_histogram -r nqueens.elf
```
- **Display both instruction and register usage statistics simultaneously:**
```bash
build/rv_histogram -ar nqueens.elf
```
### Result for step 1:

figure 3: build/rv_histogram -a nqueens.elf

figure 4: build/rv_histogram -r nqueens.elf
## histogram Generation Mechanism for RV32emu
The primary goal of `rv_histogram.c` is to parse ELF files, collect and analyze the usage frequency of RV32 instructions or registers, and finally visualize the results using histograms.
### Steps to Process and Analyze Data
#### 1. Parsing Command-Line Arguments
The program begins by parsing command-line arguments using the `parse_args()` function. It determines the target ELF file and whether to perform analysis on registers or instructions.
#### 2. Loading and Parsing the ELF File
In the `main()` function, the program loads the ELF file and extracts its program sections using the following logic:
```c
elf_t *e = elf_new();
if (!elf_open(e, elf_prog)) { /* Handle ELF loading */ }
uint8_t *elf_first_byte = get_elf_first_byte(e);
const struct Elf32_Shdr **shdrs =
(const struct Elf32_Shdr **) &elf_first_byte[hdr->e_shoff];
```
#### 3. Extracting Executable Instruction Sections
The program iterates through each section in the ELF file, checking if the section type (`sh_type`) is `SHT_PROGBITS` and if the section contains executable instructions (`sh_flags` with `SHF_EXECINSTR`). Identified executable sections are further analyzed:
```c
while (ptr < exec_end_addr) {
insn = *((uint32_t *) ptr);
rv_decode(&ir, insn);
hist_record(&ir);
}
```
#### 4. Instruction and Register Frequency Analysis
For each instruction, the program performs frequency analysis based on the `show_reg` parameter. The logic updates the frequency statistics as follows:
- **Instruction Frequency:** Updates `rv_insn_stats` using `insn_hist_incr()`.
- **Register Frequency:** Updates `rv_reg_stats` using `reg_hist_incr()`.
```c
static void insn_hist_incr(const rv_insn_t *ir);
static void reg_hist_incr(const rv_insn_t *ir);
```
#### 5. Calculating Frequencies and Generating Histograms
The program calculates the highest instruction or register frequency using `find_max_freq()` and generates histograms for each instruction or register based on the statistics:
```c
find_max_freq(rv_insn_stats, N_RV_INSNS + 1);
print_hist_stats(rv_insn_stats, N_RV_INSNS + 1);
static void print_hist_stats(const rv_hist_t *stats, size_t stats_size) {
char hist_bar[max_col * 3 + 1];
float percent;
size_t idx = 1;
for (size_t i = 0; i < stats_size; i++) {
const char *insn_reg = stats[i].insn_reg;
size_t freq = stats[i].freq;
percent = ((float) freq / total_freq) * 100;
if (percent < 1.00)
continue;
printf(fmt, idx, insn_reg, percent, freq,
gen_hist_bar(hist_bar, sizeof(hist_bar), freq, max_freq, max_col,
used_col));
idx++;
}
}
```
> issue2: bad address
> % build/rv32emu -p build/nqueens.elf
Error: Bad address
## Implementation for RV32emu: rv_pyvisual
### 1. Add `rv_pyvisual.c`
rv_pyvisual.c would output statistic file under build/pyvisual named `output.json`, and would called `run_analysis.py` to generate figure for the target input elf file.
### 2. In mk/tools.mk for command `make tool` to build
```diff
$ git diff HEAD mk/tools.c
+PYVIS_BIN := $(OUT)/rv_pyvisual
+PYVIS_OBJS := \
+ riscv.o \
+ utils.o \
+ map.o \
+ elf.o \
+ decode.o \
+ mpool.o \
+ utils.o \
+ rv_pyvisual.o
+
+PYVIS_OBJS := $(addprefix $(OUT)/, $(PYVIS_OBJS))
+deps += $(PYVIS_OBJS:%.o=%.o.d)
-TOOLS_BIN += $(HIST_BIN)
+$(PYVIS_BIN): $(PYVIS_OBJS)
+ $(VECHO) " LD\t$@\n"
+ $(Q)$(CC) -o $@ -D RV32_FEATURE_GDBSTUB=0 $^ $(LDFLAGS)
+
+TOOLS_BIN += $(HIST_BIN) $(PYVIS_BIN)
```
### 3. Add python visualization code `run_analysis.py`
### 4. Running rv_pyvisual (Basic Syntax)
```shell
$ build/rv_pyvisual [options] <elf_file_path> [options]
```
Example:
```shell
$ build/rv_pyvisual -i build/nqueens.elf -l ""
```
Or Run instruction log analysis with highlight
```shell
build/rv_pyvisual -i build/nqueens.elf -l "lw,lh,lb,lhu,lbu,sw,sh,sb bne,beq,blt,bge,bgeu,bltu jal,jalr"
```

:::danger
Don't put screenshots which contain plaintext only!
:::
### 5. Result

### Code
[gitbub-PochariChun / rv32emu visualization](https://github.com/PochariChun/rv32emu)
[gist:visualization:rv_pyvisual.c](https://gist.github.com/PochariChun/124d8fe4d50db5afcabff61c19cbdf7b)
[gist:visualization:run_analysis.py](https://gist.github.com/PochariChun/5866a68d9f5500a50c6e43bd0d1167d4)
## Reference
* [ama-riscv-sim](https://github.com/AleksandarLilic/ama-riscv-sim)