# TERM PROJECT
###### tags: `Computer Architecture 2022`
## Intro
In this final topic, I chose ckb-vm-aot as the object of analysis to report. First, I tried to compile it and ensure that the test code can work. Then go further to analyze the internal implementation of the entire project. This project enables RISCV programs to be compiled and executed in advance as X86 programs, including the use of rust language and tools such as cargo.
## Environment Setting
```
git clone https://github.com/mohanson/ckb-vm-aot.git
```
First, I downloaded the project to my home directory.

However, the test can't be done since the VM doesn't recognize the cargo command.
```
porkbeeflamb@lcw-vm:~/ckb-vm-aot$ curl
https://sh.rustup.rs -sSf | sh
```
```
info: downloading installer
Welcome to Rust!
This will download and install the official compiler for the Rust
programming language, and its package manager, Cargo.
Rustup metadata and toolchains will be installed into the Rustup
home directory, located at:
/home/porkbeeflamb/.rustup
This can be modified with the RUSTUP_HOME environment variable.
The Cargo home directory is located at:
/home/porkbeeflamb/.cargo
This can be modified with the CARGO_HOME environment variable.
The cargo, rustc, rustup and other commands will be added to
Cargo's bin directory, located at:
/home/porkbeeflamb/.cargo/bin
This path will then be added to your PATH environment variable by
modifying the profile files located at:
/home/porkbeeflamb/.profile
/home/porkbeeflamb/.bashrc
You can uninstall at any time with rustup self uninstall and
these changes will be reverted.
Current installation options:
default host triple: x86_64-unknown-linux-gnu
default toolchain: stable (default)
profile: default
modify PATH variable: yes
1) Proceed with installation (default)
2) Customize installation
3) Cancel installation
>1
info: profile set to 'default'
info: default host triple is x86_64-unknown-linux-gnu
warning: Updating existing toolchain, profile choice will be ignored
info: syncing channel updates for 'stable-x86_64-unknown-linux-gnu'
info: default toolchain set to 'stable-x86_64-unknown-linux-gnu'
stable-x86_64-unknown-linux-gnu unchanged - rustc 1.66.0 (69f9c33d7 2022-12-12)
Rust is installed now. Great!
To get started you may need to restart your current shell.
This would reload your PATH environment variable to include
Cargo's bin directory ($HOME/.cargo/bin).
To configure your current shell, run:
source "$HOME/.cargo/env"
```
First, I installed the rust using rustup.
```
porkbeeflamb@lcw-vm:~/ckb-vm-aot$ rustc --version
rustc 1.66.0 (69f9c33d7 2022-12-12)
porkbeeflamb@lcw-vm:~/ckb-vm-aot$ cargo --version
cargo 1.66.0 (d65d197ad 2022-11-15)
```
Then, I typed two instructions to confirm the install successes.
## Compile and Run
```
porkbeeflamb@lcw-vm:~/ckb-vm-aot$ make ci
cargo fmt --all -- --check
cargo check --all --all-targets --all-features
Finished dev [unoptimized + debuginfo] target(s) in 0.03s
cargo clippy --all -- -D warnings -D clippy::clone_on_ref_ptr -D clippy::enum_glob_use -A clippy::collapsible-else-if -A clippy::upper_case_acronyms -A clippy::unusual_byte_groupings -A clippy::inconsistent_digit_grouping -A clippy::large_digit_groups -A clippy::suspicious_operation_groupings
Finished dev [unoptimized + debuginfo] target(s) in 0.08s
cargo test --all -- --nocapture
Finished test [unoptimized + debuginfo] target(s) in 0.13s
Running unittests src/lib.rs (target/debug/deps/ckb_vm_aot-f42f067cdb190725)
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
Running tests/machine_build.rs (target/debug/deps/machine_build-bd6acd7561fcdc50)
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
Running tests/test_aot.rs (target/debug/deps/test_aot-339f932553a4c107)
running 21 tests
test test_aot_alloc_many ... ok
test test_aot_cycles_overflow ... ok
test test_aot_ebreak ... ok
test test_aot_flat_crash_64 ... ok
test test_aot_invalid_read64 ... ok
test test_aot_jump0 ... ok
test test_aot_load_elf_crash_64 ... ok
test test_aot_load_elf_section_crash_64 ... ok
test test_aot_load_malformed_elf_crash_64 ... ok
test test_aot_misaligned_jump64 ... ok
test test_aot_mulw64 ... ok
test test_aot_outofcycles_in_syscall ... ok
test test_aot_rvc_pageend ... ok
test test_aot_simple64 ... ok
test test_aot_simple_cycles ... ok
test test_aot_simple_max_cycles_reached ... ok
test test_aot_trace ... ok
test test_aot_with_custom_syscall ... ok
test test_aot_write_large_address ... ok
test test_aot_wxorx_crash_64 ... ok
test test_aot_chaos_seed ... ok
test result: ok. 21 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.06s
Running tests/test_b_extension.rs (target/debug/deps/test_b_extension-a50c2dbe6a96a6d8)
running 6 tests
test test_clmul_bug ... ok
test test_clzw_bug ... ok
test test_orc_bug ... ok
test test_pcnt ... ok
test test_rorw_in_end_of_aot_block ... ok
test test_sbinvi_aot_load_imm_bug ... ok
test result: ok. 6 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.01s
Running tests/test_mop.rs (target/debug/deps/test_mop-65f64fa245e161a1)
running 10 tests
test test_mop_far_jump ... ok
test test_mop_adc ... ok
test test_mop_ld_signextend_32_overflow_bug ... ok
test test_mop_sbb ... ok
test test_mop_random_adc_sbb ... ok
test test_mop_wide_div_zero ... ok
test test_mop_secp256k1 ... ok
test test_mop_wide_mul_zero ... ok
test test_mop_wide_divide ... ok
test test_mop_wide_multiply ... ok
test result: ok. 10 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.09s
Running tests/test_reset.rs (target/debug/deps/test_reset-1b843c2b1d124910)
running 1 test
test test_reset_aot ... ok
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
Doc-tests ckb-vm-aot
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
git diff --exit-code Cargo.lock
```
After the installation finished, I compiled the rust files and run the five files machine_build.rs, test_aot.rs, test_b_extension.rs, test_mop.rs and test_reset.rs respecitvely. The rust files would run the programs in tests/programs folder.
## Self generated RISCV execution code

First, I generates a file mysimple.c which tests the int and uint64_t if it can compute correctly using ^ operation.
```
riscv-none-elf-gcc -march=rv64i -mabi=lp64 -o mysimple64 mysimple.c
```

Then I used the command above to generate the rv64 execution code mysimple64.

After that, I created a rust file mytest.rs in the tests folder which runs the mysimple64 program (line 10 to line 19) and got the result. According to the rust code above, if the operation is correct, the return value should be 0. This would be checked at line 21.

If I changed the right value of line 21 from 0 to 1, the test would fail.

After changing it back, the test passed.
## Specify the load file through the command line


Since the name of my test function is test_mysimple64, I could modify the Makefile to filter other tests by replacing the --all argument.

Then I modified the mytest.rs file to make it be able to be passed the argument by using std::env module.

I set a variable in the Makefile and created a new make command to test the mysimple64 only but filter the other test function out.
```
make mytest var=KFCYUMMY
```

If I set the var to KFCYUMMY the result would be error since file KFCYUMMY doesn't exist.
```
make mytest var=mysimple64
```


If I set the var to the correct filename mysimple64, the test would success. We can see that the other tests are filtered out since I modified the Makefile.
## Disassemble the execution file
```
riscv-none-elf-objdump -d mysimple64 > asm.txt
```
I disassembled the execution file to assembly code through the command above.
```mipsasm=
00000000000101d8 <main>:
101d8: fe010113 addi sp,sp,-32
101dc: 00813c23 sd s0,24(sp)
101e0: 02010413 addi s0,sp,32
101e4: fe042623 sw zero,-20(s0)
101e8: fec42783 lw a5,-20(s0)
101ec: 00f7879b addiw a5,a5,15
101f0: fef42623 sw a5,-20(s0)
101f4: fec42783 lw a5,-20(s0)
101f8: 00f7c793 xori a5,a5,15
101fc: fef42623 sw a5,-20(s0)
10200: fec42783 lw a5,-20(s0)
10204: 0007879b sext.w a5,a5
10208: 00078663 beqz a5,10214 <main+0x3c>
1020c: 00100793 li a5,1
10210: 02c0006f j 1023c <main+0x64>
10214: fe043023 sd zero,-32(s0)
10218: fe043783 ld a5,-32(s0)
1021c: 00f7c793 xori a5,a5,15
10220: fef43023 sd a5,-32(s0)
10224: fe043703 ld a4,-32(s0)
10228: 00f00793 li a5,15
1022c: 00f70663 beq a4,a5,10238 <main+0x60>
10230: 00200793 li a5,2
10234: 0080006f j 1023c <main+0x64>
10238: 00000793 li a5,0
1023c: 00078513 mv a0,a5
10240: 01813403 ld s0,24(sp)
10244: 02010113 addi sp,sp,32
10248: 00008067 ret
```
The main function section of the assembly code would look like the above.
## Debugging ckb-vm-aot with GDB

First, I moved my test rust code mytest.rs to the /examples folder under ckb-vm-aot.

Then I revised the rust code, changing the test function to main function preparing for the compiling operation.

I compiled my rust code and got the execution file.


The execution file would be created at /target/debug/examples folder and I pasted the mysimple64 RV64 file for the later execution.

The execution would fail since the reading file (defined in argument) doesn't exist.

The execution would success if the file corresponding to the argument exist.

Now we could start preparing for gdb debugging, first we check if the rust-gdb exists.

Then we let the rust-gdb read the mytest symbols.


Then we could observe the key symbols in the rust code, I set main and run(line21) as the breakpoint.

The first main is not the main funtion of mytest.rs, so I just continued. The second main would be our target.

```
layout src
layout split
```
Now we type the two commands above, it could let you trace the corresponding src code and assembly code.

If you forgot the current assembly code position, you could print out the value of $rip(x86-64) or $pc(general) to get the current pc value.
## Analyze CKB-VM-AOT


My target is to analyze the AOT machine behavior, so I continued and stopped at the AotMachine::run function.
```rust=
pub fn run(&mut self) -> Result<i8, Error> {
if self.machine.isa() & ISA_MOP != 0 && self.machine.version() == VERSION0 {
return Err(Error::InvalidVersion);
}
let mut decoder = build_decoder::<u64>(self.machine.isa(), self.machine.version());
self.machine.set_running(true);
while self.machine.running() {
if self.machine.reset_signal() {
decoder.reset_instructions_cache();
self.aot_code = None;
}
let result = if let Some(aot_code) = &self.aot_code {
if let Some(offset) = aot_code.labels.get(self.machine.pc()) {
let base_address = aot_code.base_address();
let offset_address = base_address + u64::from(*offset);
let f = unsafe {
std::mem::transmute::<u64, fn(*mut AsmCoreMachine, u64) -> u8>(base_address)
};
f(&mut (**self.machine.inner_mut()), offset_address)
} else {
unsafe { ckb_vm_x64_execute(&mut (**self.machine.inner_mut())) }
}
} else {
unsafe { ckb_vm_x64_execute(&mut (**self.machine.inner_mut())) }
};
match result {
RET_DECODE_TRACE => {
let pc = *self.machine.pc();
let slot = calculate_slot(pc);
let mut trace = Trace::default();
let mut current_pc = pc;
let mut i = 0;
while i < TRACE_ITEM_LENGTH {
let instruction = decoder.decode(self.machine.memory_mut(), current_pc)?;
let end_instruction = is_basic_block_end_instruction(instruction);
current_pc += u64::from(instruction_length(instruction));
trace.instructions[i] = instruction;
trace.cycles += self.machine.instruction_cycle_func()(instruction);
let opcode = extract_opcode(instruction);
// Here we are calculating the absolute address used in direct threading
// from label offsets.
trace.thread[i] = unsafe {
u64::from(
*(ckb_vm_asm_labels as *const u32).offset(opcode as u8 as isize),
) + (ckb_vm_asm_labels as *const u32 as u64)
};
i += 1;
if end_instruction {
break;
}
}
trace.instructions[i] = blank_instruction(OP_CUSTOM_TRACE_END);
trace.thread[i] = unsafe {
u64::from(
*(ckb_vm_asm_labels as *const u32).offset(OP_CUSTOM_TRACE_END as isize),
) + (ckb_vm_asm_labels as *const u32 as u64)
};
trace.address = pc;
trace.length = (current_pc - pc) as u8;
self.machine.inner_mut().traces[slot] = trace;
}
RET_ECALL => self.machine.ecall()?,
RET_EBREAK => self.machine.ebreak()?,
RET_DYNAMIC_JUMP => (),
RET_MAX_CYCLES_EXCEEDED => return Err(Error::CyclesExceeded),
RET_CYCLES_OVERFLOW => return Err(Error::CyclesOverflow),
RET_OUT_OF_BOUND => return Err(Error::MemOutOfBound),
RET_INVALID_PERMISSION => return Err(Error::MemWriteOnExecutablePage),
RET_SLOWPATH => {
let pc = *self.machine.pc() - 4;
let instruction = decoder.decode(self.machine.memory_mut(), pc)?;
execute_instruction(instruction, &mut self.machine)?;
}
_ => return Err(Error::Asm(result)),
}
}
Ok(self.machine.exit_code())
}
```
This is the run function for the aot machine



Now we add two breakpoints (breakpoint 3 and 4), decode and execute_instruction, which are significant to our aot machine execution.


We can see that after four times of continue, the two decoded and executed instructions are pc=65768(dec) and pc=65772(dec) respectively.

On the disassembled mysimple64 rv64 file, we can see that the first and the second instruction is at pc=100e8(hex) and pc=100ec(hex), which are actually 65768(dec) and 65772(dec) respectively. Apparently the program is loaded correctly.
### decode operation



Let us focus on the pc=65776(hex=100f0) instruction. The decode function would select decode_raw as its decode type.



The decoder_raw function would transform the instruction_bits (the machine code of pc=65776, its hex value is 0x813023 above) to the instruction which is 0x20200082c(hex)
### execute_instruction operation


In the execute_instruction function, out argument inst is 8623491116(dec) which is actually 20200082c(hex) as above.


After the match symbol, the op value is 44 and it jumped to the OP_SD section and then executed it.


We could confirm that the 44 is truly defined as SD operation and the pc=100f0(hex) operation is a sd operation.
## Summary
From the behavior of CKB-VM-AOT, it can be seen that it reads the instructions one by one along the program counter, decodes them, analyzes them, and executes them according to the corresponding assembly language, and indeed achieves the functions it claims .