# Achieving cross-platform stack determinism for PVF execution
## Overview
One of the biggest problems of Wasm VM implementations in the context of PVF execution is the non-deterministic call stack. Different hardware architectures impose different stack layout requirements, leading them to hit stack overflow at different execution points. That is unacceptable for PVF execution, which must be fully deterministic.
This write-up tries to describe a possible approach to that problem in the context of [PVF executor](https://github.com/paritytech/pvf-executor) PoC implementation that implements a naive stack machine and uses machine stack as a mixed value/call stack.
## Minimal example
Let's consider the following Wasm code snippet. A function having nine arguments loads them and calls another function, also having nine arguments:
```wasm
(func (param i32) (param i32) (param i32) (param i32) (param i32) (param i32) (param i32) (param i32) (param i32)
local.get 0
local.get 1
...
local.get 8
call $other
)
```
As the actual machine stack is used by the VM, after argument loading, the stack layout on `x64` is like the following:
![](https://hackmd.io/_uploads/SydfmeNWa.png)
All the functions compiled by this implementation are guaranteed to be ABI-compliant. x64 ABI enforces the following rules: the first six arguments are passed in registers; the rest is passed on stack in reverse order; the stack is 16-byte aligned before the call. So, at this point, a tiny snippet of machine code is executed that loads arguments to registers and transforms the stack frame in place to comply with ABI[^1]:
![](https://hackmd.io/_uploads/ry4LQxN-T.png)
`old rsp` is the value of `rsp` before the frame transformation was started. An offset to this value is known after the call's returned; thus, the whole call frame can be easily discarded after the call.
Now, let's consider the same example with `aarch64` ABI. It requires us to pass eight arguments in registers; the rest is passed on the stack, and the stack is 16-byte aligned at every execution point where `sp` is used to access memory, including the function call. So it could look like the following (I'm keeping x64 frame by side for easy comparison):
![](https://hackmd.io/_uploads/ry9Ktx4Za.png)
Uh, that doesn't look deterministic anymore. It could fail at a different execution point on ARM.
The trick is to comply not with single ABI rules but with the intersection of all the ABIs supported by the implementation. That's not as easy as it sounds, but I believe it's possible. In this very case, it's enough to additionally pad the stack on `aarch64`:
![](https://hackmd.io/_uploads/H1VB9gNWa.png)
and the stack is deterministic again.
## It's not that easy!
Truly, it's not. That was one simplified example; the story goes much further than just that. For instance, after entering a function, we have the return address on the stack for `x64` but in a register for `aarch64`. Requirements for establishing the function stack frame are also different, with a single qword required on the stack for `x64` and either one or two qwords required for `aarch64` depending on whether the `LR` is stored on the stack or not. All those small bits must be evaluated and intersected, and corresponding paddings must be added where needed. Still, in the end, I believe it is possible to achieve cross-platform stack determinism with this approach.
[^1]: All the values are coerced to raw 64-bit values by this implementation, including floating-point values, so there's no difference in argument passing between different argument types.