This set of notes on stuff I've learned while using for around a week while building a simplistic tx simulator. It is by no means complete and may contain errors.
The most well known, and simplest way to simulate the result of a message call is to use the standard eth_call
RPC method. If all you care about is the return value of a tx or what a specific set of storage slot values might be at the end of execution, it's great! If you're building an application that needs finer-grain details around the internals of the tx (e.g. an indexer/explorer), eth_call
isn't going to cut it.
This is where debug_traceCall
comes in.
Yes it comes with your favorite eth_call
features like the ability to choose which parent block to execute and state overrides. It also has its own features including the ability to toggle returning the current frame's stack, memory, storage, and returndata for every opcode during execution. Some applications might find that granularity overwhelming (e.g. an indexer that only cares about internal calls or logs that could be emitted with a specific call). Luckily, the RPC method also supports passing a custom tracer.
The basic structure of the passed tracer object is as follows:
step
defines a function that is called for every step of the EVM. The log exposes a lot of information, including the current opcode (log.op
), information around the current frame (log.contract
), and the current stack and memory space (log.stack
and log.memory
). Those fields have their own set of methods and I'd encourage you to take a look at the geth documenation for more information. Two of the more useful methods are log.memory.slice(start, end)
and log.stack.peek(idx)
which allow you to copy memory and see items on the stack accordingly. db
allows access to storage and isn't limited to the current frame of execution.
fault
defines a function that is called when EVM execution encounters an error. log.getError()
can be used to get more information about the error in question.
result
defines the JSON-serializable output to be returned from the RPC method.
Finally, there's also the ability to specify enter
and exit
properties in addition to OR instead of step
(more information of which can be found here).
All of the above is covered in the geth docs, but there are a few features which I've found to be useful that aren't covered: passing non-standard fields as part of the tracer and using global JS functions bundled as part of the tracer js module in geth.
Passing non-standard fields as part of the RPC call is useful if you want to persist data or computations across steps and is probably a key reason many use a custom tracer. To take a concrete example from the geth codebase, imagine you want to trace a transaction and create a histogram of opcodes used during execution (this is from the codebase directly I did not write this).
You can define arbitrary fields as part of the tracer and use them in the step/fault/result functions. In this case, the geth team defined hist
as a map whose keys are opcodes and values are number of times encountered. The map is updated each time step
is called and returned via result
.
Perhaps the feature I've seen the least documentation or awareness of is that there are a few global functions available to all JS tracers. The one I've used most frequently is toHex
which makes it much easier to read addresses in responses:
versus
debug_traceTransaction
which can be used to trace a transaction that has already been executed.debug_traceCall
and debug_traceTransaction
If I missed anything or you wanna chat more on this or literally anything else, feel free to DM on twitter or email me at rajiv@framework.ventures