debug_traceCall

This set of notes on stuff I've learned while using for around a week while building a simplistic tx simulator. It is by no means complete and may contain errors.

What it does and why it's useful

The most well known, and simplest way to simulate the result of a message call is to use the standard eth_call RPC method. If all you care about is the return value of a tx or what a specific set of storage slot values might be at the end of execution, it's great! If you're building an application that needs finer-grain details around the internals of the tx (e.g. an indexer/explorer), eth_call isn't going to cut it.

This is where debug_traceCall comes in.

Yes it comes with your favorite eth_call features like the ability to choose which parent block to execute and state overrides. It also has its own features including the ability to toggle returning the current frame's stack, memory, storage, and returndata for every opcode during execution. Some applications might find that granularity overwhelming (e.g. an indexer that only cares about internal calls or logs that could be emitted with a specific call). Luckily, the RPC method also supports passing a custom tracer.

A custom tracer

The basic structure of the passed tracer object is as follows:

{ step: function(log, db), fault: function(log, db), result: function(ctx, db) }

step defines a function that is called for every step of the EVM. The log exposes a lot of information, including the current opcode (log.op), information around the current frame (log.contract), and the current stack and memory space (log.stack and log.memory). Those fields have their own set of methods and I'd encourage you to take a look at the geth documenation for more information. Two of the more useful methods are log.memory.slice(start, end) and log.stack.peek(idx) which allow you to copy memory and see items on the stack accordingly. db allows access to storage and isn't limited to the current frame of execution.

fault defines a function that is called when EVM execution encounters an error. log.getError() can be used to get more information about the error in question.

result defines the JSON-serializable output to be returned from the RPC method.

Finally, there's also the ability to specify enter and exit properties in addition to OR instead of step (more information of which can be found here).

Features and patterns not detailed in the docs

All of the above is covered in the geth docs, but there are a few features which I've found to be useful that aren't covered: passing non-standard fields as part of the tracer and using global JS functions bundled as part of the tracer js module in geth.

Passing non-standard fields as part of the RPC call is useful if you want to persist data or computations across steps and is probably a key reason many use a custom tracer. To take a concrete example from the geth codebase, imagine you want to trace a transaction and create a histogram of opcodes used during execution (this is from the codebase directly I did not write this).

{ // hist is the map of opcodes to counters hist: {}, // nops counts number of ops nops: 0, // step is invoked for every opcode that the VM executes. step: function(log, db) { var op = log.op.toString(); if (this.hist[op]){ this.hist[op]++; } else { this.hist[op] = 1; } this.nops++; }, // fault is invoked when the actual execution of an opcode fails. fault: function(log, db) {}, // result is invoked when all the opcodes have been iterated over and returns // the final result of the tracing. result: function(ctx) { return this.hist; }, }

You can define arbitrary fields as part of the tracer and use them in the step/fault/result functions. In this case, the geth team defined hist as a map whose keys are opcodes and values are number of times encountered. The map is updated each time step is called and returned via result.

Perhaps the feature I've seen the least documentation or awareness of is that there are a few global functions available to all JS tracers. The one I've used most frequently is toHex which makes it much easier to read addresses in responses:

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

versus

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Random other stuff

  • If you're using Ethers.js and want to pass the tracer as an argument, it needs to be a string of a javascript object. If you're iterating quickly on a custom tracer, I'd recommend taking the extra 5 minutes to write a script that does the formatting for you as doing it manually can get quite aggravating. Here's an example of what the formatting looks like, taken from stack overflow:
tracer: "{\n" + " data: [],\n" + " fault: function (log) {\n" + " },\n" + " step: function (log) {\n" + " var topicCount = (log.op.toString().match(/LOG(\\d)/) || [])[1];\n" + " if (topicCount) {\n" + " var res = {\n" + " address: log.contract.getAddress(),\n" + " data: log.memory.slice(parseInt(log.stack.peek(0)), parseInt(log.stack.peek(0)) + parseInt(log.stack.peek(1))),\n" + " };\n" + " for (var i = 0; i < topicCount; i++)\n" + " res['topic' + i.toString()] = log.stack.peek(i + 2);\n" + " this.data.push(res);\n" + " }\n" + " },\n" + " result: function () {\n" + " return this.data;\n" + " }\n" + "}",
  • There are a bunch of js tracers in the geth codebase! Depending on your needs, you might be able to use those or modify them lightly
  • All of the custom tracing stuff is also applicable to debug_traceTransaction which can be used to trace a transaction that has already been executed.
  • Erigon also has support for debug_traceCall and debug_traceTransaction

If I missed anything or you wanna chat more on this or literally anything else, feel free to DM on twitter or email me at rajiv@framework.ventures