# EPF5 Final Report
My project proposal is here [jsonrpc-enhancements-in-geth](https://github.com/eth-protocol-fellows/cohort-five/blob/main/projects/jsonrpc-enhancements-in-geth.md#json-rpc-enhancements-in-geth), summay as below:
>Introducing `trace_*` namespace and `eth_getTransactionBySenderAndNonce` into geth, to enhance the transaction and trace querying capabilities.
## Project details
### Period 0: Proposal
- [Update 1](https://hackmd.io/@jsvisa/rkNslE3HR)
- [Update 2](https://hackmd.io/@jsvisa/rkmWSpBUA)
- [Update 3](https://hackmd.io/@jsvisa/epf5-week3)
- [Update 4](https://hackmd.io/@jsvisa/epf5-week4)
- [Update 5](https://hackmd.io/@jsvisa/epf5-week5)
### Period 1: Design the first approach
In this period, I was thoughing of storing the data directly into the freezer db, after we detected a ChainReorg, then truncate the head of the freezed data, and then re-insert the newly produced block data.
For the data serialize format used to persisted in disk, I considered to use JSON at the first place, but later @s1na mentioned that RLP maybe has a better performance, so I wrote a simple tool [rlp-vs-json](https://github.com/jsvisa/rlp-vs-json) to benchmark the differ in performance and data size between JSON and RLP, the results as below:

From the benchmark resutls, we choose RLP as the serialize format.
- [Update 6](https://hackmd.io/@jsvisa/epf5-week6)
- [Update 7](https://hackmd.io/@jsvisa/epf5-week7)
- [Update 8](https://hackmd.io/@jsvisa/epf5-week8)
- [Update 9](https://hackmd.io/@jsvisa/epf5-week9)
### Period 2: Design the second approach
I can't handle the re-org in a graceful way, so changed into two db approach:
1. `kvdb`: used to store the newly created block traces
- is a generic Key-Value DB(ethdb)
- the key schema is `BlkNum || BlkHash || TraceType`
- `TraceType` is one of: `callTracer`, `flatCallTracer`, `parityTracer`, `prestateTracer`
- the value is RLP encoded trace data
3. `frdb`: used to store the canonical block traces
- use `TraceType` as table name
first store the data in a kv-store(based on pebble), and then in each Finalized block move the `head-64` blocks from kv into frdb
- [Update 10](https://hackmd.io/@jsvisa/epf5-week10)
- [Update 11](https://hackmd.io/@jsvisa/epf5-week11)
- [Update 12-13](https://hackmd.io/@jsvisa/epf5-week12-13)
- [Update 14-16](https://hackmd.io/@jsvisa/epf5-week14-16)
## Project status
### Setup
> how to use this feature
Currently this feature is not merged into geth master, see the [PR#30255](https://github.com/ethereum/go-ethereum/pull/30255) here, anyone who's interested into this feature, can build geth with [jsvisa/trace-filter](https://github.com/jsvisa/go-ethereum/tree/trace-filter) instead.
After we building the geth with trace-filter feature, then start geth with `--vmtrace=live --vmtrace.jsonconfig=xxx` to enable the live tracing, eg:
```bash=
geth ... --http.api=...,trace \
--vmtrace=live \
--vmtrace.jsonconfig='{
"path": "/data/live-trace",
"enableNonceTracer": true,
"maxKeepBlocks": 100000,
"config": {
"callTracer": { "withLog": true },
"parityTracer": {},
"prestateTracer": { "diffMode": true }
}
}'
```
Configurations as below:
- `path`: use this directory to store the traced data, to mitigate the impact of I/O bandwidth limitiations, use a separate disk specifically decicatted to the chaindata.
- `enableNonceTracer`: weather or not to record the sender's initialized transaction hashes, used in the `eth_getTransactionBySenderAndNonce` RPC
- `maxKeepBlocks`: keep at most N blocks, the older ones will be pruned periodly
- `config`: a map of each tracers you want to trace with this geth node, each tracer has it's own configurations, see [builtin-tracers](https://geth.ethereum.org/docs/developers/evm-tracing/built-in-tracers#native-tracers) for more details.
### RPC retrieving
You need to append the `trace` namespace in geth's `--http.api` command arguments, then you can retrieve the block traces like below:
```json
{
"id": "1",
"jsonrpc": "2.0",
"method": "trace_block",
"params": [
"0x1"
]
}
```
Besides of `trace_block`, we're also supporting:
- `trace_filter`: filter with a range of blocks or a range of addresses
- `trace_transaction`: instead of a block, only one transaction is retrieved
The usage of those RPCs are compatible with [Parity's](https://openethereum.github.io/JSONRPC-trace-module)
And further more, besides of Parity's call Tracer, we can also retrieve other tracers' result, eg:
> retrieve prestate tracer
```json
{
"id": "1",
"jsonrpc": "2.0",
"method": "trace_block",
"params": [
"0x1",
{"tracer": "prestateTracer"}
]
}
```
Next let's talk about the performance, as you all know, the `debug_traceXXX` RPC needs to replay the transaction when you requested it, so it costs a lot of time to recalculate the state, but for the live traces, we only need O(1) time to retrieve the result from disk. Let's have a more detail tests.
Here we start two geth nodes:
1. run in normal mode, with `live` trace enabled
```bash=
geth1 --vmtrace=live --vmtrace.jsonconfig=xxx
```
2. run in archive mode
```bash=
geth2 --syncmode=full --gcmode=archive --state.scheme=hash
```
After the two nodes were fully synced, we use [blockchain-etl](https://github.com/jsvisa/blockchain-etl/tree/geth-trace) to fetch the traces data:
1. for the normal node, use `trace_block` RPC
2. for the archive node, use `debug_traceBlockByNumber` RPC
Start the etl process as below(dumping 43k blocks):
```bash=
./etl dump2 \
--chain=ethereum \
--lag=64 \
--start-block=21010865 \
--end-block=21054092 \
--period-seconds=60 \
--max-workers=10 \
--block-batch-size=10 \
--batch-size=1 \
...
```

And in the meanwhile, we use Prometheus+Grafana to measure the RPC performance diffs:

## Future plans
- Short-Term Goals:
- push [#30255](https://github.com/ethereum/go-ethereum/pull/30255) to be merged into the geth master branch
- after the [geth exex plugin](https://github.com/ethereum/go-ethereum/pull/30611) got merged, refine the codebase to adapte the plugin architecture.
- Long-Term Goals: Continous contributing to the Ethereum ecosystem, especially the exectuion clients, including geth, reth and erigon. By developing new features, and fixing bugs.
## Self evaluation
In the time I've been working with the EPF5 project, I've successfully designed and implemented a live tracer framework for Geth, which has improved how we query traces and other stuff. It's been a great achievement to get this feature up and running.
Beyond my main project, I've also contributed to other important Ethereum projects like reth and erigon.
In the coming weeks, I aim to make the trace API a standard across different Ethereum clients. This will make development easier and more consistent for everyone in the Ethereum community.
I've enjoyed working with all of you and look forward to continuing our efforts to make Ethereum better.
## Feedback about EPF
I am grateful to the EPF for offering me the unique opportunity to contribute to the development of Ethereum Core up close. Throughout this journey, I have had the pleasure of forging connections with friends from various nations. While my attendance has been sporadic, each weekly standup meeting has been a significant learning opportunity. Notably, the discussions in group settings prior to the standup have been especially instrumental in enhancing my understanding and collaboration skills.