Block processing performance on Ethereum mainnet

# Introduction There's a broader shift in how Ethereum is scaling, not only through L2s, but also by improving the [scalability of Ethereum L1](https://vitalik.eth.limo/general/2025/02/14/l1scaling.html) itself. Sepolia has already reached 60 Mgas, with other testnets expected to follow soon. Some core devs and researchers have even suggested increasing the gas limit to 100 Mgas by the end of the year. In this context, all EL client teams need to focus on meeting these scaling targets across various dimensions—such as network bandwidth, memory, CPU, disk usage, and block execution time, i.e., block processing performance. In our [previous blog post article](https://lf-hyperledger.atlassian.net/wiki/spaces/BESU/pages/22156583/2024+-+Besu+Performance+Improvements+since+the+Merge), we shared how crucial block processing performance is for attestation performance and maximizing staking rewards. Currently, attestation performance is very good—even on modest hardware. For instance, our Besu validators are running on 4 CPU / 8 GiB VMs for both the CL and EL, and are still achieving over [98% attestation performance at the time of writing](https://explorer.rated.network/o/Besu%20Client%20Team?network=mainnet&timeWindow=1d&idType=nodeOperator). In this blog post, we'll specifically focus on block processing performance, without diving into resource usage. Our goal here is to share some block execution performance numbers across different Ethereum clients on a powerful machine, and to provide insight into the top consumers and our methodology to improve block processing performance. # Hardware setup To measure block processing performance, we used a high-end machine with the following specifications: * AMD Ryzen 9 9950X * Linux Ubuntu 24.04 * GIGABYTE X870 EAGLE WIFI7 * 64 GB Corsair VENGEANCE DDR5 5200 MHz (2 × 32 GB) * 2 TB SAMSUNG 990 PRO M.2, PCIe 4.0 NVMe Of course, we're not suggesting that this is the recommended hardware for running a validator. For those looking to get started with Ethereum validation, we encourage you to check the recommended hardware specification [here](https://hackmd.io/G3MvgV2_RpKxbufsZO8VVg?view). We will also share [besu numbers](#Besu-block-processing-numbers-on-Nuc-14-Pro) on one of the recommended hardware. So why did we choose this setup? For two main reasons: - Some client teams have started publishing benchmark numbers using this specific hardware, and we wanted to evaluate how Besu performs in the same environment. - We wanted to explore the upper bounds of throughput (in Mgas/s) achievable on a high-performance setup. 📝 **One important note**: we used the same hardware setup to collect performance data for all five clients—Besu, Erigon, Geth, Nethermind, and Reth. To do this, we ran the clients at different times, allowing each node to fully sync, run for a period (~2 hours), and then collected block processing metrics. While the results aren't 100% identical in terms of the specific blocks processed, we found that the numbers were consistent across multiple runs for each client. # Block processing numbers To collect the data, we relied on logs from all five clients. Fortunately, each client logs the Mgas/s value for every processed block. For each client, we gathered logs during block execution after the node had fully synced to the head of the chain—we did not measure sync time or block import time during the sync phase. Once the Mgas/s metrics were extracted, we visualized the data by plotting the Mgas/s timeline, a histogram, and summary statistics—including the minimum, average, median (50th percentile), 95th percentile, 99th percentile, and maximum. Below, you’ll find a table summarizing the results across the different clients. | Metric | Besu |Erigon | Geth | Nethermind | Reth | | ----------------- | -------: |--------:| -------: | -----------:| -------: | | Min | 107.74 |59.09 | 33.52| 173.22| 60.31| | Median | 271.50 |144.66 | 262.03| 654.54| 236.36| | Average | 282.25 |147.34 | 237.23| 626.12| 236.02| | 95th Percentile | 389.13 |198.52 | 368.08| 1013.00| 322.26| | 99th Percentile | 651.97 |223.79 | 430.30| 1175.55| 392.33| | Max | 1242.63 |266.88 | 461.98| 1357.34| 401.33| ![image](https://hackmd.io/_uploads/rJOZ4cUC1l.png) ## Some key takeaways * **Fastest average throughput**: Nethermind at 626.12 Mgas/s * **Lowest 99th percentile**: Erigon at 223.79 Mgas/s * **Most consistent performance**: Reth, with Median and Average very close (~236 Mgas/s) * **Highest peak performance**: Nethermind, reaching up to 1357.34 Mgas/s * **Reliable mid-range performer**: Besu, delivering steady throughput with decent headroom at the higher percentiles * **Good throughput and competitive consistency**: Geth, with good overall throughput and a relatively tight spread between median and 99th percentile, indicating stable behavior under load For each client, we also include a timeline of Mgas/s over time and a histogram showing the distribution of block processing rates. These visualizations help illustrate performance stability and typical throughput patterns beyond summary statistics. ## Besu ### Version ``` besu-perf@besu:/opt/besu/besu-25.3.0/bin$ ./besu --version besu/v25.3.0/linux-x86_64/openjdk-java-21 ``` ### Service configuration ``` [Unit] Description=Besu Enterprise Ethereum java client After=syslog.target network.target [Service] User=besu Group=besu Environment=HOME=/home/besu Environment=LOG4J_CONFIGURATION_FILE=/opt/log4j/besu-log-config.xml Type=simple ExecStart=/bin/sh -c "/opt/besu/besu-25.3.0/bin/besu --Xbonsai-parallel-tx-processing-enabled=true --Xplugin-rocksdb-high-spec-enabled --config-file=/etc/besu/config.toml >> /var/log/besu/besu.log 2>&1" SuccessExitStatus=143 Restart=on-failure RestartSec=10s [Install] WantedBy=multi-user.target ``` ### Mgas/s timeline and histogram ![image](https://hackmd.io/_uploads/SJBzlcLCJg.png) ![image](https://hackmd.io/_uploads/HJGSg9LAyg.png) ## Geth ### Version ``` besu-perf@besu:/opt/geth/geth-linux-amd64-1.15.5-4263936a$ ./geth --version geth version 1.15.5-stable-4263936a ``` ### Service configuration ``` [Unit] Description=Geth Execution Client (Mainnet) After=network.target Wants=network.target [Service] User=geth Group=geth Type=simple Restart=always RestartSec=5 TimeoutStopSec=600 ExecStart=/opt/geth/geth-linux-amd64-1.15.5-4263936a/geth \ --mainnet \ --datadir /data/geth \ --authrpc.jwtsecret /etc/jwt.hex \ --db.engine pebble \ --state.scheme path [Install] WantedBy=default.target ``` ### Mgas/s timeline and histogram ![image](https://hackmd.io/_uploads/HyM1-9UCJg.png) ![image](https://hackmd.io/_uploads/SJXG-c8Cke.png) ## Erigon ### Version ``` besu-perf@besu:/opt/erigon_v3.0.0_linux_amd64$ ./erigon --version erigon version 3.00.0-57625b40 ``` ### Service configuration ``` [Unit] Description=Erigon Execution Client (Mainnet) After=network.target Wants=network.target [Service] User=erigon Group=erigon Type=simple Restart=always RestartSec=5 ExecStart=/opt/erigon_v3.0.0_linux_amd64/erigon \ --chain=mainnet \ --datadir=/data/erigon \ --authrpc.jwtsecret=/etc/jwt.hex \ --externalcl [Install] WantedBy=default.target ``` ### Mgas/s timeline and histogram ![image](https://hackmd.io/_uploads/Skx2b98C1e.png) ![image](https://hackmd.io/_uploads/HJ6bfcL0yl.png) ## Nethermind ### Version ``` besu-perf@besu:/opt/nethermind/nethermind$ ./nethermind --version Version: 1.31.5+ace60000 Commit: ace6000081fb615a59aabe50d1a8cc80ed294d30 Build date: 2025-03-14 16:08:25Z Runtime: .NET 9.0.3 Platform: Linux x64 ``` ### Service configuration ``` [Unit] Description=Nethermind Execution Client (Mainnet) After=network.target Wants=network.target [Service] User=nethermind Group=nethermind Type=simple Restart=always RestartSec=5 WorkingDirectory=/data/nethermind Environment="DOTNET_BUNDLE_EXTRACT_BASE_DIR=/data/nethermind" ExecStart=/opt/nethermind/nethermind/nethermind \ --config mainnet \ --datadir /data/nethermind \ --Sync.SnapSync true \ --JsonRpc.JwtSecretFile /etc/jwt.hex \ --Pruning.Mode Hybrid \ --Pruning.FullPruningTrigger VolumeFreeSpace \ --Pruning.FullPruningThresholdMb 285000 [Install] WantedBy=default.target ``` ### Mgas/s timeline and histogram ![image](https://hackmd.io/_uploads/S1yDV5UCJx.png) ![image](https://hackmd.io/_uploads/SkK_EqL0yx.png) ## Reth ### Version ``` besu-perf@besu:/opt/reth$ ./reth --version reth Version: 1.3.4 Commit SHA: 90c514ca818a36eb8cd36866156c26a4221e9c4a Build Timestamp: 2025-03-21T19:27:56.033317818Z Build Features: asm_keccak,jemalloc Build Profile: maxperf ``` ### Service configuration ``` Description=Reth Execution Client (Mainnet) After=network.target Wants=network.target [Service] User=reth Group=reth Type=simple Restart=always RestartSec=5 ExecStart=/opt/reth/reth node \ --full \ --datadir /data/reth \ --authrpc.jwtsecret /etc/jwt.hex \ --log.file.directory /var/log/reth \ --log.file.max-size 50 \ --log.file.max-files 1 [Install] WantedBy=default.target ``` ### Mgas/s timeline and histogram ![image](https://hackmd.io/_uploads/Sk3ekiIAyg.png) ![image](https://hackmd.io/_uploads/S1Uf1iLC1g.png) # Besu block processing numbers on Nuc 14 Pro ## The hardware setup Of course, not all home stakers or node operators have access to high-end server hardware. To better reflect realistic setups, we also gathered metrics on Besu running on a recommended consumer-grade configuration: * ASUS NUC 14 Pro (Intel Core Ultra 7 155H) * 2 x 32GB RAM * Samsung 990 PRO 4TB This setup is one of the recommended harware [here](https://hackmd.io/G3MvgV2_RpKxbufsZO8VVg?view), defined by the informational [EIP-7870](https://eips.ethereum.org/EIPS/eip-7870). ## Mgas/s timeline and histogram We can see below besu statistics from block processing on Nuc 14 Pro * Min: 102.94 mgas/s * Median: 192.55 mgas/s * Average: 195.02 mgas/s * 95th Percentile: 260.15 mgas/s * 99th Percentile: 321.40 mgas/s * Max: 405.09 mgas/s ![image](https://hackmd.io/_uploads/r18Xr5cRkx.png) ![image](https://hackmd.io/_uploads/rkLBr9cAke.png) # Block processing performance profile (on Besu) The profiling below was done on the Ryzen 9 setup using wall clock profiling with 5ms sampling during 300 seconds, and parallel transaction execution enabled. The two most significant CPU consumers are root hash calculation and EVM opcode execution. The root hash calculation, responsible for finalizing the world state Merkle Trie, consumed 105 samples (~39%), highlighting the computational expense of hashing and persisting trie nodes — particularly with Bonsai state storage. Closely following, EVM opcode execution took 104 samples (~39%), with SLOAD alone accounting for 21 samples, indicating that state access remains a key bottleneck during transaction execution. These two phases combined account for nearly 78% of total block processing time, showing that world state manipulation and contract execution dominate the performance profile. The table below summarizes the CPU sample distribution during Besu's block processing, highlighting the most time-consuming operations: | Category | Samples | Approx Time | Percentage | |----------------------------------|---------|-------------|------------| | **Total** | 269 | ~54 ms | 100% | | Root Hash Calculation | 105 | ~19.5 ms | 39.0% | | EVM Execution | 104 | ~19.2 ms | 38.7% | | SLOAD (subset of EVM Execution) | 21 | ~3.9 ms | 7.8% | | Conflict Merge | 15 | ~2.8 ms | 5.6% | | Log Bloom Filter Update | 7 | ~1.3 ms | 2.6% | | Block Data Storage | 6 | ~1.1 ms | 2.2% | | Block Body Validation | 5 | ~0.9 ms | 1.9% | | Get "to" Account (state read) | 4 | ~0.7 ms | 1.5% | | Precompile Execution | 3 | ~0.6 ms | 1.1% | | ETH Transfer | 1 | ~0.2 ms | 0.4% | | Store Beacon Root | 1 | ~0.2 ms | 0.4% | ![Screenshot 2025-04-14 at 11.33.29](https://hackmd.io/_uploads/rJGiaFc0ke.png) The block processing profiling on a node running without parallel transaction execution is quite different. # Our methodology on improving block processing performance To continuously improve block processing performance in Besu, we rely on a structured, iterative methodology composed of four key techniques. Other methodologies are applied for other use cases like RPC nodes. ### 1. Performance Monitoring & Observability We leverage a Prometheus/Grafana stack to collect real-world performance metrics, primarily using two dashboards: the [Besu Full dashboard](https://grafana.com/grafana/dashboards/16455-besu-full/) and the [Node Exporter dashboard](https://grafana.com/grafana/dashboards/1860-node-exporter-full/). These allow us to assess potential regressions by comparing different releases under identical load conditions. One of Ethereum’s key strengths is the ability to test multiple versions of the client under production traffic, without impacting the network or other clients — enabling high-confidence performance validation in real-world scenarios. ### 2. Profiling (CPU / Wall Clock) When a potential regression is detected or a performance improvement is introduced, we use profiling tools to validate the impact and pinpoint the root cause. The same approach applies to verifying optimizations. After resolving the issue or deploying the improvement, we run a new profiling session to confirm that the expected behavior has been achieved. Depending on the context, we utilize both CPU time profiling and wall clock profiling to capture accurate profiling data. ### 3. Micro-Benchmarking When profiling narrows down the scope, we isolate methods and components to benchmark them in controlled environments. This validates whether observed regressions or gains are reproducible at a micro level. As Besu is built on top of Java, we use [JMH](https://github.com/openjdk/jmh) (Java Microbenchmark Harness) for micro-benchmarking at the method level. ### 4. Static Performance Tuning Finally, we experiment with JVM flags and runtime configurations to understand their impact on performance and fine-tune the system for specific workloads or environments. This step-by-step approach allows us to iterate quickly, validate assumptions with data, and ensure that improvements are robust and repeatable. # What's next * **Performance goal**: Our goal is to make Besu the fastest Ethereum execution client. * **State root optimization**: We're exploring techniques to speed up root hash calculation. * **Byte manipulation refactoring**: We're refactoring the byte handling library — a known bottleneck for Besu EVM performance. * **SIMD acceleration**: We're evaluating manual SIMD/vectorization strategies, as Java’s auto-vectorization still leaves room for improvement. * **Plenty of headroom**: Early findings show meaningful opportunities for performance gains. # Conclusion Achieving fast, consistent block processing performance is critical for Ethereum's long-term scalability—especially as we move toward higher gas limits and increasingly demanding workloads. Our study, based on mainnet data collected from various Ethereum clients on a high-end hardware setup (not Ehtereum hardware target configuration), demonstrates that Besu is one of the fastest clients with production-level loads while also revealing key opportunities for further performance improvements. For context, while this high-end machine allowed us to assess the upper performance bounds, one of the target hardware for validators is the Nuc 14, for which we also share detailed performance numbers. Whether it’s managing 60 Mgas blocks or gearing up for even more ambitious throughput targets, we remain committed to ensuring that Besu stands out as a top-performing execution client. Stay tuned for more updates as we roll out the next wave of improvements.