# Benchmark SSZ-QL (without proof inclusion) {%preview https://github.com/syjn99/prysm/pull/10 %} ## Introduction After the [PR](https://github.com/OffchainLabs/prysm/pull/15888), Prysm now supports **SSZ Query** features, without Merkle proof inclusion. It means if a user wants to query 2045754th validator in mainnet, she can simply send a POST request like: ``` POST /eth/v1/beacon/states/head/query ``` With request body: ```json { "query": ".validators[2045754]", "include_proof": false } ``` --- I would like to share a benchmark of SSZ-QL engine and main bottleneck for the API handler in this write-up. ## Rationale when writing code for benchmarking - As fetching a `BeaconState` from DB/memory isn't the scope of this benchmark, I simplified the handler logic, having considered that the `BeaconState` is presented. See `BenchmarkQueryBeaconState` ([Link](https://github.com/syjn99/prysm/blob/f4dfef191f31254985a030198f5446ff44ae9228/beacon-chain/rpc/prysm/beacon/benchmark_test.go#L36-L70)). - Also, I added several isolated benchmarks for marshalling the full state, analyzing the protobuf object, and etc. See `Benchmark_A_ToProtoUnsafe`, `Benchmark_B_AnalyzeObject`, `Benchmark_C_CalculateOffsetAndLength`, `Benchmark_D_MarshalSSZ_BeaconState`, `Benchmark_E_MarshalSSZ_SSZQueryResponse` ([Link](https://github.com/syjn99/prysm/blob/f4dfef191f31254985a030198f5446ff44ae9228/beacon-chain/rpc/prysm/beacon/benchmark_test.go#L72-L166)). - `TestPrintSszInfo` just prints the analyzed object quite prettier. See the result at [below](#Analyzed-BeaconState). ## How to run 1. Check out to `ssz-ql-benchmark` branch in `syjn99/prysm`: https://github.com/syjn99/prysm/tree/ssz-ql-benchmark 2. Download a state from [tracoor](https://tracoor.mainnet.ethpandaops.io/), and place it at `beacon-chain/rpc/prysm/beacon/state_*.go`. I used the `BeaconState` of [slot 12852789](https://beaconcha.in/slot/12852789). 3. Modify constant values on top of `beacon-chain/rpc/prysm/beacon/benchmark_test.go` if you want. ```go const ( stateFilePath = "./state_12852789.ssz" validatorIndex = 2045754 targetPath = ".validators[2045754]" ) ``` 4. At root directory, run this command: ```bash go test -bench=. -run=^# -benchtime=2s beacon-chain/rpc/prysm/beacon/benchmark_test.go -cpuprofile=cpu.out -memprofile=mem.out ``` 5. See the result. ## Result ``` goos: darwin goarch: arm64 cpu: Apple M4 Pro BenchmarkQueryBeaconState-12 15 150825878 ns/op 933128148 B/op 6581681 allocs/op --- BENCH: BenchmarkQueryBeaconState-12 benchmark_test.go:177: Running one-time setup... benchmark_test.go:208: Setup complete. Benchmark_A_ToProtoUnsafe-12 186 13050653 ns/op 57977184 B/op 82690 allocs/op Benchmark_B_AnalyzeObject-12 64981 36990 ns/op 40984 B/op 775 allocs/op Benchmark_C_CalculateOffsetAndLength-12 198119467 12.06 ns/op 0 B/op 0 allocs/op Benchmark_D_MarshalSSZ_BeaconState-12 14 145811625 ns/op 875111157 B/op 6498209 allocs/op Benchmark_E_MarshalSSZ_SSZQueryResponse-12 58596447 40.29 ns/op 256 B/op 2 allocs/op PASS ok command-line-arguments 24.499s ``` - The full process for processing the request takes about 150ms. As the result of `Benchmark_D_MarshalSSZ_BeaconState` indicates, most of the perf bottleneck is on marshalling full `BeaconState`. - This was predictable: See Pros & Cons section of this approach [here](https://hackmd.io/@junsong/Byr3_lfPxl#Pros-amp-Cons6). - Open question: As marshalling ~300MB `BeaconState` is indeed an overhead in terms of time and memory, can we avoid/optimize it? - `ToProtoUnsafe` takes about 13ms. I think this is a kind of good trade-off for our approach because we need a protobuf object to analyze something. - `AnalyzeObject` takes under **50 microseconds**, which seems negligble. ## Appendix ### Analyzed `BeaconState` ``` BeaconStateElectra (Variable-size / size: 299898616) ├─ genesis_time (offset: 0) uint64 (Fixed-size / size: 8) ├─ genesis_validators_root (offset: 8) Bytes32 (Fixed-size / size: 32) ├─ slot (offset: 40) Slot (Fixed-size / size: 8) ├─ fork (offset: 48) Fork (Fixed-size / size: 16) │ ├─ previous_version (offset: 0) Bytes4 (Fixed-size / size: 4) │ ├─ current_version (offset: 4) Bytes4 (Fixed-size / size: 4) │ └─ epoch (offset: 8) Epoch (Fixed-size / size: 8) ├─ latest_block_header (offset: 64) BeaconBlockHeader (Fixed-size / size: 112) │ ├─ slot (offset: 0) Slot (Fixed-size / size: 8) │ ├─ proposer_index (offset: 8) ValidatorIndex (Fixed-size / size: 8) │ ├─ parent_root (offset: 16) Bytes32 (Fixed-size / size: 32) │ ├─ state_root (offset: 48) Bytes32 (Fixed-size / size: 32) │ └─ body_root (offset: 80) Bytes32 (Fixed-size / size: 32) ├─ block_roots (offset: 176) Vector[Bytes32, 8192] (Fixed-size / size: 262144) ├─ state_roots (offset: 262320) Vector[Bytes32, 8192] (Fixed-size / size: 262144) ├─ historical_roots (offset: 2736713) List[Bytes32, 16777216] (Variable-size / length: 758, size: 24256) ├─ eth1_data (offset: 524468) Eth1Data (Fixed-size / size: 72) │ ├─ deposit_root (offset: 0) Bytes32 (Fixed-size / size: 32) │ ├─ deposit_count (offset: 32) uint64 (Fixed-size / size: 8) │ └─ block_hash (offset: 40) Bytes32 (Fixed-size / size: 32) ├─ eth1_data_votes (offset: 2760969) List[Eth1Data, 2048] (Variable-size / length: 1574, size: 113328) ├─ eth1_deposit_index (offset: 524544) uint64 (Fixed-size / size: 8) ├─ validators (offset: 2874297) List[Validator, 1099511627776] (Variable-size / length: 2113172, size: 255693812) ├─ balances (offset: 258568109) List[uint64, 1099511627776] (Variable-size / length: 2113172, size: 16905376) ├─ randao_mixes (offset: 524560) Vector[Bytes32, 65536] (Fixed-size / size: 2097152) ├─ slashings (offset: 2621712) Vector[uint64, 8192] (Fixed-size / size: 65536) ├─ previous_epoch_participation (offset: 275473485) List[uint8, 1099511627776] (Variable-size / length: 2113172, size: 2113172) ├─ current_epoch_participation (offset: 277586657) List[uint8, 1099511627776] (Variable-size / length: 2113172, size: 2113172) ├─ justification_bits (offset: 2687256) Bitvector[8] (Fixed-size / size: 1) ├─ previous_justified_checkpoint (offset: 2687257) Checkpoint (Fixed-size / size: 40) │ ├─ epoch (offset: 0) Epoch (Fixed-size / size: 8) │ └─ root (offset: 8) Bytes32 (Fixed-size / size: 32) ├─ current_justified_checkpoint (offset: 2687297) Checkpoint (Fixed-size / size: 40) │ ├─ epoch (offset: 0) Epoch (Fixed-size / size: 8) │ └─ root (offset: 8) Bytes32 (Fixed-size / size: 32) ├─ finalized_checkpoint (offset: 2687337) Checkpoint (Fixed-size / size: 40) │ ├─ epoch (offset: 0) Epoch (Fixed-size / size: 8) │ └─ root (offset: 8) Bytes32 (Fixed-size / size: 32) ├─ inactivity_scores (offset: 279699829) List[uint64, 1099511627776] (Variable-size / length: 2113172, size: 16905376) ├─ current_sync_committee (offset: 2687381) SyncCommittee (Fixed-size / size: 24624) │ ├─ pubkeys (offset: 0) Vector[Bytes48, 512] (Fixed-size / size: 24576) │ └─ aggregate_pubkey (offset: 24576) Bytes48 (Fixed-size / size: 48) ├─ next_sync_committee (offset: 2712005) SyncCommittee (Fixed-size / size: 24624) │ ├─ pubkeys (offset: 0) Vector[Bytes48, 512] (Fixed-size / size: 24576) │ └─ aggregate_pubkey (offset: 24576) Bytes48 (Fixed-size / size: 48) ├─ latest_execution_payload_header (offset: 296605205) ExecutionPayloadHeaderDeneb (Variable-size / size: 603) │ ├─ parent_hash (offset: 0) Bytes32 (Fixed-size / size: 32) │ ├─ fee_recipient (offset: 32) Bytes20 (Fixed-size / size: 20) │ ├─ state_root (offset: 52) Bytes32 (Fixed-size / size: 32) │ ├─ receipts_root (offset: 84) Bytes32 (Fixed-size / size: 32) │ ├─ logs_bloom (offset: 116) Bytes256 (Fixed-size / size: 256) │ ├─ prev_randao (offset: 372) Bytes32 (Fixed-size / size: 32) │ ├─ block_number (offset: 404) uint64 (Fixed-size / size: 8) │ ├─ gas_limit (offset: 412) uint64 (Fixed-size / size: 8) │ ├─ gas_used (offset: 420) uint64 (Fixed-size / size: 8) │ ├─ timestamp (offset: 428) uint64 (Fixed-size / size: 8) │ ├─ extra_data (offset: 584) List[uint8, 32] (Variable-size / length: 19, size: 19) │ ├─ base_fee_per_gas (offset: 440) Bytes32 (Fixed-size / size: 32) │ ├─ block_hash (offset: 472) Bytes32 (Fixed-size / size: 32) │ ├─ transactions_root (offset: 504) Bytes32 (Fixed-size / size: 32) │ ├─ withdrawals_root (offset: 536) Bytes32 (Fixed-size / size: 32) │ ├─ blob_gas_used (offset: 568) uint64 (Fixed-size / size: 8) │ └─ excess_blob_gas (offset: 576) uint64 (Fixed-size / size: 8) ├─ next_withdrawal_index (offset: 2736633) uint64 (Fixed-size / size: 8) ├─ next_withdrawal_validator_index (offset: 2736641) ValidatorIndex (Fixed-size / size: 8) ├─ historical_summaries (offset: 296605808) List[HistoricalSummary, 16777216] (Variable-size / length: 810, size: 51840) ├─ deposit_requests_start_index (offset: 2736653) uint64 (Fixed-size / size: 8) ├─ deposit_balance_to_consume (offset: 2736661) Gwei (Fixed-size / size: 8) ├─ exit_balance_to_consume (offset: 2736669) Gwei (Fixed-size / size: 8) ├─ earliest_exit_epoch (offset: 2736677) Epoch (Fixed-size / size: 8) ├─ consolidation_balance_to_consume (offset: 2736685) Gwei (Fixed-size / size: 8) ├─ earliest_consolidation_epoch (offset: 2736693) Epoch (Fixed-size / size: 8) ├─ pending_deposits (offset: 296657648) List[PendingDeposit, 134217728] (Variable-size / length: 16833, size: 3231936) ├─ pending_partial_withdrawals (offset: 299889584) List[PendingPartialWithdrawal, 134217728] (Variable-size / length: 231, size: 5544) └─ pending_consolidations (offset: 299895128) List[PendingConsolidation, 262144] (Variable-size / length: 218, size: 3488) ```