Try   HackMD

Ethereum Block Building Workload

Background

The Ethereum Virtual Machine (EVM) is a transacton based state machine. The state of the system started with some genesis and was incrementally updated through executing transactions. The world state,

σ, can be thought of as all state used in the virtual machine such as accounts, balances, and more. Transactions

The state transition function can be expressed as:

(1)σt+1Υ(σt,T)

(2)σt+1Π(σt,B)

(3)B(...,(T0,T1,...),...)

(4)Π(σ,B)Υ(Υ(σ,T0),T1)

Where

Υ is the Ethereum state transition function. In Ethereum,
Υ
, together with
σ
are considerably more powerful than any existing comparable system;
Υ
allows components to carry out arbitrary computation, while
σ
allows components to store arbitrary state between transactions. Transactions are collated into blocks; blocks are chained together using a cryptographic hash as a means of reference.

Where

B is this block, which includes a series of transactions amongst some other components and
Π
is the block-level state-transition function for transactions. This is the basis of the blockchain paradigm, a model that forms the backbone of not only Ethereum, but all decentralised consensus-based transaction systems to date.

Data Structures

An Ethereum block

B is defined as a tuple of four components:

B(BH,BT,BU,BW)

where:

  • BH
    (block header): A collection of metadata and accumulators that capture the state transitions and properties of the Ethereum state machine
  • BT
    (transactions): The ordered list of transactions included in this block
  • BU
    (ommers): A deprecated field from pre-Paris hard fork that contained headers of blocks sharing a grandparent with the current block
  • BW
    (withdrawals): A collection of validator withdrawals from the consensus layer, introduced in the Shanghai hard fork

The block header

BH is formally defined as:

BH(Hp,Ho,Hc,Hr,Ht,He,Hb,Hd,Hi,Hl,Hg,Hs,Hx,Ha,Hn,Hf,Hw)

where each component is defined as:

  • Hp
    (parentHash): Keccak 256-bit hash of the parent block's header
  • Ho
    (ommersHash): Deprecated 256-bit hash field, now constant at KEC(RLP(()))
  • Hc
    (beneficiary): 160-bit address receiving this block's priority fees
  • Hr
    (stateRoot): Keccak 256-bit hash of the state trie's root node after executing all transactions and withdrawals
  • Ht
    (transactionsRoot): Keccak 256-bit hash of the trie's root node containing all block transactions
  • He
    (receiptsRoot): Keccak 256-bit hash of the trie's root node containing all transaction receipts
  • Hb
    (logsBloom): Bloom filter of indexable data (addresses and topics) from all transaction receipt logs
  • Hd
    (difficulty): Deprecated scalar field, now constant at 0
  • Hi
    (number): Scalar count of ancestor blocks (0 for genesis block)
  • Hl
    (gasLimit): Scalar maximum gas expenditure allowed for this block
  • Hg
    (gasUsed): Scalar total gas consumed by all transactions in this block
  • Hs
    (timestamp): Scalar Unix timestamp at block creation
  • Hx
    (extraData): Arbitrary byte array of block-relevant data, maximum 32 bytes
  • Ha
    (prevRandao): Latest RANDAO mix from the previous block's post-beacon state
  • Hn
    (nonce): Deprecated 64-bit field, now constant at 0x0000000000000000
  • Hf
    (baseFeePerGas): Scalar wei amount burned per unit of gas consumed
  • Hw
    (withdrawalsRoot): Keccak 256-bit hash of the trie's root node containing all consensus layer withdrawals

Block Validity

A block is valid if and only if:

Valid(B)EmptyOmmers(B)ConsistentHeader(B)

A block header

BH is considered consistent if and only if all required fields match their derived values after block execution.
ConsistentHeader(B)
returns true if the following conditions hold simultaneously:

  1. BU() 

  2. HrTRIE(LS(Π(σ,B))) 

  3. HtTRIE({i<BT,iN:pT(i,BT[i])}) 

  4. HeTRIE({i<BR,iN:pR(i,BR[i])}) 

  5. HwTRIE({i<BW,iN:pW(i,BW[i])}) 

  6. HbrBRrb

Each coniditon is explained as follows:

  1. Empty ommers field

  2. State Root Consistency:

    HrTRIE(LS(Π(σ,B)))

    • Where
      σ
      is the base state
    • Π(σ,B)
      represents state after executing all transactions and withdrawals
    • LS
      converts state to trie entries
    • Must match final post-execution state root
  3. Transaction Root Consistency:

    HtTRIE({i<BT,iN:pT(i,BT[i])})

    • Each transaction is RLP encoded with its index
    • Must match root of transaction trie
    • Includes special handling for EIP-2718 transactions
  4. Receipts Root Consistency:

    HeTRIE({i<BR,iN:pR(i,BR[i])})

    • Each receipt is RLP encoded with its index
    • Must match root of receipt trie
    • Generated from actual transaction execution results
  5. Withdrawals Root Consistency:

    HwTRIE({i<BW,iN:pW(i,BW[i])})

    • Each withdrawal is RLP encoded with its index
    • Must match root of withdrawals trie
    • Reflects consensus layer withdrawals
  6. Logs Bloom Consistency:

    HbrBRrb

    • Bloom filter must match union of all receipt logs
    • Combines log entries from all transaction receipts
    • Includes logger addresses and log topics

State Root Consistency

The state root must satisfy:

TRIE(LS(σ))=P(BH)Hr

where:

  • TRIE(LS(σ))
    is the Merkle Patricia tree root hash of state
    σ
  • P(BH)
    is the parent block
  • Values are RLP encoded
  • σ
    is the state after executing all transactions and withdrawals

Execution Order Dependencies

For consistency to hold:

  1. All transactions must be executed in order:
    BT[0]BT[1]...BT[n1]
  2. All withdrawals must be processed after transactions:
    BW[0]BW[1]...BW[m1]
  3. State root (
    Hr
    ) must reflect final state after both transactions and withdrawals
  4. Receipt root (
    He
    ) must include results from all transaction executions
  5. Logs bloom (
    Hb
    ) must include all logs from all receipts

Validation Process

ValidateHeader(B):=

  1. Execute all transactions in order
  2. Process all withdrawals in order
  3. Compute final state root
  4. Compute transaction trie root
  5. Compute receipt trie root
  6. Compute withdrawals trie root
  7. Compute logs bloom filter
  8. Verify all computed values match header fields

A header is consistent if and only if all computed values exactly match their corresponding header fields.

Computation

Merkle Patricia Tries

There are four Modified Merkle Patricia Tries (MPT) used in maintenance of the Ethereum World State

σ:

  • State Trie
  • Transaction Trie
  • Receipt Trie
  • Withdrawal Trie

Transaction Execution


Symbol Key:

  • σt
    = state at time
    t
  • Π
    = block-level state transition function
  • B
    = block
  • Tn
    = transaction
    n
  • Υ
    = transaction-level state transition function

Any transactions executed first pass the initial tests of intrinsic validity. (as of October 22nd)

  1. The transaction is well-formed RLP, with no additional trailing bytes;
  2. the transaction signature is valid;
  3. the transaction nonce is valid (equivalent to the
    sender account’s current nonce);
  4. the sender account has no contract code deployed
    (see EIP-3607 by Feist et al. [2021]);
  5. the gas limit is no smaller than the intrinsic gas,
    g0, used by the transaction;
  6. the sender account balance contains at least the
    cost, v0, required in up-front payment;
  7. the maxFeePerGas, Tm, in the case of type 2
    transactions, or gasPrice, Tp, in the case of type 0 and type 1 transactions, is greater than or equal to the block’s base fee, Hf; and
  8. for type 2 transactions, maxPriorityFeePerGas, Tf , must be no larger than maxFeePerGas,
    Tm.

The process of finalising a block involves three stages:

  1. executing withdrawals;
  2. validate transactions;
  3. verify state.

Time Complexity

Block Creation Complexity=O(nkh+ns+nlogn)

where:

  • n
    = number of transactions
  • k
    = accounts touched per transaction
  • h
    = height of state trie
    log(A)
  • s
    = signature verification cost (constant)
  • A
    = total number of accounts in state

Component Breakdown

  1. State Updates:

    O(nklog(A))

    • n
      transactions
    • k
      accounts per tx
    • log(A)
      trie depth
  2. Signature Verifications:

    O(n)

    • One verification per tx
    • Constant time per verification
  3. Merkle Tree:

    O(nlogn)

    • Binary tree construction
    • n
      leaf nodes

Memory=O(nk)

Instruction Level Analysis

Empirical Values

For typical values:

  • When
    n=100
    transactions
  • k=2
    accounts per tx
  • A=224
    accounts in state
  • h=log(A)=24
  • average transaction size = 200 bytes

The dominant term becomes:

O(100224)=O(4800) for state updates

Block Header Size

Field Symbol Size in Bits Size in Bytes Notes
parentHash
Hp
256 32 Keccak hash
ommersHash
Ho
256 32 Keccak hash
beneficiary
Hc
160 20 Address
stateRoot
Hr
256 32 Keccak hash
transactionsRoot
Ht
256 32 Keccak hash
receiptsRoot
He
256 32 Keccak hash
logsBloom
Hb
2048 256 Bloom filter
difficulty
Hd
256 32 Zero value
number
Hi
256 32 Block number
gasLimit
Hl
256 32 Gas limit
gasUsed
Hg
256 32 Gas used
timestamp
Hs
256 32 Unix timestamp
extraData
Hx
≤256 ≤32 Variable, max 32
prevRandao
Ha
256 32 RANDAO value
nonce
Hn
64 8 Zero value
baseFeePerGas
Hf
256 32 Base fee
withdrawalsRoot
Hw
256 32 Keccak hash

Total Size

Total Fixed Bits   = 5600 bits
Total Fixed Bytes  = 700 bytes

This breaks down as:

  • 13 fields of 256 bits (32 bytes) each = 3328 bits (416 bytes)
  • 1 field of 2048 bits (256 bytes) = 2048 bits (256 bytes)
  • 1 field of 160 bits (20 bytes) = 160 bits (20 bytes)
  • 1 field of 64 bits (8 bytes) = 64 bits (8 bytes)
  • 1 variable field of up to 256 bits (32 bytes)

Maximum total size including variable field:

Maximum Total Bits  = 5600 bits
Maximum Total Bytes = 700 bytes

In practice, the actual size might be slightly less due to:

  1. extraData often being smaller than 32 bytes
  2. RLP encoding overhead not included in this calculation
  3. Some number fields potentially using fewer bytes when encoded