Deep in to EIP4844

tags `Blockchain Research`

1. Execution Layer

A. Transaction Type

We created a new transaction type BLOB_TX-TYPE via EIP-2718

A1. Message & Signature

class SignedBlobTransaction(Container):
    message: BlobTransaction
    signature: ECDSASignature
    
class BlobTransaction(Container):
    chain_id: uint256
    nonce: uint64
    max_priority_fee_per_gas: uint256
    max_fee_per_gas: uint256
    gas: uint64
    to: Union[None, Address] # Address = Bytes20
    value: uint256
    data: ByteList[MAX_CALLDATA_SIZE]
    access_list: List[AccessTuple, MAX_ACCESS_LIST_SIZE]
    max_fee_per_data_gas: uint256
    **blob_versioned_hashes: List[VersionedHash, MAX_OBJECT_LIST_SIZE]**

class AccessTuple(Container):
    address: Address # Bytes20
    storage_keys: List[Hash, MAX_ACCESS_LIST_STORAGE_KEYS]

class ECDSASignature(Container):
    y_parity: boolean
    r: uint256
    s: uint256

A2. Converts a blob to its corresponding KZG point:

def blob_to_kzg(blob: Vector[BLSFieldElement, CHUNKS_PER_BLOB]) -> KZGCommitment:
    computed_kzg = bls.Z1
    for value, point_kzg in zip(tx.blob, KZG_SETUP_LAGRANGE):
        assert value < BLS_MODULUS
        computed_kzg = bls.add(
            computed_kzg,
            bls.multiply(point_kzg, value)
        )
    return computed_kzg

A3. Keccak256: Converts a KZG point into a versioned hash:

def kzg_to_versioned_hash(kzg: KZGCommitment) -> VersionedHash:
    return BLOB_COMMITMENT_VERSION_KZG + hash(kzg)[1:]

The reason why VersionedHash is used instead of KZG here is for forward compatibility. STARK is very different from KZG, VersionedHash is 32 bytes, it will be very convenient to replace STARK in the future. For example, there is no need to replace the proof format in Precompile, etc…

B. Opcode to get versioned hashes

New opcode: DATA_HASH (with byte value HASH_OPCODE_BYTE)

input one stack argument index

returns tx.header.blob_versioned_hashes[index]

if index < len(tx.header.blob_versioned_hashes), and otherwise zero

The opcode has a gas cost of HASH_OPCODE_GAS

C. Precompile

Note: We removed blob_verification_precompile as originally designed

https://github.com/ethereum/EIPs/commit/f45bd0c101944dc703bd8a80c6b064b47e1f7390

C1. Point evaluation precompile

def point_evaluation_precompile(input: Bytes) -> Bytes:
    """
    Verify p(z) = y given commitment that corresponds to the polynomial p(x) and a KZG proof.
    Also verify that the provided commitment matches the provided versioned_hash.
    """
    # The data is encoded as follows: versioned_hash | z | y | commitment | proof |
    versioned_hash = input[:32]
    z = input[32:64]
    y = input[64:96]
    commitment = input[96:144]
    kzg_proof = input[144:192]

    # Verify commitment matches versioned_hash
    assert kzg_to_versioned_hash(commitment) == versioned_hash

    # Verify KZG proof
    assert verify_kzg_proof(commitment, z, y, kzg_proof)

    # Return FIELD_ELEMENTS_PER_BLOB and BLS_MODULUS as padded 32 byte big endian values
    return Bytes(U256(FIELD_ELEMENTS_PER_BLOB).to_be_bytes32() + U256(BLS_MODULUS).to_be_bytes32())

2. Consensus Layer

A. Node

Beacon chain: process updated beacon blocks and ensure blobs are available.
P2P network: gossip and sync updated beacon block types and new blobs sidecars.
Honest validator: produce beacon blocks with blobs, publish the blobs sidecars.

B. Beacon chain

On the consensus-layer the blobs are now referenced, but not fully encoded, in the beacon block body. Instead of embedding the full contents in the body, the contents of the blobs are propagated separately, as a “sidecar”.

This “sidecar” design provides forward compatibility for further data increases by black-boxing is_data_available(): with full sharding is_data_available() can be replaced by data-availability-sampling (DAS) thus avoiding all blobs being downloaded by all beacon nodes on the network.

B1. Constructing the `BeaconBlockBody`

We add a BeaconBlockBody struct:

class BeaconBlockBody(Container):
    randao_reveal: BLSSignature
    eth1_data: Eth1Data  # Eth1 data vote
    graffiti: Bytes32  # Arbitrary data
    # Operations
    proposer_slashings: List[ProposerSlashing, MAX_PROPOSER_SLASHINGS]
    attester_slashings: List[AttesterSlashing, MAX_ATTESTER_SLASHINGS]
    attestations: List[Attestation, MAX_ATTESTATIONS]
    deposits: List[Deposit, MAX_DEPOSITS]
    voluntary_exits: List[SignedVoluntaryExit, MAX_VOLUNTARY_EXITS]
    sync_aggregate: SyncAggregate
    # Execution
    execution_payload: ExecutionPayload  # [Modified in EIP-4844]
    bls_to_execution_changes: List[SignedBLSToExecutionChange, MAX_BLS_TO_EXECUTION_CHANGES]
    blob_kzg_commitments: List[KZGCommitment, MAX_BLOBS_PER_BLOCK]  # [New in EIP-4844]

Nodes broadcast this over the network instead of plain beacon blocks. When a node receives it, it first calls validate_blobs_and_kzg_commitments and if this call passes it runs the usual processing on the beacon block.

B2. Verifying each blob kzg is correct.

def validate_blobs_and_kzg_commitments(execution_payload: ExecutionPayload,
                                       blobs: Sequence[Blob],
                                       blob_kzg_commitments: Sequence[KZGCommitment]) -> None:
    # Optionally sanity-check that the KZG commitments match the versioned hashes in the transactions
    assert verify_kzg_commitments_against_transactions(execution_payload.transactions, blob_kzg_commitments)

    # Optionally sanity-check that the KZG commitments match the blobs (as produced by the execution engine)
    assert len(blob_kzg_commitments) == len(blobs)
    assert [blob_to_kzg_commitment(blob) == commitment for blob, commitment in zip(blobs, blob_kzg_commitments)]

If valid, set block.body.blob_kzg_commitments = blob_kzg_commitments.****

B3. Constructing the `SignedBeaconBlockAndBlobsSidecar`

Set signed_beacon_block_and_blobs_sidecar.beacon_block = block where block is obtained above.****

def get_blobs_sidecar(block: BeaconBlock, blobs: Sequence[Blob]) -> BlobsSidecar:
    return BlobsSidecar(
        beacon_block_root=hash_tree_root(block),
        beacon_block_slot=block.slot,
        blobs=blobs,
        kzg_aggregated_proof=compute_aggregate_kzg_proof(blobs),
    )

This signed_beacon_block_and_blobs_sidecar is then published to the global beacon_block_and_blobs_sidecar topic.

After publishing the peers on the network may request the sidecar through sync-requests, or a local user may be interested. The validator MUST hold on to sidecars for MIN_EPOCHS_FOR_BLOBS_SIDECARS_REQUESTS epochs and serve when capable, to ensure the data-availability of these blobs throughout the network.

After MIN_EPOCHS_FOR_BLOBS_SIDECARS_REQUESTS nodes MAY prune the sidecars and/or stop serving them.

C. Cross-validation

We also add a cross-validation requirement to check equivalence between the BeaconBlock
and its contained ExecutionPayload
(this check can be done later than the verifications of the beacon block and the payload)

Verifying versioned_hash is matched with kzg in block body.

def cross_validate(block: BeaconBlock):
    body = block.message.body
    all_versioned_hashes = []
    for tx in body.execution_payload.blob_transactions:
        all_versioned_hashes.extend(tx.header.blob_versioned_hashes)
    **assert all_versioned_hashes == [
        kzg_to_versioned_hash(kzg) for kzg in body.blob_kzgs
    ]**

C1. Engine API

https://github.com/ethereum/execution-apis/pull/197

BlobsBundleV1
blokHash: DATA, 32 Bytes
kzgs: Array of DATA - Array of kzg Commitnent
blobs: Array of DATA - Array of blobs, each blob is SSZ encoded(DATA)

Engine_getBlobsBundleV1

This method retrieves the blobs and their respective KZG commitments corresponding to the versioned_hashes included in the blob transactions of the referenced execution payload.

engine_getBlobsBundleV1
params:
	payloadId:DATA,8 Bytes #Identifier of the payload build process
result:BlobsBundleV1

Engine_getPayloadV1

engine_getPayloadV1
params:
	payloadId:DATA,8 Bytes
result:ExecutionPayloadV1

This method may be combined with engine_getPayloadV1 into a engine_getPayloadV2 in a later stage of EIP-4844. The separation of concerns aims to minimize changes during the testing phase of the EIP.

Engine_getPayloadV2

engine_getPayloadV1
params:
	payloadId: DATA, 8 Bytes 
result:Object{
	Payload: ExecutionPayloadV1
	blobs: BlobsBundleV1
}

3. The Mempool

The blob data is only there in the network wrapper presentation of the TX. From the perspective of the Execution layer, the blob data is not persisted, and not accessible in the EVM. Blob-data is purely meant for data availability. In the EVM the data can be proven to be there by VersionedHash.

The validity of the blob is guaranteed on other architectures. This is important in the long run, because in the future these blobs will be broadcast on another subnet (DAS & Reconstruction) 📢.

Transactions are presented as TransactionType || TransactionNetworkPayload on the execution layer network, the payload is a SSZ encoded container:

class BlobTransactionNetworkWrapper(Container):
    tx: SignedBlobTransaction
		# KZGCommitment = Bytes48
    blob_kzgs: List[KZGCommitment, MAX_TX_WRAP_KZG_COMMITMENTS]
		# BLSFieldElement = uint256
    blobs: List[Vector[BLSFieldElement, FIELD_ELEMENTS_PER_BLOB], LIMIT_BLOBS_PER_TX]
		# KZGProof = Bytes48
    kzg_aggregated_proof: KZGProof

We do network-level validation of BlobTransactionNetworkWrapper objects as follows:

def validate_blob_transaction_wrapper(wrapper: BlobTransactionNetworkWrapper):
    versioned_hashes = wrapper.tx.message.blob_versioned_hashes
    commitments = wrapper.blob_kzgs
    blobs = wrapper.blobs
    # note: assert blobs are not malformatted
    assert len(versioned_hashes) == len(commitments) == len(blobs)

    # Verify that commitments match the blobs by checking the KZG proof
    assert verify_aggregate_kzg_proof(blobs, commitments, wrapper.kzg_aggregated_proof)

    # Now that all commitments have been verified, check that versioned_hashes matches the commitments
    for versioned_hash, commitment in zip(versioned_hashes, commitments):
        assert versioned_hash == kzg_to_versioned_hash(commitment)

4. Rollup Integration

A. Architecture

L2 <> L1

Blob Lifecycle

A1. ZK-Rollup

In ZK rollup we can pass blob data as private input to KZG and do elliptic curve linear combination (or pairing) inside SNARK to verify it. But this is costly and very inefficient.

We can use proof of equivalence protocol to prove that the blob's kzg commitment and zk proof point to the same data.

Choosing the point z as a hash of all the commitments ensures that there is no way to manipulate the data or the commitments after you learn z (this is standard Fiat-Shamir reasoning)

Note that ZK rollup does not directly verify KZG, they use Point evaluation precompile to verify. In this way, after we resist quantum in the future, it will not bring new troubles to ZK rollup.

Point evaluation precompile

def point_evaluation_precompile(input: Bytes) -> Bytes:
    """
    Verify p(z) = y given commitment that corresponds to the polynomial p(x) and a KZG proof.
    Also verify that the provided commitment matches the provided versioned_hash.
    """
    # The data is encoded as follows: versioned_hash | z | y | commitment | proof |
    versioned_hash = input[:32]
    z = input[32:64]
    y = input[64:96]
    commitment = input[96:144]
    kzg_proof = input[144:192]

    # Verify commitment matches versioned_hash
    assert kzg_to_versioned_hash(commitment) == versioned_hash

    # Verify KZG proof
    assert verify_kzg_proof(commitment, z, y, kzg_proof)

    # Return FIELD_ELEMENTS_PER_BLOB and BLS_MODULUS as padded 32 byte big endian values
    return Bytes(U256(FIELD_ELEMENTS_PER_BLOB).to_be_bytes32() + U256(BLS_MODULUS).to_be_bytes32())

The idea here is we need a random point that the producer/verifier cannot choose, and evaluate that the data point in both the KZG blob and ZK rollup data commitments is the same.

5. Fee Market

FYI:

Gas fee is the pricing of Ethereum resources. At present, we use one-dimensional pricing method (only have Base fee), but with the increase of historical state and data sharding, this method is very inefficient.

The resources of Ethereum can be classified into:

Burst limits
- Bandwidth
- Compute
- State access
- Memory
Sustained limits
- State growth
- History growth

For EIP-4844 we will mainly take up two resources:

Bandwidth
History growth

History growth is not a big problem, all blobs are not stored permanently, they are only stored for a one-month validity period.

Pricing for bandwidth will be a bigger issue.

EIP-4844 introduces a multi-dimensional EIP-1559 fee market, where there are two resources, gas and blobs, with separate floating gas prices and separate limits.

Just like this:

PR 5707: Fee Market Update:

https://github.com/ethereum/EIPs/commit/7e8d2629508c4d571f0124e4fc67a9ac13ee8b9a

Introduce data gas as a second type of gas, used to charge for blobs (1 byte = 1 data gas)

Data gas has its own EIP-1559-style dynamic pricing mechanism:

MAX_DATA_GAS_PER_BLOCK, target half of that
Transactions specify max_fee_per_data_gas: uint256
No separate tip for simplicity
MIN_DATA_GASPRICE so that one blob costs at least ~0.00001 ETH
Track excess_data_gas instead of basefee

We use the excess_data_gas header field to store persistent data needed to compute the data gas price. For now, only blobs are priced in data gas.

def calc_data_fee(tx: SignedBlobTransaction, parent: Header) -> int:
    return get_total_data_gas(tx) * get_data_gasprice(header)

def get_total_data_gas(tx: SignedBlobTransaction) -> int:
    return DATA_GAS_PER_BLOB * len(tx.message.blob_versioned_hashes)

def get_data_gasprice(header: Header) -> int:
    return fake_exponential(
        MIN_DATA_GASPRICE,
        header.excess_data_gas,
        DATA_GASPRICE_UPDATE_FRACTION
    )