# Deep in to EIP4844 ###### tags `Blockchain Research` ---- # 1. Execution Layer ## A. Transaction Type We created a new transaction type `BLOB_TX-TYPE` via [EIP-2718](https://eips.ethereum.org/EIPS/eip-2718) ![](https://i.imgur.com/Q0FM9jI.png) ### A1. Message & Signature ```python class SignedBlobTransaction(Container): message: BlobTransaction signature: ECDSASignature class BlobTransaction(Container): chain_id: uint256 nonce: uint64 max_priority_fee_per_gas: uint256 max_fee_per_gas: uint256 gas: uint64 to: Union[None, Address] # Address = Bytes20 value: uint256 data: ByteList[MAX_CALLDATA_SIZE] access_list: List[AccessTuple, MAX_ACCESS_LIST_SIZE] max_fee_per_data_gas: uint256 **blob_versioned_hashes: List[VersionedHash, MAX_OBJECT_LIST_SIZE]** class AccessTuple(Container): address: Address # Bytes20 storage_keys: List[Hash, MAX_ACCESS_LIST_STORAGE_KEYS] class ECDSASignature(Container): y_parity: boolean r: uint256 s: uint256 ``` ### A2. Converts a blob to its corresponding KZG point: ```python def blob_to_kzg(blob: Vector[BLSFieldElement, CHUNKS_PER_BLOB]) -> KZGCommitment: computed_kzg = bls.Z1 for value, point_kzg in zip(tx.blob, KZG_SETUP_LAGRANGE): assert value < BLS_MODULUS computed_kzg = bls.add( computed_kzg, bls.multiply(point_kzg, value) ) return computed_kzg ``` ### A3. Keccak256: Converts a KZG point into a versioned hash: ```python def kzg_to_versioned_hash(kzg: KZGCommitment) -> VersionedHash: return BLOB_COMMITMENT_VERSION_KZG + hash(kzg)[1:] ``` The reason why VersionedHash is used instead of KZG here is for forward compatibility. STARK is very different from KZG, VersionedHash is 32 bytes, it will be very convenient to replace STARK in the future. For example, there is no need to replace the proof format in Precompile, etc… <aside> 💡 EVM uses a stack-based architecture. The word size (that is, the size of the data element in the stack) is 256 bits (32 bytes). This is done to facilitate the execution of 256-bit calculations of the Keccak hash and elliptic curves. </aside> ## B. **Opcode to get versioned hashes** New opcode:  `DATA_HASH` (with byte value `HASH_OPCODE_BYTE`) input one stack argument `index` returns `tx.header.blob_versioned_hashes[index]`  if `index < len(tx.header.blob_versioned_hashes)`, and otherwise zero The opcode has a gas cost of `HASH_OPCODE_GAS` ## C. Precompile Note: We removed `blob_verification_precompile` as originally designed [https://github.com/ethereum/EIPs/commit/f45bd0c101944dc703bd8a80c6b064b47e1f7390](https://github.com/ethereum/EIPs/commit/f45bd0c101944dc703bd8a80c6b064b47e1f7390) ### C1. ****Point evaluation precompile**** ```python def point_evaluation_precompile(input: Bytes) -> Bytes: """ Verify p(z) = y given commitment that corresponds to the polynomial p(x) and a KZG proof. Also verify that the provided commitment matches the provided versioned_hash. """ # The data is encoded as follows: versioned_hash | z | y | commitment | proof | versioned_hash = input[:32] z = input[32:64] y = input[64:96] commitment = input[96:144] kzg_proof = input[144:192] # Verify commitment matches versioned_hash assert kzg_to_versioned_hash(commitment) == versioned_hash # Verify KZG proof assert verify_kzg_proof(commitment, z, y, kzg_proof) # Return FIELD_ELEMENTS_PER_BLOB and BLS_MODULUS as padded 32 byte big endian values return Bytes(U256(FIELD_ELEMENTS_PER_BLOB).to_be_bytes32() + U256(BLS_MODULUS).to_be_bytes32()) ``` # 2. Consensus Layer ## A. **Node** - Beacon chain: process updated beacon blocks and ensure blobs are available. - P2P network: gossip and sync updated beacon block types and new blobs sidecars. - Honest validator: produce beacon blocks with blobs, publish the blobs sidecars. ## B. Beacon chain On the consensus-layer the blobs are now referenced, but not fully encoded, in the beacon block body. Instead of embedding the full contents in the body, the contents of the blobs are propagated separately, as a “sidecar”. This “sidecar” design provides forward compatibility for further data increases by black-boxing `is_data_available()`: with full sharding `is_data_available()` can be replaced by data-availability-sampling (DAS) thus avoiding all blobs being downloaded by all beacon nodes on the network. ### B1. **Constructing the `BeaconBlockBody`** We add a `BeaconBlockBody` struct: ```python class BeaconBlockBody(Container): randao_reveal: BLSSignature eth1_data: Eth1Data # Eth1 data vote graffiti: Bytes32 # Arbitrary data # Operations proposer_slashings: List[ProposerSlashing, MAX_PROPOSER_SLASHINGS] attester_slashings: List[AttesterSlashing, MAX_ATTESTER_SLASHINGS] attestations: List[Attestation, MAX_ATTESTATIONS] deposits: List[Deposit, MAX_DEPOSITS] voluntary_exits: List[SignedVoluntaryExit, MAX_VOLUNTARY_EXITS] sync_aggregate: SyncAggregate # Execution execution_payload: ExecutionPayload # [Modified in EIP-4844] bls_to_execution_changes: List[SignedBLSToExecutionChange, MAX_BLS_TO_EXECUTION_CHANGES] blob_kzg_commitments: List[KZGCommitment, MAX_BLOBS_PER_BLOCK] # [New in EIP-4844] ``` Nodes broadcast this over the network instead of plain beacon blocks. When a node receives it, it first calls `validate_blobs_and_kzg_commitments` and if this call passes it runs the usual processing on the beacon block. ### B2. **Verifying each blob kzg is correct.** ```python def validate_blobs_and_kzg_commitments(execution_payload: ExecutionPayload, blobs: Sequence[Blob], blob_kzg_commitments: Sequence[KZGCommitment]) -> None: # Optionally sanity-check that the KZG commitments match the versioned hashes in the transactions assert verify_kzg_commitments_against_transactions(execution_payload.transactions, blob_kzg_commitments) # Optionally sanity-check that the KZG commitments match the blobs (as produced by the execution engine) assert len(blob_kzg_commitments) == len(blobs) assert [blob_to_kzg_commitment(blob) == commitment for blob, commitment in zip(blobs, blob_kzg_commitments)] ``` If valid, set `block.body.blob_kzg_commitments = blob_kzg_commitments`.**** ### B3. Constructing the `SignedBeaconBlockAndBlobsSidecar` Set `signed_beacon_block_and_blobs_sidecar.beacon_block = block` where `block` is obtained above.**** ```python def get_blobs_sidecar(block: BeaconBlock, blobs: Sequence[Blob]) -> BlobsSidecar: return BlobsSidecar( beacon_block_root=hash_tree_root(block), beacon_block_slot=block.slot, blobs=blobs, kzg_aggregated_proof=compute_aggregate_kzg_proof(blobs), ) ``` This `signed_beacon_block_and_blobs_sidecar` is then published to the global `beacon_block_and_blobs_sidecar` topic. After publishing the peers on the network may request the sidecar through sync-requests, or a local user may be interested. The validator MUST hold on to sidecars for `MIN_EPOCHS_FOR_BLOBS_SIDECARS_REQUESTS` epochs and serve when capable, to ensure the data-availability of these blobs throughout the network. After `MIN_EPOCHS_FOR_BLOBS_SIDECARS_REQUESTS` nodes MAY prune the sidecars and/or stop serving them. ## C. Cross-validation We also add a cross-validation requirement to check equivalence between the `BeaconBlock`  and its contained `ExecutionPayload`  (this check can be done later than the verifications of the beacon block and the payload) **Verifying versioned_hash is matched with kzg in block body.** ```python def cross_validate(block: BeaconBlock): body = block.message.body all_versioned_hashes = [] for tx in body.execution_payload.blob_transactions: all_versioned_hashes.extend(tx.header.blob_versioned_hashes) **assert all_versioned_hashes == [ kzg_to_versioned_hash(kzg) for kzg in body.blob_kzgs ]** ``` ### C1. Engine API [https://github.com/ethereum/execution-apis/pull/197](https://github.com/ethereum/execution-apis/pull/197) ```markdown BlobsBundleV1 blokHash: DATA, 32 Bytes kzgs: Array of DATA - Array of kzg Commitnent blobs: Array of DATA - Array of blobs, each blob is SSZ encoded(DATA) ``` **Engine_getBlobsBundleV1** This method retrieves the blobs and their respective KZG commitments corresponding to the `versioned_hashes` included in the blob transactions of the referenced execution payload. ```markdown engine_getBlobsBundleV1 params: payloadId:DATA,8 Bytes #Identifier of the payload build process result:BlobsBundleV1 ``` **Engine_getPayloadV1** ```markdown engine_getPayloadV1 params: payloadId:DATA,8 Bytes result:ExecutionPayloadV1 ``` This method may be combined with `engine_getPayloadV1` into a `engine_getPayloadV2` in a later stage of EIP-4844. The separation of concerns aims to minimize changes during the testing phase of the EIP. **Engine_getPayloadV2** ```markdown engine_getPayloadV1 params: payloadId: DATA, 8 Bytes result:Object{ Payload: ExecutionPayloadV1 blobs: BlobsBundleV1 } ``` # 3. The Mempool The blob data is only there in the network wrapper presentation of the TX. From the perspective of the Execution layer, the blob data is not persisted, and not accessible in the EVM. Blob-data is purely meant for data availability. In the EVM the data can be proven to be there by `VersionedHash`. The validity of the blob is guaranteed on other architectures. This is important in the long run, because in the future these blobs will be broadcast on another subnet (DAS & Reconstruction) 📢. ![](https://i.imgur.com/APMWKNV.png) Transactions are presented as `TransactionType || TransactionNetworkPayload` on the execution layer network, the payload is a SSZ encoded container: ```python class BlobTransactionNetworkWrapper(Container): tx: SignedBlobTransaction # KZGCommitment = Bytes48 blob_kzgs: List[KZGCommitment, MAX_TX_WRAP_KZG_COMMITMENTS] # BLSFieldElement = uint256 blobs: List[Vector[BLSFieldElement, FIELD_ELEMENTS_PER_BLOB], LIMIT_BLOBS_PER_TX] # KZGProof = Bytes48 kzg_aggregated_proof: KZGProof ``` We do network-level validation of `BlobTransactionNetworkWrapper` objects as follows: ```python def validate_blob_transaction_wrapper(wrapper: BlobTransactionNetworkWrapper): versioned_hashes = wrapper.tx.message.blob_versioned_hashes commitments = wrapper.blob_kzgs blobs = wrapper.blobs # note: assert blobs are not malformatted assert len(versioned_hashes) == len(commitments) == len(blobs) # Verify that commitments match the blobs by checking the KZG proof assert verify_aggregate_kzg_proof(blobs, commitments, wrapper.kzg_aggregated_proof) # Now that all commitments have been verified, check that versioned_hashes matches the commitments for versioned_hash, commitment in zip(versioned_hashes, commitments): assert versioned_hash == kzg_to_versioned_hash(commitment) ``` # 4. Rollup Integration <aside> 💡 Instead of putting rollup block data in transaction calldata, rollups would expect rollup block submitters to put the data into blobs. This guarantees availability (which is what rollups need) but would be much cheaper than calldata. Rollups need data to be available once, long enough to ensure honest actors can construct the rollup state, but not forever. </aside> ## A. Architecture **L2 <> L1** ![](https://i.imgur.com/JVodvGv.png) **Blob Lifecycle** ![](https://i.imgur.com/a4yMMWP.png) ### A1. ZK-Rollup In ZK rollup we can pass blob data as private input to KZG and do elliptic curve linear combination (or pairing) inside SNARK to verify it. But this is costly and very inefficient. We can use [proof of equivalence](https://ethresear.ch/t/easy-proof-of-equivalence-between-multiple-polynomial-commitment-schemes-to-the-same-data/8188) protocol to prove that the blob's kzg commitment and zk proof point to the same data. ![](https://i.imgur.com/b3DWySr.png) Choosing the point `z` as a hash of all the commitments ensures that there is no way to manipulate the data or the commitments after you learn `z` (this is standard Fiat-Shamir reasoning) Note that ZK rollup does not directly verify KZG, they use ****`Point evaluation precompile`**** to verify. In this way, after we resist quantum in the future, it will not bring new troubles to ZK rollup. ****Point evaluation precompile**** ```python def point_evaluation_precompile(input: Bytes) -> Bytes: """ Verify p(z) = y given commitment that corresponds to the polynomial p(x) and a KZG proof. Also verify that the provided commitment matches the provided versioned_hash. """ # The data is encoded as follows: versioned_hash | z | y | commitment | proof | versioned_hash = input[:32] z = input[32:64] y = input[64:96] commitment = input[96:144] kzg_proof = input[144:192] # Verify commitment matches versioned_hash assert kzg_to_versioned_hash(commitment) == versioned_hash # Verify KZG proof assert verify_kzg_proof(commitment, z, y, kzg_proof) # Return FIELD_ELEMENTS_PER_BLOB and BLS_MODULUS as padded 32 byte big endian values return Bytes(U256(FIELD_ELEMENTS_PER_BLOB).to_be_bytes32() + U256(BLS_MODULUS).to_be_bytes32()) ``` The idea here is we need a random point that the producer/verifier cannot choose, and evaluate that the data point in both the KZG blob and ZK rollup data commitments is the same. ![](https://i.imgur.com/dlvsIZx.png) # 5. Fee Market FYI: - [EIP-4488 Mining Strategy & Stipend Analysis](https://hackmd.io/@adietrichs/4488-mining-analysis) - [Multidimensional EIP 1559](https://ethresear.ch/t/multidimensional-eip-1559/11651) - [Exponential EIP-1559](https://dankradfeist.de/ethereum/2022/03/16/exponential-eip1559.html) Gas fee is the pricing of Ethereum resources. At present, we use one-dimensional pricing method (only have Base fee), but with the increase of historical state and data sharding, this method is very inefficient. The resources of Ethereum can be classified into: - Burst limits - Bandwidth - Compute - State access - Memory - Sustained limits - State growth - History growth For EIP-4844 we will mainly take up two resources: - Bandwidth - History growth History growth is not a big problem, all blobs are not stored permanently, they are only stored for a one-month validity period. Pricing for bandwidth will be a bigger issue. EIP-4844 introduces a **[multi-dimensional EIP-1559 fee market](https://ethresear.ch/t/multidimensional-eip-1559/11651)**, where there are **two resources, gas and blobs, with separate floating gas prices and separate limits**. Just like this: ![](https://i.imgur.com/sXxsN3V.png) PR 5707: Fee Market Update: [https://github.com/ethereum/EIPs/commit/7e8d2629508c4d571f0124e4fc67a9ac13ee8b9a](https://github.com/ethereum/EIPs/commit/7e8d2629508c4d571f0124e4fc67a9ac13ee8b9a) Introduce data gas as a second type of gas, used to charge for blobs (1 byte = 1 data gas) Data gas has its own EIP-1559-style dynamic pricing mechanism: - `MAX_DATA_GAS_PER_BLOCK`, target half of that - Transactions specify `max_fee_per_data_gas: uint256` - No separate tip for simplicity - `MIN_DATA_GASPRIC`E so that one blob costs at least ~0.00001 ETH - Track `excess_data_gas` instead of basefee We use the `excess_data_gas` header field to store persistent data needed to compute the data gas price. For now, only blobs are priced in data gas. ```python def calc_data_fee(tx: SignedBlobTransaction, parent: Header) -> int: return get_total_data_gas(tx) * get_data_gasprice(header) def get_total_data_gas(tx: SignedBlobTransaction) -> int: return DATA_GAS_PER_BLOB * len(tx.message.blob_versioned_hashes) def get_data_gasprice(header: Header) -> int: return fake_exponential( MIN_DATA_GASPRICE, header.excess_data_gas, DATA_GASPRICE_UPDATE_FRACTION ) ```