Directly starting from the state transition function…
where is the Ethereum state transition function. In Ethereum, , together with are considerably more powerful than any existing comparable system; allows components to carry out arbitrary computation, while allows components to store arbitrary state between transactions.
is State-transition function
is the State (the state that is stored)
is the Block;
is the block-level state-transition function;
is a tuple represeting Ethereum Transaction;
is a nonced Transaction;
Where is this block, which includes a series of transactions amongst some other components and is the block-level state-transition function. This is the basis of the blockchain paradigm, a model that forms the backbone of not only Ethereum, but all decentralised consensus-based transaction systems to date.
Since Ethereum is a decentralized platform, any participant can attempt to add a new block to an existing chain of blocks. This creates a branching structure of blocks resembling a tree. To determine the main path from the root (the initial genesis block) to the leaf (the most recent block), a consensus mechanism is needed. If nodes disagree on which path represents the official blockchain, this disagreement results in a fork — a split where different nodes might follow different histories beyond a certain point, each considering their chosen history as the correct one. This divergence can lead to incompatible records of transactions, undermining trust in the system.
Since the Paris hard fork, Ethereum manages consensus through a protocol known as the Beacon Chain. This is part of Ethereum's consensus layer, which sets the rules for identifying the valid sequence of blocks.
This document (these notes as well) discusses the execution layer of Ethereum, which governs interactions and updates to the state of the Ethereum Virtual Machine (EVM). For more details on the consensus mechanisms, one would refer to the consensus specifications.
Occasionally actors do not agree on a protocol change, and a permanent fork occurs. In order to distinguish between diverged blockchains, EIP-155 by Buterin [2016] introduced the concept of chain ID, which we denote by .
For the Ethereum main network
denotes functions operating on highly structured values like Ethereum State-transition function;
is the Global State (the state that is stored); is machine state;
is the Cost function; is cost function for SSTORE
operation
is the Block; is the block-level state-transition function;
is a tuple represeting Ethereum Transaction; is a nonced Transaction;
stack; memory
The world state in Ethereum is essentially a mapping that links addresses (which are 160-bit identifiers) to their respective account states. These account states are structured data serialized as Recursive Length Prefix (RLP), detailed in Appendix B. While this mapping is not stored directly on the blockchain, it is maintained within a modified Merkle Patricia tree, or trie. This trie functions atop a simple database backend that maps byte arrays to byte arrays, termed the state database.
The root node of this trie structure is crucial as its cryptographic hash reflects all the internal data, thus serving as a secure identifier for the entire system's state. The immutable nature of this data structure enables retrieval of any previous state by adjusting the root hash, provided all root hashes are logged on the blockchain, allowing for straightforward reversion to past states.
Each account state comprises four main fields:
The system introduces a method to collapse the trie's data into a concise hash using a specific function, which helps in identifying the world state more efficiently.
Additionally, accounts can be "empty" if they have no code, nonce, or balance, identified by a specific condition in the codeHash field. If an account lacks any substantive state, it's considered "dead."
This structure enables Ethereum to manage its complex state efficiently, ensuring data integrity and enabling historical data access through its unique tree-based storage and hashing mechanisms.
A transaction in Ethereum, termed , is a cryptographically-signed instruction initiated by an entity outside the Ethereum network. The initiator is typically human, assisted by software tools for creating and sending the transaction. Importantly, the sender of a transaction cannot be a contract.
EIP-2718, introduced by Zoltu in 2020, brought forth the concept of multiple transaction types. By the London upgrade, Ethereum supports three primary transaction types:
Each transaction type includes several common fields:
Transactions of type 1 and type 2 have additional fields like:
Legacy transactions do not include an access list, while chainId
and yParity
for legacy transactions are combined into a single value:
w: A scalar value encoding Y parity and possibly chain ID;
while the newer types use more refined gas pricing fields:
For contract creation transactions, regardless of the type, they contain:
Message call transactions, on the other hand, include:
Adding to the existing transaction types, the recent DenCun upgrade introduced by EIP-4844 adds a new transaction type, known as the "Blob Transaction". This transaction type is designed to handle large amounts of data more efficiently by introducing "blob-carrying transactions". These transactions enable data to be stored temporarily off-chain in a cost-effective manner while still maintaining its availability for processing by the Ethereum network. This is particularly beneficial for scaling solutions and complex dApps requiring significant data throughput without overly burdening the main blockchain.
Here is how this integrates into the existing transaction format:
The addition of blob transactions represents a significant enhancement in Ethereum's capabilities, enabling it to handle larger data requirements efficiently and supporting more complex applications on its platform.
Please note that for "blob-carrying transactions" which contain a large amount of data cannot be accessed by EVM execution, but their commitment can be accessed. The format is intended to be fully compatible with the format that will be used in full sharding.
This as too much mathematical notations, I'm condensing it alot.
In Ethereum, a block is a data structure that primarily consists of a block header , the transactions it contains, and a previously used component for ommers (which are now deprecated post-Paris hard fork). Here's a simplified breakdown of what a block in Ethereum includes:
Each transaction in a block has a corresponding receipt that logs execution outcomes, facilitating proof creation or indexing. Receipts include:
The validity of a block in Ethereum is confirmed if:
Ethereum’s transition from proof of work to proof of stake (post-Paris hard fork) significantly altered block header properties, eliminating ommers and difficulty and introducing fields like prevRandao to align with the new consensus mechanism. This transition reflects Ethereum’s evolving technology and adaptation to more energy-efficient and scalable network operations.
To manage network abuse and address the complexities of Turing completeness, Ethereum imposes a fee structure for any executable computation, including contract creations, message calls, storage operations, and other virtual machine executions. This fee structure is denominated in "gas," which serves as a unit of computational effort (details on specific fees for different operations can be found in Appendix G of the yellow paper).
Each transaction on the Ethereum network is associated with a specific "gasLimit," which represents the maximum amount of gas the sender is willing to use for the transaction. This gasLimit effectively acts as a cap on the total computational cost the sender is prepared to incur. The gas used is prepaid from the sender's account balance at a rate defined as the "effective gas price," and any gas not used by the end of the transaction is refunded to the sender.
The introduction of EIP-1559 in the London hard fork established a new fee model involving a base fee and a priority fee:
For transaction types introduced post-EIP-1559 (type 2 transactions), users can specify two parameters:
Older transaction types (type 0 and type 1) only use a single fee parameter, "gasPrice," which must cover both the base fee and any priority fee the sender wishes to pay.
Validators have the discretion to choose which transactions to include in a block, typically favoring those with higher priority fees due to the direct financial incentive. Thus, while senders can set any priority fee, higher fees increase the likelihood of timely transaction inclusion, creating a trade-off for users between cost and speed of execution.
The execution of a transaction in Ethereum is complex because it involves a series of checks and operations that define the state transition function . Before any transaction can be executed, it must first pass several tests to ensure its intrinsic validity:
When a transaction is executed, the new state is derived from the current state using . The function also determines the gas used , the logs generated during the transaction , and the resulting status code .
During execution, an "accrued substate" is collected, which includes:
The intrinsic gas required for the transaction is calculated based on the type of transaction and the data involved. The effective gas price is determined based on the transaction type and the fees involved. The upfront cost is calculated as the product of the gas limit and the gas price, plus the transaction value.
The execution begins with the transaction changing the state by increasing the sender's nonce and reducing their balance by the cost of the gas used. The available gas for the transaction is then the gas limit minus the intrinsic gas. The computation results in a new provisional state, which after adjustments, becomes the final state upon completion of the transaction.
The entire process ensures that transactions are securely and accurately processed, maintaining the integrity and state of the Ethereum blockchain.
During the execution of a transaction, certain operations can lead to a situation where not all the gas allocated (or "purchased") by the transaction is used. Ethereum has a mechanism to handle such scenarios by refunding a portion of the gas not used. Here are the key details:
Refund for Certain Operations: Some operations in the Ethereum Virtual Machine (EVM) lead to a reduction in the use of storage space, which can be less computationally intensive than initially anticipated. For example, when storage values are reset to zero, the SSTORE operation adds to a "refund counter." This counter tracks the amount of gas to be refunded.
Calculation of Refund Amount: The refund amount is determined by specific rules set out in Ethereum's protocol. The basic formula considers the lesser of:
Final Gas Calculation:
Impact on Transaction Fee: Since the transaction fee is calculated based on the total gas used (multiplied by the gas price), refunds can significantly reduce the fee paid by the sender. This mechanism encourages users to write more efficient code and clean up unnecessary storage, as they can receive refunds for such optimizations.
Let's consider a transaction where a user interacts with a smart contract to change storage values. If the user resets previously non-zero storage slots to zero, the operation qualifies for a gas refund. Here's how it would be calculated:
Creating an account in Ethereum involves several intrinsic parameters that define how the account is set up and initialized. These parameters are critical in determining the result of the account creation process, governed by the Ethereum protocol. Here's a breakdown of the process and its parameters:
The function processes these inputs along with the current blockchain state and the accrued substate to produce:
The address of the newly created account depends on whether it’s a CREATE or CREATE2 operation:
Once the address is determined and the account is preliminarily set up with its nonce set to 1, balance set to the endowment, and other initial settings, the initialization code is executed. This execution can modify the account's storage, create further accounts, and make additional message calls. The execution environment is defined with specific parameters that include the sender, gas price, and others relevant to the transaction context.
Gas consumption during initialization is critical. If the gas runs out, or if the code-deposit cost (associated with storing the initialization code on the blockchain) exceeds the remaining gas, an out-of-gas exception occurs. This results in the transaction being reverted without any changes to the state.
If the initialization completes successfully and there are no exceptions:
If any exceptions occur during initialization, such as out-of-gas or errors in the initialization code, the process may revert all changes, effectively leaving the blockchain state as it was before attempting the creation.
This detailed process ensures that account creation in Ethereum is secure and that each step—from checking the initial conditions to finalizing the state—adheres to the protocol’s stringent requirements. The complexity of this process underlines the flexibility and security considerations inherent in Ethereum’s design, particularly in how it handles new account creation and the execution of initialization code.
In Ethereum, executing a message call is a process that involves several parameters, with each playing a specific role in how the call is processed and how it interacts with the Ethereum Virtual Machine (EVM). Here's an organized breakdown of this process:
During the execution of a message call, a few important processes occur:
State and Accrued Substate Evaluation: The message call leads to a transition to a new blockchain state and a new accrued substate , along with the consumption of some amount of gas and a resulting status code .
Value Transfer: The value specified in the call is transferred from the sender to the recipient, adjusting their balances accordingly unless the sender and recipient are the same. If the recipient does not exist, it is created with a zero balance and nonce.
Code Execution: The code associated with the recipient account is executed. This code can be a smart contract residing at the recipient's address. The hash of this code is used to identify and execute the correct function or contract.
Exception Handling: If an exception occurs due to reasons like out-of-gas, stack underflows, or invalid operations, the transaction may revert. This includes reverting state changes and potentially not refunding gas.
Output Data: In scenarios where a message call is triggered by VM code execution, the output data from the call is captured and can be used further by the EVM.
The execution model handles various types of operations based on the code to be executed, which may involve specific precompiled contracts or custom smart contract code. The process ensures that the execution adheres to the gas constraints and correctly processes the input data and transfers value as intended.
This structured approach to message calls in Ethereum provides a robust framework for executing and interacting with smart contracts, ensuring that each step—from transferring value to executing complex operations—is handled securely and predictably within the EVM environment. This system supports Ethereum's capability to execute decentralized applications securely and efficiently.
The execution model of the Ethereum Virtual Machine (EVM) is designed to process bytecode within a structured environment, following a set series of operations dictated by the EVM’s quasi-Turing-complete nature. This limitation, primarily imposed by the gas system, serves to limit the total computation that can be performed.
The EVM operates on a stack-based architecture with a 256-bit word size, ideal for cryptographic functions such as Keccak-256 hash and elliptic curve computations. It features:
The EVM does not follow a von Neumann architecture; instead, it stores program code in a separate virtual ROM, accessible only through specific instructions. Execution within the EVM can halt exceptionally for reasons such as stack underflows, invalid instructions, or running out of gas, with all state changes from the current transaction being discarded.
Fees in the EVM, denominated in gas, are incurred under three main circumstances:
CREATE
, CREATE2
, CALL
, and CALLCODE
, where gas forms part of the payment.Storage operations are particularly cost-sensitive. To encourage efficient use of storage, fees for clearing storage are not only waived but also refunded, promoting optimization of state usage.
The execution environment of a transaction or operation in the EVM includes several key pieces of information:
The function computes the resulting state (), remaining gas (), accrued substate (), and output () based on these environmental inputs.
Execution within the EVM is modeled as an iterative process involving:
The EVM manages execution through a series of operational cycles, with each cycle potentially altering the machine's state, adjusting the stack, or expanding memory, guided by specific instruction mnemonics.
The state of the machine during execution is represented as a tuple () consisting of:
The EVM can halt execution exceptionally due to various issues:
JUMP
or JUMPI
operation targets an invalid instruction point.RETURNDATACOPY
Issues: If this operation tries to access more return data than is available.SSTORE
.The validity of jump destinations is crucial for the secure and correct execution of jump operations. A valid jump destination is defined as:
JUMPDEST
: Only positions explicitly marked with a JUMPDEST
instruction are valid jump targets.PUSH
operations.Normal halting occurs when execution completes without any runtime errors:
STOP
, RETURN
, REVERT
, and SELFDESTRUCT
lead to normal halts.RETURN
and REVERT
: These return control to the calling function, potentially with data.HRETURN
Function: Specifically manages the data returned by RETURN
and REVERT
, ensuring that it is correctly handled or cleared if the execution context changes.During the execution:
The function defines how the EVM processes each step in the execution:
RETURN
.The entire execution model of the EVM is designed to ensure that each transaction is processed securely and deterministically, adhering to the strict rules laid out by the Ethereum protocol. This structure supports the complex functionalities of smart contracts while maintaining the decentralized integrity of the Ethereum blockchain.
The Paris hard fork marked a significant transition for Ethereum, shifting its consensus mechanism from Proof of Work (PoW) to Proof of Stake (PoS). This change represents a fundamental shift in how blocks are validated and new transactions are added to the blockchain.
Unlike previous hard forks in Ethereum, which typically occurred at a predetermined block height, the Paris hard fork was designed to activate based on a specific condition known as the "terminal total difficulty" (TTD). This approach was chosen to mitigate potential risks associated with the transition:
Avoiding Malicious Forks: By using total difficulty instead of block height, the transition avoids scenarios where a minority of hash power could potentially extend a competing PoW chain to reach a predefined block height first, thus creating a malicious fork. This method ensures that the transition to PoS would occur only when the cumulative difficulty of mined blocks reached a critical, predefined threshold, making it much harder for any minority group to influence or hijack the transition.
Definition of Terminal Block: The terminal block, which is the last block mined using PoW, was defined by the following criteria:
58750000000000000000000
in this case).Total difficulty () of a block in the PoW system was calculated recursively as:
This calculation accumulates the difficulty of each block, adding up to a total that reflects the overall computational effort expended to reach the current state of the blockchain.
Upon reaching the terminal block:
Beacon Chain Takes Over: The Beacon Chain, already running in parallel to the Ethereum mainnet, assumes responsibility for processing new blocks. Under PoS, blocks are validated by validators who stake their ETH to participate in the consensus mechanism, rather than by miners solving cryptographic puzzles.
Security and Efficiency: This transition not only aims to enhance the security of the Ethereum network by making it more decentralized but also significantly reduces its energy consumption, addressing one of the major criticisms of traditional PoW systems.
New Consensus Mechanism: The consensus under PoS is achieved through a combination of staking, attestation by validators, and algorithms that randomly select block proposers and committees to ensure the network remains secure and transactions are processed efficiently.
The Paris hard fork was a pivotal event in Ethereum's history, setting the stage for more scalable, sustainable, and secure operations. It represents Ethereum's commitment to innovation and its responsiveness to the broader societal concerns about the environmental impact of cryptocurrency mining.
Before the Paris hard fork, Ethereum's canonical blockchain was defined as the block path with the greatest total difficulty, as described above. This measure of total difficulty accumulated from the difficulty values of individual blocks under the Proof of Work (PoW) system.
However, following the transition to Proof of Stake (PoS) at the Paris hard fork, the rule of the greatest total difficulty was discontinued. Instead, the new rule implemented is known as LMD Ghost, which requires different information for determining the canonical Ethereum blockchain. This information must now include additional data from the Beacon Chain, which is not detailed in this document but is essential for the post-transition blockchain.
After the transition, the canonical chain is identified each time a POS_FORKCHOICE_UPDATED
event is emitted by the Beacon Chain, starting with the first event at the transition block detailed in section 10. The chain is defined as starting from the genesis block and ending at the block nominated by the POS_FORKCHOICE_UPDATED
event as the head of the chain.
The head of the chain should only be updated following a POS_FORKCHOICE_UPDATED
event. Any updates to the head of the chain should only set the head to the block specified by the event, and no optimistic updates should be made without such an event. Each POS_FORKCHOICE_UPDATED
event also references a finalized block, which should then be recognized as the most recent finalized block.
Additionally, the canonical blockchain must contain the block with the hash and number of the terminal block as defined in above section, which marks the end of PoW and the full transition to PoS.
Finalizing a block in Ethereum involves two main steps: validating the transactions and verifying the state of the blockchain. Here’s a detailed explanation of these processes:
Transaction Validation:
State Validation:
For each transaction :
Finally, , representing the state after the final transaction has been processed, is defined as the last state produced . This entire mechanism outlines how a block transitions from an initial to a final state before it is validated and added to the blockchain, ensuring all transactions are properly executed and the resulting state is consistent with the block’s data.
Note: The notes are good but please yellow paper as well. I've omitted alot of details especially abstracted alot of Math here and there. Refer appendices from the yellow paper as well.