ETH2 - HackMD

# ETH2 ### Byzantine When we draw diagrams of block chains and block trees, it is easy to assume that this is somehow "the state" of the whole system. But these diagrams only ever represent the local view of a single participant in the system. My node's view of the system is likely to differ from your node's view of the system, if only temporarily, because we operate over an unreliable network. For example, you will see blocks at different times from when I see them, or in a different order, or even different blocks from those that I see. These treacherous generals exhibit what we've come to call "Byzantine behaviour", or "Byzantine faults". They can act in any arbitrary way: delaying messages, reordering messages, outright lying, sending contradictory messages to different recipients, failing to respond at all, or any other behaviour we can think of. ## Blockchain A block's contents (its payload) may vary according to the protocol. The payload of a block on Ethereum's execution chain is a list of user transactions. In Bitcoin, the block size is limited by the number of bytes of data in the block. In Ethereum's execution chain, the block size is limited by the block gas limit (that is, the amount of work needed to process the transactions in the block). Beacon block sizes are limited by hard-coded constants. The main benefit of bundling transactions into blocks comes from the interval between them (12 seconds in Eth2, 10 minutes on average in Bitcoin). This interval provides time for the network to converge - for as many nodes as possible to see each block and therefore to come to agreement on which block is the head of the chain ## Block Trees In real networks we can end up with something more like a block tree than a block chain. In this example very few blocks are built on their "obvious" parent. Why did the proposer of block C build on Arather than B? -It may be that the proposer of C had not received block B by the time it was ready to make its proposal. It may be that the proposer of C deliberately wanted to exclude block B from its chain, for example to steal its transactions, or to censor some transaction in B. -It may be that the proposer of C thought that block B was invalid for some reason. The various branches in the block tree are called "forks". Forks happen naturally as a consequence of network and processing delays. But they can also occur due to client faults, malicious client behaviour, or protocol upgrades that change the rules, making old blocks invalid with respect to the new rules. The last of these is sometimes called a "hard fork". if you were to consult nodes that are following different forks they would give you different answers regarding the state of the system. Non-forking consensus protocols exist, such as PBFT in the classical consensus world and Tendermint in the blockchain world. These protocols always produce a single linear chain and are thus formally "safe". However, they sacrifice liveness on asynchronous networks such as the Internet ## Fork Choice Rules https://eth2book.info/capella/part2/consensus/preliminaries/#:~:text=Given%20a%20block,their%20descendent%20blocks. ## Nodes Consensus is formed by validators, which (in true Ethereum style) are horribly misnamed, as they don't really validate anything - that's done by the nodes. Each validator represents an initial 32 ETH stake. It has its own secret key, and the related public key that is its identity in the protocol. Validators are attached to nodes, and a single node can host anything from zero to hundreds or thousands of validators. Validators attached to the same node do not act independently, they share the same view of the world. The two most important intervals are the slot, which is 12 seconds exactly, and the epoch, which is 32 slots, or 6.4 minutes. # Block And Attestations Every slot, exactly one validator is selected to propose a block. The block contains updates to the beacon state, including attestations that the proposer knows about, as well as the execution payload containing Ethereum user transactions. Every epoch, every validator gets to share its view of the world exactly once, in the form of an attestation. An attestation contains votes for the head of the chain that will be used by the LMD GHOST protocol, and votes for checkpoints that will be used by the Casper FFG protocol. Spreading the attestation workload across all 32 slots of an epoch keeps resource usage low. In each slot, committees comprising only 1/32 of the validators make attestations. ## Ghosts in the machine In essence, LMD GHOST provides slot-by-slot liveness (it keeps the chain running), while Casper FFG provides safety (it protects the chain from long reversions). LMD GHOST allows us to keep churning out blocks on top of one-another, but is forkful and therefore not formally safe. ## History The original plan was to apply Casper FFG as a proof of stake overlay on top of Ethereum's proof of work consensus. Casper FFG would confer finality – a property that proof of work chains lack – on the chain on a periodic basis, say, every 100 blocks. This was intended to be the first step in weaning Ethereum off proof of work. ## LMD GHOST The name LMD GHOST comprises two acronyms, for "Latest Message Driven", and "Greedy Heaviest-Observed Sub-Tree". ### Ghost In short, GHOST's fork choice doesn't follow the heaviest chain, but the heaviest subtree. It recognises that a vote for a block is not only for that block, but implicitly a vote for each of its ancestors as well, so whole subtrees have an associated weight. Bitcoin never adopted GHOST, and (despite that paper stating otherwise) neither did Ethereum under proof of work, although it had originally been planned, and the old proof of work "uncle" rewards were related to it. Increasing the block size, or decreasing the interval between blocks, makes the chain more susceptible to forking in a network that has uncontrolled latency (delays), like the Internet. Chains that fork have more reorgs, and reorgs are bad for transaction security. Replacing Bitcoin's longest chain fork choice rule with the GHOST fork choice was shown to be more stable in the presence of latency, allowing block production to be more frequent. ## LMD In proof of work, the voters are the block proposers. They vote for a branch by building their own block on top of it. In our proof of stake protocol, all validators are voters, and each casts a vote for its view of the network once every 6.4 minutes on average by publishing an attestation. So, under PoS, we have a lot more information available about participants' views of the network. This is what it means to be "message driven", giving us the MD in LMD.Within each epoch, the validator set is split up so that only 1/32 of the validators are attesting at each slot ## Proposer Slashing https://eth2book.info/capella/part2/consensus/lmd_ghost/#img_annotated_forkchoice_lmd_ghost_2:~:text=Proposer%20slashing,eventually%20becomes%20canonical Proposer equivocation is not detected in-protocol, but relies on a third-party constructing a proof of equivocation in the form of a ProposerSlashing object. The proof comprises just two signed beacon block headers: that's enough to prove that the proposer signed off on two blocks in the same slot. A subsequent block proposer will include the proof in a block (and be well rewarded for it), and the protocol will slash the offending validator's balance and eject it from the active validator set. ## CASPERFFG Casper FFG really comes down to two big ideas. First, the two-phase commit (justification and finalisation) and, second, accountable safety.Casper FFG functions as a "finality gadget", and we use it to add finality to LMD GHOST. We have a total of n validators, of which a number, f, may be faulty or adversarial in some way. To preserve liveness, we need to be able to make a decision after hearing from only n−f validators, since the f faulty ones might withhold their votes. But this is an asynchronous environment, so the f non-responders may simply be delayed, and not faulty at all. Therefore we must account for up to f of the n−f responses we have received being from faulty or adversarial validators. To guarantee that we can always achieve a simple majority of honest validators after hearing from n−f validators, we require that (n−f)/2>f. That is, n>3f. To summarise, like all classical Byzantine fault tolerant (BFT) protocols, Casper FFG is able to provide finality when less than one-third of the total validator set is faulty or adversarial. ## EPOCHS AND CHECKPOINTS To work around this, voting is spread out through the duration of an epoch3, which, in Eth2, is 32 slots of 12 seconds each. At each slot, 1 32 32 1 of the total validator set is scheduled to broadcast a vote, so each validator is scheduled to cast a vote exactly once per epoch. For efficiency, we bundle each validator's Casper FFG vote with its LMD GHOST vote, although that's by no means necessary. For the time-being we will assume that every slot has a block in it. This is because the original Casper FFG's checkpoints are based on block heights rather than slot numbers. ## Justification and Finalization To summarise, for me to be absolutely certain that the whole network agrees that a block will not be reverted requires the following steps.4 Round 1 (ideally leading to justification): I tell the network what I think is the best checkpoint. I hear from the network what all the other validators think is the best checkpoint. If I hear that 2/3 of the validators agree with me, I justify the checkpoint. Round 2 (ideally leading to finalisation): I tell the network my justified checkpoint, the collective view I gained from round 1. I hear from the network what all the other validators think the collective view is, their justified checkpoints. If I hear that 2/3 of the validators agree with me, I finalise the checkpoint. Under ideal conditions, each round lasts an epoch, so it takes an epoch to justify a checkpoint and a further epoch to finalise a checkpoint. At the start of epoch N we are aiming to have justified checkpoint N−1 and to have finalised checkpoint N−2. Quantifying that, it takes 12.8 minutes, two epochs, to finalise a checkpoint in-protocol. In Casper FFG, the two rounds are overlapped and pipelined, so that, although it takes 12.8 minutes from end to end to finalise a checkpoint, we can finalise a checkpoint every 6.4 minutes, once per epoch.it can be possible from outside the protocol to see that a checkpoint is likely to be finalised a little earlier than the full 12.8 minutes, assuming that there is no long chain reorg. Specifically, it is possible to have collected enough votes by 2/3 of the way through the second round, that is after about 11 minutes. However, in-protocol justification and finalisation is done only during end of epoch processing. ## Sources, conflicts https://eth2book.info/capella/part2/consensus/casper_ffg/#conflicting-justification:~:text=Source%20and%20target%20votes,never%20to%20revert%20it. https://eth2book.info/capella/part2/consensus/casper_ffg/#conflicting-justification:~:text=There%20are%20some%20specific%20criteria,attestations%20included%20in%20a%20block. ## Read full Casper COmmandments:- https://eth2book.info/capella/part2/consensus/casper_ffg/#conflicting-justification:~:text=a%20little%20later.-,The%20Casper%20commandments,-In%20the%20Casper ## Justifcation and Finalization https://eth2book.info/capella/part2/consensus/casper_ffg/#conflicting-justification:~:text=is%20fairly%20straightforward.-,Justification,to%20be%20finalised.,-In%20other%20words ## Slashing Violations of the commandments are potentially difficult to detect in-protocol. In particular, detection of surround votes might require searching a substantial history of validators' previous votes6. For this reason, we rely on external slashing detection services7 to detect slashing condition violations and submit the evidence to block proposers. There needs to be only one such service on the network, as long as it is reliable. In practice we have more, but it is certainly not necessary for every node operator to run a slashing detector. https://eth2book.info/capella/part2/consensus/casper_ffg/#:~:text=The%20protocol%20as,its%20final%20value. ## Accountable safety Classical PBFT consensus can guarantee safety only when less than one-third of validators are adversarial (faulty). If more than one-third are adversarial then it makes no promises at all. Casper FFG comes with essentially the same safety guarantee when validators controlling less than one-third of the stake are adversarial: finalised checkpoints will never be reverted. In addition, it provides the further guarantee that, if conflicting checkpoints are ever finalised, validators representing at least one-third of the staked Ether will be slashed. This is called "accountable safety". ## Plausible Liveness ll block production and chain building is the responsibility of the underlying consensus mechanism, LMD GHOST in our case. However, there is a sense in which we want Casper FFG to be live: we always want to be able to continue justifying and finalising checkpoints if at least two-thirds of validators are honest, without any of those validators getting slashed. In Vitalik's words, Plausible liveness basically means that "it should not be possible for the algorithm to get 'stuck' and not be able to finalize anything at all". https://eth2book.info/capella/part2/consensus/casper_ffg/#answer-to-the-exercise:~:text=For%20me%20to%20justify,globally%20safe%20from%20reversion. #### Study this if you've time:-https://eth2book.info/capella/part2/consensus/casper_ffg/#answer-to-the-exercise:~:text=have%20been%20fine.-,Now%20with%20adversarial%20action,-To%20see%20why ### **Leaders in PBFT vs. Casper FFG** - Both **PBFT** and **Casper FFG** use a **leader** to help create or agree on blocks. - In **Casper FFG**, the leader is the **block proposer**, and this changes **every round** (automatically, regularly). - In **PBFT**, the leader is called the **primary**, and it **only changes if it’s not working properly** (like being offline or faulty). Changing the leader in PBFT requires a special process called a **view change**. --- ### **What happens if many validators go offline?** - In **PBFT**, if **more than one-third** of validators go offline, the protocol can’t continue. It gets **stuck**, because it can’t agree on a new leader. This is because PBFT **prioritizes safety** (not making mistakes) over liveness (keeping things moving). - In **Casper FFG**, if more than one-third go offline: - **Finalization stops** (the network can’t finalize new checkpoints). - But the **chain can still grow** — blocks can still be proposed and added, thanks to the **underlying consensus mechanism** (like proof-of-stake). So, **Casper FFG can still provide liveness** even if finalization pauses. That’s because it works **on top of** another system that keeps producing blocks. To address these issues, the beacon network is subdivided into committees, which are subsets of the active validator set that distribute the overall workload. Committees have a minimum size of 128 128 validators, and there are 64 64 committees that are assigned per slot.