Work in progress.
We study a modification to Ethereum forkchoice algorithm by which we allow any tip descending from the last justified checkpoint to become head of the chain. We show that accountable safety and plausible liveness continue to hold. We also show that there are no possible attestations deadlocks and that none of the currently known FFG-induced reorgs are exploitable.
This document grew out of discussions with M. Khalinin and the teachings of F. Damato. It is the outcome of trying to understand a proposal by F. Damato (hereby called Francesco's hack) in the hopes that it is equivalent to a formulation we discussed with M. Khalinin.
The document is laid out as a story instead of a research article. We start by giving some basic definitions and recalling the basic attacks in Preliminaries, we then establish some desired properties of forkchoice in Wish List and introduce the main proposed change to remove the block filtering in Remove Justification Filtering. After that we show that this change has a series of unintended consequences that need to be dealt in order to maintain the desired properties of forkchoice. We describe the discrepancy between the store's point of view and the head state's point of view with regard of justification in Store and State separation, this in turn leads to a redefinition of the notion of a supermajority link, by allowing multiple sources towards justifying a new checkpoint. We then show that Accountable Safety still holds in this context and describe an implementation problem that arises with Plausible Liveness. In Process Justification we describe a possible implementation mechanism in order to correctly track validators' votes towards multiple-sources supermajority links and in Weak Liveness we describe the probabilistic nature of Plausible liveness in this setup. We end this document showing that this forkchoice algorithm is not susceptible to any of the known attacks.
As with any forkchoice implementation, Ethereum's deals with the interplay between certain information that is available on-chain (the BeaconState
, imported BeaconBlocks
, etc.) an information that is only available off-chain and is therefore dependent on the participant's current view of the network. Some attestations/blocks may have been seen by some validators and not others, some blocks may have been timely to some part of the network, but late to others, etc. To keep track of all off-chain information needed for a validator to decide what is its canonical head, Ethereum uses a forkchoice store which is specified as a python class containing all this necessary information:
class Store(object):
time: uint64
genesis_time: uint64
justified_checkpoint: Checkpoint
finalized_checkpoint: Checkpoint
best_justified_checkpoint: Checkpoint
proposer_boost_root: Root
equivocating_indices: Set[ValidatorIndex]
blocks: Dict[Root, BeaconBlock] = field(default_factory=dict)
block_states: Dict[Root, BeaconState] = field(default_factory=dict)
checkpoint_states: Dict[Checkpoint, BeaconState] = field(default_factory=dict)
latest_messages: Dict[ValidatorIndex, LatestMessage] = field(default_factory=dict)
An honest validator calls the function get_head(store)
to obtain its canonical chain and perform its duties. Ethereum's mechanism to choose this head is a mixture of a finality gadget (FFG) [1] and a forkchoice rule (LMD GHOST) [2]. The finality gadget works by using on-chain information to label certain blocks in the block-chain as finalized and others as justified. It then provides assurances that a finalized block will always be part of the canonical blockchain unless some identifiable actor breaks the rules and therefore losing all of it's stake in the process. The forkchoice rule keeps track of off-chain information to decide which block belongs to the canonical chain. This rule is layered with the finality gadget: only blocks such that their corresponding beacon state has the same justified checkpoint as the highest justified checkpoint observed are allowed to be considered for head. It is in this interface between FFG and LMD GHOST that friction arises.
The following attack (which was found by Paul Hauner) illustrates this friction well. Consider the following diagram
The attack succeeds because honest validators filter blocks as posible head, only those that have Epoch 11 justified and Epoch 10 finalized will be allowed.
store(t)
be the forkchoice store at time t
.head_state(t)
be the BeaconState
with block root equal to the result of get_head(store(t))
.We will impose the following conditions on our forkchoice algorithm:
Monotonicity of justification:
store(t).justified_checkpoint.epoch > store(t').justified_checkpoint
if t > t'.
Monotonicity of finalization:
store(t).finalized_checkpoint.epoch > store(t').finalized_checkpoint
if t > t'.
Accountable safety and plausible liveness as defined in [1]
Notice that we are requiring monotonicity for the justified checkpoint of the forkchoice store but not of the head state. These two justified checkpoints are currently enforced to be equal in Ethereum's specification but we will relax this condition as we will see below.
Currently the result of get_head(store)
requires that a node is viable for head only if its justified checkpoint equals the current store.justified_checkpoint
and similarly for finalization. This is enforced in the following code snippet.
def filter_block_tree(store: Store, block_root: Root, blocks: Dict[Root, BeaconBlock]) -> bool:
...
correct_justified = (
store.justified_checkpoint.epoch == GENESIS_EPOCH
or head_state.current_justified_checkpoint == store.justified_checkpoint
)
correct_finalized = (
store.finalized_checkpoint.epoch == GENESIS_EPOCH
or head_state.finalized_checkpoint == store.finalized_checkpoint
)
# If expected finalized/justified, add to viable block-tree and signal viability to parent.
if correct_justified and correct_finalized:
blocks[block_root] = block
return True
...
We will relax this with the following change:
We will allow head_state
to be viable for head if the store's justified checkpoint is a descendant of head_state.justified_checkpoint
and the analogous statement for the finalized checkpoint.
Notice that this not only means that the block with root equal to store.justified_checkpoint.root
is a descendant of the block with root head_state.justified_checkpoint.root
but also that store.justified_checkpoint
is an actual checkpoint for the epoch store.justified_checkpoint.epoch
in the chain whose tip is head_state
.
A possible implementation of this is as follows, define the following helper function that returns whether a checkpoint is a checkpoint from the point of view of the given head state.
def is_checkpoint(store: Store, root: Root, checkpoint: Checkpoint) -> Bool
return get_ancestor(store, root, compute_start_slot_at_epoch(checkpoint.epoch)) == checkpoint.root
and modify the above snippet to be
def filter_block_tree(store: Store, block_root: Root, blocks: Dict[Root, BeaconBlock]) -> bool:
...
correct_justified = (
store.justified_checkpoint.epoch == GENESIS_EPOCH
or is_checkpoint(store, block_root, store.justified_checkpoint)
)
...
if correct_justified and correct_finalized:
blocks[block_root] = block
return True
...
A first immediate consequence of the change in the previous section is that we always have
store.justified_checkpoint.epoch >= head_state.justified_checkpoint.epoch
but these are not necessarily equal.
This leads to the following problem: since we may have different descendants from the same store.justified_checkpoint
but with different head state's justified checkpoint, we obtain as consequence that the head state justified checkpoint's epoch may decrease in time. This would lead an honest validator to attest surrounding itself if it uses the state's justification as a source of its attestation. We propose then the following
store.justified_checkpoint
as the source of their attestation.Remark. This is not the only possible change to allow for this. The above mentioned Francesco's hack consists of allowing validators to still use the state's justified checkpoint, but with a modified epoch.
Remark. An implementation detail related to this induced change is the fact that store.justified_checkpoint
needs to be updated during on_tick
, that is, as soon as we cross into a new epoch. This is because honest validators advance their head state and would otherwise revert the above inequality.
The monotonicity of the store's justified checkpoint guarantees that there cannot be any deadlocks, that is, honest validators will always have a head to attest to without slashing themselves.
An unfortunate consequence of this Induced change 1 is that honest validators will be attesting with different source checkpoints during the same epoch, even if trying to justify the same target. This leads to an accounting problem, honest validators on different forks with different justified checkpoints (from the point of view of their head states) cannot import attestations from the other branch, as the source is invalid. This is currently enforced in the specification in the following snippet
def get_attestation_participation_flag_indices(state: BeaconState,
data: AttestationData,
inclusion_delay: uint64) -> Sequence[int]:
...
is_matching_source = data.source == justified_checkpoint
is_matching_target = is_matching_source and data.target.root == get_block_root(state, data.target.epoch)
is_matching_head = is_matching_target and data.beacon_block_root == get_block_root_at_slot(state, data.slot)
assert is_matching_source
...
return participation_flag_indices
If we want to allow attestations from different sources to be processed in forkchoice and count them towards justifying the same target, we are faced with separate decisions to make
We propose then the following:
That is we propose to change that snippet above to be changed to:
def get_attestation_participation_flag_indices(state: BeaconState,
data: AttestationData,
inclusion_delay: uint64) -> Sequence[int]:
...
source_is_checkpoint = is_checkpoint(store, data.source.root, data.source)
source_is_descendant = is_checkpoint(store, data.source.root, justified_checkpoint)
justified_is_descendant = is_checkpoint(store, justified_checkpoint.root, data.source)
is_matching_source = source_is_checkpoint and ( source_is_descendant or justified_is_descendant )
is_matching_target = is_matching_source and data.target.root == get_block_root(state, data.target.epoch)
is_matching_head = is_matching_target and data.beacon_block_root == get_block_root_at_slot(state, data.slot)
assert is_matching_source
...
return participation_flag_indices
This induced change has in itself some consequences. We now count different links, with different sources but the same target.
TODO: do not mix store with the beacon-chain spec, implement a common ancestor and check that it is one of the two checkpoints instead of checking with two loops that they are not contending
By counting votes with different justified checkpoints we have effectively changed the notion of a supermajority link. Indeed we have now the following
Definition. A supermajority link is an ordered pair of checkpoints (a1,a2,…,ak;b), also written a•→b, Such that at least 2⁄3 of the validators (by deposit) have published votes with source some ai and target b.
A checkpoint b is called justified if (1) it is the genesis checkpoint or (2), there exists a supermajority link a•→b where all checkpoints ai are justified.
The notion of finalized is not changed: a checkpoint b is finalized if (1) it is the genesis checkpoint or (2) it is justified and there is a supermajority link (b = b1; c) with c a direct child of b.
Theorem (Accountable Safety) Under the same assumptions as in [1], that is if < 1⁄3 of the validators by weight violate a slashing condition, two conflicting checkpoints a,b cannot both be finalized.
We first prove that with these conditions we still have:
Property (i) If a•→b and c•→d are two different supermajority links, then b.epoch != d.epoch
.
Indeed if b.epoch = c.epoch
then during that epoch at least 1⁄3 of the validators doubled voted.
The second property is slightly different than in [1]:
Property (ii) If a•→b and c•→d are two different supermajority links, then the inequality ak.epoch < c1.epoch < d.epoch < b.epoch cannot hold, where ak is the last checkpoint in a•
At least 2⁄3 of the validators by weight voted for d during d.epoch
. These validators used some ci as source for their votes, with an epoch c1.epoch or higher. At least 2⁄3 of the validators voted for b during b.epoch
and they used as source some ai.epoch, that is an epoch at most ak.epoch. This means that at least 1⁄3 of the validators must have cast a surround vote in order to satisfy the stated inequality.
Property (iii) There exists at most one supermajority link a•→b with b.epoch = n
.
This is a direct consequence of Property (i)
Property (iv) There exists at most one justified checkpoint b with b.epoch = n
This is a direct consequence of Property (iii)
Proof of the Theorem. The proof goes with minor modifications to that in [1]. Let a=am (with direct child am+1) and b=bn with direct child bn+1 be conflicting finalized checkpoints. If a.epoch = b.epoch
then it is clear that 1⁄3 violated the double voting condition, hence we may assume a.epoch < b.epoch
. Notice that this implies b.epoch > am+1.epoch.
Let r→b1•→b2•→…→bn, be a chain of supermajority links, that is, there exist a supermajority link r→b1r1 and supermajority links bi•→bi+1ri+1 for each i=1,…,n, some ri, r the genesis checkpoint and bn• = bn1 = bn = b. This is the main difference with [1], here the source of a supermajority link has a source a chain of checkpoints, with all elements in the chain required to be justified. We know that no bij.epoch equals a.epoch nor am+1.epoch because that would violate property (iv). Let j be the lowest integer such that bjk.epoch > am+1.epoch some k, and let k be the minimum with this property. Let bj-1•→bjk be a supermajority link.
We claim that we must have bj-1i.epoch < a.epoch for every i. Indeed if for some i we had bj-1i.epoch > a.epoch, then we could consider the supermajority link bj-1•→bjk contradicting the minimality of k. The claim now follows from Property (iv).
Now since 2⁄3 of the validators voted to justify bjk using as source some of the bj-1i and 2⁄3 voted to justify am+1 using a as source, it follows that at least 1⁄3 violated the surrounding condition due to property (ii)
The Plausible liveness theorem from [1] still holds in this setting, its proof goes verbatim thus is omitted.
Theorem. Supermajority links can always be added to produce new finalized checkpoints, provided there exist children extending the finalized chain.
store.justified_checkpoint
is not justified in the canonical chain. Then honest validators will never be able to justify a new checkpoint since they will continue voting with a not-justified source. The protocol therefore needs to provide guarantees that the store.justified_checkpoint
will be eventually justified in the canonical chain.We can only give a probabilistic formal proof that this is the case. By contradiction we assume that 2⁄3 of the validators by weight have seen checkpoint a justified. Let us suppose however that the votes that justified a where included in blocks that have not become canonical. The chain as advanced in a different branch descending from a and this branch did not include enough votes to justify a.
In this example the votes to justify a where included in b1 but the chain continued to advance to ci. Honest validators will vote a→ci and never justify again unless a is justified in this chain. We add the following rule
Honest validators must include all attestations with target the highest justified checkpoint in their view, from all branches they have seen
With this rule validators on the blocks ci must eventually include the votes in bi that justified a. Indeed, let us assume that the chain advances and these attestations are never included, then that means that at least 2⁄3 have seen the votes in bi. Eventually honest validators will propose in the canonical chain and will include these votes.
An implementation problem posed by our decision of allowing different sources, possibly not justified, at the point of including these attestations in blocks, is that we do not hold all the information necessary to count these votes towards justification at the time of insertion. These sources could be justified later. Suppose a validator receives an attestation a→b during b.epoch
that it passes our checks in Store and State Separation but such that at the time of arrival, a has not been justified in the chain of b. Justification and finalization is processed during epoch transition from b.epoch
to b.epoch + 1
. At the time of this transition, the chain may have included enough votes to justify a, the problem we are faced is that the BeaconState object does not have this information on the last seen votes by validators.
Since the Altair fork, validator participation is tracked in the two fields
class BeaconState(Container):
...
previous_epoch_participation: List[ParticipationFlags, VALIDATOR_REGISTRY_LIMIT] # [Modified in Altair]
current_epoch_participation: List[ParticipationFlags, VALIDATOR_REGISTRY_LIMIT] # [Modified in Altair]
...
Where the ParticipationFlags
is a bitfield of 3
bits that are used to track correctly voted source, head and target.
The reason why we have two different epoch participation bitlists is because attestations are allowed to be included for an entire epoch. That is, an attestation for a→b may be included during b.epoch + 1
and thus at the epoch transition to b.epoch + 2
, the list previous_epoch_participation
will have information about that vote. Another consequence of this is that the checkpoint b
can be justified during the transition b.epoch → b.epoch + 1
or during the transition b.epoch + 1 → b.epoch + 2
. In particular, if we had a.epoch + 1 = b.epoch
it may happen that during the epoch transition b.epoch → b.epoch + 1
, the chain has seen enough votes to justify a, then to create a supermajority link a → b, thus justifying b (and finalizing b in the process).
Since we now would allow supermajority links of the form a•→b with multiple sources, and in order to count for justification of b, the checkpoints ai need to be already justified. Then either ai is already justified at the time of insertion epoch = b.epoch
(or epoch = b.epoch + 1
) or we must have had
Since otherwise ai could not be justified during the next epoch transition. This suggest that we modify our attestation filter above, to something like
def get_attestation_participation_flag_indices(state: BeaconState,
data: AttestationData,
inclusion_delay: uint64) -> Sequence[int]:
...
source_is_checkpoint = is_checkpoint(store, data.source.root, data.source)
source_is_descendant = is_checkpoint(store, data.source.root, justified_checkpoint)
justified_is_descendant = is_checkpoint(store, justified_checkpoint.root, data.source)
source_is_ffg_compatible = justified_or_justifiable(state, data.source.root)
is_matching_source = source_is_ffg_compatible and source_is_checkpoint and ( source_is_descendant or justified_is_descendant )
is_matching_target = is_matching_source and data.target.root == get_block_root(state, data.target.epoch)
is_matching_head = is_matching_target and data.beacon_block_root == get_block_root_at_slot(state, data.slot)
assert is_matching_source
...
return participation_flag_indices
where the function
def justified_or_justifiable(state: BeaconState, root: Root) -> bool:
returns true if the source's epoch satisfies the above inequality or if it is already justified.
In order to process justification and finalization correctly, we would need to use an extra bit from ParticipationFlags
.
Name | Value |
---|---|
TIMELY_SOURCE_FLAG_INDEX |
0 |
TIMELY_TARGET_FLAG_INDEX |
1 |
TIMELY_HEAD_FLAG_INDEX |
2 |
SOURCE_JUSTIFIED_INDEX |
3 |
So a validator would count the vote towards justification of the current or previous epoch by if SOURCE_JUSTIFIED_INDEX
is set. If this bit is not set however, then the only possible option is that the attestation was of the form a→b with a.epoch + 1 = b.epoch
and we are currently processing the epoch transition b.epoch -> b.epoch + 1
. In this case the validator will have to check first if the previous epoch, that is a.epoch
was justified. If it was, then it will count the attestation towards justification of b
.
Notice that this setup only allows for supermajority links of the form a•→b where at most one ai (that is the last checkpoint) is not justified at the time of attestation insertion time. It is possible to allow for more resilience. To allow for k
such checkpoints then the BeaconState
object will have to have k+1
lists
class BeaconState(Container):
...
epoch_participation_0: List[ParticipationFlags, VALIDATOR_REGISTRY_LIMIT] # [Modified in Altair]
epoch_participation_1: List[ParticipationFlags, VALIDATOR_REGISTRY_LIMIT] # [Modified in Altair]
...
epoch_participation_k: List[ParticipationFlags, VALIDATOR_REGISTRY_LIMIT] # [Modified in Altair]
...
where the epoch_participation_k
tracks the performance of the validators during current_epoch - k
. Participation flags will now need to have not only the above 4 bits, but also include a multiple-bit set SOURCE_EPOCH
, with enough bits so that k
fits within them. In this case, during the epoch transition, a validator would use the SOURCE_JUSTIFIED
flag first. If this flag is set then the attestation counts towards justification of its target. If it is not set, then the SOURCE_EPOCH
bits uniquely identify the source checkpoint and the validator can check if it is currently justified.
Remark When accounting for justification, we need to consider the validators' effective balance, since this is a varying amount, a different balances can be chosen for a vote a→b. A natural choice would be to chose the balance of the validator during a. But since we are allowing votes with multiple sources a•→b, this suffers from implementation. We will consider instead the balances at the justified checkpoint during the epoch transitions (eg. b.epoch → b.epoch + 1
). Here justified checkpoint means from the point of view of the corresponding BeaconState
, that is state.current_justified_checkpoint
.
As noted in Plausible Liveness, in order to guarantee Liveness of the network we need to guarantee that honest validators will be able to justify their store.justified_checkpoint
, and this means that they will be able to propose blocks including the attestations they have already seen on blocks in different branches. This leads to a witholding attack problem where the attacker witholds the votes justifying a checkpoint and releases their blocks at a later time. In the example in that section, the attacker holds the blocks b1 and b2 and releases them only after the chain has advanced on the blocks ci. There is an interplay here between the time of release and the stake of the attacker.
store.justified_checkpoint
is justified on chain (which we can assume by induction)Thus, choosing the number of epochs that we allow old attestations to be inserted (the parameter k
in the previous section) provides protection against larger attackers. This protection can only be meassured statistically, since in theory an attacker with a single validator can both withdold the block bi and be assigned every single block ci until the threshold of the k epochs is past.
In this section we analyze the behavior of the above proposed mechanism against known attacks on forkchoice.
The attack mentioned in Preliminaries does not succed with this mecanism. Since both the attacker's block (in Epoch 12) and the canonical chain (the last block of Epoch 11) are descendant of the last justified checkpoint (the first block of epoch 11, shaded red) they are both allowed to be head. The chain will advance based on Epoch 11, and the very next block included will justify Epoch 11:
The following deadlock (also found by P. Hauner) is a variation of the above attack:
During Epoch 12, before the first block arrives, the situation is as follows:
Attesters do not see any incoming block in Epoch 12, so they attest to their last head, as a target checkpoint for epoch 12. That is, they vote 11→12. At this point a malicious attacker releases a timed block
This block misses the orange block in Epoch 11, so it does not carry enough information to justify Epoch 11, but it can justify Epoch 10 (as it included the first block of epoch 11). The canonical chain hasn't performed the epoch transition, hence the justified checkpoint is still in Epoch 9 or before. Thus, the attacker's block becomes the canonical head.
This attack is malicious in two ways:
The mechanism we are proposing here solves both attacks. The attacker's block adn the canonical head block are both descendant of the last justified checkpoint seen, that is the first block of Epoch 11. And LMD votes will make the canonical chain the valid one and likely to continue.
Even if the attacker's chain manages to continue being canonical, those attesters that voted for 11→12 at the beginning of the Epoch, will not be surrounded during Epoch 13, since they will now vote with their Store's justified checkpoint, which will still point to Epoch 11 (even if the head state justified checkpoint is Epoch 10). So during Epoch 13 these attesters will continue to vote for 11→13 without surrounding themselves.
Consider the following forkchoice diagram
unrealized_justified_checkpoint.epoch = 11
. But the realized justification checkpoint is still at Epoch 10.In this scenario, the attacker releases his blocks:
An honest validator seeing these blocks, since they are from the previous Epoch, will pull tips, realizing the justified checkpoint with Epoch 11. Thus, honest validators will see two valid chains, one with an unrealized justified checkpoint at epoch 11, and the attacker's chain with realized justified checkpoint at epoch 11.
Honest validators are forced, by FFG rules, to switch to the attacker's branch
Remark It was noticed by D. Ryan that this attack can happen with relatively the same frequency even without pull-tips.
This attack has no effect with this mechanism as every possible head descends from the same Justified checkpoint, albeit the canoical chain not having actually justified Epoch 11.
Consider the following scenario
In this situation, the canonical chain head, block C has justified_checkpoint.Epoch = N-3
(if pulled tips are implemented we have unrealized ones: u_justified_checkpoint.Epoch == N-1
and u_finalized_checkpoint.Epoch < N-3
)
store.justified_checkpoint.epoch = N-1
store.finalized_checkpoint.epoch = N-3
Now the canonical head block C and A are both descendants of the last justified checkpoint, therefore both are contending to be canonical blocks.
In the current forkchoice implementation this leads to the finalized checkpoint reverting, from N-2 in the attacker's branch, to < N-3 in the canonical branch. But with this fix, since actions from validators are taken from the store's checkpoint, the store's finalized checkpoint is what matters, and this will still be epoch N-2, an ancestor to the canonical chain. Eventually when the canonical chain advances this endpoint will be finalized on state.
The current proposed fix for these attacks involve imposing the condition that the finalized checkpoint needs to be updated only if it increases finality and force the attacker's branch to win in this scenario.
The following attack is due to R. Saltini.
The canonical chain is in Black at the top. Block B is justified using A as source. Epoch N+2 is missing entirely. During N+3 attesters in the canonical chain voted to justify block C using B as source. After importing the canonical head D, the justification checkpoint is C, in epoch N+3 and the finalized checkpoint is A.
At this moment an attacker reveals the red chain. This chain contains blocks in the missed epoch N+2. This chain contains enough votes to justify E using B as source, so it finalizes B. When inserting the block F in N+5, this chain has finalized checkpoint B and and justified checkpoint E.
In the current specification, the justified checkpoint from the canonical chain will switch from C in epoch N+3 to E in epoch N+2 since it will follow the finalized checkpoint increasing. And this can lead to honest validators deadlocking themselves by having voted N+3→N+4
With the current proposal there are a few points to cover. There is no deadlock invoved as even from the validators in the canonical chain switching over, since their store's justified checkpoint will still be N+3.
In fact, as specified in the PR the validators in the canonical chain will not switch to the attacker's branch, as the head justified checkpoint is not an ancestor of the store's justified checkpoint in this case. They can however move their store's finalized checkpoint to be B in epoch N+1 as that's compatible with their chain.
On the other hand the validators that were on the attackers branch and did not see the canonical branch, they will also not switch over the canonical branch since C is not a descendant of their justified checkpoint E. So this situation will lead to a network split in this scenario.
[1] V. Buterin and V. Griffith. Casper the Friendly Finality Gadget. Available online https://arxiv.org/abs/1711.09437
[2] V. Buterin, D. Hernandez, T. Kamphefner, K. Pham, Z. Qiao, D. Ryan, J. Sin, Y. Wang, Y. X. Zhang. Combining GHOST and Casper. Available online https://arxiv.org/abs/2003.03052