Whisk (ethresearch post) is a single secret leader election protocol, providing proposers privacy at the protocol level.
In simple terms, for proposers to remain private, the chain must know a point P
for each validator such that P = k * G
where k
is only known by the validator. That may not sound complicated, but for a live network with hundreds of thousands of indexes, it adds some practical problems.
For the chain to have access to this set of points P
, it can either:
P
from existing on-chain dataP
This post explores the problems with each bootstrapping strategy and proposes an extended strategy that solves them.
P
from existing on-chain dataWhisk uses a specific ZK shuffle argument protocol: curdleproofs, specifically developed for Ethereum's use case. We won't go into details, except for the requirement that "Curdleproofs runs over a public coin setup in any group where the decisional Diffie–Hellman assumption holds".
What could this point P
be? There are many suitable curves out there that satisfy this requirement. To minimize the development cost of Whisk, let's pick BLS12–381, the curve already widely used in the beacon chain. This curve has three groups: G1, G2, Gt. Note that each group's points have different sizes and computational complexity, ordering by cost Gt >> G2 > G1.
What data is available to us? From the point of view of Whisk (a fork after Genesis), validators enter the system through:
Tackling the latter first, all data we get from depositing validators is:
class DepositData(Container):
pubkey: BLSPubkey
withdrawal_credentials: Bytes32
amount: Gwei
signature: BLSSignature # Signing over DepositMessage
Conveniently deposits include both G1 and G2 points such that P = G * K
where k
is only known to the validator. But don't get too excited yet.
An astute reader may have noticed that "Hey the beacon chain already holds a point P
in G1, the validator's public key!". Indeed, but due to some pairing magic, using the existing validator pubkey is not sound. Once a validator signs a single message its privacy in Whisk is broken forever (details and explanation here). So we must use a new point P
in G1, not known nor derivable from existing chain data.
Each newly depositing validator submits a G2
point via its DepositData
, can we use that? Unfortunately no, this point is also unsound due to the same oracle attack described above.
Given a tracker (rG2, rkG2)
and a public key kG1
you can de-anonymize the tracker with the pairing check:
e(kG1,rG2) ?= e(G1,rkG2)
Similarly, we can't use existing G2
points from chain data. If we have to resort to new point submission, it makes sense to use G1 due to its smaller size.
A unique Gt point can be derived from existing on-chain data (credits to Justin Drake for the idea). However, the Gt is a cyclic group whose elements are in the extension field Fp12, which makes the curdleproofs protocol much more expensive. Since this option is not proven to be feasible, I will ignore it for now.
P
Since we can't change the deposit contract, validators will have to submit their point P
through some mechanism post-deposit:
With either of those methods, the asynchrony between the validator onboarding and submitting P
causes the validator to exist in two states:
P
P
This distinction becomes relevant if you answer yes to:
Should validators participate in shuffling rounds before submitting
P
?
For a validator to participate in the secret shuffling it must have some unique point. If it has not yet submitted P
, the chain must derive one deterministically, meaning that this validator will have no proposer privacy until it submits k
. This is bad, as it can expose validators to even more severe DOS risks, or can cause missed slots.
Expecting each validator to submit its message simplifies the design, not needing a dedicated p2p topic.
However, there's a nasty dependency: a validator must proposer once to have access to proposer privacy. During the initial unsafe bootstraping phase a validator's proposer slot is known well in advance. A motivated attacker could peristently DOS a specific validator preventing it from ever achieving proposer privacy.
Going back to the question:
Should validators participate in shuffling rounds before submitting
P
The answer here is yes, unsafe bootstraped validators must participate in the shuffle to include their message. This forces the protocol into assigning a deterministic yet unique secret k
to new validators. Having to assign a point that is unique at a specific (past) time, significantly increases the complexity for participants. The document expensive_duty_discovery explains why it forces validators to perform hundreds of thousands of point scalar multiplications for each slot and registered index.
Similar to the bls_to_execution_change
topic, each validator submits a message at most once to register their point P
. Any new p2p topic adds complexity and security considerations raising the development cost of the fork.
Submission via p2p allows to have a much shorter bootstrapping window since each block includes multiple messages. It also allows target DOS'ed validators to achieve privacy, preventing the issue described above.
However, managing the load on this new p2p topic will not be trivial. Since the inclusion of your message grants you privacy, there is an incentive to get yours included as soon as possible. The bls_to_execution_change
topic caused similar overloading issues right after the Capella fork.
Let's compare the different possible bootstrapping strategies:
Permanent DOS attack | Expensive duty discovery | Complexity | |
---|---|---|---|
own proposals | ❌ vulnerable | ❌ vulnerable | ✅ Low |
gossip p2p | ✅ safe | ❌ vulnerable | ❌ Moderate |
gossip p2p with delayed start | ✅ safe | ✅ safe | ❌ High |
There's no jackpot row of all greens, unfortunately. I'll discuss the trade-offs for each, to help readers make an informed decision on what strategy is the most suitable.
Initialize validators with a deterministic k
that is unique to the set of trackers at the assigned time.
def get_unique_whisk_k(state, validator_index):
counter = 0
while True:
k = get_initial_whisk_k(validator_index, counter)
if is_k_commitment_unique(state, BLSG1ScalarMultiply(k, BLS_G1_GENERATOR)):
return k # unique by trial and error
counter += 1
Then on the first proposal, each validator submits a new tracker for a secret k of their choice. This k may be the outcome of get_initial_whisk_k
for another validator. In the extreme case, in a network with N validators they could set their k to k(i) = get_initial_whisk_k(N, i)
. Then, the validator with index N
will need to try N
times to discover its k
.
In a BN / VC split where the VC never submits k (TBA doc on BN/VC interaction) this increases the computational complexity of one or both components. To cover the worst case, get_initial_whisk_k()
must be checked N
times (index count).
Going back to the question:
Should validators participate in shuffling rounds before submitting
P
This time we will answer no. Validators will only be selected into the shuffling if they have already submitted their point P
. To not break liveness a shuffling round can only start if it has a minimum number of participants. The fork process follows these steps:
count_non_zero_trackers(state) > MIN_TRACKERS
:
While this strategy solves the attacks described above, it introduces a two-step fork with added complexity. It also adds a much stronger incentive to get your whisk registration message first. This can cause security and stability issues that should be researched.
While the construction of curdleproofs is very elegant, applying it on a live network forces the need for cumbersome bootstrapping strategies. While the current spec chose the most simple in terms of complexity, the attacks it opens are problematic. The proposed gossip with a delayed start strategy is more complex but addresses all the issues known to date.