To recap the problem Bracha'87 is trying to solve:
There are parties, out of whom at most are byzantine, with a designated leader/broadcaster for each protocol run. The protocol runs in an asynchronous[1] and permissioned[2] network without the help of Public Key Infrastructure (PKI)/signatures to authenticate messages.
The leader starts with some input value from a predefined value space (usually we consider ) and a party that terminates needs to output a value .
The two properties our broadcast protocol wants to achieve are:
Notice that:
Now, let's recall the protocol (skip rephrasing the validity and agreement proof) and explore a few extension questions:
Bracha has reminded us of the famous FLP impossibility result in Section 8: "byzantine general protocols are impossible in asynchronous system even in the presence of a single fault". Another way to put it: to achieve consensus in asynchronous model, even a single crash failure would result in infinite execution.
Intutively, we can use the "indistinguishability argument" to quickly see why. In world 1, the leader is faulty and crashes without sending any messages. In world 2, all parties are honest, but the message from leader got delayed for aribrary amount of time. If we want termination, then there must be a priori time bound at which all parties should output a value. With a delay in asynchrony, honest parties can't distinguish the two worlds (since they recieve no message in both worlds) and will output something different than leader's input, thus breaking validity. In order to avoid violating validity or agreement, honest parties have no choice but keep waiting indefinitely.
To circumvent the impossibility, people usually weaken the properties required to be held. In Bracha's case, we use a weaker (but still good enough) notion of termination:
Excercise: Show an attack that cause the protocol to not terminate for some parties.
Apparently, if a faulty leader doesn't send anything, then there will be never enough (at most ) <Echo, v>
for all honest parties, thus never a single vote from any honest party due to 3A, and thus never more than <Vote, v>
sent and 3B will never be satisfied, and no honest parties will terminate.
However, this is a correct but not super insightful attack, let's see a few more interesting ones, which help us internalize Bracha protocol later.
For simpler presentation, I'd like to assume , and we refer quorum A~C from left to right.
In the attack below, Quorum B will get 2f-1=n-f-1
echo, and Quorum C will only get f
echo, since the faulty Quorum A only echo to Quorum B. Therefore, no honest parties will ever vote (since neither condition 3A nor 3B can be satisfied) –> can't terminate.
Here's an even more revealing attack: Quorum C knows that enough () honest parties heard about <echo, 1>
(since they receive <echo, 1>
, meeting condition 3A), therefore they vote for it. However, we don't have enough of a Quorum C to convince Quorum B to vote. Note that Quorum C didn't receive anything from the leader, thus they won't echo; Quorum B didn't receive enough echo to meet condition 3A and only counting on condition 3B to vote which unfortunately we fall short of.
Another note: another extreme case is Quorum B has honest parties, and Quorum C only has honest party in which case there will only be 1 vote from Quorum C and also a non-terminating situation.
Excercise: Modify Bracha's broadcast primitive to make it a terminating protocol in a synchronous network, and provide security proof.
Given the lower bound from FL82, we can't acheive interactive consistency (a problem reducible from/to BB & BA) that tolerates byzantine parties with fewer than rounds. Therefore, we are inspired to modify Bracha to the following:
Proof:
Here's how I personally internalize Bracha's design:
Let denotes the event that "there are enough () honest parties receive <echo, v>
".
<echo,v>
'" which could prevent attack 2 above.Excercise: If we drop step 3B, which property fail to hold? Show an attack.
From the validity proof of Bracha broadcast, we know Step 3A is enough to ensure validity, thus we are looking for an attack on agreement. Given that Step 3A ensures no-conflicting votes among honest parties, we won't be using "equivocation attack" but rather "dropping message" attack.
Receiving votes in step 4 implies that at least honest parties voted.
By removing step 3B, the only source of honest vote is step 3A, meaning at least honest parties received echo. Problem is what about the other honest parties, can they also reach terminating state by receiving enough votes? From this line of logic, here's the attack:
In the attack above, Quorum B all received enough echos in 3A to vote, and later all receive enough votes to terminate. However, Quorum C can never terminate since they only have votes and echoes and get stuck forever, thus violating agreement.
Excercise: Is the following "optimized" Bracha protocol correct? Show seucrity proof or an attack.
From our summarized interalization above, it's easy to notice that modified step 3 only require echo, which doesn't guarantee non-conflicting votes from honest nodes. Therefore our attack idea is to break agreement by misleading honest parties to conflict each other using equivocating byzantine nodes.