Beacon chain protocol relies on time to coordinate validator activity. A distributed system by definition lacks global clock, so validator clocks should be synchronized periodically (since on-board clocks are typically not stable enough to keep time in a long term). BFT requirements also mean any global or centralized time solution is not acceptable, since there is a single point of failure.
If the validator clocks discrepancy becomes large then protocol liveness can degrade (correctness??). In that sense, the reliance on the validator clock synchronization assumption is critical. Thus, for the beacon chain protocol to be robust BFT, it's required that Time Synchronization is BFT too.
NTP is semi-centralized and not a BF tolerant protocol. Thus, NTP-level attacks are possible, which can seriously degrade beacon chain protocol performance.
To harden the beacon chain protocol, a BFT Time Sync protocol should be designed and implemented, either a novel one or based on existing solutions. It's important to analyze Time requirements, so that such protocol can be efficiently implemented.
Besides validator activities coordination, there are other ways time is used in Ethereum 2.0. We define two broad (intersecting) categories of time usage: coordinated distributed activities and associating timestamps with events.
Different goals mean specific time requirements can be different too. E.g. precision, accuracy, time scale can differ. Thus, it may be beneficial to design separate protocols or procedures to suit particular needs.
We've identified four sources of time requirements for Ethereum 2.0:
beacon chain specs rely on clocks to coordinate validator activities. Also, blocks and messages are in some sense timestamped with associated slots (period of times).
Network layer also timestamps packets and treats them differently based on that (e.g. may enqueue an early block).
Validator rewards are granted each epoch (a coarser-grained time period consisting of several slots). So, reward rate is tied to epoch/slot duration. Thus, in order to correspond to cryptoeconomic expectations, time keeping should maintain proper slot duration.
We will ignore smart contracts/EVM for now, however, their requirements are likely to be similar to cryptoeconomic ones.
Fork choice specs assumes validator clocks are synchronized within SECONDS_PER_SLOT
. However, clock disparity acts similar to network delays, reducing available time to propagate messages. If a validator is slow, there is less time to propagate its messages to others. If it is fast, then there is less time to receive prior messages from others, which makes more likely foe the validator to possess a different view on the beacon state (which reduces probability to reach consensus).
Therefore, our understanding is that clock discrepancy should be on the order of magnitude of typical network delays or even less, i.e. 100-200ms or so.
As validator activity happens each slot, the time sync periods should be frequent enough, so that accrued clock skew over the handover period was low.
Network messages are timestamped typically. However, from this document point of view, we consider only aspects which affect beacon chain protocol. Recent changes in p2p-interface specs assume that maximum clock disparity is 500ms. And mandate that early blocks and attestations (exceeding the 500ms discrepancy) be delayed.
So, network layer time requirements are quite similar to the requirements in the previous section, regarding maximum discrepancy values, periodicity of syncs and BF tolerance. The last one is required, since if a node is too slow, it can block or delay in-time message transition, because it will believe they are early messages. Thus, node clock synchronization should be tolerant to attacks.
The main requirement is that reward rate should be of the specified amount, i.e. average epoch duration should be 32*12 seconds, around 40K epochs per year. If there are 0.1% more or less rewards, that is arguably acceptable.
So, the cryptoeconomic requirement is that resulting average clock rate after synchronization should not differ from world time standard more than 0.1%.
Time sync from cryptoeconomic point of view should not necessarily be frequent, as the the 0.1% is much higher than typical on-board clock instability.
The requirement relates to BFT context, e.g. adversary should not be able to affect the rate (assuming limited adversary resources according to general assumptions). Beyond BFT context, e.g. for debugging reasons, standard synchronization protocols like NTP may be used.
It also can be beneficial from cryptoeconomic point of view, to synchronized clock offset in addition to synchronizing clock rate (frequency). It likely requires the same efforts.
NTP is a first-choice protocol to synchronize clocks. However, in BFT context it's not that trivial, because heavily relying on NTP means reducing BFT properties as a consequence of possible NTP-level attacks.
The BFT properties of such system should be at least thoroughly analyzed and hardened, if needed.
GPS or RadioWave clock synchronization is performed via radiowaves, which are much more difficult to control for an attacker (throughout the world). However, at the end they both rely on the same semi-decentralized world time infrastructure, as NTP. So, lack of UDP/IP is a plus, but semi-decentralization remains. While it's a progress, additional hardware requirements is hardly appropriate to open network like Ethereum 2.0 (while it may be acceptable for private or consortium nets based on Ethereum technology/infrastructure).
An ideal choice would be atomic clocks on validator nodes, which are synchronized before GENESIS (or when joining the system) and used for time keeping. Extremely stable atomic clocks would allow to keep clock disparity low for a long time, with only occasional clock syncs (may still be needed for recovery or when joining the system). While there exist relatively inexpensive chip-size atomic clocks, the additional hardware requirement is also unacceptable for an open network (as it's an obvious entry barrier).
Most modern computers have Real Time Clock on board, which is a battery-backed Crystal Oscilator (XO), typically quartz XO. Even without RTC, an XO is used to drive CPU (or other subsystems). So, RTC or XO can be used for time keeping. An on-board clock (i.e. XO + counter) has one important advantage property in BFT context - it's impossible (or at least extremely difficult) for an adversary to affect it. However, typical XOs and/or RTC clocks are relatively inaccurate and unstable, loosing or gaining up to several seconds per day (link??).
Time is ubiquitous in distributed systems, which by definition (link??) lack global clock. Distributed processes need ways to coordinate their distributed activity, time being an extremely useful and convenient tool to do that. One can also coordinate activity using message passing, which quickly leads to a concept of a logical clock. Another prominent usage of time is assigning timestamps to events.
Distributed processes typically from a plesisynchronous system (link??), since they possess clocks which have the same nominal rate, but the real rate is drifting for various reasons. This results in clock phase/offset drifting too, which means, such a system cannot reliably coordinate its activity and/or timestamp events in the same timescale. Some effort should be made to synchronize clocks.
Reasoning about event ordering in a distributed system often relies on logical clocks and/or happens before
relationship (casual order). Current Ethereum 2.0 specs does not rely on this explicitly, however slots/epochs form natural sequences and fork choice and p2p-interface specs assume queuing early messages, so a form of casual order is implicitly present.
We will concentrate on physical clocks, however logical clocks is an important tool, which may be employed for implementation or optimization reasons.
Ethereum 2.0 specs relies on a clock to recognize slot starts and offsets within slots (and the start of the beacon chain operation - Genesis), in order to trigger validator activities. Also, network layer uses a clock to timestamp incoming (outgoing) messages (and to filter/delay them - link??). Accurate and consistent timescale may be required on application layer (e.g. smart contracts).
Ethereum 2.0 core messages: blocks and attestations are "timestamped" with slots (explicitly or implicitly), in the sense that messages should be issued during appropriate slots (periods of time). Validators receive rewards each epoch (a group of consecutive slots), so reward rate is determined by slot duration. If slot duration varies relative to universal time standard (which could be a consequence of clock synchronization efforts, depending on implementation details), it breaks expectations of system designers, validators and general users.
Overall, time requirements can be split in two aspects:
The first one is needed so that (honest) validators behave in lockstep fashion. If a validator is too slow, then its messages will be ignored by others.
And if it is too fast, it will act ignoring information from other validators, which lead to an additional view on beacon state and, thus, reduces chances that validators reach consensus on the state.
While it should not affect correctness of the beacon chain protocol, it can definitely affect liveness. However, if there are many such out-of-sync validators and together with other factors may lead to problems with epoch finalization and justification which renders the whole system useless.
There may be other reasons for validator clocks to be synchronized, e.g. network layer or smart contract needs.
If validator clocks are synchronized, in the sense that disparity of clock offsets is within prescribed bounds that also means resulting clock rate is more or less the same across validators. However, it can become different from prescribed value (SECONDS_PER_SLOT
) in absolute value. I.e. validators' clock rate can become different from the world time standard.
For example, it may be beneficial for a node to send messages earlier, because there will be more time to reach other nodes. At the same time, risk to miss latest attestations is relatively smaller. Depending on protocol details, other node may decide that they are late relative to such early sender, especially, if there are lots of such early senders. Thus it can result in slot duration shrunk. And the result of slot duration shrinkage is also beneficial, since validators will receive rewards more often.
NB This means uncoordinated rational majority assumptions may not be enough.
Overall, cryptoeconomics require that clock synchronization protocol should have global clock syntonization properties, i.e. after validator clock synchronization, resulting rate should stay within prescribed bounds from world time standard.
There may be other requirements relative to world time clock. E.g. around GENESIS clock synchronization (i.e. phase/offset synchronization) is required. Which can be desirable afterwords.
We separate different needs for clock synchronization, because the different needs require different accuracy/precision and are actual in different time scales. That means it's possible they can be implemented using different means and/or different protocols.
In non-byzantine context, it would not be a problem, since there is a default and straightforward way to implement time
Validator clocks can be viewed as reference (world, global) clock estimators.
Then the requirements should specify:
precision
, i.e. discrepancy between validator clocksaccuracy
or trueness
, i.e. bias of resulting clock rate (after possible clock synchronization) relative to reference clockaccuracy
/trueness
relative to the reference clock
GENESIS
and definitely beneficial after, however, it's not sure whether it should be maintained very accurate in BFT context.reference clock
, glolbal clock
, world time standard
assumed to be UTC