Try   HackMD

Initial investigation into local network latency and capturing live reorgs

Since the last update, I have continued monitoring metrics related to the performance of the Eth network. Some of the key metrics I am tracking include the intermediate latency of blocks and attestations. This is to provide further insight into how delays in these processes can impact the overall network causing partitions and reorgs. As an initial step, I set up a Lighthouse node and streamed and recorded over 11,600 slots from the gossip network. To skip right ahead and view the performance summary for the investigation click here.

In the majority of cases, the data points for reorged blocks and late attestations are not available, as they get erased from the node's history entirely. This has resulted in large platforms such as beaconcha.in to also not keep track of these data points, making it challenging to conduct holistic assessments of network partitions and reorgs as well as to monitor real-time network performance.

It is important to note that the CL data that was collected for the investigation is real-time data gathered from the gossip network. Hence, parts of the data may not be identical to the data from the permanent chain. For example, some of the attestation data may be different as it may not have gotten included on the permanent chain due to being poorly aggregated or being published too late or due to slow propagation.

The scope for the project is to add more nodes that are distributed geographically and track the the metrics using the ELK stack I have running. This is to enable real-time monitoring and querying of the network performance.

How blocks and attestations are released

In Eth PoS, a validator is a pseudo-randomly selected to build a block at every slot (12s). For every epoch (32 slots), validators are split into random committees and are incentivised to vote on what they deem to be the head of the chain, the previous justified block and the previously finalized block (a quick summary about Eth PoS and fork choice algorithm Gasper can be found in the previous update). Since incorrect attestations, late blocks and late attestations lead to slashing, validators are advised to stick to the spec when fulfilling their duties.

The recommended steps involved in releasing a block and attestation are broken down as follows:

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

  1. The block proposer at the start of every slot is advised to build a block and broadcast it within 4s into the slot.
  2. At 4s into the slot, validators should begin to cast their vote on which block they see to be the head of the chain (regardless if they received the recent block) and publish their attestation to the other validators within their committee.
  3. At 8s into the slot, a subset of validators from each committee aggregate the attestations votes and publish it to the entire network.
  4. At 12s a new slot begins and the new proposer is selected to build a block and include the best and the most recent aggregates they’ve seen in gossip, plus any unaggregated attestations they happened to see due to being subscribed to subnets. This process is called packing and is an NP hard problem ( the packing algorithm used for every client is different as they try to maximise the number of included attestations for every block proposed given the limited block space).

It is important to note that not all attestations in gossip get included. This can be due to several factors including:

  • Suboptimal attestation aggregations by the aggregator. For example, there might be a high overlap between different attestations released by committee index aggregators which leads the proposer to not include individual attestations due to the limited block space (the proposer can only include usually 128 of the best and most recent aggregated attestations they can see)
  • Late releases of attestations mean that the proposer does not get to see the attestations in time which usually leads to an inclusion delay beyond one slot. To incentivise validators to release attestations on time the reward given gets reduced per inclusion delay. If the validator attestation is not visible within 32 slots, then the attestation gets deleted from the node and can never be included.

Measuring local block latency

Time interval between blocks

As outlined before I ran a Lighthouse node and used the standardized Eth Beacon API to listen to live blocks and attestations being propagated through the gossip network. For the initial investigation, I collected around 11,603 of slot data with the intention of analysing the network latency as well as capturing live reorgs as well as network partitions. In order to measure the local block latency, I kept track of the time delay between blocks to spot if there had been any deviations from the spec time of 12s.

As we can see from the scatter plot the time delay between blocks seem to be largely clustered around 12 seconds for every slot. The histogram below can capture this more clearly as the far majority of the data falls in ranges of 11.9-11.95 secs, which is highly congruent with the specification time which outlines that a block should be produced every 12s.

From looking at the scatter plot we can also see that are a considerable amount of blocks that arrive beyond the 12s mark. If we observe more closely, they seem to follow a pattern as they all align at the 24s mark. After doing some checks, most of the blocks are caused by a skipped block, including the 40s delayed block (it experienced two skipped slots beforehand). In the 11,603 slots that were monitored, 48 skipped slots were observed and are illustrated by the dark grey lines in the scatterplot indicating that the proposer was either offline or failed to build and propogate the block to the network.

Interval between local block and scheduled slot time

To get a better sense of the time at which the block proposers publish blocks to the network, we compare the block arrival time versus Ethereum's specification time for when slots begin and end. Since the slot time is pre-determined on PoS eth, we can check the difference between the block arrival time versus the spec. scheduled slot time for a given slot number. This can provide an estimate of how far into the slot block proposers are releasing their blocks.

import time import datetime from datetime import date,timedelta Beacon_Chain_Gen_Timestamp = 1606824023 def get_slot_datetime(slot): return datetime.datetime.utcfromtimestamp(Beacon_Chain_Gen_Timestamp + 12 * slot)

As we can see from the histogram approximately 90% of all the blocks arrive within 4.8s into the slot suggesting that the far majority of block proposers are indeed following the recommended specification. The steep drop after 5s can be explained by the fact that validators beginning to cast their vote for the head of the chain 4s into the slot. Hence, a block that gets released beyond the 5s mark risks not receiving any attestations, as some validators may simply attest to the previous block if delayed further than 4-5s. In addition, the next block proposer can use Proposer LMD Score Boosting to reorg out the preceding block given that it has accumulated fewer votes than the boosted block (more details about implementation can be found here also). Due to these reasons proposers are highly incentivised to publish their block within the recommended 4-5s.

Moreover, from looking at the overall shape of the histogram, the sharp spike at blocks that arrive at 3.5-4.5s is quite noticeable. This may be due to block proposers being signed up to MEV relays which may take may take longer to build a block (since the execution payload is built seperately ) . In addition, every second spent listening to the mempool when building a block is highly valuable as the MEV reward can equal the block reward, and at times can be orders of magnitude larger. Flashbots finds that on average, each additional second of waiting is worth 0.034 eth in additional miner payment. This can indeed lead to the artificial spike that we see in the chart as proposers and builders may try to maximise listening time in the mempool to effectively increase the total rewards (some of these concerns were mentioned in the following issue)

Measuring attestation latency

As previously outlined validators are incentivised to release their attestation as quickly as 4s into the slot to ensure that committee aggregators aggregate and publish the attestation in time. This increases the chance for the validator's attestation to be included in the next slot avoiding any further loss in profits due to inclusion delay.

In the chart below we compare the median attestation arrival time to the block arrival time. This gives an estimate of how long aggregators are waiting after the proposer had released the block to the network.

From looking at the plot the median attestation arrives 6.9s after the block arrives and around 98% of the attestation received arrive within 8s mark. The attestations that are sitauted at zero or below in the plot refer to attestations that arrived before the block did. This signifies the number of late blocks that were submitted to the network as it can be observed validators attesting ahead of time before receiving the block. These blocks will likely have very low vote counts, since most of the validators would have voted for the preceeding block as head of the chain.

Attestation votes

To gain an overview of how the network is behaving in real time, we look into the attestation votes and observe the disagreements between the block proposer and the validators. This can be used as a proxy to view the validator's current view of the network, thus giving insight into the partition in the network. As disagreements approach nearly 50% it suggests that the network is split in two in the given slot. This can go on to increasing the likelihood of a reorg to occur as the next proposer builds the next block.

In the 11,603 slots there are more instances where validators disagreed with the block proposer entirely than to partially disagree between 20-80%. This reading is better illustrated by the histogram as the disagreement counts mostly lie on either ~0% or ~100%. This highlights that there is a strong consensus in identifying missed slots or delayed blocks in the network which is a very healthy signal. It is when the attestations are partially split into the regions between 20-80% that the likelihood for reorgs increases which may lead to slashings and a reduction in the quality of service of the network.

Capturing Reorgs

As outlined in the previous update, when the network experiences higher latency the likelihood for reorgs increases. In these instances, validators that are locally situated near the proposer may effectively be in favour of the globally late (but locally on-time) block, whilst validators and proposers further in distance may attest to the previous block causing a partition and eventually a reorg. To ensure safety and liveness the forkchoice rule LMD GHOST is applied to determine the head of the chain and maintain consensus in the midst of a fork in the chain.

At the current moment, the network experiences around 5 reorgs per 24 hrs, hence reorgs do occur quite frequently. However, since all the corresponding data to reorged blocks gets deleted from the node's history, we have to actively monitor and record the blocks in real-time through the gossip network to analyse them later on.

The reorgs are identified by cross-comparing the blocks that were received through the gossip network to the blocks from the canonical chain. To fetch the blocks that are part of the canonical chain, we simply walk back iteratively referring to the beacon_block_root starting from the head of the canonical chain. Any block that was received from the gossip network that does not match the block from the canonical chain must therefore have been eventually reorged. A simple walkthrough of the code is displayed below:

bn_api_url= 'http://localhost:8545' #get block info given the block hash def get_block_info(block_hash: str, bn_api_url: str) -> dict: resp = requests.get(f"{bn_api_url}/eth/v1/beacon/headers/{block_hash}") resp.raise_for_status() return resp.json() canonical_blocks=[] block_info = get_block_info(last_block_on_chain, bn_api_url) block_info_hash= block_info["data"]["header"]["message"]["parent_root"] canonical_blocks.append(block_info_hash) for i in range(min_slot, last_slot): block_info = get_block_info(canonical_blocks[-1], bn_api_url) block_parent_root= block_info["data"]["header"]["message"]["parent_root"] canonical_blocks.append(block_parent_root) with open('canonical_blocks_f.txt', 'a') as f: f.write(json.dumps(block_parent_root) + ",") with open('canonical_blocks_f.txt', 'r') as file: data = file.read()[:-1] with open('canonical_blocks_f.txt', 'w') as file: file.write('[') file.write(data) file.write(']') streamed_block_hashes=[] ## blocks_data refers to the blocks that were captured in real time from the gossip network. for i in blocks_data: streamed_block_hashes.append(i['block']) #check for blocks that are not part of canonical chain forked_blocks = [block for block in streamed_block_hashes if block not in canonical_blocks] forked_blocks=forked_blocks[:-2]

The reorged blocks

In the 11,603 slots that were monitored, two reorgs are observed which occurred fairly close to one another. The first reorg took place at slot 5399750, followed by the second at 5399938.

By observing the block latency scatter plot, the first reorged block seems to be in line with the rest of the canonical blocks since it arrives with a 14s delay, which highlights that there may be some other causal factors other than latency which initialised the reorg. However, the second reorg shows a significant deviation in block latency compared to its neighbouring blocks.

To get a more in-depth view of how the network was behaving in these particular regions we will zoom in and analyze each reorg in more detail in the following sections.

The first reorg

Looking closely at the first reorg that was captured in slot 5399750, the latency between blocks is 14s. Additionally, the block seems to have been released 6.8s into the slot time when we compare it to the scheduled slot time. The additional 2 seconds delay may have well contributed to the block getting reorged however, it is important to note that there are numerous blocks with similar metrics, which are still part of the canonical chain. This suggests that there may be other factors that led to the reorg which we will look into further.

When we query for the attestation data for slot 5399750 on the beacon node it returns {"code":404,"message":"NOT_FOUND: beacon block at slot 5399750","stacktraces":[]}. This is due to the attestation data also being deleted after the block getting reorged. Consequently, we will use the attestation data that was recorded from the gossip network whilst the block was being propagated in real-time for this investigation. The attestation readings for slot 5399750 can be observed below:

The above chart shows the attestations published during slot 5399750. The orange lines indicate the scheduled slot time whilst the green line indicates when the block for slot 5399750 was received. All pf the attestations arrive after the block suggesting that all validators had indeed seen the block before attesting. In addition, the vast majority of the attestations are received within the allocated slot time. From a high overview, there does not seem to be any significant variations or anomalies that were observed in the latency of the blocks and attestations for slot 5399750.

However, this changes when we look into the attestation votes as large weaknesses for the block proposed in slot 5399750 are observed.

From the data that was captured all of the attestations received from the committee validators seem to be in disagreement with the designated proposer for slot 5399750. When we observe the beacon_block_root for the reorged block, this becomes much clearer as the proposer tries to build on top of the block proposed in slot 5399744, effectively attempting to reorg 6 blocks in a row.

Observing the table above, this resulted in a total of 9433 votes from 58 different committees disagreeing with the view of the proposer and declaring the block from the previous slot (5399749) to being the head of the chain. The following block proposer at slot 5399751 builds the next block on top of 5399749 and obtains 9872 votes with only 182 opposing this view. Under the fork choice rule LMD GHOST, the block effectively gets reorged and the block proposed in slot 5399751 becomes the head of the canonical chain due to being much heavier. At slot 5399752 the proposer builds on top of slot 5399751 for which all validators seem to be in coherence once again about the head of the chain as there are no disagreements. A simplified illustration of the process can be viewed below:

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

The second reorg

The second reorg seems quite different than the first one as the block proposed in slot 5399938 is delayed by more than double the spec allocated slot time. The first intuition is to check whether the previous slot was skipped, however, this is not the case as the block proposed in slot 5399937 is a valid block and part of the canonical chain. Hence, we can conclude that the block did experience a significant delay that could have well contributed to the reorg as we will see below.

From looking at the attestation chart, the block arrives 14s after slot 5399938 begins. This is captured by the red line in the chart as it essentially arrives 2 seconds into the next slot. We also observe that the vast majority of the attestation arrive within the allocated slot time and well before the block does. This highlights that the majority of the validators had cast their vote before receiving the delayed block which effectively makes the late block become very vulnerable to reorgs.

The vote split for the slot is broken down by the table as in slot 5399938 a total of 62 committees are in coherence and vote for the block proposed in slot 5399937 as the head of the chain. At this point, a total of 11182 votes are recorded for the block proposed at slot 5399937 in comparison to 0 for the current slot of 5399938. This leads the next block proposer in slot 5399939 to build on top of the heaviest block from 5399937 completing the reorg. From looking at the table we can see that there is a large consensus with this view of the chain as slot 5399939 receives 13582 votes with only 12 votes opposing this view. The simplified illustration of the process can be viewed below:

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Execution layer data

Although we were able to get more thorough understanding of how each of the reorgs unfolded on the consensus layer, the fact that they occured very close to one another in the 11,603 slots that we monitored seems very peculiar. For this reason we will investigate the data from the execution layer and dive deeper to find any underlying drivers which may have contributed to the reorgs.

Some of the key metrics that were tracked are shown in table below:

MEV

MEV refers to the profit proposers and block builders can earn by manipulating and reordering transactions (or by censoring certain transactions). By analyzing the MEV reward per block, we can determine if any large MEV opportunities were captured in the vicinity where the reorgs occurred.

Looking at the plot, we can observe that it is fairly common for the MEV reward to be equal to the block reward. The histogram shows that the MEV captured for the median block makes up to 50% of the total reward, however, the MEV rapidly decreases beyond this point. The distribution's long tail highlights the fact that MEV rewards are not evenly distributed among blocks but rather concentrated in a small number of blocks. It is in these areas that the network can be congested and lead to partition as validators may attempt to reorg blocks if the MEV reward is significantly large.

Interestingly, the succeeding blocks that reorged out the faulty blocks have relatively low MEV per block compared to their peers. The block proposed at slot 5399751 has a MEV reward equal to the block reward, and the block proposed at slot 5399939 has a MEV reward of 0. This suggests that the MEV reward was not the primary driver that led to the reorg, since the largest MEV that was captured in the region occurred in slot 5399566, which is 185 slots earlier (nearly 40 minutes before the initial reorg).
This implies that other factors such as the computational complexity of the transactions and smart contract execution could be the main drivers for the reorg.

Gas

The amount of gas used per block provides insight into the computational complexity of the block built by proposers. In Eth PoS the maximum gas limit per block is fixed at 30,000,000. The full gas limit is only utilised when there is exceptionally high demand to include a series of transactions or to make changes to large smart contracts. This can cause the base fee to rise which can incentivise proposers to include a higher number of transactions per block and more complex transactions than usual.

The plots show that the majority of the blocks use around ~50% of the maximum available gas limit (30,000,000). Howver, towards the end of the histogram there is a high accumulation of blocks that use 100% of the allocated gas limit. As mentioned earlier, this is due to proposers utilising the maximum allocated gas limit to address periods of high demand which can be highly profitable events for them. In these periods, the network may also get congested as there will be a high number of pending transactions competing to be included in the next block.

By exploring the blocks that were successfully added to the canonical chain, it can be observed that the proposers at slots 5399751 and 5399939 both used an unusually high amount of gas. Specifically, the proposer at block slot 5399751 uses 87% of the total gas available, while the proposer at block slot 5399939 used 67%. This suggests that these proposers were processing transactions and executing smart contracts that required a greater amount of computational resources which signifies that the demand for network's resources may have been elevated as we will see in the following section.

Base fee

The base fee is put in place to protect the network from being flooded with a large number of low-cost transactions, which can go on to halting the network making it difficult for legitimate transactions to be processed. Users of the network are required to pay a fee for this security, which gets used to incentivise validators to run the resources required to process the very transactions being sent. The base fee is dynamically adjusted based on user's demand for the underlying network resources to process transactions and run smart contracts.

When we look at the readings, it can be observed that the base fee exhibits numerous periods of high volatility. These regions indicate areas of high competition for block space where users try to outbid each other to ensure that their transactions get included in the following block. This can lead to a build-up of pending transactions in the mempool as users may continue to create new bids whilst also broadcasting their transaction to as many nodes in the network as possible. This can cause the network to be congested whilst causing the base fee to rise sharply.

The most significant rise in the base fee occurs between slot 5399689 - 5399729 as the base fee increases by nearly 10x going from 10.97 gwei to 93.65 gwei within 8 mins. Shortly after, in slot 5399750 the first reorg takes place signifying that the increased competition for block space may have well introduced extra load to the network and contributed to the reorg.

Tx count per block

When we observe the transaction count per block, we see a similar pattern as the base fee plots. Whilst the average block from the 11,603 slots observed includes 100 transactions per block; in the regions of high demand, this shoots up by 10x going to 986 transactions per block. The block proposers seem to have included as many transactions as possible to maximise their profitability whilst at the same time addressing the high traffic demand to avoid further congestion to the network.

This also means that the blocks getting produced in these high-demand regions are much heavier than the average block. This can increase the latency in the network as the speed at which the heavier blocks get propagated through the network will effectively be slower. This in combination with the increasing number of pending transactions being broadcasted across the network may indeed introduce further latency potentially causing certain validators with limited resources to fall behind momentarily.

This seems to be one of the underlying drivers that initiated the first reorg as the proposer in slot 5399750 attempts to reorg six blocks that had already obtained a very high number of attestation which is very unusual behaviour (since it will inevitably lead to the block being reorged). It is highly likely that the validator was not up to date with the canonical chain and fell behind momentarily leading to using an incorrect beacon_block_root. Similarly, whilst the demand for network resources remained high, the second reorg takes place at slot 5399938 due to the proposer failing to propagate the block across the network in time. The block essentially arrives with a 14s delay from the slot scheduled time, leaving no window for the attestor to attest to the block.

Performance summary

CL
  • When comparing the block arrival local time to the scheduled slot time, over 90% of the blocks are released within 5s into the slot suggesting the majority of the designated block proposers are following the spec (see chart).
  • In the 11,603 slots monitored 48 skipped slots are observed (see chart).
  • 85% of the time interval between blocks is within 12.9s (besides the occasionally skipped slots which skew the data, see chart).
  • The time interval between block arrival and the median attestation arrival is 7.2s (see chart)
  • The majority of the attestation votes for every slot have a very low disagreement rate. Approximately 90% of the votes across the 11,603 slots analysed have disagreements below 0.5% which highlight that large network partitions (near 50%) are quite rare, although they do occur (see charts).
EL
  • The network seems to have been under extra load between slot 5399689 - 5399729 due to exceptionally high demand for network resources which led the base fee to rise by 10x (see charts).
  • The tx count per block rose in line with the base fee as each block contained around 500-900 transactions per block (jumping from 100 tx per block, (see charts).
  • The competition for block space increased the number of pending transactions in the network which introduced some latency as they continued to propagate throughout the network
Reorgs
  • During the high demand two reorgs were captured in slot 5399750 and 5399938 (see chart).
  • The initial reorg seems to have been caused by the validator falling behind the canonical chain as they effectively tried to reorg 6 blocks that had already received a large number of attestations (see section).
  • The second reorg was caused by the proposer failing to propagate the block in time as it arrives 14s into the slot (see section).
  • In both reorg instances there was a 100% agreement amongst the committee validators when determining the head of the chain during the reorgs, and in both cases, they voted against the view of the proposers. This suggests that there is a strong consensus in identifying missed slots or delayed blocks in the network which is a very healthy signal (see chart).

Next steps

  • Set up the infrastructure to monitor the beacon nodes (lighthouse, prysm, nimbus and more) OS and traffic data and integrate it with the elk stack I got running.
  • Continue to track more metrics. Some of the metrics to include are EL data points such as block transactions, the size of the pending transactions in the mempool and continuing to add more of the CL Beacon API metrics.
  • Detect anomalies more efficiently by normalizing the bulk of the data and feeding it to an unsupervised classification model