Try   HackMD

Purging BeaconState ever-growing Lists

In our approach for providing Validator withdrawal feature we use BeaconState List to store withdrawal receipts from processed requests and automatical withdrawal tasks. Such method assumes data is growing with the time and nothing limits its size. The obvious downside of it is growing size of BeaconState, when every non-light client needs to keep old Withdrawals in state. We propose to purge old Wihdrawals up to client settings, so some clients could store all historical Withdrawals and others could be ok with storing only latest few thousands. Also similar approach could be applied not only for Withdrawals but for other growing parts of state.

Accumulators

Accumulators are already proposed solution for growing state data. It works in following manner: we have circle buffer for items, for, say, 16384 entities. And we have list for old buffer roots. So, this list grows with time but 16384 times slower in this example.

There are several downsides we see in accumulators and want to solve it with purging approach:

  • Old items are still needed by some clients. There should be a mechanism to sync them, there should be a mechanism for clients to request old item which is more than buffer size older.
  • Buffer size is part of a specification. Every client should follow it and cannot have its custom history settings.
  • Accumulators still grows with time. It could be not an issue at the launch, but with grown usage it could become an issue to.

Purging

Our approach relies on per-client settings for the size of stored list data. Old items are collapsed to minimum tree part structure required to construct proof for all non-purged items. So, some clients could keep all historical items, others could switch to small list with size, say, of 16384 latest items, and keep size of this BeaconState's chunk in a few MBs. There are several challenges in purging:

  • effective implementation of List-like structure with anticipated fallback when the item is not found
  • sync mechanism of BeaconState which should correctly handles different number of real items in data of different nodes and could provide capability of syncing old items from the nodes that have them
  • RPC clients should have some clear understanding, whether the requested object with proof is not available with current client or not existed at all (???)

Let's examine these challenges in details:

Implementation

We have made implementation of PurgeableList in Python based on remerkleable library which is used for typing in Ethereum 2 specification.

Sync

TODO

RPC updates

For clear understanding by clients why their request for proof data is failed we propose several enhancements:

  • When Withdrawal is created, GIndex of new withdrawal is published
  • When RPC is failed to provide Withdrawal with proof, it provides leftmost GIndex of stored Withdrawal, so client could understand whether requested Withdrawal is purged on exact client or input data is incorrect.