*Thanks to Guillaume Ballet, Gottfried Herold, Łukasz Rozmej, and Josh Rudolf for the fruitful discussions.* # Introduction This document consolidates many discussions about a particular topic regarding state migration in the Verkle Tree EIP. This document isn’t a proposal or argument but just a description of the problem and some options being considered. Most of what’s described in this document was proposed and discussed by different people in previous calls/meetings. I’ve added many other considerations while consolidating them in this document. This document doesn’t assume readers were following recent Verkle Tree implementation progress, so hopefully, it can be self-contained and allow more people to join the discussion with a good understanding of the problem and proposals. That’s the goal of this document. Everybody is invited to join [Verkle Implementers Call](https://www.notion.so/c40bb46ff7d44f898a2ba9ec99327868?pvs=21) to participate since this topic is still one actively discussed topic. Before we jump into *why* and *hows*, it’s helpful to state the goal we are trying to achieve, and hopefully, it will become clear why that’s the case: ***How can we ensure that network validators have the preimages database when the Sweeping phase starts to migrate state from MPT to VKT?*** # Context Ethereum using Verkle Trees requires a bunch of changes at many layers. Apart from the data structure, gas model, and cryptography changes, the chain will have to migrate the state of the current Merkle Patricia Trie to the new Verkle Tree (assuming no State Expiry is implemented). Without getting into details about the migration method, the current path is to do it via the *Overlay tree method*. This means we’ll migrate 1 billion (estimated) key values from the Merkle Patricia Tree to the Verkle Tree on each block. I highly recommend watching [this EthCC 2023 talk](https://www.youtube.com/watch?v=F1Ne19Vew6w) if you want more details. So, what is the relationship between migrating X key-values per block and *preimages*? On each block, the nodes will deterministically walk the MPT taking the next key-values to migrate. Then, it will insert these same key-values into the VKT. But there’s a catch: the *keys* in the MPT are Keccak hashed values, but we need the preimages to recalculate the corresponding new key in the VKT. More concretely, one of the key-values to migrate has the MPT key `0xabddde...` which results from `keccack(someAddress)`. To know where to store `someAddress` account information in the new VKT, [we need to apply another kind of hashing (*Pedersen Hash*)](https://notes.ethereum.org/@vbuterin/verkle_tree_eip#Header-values) to `someAddress`. Walking the MPT only give us `0xabdde...`, but we need the *preimage* of that hash (i.e: `someAddress`) to be able to migrate it. Many client implementations have a flag to enable *preimage recording*. As an example, you can check [Geth documentation regarding preimage recording](https://geth.ethereum.org/docs/faq#what-is-a-preimage). Since this is an optional feature (not enabled by default), most validators don’t record preimages; thus, we can do the reverse lookup to transform `0xabddde...` to `someAddress`. Other clients, such as Erigon, store their accounts as preimages, so they don’t need to fetch preimages from external sources. In summary, the *Overlay tree* migration strategy requires all validators to have access to a complete set of preimages for the last (frozen) MPT from where data will be migrated. Thus, we need some strategy to generate and distribute the preimages to all nodes before the migration starts. # A mental model of the migration timeline Before describing the currently discussed solutions, it helps to have a visual timeline of some facts about how the Verkle Trees EIP will unfold and define some milestones and terms. ![](https://hackmd.io/_uploads/BkkhOnAj3.png) [Link to bigger image.](https://hackmd.io/_uploads/BkkhOnAj3.png) Let’s try to unpack the above diagram since it will be the *base diagram* for later explaining potential options for generating and distributing preimages. ## Slot A In slot A, we see the *VKT activation*. This *activation* means that the Overlay Tree gets activated, which implies validators will have two trees: - A freshly-created (empty) Verkle Tree (overlay tree). - A read-only MPT (base tree). Between slots A and C, the only key-values that will be inserted in the VKT are: - Any new writes to the Ethereum state. (e.g: `SSTORE`) - Any read that isn’t present in the VKT and needs to fall back to the MPT, will be copied to the VKT (implicitly). As in, indirectly, reads will automatically migrate state to the VKT. Note that the *Sweeping phase* **hasn’t** started yet. Soon we’ll explain what this phase is about. ## Slot B This isn’t a relevant slot for the VKT EIP since nothing mainly changes in validators’ logic. It’s just an observation that from this point forward, no reorgs will make clients revert to a state before the *VKT activation*. ## Slot C The *Sweeping phase* starts. In the *Sweeping phase,* we migrate X key-values from the MPT to the VKT on each block. For simplicity, the diagram above states that *Slot B* < *Slot C*, but that’s strictly needed depending on the solution. (i.e: Option 1. to be explained soon, doesn’t require this condition). This is a critical slot because network validators **must** have the complete preimages database before starting the *Sweeping phase*. If that isn’t the case, when walking the MPT migrating key-values they can find a key that can’t be resolved to its preimage, which will completely block the client from executing the block. Theoretically, you could start this phase even if you have a partial preimages database since it’s expected to be used in order (sequentially). Exploiting this might be up to each client, but it comes with risks compared to planning for validators to pull the entire database with time. It can make sense for validators who join the network close to this slot. ## Slot D The *Sweeping phase* finishes, meaning all the MPT key-values were already migrated to the VKT. The VKT has the complete state of the Ethereum chain. We could also imagine some Slot E where this event gets finalized, and clients can do further cleaning tasks considering no reorg can jump back to the *Sweeping phase*. # Potential approaches Now that we understand why these preimages are needed and have some mental model of how the Verkle Trees deployment will (probably) unfold, let’s remember our goal: ***How can we ensure that network validators have the preimages database when the Sweeping phase starts?*** Note that there are two angles to this question: - Who and how is this database generated? - How is this database distributed to the rest of the network? In the following two sub-sections, we’ll discuss two proposals. Right after, we’ll go through some general questions to help compare how both options have different tradeoffs. Remember, this is an active discussion — better approaches might be discovered. What “better” means is not precisely defined, so we want to open this discussion as much as possible. The options are described in no particular order. ## Option 1 - Unfinalized MPT preimage generation Let’s look at the following diagram explaining this option: ![](https://hackmd.io/_uploads/rJqvwh0o3.png) [Link to bigger image.](https://hackmd.io/_uploads/rJqvwh0o3.png) *Note: the red elements were added compared to our base diagram. The distance between slots might not be accurate. Slot B* is irrelevant (it can be ignored; it might happen before or after Slot C). We first define a loosely defined period where all EL clients will release a *highly recommended* version that will enable preimage recording by default. There’s no strict point where all clients should coordinate, but this period should be far from *Slot A*. At a similar time, some network actors (to be explained later) will start generating *preimage databases* version based on finalized MPTs. These *preimage databases* will be published and available for the network to download. For validators that: - Were already synced (following the tip): after upgrading to their new EL client version with preimage updating, they are only required to download (once) any *preimage database* is published **after** they start recording preimages for these validators, that probably means mostly the first published database. - For any new validator (e.g. joining closer to *Slot A*): the rule is the same, but note that many of the published *preimages databases* are **invalid** for this validator. Only the ones strictly published after they joined the network are valid. The explanation of why this works is simple, but we’ll dive into more details later. Some general observations about this solution: - At *Slot C*, the union of the preimages database and the recorded preimages will be a superset of the needed preimages to perform the migration. This is the case since the MPT is still mutated before Slot A. - Between preimage recording activation and *Slot A,* reorgs might record preimages considered garbage. (Actually, this depends on how clients register preimages regarding reorgs). This shouldn’t be a big deal since there aren’t many reorgs, and the size overhead should be negligible. - Note that there isn’t a strict need to separate Slot A from Slot C; both could be the same slot. It might be worth exploring if definining Slot C after the MPT finalizes might give some form of benefit (might it be safer? easier for clients to know the MPT won’t be rollbacked? facilitate syncing?) ## Option 2 - Finalized MPT preimage generation Let’s look at the following diagram explaining this option: ![](https://hackmd.io/_uploads/Hy8Ed2Ajn.png) [Link to bigger image.](https://hackmd.io/_uploads/Hy8Ed2Ajn.png) *Note: the red elements were added compared to our base diagram. The distance between slots might not be accurate. The red “circle” means the uniquely generated and published database.* This proposal avoids the need for preimage recoding in validators by generating the preimages database only after *Slot B* when these are final. Compared to Option 1., no new release should be considered by EL clients to enable preimage recording. To put it concretely, the preimages generated in this option can be viewed as those generated in Option 1, plus the recorded preimages done by validators. When the *Sweeping phase* starts, all validators must find the required preimage in this generated database. (i.e: they don’t have to check in “two places”). ## Comparing options To understand the different tradeoffs between these options, let’s compare both by looking from different angles. ### Correctness By *correctness,* we mean that any validator at *Slot C* and forward can resolve **any preimage** in the MPT that will be walked to migrate the state. **Option 1:** After enabling preimage recording and pulling a valid *preimage database* (i.e: published after the recording was activated), the validator *c*an resolve any MPT preimage. For example, let’s assume we’re at some *Slot β* after the validator has enabled recording and has pulled a database*:* - If this preimage was included in the MPT tree before or equal to the *preimage database* timestamp, it should be present in the generated database. - If this preimage was generated after, then this preimage should be found in the recorded ones in the validator. This means that validators will have a complete set of preimages to perform the migration at *Slot C* (i.e: *Slot β* = *Slot C*). **Option 2:** This option created the preimage database after the MPT is read-only and finalized. Between *Slot B* and *Slot C,* no keys could be modified (1. MPT is read-only since we’re after *Slot A,* 2. No reorg should modify the consensus around the finalization state*)* ### Impact on Ethereum users Note that the time between *Slot A* and *Slot B*, validators will have two active trees. We’ll need to charge more gas for the reading state during this time. This is true since accessing state potentially means accessing two trees, which means it’s most costly for validators. In Option 1. the time between both slots is smaller than in Option 2. The reason for that is that Option 2. packs most of the preimage database generation **and** distribution between these slots. In Option 1. the preimage database generation and a big part of the distribution happens before *Slot A;* thus, the time between *Slot A* and *Slot C* can be shorter. This means, Option 1. could be considered better than Option 2. regarding UX. ### **How to figure out when most of the network will be ready for *Slot C*?** This is a critical question to answer. If *Slot C* is defined incorrectly and most of the network doesn’t have all the needed preimages, there can be a liveness problem. In Option 1. validators have more time to start recording preimages and downloading a *preimage database*, probably long before *Slot A* happens. *Slot C* will be defined when the release for Verkle EIP is released, so we don’t have to strictly “guess” what *Slot C* should be upfront. In Option 2. there’s naturally a shorter time between *Slot B* and *Slot C*. Additionally, as mentioned in the *Impact on Ethereum users* section above, there’s a tension between giving the network more time and asking the users to pay an overhead in gas. Option 2 might need social coordination to know when the network is ready while not waiting *too long* to avoid this UX impact. ### Implementation complexity Some points about this: - Option 2. doesn’t require EL clients to release a new version to start recording preimages. This should be an easy release since all clients already implement preimage recording, so it shouldn’t be a big deal in Option 1. - Option 2. has a *single source of truth*. The finalized version of the MPT keys is a unique and precise set of key preimages that validators should have. In Option 1. we have multiple databases being published, and any of them could be invalid (e.g: 99% valid but 1% (un)intentionally invalid). - Option 2. requires some extra coordination to define when *Slot C* will happen since that depends on the conclusion of when the validators have downloaded the *preimages database*. If *Slot C* is hardcoded upfront with a guess, there’s a risk that something bad happens, and some emergency release is needed to bump that slot number. In Option 1, since recording+generation starts with more time, *Slot C* could already have some high confidence when *Slot A* hard-fork version is published (i.e. less “rollback”/”emergency release” risk). - Option 2. provides a complete database for validators to use in the *Sweeping phase*. 100% of lookups in this table will succeed. In Option 1., a lookup in the *preimage database* can fail, having to check in the recorded preimages. This means more than one disk lookup, which can affect performance (by how much? TBD, just making the theoretical point). - Option 1. uses more storage space than Option 2. Considering this point relative to the optimal size (i.e: Option 2 database), the overhead is probably negligible. The current estimate for the preimage database is ~4GiB. - Option 1. adds extra overhead in block processing since recording preimages require extra disk writes. This can have some impact on syncing speeds. Option 2. doesn’t include this extra overhead. - Option 2. might be more sensitive to “bad actors” or “mistakes” in publishing an incorrect *preimage database*. If something like this happens and a big part of the network has an invalid database, the time between slots *Slot B* and *Slot C* can be tighter. This increases the probability of needing an emergency release. ### Who generates the preimage database? Any validator that has been running with preimage recording or runs specific clients that can take advantage of their design to dump the preimages (e.g: Erigon). ### Should validators trust this generated preimages database? No. This database is easily verifiable by any validator. The way to verify it is to do the MPT tree walking and check that all the preimages needed for the *Sweeping phase* can be resolved. This verification must happen before the *Sweeping phase* and can be done potentially in the background by validators. Recall that even missing one preimage can mean a validator will completely block the *Sweeping phase*, so this verification step is essential. ### How is this preimage database distributed to validators? This is something still being discussed, and we need more opinions from the community, but some discussed options are: 1. Out of band: make this database downloadable as a file via CDN/torrent. 2. In protocol: build a sub-protocol allowing clients to share this file with peers. Both options are possible, and below I’ll list some general questions that we’ve already touched on that can help to continue the conversation: - What if someone syncs from scratch in 10 years? - Out of band: From VKT EIP and forward, syncing from scratch involves asking validators always to record preimages so they have their database. In 10 years, nobody can promise an out-of-band source will still exist so it’s better these nodes rely on their own. This can be enabled in the planned releases of the EIP. - In protocol: it shouldn’t be a problem. - Are we asking validators to do “manual” operations? What if they do it incorrectly? - Out of band: Downloading the preimage database from an external source doesn’t necessarily mean “manual” work. The client’s software will do the download automatically. - In protocol: no manual operations are needed. - How much bandwidth overhead is added to the network? - Out of band: probably less than in protocol. The bandwidth usage is out of the protocol bandwidth usage, so it’s more efficient. - In protocol: to be discovered. - What are the implementation cost and risks? - Out of band: mainly devops. e.g for example, relying on a CDN takes some resources and time. - In protocol: implement some sub-protocol logic in all clients. The database can be considered a file, and the sub-protocol should allow clients to pull the file from other peers. (No trees or healing needed). - If we rely on some out-of-band solution (e.g: CDN), isn’t that risky? - Yes, if the complete network relies on a **single** CDN. If the CDN goes down for any reason, it could create a liveness problem. Ideally, there should be multiple sources if we rely on out-of-band solutions. Who controls these sources? That’s another topic; that should be considered as the inherent complexity of out-of-band solutions. Take everything said here as brainstorming. Time spent on this topic was mainly ping-pong conversations. ### How is this “preimage database” encoded? It will probably be a flat file with some trivial encoding. It won’t be a database with some complex format or similar. Ignacio and Guillaume had experimented with a Geth database generator and format that is very simple; more to be shared soon in VKT implementers call. ### Can you extend more on why Option 1. needs to republish databases? Option 1. keeps publishing updated *preimage databases* for two main reasons. First, let’s try to imagine if we only published this database once. This means a new validator joining the chain can only sync from a point equal to or before this published timestamp. If that isn’t the case, there would be a gap (i.e: missing preimages) in the *preimages database + preimage recording* set. Second, republishing databases also allow less coordination between EL clients publishing new versions with preimage recording enabled. Let’s imagine two extreme cases: 1. We start generating these preimage databases today: this is fine, even if no EL client has published versions with preimage recording enabled. The only condition is that these databases will be available after they start recording preimages. 2. Some EL client *X* is falling behind and late publishing a version with enabled preimage recording. This isn’t blocking validators from other clients to start recording preimages and even already having downloaded databases. The only downside of this republishing is that devops teams or archive nodes participating in publishing databases will have ongoing costs. The actual processing to generate a *preimage database* isn’t costly. A standard machine can generate this database multiple times daily without any problem. Most of the cost might be in infrastructure to automate the generation and uploading, plus bandwidth, anti-DoS, and related expenses for parties that want to be download sources. ### Does this preimage database have other benefits? Yes. Although the primary goal of generating this database is to be sure every validator has the necessary data to do the *Sweeping phase* correctly, using this database can have another big benefit: improving clients' efficiency in the *Sweeping phase*. This preimages database will be constructed in a specific way such that its usage in the client implementations of the *Sweeping phase* logic have the least amount of disk IO overhead possible. For example, if we use a usual Geth-like preimage database, this would mean that resolving preimages will be random disk lookups that aren’t optimizable by the host OS. The preimage database we plan to generate is precisely stored so that resolving preimages can be implemented as a sequential forward-only read of a flat file, which is very efficient and doesn’t require random lookups. These random lookups can also mess up with other planned in-order walks of the MPT using “flat snapshots” in clients. In summary, besides providing all the necessary information, it can also be helpful compared to other generic forms of preimage databases in some clients. Using this generated database for performance reasons isn’t mandatory but just an indirect benefit that can be taken advantage of. # Closing This is still an actively discussed topic. We want more people to [share their opinions](https://www.notion.so/c40bb46ff7d44f898a2ba9ec99327868?pvs=21)!