The snap
protocol runs on top of RLPx, facilitating the exchange of Ethereum state
snapshots between peers. The protocol is an optional extension for peers supporting (or
caring about) the dynamic snapshot format.
The current version is snap/1
.
The snap
protocol is designed for semi real-time data retrieval. It's goal is to make
dynamic snapshots of recent states available for peers. The snap
protocol does not take
part in chain maintenance (block and transaction propagation); and it is meant to be run
side-by-side with the eth
protocol, not standalone (e.g. chain progression is
announced via eth
).
The protocol itself is simplistic by design (take note, the supporting implementation is
everything but simple). In its crux, snap
supports retrieving a contiguous segment of
accounts from the Ethereum state trie, or a contiguous segment of storage slots from one
particular storage trie. Both replies are Merkle proven for immediate verification. In
addition batches of bytecodes can also be retrieved similarly to the eth
protocol.
The synchronization mechanism the protocol enables is for peers to retrieve and verify all
the account and storage data without downloading intermediate Merkle trie nodes. The final
state trie is reassembled locally. An additional complexity nodes must be aware of, is
that state is ephemeral and moves with the chain, so syncers need to support reassembling
partially consistent state segments. This is supported by trie node retrieval similar to
eth
, which can be used to heal trie inconsistencies (more on this later).
The snap
protocol permits downloading the entire Ethereum state without having to
download all the intermediate Merkle proofs, which can be regenerated locally. This
reduces the networking load enormously:
O(accounts * log account + SUM(states * log states))
O(accounts + SUM(states))
(actual state data).O(accounts * log account + SUM(states * log states)) * 32 bytes
(Merkle trie node hashes) to O(accounts + SUM(states)) / 100000 bytes
O(accounts * log account + SUM(states * log states)) / 384
(states retrieval packets) to O(accounts + SUM(states)) / 100000 bytes
(number ofTo put some numbers on the above abstract orders of magnitudes, synchronizing Ethereum
mainnet state (i.e. ignoring blocks and receipts, as those are the same) with eth
vs.
the snap
protocol:
Block ~#11,177,000:
Time | Upload | Download | Packets | Serving disk reads* | |
---|---|---|---|---|---|
eth |
10h50m | 20.38GB | 43.8GB | 1607M | 15.68TB |
snap |
2h6m | 0.15GB | 20.44GB | 0.099M | 0.096TB |
-80.6% | -99.26% | -53.33% | -99.993% | -99.39% |
*Also accounts for other peer requests during the time span.
Post snap state heal:
eth
The snap
protocol is a dependent satellite of eth
(i.e. to run snap
, you need to
run eth
too), not a fully standalone protocol. This is a deliberate design decision:
snap
is meant to be a bootstrap aid for newly joining full nodes. By enforcing allsnap
peers to also speak eth
, we can avoid non-full nodes from lingering attached tosnap
indefinitely.eth
already contains well established chain and fork negotiation mechanisms, as wellsnap
can benefit of all these mechanisms without having to duplicate them.This satellite status may be changed later, but it's better to launch with a more
restricted protocol first and then expand if need be vs. trying to withdraw depended-upon
features.
The snap
protocol is not an extension / next version of eth
as it relies on the
availability of a snapshot acceleration structure that can iterate accounts and storage
slots linearly. Its purpose is also one specific sync method that might not be suitable
for all clients. Keeping snap
as a separate protocol permits every client to decide to
pursue it or not, without hindering their capacity to participate in the eth
protocol.
The crux of the snapshot synchronization is making contiguous ranges of accounts and
storage slots available for remote retrieval. The sort order is the same as the state trie
iteration order, which makes it possible to not only request N subsequent accounts, but
also to Merkle prove them. Some important properties of this simple algorithm:
The gotcha of the snapshot synchronization is that serving nodes need to be able to
provide fast iterable access to the state of the most recent N
(128) blocks.
Iterating the Merkle trie itself might be functional, but it's not viable (iterating the
state trie at the time of writing takes 9h 30m on an idle machine). Geth introduced
support for dynamic snapshots, which allows iterating all the accounts in 7m
(see blog for more). Some important properties of the dynamic snapshots:
O(n)
operations, and moreO(1)
direct access to any accountSLOAD
.The caveat of the snapshot synchronization is that as with fast sync (and opposed to
warp sync), the available data constantly moves (as new blocks arrive). The probability
of finishing sync before the 128 block window (15m) moves out is asymptotically zero. This
is not a problem, because we can self-heal. It is fine to import state snapshot chunks
from different tries, because the inconsistencies can be fixed by running a
fast-sync-style-state-sync on top of the assembled semi-correct state afterwards. Some
important properties of the self-healing:
The accounts in the snap
protocol are analogous to the Ethereum RLP consensus encoding
(same fields, same order), but in a slim format:
empty list
instead of Keccak256("")
empty list
instead of Hash(<empty trie>)
This is done to avoid having to transfer the same 32+32 bytes for all plain accounts over
the network.
[reqID: P, rootHash: B_32, startingHash: B_32, responseBytes: P]
Requests an unknown number of accounts from a given account trie, starting at the
specified account hash and capped by the maximum allowed response size in bytes. The
intended purpose of this message is to fetch a large number of subsequent accounts from a
remote node and reconstruct a state subtrie locally.
reqID
: Request ID to match up responses withrootHash
: Root hash of the account trie to servestartingHash
: Account hash of the first to retrieveresponseBytes
: Soft limit at which to stop returning dataNotes:
Rationale:
Caveats:
[reqID: P, accounts: [[accHash: B_32, accBody: B], ...], proof: [node_1: B, node_2, ...]]
Returns a number of consecutive accounts and the Merkle proofs for the entire range
(boundary proofs). The left-side proof must be for the requested origin hash (even if an
associated account does not exist) and the right-side proof must be for the last returned
account.
reqID
: ID of the request this is a response foraccounts
: List of consecutive accounts from the trie
accHash
: Hash of the account address (trie path)accBody
: Account body in slim formatproof
: List of trie nodes proving the account rangeNotes:
0x00..0
and all[reqID: P, rootHash: B_32, accountHashes: [B_32], startingHash: B, responseBytes: P]
Requests the storage slots of multiple accounts' storage tries. Since certain contracts
have huge state, the method can also request storage slots from a single account, starting
at a specific storage key hash. The intended purpose of this message is to fetch a large
number of subsequent storage slots from a remote node and reconstruct a state subtrie
locally.
reqID
: Request ID to match up responses withrootHash
: Root hash of the account trie to serveaccountHashes
: Account hashes of the storage tries to servestartingHash
: Storage slot hash of the first to retrieveresponseBytes
: Soft limit at which to stop returning dataNotes:
Rationale:
Caveats:
[reqID: P, slots: [[[slotHash: B_32, slotData: B], ...], ...], proof: [node_1: B, node_2, ...]]
Returns a number of consecutive storage slots for the requested account (i.e. list of list
of slots) and optionally the Merkle proofs for the last range (boundary proofs) if it only
partially covers the storage trie. The left-side proof must be for the requested origin
slots (even if it does not exist) and the right-side proof must be for the last returned
slots.
reqID
: ID of the request this is a response forslots
: List of list of consecutive slots from the trie (one list per account)
slotHash
: Hash of the storage slot key (trie path)slotData
: Data content of the slotproof
: List of trie nodes proving the slot rangeNotes:
[reqID: P, hashes: [hash1: B_32, hash2: B_32, ...], bytes: P]
Requests a number of contract byte-codes by hash. This is analogous to the eth/63
GetNodeData
, but restricted to only bytecode to break the generality that causes issues
with database optimizations. The intended purpose of this request is to allow retrieving
the code associated with accounts retrieved via GetAccountRange, but it's needed during
healing too.
reqID
: Request ID to match up responses withhashes
: Code hashes to retrieve the code forbytes
: Soft limit at which to stop returning dataThis functionality was duplicated into snap
from eth/65
to permit eth
long term to
become a chain maintenance protocol only and move synchronization primitives out into
satellite protocols only.
Notes:
nil
or other placeholders.Rationale:
Caveats:
bytes / 6KB
is a[reqID: P, codes: [code1: B, code2: B, ...]]
Returns a number of requested contract codes. The order is the same as in the request, but
there might be gaps if not all codes are available or there might be fewer is QoS limits
are reached.
[reqID: P, rootHash: B_32, paths: [[accPath: B, slotPath1: B, slotPath2: B, ...]...], bytes: P]
Requests a number of state (either account or storage) Merkle trie nodes by path. This
is analogous in functionality to the eth/63
GetNodeData
, but restricted to only tries
and queried by path, to break the generality that causes issues with database
optimizations.
reqID
: Request ID to match up responses withrootHash
: Root hash of the account trie to servepaths
: Trie paths to retrieve the nodes for, grouped by accountbytes
: Soft limit at which to stop returning dataThe paths
is one array of trie node paths to retrieve per account (i.e. list of list of
paths). Each list in the array special cases the first element as the path in the account
trie and the remaining elements as paths in the storage trie. To address an account node,
the inner list should have a length of 1 consisting of only the account path. Partial
paths (<32 bytes) should be compact encoded per the Ethereum wire protocol, full paths
should be plain binary encoded.
This functionality was mutated into snap
from eth/65
to permit eth
long term to
become a chain maintenance protocol only and move synchronization primitives out into
satellite protocols only.
Notes:
Rationale:
account || storage
path[reqID: P, nodes: [node1: B, node2: B, ...]]
Returns a number of requested state trie nodes. The order is the same as in the request,
but there might be fewer is QoS limits are reached.
Version 1 was the introduction of the snapshot protocol.