Neighbors lookup in Discovery V5

Intro (you are here)
Closeness: how it works
Do we need neighbors and closeness in Discovery?
Neighbors lookup change in Discovery V5
Simulator
A. Simulator description
B. Simulator simplifications
C. Simulator test cases
Simulation results
A. Modern network
B. Old network
Discussion
References

Discovery is a family of p2p protocols, a fork of well-known Kademlia^[1] protocol, which was designed as a decentralized distributed hash table (DHT). Previous version, Discovery V4^[2] is already used in Ethereum^[3] network for node lookup and discovery but has few obvious flaws and requires some improvements. Discovery V5^[4] is introduced to move forward and get a better tool for peer knowledge propagation mechanism.

Discovery V5 protocol changes node lookup^[5] procedure from Discovery V4 and original Kademlia which is well examined by a number of researchers. Classic function has keyHash (which could be public node address or hash of lookup word) as the only parameter and returns k (refers to k-bucket from original Kademlia, maximum number of nodes in each bucket) nodes closest to the input.

In V5 initiating node A (assuming it searches for a nodes closest to itself) calculates distance d between itself and response node B. A sends FINDNODE(d) to B and after receiving an answer, calculates which way it should go around to get totally k closest nodes to its ID. So it could send FINDNODE(d+1) next or FINDNODE(d-1) and maybe FINDNODE(d+2) etc. if there are still less than k nodes in total^[6]. In comparison, in Kademlia and V4 node sends only one query to get the same outcome: FindNeighbors(Node ID of A) .

Closeness: how it works^

First, let's shed a light on the distance term in Kademlia and Discovery protocols family. From the specification of Kademlia:

Given two 160-bit identifiers, x and y, Kademlia defines the distance between them as their bitwise exclusive or (XOR) interpreted as an integer,
d(x, y) = x⊕y.

[…]

For each 0 ≤ i < 160, every node keeps a list of (IP address, UDP port, Node ID) triples for nodes of distance between 2ⁱ and 2ⁱ⁺¹ from itself. We call these lists k-buckets.

Discovery protocol simplifies this picture preserving the idea^[7]: distance is i, an exponent of 2 in bucket definition, making bucket index equal to the distance. And we call it logarithmic distance in that case. Plus we have 256-bit keys in Discovery, but it doesn't change anything except the number of possible buckets. So, distance is just a function of two Node ID's, it doesn't respect geographical closeness, latency between nodes etc.

Closeness is, respectively, the measure of distance, how it compares in different cases. So, if distance(A, B) = 256 and distance(A, C) = 250, we say that C is closer to the A than B. Kademlia table is designed to prioritize closest known nodes. Node is never dropped if it's close enough and alive, and we are trying to actualize the state of the closest nodes first. When you have thousands or even millions of nodes, setting some priority is an obvious consequence: you will never be able to keep actual information about every node in the network. So, there are two main reasons for prioritization of the closest nodes:

Creating small worlds^[8] of the same size around every node. Worlds are scaling with network size, are predictable and shapes topology with clear and consistent balance of efficiency and redundancy.
Kademlia was designed as a distributed hash table, in this way it's supposed that information about search term is stored only on nodes closest to search term hash.

But Discovery V4 doesn't have any kind of search and we are proceeding to the next question with this fact in mind.

Do we need neighbors and closeness in Discovery?^

Recent paper targeting Ethereum network attacks^[9] asks a question: Why to keep the closeness mechanism of Kademlia in a node discovery system which doesn't use its properties, but gets all of its vulnerabilities. There are several reasons why we should stay with it:

Network graph. The rule by which nodes are looking for their p2p exchange partners defines whole network topology. Recent study^[10] has found that Bitcoin network structure is not a random graph as may expected by its random selection of nodes. Instead Bitcoin network topology is heavily reshaped by mining establishment. Another study confirms^[11] that the Ethereum network graph in the real world is less clustered. It must therefore be concluded that anonymity and randomness in lookup doesn't improve the topology of Bitcoin, instead it is masking imbalance.

Security and anonymity. Since Ethereum node closeness data is open, anyone could query node to get any neighbors and calculate distance from one peer to another, which raises several kinds of attacks^[12] especially when node is new in network or had 1+ day downtime. An opposite, Bitcoin p2p lookup preserves anonymity of node connections, but observable methods reconstructs node mesh^[13] questioning an anonymity advantage of it.

Lookup using closeness. While not using key hash lookup as originally proposed in Kademlia paper, Discovery V5 suggests^[14] to use it for advertisement topic location search. Either keeping it as currently designed or using Kademlia's original scheme, both will require an opened distance measurement mechanism. The only approach that could possibly omit it, is random distribution of ad topics, which is not well researched today.

Neighbors lookup change in Discovery V5^

Looking at the previous section, there are no significant reasons to ditch closeness property. The question is why to change its interface and what is the cost of this change. "Discovery V5 Rationale" paper shows^[15] the only reason behind this:

Discovery V4 trusts other nodes to return neighbors according to an agreed distance metric. Mismatches in implementation can make it hard for nodes to join the network, or lead to network fragmentation.

First of all even random distribution doesn't lead to serious imbalance until there is an interest of discrimination by some forces, meaning until node answers with peers specially selected for such fragmentation.

One could oppose: in V4 we were not able to verify distances of nodes in reply, so we get a security improvement in V5, where we could verify NODES answer. But, what could prevent V5 node from replying with less nodes than it knows? Nothing. So, taking an answer which we shouldn't like in V4 with 2 nodes in 254, 2 in 255, 2 in 256 buckets and 10 in 253, totally 16 in one Neighbors reply, but not real from the view of node distribution, we could receive the same data in 4 messages when asking these node in V5 using FINDNODE message. While node knows about 16 nodes in 254, 255, 256 buckets, it could return just 2 from each in both protocols.

Could the implementation of V5 in reply to query with distance d send nodes that lays not in this distance? Definitely, yes. Both in case of misinterpretation of specification by client developer (which should lead to documentation improvements) and in case of malicious node, which is designed to violate the specification. Both cases could be easily handled by simple distance verification of each node in answer.

Next, changes in input parameter doesn't provide any anonymity improvements, the ability of node to hide part of known peers. You could recursively fetch the whole Kademlia table of the node in both cases.

Overall, there are no visible benefits of changing lookup parameter, but there are definitely drawbacks: more messages, more latency, more traffic.

Let's take an example of Kademlia tables and their buckets filled within 3 nodes among 10,000 peers:

k-bucket	node #1	node #2	node #3
256	16	16	16
255	16	16	16
254	16	16	16
253	16	16	16
252	16	16	16
251	16	16	16
250	16	16	16
249	16	16	16
248	13	16	16
247	10	8	11
246	9	3	4
245	6	3	4
244	0	1	0
243	1	0	1
242	0	0	0
241	0	1	1
240	0	0	0
239	0	1	0
238..1	0	0	0

What could we notice from it? First of all, every node in 10,000 nodes network keeps information only about 160-170 of them, less than 2%, we call it small world topology. Secondly, buckets with index less than 248 are filled only partially. It means that any node falling in bucket 247 and smaller with such node, will be definitely kept in table while it's alive, because the lesser is index of bucket, the smaller chance to fall in it. In third place, we could notice that nodes are stocked in a similar manner, but if we look at exact node ID's (not displayed here), we will see that 3 tables has none or almost negligible intersection. This happens because of distance function properties.

If you lookup neighbors in #249-256 buckets, you will need one query both in V4 and V5. Moving down we could switch to 3 messages and more in V5 compared to just one in V4, plus starting from bucket #244 and down we couldn't understand from the node reply, whether we should try to go down to the lower numbered buckets. And it was a case with 3 long lived good filled nodes. But the network consists of different peers. Some of them are much younger.

To measure network traffic and latency and compare it with V4, simulator was made and a number of simulations were executed. Let's see what we could expect in real network when moving to V5.

Simulator^

Simulator^[16] was made to compare different approaches of neighbors discovery. As we don't have real network data for each case, we need to make several assumptions and simplify processes occurring between nodes during lookup.

Simulator description^

Simulator is made using Kotlin^[17] language, modern Java rethink. Simulator is not using any kind of network stack, it just calls appropriate KademliaTable methods^[18] of another peer and saves estimated size of outgoing and incoming message only on initiator peer, so in p2p network without any broadcast its enough to get whole network statistics with zero duplication. To reduce complexity of real network processes, several other simplifications were made:

Simulator simplifications^

Peers uptime, which leads to variety in fullness of Kademlia tables
We use simulated Pareto distribution of peers among Kademlia tables, because p2p peers uptime prone to be distributed according to Pareto^[19]. Two configurations were chosen. One with α = 0.18, X_m = 1 which leads to distribution where 40% peers have passed all other peers through its tables (having about 8 highest buckets full and another 8 partially filled), but starts from youngest peers, which knows only about 4 other peers. Let's call it "Modern network". And another was made with α = 0.24, X_m = 10 which leads to 50% of "heard about everyone" nodes and youngest nodes with 35 peers, almost 3 full highest buckets, "Old network". Though such a kind of simulated distribution leaves reasons to doubt, building real network explorer is not a part of this research, but could be done in future if needed. According to Ethereum real network research paper we need about 4 days^[20] to pass all nodes through peer's Kademlia table, it's a lot to bet that there are more than 50% of such peers in the network. Moreover in real network there are malicious nodes, buggy old versions of clients etc, which best first configuration in representation of real network.
Liveness test
Original Kademlia and both V4 and V5 Discovery implementations require testing whether the head peer is alive when adding new peers to any full bucket^[21]. Old peer is dropped, if it's not alive; new peer is dropped, if old head is alive. For simulation purposes we use Random.nextBoolean() to check whether the head is alive. It would be more realistic if the chance of dropping peer decreases with each subsequent check (peer, which is online for 2 hours, will be more likely online in an hour, than the peer, which is online for 1 hour, according to original Kademlia paper^[22]), but simulation of such behavior increases complexity and set aside for further improvements.
Choosing a peer to query
Simulator follows non-recursive examination of all nodes in peer's Kademlia table one by one, starting from the closest (by log distance) one. As peers are added to the table in random order and fullness varies with simulated uptime, it looks very similar to the real distribution of queries and its direction.
Other assumptions
Simulations are never identical to real network data, though both could have great variety in measurement results taken in one time or another, or with different inputs, if we talk about simulation. So, some minor simplifications don't affect simulator results greatly. This way in simulator latency is a fixed number, size of packets are fixed numbers (except NODES, which is fixed for each node count), handshake traffic is not counted as we assume that session is already established and never breaks etc.

Simulator test cases^

Simulation was run using 10,000 peers, as it's a number of nodes of the same order as in the real Ethereum network (about 30,000 currently^[23]). As mentioned above, two configurations were used for distribution of node knowledge among other peers. Using these configurations 100 rounds of lookups for each node were performed. 3 kinds of neighbors lookup were used:

FindNeighbors. Kademlia classic method, which was also used in Discovery V4. Node A requests list of neighbors from other peer B submitting PeerId input. B finds surroundings of PeerId in its Kademlia table and sends back up to k (by number of nodes in a bucket) nodes in one message (which is actually split to several messages due to UDP protocol properties).
FindNodesStrict. Discovery V5 method for neighbors lookup. It's called FINDNODE in Discovery V5 documentation^[24], but we call it FindNodesStrict for clarity among other simulated methods. Node A requests list of nodes which lays in input distance from other peer B, so nodes from its #distance bucket are returned in reply. After that node A calculates whether it should request buckets around to get more nodes. In the end we want to get the same k nodes near A's peer ID.
FindNodesDown. Slightly modified Discovery V5 API method. Node A requests list of nodes which lays in input distance from other peer B plus those that are in buckets with smaller index, up to k totally. The idea is to reduce the number of messages and traffic, especially when looking for neighbors in non-top buckets, where there are usually less than k nodes. It's a straightforward improvement of the V5 method with some complexity in implementation, but we hope that it should reduce traffic and total lookup time compared to V5 method.

After 100 rounds of lookup simulator is stopped, traffic total and lookup time (called here cumulative latency) estimations are calculated.

Simulation results^

Modern network^

Let's start from a young generation network with α = 0.18, X_m = 1 which leads to distribution where 40% peers passed through its Kademlia tables all other peers, but youngest peers in simulation knows only about 4 other peers. It's a kind of network where a significant number of participants are coming and leaving. In future when there will be a lot of light nodes or stateless peers, the network could be characterized by even greater percentage of "newbie" nodes.

Figure 1a

Let's put statistics from each peer on graph (Figure 1a), so the area under it will show us cumulative network traffic. As we see, Kademlia's FindNeighbors is the most effective, followed by FindNodesDown. FindNodesStrict is the least effective. But the interesting question is, why about 10% of nodes using Discovery V5 lookup have significantly more traffic, reaching 5 times more traffic than V4. It's easier to get the answer from the next, latency graph.

Figure 1b

Latency graph (Figure 1b) gives us an idea (which is later verified in logs), what's happening: nodes perform a lot of requests. But why is it happening? First of all, there are peers with less than k (by size of bucket) nodes in the table. In all buckets. So, by looking for k neighbors in it using distance method, we send and receive 256 (by number of buckets) requests. In the second place, we don't know whether we should go down to smaller buckets or we already received everything from this node and all subsequent buckets are empty. FindNodesDown fixes this issue: 3x times cumulative latency of V4's FindNeighbors but still significantly smaller than with V5 implementation. Remember, that in real network malicious nodes and buggy versions are added to the list of "newbies".

But what about a mature network? We should see more and more light clients in Ethereum network in the near future, but anyway, let’s check whether network without recently joined nodes could be good for V5 neighbors lookup.

Old network^

This time we use Pareto-like knowledge distribution with α = 0.24, X_m = 10 which leads to 50% of "heard about everyone" nodes and youngest nodes with 35 peers. This set of peers represents a mature network with a lot of old nodes. It could be close to the current state of Ethereum, but it doesn't have any room for light clients and malicious nodes in it. Let's see how different lookup APIs stack here in terms of overall network traffic.

Figure 2a

Things go much better here (see Figure 2a) and FindNodesStrict traffic is in just, maybe 1/4 over old FindNeighbors here. And, what's interesting, FindNodesDown is least efficient here, on a slight margin, but first time it produces more traffic. So, FindNodesDown is not always better than FindNodesStrict, it's worse in a mature network without recently joined peers. But what about lookup time? Let's check on another graph.

Figure 2b

Looking at the latency graph (Figure 2b) with the naked eye we could say that FindNodesDown takes about 2x time of what V4's FindNeighbors and FindNodesStrict takes about 2.5x. Taking into account both traffic and latency, we could say that in a simulated mature network there is no sense for replacing FindNodesStrict with FindNodesDown because we get more traffic in exchange of less time with more complexity of implementation. But still V4's FindNeighbors outperforms V5's FindNodesStrict here in both traffic and smaller delays. The difference is smaller than in network with a lot of newbies but still visible.

Discussion^

As simulation results show, either FindNodesStrict or FindNodesDown, which was an experiment of improving the first method, both methods use more bandwidth than original Kademlia's FindNeighbors. You could reduce FindNodesStrict boilerplate by limiting total number of queries to 3, like it's done in Geth client but omitted in protocol specification, this way you will see results similar to Old Network simulation. And these results are still worse than measured with Discovery V4.

While absolute bandwidth numbers are not big in all cases and shouldn't be in issue on typical network connection, additional queries increase the time of any lookup worsen secondary features like ability to find some topic in a minimum amount of time.

Also worth mentioning that if we stick with Discovery V5 protocol, received nodes should be verified to lay within input distance bounds, which is already done in Geth client but is not a part of specification. Such a verification drops nodes that could be included in answer but doesn’t match the required distance from input and protects from basic attacks by malicious nodes.

Finally, V5 nodes could send malicious answers to lookup queries in the same way as V4 could, just bringing down the fullness of its buckets to external observer in answers to exact distance queries.

Neighbors lookup in Discovery V5

Closeness: how it works^

Do we need neighbors and closeness in Discovery?^

Neighbors lookup change in Discovery V5^

Simulator^

Simulator description^

Simulator simplifications^

Simulator test cases^

Simulation results^

Modern network^

Old network^

Discussion^

References^

Read more

Low participation in Hive issue

Eth2 to Eth1 Validator's withdrawal design [WIP]

Using total difficulty threshold for hardfork anchor: what could go wrong?

Purging BeaconState ever-growing Lists