# libp2p and Prysm
This document outlines some of the areas where Prysm is seeing networking issues as they increase the number of nodes in their testnet, and how they believe libp2p could help by allowing information gathered at the protocol layer to help inform the network layer.
# Ethereum 2 protocol layer
When one Ethereum 2 node connects to another Ethereum 2 node the first thing they do is to exchange handshake messages. These messages allow the peers to understand something about each other at the protocol level, and is used to decide if the peers want to talk to each other or to disconnect.
Some nodes are considered "bad" at the protocol layer. This can be for a number of reasons, for example:
- not responding to a handshake request
- responding to a handshake request with incorrect configuration (e.g. fork version, genesis hash)
- responding to a handshake request with incompatible information (e.g. supported protocol version)
Other nodes may have qualitative differences, for example:
- response time
- number of bad responses/timeouts
- supported protocols (e.g. one peer supports v1, another peer supports v1 and v2, we support both but prefer v2)
- latest block synced
These qualitative values can themselves be moderated by the global state, for example a slow peer may not be of interest if a node has many other connected peers, but if it has no other connected peers then a slow peer is better than nothing.
As the number of nodes in Ethereum 2 grows so will the number of bad or poor nodes.
# Situations that benefit from interaction between layers
The information held at the protocol layer as outlined above can have a positive impact on choices made by libp2p in deciding which peers to connect with, use as part of a pubsub mesh, and from which to disconnect. Specific situations are examined below; these are probably not exhaustive but do cover the main issues faced today.
## Connection gating
Connections are not free: they come with a cost in terms of CPU and bandwidth. At current a continual sea of connect/disconnect happens with stable nodes due to misunderstanding between the network and protocol layers, with the network layer trying to join any and all peers it finds but the protocol layer being more discerning and mindful of resources.
### Bad peers
Even at this early stage of the network there are peers that cause problems, either through misconfiguration or bugs in implementation. These are found out by the protocol and disconnected as a result, but it does not stop them from attempted reconnects:
```
{"level":"debug","msg":"Connected","peer":"16Uiu2HAmKKTgBtBAkfi3B3QjtoxXR1pVwh5Ym1Q3G8vjWmiDFnqz","prefix":"p2p","time":"2019-12-13T02:16:23+01:00"}
{"activePeers":1,"direction":1,"level":"debug","msg":"Peer handshaking","multiAddr":"/ip4/54.160.155.134/tcp/13000/p2p/16Uiu2HAmKKTgBtBAkfi3B3QjtoxXR1pVwh5Ym1Q3G8vjWmiDFnqz","peer":"16Uiu2HAmKKTgBtBAkfi3B3QjtoxXR1pVwh5Ym1Q3G8vjWmiDFnqz","prefix":"p2p","time":"2019-12-13T02:16:23+01:00"}
{"error":"stream reset","level":"debug","msg":"Handshake failed","peer":"16Uiu2HAmKKTgBtBAkfi3B3QjtoxXR1pVwh5Ym1Q3G8vjWmiDFnqz","prefix":"p2p","time":"2019-12-13T02:16:23+01:00"}
{"active":1,"level":"debug","msg":"Peer disconnected","peer":"16Uiu2HAmKKTgBtBAkfi3B3QjtoxXR1pVwh5Ym1Q3G8vjWmiDFnqz","prefix":"p2p","time":"2019-12-13T02:16:23+01:00"}
{"level":"debug","msg":"Connected","peer":"16Uiu2HAmKKTgBtBAkfi3B3QjtoxXR1pVwh5Ym1Q3G8vjWmiDFnqz","prefix":"p2p","time":"2019-12-13T02:16:23+01:00"}
{"activePeers":1,"direction":1,"level":"debug","msg":"Peer handshaking","multiAddr":"/ip4/54.160.155.134/tcp/13000/p2p/16Uiu2HAmKKTgBtBAkfi3B3QjtoxXR1pVwh5Ym1Q3G8vjWmiDFnqz","peer":"16Uiu2HAmKKTgBtBAkfi3B3QjtoxXR1pVwh5Ym1Q3G8vjWmiDFnqz","prefix":"p2p","time":"2019-12-13T02:16:23+01:00"}
{"error":"stream reset","level":"debug","msg":"Handshake failed","peer":"16Uiu2HAmKKTgBtBAkfi3B3QjtoxXR1pVwh5Ym1Q3G8vjWmiDFnqz","prefix":"p2p","time":"2019-12-13T02:16:23+01:00"}
{"active":1,"level":"debug","msg":"Peer disconnected","peer":"16Uiu2HAmKKTgBtBAkfi3B3QjtoxXR1pVwh5Ym1Q3G8vjWmiDFnqz","prefix":"p2p","time":"2019-12-13T02:16:23+01:00"}
{"level":"debug","msg":"Connected","peer":"16Uiu2HAmKKTgBtBAkfi3B3QjtoxXR1pVwh5Ym1Q3G8vjWmiDFnqz","prefix":"p2p","time":"2019-12-13T02:16:40+01:00"}
{"activePeers":10,"direction":1,"level":"debug","msg":"Peer handshaking","multiAddr":"/ip4/54.160.155.134/tcp/33192/p2p/16Uiu2HAmKKTgBtBAkfi3B3QjtoxXR1pVwh5Ym1Q3G8vjWmiDFnqz","peer":"16Uiu2HAmKKTgBtBAkfi3B3QjtoxXR1pVwh5Ym1Q3G8vjWmiDFnqz","prefix":"p2p","time":"2019-12-13T02:16:40+01:00"}
{"error":"stream reset","level":"debug","msg":"Handshake failed","peer":"16Uiu2HAmKKTgBtBAkfi3B3QjtoxXR1pVwh5Ym1Q3G8vjWmiDFnqz","prefix":"p2p","time":"2019-12-13T02:16:40+01:00"}
{"level":"debug","msg":"Peer has given too many bad responses; will ignore in future","peer":"16Uiu2HAmKKTgBtBAkfi3B3QjtoxXR1pVwh5Ym1Q3G8vjWmiDFnqz","prefix":"p2p","time":"2019-12-13T02:16:40+01:00"}
{"level":"debug","msg":"Connected","peer":"16Uiu2HAmKKTgBtBAkfi3B3QjtoxXR1pVwh5Ym1Q3G8vjWmiDFnqz","prefix":"p2p","time":"2019-12-13T02:16:41+01:00"}
{"level":"debug","msg":"Connected","peer":"16Uiu2HAmKKTgBtBAkfi3B3QjtoxXR1pVwh5Ym1Q3G8vjWmiDFnqz","prefix":"p2p","time":"2019-12-13T02:16:52+01:00"}
{"level":"debug","msg":"Connected","peer":"16Uiu2HAmKKTgBtBAkfi3B3QjtoxXR1pVwh5Ym1Q3G8vjWmiDFnqz","prefix":"p2p","time":"2019-12-13T02:16:52+01:00"}
{"level":"debug","msg":"Connected","peer":"16Uiu2HAmKKTgBtBAkfi3B3QjtoxXR1pVwh5Ym1Q3G8vjWmiDFnqz","prefix":"p2p","time":"2019-12-13T02:17:03+01:00"}
{"level":"debug","msg":"Connected","peer":"16Uiu2HAmKKTgBtBAkfi3B3QjtoxXR1pVwh5Ym1Q3G8vjWmiDFnqz","prefix":"p2p","time":"2019-12-13T02:17:03+01:00"}
{"level":"debug","msg":"Connected","peer":"16Uiu2HAmKKTgBtBAkfi3B3QjtoxXR1pVwh5Ym1Q3G8vjWmiDFnqz","prefix":"p2p","time":"2019-12-13T02:17:15+01:00"}
{"level":"debug","msg":"Connected","peer":"16Uiu2HAmKKTgBtBAkfi3B3QjtoxXR1pVwh5Ym1Q3G8vjWmiDFnqz","prefix":"p2p","time":"2019-12-13T02:17:17+01:00"}
{"level":"debug","msg":"Connected","peer":"16Uiu2HAmKKTgBtBAkfi3B3QjtoxXR1pVwh5Ym1Q3G8vjWmiDFnqz","prefix":"p2p","time":"2019-12-13T02:17:20+01:00"}
{"level":"debug","msg":"Connected","peer":"16Uiu2HAmKKTgBtBAkfi3B3QjtoxXR1pVwh5Ym1Q3G8vjWmiDFnqz","prefix":"p2p","time":"2019-12-13T02:17:21+01:00"}
{"level":"debug","msg":"Connected","peer":"16Uiu2HAmKKTgBtBAkfi3B3QjtoxXR1pVwh5Ym1Q3G8vjWmiDFnqz","prefix":"p2p","time":"2019-12-13T02:17:38+01:00"}
{"level":"debug","msg":"Connected","peer":"16Uiu2HAmKKTgBtBAkfi3B3QjtoxXR1pVwh5Ym1Q3G8vjWmiDFnqz","prefix":"p2p","time":"2019-12-13T02:17:38+01:00"}
{"level":"debug","msg":"Connected","peer":"16Uiu2HAmKKTgBtBAkfi3B3QjtoxXR1pVwh5Ym1Q3G8vjWmiDFnqz","prefix":"p2p","time":"2019-12-13T02:17:42+01:00"}
{"level":"debug","msg":"Connected","peer":"16Uiu2HAmKKTgBtBAkfi3B3QjtoxXR1pVwh5Ym1Q3G8vjWmiDFnqz","prefix":"p2p","time":"2019-12-13T02:17:43+01:00"}
```
As can be seen above, although the protocol layer has marked the peer bad it continues to deal with it, seeing repeated connection attempts that it has to shut down.
### Maximum capacity node
A single Ethereum 2 node has a maximum number of peers it wishes to connect with. This may be due to network or other constraints, or to ensure balance between connectivity and resources to process incoming messages. At current situations like the below are common:
```
{"level":"debug","msg":"We have enough peers; disconnecting","peer":"16Uiu2HAm1oYE38s5JE9TkhoY9xGm9w6FH469hpJyXmpApaAeUSFX","prefix":"p2p","time":"2019-12-17T06:34:20+01:00"}
{"level":"debug","msg":"We have enough peers; disconnecting","peer":"16Uiu2HAm1oYE38s5JE9TkhoY9xGm9w6FH469hpJyXmpApaAeUSFX","prefix":"p2p","time":"2019-12-17T06:34:21+01:00"}
{"level":"debug","msg":"We have enough peers; disconnecting","peer":"16Uiu2HAm2yRH4W9BxiQP7Mp99ybroYUvhvkzY88nLLKREFQFKanP","prefix":"p2p","time":"2019-12-17T06:34:22+01:00"}
{"level":"debug","msg":"We have enough peers; disconnecting","peer":"16Uiu2HAm2yRH4W9BxiQP7Mp99ybroYUvhvkzY88nLLKREFQFKanP","prefix":"p2p","time":"2019-12-17T06:34:22+01:00"}
{"level":"debug","msg":"We have enough peers; disconnecting","peer":"16Uiu2HAmEm8J4H39S4T3YMHjA57JRHuG1GjDgk39s5yTqjrqFLzp","prefix":"p2p","time":"2019-12-17T06:34:23+01:00"}
{"level":"debug","msg":"We have enough peers; disconnecting","peer":"16Uiu2HAmEm8J4H39S4T3YMHjA57JRHuG1GjDgk39s5yTqjrqFLzp","prefix":"p2p","time":"2019-12-17T06:34:23+01:00"}
{"level":"debug","msg":"We have enough peers; disconnecting","peer":"16Uiu2HAmRU6SJpe5aER1ieNQV7rGD3E4M6Vj34aMwdydvsTU2JBQ","prefix":"p2p","time":"2019-12-17T06:34:29+01:00"}
{"level":"debug","msg":"We have enough peers; disconnecting","peer":"16Uiu2HAmUVvgkZMYj2qriHeYP4FwjwdG87479eYdhdjy2pTAdnbY","prefix":"p2p","time":"2019-12-17T06:34:30+01:00"}
{"level":"debug","msg":"We have enough peers; disconnecting","peer":"16Uiu2HAmUVvgkZMYj2qriHeYP4FwjwdG87479eYdhdjy2pTAdnbY","prefix":"p2p","time":"2019-12-17T06:34:31+01:00"}
{"level":"debug","msg":"We have enough peers; disconnecting","peer":"16Uiu2HAmRU6SJpe5aER1ieNQV7rGD3E4M6Vj34aMwdydvsTU2JBQ","prefix":"p2p","time":"2019-12-17T06:34:31+01:00"}
{"level":"debug","msg":"We have enough peers; disconnecting","peer":"16Uiu2HAmCVsokwnprH7QJyCMh9FuSVnQwV9a4hQ5RjzmZnbNyuYv","prefix":"p2p","time":"2019-12-17T06:34:32+01:00"}
{"level":"debug","msg":"We have enough peers; disconnecting","peer":"16Uiu2HAkyNfwi6sRPwoG6qP1viXcHAssNs1apGueCQ6MbQ9dGC3v","prefix":"p2p","time":"2019-12-17T06:34:34+01:00"}
{"level":"debug","msg":"We have enough peers; disconnecting","peer":"16Uiu2HAm1oYE38s5JE9TkhoY9xGm9w6FH469hpJyXmpApaAeUSFX","prefix":"p2p","time":"2019-12-17T06:34:36+01:00"}
{"level":"debug","msg":"We have enough peers; disconnecting","peer":"16Uiu2HAmCVsokwnprH7QJyCMh9FuSVnQwV9a4hQ5RjzmZnbNyuYv","prefix":"p2p","time":"2019-12-17T06:34:36+01:00"}
{"level":"debug","msg":"We have enough peers; disconnecting","peer":"16Uiu2HAm1oYE38s5JE9TkhoY9xGm9w6FH469hpJyXmpApaAeUSFX","prefix":"p2p","time":"2019-12-17T06:34:38+01:00"}
```
Here it can be seen that there are continual attempts to connect to a node from multiple peers even though it has reached its maximum number of peers.
## Connection selection
There are situations where libp2p needs to connect to a number of peers from a given candidate list. At current this selection is random as far as the protocol layer is concerned, which can result in poor or bad peers connecting over known good peers. Bad peer selection can result in more disconnects and resultant network instability.
Notwithstanding the above, there is a requirement to allow new peers the chance to talk to us so that nodes can provide an understanding of their peers even if they do not wish to talk to them at current. This means that simplistic systems such as a count of peers, for example, would not suffice for all situations.
## Pubsub mesh
libp2p forms a mesh of connections between peers to allow for gossiping of messages. Ideally the mesh should be relatively long-lived to avoid either over-connectivity or under-connectivity. However, as libp2p builds the mesh from connected peers this can cause problems when the protocol layer disconnects a peer for some reason.
# Suggested approach
The general approach is to allow the protocol layer to information the network layer. Specifically there are two situations where protocol layer information can help:
- At the point that libp2p needs to select a peer for connection from a list of candidates, it could ask the protocol layer to order the list in terms of preference*
- At the point that libp2p is about to start a connection attempt, or once it has received a connection from a remote peer, it could ask the protocol layer if it should continue or disconnect immediately
- At the point that libp2p needs to decide on peers for its mesh it could ask the protocol layer to provide a list of peers that are currently connected at the protocol layer, and only use these as candidates for connection
These three changes, working in conjunction with the existing libp2p features, should result in less time spent needlessly connecting and disconnecting peers, and allow for longer-lived connections and more reliable gossiping of messages.
*It is recognized that this ordering can have a negative impact on building a p2p network, with new peers unable to join due to having no score, however this can be mitigated by good design of the scoring system which is part of the Ethereum 2 network.