Gossipsub v1.1 Mesh Promises Scoring Extension Proposal

# Gossipsub v1.1 Mesh Promises Scoring Extension Proposal ## Motivation A possible attack that tries to weaken a gossipsub network is introducing malicious nodes that just send back to each peer the messages they received from that peer. We will call such an attacker node *Copycat*. Currently the speced "Mesh Message Delivery" scorings are expected to penalize such a copycat but this scoring mechanism has some practical drawbacks. To be efficient the `MeshMessageDeliveriesWindow` must be smaller than the time it needs for an average message to get delivered to the attacker and back to the peer which should be in the ballpark of `200ms` for a well-connected attacker. If validating gossipsub messages is a non-trivial task (like in Eth2.0) honest nodes with slow hardware might end up with validation times `>200ms` and then get penalized for not sending any messages within the `MeshMessageDeliveriesWindow`. Furthermore, also honest nodes with a slow internet connection might end up getting penalized. One could argue that it is a feature that slow honest nodes get penalized and thrown out of the mesh, but this might lead to real problems to establish a meaningful mesh for slow nodes and will lead to even more load for them. ## New Scoring Mechanism Description The main idea is to not forward a message with a low probability and then track if all our mesh peers send us the message within some time frame. We formally specify the following parameters (per topic): | Parameter | Type | Description | Constraints | | --------- | ---- | ----------- | ----------- | | `MeshPromiseWeight` | Float | Weight of new scoring mechanism |Must be <0| | `MeshPromiseProbability` | Float | Probability for tracking a message instead of forwarding it | Must be between 0 and 1 | | `MeshPromiseDecay` | Float | Decay factor for the moving relative error | Must be between 0 and 1 | | `MeshPromiseRelativeThreshold` | Float | If the relative error rate is above this threshold the peer gets penalized | Must be between 0 and 1 | | `MeshPromiseMinTotal` | Float | The peer only gets penalized if the total amount of tracked messages is above this value | | `MeshPromiseWindow` | Duration | The time interval each mesh peer has time to send us a message after we received the message the first time | After a new message got validated we randomly decide if we want to track that message as a promise or if we want to forward this message to all peers and promote it via `IHAVE` messages. The probability for tracking the message is `MeshPromiseProbability`. If we decide to track we do not forward the message and store it in an appropriate data structure. After `MeshPromiseWindow` elapsed we check if all mesh peers have sent us the message. We increase a total counter for each mesh peer and recalculate a relative error counter based on whether the peer sent us the message or not. The total counter for each peers decays with decay factor `MeshPromiseDecay` while keeping a minimum of `MeshPromiseMinTotal`. Therefore, the calculated relative error weighs the history with some decaying factor and adds new satisfied/broken promises to it. When the relative counter is above `MeshPromiseRelativeThreshold` while the total is also above `MeshPromiseMinTotal` we penalize the peer by `MeshPromiseWeight * (relative_counter - MeshPromiseRelativeThreshold) ^ 2`. After the `MeshPromiseWindow` passed for some message we forward that message to all mesh peers that broke the promise and didn't send the message to us and we also promote the message via `IHAVE`s starting from this moment. ## Extension: Semantically equivalent messages with different ids In Eth2.0 we have an additional difficulty if we want to use the above mechanism. For some topics we have a lot of rules that should ignore a message if we received already another message that is semantically equal. This will lead to the situation that different nodes have completely different sets of valid gossipsub messages depending on which of the equivalent messages they received first. Since we still want to validate all those messages we cannot give them the same gossipsub id. To fix this problem we propose introducing a new kind of id which we call `semantic_id`. When validating a message the user can optionally provide a `semantic_id`. Two messages with the same `gossipsub_id` must always have the same `semantic_id` but it may be that two messages with the same `semantic_id` have different `gossipsub_id`s. We consider now a mesh promise as fulfilled if a peer sends us at least one message that has the same `semantic_id` as the tracked message. ## Example Implementation We implemented this scoring mechanism and the semantic id extension in `rust-libp2p`, see this branch: https://github.com/sigp/rust-libp2p/tree/mesh-promises. ## TODO - [ ] Test effectiveness of the mechanism in various attacking scenarios and compare it with the Mesh Message Deliveries mechanism. - [ ] Test the impact of not forwarding some of the messages with low probability on the health of a network if all nodes use this mechanism. - [ ] Get to an agreement whether this proposed mechanism is worth the spec changes. - [ ] Also add the mechanism to other gossipsub implementations (especially the go-implemenation)