# ERC-4337 P2P Spec Feedback **Edit Date:** 2023/10/12 **Spec Commit**: [2351f17](https://github.com/eth-infinitism/bundler-spec/blob/2351f175b1f58054306d29a3b48e7238715e13bd/p2p-specs/p2p-interface.md) The following is a collection of feedback for the ERC-4337 P2P specification. I'll be adding to this as I learn more. ## Discovery There is currently no described mechanism for a node to find peers that support a particular mempool. The discovery process should provide at least some efficiency mechanisms to improve this.`mempool_nets` is undefined in the spec beyond "mempool subnet subscriptions" which has no meaning. There is a partial definition of how this looks in the ENR [here](https://github.com/eth-infinitism/bundler-spec/blob/main/p2p-specs/p2p-interface.md#mempools-bitfield). ### Status Quo A reasonable goal for a node is to try to maintain N peers per mempool to improve latency for receiving new UOs. To do this with the current protocol: 1. Discover any peer (no way to filter) 2. Dial that peer 3. Request status/metadata 4. Determine mempool matches, either keep peer or drop. ### Proposal We are limited to 300 bytes of information that we can pack into an [ENR](https://eips.ethereum.org/EIPS/eip-778#rlp-encoding). Thus, we cannot put each supported mempool ID into the ENR directly. Instead, we can define a "mempool subnet" as a mempool bucket. A mempool is assigned to a bucket by: `mempool_id % MEMPOOL_ID_SUBNET_COUNT`. The ENR structure is: | Key | Value | | --------------- | ---------------------------------------- | | mempool_subnets | SSZ `Bitvector[MEMPOOL_ID_SUBNET_COUNT]` | Where `MEMPOOL_ID_SUBNET_COUNT`: 1024 (SSZ 128 bytes). **Note:** A much more efficient struture would be a "sparse" bitvector. Nothing says we have to use SSZ here. We could only store the indexes/values of non-zero bytes in the bitvector. If `MEMPOOL_ID_SUBNET_COUNT` >> `MAX_SUPPORTED_MEMPOOLS` this should compress very nicely. This is worth exploring further, as increasing `MEMPOOL_ID_SUBNET_COUNT` reduces the chance of a false positive on a peer that supports a subnet not supporting the particular pool. Now to discover a peer that supports the mempool ID: 1. Map a mempool ID to its subnet 2. Discover a peers filtering to find an ENR that supports the subnet 3. Dial peer 4. Request status/metadata 5. Determine mempool matches, either keep peer or drop. This should reduce the amount of dialing needed to find a peer that supports a particular mempool by a factor of `MEMPOOL_ID_SUBNET_COUNT`. ## Request/Response ### `status`/`metadata` `status` should be reserved for more dynamic data while `metadata` should be used for more static data. `metadata` should only be requested upon initial handshake, and when a `ping` noticies that the `metadata_seq_number` has changed. We should move `supported_mempools` to `metadata` and add syncing data to `status`. `status`: ``` ( block_hash: u256 block_number: u256 ) ``` `block_hash`: Hash of the last processed block. `block_number`: Number of the last processed block. **Rational:** `status` can be used by nodes to determine the syncing progress of a peer. Peers need to keep their mempools up to date or they risk sending invalid user operations. By adding these fields to `status` a node can determine when a peer is unhealthy. This better tracks how both [devp2p](https://github.com/ethereum/devp2p/blob/master/caps/eth.md#status-0x00) and the [consensus spec](https://github.com/ethereum/consensus-specs/blob/dev/specs/phase0/p2p-interface.md#status) define their `status` endpoints. `metadata`: ``` ( seq_number: uint64 mempool_nets: Bitvector[MEMPOOL_SUBNET_COUNT] supported_mempools: List[Bytes32, MAX_SUPPORTED_MEMPOOLS] ) ``` `seq_number`: metadata sequence number. `mempool_nets`: Mempool subnets as defined above. `supported_mempools`: List of supported mempool IDs. **Rational**: Changing the list of supported mempools should be a relatively rare occurance and should require special processing by the node including increasing their metadata sequence number. #### Current implementation If we keep the current implementation we should change the framing for the `status` endpoint. Its `request`/`response` should be sent in a single chunk. Not: ``` Responses that consist of a single SSZ-list send each list item as a response_chunk. ``` ### `pooled_user_ops_hashes` Suggestion: Drop the `s` => `pooled_user_op_hashes` #### Drop `mempool` from the request Peers have already negotiated the common mempools in their connection. Peers should only need to send a single `pooled_user_op_hashes` request and should receive hashes for each mempool in common. This will limit the amount of requests a user needs to send. It is not important which mempool a UO is associated with. The peer needs to re-validate the UO anyway, and can determine its mempool association at that point. New request: ``` ( offset: uint64 ) ``` `offset`: Offset into a list of UO hashes. This this should be determined once by a peer and should time out. #### Offset list timeout The `offset` field only has meaning in the context of a previously sent `pooled_user_op_hashes` request. We should define this as the following: 1. When `offset = 0` a node should retrieve a list of UO hashes on shared mempools, up to a limit (maybe MAX_OPS_PER_REQUEST * 8). If this list > `MAX_OPS_PER_REQUEST` it should persist the portion that it hasn't yet sent. 2. When `offset != 0` the node should return UO hashes from the above list, skipping any already sent. 3. After N seconds a node should time out this list and any request with `offset != 0` should be treated as invalid. Suggestion: `N = 1`. 4. Any request with `offset = 0` causes a new list retrieval and timer restart. ### `pooled_user_ops_by_hash` The size of this response is currently unbounded. We need to cap the maximum size of the response and mark it as a protocol violation if a sender sends more than this size. Suggested cap: 10MB ## Gossip ### `user_ops_with_entry_point` As defined `user_ops_with_entry_point` will incur large amounts of inefficiency in the gossip network. Libp2p `gossipsub` deduplicates message by message ID. Nodes gossip [`IHAVE` messages](https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/gossipsub-v1.0.md#ihave) to notify peers of the message IDs that they currently have for a topic. Peers then look at their local cache of message IDs, and if they have not yet received a message for that message ID they issue an [`IWANT` message](https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/gossipsub-v1.0.md#iwant) to the peer that gossiped the message ID. The current spec defines message IDs as: > SHA256(MESSAGE_DOMAIN_VALID_SNAPPY + snappy_decompress(message.data))[:20 and `user_ops_with_entry_point` as: ``` class UserOperationsWithEntryPoint(Container): entry_point_contract: Address verified_at_block_hash: uint256 chain_id: uint256 user_operations: List[UserOp, MAX_OPS_PER_REQUEST] ``` If two messages have a different list of user ops, they will have different message IDs. Peers will be required to issue `IWANT` messages for these full lists, and then do the `UserOp` deduplication internally. This will massively increase network utilization. #### Suggestion The gossip message for user operations should be redefined as `verified_user_operation`. ``` class VerifiedUserOperation(Container): verified_at_block_hash: uint256 user_operation: UserOperation ``` **Note:** Even this adds inefficiencies as there may be multiple messages for a given UO that was verified at different block hashes. Ideally a UO should only be verified at 1-2 different blocks so this shouldn't add too much overhead to the network. Gossipsub internally batches message ids into the `IHAVE` and `IWANT` messages, so no need to do batching in this message itself. `entry_point_contract` and `chain_id` are redundant pieces of information. `entry_point_contract` should be implied from the `mempool` definition. `chain_id` should be implied from the network itself. If we feel we need `chain_id` somewhere we could add it to the ENR or to the `metadata`. ### Mempool Topic Deduplication The libp2p `Gossipsub` specification contains the [following](https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/gossipsub-v1.0.md#message-cache). > In all pubsub implementations, we can first check the seen cache before forwarding messages to avoid wastefully republishing the same message multiple times. This `seen` cache is shared across topics and caches message IDs. If a caller attempts to publish the same message ID on two different topics, the 2nd attempt will be deduplicated and fail. [Rust implementation](https://github.com/libp2p/rust-libp2p/blob/497f393987db80972f8b832fc2f9e3a57745b576/protocols/gossipsub/src/behaviour.rs#L645) [Go implementation](https://github.com/libp2p/go-libp2p-pubsub/blob/d13e24ddc9f288557732c9c02b143c09b86c6778/pubsub.go#L1150) The definition of message ID above doesn't incorporate the topic string its being sent on, and thus if the same message is sent to two mempool topics, it will be deduplicated by the 2nd request and will not reach subscribers. If a node has a UO that matches two separate mempools it can only send that UO to the subscribers of a single pool. More recent definitions of the consensus spec now include the message topic in the message id. See [here](https://github.com/ethereum/consensus-specs/blob/dev/specs/altair/p2p-interface.md#topics-and-messages). #### Problem If we incorporate the topic into the message ID, peers lose the ability to deduplicate their `IWANT` messages. Each UO will be requested by each peer on each mempool that it matches and the peer supports. #### Options 1. The current message ID definition should change to incorporate the topic ID. This will at least ensure that a UO is sent to each mempool it matches on. The downside here will be network inefficiencies. 2. A small change to `Gossipsub` would remove this issue. The `seen` cache should be split into a read-side and write-side cache. The read-side cache can be cross-topic. The write-side cache should be per topic. This would ensure that a message ID is broadcast on all topics it applies to, but a peer knows to only read that message once. I asked a question about this on the [libp2p forum](https://discuss.libp2p.io/t/gossipsub-message-id-across-topics/2102). Would be helpful to talk to maintainers. 3. This is a more radical suggestion, but we could switch to an entirely different pubsub implementation, or just use the request/response domain. For now, I'm much more inclined to go with option (1) and push for the change described in (2) before considering this. ## Misc `MESSAGE_DOMAIN_VALID_SNAPPY` is undefined. Suggestion: `0x0100000000` `MESSAGE_DOMAIN_INVALID_SNAPPY` is undefined. Suggestion: `0x0000000000`