Portal Summit Berlin 2025
Day 1 - March 10th, 2025
Portal deep dive by Piper
Portal Network goals
- Stewardship of Ethereum data
- Ensure decentralized storage and access to Ethereum’s history and state data
- Serve as a 3rd pillar of Ethereum client development, alongside EL and CL clients
- Lightweight access to Ethereum data
- EL clients can serve JSON-RPC requests from Portal Network for both history & state data
- Not aim for hot path
- Portal is not designed for time-sensitive tasks (eg. block building)
How Portal works
- Network basics
- UDP-based communication
- Portal Network is built on UDP, rather than TCP, meaning no delivery or order guarantees
- Allow for connectionless, stateless communication, reducing peer pool limitaions
- DHT & Discv5 basics
- Portal uses DHT to store & retrieve content
- Node distance is measured using an XOR metric
- Routing table: Node maintain a bucket list of peers sorted by distance, and closer peers store & retrieve data more efficiently
- Portal content storage & retrieval
- Content keys
- Data in Portal is identified by content keys, which map to content IDs (hashes of the content)
- Nodes store content that is close to them in the keyspace, ensuring efficient retrieval
- Content storage: data radius
- Each node in Portal has the agency to decide how much data it wants to store, based on its data radius (area around a node in the network address space)
- Data radius can be defined as fixed storage capacity (eg. 2GB of the total data) or fixed percentage capacity (eg. 7% of the total sdata)
- Content retrieval
- Use iterative DHT routing, where a node queries progressively closer peers until it reaches one within its data radius that stores & serves the requested content
- Discv5 protocol
- Based on EIP-778 (ENR), Discv5 provides node discovery and encrypted communication over UDP
- Enable data retrieval, peer routing and content queries through message pairs (PING/ PONG, FINDNODES/ NODES, TALKREQ/ TALKRESP)
- Portal Network primarily relies on the TALKREQ/ TALKRESP message pair for custom protocol communication
- Portal Wire Protocol
- Overlay Networks
- Use separate routing tables for different networks (eg. state, history, and beacon), each maintained its own separate DHT on top of Discv5
- Portal Wire Protocol messages
- Reply on 4 message pairs (PING/ PONG, FINDNODES/ NODES, GOSSIP/ ACCEPT, FINDCONTENT/ FOUNDCONTENT) built on Discv5's TALKREQ/ TALKRESP for liveness checks, routing, content retrieval, and data sharing
- Portal Wire Protocol data transfer
- Use uTP for large payloads to overcome UDP's packet size limitation
- Use UDP for small payloads that fits in the packet limit
Portal Network
- 3 core pillars
- Beacon network
- History network
- Storage and retrieval of historical headers, block bodies, and receipts
- Use cases
- Enable JSON-RPC API access to Ethereum's full history
- Support full history sync for EL clients
- Ephemeral block header update
- The ephemeral block header feature allows tracking the chain's head while ensuring only verifiable data is stored by ignoring the ancestor_count in content ID generation
- State network
- Access to recent and archival account state and contract storage
- Two storage model: Full-trie model & Flat model
- Full-trie model (near ready for deployment): Store the entire history trie, allowing for archive-level access and exclusion proofs
- Flat model (early R&D): Enable O(1) direct access to the state, designed for lower latency and faster response
- Relation among the 3 sub-networks
- Each subnetwork operates as a separate DHT with its own routing table and data storage
- State network relies on History network to prove the validity of state dat
- History network relies on Beacon network to verify the canonical head of the chain
- Portal light client use cases
- Wallets, CLI tools, IDEs: Light clients can use Portal to access history & state data without running a full node
- EL clients: Serve JSON-RPC requests using data retrieved from the Portal, even before completing a full sync
Open discussion
- How does state expiry improves storage efficiency?
- State expiry would improve the State Network by creating a clear separation between active (hot) state and historical (frozen) state
- Frozen state data becomes static and easier to store efficiently, while active state remains dynamic and limited in size
- Allow for dense historical storage and efficient handling of state changes over time
- Why Portal need to keep up with the head state?
- It allows light clients to read head state data without relying on a centralized provider
- The head state is the most valuable but also the most challenging to handle due to frequent changes and reorgs
- Need further exploration on the architecture design
- How should transactions be indexed for fast lookup?
- Potentially a dedicated network (canonical transaction indexing) mapping transaction hash to block hash in the future
Reference & links
Execution client integration plan review
Image Not Showing
Possible Reasons
- The image was uploaded to a note which you don't have access to
- The note which the image was originally uploaded to has been deleted
Learn More →
Nethermind
- Pre-merge data: Ready to drop
- Sync strategy: Head backward syncing (from the latest block to the merge), support fresh sync only
- JSON-RPC handling: Reply on Portal Network
- Portal integration: Developing its own Portal implementation
- Devp2p handling: Undecided on how to handle pre-merge data requests on the devp2p layer
- Era file: Use era files for downloading large chunks of historical data
Geth
- Pre-merge data: Plan to drop
- Sync strategy: Sync from the merge
- JSON-RPC handling: Returns errors for JSON-RPC queries related to pre-merge data
- Portal integration: Plan to link Shisui's codebase into Geth later
- Devp2p handling: Working on a new protocol version
- Era file: Use era files for loading full history
Reth
- Pre-merge data: May retain pre-merge data, allowing users to choose
- Sync strategy: Supports full sync via centralized services
- JSON-RPC handling: Returns errors for JSON-RPC queries related to missing data
- Portal integration: May integrate with Trin for full sync and RPC queries
- Devp2p handling: May continue to support, or proxy to Portal
- Era file: Undecided for now
Nimbus & Ethereum JS
- Pre-merge data: Ready to drop
- Sync strategy: Will use era files or Portal for syncing
- JSON-RPC handling: Reply on Portal Network
- Portal integration: Plan to integrate Portal client for JSON-RPC first, then block syncing
- Devp2p handling: May continue to support devp2p, answering requests if data is available, or proxy to Portal Network
- Era file: Use era files for block syncing
Erigon & Besu
- No updates provided (absent from the Summit)
State network deep dive by Milos and Kolby
Current status of the State Network
- State coverage
- State network currently stores the first 1 million blocks of state data and has seeded 21 million blocks of state data using 16 large nodes
- State network is catching up from 21 million blocks to the head of the chain
- In a few month, the network will catch up the head with a delay of 8-16 blocks
- Ideally the target will be ~1.5 slots delay for Portal clients
- Gossip strategy
- Currently each node attempts to gossip slices of data to 2 nodes in parallel, with a fallback to 6 more nodes if initial attempts fail or get rejected
Current challenges and Potential solutions
- Challenges
- State propagation
- Propagating state updates (~4,000 key-value pairs per block) across the network efficiently is challenging, with current success rates ranging from 50% to 90%
- Gossiping failure rates increase when nodes reject data as they already have it
- Performance bottlenecks
- The current Portal client implmentation is inefficient (unsatuated CPU, disk, or I/O usage observed from Trin)
- Solutions
- Client implementation: Optimize clients to improve state propagation (eg. change mutex for read-write locks etc.)
- Gossip protocol: Add protocol-level feedback to distinguish between “already have the data” and “reject the data”
- EL client contribution: EL clients can contribute to Portal by randomly gossiping slices of state data, using their node IDs to determine their assigned portion. Goal is to achieve emergent coordination among clients, avoiding overlap and saturation
Flat model intro
- Goal and scope
- Enable faster access to state info without traversing the full trie
- Allow data correctness with proofs
- Mainly apply to recent state history rather than the entire state trie, as scaling the entire trie is impractical
- Data mapping & storage structure
- The state data is mapped to a content ID space
- Nodes store continuous slices of the trie, incl. leaves and intermediary nodes, within a certain window (eg. 128 - 8,000 blocks)
- Updates are propagated along paths, and nodes maintain recent state data
- Proofs and verification
- Nodes store proofs to verify the correctness of data
- If a node doesn’t have the requested data, it provides an exclusion proof, and the requester falls back to querying other nodes or to the full-trie model
- Routing and network consideration
- Use a distance-based approach (eg. distance ± 5) with proportional buckets to organize peers in the network
- To avoid hotspots, the flat model shifts the starting point of storage trees in the address space, ensuring even distribution of storage among nodes
- Complement with Stateless
- The flat model provides fast access to recent state, and the statelenss model ensure nodes can reconstruct state without storing the entire history
- But the flat model still requires nodes to reinitialize older state data and store recent state data to ensure full accessibility
- Implementation timeline
- Still in the R&D phase, with implementation expected to begin in the coming months
Open questions
- How to balance the trade-off between state freshness and network scalability as the network grows?
- What is the optimal size for the recent state window in flat model (eg. 128, 256, or 8,000 blocks)? Or should be configurable?
- How should the transition from the current MPT to future stateless model be managed to ensure fast data access, avoid gaps, and determine the optimal point where old model can be completely phased out?
Discv5 upgrade by Felix
Issue with current Discv5.1
- TALKREQ limitation
- Originally designed for LES for single package request/ response interactions
- Portal Network is repurposing TALKREQ for message transport, but packet size is limited (~1,000 bytes) and allows only a single response per request
- Session management
- Discv5 assumes that responses are served quickly from memory, but this becomes problematic when disk access or locks are involved, leading to timeouts
- The protocol lacks clear semantics for session establishment, especially when sessions are dropped or re-established, causing issues like handshake failures and message resends
- Need better handling of session initiation and re-establishment when nodes forget session state
- NAT challenge
- Nodes behind NATs struggle to establish direct connections, as Discv5 relies on UDP packets that may be blocked by firewalls
- The current version does not handle NAT traversal efficiently, leading to connectivity issues for nodes in restrictive network environments
Proposed upgrade on Discv5.2
- NAT traversal
- Relay mechanism
- A relay node is used to facilitate communication between two nodes behind NATs
- The relay must have an active session with both nodes and maintain regular communication to keep NAT holes open
- Require a new session notification packet that allows nodes to initiate communication without requiring a direct response
- Liveness check
- Increase the frequency of liveness checks between nodes to maintain NAT holes
- Potentially introduce lightweight ping mechanisms to ensure nodes remain reachable without overloading the network
- Sub-protocol data transmission
- Session-based communication
- Introduce a subprotocol mechanism to handle longer-term, data-intensive interactions (eg,`. file transfers or large data retrievals)
- Allow nodes to establish dedicated sessions for specific tasks, bypassing the limitations of the TALKREQ
- Asymmetric communication
- Add support for one-way notifications (eg. session notifications) that do not require immediate responses, reducing latency and improving efficiency
- Optimize UTP integration
- Need to modify the uTP spec and create an own version
- Optimize overhead and performance through potential simplified uTP headers, skipping initial uTP connection setup, and multiplexing on the same socket
- Protocol enhancements
- Packet flags
- Add flags to packets to indicate whether they are part of an existing session or a new session, improving session management and reducing handshake failures
- Introduce a response flag to distinguish between requests and responses, preventing unnecessary handshake attempts
Implementation plan
- High priority for Portal teams to implement or make significant progress on Discv5.2 by the end of 2025
Reference & links
EIP-4444 implementation plan review

Context
- EL clients are working on implementing EIP-4444, which allows dropping pre-merge block bodies and receipts
- The discussion focus is on post-merge data and how to handle the rolling window
Key consideration
- Sync performance & Finality
- A rolling window complicates sync strategies, especially for clients that rely on snap sync
- The rolling window must align with finality guarantees to ensure that dropped data does not compromise chain security
- Rolling Window Size
- Only keep the last 8192 blocks (~27h) is considered aggressive and may not align with finality guarantees
- A larger window (eg. 5 months as a lower bound) might be more reliable for syncing and data recovery
- Data availability
- Era file
- Era files are effective for pre-merge data but may not be suitable for post-merge data
- A standardized format supported by multiple clients is needed for post-merge data
- Portal Network
- Portal serves as the p2p solution for serving data storage & retrieval
- Deposit contract logs
- Accessing deposit contract logs is an issue until the Pectra upgrade
- Header retention
- Should headers be dropped in the future, or is the storage savings too minimal to justify the engineering effort?
Devconnect 2025 target
- Will finalize the rolling window decision based on testing and client readiness
Day 2 - March 11th, 2024
Purpose and background
- The goal of EIP-7745 is to reduce the cost of updating and searching logs while maintaining the ability to search for specific log addresses, topics, and patterns etc.
- Existing search structures (eg. Bloom filters) are inefficient for searching historical logs
- EIP-7745 introduces Filter Maps, a 2D log filter data structure that balances update cost vs search efficiency
- A prototype exists in Geth, showing significant performance improvement in log search (in milliseconds rather than in minutes)
Tradeoff of structuring log data
- Linear list approach: Cheap to update, but expensive to search, DHT-friendly
- Single tree approach: Expensive to update, but cheap to esearch, not DHT-friendly
- Need a better tradeoff between update cost and search efficiency
The solution: Filter maps
- Fixed-sized structure with probabilistic collision filtering
- Logs are grouped into epochs, making them DHT-friendly and efficiently distributed
- If a single tree fits into memory, update costs remain close to a linear index
- The fixed tree shape allows for simple generalized merkle proofs
Key features & design choice
- Efficient database access and merkle proofs
- Queries now only takes milliseconds instead of minutes
- Support provable JSON-RPC API responses
- Row mapping and collision mitigation
- Use row mapping to locate log values
- Use multi-layer mapping structure to avoid collision, especially for heavily used log values (eg. ERC20 transfer)
- Minimal state requirement
- Memory footprint hard-capped (~21MB for minimal state updates)
- Prevent log indexing from getting denser over time
- Anchor the log index root with different options
- Preferred: Replace log bloom filter with the log index root hash (require protocol upgrade)
- Other options
- Generate zkp of the log index root
- Sign log index roots with a set of trusted keys
- Full nodes can take advantage of the log indexing tech without anchoring
- Distribute and access the log index data
- Epochs are self-contained and can be distributed in the DHT based on epoch index
- Non-finalized epochs requires update before finalization
- Each node can serve the standard pattern matching request for stored epochs
- Transaction by hash lookup with exclusion proof can also be implemented
Open discussion
- Search speed improvement
- Current log searches take tens of minutes, but Filter Maps can reduce this to milliseconds
- Efficient log search (eg. search a specific event across all history)
- Allow users to retrieve data by only downloading ~2 MB for full history search, instead of downloading GB level headers and receipts
- JSON-RPC API design enhancements
- Need better API design to support efficient historical searches and provable responses
- Portal Network integration
- The main challenge will be handling the data at the head of the chain and anchoring it effectively
- Potential solution could be adding a field to the header to store the root hash of the log index structure
- Future work
- Need further documentation, testing, EELS spec or second implementation for cross-client validation and wider adoption
Reference & links