Portal Summit Berlin 2025

Day 1 - March 10th, 2025

Portal deep dive by Piper

Portal Network goals

Stewardship of Ethereum data
- Ensure decentralized storage and access to Ethereum’s history and state data
- Serve as a 3rd pillar of Ethereum client development, alongside EL and CL clients
Lightweight access to Ethereum data
- EL clients can serve JSON-RPC requests from Portal Network for both history & state data
Not aim for hot path
- Portal is not designed for time-sensitive tasks (eg. block building)

How Portal works

Network basics
- UDP-based communication
  - Portal Network is built on UDP, rather than TCP, meaning no delivery or order guarantees
  - Allow for connectionless, stateless communication, reducing peer pool limitaions
- DHT & Discv5 basics
  - Portal uses DHT to store & retrieve content
  - Node distance is measured using an XOR metric
  - Routing table: Node maintain a bucket list of peers sorted by distance, and closer peers store & retrieve data more efficiently
Portal content storage & retrieval
- Content keys
  - Data in Portal is identified by content keys, which map to content IDs (hashes of the content)
  - Nodes store content that is close to them in the keyspace, ensuring efficient retrieval
- Content storage: data radius
  - Each node in Portal has the agency to decide how much data it wants to store, based on its data radius (area around a node in the network address space)
  - Data radius can be defined as fixed storage capacity (eg. 2GB of the total data) or fixed percentage capacity (eg. 7% of the total sdata)
- Content retrieval
  - Use iterative DHT routing, where a node queries progressively closer peers until it reaches one within its data radius that stores & serves the requested content
- Discv5 protocol
  - Based on EIP-778 (ENR), Discv5 provides node discovery and encrypted communication over UDP
  - Enable data retrieval, peer routing and content queries through message pairs (PING/ PONG, FINDNODES/ NODES, TALKREQ/ TALKRESP)
  - Portal Network primarily relies on the TALKREQ/ TALKRESP message pair for custom protocol communication
Portal Wire Protocol
- Overlay Networks
  - Use separate routing tables for different networks (eg. state, history, and beacon), each maintained its own separate DHT on top of Discv5
- Portal Wire Protocol messages
  - Reply on 4 message pairs (PING/ PONG, FINDNODES/ NODES, GOSSIP/ ACCEPT, FINDCONTENT/ FOUNDCONTENT) built on Discv5's TALKREQ/ TALKRESP for liveness checks, routing, content retrieval, and data sharing
- Portal Wire Protocol data transfer
  - Use uTP for large payloads to overcome UDP's packet size limitation
  - Use UDP for small payloads that fits in the packet limit

Portal Network

3 core pillars
- Beacon network
  - Tracking the canonical head of the chain
  - Based on CL light sync protocol
- History network
  - Storage and retrieval of historical headers, block bodies, and receipts
  - Use cases
    - Enable JSON-RPC API access to Ethereum's full history
    - Support full history sync for EL clients
  - Ephemeral block header update
    - The ephemeral block header feature allows tracking the chain's head while ensuring only verifiable data is stored by ignoring the ancestor_count in content ID generation
- State network
  - Access to recent and archival account state and contract storage
  - Two storage model: Full-trie model & Flat model
    - Full-trie model (near ready for deployment): Store the entire history trie, allowing for archive-level access and exclusion proofs
    - Flat model (early R&D): Enable O(1) direct access to the state, designed for lower latency and faster response
- Relation among the 3 sub-networks
  - Each subnetwork operates as a separate DHT with its own routing table and data storage
  - State network relies on History network to prove the validity of state dat
  - History network relies on Beacon network to verify the canonical head of the chain
Portal light client use cases
- Wallets, CLI tools, IDEs: Light clients can use Portal to access history & state data without running a full node
- EL clients: Serve JSON-RPC requests using data retrieved from the Portal, even before completing a full sync

Open discussion

How does state expiry improves storage efficiency?
- State expiry would improve the State Network by creating a clear separation between active (hot) state and historical (frozen) state
- Frozen state data becomes static and easier to store efficiently, while active state remains dynamic and limited in size
- Allow for dense historical storage and efficient handling of state changes over time
Why Portal need to keep up with the head state?
- It allows light clients to read head state data without relying on a centralized provider
- The head state is the most valuable but also the most challenging to handle due to frequent changes and reorgs
- Need further exploration on the architecture design
How should transactions be indexed for fast lookup?
- Potentially a dedicated network (canonical transaction indexing) mapping transaction hash to block hash in the future

Reference & links

Link to the slides: https://docs.google.com/presentation/d/16lscCmWZfbxDXgdubV4alRjqnEGczFc2KAsPXgbvqCs/edit

Execution client integration plan review

Nethermind

Pre-merge data: Ready to drop
Sync strategy: Head backward syncing (from the latest block to the merge), support fresh sync only
JSON-RPC handling: Reply on Portal Network
Portal integration: Developing its own Portal implementation
Devp2p handling: Undecided on how to handle pre-merge data requests on the devp2p layer
Era file: Use era files for downloading large chunks of historical data

Geth

Pre-merge data: Plan to drop
Sync strategy: Sync from the merge
JSON-RPC handling: Returns errors for JSON-RPC queries related to pre-merge data
Portal integration: Plan to link Shisui's codebase into Geth later
Devp2p handling: Working on a new protocol version
Era file: Use era files for loading full history

Reth

Pre-merge data: May retain pre-merge data, allowing users to choose
Sync strategy: Supports full sync via centralized services
JSON-RPC handling: Returns errors for JSON-RPC queries related to missing data
Portal integration: May integrate with Trin for full sync and RPC queries
Devp2p handling: May continue to support, or proxy to Portal
Era file: Undecided for now

Nimbus & Ethereum JS

Pre-merge data: Ready to drop
Sync strategy: Will use era files or Portal for syncing
JSON-RPC handling: Reply on Portal Network
Portal integration: Plan to integrate Portal client for JSON-RPC first, then block syncing
Devp2p handling: May continue to support devp2p, answering requests if data is available, or proxy to Portal Network
Era file: Use era files for block syncing

Erigon & Besu

No updates provided (absent from the Summit)

State network deep dive by Milos and Kolby

Current status of the State Network

State coverage
- State network currently stores the first 1 million blocks of state data and has seeded 21 million blocks of state data using 16 large nodes
- State network is catching up from 21 million blocks to the head of the chain
  - In a few month, the network will catch up the head with a delay of 8-16 blocks
  - Ideally the target will be ~1.5 slots delay for Portal clients
Gossip strategy
- Currently each node attempts to gossip slices of data to 2 nodes in parallel, with a fallback to 6 more nodes if initial attempts fail or get rejected

Current challenges and Potential solutions

Challenges
- State propagation
  - Propagating state updates (~4,000 key-value pairs per block) across the network efficiently is challenging, with current success rates ranging from 50% to 90%
  - Gossiping failure rates increase when nodes reject data as they already have it
- Performance bottlenecks
  - The current Portal client implmentation is inefficient (unsatuated CPU, disk, or I/O usage observed from Trin)
Solutions
- Client implementation: Optimize clients to improve state propagation (eg. change mutex for read-write locks etc.)
- Gossip protocol: Add protocol-level feedback to distinguish between “already have the data” and “reject the data”
- EL client contribution: EL clients can contribute to Portal by randomly gossiping slices of state data, using their node IDs to determine their assigned portion. Goal is to achieve emergent coordination among clients, avoiding overlap and saturation

Flat model intro

Goal and scope
- Enable faster access to state info without traversing the full trie
- Allow data correctness with proofs
- Mainly apply to recent state history rather than the entire state trie, as scaling the entire trie is impractical
Data mapping & storage structure
- The state data is mapped to a content ID space
- Nodes store continuous slices of the trie, incl. leaves and intermediary nodes, within a certain window (eg. 128 - 8,000 blocks)
- Updates are propagated along paths, and nodes maintain recent state data
Proofs and verification
- Nodes store proofs to verify the correctness of data
- If a node doesn’t have the requested data, it provides an exclusion proof, and the requester falls back to querying other nodes or to the full-trie model
Routing and network consideration
- Use a distance-based approach (eg. distance ± 5) with proportional buckets to organize peers in the network
- To avoid hotspots, the flat model shifts the starting point of storage trees in the address space, ensuring even distribution of storage among nodes
Complement with Stateless
- The flat model provides fast access to recent state, and the statelenss model ensure nodes can reconstruct state without storing the entire history
- But the flat model still requires nodes to reinitialize older state data and store recent state data to ensure full accessibility
Implementation timeline
- Still in the R&D phase, with implementation expected to begin in the coming months

Open questions

How to balance the trade-off between state freshness and network scalability as the network grows?
What is the optimal size for the recent state window in flat model (eg. 128, 256, or 8,000 blocks)? Or should be configurable?
How should the transition from the current MPT to future stateless model be managed to ensure fast data access, avoid gaps, and determine the optimal point where old model can be completely phased out?

Discv5 upgrade by Felix

Issue with current Discv5.1

TALKREQ limitation
- Originally designed for LES for single package request/ response interactions
- Portal Network is repurposing TALKREQ for message transport, but packet size is limited (~1,000 bytes) and allows only a single response per request
Session management
- Discv5 assumes that responses are served quickly from memory, but this becomes problematic when disk access or locks are involved, leading to timeouts
- The protocol lacks clear semantics for session establishment, especially when sessions are dropped or re-established, causing issues like handshake failures and message resends
- Need better handling of session initiation and re-establishment when nodes forget session state
NAT challenge
- Nodes behind NATs struggle to establish direct connections, as Discv5 relies on UDP packets that may be blocked by firewalls
- The current version does not handle NAT traversal efficiently, leading to connectivity issues for nodes in restrictive network environments

Proposed upgrade on Discv5.2

NAT traversal
- Relay mechanism
  - A relay node is used to facilitate communication between two nodes behind NATs
  - The relay must have an active session with both nodes and maintain regular communication to keep NAT holes open
  - Require a new session notification packet that allows nodes to initiate communication without requiring a direct response
- Liveness check
  - Increase the frequency of liveness checks between nodes to maintain NAT holes
  - Potentially introduce lightweight ping mechanisms to ensure nodes remain reachable without overloading the network
Sub-protocol data transmission
- Session-based communication
  - Introduce a subprotocol mechanism to handle longer-term, data-intensive interactions (eg,`. file transfers or large data retrievals)
  - Allow nodes to establish dedicated sessions for specific tasks, bypassing the limitations of the TALKREQ
- Asymmetric communication
  - Add support for one-way notifications (eg. session notifications) that do not require immediate responses, reducing latency and improving efficiency
- Optimize UTP integration
  - Need to modify the uTP spec and create an own version
  - Optimize overhead and performance through potential simplified uTP headers, skipping initial uTP connection setup, and multiplexing on the same socket
Protocol enhancements
- Packet flags
  - Add flags to packets to indicate whether they are part of an existing session or a new session, improving session management and reducing handshake failures
  - Introduce a response flag to distinguish between requests and responses, preventing unnecessary handshake attempts

Implementation plan

High priority for Portal teams to implement or make significant progress on Discv5.2 by the end of 2025

Reference & links

EIP-4444 implementation plan review

Context

EL clients are working on implementing EIP-4444, which allows dropping pre-merge block bodies and receipts
The discussion focus is on post-merge data and how to handle the rolling window

Key consideration

Sync performance & Finality
- A rolling window complicates sync strategies, especially for clients that rely on snap sync
- The rolling window must align with finality guarantees to ensure that dropped data does not compromise chain security
Rolling Window Size
- Only keep the last 8192 blocks (~27h) is considered aggressive and may not align with finality guarantees
- A larger window (eg. 5 months as a lower bound) might be more reliable for syncing and data recovery
Data availability
- Era file
  - Era files are effective for pre-merge data but may not be suitable for post-merge data
  - A standardized format supported by multiple clients is needed for post-merge data
- Portal Network
  - Portal serves as the p2p solution for serving data storage & retrieval
Deposit contract logs
- Accessing deposit contract logs is an issue until the Pectra upgrade
Header retention
- Should headers be dropped in the future, or is the storage savings too minimal to justify the engineering effort?

Devconnect 2025 target

Will finalize the rolling window decision based on testing and client readiness

Day 2 - March 11th, 2024

EIP-7745 Presentation by Zsolt Felfoldi

Purpose and background

The goal of EIP-7745 is to reduce the cost of updating and searching logs while maintaining the ability to search for specific log addresses, topics, and patterns etc.
Existing search structures (eg. Bloom filters) are inefficient for searching historical logs
EIP-7745 introduces Filter Maps, a 2D log filter data structure that balances update cost vs search efficiency
A prototype exists in Geth, showing significant performance improvement in log search (in milliseconds rather than in minutes)

Tradeoff of structuring log data

Linear list approach: Cheap to update, but expensive to search, DHT-friendly
Single tree approach: Expensive to update, but cheap to esearch, not DHT-friendly
Need a better tradeoff between update cost and search efficiency

The solution: Filter maps

Fixed-sized structure with probabilistic collision filtering
Logs are grouped into epochs, making them DHT-friendly and efficiently distributed
If a single tree fits into memory, update costs remain close to a linear index
The fixed tree shape allows for simple generalized merkle proofs

Key features & design choice

Efficient database access and merkle proofs
- Queries now only takes milliseconds instead of minutes
- Support provable JSON-RPC API responses
Row mapping and collision mitigation
- Use row mapping to locate log values
- Use multi-layer mapping structure to avoid collision, especially for heavily used log values (eg. ERC20 transfer)
Minimal state requirement
- Memory footprint hard-capped (~21MB for minimal state updates)
- Prevent log indexing from getting denser over time
Anchor the log index root with different options
- Preferred: Replace log bloom filter with the log index root hash (require protocol upgrade)
- Other options
  - Generate zkp of the log index root
  - Sign log index roots with a set of trusted keys
  - Full nodes can take advantage of the log indexing tech without anchoring
Distribute and access the log index data
- Epochs are self-contained and can be distributed in the DHT based on epoch index
- Non-finalized epochs requires update before finalization
- Each node can serve the standard pattern matching request for stored epochs
- Transaction by hash lookup with exclusion proof can also be implemented

Open discussion

Search speed improvement
- Current log searches take tens of minutes, but Filter Maps can reduce this to milliseconds
Efficient log search (eg. search a specific event across all history)
- Allow users to retrieve data by only downloading ~2 MB for full history search, instead of downloading GB level headers and receipts
JSON-RPC API design enhancements
- Need better API design to support efficient historical searches and provable responses
Portal Network integration
- The main challenge will be handling the data at the head of the chain and anchoring it effectively
- Potential solution could be adding a field to the header to store the root hash of the log index structure
Future work
- Need further documentation, testing, EELS spec or second implementation for cross-client validation and wider adoption

Reference & links

Video: https://purified-web3.box
Github: https://github.com/ethereum/go-ethereum/pull/31080

Portal Summit Berlin 2025

Day 1 - March 10th, 2025

Portal deep dive by Piper

Portal Network goals

How Portal works

Portal Network

Open discussion

Reference & links

Execution client integration plan review

Nethermind

Geth

Reth

Nimbus & Ethereum JS

Erigon & Besu

State network deep dive by Milos and Kolby

Current status of the State Network

Current challenges and Potential solutions

Flat model intro

Open questions

Discv5 upgrade by Felix

Issue with current Discv5.1

Proposed upgrade on Discv5.2

Implementation plan

Reference & links

EIP-4444 implementation plan review

Context

Key consideration

Devconnect 2025 target

Day 2 - March 11th, 2024

EIP-7745 Presentation by Zsolt Felfoldi

Purpose and background

Tradeoff of structuring log data

The solution: Filter maps

Key features & design choice

Open discussion

Reference & links

Read more

Portal Network Implementers Call Notes

EPF cohort 6 华语区 Townhall

Setup Node Workshop

RollCall 8.1 Summary