# 4444s MVP - May 1st Delivery Clients MAY drop pre-merge blocks and receipts (not headers) after May 1st 2025 https://notes.ethereum.org/_XVO7jmXTGOwZmhR5-3T9Q Besu Status Quo Use Cases --------- 1. Staking use case - SNAP -> spec-compliant - CHECKPOINT @ [deposit contract block] -> non-spec-compliant 2. Archive use case - FULL + BONSAI -> all trie logs but can't rollback or serve history over RPC (512 limit) - only useful if you don't trust fast sync block verification - does these users exist? - ~ 1 month sync time - kind of infeasible? - FULL + BONSAI_ARCHIVE -> all trie logs and snapshots for performant rollback - works for everything other than eth_getProof - ~ 1 month sync time - kind of infeasible? - FULL + FOREST is infeasible, months to sync and >> TB --- May 1st - 4444s MVP ------------------- What is impact of some/most/all the network dropping pre-merge block bodies and receipts? At the interop, Geth, Nethermind and Besu said they _will_ drop the data after May 1st. If we do nothing, we risk not having access to devp2p data in order to complete a sync. Just moving the checkpoint to the merge block (and enabling "post merge genesis" config) would technically work for dropping the whole block pre-merge, but wouldn't be spec compliant, similar to CHECKPOINT. MVP solution... 1. Staking use case - CHECKPOINT @ [merge block] and back fill headers Other options: - era1 but only store the headers - "geth init"-sub-command-style bootstrap of pre-merge headers using some as yet undefined data format we make available - do nothing and risk SNAP sync not working ``` MVP for May 1st - move checkpoint and backfill headers - assume current archive mode is already infeasible for mainnet ``` Next ---- - What are the blockers for dropping headers by May 1st? - era1 for bonsai archive full sync? - semi automatic MVP - instruct user to download era1 from somewhere - subcommand to import era1 -> db - start client and should continue full sync - develop alternative archive sync (iterate on bonsai archive) - validator sync - as previously specfied, e.g. backfill blocks/receipts after starting to attest Notes --- EIP-7801 - advantage: not needing to test a new stack since it's over devp2p era1 - Desirable to have canonical era1 distribution location (not github) - Similar to https://github.com/eth-clients/checkpoint-sync-endpoints ? - can precommit with 1 hash, no need to verify after that, then can use s3 bucket headers-only "era1" 2.5 million blocks a year @ 12 seconds --- ### Samba Questions - what still needs to be done so we could use portal for chain history in besu - continued funding? - applied for grant - EF grant? - Derek part time until Feb, but could do full time from April - meldsun can work fulltime for now, but will need funding eventually - Americas timezone - gradle, maybe spring boot - dockerize - next step testing hive - UTP: time/effort - trin is a good reference client - 1-2 mins to get 60 peers - use samba as library - reusing some Besu code for Rocksdb - would be nice to keep database agnostic ### Portal Questions - What about 4444s_MVP for testnets? - no plan for sepolia/holesky yet - ideally need a small, high storage node topology but that's not how it currently works - Archive sync additional latency? - state network/snap sync plans? - Performance impact on critical path client operations - nethermind further along? - can derive from benchmarking trin? - e.g. trace header chain back, logs latency numbers - e.g. pull bodies in parallel - trin good for concurrency - If we opt in to portal history network, does that imply opting in to other portal networks? - no, all opt-in e.g. could opt-out of state network - If client/user can choose what data to store, and everyone chooses the minimal amount, how do we guarantee all data is stored across the network? - How long will portal "subsidise" network - what nodes are they running now, and will after May 1st - will subsidize for a number of years, if necessary - What's the disk footprint? - every node on the network is the same size - 30GB (fix storage size model) - two storage models: - 1. fixed radius storage - advertise a radius size - 2. fixed sized storage - accepts until it hits limit then hits furthest away data - up to us, e.g. could store 100GB, still a saving - could do combination of both storage - Next steps for 4444s_MVP - remove headers - remove more data - rolling window - "must" vs "may" - portal validates data from different network, can't break mainnet by sending testnet data - for sync, just need to retrieve faster than your execution speed - currently replication value is 1 - 3 - if geth, nethermind, besu run then should be running at ~50 (50 TB of distributed storage) - ensure even distribution with hash (sha256) based addressing scheme - content id - location on the network - 32 bytes - sha 256 of content key? - content key - machine readible - variable length, represents query, e.g. getBlockByHash - portal monitoring: https://glados.ethdevops.io/ - each client maintains it's own routing table - DHT routing - ways to incrementally reduce latencies - know header, body, block number, etc can translate to content key - 8192 ephemeral head - GS: just use portal to follow chain head with an EL? - beacon portal network - implementation of beacon chain light client - portal hive test suite - https://portal-hive.ethdevops.io/ Interop in April?? - portal interop in parallel development effort - 3-4 months with small team (2 - 3 devs) start to finish history network Impact on DIN