owned this note
owned this note
Linked with GitHub
# Nimbus-eth1 Updates
- Portal Alpha History Network Launch: https://github.com/status-im/nimbus-eth1/milestone/3
- Portal Alpha History Network Launch Nice-to-haves: https://github.com/status-im/nimbus-eth1/milestone/4
### Portal Network Master issues:
- Overlay Network Functionality: https://github.com/status-im/nimbus-eth1/issues/898
- State Network Functionality: https://github.com/status-im/nimbus-eth1/issues/830
### Portal Network open issues for Fluffy:
### April - December 2022 Development Update:
We have not been giving development updates for quite a while, so here is a more high level update of items that we have been working on the rest of the year.
- NeighborhoodGossip improvements:
- Improvements to the gossip algorithm + updates in specification: closest locally stored nodes + potential lookup
- Concurrency added
- Implementation of POKE mechanism
- uTP tuning and bug fixing
- uTP adjustements to allow for multiple content items: Specification changes + implementation
- Dynamic radius adjustments + db pruning accordingly (static radius is still possible)
- Adding support for block receipts in the history network
- Validation of data in history network for block bodies and receipts
- Block body and receipts SSZ type updates
- Experiments with bulk data seeding / gossiping.
- Multiple content items in one Offer
- Added the prerequisite of a content id range query on the content database.
- Adding support for the header `MasterAccumulator` and `EpochAccumulator` content types and related functionality
- Building of the finite/static (until merge) `MasterAccumulator` and adding it
in the Fluffy binary
- Evolution from using am infinite master accumulator that could be requested over the network to the finite master accumulator that is only to be used for blocks up until the pre-merge point.
- Implemented the headers with proof content type:
- Use this type instead of the current header type for all headers pre-merge.
- The proof allows for to verify that the header is part of the canonical chain. This is more efficient than having to request the required `EpochAccumulator` for each header to verify.
- Adjust specification to use this new type instead of the current and specify how the proof works
- `eth_data_exporter` tool:
- Functionality to get the block data from an execution layer JSON-RPC endpoint and storing it into a JSON or e2s format
- Functionality to get all headers and storing them into files per epoch
- Functionality to build the master accumulator and all epoch accumulators from these header files
- Support for injecting block data (headers, bodies, receipts and epoch accumulators) into the network from local files
- Support for injecting block data into the network via JSON-RPC call `portal_historyOffer` (e.g. by using existing eth-portal bridge tool)
- Implement `eth_getLogs` JSON-RPC API call.
- Implement `eth_getBlockByNumber` by requesting the right EpochAccumulator
- Review & fix the state of the Portal JSON RPC API implementation. Add missing calls to Portal JSON RPC API implementation and make it compatible with existing Portal Hive tests + Fluffy Docker image improvements for Portal Hive usage.
- Add `blockwalk` tool to walk down blocks starting from a specific hash and test their availability on the history Portal network.
- Add `content_verifier` tool that verifies availability of all epoch accumulators on the network (should eventually be able to verify all types and might be merged with `blockwalk` tool).
- Proof-of-concept for beacon chain based execution block proofs for the Portal network (relies on the beacon state historical_roots field). Could be used for post-merge canonical chain verification.
- Initial version of the Portal Beacon Light Client Network:
- Help with specifications & review
- First iteration (mostly according to current DRAFT specs) implemented
- First iteration of a bridge to get the content from the consensus libp2p network into the Portal network implemented
- Fluffy Documentation updates & improvements
### February - March 2022 Development Update
The past two months most work has gone into:
- Getting the uTP implementation well tested and tuned
- Adding code for the initial seeding of block data into the history sub-network and gossiping it around.
- Getting Fluffy prepared for setting up a testnet
- Setting up a small fleet of nodes (64x Fluffy nodes) to have our own small public testnet
Seeding data for the history sub-network was first tested successfully with the local testnet script. This script launches n (default 64) nodes on the machine where you run it and then seeds them with block data (only 20 blocks default). It is then tested of these blocks can be retrieved through lookups in the network.
Next, once the fleet of Fluffy nodes was up and running, data was seeded into that network. Only 2500 mainnet blocks (headers + bodies) were seeded initially, and these blocks are retrievable from this public network.
This means that one can start a Fluffy node, connect to this testnet and retrieve the first 2500 blocks from the network. More will be added soon after some optimisations.
A small testing tool "blockwalk" was added to test a range of blocks their availability on the Portal history sub-network (a sort of cli baby block explorer).
Work went also in testing interopability with Ultralight and Trin clients, which resulted in some Portal [spec clarifications](https://github.com/ethereum/portal-network-specs/pull/137).
The Fluffy fleet is now also linked with the Trin and Ultralight bootstrap nodes, which means you can join the network by bootstrapping from any of [these bootstrap nodes](https://github.com/ethereum/portal-network-specs/blob/master/testnet.md#bootnodes).
- Preparation for testnet + Fluffy fleet:
- Updated docs on how to join the testnet
- Improved Portal wire protocol logging
- Load bootstrap nodes at compile time
- Added cli option to start fluffy with a netkey file
- Added radius cli option to Fluffy and use it in local testnet script
- Get a Fluffy fleet of 64 nodes running
- Work on seeding data into the network:
- Load block header data and propagate in a Portal history network, basically by implementing neighborhoodGossip
- Implement `eth_getBlockbyHash`
- Added block bodies to the propagation and lookups
- Added concurrency to the content offers of neighborhoodGossip proc
- Blockwalk cli tool to test range of blocks their availability on the Portal history sub-network
- uTP improvements, fixes and testing:
- Added uTP test app which uses network simulator similar to the one in [QUIC interop testing](https://github.com/marten-seemann/quic-interop-runner) setup, to test uTP over discv5 in different network conditions
- Added uTP tests that stress conccurent reading and writing over uTP socket
- Refactored uTP internals to use one event loop to manage all uTP events
- Added possibility to configure size of uTP packets
- Fixed uTP bug which could introduce never ending read from uTP socket
- Fixed uTP bug which caused socket to accept data packet in not connected state
- Fixed uTP bug which caused not releasing resources when closing uTP socket
- Fixed uTP bug which caused a defect exception when packet sequence numbers should wrap-around
- Added metrics to uTP
- Added metrics to the Portal wire protocol
- Added Grafana dashboard to track metrics of a Fluffy node
- Added basic validation of history network
- Refactor uTP / Portal stream connection process in Portal wire protocol
- Add initial Docker file for Fluffy
- Content Database:
- Added api to get db size of content db
- Added api to get n furthest elements from db
- Improved the local testnet docs, including more details on how it works.
- Fixed issues:
- Let Portal wire protocol adhere to the discv5 packet size limits
- Avoid opening an uTP stream when no content was accepted
- Fix runtime loglevel selection for Fluffy
### December - January Development Update
- Added data radius cache per Portal protocol.
- Added populate history db option to Fluffy: a way to (offline) populate the database with history data (currently only block headers from a json file).
- uTP: Additional feature implementation:
- Handling of selective ACKs, both for sending and receiving.
- Fast resends: when packets are missed.
- Fast resends: due to selective acks.
- uTP: Create Portal [test vectors specification]((https://github.com/ethereum/portal-network-specs/blob/master/utp-wire-test-vectors.md)) for encoding and decoding uTP packets and implement tests for those test vectors.
- uTP: Investigate possible ways of testing uTP with different network fault scenarios.
- uTP: Implement uTP test app and adapt tooling developed for [QUIC interop testing](https://github.com/marten-seemann/quic-interop-runner) to test different network conditions (delays, packet drop rates, etc.).
- uTP: Fix bugs found during testing:
- [Missed ACKs for duplicated packets](https://github.com/status-im/nim-eth/pull/462)
- [Missed `talkresp`, causes missed uTP packets due to node being removed from routing table](https://github.com/status-im/nim-eth/pull/464)
- [Sudden window drop when timeout](https://github.com/status-im/nim-eth/pull/465)
- Integration of uTP over Discovery v5 in Fluffy.
- Add options to Fluffy cli regarding routing table ip limits and bits per hop value configuration.
- Update history network content keys + adjust testing according to test vectors.
- Add/update Fluffy [build docs](https://github.com/status-im/nimbus-eth1/blob/master/fluffy/README.md) and [interop docs](https://github.com/status-im/nimbus-eth1/blob/master/fluffy/docs/protocol_interop.md).
- Tested intial interop of the Portal wire protocol message with Trin & Ultralight.
- Added NodeId resolve call for Portal Networks.
- Use json rpc client to run tests on the Portal local testnet.
### November Development Update
- (**Ongoing**) Implementation of uTorrent Transport Protocol (uTP)
- Implement the full uTP API, i.e reading stream of
bytes, writing stream of bytes, opening stream and closing
- Generalize uTP to work over UDP transport and Discovery v5 transport (talkreq messages).
- Add all necessary fluffy specific extensions:
- connecting to remote peer with requested connection id.
- allowing incoming connections only from specified peers.
- Initial testing against trin and reference implementation.
- Initial work on congestion control and correct backpressoure semantics
- Further implemented most of the Discovery v5 JSON-RPC specification.
- Several items in the specification PR are still unclear or inconsistent. See comments made in https://github.com/ethereum/portal-network-specs/pull/88
- Initiation of Portal JSON-RPC endpoints:
- Currently only `portal_<network>_nodeInfo`, `portal_<network>_routingTableInfo` and `portal_<network>_recursiveFindNodes`.
- No specification exists here yet, but we can implement similar calls as the discv5 specification.
- This was made generic so that common calls can be easily instantiated for the different Portal networks.
- Creation of a local testnet script which launches n amount of nodes that connect to each other over the discovery v5 and portal networks (state & history).
- Random lookups (recursive FindNodes) are launched for each node via JSON-RPC calls.
- State of the routing table is checked after that again via JSON-RPC calls.
- No content is served and/or checked yet. Neighbourhood gossip needs to be implemented still too.
- Added support to pass a bootstrap file (containing ENRs) on cli (opposed to only passing the ENRs on the command line before) + write own ENR file in data-dir.
- Updated the wire protocol according to the latest wire protocol specification.
- Usage of SSZ Unions in wire protocol and for state content keys.
- Implemented Generic SSZ Union solution that maps on Nim case objects (= Object Variants) + move of SSZ code to nim-ssz-serialization repository that is now commonly used between nimbus-eth1 and nimbus-eth2 projects.
- Specification adjustments + adjustments to tests / test vectors.
- Add basic wire protocol test vectors in Portal specifications repository.
- Improve the populating of the Portal networks routing table by looking for specific nodes in the discovery v5 routing table on incoming messages.
### August - September - October Development Update
- (**Ongoing**) Implementation of uTorrent Transport Protocol (uTP)
- Initially over raw UDP to test against reference implementation
- Development happening here: https://github.com/status-im/nim-eth/tree/master/eth/utp
- (**Ongoing**) Update the wire protocol according to the latest wire protocol specification
- Lacking proper SSZ Unions
- Lacking latest content key SSZ sedes
- Implement `getContent` call for state and history network:
- Check local database
- Do the recursive content lookup
- Store data if in radius
- Add the history network code:
- Now uses the separated wire protocol code
- Implemented latest content key SSZ sedes
- Separate the wire protocol from the state network code
- Drafted this and other option(s) before suggesting specification changes
- Allow for passing specific Portal network bootstrap nodes by CLI argument
- Add a simple ContentDB (before there was no persistent storage)
- kvstore using sqlite as backend
- Improve Fluffy CLI option descriptions
- Implement custom distance function of Portal state network
- Allow for custom distance functions in routing table
- Implement recursive content lookup (in Portal state network)
- Initiation of Discovery v5 JSON-RPC endpoints:
- Currently only `discv5_nodeInfo`
- Routing table node validation based on Portal network responses.
- Generalize network layer of Portal Networks (abstract link with storage away)
- Add History network message types
- Add merkle tree implementation, with proof generation and verification. Those utilities will be needed to implement double batched merkle accumulator
#### Master issue on State network:
#### Portal Network open issues for Fluffy:
### July Development Update
Most of the development focus went to providing Fluffy with a json-rpc proxy and starting to support state content searches.
- Providing fluffy with a json-rpc proxy that can relay specific json-rpc calls to an Eth1 client
- Adding initial tests for the state content network by having a node that starts from a genesis json file
- Further implementation of `FoundContent` & `FoundNodes` responses
- SSZ sedes work for `node_key` / `node_id`
- Move of Portal wire protocol code from nim-eth to nimbus-eth1/fluffy
- Stubs for the json-rpc bridge client + adding experimental call: `bridge_getBlockWitness`
- Merkleization fork/port of nimbus-eth1 code to nim-eth, adding tests and merkle verification
- Adding a routing table + lookup code to the Portal protocol
## Nimbus-eth1 core
### July+August Development Update
Development is mainly in these areas, with the usual little issues and fixes throughout the code tree not listed here (see GitHub for those).
The last couple of months has seen some bold deep dives:
- Aiming to pass the Ethereum Hive testsuite completely
- Going very well
- We pass 100% of the consensus tests which is the largest suite by far, including latest London hardfork tests
- Clique proof-of-authority
- Tested against part of Hive and some 50k early blocks from Goerli
- Not fully tested on Goerli yet though
- Research on transaction pool for The Merge
- Nimbus-eth1 does not do mining, but we will need blocks to propose for Nimbus-eth2
- GraphQL implementation
- GraphQL API is now implemented, mirroring the JSON-RPC API
- It passes many Hive tests
- As with the JSON-RPC API, we need to go through the list of calls and check it matches the emerging Ethereum RPC specifications, and expectations of other clients
- WebSockets implementation
- Another way to make RPC calls similar to JSON-RPC and GraphQL, with the added function of being able to wait for events
- Large memory savings during some transaction executions
- We used to crash on running out of memory during some tests, now we use very little
- Faster network sync and storage
- "FastBeamSnap" - simultaneous Beam, Snap and Trie sync balanced dynamically
- Several sync approaches previously researched by others have been combined into a unified model, that theoretically balances to perform efficiently in a wide variety of conditions
- Chain head tracking of each peer
- Fast sync (as defined by Geth) bootstrapped statistical consensus followed by state fetch
- Snap sync (as defined by Geth 1.10+, with some added performance tweaks)
- Beam sync (fetch state on request during transactions, simultaneous with Snap sync)
- "BeamSnap" uses `snap/1` protocol where available to speed up Beam sync further
- Stateless participation mode, while sync continues in the background
- Fast recovery after a long period of downtime, whether due to the program not running, program aborted suddenly, battery ran out, or network not available
- When resuming aborted sync, recovery should not need to do any large trie scans
- Beam sync and stateless transaction execution
- Runs transactions of blocks before all the state is available locally
- Fetches extra state over the network on request by transactions
- Depends on our "asynchronous EVM" capability
- Runs hundreds or thousands of EVMs in parallel to reduce overall latency
- Allows the "trustless" consensus state to be reached more quickly
- Stateless participation mode in parallel with sync
- When Beam sync is going, it's as if the whole network state is available on request, so transactions can be run and consensus can be performed the same way as a full node, without having a full copy of the whole state locally
- This mode is reached much more quickly than full sync takes to complete
- Consensus and transaction behaviours are the same as the fully synced state, but slower
- Similar to Fluffy, or light clients, this allows the user to participate fairly quickly after bootstrapping, but this mode is designed to keep fetching the whole state in the background, and when that's done, switch over to being a full node
- This is work in progress and we don't know how well it works yet
- Snap sync but not the way Geth does it
- Using `snap/1` the standard protocol
- We have identified some issues with the protocol spec and Geth implementation, and found workarounds. Will report in due course
- Simultaneous trie-node traversal and snap-range retrieval with different peers according to their capabilities and response times ensures good syncing as conditions and data availability varies, and naturally falls back to classic trie-node retrieval when necessary
- Pie slicer
- Fetching state is coordinated by the "pie slicer", which is responsible for:
- Adaptively pipelining snap-range and sub-trie-traversal requests to avoid network idle time that makes sync take longer, and to keep the peers' storage I/O queues active for best throughput
- Sequencing pipelined `snap/1` so as to to minimise I/O load on peers is an interesting problem
- Intentional "leaf locality" separately per peer, to assist peers to have the most local and sequential I/O without wasted readahead and RAM on ranges that end up requested from different peers, as would happen with a naive load balancer
- "Pie assembly" at the end of the major traversal, as hundreds to 10,000s of stored "pie slices" with boundary proofs are combined into a single state trie. This is equivalent to Geth's snap trie-heal phase, with the main difference being a different representation and the option to defer some calculations to make the first phase faster and use less memory
- State representation and storage model
- Instead of state + diffs-in-RAM representation, we are using a shared-subtree state representation as it has desirable algorithmic scaling properties
- Our storage model is designed to store arbitrary ranges of blocks and their overlapping state-tries, while keeping data mostly-sorted by leaf-path for efficient snap range scans
- The storage model implements rapid pruning of old or unwanted states in a novel way that uses little I/O, while keeping the on-disk storage sorted by leaf-path and values shared among many blocks
- Space usage is relatively stable and predictable
- If we have done our homework right, the database should be resistant to corruption, including in disk full conditions (a weakness of Geth) and sudden power failure
- Excellent tracking of hardforks, Berlin, London, later EIPs
- We keep on top of these well, as we take care to pass all the execution test suites
- EIP-1559 is implemented of course
- We intend to be more active in the implementor meetups in due course, and to add Nimbus-eth1 in the various EIP/hardfork support matrices online, but currently we are "head down", filling out the essentials to ship a 1.0 alpha-testable release
- Sync progress estimation
- The pie slicer approach to snap _or_ trie based sync, combined with statistical properties of the Ethereum state trie, allows us to estimate progress in each phase of sync quite accurately, even to the point of predicting time left and storage required
- An attractive and useful status panel below the logging area
- Fun with terminal control codes, but will also work with Windows console
- This is the user interface in practice until there's a GUI to control Nimbus-eth1
- Useful means showing sync progress, connected peers and other basic things users tend to ask about
- EVMC compatibility for the EVM
- Our EVM is being made fully compatible with the EVMC API specification, version 9
- It's a good specification
- By providing a clean separation between EVM and host, it has helped us make the EVM fully asynchronous with respect to network state requests, even though EVMC does not support that capability itself
- We use our EVMC extensions when Nimbus-eth1 is using its own EVM for:
- Asynchronous EVM (for network state fetches)
- Pre-emptible EVM (to avoid hogging the CPU for too long)
- Skip byte-endian conversion of 256-bit numeric values
- We can successfully load an alternative EVM as a shared library and use it instead of the Nimbus EVM
- Using the Nimbus EVM as a shared library in another program is not currently possible due to Nim language runtime issues, but we are looking into it