[Quick contemporaneous notes by Ben Edgington; fka "Eth2 Implementers' Call"]
Agenda: https://github.com/ethereum/pm/issues/443
Livestream: https://youtu.be/izyYW9-HbNk
Kintsugi testnet finalising! Was not finalising for the last few days due to errors found by the fuzzing exercise. Have now pruned the forks and got almost all client combos working now. Pari has a lot of data preserved from the failures for anyone who wants to investigate further. We have captured some testcases from these - are we cataloging these anywhere? Pari is working on a timeline. Mariusz is adding cases to Hive.
[Mikhail] The blockhash issue occurred due to some unclarity in the spec. Pending an update to address this. Any time there is an inserted execution payload there should be a "well-formedness check" before proceeding any further.
Some discussion around how to handle blocks older than the execution layer's finality checkpoint. This could happen if the consensus layer were to resync from genesis, say. Seems to be straightforward to handle. Also can probably ignore forkChoiceUpdated()
calls that go backwards.
Further topic. This may not be true for all execution clients, but default is for client to only execute a branch once its PoW difficulty is greater than the previous head. There is a mismatch in this expectation with the way that the consensus layer drives the execution client, which led to high load on the very forked testnet. [Peter S] In PoW mode, when a block does not build on our current chain, if we have the state for the parent we execute it, if not we store the block for later. [Danny] We may need to modify the semantics of the engine API calls to allow for this kind of behaviour: allow for mandatory or optional execution in forkChoiceUpdated()
and executePayload()
respectively (maybe?). Geth stores 128 past states which makes the change in the semantics much more of an edge case. Erigon does not store multiple states, and big reorgs are relatively heavy - it uses reverse state diffs. Note that Geth is moving in the same direction (with reverse diffs), though it will still maintain 128 past states. Action: document any required changes to the engine API and get review. Could be useful to add VALID
and INVALID
returns for forkChoiceUpdated()
.
[Peter S] Do we have any environment for testing these scenarios - e.g. spin up a non-finalising chain?
[Danny] Hive is approaching being able to handle this kind of scenario, and partitions. Let Lightclient know if you have any specific configurations that would be useful.
There is also some third-party simulation work going on.
[Peter S] There are some ordering dependencies of calls in the Engine API. What is the desired behaviour in the case of protocol violation that breaks this ordering?
[Mikhail] forkChoiceUpdate()
will just return SYNCING
if it's called out of order. The current API should handle all edge cases gracefully. Need to double check that all cases are covered in practice.
[Lightclient] Which consensus clients support starting from a Merge checkpoint? It should be possible for all clients now.
Spec updates coming, main thing being the call semantics above. Until then, keep Kintsugi alive. Probably one more testnet like Kintsugi, then Merge-fork public testnets.
Aim to freeze spec asap.
[Pari] Do all clients support the Bellatric naming? (See chat)
[Saulius] What changes needed to be made to clients as a result of testnet incidents? Would be good to have a list. Also, should we think about making the fork choice more like PoW since fork handling in PoW has proven very robust? On PoS forking is cheap, not so under PoW. [Marius] There was some discussion in Greece about whether proposing bad blocks would be penalised, things that would be difficult for the execution layer to handle, such as not building on the current head. [Danny] Likely this is possible since we have a block roots accumulator. Haven't thought deeply about it. Action: Saulius invites further discussion
Lodestar
New team member. New release.
Grandine
Finally joined Kintsugi (just before the crash). Back on with 1000 validators. Currently testing only with Geth. Looks good right now.
Prysm
Optimistic sync. Fork choice proposer boost and new spec tests by end of Jan. Switching beacon state to Go structures to save memory. Implementing Web3Signer API, by end of Jan. Work on Key Managwement API.
Nimbus
Prepping new release today or tomorrow. Mostly performance, but also ships key manager API. Optimised used of Nim garbage collector, now ~1GB on mainnet.
Want to work on light client. Have a server compatible with Lodestar, and will put a light client mode into Nimbus. Have plans to contribute to the spec.
REST API cache now speeds up calls significantly.
Teku
v22.1.0 release has network config --kintsugi
. Optimisations. Lots of work on optimistic sync. Starting to find lots of great corner cases in optimistic sync. More confident that it will work out well now.
Working on key manager API - can be enabled in the release build. Just need to add authentication and SSL. Should be fully supported in next release.
Lighthouse
Release candidate in next day, release next week. Big update! Performance work, moving slasher DB to new platform. Optimistic sync close to mergeable. Flashbots PoC. Client diversity analysis. Proposer boost and Bellatrix rename ready to go.
[Marius] LH database grew 7GB a day during non-finalisation, but were not pruned when finalisation occurred - is this expected? [Paul H] This ought to work; will look into it. [Adrian S] Teku's default for finalised states is tree-mode which deduplicates automatically. [Jacek] Nimbus does deduplication for validator states as well.
(Discussion of Teku's state tree storage mode ensues… [Jacek] Nimbus has been playing with daily "era files", which are effectively complete checkpoints.)
Dankrad
Published a draft for simplified sharding design. Please take a look at the draft PR and give feedback. Should be much faster to implement.
Also - contact Danny if you want to do some R&D work on DHTs.
New Single Secret Leader Election post on Ethresear.ch today.