Try   HackMD

What's New in Eth2 - 22 August 2020

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
Ben Edgington (PegaSysConsenSys — but views expressed are all my own)

Edition 50 at eth2.news

Medalla Meltdown redux

In case you missed it, things have been turbulent on the Medalla beacon chain testnet. I wrote a mostly accurate initial account earlier in the week. Read that first if you've no idea what happened. Prysm has since published their own detailed account of things from their viewpoint. The long and short of it is that the Medalla testnet suffered some big shocks and a great deal of stress, beginning on Friday the 14th, and continuing for five days while we recovered things.

The incident


My Teku node's view of Medalla participation in the 6 hours after the incident (click to enlarge)
Time is UTC on 2020-08-14.

What you see here is an initial massive drop off of validators over a thirty minute or so period. This is the Prysm nodes all disappearing from view because their clocks got set to four hours in the future. The horizontal red line is the 66.7% participation that allows the network to finalise: we dropped way, way below that.

Things slowly begin to recover as people transfer keys from their Prysm validators to other clients. Then a bit more rapidly from about 2 hours after the incident as the clock error was resolved by Cloudflare, and Prysm validators that had not been updated came back into view. As per Prysmatic Labs' account, the update designed to fix the original issue contained a critical flaw that unfortunately rendered updated Prysm clients inactive, compounding the problems.

Then, just over four hours after the initial issue, chaos breaks out and participation dumps again. Nodes of all types were struggling to process all the "attestations from the future" that the Prysm validators had produced, and which had now become newly valid. The network begins to fragment as nodes try to make sense of all this. Memory gradually bloats up in Teku and Lighthouse causing performance issues and crashes; updated Prysm nodes are still not participating it's a mess.

But it's also one of the greatest tests I can imagine! If we had called a meeting about testing the clients and brainstormed some extreme ways to break the network, I very much doubt that we would have come up with something as exciting as this. The "time transport" aspect led to some fascinating features, not least a large number of validators getting slashed.

As a result, all sorts of previously unexplored code-paths were exercised. Huge improvements have been made across the board. Everything is better than it was. We learned a lot of valuable lessons.

Latest status

Over the early part of this week, the client teams got busy hardening their clients to handle this newly hostile environment. Agonisingly slowly, we crept up towards the magic 66.7% participation rate we need for the network to finalise and be considered healthy again. Eventually we got there around 6.30pm UTC on Wednesday the 19th, almost exactly 5 days after the original incident. Throughout that time, there were always clients able to build the beacon chain: it never stopped running. The beacon chain is robust; the beacon chain can recover :tada:

Some articles and writings on the whole train of events:

Phase 0: The beacon chain

All quiet on the spec front: nothing new to report.

My brilliant R&D colleagues at ConsenSys published a nice progress report on their efforts to formally verify the Phase 0 specification.

And the latest news on the multi-client fuzz testing work by Sigma Prime.

The Great Explainers

Here's something fun: you can learn about Eth2 and earn yourself a POAP at the same time! Info is here, Get yourself onto the Ethstaker Discord and check out the #eth2-studymaster channel. Full announcement here. The idea is, over ten weeks, to read ten Eth2 articles and then answer questions on them. Do well enough and you get the POAP. I'm really hoping that Eth2 devs are not excluded :joy:

Somer Esat continued his excellent series of guides with an in-depth article on setting up Teku on Medalla with Ubuntu. This is now my go-to reference :slightly_smiling_face: Somer and Super Phiz made a video walkthrough of the whole thing. Super Phiz has also done a video with Cayman for setting up the Lodestar client. Coinmonks has a guide to setting up Lighthouse on Ubuntu.

On the subject of Teku, I made a quick Teku trouble-shooting guide. Keep the feedback coming, people!

Simon de la Rouviere produced a nice tweet-form overview of Eth2 for those who like things bite-sized. And the Narkasa exchange published a nice, gentle "What Is Ethereum 2.0?" article.

Research

It barely qualifies as research, but I did an analysis on the effectiveness of the mechanism used by the Eth2 beacon chain to agree on the state of the Eth1 chain. Tl;dr: there's room for improvement and I make a suggestion.

The ConsenSys TX/RX team has made a simulator that runs an Eth1 client in an Eth2 sharded environment (so-called Phase 1.5). Now you can run it too. It's built on Teku for the Eth2 part and Catalyst (a fork of Geth by Guillaume Ballet) for the Eth1 part. Danny did a live demo of this during his EDCON talk last week.

Also on ethresear.ch:

  • Vitalik defends the beacon chain's preference for liveness over consistency when a choice needs to be made. An interesting discussion ensues.

Media

It seems like a long time ago already :sweat_smile:, but the talks from EDCON are available. Lots of good Eth2 stuff:

  • Vitalik from Eth1 to Eth2, opportunities and challenges.
  • Watch Danny wrestle with his cat while demonstrating a simulation of the Eth1/Eth2 merger.
  • Hsiao-Wei talks Phase 1 shard data chains.
  • Aditya on weak subjectivity.
  • Afri on the final road to Eth2.
  • And Terence (Prysm) and Paul (Lighthouse) with more client-focused presentations.

Paradigm did a nice interview with Danny. Find out what he likes to do when not wrestling with cats (it involves vegetables).

Vitalik in CoinTelegraph: Ethereum 2.0 Presents a "Much Harder" Challenge Than We Thought. Nobody panic! We love a good challenge :grinning:

Regular Calls

Implementers

Call #46 took place on the 20th of August.

We did only a very brief review of the Medalla situation since everyone had been in constant contact pretty much throughout, and time was a bit constrained.

Thoughts are turning towards beacon chain launch-readiness. Afri walked through some proposals he has around launch preparation. And Hsiao-Wei is tracking tasks: see the project board for a glimpse behind the scenes.

Client team stuff

Lots of client team updates this week, recounting our adventures on Medalla: update from Nimbus, and a follow-up; and from Lighthouse; and from Lodestar. Prysm's latest update is the write-up and analysis of the Medalla testnet incident.

And finally

What better than to combine a cosy chat about Eth2 with a little wine tasting? Quantstamp and Cred have teamed up to do a remote wine tasting and panel discussion: "Wine Not Talk About Ethereum 2.0?" :wine_glass:

Info and registration here. They ship the wine to you.

I am on the panel, alongside some of my favourite people in the Eth2 world. It turns out to be starting at 2am my local time, so I'll likely be fairly drunk before we even begin. Should be fun!


Follow me on Twitter to hear when the next edition is out 🙌.

 We also have an RSS feed.