Who moved my Testnet?

# Who Moved My ~~Cheese~~ Testnet? The recent incidents on the Holesky and Sepolia networks have made me think about our current state of testnets and who is most affected by these issues. It also made me reflect on our current approaches for testing hardforks and other coordinated rollouts and where we can improve. In this post, I am hoping to share a little bit of my view on the current state of our testnets and give some suggestions I believe can help us move forward. ## TLDR - Both Holesky and Sepolia Testnets are important for external entities testing their software and infrastructure; they have evolved from being simply a testing playground and are closer to a staging environment. - We need clarity on the requirements for Sepolia and Holesky Testnets. Each network has its purpose and different people rely on them for different use cases. - Core-devs need a way of testing rollouts and network upgrades that are as close as possible to Mainnet. The fewer gaps we have between Mainnet and these testing environments, the lower the risk of breaking important things. ## What are Ethereum Testnets? This is a quick primer for those unfamiliar with Ethereum testnets and their purpose. As of today, Ethereum has two (three if you include Ephemery) test networks. **Sepolia** [Sepolia](https://github.com/eth-clients/sepolia) is a permissioned network with 1,780 validators (as of 12 March 2025). Permissioned does not mean it is a private network, it means that only specific entities are allowed to join the network as validators. The network has a modified version of the deposit smart contract (this is the reason why [we had an issue](https://mariusvanderwijden.github.io/blog/2025/03/08/Sepolia/) when upgrading the network to Pectra) that requires a specific token to create a deposit. Only people with that token are allowed to create a deposit. The benefit of having control over the validator set is that we can "enforce" operators to run their validators as perfectly as they can, even without the monetary incentives a network like Mainnet would provide. It also makes it a lot easier to roll out changes in a coordinated fashion: we can have the whole validator set running a specific version of client software in a short period. This makes Sepolia the recommended network for application developers as it is the most stable test network we have. **Holesky** [Holesky](https://github.com/eth-clients/holesky) is a public network with 1,750,086 validators (as of 12 March 2025). It is by far our largest live network (larger than Mainnet). In fact, the whole reason we created Holesky was to have a network that would show us what Mainnet might look like as it grows, giving us core-devs a chance to identify issues on Holesky before they hit Mainnet. Because Holesky is public, anyone can run a validator on the network, making this the perfect testing ground for home stakers that want to get experience running a node before committing to running a Mainnet validator. It is also a great testing ground for other entities like L2s or staking pools, as they can test their systems in an adversarial environment that simulates Mainnet. ## How do client teams test their software? One of the most fundamental things in Ethereum (and blockchain in general) is its distributed nature. This presents a real testing challenge for us client developers because most of the core features of the protocol and the software we write is designed for interoping with other clients. In the early years of Ethereum, each team would come up with their way of running large-scale tests, relying on their internal devops teams to support spinning up development networks. We had some public Testnets like Ropsten and Rinkeby where we could test our code as close to Mainnet as possible. As Ethereum evolved, so did the network topology complexity of running nodes and large-scale Testnets. We now have Consensus Clients, Execution Clients, mev-boost, relays, DVT, etc. Thankfully, in the past few years, the EF has given us [ethPandaOps](https://ethpandaops.io/), a team of very talented people that has helped client teams tremendously with lots of automation, testing and monitoring tools. With their tools, we are now able to run small networks locally in our laptops (using [Kurtosis](https://github.com/kurtosis-tech/kurtosis) and the [ethereum-package](https://github.com/ethpandaops/ethereum-package)), and also can spin up huge networks for testing specific features (we call those devnets). From a very simplistic point-of-view, these are the testing "scopes" we have for a specific fork and each one of the features included in a fork (e.g. EIPs). The larger the circle, the more complex the deployment and coordination required to test. ![Screenshot 2025-03-13 at 10.31.06](https://hackmd.io/_uploads/HyxVT_ynkl.png) For example, while working on a specific feature, I will start by relying on unit tests and reference tests to ensure the code I am writing is doing what I expect it to do and, more importantly, not doing something that breaks the specification. Once I am happy with it, I will start a local interop network (using Kurtosis) to be able to test how my client interacts with other clients who have already implemented the same new feature. One great feature of Kurtosis is that the local network is defined by a YAML config file that is easy to share with other teams. So if I find a bug or some unexpected behaviour when interacting with other clients, I can send them the config I am using and they can have a replica of the local network I was running on their computer, all within minutes! The next step is where core-dev teams will start to coordinate some interop testing using short-lived networks called devnets. These can be either specific for testing a particular feature or a complete feature set for a fork. EthPandaOps are responsible for spinning up these devnets and we all have a chance to test our clients and identify any other interop issues. Sometimes we use what we call Shadowforks for testing interop. Shadowforks are like a network "clone". Using shadowforks we can start a network from another existing network, including Mainnet. These are useful when we want to have world states or validator data from real environments and are particularly useful for finding bugs that could only surface on networks that have past history on them. If you want to learn more about Shadowforks, I suggest taking a look at [this article](https://www.alchemy.com/overviews/shadow-fork). We also have [Hive tests](https://github.com/ethereum/hive) to run integration tests across different client. Hive is a tool to run end-to-end testing against Ethereum clients. The Ethereum Foundation maintains two public Hive instances, constantly testing consensus, p2p and client compatibility. They are particularly important as usually this is where we would see issues like Execution Engine API divergence between clients and other things that can be hard to find without testing the full matrix of Consensus and Execution clients. Next in line, we have our Testnets. This is where things get more serious as these networks are all used by other entities and users and we try hard to ensure our upgrades are harmless (although we [did not succeed this time](https://etherworld.co/2025/03/03/aftermath-of-holesky-testnet-incident-lessons-learned/) testing the Pectra fork). The ideal scenario is a flawless rollout, with no interop issues between clients or any protocol breaches. All Testnet rollouts are coordinated and scheduled before they happen, so node operators have time to upgrade their clients. ## Lessons learned after the Holesky and Sepolia incidents One problem during the Holesky and Sepolia incident is the fact that these networks aren't used just by core-devs to test their deployment and network upgrade logic. These networks are actively used by other entities to test their software (think L2s, Staking Pools and other systems). So by the time things went bad in Holesky, all the testing pipelines of these entities were blocked. This problem is even worse because many of these entities require specific infrastructure deployed on the network, including smart contracts, oracles, monitoring systems, and others. So it isn't trivial for them to just shift their testing to another network. On the other hand, I have heard from some core-devs things like: "We need a Testnet that we can break", or "We broke Holesky? This is what a Testnet is for!". And to be fair, they are not wrong. We do need a way to test our rollout without having to worry about "cleaning up" after ourselves when things go wrong. In this particular instance, both the incidents in Holesky and Sepolia were caused by divergences between our Testnets and Mainnet configuration. It is funny to think that none of these issues would've happened on Mainnet. Maybe next time we just press the button and go straight to Mainnet (nope, we won't) 🙂. What is clear to me is that we need better alignment on what are the requirements of each Testnet, and who is the intended target for them. We also need to set clear expectations for users of our existing Testnets that we will treat them as we would treat Mainnet. During the Holesky incident, there were calls for a hard fork fix of the network, something that would be unacceptable on Mainnet so why were we even talking about it? If we were going to fix Holesky, we would need to do it in the same way we would on Mainnet. After all, this is what the protocol is designed to do. If it can't survive something like this, we need to go back to the drawing board and fix it (spoiler alert: [the network is fixed and back finalizing now](https://x.com/ethPandaOps/status/1899184122825765187)). Another important thing is to revisit our testing strategy and identify all the areas where our deployment, configuration or anything else is different from Mainnet. In the Sepolia incident, the [reason for the failure](https://mariusvanderwijden.github.io/blog/2025/03/08/Sepolia/) was the fact that the deposit contract in Sepolia is slightly different from the one on Mainnet, and most people didn't even know they were different. The issue that happened on Holesky did not happen on earlier testings because the way that we configure our Devnets is different from how we bootstrap our Testnet configuration. The tighter the gap between our testing environment (Kurtosis, Devnets and Testnets) are from Mainnet, the less likely we are to be surprised when rolling things out. The complexity of the protocol is growing with each new fork, so we are doing a favour to ourselves spending time identifying those differences and removing them as much as possible. ## Where do we go now? At the moment core-devs are working with ethPandaOps to have a replacement for Holesky that can be used by the community to test all Pectra-related changes in a timely manner. [Some ideas have been proposed](https://notes.ethereum.org/@ethpandaops/post-finality-path-forward-holesky) and soon we will have something up and running. With regards to the existing Holesky network, it isn't clear what the future holds for it. Although it has recovered finality, in the process heaps of validators have been slashed and others have been ejected due to leaking their balances too low. We could leave it online and run its course, but it wouldn't be fulfilling its primary goal of working as a staging environment for the community. The past few weeks have been an excellent learning opportunity for us core-devs, and every client had the opportunity to improve their clients to handle those catastrophic scenarios. We have had devnets in the past to test non-finality but I think we have never run such a large-scale scenario like what happened in Holesky. And the fact that we were able to eventually stabilise the network and recover finality is awesome. On the other hand, there is a huge cost associated with keeping the network alive, so I believe that soon we will sunset Holesky (of course we will have a replacement for it). Last but not least, everything that happened this past few weeks has helped us to identify areas where we can improve our testing, closing gaps between our devnets and Mainnet and I am sure all client teams are coming out of it more experienced in disaster recovery and better at responding to incidents. It is a huge moment for us to show how we can learn from our mistakes and improve the protocol, our processes and our code. # References - [Marius's blog post: Sepolia Pectra fork incident recap](https://mariusvanderwijden.github.io/blog/2025/03/08/Sepolia/) - [Kurtosis Github Repository](https://github.com/kurtosis-tech/kurtosis) - [Ethereum-Package for Kurtosis Github Repository](https://github.com/ethpandaops/ethereum-package) - [Alchemy article: What is a Shadowfork?](https://www.alchemy.com/overviews/shadow-fork) - [Etherworld article: Aftermath of Holesky Testnet Incident: Lessons Learned](https://etherworld.co/2025/03/03/aftermath-of-holesky-testnet-incident-lessons-learned/) - [ethPandaOps blog post: Post-finality Holesky Path Forward](https://notes.ethereum.org/@ethpandaops/post-finality-path-forward-holesky) - [Pari and Marius interview at Galaxy talking about Pectra testing](https://www.youtube.com/watch?v=MFoWn4Lf5OQ)