E2E Testing with Kurtosis in Prysm: Exploring Feasibilities

{%preview https://github.com/syjn99/prysm/tree/poc/e2e-kurtosis %}*Branch: `poc/e2e-kurtosis`* ## Current E2E Structure Review {%preview https://prysm.offchainlabs.com/docs/learn/dev-concepts/end-to-end/ %} [Prysm](https://github.com/offchainlabs/prysm/) maintains its own E2E package in [`testing/endtoend`](https://github.com/OffchainLabs/prysm/tree/develop/testing/endtoend). Following code is a typical entry point for E2E test: ```go! func TestEndToEnd_MinimalConfig(t *testing.T) { r := e2eMinimal(t, types.InitForkCfg(version.Bellatrix, version.Electra, params.E2ETestConfig()), types.WithCheckpointSync()) r.run() } ``` 1. First, it initializes a test runner with fork schedule, network config, and some options like `WithCheckpointSync()`. For example, `WithCheckpointSync()` adds an additional step to check whether a new node is able to sync via checkpoint provided from existing nodes. (See `testCheckpointSync` function.) 2. Then, call `run` method for the test runner: ```go // run is the stock test runner func (r *testRunner) run() { r.runBase([]runEvent{r.defaultEndToEndRun}) } ``` - `runBase` is responsible to orchestrating all the components for launching a new devnet. There are (literally) *tons* of components: bootnode, depositor (send transactions to the network), and of course multiple Ethereum nodes (Execution client (= geth) + Beacon client + Validator client). - `runBase` receives a list of `runEvent`. Tests like `testCheckpointSync` are called from each `runEvent`. ### Drawbacks ![image](https://hackmd.io/_uploads/HJGCBD2ZWe.png)*Over 100+ PRs tagged as `e2e-tests`...* - **Maintenance cost**: E2E package is big, so that decent amount of dev resources should be allocated especially when it comes to new fork. ![image](https://hackmd.io/_uploads/Hy4O8Pn-Wg.png)*Marked as deprecated but we are still using these...* - **Keep using deprecated stuffs**: gRPC API support specifically. As the current E2E package is historical, it heavily relies on existing gRPC-related code (gRPC client, gRPC calls...). - **Non-declarive**: What do we expect for E2E? To answer this, we need to read Go code line by line. ## What Kurtosis & `ethereum-package` provide [`ethereum-package`](https://github.com/ethpandaops/ethereum-package) is a Kurtosis package for deploying a private network with ease. It provides not only spinning up a new devnet but also useful tools for testing and monitoring. Kurtosis orchestrates all those services based on docker images. I'm pretty sure that using `ethereum-package` can replace **most** of the code only for E2E test, for example, `setup()` in `component_handler_test.go`. One of the core benefits for using `ethereum-package` is "customizability". It can: - Specify participants in the network with different client type and images - Configure network parameters like `seconds_per_slot` - Test remote signer (`web3signer`) - Run transaction spamming service (`spamoor`, `tx_fuzz`) --- ```bash bazel test //testing/endtoend-kurtosis:go_default_test --test_output=streamed ``` The current version of my WIP branch can successfully launch a new devnet using: - Docker images that are built on-demand: [They are provided as `data` in `BUILD.bazel`](https://github.com/syjn99/prysm/blob/13fe56d3e2362591d086772be27397f1d091c3d0/testing/endtoend-kurtosis/BUILD.bazel#L13-L17), thanks to Bazel build system. - A [default network parameter](https://github.com/syjn99/prysm/blob/poc/e2e-kurtosis/testing/endtoend-kurtosis/network-config/default.yaml) with only two nodes - And of course, `ethereum-package`. ## Thinking of Assertions... Now we can easily spin up a new devnet using existing Bazel environment. Next steps are straight forward: we need to "assert" on the devnet. For example, can this network process a new deposit transaction well and add one more in its validator registry? ```go! // Evaluator defines the structure of the evaluators used to // conduct the current beacon state during the E2E. type Evaluator struct { Name string Policy func(currentEpoch primitives.Epoch) bool // Evaluation accepts one or many/all conns, depending on what is needed by the set of evaluators. Evaluation func(ec *EvaluationContext, conn ...*grpc.ClientConn) error } ``` Prysm uses the concept of [evaluators](https://prysm.offchainlabs.com/docs/learn/dev-concepts/end-to-end/#evaluators) with fine-grained [policies](https://prysm.offchainlabs.com/docs/learn/dev-concepts/end-to-end/#policies). When every epoch that satisfies the given policy, the evaluator run `Evaluation` function. Mostly `Evaluation` function fetches data from the Beacon node using gRPC client, and return `error` if something unexpected happens. Then, given an Kurtosis enclave where a devnet is deployed, how can we test scenarios we want? ### 1. Introducing Assertoor [Assertoor](https://github.com/ethpandaops/assertoor/) provides a toolbox with [handy methods](https://github.com/ethpandaops/assertoor/wiki#task-categories) for: - Running a task - Asserting a specific condition (e.g., `check_clients_are_healthy`) - Generating objects like transactions, voluntary exits, or consolidation requests. As this is also maintained by ethPandaOps, it is perfectly integrated with `ethereum-package`. <details> <summary>assertoor parameters in ethereum-package</summary> <div markdown="1"> ```yaml! # Configuration place for the assertoor testing tool - https://github.com/ethpandaops/assertoor assertoor_params: # Assertoor docker image to use # Defaults to the latest image image: "ethpandaops/assertoor:latest" # Check chain stability # This check monitors the chain and succeeds if: # - all clients are synced # - chain is finalizing for min. 2 epochs # - >= 98% correct target votes # - >= 80% correct head votes # - no reorgs with distance > 2 blocks # - no more than 2 reorgs per epoch run_stability_check: false # Check block proposals # This check monitors the chain and succeeds if: # - all client pairs have proposed a block run_block_proposal_check: false # Run normal transaction test # This test generates random EOA transactions and checks inclusion with/from all client pairs # This test checks for: # - block proposals with transactions from all client pairs # - transaction inclusion when submitting via each client pair # test is done twice, first with legacy (type 0) transactions, then with dynfee (type 2) transactions run_transaction_test: false # Run blob transaction test # This test generates blob transactions and checks inclusion with/from all client pairs # This test checks for: # - block proposals with blobs from all client pairs # - blob inclusion when submitting via each client pair run_blob_transaction_test: false # Run all-opcodes transaction test # This test generates a transaction that triggers all EVM OPCODES once # This test checks for: # - all-opcodes transaction success run_opcodes_transaction_test: false # Run validator lifecycle test (~48h to complete) # This test requires exactly 500 active validator keys. # The test will cause a temporary chain unfinality when running. # This test checks: # - Deposit inclusion with/from all client pairs # - BLS Change inclusion with/from all client pairs # - Voluntary Exit inclusion with/from all client pairs # - Attester Slashing inclusion with/from all client pairs # - Proposer Slashing inclusion with/from all client pairs # all checks are done during finality & unfinality run_lifecycle_test: false # Run additional tests from external test definitions # Entries may be simple strings (link to the test file) or dictionaries with more flexibility # eg: # - https://raw.githubusercontent.com/ethpandaops/assertoor/master/example/tests/block-proposal-check.yaml # - file: "https://raw.githubusercontent.com/ethpandaops/assertoor/master/example/tests/block-proposal-check.yaml" # config: # someCustomTestConfig: "some value" tests: [] ``` </div> </details> Assertoor can be launched with `ethereum-package` when we explicitly provide `"assertoor"` in `additional_services`. `ethereum-package` provides [pre-built test suites](https://github.com/ethpandaops/ethereum-package/tree/main/static_files/assertoor-config/tests) like `run_lifecycle_test` as well. There are bunch of YAML files written by ethPandaOps team in [playbook](https://github.com/ethpandaops/assertoor/tree/master/playbooks) directory. ```yaml! id: test1 name: "Test 1" timeout: 1h config: # walletPrivkey: "" # validatorPairNames: [] configVars: {} tasks: [] cleanupTasks: [] schedule: startup: true cron: - "* * * * *" ``` If we want to run specific assertion periodically (e.g., every epoch), we might use `cron` field in `schedule`. ### 2. Reusing `evaluators` We can think of reusing the current code by making `Evaluator` more flexible. We can access any services in the Kurtosis enclave as Kurtosis automatically exposes ports to the user. We might use a typical HTTP client with REST API to communicate with each service in the enclave, and `Evaluator` will accept HTTP connection instead of gRPC connection. ## Challenges & Open Questions There are few specific tests that `ethereum-package` doesn't provide: - Sync test ([`testBeaconChainSync`](https://github.com/OffchainLabs/prysm/blob/61de11e2c4749e78b36054924453bea39225cd28/testing/endtoend/endtoend_test.go#L359-L418)): Add a new node syncing from the genesis and test if it catches the head correctly. - Checkpoint test ([`testCheckpointSync`](https://github.com/OffchainLabs/prysm/blob/61de11e2c4749e78b36054924453bea39225cd28/testing/endtoend/endtoend_test.go#L295-L357)): Add a new node syncing from the checkpoint (checkpoint is provided by one of nodes in the devnet) and test if it catches the head correctly. - Doppel Ganger test ([`testDoppelGangerProtection`](https://github.com/OffchainLabs/prysm/blob/61de11e2c4749e78b36054924453bea39225cd28/testing/endtoend/endtoend_test.go#L420-L455)): Add a new validatornode with same validator key and see if the validator detects the duplicative instance. - Offline test (e.g., [`multiScenario`](https://github.com/OffchainLabs/prysm/blob/61de11e2c4749e78b36054924453bea39225cd28/testing/endtoend/endtoend_test.go#L769-L846)): Make few participants offline, and see whether the network is stable or not. Also test those participants to be on track after they resumed. --- Here're some open questions on my head. They are listed in no particular order. - When are we going to be **confident** for this solution? - How can we say this new E2E testing package has more coverage than the current one? - Do we need a dedicated branch for this? Or just merging it into `develop`?