Second Opinion ZK Oracle

# Second Opinion ZK Oracle: Technical Design and Architecture ## Abstract This specification gives an overview for ZK Oracle project based on [LIP-23](https://github.com/lidofinance/lido-improvement-proposals/blob/develop/LIPS/lip-23.md). The main idea behind this project is to build robust Oracle which brings concensus layer validators balance and Withdrawal Vault's balance based on Zero-Knowledge technology and plug it to protocol as a second opinion to confirm huge negative rebase for stETH. ## Interface The Oracle interface implemented as: ``` interface SecondOpinionOracle { function getReport(uint256 refSlot) external view returns ( bool success, uint256 clBalanceGwei, uint256 withdrawalVaultBalanceWei, uint256 totalDepositedValidators, uint256 totalExitedValidators ); } ``` NOTE: `totalDepositedValidators` and `totalExitedValidators` isn't used in current implementation and therefore omited from ZK Oracle report. ## ZK Technology used The current implementation for Second Opinion ZK Oracle based on SuccinctLabs' [SP1](https://docs.succinct.xyz/docs/sp1/introduction). SP1 is a zero‑knowledge virtual machine (zkVM) that proves the correct execution of programs compiled for the RISC-V architecture. This means it can run and prove programs written in Rust, C++, C, or any language that compiles to RISC-V. SP1 is feature-complete, consistently delivers state-of-the-art performance on industry-standard benchmarks, and has been rigorously audited by top security firms. It's trusted in production by leading teams across blockchains, cryptography, and beyond. ## System components The overall diagram for the components: ![image](https://hackmd.io/_uploads/SkFBVduVex.png) The ZK Oracle consists of the following components: 1. Program - written in Rust code for execution inside SP1 zkVM. Main oracle source code 2. Script - written in Rust code to manually bootstrap execution of the Program inside zkVM. 3. Service - written in Rust code that executing the Program inside zkVM based on predefined caidance and conditions. For example, this component can run the Oracle computations once a day after reaching finality for particular reference slot. 4. Shared - it's a Rust code shared between Program, Script and Service. It's a library with extracted common functionality. 5. Contracts - it's a Solidity code that provides a contract to implement Second Opinion interface and accumulate ZK Oracle running results. ## Program Flows ### SubmitReportData() ![NM_SubmitReportData](https://hackmd.io/_uploads/SkFOrlaBxl.png) ### Circuit Flow Diagram ![image](https://hackmd.io/_uploads/HJTwLeaHgl.png) ## GateSeal for disable Second Opinion reporting in case of an emergency The current state of ZK tech is very modern and may contains bugs and vulnerabilities. To protect Lido protocol from potential exposure it's necessary to introduce a GateSeal that will disable ZK Oracle reporting in case of vulnerability discovery. To protect the protocol there is a need for a GateSeal that can pause Second Opinion oracle operations for timespan enough for Lido Governance intervention. In case of confirmed ZK vulnerability, Lido Governance should detach comporomised second opinion oracle contract from protocol via voting. In case of false positive case, there is no need to take action. Second opinion will be unpaused after a period of time. It's proposed to have the same [GateSeal Comittee](https://docs.lido.fi/contracts/gate-seal) for this GateSeal as for main Lido protocol. To allow GateSeal execute pause on Second Opinion contract, there is an Access Control pattern in place which allows to DAO Agent have a admin role and provide roles for Pause and Unpause to GateSeal contract and [Reseal Manager](https://github.com/lidofinance/dual-governance/blob/docs/update-known-risks-and-limitaitons/docs/specification.md#contract-resealmanager) (part of Dual Governance). ## Service lifetime and monitoring consideration ZK Oracle service is running contantly in docker environment. It uses internal cron to periodically wake up, check parameters and if it's match, start prepearing the report. It's proposed that second opinion ZK reports synchronyzed with traditional oracles committee reports and happening once in 24 hours. However it's possible to make reports more rarely using internal cron. It's not possible to make reports more frequently as they are limited to reference slot, reported by HashConsensus. Usully report submitted to `Sp1LidoAccountingReportContract` after 30-40 minutes of comutation. After fresh install if there is no cache available the run can be up to 2-3 hours. The following measures helps in oracle maintainance. - There is a mechanism to trigger script execution for correct reference slots. Poll HashConsensus contract for new refslot + run report if slot reached finalization state. - Metadata request endpoints: /health - simple health check, return 200 OK if alive /metrics - endpoint for collecting metrics - see Metrics section for details - Dry-run mode for hot reserve gather and prepare all data, but don’t submit to external prover and contract controlled via env var, promoting an instance to normal mode requires changing the env var in the container and restarting the service. ## Deployment with Docker - Two-stage build - Second stage based on Alpineimage - All parameters (including sensitive) injected through ENV - Healthcheck pointing to /health endpoint - P2: Exponential backoff for restarts ## Depencencies 1. [SP1 Verifying contract](https://github.com/succinctlabs/sp1-contracts/blob/main/contracts/src/v5.0.0/SP1VerifierGroth16.sol) ([deployments](https://github.com/succinctlabs/sp1-contracts/blob/main/contracts/deployments/)) 2. [SP1 Prover Network](https://docs.succinct.xyz/docs/sp1/prover-network/intro) ### Rust depencencies ``` alloy = { version = "1", features = ["contract", "json", "providers", "signer-local", "signers", "sol-types", "network"] } # ==== cargo edition and rustc compatibility ==== # These dependencies are pinned to a concrete version as a compatibility mechanism between new version of the crates # (moving on oo edition=2024 and rust-version=1.85) and sp1 compiler that's based on rustc=1.82 # This can be unpinned when sp1 rustc moves to at least 1.85 # 1.2.0 requires rust >= 1.85, sp1 uses 1.82-dev # Note: there are overrides in script and service Cargo.toml in dev-dependencies - needs to be updated there as well alloy-primitives = { version = "=1.1.3", features = ["serde", "rlp"] } syn-solidity = "=1.1.3" # ==== cargo edition and rustc compatibility ==== alloy-sol-types = "1" arbitrary = "1.4" alloy-rlp = { version = "0.3.10", features = ["derive"] } anyhow = "1.0" chrono = { version = "0.4", features = ["clock"] } chrono-tz = "0.10.3" derive_more = { version = "2.0", features = ["debug"] } dotenvy = "0.15.7" ethereum_hashing = "0.7.0" ethereum_serde_utils = "0.8" ethereum_ssz = "0.9" ethereum_ssz_derive = "=0.9" ethereum-types = {version = "0.15.1", features = ["arbitrary"] } eth_trie = "0.6.0" eyre = "0.6.12" hex = "0.4.3" hex-literal = "1" itertools = "0.14.0" json-subscriber = "0.2.4" k256 = "0.13.3" lazy_static = "1.5" log = "0.4.27" prometheus = "0.14" proptest = "1.6" proptest-arbitrary-interop = "0.1" rand = "0.9" reqwest = "0.12" rs_merkle = "1.5" serde = { version = "1.0", default-features = false, features = ["derive"] } serde_derive = "1.0" serde_json = { version = "1.0", default-features = false, features = ["alloc"] } simple_logger = "5.0" sp1-derive = "5" sp1-helper = "5" sp1-sdk = {version = "5", features = ["network"] } sp1-zkvm = "5" thiserror = "2.0" tokio = "1.45" tracing = "0.1.41" tracing-forest = "0.1.6" tracing-subscriber = {version="0.3.19", features=["std", "fmt", "json"]} tree_hash = "0.10" tree_hash_derive = "0.10" typenum = "1.18" # ssz_types = { version = "0.12.0", features = ["arbitrary", "cap-typenum-to-usize-overflow"], path = "../../ssz_types" } ssz_types = { git = "https://github.com/lidofinance/ssz_types", features = ["arbitrary", "cap-typenum-to-usize-overflow"] } ``` ## Roles and Access Control | Role |Assignee | |------------------|-------------------------| |DEFAULT_ADMIN_ROLE|Aragon Agent | |PAUSE_ROLE |GateSeal contract | | |ResealManager contract | |RESUME_ROLE |ResealManager contract | `Sp1LidoAccountingReportContract` created with an AccessControl OZ pattern. It's intended to transfer admin to Lido DAO Agent after deployment. Also during the deployment it's intended to provide "Pausable" role to GateSeal contract and Pausable/Resumable roles to ResealManager. Proposed default **pause is 14 days** to give a time for Governance to unplug the malicious or broken second opinion contract from Lido protocol. The main `submitReportData()` is permissionless, so everyone can call the function and submit report if the contract is not paused. Inside the function there is a check to SP1 Verifier contract that makes sure of report validity and provability. Malicious or incorrect report shouldn't pass this check. ## Failure Modes 1. Succinct Prover Network is down 2. Lido ZK Oracle service is down ## Dual Governance relation As mentioned in [LIP-23 regarding the Dual Governance](https://github.com/lidofinance/lido-improvement-proposals/blob/develop/LIPS/lip-23.md#dual-governance-clashing) there is a risk of Deadlock. Other risk that could be seen here is that Dual Governance may block removing the compromised ZK Oracle and block re-plugging fixed version. If that considered a major issue, the default GateSeal pause should be longer or infinite. ## Sanity checkers improvements: Zero rebase There is a way to increase sanity checker reliance on Second Opinion via modifing current values for `initialSlashingAmountPWei` and `inactivityPenaltiesAmountPWei` parameters. The idea behind those values are to tolerate initial possible validators penalties to avoid checking the second opinion for small negative rebases. However, that could be reduced up to 0 which leads to every negaive rebase will be checked against second opinion. There are pros and cons for this change. The pros: - Second Opinion could be used to prevent even small negative rebase. - It's possible to remove complex logic of the SanityChecker with preserving rebases for the last 54 days. As for the cons: - In case of pausing or unplugging second opinion, every small negative rebase will block the oracle reports and will require the DAO to either change parameters or re-plug the Second Opion Oracle. Although, there were no negative rebases in the wild so far.