owned this note
owned this note
Published
Linked with GitHub
# Trustless Bitcoin Oracle with zkSPV Contract
*If your chain doesn't interact with Bitcoin, what are you busy with?*
In the case where a cross-chain application needs to be built between Bitcoin and another system, a method is required to synchronize Bitcoin's history with the designated network. The most straightforward approach is to have an oracle (centralized or federated), but few would consider this approach reliable. At the same time, we have clear rules of (1) how to validate the Bitcoin blocks and entire chain correctness; (2) how to detect the mainchain, and (3) what to do in the case of reorganizations. We can program Bitcoin verification rules in the form of a smart contract, allowing each user to provide blocks and have them accepted if they are correct.
We could have a full node in the form of a smart contract (which would verify all blocks and transactions, manage reorgs and updates, etc.), but there is the question about its maintenance cost. Fortunately, Satoshi proposed a much lighter approach, called a SPV node.
In this write-up, we describe the concept of the SPV contract and approaches to implementing it. Happy journey!
> We used Noir to build the Bitcoin prover (which we are constantly referring to in the write-up). We also mention some specific approaches we applied and limitations we had with the framework and our implementation: https://github.com/distributed-lab/bitcoin-prover. We kindly ask the reader not to limit themselves to this specific proposal and instead use and apply whatever they prefer, based on the requirements and desired properties they want to achieve. A lot of work is still in progress; we are open to any ideas and contributions, so this version of the write-up definitely isn't the latest one.
## Back to the SPV node design
The SPV node concept was proposed in the original Bitcoin [whitepaper](https://bitcoin.org/bitcoin.pdf). It allows for reliable syncing of Bitcoin history without maintaining a full node, simply by syncing the block headers and verifying their **correctness**. Since every block header includes the Merkle Root of the included transactions, these transactions can be reliably proven then.
For transaction verification, the SPV node asks the full node to return the proof of inclusion of this transaction into the block. As proof, in this case, Merkle Branch is being used. Then, the SPV node performs a list of verifications to ensure the transaction is valid. We can list these verifications as follows:
1. Block header verification:
* Structure and existence of all fields: `version` (4 bytes), `prev_block` (32 bytes), `root` (32 bytes), `timestamp` (4 bytes), `bits` (4 bytes), `nonce` (4 bytes)
* The block is part of the chain and refers to the existing previous block
* Timestamp value exceeds the median value of the previous 11 blocks. According to their clock, full nodes won’t accept blocks with timestamps for more than two hours in the future.
* Difficulty target $^1$
* The hash value (double SHA256) of the concatenation of all previous parts of the header with the nonce value must be equal to the block hash value (+ satisfy a difficulty target parameter)
> $^1$ The block's header double hash value must satisfy the defined difficulty parameter (must be less than the target value). The difficulty target parameter is changed every 2016 blocks to adjust the block mining time for the current network hash rate. It does so by summing up the total number of minutes miners took to mine the last 2,015 blocks, and it divides this number by the protocol's desired goal of 20,160 minutes (2,016 blocks x 10 minutes). The ratio is then multiplied by the current difficulty level to produce the new difficulty level. If the correction factor is greater than 4 (or less than 1/4), then 4 or 1/4 is used instead to prevent the change from being too abrupt.
2. Merkle Branch verification. When the SPV node receives the transaction with the Merkle Branch -- it verifies that the path leads to the existing Merkle Root (defined in the mainchain block header)
3. SPV node must verify that the block is included in the mainchain (the heaviest chain), which means that a certain number of blocks were built on top of the provided block.
> We will use the following notation:
> * $\mathsf{bh\_verify}(h)\to \{0,1\}$ - the function that verifies the block header $h$ according the to rules described above
> * $\mathsf{hbatch\_verify}(b_l,...,b_h) \to 0 /\{d,\mathsf{root}$\} - the function that verifies the sequence of headers from $h_l$ to $h_h$ and returns the difficulty and the Merkle Root if it's correct
> * $\pi_{\mathcal{R}}$ is a proof for the relation $\mathcal{R} = \{(a, b, c, ...; x, y, z,...): f(a, b, c, ..., x, y, z,...)\}$, where $a$, $b$, $c$, … are public variables, $x$, $y$, $z$, … are private variables. To prove the relation, the prover will show the knowledge of $x$, $y$, $z$, …, such that $f(a, b, c, ..., x, y, z, ...)$ is true
## SPV Contract v0
The most straightforward way to have all the mentioned verifications implemented in the form of a smart contract. In this case, anyone can submit the Bitcoin block header to the contract if it passes all the described verifications. At the same time, anyone can reliably access information about any stored block header using the contract’s read method, with assurance of its accuracy.

An interesting point here is the cost of the header verification (based on the reference implementation you can find together with EIP [here](https://ethereum-magicians.org/t/erc-8002-simplified-payment-verification-gateway/25038)).
Lets take:
* ETH - $4400
* GasPrice - 2.5Gwei
=>
* 1 block header verification ~ 130k gas => 0.000325 ETH = $1.43
Not so much, right? But having 915,000 blocks exist, the price of syncing an entire Bitcoin history is $1.43 * 915,000 = $1,308,450. Oooops...
## SPV Contract v1
That's actually the reason why we couldn't launch an SPV contract in this form. The good news -- we have SNARKs. So let's change the model from "please, verify an entire history" to "Here is the proof that I know the valid history up to some checkpoint block $N$; and please, verify the difference up to the latest Bitcoin block". In this case, an SPV contract:
1. Verifies the proof of the history validity up to the checkpoint
2. Verifies the proof of correctness of the Merkle Tree with all block headers
3. Verifies the batch of the latest block headers
4. All new blocks can be added one by one according to the SPV v0 logic
Cost? Let's say we are proving 914,900 blocks and verifying the batch of 100 latest: ~\$5 + 100 * \$1.43 = $148. ~10,000 times cheaper.
Proving time (Noir + UltraHonk): 109:53:24
> [Raito](https://github.com/starkware-bitcoin/raito) made it within [6.5 hours](https://x.com/dimahledba/status/1965006118587179342) with stwo (and that is amazing, especially I like the verification possibility on [Raspberry Pi](https://x.com/dimahledba/status/1968666625189564796)!). Unfortunately, an extra SNARK wrapper with Groth16/PLONK/Ultragroth is required to make it verifiable on Ethereum. That's not a problem for one-time proof presumed by SPV v1, but it isn't quite compatible with v2, which we will describe later.
> P.S. We are going to build a similar ZK Bitcoin client with Binius, which shows incredible benchmarks -- for instance, P2PKH + ECDSA verification takes 175.89ms to prove and 12.32ms to verify the proof on Apple M2 Max. The only thing that blocks us now is the absence of recursion (gonna be ready by the end of the year).
### Here recursion comes
We couldn't benchmark how much RAM you need to prove an entire history in one cycle. A rough estimation is "a lot". But an approach can consist of chunking the history and making recursive proofs in the form:
$$\mathcal{R}_i = \{(h_h, \mathsf{root}_i, d, \mathsf{pp}_i; \pi_{i-1},\vec{h}_i): \\ \mathsf{proofVerify}(\mathsf{pp}_i, \pi_{i-1}) \to 1 \land \mathsf{hbatch\_verify}(\vec{h_i})\to \{d,\mathsf{root}_i\}\},$$
where $h_h$ is the last block header, $\mathsf{root}_i$ is a root built according to headers in the chunk, $d$ -- resulting difficulty, $\mathsf{pp}$ -- public parameters (verifier's key, etc), $\pi_{i-1}$ -- the proof of the previous recursion, $\vec{h}_i$ -- the block headers in the chunk.
> Actually, the statement is a bit more complicated because of proper difficulty recalculation and constructing the global root based on the chunks' trees. But we don't want to overload the reader; the specific logic can always be found [here](https://github.com/distributed-lab/bitcoin-prover/tree/main/app/blocks_recursive).
In practice, you can freely use chunks with 1024 blocks with no need to mortgage your house to rent computing power for proving.
### Merkle Forest
We use a 2-tier merkelized structure for managing block headers. The first-layer trees $T^1_i$ aggregate headers within a chunk. The second-layer tree $T^2$ aggregates roots of all $T^1$'s.

For first-layer trees, we use [SMT](https://docs.iden3.io/publications/pdfs/Merkle-Tree.pdf) construction. However, the use of classic Merkle trees or SMTs for the global tree would be inefficient from a proving perspective (due to the need to rebuild them within each proving cycle). Instead, we use [Incremental Merkle Trees](https://docs.solarity.dev/docs/getting-started/guides/libs/data-structures/incremental-merkle-tree), which transfer frontiers between proving cycles and allow us to append new leaves with $O(\log n)$ complexity.
SPV v1 contract is deployed [here](https://etherscan.io/address/0x4c8D4e3C45870Df8b707c2dE9F2b3444971710F5#code): `0x4c8D4e3C45870Df8b707c2dE9F2b3444971710F5`
## SPV contract v2
WIP
**Observation #1.** Do we really need to store headers and manage reorgs on-chain? What if we could simplify the SPV contract to a pure ZK verifier of the statement "*I know the Bitcoin mainchain because I know the heaviest chain created according to the Bitcoin rules*".
Under these conditions, users don't provide new headers to the SPV contract. They don't do reorganizations. They just prove the knowledge of ~~longer~~ heavier chain than the one last known by the SPV contract.
**Observation #2.** Chunk can be equal to 1 block. This approach allows us to generalize the model with per-block recursion, proving the statement "This block is correct and the proof for the previous block is correct". So, in this case (1) the chain of headers can be additionally proven from any height up to the latest block; (2) the proving logic can be much more complicated, like executing scripts, verification of the UTXO dataset, etc.
$$\mathcal{R}_i = \{(h_i, \mathsf{root}_i, d, \mathsf{pp}_i; \pi_{i-1}): \\ \mathsf{proofVerify}(\mathsf{pp}_i, \pi_{i-1}) \to 1 \land \mathsf{hb\_verify}({h_i})\to \{d,\mathsf{root}_i\} \land f_{ext}(\vec{\mathsf{tx}}, \vec{\mathsf{script}}, \vec{\mathsf{UTXO}}\to 1)\},$$
which leads to the Bitcoin mainchain trace table:
| # | Proof | Roots |UTXO dataset|
| -------- | -------- | -------- |-------- |
|...|...|...|...|
|$n$ | $\pi_{\mathcal{R}_n=\{\mathsf{proofVer}(\pi_{\mathcal{R}_{n-1}=\{...\}})\to1,\mathsf{b\_verify}(h_n)\to 1\}}$ | $T^1_n,T^2$ | $\mathsf{UTXO}_n$
|...|...|...|...|
It is seen that reorganization can be achieved by performing a proper recursive proof from any height.
## What else to prove
In addition to the SPV logic, Bitcoin has so many interesting things to prove!
### Transactions
**TX** is an important structure in our case. It comes to the system as a raw hex string, which we parse inside Noir and create a corresponding TX object. To access TX fields, we use offset and size for each sub-object (e.g., any input):
* **Inputs:** Each input contains a reference to a previous transaction output `txid` and `vout`, and a `sequence` number. Inputs are stored as fixed-size arrays, parameterized by the compile-time input count
* **Outputs:** Each output contains an `amount` (8 bytes) and a variable-length `scriptPubKey`. As with inputs, outputs are stored as an array sized at compile time
* **Witness Data:** For SegWit transactions, an optional Witness structure is included. A witness consists of a stack size and a collection of witness items (typically signatures and redeem scripts). This is essential for script validation under SegWit rules
* **Metadata:** Additional fields include `version`, optional SegWit `marker` and `flag`, and the `lock_time`
All transaction parameters, such as total size, number of inputs, outputs, and maximum witness size, are specified as compile-time constants (global parameters). This allows Noir to enforce bounds checking without relying on runtime variability, ensuring both determinism and security in a zero-knowledge setting.
> In order to handle different types of Bitcoin transactions (such as P2TR, P2MS, P2WPKH, P2SH-in-P2WSH, etc.), the Noir implementation requires a large set of compile-time constants. Noir enforces strict rules regarding variable definitions and computations that must occur before or at compile time, which makes it impractical to calculate these parameters dynamically inside the circuit itself.
> To address this, a dedicated `globals.nr` file is generated for each transaction type (technically, for each new proof). This file contains all necessary transaction-related constants, including transaction lengths, script sizes, witness data sizes, and opcode counts. These values are tailored for the specific transaction being proven, ensuring correctness of parsing and signature verification within the ZK circuit.
> Because each transaction structure differs in length, script format, and witness data, the parameters cannot be shared across transaction types. Instead, they are automatically created through a Python generator that parses the raw transaction data (both current and previous transactions). The generator fills pre-defined Noir templates with the correct constants, producing the final `.nr`and `.toml` files. This automation avoids manual calculation errors and guarantees that the circuit has access to all required values at compile time.
> This design pattern allows Noir to remain both flexible and efficient: the proving circuits stay generic while the transaction-specific constants are injected via automatically generated global files.
### Bitcoin Virtual Machine
WIP
> Not to forget about Simplicity proving =)
### Addresses
Bitcoin supports multiple address and locking script types, each defining different spending conditions and verification logic. In our system, these are represented as dedicated Noir circuits, where for each supported address type we generate a corresponding `globals.nr` file. The generation process is automated through a Python-based generator, which constructs the appropriate circuit depending on the transaction under verification.
At the current stage, we support most of the Bitcoin address and script types:
| Address | Description |
| ------- | ----------- |
| Pay-to-Multisig (P2MS)|Classical M-of-N multisignature scheme, locking outputs to multiple keys. The scheme checks whether there are enough valid signatures.|
|Pay-to-PubKey (P2PK)|Simplest legacy type, locking directly to a public key. The scheme checks the signature against the public key.|
|Pay-to-PubKey-Hash (P2PKH)|Legacy type that locks outputs to a hash of a public key, requiring signature and key for spending. The key difference is that we have linked the legacy and segwit schemes and check within one scheme whether the transaction is SegWit or not. Depending on the answer, we take the signature and public key either from the witness data or from scriptSig.|
|Pay-to-Script-Hash (P2SH)|Encapsulates arbitrary redeem scripts by committing only to their hash. The key difference is the same as in the previous type. We have combined the legacy and segwit schemes, and depending on this, the last element of the script will be either redeemScript (legacy) or witness script (segwit)|
|P2SH nested SegWit (P2SH-in-P2WPKH)|Compatibility form of SegWit v0, embedding Pay-to-Witness-PubKey-Hash inside a P2SH envelope. Simply executes redeemScript, which is located inside scriptPubKey.|
|P2SH nested SegWit (P2SH-in-P2WSH)|Similar to the above, but embedding Pay-to-Witness-Script-Hash inside a P2SH envelope. Same as in previous type, executes redeemScript, which is located inside scriptPubKey|
|Pay-to-Taproot (P2TR, key path)|Native SegWit v1 address type based on Schnorr signatures, supporting single-key spends via the key path. The scheme verifies the Schnorr signature according to BIP-341 rules|
|Pay-to-Taproot (P2TR, script path)|Alternative Taproot spending path using Merkle-committed scripts. The scheme verifies the tweaked key firstly, and then executes the script|
Actually, we conducted an interesting experiment on a task derived from Project 11. It's a client-side ZK PQ prover for P2PKH address. Here are benchmarks:

> $^*$ It doesn't include Binius benchmarks because of ZK's absence, but it's gonna be ready by the end of the year.
### UTXO set
WIP
## Cases
* [Wrapless](https://arxiv.org/abs/2507.06064)
* Proof-of-reserve
* Crosschain order-based DEX