Formal verification of a zk gadget

--- title: Formally verifying zk-circuits tags: zk-proof, soundess, completenss breaks: false --- [Franck Cassez](https://franck44.github.io), Head of Research at [Mantle](https://www.mantle.xyz). — all views expressed are my own. [![Twitter URL](https://img.shields.io/twitter/url/https/twitter.com/franckDownunder.svg?style=social&label=Follow%20%40FranckDownunder)](https://twitter.com/franckDownunder) The [zk-bug-tracker](https://github.com/0xPARC/zk-bug-tracker) repository is an excellent resource to learn about bugs and vulneriabilities in ZK components. I was looking for a simple known bug to check whether, with some _deductive verification_ techniques, I could identify it, fix it and formally prove _correctness_. The simplest bug I could find is the [PSE & Scroll zkEVM: Missing Overflow Constraint](https://github.com/0xPARC/zk-bug-tracker?tab=readme-ov-file#15-pse--scroll-zkevm-missing-overflow-constraint) bug. So I set out to implement the code and verify it with my favourite verification-friendly language: [Dafny](https://github.com/dafny-lang/dafny). # Background In a zk-proof system, a _prover_ wants to convince a _verifier_ that they have computed a function correctly. They both know the function, and in the example of the [PSE & Scroll zkEVM: Missing Overflow Constraint](https://github.com/0xPARC/zk-bug-tracker?tab=readme-ov-file#15-pse--scroll-zkevm-missing-overflow-constraint) this is the function `Mod` defined by (the code is this article is written in [Dafny](https://github.com/dafny-lang/dafny)): ```rust= function Mod(a: u256, n: u256): (r: u256) { if n == 0 then 0 else a % n // standard int modulo operator } ``` The prover wants to convince the verifier that they have computed `r` and that `r == Mod(a, n)` for given `a` and `n`. Note that all the values are unsigned integers over 256 bits (`u256`). The verifier has a _gadget_ to verify this claim in the form of a _set of constraints_. The constraints must have some predefined form (R1CS, QAP, AIR, etc) and represent what is usually called a _circuit_. In this instance, the gadget/circuit for verifying the `Mod` function takes 4 input values `k, n, r, a` and has one constraint^[`2^256` is `2` to the power `256`.] `k * n + r == a [modulo 2^256]`. Both the prover and the verifier know `a` and `n`. The protocol to verify the correctness of the computation is: 1. the prover provides a _proof_ in the form of two values `k, r`. 2. The verifier _accepts_ the proof if and only if the predicate`k * n + r == a [modulo 2^256]` evaluates to `true`. :::info :bulb: What is expected from this protocol is that it is _sound_ and _complete_: - _Soundness_ means that if the verifier accepts the proof then `r == Mod(a, n)`; - _Completeness_ means that if `r == Mod(a, n)` the prover can supply a proof `k, r` that is accepted by the verifier. ::: # Specification and verification How can we check whether the protocol is sound and/or complete? ## Constraints as predicates We can specify the verifier's gadget with a Dafny _predicate_, a function that evaluates to either`true` or `false`: ```rust= predicate MulAddWordsGadget(a: u256, b: u256, c: u256, d: u256) { (a as nat * b as nat + c as nat) % TWO_256 == (d as nat) % TWO_256 } ``` In this predicate, - `TWO_256` is a constant equal to `2^256`, - the type `u256` is the set of unsigned integers over 256 bits (from `0` to `TWO_256 - 1`), - `nat` is the type for _mathematical unbounded_ natural numbers, and - `%` is the usual modulo operator on `nat`. The predicate reads: "compute `a * b + c` _as an unbounded natural number_, take the `modulo TWO_256` and check that it is equal to `d % TWO_256`." Note that `d as nat % TWO_256` is `d` as `d` is less than `TWO_256` (because it is a `u256`). ## Soundness and completeness Soundness and completeness are properties of the protocol and they can conveniently be specified as _lemmas_ (theorems) in [Dafny](https://github.com/dafny-lang/dafny). Here is how the soundness property can be written: ```rust= // Soundness proof lemma SoundnessProof(k: u256, n: u256, r: u256, a: u256) requires MulAddWordsGadget(k, n, r, a) ensures r == Mod(a, n) { // Proof to be supplied ... } ``` The `requires` is a _precondition_ (assumption) of the lemma `SoundnessProof` and the `ensures` the _postcondition_ (conclusion). As a result the lemma specifies the property: "If the verifier accepts then `r == Mod(a, n)`". Completeness is specified in a similar manner: ```rust= // Completeness proof lemma CompletenessProof(n: u256, r: u256, a: u256) requires r == Mod(a, n) ensures exists k: u256:: MulAddWordsGadget(k, n, r, a) { // Proof to be supplied ... } ``` The `CompletenessProof` lemma reads: "if `r == Mod(a, n)` then there exists a value of `k` such that the verifier accepts the proof". As a result if the prover has correctly computed `r` they can supply a value of `k` such that `MulAddWordsGadget(k, n, r, a)` evaluates to true (the verifier accepts). ### Verification of soundness We can now try to _verify_ the lemmas. By _verify_, I mean formally prove with the rules of logic (and the arithmetic properties of `Mod`) that for _all inputs_ that satisfy the preconditions, the postcondition is satisfied. The rules of logic and arithmetic properties are used by the Dafny verification engine to verify each step of a proof. Moreover, in Dafny, a proof is a program, so if you can write programs, you can write a proof. Let's first try with a weaker version (I'll explain why later) of the `SoundnessProof` lemma, where we assume that `n > 0` (not zero) and we only want to ensure that `Mod(r, n) == Mod(a, n)` rather than `r == Mod(a, n)`: ```rust= // Soundness proof - weak version lemma SoundnessProofWeak(k: u256, n: u256, r: u256, a: u256) requires n > 0 requires MulAddWordsGadget(k, n, r, a) // instead of r == Mod(a, n), we require the modulus to be equal ensures Mod(r, n) == Mod(a, n) { // Proof to be supplied } ``` :::warning :warning: When verifying this lemma, Dafny reports that the it does not hold, and automatically synthesises a counter-example: ::: ```rust= k = 2 n = 3 r = 115792089237316195423570985008687907853269984665640564039457584007913129639933 a = n ``` The value of `r` is actually `2^256 - 3`, and `k * n + r % TWO_256` evaluates to `3`, same as `a % TWO_256` so the verifier accepts. However `Mod(r, n) == 1` but `Mod(a, n) == 0` so the prover did not compute the correct value. The prover managed to supply `k, r` that verifies but they have not computed the correct value of `r`. The problem has already been identified in this [issue-996](https://github.com/privacy-scaling-explorations/zkevm-circuits/issues/996). The reason is that the `MulAddWordsGadget` gadget is _under constrained_. Note that the counter example provided in [issue-996](https://github.com/privacy-scaling-explorations/zkevm-circuits/issues/996) is different to the one provided by Dafny. The fix is to add a constraint to ensure that no overflow occurs in the computation of `(a as nat * b as nat + c as nat)`. Let's add another gadget (a predicate) that is true if and if the computation overflows: ```rust= predicate OverflowMulAddGadget(a: u256, b: u256, c: u256) { a as nat * b as nat + c as nat >= TWO_256 } ``` The (weak) soundness proof now becomes: ```rust= // Soundness proof - weak version lemma SoundnessProofWeak(k: u256, n: u256, r: u256, a: u256) requires n > 0 requires !OverflowMulAddGadget(k, n, r) requires MulAddWordsGadget(k, n, r, a) ensures Mod(r, n) == Mod(a, n) { Lemma1(k, n, r); } // Helper. lemma {:axiom} Lemma1(k: u256, n: u256, r: u256) requires k as nat * n as nat + r as nat < TWO_256 ensures Mod(k * n + r, n) == Mod(r, n) ``` :::success :+1: With the help^[I have assumed the folks arithmetic lemmas here for the sake of simplicity but we can include them from the Dafny standard libraries.] of an arithmetic lemma `Lemma1`, Dafny automatically _verifies_ `SoundnessProofWeak`. ::: Now let's try to verify the original lemma `SoundnessProof`. ```rust= // Soundness proof lemma SoundnessProof(k: u256, n: u256, r: u256, a: u256) requires n > 0 requires !OverflowMulAddGadget(k, n, r) requires MulAddWordsGadget(k, n, r, a) ensures r == Mod(a, n) { Lemma1(k, n, r); } ``` :::warning :warning: Unfortunately Dafny reports a counter-example `k, n, r, a` again and we perform some simple computation (e.g. with Python) to understand the counter-example better: ::: ```python3 >>> k = 115792089237316195423570985008687907853269984665640564039457584007913129639934 >>> n = 1 >>> r = n >>> a = 115792089237316195423570985008687907853269984665640564039457584007913129639935 >>> (k * n + r) % 2**256 115792089237316195423570985008687907853269984665640564039457584007913129639935 >>> a % 2**256 115792089237316195423570985008687907853269984665640564039457584007913129639935 >>> a % n 0 >>> r 1 >>> 2**256 - k 2 >>> 2**256 - a 1 ``` The issue here (I assume it has already been identified) is that we do **NOT** require `r < n`. So the `Mod` gadget is still _under constrained_. And this is the reason why I started with a weaker version of the soundness proof. :::success :+1: Replacing `requires n > 0` (line 3) with `requires r < n` fixes the problem and this time Dafny _verifies_ lemma `SoundnessProof`. Remember that _verifies_ here means _for all inputs_ the lemma holds. ::: There is one case with have not yet dealt with: `n == 0`. In this case we have to add the constraint `r == 0` to the gadget (I assume this has also been identified before). So overall, here is the final version of the lemma: ```rust= // Soundness proof lemma SoundnessProofWithZero(k: u256, n: u256, r: u256, a: u256) requires (n == 0 && r == 0) || r < n requires !OverflowMulAddGadget(k, n, r) requires MulAddWordsGadget(k, n, r, a) ensures r == Mod(a, n) { Lemma1(k, n, r); } ``` :::success :bulb: To ensure soundness, the gadget for the verification of `Mod` requires 3 constraints (and circuits) given by lines 3--5 in the `SoundnessProofWithZero` lemma. ::: ### Verification of completeness Now let's check completeness. Here is the specification (as above) augmented with a proof, using a simple lemma `ModuloLemma2` that states that: if two numbers `a` and `b` are equal modulo `n` then their difference is a multiple of `n`. ```rust= // Completeness proof lemma CompletenessProof(n: u256, r: u256, a: u256) requires n > 0 requires r == Mod(a, n) ensures exists k: u256:: MulAddWordsGadget(k, n, r, a) && !OverflowMulAddGadget(k, n, r) { if n <= a { var k := ModuloLemma2(r, a, n); assert MulAddWordsGadget(k, n, r, a) ; } else { assert MulAddWordsGadget(0, n, r, a); } } lemma {:axiom} ModuloLemma2(a: u256, b: u256, n: u256) returns (k: u256) requires n > 0 requires a < n <= b requires Mod(a, n) == Mod(b, n) ensures k as nat * n as nat < TWO_256 && b - a == k * n ``` :::success :+1: Dafny confirms that the `CompletenessProof` lemma is valid. ::: What about the case `n == 0`? In that case we **cannot prove completeness**. And indeed, the prover cannot build a proof except if `a == 0.`Assume `n == 0` and `a != 0`. Then `Mod(a, n) == 0`. So the prover should provide a proof of the form `k, 0`. In that case the verifier computes`k * n + r == k * 0 + 0 = 0` but `a % TWO_256 != 0` and the verifier rejects the proof. The formal proof of _incompleteness_ can also be specified and _verified_ with Dafny: ```rust= lemma InCompletenessProofZero(k: u256, n: u256, r: u256, a: u256) requires n == 0 && a > 0 requires r == Mod(a, n) ensures !MulAddWordsGadget(k, n, r, a) { // Thanks Dafny for automatically proving this lemma. } ``` The previous lemma states that for any value of `k`, `MulAddWordsGadget(k, n, r, a)` evaluates to `false`, implying that **the prover cannot provide any `k`** to build an accepting proof (although they computed `r` correctly as per line 3). I also assume that this is known fact. # Conclusion The example above demosntrates that deductive formal verification is practical and applicable to proving zk-circuits. I did not have to write any _proof code_, only to import already proved arithmetic lemmas. The counter-examples were synthesised automatically. The result above is a **machine-checkable proof of soundness and completeness** for the `Mod` gadget. To the best of my knowledge this is the first proof of this type for this gadget. Unfortunately I have not found any _formal proof_ bounty, and as a matter of fact you can get rewards for finding bugs, but not for proving the absence of bugs :disappointed:. [^fn]:

Read more

Fair Sequencer

Verified Control Flow Graphs for EVM Bytecode

Control Flow Graph reconstruction for EVM bytecode

ByteSpector Grant Application - EF