# Provable Watermark Extraction
Introducing **zkDL++**, a novel framework designed for provable AI. Leveraging zkDL++, we address a key challenge in generative AI watermarking—maintaining privacy while ensuring provability. By enhancing the watermarking system developed by Meta, zkDL++ solves the problem of needing to keep watermark extractors private to avoid attacks, offering a more secure solution. Beyond watermarking, zkDL++ proves the integrity of any deep neural network (DNN) with high efficiency. In this post, we outline our approach, evaluate its performance, and propose avenues for further optimization.
## Introduction
[Stable Signature](https://arxiv.org/pdf/2303.15435.pdf) is a watermarking (WM) scheme introduced by Meta AI to allow for identification of images generated by their latent diffusion model (LDM). It uses a Convolutional Neural Network(CNN) to extract the WM, and has the advantage of being very robust to image tampering, allowing for detection even if the image was filtered, cropped, compressed, etc. However, Stable Signature suffers from the drawback of having to keep the extractor model private, as exposing it makes the WM susceptible to attacks. This drawback can be significant if at any point there is a controversy about ownership of a certain image. While Meta, or any other company using such a WM method, can run the extractor and see that a certain image was generated by their model, they cannot prove it to a third party such as social media users, unless they expose the weights of the extractor model. Such exposure may lead to obsolescence of the WM system.
<div style="text-align: center;">
<img src="https://hackmd.io/_uploads/SyWq1mRnA.jpg" alt="The WM extraction relation" />
</div>
We propose using the emerging technology of Zero-Knowledge Proof (ZKP) as a solution to this problem. In a ZKP scheme, a prover wishes to convince a verifier that they have knowledge of a witness $W$, which obeys a certain functional relation $F(X,W)=0$, without revealing $W$. The relation $F$ is generally represented by an arithmetic circuit and is public. The variable $X$ contains public input and output to the circuit. In the case of Stable Signature, the relation $F$ is the architecture of the WM extractor, with its output compared to the key, and it is satisfied if-and-only-if the extracted WM from the input image matches the key. The public variable $X=(x_{in},x_{out})$ includes the image and the Boolean output, and the private witness $W=(w,k)$ includes the weights of the extractor and the key as is depicted in the diagram above. A verifier can then run a verification algorithm that tests the proof and returns True if the relation is indeed satisfied, and False otherwise.
An additional challenge is ensuring that the weights weren’t deliberately chosen to produce a specific result for a given image. Fortunately, this issue can be easily resolved by the following:
1. Compute a cryptographic commitment to the weights $C=c(W)$. This can be thought of as the hash of $W$. The commitment is published together with the WM system.
2. Add $C$ to $X$ in $F$. This is the representation of $W$ that is included in the proof, so the proof is only valid if indeed $C=c(W)$.
### Real-world application of provable watermarking extraction
According to Meta, the end goal is to be able to label AI generated content on social platforms. Technologies like Stable Signature can significantly enhance detection accuracy by embedding robust watermarks into AI-generated media. Meta can add these watermarks to content generated by their models, but users cannot verify them independently because the extractor model's weights must remain private. Moreover, this challenge is compounded by the fact that each company, like Google, has its own proprietary watermarking system that is incompatible with others.
ZKPs complement perfectly the shortcomings of such a system. For AI-generated images (the same approach applies to [audio](https://ai.meta.com/research/publications/proactive-detection-of-voice-cloning-with-localized-watermarking/) as well), Meta can attach a proof of watermark extraction, akin to a digital signature, without revealing sensitive model details. Clients can then run ZK verification for each image in their feed, which enables automatic labeling of AI-generated content. Additionally, other companies like Google can integrate their own watermark extraction proofs, allowing the system to distinguish between different AI models.
Meta will handle the resource-intensive task of running the prover, while clients—such as mobile devices—can handle the lightweight verification. Since the watermark is robust against transformations (like cropping or compression), Meta can rerun the prover to account for any transformations and produce new proofs, as demonstrated with approaches like [VIMz](https://eprint.iacr.org/2024/1063).
## Integrity Proofs for DNNs
In [Meta's Stable Signature](https://ai.meta.com/research/publications/the-stable-signature-rooting-watermarks-in-latent-diffusion-models/), the watermark extractor is a CNN specifically designed to analyze watermarked images. Let’s focus on images sized $128\times 128$ pixels. The diagram below shows the architecture of Meta’s extractor DNN, which consists of $9$ convolutional layers, an average pooling layer, and a fully connected layer. This model contains approximately $38$ million parameters.
![upload_83c42667cd1991497f2a60c838fca752](https://hackmd.io/_uploads/Bk8vg7A2R.png)
Current industry solutions for proving DNN inference often rely on [zkVMs](https://dev.risczero.com/api/zkvm/). However, zkVMs are not application-specific and introduce unnecessary complexity, making them less suitable for this task. In contrast, two recent academic works, [zkCNN](https://dl.acm.org/doi/pdf/10.1145/3460120.3485379) and [zkDL](https://eprint.iacr.org/2023/1174.pdf), propose SNARK frameworks tailored to proving DNN inference, both based on the [Sumcheck](https://zkproof.org/2020/03/16/sum-checkprotocol/) protocol.
As shown in the diagram below, zkCNN leverages the [GKR](https://people.cs.georgetown.edu/jthaler/GKRNote.pdf) protocol, while zkDL independently proves each layer of the DNN. The trade-offs between the two are notable:
- zkCNN commits only to the inputs and outputs of the network, whereas zkDL requires commitment to all intermediate inter-layer values.
- However, zkDL allows for concurrent processing of layer proofs, while zkCNN must process them sequentially.
- Another advantage of zkDL is its flexibility—each layer-proof can use a different proving system, as long as shared commitments are maintained.
![upload_c76d397025596573c24610fcb53163b3](https://hackmd.io/_uploads/BJFixQA30.png)
In the next section, we outline how we integrated concepts from both zkDL and zkCNN to develop a proof-of-concept for a DNN-SNARK framework, which we call zkDL++.
Before diving into our implementation, let's first compare the proof construction run-times for a network of two fully-connected layers. This analysis evaluates zkDL/zkDL++, [Jolt zkVM](https://github.com/a16z/jolt), and [SP1 zkVM](https://github.com/succinctlabs/sp1). As illustrated in the diagram below, the input to the network of size $L$ is reduced to size $\frac{L}{4}$ in the first fully-connected layer, which then passes through a second fully-connected layer that produces an output of size $\frac{L}{8}$. Both inputs and network parameters are randomized uniformly.
<div style="text-align: center;">
<img src="https://hackmd.io/_uploads/r1Tne7RnC.png" width="40%" />
</div>
We start by implementing the model in all three frameworks. We sample the input vector as a random byte-vector and arithmetize the fully-connected layers computation (i.e. matrix-vector products) in the three frameworks. This is followed by proof generation and benchmarking. The code for our evaluation is available at [zkDL++](https://github.com/ingonyama-zk/zkdl-plus/tree/sb/bench-op) (available soon), [Jolt](https://github.com/suyash67/jolt/tree/sb/compare), and [SP1](https://github.com/suyash67/sp1-examples/tree/sb/init/simple-cnn).
The comparison shows that zkDL/zkDL++ outperforms both Jolt and SP1 in prover run-time performance, especially as input sizes increase. While Jolt and SP1 suffer significant performance degradation with larger inputs, zkDL/zkDL++ maintains relatively low run-times, making it ideal for systems that demand high performance and scalability. For example, when the input size is $2^{10}$, zkDL generates a proof around $200$ times faster than SP1, as is presented in the prover run-time graph below.
<div style="text-align: center;">
<img src="https://hackmd.io/_uploads/HJQzDW0hA.png" width="80%" />
</div>
In this scenario, the number of model parameters is $0.3$ million—still much lower than a typical DNN model. In such cases, zkDL++ would likely be the only framework capable of practically generating proofs of integrity.
It is worth noting all the experiments were performed on the same hardware configuration, which consists of a powerful CPU (13th Gen Intel Core i9-13900K) and a high-end GPU (NVIDIA GeForce RTX 4080). Although the machine is equipped with a capable GPU, the experiments were not specifically optimized to fully utilize the GPU's potential. GPU optimization could potentially yield faster results, but that wasn’t the focus of our experiments.
## zkDL++
zkDL++ can be viewed as a hybrid of zkDL and zkCNN, designed to optimize both approaches. While zkDL and zkCNN represent two extremes—proving each layer independently versus proving the entire network as a whole—zkDL++ strikes a balance between them. This flexibility allows for high computational parallelism while maintaining a manageable commitment cost.
Our proof-of-concept is built on a fork of [zkDL](https://github.com/SafeAILab/zkDL). Initially, zkDL only supported primitives for fully connected and ReLU layers. By adding support for convolutional and average pooling layers, we were able to construct a SNARK for the watermark extractor DNN. In the following sections, we present the run-time results of our proof-of-concept, delve into the mathematical foundations of the proving primitives, and conclude with a discussion on limitations and future directions.
### Results and Methodology
We run an end-to-end SNARK prover to prove the correctness of the inference of the extractor DNN. The computation is split into the following steps.
1. **Preprocessing**: We commit to the parameters of the DNN, i.e. the weights and biases of the trained DNN model. Once the model is trained, this is a one-time computation.
2. **Inference**: We run the inference on randomized input (of size $3 \times 128 \times 128$) with the extractor DNN. In the context of SNARKs, this step is referred to as witness generation.
3. **Witness commitments**: We start generating the proof by first committing to the witnesses. Specifically, we commit to the input and output of each layer individually.
4. **Sumcheck proofs**: We then compute the sumcheck proofs for each layer that attests to the correctness of each layer individually. This is followed by the computation of opening proofs for the witness polynomials in each layer.
We benchmark the zkDL++ framework on the same hardware as described in the previous section. The following table shows the run-times of each of these steps.
| Computation | Time (in seconds) |
| ------------- | ----------------- |
| Preprocessing | 5.1 |
| Inference | 0.5 |
| Witness commitments | 143.5 |
| Sumcheck proofs | 172 |
The total time for generating the proof is approximately 5.4 minutes. Next, we discuss the overview of our techniques. The objective is to arithmetise the primitives used in DNNs.
### Convolution
To keep things concise, we’ll focus on the arithmetization of 1D convolution. Let $u$ and $v$ be vectors of length $n$, representing the input and output of the convolution, respectively, and let $f$ be a filter of length $2t + 1$.
\begin{align}
v(i) = \sum_{j = 0}^{2t} f(j) \cdot u(i + t -j).
\end{align}
We aim to represent this convolution between the input $u$ and the filter $f$ as a polynomial equation. Let $p_u(x)$ and $p_f(x)$ be the polynomials corresponding to the coefficients of $u$ and $f$, respectively as
\begin{align}
p_u(x) = \sum_{k=0}^{n-1} u_k \cdot x^k, & & p_f(x) = \sum_{k = 0}^{2t} f_k \cdot x^k.
\end{align}
Multiplying $p_u(x)$ and $p_f(x)$ yields the convolution of $u$ and $f$. To visualize this, consider the following example from Wikipedia.
![image](https://hackmd.io/_uploads/Skpk-7A2R.png)
When multiplying two polynomials $P$ and $Q$, summing terms of the same color gives the convolution. Note, however, that tail terms appear as remnants from the multiplication, which are not part of the convolution. Using this approach, we obtain
\begin{align}
p_u (x) \cdot p_f (x) = \Delta_1 (x) + x^t \cdot p_v(x) + x^{n+ t} \cdot \Delta_2(x),
\end{align}
where $p_v(x)$ is the polynomial representing the output vector $v$, and $\Delta_1(x)$ and $\Delta_2(x)$ are the tail terms. The prover needs to demonstrate that this relationship between $p_u(x), p_f(x)$, and $p_v(x)$ holds at a random point $x = r$ chosen by the verifier.
We use the sumcheck protocol to prove this relationship. However, the main challenge for the prover lies in sending evaluations of univariate polynomials, while the sumcheck protocol typically handles multivariate polynomials. To address this, we note that evaluating $p_u(r)$ for $r \in \mathbb{F}$ can be interpreted as an inner product of the vector $u$ with the vector $\vec{r} := (1, r, r^2, \dots, r^{n-1})$ as
\begin{align}
p_u(r) = \sum_{k=0}^{n-1} u(k) \cdot r^k = \langle \ u, \ \vec{r} \ \rangle.
\end{align}
The sumcheck protocol can now be applied to verify this inner product. The prover uses this approach to compute the evaluations $p_u(r), p_f(r),$ and $p_v(r)$, while the verifier checks that the relationship holds. Note that the prover can send the $\Delta_1$ and $\Delta_2$ polynomials directly to the verifier, as their size is relatively small.
This analysis can be trivially extended to the case of 2d convolution. For 2d convolution, we multiply the bi-variate polynomials $p_u(x, y)$ and $p_f(x, y)$ to get the convolution output polynomial $p_v(x, y)$. In this case, if we view the coefficients of $p_u(x, y) \cdot p_f(x, y)$ as a matrix where the $(i, j)$ entry is the coefficient for $y^ix^j$ then the output polynomial coefficients are present in the centre of the matrix, as shown in the figure below. Notice that we get $8$ tail terms.
![image](https://hackmd.io/_uploads/ByMG-mC2C.png)
### Average Pooling
To arithmetize the average pooling layer, we need to represent it as a sumcheck instance. Let $u$ be the input of size $D$, and $v$ the output of size $d$ of the average pooling layer. The output is defined for each $i \in \{1, \dots, d\}$ as
\begin{equation*}
v_i = \frac{1}{t} \cdot \sum_{j=1}^{s} u_{ti + j}
\end{equation*}
where $t := \frac{D}{d}$ is the subsampling factor (i.e. the size of the filter). This operation can be represented as the following matrix-vector product
\begin{align*}
\begin{bmatrix}
v_1 \\
v_2 \\
\vdots \\
v_d
\end{bmatrix}=
\begin{bmatrix}
\frac{1}{t} & \ldots & \frac{1}{t} & & & & & & \\
& & & \frac{1}{t} & \ldots & \frac{1}{t} & & & & \\
& & & & & \ddots & & & & \\
& & & & & & & \frac{1}{t} & \ldots & \frac{1}{t}
\end{bmatrix}_{d \times D}
\begin{bmatrix}
u_1 \\
u_2 \\
u_3 \\
\vdots \\
\vdots \\
u_D
\end{bmatrix}.
\end{align*}
Matrix multiplication can be efficiently proven using sumcheck as demonstrated [here](https://people.cs.georgetown.edu/jthaler/OptimalMatMult.pdf) by Justin Thaler. Therefore, the correctness of the average pooling operation can be verified using sumcheck. However, a minor challenge arises as the verifier would need to evaluate a bi-variate polynomial, corresponding to the constant matrix in the above equation, on a challenge $y = r_y \in \mathbb{F}^D$, which implies linear work for the verifier. To avoid this and ensure efficient verification, we can exploit the structure of the matrix as
\begin{align*}
\begin{bmatrix}
\frac{1}{t} & \ldots & \frac{1}{t} & & & & & & \\
& & & \frac{1}{t} & \ldots & \frac{1}{t} & & & & \\
& & & & & \ddots & & & & \\
& & & & & & & \frac{1}{t} & \ldots & \frac{1}{t}
\end{bmatrix}_{d \times D}=
I_{d \times d} \otimes
\underbrace{
\begin{bmatrix}
\frac{1}{t} & \ldots & \frac{1}{t}
\end{bmatrix}_{1 \times t}
}_{:= B}.
\end{align*}
The prover can leverage this tensor product structure to construct a more efficient sumcheck since it is essentially a multiplication of two multi-linear polynomials in different variables.
### Fully-Connected
A fully-connected layer applies a weight matrix on an input vector to generate an output vector. Let $u$ be the input vector and $v$ be the output vector, as shown in the diagram below.
![image](https://hackmd.io/_uploads/rkcXZmAn0.png)
This operation is represented as a matrix-vector product as
\begin{align*}
\begin{bmatrix}
v_1 \\
v_2 \\
\vdots \\
\vdots \\
v_n
\end{bmatrix}=
\begin{bmatrix}
w_{11} & & \ldots & & w_{n1} \\
w_{21} & & \ldots & & w_{n2} \\
\vdots & & \vdots & & \vdots \\
\vdots & & \vdots & & \vdots \\
w_{n1} & & \ldots & & w_{nn}
\end{bmatrix}
\begin{bmatrix}
u_1 \\
u_2 \\
\vdots \\
\vdots \\
u_n
\end{bmatrix}.
\end{align*}
As noted in the average pooling case, we can use the sumcheck technique for matrix-multiplication to prove the correctness of the fully-connected layer. Note that the weight matrix is committed to as a part of pre-processing.
### ReLU
The zkDL paper gives a technique to arithmetise the ReLU function as a sumcheck instance. Since ReLU is a non-linear operation, we need a slightly different approach in its arithmetisation. For the purpose of exposition, we will demonstrate arithmetisation for a single ReLU function. Let $u$ be the input to the ReLU function and $v$ be its output. The ReLU function is applied on a scaled version of the input as follows
\begin{equation*}
v := \textsf{ReLU}\left(\left\lfloor \frac{u}{2^R} \right\rceil\right).
\end{equation*}
where $R$ is a constant and $\left\lfloor\cdot\right\rceil$ is the rounding operator. Let $Q_u$ be the quotient and $R_u$ be the remainder when we divide $u$ by $2^R$. Thus, we have
\begin{equation}
u = 2^R \cdot Q_u + R_u.
\end{equation} By definition, $R_u$ must be an R-bit integer. Let $Q_u$ be a $Q$-bit signed integer defined as
\begin{equation}
Q_u := (B_{Q-1} \ \| \ B_{Q-2} \ \| \ \dots \ \| \ B_0),
\end{equation} where the most significant bit $B_{Q−1}$ encodes the sign of $Q_u$. By definition of the ReLU function, we have
\begin{equation}
v = (1 - B_{Q-1}) \cdot Q_u.
\end{equation} To prove the correctness of the ReLU layer, we must prove that the above equations hold. To prove the bit decomposition of the integer $Q_u$, we can use sumcheck to prove the following relations.
\begin{equation}
Q_u := \sum_{i=0}^{Q-1} B_i \cdot 2^i \qquad \text{ and } \qquad B_i \cdot (B_i - 1) = 0 \ \forall i.
\end{equation} Similarly, we must prove that the bit decomposition of the remainder $R_u$ is correct and that $R_u < 2^{R}$. We prove the relation between quotient $Q_u$ and the output $v$ for all neurons across a layer by batching the equalities and then using sumcheck, i.e.,
\begin{equation}
\sum_{j=1}^{n} \textsf{eq}(\tau, j) \cdot (v_j - (1 - B_{Q_j-1}) \cdot Q_{u_j}) = 0.
\end{equation} Similarly, the bit-decomposition relations for $Q_u, B_i$ and $R_u$ can be batched together to create a single sumcheck instance for an entire ReLu layer.
## Limitations and Future work
The current proof generation process, taking approximately 5.4 minutes, can appear resource-intensive when scaled for content production. A significant portion of this time is spent on two key tasks: computing witness commitments and generating sumcheck proofs. At present, these operations are performed sequentially, which introduces inefficiencies that can be addressed.
1. **Sequential Witness Commitments**: Proof generation involves computing commitments for 13 witness vectors, corresponding to 9 convolution layers, 1 average pooling layer, 1 fully connected layer, and the input and output. These commitments are currently calculated layer by layer in a linear fashion, starting from the input and progressing to the output. This sequential process creates bottlenecks, particularly when dealing with large models. Parallelizing the computation of these witness commitments is a straightforward and promising area for improvement.
2. **Batching Sumcheck Proofs**: Each layer also requires the generation of sumcheck proofs, which independently verify different properties of that layer. Presently, sumcheck proofs are computed one after the other, further contributing to the extended proof generation time. There is scope to parallelize the sumcheck proofs across layers, as they are independent. Additionally, within each layer, multiple sumcheck proofs are needed to prove various statements. Batching these sumcheck computations—both within individual layers and across different layers—can help reduce redundancy and improve overall performance.
3. **Choice of commitment scheme**: The current implementation uses the Hyrax multi-linear commitment scheme for witness commitments and sumcheck opening proofs. While Hyrax is effective, exploring alternative schemes such as [Zeromorph](https://eprint.iacr.org/2023/917) could be beneficial, particularly for applications where evaluation proofs need to be zero-knowledge. Zeromorph offers potential advantages in terms of performance and security in zero-knowledge settings.
4. **Choice of the field**: The current setup uses the BLS12-381 elliptic curve and its corresponding scalar field for computations. While this curve is widely adopted for cryptographic purposes, exploring alternative fields, such as Mersenne primes or Baby Bear (and their extensions), could provide speedups in the proof generation process.
There are several promising avenues for improving the efficiency of proof generation. Parallelizing witness commitments and sumcheck proofs, exploring alternative commitment schemes, and experimenting with smaller fields represent significant opportunities for future optimization.
### What's Next?
In this work, we demonstrated zkDL++ with a real-world example involving a CNN. However, the zkDL++ framework is flexible and can be extended to support additional layers, making it applicable to any deep neural network (DNN). In our proof of concept for private watermark extraction, we achieved prover runtimes in the range of minutes for a standard CNN. With further optimizations, we believe this can be reduced to seconds, meeting the performance requirements for large-scale deployment by companies like Meta.
Our code is production-ready and available for testing. We encourage collaborators, researchers, and developers to explore zkDL++ with us and help advance this powerful framework.
# Acknowledgements
We would like to thank the authors of the zkDL and Stable Signature papers. Special thanks goes to Pierre Fernandez and Haochen Sun :heart: