This is a follow-up investigation of our PPD Project Plan back in Jan 2025.
Recall that we are in the situation where a weak client, who holds some secret witness to a (public) statement, would like to generate a (public verifiable) zero-knowledge proof of the knowledge of the witness (and that it satisfies some relation), but not able to do so, due to resource constraint such as low computational power, shortage of power (battery life), or low memory. While computational power or battery life are somewhat reasonable to circumvent, memory cap remains a hard constraint of such weak client.
In such a situation, the weak client, would like to find a way to delegate the proof generation to a strong server, while, at the same time, would like to keep the witness its own secret.
We call such service "Private Proof Delegation; and our team embarked on the quest to fulfill the wish of the client.
PPD is an instance of Secure 2-party Computation (2PC) and is solvable using Secret-Sharing-based MPC technique such as Collaborative zk-SNARKs (coSNARKs), firstly introduced in 2021, and used for Private Proof Delegation in 2023. Software-wise, such solutions are available at TACEO and EZKL.
Roughly speaking, coSNARKs have 2 deployment model: client delegates proof generation to 2-3 coSNARKs servers, or client participate as one of the coSNARKs servers. The first model encounters trust issue in the coSNARKs servers while the 2nd model takes the PPD issue back to square one: the client needs to have the capacity of a server.
Hence, we looked into Fully Homomorphic Encryption, FHE, -based solutions and investigate the landscape of FHE-SNARKs.
As it can be summarized, a modern SNARK is composed of two components: Polynomial Interactive Oracle Proof (PIOP) and Polynomial Commitment Scheme (PCS). Thus composing FHE-SNARKs is equivalent to compose FHE-PIOP + FHE-PCS.
The first issue turned out to be that field size matters, A LOT! The MHEGA project by TACEO demonstrated infeasibility of computing FHE-FFT on big prime field, where the plaintext space has to be 254 bits which pumped up the ciphertext space of HE to be 1045 bits and even with that can only accomodate one level of HE multiplication. The very big ciphertext bitwidth constitute extremely high computational cost even though MHEGA had done many tricks to cut down multiplication depth of the FFT however at the cost of requiring many rotation keys.
Here we cite their MHEGA project's benchmarking results:
The results are somewhat sobering. The overhead of homomorphic encryption is more than 6 orders of magnitude despite using the optimized algorithms for evaluating the matrix-vector multiplications. A still comparably small FFT of size 32768 blows up from 12.35 ms without encryption to 1514.2 s with encryption.
Unfortunately, CPU overhead is not the only drawback of using HE, RAM requirements are also huge. The final benchmark with size 32768 required nearly everything of the 100 GB RAM of the benchmark server. The reason for that is (among the general size of HE ciphertexts) the size of the many key-switching keys required to evaluate all the ciphertext rotations in the babystep-giantstep optimized diagonal method. These need to be created by the party knowing the decryption key, thus, when HE is used for outsourcing computations, the communication of these keys already requires to upload several tens of GBs of data.
OK, let's forget about big field size and focus on small field size such as Goldilocks. This means that we can only do FHE-SNARKs for field-agnostic SNARKs such as STARKs or the similar.
There is a series of FHE-STARK (FHE-FRI-ish) in the recent years that shows feasibility for FHE-FRI-PIOP:
Aligning the security parameters of HE and FRI is a difficult task for which we introduce several optimizations. We demonstrate their efficiency with a proof-of-concept implementation and show that we can run FRI's commit phase for 4096 encrypted Reed Solomon codewords with degree bound in just 5.4 seconds (using 32 threads) on a c6i.metal instance using less than 4GB of memory. Verification takes just 12.3 milliseconds (single-threaded) for the same parameter set and can be reduced to just 5.6ms with parameters optimized for the verifier.
All the work above however only show how to compute FHE-PIOP on small field but they are only half the way to the true PPD.
In the following screenshot we highlight the gist of the FHE-SNARKs so far:
Per the diagram the Merkle Tree roots are computed from the ciphertexts of the witness not the witness itself. Hence the verifier needs to decrypt to be able to verify the generated proof, which makes the proof only designated verifier (to the client who has the FHE key). To circumvent the issue, the client needs to decrypt the queries and attach a proof of decryption.
Yes, we are somewhat back to square one, the client still needs to generate a zk-SNARK albeit of a smaller size. This at the same time also change the zk-SNARKs scheme itself to a new scheme.
It is unfortunate that the best known methodology is only half the way to PPD:
We need to continue our investigation for PPD which should be now focused on the FHE-PCS part:
Several potential paths include: