A Witness Encryption Construction over KZG-Committed Data

During my internship at Ingonyama, I was tasked with making a proof of concept for a trustless dark chess. Dark chess is basically just chess, but with a fog-of-war mechanic, meaning you can only see the squares that you can attack. <div style="text-align: center;"> <img src="https://hackmd.io/_uploads/Byfljxeqke.png" alt="Alt text" width="300"> <p style="font-style: italic;">This is the squares white can see, the rest is behind a "fog-of-war".</p> </div> This is usually done by a trusted coordinator (like chess.com), having access to the whole board and sending the relevant squares to both players. But, as cryptographers, we don't like having to rely on a central authority, so let's fix this. The first thing that comes to mind would be to use ZK. Unfortunately, here, ZK is not sufficient. Yes you could ask the opponent to reveal their pieces that are on squares you control without revealing your own pieces using ZK. But by knowing which squares you have access to, the opponent learns a lot about where your pieces are. <div style="text-align: center;"> <img src="https://hackmd.io/_uploads/rJnrhel51l.png" alt="Alt text" width="300"> <p style="font-style: italic;">If you prove that you have access to the red squares it is pretty easy for the opponent to guess that you have a bishop on f4 and king on a3</p> </div> To counter this we would need for the opponent to send his entire board, in such a way that you can only read the squares that you have access to. That's where Witness Encryption comes into play. informally, Witness Encryption is encrypting a message that can only be decrypted by someone who has a solution to a problem (a tad more formally, someone who has a valid witness to an NP statement). Or you can just think of it as: people can only decrypt it if they can make a specific function evaluate to true. Now this would solve our problem, because the opponent can encrypt his squares in a way that you can only decrypt if a function encoding visibility rules evaluates to true on that square. Unfortunately General Purpose Witness Encryption is still an open problem. There are, however, WE protocols for specific problems or statements. This [paper](https://eprint.iacr.org/2024/264) presents a protocol for witness encrypting to the proof of the evaluation of a KZG-committed polynomial. Here's the basic idea: Say you have an encryptor Eve and decryptor Derrick. Derrick has a polynomial P and has published its KZG commitment $com(P)$. Eve can then use $com(P)$ and values $\beta, \alpha$ to encrypt a message that Derrick will only be able to decrypt if he can prove that $P(\beta)=\alpha$. (You don't need to know the exact protocol to follow this post, but essentially it works by adding randomness to the KZG commitment equality.) While insuficient on its own, it allowed me to come up with a special-purpose protocol for dark chess. But this is not what this post is actually about. After thinking about how to make it general purpose I came up with the following: Say Eve has a boolean function $f$ and a commitment $com(P)$ that represents Derrick's data x. She wants to encrypt a message that he can only decrypt if $f(x)=true$. We'll assume Derrick's data is encoded in the polynomial P as bits, meaning it only evaluates to zero or one on the root-of-unity domain. In other words, if $(x_1,...,x_n) \in \{0,1\}^n$ is Derrick's data, then the polynomial P will be such that $P(\zeta_i) = x_i$ where the zetas are roots of unity. The first step is to transform f into a SAT formula over the input bits - specifically a CNF-SAT formula (this isn't strictly necessary but makes things easier). Meaning it takes the form: $$\bigwedge_{i=1}^m \left(\bigvee_{j=1}^{j_i} l_{ij}\right)$$ So basically a big m-wide AND gate containing *clauses*, which are varying-size OR gates of *literals*, where a literal $l_{ij}$ is either $x_k$ or $\neg x_k$ for a $k\leq n$ For the formula to evaluate to true, all m clauses need to be true, which means at least one of the inner literals in each clause must be true. for example this formula evaluates to true if the number you get by considering $x = (x_1,x_2,x_3)$ as an integer is $\geq 5$ $$ x_1 \wedge(x_2\vee x_3) $$ **Encryption** Let's look at how the encryption process works. Eve starts by choosing $m$ uniformly random keys $k_1,...,k_m$, one for each clause in our SAT formula. She then combines these keys by XORing them $K = \bigoplus_{i=1}^m k_i$. This K will serve as the main encryption key - Eve uses it to encrypt her message M with her preferred symmetric encryption scheme and sends the cipher. Now comes the interesting part. For each clause $i$ in our formula, Eve needs to ensure that Derrick can only obtain $k_i$ if that clause evaluates to true. She does this by encrypting $k_i$ multiple times, once for each literal in the clause, using the KZG-based witness encryption protocol from the paper: - For a positive literal $x_j$, she encrypts $k_i$ requiring a proof that $P(\zeta_j)=1$ - For a negative literal $\neg x_j$, she encrypts $k_i$ requiring a proof that $P(\zeta_j)=0$ Eve then sends all these encrypted versions of the keys to Derrick. Derrick will need to decrypt at least one version of each $k_i$ to reconstruct K and access the message. In our example we get two keys $k_1$ and $k_2$ and encrypt $k_1$ on the condition that $P(\zeta_1)=1$ to get $ct_1$. We then encrypt $k_2$ two times, once for $P(\zeta_2)=1$ and for $P(\zeta_3)=1$ to get $ct_2$ and $ct_3$ **Decryption** The decryption process follows naturally from the encryption. If Derrick's data satisfies the formula, then by definition every clause must evaluate to true. This means that for each clause $i$, at least one of its literals matches his committed polynomial P. For each clause, Derrick just needs to pick any matching literal and use it to decrypt the corresponding $k_i$ - he does this by generating a proof of the required polynomial evaluation ($P(\zeta_j)=1$ or $P(\zeta_j)=0$ depending on whether it was a positive or negative literal). Once he has all the $k_i$, he can XOR them together to recover K and decrypt the message M. In our previous example, say Derrick has the number 5, meaning $x_1=1,x_2=0,x_3=1$. He will first create a proof that $P(\zeta_1)=1$ and use it to decrypt $ct_1$ and get $k_1$. He can't decrypt $ct_2$ but he will get $k_2$ from $ct_3$ instead with the proof that $P(\zeta_3)=1$. Now he can XOR them, get K, decrypt M. Since CNF-SAT is NP-complete, this protocol effectively lets us encrypt a message for any problem in NP. Eve doesn't need to know a solution, crucially, she doesn't even need to know a solution exist! - she can encrypt towards a formula that is not satisfiable. This might start to sound like general Witness Encryption, but there's a crucial difference: Eve isn't encrypting in a way that anyone with a solution can decrypt. Only Derrick will be able to decrypt, and only if the polynomial he already committed to encodes a solution. You could extend this to have interactive WE by having Eve re-encrypt the message for anyone with a claimed solution (or even set up a MPC that does this automatically, removing the need for Eve to stay online or handle the computation). However, if you're willing to make it interactive, there are more efficient approaches - for instance, using 1-of-2 Oblivious Transfer where you prove bit consistency with a zero-knowledge proof. This protocol also comes with significant computational costs: up to $2n$ pairings for the encryptor and $n$ pairings plus $n$ KZG proof generations for the decryptor. Communication overhead is also substantial - you need to send a $G_2$ group element per evaluation, on top of all the encrypted $k_i$s, adding on the order of kilobits of overhead per bit of the solution. However not only are pairings much faster now, [reaching sub-millisecond](https://hackmd.io/@gnark/eccbench), but the computation is highly parallelizable. You can compute every needed pairings at the same time, same goes for the evaluation proofs. Despite these limitations, I think the protocol could still make sense for some specific use cases, particularly where we can optimize the computational costs. Here's why: The computation actually scales with the relevant parameters rather than your total data size. Say you have a polynomial representing some identity - for the sake of simplicity, imagine it's just a name and country. While the name might require hundreds of bits, the country can be encoded in less than 10. If somebody wants to encrypt a message based only on the country, they only need to compute 10 pairings. You can also optimize decryption by storing and reusing your KZG opening proofs for as long as your data stays the same. The decryption cost, if you already have your opening proofs, also only depends on the number of parameters that are relevant. Of course, storing all those proofs can get expensive, but depending on your system, you can always choose to store the most frequently used parts of the polynomial and recompute the others. In our identity example, you might only store country bit proofs since you're more likely to receive messages that depend on your country than your name. One interesting application would be attribute-based encryption. The protocol relies on KZG commitments, so it's compatible with some existing proof systems, which means you could easily zk prove the content of your polynomial. You could imagine a group of people publishing a commitment to their attributes, along with a zero knowledge proof that it was properly constructed, on-chain. And now anyone could encrypt a message for everyone with certain attributes by making a SAT instance that checks the attribute, and encrypting the message with it for all polynomials in the group. It doesn't require the receivers to interact and any sender can send as many messages as they want on the attributes that they want. What comes to mind here are on-chain games, because in a game the set of participants is relatively small and they already need to prove and verify zk proofs anyway (if the game has hidden information at least), so a few pairings more shouldn't make that much of a difference. In fully trustless games, we already have hidden information with ZK, however, this isn't sufficient for fog-of-war type games, where the opponent has access to part of your hidden information, but you don't know what part (think Starcraft). This protocol seems to offer a solution for those kinds of situations. (PoC code can be found [here](https://github.com/vladfdp/sat-we))