crystals-dilithium zk proofs

## simple crystals-dilithium zk proofs Considerations when implementing zk proofs of [crystals-dilithium](https://pq-crystals.org/dilithium/data/dilithium-specification.pdf) signatures. ![image](https://hackmd.io/_uploads/Bkl2nvDwke.png) We'll consider only the simplified scheme as described above. Discussion should be applicable to the full scheme. ### Proving signature With crystals-dilithium we must contend with a variable length signing procedure. Specifically we have to rejection sample based on the value of $\textbf{y}$, and generate a ball polynomial $c$ based on a hash. Both of these operations are non-deterministic. e.g. it's expected that they must be retried (specifically the authors target 4-7 iterations). Inside a zk circuit we can avoid variable length execution by doing the rejection sampling at witness generation time and providing a suitable $\textbf{y}$ as an input. The circuit does **not** need to range check the $\textbf{z}$ value inside the proof. Outputting a value outside the range leaks the secret key, so it's important that the witness calculation function operates correctly. In fact, we don't need to constrain the calculation of the $c$ value either. The verification procedure is robust and allows us to constrain relatively little of the signing procedure. Specifically we can take $c$, $\textbf{y}$, and $\textbf{s}_1$ as inputs directly. We must constrain only the following: - $\textbf{z} = \textbf{y} + c\textbf{s}_1$ ($\ell$ polynomial additions and $\ell$ polynomial multiplications) ($\ell \approx 5$) #### unconstrained $\textbf{w}_1$ and $c$ calculation The verifier calculates $\textbf{A}\textbf{z} - c\textbf{t}$. We can substitute $\textbf{z}$ and $\textbf{t}$ in this equation to get $\textbf{A}(\textbf{y} + c\textbf{s}_1) - c(\textbf{A}\textbf{s}_1 + \textbf{s}_2) = \textbf{A}\textbf{y} + \textbf{A}c\textbf{s}_1 - \textbf{A}c\textbf{s}_1 - c\textbf{s}_2$ The middle terms cancel leaving $\textbf{A}\textbf{y} - c\textbf{s}_2$ $c$ and $\textbf{s}_2$ are both of small norm and are filtered out by decomposing into the high bits of the coefficients yielding $\textbf{w}'_1$. Because the verifier calculates the $\textbf{w}'_1$ value, it's able to ensure that $c$ was hashed appropriately even without explicit constraints in the zk circuit. ### misc notes Polynomials are in a ring with modulus $X^{256} + 1$ with coefficients in a field with scalar modulus $\approx 2^{23}$. Depending on the architecture of the execution system we can do lazy reduction for the scalar operations. This is a compiler level optimization that is particularly relevant in systems with large native types (e.g. $\approx 2^{256}$) that can't efficiently support montgomery/crt representation.