zkSummit8 - 15/9/22

# zkSummit8 - 15/9/22 Here's some notes from my visit to zkSummit. ## 1. Demystifying zk Programming > Howard Wu This talk was an overview of the Aleo (Autonomous Ledger Executions Off-chain) blockchain. Their aims for the platform are 1. Improve efficiency by avoiding re-execution 2. Support privacy 3. Flexibility (user defined programs) They do this, of course, by sprinkling zkSNARKs over a blockchain. Starting with an account-based model they update the account contents like so ```text balance => commit(balance), code => predicate(code), storage => encrypted(storage), ``` Then they identify bottlenecks for both efficiency and concurrency that motivate the construction of a **record-based model**. The bottlenecks are envisioned under the context of a dapp with a large number of users. In this scenario, each user that is using the dapp has some user-specific state associated with the dapp. When users interacts with the dapp their state must be updated. The problems are threefold: 1. *State updates are not efficient* - Predicates must consume the entire program state in order to update it (e.g. the state for all users using the dapp) 2. *Doesn't support concurrency* - Due to "merkleization", one user's transaction invalidates any others that were submitted concurrently 3. *Doesn't have account privacy* - The users updating the dapp are public Their solution is a *record-based model* where within a given app each users records are separate. This simultaneously means that: a) predicates do not need to consume the entire program state and b) that users can concurrently update their records. **Questions:** 1. How does this compare with our **parties-based model**? 2. Programs can still have global state, for example, the exchange rate in a constant product market maker liquidity pool; so how do they support and handle updating application-global state? * I asked Howard this question and the answer is that they have a *finalization hook* behind which some things are serialized. **Spec** * Transactions: ~1kb * Verification: < 50ms * Throughput: "thousands of tx / block" * Flexibility * User-defined programs * Inter-process communication (programs can interact) * Uses Marlin * Setup * Trusted * Universal * Sampled Aleo curves * BLS12-377 * Edwards-BLS12 ## 2. Caulk - Linear-map vector commitments > Arantxa Zapico This talk actually changed its title to **<center>Caulk - Lookup arguments in sublinear time</center>** Surprise to me! Anyway, this was essentially about the lookup research we already met with Mary Maller to discuss. While plookup is good for small tables, the main advantage of Caulk is that it's good for answering membership queries where the query is a small subset of elements. **Questions:** 1. How does this compare with *Pointproofs*? * Pointproofs do not support subset queries as efficiently (you have to lookup each element up separately or aggregate them) * Pointproofs are unable to reuse existing trusted setups (because they require you to remove one of the elements) **Spec** * Pairing-based * 100x faster (for specific goal) than Merkle tree approach * Better for bigger tables with small query sets ## 3. A gentle introduction to Hasse's theorem / cool facts about elliptic curves > Ariel Gabizon This was a different kind of talk for this conference. He simply shared with us some mathematical concepts that he thinks are interesting and cool. In doing so, he outlined a proof of Hasse's theorem using endomorphisms over the complex numbers. It would be nice if there were a recording of this online somewhere. ## 4. Hyperplonk > Benedikt Bünz This was a talk about a new Plonk construction where the author improves performance by trying to avoiding FFTs. This new polynomial IOP can be combined with the Bulletproofs, KZG or FRI polynomial commitment schemes to create SNARKs. Aims to support * Custom gates * Lookups * Halo2 arithmetization (access multiple rows) Whereas Plonk uses univariate polynomials and codes things over a multiplicative subgroup, Hyperplonk uses multi-linear polynomials defined over the boolean hypercube. Copy constraints are enforced using a permutation argument which builds on the product argument from Quarks. https://eprint.iacr.org/2020/1275 The case against FFTs is * $O(n\log(n))$ * Requires FFT-friendly multiplicative subgroup $\omega^n = 1$ * Complex (i.e. inefficient) memory access patterns * Impacts parallelization Basically, they argue that FFTs are a bottleneck for large instances because they have superlinear scaling. Hyperplonk aims to support very complicated custom gates * So-called **high-degree custom gates** The premise is that in Plonk if you make your gate too large, then you end up paying for it in group operations. Concretely, in terms of group operations * *Plonk:* $O(n \cdot d)$ * *Hyperplonk:* $O(n)$ plus $O(n \cdot d \log^2(d))$ field operations **Proof size:** is a bit larger than with Plonk: $~4\log(n)$ where $n$ is the number of constraints (<4kb for $n = 2^{20}$ with KZG) So, how does this multi-linear polynomial look? Suppose that in Plonk the domain $\mathcal{H} = \{1, \omega, \omega^2, \ldots, \omega^7\}$ and the polynomial $\mathcal{P}(x)$ is like this \begin{aligned} \mathtt{Column 0} &~& \mathtt{Column 1} \\ \mathcal{P}(1) &= 7 ~ &\mathcal{P}(\omega) = 17 \\ \mathcal{P}(\omega^2) &= -5 ~ &\mathcal{P}(\omega^3) = 19 \\ \mathcal{P}(\omega^4) &= 12 ~ &\mathcal{P}(\omega^5) = 23 \\ \mathcal{P}(\omega^6) &= 4 ~ &\mathcal{P}(\omega^7) = 11 \end{aligned} for $n = |\mathcal{H}| = 8$. Then for the multi-linear approach the polynomial will have $\log_2(n) = 3$ variables and appear something like this \begin{aligned} \mathtt{Column 0} &~& \mathtt{Column 1} \\ \mathcal{P}(0,0,0) &= 7 ~ &\mathcal{P}(0,0,1) = 17 \\ \mathcal{P}(0,1,0) &= -5 ~ &\mathcal{P}(0,1,1) = 19 \\ \mathcal{P}(1,1,0) &= 12 ~ &\mathcal{P}(1,1,1) = 23 \\ \mathcal{P}(1,0,0,) &= 4 ~ &\mathcal{P}(1,0,1) = 11 \end{aligned} where the values of the variables correspond to the vertices of the boolean hypercube. So, for $n$ rows $\mathcal{P}$ requires $\log_2(n)$ boolean variables. To handle **shifting** (i.e. $\mathcal{P}(X)$ and $\mathcal{P}(\omega X)$ in Plonk-speak), they adopt a strategy that involves encoding the column number into $\mathcal{P}$'s arguments (n.b. in this example the 3rd parameter of encodes the column number). The resulting polynomial is linear in every variable, which is good for efficiency, but the proof size has a factor of $\log_2(n)$ overhead. They try to work on $\mathcal{P}$ only in evaluation form and use the **sum-check** protocol for the zero-check. The prover sends a degree 3 polynomial in each round. Adapts nicely to custom gates. > In Plonk it's advantageous to use complex custom custom gates, because it reduces the number of gates and the size of the witness, which reduces the number of group operations. However, on the other hand it increases the degree of the quotient polynomials, which increases the number of group operations. (They claim that Plonk only really works for custom gates up to degree 5.) However, in Hyperplonk (custom) gates only affect the sum-check, which is only field operations. Group operations are about 1000 x more expensive than field operations, so this is a big speed-up. The sum-check prover is quadratic in $d$ the degree of the custom gates, so they do some optimizations to reduce this to something like $O(d\log^2(d))$. These optimizations resort to some FFTs. ## 5. SAFE: Faster and simpler hashing for ZKP > JP Aumasson They have developed a sponge API for field elements, a variant of the duplex model with field element interfaces. Their aim is to help eliminate common painpoints and security bugs found in implementations. * Security flaws, such as domain separation failures * Inconsistent APIs that are hard to use securely * Deal with the padding vs performance issue The framework aims at * Standardization * Interoperability * I/O patterns * Specification of the sponge state * Pseudocode and models I think it may also have some functionality related to dealing with transcripts (i.e. Shamir-Heuristic) safely. https://safe-hash.dev This could be nice to adopt, for example, by integrating it with our Poseidon implementation; however, because it would mandate its own domain separation strings it would not be backwards compatible. ## 6. New directions in ZK hashing > Dmitry Khovratovich I thought this was a very cool talk. It's about a new snark friendly cryptographic hash function called Reinforced Concrete. https://eprint.iacr.org/2021/1038 The problem they are trying to solve is that SNARK friendly hashes are currently extremely slow when run outside of the circuit. Their approach is to design their hash function using lookup arguments. Their hash has some nice properties * High algebraic degree * Irreducible to a small system of low degree * Looks like an "old structure from cryptanalysis" It could be faster than Poseidon in circuit, but it depends who you ask. JP says the speed will be about the same while Dmitry seemed to suggest it could be twice as fast. They may be looking for a proof system with lookups on which to implement it and gain practical insights! Perhaps this is something we should investigate or maybe even reach out to them.