ZKEVM - `halo2` Customization for Aggregation

## Issues ### 1. Hard to split/parallelize Because all circuits are using "random linear combination" trick to synchronize the dynamic-size witnesses, it's important to commit the witnesses in both sides to generate the randomness. However, this makes it hard to split/parallelize the workloads because they have some shared states in the middle of proving. ### 2. Hard to build the aggregation tree Because some circuits are serving self as lookup table for others, we need to make sure the dependents are using the exact same commitments from dependencies' ones. However, the aggregation circuit might not have enough capacity to aggregate all inner circuits, we might need to build a tree of aggregation to incrementally aggregate all proofs. So some dependencies' commitments need to be exposed as public inputs and share between different leafs, and transcript state also needs to be shared. ## Approaches ### 1. Define all circuits as a single circuit In current API, we can connect all circuits by defining them as a single one, and share the expressions around without any issue (tho we still need the challenge API). We then need to build our own meta structure to describe columns dependency, parallelizable workloads, and aggregation tree for customized prover algorithm. Even we can parallize the prover in this approach, we still have the same cost to share transcript state and commitments. So second approach seems to be a natural solution. ### 2. Define higher order circuits to connect inner circuits While verifying each inner circuit, we ignore all "random linear combination" part. After each inner circuit's encapsulated constraints are verified, we then reuse their commitments for higher order circuits' advice columns. Then we verify these commitments indeed satisfy some relation (subset, shuffle, etc...). > in the higher-order circuit, do we treat the commitments of the inner circuit as fixed columns? or instance columns? > [name=ying tong] > They should be instance no? As RNG will change in-between circuit instanciations and we need to keep the same Prover/Verifier keys. Than can't be done with `Fixed` columns no? > [name=CPerezz] > It should still be advice columns (updated above), the different is it must be reused from those exposed from inner circuits. > [name=han] Then parallelization of inner circuit will be no pain, and higher order circuits will have most columns committed and should be able to skip some workload, but passing the orginal commitments (instead of processed) will add some overhead. To mitigate the overhead, the best practice to build inner circuit is to put (or duplicate) everything that connected to other circuit in the same columns. For Keccak circuit as an example, it cost ~1300 rows to absorb 136 bytes, then we have enough capacity to duplicate all 136 bytes input and 32 bytes output in the same column (with some 2 boolean columns to show their position), then we only need to expose the state, 2 booleans and bytes (overall 4 columns) to the higher order circuit. For EVM circuit as an example, it currently does random linear combination of shuffle/lookup input itself. In this approach, it should just move the actual values and expose those shuffle/lookup input columns for higher order circuits. Let's take another naive example, say EVM circuit wants to do 128-bits xor in a single lookup, that's not possible by a fixed table, but possible by a advice table. So we create a Xor circuit to help to build such advice table, the illustration will be like: ![](https://hackmd.io/_uploads/SJf23yLv9.png) The necessary for such pattern is because Aggregation circuit #1 might not be able to also aggregate Xor circuit or other dependency circuits, so we need next layer of aggregation (Aggregation circuit #3), at the same time, to keep them connected. In each level, the pink part can be discarded (the expensive part), only the yellow part need to be exposed as public input for higher order circuit in next level. Those workloads at the same level can be parallelized with no pain, and some of workloads are overlapping (the yellow part). To build a PoC of such approach, we have some todos: - `halo2` - [ ] Blinding factors need to be removed to make sure higher order circuits can make the same commitments of inner circuits. > [name=CPerezz] > I think this is already the case. One of the latest PRs to halo2 added the ZK property as a trait bound for the Circuit structure. > See: https://github.com/privacy-scaling-explorations/halo2/pull/76 - Aggregation circuit - [x] Aggregate proofs from different circuits. - [x] Can be configured columns mapping from different vks and make sure their commitments are same by copy constraint automatically. Also to make sure inner circuits compatible or easier to integrate in such scheme, we have to do some adjustments: - The shared part needs to be modularized as a chip with its own config. - The shared part needs to automatically interpolate some polynomials to larger domain if its source circuit has lower degree.