paper: https://eprint.iacr.org/2020/1247.pdf
### [[arbitrary-arithmetic-circuits]]
- for [[layered-arithmetic-circuits]] each gate can only take it's input from the layer below.
- for the purposes of [[virgo++]] the [[arbitrary-arithmetic-circuits]] is defined as follows
- the circuit can be seen as a directed acyclic graph
- with fan-in 2 nodes (meaning each gate takes a maximum of 2 inputs)
- the depth of the circuit $d$ is the length of the longest path in the graph
- input gates are at layer d, output gates are at layer 0
- each gate in the graph can be assigned a layer id via topological sorting
- we can determine a gates layer via the following: layer(g) = min(layer(u), layer(v)) - 1
- where u and v represents the input to gate g
- note: if g is the input gate then it's layer id is just $d$
- this leads to the conclusion that every gate at layer i must have an input from layer i + 1
- rough intuition
- inputs are fixed at layer d
- all other gates depend on the inputs or depend on things that have been derived from the input.
- if you depend exclusively on the input then layer d - 1
- if you depend on something that depends on the input (say 1 step) then layer d - 2
- the idea here is the layers might represent some kind of computational step, so the gate has to wait a layer for all it's needed inputs to be ready, only then can it perform it's computation.
- hence the layer id can represent how long you have to wait for all your inputs to be ready!!
- topological sorting puts every gate in the exact layer they need to be, nothing more, nothing less (in terms of input ready wait time).
### generalizing [[gkr]] for [[arbitrary-arithmetic-circuits]] (naive)
- the idea is to perform statement reductions from the output layer to the input layer, where the verifier can then confirm the last reduced statement on the input.
- we do this for [[layered-arithmetic-circuits]] as follows
- given that layer i depends on layer i + 1, we construct a relation that can get all elements in layer i but using only values from layer i + 1
- we then prove this relation via the appropriate argument, this will lead to a reduced statement about layer i + 1
- we continue this process recursively until we get to a reduced statement about the input layer
- which the verifier can check directly. fin.
- the key thing here is that we have some relation that can produce all the elements at layer i, but doesn't make use of layer i itself.
- for the [[arbitrary-arithmetic-circuits]] layer i depends not only on layer i + 1 but potentially on all layers $j > i$
- so to generate some relation that produces all elements at layer i, we need to capture all layers $j > i$
- construction notes
- a given gate is either an addition gate or a multiplication gate
- the gate takes in two inputs
- one of those inputs must come from layer i + 1
- the other input can come from any layer $j > i$
- hence the relation just has to have all possible combination of layers
- $add_{i,j}(z, x, y)$ -> represents an addition gate at layer i - 1 with inputs from layer i and j respectively.
- The relation is given as follows $$
\begin{matrix}
V_i(z) = \sum_{x, y \in \{0,1\}^s} add_{i+1, i+1}(z, x, y)(V_{i+1}(x) + V_{i + 1}(y)) + mul_{i+1,i+1}(z, x, y)(V_{i+1}(x) \cdot V_{i+1}(y)) + \\
add_{i+1,i+2}(z, x, y)(V_{i+1}(x) + V_{i+2}(y)) + mul_{i+1,i+2}(z, x, y)(V_{i+1}(x) \cdot V_{i+2}(y) + ... + \\
add_{i+1,d}(z, x, y)(V_{i+1}(x) + V_d(y)) + mul_{i+1,d}(z, x, y)(V_{i+1}(x) \cdot V_d(y))
\end{matrix}
$$
- running sumcheck on the above relation will lead to claims for every $V_j$ where $j > i$
- hence once we reach some layer $V_i$ to reduce, we'd have accumulated multiple claims for that layer.
### problems with naively generalizing [[gkr]] for [[arbitrary-arithmetic-circuits]]
- assuming that all layers have S gates
- the polynomial size for a given layer in the generalized relation is O((d - i)S)
- and the sumcheck cost is linear in polynomial size
- hence the total cost of running sumcheck for all layers is given as follows
- $O(dS + (d - 1)S + ... + S) = O(d^2S) = O(d|C|)$
- recall S is the total number of gates in a layer if we have d layers then $dS = |C|$
- we need to somehow reduce the size of the polynomials per round to reduce the cost of sumcheck, reducing the cost of the protocol.
### generalizing [[gkr]] for [[arbitrary-arithmetic-circuits]] (optimal)
- assuming layer i has S gates and each gate can take only two inputs, then at most layer i is defined by 2S gates.
- below is a relation that can still allow you compute the correct $V_i(z)$ at some point z based on layers $j > i$, but with cheaper sumcheck costs.
- rather than considering all the gates as input, when trying to evaluate a layer's eval, we only consider the subset of gates that actually have wirings to the target layer.
- we define $V_{i,j}$ as the subset of gates from layer $V_j$ that have wirings to gates in $V_i$
- this will affect the way the $add_i$ and $mul_i$ equations are constructed.
- the relation is given as follows
- $$
\begin{matrix}
V_i(z) = \sum_{x \in \{0, 1\}^{s_i, i+1}} ( \\
\sum_{y \in \{0, 1\}^{s_i, i+1} add_{i+1, i+1}(z, x, y)(V_{i,i+1}(x) + V_{i,i+1}(y))} \\ + ... + \sum_{y \in \{0,1\}^{s_i, d} add{i + 1, d}(z, x, y)(V_{i,i+1}(x) + V_{i,d}(y))} \\
\sum_{y \in \{0, 1\}^{s_i, i+1} mul_{i+1, i+1}(z, x, y)(V_{i,i+1}(x) \cdot V_{i,i+1}(y))} \\ + ... + \sum_{y \in \{0,1\}^{s_i, d} mul{i + 1, d}(z, x, y)(V_{i,i+1}(x) \cdot V_{i,d}(y))})
\end{matrix}
$$
- ![[Pasted image 20250528155932.png]]
- to prove the [[arbitrary-arithmetic-circuits]]
- we'd need to prove each relation above, which will produce claims for the v_subset values.
- the claims will be given as hints by the prover, for the next round the verifier will need to verify the accumulated subset claims.
- reducing it to one claim about the full polynomial, which can then be proved via the standard virgo process.
### performing sumcheck on the relation
- phase 1
- we recall libra phase 1
- we are trying to prove a relation of the following form
- $H(a) = \sum_{b, c} f(a, b, c) \cdot v(b) \cdot v(c)$
- the goal is to not materialize $f(a, b, c)$ as it is sparse and we should be able to take advantage of that sparsity.
- to achieve efficiency we compute the sumcheck messages in phases, first for the b variables (phase 1) and then for c variables (phase 2).
- we perform the following rewrite
- $H(a) = v(b) \cdot h_a(b)$
- where $h_a(b) = \sum_c f(a, b, c) \cdot v(c)$
- we turn $f(a, b, c)$ into a vector over the [[boolean-hypercube]] as follows
- $h_a(b) = \sum_x I(a, z) \cdot f[z, b, c] \cdot v(c)$
- constructing $h_a(b)$
- first we initialize an empty vector $A_{h_a}$ with $2^{|b|}$ zeroes
- we iterate over the sparse vector $f[z, b, c]$ for each entry to perform the following
- $A_{h_a}[b] += I[z] \cdot V_c(c)$
- once we have the table $A_{h_a}$ and $V(b)$ we just apply [[sumcheck-over-products]]
- effect of the c variable padding for phase 1
- what if instead we wanted to run phase 1 sumcheck on the following relation
- $H(a) = \sum_{b, c} c_{k+1} \cdot ... \cdot c_l \cdot f(a, b, c_0, ... c_k) \cdot v(b) \cdot v(c_0, ..., c_k)$
- applying the rewrite process
- $H(a) = v(b) \cdot h_a'(b)$
- $h_a'(b) = \sum_c c_{k+1} \cdot ... \cdot c_l \cdot f(a, b, c_0, ..., c_k) \cdot v(c_0, ..., c_k)$
- we showed in [[variable-padding-for-sumcheck]] that variable padded sumcheck has the same claimed sum.
- hence $h_a(b) == h'_a(b)$
- which means that as it is related to the construction of $A_{h_a}$ variable padding doesn't matter.
- how do we determine the size of the $A_{h_a}$ table
- this should depend entirely on the size of a
- in [[virgo++]] size of a is the same for all terms in the layer equation sum i.e. no of variables of $V_{i + 1}$
- to fully solve phase 1 for [[virgo++]] we need to know the effect of sum of sums on table generation.
- we look at the following relation
- $\begin{matrix} H(a) = \sum_{b,c} f_1(a, b, c) \cdot v_1(b) \cdot v_2(c) + f_2(a, b, c) \cdot v_1(b) \cdot v_3(b) + \\ ... + f_n(a, b, c) \cdot v_1(b) \cdot v_n(c) \end{matrix}$
- where $f_1(..), f_2(..), ..., f_n(..)$ are sparse
- given that $v_1(b)$ is common across all we can apply the following rewrite
- $H(a) = v_1(b) \cdot h_a(b)$
- where $h_a(b) = \sum_c f_1(a, b, c) \cdot v_2(c) + f_2(a, b, c) \cdot v_3(c) + ... + f_n(a, b, c) \cdot v_n(c)$
- we can turn the sparse polynomials into vectors over the [[boolean-hypercube]] by extracting $I(a, z)$
- $\begin{matrix} h_a(b) = \sum_c I(a, z) \cdot f_1[z, b, c] \cdot v_2(c) + I(a, z) \cdot f_2[z, b, c] \cdot v_3(c) \\ + ... + I(a, z) \cdot f_n[z, b, c] \cdot v_n(c) \end{matrix}$
- an obvious solution is to do them one after the other as follows
- first we set up the $A_{h_a}$ table with all zeroes
- we also initialize the $I_a(z)$ table
- then we start with the first pair $(f_1[z, b, c], v_2(c))$
- applying the following routine $A_{h_a}[b] += I_a[z] \cdot v_2(c)$
- repeat for all relevant pairs
- phase 2
- in phase 2 we start to see the effect of the variable padding, we showed in [[variable-padding-for-sumcheck]] how to perform sumcheck on a variable padded polynomial.
- we also know that you can perform sumcheck on sums of polynomials by just adding the round messages together and applying the same challenges to each polynomial.
- this also works for sums of product polynomials, we just treat the product polynomial as a single polynomial (via abstraction), which reduces to the last case.
- the only thing left for phase 2 is figuring out how to build the bookkeeping tables.
- $\sum_{i = 0}^m \sum_c c_{k+1} \cdot ... \cdot c_l \cdot f_i(a, u, c_0, ..., c_k) \cdot t(u) \cdot v_i(c_0, ..., c_k)$
- where $t(u)$ = $V_{i + 1}(u)$ = scalar
- since we are performing [[variable-padding-for-sumcheck]] we can ignore the padded variables
- $\sum_{i = 0}^m \sum_c f_i(a, u, c_0, ..., c_k) \cdot t(u) \cdot v_i(c_0, ..., c_k)$
- we can take advantage of sparsity
- $\sum_{i = 0}^m \sum_c I_a(z) \cdot I_u(b) \cdot f_i[z, b, c_0, ..., c_k] \cdot t(u) \cdot v_i(c_0, ..., c_k)$
- we already have the table representing $v_i(..)$
- we can construct $A_{f_i}$ as follows:
- where $A_{f_i} [c_0, ..., c_k] += I_a[z] \cdot I_u[b] \cdot t(u)$
- for all non-zero entries in $f$
### combining multiple claims into 1
- when it is time to prove any layer (except the output layer), we'd already have a set of claims of the layer or subsets of the layer.
- specifically if we try to prove layer i, we have claims from $V_{0, i}, V_{1, i}, ..., V_{i-1,i}, V_{i-1,i}'$
- we have two claims for $V_{i-1,i}$
- the objective is to prove that all claims are valid.
- first we focus on proving a single subset claim.
- we have the following claim $V_{j, i}(r^{(j,i)}) = H_{j,i}$ and $V_{j,i} \in V_i$
- we need to prove two things
1. $V_{j,i}$ was constructed correctly from $V_i$ based on the circuit description
2. Evaluating this subset $V_{j,i}$ at $r^{(j,i)}$ gives the target value $H_{j,i}$
- to extract the subset, we can construct a function $C_{j,i}(z, x) : \{0,1\}^{s_{j,i}} \times \{0,1\}^{s_i}$
- where z represents the index of subset and x represents the index in the main polynomials
- $C_{j,i}(z, x) = 1$ if $V_{j,i}[z] == V_i[x]$ i.e. they represent the same gate
- $C_{j,i}(z, x) = 0$ otherwise
- we can compute the sum of subset values as follows
- $\sum_{z,x \in {0,1}} C_{j,i}[z, x] \cdot V_i(x)$
- rough intuition
- it start with the first subset entry e.g. 0
- then it pairs 0 with every entry in the main array $V_i(x)$
- if the subset has an entry at index 0 then for some value of x, $C_{j,i}[z, x] = 1$
- allowing $V_i(x)$ to contribute to the sum.
- to fully prove the subset claim, we also have to prove correct evaluation at $r^{(j,i)}$
- evaluation can be seen as a dot product of the $y$ values and the lagrange basis
- subset entries are already ordered by $z$ i.e. $z_0$ is the first entry, $z_1$ the second, ...
- as long as we multiply the right subset entry with the right basis element we should get the same evaluation
- this leads to the following function
- $H_{j, i} = \sum_{z,x \in \{0, 1\}} I_r^{(j,i)}(z) \cdot C_{j,i}[z, x] \cdot V_i(x)$
- where $I_r(z)$ represents the Lagrange basis generator gotten from $r^{(j,i)}$
- rough intuition
- z tells us the current subset entry we are working with
- then we have found the entry $C_{j, i}[z, x] = 1$
- hence we contribute $I_r(z) \cdot V_i(x)$ to the sum
- as long as $I_r(z)$ produces the correct basis element, the above across all z will lead to the evaluation of $V_{j, i}$
- conclusion
- running sumcheck on:
- $H_{j, i} = \sum_{z,x \in \{0, 1\}} I_r(z) \cdot C_{j,i}[z, x] \cdot V_i(x)$
- is sufficient to prove the subset claim $H_{j,i} = V_{j, i}(r^{(j,i)})$
- next we need to handle proving multiple claims
- we can do this via [[batch-sumcheck]]
- we sample a random $\alpha_i$ for every claims, perform a random linear combination and prove the final sumcheck.
- as follows:
- sample $\alpha_{0,i}, \alpha_{1,i}, ..., \alpha_{i-1, i}, \alpha_{i-1, i}'$
- compute the new claimed sum as follows:
- $H = \alpha_{0, i} \cdot V_{0, i}(r^{(0,i)}) + ... + \alpha_{i-1, i} \cdot V_{i-1, i}(r^{(j, i)}) + \alpha_{i-1, i}' \cdot V_{i-1, i}(r^{(j, i)'})$
- the sumcheck body:
- $\begin{matrix} (\sum_{j = 0}^{i - 1} \alpha_{j, i} \sum_{z,x \in \{0, 1\}} I_r^{(j,i)}(z) \cdot C_{j,i}[z, x] \cdot V_i(x)) \\ + \alpha_{i-1, i}' \cdot \sum_{z, x \in \{0,1\}} I_r^{(i-1, i)} \cdot C_{i-1, i}[z, x] \cdot V_i(x) \end{matrix}$
- given that $V_i(x)$ is common across all terms we can rewrite:
- $\sum_x V_i(x)g_i(x)$
- $g_i(x) = \begin{matrix} (\sum_{j = 0}^{i - 1} \alpha_{j, i} \sum_{z,x \in \{0, 1\}} I_r^{(j,i)}(z) \cdot C_{j,i}[z, x]) \\ + \alpha_{i-1, i}' \cdot \sum_{z, x \in \{0,1\}} I_r^{(i-1, i)} \cdot C_{i-1, i}[z, x] \end{matrix}$
- computing the $g_i(x)$ bookkeping table
- initialize the table to the all zeros tables indexed via x
- for each subset
- compute the basis vector indexed via z, based on r call it $G_j$
- for every entry in $C_{j,i}[z, x]$ that is non-zero
- compute: $A_{g_i}[x] += \alpha_{j, i} \cdot G_j[t]$
- [ ] clean up the combining multiple claims to 1, there is something deeper going on here, also look into jagged poly (feel they must be related).