virgo++ - HackMD

paper: https://eprint.iacr.org/2020/1247.pdf ### [[arbitrary-arithmetic-circuits]] - for [[layered-arithmetic-circuits]] each gate can only take it's input from the layer below. - for the purposes of [[virgo++]] the [[arbitrary-arithmetic-circuits]] is defined as follows - the circuit can be seen as a directed acyclic graph - with fan-in 2 nodes (meaning each gate takes a maximum of 2 inputs) - the depth of the circuit $d$ is the length of the longest path in the graph - input gates are at layer d, output gates are at layer 0 - each gate in the graph can be assigned a layer id via topological sorting - we can determine a gates layer via the following: layer(g) = min(layer(u), layer(v)) - 1 - where u and v represents the input to gate g - note: if g is the input gate then it's layer id is just $d$ - this leads to the conclusion that every gate at layer i must have an input from layer i + 1 - rough intuition - inputs are fixed at layer d - all other gates depend on the inputs or depend on things that have been derived from the input. - if you depend exclusively on the input then layer d - 1 - if you depend on something that depends on the input (say 1 step) then layer d - 2 - the idea here is the layers might represent some kind of computational step, so the gate has to wait a layer for all it's needed inputs to be ready, only then can it perform it's computation. - hence the layer id can represent how long you have to wait for all your inputs to be ready!! - topological sorting puts every gate in the exact layer they need to be, nothing more, nothing less (in terms of input ready wait time). ### generalizing [[gkr]] for [[arbitrary-arithmetic-circuits]] (naive) - the idea is to perform statement reductions from the output layer to the input layer, where the verifier can then confirm the last reduced statement on the input. - we do this for [[layered-arithmetic-circuits]] as follows - given that layer i depends on layer i + 1, we construct a relation that can get all elements in layer i but using only values from layer i + 1 - we then prove this relation via the appropriate argument, this will lead to a reduced statement about layer i + 1 - we continue this process recursively until we get to a reduced statement about the input layer - which the verifier can check directly. fin. - the key thing here is that we have some relation that can produce all the elements at layer i, but doesn't make use of layer i itself. - for the [[arbitrary-arithmetic-circuits]] layer i depends not only on layer i + 1 but potentially on all layers $j > i$ - so to generate some relation that produces all elements at layer i, we need to capture all layers $j > i$ - construction notes - a given gate is either an addition gate or a multiplication gate - the gate takes in two inputs - one of those inputs must come from layer i + 1 - the other input can come from any layer $j > i$ - hence the relation just has to have all possible combination of layers - $add_{i,j}(z, x, y)$ -> represents an addition gate at layer i - 1 with inputs from layer i and j respectively. - The relation is given as follows $$ \begin{matrix} V_i(z) = \sum_{x, y \in \{0,1\}^s} add_{i+1, i+1}(z, x, y)(V_{i+1}(x) + V_{i + 1}(y)) + mul_{i+1,i+1}(z, x, y)(V_{i+1}(x) \cdot V_{i+1}(y)) + \\ add_{i+1,i+2}(z, x, y)(V_{i+1}(x) + V_{i+2}(y)) + mul_{i+1,i+2}(z, x, y)(V_{i+1}(x) \cdot V_{i+2}(y) + ... + \\ add_{i+1,d}(z, x, y)(V_{i+1}(x) + V_d(y)) + mul_{i+1,d}(z, x, y)(V_{i+1}(x) \cdot V_d(y)) \end{matrix} $$ - running sumcheck on the above relation will lead to claims for every $V_j$ where $j > i$ - hence once we reach some layer $V_i$ to reduce, we'd have accumulated multiple claims for that layer. ### problems with naively generalizing [[gkr]] for [[arbitrary-arithmetic-circuits]] - assuming that all layers have S gates - the polynomial size for a given layer in the generalized relation is O((d - i)S) - and the sumcheck cost is linear in polynomial size - hence the total cost of running sumcheck for all layers is given as follows - $O(dS + (d - 1)S + ... + S) = O(d^2S) = O(d|C|)$ - recall S is the total number of gates in a layer if we have d layers then $dS = |C|$ - we need to somehow reduce the size of the polynomials per round to reduce the cost of sumcheck, reducing the cost of the protocol. ### generalizing [[gkr]] for [[arbitrary-arithmetic-circuits]] (optimal) - assuming layer i has S gates and each gate can take only two inputs, then at most layer i is defined by 2S gates. - below is a relation that can still allow you compute the correct $V_i(z)$ at some point z based on layers $j > i$, but with cheaper sumcheck costs. - rather than considering all the gates as input, when trying to evaluate a layer's eval, we only consider the subset of gates that actually have wirings to the target layer. - we define $V_{i,j}$ as the subset of gates from layer $V_j$ that have wirings to gates in $V_i$ - this will affect the way the $add_i$ and $mul_i$ equations are constructed. - the relation is given as follows - $$ \begin{matrix} V_i(z) = \sum_{x \in \{0, 1\}^{s_i, i+1}} ( \\ \sum_{y \in \{0, 1\}^{s_i, i+1} add_{i+1, i+1}(z, x, y)(V_{i,i+1}(x) + V_{i,i+1}(y))} \\ + ... + \sum_{y \in \{0,1\}^{s_i, d} add{i + 1, d}(z, x, y)(V_{i,i+1}(x) + V_{i,d}(y))} \\ \sum_{y \in \{0, 1\}^{s_i, i+1} mul_{i+1, i+1}(z, x, y)(V_{i,i+1}(x) \cdot V_{i,i+1}(y))} \\ + ... + \sum_{y \in \{0,1\}^{s_i, d} mul{i + 1, d}(z, x, y)(V_{i,i+1}(x) \cdot V_{i,d}(y))}) \end{matrix} $$ - ![[Pasted image 20250528155932.png]] - to prove the [[arbitrary-arithmetic-circuits]] - we'd need to prove each relation above, which will produce claims for the v_subset values. - the claims will be given as hints by the prover, for the next round the verifier will need to verify the accumulated subset claims. - reducing it to one claim about the full polynomial, which can then be proved via the standard virgo process. ### performing sumcheck on the relation - phase 1 - we recall libra phase 1 - we are trying to prove a relation of the following form - $H(a) = \sum_{b, c} f(a, b, c) \cdot v(b) \cdot v(c)$ - the goal is to not materialize $f(a, b, c)$ as it is sparse and we should be able to take advantage of that sparsity. - to achieve efficiency we compute the sumcheck messages in phases, first for the b variables (phase 1) and then for c variables (phase 2). - we perform the following rewrite - $H(a) = v(b) \cdot h_a(b)$ - where $h_a(b) = \sum_c f(a, b, c) \cdot v(c)$ - we turn $f(a, b, c)$ into a vector over the [[boolean-hypercube]] as follows - $h_a(b) = \sum_x I(a, z) \cdot f[z, b, c] \cdot v(c)$ - constructing $h_a(b)$ - first we initialize an empty vector $A_{h_a}$ with $2^{|b|}$ zeroes - we iterate over the sparse vector $f[z, b, c]$ for each entry to perform the following - $A_{h_a}[b] += I[z] \cdot V_c(c)$ - once we have the table $A_{h_a}$ and $V(b)$ we just apply [[sumcheck-over-products]] - effect of the c variable padding for phase 1 - what if instead we wanted to run phase 1 sumcheck on the following relation - $H(a) = \sum_{b, c} c_{k+1} \cdot ... \cdot c_l \cdot f(a, b, c_0, ... c_k) \cdot v(b) \cdot v(c_0, ..., c_k)$ - applying the rewrite process - $H(a) = v(b) \cdot h_a'(b)$ - $h_a'(b) = \sum_c c_{k+1} \cdot ... \cdot c_l \cdot f(a, b, c_0, ..., c_k) \cdot v(c_0, ..., c_k)$ - we showed in [[variable-padding-for-sumcheck]] that variable padded sumcheck has the same claimed sum. - hence $h_a(b) == h'_a(b)$ - which means that as it is related to the construction of $A_{h_a}$ variable padding doesn't matter. - how do we determine the size of the $A_{h_a}$ table - this should depend entirely on the size of a - in [[virgo++]] size of a is the same for all terms in the layer equation sum i.e. no of variables of $V_{i + 1}$ - to fully solve phase 1 for [[virgo++]] we need to know the effect of sum of sums on table generation. - we look at the following relation - $\begin{matrix} H(a) = \sum_{b,c} f_1(a, b, c) \cdot v_1(b) \cdot v_2(c) + f_2(a, b, c) \cdot v_1(b) \cdot v_3(b) + \\ ... + f_n(a, b, c) \cdot v_1(b) \cdot v_n(c) \end{matrix}$ - where $f_1(..), f_2(..), ..., f_n(..)$ are sparse - given that $v_1(b)$ is common across all we can apply the following rewrite - $H(a) = v_1(b) \cdot h_a(b)$ - where $h_a(b) = \sum_c f_1(a, b, c) \cdot v_2(c) + f_2(a, b, c) \cdot v_3(c) + ... + f_n(a, b, c) \cdot v_n(c)$ - we can turn the sparse polynomials into vectors over the [[boolean-hypercube]] by extracting $I(a, z)$ - $\begin{matrix} h_a(b) = \sum_c I(a, z) \cdot f_1[z, b, c] \cdot v_2(c) + I(a, z) \cdot f_2[z, b, c] \cdot v_3(c) \\ + ... + I(a, z) \cdot f_n[z, b, c] \cdot v_n(c) \end{matrix}$ - an obvious solution is to do them one after the other as follows - first we set up the $A_{h_a}$ table with all zeroes - we also initialize the $I_a(z)$ table - then we start with the first pair $(f_1[z, b, c], v_2(c))$ - applying the following routine $A_{h_a}[b] += I_a[z] \cdot v_2(c)$ - repeat for all relevant pairs - phase 2 - in phase 2 we start to see the effect of the variable padding, we showed in [[variable-padding-for-sumcheck]] how to perform sumcheck on a variable padded polynomial. - we also know that you can perform sumcheck on sums of polynomials by just adding the round messages together and applying the same challenges to each polynomial. - this also works for sums of product polynomials, we just treat the product polynomial as a single polynomial (via abstraction), which reduces to the last case. - the only thing left for phase 2 is figuring out how to build the bookkeeping tables. - $\sum_{i = 0}^m \sum_c c_{k+1} \cdot ... \cdot c_l \cdot f_i(a, u, c_0, ..., c_k) \cdot t(u) \cdot v_i(c_0, ..., c_k)$ - where $t(u)$ = $V_{i + 1}(u)$ = scalar - since we are performing [[variable-padding-for-sumcheck]] we can ignore the padded variables - $\sum_{i = 0}^m \sum_c f_i(a, u, c_0, ..., c_k) \cdot t(u) \cdot v_i(c_0, ..., c_k)$ - we can take advantage of sparsity - $\sum_{i = 0}^m \sum_c I_a(z) \cdot I_u(b) \cdot f_i[z, b, c_0, ..., c_k] \cdot t(u) \cdot v_i(c_0, ..., c_k)$ - we already have the table representing $v_i(..)$ - we can construct $A_{f_i}$ as follows: - where $A_{f_i} [c_0, ..., c_k] += I_a[z] \cdot I_u[b] \cdot t(u)$ - for all non-zero entries in $f$ ### combining multiple claims into 1 - when it is time to prove any layer (except the output layer), we'd already have a set of claims of the layer or subsets of the layer. - specifically if we try to prove layer i, we have claims from $V_{0, i}, V_{1, i}, ..., V_{i-1,i}, V_{i-1,i}'$ - we have two claims for $V_{i-1,i}$ - the objective is to prove that all claims are valid. - first we focus on proving a single subset claim. - we have the following claim $V_{j, i}(r^{(j,i)}) = H_{j,i}$ and $V_{j,i} \in V_i$ - we need to prove two things 1. $V_{j,i}$ was constructed correctly from $V_i$ based on the circuit description 2. Evaluating this subset $V_{j,i}$ at $r^{(j,i)}$ gives the target value $H_{j,i}$ - to extract the subset, we can construct a function $C_{j,i}(z, x) : \{0,1\}^{s_{j,i}} \times \{0,1\}^{s_i}$ - where z represents the index of subset and x represents the index in the main polynomials - $C_{j,i}(z, x) = 1$ if $V_{j,i}[z] == V_i[x]$ i.e. they represent the same gate - $C_{j,i}(z, x) = 0$ otherwise - we can compute the sum of subset values as follows - $\sum_{z,x \in {0,1}} C_{j,i}[z, x] \cdot V_i(x)$ - rough intuition - it start with the first subset entry e.g. 0 - then it pairs 0 with every entry in the main array $V_i(x)$ - if the subset has an entry at index 0 then for some value of x, $C_{j,i}[z, x] = 1$ - allowing $V_i(x)$ to contribute to the sum. - to fully prove the subset claim, we also have to prove correct evaluation at $r^{(j,i)}$ - evaluation can be seen as a dot product of the $y$ values and the lagrange basis - subset entries are already ordered by $z$ i.e. $z_0$ is the first entry, $z_1$ the second, ... - as long as we multiply the right subset entry with the right basis element we should get the same evaluation - this leads to the following function - $H_{j, i} = \sum_{z,x \in \{0, 1\}} I_r^{(j,i)}(z) \cdot C_{j,i}[z, x] \cdot V_i(x)$ - where $I_r(z)$ represents the Lagrange basis generator gotten from $r^{(j,i)}$ - rough intuition - z tells us the current subset entry we are working with - then we have found the entry $C_{j, i}[z, x] = 1$ - hence we contribute $I_r(z) \cdot V_i(x)$ to the sum - as long as $I_r(z)$ produces the correct basis element, the above across all z will lead to the evaluation of $V_{j, i}$ - conclusion - running sumcheck on: - $H_{j, i} = \sum_{z,x \in \{0, 1\}} I_r(z) \cdot C_{j,i}[z, x] \cdot V_i(x)$ - is sufficient to prove the subset claim $H_{j,i} = V_{j, i}(r^{(j,i)})$ - next we need to handle proving multiple claims - we can do this via [[batch-sumcheck]] - we sample a random $\alpha_i$ for every claims, perform a random linear combination and prove the final sumcheck. - as follows: - sample $\alpha_{0,i}, \alpha_{1,i}, ..., \alpha_{i-1, i}, \alpha_{i-1, i}'$ - compute the new claimed sum as follows: - $H = \alpha_{0, i} \cdot V_{0, i}(r^{(0,i)}) + ... + \alpha_{i-1, i} \cdot V_{i-1, i}(r^{(j, i)}) + \alpha_{i-1, i}' \cdot V_{i-1, i}(r^{(j, i)'})$ - the sumcheck body: - $\begin{matrix} (\sum_{j = 0}^{i - 1} \alpha_{j, i} \sum_{z,x \in \{0, 1\}} I_r^{(j,i)}(z) \cdot C_{j,i}[z, x] \cdot V_i(x)) \\ + \alpha_{i-1, i}' \cdot \sum_{z, x \in \{0,1\}} I_r^{(i-1, i)} \cdot C_{i-1, i}[z, x] \cdot V_i(x) \end{matrix}$ - given that $V_i(x)$ is common across all terms we can rewrite: - $\sum_x V_i(x)g_i(x)$ - $g_i(x) = \begin{matrix} (\sum_{j = 0}^{i - 1} \alpha_{j, i} \sum_{z,x \in \{0, 1\}} I_r^{(j,i)}(z) \cdot C_{j,i}[z, x]) \\ + \alpha_{i-1, i}' \cdot \sum_{z, x \in \{0,1\}} I_r^{(i-1, i)} \cdot C_{i-1, i}[z, x] \end{matrix}$ - computing the $g_i(x)$ bookkeping table - initialize the table to the all zeros tables indexed via x - for each subset - compute the basis vector indexed via z, based on r call it $G_j$ - for every entry in $C_{j,i}[z, x]$ that is non-zero - compute: $A_{g_i}[x] += \alpha_{j, i} \cdot G_j[t]$ - [ ] clean up the combining multiple claims to 1, there is something deeper going on here, also look into jagged poly (feel they must be related).