Marlin + Plookup (High-Level Summary)

# Marlin + Plookup (High-Level Summary) ###### tags: `Crypto` `ZK` We apply the Marlin SUMCHECK-based indexing argument together with a novel JITed matrix approach allowing for plookup to be performed easily. To achieve JIT of Marlin, we do the following. ### High-Level Summary #### Marlin JIT indexer Let $K$ be the subgroup over which one indexes the non-zero values of the matrix $M$. Let $\bar{K}$ be the indexes not involved in JITed values. Let $K_0,...,K_i,..,K_N$ be the sets of index values for $M$ involved in JITed values $x_i$. Let them be pairwise disjoint. Let $K = \bar{K} \cup \bigcup_i K_i$. Let $m(\kappa) = u_H(X, \mathsf{row}(\kappa))u_H(Y, \mathsf{col}(\kappa))\mathsf{val}(\kappa)$ Define. $M(X, Y) = \sum_{\kappa \in \bar{K}} m(\kappa) + \sum_{i=0..N} x_i \sum_{\kappa\in K_i} m(\kappa)$ Then, when doing the SUMCHECK protocol to find $f_3$ and $g_3, \sigma_3$, both the prover and validator create the following low-degree preprocessed polynomials in place of $\hat{\mathsf{val}}$: \begin{align} \bar{ \mathsf{val}}(\kappa)= \begin{cases} \mathsf{val}(\kappa) & \kappa \in \bar{K}\\ 0, & \kappa \in K\setminus\bar{K} \end{cases} \end{align} \begin{align} \mathsf{val}_i(\kappa)= \begin{cases} \mathsf{val}(\kappa) & \kappa \in K_i\\ 0, & \kappa \in K\setminus K_i \end{cases} \end{align} They are preprocessed polynomials and only contribute to the proving key size, but not the proof size. Then, following the Marlin linear relation protocol (see: [Marlin](https://eprint.iacr.org/2019/1047.pdf) pg. 24.) ceteris paribus, one checks the following polynomial identity, with the JIT matrix $\mathsf{Instance}$ values $x_i$: $v_H(β_2)v_H(β_1)(\bar{\mathsf{val}}(X) + \sum_i x_i \mathsf{val}_i(X)) − (β_2 − \mathsf{row} (X))(β_1 − \mathsf{col}(X))(Xg_3(X) + σ_3/|K|) = h_3(X)v_K(X)$ #### Plookup Column Aggregation Via the above method of inserting JITed values into a matrix, one can efficiently check the following statement necessary to encode plookup vector lookups and multiple tables (see: [plookup](https://eprint.iacr.org/2020/315.pdf), pg. 7): Over the evaluation domain $L\subset H$, check that $M\cdot z \stackrel{?}{=} z^{out}$, where the rows of the matrix $M: \mathbb{F}^{|H|} \rightarrow \mathbb{F}^{|L|}$ encode the following linear relations: $\mathsf{tableindex} + \sum_{i = 1..r} \gamma^i z^{in}_i \stackrel{?}{=} z^{out}$ where $z^{out}$ is our plookup aggregated table value witness. Whereas we have checked via JIT Marlin that $z_M = M\cdot z$ over $H$, we now check $z_M(X) - z^{out}(X) = g(X)v_L(X)$ for some low-degree $g(X)$, where $v_L(X)$ is the vanishing polynomial for $L$. Then we are done. This allows us to perform the randomised column aggregation step of plookup, generating the aggregate $z^{out}$. There are two ways to construct $z^{out}$: 1. One aligns $z^{out}$ in the witness values to some subgroup $L$ of $H$. For instance, in a pow 2 subgroup $H$, $z^{out}$ are the evaluations of $z(X)$ in $L\equiv 2^kH$ for some $k$. More precisely, we use $A$ to encode $T$ at the strides of $L$, and set the rows of $B, C$ to be $\vec{0}$ at every row $i$ s.t. where $g$ generates $H$, $g^i \in L$, we can obtain our plookup commitment as $z_A(X) |_{L \subset H}$. 2. One commits to a second witness polynomial $z^{out}$. The latter is simpler but increases the proof size by 1 $\mathbb{G}$. The size of this subgroup $L$ represents the least power of 2 bounding the total number of plookups in the circuit. #### Running the Plookup Protocol Then, one can run the plookup protocol over the subgroup $L$, to show that $\mathsf{tableindex}(X) + \sum_{i = 1..r} \gamma^i z^{in}_i(X) \ \bar{\in} \ t_0 + \sum_{i = 1..r} \gamma^i t_i$ Here, $f \bar{\in} g$ means $\forall l_1 \in L. \exists l_2 \in L. f(l_1) = g(l_2)$ Since $\gamma$ is derived via the Fiat-Shamir heuristic, this is true $\iff$ $\exists$ a lookup mapping $p: \mathbb{F}_{|L|} \rightarrow \mathbb{F}_{|L|}$ such that $\forall j$ $(\mathsf{tableindex}(\mathbf{g}^j), z^{in}_1(\mathbf{g}^j), ..., z^{in}_r(\mathbf{g}^j)) = (t_0(\mathbf{g}^{\mathsf{p}(j)}), t_1(\mathbf{g}^{\mathsf{p}(j)}), ..., t_r(\mathbf{g}^{\mathsf{p}(j)}))$ This can be performed efficiently by generating the sorted version of $z_s(X)$ of $z^{out}$ and using a grand product argument on them combined with extra randomness. The maximum capacity (in rows) of all the tables combined is $|L|$ and the maximum number of columns in a table is $r$. Typically, one expects $r \in \{2, 3\}$ representing unary and binary operation lookups. --- <p style="text-align: center">End of High-Level Summary</p>p> ---