Understand PLONK: Plonkish Arithmetization

# Understand PLONK: Plonkish Arithmetization This is a key component of the PLONK protocol, before diving in, let's explore the term "Arithmetization" in the context of Zero Knowledge; Arithmetization is the process of converting computational statements into mathematical equations, specifically polynomial equations, that can be efficiently proven and verified. It's like translating a computer program into a mathematical language that zero-knowledge proofs can understand. Here is the basic building blocks of the concept of Arithmetization; - Variables: Represent values in computation - Constraints: Mathematical relationships between variables - Polynomials: Used to encode relationships and checks - Fields: Mathematical structures where computation occurs Over the years many form of Arithmetization has been laid out, let's explore a few; 1. Rank-1 Constrant System (R1CS) - Used in protocols like Groth16 - Represents computation as: $A⋅x ∘ B⋅x = C⋅x$ - Where: * $A, B, C$ are matrices * $x$ is the witness vector * $∘$ represents element-wise multiplication 2. Plonkish Arithmetization - Used in PLONK and its variants - Gate format: $q_L⋅a + q_R⋅b + q_O⋅c + q_M⋅(a⋅b) + q_C = 0$ - Where: * $a, b, c$ are witness values * $q_L, q_R, q_O, q_M, q_C$ are selector polynomials, indicating the gate operations 3. Algebraic Intermediate Representation - Used in STARKs - Represents computation as polynomial constraints over execution trace - Constraints form: $f(x_1, x_2, ..., x_n) = 0$ **Common Steps in Arithmetization** Step 1: Statement → Circuit - Convert computation into logical/arithmetic circuit - Identify inputs, outputs, and intermediate values Step 2: Circuit → Constraints - Transform circuit gates into mathematical constraints - Ensure constraints capture computational logic Step 3: Constraints → Polynomials - Convert constraints into polynomial equations - Ensure polynomial degrees are manageable Step 4: Optimization - Reduce number of constraints - Minimize polynomial degrees - Optimize for proof size/verification time ## Back to PLONK Arithmetization Say we are given this computation; $com = xy(p+q)$ it circuit execution trace would be as follows; $z = x * y$ $r = p + q$ $out = z * r$ <a href="https://imgbb.com/"><img src="https://i.ibb.co/khxZxwR/plonk-cir.png" alt="plonk-cir" border="0"></a> when the circuit is being excecuted all this variables would be taking a concrate vaule, forming the execution trace table with which proof would would be generated on; This is how the circuit would look like when $x = 2$, $y = 3$, $p = 4$, and $q = 5$. | A | B | C | |-----|----|-----| | 2| 3| 6| | 4| 5| 9| | 6| 9| 54| uisng this execution trace, we can now introduce the constrant gate, encoding the gate operation and ensuring correct execution. This would introduce the vanilla PLONK gate equation. $q_L⋅a + q_R⋅b + q_O⋅c + q_M⋅(a⋅b) + q_C = 0$ where the $a, b,$ and $c$ are the left, right, and output wires of the gate. To say $a+b=c$, we let $q_L$ and $q_R$ equal $1$, while $q_O$ equals $-1$. The other coefficients wouldn't be used, so we set $q_M$ and $q_C$ to $0$. To say $a×b = c$, we set $q_O$ equal to $-1$, set $q_M$ equal $1$, and set the rest of the coefficients to $0$. To bind a variable to a public value, we let $q_R, q_O, q_M$ equal to $0$, $q_L$ equal to 1, and $q_C$ equal to the public value. so for the 3 gates we are working with the gate equations would look like; gate 1: $0⋅2 + 0⋅3 + -1⋅6 + 1⋅(2⋅3) + 0 = 0$; this equation holds. gate 2: $1⋅4 + 1⋅5 + -1⋅9 + 0⋅(4⋅5) + 0 = 0$; this equation holds. gate 3: $0⋅6 + 0⋅9 + -1⋅54 + 1⋅(6⋅9) + 0 = 0$; this equation holds. next is the collect all this vectors so we can build just one relationship this protocol can progress with; $q_L = (0,1,1)$ $q_R = (0,1,0)$ $q_O = (-1,-1,-1)$ $q_M = (1,0,1)$ $q_C = (0,0,0)$ $a = (2,4,6)$ $b = (3,5,9)$ $c = (6,9,54)$ Next up is to interpolate these vectors to obtain a polynomial, the nature of this interpolation is quite unique, this interpolation would be done on the `Domain` of the roots of unity. But out example would encounter a problem seeing that the size if the vector is $3$ which is not a power of $2$ which is needed to manipulate/obtain a multiplicative sub-group over the `Domain` root of unity which would be useful in the protocol as we progress. To solve out little problem would extend the vectors with `Zero` before carrying out the interpolation over the root of unity $\omega$ when $n$ is 4; This is the resulting vectors after the extension; $q_L = (0,1,1,0)$ $q_R = (0,1,0,0)$ $q_O = (-1,-1,-1,0)$ $q_M = (1,0,1,0)$ $q_C = (0,0,0,0)$ $a = (2,4,6,0)$ $b = (3,5,9,0)$ $c = (6,9,54,0)$ Using this technique, we can successfully prove and verify this circuit execution trace with the help of these 8 polynomials no matter how big the circuit is (Number of constraints). But that is one more thing! we have only been able to mathematically prove that these gates are present in a circuit but we have not established how these gates are wired to other gates and signals in the circuit. This introduces us to the term "Copy Constraints". ## Copy Constraints Copy constraints are a fundamental mechanism in the PLONK protocol that ensure certain wire values are equal across different parts of a circuit. They solve the critical problem of enforcing value consistency without revealing the actual values. Key Characteristics: 1. Verify that specific wire values are identical 2. Maintain zero-knowledge properties 3. Efficiently check value equality across circuit > We need to check that the wire vaules are copied correctly, this can be checked by making use of the permutation argument Bayer, Groth12. Looking into this slide; <a href="https://ibb.co/5YKJ4FK"><img src="https://i.ibb.co/jVJjfRJ/pa1.png" alt="pa1" border="0"></a> <a href="https://ibb.co/TtGTgGw"><img src="https://i.ibb.co/3cbsCbT/pa2.png" alt="pa2" border="0"></a> <a href="https://ibb.co/Rj5mn72"><img src="https://i.ibb.co/DrqyXL4/pa3.png" alt="pa3" border="0"></a> <a href="https://ibb.co/sQMzDGf"><img src="https://i.ibb.co/nf5tVvG/pa4.png" alt="pa4" border="0"></a> Now we've got a good understanding on how permuatation check is designed, let's explore it taliored to the PLONK protocol; The best method for explaining this concept I found was documented by the Lambdaclass team, I would be explaining this PLONKISH wire relationship using this concept also. We would be representing the circuit structure and the witnesses in the form of a matrix. $q_L = (0,1,1,0)$ $q_R = (0,1,0,0)$ $q_M = (1,0,1,0)$ $q_O = (-1,-1,-1,0)$ $q_C = (0,0,0,0)$ Imtroducing the $Q$ matix | $Q_L$ | $Q_R$ | $Q_M$ | $Q_O$ | $Q_C$ | |-------|-------|--------|-------|-------| | 0 | 0 | 1 | -1 | 0 | | 1 | 1 | 0 | -1 | 0 | | 1 | 0 | 1 | -1 | 0 | | 0 | 0 | 0 | -1 | 0 | **The $V$ Matix** The claim in the previous section is not an "if and only if" statement because the following trace columns do satisfy the equations but do not correspond to a valid execution: | A | B | C | |-----|----|-----| | 2| 3| 6| | 4| 5| 9| | 6| 9| 54| The $V$ matrix encodes the carry of the results from one gate to the right or left operand of a subsequent one. These are called wirings. Like the $Q$ matrix, it's independent of the individual evaluation. It consists of indices for all input and intermediate variables. In this case, that matrix is: | L | R | 0 | |-----|----|-----| | 1| 2| 0| | 4| 5| 3| | 0| 3| 6| Here $0$ is the index of $z$, $1$ is the index of $x$, $2$ is the index of $y$, $3$ is the index of $v$, $4$ is the index of $p$, $5$ is the index of $q$, and $6$ is the index of the output $out$. Now we can update the claim to have an "if and only if" statement. **New PLONK Claim**: Let $T$ be a matrix with columns $A$, $B$, $C$. It corresponds to a proper evaluation of the circuit if and only if; 1. for all $i$ the following equality holds $Q_{Li}⋅A_i + Q_{Ri}⋅B_i + Q_{i}⋅C_i + Q_{Mi}⋅(A_i⋅B_i) + Q_{Ci} = 0$ 2. for all $i,j,k,l$ such that $V_{i,j} = V_{k,l}$ we have $T_{i,j} = T_{k,l}$ With this we can successfully encode the circiut excecution and enforce it wiring, just what we need for the PLONK protocol. In Zero Knoweldge proof protocols, we almost alway operate on polynomials and we would like to do the same here, we need to convert these `Claim` equations to polynomials. **_Starting with Claim 1_**; for all $i$ the following equality holds $Q_{Li}⋅A_i + Q_{Ri}⋅B_i + Q_{i}⋅C_i + Q_{Mi}⋅(A_i⋅B_i) + Q_{Ci} = 0$ Let $ω$ be a primitive $N-th$ root of unity and let $H=ω^i:0≤i<N$. Let $a,b,c,q_L,q_R,q_M,q_O,q_C$ be the polynomials of degree at most $N$ that interpolate the columns $A,B,C,Q_L,Q_R,Q_M,Q_O,Q_C$ at the domain $H$. This means for example that $a(ω^i) = A_i$ for all $i$, and similarly for all the other columns. With this, `Claim` $1$ can be expressed as follows; $a(x)q_L(x) + b(x)q_R(x) + a(x)b(x)q_M(x) + c(x)q_O(x) + q_C(x) = 0$ for all $x$ in domain $H$. Interestingly, in the world of polynomials, this is also equivalent to; $aq_L + bq_R + abq_M + cq_O + q_C = z_{H^t}$, where $z_H$ is the polynomial $X^N - 1$. **_Transforming Claim 2_**; To perfrom this transformation, recall our discussion on Permutation check. A permutation is a rearrangement of a set, usually denoted $\sigma$. For finite sets, it is a map from a set to itself that takes all values. In our case, the set will be the set of all pairs; $I = (i,j):$ such that $0 ≤ i < N$, and $0 ≤ j < 3$ The matrix $V$ induces a permutation of this set where $σ((ij))$ is equal to the indices of the next occurrence of the value at position $(i,j)$. If you are already at the last occurrence, go to the first one. By next, we mean the following occurrence, as if the columns were stacked on each other. Let's see how this works in the example circuit. Recall $V$ is; | L | R | 0 | |-----|----|-----| | 1| 2| 0| | 4| 5| 3| | 0| 3| 6| The permutation expression for this case can be represented as such; - $\sigma((0,0)) = (0,0)$ - $\sigma((0,1)) = (0,1)$ - $\sigma((0,2)) = (2,0)$ - $\sigma((1,0)) = (1,0)$ - $\sigma((1,1)) = (1,1)$ - $\sigma((1,2)) = (2,1)$ - $\sigma((2,0)) = (0,2)$ - $\sigma((2,1)) = (1,2)$ - $\sigma((2,2)) = (2,2)$ It can be easily observed `Check` 2 can be expressed as follows; for all $(i,j) \in I$; $T_{i,j} = T_{\sigma((i,j))}$ This mathematically implies that set $A$ and $B$ are equal given that; $A = \{((i,j), T_{i,j}): (i,j) \in I\}$ $B = \{(\sigma((i,j)), T_{i,j}): (i,j) \in I\}$ This is great, but the problem we had initally was that we would like to represent `Check` 2 problem as a Polynomial Identity Representation which would be more convenient for this protocol, this problem would introduce the concept of **Equality of Sets** First explore [this](https://hackmd.io/@0xdeveloperuche/HJX8bvAz1x), a deep dive into multiset equality check, this would give a very good overview on this concept before diving into what this has to do with our polynomail identity problem. ### Equality of sets Problem: How to check if two sets $A = \{a_0, a_1, \ldots, a_{k-1}\}$ and $B = \{b_0, b_1, \ldots, b_{k-1}\}$ of field elements are equal? Naive Attempt: Multiplication You might think to multiply all elements of A and B , then compare the results: $\prod_{i=0}^{k-1} a_i \quad \text{and} \quad \prod_{i=0}^{k-1} b_i$. If they match, $A$ and $B$ might be equal. However, this test fails in some cases. For instance: • $A = \{4, 15\}$ and $B = \{6, 10\}$ both yield $60$ as the product, but clearly $A \neq B$. Better Approach: **Polynomials** Instead of comparing products, we use polynomial roots to represent the sets. For $A = \{a_0, a_1, \ldots, a_{k-1}\}$ , construct the polynomial: $f_A(X) = (a_0 + X)(a_1 + X)\cdots(a_{k-1} + X)$. Do the same for $B$ : $f_B(X) = (b_0 + X)(b_1 + X)\cdots(b_{k-1} + X)$. These polynomials encode the sets $A$ and $B$ . The sets $A$ and $B$ are equal if and only if: $f_A(X)$ = $f_B(X)$, which holds because of the unique factorization theorem in polynomial algebra. Checking Equality Efficiently Instead of comparing the polynomials symbolically, we use a random evaluation strategy. Pick a random field element $\gamma$ and evaluate: $f_A(\gamma) \quad \text{and} \quad f_B(\gamma)$. If $f_A(\gamma)$ = $f_B(\gamma)$, then with high probability, $A = B$. This relies on the Schwartz-Zippel Lemma, which states that if two distinct polynomials are evaluated at a random point, the chance of them being equal is very small. **Generalization: Using Roots of Unity** When the sets $A$ and $B$ are large, we can optimize further using a special domain $H$ based on roots of unity. Define: • $\omega$ : A primitive $k -th$ root of unity. • $H = \{1, \omega, \omega^2, \ldots, \omega^{k-1}\}$ : The evaluation points. For $A$, construct the polynomial $f$ that interpolates the values: $\{a_0 + \gamma, a_1 + \gamma, \ldots, a_{k-1} + \gamma\}$ at points in $H$ . Similarly, construct $g$ for $B$ . The sets $A$ and $B$ are equal if: $f(h) = g(h) \quad \forall h \in H$. Polynomial $Z(X)$ To simplify, introduce a polynomial $Z(X)$ that relates $f$ and $g$ : $Z(X)f(X) = g(X)Z(\omega X)$, with $Z(1) = 1$. By evaluating $Z(X)$ recursively at $H$ , we conclude $A = B$ if: $\prod_{i=0}^{k-1}(a_i + \gamma) = \prod_{i=0}^{k-1}(b_i + \gamma)$. _Note: if you find $g(X)Z(\omega X)$ confusing, explore [this](https://hackmd.io/@0xdeveloperuche/HyxpMJa-Xyg)_ This method scales efficiently and leverages the structure of polynomials and roots of unity for large sets. Next thing to do now is to extend this expression in that is would not just accomodate an equality check for two set of field elements but rather to a set of tuples of field elements, with this we get much more close to the plonkish permutation argument. Lemma: Let $A = \{\bar{a}0, \ldots, \bar{a}{k-1}\}$ and $B = \{\bar{b}0, \ldots, \bar{b}{k-1}\}$ be sets of pairs of field elements, where each $\bar{a}i = (a{i,0}, a_{i,1})$ and similarly for $\bar{b}_i$. We aim to determine whether $A$ and $B$ are equal using polynomial techniques. 1. Mapping Tuples to Field Elements: Let $\beta, \gamma \in F$ be random field elements. Each pair $(a_{i,0}, a_{i,1})$ in A is mapped to a single field element: $a_{i,0} + a_{i,1}\beta + \gamma$, and similarly for $B$. 2. Interpolating Polynomials: Define $H = \{1, \omega, \omega^2, \ldots, \omega^{k-1}\}$, where $\omega$ is a $k -th$ root of unity. Construct polynomials $f(X)$ and $g(X)$ that interpolate the values: $\{a_{i,0} + a_{i,1}\beta + \gamma \mid i = 0, \ldots, k-1\}$, and $\{b_{i,0} + b_{i,1}\beta + \gamma \mid i = 0, \ldots, k-1\}$, respectively, over the evaluation points in $H$. 3. Equality Verification: To verify that $A$ and $B$ are equal, we check for the existence of a polynomial $Z(X)$ satisfying: $Z(\omega^0) = 1$, and $Z(X)f(X) = g(X)Z(\omega X)$, for all $X \in H$. > If such a polynomial $Z(X)$ exists, then with overwhelming probability, the sets $A$ and $B$ are equal. This probability is high due to the randomness of $\beta$ and $\gamma$ and the properties of polynomials over finite fields. ### With this, we are set for appling this to the PLONK problem Recall the permutation check problem we had initially, set $A$ and $B$ are equal given that; $A = \{((i,j), T_{i,j}): (i,j) \in I\}$ $B = \{(\sigma((i,j)), T_{i,j}): (i,j) \in I\}$ We cannot apply this principle we just explored yet because it was defined to work on field elements and an extension of a tuple of field elements but what we've got now are tuples indexes. The next subsections would do justice to this! **Mapping Indexed Sets to Field Elements and Verifying Equality** Recall that we aim to rephrase `Check` 2 in terms of polynomials. Specifically, `Check` 2 is equivalent to the equality of two sets $A$ and $B$ , defined as: $A = \{((i, j), T_{i,j}) : (i, j) \in I\}$, $B = \{(\sigma((i, j)), T_{i,j}) : (i, j) \in I\}$. **Mapping Indexed Pairs to Field Elements** The sets $A$ and $B$ are not directly composed of field elements or pairs of field elements; instead, their elements include an index pair $(i, j)$ and a field element $v$ . To leverage the results from the previous section, we need to convert these sets into pairs of field elements. **Mapping Strategy** For each element $((i, j), v)$, we leave $v$ as it is and map the index $(i, j)$ to a field element $a_0$ such that different index pairs map to distinct field elements. Let $\eta$ be a $3N -th$ primitive root of unity. Define: $a_0 = \eta^{3i + j}, \quad a_1 = v$. Thus, we map $((i, j), v)$ to the pair $(\eta^{3i + j}, v)$, which is a pair of field elements. **Constructing the New Sets** Using this mapping, the sets $A$ and $B$ become: $A = \{(\eta^{3i + j}, T_{i,j}) : (i, j) \in I\}$, $B = \{(\eta^{3k + l}, T_{i,j}) : (i, j) \in I, \sigma((i, j)) = (k, l)\}$. `Check` 2 is equivalent to the equality of these sets $A$ and $B$. **Equality Verification via Polynomials** We now apply the polynomial-based method from the previous section to verify the equality of $A$ and $B$. **Polynomials for Interpolation** Let $\eta$ be a $3N -th$ root of unity, $\beta, \gamma \in F$ be random field elements, and $D = \{1, \eta, \eta^2, \ldots, \eta^{3N-1}\}$ . Define $f(X)$ and $g(X)$ as the polynomials interpolating the following values at $D$ : $f(X) \text{ interpolates } \{T_{i,j} + \eta^{3i + j}\beta + \gamma : (i, j) \in I\}$, $g(X) \text{ interpolates } \{T_{i,j} + \eta^{3k + l}\beta + \gamma : (i, j) \in I, \sigma((i, j)) = (k, l)\}$. **Verifying Equality** If there exists a polynomial $Z(X)$ such that: $Z(\eta^0) = 1$, $Z(d)f(d) = g(d)Z(\eta d), \quad \forall d \in D$, then the sets $A$ and $B$ are equal with overwhelming probability. **Additional Definitions for the Protocol** Let $\omega$ = $\eta^3$ be a primitive $N -th$ root of unity and $H = \{1, \omega, \omega^2, \ldots, \omega^{N-1}\}$. Define $S_\sigma^1, S_\sigma^2, S_\sigma^3$ as the interpolations at H of the sets of values: $S_\sigma^1 : \{\eta^{3k + l} : (i, 0) \in I, \sigma((i, 0)) = (k, l)\}$, $S_\sigma^2 : \{\eta^{3k + l} : (i, 1) \in I, \sigma((i, 1)) = (k, l)\}$, $S_\sigma^3 : \{\eta^{3k + l} : (i, 2) \in I, \sigma((i, 2)) = (k, l)\}$. Before concluding this session, it is necessary to clearly define the structure `CommonPreprocessedInput`, this would be essential in the proof generation phase of this protocol. CommonPreprocessedInput { $S_\sigma^1$, $S_\sigma^2$, $S_\sigma^3$, $q_L$, $q_R$, $q_O$, $q_M$, $q_C$, $a$, $b$, $c$ } With these, PLONKish arithmetization is complete. references: https://research.metastate.dev/plonk-by-hand-part-1 https://www.youtube.com/watch?v=-SXMd6S0r0I https://blog.lambdaclass.com/all-you-wanted-to-know-about-plonk/ (life savers)