# Vortex: A Batched Polynomial Commitment Scheme This blog gives an introduction to Vortex, a batched polynomial commitment scheme. Using Reed-Solomon encoding and a lattice-based hash function, Vortex ensures post-quantum security and efficiency for large inputs, making it well-suited for future-proof cryptographic applications. ## Parameters We begin by defining key parameters for quick reference throughout the blog. - $k$: message length, number of columns in the matrix $W$, assumed to be a power of two for FFT implementation. <!--(If necessary, random padding can be added to the end of the message to reach this size. Check the code, see how it handle)(Assume $k$ is a power of 2, we can pad zeros (or randoms? the additional points are not points the verifier wants to open) into the input to achieve it.)--> - $n$: codeword length, a power of 2 greater than $k$. - $\omega$: a primitive $n$-th root of unity. - $\log_2 b$: bit size of each limb. - $\mathbb{F}_q$: base field. - $d$: modulo polynomial degree. - $\mathcal{R}=\frac{\mathbb{F}_q[X]}{X^d+1}$: ring. - $m$: number of ring elements - $W$: the original trace to be committed, the size is $M=N\cdot k$ - $W'$: the encoded trace, the size is $N\cdot n$ - $M=N\cdot k$: witness/matrix size - $N$: polynomial numbers, number of rows in the matrix $W$, the number of finite field elements to be hashed - $t$: the number of columns that should be opened later. ## Preliminaries ### Row Encoding: Reed-Solomon Encoding Algorithm [Reed-Solomon encoding](https://en.wikipedia.org/wiki/Reed%E2%80%93Solomon_error_correction#Constructions_(encoding)) transforms the input message into a structured codeword by evaluating a polynomial at a specific set of points. We consider the message in an evaluation domain. #### **1. Initial Setting** - $k$: message length, assumed to be a power of two for FFT implementation. - $n$: codeword length, a power of 2 greater than $k$. - Given an input message $a = (a_0, a_1, \dots, a_{k-1})$, - $D_k=(\omega^{n/k})$ is a subgroup of order $k$ within $D=(\omega)$, where $D$ is a subgroup of size $n$ within $\mathbb{F}_q^*$. :::info In this [implementation](https://github.com/Consensys/linea-monorepo/tree/main/prover), $\mathbb{F}_q$ is the BLS12-377 scalar field. We can further extend it to a smaller field such as KoalaBear field. ::: #### **2. Construct the Polynomial from $a$** We interpolate the input message over the domain $D_k$ using the Inverse FFT: $$ P(X) = IFFT(a, D_k) $$ #### **3. Encode $P(x)$ at $n$ Points** The codeword is obtained by evaluating $P(x)$ at $n$ Points using the FFT: $$ \text{codeword} = FFT(P,D)=(P(1), P(\omega), P(\omega^2), \dots, P(\omega^{n-1})) $$ --- ### Column Hashing: Lattice-Based Hash The hashing process using a polynomial ring. #### **1. Setup** - $\log_2 b$: bit size of each limb. - $\mathbb{F}_q$: message field. - $N$: the number of finite field elements to be hashed. - $d$: modulo polynomial degree. - $q$: ring modulus. The proof and message are the same field elements. - Define the ring: $\mathcal{R}=\frac{\mathbb{F}_q[X]}{X^d+1}$. - Set $m=\frac{N}{d} \cdot \frac{\log_2q}{\log_2b}$. - Generate a random matrix $A=(A_i)\gets\mathcal{R}^m$. <!-- Ring-SIS parameters - `LogTwoBound`: bit size of each limb, $b$ - `LogTwoDegree`: The polynomial degree is $d=2^{LogTwoDegree}$ - $m$ is calulated by `maxNbRows=32`,... e.g. `LogTwoBound`=$b$=16, `LogTwoDegree`=6 ($d=64$) each field element are encoded into 16-bit limbs --> #### **2. Hash** **Input**: a message $\mathbf{x}=(x_0,x_1,...)\in\mathbb{F}_q^N$. 1. Convert Field Elements to Limbs: - each $x_i\in\mathbb{F}_q$ is split into $\frac{\log_2q}{\log_2b}$ limbs $l_{i,j}$ of size $\log_2 b$ bits. 2. Rearrange as Polynomials: - Group every $d$ limbs into a polynomial. Since there are a total of $N\frac{\log_2q}{\log_2b}$ limbs, this results in $m=\frac{N}{d} \cdot \frac{\log_2q}{\log_2b}$ polynomials. Define the vector of polynomials $L=(L_i)\in \mathcal{R}^m$. 3. Compute the Hash Function: - compute polynomial multiplications $h=\sum_{i=0}^{m-1} A_i\cdot L_i$ using FFT - Extract and return the coefficients of $h$ **Example: Hashing a Column** Consider an example where: - $d=4$, $\log_2q=8$, $\log_2b=4$. - Each field element splits into 2 limbs. - Since the polynomial degree is 4. So, each polynomial multiplication "hashes" 2 field elements at once. The hashing structure is illustrated below: ![](https://i.imgur.com/kGAh3pG.png) --- ## Vortex Vortex is a **batched** polynomial commitment scheme designed for UniEval PIOP (Universal Evaluation Polynomial Interactive Oracle Proofs). It allows a *prover* to commit to multiple polynomials $W_0(X),W_1(X),...$, later on, the prover can compute openings to prove to the *verifier* that the commited polynomials can be evaluated to a claimed value $y=(y_0,y_1,...)$ at a **single** point $x$, i.e. $y_i = W_i(x)$. ![](https://i.imgur.com/YKwJcht.png) ### Commit(W) $N,k$ are set to be $O(\sqrt{M})$. Given an $N\times k$ matrix $W$: ![](https://i.imgur.com/pamxi3k.png) <!-- W:[]smartvectors.SmartVector, an abstraction over vectors of field elements W':EncodedMatrix []smartvectors.SmartVector Hashes:Element represents elements in the BLS12-377 scalar field. Uses field.Element to store hashed column values. --> $Commit(W)$ outputs a commitment $cm$ to polynomials $W_0(X),W_1(X),...$, where $W_i(X)=IFFT(W_i,D_k)$. **Step 1: Encode Each Row** - Encode each row of $W$ and obtain $$ W'= \begin{bmatrix} Encode(W_0) \\ Encode(W_1) \\ \vdots \\ Encode(W_{N-1}) \end{bmatrix} $$ where - $Encode(W_i)=FFT(W_i(X),D)$, - $W_i(X)=IFFT(W_i,D_k)$. ![](https://i.imgur.com/C001Hwv.png) **Step 2: Hash Each Column** - Each column of $W'$ is hashed using a lattice-based hash function to obtain the commitment: $$h=(h_0,h_1,...,h_{n-1}),$$ where - $h_i=H(w'_{0,i},w'_{1,i},...)$. ![](https://i.imgur.com/swVih6y.png) :::info - Each cell element in $W$ and $W'$ belongs to $\mathbb{F}_q$. - The hash values $h_i$ belong to the ring $\frac{\mathbb{F}_q[X]}{X^d+1}.$ ::: --- ### The Batch-Opening Phase: OpenEval$(x,y)$ It is an interatctive protocol between the prover and the verifier. - The prover runs the Open algorithm on input $(pp,W,W',x,y)$. - The verifier runs the Verify algorithm on input $(pp,cm, x,y)$ ![](https://i.imgur.com/oQXopMz.png) --- **The Prover: Open Commitment - Open$(pp,W,W',x,y)$** - Define the linear combination of rows: $$u=W_0 + \alpha \cdot W_1+\alpha^2 \cdot W_2+...+\alpha^{N-1} \cdot W_{N-1}$$ At index $q_i$, the encoded $u$ gives $$u'_{q_i}=W'_0[q_i] + \alpha \cdot W'_1[q_i]+\alpha^2 \cdot W'_2[q_i]+...,$$ where $W'$ is the encoded $W$. - Let $\mathbf{q}$ be the set of queried column indices. - For each queried column $s_i\in\mathbf{q}$, extract: $$\mathbf{col}_i=(W'_0[s_i],W'_1[s_i],W'_2[s_i],...)^T$$ - $\pi=(u,\mathbf{col}=\{\mathbf{col}_{s_i}\}_{s_i\in\mathbf{q}})$ --- **The Verifier: Verify Proof - Verify$(pp,cm, x,y)$** The verifier checks two conditions: **1. Proximity Check** - Ensure that the commitment is consistent with the provided column: $$H(\mathbf{col}_i)=h_{q_i}$$ - Verify the linear combination: $$\mathbf{col}_i[0] + \alpha \cdot \mathbf{col}_i[1]+\alpha^2 \cdot \mathbf{col}_i[2]+...=u'_{q_i}$$ **2. Evaluation Check** - Interpolate $u$ to obtain a polynomial $P_u(X)$, note that $$P_u(X)=W_0(X)+\alpha\cdot W_1(X)+\alpha^2 \cdot W_2(X)+...+\alpha^{N-1} \cdot W_{N-1}(X)$$ - Verify the evaluation: $$P_u(x)=y_0 + \alpha \cdot y_1+\alpha^2 \cdot y_2+...+\alpha^{N-1} \cdot y_{N-1},$$ ensuring that $$W_i(x)=y_i$$ ## Selection of Parameters ### Soundness of Vortex The soundness error is given by: $$\epsilon_{soundness}\leq \epsilon_{collision}+\epsilon_{opening}+\epsilon_{commit},$$ where - **Hash Collision Probability** $\epsilon_{collision}$: This is bounded by the probability of breaking the SIS assumption. The [Linea Prover Documentation](https://eprint.iacr.org/2022/1633.pdf) analyzes SIS attacks, including SVP, BKZ, and CPW attacks. For computing security levels, refer to the [SIS parameter estimation code](https://github.com/Consensys/linea-monorepo/blob/main/prover/sis_estimation/sis_parameter_estimation.ipynb). - **Opening Phase Error** $\epsilon_{opening}=(1-\theta)^t$: where $\theta=1-\sqrt{\rho}-\frac{\sqrt{\rho}}{2m'},\rho=k/n$ is the rate of the Reed-Solomon Code, with Johnson proximity $m'\geq 3$, and $t$ is the number of columns that need be opened - **Commit Phase Error** $\epsilon_{commit}\leq(N-1)\frac{(m'+0.5)^7}{3\rho^{1.5}}\frac{|D|^2}{|\mathbb{F}|}$: where $N$ is the polynomial numbers, $D$ is a subgroup of size $n$ within $\mathbb{F}_q^*$, and $\mathbb{F}$ is the extension field where challenges are sampled. ### 128 bits security :::warning This section demonstrates parameter selection for illustrative purposes only. Production parameters should be carefully chosen for real-world applications. ::: For a target security level of $2^{-\lambda}=2^{-128}$. - We choose SIS parameters $\log_2 q, \log_2 b, d$, such that $$\epsilon_{collision}\leq \frac{1}{4}\cdot 2^{-\lambda}$$ The following table lists SIS parameters that achieve at least 128-bit security for each attack type: <div style="display: table; margin: auto;"> | $\log_2(q)$ | $\log_2(b)$ | $d$ | SVP (L2) | SVP (L_$\infty$) | BKZ attack | CPW attack | |-------------|---------------|------|-----------|----------------|-------------|-------------| | 32 | 8 | 512 | 614.42 | 1269.76 | 551.69 | 1272.31 | | 32 | 16 | 512 | 315.41 | 634.88 | 158.74 | 1272.31 | | 32 | 16 | 1024 | 614.42 |1269.76 |344.85 |2741.67 | </div> :::info Given a fixed ring modulus $q$, the security generally increases if bit size of each limb $\log_2 b$ decreases or modulo polynomial degree $d$ increases. ::: - To compute the opening phase error, we fix the Johnson proximity $m'= 3$, and choose the number of opening columns $t$ and the rate of the Reed-Solomon Code $\rho=k/n$ such that $$\epsilon_{opening}=(1-\theta)^t\leq \frac{1}{4}\cdot 2^{-\lambda}$$ <div style="display: table; margin: auto;"> | $-\log_2(\rho)$ | $t$ | $-\log_2(\epsilon_{opening})$ | |-------------|---------------|------| | 2 | 107 | 130.79 | | 4 | 59 | 131.12 | | 6 | 41 | 132.12 | | 8 | 31 | 130.89 | </div> :::info To achieve the same level of security, a larger inverse rate $(-\log_2(\rho))$ reduces the number of required openings, resulting in a smaller proof size. However, it also increases the prover time since the blobUpFactor = inverseRate. This trade-off should be carefully considered in real-world applications. ::: - For the commit phase, we select $m'=3,N=n=2^{20},\rho=2^{-8}$, such that $$\epsilon_{commit}\leq(N-1)\frac{(m'+0.5)^7}{3\rho^{1.5}}\frac{n^2}{||\mathbb{F}||}\leq \frac{1}{4}\cdot 2^{-\lambda}$$ :::info Notice that a big field size for the extension field $(\mathbb{F})$ plays a dominant role in determining the security level. ::: <!-- ## TODO: - [ ] add parameter analysis section to clarify how the parameters affect security and implementation, consider binding, hiding, soundness: - [ ] sis parameters (log_2 b, log_q, d. How do the chosen parameters `b` and `d` affect the security?) - [ ] proof size related parameters, square or rectangular? any security impact from the shape? practical data? - [ ] key size m, how to determine it? - m should be big, And the parameters of SIS are set to be secure with "m=\infnty" which is crucial. Otherwise, we have a circular dependency. I am saying that we need to pick parameters that work for a very large "m" So that the m' = NextPowerOf2(#Row * log2(|F|) / log2(B) / sis.modulusDegree) that we compute later, once we know the number of rows. Is guaranteed to be lower than the initial one - m = NextPowerOf2(#Row * log2(|F|) / log2(B) / sis.modulusDegree)... - [ ] open all columns? two types of opening, - the verifier attempts to open polynomials at a single point $W_0(x),W_1(x),...$ - to prove above values truely equal to the opened values, the verifier open multiple columns in the vortex protocol to verify the relation $W_0(x)=y_0,W_1(x)=y_1,...$, where $(y_0,y_1,...)$ are openings prodived by the prover. Size of opened columns??? - [ ] example parameters for 128 bits security: soundness, sis --> ## Key Takeaways :::info **Why Reed-Solomon (RS) Codes**: - **Efficient Encoding**: RS code encoding enables fast arithmetic operations using FFT, which first efficiently recovers the original polynomial from its initial evaluations and then computes the codewords efficiently in a larger domain. - **Efficient Proximity Testing and Error Detection**: RS codes support efficient proximity testing, ensuring that small deviations in codewords can be detected quickly. For appropriately chosen parameters, the number of opened columns $t<100$, is negligible compared to the total number of columns $n\approx 2^{20}$ **Why Ring-SIS Hash**: - **Post-Quantum Security**: Ring-SIS is based on hard lattice problems, providing strong security against quantum attacks. - **Computational Efficiency**: Ring-SIS hash functions are efficient to compute, particularly when implemented over polynomial rings. This allows for fast arithmetic operations via the Fast Fourier Transform (FFT), minimizing computational overhead during the hashing process. ::: ✔ **Efficient and Compact Commitments**: Vortex enables efficient commitment to **multiple** polynomials simultaneously, reducing communication complexity between the prover and verifier. The messages exchanged between the prover and verifier are of size $O(M^{1/2})$ for a witness of size $M$, making it ideal for large-scale applications like zk-EVMs, where witness sizes can reach hundreds of millions of field elements. ✔ **Small Proofs**: The proof size remains compact, benefiting from self-recursive compression to $O(\log\log M)$. We do not cover the details of self-recursion in this post; see [[Section 9]](https://eprint.iacr.org/2022/1633.pdf) for more information. <!--✔ **Fast Verification**: enabling batch evaluation proofs instead of proving each polynomial separately. Hash consistency and evaluation checks ensure correctness.--> ✔ **Post Quantum Secure**: Vortex is designed to be secure against quantum attacks. ✔ **Field Compatibility**: In ECC-based verification, proofs and witnesses are often defined in different fields. In contrast, lattice-based proof systems ensure that both the proof and witness lie within the **same** field, simplifying implementation and significantly reducing potential security risks. ✔ **Flexible Field Size (Small Field Support)**: Unlike ECC-based PCS, which needs 256-bit fields for 128-bit security, Vortex is a FRI-like PCS that can work with much smaller fields (e.g., 32-bit KoalaBear fields) while maintaining the same security level. The use of small fields enables **efficient arithmetic**, resulting in faster computations and lower resource usage. # Application of Vortex The chart below outlines the steps required to prove a computation using Vortex. ```mermaid graph LR Arithmetization --> Wizard-IOP:global-constraints -- Arcane --> PIOP -- UniEval --> UniEval-PIOP --> Vortex ``` ## Arithmetization We present a simple example to illustrate the process above. Suppose we want to prove that we know the solution to the 7th term of the Fibonacci sequence, given two user-specified inputs. **Fibonacci Problem**: Given two inputs $f_0,f_1$, compute the 7th term of the Fibonacci sequence modulo 193. The problem is defined as follows: - **Input**: - $f_0=1$ - $f_1=1$ - **Arithmetization**: $f_{i+2} = f_{i+1}+f_{i}$ - **Output**: $f_9=55$ ## Arithmetization --> Vectors & Constraints ### Vectors/ Trace We transform the arithmetization into a trace table. The initial inputs are stored in $W_0[0]$ and $W_1[0]$. $W_2$ stores the caculated values from the previous two Fibonacci numbers. We then copy values from $W_1$ and $W_2$ into the next column (in $W_0$ and $W_1$, resp.) to caculate the next Fibonacci number. :::warning In other work, data is typically recorded in columns. However, to align with the Vortex design, which uses **row** encoding, we transpose the columns into rows for data recording and encoding. ::: ![](https://i.imgur.com/93KM22H.png) ### Polynomial Constraints The values in the above table satisfy the following constraints. **Local constraints** (**fixed** column positions): - $W_0[0] - 1=0$ - $W_1[0] - 1=0$ - $W_2[7] - 55=0$ **Global constraints** ensure the correct relationship among all column positions: - $W_2-W_1-W_0=0$ (or $W_2[i]-W_1[i]-W_0[i]=0$ for all $i$) - $Shift(W_0,1)-W_1=0$ (or $W_0[i+1] - W_1[i]=0$ for all $i$) - $Shift(W_1,1)-W_2=0$ (or $W_1[i+1] - W_2[i]=0$ for all $i$) ### IOPs (Preliminary) #### Wizard-IOP: - Prover Messages: The prover messages are represented as **vectors** of values (e.g. $W_0, W_1, W_2$) over a finite field. These can be viewed as evaluations of polynomials at specific points (e.g., roots of unity). - Verifier Queries: The verifier can perform a **wide** range of queries on these columns, including inclusion, permutation, inner-product, and **global constraints**. - Focus: The model is more **general and expressive**, allowing for complex protocols to be specified in a modular and abstract way. It extends the polynomial-IOP model by adding higher-level abstractions and tools for protocol design. #### Polynomial IOP: - Prover Messages: All prover messages are treated as **polynomials** over a finite field. - Verifier Queries: The verifier queries the evaluations of these polynomials at **specific points**. - Focus: The model is primarily focused on **polynomial-based proofs**, allowing for the design of complex protocols in a modular and abstract manner. It extends the polynomial-IOP model by adding higher-level abstractions and tools for protocol design. ## Arcane Compiler: Wizard-IOP --> PIOP We show how the global constraints can be reduced using a standard technique from the work of [Plonk](https://eprint.iacr.org/2019/953). Let $W_0,W_1,···$ be vectors and $C(W_0,W_1,···)$ be a multi-variate arithmetic circuit of degree $c$ (i.e. $C=W_2-W_1-W_0$). We denote $W_i(X)$ as the polynomials interpolated from rows $W_i$. We have the following **equivalence**: - In the Wizard-IOP model, the global constraint $C(W_0,W_1,...)$ is satisfied. - (By the Schwartz-Zippel Lemma) There exists a polynomial $Q(X)$ of degree at most $(c−1)n$ such that, $$C(W_0(x), W_1(x),...)=(x^n-1)Q(x),$$ where $x$ is a random point selected by the verifier. So the prover only needs to show the evaluation of polynomials at a random point satisfies the above equation. **Reduction of ONE Global Constraint $(C,W_0,W_1,W_2)$**: We can use **Vortex.Commit** to create a compact commitment and use **Vortex.OpenEval** to generate and verify proof. ![](https://i.imgur.com/4gidHKn.png) In the **Polynomial Commitments**: The prover **commits** to polynomials $W_0(X),W_1(X),W_2(X),Q(X)$. Later, the prover can **open** the commitment by revealing the evaluation of $W_i(X)$ and $Q(X)$ at a point $x$ and providing proofs that - The evaluations $y_i=W_i(x)$ and $y_Q=Q(x)$ are **consistent** with their commitment $cm$. The **global constraint** is satisfied by the evaluation check. :::warning Note that above analysis is a **univariate** query. $W_i(X)$ may be opened at multiple points, such as $x,\omega^{-1} x$. In the Fabonacci example, - $W_1(X), W_2(X)$ are opened at $x,\omega^{-1} x$ (due to the Shift), - $W_0(X)$ is only opened at $x$ ::: ## UniEval Compiler: from PIOP to UniEval PIOP To open the commitment at **multiple** points, we could create a proof for each opening point queried. However, this would be inefficient and result in a large proof size. We optimize multiple-point queries into single-point evaluations to improve efficiency in verification and reduce proof sizes. Let $S_i$ be the set of points where the verifier wants to open $W_i(X)$. **Claim**. $W_i(X)$ opens at all points $x\in S_i$ if and only if: $$Z_{S_i}~\text{devides}~ (W_i(X)-R_i(X)),$$ where $Z_{S_i}=\prod_{x\in S_i}(X-x)$ and $R_i(x)=W_i(x)$ for all $x\in S_i$. <!-- - The verifier samples $\beta\gets \mathbb{F}_q$ - The prover computes and sends oracle-access to: $Q'(X)=\sum_{i}\beta^i \frac{W_i(X)-R_i(X)}{\prod_{x\in S_i}(X-x)}$ - The verifier samples $z\gets \mathbb{F}_q$ $$y_{Q'}(z^n-1)=\sum_{i}\big(\beta^i (y_i-R_i(z))\prod_{x\not\in S_i}(z-x)\big)$$ --> The multi-point to single-point reduction process: ![](https://i.imgur.com/AWFnlAU.png) Additionally, note that we do not need to generate $cm'$ from the entire $W$ and $Q$. Due to the encoding and hash structure, we can extend $cm$ and compute: $$cm'=cm+encode(Q)\cdot A_{next}$$ ### Acknowledge Special thanks to Alexandre Belling, Gautam Botrel, Ivo Kubjas and Thomas Piellard for their feedback and review. ## Reference <!-- - [Fibonacci Example Using Vortex](https://docs.google.com/spreadsheets/d/1fxRiyrttQrID-XtplEaac-RHpw18IxMGITy7py-KxO8/edit?usp=sharing)--> - [Linea Prover Documentation](https://eprint.iacr.org/2022/1633.pdf) - [Vortex: A List Polynomial Commitment and its Application to Arguments of Knowledge](https://eprint.iacr.org/2024/185.pdf) - [Linea monorepo](https://github.com/Consensys/linea-monorepo/tree/main/prover) - [Plonk](https://eprint.iacr.org/2019/953)