# Proposal for Handling The Memory of zkVM _Hanh Tang (NTU), Minh Pham (Orochi Network), and Chiro Hiro (Orochi Network)_ ## Generalize the memory for zkVM (zkMemory) The idea is to create an independent module that can be used by any zkVM. You might aware that the memory can be constructed as a simple state machine with $2$ instructions `READ` and `WRITE`, and configurable `WORD_SIZE`. Our memory state machine is only able access the exactly `WORD_SIZE` for every executed instruction. That is, if you want to access arbitrary data size, it must be translated to multiple accesses. These instructions need to be satisfied following conditions: - **`READ` instruction** - `READ` on a memory was not wrote should return `0` - Every`READ` access for the same location, must have the value to be equal to the previous `WRITE`. - **`WRITE` instruction** - Every `WRITE` access must write on writable memory chunks _(some areas of the memmory might be read only)_. > **Questions:** > - *How could we handle the memory boundaries?* > - *Do we need to deal with memory allocation/deallocation?* > - *How could we deal with configurable WORD_SIZE?* ### Memory trace To prove the accuracy and consistent of the memory, every memory accesss needs to be recorded. We propose the following trace table, *(this table might be a well know among many zkVM projects)*. | Address | Time Log | Instruction | Value | | ------------------ | -------- | ----------- | ------------------ | | 0x0000000000000000 | 1 | READ | 0x0000000000000000 | | 0x0000000000000000 | 2 | WRITE | 0x0000000000000a20 | | 0x0000000000000020 | 3 | WRITE | 0x0000000000000010 | | 0x0000000000000020 | 4 | READ | 0x0000000000000010 | | 0x... | .. | ... | ... | We're epxecting these tuple for every memory access. $$(\text{address}, \text{time_log}, \text{instruction}, \text{value})$$ - **address:** The location that we applied the memory instruction. - **time_log:** Incremental value that will be increased for every access. - **instruction:** `READ`, `WRITE`. - **value:** Value of size `WORD_SIZE`. ### Proposing for the first implementation We suppose to implement the first version of zkMemory following this wishlist: - Building up the memory trace. - Committing every state of memory to the Verkle tree. - Providing the witness after each state is committed allowing the prover to prove the correctness of entier memory. ### KZG Polynomial Commitment KZG polynomial commitment scheme was introduced by [Kate, Zaverucha and Goldberg](https://www.iacr.org/archive/asiacrypt2010/6477178/6477178.pdf). It allows us to commit to a polynomial $p(X)$. Then open $p$'s evaluations at specific points later. KZG is widely used due to various reasons: - KZG commitment and opening proof sizes are constant, even when opening at multiple points. The proof of evaluation only consists of one group element. Thus, the scheme achieves constant proof size. - Verification time is also constant, namely, only a pairing operation. - The commitment is homomorphic. Given the commitment $c$ and $c'$ and opening $\pi$ and $\pi'$ of $p$ and $p'$, then the commitment and openings of $p+p'$ is just $c+c'$ and $\pi+\pi'$. $\Rightarrow$ Using KZG commitment scheme greatly reduces communication cost. Now, we give a formal description to the KZG commitment scheme. Notice that for indexing, we use $\omega^i$ instead of $i$, where $\omega$ is a primitive root. - $\mathsf{Setup}(1^\lambda)$: Sample $s \in \mathbb{F}$ and output $\left(\{[s^i]_1\}_{i \in \{0,\dots, k-1\}},\{[s^i]_2\}_{i \in \{0,\dots, k-1\}}\right)$ - $\mathsf{Commit}(p(X) \in \mathbb{F}_{<k}[X],crs)$: For $p(X)=\sum_{i=0}^{k - 1}p_iX^i$, output $C=[p(s)]_1=\sum_{i=0}^{k -1}p_i[s^i]_1$ - $\mathsf{Open}(C,p(X))$: Output $p(X)$. - $\mathsf{Verify}(crs,C,p(X))$: Check if $[p(s)]_1=\sum_{i \in [k]}p_i[s^i]_1=C$ - $\mathsf{OpenWitness}(p(X),\omega^i,crs)$: To open $p(X)$ at index $\omega^i$, let $h_i(X)=\dfrac{p(X)-p(\omega^i)}{X-\omega^i}$. Then, output $\pi =\left(\omega^i,p(\omega^i),[h_i(s)]_2\right).$ - $\mathsf{VerifyWitness}(C,\pi=(\omega^i,y,h_i(s)_2),crs)$: Verify $p(\omega^i)=y$ by checking $e(C-[y]_1,[1]_2)=e([s]_1-[\omega^i]_1,[h_i(s)]_2)$. ### Commitment to Verkle Tree Verkle tree was introduced by [John Kuszmaul](https://math.mit.edu/research/highschool/primes/materials/2018/Kuszmaul.pdf). It is a $k-$ary tree having $\ell+1$ layers, where: - Each leaf node contains a key and a value. - Each parent node $\mathbf{N}_j^{(i)}$ has $k$ children and the value $\mathbf{v}_j^{(i)}$ of $\mathbf{N}_j^{(i)}$ of is the commitment to its children. We see that, unlike Merkle tree, which is a binary tree, in a Verkle tree each parent node can have many more children. To provide proof for a Verkle tree, we just need to provide a path from the node to the root. To do so, we employ the KZG commitment scheme. Now, we will describe how to commit and open a path in a Verkle tree. It is possible to open multiple paths using the same technique. More details can be found in [Dankrad Feist's blog](https://dankradfeist.de/ethereum/2021/06/18/pcs-multiproofs.html). - $\mathsf{VerkleSetup}$: Sample $s \in \mathbb{F}$ and output $\left(\{[s^i]_1\}_{i \in \{0,\dots, k-1\}},\{[s^i]_2\}_{i \in \{0,\dots, k-1\}}\right)$. - $\mathsf{VerkleCommit}$: For integers $k, j$, let $j' = k\cdot j$. For each node $\mathbf{N}_j^{(i)}$, where $i \in \{0,\dots,k - 1\}$, with children whose values are $v_{j'}^{(i+1)}, v_{ j'+1}^{(i+1)},\dots,v_{j'+k-1}^{(i+1)}$, find a polynomial $p^{(i)}_{j}(X)$ such that $p^{(i)}_{j}(\omega^t)=v_{j'+t}^{(i+1)}$ for $t=0,1,...,k-1$. Then the value $v^{(i)}_{j}$ of $\mathbf{N}_j^{(i)}$ is $\mathsf{Commit}\left(crs,p^{(i)}_{j}(X)\right)$. For the leaf layer, namely, layer $\ell$, the value $v^{(\ell)}_{j}$ for $j \in \{0, \dots, 2^\ell - 1\}$ is simply the original value at location $j$. Output $v_0^{(0)}$, the root of the tree. - $\mathsf{VerkleOpen}$: For a path $\mathbf{v}_{j_0}^{(0)} \rightarrow \mathbf{v}_{j_1}^{(1)} \rightarrow \mathbf{v}_{j_2}^{(2)} \rightarrow...\rightarrow \mathbf{v}_{j_\ell}^{(\ell)}$, we have to prove that $p^{(i)}_{j_i}(\omega^{j_i})=\mathbb{v}_{j_{i+1}}^{(i+1)} ~\forall i \in \{0,\dots, \ell - 1\}$. We proceed the following steps: -- Let $h^{(i)}_{j_i}(X)=\dfrac{p^{(i)}_{j_i}(X)-\mathbb{v}_{j_{i+1}}^{(i+1)}}{X-\omega^{j_i}}$ and $r=H(\mathbf{v}_{0}^{(0)},...,\mathbf{v}_{j_{\ell-1}}^{(\ell-1)},\mathbf{v}_{j_1},...,\mathbf{v}_{j_{\ell}}^{(\ell)},\omega^{j_0},...,\omega^{j_{h-1}})$. -- Let $G(X)=\sum_{i \in [h]}r^ih_{i,j_i}(X)$, compute $D=[G(s)]_1$. -- Let $r'=H(D,r)$ and $h(X)=\sum_{i \in [\ell]}r^i\dfrac{p_{i,j_i}(X)}{r'-\omega^{j_i}}$. Compute $E=[h(s)]_1$. Note that $E$ can be computed by both prover and verifier, si nce the verifier has already known $[p_i{i,j_i}(s)]$, which belongs to the opening path. -- Finally, let $y=\sum_{i \in [\ell]}r^i\dfrac{\mathbb{v}_{j_{i+1}}^{(i+1)}}{r'-\omega^{j_i}}$. -- Output $D,\pi=\left[\dfrac{h(s)-G(s)-y}{s-r'}\right]_1$ - $\mathsf{VerkleVerify:}$ Check if the root node element is equal to the first opening element and for $r$, $r'$, $y$ defined earlier, compute $E=\sum_{i \in [\ell]}r^i\dfrac{[v_{j_i}^{(i)}]_1}{r'-\omega^{j_i}}$ check whether $e(E-D-[y]_1,[1]_2)=e(\pi,[s]_2-[r']_2)$ There are several benefits of commiting to a Verkle tree using KZG than hashing in a Merkle tree. - Because the width Verkle tree is larger, the opening path is much shorter. - The opening proof of Verkle tree is constant sized. As mentioned by Buterin in [his blog](https://vitalik.ca/general/2021/06/18/verkle.html),the proof size is much smaller than a Merkle proof (6 to 8 times smaller). ### Proposal of Folding Scheme for Vector Commitment We construct a folding scheme for correct evaluation of [vector commitment](https://eprint.iacr.org/2011/495.pdf) by R1CS-in-the-exponent. The vector commitment to a vector $\mathbf{x} = (x_0, \dots, x_{k - 1})$ in our solution employs the [KZG polynomial commitment scheme](https://www.iacr.org/archive/asiacrypt2010/6477178/6477178.pdf) and is realized by committing to a polynomial $p(X)$ whose evaluations at $\omega^0, \dots, \omega^{k - 1}$, where $\omega$ is some reasonable primitive root, are equal to $x_0, \dots, x_{k - 1}$, respectively. > **Remark.** The solution we propose here, following [Nova](https://eprint.iacr.org/2021/370.pdf), is not complete for R1CS-in-the-exponent for vector commitment with respect to KZG commitment scheme and Verkle tree. It requires additonal adaptations to make it suitable with the form of relaxed R1CS-in-the-exponent. To make polynomial commitment suitable with folding scheme, we remind the polynomial representations (i) by coefficients and (ii) by Lagrange basis, and show how to commit such a polynomial with respect to these presentations. #### Polynomial representation by coefficients The coefficient representation of a polynomial $p(X) \in \mathbb{F}_{<k}[X]$ of degree at most $k - 1$ is represented by \begin{equation} p(X) = p_0 + p_1X + p_2X^2 + \dots + p_{k - 1}X^{k - 1}. \end{equation} #### Polynomial representation by Lagrange basis In case we would like to construct a polynomial $p(X) \in \mathbb{F}_{<k}[X]$ satisfying $p(\omega^i) = x_i$ for all $i \in \{0, \dots, k - 1\}$, using Lagrange basis $\{L_i(X)\}_{i \in \{0, \dots, k -1\}}$ helps construct such polynomial in a cleaner way. That is, \begin{equation} p(X) = \sum_{i = 0}^{k - 1} x_i L_i(X). \end{equation} #### Polynomial commitment by $2$ ways of representations Thus, in KZG polynomial commitment scheme, we can compute commitment to $p(X)$ by computing either \begin{equation} [p(X)]= \sum_{i = 0}^{k - 1} p_i\cdot [s^i] \quad\text{ or }\quad [p(X)] = \sum_{i = 0}^{k - 1}x_i \cdot [L_i(X)] \end{equation} by using the common reference string $\{[s^i]\}_{i \in \{0, \dots, k - 1\}}$. However, commitment to $p(X)$ by using Lagrange basis asks us to prepare $\{[L_i(X)]\}_{i \in \{0, \dots, k - 1\}}$ in advance. ### Proposal of Folding Scheme for Memory Accesses with Polynomial Commitments In this section, we discuss the technique for constructing folding scheme for memory accesses with respect to polynomial commitments. In particular, we assume that our memory is a $k$-element array $\mathbf{v} = (v_0, \dots, v_{k - 1})$. We commit the entire memory by using the KZG commitment scheme and describe the proof of correct accesses, namely, reading and writing, to the memory. Using the KZG commitment scheme, we assume that the commitment to the memory is equal to \begin{equation} c = [p(X)] = \sum_{i = 0}^{k - 1}x_i \cdot [L_i(X)] \end{equation} by using $\{[s^i]\}_{i \in \{0,\dots, k - 1\}}$. This computation is in fact equivalent to evaluating $p(X)$ at a secret point $s$. We first describe the technique for handling `READ` access to the memory. Specifically, we would like to prove that the $i$-th element of array $\mathbf{v}$, namely, $v_i$, is equal to $y$ for some public value $y$. In case of polynomial commitment, it is equivalent to proving that $p(\omega^i) = y_i$. The proof for opening at the $i$-th position is computed as \begin{equation} w_i = \left[\frac{p(X) - p(\omega^i)}{X -\omega^i}\right]. \end{equation} And, to verify the correctness of the proof, we simply check that \begin{equation} e([s] - [\omega^i], w_i) = e(c - [y_i], [1]). \end{equation} <!-- **Defining R1CS-in-the-exponent.** We now make an attempt to transform the above verification into the form of R1CS. Let $n$ be some positive integer. Since the verification procedure is involved by some pairing evaluation, for vectors $\mathbf{a} = ([a_0], \dots, [a_{n - 1}]), \mathbf{b}= ([b_0], \dots, [b_{n - 1}]),\mathbf{c}= ([c_0], \dots, [c_{n - 1}])$ and $\mathbf{d}= ([d_0], \dots, [d_{n - 1}])$ over $\mathbb{G}$, we define the notation $e(\mathbf{a}, \mathbf{b}) = e(\mathbf{c}, \mathbf{d})$ to imply that $e([a_i], [b_i]) = e([c_i], [d_i])$ for all $i \in \{0, \dots, n - 1\}$. In case that $\mathbf{A} = ([a_{i,j}])_{i \in \{0, \dots, n - 1\}, j\in\{0, \dots, m - 1\}}$, $\mathbf{B} = ([b_{i,j}])_{i \in \{0, \dots, n - 1\}, j\in\{0, \dots, m - 1\}}$, $\mathbf{C} = ([c_{i,j}])_{i \in \{0, \dots, n - 1\}, j\in\{0, \dots, m - 1\}}$, $\mathbf{D} = ([d_{i,j}])_{i \in \{0, \dots, n - 1\}, j\in\{0, \dots, m - 1\}}$ are matrices of the same dimension over $\mathbb{G}$, we say that \begin{equation} e(\mathbf{A}, \mathbf{B}) = e(\mathbf{C}, \mathbf{D}) \end{equation} if and only if $e(a_{i, j}, b_{i, j}) = e(c_{i,j}, d_{i,j})$ for all $i \in \{0,\dots, n -1\}$ and all $j \in \{0,\dots, m - 1\}$. We hence define a new form of R1CS, namely, R1CS-in-the-exponent, such that the R1CS form is written by \begin{equation} e(\mathbf{A}\cdot \mathbf{w}, \mathbf{B}\cdot\mathbf{w}) = e(\mathbf{C}\cdot \mathbf{w}, [\mathbf{1}]) \end{equation} where $[\mathbf{1}]$ is a vector of $[1]$ and $\mathbf{w}$ is a vector of elements of group $\mathbb{G}$. Respectively, in case of witness matrix $\mathbf{Z}$ over $\mathbb{G}$, we also can define a in a similar notation \begin{equation} e(\mathbf{A}\cdot\mathbf{Z}, \mathbf{B}\cdot\mathbf{Z}) = e(\mathbf{C}\cdot\mathbf{Z}, [\mathbf{1}]) \end{equation} where $[\mathbf{1}]$ in this case is thought of to be an $n\times m$-dimensional matrix of $[1]$. **Relaxed R1CS-in-the-exponent.** We imitate the approach in [Nova](https://eprint.iacr.org/2021/370.pdf) original paper to define relaxed R1CS-in-the-exponent of the following form: \begin{equation} e(\mathbf{A}\cdot\mathbf{Z}, \mathbf{B}\cdot\mathbf{Z}) = u\cdot e(\mathbf{C}\cdot\mathbf{Z} + \mathbf{E}, [\mathbf{1}]) \end{equation} where $\mathbf{E} = (e_{i,j})_{i \in \{0, \dots, n - 1\}, j\in\{0, \dots, m - 1\}}$ over $\mathbb{G}$. --> **R1CS-in-the-exponent for vector commitment.** It is now reasonable for us to define the correct form of matrices $\mathbf{A}$, $\mathbf{B}$ and $\mathbf{C}$, and witness vector $\mathbf{w}$. - We first realize the structure of witness vector $\mathbf{w}$. Notice that the verification of evaluation at $\omega^i$ require the involvement of $[s], c, [\omega^i], [y_i], w_i$ and $[1]$. Since there is no secret element here, we simply define the witness vector $\mathbf{w}$ to be \begin{equation} \mathbf{w} = \begin{pmatrix} [s]\\ c\\ [\omega^i]\\ [y_i]\\ w_i\\ [1]\\ \end{pmatrix}. \end{equation} - The matrix $\mathbf{A}$ simply computes $[s] - [\omega^i]$. Hence, \begin{equation} \mathbf{A} = \begin{pmatrix} 1 & 0 & -1 & 0 & 0 & 0 \end{pmatrix}. \end{equation} - Similarly, matrix $\mathbf{B}$ is defined to be \begin{equation} \mathbf{B} = \begin{pmatrix} 0 & 0 & 0 & 0 & 1 & 0 \end{pmatrix}. \end{equation} - Matrix $\mathbf{C}$ is defined to be \begin{equation} \mathbf{C} = \begin{pmatrix} 0 & 1 & 0 & -1 & 0 & 0 \end{pmatrix}. \end{equation} #### Handling `WRITE` Access Notice that, since $c = [p(X)] = \sum_{i = 0}^{k - 1}x_i \cdot [L_i(X)]$ is computed by Lagrange basis, to update $x_i$ to $x'_i$, we simply compute \begin{align} c'&= x'_i \cdot [L_i(X)] + \sum_{j \not= i}x_j \cdot [L_j(X)]\\ &= \left(x_i \cdot [L_i(X)] + \sum_{j \not= i}x_j \cdot [L_j(X)]\right) + (x'_i - x_i) \cdot [L_i(X)]\\ &= c + (x'_i - x_i)\cdot [L_i(X)] \end{align} which is a vector commitment to the vector $(x_0, \dots, x_{i - 1}, x'_i, x_{i + 1}, \dots, x_{k - 1})$. Hence, to update $c$ to $c'$, we simply make the addition for $c$ and $(x'_i - x_i)\cdot [L_i(X)]$. ### Proposal of Verkle Tree with R1CS-in-the-Exponent Recall that a $k$-ary [Verkle tree](https://math.mit.edu/research/highschool/primes/materials/2018/Kuszmaul.pdf) of height $\ell$ has $\ell + 1$ layers, indexed from $0$ to $\ell$, of the following structure: - The unique node in level $0$ is the root of the tree and its value is denoted by $v_{0}^{(0)}$ which is a vector commitment to the sequence of values $\left(v^{(1)}_{0}, \dots, v^{(1)}_{k - 1}\right)$ of its direct $k$ child nodes in layer $1$, to be described later. - For each intermediate layer from $1$ to $\ell - 1$, the $i$-th layer has exactly $k^i$ nodes with values denoted by $v^{(i)}_{j}$ for $j \in \{0, \dots, k^i - 1\}$. The value $v^{(i)}_{j}$ is a vector commitment to the sequence of values $\left(v^{(i + 1)}_{k\cdot j}, \dots, v^{(i + 1)}_{k\cdot (j + 1) - 1}\right)$ in the $(i+1)$-th layer. - The last layer, namely, layer $\ell$, has exactly $k^\ell$ nodes with values denoted by $v^{(\ell)}_{0}, \dots, v^{(\ell)}_{k^\ell - 1}$. To prove correct opening of Mekle tree, we need to prove correct opening of each respective vector commitment in each layer from $0$ to $\ell - 1$. Hence, in this way, we can concatenate all component of all verifications, i.e., for all opening according to a path in Verkle tree, to make it an R1CS form. ## Conclusion - We can provide a library/crate that handle memory commitment for multiple prover, especially the prove that friendly with KZG. - The library/crate we're developing can be used in any zkVM with configurable `WORD_SIZE`. - Verkle tree will redurce the overhead to commit and open at arbitrary node. - Constructing folding scheme for vector commitment.