# Understanding KZG Commitment ###### tags: `Public` ## Introduction to Principle The set of vector data that we want to prove (basically a list of data) $$ [d_0,d_1,d_2...,d_{n-1}] $$ Let’s set up a function that converts our data to an integer$f(d_i)$ We get a series of points in the Cartesian coordinate system $$ [(0,f(d_0)),(1,f(d_1)),(2,f(d_2))...(n-1,f(d_{n-1}))] $$ The corresponding polynomial function can be obtained by [Lagrangian interpolation](https://en.wikipedia.org/wiki/Polynomial_interpolation) $$ p(X) = \sum_{i=0}^{n} p_i X^i $$ Define elliptic curve pairing $$ e: G_1 \times G_2 \rightarrow G_T $$ $g_1$ and $g_2$ are the generation points of the group $G_1$ and $G_2$ respectively. We define an identifier (this identifier is very important and will be used frequently, please get familiar with it). $$ [x]_1=x*g_1 \in G_1 , [x]_2=x*g_2 \in G_2, x \in F_p $$ ### Trusted Setup Generate a key $s$, and then we calculate separately, which involves the MPC scenario $$ [s^i]_1 \to [s^0]_1, [s^1]_1,[s^2]_1...[s^{n-1}]_1 $$ $$ [s^i]_2 \to [s^0]_2, [s^1]_2,[s^2]_2...[s^{n-1}]_2 $$ An important feature here is that even if all of the above data is exposed, we can't know what the value of $s$ is, but we can do some specific logic based on the exposed data such as $$ c[s^i]_1=cs^ig_1=[cs^i]_1 $$ $$ c[s^i]_1 + d[s^j]_1=[cs^i+ds^j]g_1=[cs^i+ds^j]_1 $$ Prover can calculate the following polynomials $$ [p(s)]_1 = [\sum_{i=0}^{n} p_i s^i]_1 = \sum_{i=0}^{n} p_i [s^i]_1 $$ Let's understand what's behind this equation - When we did the Trusted Setup, we already had ****$[s^i]_1$ - When prover wants to compute $[p(s)]_1$ , You can evaluate it without knowing what s is We define it here as a Commitment to the polynomial $p(X)$ $$ C=[p(s)]_1 $$ ### Commitment Security Analysis Whether there is a polynomial $q(X)$ that $q(X) \ne p(X)$ , satisfied with $[p(s)]_1 = [q(s)]_1$ ? Obviously, this problem is equivalent to $p(s)=q(s)$, also $0 = p(s)-q(s)$ Let's say there's a very numerical polynomial like this $p(X)$(Note that $p(X)$ here has the same order as $q(X)$, set to $n$) We construct $r(X)=p(X)-q(X)$, the order is the same Let’s make $r(X)=0$, so let's say $z$ one of the zeros,then $(X-z)$ divides $r(X)$, means that $r(X)$ can be represented by the following equation $$ r(X)=(X-z)*t(X) $$ It’s obvious that the order of $t(X)$ must be $n-1$ .By analogy, since the order of the polynomial decreases by one for every zero linear factor it divisible, we reach an important conclusion, $r(X)$ has at most $n$ zero points, so $p(X)$ and $q(X)$ intersect at most $n$ points Therefore, if an attacker wants to construct a false Commitment, the probability is $n/p$, p is the order of Group $G_1$ , $p \approx 2^{256}$ As you can see, the probability is very small ### Prover The process of building Proof - Prover selects a point and satisfies $p(a)=b$ - Calculate proof data $\pi = [q(s)]_1$ - $q(x)= \frac {p(x)-b}{x-a}$ , (Note that the concept of division of polynomials is involved here) Set $t (x) = p (x) - b$ , have $t(x)=0$ solution for $x=a$ , so $(x-a)$ must be divided exactly by polynomial $p(x)-b$ ### Verifier verifies correctness Parameters received - $\pi$ - Point (a,b) The following equations need to be computationally validated, and if both sides are equal, the Verification passes $$ e(\pi, [s-a]_2) = e(C-[b]_1, g_2) $$ Let's do it in a form that's easier to understand $$ e([q(s)_1],[s-a]_2) = e([p(z)]_1-[b]_1, g_2) \to \\ e([q(s)]*g_1,[s-a]*g_2) = e(([p(z)]-[b])*g_1, g_2) \to \\ e(g_1,g_2)^{q(s)(s-a)} = e(g_1, g_2)^{p(s) - b} \to \\ q(s)(s-a) = p(s) - b $$ Let's analyze it - $[s-a]_2$ In fact, it can be obtained by the combination of $[s]_2 + [-a]_2$, and is the point on $G_2$ - $[s]_2$ was obtained in the previous Trusted Setup, $[a]_2$ is provided by prover and is publicly available, and the symmetric transformation of it on the elliptic curve with respect to the X-axis is obtained $[-a]_2$ - Commitment $C$ is a point on the Group $G_1$ This allows Verivier to verify that Prover knows the polynomial structure without disclosing it and without everyone exposing $s$ The verification process is carried out by on-chain contracts, and there is space for gas optimization. For details, please refer to this - **[A minimum-viable KZG polynomial commitment scheme implementation](https://ethresear.ch/t/a-minimum-viable-kzg-polynomial-commitment-scheme-implementation/7675)** ## **pros and cons** It can flexibly perform sampling check on any Data point. When Merkle Tree checks the data of a node, it needs to attach the bypass Hash Data in the verification path, which has much less storage and transportation space than the data But the calculation amount of on-chain verification is relatively large, and there are some problems of delay and Gas consumption ## **usage scenario** - Responsible for important functions in the future Ethereum EIP-4844 proto-dank sharding sharding solution - The scroll L2 project also uses this zk algorithm - It is also a dependency of zk algorithm plonk ## reference - [https://dankradfeist.de/ethereum/2020/06/16/kate-polynomial-commitments.html](https://dankradfeist.de/ethereum/2020/06/16/kate-polynomial-commitments.html) - [https://scroll.io/blog/kzg](https://scroll.io/blog/kzg) - [https://notes.ethereum.org/@dankrad/kzg_commitments_in_proofs#ZKPs-over-non-aligned-fields](https://notes.ethereum.org/@dankrad/kzg_commitments_in_proofs#ZKPs-over-non-aligned-fields) - [https://ethresear.ch/t/a-minimum-viable-kzg-polynomial-commitment-scheme-implementation/7675](https://ethresear.ch/t/a-minimum-viable-kzg-polynomial-commitment-scheme-implementation/7675)