# Understanding KZG Commitment
###### tags: `Public`
## Introduction to Principle
The set of vector data that we want to prove (basically a list of data)
$$
[d_0,d_1,d_2...,d_{n-1}]
$$
Let’s set up a function that converts our data to an integer$f(d_i)$
We get a series of points in the Cartesian coordinate system
$$
[(0,f(d_0)),(1,f(d_1)),(2,f(d_2))...(n-1,f(d_{n-1}))]
$$
The corresponding polynomial function can be obtained by [Lagrangian interpolation](https://en.wikipedia.org/wiki/Polynomial_interpolation)
$$
p(X) = \sum_{i=0}^{n} p_i X^i
$$
Define elliptic curve pairing
$$
e: G_1 \times G_2 \rightarrow G_T
$$
$g_1$ and $g_2$ are the generation points of the group $G_1$ and $G_2$ respectively. We define an identifier (this identifier is very important and will be used frequently, please get familiar with it).
$$
[x]_1=x*g_1 \in G_1 , [x]_2=x*g_2 \in G_2, x \in F_p
$$
### Trusted Setup
Generate a key $s$, and then we calculate separately, which involves the MPC scenario
$$
[s^i]_1 \to [s^0]_1, [s^1]_1,[s^2]_1...[s^{n-1}]_1
$$
$$
[s^i]_2 \to [s^0]_2, [s^1]_2,[s^2]_2...[s^{n-1}]_2
$$
An important feature here is that even if all of the above data is exposed, we can't know what the value of $s$ is, but we can do some specific logic based on the exposed data such as
$$
c[s^i]_1=cs^ig_1=[cs^i]_1
$$
$$
c[s^i]_1 + d[s^j]_1=[cs^i+ds^j]g_1=[cs^i+ds^j]_1
$$
Prover can calculate the following polynomials
$$
[p(s)]_1 = [\sum_{i=0}^{n} p_i s^i]_1 = \sum_{i=0}^{n} p_i [s^i]_1
$$
Let's understand what's behind this equation
- When we did the Trusted Setup, we already had ****$[s^i]_1$
- When prover wants to compute $[p(s)]_1$ , You can evaluate it without knowing what s is
We define it here as a Commitment to the polynomial $p(X)$
$$
C=[p(s)]_1
$$
### Commitment Security Analysis
Whether there is a polynomial $q(X)$ that $q(X) \ne p(X)$ , satisfied with $[p(s)]_1 = [q(s)]_1$ ?
Obviously, this problem is equivalent to $p(s)=q(s)$, also $0 = p(s)-q(s)$
Let's say there's a very numerical polynomial like this $p(X)$(Note that $p(X)$ here has the same order as $q(X)$, set to $n$)
We construct $r(X)=p(X)-q(X)$, the order is the same
Let’s make $r(X)=0$, so let's say $z$ one of the zeros,then $(X-z)$ divides $r(X)$, means that $r(X)$ can be represented by the following equation
$$
r(X)=(X-z)*t(X)
$$
It’s obvious that the order of $t(X)$ must be $n-1$ .By analogy, since the order of the polynomial decreases by one for every zero linear factor it divisible, we reach an important conclusion, $r(X)$ has at most $n$ zero points, so $p(X)$ and $q(X)$ intersect at most $n$ points
Therefore, if an attacker wants to construct a false Commitment, the probability is $n/p$, p is the order of Group $G_1$ , $p \approx 2^{256}$
As you can see, the probability is very small
### Prover The process of building Proof
- Prover selects a point and satisfies $p(a)=b$
- Calculate proof data $\pi = [q(s)]_1$
- $q(x)= \frac {p(x)-b}{x-a}$ , (Note that the concept of division of polynomials is involved here)
Set $t (x) = p (x) - b$ , have $t(x)=0$ solution for $x=a$ , so $(x-a)$ must be divided exactly by polynomial $p(x)-b$
### Verifier verifies correctness
Parameters received
- $\pi$
- Point (a,b)
The following equations need to be computationally validated, and if both sides are equal, the Verification passes
$$
e(\pi, [s-a]_2) = e(C-[b]_1, g_2)
$$
Let's do it in a form that's easier to understand
$$
e([q(s)_1],[s-a]_2) = e([p(z)]_1-[b]_1, g_2) \to \\ e([q(s)]*g_1,[s-a]*g_2) = e(([p(z)]-[b])*g_1, g_2) \to \\ e(g_1,g_2)^{q(s)(s-a)} = e(g_1, g_2)^{p(s) - b} \to \\ q(s)(s-a) = p(s) - b
$$
Let's analyze it
- $[s-a]_2$ In fact, it can be obtained by the combination of $[s]_2 + [-a]_2$, and is the point on $G_2$
- $[s]_2$ was obtained in the previous Trusted Setup, $[a]_2$ is provided by prover and is publicly available, and the symmetric transformation of it on the elliptic curve with respect to the X-axis is obtained $[-a]_2$
- Commitment $C$ is a point on the Group $G_1$
This allows Verivier to verify that Prover knows the polynomial structure without disclosing it and without everyone exposing $s$
The verification process is carried out by on-chain contracts, and there is space for gas optimization. For details, please refer to this
- **[A minimum-viable KZG polynomial commitment scheme implementation](https://ethresear.ch/t/a-minimum-viable-kzg-polynomial-commitment-scheme-implementation/7675)**
## **pros and cons**
It can flexibly perform sampling check on any Data point. When Merkle Tree checks the data of a node, it needs to attach the bypass Hash Data in the verification path, which has much less storage and transportation space than the data
But the calculation amount of on-chain verification is relatively large, and there are some problems of delay and Gas consumption
## **usage scenario**
- Responsible for important functions in the future Ethereum EIP-4844 proto-dank sharding sharding solution
- The scroll L2 project also uses this zk algorithm
- It is also a dependency of zk algorithm plonk
## reference
- [https://dankradfeist.de/ethereum/2020/06/16/kate-polynomial-commitments.html](https://dankradfeist.de/ethereum/2020/06/16/kate-polynomial-commitments.html)
- [https://scroll.io/blog/kzg](https://scroll.io/blog/kzg)
- [https://notes.ethereum.org/@dankrad/kzg_commitments_in_proofs#ZKPs-over-non-aligned-fields](https://notes.ethereum.org/@dankrad/kzg_commitments_in_proofs#ZKPs-over-non-aligned-fields)
- [https://ethresear.ch/t/a-minimum-viable-kzg-polynomial-commitment-scheme-implementation/7675](https://ethresear.ch/t/a-minimum-viable-kzg-polynomial-commitment-scheme-implementation/7675)