owned this note
owned this note
Published
Linked with GitHub
# Pedersen Hashing in TurboPlonk
**Disclaimer:** This documentation was written on March, 2022. It is intended to give readers a high-level understanding. The codebase is the canonical source of truth, and over time this document might fall behind the implementation details of the code.
###### tags: `audit`
##### Authors: Suyash and Arijit
The aim of this document is to chalk out all the details of how Pedersen hashing in TurboPlonk works. This should serve as a standalone document to understand Pedersen hash implementation in TurboPlonk. Credits to Cody, Ariel and Zac for previous documentations in different places.
### wNAF Representation
The $w$NAF form is used to break a large scalar into smaller quads for enabling efficient scalar multiplication in Elliptic Curve Cryptography. In the context of Aztec, we define the following functions to compute the $w$NAF of a field scalar $b \in \mathbb{F}_r \equiv \{0,1,2, \dots, r-1\}$:
$$
\begin{aligned}
\textsf{wnaf_decompose}(b) &:= (\{q_{127}, q_{126}, \dots, q_0\}, s) \\
b = \textsf{wnaf_reconstruct}\Big(\{q_i\}_{i=0}^{127}, \ s\Big) &:= \left(\sum_{i=0}^{127}q_i4^i\right) - s.
\end{aligned}
$$
where we refer to $\{q_i\}_{i=0}^{127}$ as quads and $s$ as the skew. To understand how $w$NAF representations work in detail, we refer the reader to the $w$NAF [documentation](https://hackmd.io/TEjx7DC6T22g2W5ZJFKfog?view).
A brief summary is as follows. In case of an even $b$, the function $\textsf{wnaf_decompose}$ computes decomposition of $b+1$ (the next odd number) and sets $s=1$. In case of an odd $b$, the function $\textsf{wnaf_decompose}$ computes decomposition of $b$ and sets $s=0$.
In case of an odd $b$, the function $\textsf{wnaf_reconstruct}$ gets back original $b$ using the reconstruction formula. In case of an odd $b$, $\textsf{wnaf_reconstruct}$ gets back $b+1$, and obtains $b$ by subtracting $s=1$ from it.
### Scalar Multiplication Using Quads
Before describing our pedersen hash in the next section, we describe a simpler and more straightforward primitive of scalar multiplication.
Given a scalar $b \in \mathbb{F}_r$, we wish to compute $B = bP$. To do this efficiently, we use the quad representation of $b$. We do the following:
1. Input: $b \in \mathbb{F}_r$.
3. Compute the $w$NAF of the modified input:
$$
\textsf{wnaf_decompose}(b) = \Big(\{q_i\}_{i=0}^{127}, s\Big)
$$
where $\{q_i\}_{i=0}^{127}$ are the quads (for $b$ if $b$ is odd, for $b+1$ if $b$ is even) and $s$ is the skew ($s=1$ if the input $b$ is even). We have
$$
b = \textsf{wnaf_reconstruct}\Big(\{q_i\}_{i=0}^{127}, s\Big) = \left(\sum_{i=0}^{127}4^iq_i\right) - s
$$
Therefore, the scalar multiplication can be written as:
$$
B = bP = \sum_{i = 0}^{127} q_i \cdot 4^{i}P -sP
$$
It it worth noting an important observation from the $w$NAF [documentation](https://hackmd.io/TEjx7DC6T22g2W5ZJFKfog?view): The most significant quad $q_{127}=1$ for any scalar $b\in \mathbb{F}_r$. Thus, we can rewrite $B$ as:
$$
B = bP = 4^{127} + \sum_{i = 0}^{126} q_i \cdot 4^{i}P - sP.
$$
Now, if we have pre-computed the following group elements: $-3P_i, -P_i, P_i, 3P_i$ where $P_i = 4^iP$, then we can compute the scalar multiplication in our circuit by cumulative point additions. Notice that we only need to compute scalar multiplications of quad values $1$ and $3$, i.e. $P_i$ and $3P_i$ . The scalar multiplications for $-1$ and $-3$ can be obtained by negating the y-coordinates of that of $P_i$ and $3P_i$ respectively while keeping the x coordinates the same. We store these pre-computed group elements in the `hash_ladder` struct in the code.
### From scalar multiplication to Pedersen
In a particular Pedersen hash (say for computing a commitment to a `value_note`), you need to hash together multiple field scalars each with a unique and uniformly generated set of group generators. This is simplified in the code using something called as `hash_index`. A `hash_index` is a tuple of integers which chooses the generator $P \in \mathbb{G}_1$ and auxiliary generator $P_{\text{aux}} \in \mathbb{G}_1$ for computing Pedersen hashes. We also have a `skew_generator` for a particular `hash_index`.
But why do we need three generators for computing a Pedersen hash of a single scalar? The reason is we want a collision resistant hash function. Since we take only the x-coordinate of the Pedersen hash as output, we need to ensure that the input $b < p/2$ where $p=|\mathbb{G}_1|$ (For a proof see the answers of [this](https://crypto.stackexchange.com/questions/96275/two-elliptic-curve-points-having-the-same-x-coordinate?answertab=active#tab-top), collision resistance is formally proved below. See the exact value of $p$ (Grumpkin group order) in the [Aztec Yellow paper](https://hackmd.io/@aztec-network/ByzgNxBfd#◈-First-pairing-source-group)). In the current setting, the input value can exceed $p/2$. For instance, let $b_{max}$ denote the maximum value of the scalar you can represent using our $w$NAF form (taking $n=127$ below).
$$
b_{max} = 4^{n} + 3(4^{n-1} + 4^{n-2} + \dots + 4 + 1)-s = 2\cdot 4^{n}-1-0 = 2^{255}-1.
$$
Clearly, $b_{max}>p/2$. Let's drop the least significant quad $q_{0}$ and consider only the remaining 127 quads. In that case, we can have powers of $4$ till $4^{n-1}$, so
$$
b_{max} = 4^{n-1} + 3(4^{n-2} + 4^{n-1} + \dots + 4 + 1)-s = 2\cdot 4^{n-1}-1-0 = 2^{253}-1.
$$
and hence $b_{max}>p/2$. Similarly, let's drop the two least significant quads $q_{0},q_{1}$ and the skew $s$ and see what is the maximum value we can represent.
$$
b_{max} = 4^{n-2} + 3(4^{n-3} + 4^{n-4} + \dots + 4 + 1) = 2\cdot 4^{n-2}-1 = 2^{251}-1.
$$.
In this case, we do have $b_{max}<p/2$. Therefore, we need to separate the original input $b$ in two parts: $\gamma$ which contains the quads $( q_{0}, q_{1})$ and $\lambda$ which contains $(q_2, q_3\dots, q_{126}, q_{127}=1)$. If the skew $s$ is 1, we use a `skew_generator` $P_{\text{skew}}$. Thus, we define the Pedersen hash of a scalar $b\in \mathbb{F}_r$ as:
$$
\begin{align*}
H_P(b) &= \underbrace{\left(4^{125} + q_{126}4^{124} + \dots + q_{3}4 + q_{2}\right)P}_{\lambda P} + \underbrace{\left( q_{1}4 + q_{0}\right)P_{\text{aux}}}_{\gamma P_{\text{aux}}} - sP_{\text{skew}} \\
&= \underbrace{(4^{125}P - sP_{\text{skew}})}_{\text{Gate }0} + \underbrace{(q_{126}4^{124} + \dots + q_{3}4 + q_{2})P}_{\text{Gates }1, 2, \dots, 125} + \underbrace{(4q_1 + q_0)P_{\text{aux}}}_{\text{Gate }126, 127}
\end{align*}
$$
From the above equation, it is evident that $1 \le \lambda < 2^{251} < \frac{p}{2}$, $-15 \le \gamma \le 15$, $\gamma \neq 0$).
In the code, the powers of the generators are pre-computed and stored in a struct called `hash_ladder`. For generators $P,P_{\text{aux}}\in \mathbb{G}_1$ we define a `hash_ladder` as follows:
| index | quad | $\texttt{one}$ | $\texttt{three}$ |
| -------- | -------------- | ----------------- | ------------------------- |
| 0 | $q_{127}(= 1)$ | $4^{125}P$ | $3 \cdot 4^{125}P$ |
| 1 | $q_{126}$ | $4^{124}P$ | $3 \cdot 4^{124}P$ |
| 2 | $q_{125}$ | $4^{123}P$ | $3 \cdot 4^{123}P$ |
| $\vdots$ | $\vdots$ | $\vdots$ | $\vdots$ |
| 124 | $q_{3}$ | $4P$ | $3 \cdot 4P$ |
| 125 | $q_{2}$ | $1P$ | $3P$ |
| 126 | $q_{1}$ | $4P_{\text{aux}}$ | $3 \cdot 4P_{\text{aux}}$ |
| 127 | $q_{0}$ | $1P_{\text{aux}}$ | $3P_{\text{aux}}$ |
To confirm the structure of the above table, we have a added a test called `hash_ladder_structure` in `pedersen.test.cpp`.
### Proving Collision Resistance
We prove this by contradiction. Let $a,b \in \mathbb{F}_r$ be two distinct field scalars for which we get a collision (Pedersen hashes yield curve points whose x-coordinates are the same). Thus, we have:
$$
\begin{aligned}
H_P(a) &= \gamma_1 P_{\text{aux}}+\lambda_1P -s_1P_{\text{skew}} = A \implies (x,y_1) \\
H_P(b) &= \gamma_2 P_{\text{aux}}+\lambda_2P -s_2P_{\text{skew}} = B \implies (x,y_2).
\end{aligned}
$$
For collision to occur, we need $y^2_1 = y^2_2 \implies y_1 = \pm y_2$ (as both $A$ and $B$ satisfy the curve equation $y^2 = x^3 - 17$). If $y_1 = -y_2$, $A$ and $B$ are inverses of each other i.e.
$$
\begin{aligned}
A + B &= \mathcal{O} \\
\implies (\lambda_1+\lambda_2)P + (\gamma_1+\gamma_2) P_{\text{aux}} -(s_1+s_2)P_{\text{skew}} &= \mathcal{O}
\end{aligned}
\tag{1}
$$
Since $1 \le \lambda_1, \lambda_2 < 2^{251} < \frac{p}{2}$, we know that $0 < (\lambda_1 + \lambda_2) < p$. Hence to satisfy equation (1) with uniform generators $P$, $P_{\text{aux}}$, and $P_{\text{skew}}$, an adversary needs to come up with non-zero scalar solution $(a_1,a_2,a_3)$ such that
$$
a_1P+a_2P_{\text{aux}}+a_3P_{\text{skew}} = \mathcal{O}
$$
holds. Here $a_1$ is constrained to be non-zero ensuring that the adversary needs to produce non-zero scalar solution. Coming up with such a solution is equivalent to solving DL problem as shown in this [document](https://hackmd.io/urZOnB1gQimMqsMdf7ZBvw?view).
If $y_1=y_2$, we must have
$$
\begin{aligned}
&A = B \\
&\implies \lambda_1P + \gamma_1P_{\text{aux}} - s_1P_{\text{skew}} = \lambda_2P + \gamma_2P_{\text{aux}} - s_1P_{\text{skew}} \\
&\implies (\gamma_1-\gamma_2)P_{\text{aux}} +(\lambda_2-\lambda_1)P +(s_2-s_1)P_{\text{skew}} = \mathcal{O}.
\end{aligned}
\tag{2}
$$
Let ${c = (\gamma_1-\gamma_2), d = (\lambda_2-\lambda_1)}$, and $e = (s_2-s_1)$. All $c$, $d$, and $e$ cannot be zero simultaneously since then $a = b$, a contradiction to our assumption of having two distinct scalars producing collision. Again, $e$ should ideally lie in the set $\{-1,0,1\}$. But we do not need to enforce that since that will make the adversary more powerful. Coming up with non-zero scalar solution such that equation (2) holds is equivalent to solving DL problem as shown in this [document](https://hackmd.io/urZOnB1gQimMqsMdf7ZBvw?view).
Hence our Pedersen hash function is collision resistant. $\blacksquare$
### Gate Structure:
In this section, we describe the gate structure for computing a fixed-base scalar multiplication in circuits. We call this as `fixed_group_add_gate` in the code. For input $b = t + \sum_{i = 0}^{126} b_i \cdot 4^{i}$ and generator $P\in \mathbb{G}_1$, we have $n=127$ quads (excluding $q_{127}=1$) and we need a total of $n+1 = 128$ following gates.
| $w_1$ | $w_2$ | $w_3$ | $w_4$ |
| -------- | -------- | -------- | --- |
| $x_0$ | $y_0$ | $4^{-n}$ | $a_0 = (1 - s4^{-n}) \in \{1, 1 - 4^{-n}\}$ |
| $x_1$ | $y_1$ | $x_{\alpha, 0}$ | $a_1 = 4a_0 + q_{126}$ |
| $x_2$ | $y_2$ | $x_{\alpha, 1}$ | $a_2 = 4^2a_0 + 4q_{126} + q_{125}$ |
| $\vdots$ | $\vdots$ | $\vdots$ | $\vdots$ |
| $x_i$ | $y_i$ | $x_{\alpha, i-1}$ | $a_{i} = 4^{i}a_0 + 4^{i-1}q_{126} + 4^{i-2}q_{125} + \dots + 4q_{127-i+1} + q_{127-i}$ |
| $x_{i+1}$ | $y_{i+1}$ | $x_{\alpha, i}$ | $a_{i+1} = 4^{i+1}a_0 + 4^{i}q_{126} + 4^{i-1}q_{125} + \dots + 4q_{127-i} + q_{127-i-1}$ |
| $\vdots$ | $\vdots$ | $\vdots$ | $\vdots$ |
| $x_{n}$ | $y_{n}$ | $x_{\alpha, n-1}$ | $a_{n} = 4^{n}a_0 + 4^{n-1}q_{126} + 4^{n-2}q_{125} + \dots + 4^2q_2 + 4q_{1} + q_{0} =: b$ |
For the $i$th gate, we have the wires values $w_1 = x_i, w_2 = y_i$ such that $(x_i,y_i) \in \mathbb{G}_1$ denotes the cumulative addition until that gate. In the third wire of the *next* gate, we store $x_{\alpha,i}$ - the x-coordinate of the next point to be added, i.e. $\{-3P_{124-i}, -P_{124-i}, P_{124-i}, 3P_{124-i}\}$. Thus, we have:
$$
(x_{i+1},y_{i+1}) = (x_{i},\ y_{i}) \ +_{ecc} \ (x_{\alpha, i},\ y_{\alpha, i})
$$
The forth wire values are the cumulative sums of the quad values. We use cumulative sums for efficiency reasons. Therefore, the wire value $w_4$ of the last gate should be equal to the original input $b$. Note that in the last gate, we have $4^{n}a_0 = 4^n(1 - s4^{-n}) = 4^n - s$ gives $4^{n}$ when $b$ is odd and $4^{n}-1$ when $b$ is even, so the skew term is taken care of. We describe the constraints on the wire values (i.e. witnesses) in the next section.
### Constraints
To prove that the fixed-base scalar multiplication is correctly computed, we need to enforce the following constraints on the $i$th gate for all $i \in \{0,1,\dots, n-1\}$:
![](https://hackmd.io/_uploads/Hy3G2KHOY.png)
1. Check if the quad value is indeed $\{-3,-1,1,3\}$. We have $\delta_i = q_{126-i} = a_{i+1} - 4 a_i$
$$
\boxed{(\delta_i+3)(\delta_i+1)(\delta_i-1)(\delta_i-3) = 0}
\tag{a}
$$
2. Check if the x-coordinate of the point to be added next is correct. Now, we know the point to be added has to be $(x_{\alpha,i}, y_{\alpha,i}) \in \left\{(x_{\beta}, y_{\beta}), (x_{\beta}, -y_{\beta}), (x_{\gamma}, y_{\gamma}), (x_{\gamma}, -y_{\gamma})\right\}$ where $(x_{\beta}, y_{\beta}) \equiv P_{124-i}$ and $(x_{\gamma}, y_{\gamma}) \equiv 3P_{124-i}$ are pre-computed constants (note that $\delta_i^2$ can either be 9 or 1 below).
$$
\begin{aligned}
x_{\alpha,i} :=& \left(\frac{\delta_i^2 - 9}{-8}\right)x_\beta + \left(\frac{\delta_i^2 - 1}{8}\right)x_{\gamma}
\\
=& \bigg(\underbrace{\frac{x_\gamma-x_\beta}{8}}_{q_{1,i}}\bigg) \delta_i^2 + \bigg(\underbrace{\frac{9x_\beta - x_\gamma}{8}}_{q_{2,i}}\bigg)
\end{aligned}
$$
$$
\therefore \quad \boxed{q_{1,i}\delta_i^2 + q_{2,i} - x_{\alpha,i} = 0}
\tag{b}
$$
3. Check if x-coordinate of the resulting point addition is correct. For this, we use the elliptic curve point addition formula:
$$
\begin{gather}
x_{i+1} = \left(\frac{y_{\alpha,i} - y_i}{x_{\alpha,i} - x_i}\right)^2 - x_{\alpha,i} - x_i \\
\implies (x_{i+1} + x_{\alpha,i} + x_i)(x_{\alpha,i} - x_i)^2 - (y_{\alpha,i} - y_i)^2 = 0 \\
\implies (x_{i+1} + x_{\alpha,i} + x_i)(x_{\alpha,i} - x_i)^2 - y_{\alpha,i}^2 + 2y_{\alpha,i}y_i - y_i^2 = 0 \tag{2}
\end{gather}
$$
We can derive $y_{\alpha,i}$ from $x_{\alpha,i}$ as
$$
\begin{aligned}
y_{\alpha,i} :=& \ \frac{(x_{\alpha,i}-x_\gamma)\delta_i}{(x_\beta - x_\gamma)}y_\beta+ \frac{(x_{\alpha,i} - x_\beta)\delta_i}{3(x_\gamma - x_\beta)}y_\gamma \\
=& \ \bigg( \underbrace{\frac{3y_\beta - y_\gamma}{3(x_\beta - x_\gamma)}}_{q_{ecc,i}} \bigg) x_{\alpha,i}\delta_i + \bigg( \underbrace{\frac{x_\beta y_\gamma - 3x_\gamma y_\beta}{3(x_\beta - x_\gamma)}}_{q_{3,i}} \bigg)\delta_i
\end{aligned}
$$
Substituting $y_{\alpha,i} = q_{ecc,i}x_{\alpha,i}\delta_i + q_{3,i}$ and $y_{\alpha,i}^2 = x_{\alpha,i}^3 + b_{\text{grumpkin}}$ in $(2)$, we get:
$$
\boxed{(x_{i+1} + x_{\alpha,i} + x_i)(x_{\alpha,i} - x_i)^2 - (x_{\alpha,i}^3 + b_{\text{grumpkin}}) + 2y_i(q_{ecc,i}x_{\alpha,i}\delta_i + q_{3,i}) - y_i^2 = 0}
\tag{c}
$$
4. Check if the y-coordinate of the resulting point addition is correct.
$$
\begin{gather}
y_{i+1} = \bigg(\frac{y_{\alpha,i} - y_i}{x_{\alpha,i} - x_i}\bigg)\big(x_i - x_{i+1}\big) - y_i
\\
\therefore \quad \boxed{(y_{i+1} + y_i)(x_{\alpha,i} - x_i) - (q_{ecc,i}x_{\alpha,i}\delta_i + q_{3,i} - y_i)(x_i - x_{i+1}) = 0}
\tag{d}
\end{gather}
$$
5. Initialisation for $i=0$: we need to check $a_0 \in \{1, 1-4^{-n}\}$. Note that the selector $q_{c,i} = 0$ for $i\neq 0$.
$$
\begin{gather}
\boxed{q_{c,i}(1 - a_i)(1 - 4^{-n} - a_i) = 0}
\tag{e}
\end{gather}
$$
5. Initialisation for $i=0$: $x_0$ must be the x-coordinate of either of $Q_0=4^{125}P$ or $Q_1=4^{125}P - P_{\text{skew}}$. Define selector values for $i=0$:
$$
\begin{gather}
q_{4,0}=Q_{0, x}, \quad q_{5,0}=(Q_{0,x}-Q_{1,x})
\\
q_{m,0}=Q_{0, y}, \quad q_{c,0}=(Q_{0,y}-Q_{1,y})
\\ \\
\boxed{q_{c,i} (w_{3,i}(q_{4,i} - x_i) + q_{5,i}(1 - a_i)) = 0}
\tag{f}
\end{gather}
$$
7. Initialisation for $i=0$: $y_0$ must be the y-coordinate of either of $Q_0=4^{125}P$ or $Q_1=4^{125}P- P_{\text{skew}}$.
$$
\boxed{q_{c,i} (w_{3,i}(q_{m,i} - y_i) + q_{c,i}(1 - a_i)) = 0}
\tag{g}
$$
We combine the constraints in equations $(a,b,c,d,e,f, g)$ using powers of a challenge. Linear and non-linear terms are separately computed in `transition_widgets/turbo_fixed_base_widget.hpp`.
### Range Constraining the Input to Pedersen Hashes
Suppose our input to the hash function is $b \in \mathbb{Z}_2^{254}$, we need to prove that $b \in \mathbb{F}_r$. For that, we can split the bit-wise representation of $b$ in two parts:
$$
b = \big(\underbrace{b_{hi}}_{128} \ \| \ \underbrace{b_{lo}}_{126} \big) \in \mathbb{F}_r
$$
Now, suppose we split the field modulus in a same manner as $r = (r_{hi} \ \| \ r_{lo})$. If we prove either of the following condition holds:
1. $r_{hi} - b_{hi} >0$
2. $r_{hi} - b_{hi} =0$ and $r_{lo} - b_{lo} >0$
then we effectively prove that $b < r$. Only question remains to be answered is how do we compute $b_{hi}$ and $b_{lo}$ from the accumulating $w$NAF sums.
#### Computing Upper Half $b_{hi}$
We can compute the upper half $b_{hi}$ from $a_{64}$ in the following way. For $i=64$, we have:
$$
\begin{aligned}
a_{64} &= 4^{64}a_0 + 4^{63}q_{126} + 4^{62}q_{125} + \dots + 4q_{64} + q_{63} \\
&= 4^{64}\left(1 - s \cdot 4^{-127}\right) + 4^{63}q_{126} + 4^{62}q_{125} + \dots + 4q_{64} + q_{63} \\
&= -s \cdot 4^{-63} + \underbrace{4^{64}\cdot 1 + 4^{63}q_{126} + 4^{62}q_{125} + \dots + 4q_{64} + q_{63}}_{b_{hi}} \\
&= -s \cdot 4^{-63} + b_{hi}
\end{aligned}
$$
Note that the most significant quad in $b_{hi}$ too is 1 for the same reasons why it was 1 for $b$.
$$
\boxed{\therefore \quad b_{hi} = a_{64} + s \cdot 2^{-126}}
$$
#### Computing Lower Half $b_{lo}$
We compute the lower half as follows
$$
\begin{aligned}
b_{lo} &= a_{127} - 2^{126} \cdot b_{hi} \\
&= a_{127} - 2^{126}\left(a_{64} + s \cdot 2^{-126}\right)
\end{aligned}
$$
$$
\boxed{\therefore \quad b_{lo} = a_{127} - 2^{126}a_{64} - s}
$$
### References
1. Barrtenberg implementation:
a. [`stdlib/hash/pedersen/pedersen.cpp`](https://github.com/AztecProtocol/aztec2-internal/blob/defi-bridge-project/barretenberg/src/aztec/stdlib/hash/pedersen/pedersen.cpp)
b. [`plonk/proof_system/widgets/transition_widgets/turbo_fixed_base_widget.hpp`](https://github.com/AztecProtocol/aztec2-internal/blob/defi-bridge-project/barretenberg/src/aztec/plonk/proof_system/widgets/transition_widgets/turbo_fixed_base_widget.hpp)
2. The Turbo-PLONK program syntax for specifying SNARK programs, Ariel G and Zac W.
Link: https://docs.zkproof.org/pages/standards/accepted-workshop3/proposal-turbo_plonk.pdf
3. Fixed-base Multiplication gate, Cody G.
Link: https://hackmd.io/MCmV2bipRYelT1WUNLj02g?view