# What changes for BW6-761?
*Authors: [Youssef El Housni](https://yelhousni.github.io/) ([ConsenSys](https://consensys.net/), [GRACE](https://team.inria.fr/grace/)) and [Aurore Guillevic](https://members.loria.fr/AGuillevic/) ([Inria](https://www.inria.fr/en/caramba), [Aarhus University](https://cs.au.dk/research/cryptography-and-security/))*
Last year in [[HG20](https://eprint.iacr.org/2020/351.pdf)], we introduced the BW6-761 curve which is a pairing-friendly curve that forms a 2-chain with BLS12-377 (previously introduced in [ZEXE](https://eprint.iacr.org/2018/962.pdf)). That is, BW6-761 has a subgroup order $r$ precisely equal to the characteristic $q$ of the base field on which BLS12-377 is defined. In a recent work [[HG21](https://eprint.iacr.org/2021/1359.pdf)], we generalised the curve construction, the pairing formulas ($e : \mathbb{G}_1 \times \mathbb{G}_2 → \mathbb{G}_T$) and the group operations to any BW6 curve defined on top of any BLS12 curve, forming a family of 2-chain pairing-friendly curves. We also investigated other possible 2-chain families, and much more (read the [paper](https://eprint.iacr.org/2021/1359.pdf)!).
Since BW6-761 is already implemented in few projects ([gnark-crypto](https://github.com/ConsenSys/gnark-crypto/tree/master/ecc/bw6-761), [arkworks](https://github.com/arkworks-rs/curves/tree/master/bw6_761), [EY/libff](https://github.com/EYBlockchain/zk-swap-libff/tree/ey/libff/algebra/curves/bw6_761), [kilic/bw6](https://github.com/kilic/bw6) and [constantine](https://github.com/mratsim/constantine/tree/master/constantine/curves)) and is used in production ([Celo](https://celo.org/), [Aleo](https://aleo.org/)), natural questions arise: *What changes for BW6-761 after this work? Is the new formulas backward-compatible with the current implementations? If not, does it worth switching?*
The new general construction where BW6-761 would fall, is coherent with the groups operations and pairing formulae found previously in [HG20]. However, we found new pairing algorithms that are faster. In this note, we try to shed light on the differences and the backward compatibilty.
## Pairing computation
Given $P \in \mathbb{G}_1$ and $Q \in \mathbb{G}_2$, a pairing computation for BW6 is
$$ e(P,Q) = m(P,Q)^{(q^6-1)/r}$$
where $m(P,Q)$ is the Miller loop and $z \mapsto z^{(q^6-1)/r}$ is the final exponentiation.
### Final exponentiation
The final exponentiation corresponds to raising the Miller loop result to $(q^6-1)/r$ or to a multiple $h\cdot (q^6-1)/r$ (where $r \nmid h$). In [HG20], we raise to $3(u^3-u^2+1)(q^6-1)/r$ which gives a fast algorithm described in appendix B (alg.6) in that paper. It is the one implemented in all the code bases (AFAWK). In the new work, we give a generic formula for which the BW6-761 specification would be Equation (28) of that paper. This corresponds to raising to $3(u+1)(q^6-1)/r$ which gives a different result than previously. However, this new algortihm, while generic, is less efficient (previously we took exact values of $h_t$ and $h_y$ and this gave a better LLL decomposition).
**$\rightarrow$ conclusion**: There is nothing to change regarding BW6-761 final exponentiation.
### Miller loop
Previously, we presented an optimal ate Miller loop
$$
m(P,Q) = f_{u+1,Q}(P) \cdot f^q_{u^3-u^2-u,Q}(P)
\tag{1}
$$
which can be optimized in
$$
m(P,Q) = f_{u,Q}(P) \cdot l_{[u]Q,Q}(P) \cdot (f_u)^q_{u^2-u-1,[u]Q}(P)
\tag{2}
$$
Where $u=\texttt{0x8508c00000000001}$ is the BW6-761 (and BLS12-377) seed. We note that
integer | bitsize | Hamming weight | 2-NAF hamming weight |
------- | ------- | -------------- | -------------------- |
$u$ or $u+1$ | 64 | 7 | 7
$u^3-u^2-u$ | 190 | 136 | 31
$u^2-u-1$ | 127 | 66 | 19
Equations (1) and (2) give the same result but it seems that the fastest algorithm is equation (2) with $f_{u,Q}(Q)$ loop in binary and $(f_u)^q_{u^2-u-1,[u]Q}(P)$ loop in 2-NAF. Note that implementing the second loop in 2-NAF requires precomputing $f_{u,Q}^{-1}$ (needed when the second loop integer has a bit $-1$ in its 2-NAF form) but this inverse can be replaced by a conjugate in $\mathbb{F}_{q^6}$.
All mentioned code bases implement Equation (1) with two independent Miller loops, the first one in binary and the second in 2-NAF (no inverse/conjugate when the two loops are independent).
In [a fork of gnark-crypto](https://github.com/yelhousni/gnark-crypto), we implement all three options and report timings on an AWS z1d.large (3.4 GHz Intel Xeon) below
Equation | option (1st loop - 2nd loop) | timing (ms)
-------- | ------------ | ------
[(1)](https://github.com/yelhousni/gnark-crypto/blob/e19af89e10fd6b421b06f9d5072434bbba89f003/ecc/bw6-761/pairing.go#L946) | binary - 2NAF | 1.62
[(2)](https://github.com/yelhousni/gnark-crypto/blob/e19af89e10fd6b421b06f9d5072434bbba89f003/ecc/bw6-761/pairing.go#L704) | binary - binary| 1.59
[(2')](https://github.com/yelhousni/gnark-crypto/blob/e19af89e10fd6b421b06f9d5072434bbba89f003/ecc/bw6-761/pairing.go#L819) | binary - 2NAF | 1.27
So in practice, the third option is the fastest because $\texttt{HW}_{\texttt{2NAF}}(u^2-u-1)=19$ compared to $66$ in binary. This is backward compatible with existing codes.
Now, let's take a look at the new algorithms introduced in [HG21]. First, there is a different generic formula (e.g. equation (23)) when applied to BW6-761 gives
$$
m(P,Q) = f^q_{u+1,Q}(P) \cdot f_{u^3-u^2+1}(P)
\tag{3}
$$
This is slower than before (for the specific BW6-761 choice of $u$) because $\texttt{HW}_{\texttt{2NAF}}(u^3-u^2+1)=37$ and we cannot share the computation over the two loops. Moreover, it is not backward compatible.
Next, we remark that $(u-1)^2$ has a lower Hamming weight and derive the formula
$$
m(P,Q) = f_{u+1,Q}(P) \cdot (f_{u+1})^q_{(u-1)^2,[u+1]Q}(P) \cdot l^q_{[(u+1)(u-1)^2]Q,-Q}(P)
\tag{4}
$$
This should be faster because $\texttt{HW}_{\texttt{2NAF}}((u-1)^2)=12$ or $15$ in binary. In practice,
Equation | option (1st loop - 2nd loop) | timing (ms)
-------- | ------------ | ------
[(4)](https://github.com/yelhousni/gnark-crypto/blob/e19af89e10fd6b421b06f9d5072434bbba89f003/ecc/bw6-761/pairing.go#L584) | binary - binary | 1.250
[(4')](https://github.com/yelhousni/gnark-crypto/blob/e19af89e10fd6b421b06f9d5072434bbba89f003/ecc/bw6-761/pairing.go#L459) | binary - 2NAF | 1.224
However, this is not backward compatible with equations (1) and (2) as the result is different.
Finally, we derive a novel **optimal Tate** (or twisted ate) Miller loop (presented in Algorithm 2 in [HG21]) that leverages the endomorphism $\phi: (x,y) \mapsto (\omega x,-y)$ on $\mathbb{G}_1$ with eigenvalue $\lambda=t-1$, where $t$ is the Frobenius trace. We show that in general it is faster or at least competitive with previous algorithms. For BW6-761, we have two formulae
$$
m_{Tate}(P,Q) = f_{u^3-u^2+1-\lambda\cdot(u+1),P}(Q)
\tag{5}
$$
$$
m_{ate}(P,Q) = f_{u^3-u^2+1-\lambda\cdot(u+1),Q}(P)
\tag{5'}
$$
and
$$
m_{Tate}(P,Q) = f_{u+1+\lambda\cdot(u^3-u^2-u),P}(Q)
\tag{6}
$$
$$
m_{ate}(P,Q) = f_{u+1+\lambda\cdot(u^3-u^2-u),Q}(P)
\tag{6'}
$$
Equation (5) and (5') are slightly slower than respectively (6) and (6') because $\texttt{HW}_{\texttt{2NAF}}(u^3-u^2+1)=37$ while $\texttt{HW}_{\texttt{2NAF}}(u^3-u^2-u)=31$. We report the following timings
Equation | option | timing (ms)
-------- | ------ | ------
[(5)](https://github.com/yelhousni/gnark-crypto/blob/e19af89e10fd6b421b06f9d5072434bbba89f003/ecc/bw6-761/pairing.go#L312) | 2NAF | 1.28
[(6)](https://github.com/yelhousni/gnark-crypto/blob/e19af89e10fd6b421b06f9d5072434bbba89f003/ecc/bw6-761/pairing.go#L163) | 2NAF | 1.26
[(5')](https://github.com/yelhousni/gnark-crypto/blob/67e6bcd98549096529a977ff04d317f763643c68/ecc/bw6-761/pairing.go#L165) | 2NAF | 1.29
[(6')](https://github.com/yelhousni/gnark-crypto/blob/67e6bcd98549096529a977ff04d317f763643c68/ecc/bw6-761/pairing.go#L311) | 2NAF | 1.28
To sum up, taking the fastest options:
Equation| option | timing (ms) | backward compatibily
--------| ------- | ----------- | --------------------
(1)* | binary - 2NAF | 1.62 | Yes
(2') | binary - 2NAF | 1.27 | Yes
(4') | binary - 2NAF | 1.22 | No
(5) | 2NAF | 1.28 | No
(6) | 2NAF | 1.26 | No
(5') | 2NAF | 1.29 | No
(6') | 2NAF | 1.28 | Yes
*the one currently implemented in most code bases.
Note that these are the timings for single pairings. For multi-pairings (product of multiple single pairings), equation (2) and (4) might be slower depending on the number of pairs. In fact, we cannot share squares in the first loop as the second one needs the result of the first one for each independant pair. Here are, for example, the timings of a multi-pairing of size 3:
Equation | multi-pairing size |option | timing (ms) | backward compatibily
-------- | ------ | ------ | --|-
(1)* | 3 | binary-2NAF | 3.81 | Yes
(2') | 3| binary-2NAF | 3.82 | Yes
(4') | 3 | binary-binary | 3.67 | No
(5) | 3 | 2NAF | 2.98 | No
(6) | 3 | 2NAF | 2.95 | No
(5') | 3 | 2NAF | 3.03 | No
(6') | 3 | 2NAF | 2.99| Yes
**$\rightarrow$ conclusion** If you need,
- single pairings and backward compatibility choose (2') or (6')
- single pairings and **no** backward compatibility choose (4')
- large multi-pairings and backward compatibility choose (6')
- large multi-pairings and **no** backward compatibility choose (6)
*Remark*: For BW6 constructed over BLS24, optimal Tate Miller loops are the fastest for single and multi-pairings.

With my co-authors Aurore Guillevic and Diego F. Aranha, we published on ePrint mid-May a survey of elliptic curves for proof systems (ePrint 2022/586). Recently, there have been many tailored constructions of these curves that aim at efficiently implementing different kinds of proof systems. In that survey we provide the reader with a comprehensive view on existing work and revisit the contributions in terms of efficiency and security. We present an overview at three stages of the process: curves to instantiate a SNARK, curves to instantiate a recursive SNARK, and curves to express an elliptic-curve related statement. In this post, we are only interested in the first case. For pairing-based SNARKs, such as the circuit-specific Groth16 [1], bilinear pairings ($e:\mathbb{G}_1\times \mathbb{G}_2 \rightarrow \mathbb{G}_T$) are needed for the proof verification. For universal SNARKs, such as PLONK [2] or Marlin [3], a polynomial commitment scheme is needed. An example of an efficient one is the Kate-Zaverucha-Goldberg (KZG) scheme [4]. The verifier needs to verify some polynomial openings using bilinear pairings. So far, for both circuit-specific and universal constructions, the verification algorithms boil down mainly to pairing computations. However, the prover work is not the same. For Groth16, it is mainly multi-scalar-multiplications (MSM) in both $\mathbb{G}_1$ and $\mathbb{G}_2$ while for KZG-based schemes it is mainly MSMs in $\mathbb{G}_1$ only. In section 4 of the survey paper, we look at a particular family of pairing-friendly elliptic curves suitable for KZG. We derive algorithm 6 to tailor this family to the SNARK context. Note that KZG is also useful for other compelling applications such as Verkle trees for blockchain stateless clients. Pairing-friendly curves

6/14/2022Authors: Youssef El Housni (ConsenSys, GRACE) and Aurore Guillevic (inria) Following the recent work on FFT over non-smooth fields (ECFFT), it might be viable to instantiate a SNARK with an elliptic curve of non-smooth subgroup order. Particularly, one layer SNARK composition can be achieved with less constraints for the choice of curves. In this note we propose a 2-chain based on the widely used BL12-381 curve. The outer curve is a BW6 with a subgroup order 2-adicity equal to one. ECFFT For smooth finite fields $\mathbb{F_q}$ (i.e., when $q−1$ factors into small primes) the Fast Fourier Transform (FFT) leads to the fastest known algebraic algorithms for many basic polynomial operations, such as multiplication, division, interpolation and multi-point evaluation. However, the same operations over fields with no smooth order root of unity suffer from an asymptotic slowdown. [In a recent work by Ben-Sasson, Carmon, Kopparty and Levit, called ECFFT, the authors proposed a new approach to fast algorithms for polynomial operations over all large finite fields.] The key idea is to replace the group of roots of unity with a set of points $L \subset F$ suitably related to a well-chosen elliptic curve group (the set $L$ itself is not a group). The key advantage of this approach is that elliptic curve groups can be of any size in the Hasse-Weil interval $[q+1\pm2\sqrt{q}]$ and thus can have subgroups of large, smooth order, which an FFT-like divide and conquer algorithm can exploit. (From the paper's abstract) 2-chains

6/2/2022last benchmark update: 29/01/21 There are several pairing-friendly elliptic curves libraries used in zero-knowledge proofs (ZKP) projects. Typically, the most important elliptic curve operations in ZKP schemes are "multi scalar multiplication" (MSM) and "pairing product" (PP). This writeup is a try to benchmark and compare the building blocks of these operations in two different architechtures. In fact, since MSM and PP are size-dependant, we focus on timings of mixed addition in G1/G2, doubling in G1/G2, Miller loop and Final exponentiation. The libraries are written in different languages with different software and mathematical optimizations, and with more or less assembly code. The target curves are: BN254, BLS12-381, BLS12-377 and BW6-761. The chosen libraries are:

4/22/2021In this note we describe PLONK, a recent zero-knowledge proof system. The goal is to walk through the scheme step by step and explain the cryptographic primitives at play. Arithmetization The arithmetization a program consists in expressing a program as a sequence of constraint which have a simple mathematic form. There are lots of ways to do this, but in this post we will look at a variation of Rank 1 Constraint System (R1CS). Namely, the $i-th$ constraint binds variables $(w_i)$ by relations of this form: \begin{align} (q_L)iw{i_a}+(q_R)iw{i_b}+(q_M)iw{i_a}w_{i_b}+d_i=(q_c)iw{i_c} \end{align}

12/18/2020
Published on ** HackMD**