Authors: Youssef El Housni (ConsenSys, GRACE) and Aurore Guillevic (Inria, Aarhus University)
Last year in [HG20], we introduced the BW6-761 curve which is a pairing-friendly curve that forms a 2-chain with BLS12-377 (previously introduced in ZEXE). That is, BW6-761 has a subgroup order
Since BW6-761 is already implemented in few projects (gnark-crypto, arkworks, EY/libff, kilic/bw6 and constantine) and is used in production (Celo, Aleo), natural questions arise: What changes for BW6-761 after this work? Is the new formulas backward-compatible with the current implementations? If not, does it worth switching?
The new general construction where BW6-761 would fall, is coherent with the groups operations and pairing formulae found previously in [HG20]. However, we found new pairing algorithms that are faster. In this note, we try to shed light on the differences and the backward compatibilty.
Given
where
The final exponentiation corresponds to raising the Miller loop result to
Previously, we presented an optimal ate Miller loop
which can be optimized in
Where
integer | bitsize | Hamming weight | 2-NAF hamming weight |
---|---|---|---|
64 | 7 | 7 | |
190 | 136 | 31 | |
127 | 66 | 19 |
Equations (1) and (2) give the same result but it seems that the fastest algorithm is equation (2) with
All mentioned code bases implement Equation (1) with two independent Miller loops, the first one in binary and the second in 2-NAF (no inverse/conjugate when the two loops are independent).
In a fork of gnark-crypto, we implement all three options and report timings on an AWS z1d.large (3.4 GHz Intel Xeon) below
Equation | option (1st loop - 2nd loop) | timing (ms) |
---|---|---|
(1) | binary - 2NAF | 1.62 |
(2) | binary - binary | 1.59 |
(2') | binary - 2NAF | 1.27 |
So in practice, the third option is the fastest because
Now, let's take a look at the new algorithms introduced in [HG21]. First, there is a different generic formula (e.g. equation (23)) when applied to BW6-761 gives
This is slower than before (for the specific BW6-761 choice of
Next, we remark that
This should be faster because
Equation | option (1st loop - 2nd loop) | timing (ms) |
---|---|---|
(4) | binary - binary | 1.250 |
(4') | binary - 2NAF | 1.224 |
However, this is not backward compatible with equations (1) and (2) as the result is different.
Finally, we derive a novel optimal Tate (or twisted ate) Miller loop (presented in Algorithm 2 in [HG21]) that leverages the endomorphism
and
Equation (5) and (5') are slightly slower than respectively (6) and (6') because
Equation | option | timing (ms) |
---|---|---|
(5) | 2NAF | 1.28 |
(6) | 2NAF | 1.26 |
(5') | 2NAF | 1.29 |
(6') | 2NAF | 1.28 |
To sum up, taking the fastest options:
Equation | option | timing (ms) | backward compatibily |
---|---|---|---|
(1)* | binary - 2NAF | 1.62 | Yes |
(2') | binary - 2NAF | 1.27 | Yes |
(4') | binary - 2NAF | 1.22 | No |
(5) | 2NAF | 1.28 | No |
(6) | 2NAF | 1.26 | No |
(5') | 2NAF | 1.29 | No |
(6') | 2NAF | 1.28 | Yes |
*the one currently implemented in most code bases.
Note that these are the timings for single pairings. For multi-pairings (product of multiple single pairings), equation (2) and (4) might be slower depending on the number of pairs. In fact, we cannot share squares in the first loop as the second one needs the result of the first one for each independant pair. Here are, for example, the timings of a multi-pairing of size 3:
Equation | multi-pairing size | option | timing (ms) | backward compatibily |
---|---|---|---|---|
(1)* | 3 | binary-2NAF | 3.81 | Yes |
(2') | 3 | binary-2NAF | 3.82 | Yes |
(4') | 3 | binary-binary | 3.67 | No |
(5) | 3 | 2NAF | 2.98 | No |
(6) | 3 | 2NAF | 2.95 | No |
(5') | 3 | 2NAF | 3.03 | No |
(6') | 3 | 2NAF | 2.99 | Yes |
Remark: For BW6 constructed over BLS24, optimal Tate Miller loops are the fastest for single and multi-pairings.