owned this note
owned this note
Published
Linked with GitHub
# Bandersnatch Implementation Notes
## Scope
This note contains notes regarding the implementation of the Bandersnatch elliptic curve. This curve was introduced by Simon Masson, Antonio Sanso and Zhenfei Zhang in https://eprint.iacr.org/2021/1152.pdf [^Bandersnatch] and we assume some familiarity with that paper
[^Bandersnatch]: "Bandersnatch: a fast elliptic curve built over the
BLS12-381 scalar field" by Simon Masson, Antonio Sanso and Zhenfei Zhang https://eprint.iacr.org/2021/1152.pdf
The target audience of this note is developers who wish to implement this curve. This is particularly important for implementations that wish to be compatible with the Go implementation that is supposed to go into Geth. We only have proof sketches here.
## The Bandersnatch curve
The bandersnatch curve is an elliptic curve whose field of definition is a prime Field $\mathbb{F} = GF(r_{BLS})$, where $r_{BLS}$ is the size of relevant 255-bit prime subgroup (aka scalar field) of the BLS12-381 curve, which is the de-facto standard curve used in applications using pairing-friendly elliptic curves. This choice makes Zero-Knowledge type arguments and proofs involving field elements from $GF(r_{BLS})$ very efficient with current techniques. The bandersnatch curve was specifically chosen to allow an efficient degree-2 endomorphism, so that the GLV-technique can be used to speed up exponentiation.
The bandersnatch curve allows a representation as an twisted Edwards curve and this is the type of curve model that we will exclusively use here. Explicitly, this means that points on the bandersnatch curve satisfy the equation
$$ax^2 + y^2 = 1+dx^2y^2$$
for curve parameters $a,d$. Concretely, we have $a = -5$ (chosen small for efficiency) and $\frac{a^2}{d^2} = \sqrt{2} -1$ in $GF(r_{BLS})$. The latter choice is to allow the endomorphism.
Note that both $a,d$ are actually non-squares, making this an _incomplete_ twisted Edwards curve. Most of the literature such as Bernstein et al.[^IntroEdwards], which introduced them to crypto, is mostly concerned with the case where $a$ is a square and $d$ is a non-square and many results and desirable properties of twisted Edwards curve only hold in that good case. Bandersnatch's non-standard choice is needed to allow the endomorphism.
[^IntroEdwards]: "Twisted Edwards Curves" by Daniel J. Bernstein, Peter Birkner, Marc Joye, Tanja Lange and Christiane Peters https://eprint.iacr.org/2008/013.pdf
#### (incomplete) twisted Edwards curves and coordinates
Twisted Edwards curves are, strictly speaking, not elliptic curves: the reason is that the (naive projectivization of the) curve equation $ax^2 + y^2 = 1+dx^2y^2$ actually has singularities at infinity. Notably, it has two self-intersections at infinity. For $a$ square, $d$ non-square, these self-intersection points do not correspond to rational points and we can usually ignore them. For $a,d$ both non-squares, we are not so lucky.
To properly speak about and treat those singularities, we use projective extended twisted edward coordinates, i.e. we consider $X,Y,T,Z$ not all zero, satisfying
$$a X^2 + Y^2 = dT^2 + Z^2\\
XY = TZ$$
Recall that these are projective coordinates, so the represenation is not unique. Indeed, $(X:Y:T:Z)$ = $(cX:cY:cT:cZ)$ represent the same point for any non-zero $c$
This corresponds to the usual case by just setting $T = XY/Z$, with $Z=1$ corresponding to the affine points. These coordinates are fairly popular, since storing T actually improves efficiency. See Hisil-Wang-Carter-Dawson[^HWCD] for details. For the discussion at hand, we note that the introduction of T also serves to actually desingularize the curve and allows us to properly talk about points at infinity.
The "standard" addition law, taken from the [Explicit Formula Database](https://eprint.iacr.org/2008/013.pdf), which takes it from [^HWCD], $P_3 = P_1 + P_2$ in these coordinates is given by
$$
X_3 = (X_1 Y_2 + X_2 Y_1)(Z_1Z_2 -dT_1T_2)\\
Y_3 = (Y_1Y_2 -aX_1X_2) (Z_1Z_2 + dT_1T_2)\\
T_3 = (X_1Y_2 +Y_1X_2) (Y_1Y_2-aX_1X_2)\\
Z_3 = (Z_1Z_2 -dT_1T_2)(Z_1Z_2 + dT_1T_2)
$$
Variations of this are possible by using the curve equations. The negative of $(X:Y:T:Z)$ is $(-X:Y:-T:Z)$
Note for clarification: There is no *need* to use these coordinates to implement the curve, since the points at infinity are never supposed to appear anyway. For our purposes, they are just needed to properly *talk about* some properties we care about. We will see we could do without extended coordinates ex-posteriori, but we use them at first to correctly argue that.
#### Special points
The curve has exactly 4 points at infinity, not all rational, with coordinates given in $(X:Y:T:Z)$-form by
$$I_1 = (0: \sqrt{d} : 1 : 0)\\
I_2 = (0: -\sqrt{d} : 1 :0)\\
E_1 = (\sqrt{d/a} : 0 : 1 : 0) = (1:0:\sqrt{a/d}:0)\\
E_2 = (-\sqrt{d/a}: 0 : 1 : 0) = (1:0:\sqrt{a/d}:0) $$
for some consistent choices of square roots.
For usual twisted Edwards curves, none of these are rational (as the square roots do not exist in the base field). For us, $E_1$, and $E_2$ actually are rational. We also observe that, without the $T$ coordinate, the those points points would come in pairs with identical coordinates (due to the original curve equation having two self-intersections at infinity) and, more importantly, the points would (wrongly!) appear to be all rational with x=z=0, y=1 resp. y=z=0, x=1.
The set of rational points of the bandersnatch has order $2\cdot 2\cdot p_{253}$, where $p_{253}$ is a 253-bit prime. The structure as an abelian group is $Z/2 \times Z/2 \times Z/p_{253}$.
We have a full rational 2-torsion group with a subgroup of 4 points of order 1 or 2 given by
$$ N = (0:1:0:1)\\
A = (0:-1:0:1)\\
E_1\\
E_2$$
$N$ is the neutral element of the curve, $A$ is the affine order-2-point that has a very important role for us; $E_1,E_2$ are the rational points at infinity defined above.
## Singularities of curve operations
One of the main appeals of using (twisted) Edwards curves and the main reason why they are used in crypto is that there exists complete formulas for the addition laws, i.e. expressions to compute (the coordinates of) $P+Q$ from (coordinates of) $P$ and $Q$ that work for at least all rational cases, including $P=\pm Q$; note that e.g. for short Weierstrass form, we need to special-case $P=Q$. This is a constant source of both bugs and attack vectors for side-channel attacks.
Alas, this nice property of twisted Edwards curves, relies on $a$ being a square and $d$ being a non-square. A more precise analysis that I could not find in the literature [^ErrorInHWCG] of the standard addition law $P_3 = P_1 + P_2$ shows that:
The "standard" addition law given above for $P_1 + P_2$ is singular (i.e. gives the invalid answer 0:0:0:0) if and only if $P_1 - P_2$ is a point at infinity. (Note the negative sign: $P_1+P_2$ itself may or may not be at infinity)
Of course, other formulas and coordinates might have other singular cases; in particular, coordinate systems that do not cover the points at infinity or do not desingularize (and hence cannot tell $E_1$ and $E_2$ apart) are not expected to work if either $P_1$, $P_2$ or $P_1 + P_2$ is at infinity.
A more precise statement is that if $P_1 - P_2$ is one of $E_1, E_2$, then both $Y_1Y_2 - aX_1X_2$ and $Z_1Z_2 - dT_1T_2$ are 0. This corresponds to a pair of points of the form $(X:Y:T:Z)$ and $(\frac{1}{\sqrt{ad}}Y:\frac{a}{\sqrt{ad}}X:\frac{1}{d}Z:T)$, where $\sqrt{ad}$ is one of the two square roots of $ad$. In the other case where $P_1 - P_2$ is one of the other two (non-rational) points at infinity, then both $X_1Y_2 + X_2Y_1$ and $Z_1Z_2 + dT_1T_2$ are 0, but this case cannot happen if both $P_1$ and $P_2$ are rational.
For the efficient degree-2 endomorphism $\Psi$ that was described in the Bandersnatch paper [^Bandersnatch], there are various possible formulas that all tend to have potential singularities at 2-torsion points.
For those, the correct answers are
$\Psi(N) = N, \Psi(A) = N, \Psi(E_1) = A, \Psi(E_2) = A$
We strongly suggest that test vectors for any implementation covers these special cases appropriately.
## Avoiding singularities by staying on the subgroup.
Let us denote by $G$ subgroup of rational points of large prime order $p_{253}$ and denote by $G'$ the subgroup of size $2\cdot p_{253}$ that is generated by $G$ and the point $A$. Let $\mathcal{E}$ denote the group of *all* rational Bandersnatch points.
Clearly, we have $G\subset G' \subset \mathcal{E}$.
A consequence of the singularity condition given above is that no singular cases for the addition law appear as long as we stay on $G'$, since $G'$ is closed under subtraction and contains no points at infinity.
Furthermore, for cryptographic purposes, we actually want to only compute on the prime-order subgroup $G$ anyway. In particular, we need to ensure that (potentially untrusted) output that we receive is verified or preprocessed to ensure that it is on $G$.
## A Decaf-Style representation of cofactor-4 incomplete twisted Edwards curves.
In order to ensure that we only work on the subgroup and also get extremely fast deserialization of untrusted input, we use the following decaf-style (to my knowledge novel, in the context of cofactor-4 incomplete twisted Edwards curves) trick:
We represent $G$ as $G\cong G'/\{N,A\}$, i.e. we work only in $G'$, but we identify $P$ with $P+A$.
Explicitly, for $P=(X:Y:T:Z)$, we have $P+A = (-X:-Y:T:Z) = (X:Y:-T:-Z)$.
Of course, this means that we no longer have a unique representation of curve points, but we did not have that to start with due to using projective coordinates. Of course, for every element $P\in G'$, exactly one of $P$ and $P+A$ is in $G$ and we could just choose that one; however, this is actually hard to determine and the point is that we do not need to know which one. If a unique representative is really needed, we suggest to choose the one with $Y \in [0, \frac{p-1}{2}]$ and $Z=1$
Now, why is this a good idea? We have the following key properties:
- On $G'$, all points have $Y\neq 0$ and $Z\neq 0$.
- It is **much** easier to check whether an element is in $G'$ than to check whether it is in $G$.
- Checking whether two curve points $P$ and $Q$ are equal / $P$ is the neutral elements is essentially equally difficult whether we work modulo $A$ or not.
- All curve operations are compatible with this identification.
Let us argue why these properties hold:
The last bullet point is easy: This holds simply because $\{N, A\}$ is a subgroup.
The first one is also easily verified by plugging either $Y=0$ or $Z=0$ into the curve equations. In fact, $E_1$ and $E_2$ are the only rational points with either $Y=0$ or $Z=0$.
The other properties are a bit harder: First, we make the following observation: The map
$$\mathcal{E} \to \mathbb{F}, P \mapsto \frac{X}{Y}$$
is 2:1. Indeed the preimage sets (including non-rational points) are exactly of the form $\{P, P+A, -P+I_1, -P+I_2\}$, where $I_1$, $I_2$ are the non-rational points at infinity, as one can easily verify. On $\mathcal{E}$, we hence get a 2:1 map.
In particular, when quotienting out $\{N, A\}$ and on $G$, the map is injective.
This means that to check equality of points $P_1 = P_2$ modulo our identification, we just need to check whether $\frac{X_1}{Y_1} = \frac{X_2}{Y_2}$, or equivalently $X_1Y_2 = X_2Y_1$. Note that this is just as hard or easy as not doing the identification, where we would need to divide by $Z$ (due to working with projective coordinates) instead of by $Y$. To check whether a point is (equivalent to) the neutral element, we just need to check whether $X=0$.
For the subgroup check, we make the following observation:
For any rational point P = (X:Y:T:Z), the following are equivalent:
1. One of $P$, $P+A$ is in $G$
2. $Z^2-aX^2$ is a square in $\mathbb{F}$
3. $Z^2-dX^2$ is a square in $\mathbb{F}$
4. $(aZ^2-dY^2)(a-d)$ is a square in $\mathbb{F}$ and $Z\neq 0$
5. $(aZ^2-dY^2)$ is a non-square in $\mathbb{F}$ and $Z\neq 0$.
Note that we can restrict our attention to $Z\neq 0$. Setting $x= \frac{X}{Z}$, $y = \frac{Y}{Z}$, we have $x^2 = \frac{1-y^2}{a-dy^2}$ resp. $y^2 = \frac{1-ax^2}{1-dx^2}$. Together with checking that $a-d$ is a non-square, this is enough to show that 2--5 are equivalent. The tricky part is equivalence with 1. The easiest way is probably observing that $P$ is in $G$ iff $P = Q + Q$ has a solution for rational $Q$. Writing this out shows that 2--5 are neccessary. To show that it is sufficient, observe that conditions 2--5 hold for exactly one of $P, P+E_1$. This is also true for condition 1 due to the group structure of $\mathcal{E}$. See note [^SubgroupProof] below for more details.
This observation allows us to test membership in $G'$ by a computation of a Jacobi symbol, which is considerably faster than computing a square root in $\mathbb{F}$ or any exponentiation involving $P$.
## Serialization format and MapToFieldElement
The above makes it neccessary to use a serialization format that is compatible with the identification of $P$ with $P+A$. This precludes us from just using the $\frac{X}{Z}$ or $\frac{Y}{Z}$ coordinates directly. We use two different serialization formats, a long and a short one (there is a trade-off speed vs. bandwidth here) What we want to have are the following desiderata:
- The short serialization format consists of a single (serialized) field element, i.e. 256 bit
- the long serialization format contains the short serialzation format as an exact substring
- the format is prefix-free, even when short and long formats are mixed: this means that we can determine automatically from an input stream which format was used.
- serializing and deserializing is an exact round-trip (modulo P=P+A, of course), without needing to clear cofactors.
- Serializing a point in a given format (short or long) gives a unique bitstring. -- No two valid diferent bitstrings of a given format deserialize to the same point.
- Serializing a point is very efficient (The most expensive part is a simple conversion to affine coordinates, meaning 1 division -- this seems hard to avoid if we want a unique bitstring)
- Verifying untrusted input upon deserializing is very efficient: The most expensive part is a single Legendre-Symbol computation. No square roots are required for the check!
- Deserializing is very efficient: Deserializing from short format requires 1 square root operation (which is essentially the minimum possible -- you cannot have a rational function here), deserializing from long format is basically free (apart from potential subgroup check)
- Certain leading bits of the serialization format are always 0 (Notably: If it starts (counting bits within bytes high-endian) with with 1..., then the next bit must be 0, the format is long and the 257th bit is 0). This allows (if desired) to add other formats while retaining prefix-freeness.
Let us write bits within bytes in high-endian notataion and use high-endian byte-order to serialize integers. Then we serialize a point P = (X:Y:T:Z) in short format as
$$\textrm{SerializeFieldElement}_{256}\Bigl(\frac{X}{Z} \cdot \textrm{Sign}(\frac{Y}{Z})\Bigr)$$
where SerializeFieldElement just interprets the field element $u:=\frac{X}{Z} \cdot \textrm{Sign}(\frac{Y}{Z})$ as an 256-bit integer in $[0, r_{BLS}-1]$ and writes it byte by byte in high-endian order. Sign(u) is $+1$ if $u$ is in $[1,\frac{r_{BLS}-1}{2}]$, and Sign(u) is -1 otherwise. Note that we never take the Sign of 0. In fact, $\textrm{Sign}$ is just an arbitrary $\pm 1$-valued function satisfying $\textrm{Sign}(-u) = -\textrm{Sign}(u)$ here.
For the long serialization format, we serialize as
$$10, \textrm{SerializeFieldElement}_{254}\Bigl(\frac{Y}{Z}\cdot\textrm{Sign}(\frac{Y}{Z})\Bigr),\textrm{SerializeFieldElement}_{256}\Bigl(\frac{X}{Z} \cdot \textrm{Sign}(\frac{Y}{Z})\Bigr)$$
i.e. we write $v:=\frac{Y}{Z}\cdot\textrm{Sign}(\frac{Y}{Z})$ as a 254-bit integer, prepend the bits "10" and follow by the short serialization format. Observe here that $v$ indeed fits into 254 bits due to the way we defined $\textrm{Sign}$.
Observe that this format does not (allow to) distinguish between $P$ and $P+A$.
Also, observe that anything in short serialization format starts with a zero bit, anything in long format starts with the sequence 10.
To deserialize from the short format (We ignore the T coo here), first recover $u=\frac{X}{Z} \cdot \textrm{Sign}(\frac{Y}{Z})$.
Verify that it is in $[0, r_{BLS}-1]$. Set $X:=u$.
Set $Z:=1$.
Compute $Y$ satisfying the curve equation using $Y^2 = \frac{1-aX^2}{1-dX^2}$ and choose the square root with positive sign. (If no square root exists, abort)
Check that $X$ corresponds to a point in $G'$ by computing a Jacobi symbol.
To deserialize from the long format (ignore the T coo here), check that the input stream starts with 10 and recover $u$ and $v$. Then
- Check that $u, v$ are in $[0, r_{BLS}-1]$
- Check that $v\neq 0$ and $\textrm{Sign}(v)$ = +1
- Set X:=u, Y:=v, Z:=1
- Check that X,Y,Z satisfy the curve equation(s)
- Check that X corresponds to a point on $G'$ by computing a Jacobi symbol
### MapToFieldElement
Our choice of short serialization format was guided by trying to minimize deserialization costs. The more natural choice $\frac{X}{Y}$, which we argued above to be injective, would work as well. Unfortunately, this would require computing two square roots (rather than one) during deserialization. In situations where we need just an injective map to the field of definition and do not need to deserialize or if we need to run NIZKs or SNARKs, we indeed suggest to use $\frac{X}{Y}$ instead of the serialization format, as the Sign operation is not as NIZK-friendly.
[^ErrorInHWCG]: Unfortunately, the proof for this in [^HWCD], Theorem 1, which is widely cited, is flawed: They (wrongly) state that addition is non-singular "if points at infinity are not involved" and the proof assumes that whenever a division by zero occurs (for their affine version of the group law), then the result must be a point at infinity. For the actual singular cases where $P_1 - P_2$ is at infinity, we usually have that neither of $P_1, P_2$ or $P_1 + P_2$ is at infinity. Furthermore, what can happen and indeed does happens in the affine group law is that we get an expression of the form $0/0$ -- and the correct result would be a perfectly normal finite rational point on the curve. Note that the **statement** of the theorem is actually still "accidentially" correct, because they only work with odd-order points.
[^HWCD]: "Twisted Edwards Curves Revisited" by
Huseyin Hisil, Kenneth Koon-Ho Wong, Gary Carter, and Ed Dawson https://iacr.org/archive/asiacrypt2008/53500329/53500329.pdf
[^SubgroupProof]: Consider $P = (X:Y:T:Z)$ with $Z$ non-zero. Then exactly one of $P$, $P+A$, $P+E_1$, $P+E_2$ is in $G$. Consider first the case that $P$ is in $G$. Then there exists $Q = (X':Y':T':Z')$ with $P= Q + Q$. Since $Q-Q$ is not at infinity, our addition formulas as correct and we can write $X,Y,T,Z$ in terms of $X', Y', T', Z'$. This gives
$$Z^2 - aX^2 = (Z'^2-dT'^2)^2\bigl((Z'^2 + dT'^2)^2-a\cdot(2X'Y')^2\bigr) =\\
(Z'^2-dT'^2)^2\bigl( (aX'^2 + Y'^2)^2 - 4aX'^2Y'^2\bigr) = \\
(Z'^2-dT'^2)^2\bigl( (aX'^2 - Y'^2)^2\bigr)\ ,$$ which is a square.
If P+A is in $G$, then $Z^2 - aX^2$ is also a square, simply because $P+A = (-X:-Y:T:Z)$, so $X^2$ and $Z^2$ are the same for $P$ and $P+A$.
Conversely, for $(\tilde{X}:\tilde{Y}:\tilde{T}:\tilde{Z}) = P+E_1$, we have $P+E_1 = (Y:aX:\frac{\sqrt{ad}}{d}Z:\sqrt{ad}T)$.
Now, computing $\tilde{Z}^2-a\tilde{X}^2$ gives
$$\tilde{Z}^2-a\tilde{X}^2 = adT^2 - aY^2 = \frac{a}{Z^2}(dT^2Z^2 - Y^2Z^2) =\\
\frac{a}{Z^2}(dX^2Y^2 - Y^2Z^2) = \frac{-aY^2}{Z^2}(Z^2-dX^2)\ .$$
Now if $P+E_1$ is in $G$, then, by the above argument for the $P\in G$-case, the term $\tilde{Z}^2-a\tilde{X}^2$ is a square. Since $-a$ is a non-square, this means that $Z^2-dX^2$ is not a square, as it should be, as neither $P$ nor $P+A$ are in $G$. The case $P+E_2 = P+E_1+A$ is analogous.