owned this note
owned this note
Published
Linked with GitHub
# ECC operation optimizations
Suppose the scalar field of the curve
$$y^2 = x^3 + b$$
is $F_q$ where $q$ has $B$ bits and the baby Jubjub prime is $p$ which is slightly less 254-bit. Choose $n$ and $k$ so that
* $nk >= B$
* $3n + log_2(2k^2) + \epsilon < 254$
where $\epsilon$ is a small constant.
We represent BigInts with $k$ registers of $n$ bits each. For register size $X = 2^n$, define addition and multiplication without carries by:
* $\sum_i a_i X^i +' \sum_i b_i X^i = \sum_i c_i X^i$ with $c_i = a_i + b_i$
* $\sum_i a_i X^i *' \sum_i b_i X^i = \sum_i d_i X^i$ with $d_i = \sum_{j = 0}^i a_j * b_{i - j}$
### Tracking overflows
Define $n_{z_i} = \lceil log_2(z_i) \rceil$ for a signal $x$. This function represents the number of bits needed to represent $x$.
For an array $z = \sum_i z_i X^i$, define $n_z = \max_i (n_{z_i})$. Tracking $n_z$ allows us to ensure that we are never overflowing the maximum theoretical register size of 253 bits, even when representing numbers in overflow representation.
Additionally, define $k_z$ to be the number of such $n_z$-bit registers used to represent $k$.
Suppose that $c = a +' b$. Then $n_c \leq \max(n_a, n_b) + 1$, and $k_c = \max(k_a, k_b)$.
Now suppose that $d = a *' b$. Then $n_d \leq n_a + n_b + \lceil log_2(\min(k_a, k_b)) \rceil$, and $k_d = k_a + k_b - 1$.
## AddUnequal
For input points $(x_1, y_1)$ and $(x_2, y_2)$ with sum $(x_3, y_3)$, we verify
1. $x_3 = \lambda^2 - x_2 - x_1$ where $\lambda = \frac{y_2-y_1}{x_2-x_1}$
- $(x_1 + x_2 + x_3)(x_2-x_1)^2 - (y_2-y_1)^2 = 0$
2. $\frac{y_3 + y_1}{x_3 - x_1} = \frac{y_3 + y_2}{x_3 - x_2}$, which is equivalent to:
- $y_1 * x_3 + (y_3 + y_2) * x_1 = y_2 * x_3 + (y_3 + y_1) * x_2$
4. $x_3, y_3$ are 4x64 bit numbers
To verify these statement, we do the following:
* Evaluate both sides using $+'$ and $*'$ so that no carrying takes place.
* We now wish to verify the identity $\sum_i a_i X^i = \sum_i b_i X^i$ mod $q$ where $0 \leq a_i, b_i < C_1X^3$ for a small constant $C_1$
* ~~Find a multiple of $q$ of the form $t = \sum_i t_i X^i$ where $C_1X^3 < t_i < p - C_1 X^3$, where $p$ is the babyjubjub prime.~~ no longer needed because of negative carries
* Verify that there is a bigint $r = \sum_i r_i X^i$ for which
$$Carry(\sum_i (a_i + t_i - b_i) X^i) = Carry(\sum_i r_i X^i *' \sum_i q_i X^i)$$
Possibly the last two carries can be performed in a mixed fashion.
- is it possible to do away with $t$ by just adapting our carry algorithm to work with negative registers? Then we'd only need to do one carry
- Is checking Carry(x) = Carry(y) same as checking Carry(x-y) so we only need to check Carry(x) == 0? [Check carry to zero implementation](https://zkrepl.dev/?gist=e168291f275526db7511dc2d049f7219)
- constrain $x_3$ and $y_3$ to be 64 bits per register and also in $[0,p-1]$ by rangechecking that all registers are in $[0,2^{64}-1]$ and if upper registers are all equal to $2^{64}-1$ then checking lowest register is in $[0, 2^{64}-2^{32} - 2^9 -2^8 - 2^7 - 2^6 -2^4 -1 -1]$. [Check range implementation](https://zkrepl.dev/?gist=86d694a9583cc18db77a01a82bcaf884)
## Double
For input points $(x_1, y_1)$ with double $(x_3, y_3)$, we verify
1. $y_3^2 = x_3^3 + b$
2. $\frac{- y_3 - y_1}{x_3 - x_1} = \frac{3 x_1^2}{2y_1}$, which is equivalent to:
- $-2y_1\cdot (y_3+y_1) = 3x_1^2\cdot(x_3-x_1)$
3. $(x_3,y_3) \neq (x_1,-y_1)$
- only need to do $x_3 \neq x_1$
4. $x_3, y_3$ are 4x64 bit numbers and in the range [0, p-1]
### [ignore] Picking $n$, $k$, $t$
How do we find $t$ and pick our register sizes? We'll have to do some tracking. Assume that $x_1, y_1, x_2, y_2, x_3, y_3, b$ all have $k$ registers of $n$ bits.
In the first equation:
- $k_{LHS} \leq 2k-1$
- $n_{LHS} \leq 2n + \lceil log(k) \rceil$.
- $k_{RHS} \leq 3k-2$
- $n_{RHS} \leq 3n + \lceil log(k) \rceil + \lceil log(2k-1) \rceil + 1 \leq 3n + 2 \lceil log(k) \rceil + 2$
In the second equation:
- $k_{LHS} \leq 2k-1$
- $n_{RHS} \leq 2n + \lceil log(k) \rceil + 2$
- $k_{LHS} \leq 2k-1$
- $n_{RHS} \leq 2n + \lceil log(k) \rceil + 2$
All of the $n$ values should be less than 254. Simultaneously, $nk$ must be at least 256. Therefore $k$ should be at least 4. So we arrive at $n=64$, $k=4$.
Now it is the case that we must pick a $t$ such that all $t_i$ are large enough to ensure that $a_i + t_i - b_i \geq 0$ for all $i$. Since $a_i - b_i$ can be as low as $-2^{3n + 2\lceil log(k) \rceil + 2} \geq -2^{192+4+2} = -2^{198}$, we should pick $t$ such that all values of $t_i \geq 2^{198}$.
We also have an upper bound on $t$. Note that $a_i + t_i - b_i < p$, so we should pick $t < p - 2^{198}$ as well.
So now we must find an array $t_0, t_1, \dots, t_9$ (length 10, since our max number of registers is $3k-2 = 10$) such that:
- $2^{198} \leq t_i < p - 2^{198}$
- $q | \sum t_i X^i$ for $X = 2^{64}$
## Prime trick
Consider $X = 2^{64}$ and
$$S = \sum_{i=0}^9 a_i X^i$$
Where each of the $a_i$ is up to $192+\epsilon$ bits (probably fine to say $\epsilon \leq 6$).
We wish to find some $S'$ that is congruent to $S$ modulo $p$ (secp256k1 prime) with few registers.
The most $p = 2^{256} - k$ trick is:
$$2^{256} * Z \equiv k * Z \mod p$$
Since the secp256k1 prime has $k = 2^{256} - 2^{32} - \delta$ for some small $\delta \approx 2^{10}$, we can write the following congruence:
$$S$$
$$\equiv a_9X^9 + a_8X^8 + a_7X^7 + a_6X^6 + a_5X^5 + a_4X^4 + a_3X^3 + a_2X^2 + a_1X + a_0$$
$$\equiv (2^{64}+2^{33}\delta + \delta^2)a_9X + (2^{64}+2^{33}\delta + \delta^2)a_8 + (2^{32}+\delta)a_7X^3 + (2^{32}+\delta)a_6X^2 + (2^{32}+\delta)a_5X + (2^{32}+\delta)a_4 + a_3X^3 + a_2X^2 + a_1X + a_0$$
Since $(2^{32}+\delta)^2 = 2^{64} + 2^{33}\delta + \delta^2$.
Let $C_i$ be the coefficient of $X^i$ after terms are accumulated in this last equation. We have:
$$C_3 = (2^{32}+\delta)a_7 + a_3$$
$$C_2 = (2^{32}+\delta)a_6 + a_2$$
$$C_1 = (2^{64}+2^{33}\delta+\delta^2)a_9 + (2^{32}+\delta)a_5 + a_1$$
$$C_0 = (2^{64}+2^{33}\delta+\delta^2)a_8 + (2^{32}+\delta)a_4 + a_0$$
Let's replace $2^{64}$ with $X$ in the expressions for $C_1$ and $C_0$ and re-accumulate:
$$C_3' = (2^{32}+\delta)a_7 + a_3$$
$$C_2' = (2^{32}+\delta)a_6 + a_2 + a_9$$
$$C_1' = (2^{33}\delta+\delta^2)a_9 + (2^{32}+\delta)a_5 + a_1 + a_8$$
$$C_0' = (2^{33}\delta+\delta^2)a_8 + (2^{32}+\delta)a_4 + a_0$$
We add an additional 44 bits of overflow at most to the registers (the worst coefficient of an $a_i$ involved in a sum is $2^{33}\delta$). Since our $a_i$ are at most 200 bits approximately, the $C_i$ do not overflow the babyjubjub prime (253 bits). So now we have a representation $S' = C_3'X^3 + C_2'X^2 + C_1'X + C_0'$ of a number that is congruent to $S$, such that all of the $C_i'$ are at most 244 bits. After subtracting off some multiple of $p$ we can check that after carries this is equivalent to $0$ in $244 * 4 \approx 1000$ constraints.