owned this note
owned this note
Published
Linked with GitHub
# Nearly full-width variable-base scalar mul
[TOC]
ECC gadget: https://hackmd.io/KfD81XFQQqqUJI7ZoICk7w
Copied from https://github.com/zcash/zcash/issues/3924 (with some variable name changes):
> Consider the following algorithm:
>
> Acc := [2] T
> for i from n-1 down to 0 {
> U := r_i ? T : −T
> Acc := (Acc + U) + Acc
> }
>
> This computes $[2^{n+1} - (2^n - 1) + 2 \cdot r] T = [2^n + 1 + 2 \cdot r] T$.
>
> Suppose that we actually want to compute $[2^n + k]$ T, where $k < 2^{n+1}$ and $T \neq \mathcal{O}$. Without loss of generality, assume $k$ is odd (if it is even then add one to $k$ and subtract $T$ from the result). Let $k = 1 + 2 \cdot r$, and solve to give $r = (k - 1)/2$. Conveniently, this is equivalent to setting $r = k >> 1$ where $>>$ is the bitwise right-shift operator.
>
> So the full algorithm is:
>
> Acc := [2] T
> for i from n-1 down to 0 {
> U := k_{i+1} ? T : −T
> Acc := (Acc + U) + Acc
> }
> return (k_0 = 0) ? (Acc - T) : Acc
In the Orchard circuit we need to check $\mathsf{pk_d} = [\mathsf{ivk}] \mathsf{g_d}$ where $\mathsf{ivk} \in [0, p)$ and the scalar field is $\mathbb{F}_q$ with $p < q$.
We have $p = 2^{254} + t_p$ and $q = 2^{254} + t_q$, for $t_p, t_q < 2^{128}$.
We're trying to compute $[\alpha] T$ for $\alpha \in [0, q)$.
Set $k = t_q + \alpha$ as an integer and $n = 254$. Then we can compute
$\begin{align*}
[2^{254} + t_q + \alpha] T &= [q + \alpha] T \\
&= [\alpha] T
\end{align*}$
provided that $\alpha + t_q \in [0, 2^{n+1})$, i.e. $\alpha < 2^{n+1} - t_q$ which covers the whole range we need because in fact $p - 1 + t_q < 2^{255}$.
We will encounter a potential overflow problem when trying to decompose $\alpha + t_q$ into bits. This should be a decomposition in the integer range $[t_q, p + t_q)$; not $[0, p)$. We'll discuss this in more detail below.
We'll also find that we need to use complete addition in the last 3 steps. Continuing the analysis of incomplete addition:
> It remains to check that the x-coordinates of each pair of points to be added are distinct.
>
> When adding points in the large prime-order subgroup, we can rely on Theorem 3 from Appendix C of the [Halo paper](https://eprint.iacr.org/2019/1021.pdf), which says that if we have two such points with nonzero indices wrt a given odd-prime order base, where the indices taken in the range $-(q-1)/2..(q-1)/2$ are distinct disregarding sign, then they have different x-coordinates. This is helpful, because it is easier to reason about the indices of points occurring in the scalar multiplication algorithm than it is to reason about their x-coordinates directly.
> So, the required check is equivalent to saying that the following "indexed version" of the above algorithm never asserts:
>
> acc := 2
> for i from n-1 down to 0 {
> u = k_{i+1} ? 1 : −1
> assert acc ≠ ± u
> assert (acc + u) ≠ acc // X
> acc := (acc + u) + acc
> assert 0 < acc ≤ (q-1)/2
> }
> if k_0 = 0 {
> assert acc ≠ 1
> acc := acc - 1
> }
The maximum value of $acc$ is:
```
<---n 1s--->
1011111...1111
= 1100000...0000 - 1
```
= $2^{n+1} + 2^n - 1$
> The assertion labelled X obviously cannot fail because $u \neq 0$. It is possible to see that acc is monotonically increasing except in the last conditional. It reaches its largest value when $k$ is maximal, i.e. $2^{n+1} + 2^n - 1$.
So to entirely avoid exceptional cases in addition, we would need $2^{n+1} + 2^n - 1 < (q-1)/2$. But we can use $n$ larger by $c$ if the last $c$ iterations use complete addition.
The first $i$ for which the algorithm using **only** incomplete addition fails is going to be $252$, since $2^{252+1} + 2^{252} - 1 > (q - 1)/2$. We need $n = 254$ to make the wraparound technique above work.
> sage: q = 0x40000000000000000000000000000000224698fc0994a8dd8c46eb2100000001
sage: 2^253 + 2^252 - 1 < (q-1)//2
False
sage: 2^252 + 2^251 - 1 < (q-1)//2
True
So the last three iterations of the loop ($i = 2..0$) need to use complete addition, as does the conditional subtraction at the end. Writing this out using ⸭ for incomplete addition (as we do in the spec), we have:
Acc := [2] T
for i from 253 down to 3 {
U := k_{i+1} ? T : −T
Acc := (Acc ⸭ U) ⸭ Acc
}
for i from 2 down to 0 {
U := k_{i+1} ? T : −T
Acc := (Acc + U) + Acc // complete addition
}
return (k_0 = 0) ? (Acc + (-T)) : Acc // complete addition
## Algorithm with incomplete addition
Let $\mathbf{z}_i = \sum_{h=i}^{n} (\mathbf{k}_{h} \cdot 2^{h-i})$, where $n = 254$.
Then we have
$\begin{array}{rcl}\hspace{1.5em}
\mathbf{z}_{n+1} &=& 0 \\
\mathbf{z}_{i} &=& 2\cdot \mathbf{z}_{i+1} + \mathbf{k}_i, \text{ for } 0 \leq i \leq n \\
\mathbf{z}_0 &=& k \\
&=& \alpha + t_q \text{ (as an integer!)}
\end{array}$
Note that since we are witnessing the boolean decomposition of $k = \alpha + t_q$ in big-endian order, the corresponding running sum $\mathbf{z}_{i}$ will also be assigned from $i = n + 1$ down to $0$.
Let $k = \sum\limits_{h=0}^{254} \mathbf{k}_h \cdot 2^h$ as an integer.
## Overflow check for variable-base scalar multiplication
$\mathbf{z}_i$ cannot overflow for any $i \geq 1$, because it is a weighted sum of bits only up to $2^{n-1} = 2^{253}$, which is smaller than $p$ (and also $q$).
However, $\mathbf{z}_0 = \alpha + t_q$ *can* overflow $[0, p)$.
Since overflow can only occur in the final step that constrains $\mathbf{z}_0 = 2 \cdot \mathbf{z}_1 + \mathbf{k}_0$, we have $\mathbf{z}_0 = k \pmod{p}$. It is then sufficient to also check that $\mathbf{z}_0 = \alpha + t_q \pmod{p}$ (so that $k = \alpha + t_q \pmod{p}$) and that $k \in [t_q, p + t_q)$. These conditions together imply that $k = \alpha + t_q$ as an integer, and so $2^{254} + k = \alpha \pmod{q}$ as required.
> Note: the bits $\mathbf{k}_{254..0}$ do not represent a value reduced modulo $q$, but rather a representation of the unreduced $\alpha + t_q$.
### Optimized check for $k \in [t_q, p + t_q)$
Since $t_p + t_q < 2^{130}$, we have $[t_q, p + t_q) = [t_q, t_q + 2^{130}) \;\cup\; [2^{130}, 2^{254}) \;\cup\; \big([2^{254}, 2^{254} + 2^{130}) \;\cap\; [p + t_q - 2^{130}, p + t_q)\big).$
We may assume that $k = \alpha + t_q \pmod{p}$.
Therefore,
$\begin{array}{rcl}
k \in [t_q, p + t_q) &\Leftrightarrow& \big(k \in [t_q, t_q + 2^{130}) \;\vee\; k \in [2^{130}, 2^{254})\big) \;\vee\; \\
& & \big(k \in [2^{254}, 2^{254} + 2^{130}) \;\wedge\; k \in [p + t_q - 2^{130}, p + t_q)\big) \\
&\Leftrightarrow& \big(\mathbf{k}_{254} = 0 \implies (k \in [t_q, t_q + 2^{130}) \;\vee\; k \in [2^{130}, 2^{254}))\big) \;\wedge \\
& & \big(\mathbf{k}_{254} = 1 \implies (k \in [2^{254}, 2^{254} + 2^{130}) \;\wedge\; k \in [p + t_q - 2^{130}, p + t_q)\big) \\
&\Leftrightarrow& \big(\mathbf{k}_{254} = 0 \implies (\alpha \in [0, 2^{130}) \;\vee\; k \in [2^{130}, 2^{254})\big) \;\wedge \\
& & \big(\mathbf{k}_{254} = 1 \implies (k \in [2^{254}, 2^{254} + 2^{130}) \;\wedge\; (\alpha + 2^{130}) \bmod p \in [0, 2^{130}))\big) \;\;Ⓐ
\end{array}$
> Given $k \in [2^{254}, 2^{254} + 2^{130})$, we prove equivalence of $k \in [p + t_q - 2^{130}, p + t_q)$ and $(\alpha + 2^{130}) \bmod p \in [0, 2^{130})$ as follows:
> * shift the range by $2^{130} - p - t_q$ to give $k + 2^{130} - p - t_q \in [0, 2^{130})$;
> * observe that $k + 2^{130} - p - t_q$ is guaranteed to be in $[2^{130} - t_p - t_q, 2^{131} - t_p - t_q)$ and therefore cannot overflow or underflow modulo $p$;
> * using the fact that $k = \alpha + t_q \pmod{p}$, observe that $(k + 2^{130} - p - t_q) \bmod p = (\alpha + t_q + 2^{130} - p - t_q) \bmod p = (\alpha + 2^{130}) \bmod p$.
>
> (We can see in a different way that this is correct by observing that it checks whether $\alpha \bmod p \in [p - 2^{130}, p)$, so the upper bound is aligned as we would expect.)
Now, we can continue optimizing from $Ⓐ$:
$\begin{array}{rcl}
k \in [t_q, p + t_q) &\Leftrightarrow& \big(\mathbf{k}_{254} = 0 \implies (\alpha \in [0, 2^{130}) \;\vee\; k \in [2^{130}, 2^{254})\big) \;\wedge \\
& & \big(\mathbf{k}_{254} = 1 \implies (k \in [2^{254}, 2^{254} + 2^{130}) \;\wedge\; (\alpha + 2^{130}) \bmod p \in [0, 2^{130}))\big) \\
& & \big(\mathbf{k}_{254} = 0 \implies (\alpha \in [0, 2^{130}) \;\vee\; \mathbf{k}_{253..130} \text{ are not all } 0)\big) \;\wedge \\
& & \big(\mathbf{k}_{254} = 1 \implies (\mathbf{k}_{253..130} \text{ are all } 0 \;\wedge\; (\alpha + 2^{130}) \bmod p \in [0, 2^{130}))\big)
\end{array}$
Constraining $\mathbf{k}_{253..130}$ to be all-$0$ or not-all-$0$ can be implemented almost "for free", as follows.
Recall that $\mathbf{z}_i = \sum_{h=i}^{n} (\mathbf{k}_{h} \cdot 2^{h-i})$, so we have:
$\begin{array}{rcl}
\mathbf{z}_{130} &=& \sum_{h=130}^{254} (\mathbf{k}_h \cdot 2^{h-130}) \\
\mathbf{z}_{130} &=& \mathbf{k}_{254} \cdot 2^{254-130} + \sum_{h=130}^{253} (\mathbf{k}_h \cdot 2^{h-130}) \\
\mathbf{z}_{130} - \mathbf{k}_{254} \cdot 2^{124} &=& \sum_{h=130}^{253} (\mathbf{k}_h \cdot 2^{h-130})
\end{array}$
So $\mathbf{k}_{253..130}$ are all $0$ exactly when $\mathbf{z}_{130} = \mathbf{k}_{254} \cdot 2^{124}$.
Finally, we can merge the $130$-bit decompositions for the $\mathbf{k}_{254} = 0$ and $\mathbf{k}_{254} = 1$ cases by checking that $(\alpha + \mathbf{k}_{254} \cdot 2^{130}) \bmod p \in [0, 2^{130})$.
### Overflow check constraints
Let $s = \alpha + \mathbf{k}_{254} \cdot 2^{130}$. The constraints for the overflow check are:
$\mathbf{z}_0 = \alpha + t_q \pmod{p}$
$\mathbf{k}_{254} = 1 \implies \big(\mathbf{z}_{130} = 2^{124} \;\wedge\; s \bmod p \in [0, 2^{130})\big)$.
$\mathbf{k}_{254} = 0 \implies \big(\mathbf{z}_{130} \neq 0 \;\vee\; s \bmod p \in [0, 2^{130})\big)$.
Define $\mathsf{inv0}(x) = \begin{cases} 0, &\text{if } x = 0 \\ 1/x, &\text{otherwise.} \end{cases}$
Witness $\eta = \mathsf{inv0}(\mathbf{z}_{130})$, and decompose $s \bmod p$ as $\mathbf{s}_{129..0}$.
Then the needed gates are:
$s = \alpha + \mathbf{k}_{254} \cdot 2^{130}$
$\mathbf{z}_0 = \alpha + t_q \pmod{p}$
$\mathbf{k}_{254} \cdot (\mathbf{z}_{130} - 2^{124}) = 0$
$\mathbf{k}_{254} \cdot (s - \sum\limits_{i=0}^{129} 2^i \cdot \mathbf{s}_i) = 0$
$(1 - \mathbf{k}_{254}) \cdot (1 - \mathbf{z}_{130} \cdot \eta) \cdot (s - \sum\limits_{i=0}^{129} 2^i \cdot \mathbf{s}_i) = 0$
where $\sum\limits_{i=0}^{129} 2^i \cdot \mathbf{s}_i$ can be computed by another running sum. [TODO: be explicit about how to do that.]
## Main constraint system
let $(x_T, y_T) = T$.
$\mathbf{z}_{255} = 0$
$A_{254} = [2] T$
for $i$ from $254$ down to $4$:
$(\mathbf{k}_i)(\mathbf{k}_i-1) = 0$
$\mathbf{z}_{i} = 2\mathbf{z}_{i+1} + \mathbf{k}_{i}$
$x_{U,i} = x_T$
$y_{U,i} = (2 \mathbf{k}_i - 1) \cdot y_T$ # conditionally negate
$\lambda_{1,i} \cdot (x_{A,i} - x_{U,i}) = y_{A,i} - y_{U,i}$
$\lambda_{1,i}^2 = x_{R,i} + x_{A,i} + x_{U,i}$
$(\lambda_{1,i} + \lambda_{2,i}) \cdot (x_{A,i} - x_{R,i}) = 2 y_{\mathsf{A},i}$
$\lambda_{2,i}^2 = x_{A,i-1} + x_{R,i} + x_{A,i}$
$\lambda_{2,i} \cdot (x_{A,i} - x_{A,i-1}) = y_{A,i} + y_{A,i-1}$
TODO: do the final three bits
Output $(x_{A,0}, y_{A,0})$
After substitution of $y_{P,i}$, $x_{R,i}$, $y_{A,i}$, and $y_{A,i+1}$, this becomes:
$A_{254} = [2] T$
for $i$ from $254$ down to $4$:
// let $\mathbf{k}_i = \mathbf{z}_{i+1} - 2\mathbf{z}_i$
// let $x_{R,i} = (\lambda_{1,i}^2 - x_{A,i} - x_T)$
// let $y_{A,i} = \frac{(\lambda_{1,i} + \lambda_{2,i}) \cdot (x_{A,i} - (\lambda_{1,i}^2 - x_{A,i} - x_T))\hspace{2em}}{2}$
$(\mathbf{z}_{i+1} - 2\mathbf{z}_i)(\mathbf{z}_{i+1} - 2\mathbf{z}_i - 1) = 0$
$\lambda_{1,i} \cdot (x_{A,i} - x_T) = \frac{(\lambda_{1,i} + \lambda_{2,i}) \cdot (x_{A,i} - (\lambda_{1,i}^2 - x_{A,i} - x_T))\hspace{2em}}{2} - (2 \cdot (\mathbf{z}_{i+1} - 2\mathbf{z}_i) - 1) \cdot y_T$
$\lambda_{2,i}^2 = x_{A,i-1} + (\lambda_{1,i}^2 - x_{A,i} - x_T) + x_{A,i}$
if $i > 3$ then $2 \cdot \lambda_{2,i} \cdot (x_{A,i} - x_{A,i-1}) =$
$(\lambda_{1,i} + \lambda_{2,i}) \cdot (x_{A,i} - (\lambda_{1,i}^2 - x_{A,i} - x_T)) +$
$(\lambda_{1,i-1} + \lambda_{2,i-1}) \cdot (x_{A,i-1} - (\lambda_{1,i-1}^2 - x_{A,i-1} - x_T))$
$\lambda_{2,3} \cdot (x_{A,3} - x_{A,2}) = \frac{(\lambda_{1,3} + \lambda_{2,3}) \cdot (x_{A,3} - (\lambda_{1,3}^2 - x_{A,3} - x_T))\hspace{2em}}{2} + y_{A,2}$
TODO: do the final three bits
Output $(x_{A,0}, y_{A,0})$
$(x_T, y_T, \lambda_1, \lambda_2, x_{A,i}, \mathbf{z}_i)$
### Incomplete addition gate:
Let $n = 254$
\begin{array}{|c|c|c|c|c|c|c|c|c|c|}
\hline
x_P & y_P & z^{hi} & x_A^{hi} & \lambda_1^{hi} & \lambda_2^{hi} & z^{lo} & x_A^{lo} & \lambda_1^{lo} & \lambda_2^{lo} \\\hline
& & 0 & 2[T]_x & & & \mathbf{z}_{129} & x_{A,129} & & \\\hline
x_P & y_P & \mathbf{z}_{254} & x_{A,254} & \lambda_{1,254} & \lambda_{2,254} & \mathbf{z}_{128} & x_{A,128} & \lambda_{1,128} & \lambda_{2,128} \\\hline
\vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots \\\hline
x_P & y_P & \mathbf{z}_{130} & x_{A,130} & \lambda_{1,130} & \lambda_{2,130} & \mathbf{z}_4 & x_{A,4} & \lambda_{1,4} & \lambda_{2,4} \\\hline
x_P & y_P & \mathbf{z}_{129} & x_{A,129} & \lambda_{1,129} & \lambda_{2,129} & \mathbf{z}_3 & x_{A,3} & \lambda_{1,3} & \lambda_{2,3} \\\hline
\end{array}
\begin{array}{|c|l|}
\hline
\text{Degree} & \text{Constraint} \\\hline
2 & q_{ECC,i} \cdot (\mathbf{z}_{i+1} - 2\mathbf{z}_i - \mathbf{k}_i) = 0 \\\hline
3 & q_{ECC,i} \cdot \mathbf{k}_i \cdot (\mathbf{k}_i - 1) = 0 \\\hline
4 & q_{ECC,i} \cdot \left(\lambda_{1,i} \cdot (x_{A,i} - x_{P,i}) - y_{A,i} + (2\mathbf{k}_i - 1) \cdot y_{P,i}\right) = 0 \\\hline
4 & q_{ECC,i} \cdot \left((\lambda_{1,i} + \lambda_{2,i}) \cdot (x_{A,i} - (\lambda_{1,i}^2 - x_{A,i} - x_{P,i})) - 2 y_{A,i}\right) = 0 \\\hline
3 & q_{ECC,i} \cdot \left(\lambda_{2,i}^2 - x_{A,i+1} - (\lambda_{1,i}^2 - x_{A,i} - x_{P,i}) - x_{A,i}\right) = 0 \\\hline
3 & q_{ECC,i} \cdot \left(\lambda_{2,i} \cdot (x_{A,i} - x_{A,i+1}) - y_{A,i} - y_{A,i+1}\right) = 0 \\\hline
2 & q_{ECC,i} \cdot \left(x_{P,i} - x_{P,i-1}\right) = 0 \\\hline
2 & q_{ECC,i} \cdot \left(y_{P,i} - y_{P,i-1}\right) = 0 \\\hline
\end{array}
## Implementing complete addition
$\begin{array}{rcll}
\mathcal{O} &+& \mathcal{O} &= \mathcal{O} ✓\\
\mathcal{O} &+& (x_q, y_q) &= (x_q, y_q) ✓\\
(x_p, y_p) &+& \mathcal{O} &= (x_p, y_p) ✓\\
(x, y) &+& (x, y) &= [2] (x, y) ✓\\
(x, y) &+& (x, -y) &= \mathcal{O} ✓\\
(x_p, y_p) &+& (x_q, y_q) &= (x_p, y_p) \;⸭\; (x_q, y_q), \text{if } x_p \neq x_q ✓
\end{array}$
Suppose that we represent $\mathcal{O}$ as $(0, 0)$. ($0$ is not an $x$-coordinate of a valid point because we would need $y^2 = x^3 + 5$, and $5$ is not square in $\mathbb{F}_q$. Also $0$ is not a $y$-coordinate of a valid point because $-5$ is not a cube in $\mathbb{F}_q$.)
$$
\begin{aligned}
P + Q &= R\\
(x_p, y_p) + (x_q, y_q) &= (x_r, y_r) \\
\lambda &= \frac{y_q - y_p}{x_q - x_p} \\
x_r &= \lambda^2 - x_p - x_q \\
y_r &= \lambda(x_p - x_r) - y_p
\end{aligned}
$$
For the doubling case, $\lambda$ has to instead be computed as $\frac{3x^2}{2y}$.
Recall that $\mathsf{inv0}(x) = \begin{cases} 0, &\text{if } x = 0 \\ 1/x, &\text{otherwise.} \end{cases}$
Witness $\alpha, \beta, \gamma, \delta, \lambda$ where:
* $\alpha = \mathsf{inv0}(x_q - x_p)$
* $\beta = \mathsf{inv0}(x_p)$
* $\gamma = \mathsf{inv0}(x_q)$
* $\delta = \begin{cases}
\mathsf{inv0}(y_q + y_p), &\text{if } x_q = x_p \\
0, &\text{otherwise}
\end{cases}$
* $\lambda = \begin{cases}
\frac{y_q - y_p}{x_q - x_p}, &\text{if } x_q \neq x_p \\[0.5ex]
\frac{3{x_p}^2}{2y_p} &\text{if } x_q = x_p \wedge y_p \neq 0 \\[0.5ex]
0, &\text{otherwise.}
\end{cases}$
### Constraints
\begin{array}{|c|rcl|l|}
\hline
\text{Degree} & \text{Constraint}\hspace{7em} &&& \text{Meaning} \\\hline
4 & q_\mathit{add} \cdot (x_q - x_p) \cdot ((x_q - x_p) \cdot \lambda - (y_q - y_p)) &=& 0 & x_q \neq x_p \implies \lambda = \frac{y_q - y_p}{x_q - x_p} \\\hline
5 & q_\mathit{add} \cdot (1 - (x_q - x_p) \cdot \alpha) \cdot \left(2y_p \cdot \lambda - 3{x_p}^2\right) &=& 0 & \begin{cases} x_q = x_p \wedge y_p \neq 0 \implies \lambda = \frac{3{x_p}^2}{2y_p} \\ x_q = x_p \wedge y_p = 0 \implies x_p = 0 \end{cases} \\\hline
6 & q_\mathit{add} \cdot x_p \cdot x_q \cdot (x_q - x_p) \cdot (\lambda^2 - x_p - x_q - x_r) &=& 0 & x_p \neq 0 \wedge x_q \neq 0 \wedge x_q \neq x_p \implies x_r = \lambda^2 - x_p - x_q \\
6 & q_\mathit{add} \cdot x_p \cdot x_q \cdot (x_q - x_p) \cdot (\lambda \cdot (x_p - x_r) - y_p - y_r) &=& 0 & x_p \neq 0 \wedge x_q \neq 0 \wedge x_q \neq x_p \implies y_r = \lambda \cdot (x_p - x_r) - y_p \\
6 & q_\mathit{add} \cdot x_p \cdot x_q \cdot (y_q + y_p) \cdot (\lambda^2 - x_p - x_q - x_r) &=& 0 & x_p \neq 0 \wedge x_q \neq 0 \wedge y_q \neq -y_p \implies x_r = \lambda^2 - x_p - x_q \\
6 & q_\mathit{add} \cdot x_p \cdot x_q \cdot (y_q + y_p) \cdot (\lambda \cdot (x_p - x_r) - y_p - y_r) &=& 0 & x_p \neq 0 \wedge x_q \neq 0 \wedge y_q \neq -y_p \implies y_r = \lambda \cdot (x_p - x_r) - y_p \\\hline
4 & q_\mathit{add} \cdot (1 - x_p \cdot \beta) \cdot (x_r - x_q) &=& 0 & x_p = 0 \implies x_r = x_q \\
4 & q_\mathit{add} \cdot (1 - x_p \cdot \beta) \cdot (y_r - y_q) &=& 0 & x_p = 0 \implies y_r = y_q \\\hline
4 & q_\mathit{add} \cdot (1 - x_q \cdot \gamma) \cdot (x_r - x_p) &=& 0 & x_q = 0 \implies x_r = x_p \\
4 & q_\mathit{add} \cdot (1 - x_q \cdot \gamma) \cdot (y_r - y_p) &=& 0 & x_q = 0 \implies y_r = y_p \\\hline
4 & q_\mathit{add} \cdot (1 - (x_q - x_p) \cdot \alpha - (y_q + y_p) \cdot \delta) \cdot x_r &=& 0 & x_q = x_p \wedge y_q = -y_p \implies x_r = 0 \\
4 & q_\mathit{add} \cdot (1 - (x_q - x_p) \cdot \alpha - (y_q + y_p) \cdot \delta) \cdot y_r &=& 0 & x_q = x_p \wedge y_q = -y_p \implies y_r = 0 \\\hline
\end{array}
Max degree: 6
\begin{array}{|c|c|c|c|c|c|}
\hline
x_p & y_p & x_q & y_q & \lambda & \\\hline
\alpha & \beta & x_r & y_r & \gamma & \delta \\\hline
\end{array}
or
\begin{array}{|c|c|c|c|c|c|c|c|c|c|c|}
\hline
x_p & y_p & x_q & y_q & x_r & y_r & \lambda & \alpha & \beta & \gamma & \delta \\\hline
\end{array}
or
\begin{array}{|c|c|c|c|c|c|c|c|c|c|c|}
\hline
x_p & y_p & x_q & y_q & \lambda & \alpha & \beta & \gamma & \delta \\\hline
& & x_r & y_r & & & & & \\\hline
\end{array}
-----
### Old version (buggy! 🐛)
Witness $\lambda, \alpha, \beta, \gamma, \delta, A, B, C, D$.
\begin{array}{rcl|l}
&&& Meaning \\\hline
A \cdot (1-A) &=& 0 & A \in \mathbb{B} \\
B \cdot (1-B) &=& 0 & B \in \mathbb{B} \\
C \cdot (1-C) &=& 0 & C \in \mathbb{B} \\
D \cdot (1-D) &=& 0 & D \in \mathbb{B} \\
(x_q - x_p) \cdot \alpha &=& 1-A & x_q = x_p \implies A \\
x_p \cdot \beta &=& 1-B & x_p = 0 \implies B \\
B \cdot x_p &=& 0 & B \implies x_p = 0 \\
x_q \cdot \gamma &=& 1-C & x_q = 0 \implies C \\
C \cdot x_q &=& 0 & C \implies x_q = 0 \\
(y_q + y_p) \cdot \delta &=& 1-D & y_q = -y_p \implies D \\
(x_q - x_p) \cdot ((x_q - x_p) \cdot \lambda - (y_q - y_p)) &=& 0 & x_q \neq x_p \implies \lambda = \frac{y_q - y_p}{x_q - x_p} \\
A \cdot \left(2y_p \cdot \lambda - 3{x_p}^2\right) &=& 0 & A \wedge y_p \neq 0 \implies \lambda = \frac{3{x_p}^2}{2y_p} \\
(1-B) \cdot (1-C) \cdot (\lambda^2 - x_p - x_q - x_r) + B \cdot (x_r - x_q) &=& 0 & (¬B \wedge ¬C \implies x_r = \lambda^2 - x_p - x_q) \wedge (B \implies x_r = x_q) \\
\textsf{^ do we need (1-D) as well? (Ying Tong)} &\\
(1-B) \cdot (1-C) \cdot (\lambda \cdot (x_p - x_r) - y_p - y_r) + B \cdot (y_r - y_q) &=& 0 & (¬B \wedge ¬C \implies y_r = \lambda \cdot (x_p - x_r) - y_p) \wedge (B \implies y_r = y_q) \\
\textsf{^ do we need (1-D) as well? (Ying Tong)} &\\
C \cdot (x_r - x_p) &=& 0 & C \implies x_r = x_p \\
C \cdot (y_r - y_p) &=& 0 & C \implies y_r = y_p \\
D \cdot x_r &=& 0 & D \implies x_r = 0 \\
D \cdot y_r &=& 0 & D \implies y_r = 0 \\
\end{array}
Max degree: 4
\begin{array}{|c|c|c|c|c|c|c|c|c|c|}
\hline
x_P & y_P & x_A & y_A & a & b & c & d & \lambda \\\hline
x_p & y_p & x_q & y_q & A & B & C & D & \lambda \\\hline
& & x_r & y_r & \alpha & \beta & \gamma & \delta & \\\hline
\end{array}
#### How was this buggy?
As well as the missing $(1-D)$ that Ying Tong points out, this version was incorrectly assuming that the case of adding a point to its negation $(D)$ occurs exactly when $y_q = -y_p$. But this is insufficient because there are multiple points (in fact 3 points, related by the cubic endomorphism) that have the same $y$ but different $x$. Hence the last two constraints above, intended only to cover $(x, y) + (x, -y)$, would also have been incorrectly applied to $(x, y) + (\zeta x, -y)$ and $(x, y) + (\zeta^2 x, -y)$, where $\zeta$ is a cube root of unity in $\mathbb{F}_p$.