ZIP 2005 Spendability Proof Attempt

# ZIP 2005 Spendability Proof Attempt <center>Taylor Hornby 2026-01-31 </center> This is an attempt at patching up the [Spendability argument in ZIP 2005](https://zips.z.cash/zip-2005#securityargumentforspendability). The current argument, as of 2026-01-31 [relies on a false assumption](https://github.com/zcash/zips/issues/1150) that $f$ behaves like a random oracle. Here, we attempt to remove that assumption and lay out a complete argument for the case `split_flag=false, use_qsk=false`. Our argument is still in the ROM, so it still remains to: - consider quantum adversaries - ideally not use the ROM - consider cases other than `split_flag=false` and `use_qsk=false`. - check if this argument is correct (**I've likely made serious errors/omissions!**) The way this argument works is essentially by reduction: we will show that if there exists an adversary who can break the nullifier binding property of the protocol specified in ZIP 2005 (which we call "QRP") where... $nf = Extract([H^{rcm}(rseed, leadByte, noterepr) + f(rseed, leadByte, noterepr) + g(nk, rho, rseed)]R)$ ...and $H^{rcm}$ is a random oracle, then there is an adversary who can break the binding property of the nullifier in a different protocol QRP* where the nullifier is computed as... $nf = Extract([H^{new2}(rseed, leadByte, noterepr, sk)]R)$ ...where $H^{new2}$ is a random oracle. Since QRP*'s nullifier obviously binds $rseed$, $leadByte$, $noterepr$, and $sk$, and since $noterepr$ contains $\rho$, then QRP's nullifier must bind all of its inputs $(rseed, leadByte, noterepr, \rho, nk = H^{nk}(sk))$ as well. There is one important caveat: the reduction involves computing discrete logs. ### Adversary's Inputs & Recovery Circuit Constraints **In the case where split_flag=false and use_qsk=false**, the adversary can control $\rho$, $g_d$, split_flag, rsplit, is_internal_rivk, use_qsk, and $sk$. The recovery circuit ensures that the adversary knows auxilliary inputs such that the following constraints (among others) hold: - $ak^{P} = [H^{ask}(sk)]G^{orchard}$, $nk = H^{nk}(sk)$, $rivk_{ext} = H^{rivk}_{legacy}(sk)$ - $rivk \in \{rivk_{ext}, H^{rivk_{int}}_{rivk_{ext}}(ak, nk)\}$ - $ak = Extract(ak^P)$, $ivk = Commit^{ivk}_{rivk}(ak, nk)$, $pk_d = [ivk]g_d$. - $nf = Extract([H^{rcm}(rseed, leadByte, noterepr) + f(rseed, leadByte, noterepr) + g(nk, rho, rseed)]R)$ where $f$ and $g$ are defined based on discrete log relations as discussed in ZIP 2005. Note that this is not the complete set of constraints, but it's all we need for the following proof of nullifier binding to work. This is also not how the actual protocol computes $nf$, but it is equivalent. ### Proof We start with the game in which an adversary tries to break the nullifier binding property of QRP where all of the hash functions are random oracles. Then we modify the random oracles in such a way that they are still indistinguishable from random oracles, so that when these new oracles are used, the adversary must still be successful at breaking nullifier binding. But the new oracles are designed so that when this is done, the adversary is breaking the nullifier binding property of QRP*. At the start of time, we initialize a list $SKs = []$. Then, the oracle for each hash function that takes an $sk$ as an input is modified to append its $sk$ input to the list $SKs$. Next, we modify the $H^{rcm}$ oracle to work as follows: ``` R := [] L := [] BAD_INPUTS := [] H^rcm(rseed, leadByte, noterepr): (g_d, pk_d, v, rho, psi) := noterepr sks := { sk in SKs | pk_d derives from sk and g_d for any value of is_internal_rivk } if |sks| == 0 BAD_INPUTS = BAD_INPUTS ∪ {(rseed, leadByte, noterepr)} sk := ⊥ offset := 0 else if |sks| == 1 if (rseed, leadByte, noterepr) ∈ BAD_INPUTS bad event! sk := the sk ∈ sks nk := H^nk(sk) // here's where the reduction needs to compute discrete logs offset := -f(rseed, leadByte, noterepr) - g(nk, rho, rseed) else (|sks| >= 2) bad event! if L[rseed, leadByte, noterepr, sk] is undefined R[rseed, leadByte, noterepr, sk] <- random sample L[rseed, leadByte, noterepr, sk] <- R[rseed, leadByte, noterepr, sk] + offset return L[rseed, leadByte, noterepr, sk] ``` We need to argue that the modified random oracles are indistinguishable from actual random oracles. The ones that just record their $sk$ inputs are obviously indistinguishable. The modified $H^{rcm}$'s output is uniformly distributed; it can be distinguished only if at least one of the bad events occur: - If the $|sks| \geq 2$ bad event in $H^{rcm}$ is triggered, the adversary has found two different spending keys for the same address, meaning they can violate Balance by spending the same note with two different nullifiers, so we can assume that doesn't happen. - If the adversary can arrange to call $H^{rcm}$ on some input that first goes down the $|sks| == 0$ branch and then later call it on the same input (after adding more $sk$s to $SKs$) and it goes down the $|sks| == 1$ branch. When this happens, the adversary must provide an $(g_d, pk_d)$ pair to the input of $H^{rcm}$ before a corresponding $sk$ has been passed to any oracle. Each $(g_d, pk_d)$ uniquely determines $ivk$, so under the heuristic that $ivk$ is essentially a random function of $sk$, this is a multi-target preimage search. If the adversary makes $N$ oracle queries in total, there are at most $N$ targets, and so the success probability of each attempt is at most $2*2N/(r_P−1)$, which is secure against classical and Grover search, as argued in the Informal Security Argument section of the ZIP. (The extra factor of two is coming from both possible values of is_internal_rivk). For classical dlog-breaking adversaries, since there are at most $N$ attempts, the probability is at most $4N^2/(r_P-1)$ which is negligible. That $ivk$ is a random function of $sk$ is justified by the constraints listed above: $rivk$ is always some sort of RO hash involving $sk$, as are the other two inputs to $Commit^{ivk}$ $ak$ and $nk$. So, aside from these cases which have negligible probability for a bounded dlog-breaking adversary, no bounded dlog-breaking adversary can distinguish between our modified oracles and the original random oracles. Now suppose there is an adversary $A$ who violates spendability for QRP by breaking the binding property of the nullifier with nonnegligible probability. When we swap out the actual random oracles for our modified ones, the adversary must still break nullifier binding with nonnegligible probability, or else we have a distinguisher for the oracles. When $A$ succeeds in their attack, they must output two different openings of the same nullifier, i.e. they output $X = (rseed_x, leadByte_x, noterepr_x)$ and $Y = (rseed_y, leadByte_y, noterepr_y)$ such that $X \neq Y$, the nullifiers derived from $X$ and $Y$ are both the same value $nf$, and they know auxiliary values $X_{aux} = (..., sk_x, ...)$ and $Y_{aux} = (..., sk_y, ...)$ which cause the QRP constraint checks to pass. Suppose for the sake of contradiction that $A$ has succeeded and at the end of their attack, either $X \in BAD\_INPUTS$ or $Y \in BAD\_INPUTS$. Then since the constraint checks on $(X, X_{aux})$ and $(Y, Y_{aux})$, pass, it must be the case that the address in $noterepr_X$ derives from $sk_x$ and the address in $noterepr_Y$ derives from $sk_y$. We could therefore modify $A$ to call $H^{nk}$ with $sk_x$ and $sk_y$ to store them in $SKs$ (if they aren't already) then call $H^{rcm}$ again on $X$ and $Y$. Now, each of these calls must go down the $|sks| == 1$ or $|sks| \geq 2$ branches. Since $X \in BAD\_INPUTS$ or $Y \in BAD\_INPUTS$, this must result in one of the bad events being triggered, which we argued above has negligible probability. So with all but negligible probability, $X \notin BAD\_INPUTS$ and $Y \notin BAD\_INPUTS$. Now, notice that when we use the modified oracles, the nullifier calculation simplifies to... ``` nf = Extract([H^{rcm}(rseed, leadByte, noterepr) + f(rseed, leadByte, noterepr) + g(nk, rho, rseed)]R) = Extract([H^{new}(rseed, leadByte, noterepr) - f(rseed, leadByte, noterepr) - g(nk, rho, rseed) + f(rseed, leadByte, noterepr) + g(nk, rho, rseed)]R) = Extract([H^{new}(rseed, leadByte, noterepr)]R) ``` ...where $H^{new}$ is the oracle defined by the $R$ table in $H^{rcm}$. It remains to argue that this is actually binding of $sk$ as well. Since $X \notin BAD\_INPUTS$ and $Y \notin BAD\_INPUTS$, $sk \neq \bot$, where $sk$ is the internal variable for $H^{new}$ in any call that produced $nf$. Inspecting $H^{new}$ reveals that it's just a lazy random dictionary for the input $(rseed, leadByte, noterepr, sk)$. This means that the nullifier calculation is equivalent to... $nf = Extract([H^{new2}(rseed, leadByte, noterepr, sk)]R)$ ...where $H^{new2}$ is a random oracle defined by the same $R$ table, except we have argued that we can consider $sk$ as part of the input. So, with overwhelming probability, an adversary who breaks nullifier binding for QRP will also be able to break nullifier binding for QRP*. (FIXME: there is something handwavy about the above justification that we can consider $sk$ to be part of the input. The full argument is something like: there is some procedure for transforming $A$ so that it always gives $sk$ as part of the input without breaking its attack, which is possible because $x \notin BAD\_INPUTS$ and $Y \notin BAD\_INPUTS$ so $A$ has access to the $sk$ at the time of the call. Actually... can't we just, without even doing a reduction, conclude it's unlikely since the result must be two randomly-sampled values in R[] colliding?) (Note that the oracle substitution is *also* cancelling the f(...) term and adding a -g(...) term to the note commitment! But that's fine because this argument for the nullifier binding its inputs isn't at all relying on the note commitment with the -g(...) term being binding! There might be problems with this argument in a game where the adversary's goal is to break *either* Balance or Spendability.) For split_flag=false and use_qsk=true I think you can do something similar, but instead of saving the $sk$s from oracle calls you save all the $(qk, nk, ak)$. split_flag=true remains TODO. This proof is incomplete because it's not sufficient to argue for the cases `use_qsk=false` and `use_qsk=true` separately. An adversary could use a strategy that involves mixing both of those cases. The full proof would need to alter the oracles to save $sk$s *as well as* $(qk, nk, ak)$s, and $H^{rcm}$ would have to find *either* an $sk$ or $(qk, nk, ak)$ that's compatible with the $(g_d, pk_d)$ pair, and then the argument that the adversary cannot distinguish our modified $H^{rcm}$ by switching from $|sks| == 0$ to $|sks| == 1$ (or the equivalent thing involving $(qk, nk, ak)$) becomes much more complicated.)