owned this note changed 3 years ago
Linked with GitHub

Full Cost of PublicKey-Contract Collision Attack

Dmitry Khovratovich

Problem statement

Ethereum addresses as functions of public keys are computed as 160-bit hashes:
\[ A_{PK} = Keccak_{256}(K)[0..160] \]
where \(K=[k]P\) is the public key derived from secret key \(k\) and \([0..160]\) indicates that we take only 160 bits of the output. Note that \(Keccak_{256}\) is slightly different from the standardized SHA-3.

Ethereum contract addresses are created in two ways. First, when a contract is regularly deployed it is a hash of a nonce and the creator address:
\[ A_{CT} = Keccak_{256}(encode_1(A,N))[0..160] \]
where \(N\) is a nonce starting from 1.

Second, via CREATE2 instruction where the contract code \(C_d\) is also hashed:
\[ A_{CR} = Keccak_{256}(encode_2(A,S,Keccak_{256}(C_d))) \]
where \(S\) is arbitrary 32-byte salt.

It is obvious that by trying \(2^{80}\) variants \(K_i = K+[i]P\) of \(K\) and \(2^{80}\) salt values \(S_j\) one might obtain the address collision
\[ A_{PK}^i = Keccak_{256}(K_i)[0..160] = A_{CR}^j = Keccak_{256}(encode_2(A,S_j,Keccak_{256}(C_d))). \]

Note that the \(encode\) functions ensure that the preimages for \(A_{CR}\) and \(A_{PK}\) are domain-separated so there can't be a collision in inputs.

We investigate how expensive it would be to create such a collision.

Parallel Distinguished Points Method

There is simple memoryless method of collision search. It is based on the rho method for the collision in a single function \(f\):

  • Start at input \(x\);
  • Use two pointers: one iterates \(f(x)\), the other iterates \(f(f(x))\) at the same time.
  • Whenever two pointers collide, there must have been a collision on their paths.

The collision search between two functions \(f_1\) and \(f_2\) works the same as we just switch between \(f_1\) and \(f_2\) at a pseudo-random fashion when iterating. The functions \(f_1,f_2\) are exactly the functions to derive \(A_{PK}\) and \(A_{CR}\) respectively with the only caveat that they must map the input space to the same domain. This is achieved by fixing all inputs except for \([i]\) and \(S_j\) and mapping the hash outputs to those. For example,
\[ f_1(x) = Keccak_{256}(K+[x]P)[0..160] \]

The found collision is then with probability 1/2 a collision between \(f_1\) and \(f_2\). Constant memory is required. Note that function \(f_1\) takes a 256-bit input and maps it to value \(i\), which is not low-weight. Thus a single iteration of \(f_1\) is a curve point multiplication.

The downside of the method is that it would require time equivalent to \(2^{80}\) Keccak+KeyDerivation calls, which is clearly enormous. There is a parallel version of collision search due to van Orrschot and Wiener (1996).

Let us denote the number of available Keccak processors by \(m\). Then each processor starts at its own point \(x_0\) and iterates \(f\) until a distinguished point (say last 10 bits are 0) is reached. It is then stored in common memory and, if there is no collision with the stored ones, the search is restarted. The proportion of distinguished points is \(\theta\) so that the average iteration length is \(1/\theta\).

The run-time of the attack with \(m\) processors is about
\[ \frac{2^{81}}{m}+\frac{2.5}{\theta} \]
calls to \(f\) (which is upper bounded as the cost of EC multiplication+hashing).

The expected number of distinguished points to be produced is \(\theta 2^{81}\). We may assume that every point stores about 60 bits about \(x_0\) and \(160 + \log_2 \theta\) bits about itself.

Concrete Numbers

Cost of Single Iteration

The most expensive part is the computation of \(f_1\). We need to perform a 160-bit fixed-base scalar multiplication and Keccak-256 hashing. The latter is estimated to take 40 kGates. We then assume that the chip is big enough so we can store \(2^6\) 510-bit points of form \([t]P\) for \(t<2^6\),
so that we need only 160 doublings and 30 additions, i.e. about half of regular exponentiation. The latter is reported to take 3 ms, 300K cycles, and 11K gates on ASIC with 120Mhz frequency. We thus presume that one can make \(2^{12}\) calls to \(f_1\) per second, or \(2^{37}\)
calls per year.

Example

Setting \(m=2^{44}\) and \(\theta=2^{-37}\) we obtain that the running time is \(2^{37}\) calls to \(f_1\) (i.e. a year) and the total storage is about 160 TB.

The cost can be estimated as follows. Assuming that SHA-2 has about \(2^{11}\) smaller latency that the \(f_1\) processor, we obtain that the total cost of the attack should be equivalent to \(2^{92}\) SHA-2 hashes. Note that the Bitcoin mining network computes \(2^{67}\) double SHA-2 hashes per second, so that the attack cost is equivalent to \(2^{25}\) seconds, or 1 year, of Bitcoin mining. This is $10 billion in June 2021.

Select a repo