Dmitry Khovratovich
Ethereum addresses as functions of public keys are computed as 160-bit hashes:
\[
A_{PK} = Keccak_{256}(K)[0..160]
\]
where \(K=[k]P\) is the public key derived from secret key \(k\) and \([0..160]\) indicates that we take only 160 bits of the output. Note that \(Keccak_{256}\) is slightly different from the standardized SHA-3.
Ethereum contract addresses are created in two ways. First, when a contract is regularly deployed it is a hash of a nonce and the creator address:
\[
A_{CT} = Keccak_{256}(encode_1(A,N))[0..160]
\]
where \(N\) is a nonce starting from 1.
Second, via CREATE2 instruction where the contract code \(C_d\) is also hashed:
\[
A_{CR} = Keccak_{256}(encode_2(A,S,Keccak_{256}(C_d)))
\]
where \(S\) is arbitrary 32-byte salt.
It is obvious that by trying \(2^{80}\) variants \(K_i = K+[i]P\) of \(K\) and \(2^{80}\) salt values \(S_j\) one might obtain the address collision
\[
A_{PK}^i = Keccak_{256}(K_i)[0..160] = A_{CR}^j = Keccak_{256}(encode_2(A,S_j,Keccak_{256}(C_d))).
\]
Note that the \(encode\) functions ensure that the preimages for \(A_{CR}\) and \(A_{PK}\) are domain-separated so there can't be a collision in inputs.
We investigate how expensive it would be to create such a collision.
There is simple memoryless method of collision search. It is based on the rho method for the collision in a single function \(f\):
The collision search between two functions \(f_1\) and \(f_2\) works the same as we just switch between \(f_1\) and \(f_2\) at a pseudo-random fashion when iterating. The functions \(f_1,f_2\) are exactly the functions to derive \(A_{PK}\) and \(A_{CR}\) respectively with the only caveat that they must map the input space to the same domain. This is achieved by fixing all inputs except for \([i]\) and \(S_j\) and mapping the hash outputs to those. For example,
\[
f_1(x) = Keccak_{256}(K+[x]P)[0..160]
\]
The found collision is then with probability 1/2 a collision between \(f_1\) and \(f_2\). Constant memory is required. Note that function \(f_1\) takes a 256-bit input and maps it to value \(i\), which is not low-weight. Thus a single iteration of \(f_1\) is a curve point multiplication.
The downside of the method is that it would require time equivalent to \(2^{80}\) Keccak+KeyDerivation calls, which is clearly enormous. There is a parallel version of collision search due to van Orrschot and Wiener (1996).
Let us denote the number of available Keccak processors by \(m\). Then each processor starts at its own point \(x_0\) and iterates \(f\) until a distinguished point (say last 10 bits are 0) is reached. It is then stored in common memory and, if there is no collision with the stored ones, the search is restarted. The proportion of distinguished points is \(\theta\) so that the average iteration length is \(1/\theta\).
The run-time of the attack with \(m\) processors is about
\[
\frac{2^{81}}{m}+\frac{2.5}{\theta}
\]
calls to \(f\) (which is upper bounded as the cost of EC multiplication+hashing).
The expected number of distinguished points to be produced is \(\theta 2^{81}\). We may assume that every point stores about 60 bits about \(x_0\) and \(160 + \log_2 \theta\) bits about itself.
The most expensive part is the computation of \(f_1\). We need to perform a 160-bit fixed-base scalar multiplication and Keccak-256 hashing. The latter is estimated to take 40 kGates. We then assume that the chip is big enough so we can store \(2^6\) 510-bit points of form \([t]P\) for \(t<2^6\),
so that we need only 160 doublings and 30 additions, i.e. about half of regular exponentiation. The latter is reported to take 3 ms, 300K cycles, and 11K gates on ASIC with 120Mhz frequency. We thus presume that one can make \(2^{12}\) calls to \(f_1\) per second, or \(2^{37}\)
calls per year.
Setting \(m=2^{44}\) and \(\theta=2^{-37}\) we obtain that the running time is \(2^{37}\) calls to \(f_1\) (i.e. a year) and the total storage is about 160 TB.
The cost can be estimated as follows. Assuming that SHA-2 has about \(2^{11}\) smaller latency that the \(f_1\) processor, we obtain that the total cost of the attack should be equivalent to \(2^{92}\) SHA-2 hashes. Note that the Bitcoin mining network computes \(2^{67}\) double SHA-2 hashes per second, so that the attack cost is equivalent to \(2^{25}\) seconds, or 1 year, of Bitcoin mining. This is $10 billion in June 2021.
or
or
By clicking below, you agree to our terms of service.
New to HackMD? Sign up
Syntax | Example | Reference | |
---|---|---|---|
# Header | Header | 基本排版 | |
- Unordered List |
|
||
1. Ordered List |
|
||
- [ ] Todo List |
|
||
> Blockquote | Blockquote |
||
**Bold font** | Bold font | ||
*Italics font* | Italics font | ||
~~Strikethrough~~ | |||
19^th^ | 19th | ||
H~2~O | H2O | ||
++Inserted text++ | Inserted text | ||
==Marked text== | Marked text | ||
[link text](https:// "title") | Link | ||
![image alt](https:// "title") | Image | ||
`Code` | Code |
在筆記中貼入程式碼 | |
```javascript var i = 0; ``` |
|
||
:smile: | Emoji list | ||
{%youtube youtube_id %} | Externals | ||
$L^aT_eX$ | LaTeX | ||
:::info This is a alert area. ::: |
This is a alert area. |
On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?
Please give us some advice and help us improve HackMD.
Syncing