Try   HackMD

Impact Analysis of Neutering SELFDESTRUCT - Dev Update #1

SELFDESTRUCT is an opcode that a contract can use to delete itself, meaning both code and storage are deleted from the state tree, and all ETH in the contract are sent to a specified address. It turns out that SELFDESTRUCT causes several complexities. Among other things, allowing contracts to selfdestruct causes complexities when switching to Verkle trees, and the current proposal for Verkle tries requires neutering the opcode. This means that the opcode is renamed to SENDALL, and the only thing it does is to send the contract balance to the specified address.

Goal of the project

Neutering SELFDESTRUCT is a backward-incompatible change. The goal of this project is to analyze all existing uses of SELFDESTRUCT to evaluate the potential effects of this change.

What have been done

First, I downloaded the contract code of all contracts deployed on the Mainnet from the Public BigQuery Ethereum dataset. This dataset was created using Ethereum ETL. I then created a PostgreSQL database where I inserted this data. I will use this database to store metadata about the contracts, as I go along. Currently, I have added the following metadata:

  • has_selfdestruct Does the code have the SELFDESTRUCT opcode?
  • selfdestruct_recipients The potential recipients of the contract balance, extracted by static analysis (see below).
  • has_callcode Does the code have the CALLCODE opcode? This is relevant, because a contract can also selfdestruct by using CALLCODE to another contract that uses SELFDESTRUCT.
  • has_delegatecall Does the code have the DELEGATECALL opcode? Same as for CALLCODE.
  • call_addresses The potential addresses that can be called by CALLCODE or DELEGATECALL, extracted by static analysis (see below).

Static analysis of contract code

In order to perform static analysis on contract code, I have implemented a very primitive EVM interpreter in Haskell. This interpreter performs symbolic execution of EVM code. The interpreter is able to enumerate all possible inputs of a given opcode, which is used to find all possible recipients of the contract balance, and all possible addresses called by CALLCODE or DELEGATECALL. The plan is to release the source code as part of the final analysis.

Some numbers

There are a total of 43,621,454 deployed contracts on Ethereum Mainnet (as of June 12, 2021). The contracts have 398,220 distinct contract codes. Among the distinct codes, there are

  • 20,565 (~5%) codes having the SELFDESTRUCT opcode,
  • 32,032 (~8%) codes that do not have SELFDESTRUCT, but has either CALLCODE or DELEGATECALL, and
  • 345,623 (~87%) codes that do not have any of the three opcodes and are therefore indestructable.

There will be more numbers to come in the upcoming updates.

What remains to be done

Please let me know if you think I should add anything to this list.

  • Find out which of the contracts having CALLCODE or DELEGATECALL can be proven indestructable. We can prove this recursively if

    1. The contract does not contain SELFDESTRUCT and
    2. All possible contracts that can be called by the contract can be determined by static analysis, and
    3. All possible contracts that can be called, are proven to be indestructable
  • Obtain the Solidity code of all selfdestructable contracts that are available from etherscan.io.

  • Identify which contracts are metamorphic, meaning that they are able to selfdestruct and be redeployed at the same address. Metamorphic contracts will no longer be possible after neutering SELFDESTRUCT, so this will be an important part of the analysis. If a contract is metamorphic, the contract must have been either

    1. deployed using CREATE2 or
    2. deployed by a metamorphic contract

    In order to analyze this, I need to obtain information about how each contract was deployed. Check if Ethereum ETL can be used for this.

  • Obtain indicators for the popularity of the destructable contracts, such as balance, number of transactions interacting with the contract etc.

  • Improve the EVM analyzer such that it can tell under which conditions the contract may selfdestruct. For instance, if the CALLER must be equal to a specific address, etc.

  • Are there any cases where EXTCODECOPY, EXTCODEHASH or EXTCODESIZE are used to query the code of destructable contracts? For what reasons? Do any of the use cases depend on the selfdestructable contract to be able to selfdestruct?

  • Publish the code on github. All of the analysis should be replicable.