# Making ASE work with an Edict This document explains a particular approach to Address Space Extension. Readers of this document should be familiar with previous work by [Vitalik Buterin](https://ethereum-magicians.org/t/increasing-address-size-from-20-to-32-bytes/5485) and [Ipsilon](https://notes.ethereum.org/@ipsilon/address-space-extension-exploration). It attempts to solve the thorny problem of preventing loss of funds due to failure to distiguish between legacy addresses that refer to legacy accounts and those that are references to long address. This is done by condensing the problems as much as possible into a single requirement placed upon users (in practice, upon tooling) refered to as The Edict, which guarantees funds will not be lost. It is implicitly assumed that legacy and long address contracts have the ability to call each other. This is presumably achieved by having the long address contracts convert addresses between legacy and long form. The details of how this might be achieved are no discussed. Other issues such as `ECRECOVER` and `SELFDESTRUCT` are considered out of scope. This scheme should be compatible with any scheme for state expiry with the translation table living in epoch 0. ## Legacy and Long Addresses Currently all accounts in Ethereum have 160 bit addresses, refered to as *legacy addresses*. Address Space extension is the proposal to introduce new 256 bit addresses, refered to as *long addresses*. Rather than specify exactly how long addresses are formatted, it is simply that they have the following properties: 1. There is a subset of long addresses are just encoded legacy addresses. We will define the functions `encode()` and `decode()` that implement this transformation. 2. There is a function `compress()`, such that `compress(encode(legacy_address)) == legacy_address` and otherwise behaves as a cryptographic hash function. ## Collisions It is possible, using a collision finding algorithm to find a collision between two 160 bit addresses. This attack is hugely expensive, on a similar scale to performing a 51% attack on Bitcoin and it is highly unlikely it will ever be attempted. Nonetheless it is neccesary to mitigate against it. In the event of such an attack both addresses will have been generated by the same malicious actor. For this reason, there is not need to protect EOAs from loss in the event of a collision attack, but contracts can be relyed on by innocent parties. ## The Translation Table Accounts start in an `UNINITIALIZED` state until they are claimed by either a legacy address or a long address. Once a address has left the `UINITIALIZED` state its state cannot be changed again. `UNINITIALIZED` : The default state that accounts start in. `LEGACY` : This address has been used by a legacy account (contract creation or EOA transaction sent). `TRANSLATION(long_address)` : This address refers to a long address account. The test for `LEGACY` status could alternatively be "has code deployed". ## Adding entries to the Translation Table There are two routines for converting between legacy and long addresses. They are available as precompiles, but the EVM will also call them internally. An appropriate gas cost is charged for their usage. ### `COMPRESS` `COMPRESS` takes a long address and returns the corresponding legacy address. It claims the legacy address for the long address and reverts if it has already been claimed. ```python def COMPRESS(long_address): legacy_address = compress(long_address) if state[legacy_address].state == UNINITIALIZED: state[legacy_address] = TRANSLATION(long_address) elif state[legacy_address].state == LEGACY: revert() elif state[legacy_address].state == TRANSLATION: require(state[legacy_address].long_address == long_address) return legacy_address ``` ### `DECOMPRESS` `DECOMPRESS` takes a legacy address and turns it into long address based on its state. If the legacy address is `UNINITIALIZED` it is assumed not to correspond to a long address. ```python def DECOMPRESS(legacy_address): if state[legacy_address].state == TRANSLATION: return state[legacy_address].long_address else: return encode(legacy_address) ``` ### When is `COMPRESS` called? There are 3 ways `COMPRESS` can be called. 1. The EVM implicitly calls `COMPRESS` when it needs pass a long address to a legacy contract (`COINBASE`, `CALLER`, etc...). 2. Smart contracts can call `COMPRESS` through a precompile. 3. Transactions can include a list of long addresses they want added to the translation table in `new_translations` header field. The EVM will call `COMPRESS` on each of them prior to starting execution. If `COMPRESS` reverts then the entire transaction reverts. ## The Edict Because legacy addresses are assumed to point to legacy accounts it is important that no one uses `compress(long_address)` without first adding it to the translation table. Users who fail to do this will potentially suffer loss of funds. This should be enforced by tooling not provided the ability to users to calculate `compress(long_address)` unless they meet the following requirement: > You MUST NOT distribute the return value of `compress(long_address)` unless: > 1. You have included `long_address` in `new_translations` header field. > 2. You are using the `COMPRESS` precompile of the EVM. > 3. You have comfirmed that the state of Ethereum contains a translation for `compress(long_address)` that points to `long_address`. Because this is so important we will refer to it as "The Edict". ### If you obey The Edict... #### ... you can get full counterfactual security for long addresses Suppose that Alice has a legacy account that she wishes to use to send 1000 units of a legacy ERC20, `WDOGE`, to a long address contract that has been counterfactually generated by Chuck. Suppose that when Chuck generated wrapped his contract he used a collision finding algorithm to create a EOA he controls with the same legacy address. When Alice creates her transaction she obeys The Edict by including the `long_address` of the counterfactual contract in the `new_translations` field of the transaction. The body of her transaction calls `transfer(chucks_contract_compressed, 1000)` on the `WDOGE` contract. Chuck also sends a transaction claiming the legacy address for himself. If Alice's transaction is processed first it causes a translation to be entered pointing `chucks_contract_compressed` to `chucks_contract`. This renders Chuck's collision worthless and will cause Chuck's transaction to fail. If Chuck's transaction is processed first then Alice's transaction will fail, costing her gas fees only. #### ... you cannot lose funds to incorrect address decompression. If the edict is obeyed, any legacy address of the form `compress(long_address)` must have it's long address entered into the translation table. This prevents any mislocation of funds. Note that if a users trusts someone else to give them an address honestly they must also trust them to obey The Edict. ### Dealing with Edict violations Edict violations cause two problems. Firstly, counterfactual security is lost (Chuck's attack would have worked). Secondly funds can be lost due to `DECOMPRESS` incorrectly returning `encode(legacy_address)`. In order to cause loss of funds two things must happen: 1. A long address gets compressed in violation of The Edict. 2. The corresponding legacy address is decompressed to an encoded legacy address. If the recipient is a legacy address or the thing being sent is a legacy ERC20, loss is impossible. #### Misplaced ETH ETH that has been misplaced can be recovered by automatic account merging. This is implemented by moving any ETH stored in an `UINITIALIZED` legacy account to the corresponded long address when a translation is entered for it. #### Misplaced Tokens Misplaced Tokens cannot be recovered without assistance from holding contract. Since legacy contracts cannot contain misplaced ETH this is not an issue with those contracts. Long address contracts can implement recovery using the following function. ```solidity function recover_edict_violation(long_address wrong_address) { require(is_encoded_legacy_address(wrong_address)) right_address = decompress(compress(wrong_address) require(wrong_address != right_address) _transfer(wrong_address, right_address, _balances[wrong_address] } ``` `recover_edict_violation()` can be called by anyone because it's only effect is to transfer assets from an account that cannot legally be created to the rightful owner of those assets. #### Attempting to call contracts Currently, if a call is attempted to a non-existent contract the call will be treated as transfer potentially causing loss of ether. Calling a long address contract by its legacy address (in violation of the edict) could cause loss of funds in the same way. This could be fixed by adding a "recipient must be a contract" flag to transactions. It would be prudent to encourage contract creators to call `COMPRESS` on their contracts to reduce the chance of issues. #### Other cases It is not possible for the EVM to fix edict violations in the general case because the EVM cannot know what is and is not an address or what invariants it must maintain when fixing edict violations. This is unlikely to be a problem in practice, because recoverable edict violations involve accounts that have never sent a transaction and are unlikely to be involved in complicated contract interactions. ### How can edict violations occur? Fundamentally the responsibility for preventing edict violations lies with tooling. The function `compress()` is a function for highly sophisticated users only. Tooling should use long addresses everywhere internally and only call the `compress()` function when assembling transactions (thus obey clause 1 of The Edict). User interfaces must not provide a facility for users to calculate compressed addresses unless they have complied with clause 3 of The Edict. #### Cross chain issues If an address is compressed on one chain, it is an edict violation to use it on another. Assuming an address on one chain is valid on another is already a major source of issues (e.g. Binance defaulting to BSC for withdrawals). A similar issue occurs with testing enviroments like Hardhat. #### Reverted and pending transactions A transaction can be created without violating The Edict if it contains compressed addresses that are not in the Translation Table, as long as it includes the corresponding long address in the `new_translations` field of the transaction. If someone, say a user examining Etherscan, takes the address out of a pending or reverted transaction and uses it, they are violating The Edict. The reverted transaction case can be resolved by specifying that the processing of the `new_translations` field happens in a preliminary stage of transaction processing and its effects are preserved even if the body of the transaction reverts.