Recompiled YUL contracts are mostly compatible with existing pallet-contracts; here some thoughts around what's missing or could be further optimized.
EVM/ETH use keccak256 as code hashes. So we need:
EXTCODEHASH
.Just using blake2 or anything else than keccak256 does not work for the recompiler because it will break the semantics of existing contracts.
I.e. the hash of a contract is not always just some opaque bytes. For example when a contract compares some code hash against the "empty code hash" (keccak256('')
) to figure out if the address is a code account. Or it might expect a specific code hash at a specific address. This does fall apart if we use a different hash function.
STATICCALL
In a staticcall
context, the runtime throws if the contract calls into any state modifying function. Propagated down the call stack. EIP-214
CALL
Solidity specifies callees using addresses and not code hashes. Instead of the contract having to do the lookup, the runtime should do it instead.
EVM does balance transfer via calls to EOA. So instead of failing calls to accounts without code, balance should be transferred in that case.
RETURNDATA{COPY,SIZE}
The output of the last call context should be kept around because contracts can request it.
TLOAD
/ TSTORE
Transient ("temporay") contract storage that is always reverted after the call ends.
https://eips.ethereum.org/EIPS/eip-1153#specification
Must be implemented so that it is cheaper than than ordinary STORE
/ LOAD
on contract storage as solc
will use those opcodes for optimizations. Must be implemented to exactly resemble the semantics on EVM, otherwise it can introduce security risks.
BLOBHASH
/ BLOBBASEFEE
New opcodes related to sharding on ETH. The idea of proto-danksharding is to provide more data short term (data too expensive to store for all the rollups long term). Questionable if/how we can to support that. However, ETH folks are always getting creative with whatever new functionality they put in EVM and abuse it for something else. I expect this to be the case here too so we might want to support it anyways if we can. IMO not the highest priority though.
https://github.com/ethereum/EIPs/blob/master/EIPS/eip-4844.md
https://eips.ethereum.org/EIPS/eip-4844#gas-accounting
https://www.eip4844.com/
CHAINID
Returns some number (identifier) for the chain. Ideally we don't clash with any existing ones.
BLOCKHASH
BLOCKHASH(blockNumber)
returns the hash of block number blockNumber
(only valid for blockNumber
up to the newest 256 blocks).
GASLIMIT
Return the blocks gas limit
CODESIZE
/ EXTCODESIZE
Returns the size of code blob running (CODESIZE
) or the code size of the code at the specified address (EXTCODESIZE
) respectively (analogous to existing code_hash / ext_code_hash).
We can return a u32 here (the code size can not exceed it anyways and it is much cheaper to zero extend this into an i256 than allocating and loading from stack space).
INVALID
The INVALID
opcode (0xFE
) reverts but also consumes all remaining gas. Could maybe implemented in return flags.
CREATE
/CREATE2
The runtime should use the same address derivation as on EVM. Contract code might assume this:
https://github.com/Uniswap/v2-periphery/blob/0335e8f7e1bd1e8d8329fd300aea2ef2f36dd19f/contracts/libraries/UniswapV2Library.sol#L18
Additionally, we have many parameters in fn instantiate
in the contracts pallet which don't matter for EVM. So we could have a simpler create
/create2
API methods that behaves exactly like on Ethereum and take the same parameters:
fn create2(
code_hash_ptr: u32, // keccak256 hash image of the contract code
value_ptr: u32, // i256 ptr to balance to be transferred
input_data_ptr: u32, // constructor calldata ptr
input_data_len: u32, // constructor calldata length ptr
address_ptr: u32, // output buffer (20 bytes) = keccak256(0xff + sender_address + salt + keccak256(initialisation_code))[12:]
salt_ptr: u32, // 32byte ptr of salt
) -> Result<()>
Where CREATE2
writes the zero address on failure. Anolog for CREATE
.
This would also benefit code size as there are only 6 parameters which doesn't require spilling.
Another thing to note is that we likely just ignore the output of the constructor (on EVM, the constructor output is the runtime code to be deployed, however we assume the code already on-chain and execute the constructor in the context of the new instance, discarding any output).
BALANCE
Currently, the balance
seal API returns the balance of the executing account. EVM has the account (address) as parameter.
Best to check what frontier/moonbeam do for those. I'm not sure of the back of my head what do for those but if frontier can emulate it then we surely can find a solution for those too. Worst case we don't support some of them and just emit a compiler error at the cost of sacrificing compatibility.
PREVRANDAO
COINBASE
ORIGIN
GASPRICE
DIFFICULTY
(currently set to constant 2500000000000000)BASEFEE
(currently set to contant 0)On EVM there are CALLDATALOAD(i) -> calldata[i]
to load a single word from calldata at offset i
and CALLDATACOPY(destOffset, offset, size)
which is essentially a memcopy (offset
is the offset from the start of calldata and destOffset
the offset into the EVM linear heap memroy). CALLDATASIZE() -> size
returns the size of the calldata in bytes. CALLVALUE() -> value
returns the transferred balance with this call.
Ideas discussed so far:
CALLDATALOAD[0]
at minimum because this is required for the selector check. This could spare calling into seal_input
in cases where the code doesn't use CALLDATCOPY
at all, as the compiler can optimize CALLDATALOAD(0)
away if the offset 0 is static (which it always is during selector check).On ETH the deploy code can insert immutables into the code, which we can't, so they need to be stored somewhere. My naive approach would be just store them in regular contract storage under a 4byte index key (ETH storage keys are always 32bytes so this can never collide) but runtime performance is penalized by doing that.