# Runtime for recompiled YUL contracts
Recompiled YUL contracts are _mostly_ compatible with existing pallet-contracts; here some thoughts around what's missing or could be further optimized.
## Required additions
### Keccak256 code hashes
EVM/ETH use keccak256 as code hashes. So we need:
- An API to get the keccak256 hash of a blob for supporting `EXTCODEHASH`.
- Have the deploy function to accept keccak256 hashes
Just using blake2 or anything else than keccak256 does not work for the recompiler because it will break the semantics of existing contracts.
I.e. the hash of a contract is not always just some opaque bytes. For example when a contract compares some code hash against the "empty code hash" (`keccak256('')`) to figure out if the address is a code account. Or it might expect a specific code hash at a specific address. This does fall apart if we use a different hash function.
### `STATICCALL`
In a `staticcall` context, the runtime throws if the contract calls into _any_ state modifying function. Propagated down the call stack. [EIP-214](https://github.com/ethereum/EIPs/blob/master/EIPS/eip-214.md)
### `CALL`
#### Address vs. code hash
Solidity specifies callees using addresses and not code hashes. Instead of the contract having to do the lookup, the runtime should do it instead.
#### Balance transfers
EVM does balance transfer via calls to EOA. So instead of failing calls to accounts without code, balance should be transferred in that case.
#### `RETURNDATA{COPY,SIZE}`
The output of the last call context should be kept around because contracts can request it.
### `TLOAD` / `TSTORE`
Transient ("temporay") contract storage that is always reverted after the call ends.
https://eips.ethereum.org/EIPS/eip-1153#specification
Must be implemented so that it is cheaper than than ordinary `STORE` / `LOAD` on contract storage as `solc` will use those opcodes for optimizations. Must be implemented to exactly resemble the semantics on EVM, otherwise it can introduce security risks.
### `BLOBHASH` / `BLOBBASEFEE`
New opcodes related to sharding on ETH. The idea of proto-danksharding is to provide more data short term (data too expensive to store for all the rollups long term). Questionable if/how we can to support that. However, ETH folks are _always_ getting creative with whatever new functionality they put in EVM and abuse it for something else. I expect this to be the case here too so we might want to support it anyways if we can. IMO not the highest priority though.
https://github.com/ethereum/EIPs/blob/master/EIPS/eip-4844.md
https://eips.ethereum.org/EIPS/eip-4844#gas-accounting
https://www.eip4844.com/
### `CHAINID`
Returns some number (identifier) for the chain. Ideally we don't clash with any existing ones.
### `BLOCKHASH`
`BLOCKHASH(blockNumber)` returns the hash of block number `blockNumber` (only valid for `blockNumber` up to the newest 256 blocks).
### `GASLIMIT`
Return the blocks gas limit
### `CODESIZE` / `EXTCODESIZE`
Returns the size of code blob running (`CODESIZE`) or the code size of the code at the specified address (`EXTCODESIZE`) respectively (analogous to existing code_hash / ext_code_hash).
We can return a u32 here (the code size can not exceed it anyways and it is much cheaper to zero extend this into an i256 than allocating and loading from stack space).
### `INVALID`
The `INVALID` opcode (`0xFE`) reverts but also consumes all remaining gas. Could maybe implemented in return flags.
### `CREATE`/`CREATE2`
The runtime should use the same address derivation as on EVM. Contract code might assume this:
https://github.com/Uniswap/v2-periphery/blob/0335e8f7e1bd1e8d8329fd300aea2ef2f36dd19f/contracts/libraries/UniswapV2Library.sol#L18
Additionally, we have many parameters in `fn instantiate` in the contracts pallet which don't matter for EVM. So we could have a simpler `create`/`create2` API methods that behaves exactly like on Ethereum and take the same parameters:
```rust=
fn create2(
code_hash_ptr: u32, // keccak256 hash image of the contract code
value_ptr: u32, // i256 ptr to balance to be transferred
input_data_ptr: u32, // constructor calldata ptr
input_data_len: u32, // constructor calldata length ptr
address_ptr: u32, // output buffer (20 bytes) = keccak256(0xff + sender_address + salt + keccak256(initialisation_code))[12:]
salt_ptr: u32, // 32byte ptr of salt
) -> Result<()>
```
Where `CREATE2` writes the zero address on failure. Anolog for `CREATE`.
This would also benefit code size as there are only 6 parameters which doesn't require spilling.
Another thing to note is that we likely just ignore the output of the constructor (on EVM, the constructor output is the runtime code to be deployed, however we assume the code already on-chain and execute the constructor in the context of the new instance, discarding any output).
### `BALANCE`
Currently, the `balance` seal API returns the balance of the executing account. EVM has the account (address) as parameter.
## Unclear
Best to check what frontier/moonbeam do for those. I'm not sure of the back of my head what do for those but if frontier can emulate it then we surely can find a solution for those too. Worst case we don't support some of them and just emit a compiler error at the cost of sacrificing compatibility.
- `PREVRANDAO`
- https://eips.ethereum.org/EIPS/eip-439
- On zkSync they just return a constant so might be fine for us too (https://docs.zksync.io/build/developer-reference/differences-with-ethereum.html#difficulty-prevrandao)
- Currently set to constant 2500000000000000
- `COINBASE`
- `ORIGIN`
- `GASPRICE`
- `DIFFICULTY` (currently set to constant 2500000000000000)
- `BASEFEE` (currently set to contant 0)
## Potential Optimizations
### Calldata and callvalue
On EVM there are `CALLDATALOAD(i) -> calldata[i]` to load a single word from calldata at offset `i` and `CALLDATACOPY(destOffset, offset, size)` which is essentially a memcopy (`offset` is the offset from the start of calldata and `destOffset` the offset into the EVM linear heap memroy). `CALLDATASIZE() -> size` returns the size of the calldata in bytes. `CALLVALUE() -> value` returns the transferred balance with this call.
Ideas discussed so far:
- The selector check requires the calldatasize, so we could provide it at a fixed memory location or in a register or on the stack at the start of the execution to spare a host API call in virtually any case.
- Callvalue is used regurarly as the contract reverts if non-payable functions are called with value.
- The runtime could provide calldata[0] at a fixed memory location. All contracts are expect to do at least a `CALLDATALOAD[0]` at minimum because this is required for the selector check. This could spare calling into `seal_input` in cases where the code doesn't use `CALLDATCOPY` at all, as the compiler can optimize `CALLDATALOAD(0)` away if the offset 0 is static (which it always is during selector check).
### Immutables
On ETH the deploy code can insert immutables into the code, which we can't, so they need to be stored somewhere. My naive approach would be just store them in regular contract storage under a 4byte index key (ETH storage keys are always 32bytes so this can never collide) but runtime performance is penalized by doing that.
- Could access them lazily and keep them on the stack so the penality is only signifcant for the first access