# RFB: Detecting Ethereum RPC censorship programmatically. **UPDATE**: I was wrong, Infura doesn't censor RPC reads! This post caught the eyes of Infura, and they set out and wrote [a blog post](https://blog.infura.io/post/how-to-use-ethereum-proofs-2) which you should check out. Although, the party isn't over yet. The option to censor is still there unless we use secure RPC. Which is why I ended up building out this, check it out- https://github.com/liamzebedee/eth-verifiable-rpc [Liam Zebedee](https://glissblog.vercel.app/) (@liamzebedee). This is a spec, a request for either (1) grants or (2) builders. Please reach out on the Twitter thread / over DM's if you're interested in either. ## Introduction. Recently, Ethereum node providers like Infura/Alchemy started censoring parts of the Ethereum database from being read via the [JSON-RPC API's](https://eth.wiki/json-rpc/API). This proposal is to programatically detect this, by building a local EVM shim that **verifiably** loads state from a remote node during execution. ## Problem. **Example**: the [ENS](ens.domains) entry for `tornadocash.eth` On a censoring provider like Infura, the contenthash key for `tornadocash.eth` returns 0, where in fact we know it to be nonzero. You can verifty this simply using `cast` from the [Foundry](https://github.com/foundry-rs/foundry) toolbelt: ```sh! (base) ➜ lib git:(main) ✗ ETH_RPC_URL=https://mainnet.infura.io/v3/84842078b09946638c03157f83405213 cast call 0x226159d592E2b063810a10Ebf6dcbADA94Ed68b8 "contenthash(bytes32 node)" tornadocash.eth 0x00000000000000000000000000000000000000000000000000000000000000200000000000000000000000000000000000000000000000000000000000000000 ``` As part of my work on [Dappnet](https://twitter.com/liamzebedee/status/1578127982173908992), I know that it's being censored. But this isn't being talked about. **What is worse**, is that we don't know how to detect it. So I'm implementing an RPC provider marketplace, and I'm unable to tell which providers will at a moment's notice, block users from accessing their money/dapps. ## How can we detect censorship? This section outlines **(1) how Ethereum works** and then **(2) how we can detect censorship**. ### (1) How Ethereum works. What is happening when we call `cast call 0x222... "contenthash(bytes32 node)" tornadocash.eth`? * Ethereum is a database with a microservices layer called smart contracts, which run on the EVM. * To write to the database, we send transactions. To read from the database, we call these smart contracts and get data. * The read/write messages are sent over an RPC protocol, called [JSON-RPC](https://ethereum.org/en/developers/docs/apis/json-rpc/) to an Ethereum node. * The Ethereum node tracks two things - consensus (the hash of the latest block of transactions in the database) and execution (the world state and processing of txs). * `cast call` translates to an `eth_call` RPC, which translates to running the EVM with the following message (as EVM is a message-passing model): * ENS is the domain name system, mapping `(name => (key => value))`. To track this, we call a contract called the resolver. The resolver's address is `0x226159d592E2b063810a10Ebf6dcbADA94Ed68b8`, which we'll call `ENS_RESOLVER`. * We are calling `contenthash(bytes32 node)` ([impl](https://github.com/ensdomains/resolvers/blob/master/contracts/profiles/ContentHashResolver.sol)), a function on the resolver contract. * Our call data is encoded according to the EVM [calling convention](https://en.wikipedia.org/wiki/Calling_convention), wherein we concat the 4 byte function selector with its ABI-coded arguments. * `cast abi-encode "contenthash(bytes32 node)(bytes memory)" $(cast --from-ascii "tornadocash.eth")` * This creates our message for the EVM to execute - `Message(from=0x0, to=$ENS_RESOLVER, data=0x746f726e61646f636173682e6574680000000000000000000000000000000000, value=0 ether)`, * When `eth_call` is run, the EVM executes the bytecode of the contract, and returns data from the storage. How does storage work? * Traditional databases use SQL to represent data, and we write SQL in order to read/write it. In Ethereum, the language is EVM bytecode, and operates purely on a key-value basis (`sstore`, `sload` opcodes), no relational model (joins, etc). Smart contracts are like writing programs that natively use the database for storing their data structures. * EVM has two notions of memory locations - `memory` aka RAM, and `storage` aka disk. * Every contract has its own private namespace for `storage`, and other contracts cannot read it, they must use contract calls to interface with each other. How does consensus work? * Ethereum is a blockchain, meaning the latest block hash represents the state of your entire system - all of the transactions it has processed, the latest state of the database, the balances, the smart contract programs, etc. * An easy way to think about it - each block represents a tick of the system, and the block hash is like the time. * In Ethereum 1.0, the clock was based on proof-of-work. But since August 2022, it's been upgraded to a new protocol, proof-of-stake. * **You can track the clock without tracking the rest of the database**. This is called a _light node_, but since Eth 2.0, it just means running a "consensus node" - since the consensus layer has been split from the execution layer. How does the block hash represent? * The block hash represents the cryptographically authenticated state of Ethereum - which is a fancy way of saying, it's a big fat merkle tree, and you can prove anything in the database by revealing a path from the root to the leaf. * The [**_seminal_ diagram for the Ethereum world state is here**](https://ethereum.stackexchange.com/questions/268/ethereum-block-architecture/6413#6413). Seriously, this was made in 2018 and is just _that_ fucking good. * Simply put, Ethereum's world state is split into 3 tries - accounts, code, and storage. * This looks like: * `state => (Accounts(address => balance), Code(address => bytes), Storage(contract_address => (bytes32 => bytes)))` ### (2) How we can detect censorship. Concept: * If we have a consensus node, we know the block hash. * If we know the block hash, we can verify proof of anything in the database. * Looking up the `contenthash` for `tornadocash.eth` is simply running a very small amount of EVM code, that interacts with a very small amount of state. * `state.Code[ENSResolver]` * `state.Code[ContentHashResolver]` * `state.Storage[ContentHashResolver][hashes][tornadocash.eth]` * If we ask the RPC node for this state using `eth_getStorageAt`, we can **trivially** verify if it was censored or not. How? * By requesting a merkle proof of the path: `(block_hash, storage, ContentHashResolver, hashes, tornadocash.eth)` * If the hash check fails, then we know the state leaf isn't authentic. Ideation: * A lightweight consensus node like [Helios](https://github.com/a16z/helios). * Requesting state directly from the RPC node using [`eth_getStorageAt`](https://docs.alchemy.com/reference/eth-getproof) * Executing EVM `eth_call` client side (ie. something like [Wei/FUCory's work](https://twitter.com/fucory/status/1608193056725139456?s=61&t=boSMYnkV-3i-5YN_FeTIoQ)), and [lazily loading](https://en.wikipedia.org/wiki/Lazy_loading) the storage from the remote execution node. * Load the `msg.to` contract's code. * Execute a local EVM. * When encounter `CALL`, load the corresponding contract's code. * When encounter `SLOAD`, load the corresponding storage key. * Verify both of these through Merkle proofs, so we can detect inauthentic state. * Return the value of the call like normal, ie. `contenthash(xx)` Next steps: * Sanity check this could work. * Build this. * Run it against every publicly available RPC provider - Infura, Quiknode, Alchemy, POKT. **Why?** Because while we know which nodes censor transactions to Tornado, we don't know which nodes censor read-access to Tornado.