# Ethercluster
__Situation:__ Cloudflare is used to provide the services at ethercluster.com (et al) by fetching and returning original block information from Rivet.
__Problem:__ 25k unique users/day and 25M requests per day is getting expensive, driven mostly by the Rivet cost.
## Load balancing
__Solution:__ Use the RPC server of an existing always-on geth instance of our own
instead of the expensive Rivet origin. The self-hosted instance can use a caching proxy in front of it to further mitigate origin request counts.
__Requirements:__
- [ ] Ensure consistency between origins at some regular interval, using Rivet as the canonical/control and as the fallback origin in case of mismatch.
## Caching
__Solution:__ Use Cloudflare Workers to cache chain data, avoiding redundant calls to Rivet.
OpenRPC?
Some method calls return network-static data.
These are VERY time insensitive.
These can be cached indefinitely, or for the longest allowable time.
- eth_chainId
- eth_coinbase (?)
- eth_mining
- net_listening
Some method calls return SOMEWHAT time insensitive data.
These can be cached at some reasonable non-tiny interval, eg. 5m.
- eth_mining
- web3_clientVersion
- net_version
- eth_protocolVersion
- eth_gasPrice (?)
- eth_syncing (?)
- net_peerCount (?)
Some method calls are INVALID.
Some method calls return ADMINISTRATIVE data.
These should be dropped.
Instead of compiling a list, as below, we could "blacklist" request methods
based on their responses; eg. `response: -13200 method does not exist`.
Any calls returning this error once would be dropped forever (OK, for a long time).
This way Rivet gets to tell us what is and is not allowed.
- miner_start
- eth_setCoinbase
- admin_xxx
- personal_xxx
- debug_xxx
- miner_xxx
Finally, some method calls return freshness-required data.
The MOST granular the state ever needs to be maintained is at the block level,
where any change in the response to `eth_getBlockByNumber("latest", false)` should
cause these caches to bust. Note that `eth_blockNumber` alone is not sufficient (reorgs).
- eth_blockNumber
- eth_getBalance
- eth_getBlockByNumber
Specifics:
### `eth_getBalance`
`eth_getBalance` is the MOST USED call, by far (eg. 380k vs the number-two spot at 29k / TODO: FACTCHECK).
Cache the call by params: `address` and `blockNumber`.
Note that an unlimited cache presents a vulnerability to cache explosion. We'll need to bust a cache size limit.
```
ethrpc eth_getBalance --help
Params:
- (0): [Required] <address>
type: string,
pattern: ^0x[a-fA-F\\d]40$
- (1):<blockNumber>
description: The hex representation of the block's height,
type: string,
pattern: ^0x[a-fA-F\\d]+$ # SIC: special strings "latest", "pending", and "earliest" are allowed. "pending" is "latest"+1
Returns:
title: getBalanceResult,
oneOf: [
description: Hex representation of the integer,
type: string,
pattern: ^0x[a-fA-F0-9]+$,
,
description: Null,
type: null,
```
## Cache key
For the methods which return fresh data referred above, we can generate the cache key in a way that has all the unique info + the requested **block_hash**.
> NOTE: we need to translate "latest", "pending" tags to the appropriate block_hash as well. "pending" might be left out of caching.
Using the following request as an example use case:
```
'{"jsonrpc":"2.0","method":"eth_getBalance","params":["0xc94770007dda54cF92009BFF0dE90c06F603a09f", "latest"],"id":0}'
```
The cache key will be `eth_getBalance|0xc628dc616c51384ea399e18d248120a795d486a6145f3bac766d23616cbb8efa|0xc94770007dda54cF92009BFF0dE90c06F603a09f` based on the following schema `<method>|<block_hash>|<...unique_args>`.
Generally, this defines the cache key as some serialization of `method+params...`, where `+` is
some simple concatenation (either empty or some delimiter).
We know, however, that SOME params mean the request is dynamic, for example:
- `eth_getBalance("0xc94770007dda54cF92009BFF0dE90c06F603a09f", "latest")`
This request's cache should bust whenever the value of `"latest"` changes.
On the other hand, some parameters make the request one for a constant response, for example:
- `eth_getBalance("0xc94770007dda54cF92009BFF0dE90c06F603a09f", "0x430e43e4e4c09dbc8aa722cbb833a0c707d2ed281320b64ee83558094a5060c3")`.
## Expiration
All of the cached keys have to expire at some point, as we don't want to keep stuff consuming cache size forever.
Based on the importance and freshness of each call we have to set the expiry time as appropriate.
## External cache setter
Methods have to introduced so as external services (we own) can call them authenticated in order to clean cache(s) under certain circumstances.
In case of a reorg, we might want to call a method to set/change the head block info. In advance, but not needed we might want to clear cached data on the forked blocks, but if logical expiration times are being set, then we don't have to worry till they expire.
## Split of batched requests
Almost 30% of the requests at Rivet are batched ones, which means a single HTTP request, might perform >1 JSON RPC requests. Rivet bills on individual JSON RPC requests.
## Good to check
- [x] Check if CF cache allows us to clear cache using wildcard cache keys. e.g.: clean all `eth_getBalance|0xdead|*`.
- Cache Tags are user for that, and it's an Enterprise feature (we don't have Enterprise).
# Development
## References
- https://developers.cloudflare.com/workers/examples/cache-using-fetch
- https://developers.cloudflare.com/workers/examples/cache-post-request
- https://developers.cloudflare.com/workers/examples/cache-api
- https://developers.cloudflare.com/cache/about/cache-control#cache-control-directives
- https://developers.cloudflare.com/workers/runtime-apis/cache