# Gateway fork recovery ### Assumptions The new arweave.net is a totally new codebase and run totally outside of the current erlang context. The gateway will be deployed with at least 2 arweave nodes which it'll communicate with and trust. The gateway is not interested in crawling the network or communicating with other untrusted nodes. The gateway fetches data from arweave nodes using the existing HTTP endpoints (like any external web client), arweave nodes push events about new transactions and blocks to the gateway using HTTP webhooks. The gateway has no other way of getting data out of a node or out of the network. The gateway will handle logic such as domain sandboxing, custom domains, caching, and other user-land concerns. The gateway will proxy most requests to an arweave node and cache responses, but will serve arql transactions directly from it's own databse. The gateway is not concerned with maintaining an independent or totally comprehensive view of the entire chain state, it is only concerned with tracking state relevant for serving arql requests. The gateway is more concerned with serving requests quickly and including new data quickly, so pending data should be included in arql responses. Data that went into forks should be removed as quickly as possible, but in edge cases where it's ambiguous, we should favour holding on to that data and contuning to serve it. Ideally the gateway shouldn't have complicated fork resolution logic, and should rely on nodes for the more complicated blockchain operations. The gateway code will run on explicitly short-lived containers (currently AWS Lambda), so we should favour stateless, idempotent event driven data flows responding to HTTP events, rather than polling for changes and long running/stateful processes. It is possible to have cron style jobs run on schedules, but it should be avoided as much as possible. Gateway behaviours are not part of the core protocol and will change over time. ### Approx layout of components and actors ![](https://i.imgur.com/K8XceBj.png) The gateway is actually a bunch of services and components including CDN/caching, but for the purposes of this it can be considered a single unit that simply responds to HTTP requests. ### Current webhook events Current webhook events ```javascript= { transaction: { //all tx fields minus data } } ``` ```javascript= { block: { //all block fields minus wallet_list } } ``` ### Proposed new webhook event This would contain an array of the last n blocks, with each block containing the same fields as the regular block endpoint. ```javascript= { event: 'fork-recovery' blocks: [ { //all block fields minus wallet_list } ] } ``` ### Postgres database A postgres databse will be used to serve arql queries. We should try and keep operations/processes that interact with the database as simple and stateless as possible, so idempotent upserts with little/no branching logic that can be run over and over again should be preferred over _select x_, _fetch some other data over an api_, _maybe insert some data back into the table_. E.g. ```sql begin; insert into "transactions" ("content_type", "data_root", "data_size", "data_tree", "id", "last_tx", "owner", "quantity", "reward", "signature", "tags", "target") values (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) on conflict (id) do update set signature = excluded.signature,owner = excluded.owner,target = excluded.target,reward = excluded.reward,last_tx = excluded.last_tx,tags = excluded.tags,content_type = excluded.content_type,quantity = excluded.quantity,data_size = excluded.data_size,data_root = excluded.data_root,data_tree = excluded.data_tree; insert into "tags" ("index", "name", "tx_id", "value") values (?, ?, ?, ?), (?, ?, ?, ?), (?, ?, ?, ?), (?, ?, ?, ?), (?, ?, ?, ?) on conflict (tx_id,index) do update set name = excluded.name,value = excluded.value; commit; ``` #### Transactions ``` id, owner, content_type, data_size, tags, ... ``` Primary key = `id` #### Tags ``` tx_id, index, name, value ``` `index` = the tags array index and sort order. Primary key = composite key of `txid, index` #### Blocks ``` id, height, timestamp, ... ``` Arql request look something like this ```sql select distinct(tx_id) from tags where (tags.name = :arql_name and tags.value = :arql_value) ``` Where clauses are generated dynamically based on the user supplied query.