## Fork-aware transaction pool ### Recap: current transaction pool architecture This section is for readers who may not be familiar with current design of transaction pool. It briefly presents the state of the art. #### Transaction: provides and requires tags In substratge transactions ordering is implemented through the *required* and *provided* tags (introduced [here](https://github.com/paritytech/substrate/issues/728)). Tag is an opaque vector of bytes. The transaction pool does not interpret them. Tags are computed by the runtime during call to [`validate_transaction`](https://github.com/paritytech/polkadot-sdk/blob/ea4085ab7448bb557a1558a25af164cf364e88d6/substrate/primitives/transaction-pool/src/runtime_api.rs#L49-L53) function and returned in the [`ValidTransaction struct`](https://github.com/paritytech/polkadot-sdk/blob/ea4085ab7448bb557a1558a25af164cf364e88d6/substrate/primitives/runtime/src/transaction_validity.rs#L263-L272). Each transaction may require certain tags, and each transaction provides some tags. The transaction is ready to execute if *required* tags are satisified (by other transaction or chain of transactions that are also ready) or empty. Otherwise the transaction is future. Common example is [account nonce](https://github.com/paritytech/polkadot-sdk/blob/d4c426afd46f43b81115911657ccc0002a361ddb/substrate/frame/system/src/extensions/check_nonce.rs#L116-L121). #### Submit New transactions are submitted to the transaction pool at a specific block. The pool validates the transaction with the runtime against the state for this block, and subsequently places the transaction in either the *future* or *ready* pool. Upon importing a new transaction into the transaction pool, a notification is dispatched to the listeners. #### Single view on ready/future transactions The singular instance of the structure called [`BasicPool`](https://github.com/paritytech/polkadot-sdk/blob/ea4085ab7448bb557a1558a25af164cf364e88d6/substrate/client/transaction-pool/src/graph/base_pool.rs#L211-L221) holds two sets of transactions: *ready* and *future*. This single *view* to transactions within the pool is being continuously updated. The update process is driven by two events: *new best block* and *finalized*. These events activate two internal processes in transaction pool: maintanance and revalidation which are responsible for updating *ready* and *future* sets. #### Maintanance It is a process triggered by *new best block* or *finalized block* notification. This process: - removes (prunes) the transactions included in the notified block from the transaction pool, - checks what tags were provided by the transactions included in the notified block, and updates *ready* and *future* sets within the transaction pool. *New best block* (or *finalized block*) events may be reported for blocks originating from a different fork compared to what the transaction pool is aware of. If that is the case, transactions needs to be resumbitted from blocks on the retracted fork and the transactions on the enacted one should be pruned. #### Revalidation It is a process that periodically validates transactions from the pool against the tip of the chain that the transaction pool is aware of. Obviously, it updates ready and future sets, triggered by every maintenance. #### `ready_at` Provides the future that resolves to an iterator over the ready transactions for the requested block. `ready_at` operates on the block height and, by itself, does not perform any computations. It internally waits for the maintenance process to be finished. #### `import_notification_stream` Every transaction imported to the transaction pool is sent to this public stream, allowing listeners to be notified about newly imported transactions. #### Problems The `ready_at` method operates on block height. It is not synced with maintanance process in terms of forks. This results in two major issues: - when building new blocks on top of block which are not *best* or *finalized*, the `invalid::stale` error will occur. This is because the transaction pruning was not executed on block import, ``` A -- B0[u0] -- C0[..] // If the maintanance was *not* triggered for B0, // ready_at will provide u0 when building C0 , // (which is stale from B0 perspective) ``` - when constructing new blocks on an alternative fork, the `invalid::future` error might arise. This occurs when blocks on the alternative fork lack transactions that serve as prerequisites for transactions present in the ready pool. As the maintained for contains these prerequisite transactions the ready set would comprise transactions that are considered future on the alternative fork. See the figure below: ``` B1[u0,u1]--C1[u2] //u3 is ready, after maintanance was triggered for C1 / A \ B0[t0,t1]--C0 //when building C0 ready_at will provide u3, //which is future from B0 perspective ``` ### Fork awarness proposal #### View All transactions are simply kept in transaction pool internal array as long as they are referenced. The view is a structure that contains two sets of transactions: *ready* and *future* at given block (similar to existing `BasePool` struct). The view keeps references. The view provides *ready* iterator that can be used by the block builder. The *graph* part of existing transaction pool implementaion can be re-used for this. The view is created for imported blocks. This can be done on every block, but can be customized and for consensus which build only on top of the best block, the view may be created for best blocks only (to mimic current behaviour). #### `ready_at` `ready_at` operates on block hash, and simply returns iterator for a view associated with requested block. #### Destroying views Views are destroyed when: - block is finalized (all views for blocks height lower then finalized block are destroyed), - the state for given block is pruned (not sure if we have notification for this), This maybe not be enough for some consensus (e.g. PoW), so LRU cache of views can also be maintained. Maybe it should be a hybrid solution. #### Submit When a new transaction is submitted, we update all blocks starting with height greater than *at block* used for submission. Same applies when transaction is replaced. #### Open questions *How to handle transaction longevity?* Simply check when including to *ready* or *future*. If transaction is stale, we simply do not include it. (And we have valid till field). *Transaction replacement:* Implementation detail: do we need map: `tx-hash -> Vec<block-hash>` ? It could help to find views that needs to be updated. *Is resubmission needed?* No. In this approach resubmission would mean that we take transactions from every imported block and put them into transaction pool if they are not already included. This should not be required - transaction (intended to broadcast) were already gossiped. This would also implicitely put the transactions to the other nodes, even those which are not intended to [propagate](https://github.com/paritytech/polkadot-sdk/blob/ea4085ab7448bb557a1558a25af164cf364e88d6/substrate/primitives/runtime/src/transaction_validity.rs#L283). *Is revalidation needed?* No. All block has a valid up-to-date set of *future* and *ready* transactions. There is no need to do additional revalidation process. *Race between block import and ready_at call?* Are we able to prepare *ready* set quick enough? The solution to this potential problem: `ready_at` could be a channel (or iterator over lockfree queue) which can be fed by view. So validation at given block and block building upon this block can happen partially in parallel. *Do we need tree-route?* Yes. If blocks are not reported one-by-one then we may need pre-computed tree-route to optimize view building. #### Are all scenarios addressed? - new block is imported, - new best block is notified, - finalized block is notified, - block is pruned, - new transaction is submitted, - transaction is replaced - warp sync (cpu usage, don't do useless work) - sync (cpu usage, don't do useless work) - *anything missing?*