DB refactor and push syncing

--- title: DB refactor and push syncing tags: swarm --- # DB refactor and push syncing This is the initial design document to achieve better storage, capacity management and efficient syncing. Requirements for optimising correctness with efficient resource utilisation: - *fast syncing* - newly uploaded chunks arrive at their nearest neighbourhood fast - lean on bandwidth, but also resilient to node dropouts or eclipse attacks - chunks cached on delivery route do not enter the syncpool - *garbage collection* - - items that are not synced yet never get purged - items which are least profitable are purged first - items that are more distant are purged first - first approximation to reconcile these is neighbourhood updates sent from caching node, later POC payments - *total local replication* - nearest neighbours always syncrhonise all chunks historically within their area of responsibility - optionally start filling up higher bins - for security there might be a need to do (pull) syncing between each pair of peers ## DB structure ### Counters The database for chunks is relying on DB abstractions using shed. Shed provides ways to define global fields and indexes over `IndexItem`. An index item has the following fields: - accessTimestamp - time of latest chunk access - represents recency of access - mutable - for collision resistance should be combined with hash - 0 is null vaue - storedTimestamp: - time of chunk storage - represents chunk age - immutable - for collision resistance should be combined with hash - 0 is null vaue - address: - the chunk'd address (either content address or feed update address) - to be validated against chunk data by localstore validators - empty slice is null vaue - data: - chunk data - to be validated against address by localstore validators unless switched off for testing with globalstore - empty slice is null vaue - useMockStore - pointer to bool to indicate to the encode/decode functions - nil is null value `shed.Get(ctx, item)` merges the fields gotten from the value to item. Therefore all values of `IndexItem` are treated as nullable. ### Indexes: The new localstore's main component is a dbstore using shed. We will not use a memory store in the beginning. First benchmark, then use cachesize settings if needed. The db defines the following shed indexes - RETRIEVAL index: `hash` -> `storedTimestamp|chunk` - ACCESS index: `hash -> accessTimestamp` - PULL syncing index: `po|storeTimestamp|hash` -> `nil` - PUSH syncing index: `storeTimestamp|hash'` -> `nil` - GC index: `accessedTimestamp|storedTimestamp|hash` -> `nil` ### operations Using the shed's indexes, the localstore dbstore will provide several modes of update and access. Practically, calling a `mode` on the localstore will return a `ChunkStore` interface ```golang type ChunkStore interface { Put(context.Context, storage.Chunk) error Get(context.Context, storage.Address) (storage.Chunk, error) } ``` The importance of modes of update/access is clear once we consider that depending on its origin, adding new chunks require slightly different operations. For generality, we define updates of existing entries and removal of entries as modes of update. - UPDATE * **SYNCING**: when a chunk is received via syncing in it is put in - RETRIEVAL, PULL * **UPLOAD**: when a chunk is created by local upload it is put in - RETRIEVAL, PULL and PUSH * **REQUEST**: when a chunk is received as a result of retrieve request and delivery, it is put only in - RETRIEVAL and GC index; ie., it does not enter the syncpool * **ACCESS**: when an update request is received for a chunk or chunk is retrieved for delivery - PULL and GC updated * **SYNCED**: when push sync receipt is received - delete from PUSH and put to GC * **REMOVAL**: when GC-d, - delete from RETRIEVAL, PULL, GC - ACCESS * **REQUEST**: when accessed for retrieval - UPDATE GC index with a new accessedAt (asyncronously) only if it exists * **SYNCING**: when accessed for syncing or proof of custody request - no update of accessedAt ### Garbage collection - GC is discussed at length in https://hackmd.io/t-OQFK3mTsGfrpLCqDrdlw ## Current syncing ### Peer connections and syncing Currently syncing is a protocol using the stream package. When two nodes are connected they will start syncing both ways. On each peer connection there is bidirectional chunk traffic. The two directions of syncing are managed by distinct and independent stream subscriptions. In the context of a subsription, the subscriber or client is called downstream peer, while the server is the upstream peer. Subscriptions are per proximity order requested by the upstream peer depending on whether the downstream peer is considered to belong to the nearest neighbours. Nearest neighbours are the connected peers that belong to kademlia proximity order higher or equal to depth. Depth is defined as the proximity order such that there are at least R nearest neighbours at any one time. R is a redundancy parameter (defaults to 2, i.e. neighbourhoods contain at least 3 nodes). ### Subscriptions to PO bins Nearest neighbours share an area of responsibility and replicate the contents of this area for redundancy. Therefore nearest neighbours subscribe to all PO bins higher or equal to depth. Peers that are not nearest neighbours subscribe to the bin they belong and as a result only receive chunks which are closer to then than to the upstream peer. ### Historical and live syncing Currently each stream (subscription) allows two parallel substreams: history and live. When the subscription is started the highest storeTimestamp in the PO bin is taken as the division point (called `sessionAt`). Already stored chunks which have storeTimestamp less than `sessionAt` are synced via the history substream whereas newly arriving chunks are synced as part of the live substream. The streamer protocol uses priority queues and prioritises live sync over history. In case the throughput is reasonable, live sync is idle, i.e the latest stored chunk is synced to downstream. Historical syncing keeps intervals and the subscriber makes sure all data is synced and no intervals are synced twice. the intervals are based on the upstream peers storeTimestamp and therefore are specific to a peer connection. ### the OfferedHashes--WantedHashes roundtrip As a result of syncing streams on each peer connection, a chunk can and would be synced to a peer from multiple upstream peers. In order to save bandwidth by not sending chunk data over to peers that already have them, the stream protocol implements a roundtrip: before sending chunks upstream peer offers a batch of hashes, to which downstream responds with stating which hashes in the offered batch they actually need. ### Caveats Live and history streams are separate primarily in order to allow faster track to newly uploaded chunks so that they become available for retrieval sooner. This however breaks down if a peer disconnects before fully syncing live. Upon reestablishing the connection or when connecting to another peer, the chunk that was previously in the live stream now becomes history. whats more since they are recent history they will be synced only after the entire history before completes. Pull syncing is node centric as opposed to chunk centric ,i.e., makes sure you fill any connection's storage with items closer to them. It does not however makes sure that any particular chunk has arrived in their local neighbourhood. ## Push Syncing Push syncing is applied to rectify these shortcomings of pull/stream syncing. It is a protocol followed by the uploader to make sure each newly uploaded chunk reaches their destination and get replicated across nodes in whose area of responsibility the respective chunks belong. Push sync is a proactive protocol where chunk is sent to their local neighbours and expect a statement of storage message as a response. ### Retries In pull syncing, a chunk travels through intermediate nodes which store and pass them on asynchronously. Push syncing, however, uses neighbourhood addressing, ie., the chunk is sent directly to storer nodes and not even cached by relaying nodes. Uploaders can keep monitoring responses to chunks they sent to the network. After some time, the chunk is pushed to local neighbourhood again. ### Tracking progress Since a message is expected from storers to indicate that chunks been synced, uploader can make sure the file/collection is fully retrievable before publishing without having to poll and actually retrieve data. ### Security, Attacks Since responses are statements of custody, there is a possibility of a grieving attack where a malicious node would eclipse the route for some chunks and respond with statements reassuring the uploader while also not forwarding the message. This is best mitigated by having pull syncing in place which channels the chunks to every peer closer and is much harder to attack. Alternatively nodes could be punished with a fingerpointing mechanism but that requires stake and registration ### POC protocol The uploader uses the PUSH syncing index (`storedTimestamp|hash->nil`) They iterate over the index from the beginning and send the chunks to the nodes whose area of responsibility the chunk falls on. For this they use `pss` neighbourhood addressing with no encryption using the `SYNC` topic. Storer nodes subsribed to the `SYNC` topic and pick up messages addressed to a destination that falls within their area of responsibility. Once a storer picks up a chunk in their area of responsibility, they send back a receipt to the uploader using unencrypted direct messaging on the `RECEIPT` topic. When the uploader (subscribed to `RECEIPT` topic) receive a receipt, they delete the item from the PUSH index and enter it into the GC index - as a result the chunk can now be deleted. Anonymous uploaders lose the ability to track synced status through receipts ### Components The current WIP implementation implements push syncing using the following components. The code is found here: https://github.com/ethersphere/go-ethereum/pull/971 - dispatcher: - fed chunks and sends them to neighbourhood using pub to SYNCING topic - receives receipts using sub on RECEIPT topic - storer - receives chunks using sub on SYNCING topic - store the chunks - sends receipts to uploader/sender using pub to RECEIPT topic - pubsub - provides pub/sub - pss adapter using raw (unencrypted) send - use neighbourhood addressing for CHUNK topic - use direct addressing with RECEIPT topic - db - to interface with localstore/netsore - subscribes to PUSH syncing index iteration (forever) - feeds the dispatcher - tags - chunks state machine SPLIT|STORED|SENT|SYNCED - tags associated with chunks (to indicate file/collection) - and have states with a counter associated with them - as chunks change state, the number of chunks of that reaching that state for the tag are incremented - counts are monotonic allowing for progress bar functionality This needs to be refactored: - db should be just a feed implemented by netstore - tags should move to storage package and handled there - the push sync 'protocol' could be part of the storage package on a par with fetcher. It should be part of netstore and could be called pusher.

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.