# DWN Sync TL;DR - Give me all the messages that I don't already have. A synchronization interaction is limited to a single DID. What do i mean by this? DWNs are multi-tenanted. This means that a single DWN can store data for multiple DIDs. A synchronization interaction is only concerned with syncing data of a single tenant. ```mermaid sequenceDiagram autonumber participant A as DWN A participant B as DWN B A->>B: SYNC_INIT B->>B: Validate / Verify Message B->>A: ACK B->>B: Retrieve all messages that match provided scope B->>A: Send messages that don't exist in filter ``` ## `SYNC_INIT` Should contain the following: ### Authentication Some means to prove that the requesting DWN is allowed to sync the data of a tenant residing on the recieving DWN. There are a handful of different approaches to achieve this. Two that come to mind are: * a compact JWS signed using a key associated to the DID whose data is being synced * [DWN Permissions](https://identity.foundation/decentralized-web-node/spec/#permissions) ### Scope Some means to identify the data that should be synced e.g. ```json= { "schema": "https://schema.org/Note" } ``` This implies that **only** messages that have `https://schema.org/Note` should be considered for sync :information_source: _There should be a way to say "sync everything". Maybe the absence of a scope?_ :information_source: _`scope` also exists in [DWN Permissions](https://identity.foundation/decentralized-web-node/spec/#permissions). This may be enough of a reason to use the DWN Permissioning model for sync_ :information_source: _Scope could be included in the payload of the JWS used for authn_ ### Filter A serialized bloom filter (likely a XOR filter) that is pre-populated with all of the scoped CIDs that the requesting DWN already has. :information_source: _A Bloom filter is a space-efficient probabilistic data structure that is used to test whether an element is a member of a set. False positive matches are possible, but false negatives are not._ :information_source: _XOR filters are more space-efficient than classic bloom filters. Particularly useful as readonly sets._ :information_source: _[open-source javascript lib](https://www.npmjs.com/package/bloom-filters)_ ## Considerations * Constructing, serializing, and transmitting a bloom filter will be increasingly i/o intensive (linearly) as a dataset grows. * :warning: **TODO**: Identify whether this is actually a problem and find a solution if it is. * Sync may require a bi-directional connection in order to sync in both directions * Cloud DWNs don't have a way to `SYNC_INIT` with "local" DWNs for two reasons: * Local DWNs are not publicly addressable * Cloud DWNs don't have the means necessary to authenticate (by design) * AKA no keys * Some interface messages are more important than others (e.g. `Permissions*`) ## Blockers * Tenant gating is not implemented * Permissions are not fully implemented ## Open Questions * **[step 5]** how should messages be synced? few options come to mind: * yeet the actual messages over the wire? * stream? * http response * * send only the message CIDs, add `GET_CID` functionality, and expect requesting DWN to `GET_CID` all CIDs