Alan Lundeen
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
      • Invitee
    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Versions and GitHub Sync Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
Invitee
Publish Note

Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

Your note will be visible on your profile and discoverable by anyone.
Your note is now live.
This note is visible on your profile and discoverable online.
Everyone on the web can find and read all notes of this public team.
See published notes
Unpublish note
Please check the box to agree to the Community Guidelines.
View profile
Engagement control
Commenting
Permission
Disabled Forbidden Owners Signed-in users Everyone
Enable
Permission
  • Forbidden
  • Owners
  • Signed-in users
  • Everyone
Suggest edit
Permission
Disabled Forbidden Owners Signed-in users Everyone
Enable
Permission
  • Forbidden
  • Owners
  • Signed-in users
Emoji Reply
Enable
Import from Dropbox Google Drive Gist Clipboard
   owned this note    owned this note      
Published Linked with GitHub
Subscribed
  • Any changes
    Be notified of any changes
  • Mention me
    Be notified of mention me
  • Unsubscribe
Subscribe
# Storage & Replication Proposal This document outlines a proposed system for storing, modifying, and replicating documents ("objects") in a user's Identity Hub. ## Overview In our proposed storage system, a user's Hub consists of a set of document revisions ("commits") which each represent a change (creation, update, or deletion) of a single Hub object. Each commit is signed, immutable, and content-addressable (stored and referenced by its hash). The set of commits representing an object is generally append-only, with certain exceptions made to allow garbage collection of older commits. A single Hub object consists of a linear chain of commits which can be combined via a determinstic merge algorithm to arrive at a final object state. Replication of state between Hubs is acheived by ensuring that all relevant commits have been distributed to each of a user's Hubs. The presence of all commits, combined with the merge algorithm, ensures that all Hubs eventually converge to an identical state. #### Scope The following are not in scope for this document: * How Hub objects are encrypted * The underlying transport protocol between Hubs during replication ## Commits Each Hub object consists of a series of commits which each represent an immutable, atomic update of that object. The following diagram illustrates an object `3a9de008` which consists of 3 commits: ![](https://i.imgur.com/owH9ET1.png) Each commit contains change metadata and an optional payload. For example, the following commit could represent the creation of object `3a9de008`: ```json { "rev": "3a9de008f526d239d89905e2203fa484f6e68dfc096a7c051eb80f15", "operation": "create", "committed_at": 1530308239, "parents": [], "object": "3a9de008f526d239d89905e2203fa484f6e68dfc096a7c051eb80f15", "state": { // ... }, "patch": { // ... } } ``` Followed by an update to the same object: ```json { "rev": "fe4fd3240ff1c68a85730b3cf35742c8a241a0a70bd98e8869b4b593", "operation": "update", "committed_at": 1530309810, "parents": ["3a9de008f526d239d89905e2203fa484f6e68dfc096a7c051eb80f15"], "object": "3a9de008f526d239d89905e2203fa484f6e68dfc096a7c051eb80f15", "state": { // ... }, "patch": { // ... } } ``` Commit objects contain the following fields: | Field | Type | Description | | ------ | -------- | ----------- | | rev | string | The revision ID (SHA3-256 hash of the commit object). | | committed_at | number | Commit timestamp in [unix time](https://en.wikipedia.org/wiki/Unix_time); used for roughly ordering commits within the history of an individual object. | | parents | array<string> | Array of hash(es) of the commit(s) upon which this one was based *(optional)*. | | object | string | The ID of the object that is modified by this commit, or `new` when creating an object. | | state | object | The full state of the object at this point *(optional)*. | | patch | object | A [JSON Patch](http://jsonpatch.com/) or [JSON Merge Patch](https://tools.ietf.org/html/rfc7396) object (format TBD) representing the changes in this commit *(optional)*. | | operation | string | The operation being performed on the object (`create`, `update`, or `delete`). | Typically a commit should specify at least one of the `state` or `patch` fields, or else have `operation` set to `delete`. :::warning Open: Hashing a JSON object requires a canonicalization step. This could be avoided with an approach like JWT where the hash is computed from a serialized (e.g. base64) form, however Hubs would likely be required to save two copies of each commit (the original serialized base64 plus an expanded, index-able JSON format). ::: > [name=Danny Strockis] D: Should we just use the CouchDB commit schema, or the parts of it that we need? > A: we kinda wanted to be db agnostic too... ### Linear object history The set of commits pertaining to an object are organized into a linear state based on a well-known determinstic merge algorithm. This ensures that all Hubs eventually converge on an identical object history, regardless of the order in which commits are received. The merge algorithm logic is: 1. A commit with `operation: create` always precedes (comes before) any other commit. 2. A commit with `operation: delete` always supersedes (comes after) any other commit. 3. Commits with the same `operation` are ordered based on their `committed_at` fields, with higher timestamps superseding lower timestamps. 4. In the event of two commits with identical `operation` and `committed_at` fields, the commit with the lexicographically highest `rev` field supersedes the other. Wall-clock timestamps are known to be unreliable in distributed systems. They were chosen over other approaches (like vector clocks) for the merge algorithm because they can be intuitively reasoned about by end users. The merge algorithm must only be consistent, not necessarialy "correct" based on some concept of real-world time, so wall-clock timestamps are sufficient. :::warning Open: Should the `parents` field be considered? If `CommitB` lists `CommitA` in its `parents`, `CommitB` should clearly come after `CommitA` regardless of timestamps. However, this will complicate the implementation. ::: > [name=Danny Strockis] D: Clarify that the above and below logic pertains to a single object's history, and that the ordering of all changes across a hub is up to the hub to decide/implement. ### Implicit graph-based object history In addition to the canonical linear commit state of an object, the values in the `parents` field of each commit allows the construction of a parallel history graph which more richly represents the causality of changes. The following diagram illustrates a series of commits of a single object, along with each commit's `parents` field (`p`) and `committed_at` time (`t`). The top diagram shows the commits in linear form, and the bottom shows the same commits expanded into the graph form: ![](https://i.imgur.com/gIZpg8I.png) The implicit graph history allows apps and user agents to optionally support more intelligent merge semantics for conflicting updates. For example, if the illustration above represents edits to a shared grocery list, the conflicting commits `C` and `D` could be resolved by an intelligent user agent by creating the new merged commit `E`. > A: If we make the assumption that REST-style simplicity of requests has gone out the window due to encryption requirements, and clients therefore need an SDK already, does it still make sense to keep the simpler timestamp based ordering? We could just have the SDK handle the parenting logic. > [name=Danny Strockis] D: So it's basically last-write-wins (assuming clocks are aligned). Why would anyone specify a parent commit in a change, if it's not required? > > A: This was an attempt to offer both a simple PUT/GET interface as well as a more advanced ordering system for clients that care about conflicts. > [name=Danny Strockis] Are you sure you want to base the order of changes to an object on the timestamp of the client that generated the change? I wonder if some unexpected behavior might occur, like an update not taking effect b/c the timestamp was lower than an existing commit. Or like a client specifying a change in the future that overrides any other changes made. ### Patch-based commits In the simplest usage, each commit specifies a `state` field representing the complete state of the object at that point in time. The current state of the object is thus represented by the `state` field of the final commit in the object's history (as determined by the linear merge algorithm). This approach is simple but not always ideal. For example, in the scenario of a shared grocery list that is simultaneously updated by two participants, one of the edits will "win" based on the merge algorithm and the other will be "lost" (although still available in the object's history). To accomodate this scenario, we introduce the idea of *patch commits*. A commit can specify a `patch` field containing a [JSON Patch](http://jsonpatch.com/) or [JSON Merge Patch](https://tools.ietf.org/html/rfc7396) object (the exact format is still open). If a collision is detected between multiple patch-based commits (i.e. where multiple commits reference the same parent), an intelligent app can resolve the conflict by combining the patch operations into a new merged commit. A single commit can specify both `patch` and `state` fields to indicate both the difference from the prior commit as well as the current up-to-date state of the object. > [name=Danny Strockis] D: I'm not sure I see the point of offering both the `state` and the `patch` methods of making a request. Can't we just pick one? It seems to me like the problem of conflict resolution between two commits is the same whether I use `state` or `patch`. > > A: Again, trying to give a simple and an advanced way to make changes. Advanced patch way makes it easier to resolve conflicting changes, like if two changes add different values. > > D: Worried about cases where apps try to use state & patch in combination/simultaenously. Also not sure the easier conflict resolution gained by using patch is worth the complexity of having two ways to do things. > [name=Danny Strockis] How does a client become aware of a conflict? Do they need to scan previous revisions at all times? > > The hub could send a signal in responses like a flag if the most recent version references the same parent as another commit. > [name=Danny Strockis] If a patch is submitted, does the hub do the auto-merge with previous commits to return the current object state on a GET request? > > Nope, because of encryption. This means if just one client uses patch to change an object, all clients should use patch to change/read the object. Unless we mandate that clients using patch must also submit state. ### Object deletion An object is deleted by issuing a commit for that object with the `operation` field set to `delete`. A deletion commit intentionally supersedes any other commit, which allows a Hub to be certain that the deletion is final and cannot be un-done by another commit. This allows a Hub to optionally reclaim space from deleted objects by saving only the final deletion commit and freeing any earlier commit data. Apps which want to present an "Undo delete" option can accomplish this by re-creating an identical object which will be assigned a new ID. > [name=Danny Strockis] Wouldn't we want to allow someone to un-delete an object with the same ID? > > Having a finalized delete state makes it easy to garbage collect, apps that care about undo can maybe do a soft delete. ### Garbage collection Our proposal allows a Hub to intelligently discard old commit states as needed to accomodate storage limitations (e.g. in the case of a storage-limited free plan). Any such limitations should be clearly communicated to the user. A commit is generally "safe" to delete as long as there exists an un-conflicted commit later in the linear history which contains the `state` field, since this `state`-based commit can be used as the base for further `patch`-only commits In other words, a chain of `patch`-only commits should not be broken, as missing patches will make it impossible to recalculate the final state of the object. Note that it is not possible to absolutely guarantee the safety of any deletion - there is always the possibility that a conflicting commit has been made in an offline Hub but has not been replicated yet. In practice, however, deletions may be achievable given a sufficient delay (e.g. 1-2 years) to ensure a high likelihood that no conflicts exist. As an alternative to automatic garbage collection, a Hub could present the user with a list of potentially "safe" deletions based on this logic and request confirmation. When garbage collecting old commits, the commit IDs should be retained ("tombstoned") in order to avoid re-adding the commit when replicating state from another Hub which may have a longer retention policy. ### Identifiers #### Commit identifiers Each commit is identified by the hash of its canonical JSON form without the `rev` field. Because equivalent JSON objects may differ in ordering and whitespace depending on the platform, a canonical representation is necessary to ensure a consistent hash. We propose using the rules outlined by the [json-stable-stringify](https://github.com/substack/json-stable-stringify) and [fast-json-stable-stringify](https://www.npmjs.com/package/fast-json-stable-stringify) libraries as they are well-used and simple to implement (~60 LOC). #### Object identifiers The choice of how to construct object identifiers in the system is an important one. Identifiers are ideally deterministic and verifiable. We propose to use the revision hash (`rev`) of the initial commit of a new object as the object's ID. Since the revision identifier is based on the hash of the content, this ensures that the object ID is also deterministic and verifiable. This introduces a slight chicken-and-egg problem, as commits are expected to contain an `object` field with the ID of the associated object, but the commit hash (and thus the object ID) is not known until the commit is fully constructed. To solve this, we propose that, for a commit which creates a new object, the hash of the commit should be calculated without the value of the `object` field. Thus the steps to generate the ID for such a commit are: 1. Construct the commit without the `object` or `rev` fields 2. Hash the canonicalized commit to obtain the revision hash value 3. Add both the `rev` and `object` fields to the commit Similarly, such a commit can be verified with the following steps: 1. Delete the `object` and `rev` fields from the commit object 2. Hash the canonicalized commit to obtain the revision hash value 3. Verify that both `object` and `rev` match the resulting hash An alternative approach would be to replace the `object` field in the initial commit with a special value (e.g. `new`) with the understanding that the object ID of such a commit is defined to be equal to the revision ID (`rev`) field. > Vote for`new` or`0` > [name=AF] ## Replication Because our set of commits is generally append-only and each commit is immutable, the problem of replication is simplified. Hubs need only ensure they collect the full set of available commits, which can be done by periodically contacting each peer Hub to identify any new commits which must be replicated. Once the full set of commits is obtained, the determinstic merge algorithm will ensure all Hubs converge to an identical state. Our design proposes a simple log-based replication protocol for negotiating state between pairs, but also leaves open the possibility of expanding the replication system with more efficient protocols in the future. ### Log-based replication In this approach, each Hub instance maintains an opaque state token string (a.k.a. [watermark](https://en.wikipedia.org/wiki/Watermark_(data_synchronization))) representing the current state of that Hub. A Hub should save an independent state token from each of its peers. During replication, if the Hub possesses a state token previously returned from a specific peer, the Hub can present the token back to that same peer in order to discover what commits have occurred on the peer since that time. #### State tokens We propose using opaque state tokens to offer the greatest flexibility to Hub implementers. External inspection of a state token is not guaranteed to reveal any information about the internal state of a Hub; the only supported action is to replay the token to the Hub which created it in order to identify changes since the original token was issued. The simplest implementation of a state token on a single-machine Hub could be a monotonically increasing integer which is incremented each time a commit is seen. For a complex cloud-based Hub provider spanning multiple regions and data centers, the coordination and synchronization requirements for a monotonic integer may be prohibitive. In this case, the opacity of the token allows the Hub implementer to choose a different mechanism (e.g. serializing a vector clock) which is more appropriate to their internal storage system. An alternative approach would be to specify a well-known state token format to be shared by all Hub implementations. This has the advantage of reducing bandwidth and computation needs: if a peer presents a state token which is identical to a Hub's own token, the Hub can be assured that the peer's internal state is also identical. However, it is challenging to design a format that is both simple enough for a naive Hub implementation but also expressive enough to accomodate complex multi-region providers. > [name=Danny Strockis] If I have this right, the order of the state token does not need to match the linear object history, right? So replication from a hub might return commits "out of order"? > > Correct. > [name=Danny Strockis] Are state tokens specific to the filters used? For example, what if I'm using filters A & B, but then change my query and use filter A only. Should I start my sync over, or use my previous state token? > > No, for now let's say if you change your filters you should start the sync over from zero. #### Filters To support the case of Hubs which may only wish to replicate a subset of data, we also introduce the concept of "filters". A filter is a mechanism for a Hub to specify the specific objects or hierarchies of interest during replication. The initial implementation of filters consists of a simple list of desired objects and/or areas: ```json [ { // All collections objects with the MusicPlaylist schema "interface": "collections", "schema": "schema.org/MusicPlaylist" }, { // A specific object "object": "3a9de008f526d239d8990..." }, // ... ] ``` All conditions within a given filter object must match, and at least one of the given filters must match for an object to be included. :::warning Open: Should the options for filter conditions match the permissions filters? ::: It will also be possible to expand the supported filter types in the future. For example, a filter based on [Bloom filters](https://en.wikipedia.org/wiki/Bloom_filter) could allow the space-efficient representation of a potentially large set of object IDs with an allowable false positive rate. #### Replication flow Replication occurs as a one-way process between two peers where new commits in the source are replicated to the target. A full two-way synchronization can be performed by initiating two one-way replication sessions. The one-way nature of this protocol is desirable in certain situations, for example a Hub located on a mobile phone which needs to refresh state from a Hub in the cloud but does not yet want to upload its own updates due to bandwidth and battery constraints. The protocol described here is a slightly simplified version of the [CouchDB replication protocol](http://docs.couchdb.org/en/stable/replication/protocol.html). It defines replication in terms of APIs available from each Hub which allows orchestration to be done by a third-party Replicator, assuming the third-party has sufficient permissions to access the necessary APIs in each Hub. In practice the Replicator will likely reside alongside one of the Hubs participating in the replication flow. In this case, API calls that would be made to the local Hub may be replaced by internal calls that have the same effect. The one-way replication flow from source to target follows several steps: ##### 1. Identify changed documents A request for a list of commits is sent to the *source*. This request may optionally include: * A state token previously received from the *source*, in which case the response should include only commits which *were first seen by the source* after the state token was generated. * A list of `filters`, in which case the response should include only commits on objects which match the provided filter(s). * A list of `fields` to return, to minimize the amount of metadata transferred. An example request: ```json { "iss": "did:hub.id", "aud": "did:user.id", "@type": "Replication/Read", "request": { "fields": ["rev"], "filters": [ { "since": "aGM3ODlmcVokQDM3Yzk4bzNxNDdmaDk=" }, { "interface": "collections", "schema": "schema.org/MusicPlaylist" } ], } } ``` The response should list commit IDs which match the specified request parameters, e.g.: ```json { "@type": "Replication/Response", "response": { "state": "ZHNmc2RmZGZzZGZzYWRmcedA2RzZmE=", "commits": [ {"rev": "3a9de008f526d239d89905e2203fa484f6e68dfc096a7c051eb80f15"}, {"rev": "fe4fd3240ff1c68a85730b3cf35742c8a241a0a70bd98e8869b4b593"}, {"rev": "7eb02e738ec6eb95542b6e0555c2231ad299ccda124e056ed850c88e"} ] } } ``` :::warning TODO: Define pagination options (just update `since` field?) and possibly a continuous feed option. ::: > [name=Danny Strockis] How does this Replication API adhere to authorization rules? Does it evaluate all permissions to determine which objects can/can't be included in this response? > > The thought was that this API/protocol was not meant for clients, only for user agents and hubs that have access to all your data. We could revisit this if we think clients need to partake in replication. > > How did the hub get permissions to read objects in the first place? Did it ask for consent to all objects somehow? ##### 2. Calculate revision difference Any commit IDs already seen by the *target* should be subtracted from the list of commits returned by the *source*. The remaining difference represents the state in the *source* which must be replicated to the *target*. This is done by sending a request to the target with the change feed details from the source: ```json { "iss": "did:hub.id", "aud": "did:user.id", "@type": "Replication/Diff", "request": { "commits": [ {"rev": "3a9de008f526d239d89905e2203fa484f6e68dfc096a7c051eb80f15"}, {"rev": "fe4fd3240ff1c68a85730b3cf35742c8a241a0a70bd98e8869b4b593"}, {"rev": "7eb02e738ec6eb95542b6e0555c2231ad299ccda124e056ed850c88e"} ] } } ``` The target should respond with a list of missing commits which it would like to receive: ```json { "@type": "Replication/Response", "response": { "commits": [ {"rev": "fe4fd3240ff1c68a85730b3cf35742c8a241a0a70bd98e8869b4b593"}, {"rev": "7eb02e738ec6eb95542b6e0555c2231ad299ccda124e056ed850c88e"} ] } } ``` ##### 3. Replicate change data After determining the set of missing commits, an additional request should be issued to retrieve the full details of the relevant commits: ```json { "iss": "did:hub.id", "aud": "did:user.id", "@type": "Replication/Read", "request": { "since": "aGM3ODlmcVokQDM3Yzk4bzNxNDdmaDk=", "filters": [ {"rev": "fe4fd3240ff1c68a85730b3cf35742c8a241a0a70bd98e8869b4b593"}, {"rev": "7eb02e738ec6eb95542b6e0555c2231ad299ccda124e056ed850c88e"} ] } } ``` With no `fields` parameter specified, the response should return the full details of each matching commit: ```json { "@type": "Replication/Response", "response": { "state": "ZHNmc2RmZGZzZGZzYWRmcedA2RzZmE=", "commits": [ { "rev": "fe4fd3240ff1c68a85730b3cf35742c8a241a0a70bd98e8869b4b593", "committed_at": 1530309810, "parents": ["3a9de008f526d239d89905e2203fa484f6e68dfc096a7c051eb80f15"], "object": "3a9de008f526d239d89905e2203fa484f6e68dfc096a7c051eb80f15", "state": { // ... }, "patch": { // ... } }, { "rev": "7eb02e738ec6eb95542b6e0555c2231ad299ccda124e056ed850c88e", "committed_at": 1530309923, "parents": ["fe4fd3240ff1c68a85730b3cf35742c8a241a0a70bd98e8869b4b593"], "object": "3a9de008f526d239d89905e2203fa484f6e68dfc096a7c051eb80f15", "state": { // ... }, "patch": { // ... } } ] } } ``` Once all missing commits have been replicated to the `target`, the one-way replication process is complete. :::warning Open: Pagination is still an open question at the Hub protocol level. The pagination scheme chosen for the Hub overall should also apply to the `Replication` interface. ::: ### Sample replication flow The following diagram illustrates a sample replication flow that might occur between a Hub on a user's mobile device (`Phone Hub`) and a Hub hosted by a cloud provider (`Cloud Hub`). Initially, the user chooses to sync only music playlists to their `Phone Hub` : ```sequence participant Phone Hub as PH participant Cloud Hub as CH PH -> CH: Request all changes for filter {"schema": "schema.org/MusicPlaylist"} since the beginning of time CH -> PH: Return state token "abc123" and list of commit IDs PH -> CH: Request full commit details for filter {"rev":"537ad1af"} CH -> PH: Return full state for matching changes ``` The `Phone Hub` goes offline and then comes back online at a later date and wishes to discover any relevant changes since the last replication: ```sequence participant Phone Hub as PH participant Cloud Hub as CH PH -> CH: Request all changes for filter {"schema": "schema.org/MusicPlaylist"} since state token "abc123" CH -> PH: Return state token "def456" and list of commit IDs PH -> CH: Request full commit details for filter {"rev":"7af33ed9"} CH -> PH: Return full state for matching changes ``` Later, the user also chooses to synchronize their recipes collection to their phone: ```sequence participant Phone Hub as PH participant Cloud Hub as CH PH -> CH: Request all changes for filter {"schema": "schema.org/Recipe"} since the beginning of time CH -> PH: Return state token "ghi789" and list of commit IDs PH -> CH: Request full commit details for filter {"rev":"8ae4e543"} CH -> PH: Return full state for matching changes ``` > [name=Danny Strockis] Note that for client apps to partake in replication, we'll likely have to build SDKs. > [name=Danny Strockis] So I understand that this replication protocol is meant to work as a two-way sync between mobile clients and servers, where neither is the "master". But I only see a pull-style replication flow here. How does the phone sync its changes to the hub? Right now it seems like the hub(s) is/are the master instance(s), and the phone pushes changes to the server hubs before other instances can become aware of them. Does this really achieve the censorship protection goals you had? > > Can add couchDB style push changes into spec. > [name=Danny Strockis] On a related note, how do clients/oher hubs figure out which hub to contact to sync/get data, or to write a change? Will they always just pick the first in the DDO? Doesn't that first hub then retain censorship power? > > Similarly, if one hub sends back a 404 not found for a document, how does the client know it should check a different hub for the same document? Do clients need to always partake in replication and sync changes from multiple hubs to protect against censorship? How many hubs? ### Future replication enhancements #### Continuous log-based replication A future enhancement could add support for continuous log-based replication based on keepalive connections in the `Replication/Read` request. Rather than closing the connection after processing new commits, the connection would be kept open by periodic heartbeats, and further commits could be be sent as soon as they are available. This would require modification of the Hub request/response protocol to support streaming responses where each chunk of data is encrypted as a separate blob. #### Additional replication protocols The log-based replication system is a reasonably simple approach that has been used with success in other systems like CouchDB. However, if Hub usage grows signifigantly, it may make sense to investigate the possibility of adding more efficient but more complex replication options. As an example, [Invertible Bloom Filters](https://www.ics.uci.edu/~eppstein/pubs/EppGooUye-SIGCOMM-11.pdf) containing commit IDs could be used to efficiently identify commits missing from the *target*, after which the standard replication protocol could be used to transfer the necessary commit data.

Import from clipboard

Paste your markdown or webpage here...

Advanced permission required

Your current role can only read. Ask the system administrator to acquire write and comment permission.

This team is disabled

Sorry, this team is disabled. You can't edit this note.

This note is locked

Sorry, only owner can edit this note.

Reach the limit

Sorry, you've reached the max length this note can be.
Please reduce the content or divide it to more notes, thank you!

Import from Gist

Import from Snippet

or

Export to Snippet

Are you sure?

Do you really want to delete this note?
All users will lose their connection.

Create a note from template

Create a note from template

Oops...
This template has been removed or transferred.
Upgrade
All
  • All
  • Team
No template.

Create a template

Upgrade

Delete template

Do you really want to delete this template?
Turn this template into a regular note and keep its content, versions, and comments.

This page need refresh

You have an incompatible client version.
Refresh to update.
New version available!
See releases notes here
Refresh to enjoy new features.
Your user state has changed.
Refresh to load new user state.

Sign in

Forgot password

or

By clicking below, you agree to our terms of service.

Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
Wallet ( )
Connect another wallet

New to HackMD? Sign up

Help

  • English
  • 中文
  • Français
  • Deutsch
  • 日本語
  • Español
  • Català
  • Ελληνικά
  • Português
  • italiano
  • Türkçe
  • Русский
  • Nederlands
  • hrvatski jezik
  • język polski
  • Українська
  • हिन्दी
  • svenska
  • Esperanto
  • dansk

Documents

Help & Tutorial

How to use Book mode

Slide Example

API Docs

Edit in VSCode

Install browser extension

Contacts

Feedback

Discord

Send us email

Resources

Releases

Pricing

Blog

Policy

Terms

Privacy

Cheatsheet

Syntax Example Reference
# Header Header 基本排版
- Unordered List
  • Unordered List
1. Ordered List
  1. Ordered List
- [ ] Todo List
  • Todo List
> Blockquote
Blockquote
**Bold font** Bold font
*Italics font* Italics font
~~Strikethrough~~ Strikethrough
19^th^ 19th
H~2~O H2O
++Inserted text++ Inserted text
==Marked text== Marked text
[link text](https:// "title") Link
![image alt](https:// "title") Image
`Code` Code 在筆記中貼入程式碼
```javascript
var i = 0;
```
var i = 0;
:smile: :smile: Emoji list
{%youtube youtube_id %} Externals
$L^aT_eX$ LaTeX
:::info
This is a alert area.
:::

This is a alert area.

Versions and GitHub Sync
Get Full History Access

  • Edit version name
  • Delete

revision author avatar     named on  

More Less

Note content is identical to the latest version.
Compare
    Choose a version
    No search result
    Version not found
Sign in to link this note to GitHub
Learn more
This note is not linked with GitHub
 

Feedback

Submission failed, please try again

Thanks for your support.

On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

Please give us some advice and help us improve HackMD.

 

Thanks for your feedback

Remove version name

Do you want to remove this version name and description?

Transfer ownership

Transfer to
    Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

      Link with GitHub

      Please authorize HackMD on GitHub
      • Please sign in to GitHub and install the HackMD app on your GitHub repo.
      • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
      Learn more  Sign in to GitHub

      Push the note to GitHub Push to GitHub Pull a file from GitHub

        Authorize again
       

      Choose which file to push to

      Select repo
      Refresh Authorize more repos
      Select branch
      Select file
      Select branch
      Choose version(s) to push
      • Save a new version and push
      • Choose from existing versions
      Include title and tags
      Available push count

      Pull from GitHub

       
      File from GitHub
      File from HackMD

      GitHub Link Settings

      File linked

      Linked by
      File path
      Last synced branch
      Available push count

      Danger Zone

      Unlink
      You will no longer receive notification when GitHub file changes after unlink.

      Syncing

      Push failed

      Push successfully