Holochain RSM Migration Guide

Deprecated!

Go to https://holochain-open-dev.github.io/blog/holochain-rsm-migration-guide/ instead. If you want to suggest changes, fork holochain-open-dev/blog and submit as a PR.

Holochain has always been about individuals making changes to their local state, which then transform global state: first, agents commit entries to their source chain, then those entries are transformed into DHT operations and broadcast to the appropriate authourities. An entry can be considered 'a thing that's been said', while a commit represents 'the act of saying something'. This is a peer-to-peer example of the Event Sourcing pattern: participants' source chains are streams of events that affect both their local states and a global DHT state. If all the source chains were replayed up to any point in their histories, they'd recreate the DHT at that point.

The biggest change with Holochain RSM is more attention to this reality. It's now more explicit in the design of the new API. For instance, when you commit an entry, you get back a header hash, not an entry hash. What Holochain is saying is "here's the ID of the event that represents you speaking this entry into existence". And when you update or delete something, once again you're acting on a commit. That means that Holochain RSM is better at modelling the multi-perspective nature of the real world; Alice and Bob can now both create a to-do item saying "Buy milk after work", and Alice can mark hers as done without affecting Bob's (or all the previous times she created that same item). No need to disambiguate with extra timestamp fields in your entry structs.

Holochain RSM also hides fewer things behind abstractions. For instance, canonical updates no longer exist. Now, updates are simply accumulated for your code to retrieve and make sense of. This requires a bit more thinking, but there are a couple benefits:

  • Because there are fewer 'leaky abstractions' to try to make guesses about, things are clearer and you can reason about them more correctly. The dreaded 'update loop' bug no longer exists.
  • With access to lower-level building blocks, you're free to create patterns that work best for your use case. Consider two styles of wiki, one that has 'official' versions of each article like Wikipedia and another that captures different people's perspectives like Federated Wiki.

Finally, the HDK is no longer necessary. It's still strongly recommended, because getting data across the WebAssembly boundary is still fiddly, but the core API is now simple enough for you to use directly if the HDK gets in your way or is too heavy for your use case.

Language

Holochain-Redux Holochain-RSM Comment
DNA instance Cell A DNA paired with an agent key, running in the conductor
Element A commit, which is a header + optional entry data. Each commit to a source chain is represented as an element.
Commit Create While they're still the same thing, we felt 'create' was more in line with the CRUD language you're used to from other frameworks.
Remove Delete See above
Update Update + Redirect The previous concept of canonical updates will be replaced with updates carrying a 'redirect' flag and are not yet available.
Update RSM now allows you to do 'multivalent' updates; that is, many branching non-canonical realities.

Data structures

Holochain-Redux Holochain-RSM Comment
Headers contain the signature of the entry Agents now sign the header and the (Header, Signature) pair is what gets distributed to the DHT
There is only one Header type; system actions are special entry types There are multiple Header types, and some system actions (DNA, links, deletes) are completely contained in the header This reduces DHT chatter. Header structs
Entries and headers don't have a joint struct since they are not used together very often An Element is defined as a Header, alongside its Entry if the header type contains one The majority of HDK calls accept and return Elements

Design

Holochain-Redux Holochain-RSM Comment
Link definitions are static, can only contain type and tag string Link definitions are dynamic, can contain any arbitrary data
You can only validate an agent with their public key and nickname There is a new MembraneProof entry which contains data that you pass when installing a DNA to prove that you have permission to join the DHT
You have no way to retrieve all the headers committed by an agent In the future there will be a way to get all the activity for one agent (all their headers) This is one of the pillars of Holochain's integrity model; peer witnessing of agent activity prevents malicious agents from counterfeiting their history

CRUD

Holochain-Redux Holochain-RSM Comment
Update and delete operate on an entry hash Update and delete operate on a header hash This is a big shift in focus from data to state changes, and disambiguates between the same data written at different times by different people
An entry can be only updated or deleted once, and that status change is canonical; conflicting changes can't be resolved and result in an inconsistent DHT An entry can be updated or deleted multiple times for diverging realities, a la Git In the future, updates will get a 'redirect' flag and a conflict resolution callback to emulate canonical updates
Update, delete, and get follow the update chain to the latest version of the specified entry before performing actions Update, delete, and get operate directly on the specified element This prevents redirect loops and lets you implement your own selection logic.
If you delete an entry, that entry is dead forever (tombstone set) An entry is still alive until all headers are deleted (reference counting)
Creation is called 'commit' and deletion is called 'remove' Creation is called 'create' (e.g., create_entry) and deletion is called 'delete'
Only app entries can be updated and deleted App entries, agent public keys, and capability grants/claims (or rather, their elements) can be updated and deleted
Deleted data stays on the DHT In the future you may be able to scrub data from the DHT: 'withdraw' will redact a mistakenly committed element of your own, and 'purge' will erase unsafe/illegal content created by others Data scrubbing will still depend on good faith; malicious nodes will be able to ignore these operations.
Anyone can write the same entry multiple times In the future, we'll introduce simple 'CRDT' types for entries that should only have one author This is good for scarce/rivalrous resources, such as usernames.

Development

Holochain-Redux Holochain-RSM Comment
Interaction with the host is complicated and requires the HDK The host API, and the API that the host expects the guest zome to implement, are simple enough to work with directly if the HDK gets in the way It's still preferable to use the HDK in most cases, because it hides away the boilerplate code required to work around the Rust compiler and transfer data through the WASM boundary.
Individual entry types are defined with a callback tagged with the #[entry_def] macro and return a ValidatingEntryType struct; the entry! macro helps construct it All entry definitions are simple structs, returned in a single entry_defs callback The HDK makes it easy to treat Rust structs as entries using the #[hdk_entry] macro.
The number of required validations cannot be specified The number of required validations can be specified per entry type definition Not currently hooked up to the DHT layer
Capabilities are statically defined beside the zome function, never fully implemented Capabilities can be dynamically granted or revoked for any function to any agent, for enforcing security on function calls In the future, granted capabilities may prepopulate function parameters to limit callers' privileges
UIs make zome calls freely (behind the scenes, the conductor applies the 'author' capability grant) UIs must use a valid capability claim to make a zome call Currently not fully enforced
Nodes communicate with send/receive, passing JsonString messages to each other call_remote allows one agent to call another agent's function as if it were her own You can still emulate send/receive with a receive zome function that has an unrestricted capability grant
Anchors are defined in a separate library Anchors are available in the HDK Anchors are a specialization of a new 'path' pattern, and can be sharded to reduce DHT hotspots. Anchor, Path, sharding
Zome functions are tagged with the #[zome_fn] macro and must be defined inside the #[zome] module Zome functions can be defined anywhere in your code as long as you make them externally visible Managing WASM data and the Rust compiler is tricky; the HDK has tools to make this easier.
Zome functions can take multiple input parameters There can only be one input parameter for zome functions Functions that need multiple parameters should define a special struct to hold them.
Validation callbacks are defined alongside the entries and links definitions Validation callbacks are defined just like zome functions and conform to a naming convention You can define multiple validation functions with varying specificity to cover all app and system types (validate), a specific CRUD operation on all app or system types (validate_update), or a specific app entry type (validate_update_agent, validate_delete_entry). In the future you will be able to specify an entry type, but until then you can try to parse the entry's data and match on it to handle different entry types. Callbacks will be tried in reverse specificity until one returns a failure. Links don't participate in the specificity cascade; you can only define validate_create_link and validate_delete_link. As with zome functions, it's easiest to use the HDK to help you define these.
JsonString is used for input/output parameters and entry content. Types used in all these cases must implement DefaultJson SerializedBytes is used for input/output parameters and entry content. Types used in all these cases must implement SerializedBytes. SerializedBytes data uses MessagePack by default, is smaller than JSON, and can contain raw binary data without needing to be Base64-encoded. Primitive types can't be used, but you can wrap them in a simple struct.
Instances are identified by an arbitrary string instance_id Cells (instances) are identified by the pair [DNA_HASH, AGENT_PUB_KEY]
All hashes and public keys are identified by the Address type Each type of hash has a dedicated Rust type (EntryHash, HeaderHash); you can also use AnyDhtHash when needed holo_hash crate
The only callbacks available are ìnit, validate_agent and the validation callbacks There will be a lot of useful "hooks": app_install, app_uninstall, post_commit
The instances running in the conductor are defined and maintained in the conductor-config.toml The conductor-config.toml only contains initial environment settings, the cell information is stored in a dynamic database
Zome functions are not transactional: an initial commit can succeed and stay committed even if a following commit from the same function call fails All zome functions are transactional, the call fails and rolls back all state changes if anything fails inside that call Local state available to a zome function is a snapshot of the source chain, while global state from the DHT is dynamic.
If get_entry fails for a required dependency in a validation rule, there's no way to retry If a get fails or returns None, your validation function can return the hash of the missing dependency, and Holochain will pause the validation and retry again in the future
UIs call conductor admin functions and the DNA's zome functions via a local JSON-RPC call over WebSocket UIs call functions by sending MsgPack-serialised messages over WebSocket WireMessage envelope for function calls, admin API request enum, zome call invocation struct, holochain-conductor-api for JavaScript-based UIs
Signals are sent as JSON-serialised objects over WebSocket Signals are sent as MsgPack-serialised objects over WebSocket Signal message struct
JavaScript clients can use hc-web-client to make zome or admin calls and listen for signals JavaScript clients can use holochain-conductor-api
All host's exposed API functions are shadowed by Rust functions in the HDK All host API functions can be used directly, but are shadowed by macros in the HDK to facilitate transfer of data between host and zome These macros create usability problems with IDEs that support Rust code intel via RLS; they may become functions in the future to fix this issue.

HDK calls

Holochain-Redux Holochain-RSM Comment
commit_entry returns the entry hash create_entry returns the header hash
get_entry returns only the Entry get on a header hash returns the full Element (Header + Entry); get on an entry hash returns the latest written Element for that entry
hdk::AGENT_ADDRESS gets you the initial public key of the agent agent_info() gets you both the initial and the latest public key for the agent
hdk::DNA_ADDRESS gets you the hash of the DNA zome_info gets you the DNA name and hash, and the zome name
There are multiple variations of some host functions: get_entry, get_entry_result, get_entry_as_type There are two variations of get (get and get_details) and get_links (get_links and get_links_details) In RSM, the details calls retrieve all the information about the entry/link that is available in the DHT (for example, all its headers, updates/deletes, etc.) while simple ones get you the latest element
Random numbers had to come from the UI or be hacked by asking the keystore to generate a new secret The host API now has a random_bytes function
Timestamps in app entries had to come from the UI The host API now has a sys_time function

Developer tooling

Holochain-Redux Holochain-RSM Comment
hc init scaffolds a new DNA directory with a dna.json manifest file and Tryorama test template No DNA scaffolding function yet
hc generate scaffolds a new zome with Cargo tooling and a build script No zome scaffolding function yet, but zomes are simple Rust library crates and no longer need a complex build script The build script was necessary for optimising the compiled WASM
hc package builds a DNA manifest and a collection of zomes into a DNA package Zomes are compiled into WASM, their build artifacts are placed in a DNA workdir, and dna-util compiles them into a DNA package
hc test runs the Tryorama test script You can run a Tryorama test with npm run test There is no scaffolding tool for the Tryorama test script; take a look at a sample
hc run starts up a development conductor with an instance, RPC interface, and UI server There is no equivalent yet holochain-open-dev/holochain-run-dna is a temporary solution created
Uses Rust nightly Uses Rust stable Committed to always targeting Rust stable for both Holochain and the HDK.
Holoscape can be used for dev diagnostics and user-friendly app management Currently no equivalent, but will eventually have two tray apps one for devs and one for end-users
Keystore is integrated into Holochain Keystore is a separate binary and can be shared by multiple conductors, similar to ssh-agent or Pageant
hApp bundles (DNAs + UI) can be specified using a bundle.toml manifest file Manifest file not yet supported; the conductor admin API is used to install all the DNAs in a hApp
Select a repo