Holochain documentation map

# Holochain documentation map ###### tags: `holochain` `documentation` `dev portal` The purpose of this doc is to map out all the things we think we ought to talk about in the Holochain developer documentation. ## Missing sections (names TBD) * Developer guide (how to use entries, how to use links, how to write validation functions, lifecycle of a zome call, etc, etc, etc) * Patterns (best practices for composing building blocks into usable things) * `hc` guide/manpage * Conductor API reference * local WebSocket * idiosyncrasies of socket binding * Already somewhat documented via [`holochain/holochain-client-js`](https://github.com/holochain/holochain-client-js/blob/develop/docs/client.md), though that's lib-specific and off-site. * Also documented via the [`holochain_conductor_api` crate Rustdoc](https://docs.rs/holochain_conductor_api), but it places the burden on devs to translate from Rust types to MsgPack to client-side structs, and doesn't even document the envelope format anywhere. * Concern: How do we document the API endpoints in a readable way, given the messages are all MsgPack? show MsgPack alongside a JSON representation? * call/response envelope format * UI signal envelope format * DNA/hApp manifest reference ## Developer guide topics (organised to match roughly with the Core Concepts) * Application architecture * Stubbing a hApp, DNA, and pair of zome crates (use the scaffolding tool for all/some of this?) * Execution environment * WASM VM * Host/guest model * Single argument per function * Various kinds of return value (maybe deal with in each individual function signature) * How to call a function * source chain head moved error? * (lack of) state persistence across calls * Call signature / envelope format * Role of the conductor * Composing zomes and DNAs into an app * DNA and app manifest formats * Coordinator swapping * cell provisioning strategies * `Create` -- always create a new cell * `UseExisting` -- requires existing cell under same key derivation tree (not implemented?) * `CreateIfNotExists` -- try `UseExisting` followed by `Create` (not implemented?) * `CloneOnly` -- defer instantiation until first clone. How is this different from `Create` with `deferred` and no effective instantiation? * Common flags * `deferred` -- don't create one until asked (not implemented? how do you ask?) * `clone_limit` -- how many clones can be created? * App/cell lifecycle ```mermaid sequenceDiagram participant User participant Client participant Launcher participant Conductor participant Network Note over User,Network: Installation User->>Launcher: app bundle and network seed activate Launcher Launcher->>Conductor: app bundle, optional network seed, and optional membrane proof deactivate Launcher activate Conductor Conductor->>Conductor: generate new agent key pair opt Conductor->>Conductor: register new key pair with DPKI service end loop for each role in app bundle create participant Integrity zome Conductor->>Integrity zome: store integrity zome bytecode in database create participant Coordinator zome Conductor->>Coordinator zome: store coordinator zome bytecode in database Conductor->>Conductor: spin up WASM VM with integrity zome bytecode Conductor->>Integrity zome: call `entry_defs()` Integrity zome-->>Conductor: entry types definitions Conductor->>Conductor: store metadata, WASM bytecode, and entry type definitions in database opt provisioning is not set to deferred Conductor->>Integrity zome: call `genesis_self_check(membrane_proof)` break `genesis_self_check()` fails Conductor->>Conductor: cell is disabled end Conductor->>Conductor: initialize source chain with `DNA`, agent public key, and `AgentValidationPkg` actions (genesis actions) Note over Conductor: Do not self-validate genesis actions, dependencies are inaccessible Conductor->>Network: Discover list of initial network peers Network-->>Conductor: initial network peers Conductor->>Network: Establish connections with peers Note over Conductor,Network: publish genesis actions -- see elsewhere for publish sequence end end Note over User,Network: Usage Note over User,Coordinator zome: function call -- simplified sequence, see below for full sequence Client->>Conductor: call function `foo()` in coordinator zome activate Conductor Conductor->>Conductor: check capability token opt no function has been called in app yet Conductor->>Coordinator zome: call `init()` activate Coordinator zome Coordinator zome-->>Conductor: result of `init()` deactivate Coordinator zome break ❌ result is failure Conductor-->>Client: failure end end Conductor->>Coordinator zome: call `foo()` activate Coordinator zome Coordinator zome-->Conductor: result of call deactivate Coordinator zome Conductor->>Conductor: validate written data par publish data Conductor->>Network: send written data for validation and return result of call Conductor-->>Client: result of call deactivate Conductor end Note over Client,Conductor: (future) activating deferred cells Client->>Conductor: call app API function `????` on app role `bar` activate Conductor Conductor->>Conductor: instantiate cell from DNA that fills role `bar`, name it `bar_qux` Conductor-->>Client: result of activation attempt deactivate Conductor Note right of Conductor: activating a deferred cell follows same sequence as instantiating app's initial cells above Note over Client,Conductor: cloning a DNA Client->>Conductor: call app API function `CreateCloneCell` on app role `bar` with name `bar_qux` activate Conductor Conductor->>Conductor: clone and instantiate cell from DNA that fills role `bar`, name it `bar_qux` Note right of Conductor: instantiating a cloned cell follows same sequence as instantiating app's initial cells above Conductor-->>Client: result of clone attempt Note over Client,Conductor: disabling a clone cell Client->>Conductor: call `DisableCloneCell` with clone cell ID `baz_qux` Conductor->Conductor: wait for any running functions in cell `baz_qux` to finish Conductor->>Conductor: disable network communications and function bindings for cell Conductor-->>Client: result of attempt to disable Note over Client,Conductor: enabling a cell Client->>Conductor: call `EnableCloneCell` with clone cell ID `baz_qux` Conductor->Conductor: resume network communications for cell and re-bind cell's functions Conductor-->>Client: result of attempt to enable Note over Client,Network: conductor is unable to connect to other peers, either because of network failure or because peers have blocked agent Coordinator zome->>Conductor: request DHT data activate Conductor Conductor-xNetwork: ❌ connection attempt fails Conductor->>Conductor: attempt to fetch DHT data from local store or cache Conductor-->>Coordinator zome: locally stored data or empty result Note over Client,Network: uninstalling an app loop for each cell in app Conductor->>Conductor: disable cell Conductor->>Conductor: remove cell's source chain data opt cell is last locally installed instance of DNA Conductor->>Conductor: remove DNA's validated and/or cached DHT data destroy Coordinator zome Conductor-xCoordinator zome: remove bytecode from database destroy Integrity zome Conductor-xIntegrity zome: remove bytecode from database end end Conductor->>Conductor: remove app bundle data from database Network-xConductor: ❌ connection attempt fails ``` 1. (optional) User specifies network seed/per-DNA properties 2. New keypair generated 1. (Optional) registered in DPKI 3. App installed 4. Cell(s) marked for immediate instantiation are instantiated 1. In each cell, `entry_defs`, `link_defs`, and `genesis_self_check` called 2. Peer communication established, gossip starts, genesis entries published 3. Cells not totally initialised 5. `init` lazy-executed on first zome call 6. During life of app: * deferred cells can be activated * clones can be made from active or deferred-yet-uninstantiated role * network seed (not guaranteed to be unique among all apps using same parent DNA) * properties (only useful if clone's functionality should be different, although non-functional modifier can be used to keep network seed same and uniquify clone) * origin time (good way of uniquifying clone) * quantum time (obscure, don't use for cloning unless you know what it does) * * DNA modifiers * (future) DPKI hash * Migration considerations * the client can disable and re-enable cells 5. End-of-life * Cells can be disabled by app's client * Cells/app and their data can only be deleted by 'orchestrator' (settle on a name for Launcher, Kangaroo, et al) * Can use cell until it's disabled * Can participate in network once connected to peers, as long as peers can be contacted -- can be blocked by peers either by warrant or app-level blocking * Integrity zome * Determinism and protections against non-determinism (host call permissions) * Callbacks to implement * `entry_defs` * `link_defs` * `genesis_self_check` * `validate` * Coordinator zome * Depending on an integrity zome so you can write/deserialise its defined entry/link types * Defining your zome's API with public functions * How to call host functions and what happens when they error * How to address a cell * from the UI * from another zome in the same cell * from another cell * from another peer in the same network * Short-circuiting execution easily with `?` * Lifecycle of a zome call ```mermaid sequenceDiagram participant Client participant Conductor participant Coordinator zome participant Integrity zome participant DHT Client->>Conductor: call `foo(input)` in zome `bar` in cell `baz` with capability token `x` Conductor->>Conductor: check capability token x break ❌ capability token x is invalid Conductor-->>Client: authentication failure message end opt `init()` hasn't been run on cell activate Coordinator zome Note over Conductor,Coordinator zome: call `init()` in coordinator zome as if it were a regular function (see below for sequence) deactivate Coordinator zome break ❌ `init()` returns failure Conductor-->>Client: failure value from `init()` end Conductor->>Conductor: write `InitZomesComplete` action to source chain end Note over Client,Integrity zome: `foo(input)` function call begins here Conductor->>Conductor: create snapshot of local source chain state ('scratch space') activate Coordinator zome Conductor->>Conductor: spin up WASM VM, load zome `bar` bytecode into VM Conductor->>Coordinator zome: call `foo(input)` opt `foo` calls host function `qux` activate Conductor Coordinator zome->>Conductor: call host function opt host function `qux` writes action Conductor->>Conductor: add action to scratch space and mark as dirty end opt host function `qux` reads local state Conductor->>Conductor: query scratch space, not actual source chain state end Conductor-->>Coordinator zome: result (data or error) end Coordinator zome-->>Conductor: output (value or error) deactivate Coordinator zome break ❌ output from `foo()` is error Conductor-->>Client: error output from `foo()` end opt scratch space is dirty opt source chain has changed since snapshot was taken break ❌ at least one action in scratch space has been written with `ChainTopOrdering::Strict` Conductor-->Client: chain top moved error end Conductor->>Conductor: create new scratch space Conductor->>Conductor: rebase pending actions from scratch space of `foo()` on new scratch space end activate Integrity zome Conductor->>Conductor: spin up WASM VM, load integrity zome bytecode into VM loop for each pending action in scratch space Conductor->>Conductor: Create DHT operations for action loop for each DHT operation Conductor->>Integrity zome: call `validate(operation)` Integrity zome-->>Conductor: validation result break ❌ result is `Invalid` Conductor-->>Client: Validation error end break ❔ result is `UnresolvedDependencies` Conductor-->>Client: Validation error end end end Conductor->>Conductor: write ('flush') validated actions to source chain par publish DHT operations to peers loop for each DHT operation Conductor->>DHT: operation DHT->>DHT: Validate operation DHT-->>Conductor: validation receipt end Note right of DHT: this task requeues itself until enough validation receipts have been collected for each DHT operation -- for entries, this is specified by the `RequiredValidations` for the entry type and call `post_commit()` Conductor->>Coordinator zome: call `post_commit()` with written actions end end Conductor-->>Client: output of `foo()` ``` soooooo... this is rather long and hard to read * Capabilities * How to generate one * functions covered * How to get one (app patterns, plus saving cap claim) * UI vs zome * How to use one * UI vs zome * Types * Anonymous * no token needed * still need to sign call * Transferrable * works just like a trad capability -- supply the token when making a call * what can a token look like? * Assigned * need a valid signature from an allowlisted pubkey * Author * Special case * Also covers UI in some execution environments * Launcher * Holo * source chain head moved error * Scheduling tasks * how to schedule a function * No arguments received; pass state to scheduled function via source chain * idempotent * how to mark up a schedulable function * infallible * must receive and return a schedule * Callbacks to implement * `init` * can fail, in which case app is disabled * `post_commit` * no write privileges * infallible * `recv_remote_signal` * just another zome function, but calling it is treated specially by the conductor * Swappability * Source chain * Querying * `query` for local state querying * anything else? * DHT * Gets * `get_agent_activity`, includes chain status, warrants, current state, and valid/rejected actions * The operations that a publish produces, who they each go to, and the way the DHT is transformed as a result * Not 1:1 correspondence with actual DHT ops produced; some ops go to two authorities but are collapsed into one type of op for validation simplicity * Authorities / the addresses they're responsible for * Agent activity / author's public key * Entry / hash of entry * Record / hash of action * All actions * `RegisterAgentActivity` * goes to agent activity authority * contains action, optionally contains entry data if action is a new-entry action and `cache_at_agent_activity` is true for entry type * adds an action to agent activity * invalid if generic action data is incorrect; flags author if fork is detected * `StoreRecord` * goes to record authority * contains action and entry * stores action at base, stores entry too but doesn't become responsible for it * invalid if generic action data is incorrect * Public entries * Create * `StoreEntry` * goes to entry authority * contains entry and action * stores entry data _and_ action at base * Update -- create, plus * `RegisterUpdate` * collapses `RegisterUpdatedContent` (goes to original entry's base) and `RegisterUpdatedRecord` (goes to original entry's record) * contains action and entry * invalid if entry types differ * marks action at both entry base and record base as updated, which contains new entry address and can be hashed to get new action address * Delete * `RegisterDelete` * collapses `RegisterDeletedBy` (goes to deleted entry's base) and `RegisterDeletedEntryAction` (goes to deleted entry's action) * contains action * invalid when? * marks action at entry base as deleted, with the consequence that the entry is marked dead if all of its actions are deleted. Also marks action at previous action base as deleted. * Private entries * No `StoreEntry`; `RegisterUpdate`/`RegisterDelete` still go to both record and entry base but (I assume) the entry authorities reject them. * Can't use private entry as a validation dependency; author-side validation will give different result from authority-side * Links * Create * `RegisterCreateLink` * goes to link base (entry/action/agent/external ref) address * adds link as metadata, even if base doesn't exist * Delete * `RegisterRemoveLink` * goes to link base * marks deleted link as dead * Publish sequence ```mermaid sequenceDiagram participant Author participant Network loop for each action written to source chain Author->>Author: transform action into DHT operations loop for each DHT operation Author->>Author: sign operation with private key Author->>Author: determine DHT base address of operation (depends on operation type) opt author isn't connected to any DHT authority peers responsible for base address Author->>Network: request agent IDs for responsible authorities Network-->>Author: list of agent IDs end loop for each known authority, or until number of collected validation receipts >= required validations for operation create participant Authority as DHT authority peer Author->>Authority: publish signed operation Authority->>Authority: validate operation alt ❌ operation is invalid Authority->>Authority: add author to block list Authority->>Authority: create negative validation receipt else ✔️ operation is valid Authority->>Authority: integrate operation into DHT store Authority->>Authority: create positive validation receipt end Authority->>Authority: sign validation receipt destroy Authority Authority-->>Author: validation receipt end end end par in background loop until number of collected validation receipts collected >= required validations Note over Author,Network: do publish loop as above, skipping authorities who have already sent a receipt end end ``` * Links, paths, and anchors * `create_link` * base/target don't need to exist * `delete_link` * `get_links` and `get_link_details` * filtering by type and tag * Paths and anchors * can't specify relaxed chain top ordering * Entries and CRUD * Unique identifiers are hashes * Action hash or entry hash? you get to decide (maybe a collection of patterns) * for update and delete, metadata is attached to both original entry base and original action base * Max payload size * `create` * Entry content is deduplicated * `update` * Pattern for jumping the update chain (put in pattern section instead?) * what happens when you update an `AgentID` entry? * old one is retired * (future) sys validation with DPKI * `delete` * Nothing's actually deleted * `get`, `get_details` * (future) actual removal * `withdraw`, remove one's own actions after an accidental source chain fork * `purge`, remove entry content (but not the actions that wrote it) * relaxed chain top ordering * eliminates source chain head moved errors, but all writes in function call have to have it * Zome functions and calls * `#[hdk_extern]` * single parameter * return value type * short-circuiting fallible functions with `?` * Remote calls * Cross-zome/cell calls * Capabilities * Specially supported pattern: fire-and-forget remote signals * `post_commit` * infallible, read-only, but can `call()` another function * Warning against sending signals in a zome call unless you know what you're doing (writes may not succeed) * source chain head moved errors * Integrity * Validation * Determinism * dependencies * `must_get_*` functions * `must_get_valid_record` is the only one that checks for validation receipts/warrants because an entry is only valid/invalid in the context of an action * this also means it trusts the word of others (1-of-n validation) * short-circuiting with `?` * Operate on ops * flat ops convenience thingy * Considerations re: what should be validated on each op (IOW, what authorities should be responsible for what things) * Things that can be validated * membrane proof * entry structure (`entry_def` macro gives you deserialisation and error short-circuiting for free with `?` operator) * permission * rate limiting with `weight` field * dependencies, incl source chain history * Inductive validation for costly dep trees (pattern?) * Limitations * Cannot `must_get` links or actions on a base * Cannot currently co-validate multiple actions (can only validate an action based on prior valid actions) * Cannot validate the non-existence of something, because that can always change * (future) source chain restructured to atomic bundle of actions, co-validated * sys validation * Lifecycle of a validation * At author time * At publish time * At gossip time * Membrane proof * Genesis self-check -- not a 'true' validation function, just a way to guard yourself against copy/paste mistakes and other things that can permanently hose your chance of joining a network * (future) Handled specially -- restricts/grants access to a network; validated at handshake time (turns out this is not currently implemented, and there are questions about how to implement it in a way that doesn't carry a performance hit with each new peer connection -- and there may be lots of them in a big heavily sharded DHT) * `AgentValidationPkg` is the only action for which an honest person can get warranted, because they try to join the network and publish it before they're able to fetch deps * Blocking * What does and doesn't happen * (future) sticky validation * Consequence of validation failure * App-level blocks can also be applied arbitrarily without needing proof of invalid activity, and can be removed * All blocks are agent-centric; a network-wide block is a consequence of each individual peer blocking the agent * Signals * Local signals to UI * Remote signals to peers on same DHT * needs cap grant on `recv_remote_signal` * (future) pub/sub to replace most use cases for remote signals * not supported between zomes or between cells :( * (future) DPKI and other conductor services * What do they do? * Involved in sys validation * Sys validation * What does it validate? * Source chain continuity * Timestamp monotonicity * Contiguity of authorship * (future) if DPKI enabled, sys validation checks that changes of agent ID are reflected in DPKI DNA as-at action timestamp * What else? * App lifecycle * Cell initialisation * Peer discovery -- connecting to other peers via bootstrap server, mDNS, or injected network addresses * NAT holepunching -- WebSocket proxy server * DNA properties * What can you do with them? * Where/how are they accessed by your code? * How can the user modify them? * What happens then? (answer: a fork) * `*_info` functions * Fun little helpers * Crypto * hashing * Ed25519 signing/verifying * box encryption * Randomisation * Timestamp * Tracing * Lair and what it can do * Supplies crypto primitives to HDK * Generates agent keys for conductor, signs actions * Countersigning * Automated process; agreement to countersign a given piece of data should have already been made * online and interactive (in distributed systems terms) * Document the heck out of the lifecycle ```mermaid sequenceDiagram box Alice participant ACell as Alice's cell participant ACond as Alice's conductor end box Bob participant BCond as Bob's conductor participant BCell as Bob's cell end ACell->>ACell: prepare entry to be countersigned ACell->>BCell: send entry data (signal or zome function) BCell-->>ACell: acknowledge, store, and approve of entry data ACell->>ACell: create reflight request Note right of ACell: contains parties involved, hash of entry to be signed, time window, action stub, arbitrary preflight bytes ACell->>BCell: call arbitrary zome function called `receive_preflight` with prepared preflight request BCell->> ``` * Patterns for implementing the interactions that aren't handled at the system level ## Patterns * Lobby * gaining access to privileged DHTs * getting directory of DNA clones * mediated access to other DHTs via `call_remote` * Blockchain binding * File chunking * DNA composability * One DHT per file/channel * Throwaway DHT * Per-role DNAs (combine with Lobby for mediated access) * Various CRUD patterns * See [https://github.com/mjbrisebois/blog/blob/master/drafts/holochain-architectural-pattern-vectors.md](https://github.com/mjbrisebois/blog/blob/master/drafts/holochain-architectural-pattern-vectors.md) * Canonical updates * Simple resolution strategy (e.g., LWW) * Branch pointer (requires consensus protocol) * 'latest update' link * Update trees * Communications * [Asynchronous private messaging](https://forum.holochain.org/t/asynchronous-private-messaging/1085) * [Mailbox](https://forum.holochain.org/t/mailbox/1084) * Access control / privileges / privacy / secrecy * Benevolent dictator * Progenitor * [Capability delegation](https://forum.holochain.org/t/capability-delegation/1082) * Memproofs for DHT read / peer connection privileges * Token gating * Group password (better handled by network seed TBH) * Certificate chains * Lobby can be used to distribute memp * DHT writes * Only author can edit * Read-only public DHT with memproof for write privileges (necessary for Holo resilience nodes to work) * Encrypted DHT data * key rotation for forward secrecy * double ratchet * manual rotation * Private data and `call_remote` * Homomorphic encryption * ZKP * Negotiation of shared secret, e.g., dice roll * SMPC (e.g., EcDSA) * just sharing the darn secret * Shared secrets, public disclosure/auditing * Validation * Inductive validation * Remote signals * Heartbeat / presence indicators * Online-ish-ness -- dealing with ambiguous definition of 'online' in P2P context * swarm connectivity -- how connected am I to the people I'm doing work with? * beacon or sentinel -- am I connected to 'the internet' (or at least the portion that I expected to see)? * Anchor pub/sub * Publish frontrunning -- sending out a signal in `post_commit` to let listeners know that a DHT entry should exist there eventually * Real-time collaboration * Syn v1 (scribe that collects CRDTs from collaborators, periodically commits change sets) * Syn v2 (no scribe needed; don't know how this'll work yet) * Branching revisions * Consensus protocols * Countersigning-based * Single notary * Notary pool (m-of-n optional witnesses) * Hardened metadata (will be supported by Holochain eventually; curious if it could be implemented as an app-level pattern) * Blockchain binding