# Holochain documentation map
###### tags: `holochain` `documentation` `dev portal`
The purpose of this doc is to map out all the things we think we ought to talk about in the Holochain developer documentation.
## Missing sections (names TBD)
* Developer guide (how to use entries, how to use links, how to write validation functions, lifecycle of a zome call, etc, etc, etc)
* Patterns (best practices for composing building blocks into usable things)
* `hc` guide/manpage
* Conductor API reference
* local WebSocket
* idiosyncrasies of socket binding
* Already somewhat documented via [`holochain/holochain-client-js`](https://github.com/holochain/holochain-client-js/blob/develop/docs/client.md), though that's lib-specific and off-site.
* Also documented via the [`holochain_conductor_api` crate Rustdoc](https://docs.rs/holochain_conductor_api), but it places the burden on devs to translate from Rust types to MsgPack to client-side structs, and doesn't even document the envelope format anywhere.
* Concern: How do we document the API endpoints in a readable way, given the messages are all MsgPack? show MsgPack alongside a JSON representation?
* call/response envelope format
* UI signal envelope format
* DNA/hApp manifest reference
## Developer guide topics
(organised to match roughly with the Core Concepts)
* Application architecture
* Stubbing a hApp, DNA, and pair of zome crates (use the scaffolding tool for all/some of this?)
* Execution environment
* WASM VM
* Host/guest model
* Single argument per function
* Various kinds of return value (maybe deal with in each individual function signature)
* How to call a function
* source chain head moved error?
* (lack of) state persistence across calls
* Call signature / envelope format
* Role of the conductor
* Composing zomes and DNAs into an app
* DNA and app manifest formats
* Coordinator swapping
* cell provisioning strategies
* `Create` -- always create a new cell
* `UseExisting` -- requires existing cell under same key derivation tree (not implemented?)
* `CreateIfNotExists` -- try `UseExisting` followed by `Create` (not implemented?)
* `CloneOnly` -- defer instantiation until first clone. How is this different from `Create` with `deferred` and no effective instantiation?
* Common flags
* `deferred` -- don't create one until asked (not implemented? how do you ask?)
* `clone_limit` -- how many clones can be created?
* App/cell lifecycle
```mermaid
sequenceDiagram
participant User
participant Client
participant Launcher
participant Conductor
participant Network
Note over User,Network: Installation
User->>Launcher: app bundle and network seed
activate Launcher
Launcher->>Conductor: app bundle, optional network seed, and optional membrane proof
deactivate Launcher
activate Conductor
Conductor->>Conductor: generate new agent key pair
opt
Conductor->>Conductor: register new key pair with DPKI service
end
loop for each role in app bundle
create participant Integrity zome
Conductor->>Integrity zome: store integrity zome bytecode in database
create participant Coordinator zome
Conductor->>Coordinator zome: store coordinator zome bytecode in database
Conductor->>Conductor: spin up WASM VM with integrity zome bytecode
Conductor->>Integrity zome: call `entry_defs()`
Integrity zome-->>Conductor: entry types definitions
Conductor->>Conductor: store metadata, WASM bytecode, and entry type definitions in database
opt provisioning is not set to deferred
Conductor->>Integrity zome: call `genesis_self_check(membrane_proof)`
break `genesis_self_check()` fails
Conductor->>Conductor: cell is disabled
end
Conductor->>Conductor: initialize source chain with `DNA`, agent public key, and `AgentValidationPkg` actions (genesis actions)
Note over Conductor: Do not self-validate genesis actions, dependencies are inaccessible
Conductor->>Network: Discover list of initial network peers
Network-->>Conductor: initial network peers
Conductor->>Network: Establish connections with peers
Note over Conductor,Network: publish genesis actions -- see elsewhere for publish sequence
end
end
Note over User,Network: Usage
Note over User,Coordinator zome: function call -- simplified sequence, see below for full sequence
Client->>Conductor: call function `foo()` in coordinator zome
activate Conductor
Conductor->>Conductor: check capability token
opt no function has been called in app yet
Conductor->>Coordinator zome: call `init()`
activate Coordinator zome
Coordinator zome-->>Conductor: result of `init()`
deactivate Coordinator zome
break ❌ result is failure
Conductor-->>Client: failure
end
end
Conductor->>Coordinator zome: call `foo()`
activate Coordinator zome
Coordinator zome-->Conductor: result of call
deactivate Coordinator zome
Conductor->>Conductor: validate written data
par publish data
Conductor->>Network: send written data for validation
and return result of call
Conductor-->>Client: result of call
deactivate Conductor
end
Note over Client,Conductor: (future) activating deferred cells
Client->>Conductor: call app API function `????` on app role `bar`
activate Conductor
Conductor->>Conductor: instantiate cell from DNA that fills role `bar`, name it `bar_qux`
Conductor-->>Client: result of activation attempt
deactivate Conductor
Note right of Conductor: activating a deferred cell follows same sequence as instantiating app's initial cells above
Note over Client,Conductor: cloning a DNA
Client->>Conductor: call app API function `CreateCloneCell` on app role `bar` with name `bar_qux`
activate Conductor
Conductor->>Conductor: clone and instantiate cell from DNA that fills role `bar`, name it `bar_qux`
Note right of Conductor: instantiating a cloned cell follows same sequence as instantiating app's initial cells above
Conductor-->>Client: result of clone attempt
Note over Client,Conductor: disabling a clone cell
Client->>Conductor: call `DisableCloneCell` with clone cell ID `baz_qux`
Conductor->Conductor: wait for any running functions in cell `baz_qux` to finish
Conductor->>Conductor: disable network communications and function bindings for cell
Conductor-->>Client: result of attempt to disable
Note over Client,Conductor: enabling a cell
Client->>Conductor: call `EnableCloneCell` with clone cell ID `baz_qux`
Conductor->Conductor: resume network communications for cell and re-bind cell's functions
Conductor-->>Client: result of attempt to enable
Note over Client,Network: conductor is unable to connect to other peers, either because of network failure or because peers have blocked agent
Coordinator zome->>Conductor: request DHT data
activate Conductor
Conductor-xNetwork: ❌ connection attempt fails
Conductor->>Conductor: attempt to fetch DHT data from local store or cache
Conductor-->>Coordinator zome: locally stored data or empty result
Note over Client,Network: uninstalling an app
loop for each cell in app
Conductor->>Conductor: disable cell
Conductor->>Conductor: remove cell's source chain data
opt cell is last locally installed instance of DNA
Conductor->>Conductor: remove DNA's validated and/or cached DHT data
destroy Coordinator zome
Conductor-xCoordinator zome: remove bytecode from database
destroy Integrity zome
Conductor-xIntegrity zome: remove bytecode from database
end
end
Conductor->>Conductor: remove app bundle data from database
Network-xConductor: ❌ connection attempt fails
```
1. (optional) User specifies network seed/per-DNA properties
2. New keypair generated
1. (Optional) registered in DPKI
3. App installed
4. Cell(s) marked for immediate instantiation are instantiated
1. In each cell, `entry_defs`, `link_defs`, and `genesis_self_check` called
2. Peer communication established, gossip starts, genesis entries published
3. Cells not totally initialised
5. `init` lazy-executed on first zome call
6. During life of app:
* deferred cells can be activated
* clones can be made from active or deferred-yet-uninstantiated role
* network seed (not guaranteed to be unique among all apps using same parent DNA)
* properties (only useful if clone's functionality should be different, although non-functional modifier can be used to keep network seed same and uniquify clone)
* origin time (good way of uniquifying clone)
* quantum time (obscure, don't use for cloning unless you know what it does)
* * DNA modifiers
* (future) DPKI hash
* Migration considerations
* the client can disable and re-enable cells
5. End-of-life
* Cells can be disabled by app's client
* Cells/app and their data can only be deleted by 'orchestrator' (settle on a name for Launcher, Kangaroo, et al)
* Can use cell until it's disabled
* Can participate in network once connected to peers, as long as peers can be contacted -- can be blocked by peers either by warrant or app-level blocking
* Integrity zome
* Determinism and protections against non-determinism (host call permissions)
* Callbacks to implement
* `entry_defs`
* `link_defs`
* `genesis_self_check`
* `validate`
* Coordinator zome
* Depending on an integrity zome so you can write/deserialise its defined entry/link types
* Defining your zome's API with public functions
* How to call host functions and what happens when they error
* How to address a cell
* from the UI
* from another zome in the same cell
* from another cell
* from another peer in the same network
* Short-circuiting execution easily with `?`
* Lifecycle of a zome call
```mermaid
sequenceDiagram
participant Client
participant Conductor
participant Coordinator zome
participant Integrity zome
participant DHT
Client->>Conductor: call `foo(input)` in zome `bar` in cell `baz` with capability token `x`
Conductor->>Conductor: check capability token x
break ❌ capability token x is invalid
Conductor-->>Client: authentication failure message
end
opt `init()` hasn't been run on cell
activate Coordinator zome
Note over Conductor,Coordinator zome: call `init()` in coordinator zome as if it were a regular function (see below for sequence)
deactivate Coordinator zome
break ❌ `init()` returns failure
Conductor-->>Client: failure value from `init()`
end
Conductor->>Conductor: write `InitZomesComplete` action to source chain
end
Note over Client,Integrity zome: `foo(input)` function call begins here
Conductor->>Conductor: create snapshot of local source chain state ('scratch space')
activate Coordinator zome
Conductor->>Conductor: spin up WASM VM, load zome `bar` bytecode into VM
Conductor->>Coordinator zome: call `foo(input)`
opt `foo` calls host function `qux`
activate Conductor
Coordinator zome->>Conductor: call host function
opt host function `qux` writes action
Conductor->>Conductor: add action to scratch space and mark as dirty
end
opt host function `qux` reads local state
Conductor->>Conductor: query scratch space, not actual source chain state
end
Conductor-->>Coordinator zome: result (data or error)
end
Coordinator zome-->>Conductor: output (value or error)
deactivate Coordinator zome
break ❌ output from `foo()` is error
Conductor-->>Client: error output from `foo()`
end
opt scratch space is dirty
opt source chain has changed since snapshot was taken
break ❌ at least one action in scratch space has been written with `ChainTopOrdering::Strict`
Conductor-->Client: chain top moved error
end
Conductor->>Conductor: create new scratch space
Conductor->>Conductor: rebase pending actions from scratch space of `foo()` on new scratch space
end
activate Integrity zome
Conductor->>Conductor: spin up WASM VM, load integrity zome bytecode into VM
loop for each pending action in scratch space
Conductor->>Conductor: Create DHT operations for action
loop for each DHT operation
Conductor->>Integrity zome: call `validate(operation)`
Integrity zome-->>Conductor: validation result
break ❌ result is `Invalid`
Conductor-->>Client: Validation error
end
break ❔ result is `UnresolvedDependencies`
Conductor-->>Client: Validation error
end
end
end
Conductor->>Conductor: write ('flush') validated actions to source chain
par publish DHT operations to peers
loop for each DHT operation
Conductor->>DHT: operation
DHT->>DHT: Validate operation
DHT-->>Conductor: validation receipt
end
Note right of DHT: this task requeues itself until enough validation receipts have been collected for each DHT operation -- for entries, this is specified by the `RequiredValidations` for the entry type
and call `post_commit()`
Conductor->>Coordinator zome: call `post_commit()` with written actions
end
end
Conductor-->>Client: output of `foo()`
```
soooooo... this is rather long and hard to read
* Capabilities
* How to generate one
* functions covered
* How to get one (app patterns, plus saving cap claim)
* UI vs zome
* How to use one
* UI vs zome
* Types
* Anonymous
* no token needed
* still need to sign call
* Transferrable
* works just like a trad capability -- supply the token when making a call
* what can a token look like?
* Assigned
* need a valid signature from an allowlisted pubkey
* Author
* Special case
* Also covers UI in some execution environments
* Launcher
* Holo
* source chain head moved error
* Scheduling tasks
* how to schedule a function
* No arguments received; pass state to scheduled function via source chain
* idempotent
* how to mark up a schedulable function
* infallible
* must receive and return a schedule
* Callbacks to implement
* `init`
* can fail, in which case app is disabled
* `post_commit`
* no write privileges
* infallible
* `recv_remote_signal`
* just another zome function, but calling it is treated specially by the conductor
* Swappability
* Source chain
* Querying
* `query` for local state querying
* anything else?
* DHT
* Gets
* `get_agent_activity`, includes chain status, warrants, current state, and valid/rejected actions
* The operations that a publish produces, who they each go to, and the way the DHT is transformed as a result
* Not 1:1 correspondence with actual DHT ops produced; some ops go to two authorities but are collapsed into one type of op for validation simplicity
* Authorities / the addresses they're responsible for
* Agent activity / author's public key
* Entry / hash of entry
* Record / hash of action
* All actions
* `RegisterAgentActivity`
* goes to agent activity authority
* contains action, optionally contains entry data if action is a new-entry action and `cache_at_agent_activity` is true for entry type
* adds an action to agent activity
* invalid if generic action data is incorrect; flags author if fork is detected
* `StoreRecord`
* goes to record authority
* contains action and entry
* stores action at base, stores entry too but doesn't become responsible for it
* invalid if generic action data is incorrect
* Public entries
* Create
* `StoreEntry`
* goes to entry authority
* contains entry and action
* stores entry data _and_ action at base
* Update -- create, plus
* `RegisterUpdate`
* collapses `RegisterUpdatedContent` (goes to original entry's base) and `RegisterUpdatedRecord` (goes to original entry's record)
* contains action and entry
* invalid if entry types differ
* marks action at both entry base and record base as updated, which contains new entry address and can be hashed to get new action address
* Delete
* `RegisterDelete`
* collapses `RegisterDeletedBy` (goes to deleted entry's base) and `RegisterDeletedEntryAction` (goes to deleted entry's action)
* contains action
* invalid when?
* marks action at entry base as deleted, with the consequence that the entry is marked dead if all of its actions are deleted. Also marks action at previous action base as deleted.
* Private entries
* No `StoreEntry`; `RegisterUpdate`/`RegisterDelete` still go to both record and entry base but (I assume) the entry authorities reject them.
* Can't use private entry as a validation dependency; author-side validation will give different result from authority-side
* Links
* Create
* `RegisterCreateLink`
* goes to link base (entry/action/agent/external ref) address
* adds link as metadata, even if base doesn't exist
* Delete
* `RegisterRemoveLink`
* goes to link base
* marks deleted link as dead
* Publish sequence
```mermaid
sequenceDiagram
participant Author
participant Network
loop for each action written to source chain
Author->>Author: transform action into DHT operations
loop for each DHT operation
Author->>Author: sign operation with private key
Author->>Author: determine DHT base address of operation (depends on operation type)
opt author isn't connected to any DHT authority peers responsible for base address
Author->>Network: request agent IDs for responsible authorities
Network-->>Author: list of agent IDs
end
loop for each known authority, or until number of collected validation receipts >= required validations for operation
create participant Authority as DHT authority peer
Author->>Authority: publish signed operation
Authority->>Authority: validate operation
alt ❌ operation is invalid
Authority->>Authority: add author to block list
Authority->>Authority: create negative validation receipt
else ✔️ operation is valid
Authority->>Authority: integrate operation into DHT store
Authority->>Authority: create positive validation receipt
end
Authority->>Authority: sign validation receipt
destroy Authority
Authority-->>Author: validation receipt
end
end
end
par in background
loop until number of collected validation receipts collected >= required validations
Note over Author,Network: do publish loop as above, skipping authorities who have already sent a receipt
end
end
```
* Links, paths, and anchors
* `create_link`
* base/target don't need to exist
* `delete_link`
* `get_links` and `get_link_details`
* filtering by type and tag
* Paths and anchors
* can't specify relaxed chain top ordering
* Entries and CRUD
* Unique identifiers are hashes
* Action hash or entry hash? you get to decide (maybe a collection of patterns)
* for update and delete, metadata is attached to both original entry base and original action base
* Max payload size
* `create`
* Entry content is deduplicated
* `update`
* Pattern for jumping the update chain (put in pattern section instead?)
* what happens when you update an `AgentID` entry?
* old one is retired
* (future) sys validation with DPKI
* `delete`
* Nothing's actually deleted
* `get`, `get_details`
* (future) actual removal
* `withdraw`, remove one's own actions after an accidental source chain fork
* `purge`, remove entry content (but not the actions that wrote it)
* relaxed chain top ordering
* eliminates source chain head moved errors, but all writes in function call have to have it
* Zome functions and calls
* `#[hdk_extern]`
* single parameter
* return value type
* short-circuiting fallible functions with `?`
* Remote calls
* Cross-zome/cell calls
* Capabilities
* Specially supported pattern: fire-and-forget remote signals
* `post_commit`
* infallible, read-only, but can `call()` another function
* Warning against sending signals in a zome call unless you know what you're doing (writes may not succeed)
* source chain head moved errors
* Integrity
* Validation
* Determinism
* dependencies
* `must_get_*` functions
* `must_get_valid_record` is the only one that checks for validation receipts/warrants because an entry is only valid/invalid in the context of an action
* this also means it trusts the word of others (1-of-n validation)
* short-circuiting with `?`
* Operate on ops
* flat ops convenience thingy
* Considerations re: what should be validated on each op (IOW, what authorities should be responsible for what things)
* Things that can be validated
* membrane proof
* entry structure (`entry_def` macro gives you deserialisation and error short-circuiting for free with `?` operator)
* permission
* rate limiting with `weight` field
* dependencies, incl source chain history
* Inductive validation for costly dep trees (pattern?)
* Limitations
* Cannot `must_get` links or actions on a base
* Cannot currently co-validate multiple actions (can only validate an action based on prior valid actions)
* Cannot validate the non-existence of something, because that can always change
* (future) source chain restructured to atomic bundle of actions, co-validated
* sys validation
* Lifecycle of a validation
* At author time
* At publish time
* At gossip time
* Membrane proof
* Genesis self-check -- not a 'true' validation function, just a way to guard yourself against copy/paste mistakes and other things that can permanently hose your chance of joining a network
* (future) Handled specially -- restricts/grants access to a network; validated at handshake time (turns out this is not currently implemented, and there are questions about how to implement it in a way that doesn't carry a performance hit with each new peer connection -- and there may be lots of them in a big heavily sharded DHT)
* `AgentValidationPkg` is the only action for which an honest person can get warranted, because they try to join the network and publish it before they're able to fetch deps
* Blocking
* What does and doesn't happen
* (future) sticky validation
* Consequence of validation failure
* App-level blocks can also be applied arbitrarily without needing proof of invalid activity, and can be removed
* All blocks are agent-centric; a network-wide block is a consequence of each individual peer blocking the agent
* Signals
* Local signals to UI
* Remote signals to peers on same DHT
* needs cap grant on `recv_remote_signal`
* (future) pub/sub to replace most use cases for remote signals
* not supported between zomes or between cells :(
* (future) DPKI and other conductor services
* What do they do?
* Involved in sys validation
* Sys validation
* What does it validate?
* Source chain continuity
* Timestamp monotonicity
* Contiguity of authorship
* (future) if DPKI enabled, sys validation checks that changes of agent ID are reflected in DPKI DNA as-at action timestamp
* What else?
* App lifecycle
* Cell initialisation
* Peer discovery -- connecting to other peers via bootstrap server, mDNS, or injected network addresses
* NAT holepunching -- WebSocket proxy server
* DNA properties
* What can you do with them?
* Where/how are they accessed by your code?
* How can the user modify them?
* What happens then? (answer: a fork)
* `*_info` functions
* Fun little helpers
* Crypto
* hashing
* Ed25519 signing/verifying
* box encryption
* Randomisation
* Timestamp
* Tracing
* Lair and what it can do
* Supplies crypto primitives to HDK
* Generates agent keys for conductor, signs actions
* Countersigning
* Automated process; agreement to countersign a given piece of data should have already been made
* online and interactive (in distributed systems terms)
* Document the heck out of the lifecycle
```mermaid
sequenceDiagram
box Alice
participant ACell as Alice's cell
participant ACond as Alice's conductor
end
box Bob
participant BCond as Bob's conductor
participant BCell as Bob's cell
end
ACell->>ACell: prepare entry to be countersigned
ACell->>BCell: send entry data (signal or zome function)
BCell-->>ACell: acknowledge, store, and approve of entry data
ACell->>ACell: create reflight request
Note right of ACell: contains parties involved, hash of entry to be signed, time window, action stub, arbitrary preflight bytes
ACell->>BCell: call arbitrary zome function called `receive_preflight` with prepared preflight request
BCell->>
```
* Patterns for implementing the interactions that aren't handled at the system level
## Patterns
* Lobby
* gaining access to privileged DHTs
* getting directory of DNA clones
* mediated access to other DHTs via `call_remote`
* Blockchain binding
* File chunking
* DNA composability
* One DHT per file/channel
* Throwaway DHT
* Per-role DNAs (combine with Lobby for mediated access)
* Various CRUD patterns
* See [https://github.com/mjbrisebois/blog/blob/master/drafts/holochain-architectural-pattern-vectors.md](https://github.com/mjbrisebois/blog/blob/master/drafts/holochain-architectural-pattern-vectors.md)
* Canonical updates
* Simple resolution strategy (e.g., LWW)
* Branch pointer (requires consensus protocol)
* 'latest update' link
* Update trees
* Communications
* [Asynchronous private messaging](https://forum.holochain.org/t/asynchronous-private-messaging/1085)
* [Mailbox](https://forum.holochain.org/t/mailbox/1084)
* Access control / privileges / privacy / secrecy
* Benevolent dictator
* Progenitor
* [Capability delegation](https://forum.holochain.org/t/capability-delegation/1082)
* Memproofs for DHT read / peer connection privileges
* Token gating
* Group password (better handled by network seed TBH)
* Certificate chains
* Lobby can be used to distribute memp
* DHT writes
* Only author can edit
* Read-only public DHT with memproof for write privileges (necessary for Holo resilience nodes to work)
* Encrypted DHT data
* key rotation for forward secrecy
* double ratchet
* manual rotation
* Private data and `call_remote`
* Homomorphic encryption
* ZKP
* Negotiation of shared secret, e.g., dice roll
* SMPC (e.g., EcDSA)
* just sharing the darn secret
* Shared secrets, public disclosure/auditing
* Validation
* Inductive validation
* Remote signals
* Heartbeat / presence indicators
* Online-ish-ness -- dealing with ambiguous definition of 'online' in P2P context
* swarm connectivity -- how connected am I to the people I'm doing work with?
* beacon or sentinel -- am I connected to 'the internet' (or at least the portion that I expected to see)?
* Anchor pub/sub
* Publish frontrunning -- sending out a signal in `post_commit` to let listeners know that a DHT entry should exist there eventually
* Real-time collaboration
* Syn v1 (scribe that collects CRDTs from collaborators, periodically commits change sets)
* Syn v2 (no scribe needed; don't know how this'll work yet)
* Branching revisions
* Consensus protocols
* Countersigning-based
* Single notary
* Notary pool (m-of-n optional witnesses)
* Hardened metadata (will be supported by Holochain eventually; curious if it could be implemented as an app-level pattern)
* Blockchain binding