Thread happ design Summary

# Thread happ design Summary ## Requirements Given these requirements for one year of use: - Group of < 100 members - Using < 50 applets - Max threads per entry: 100 - Max number of beads: 400k - Max number of threads: 3k - Max size of a thread: 200k beads Any agent can start a thread from any topic, even off of other threads. ## Design ### Paths ![](https://i.imgur.com/rJ1raMs.png) ### Threads ![](https://i.imgur.com/YSQcK7j.png) ### SemanticTopic ![](https://i.imgur.com/7qejyGx.png) ## Summary A Thread is a sequence of *beads* starting from a *ParticipationProtocol*. A *ParticipationProtocol* links off a *Subject* which can be `AnyLinkableHash`. The *ParticipationProtocol* (PP) states a purpose of the thread (default is "general discussion"), the participation rules and holds the type of the subject (Since a subject can be anything, we need to know what type of object we are linking against (agent, entry, another bead, external data, etc.) for determing behavior in code and UI). A bead is any type of message. For a start, only text messages type is implemented, audio message could be the next step for proof-of-concept. The *ParticipationProtocol* could define the maximum allowed length of a text message, how many beads an agent can add to a thread (spam limit), or even who is allowed to add beads (white list). The happ defines a `SemanticTopic` entry-type which is a simple public Entry with only a title string value. When committed it is linked from a "semantic topics" anchor. A bead holds the ActionHash of the PP it is attached too, and also the ActionHash to the previous known bead in the thread. If its the first bead of the thread than that value is null, and is considered the root bead. When a bead is committed, it is linked from a time-index based on its PP's ActionHash, e.g. there is a time-index per PP). The granularity is 1 day. When "opening" a thread in the UI, we grab the 10 last beads from the time-index. When scrolling upwards to go further back in time, we query the time-index based on the oldest known bead's creation time. When a thread is "closed" in the UI, or after a lot of beads have been retrieved for a given thread, we commit a private `ThreadQueryLog` entry, that logs the creation-time and ActionHash of the newest known bead for that thread. That way, next time the user "opens" the thread, we first retrieve the latest ThreadQueryLog from the agent's source-chain, and query the time-index from that logged time to the current time. Any new bead since the last query can be marked "unread" by the UI. Beads committed offline and shared at a later time will be detected if several beads have the same "reply_of" value. Its up to the UI to decide how to display these inner-thread forks. Any participating agent can flag any bead as invalid. Its up to the UI to decide how to display "invalid" beads. When a bead is committed, it is also linked from a global time-index. The granularity is 1 hour. The link tag must hold the value of the bead's PP's ActionHash. By querying the global time-index for the latest beads, we can determine what threads have unread beads by just parsing the link tags. PPs are linked from the global time-index as well, in order to detect new threads. New Threads are discovered by looking for the latest PP in the global time-index. A GlobalQueryLog private entry is committed to store the time of the last known PP. Given the hugeness of the global time-index, it is not recommended to traverse the whole index (i.e. starting from dna origin-time). Thus, Subjects are also indexed per SubjectType per dnaHash. To find all available threads for an applet we can simply query all anchor links from its dnaHash. This should be a lot faster than querying the global time-index from the dna's "origin-time". Finally, a PP is linked from its subject, with subject hash as tag. This allows to query a subject for all its threads. If an applet's UI wants to display a subject's threads, it would simply need to query for "threads" links off the AnyLinkableHash and display the result accordingly. Given the accumulation of query log entries, it is expected to "migrate" the source-chain after a period of time (1 year) to flush out obsolete data. It is perhaps preferable to have the UI store the query logs as it would be less storage space, but this design implies an UI-agnostic solution. # Tree Index ## Terminology #### From Holochain A **Component** is arbitrary bytes (`Vec<u8>`) to be hashed together in a predictable way. A **Path** is a vector of **Components**. A **TypedPath** is a **Path** with a **ScopedLinkType** (Bundled `link_index` & `zome_index`). An **Anchor** is a String based **Path** (i.e. all components are strings) #### Anchor An **Anchor** is the string representation of a **Path** (e.g. `all_profiles.dan.`). A **TypedAnchor** is an **Anchor** with a **ScopedLinkType**. A **RootAnchor** is an **Anchor** with only one **Component**. A **LeafAnchor** is an **Anchor** without *Anchor* children (e.g. `all_profiles.dan.daniel`). (An **InternalAnchor** is an **Anchor** with *Anchor* children) An **Item** is whatever object that links off an **LeafAnchor** (that is not itself an *Anchor*). A **FurnishedAnchor** is an **Anchor** with **Items** linking from it. An **ItemLink** is the **Link** between a **LeafAnchor** and an **Item**. It is typed. LeafAnchor type != ItemLink type #### Time A **TimePath** is a timestamp representation of a **Path**. All components are a single `i32` ### Verbs **probe**: Query the DHT for data **get** **find**: **search**: Lookup a specific data value **query** **retrieve** **look_for** **scan** **Insert**: Add an item to a tree **Remove** Remove an item from the tree **Walk** **Traverse** **Collect** **Enumerate** Enumerating all the items Enumerating a section of a tree Searching for an item Adding a new item at a certain position on the tree Deleting an item Pruning: Removing a whole section of a tree Grafting: Adding a whole section to a tree Finding the root for any node Finding the lowest common ancestor of two nodes