Radicle Protocol Overview (Heartwood Release)

# Radicle Protocol Overview (Heartwood Release) [TOC] --- **Changelog & Notes** | Date | Name | Notes | |---|---|---| | 2-Jan-2024 | Stellar | All of the sections now have more comprehensive summaries. There are more details on images as well. Please review the comments. | | 31-Dec-2023 | Stellar | Couple minor things left - I made some comments on sentences/paragraphs/images that need more work. It did end up finalizing on a format where each section has an intro/summary paragraph after all. For the copy comments, let me know if you want me to spend more time on it or if you want to take over. For the image items, I'll give it a little more thought on Jan 2nd. | | 23-Dec-2023 | Stellar | Made a new document version based on the older document here: https://hackmd.io/ksY_LAkaSV-4RDcbwzviwQ | Note: When I mention "side bar" below, I mean formatted as information in the right hand side of the page, like the supplementary links in the SSB guide: https://ssbc.github.io/scuttlebutt-protocol-guide/#keys-and-identities **TODOs** - [ ] Identity document section - add a little blurb that talks about custom signing schemes and how those can also be set/configured, whether tied to pgp, crypto tokens, etc. - [ ] P2P section - mention more technicalities relate to transport... udp or tcp ... future work to come on nat holepunching ? Also rationale for not using libp2p or other existing frameworks and why building a new one? - [ ] P2P intro /ssb - consider mentioning publish/subscribe paradigm. - [ ] Seeding repositories - Do you think this should remain in the Nodes section, or move to P2P? --- ## Summary The Heartwood release of the Radicle protocol establishes a sovereign data network for code collaboration and publishing, built on top of Git. In Radicle, users maintain local copies of their interested repositories and related social artifacts like issues and patches. Instead of depending on a centralized service like GitHub, each participant in Radicle operates a node that is capable of running on a personal computer, and is connected in a peer-to-peer network. Nodes, identified by public keys, host and synchronize Git repositories across the network, using a novel gossip protocol for peer and repository discovery, alongside the Git protocol for data replication. In combination, this allows peers to locate, replicate, and verify any repository published to the network, provided at least one other peer seeding the repository is online. Since Radicle is built on Git, it can easily interoperate with existing tools and workflows. Radicle's architecture is local-first, ensuring continuous access to one's repositories directly from their device, regardless of internet connectivity. Repositories have unique identifiers and are self-certifying, where all actions, from committing code to adding a comment to an issue, are performed locally and cryptographically signed, allowing peers to verify authenticity and data provenance, once propagated to the network. This allows trust to be established without reliance on a centralized authority. Radicle is designed for extensibility, allowing for diverse use cases without necessitating modifications at the protocol level. This guide delves into the capabilities of Radicle’s Heartwood release, which has a focus on code collaboration and code publishing. Nonetheless, a range of other applications is foreseen in the future, including knowledge sharing, project coordination, and data set collaboration. `[🖼️Img: Display a high level network diagram. New website image described - This image should be slightly inspired by the current "radicle stack" image on the home page, but adding a few more dimensions or peers that illustrates how radicle replicates and syncs data across nodes (e.g. a visual diagram of the 'seed' action, partially) - has collaborative objects (COBs) in the diagram)]` ## 1. Introduction Git, the most widely-used distributed version control system, enables users to maintain and modify personal copies of data repositories, commonly for source code. Its structure for direct user-to-user collaboration, while feasible, is often cumbersome due to Git's primary focus on version control rather than collaboration. As a result, users frequently opt for centralized services like GitHub or GitLab, which offer enhanced interfaces and collaborative tools on top of Git, such as project management and code review. This dependency, however, can result in vendor lock-in since it places a project's social artifacts (e.g. issues, comments, pull requests) under corporate control, potentially compromising data sovereignty and censorship resistance. Traditional self-hosting options like Gitea or GitLab Self-Managed provide more sovereignty but often lead to fragmented collaboration environments, as users must create separate profiles for each self-hosted instance. This simultaneously limits a project's exposure to the wider open source community, a key advantage of platforms like GitHub, which have grown significantly due to network effects. This isolation can impact the visibility and collaborative potential of projects that choose traditional self-hosting solutions. The Radicle protocol, in contrast, extends Git's capabilities with a decentralized identity system, novel gossip protocol, and integrated social artifacts, forming a *self-hosted network for code collaboration*. This protocol locates, serves, and replicates Git repositories across a peer-to-peer network while maintaining data authenticity via cryptographic signatures, so peers can directly exchange data without the need for a trusted third party. This enables communities to both self-host and share their repositories across a distributed protocol, contributing to the emergence of a new sovereign network for code collaboration and more. ## 2. Nodes Radicle is a peer-to-peer system, which means that there is no traditional client-server model. Peers on the Radicle network are referred to as *nodes*, and are indistinguishable from users at the protocol level. Nodes, identified by their public keys, also referred to as a [Node ID (NID)](#Node-Identifier-NID), are responsible for seeding Git repositories, each identified by a unique [Repository ID (RID)](#Repository-Identifier-RID). The seeding process involves both hosting the repository data and synchronizing changes with other nodes. Every Radicle user, irrespective of their role or activity, operates a node by installing a [Radicle client](#Radicle-Clients) on their device. No specialized equipment is necessary for operating a node as a typical end-user; they can run on a personal computer without requiring an always-on server. `[🖼️Img: Display 3 nodes, displaying NIDs and differing repo inventories. One of the nodes should be a seed node with a very large repo inventory whereas another just has 4 repos on the list.]` ![image](https://hackmd.io/_uploads/S1_RCK1O6.png) ### Seeding Repositories Users configure nodes with a *seeding policy* which specifies the list of repositories they are interested in seeding, in addition to retention rules. This means that nodes aren't just seeding random repositories, users have an active choice in deciding the repository data on their device. The typical end-user may choose to only seed the repositories they are actively collaborating on. However, more dedicated users may opt to run an always-on server as a [seed node](#url), offering their infrastructure to the wider Radicle network or to their community. Seed nodes significantly enhance the network's capacity to provide continuous access to a broad range of repositories. They can vary in their seeding policies, from *public seed nodes* that openly seed all repositories to *community seed nodes* that selectively seed repositories from a group of trusted peers. [`📖Side bar:` For more details on how to run a seed node, refer to the [Seed Node Guide](#). ] ### Node Identifier (NID) Radicle node identities are based on public-key cryptography[^pkc], so it's easy to verify message and data authenticity from each node within the network through signatures generated by their secret keys. This also allows for consistent identification even as a user's network address varies. When setting up a node, users create a profile by generating their unique `NodeId` (NID), which is an Ed25519 key pair[^id01] that is encoded as a Decentralized Identifier (DID)[^did01] using the `did:key` method[^did02]. Creating a profile requires no permission or coordination: the key pair can be created while offline without providing an email address or any personal identifying information. It is important to safeguard one's secret key, as if it is either lost or compromised, one will have to generate a new `NodeId`. A changeable, non-unique `alias` can optionally be associated to each profile, for easier recognition across the network. [`📖Side bar:` DIDs are a new identifier standard established by the W3C that are used for interoperability across various systems and provide flexibility for incorporating different types of identifiers in the future.] [`📖Side bar:` In the Radicle protocol, the terms "Node ID", "Peer ID", and "Public Key" all mean the same thing and can be used interchangeably.] `[🖼️Img: show something like this, yet have a DID and alias on it]` ![image](https://hackmd.io/_uploads/SyxmI0yIT.png) ### Radicle Clients To operate a node, users install software that enables their device to concurrently function as both a client and server within the network. For simplicity this is referred to as *Radicle client software* or *client software*. Client software is lightweight and suitable for use on both end-user and seed nodes. The sole reference implementation is the [Radicle CLI](https://app.radicle.xyz/nodes/seed.radicle.xyz/rad:z3gqcJUoA1n9HaHKufZs5FCSGazv5), both a Radicle node and a command line interface tool, which can be optionally supplemented by a web application (`app.radicle.xyz`). Radicle is released under the open source MIT and Apache 2.0 licenses, to encourage the development of diverse clients and applications. All client software adheres with the Radicle protocol specification, as outlined in the [Radicle Improvement Proposals](https://app.radicle.xyz/nodes/seed.radicle.xyz/rad:z3trNYnLWS11cJWC6BbxDs5niGo82) (RIPs), ensuring consistent functionality across different implementations. [`📖Side bar:` For a guided install of the Radicle CLI, check out either the Quick Start or Getting Started guides. ] [^id01]: https://ed25519.cr.yp.to/ [^did01]: https://www.w3.org/TR/did-core/ [^did02]: https://w3c-ccg.github.io/did-method-key/ [^pkc]: https://en.wikipedia.org/wiki/Public-key_cryptography ## 2. P2P Protocol Radicle adopts a local-first, peer-to-peer (P2P) architecture, which draws inspiration from Scuttlebutt's gossip protocol[^ssb] where data transmission relies on a publish/subscribe system, with peers only host and synchronize the data they have subscribed to. Peer connections in Radicle are secured with the Noise protocol[^np01]. Unlike Scuttlebutt's focus on social networking via append-only logs, Radicle focuses on code collaboration by incorporating Git's repository structure and replication capabilities into a P2P context. This model not only leverages Git's proven version control capabilities but also gives users complete autonomy over their social artifacts. Radicle's P2P architecture, in contrast to federated systems, ensures no centralized points of failure, allowing the network to persist as long as users operate nodes. `[🖼️Img: Display a few clusters of peer nodes that are seeding different repositories, so we can see that peers that have the same repositories are regularly talking to each other. Perhaps have different colors (or line types) representing each of the repos and different types of data being exchanged (gossip protocol messages + Git objects)]` [^ssb]: https://ssbc.github.io/scuttlebutt-protocol-guide/#discovery ### Gossip Protocol The Radicle networking layer is designed as a gossip protocol, where messages are sent between peers to build routing tables that aid in repository discovery and replication. The core functionality is achieved with three message types, each fulfulling a distinct role: 1. **Node Announcements** are used for broadcasting all network addresses on which a node is publicly reachable, to facilitate the peer discovery. 2. **Inventory Announcements** are used for broadcasting repository inventories and constructing the routing table which maps out what repositories are hosted where. 3. **Reference Announcements** are used for broadcasting updates to repositories, relayed only to nodes seeding the relevant repository. To prevent endless propagation, nodes drop any message already encountered. However, for the sake of broadcasting messages to new nodes, gossip messages may be temporarily stored. Each message includes the originating `NodeId` along with a `Signature` and a `Timestamp`, allowing network participants to verify the authenticity of each message they receive. [ `📖Side bar:` Refer to [RIP-1: Heartwood](https://app.radicle.xyz/nodes/seed.radicle.xyz/rad:z3trNYnLWS11cJWC6BbxDs5niGo82/tree/0002-identity.md) to understand more details about Radicle's protocol design in the Heartwood release. ] `[🖼️Img: Show an image with the three gossip messages and different data that goes into each of the messages]` ![image](https://hackmd.io/_uploads/S1NSm9Jua.png) ### Transport Encryption & Privacy Connections between peers in the Radicle network are secured using the Noise protocol. This begins with two peers exchanging handshake messages to establish a shared secret key. After the handshake phase, each peer uses this shared key to send encrypted *transport messages*[^np02], ensuring secure and encrypted communications across the network. Radicle also has a Tor[^tor] integration that users can leverage for network address privacy, to be identified by an `.onion` address[^oni] instead of a standard IP address. `[🖼️Img: Show something like this diagram below, but also some more details like their network address - like have one of the peers be on Tor with an .onion address and the other one having a standard IP address]` ![image](https://hackmd.io/_uploads/B1RZYKJOa.png) ### Replication via Git While gossip is used to exchange messages, the actual repository data is transferred via the process of replication using the Git protocol. The process begins with a node establishing a secure connection to one or more of the repository's seeds upon receiving a *reference announcement* about a repository update. Once connected, the node initiates a `git-fetch` operation, using the Git protocol, to download the relevant Git objects into the node’s storage, subsequently making them accessible to other nodes showing interest. This fetch operation is tunneled over the same secure connection initially formed between the nodes, thereby effectively *multiplexing* [^mp] the same physical connection for both activities. `[🖼️Img: Create a diagram that shows this multiplexing/tunneling related to the reference announcement and a peer getting new Git objects via git fetch.]` ### Bootstrap Nodes A node joining the network for the first time will not know of any peers. Hence, the reference implementation of the Radicle client has been pre-configured with two *Bootstrap Nodes*: `seed.radicle.garden` and `seed.radicle.xyz`, which are registered DNS names that resolve to node addresses on the network. In the bootstrapping process, nodes resolve these names to have a set of addresses to initially connect to, and once they establish connection with a peer, use the regular peer discovery process to find more peers. `[🖼️Img: Create a diagram that shows the three phases: from `seed.radicle.garden` and `seed.radicle.xyz`, to the resolved node addresses, to the peer discovery ]` [^tor]: https://spec.torproject.org/ [^np01]: http://www.noiseprotocol.org/ [^np02]: http://www.noiseprotocol.org/noise.html#introduction [^mp]: https://en.wikipedia.org/wiki/Multiplexing [^oni]: https://tb-manual.torproject.org/onion-services/#:~:text=An%20onion%20address%20consists%20of,and%20using%20an%20onion%20service. ### Federation vs. P2P In contrast to federated systems like atproto[^atp], which rely on intermediary relay servers often managed by volunteers or companies, Radicle's structure does not necessitate such dependencies. These federated models, despite their decentralization, can become less reliable when a relay's operational costs outweigh their incentives, leading to potential shutdowns. Additionally, relays are dependent on domain names which can and are regularly seized by governments. Radicle, while featuring public seed nodes that may operate as always-on servers to aid in repository discovery, ensures that these nodes do not represent a central dependency and instead function akin to end-user nodes in the network. In this manner, the Radicle protocol has the advantages of a federated system while mitigating its inherent risks. `[🖼️Img: Create a diagram that looks similar to the one below, with federation on the left and p2p on the right, except for the 'p2p' version there will be some nodes that look bigger & have more connections, representing the seed nodes. This concept of resilience in this design may work well as an animation that there is less user interruption in p2p design as they dont have to make choice of which seed node to associate to if one goes down and their data is on a couple of them.]` ![image](https://hackmd.io/_uploads/r1Ji9Fk_T.png) [^atp]: https://atproto.com/ ## 4. Repositories **Repositories** are central to the Radicle network, serving as the primary data object exchanged by peers. A repository in Radicle is fundamentally a Git repository, supplemented with a unique repository identifier (RID) and metadata essential for validating the authenticity of its contents. These repositories, which can be either public or private, can accommodate diverse content including source code, documentation, and data sets. All repositories are initialized with an [identity document](#identity-document) where its settings are defined, such as its name, description, and [delegates](#Delegates) (considered the repository maintainers). The initial version of this document is used to deterministically derive the RID. `[🖼️Img: show some diagram that displays information architecture of a repo + the additional radicle metadata]` ### Delegates Repositories are managed by delegates identified by their `NodeIDs`, who can be individuals, groups, or bots, and are responsible for critical tasks such as merging patches, addressing issues, and modifying repository settings. A repository always begins with one delegate, its creator, and can remain at that size for smaller projects, or can eventually involve multiple delegates. ### Identity Document Before a repository can be published on Radicle, it needs to be initialized with an identity document. This document, a canonical JSON file found at `refs/rad/id`, encapsulates key metadata such as the repository’s `name`, `description`, and `defaultBranch`. It also includes the public keys of the repository's `delegates` and the `threshold` number of delegate signatures required for authorizing changes. An example of such a document for the Heartwood repository would look like this: ``` { "delegates": ["did:key:z6MknSLrJoTcukLrE435hVNQT4JUhbvWLX4kUzqkEStBU8Vi"], "threshold": 1, "payload": { "xyz.radicle.project": { "name": "heartwood", "description": "Radicle Heartwood Protocol & Stack ❤️🪵", "defaultBranch": "master" } } } ``` ### Private Repositories Radicle supports **private repositories** where access is restricted to a designated group of trusted peers. This is achieved by setting the `visibility` as `private` and defining an optional `allow` list within the repository identity document. This ensures only nodes in the privacy set can replicate and access the data, maintaining confidentiality. While the data is not encrypted at rest, these repositories rely on selective replication through the allow list for privacy, which renders them invisible and inaccessible to other nodes in the Radicle network. An example of what is added to an identity document to make a repository private would look like this: ``` "visibility": { "type": "private", "allow": [ "did:key:z6Mkt67GdsW7715MEfRuP4pSZxJRJh6kj6Y48WRqVv4N1tRk" ] } ``` ### Repository Identifier (RID) To ensure uniqueness and easy identification of repositories, a stable and globally unique identifier, known as the Repository Identifier (RID), is assigned to each repository. The RID is deterministically derived from the initial version of the repository's identity document. This process involves using Git’s hash-object command to produce a SHA-1 hash of the document. The hash is then encoded using `multibase`[^mb] encoding with the `base-58-btc` alphabet, the same encoding used for the `did:key` method, and prefixed with `rad:`, creating a valid URN[^urn]. For example, here's the RID for the Heartwood repository: `rad:z3gqcJUoA1n9HaHKufZs5FCSGazv5`. [^urn]: https://datatracker.ietf.org/doc/html/rfc8141 [^mb]: https://w3c-ccg.github.io/multibase/ [ `📖Side bar:` Refer to [RIP-2: Identity](https://app.radicle.xyz/nodes/seed.radicle.xyz/rad:z3trNYnLWS11cJWC6BbxDs5niGo82/tree/0002-identity.md) to understand more details about repository identity in the Radicle protocol. ] ## 5. Local-First Storage Since storage and replication are tightly coupled, and replication makes use of Git, so does storage. The storage layout is designed in such a way that it's easy to transfer data between peers over the network, using an unmodified Git protocol. Peer data is stored within the same Git repository using `gitnamespaces`[^gns], where Node IDs are used as the namespace. This allows storage to be managed through a partitioned approach where each user maintains their own *local fork* of a repository, as well as any other forks they have an interest in, all within the same Git repository. These forks are then shared among users across the network. Each repository fork has a single owner and writer, and users are only permitted to make changes to their respective forks. Storage is accessed directly by the node to report its repository inventory to other nodes, and by the end user through either specialized tooling or `git`. Users are typically interacting with two potentially diverging repository copies on their device, the *working copy* and a hidden copy called the *stored copy* that they interact with via `push` and `pull` commands using Radicle's *git remote helper*[^grh] named `rad`, to synchronize changes across the working and stored copies, even when offline. Changes to the stored copy are automatically propagated to the network when the user is connected to the internet. This local-first design not only enhances the user experience by making offline work more frictionless, but also eliminates the need for centralized servers. `[🖼️Img: show some diagram that is something like a combination of the two ascii diagrams in the following sections.]` ### Storage Layout Radicle's storage layout is designed to support multiple repositories and multiple peers per repository. Each repository is a bare Git repository, stored under a common base directory, identified uniquely with its Repository ID or `RID`. Instead of each of the repository's peers storing data in a separate Git repository with a separate object database (ODB), peer data is stored within the same Git repository using the `gitnamespaces` feature. For each peer, including the local peer, their unique Node ID known as the `NID` is used as the namespace within each repository to separate Git objects. Thus, each peer can have its own namespace for references (i.e. `heads`, `tags`, and `notes`), while sharing the objects with other peers via a shared ODB. Since the underlying storage uses Git, the storage layout below is represented as a file tree on the file-system, with `<storage>` representing the storage root, or top-level directory under which all repositories are stored on a user's device. For every repository, each peer associated with that repository must have a separate, logical Git source tree -- which contains all the usual reference namespaces. This *logical repository* is also known as the repository *fork* or *view*, and allows peers to maintain different sets of changes for the same physical repository. This design ensures only one copy of each object is stored across all repository forks. [^gns]: https://git-scm.com/docs/gitnamespaces ``` <storage> # Storage root containing all local repositories ├─ <rid> # Storage for first repository │ └─ refs # All Git references locally stored │ └─ namespaces # All peer source trees or "forks" │ ├─ <nid> # First node's source tree │ │ └─ refs # First node's Git references │ │ ├─ heads # First node's branches │ │ │ └─ master # First node's master branch │ │ ├─ tags # First node's tags │ │ │ ... │ │ └─ rad │ │ └─ id # First node's version of the repository identity document │ │ │ └─ <nid> # Second node's source tree │ ├─ refs # Second node's references │ └─ ... ├─ <rid> # Storage for second repository │ ... └─ <rid> # etc. ... ``` Though this storage tree is browsable by the user with standard file system commands, it is not meant to be interacted with directly by users, for risk of corrupting the data. Additionally, Git is free to pack the objects, which means they may not always appear as individual files. [^gns]: https://git-scm.com/docs/gitnamespaces [ `📖Side bar:` Refer to [RIP-3: Storage Layout](https://app.radicle.xyz/nodes/seed.radicle.xyz/rad:z3trNYnLWS11cJWC6BbxDs5niGo82/tree/0003-storage-layout.md) to understand more details about how storage works in the Radicle protocol. ] ### Working & Stored Copies Users will typically have *two* copies of a repository on their device: one *working copy* which they regularly update and one hidden copy that is propagated to network storage, called the *stored* copy, that the user interacts with via `push` and `pull` commands. The working copy is setup in such a way that it is linked to storage via a *git remote helper*[^grh] named `rad`. Publishing code is then a matter of running `git push rad`, for example. With this architecture, users can publish code even when offline, with changes to the stored copy becoming automatically propagated to the Radicle network once they connect to the internet, requiring no extra steps from the user. `(Note: this diagram below is not very intuitive and should be improved)` ``` ┌╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴┐ ┌╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴┐ ┆ ┌───────────────────┐ ┌──────┐ ┆ ┆ ┌──────┐ ┌─────────────────┐ ┆ ┆ │ Storage │ │ │ ┆ Git ┆ │ │ │ Storage │ ┆ ┆ │ ├╸┆╸╸╸╸╸╸┆╸╸╸╸╸╸╸╸╸╸╸╸╸╸┆╸╸╸╸╸╸┆╸┤ │ ┆ ┆ │ ┌──────┐ ┌─────┐ ┌│ │ │ ┆ protocol ┆ │ │ │ ┌─────┐ ┌─────┐ │ ┆ ┆ │ │repo │ │repo │ ││ │ │ ┆ ┆ │ │ │ │repo │ │repo │ │ ┆ ┆ │ ├──────┤ ├─────┤ ├│ │ │ ┆ ┆ │ │ │ ├─────┤ ├─────┤ │ ┆ ┆ └─┴───╿──┴─┴───┬─┴─┴┘ │ │ ┆ ┆ │ │ └─┴───┬─┴─┴───╿─┴─┘ ┆ ┆ │ │ │ │ ┆ gossip ┆ │ │ │ │ ┆ ┆ │ │ │ Node ├╸╸╸╸╸╸╸╸╸╸╸╸╸╸┤ Node │ │ │ ┆ ┆ │ │ │ │ ┆ protocol ┆ │ │ │ │ ┆ ┆ push pull │ │ ┆ ┆ │ │ pull push ┆ ┆ │ │ │ │ ┆ ┆ │ │ │ │ ┆ ┆ │ │ │ │ ┆ ┆ │ │ │ │ ┆ ┆ │ │ │ │ ┆ ┆ │ │ │ │ ┆ ┆ ┌────┴───┐ ┌──╽─────┐│ │ ┆ ┆ │ │ ┌─────╽──┐ ┌──┴────┐┆ ┆ │working │ │working ││ │ ┆ ┆ │ │ │working │ │working│┆ ┆ │copy │ │copy ││ │ ┆ ┆ │ │ │copy │ │copy │┆ ┆ └────────┘ └────────┘└──────┘ ┆ ┆ └──────┘ └────────┘ └───────┘┆ └╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴┘ └╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴┘ ``` [^grh]: https://git-scm.com/docs/gitremote-helpers ### URL Scheme The Radicle protocol utilizes a custom URL scheme to pinpoint specific repository remotes within the network, structured as `rad://<rid>/<nid>`, which is essentially the Repository ID followed by the Node ID. The remote helper is what allows Git to interpret URLs with the `rad://` scheme. By using this scheme with Git, the user instructs Git to invoke the `git-remote-rad` executable during `git push` or `git fetch`, which allows the user to interact with the network through the storage layer. The `<nid>` component is what the `--namespace` option will be set to. Here's an example URL for repository `z42hL2jL4XNk6K8oHQaSWfMgCL7ji` and peer `z6MknSLrJoTcukLrE435hVNQT4JUhbvWLX4kUzqkEStBU8Vi`: ``` rad://z42hL2jL4XNk6K8oHQaSWfMgCL7ji/z6MknSLrJoTcukLrE435hVNQT4JUhbvWLX4kUzqkEStBU8Vi ``` If `<nid>` is not specified, Git will interact with the [repository's canonical references](#Canonical-Repository-Version), also know as the default or authoritative repository state. Here's a sample URL for the same repository above, that will retrieve its canonical references: ``` rad://z42hL2jL4XNk6K8oHQaSWfMgCL7ji ``` ## 6. Trust with Self-Certification Based on this local-first storage layout, there are technically no shared branches in Radicle. Instead, each branch is owned by one user, where they are partitioned by their NodeID (NID) in the repository hierarchy. In a project with multiple delegates, for example, Alice, Bob and Eve, each would have their own `master` branch stored under their `<nid>`, where they have the exclusive ability to write to their own branch alone. Note that top-level, *canonical* references may still exist (i.e. `<rid>/refs/{heads,tags}`) yet peers do not directly write to this space. Instead the canonical or 'authoritative' version of the repository is established *dynamically* based on the threshold of delegates having the same commit. Repositories in Radicle are self-certifying, with delegates' cryptographic signatures over major changes recorded as "signed refs". These signed refs facilitate updates to the canonical references, ensuring data integrity. ``` <storage> └─ <rid> └─ refs ├─ HEAD # Canonical head reference ├─ heads # Canonical branches │ └─ master # Canonical master branch ├─ tags │ └─ v1.0.0 # Canonical v1.0.0 release tag ├─ rad │ └─ id # Canonical identity reference └─ namespaces # All peer source trees ├─ <nid> # First node's source tree │ └─ refs # First node's Git references │ ├─ heads # First node's branches │ │ └─ master # First node's master branch │ ... ├─ <nid> # Second node's source tree │ └─ refs # Second node's Git references │ ├─ heads # Second node's branches │ │ └─ master # Second node's master branch │ ... ├─ <nid> # Third node's source tree │ └─ refs # Third node's Git references │ ├─ heads # Third node's branches │ │ └─ master # Third node's master branch │ ... ... ``` ### Canonical Repository Version Unlike centralized forges such as GitHub, where repositories are deemed authentic based on their location (e.g. `https://github.com/bitcoin/bitcoin`), in a distributed network like Radicle, location is not enough. Instead, we need a way to automatically verify the data we get from *any given location*. Radicle's approach hinges on the self-certifying nature of its repositories, anchored in the repository [identity document](#identity-document). The canonical repository version is established *dynamically* based on the delegate thresholds defined in this identity document. For example, if a `threshold` of two out of three delegates is set, with the `defaultBranch` specified as `master`, and both Alice and Bob have the same commit in their `master` branches, that specific commit is recognized as the authoritative, current state of the repository. In the current Heartwood release, the dynamic establishment of an authoritative version is limited to a single `defaultBranch`. Future releases of the protocol may enhance this feature to include additional secondary branches for dynamic authoritative versioning. `[🖼️Img: show alice, bob and eve where alice and bob have same commit hash for master and that ends up being the commit hash that the canonical reference points to (as opposed to eve's commit)]` ### Signed Refs Together, a repository's RID and the identity document creates a verifiable identity that serves as the basis for an *ownership proof* for the repository. For repositories to be *self-certifying*, delegates identified in the identity document authenticate changes by cryptographically signing over repository heads, tags, and pertinent Git references, as well as changes to the identity document itself. These signatures are termed *signed refs* and are stored under the `refs/rad/sigref` directory. Signed refs are key to establishing a repository's canonical state and are updated whenever there are changes to a repository that are authenticated by the threshold of delegates. This ensures the integrity of all subsequent updates: any state change in the document can be verified against its previous version. Given an RID and a set of signed refs, anyone can retrieve the initial identity document and authenticate all subsequent repository updates without a trusted third party. This verification model draws inspiration from The Update Framework (TUF)[^tuf], a framework designed to secure software update systems. `[🖼️Img: show something like the image below]` ![image](https://hackmd.io/_uploads/SkaLXh1Op.png) [^tuf]: https://theupdateframework.github.io/specification/latest/ ## 7. Collaborative Objects In the Radicle protocol, Collaborative Objects (COBs) play an important role in supplementing Git with social artifacts such as issues and patches, which are not inherently supported by Git. These artifacts are typically constrained to centralized platforms like GitHub or GitLab, whereas in Radicle, COBs are stored within each repository as Git objects, using Conflict-Free Replicated Data Types (CRDTs) for data consistency. Updates to social artifacts are cryptographically signed so collaborative interactions within Radicle's distributed architecture can be independently verified for authenticity. Heartwood includes three predefined COBs to support code collaboration: `issue`, `patch`, and `id`, but users have full control to customize them or define entirely new datatypes. [`📖Side bar:` The `id` COB is used to derive the repository identity document. ] `[🖼️Img: Show an image like the one below, in a horizontal orientation - except it is for people adding a comment to an issue. Show their cryptographic signatures, so each node can verify the source of the comment. "Alice, Bob and Eve add comments to issue 10 on repository xxx. Alice and Bob were offline, and Eve was online. When Alice and Bob connect to the internet, everyone sees each others comments."]` ![image](https://hackmd.io/_uploads/S1wst6WOa.png) ### Conflict-Free Replicated Data Types (CRDTs) Radicle's CRDTs, inspired by Ink & Switch's Automerge JavaScript library[^crdt] yet implemented in Rust, allow users to make updates to social artifacts independently, minimizing merge conflicts. This mechanism supports collaborative elements like issue commenting, where multiple users can independently interact without coordination. Each COB records the initial version of an object and tracks all subsequent modifications made across the network. Each modification is stored as a separate Git object to ensure that the CRDT change graph is compatible with Git's synchronization processes. To retrieve the current version of an object, the system replays all the changes in a deterministic order to reconstruct the object. [^crdt]: https://automerge.org/ ### Customizing COBs Radicle's three predefined COBs `issue`, `patch`, and `id` are stored as Git objects under the `refs/cobs` hierarchy. These are associated with unique namespaces, in reverse domain name notation, such as `xyz.radicle.issue` or `xyz.radicle.patch`, to allow for extensibility. For example, issues are stored under `refs/cobs/xyz.radicle.issue/<ISSUE-ID>`, but it's also possible to define a custom issue COB under the namespace `org.YourOrg.issue` or an entirely new COB such as `org.YourOrg.YourCOB`. This means that Radicle can be extended with new collaborative data types without changing the protocol version. [ `📖Side bar:` For more details on how to create or customize Collaborative Objects, refer to the [COB Guide](#). ] ## 8. Conclusion The Heartwood release of the Radicle protocol introduces a new approach to code collaboration and publishing rooted in sovereignty. Built upon Git's well-established versioning protocol, Radicle can easily interoperate with existing systems and workflows one is already familiar with. Users have full control and ownership over their identities and repositories. Repositories are self-certifying data structures, meaning updates are cryptographically signed and can be verified by anyone, without needing a trusted third party. Every user in Radicle is self-hosting their repositories while also remaining connected to a wider network. Radicle is highly extensible with its Collaborative Objects that enable full control and customization over workflows and datatypes, while also opening the possibility for Radicle to support a wider range of functionalities, potentially including knowledge sharing, project coordination, and data set collaboration. ## References