# Deterministic Data: An Intro to dCBOR If you missed the first part, feel free to read more here: https://hackmd.io/@leonardocustodio/json-vs-cbor In the world of distributed systems and blockchains, "almost the same" is often as bad as "completely different." If you are building cryptographic applications, you know that a single byte difference in data serialization changes a hash completely, breaking signatures and invalidating consensus. **dCBOR** (Deterministic CBOR) is a strict profile of the CBOR data format designed to solve this exact problem. Let's explore what dCBOR is, why it is critical for cryptographic engineering, and look at **bcts**, a comprehensive TypeScript implementation of these specifications. ## The Problem with Standard CBOR **CBOR** (Concise Binary Object Representation, RFC 8949) is a binary data serialization format that is loosely based on JSON. It is small, fast, and easy to parse. However, standard CBOR allows for **non-deterministic encoding**. This means the same piece of data can be encoded into valid CBOR bytes in multiple different ways. For example, a standard CBOR encoder might handle the number `10` in several ways: * As an integer: `10` * As a float: `10.0` * As a 32-bit float vs a 64-bit float While these all represent the same *value*, they result in **different byte sequences**. If you hash these sequences, you get different hashes. In a blockchain context—where data identity is defined by its hash this ambiguity is fatal. ## The Solution: Deterministic CBOR (dCBOR) **dCBOR** is an application profile for CBOR specified by Blockchain Commons. Its primary goal is to ensure that **one specific data item always encodes to one specific byte sequence.** ### Key Rules of dCBOR To achieve determinism, dCBOR applies narrowing rules on top of the standard CBOR rules according to [Section 2 of dCBOR spec](https://datatracker.ietf.org/doc/draft-mcnally-deterministic-cbor): 1. **No Indefinite Length Items**: Arrays, maps, byte strings, text strings **MUST** have definite length. 2. **Preferred Serialization**: CBOR allows multiple possible encoding for the same data, dCBOR encoders/decoders must use only the preferred serialization and **REJECT** any others that do not conform. 3. **Ordered Map Keys**: Keys in a map **MUST** be sorted in a specific byte-wise order. You cannot serialize `{ "a": 1, "b": 2 }` as `{ "b": 2, "a": 1 }`. 4. **No Duplicate Keys**: Any message must be rejected if duplicate keys are found. 5. **Numeric Reduction**: Variable-length integers **MUST** be as short as possible. Floating-point values **MUST** use the shortest form that preserves the value (e.g., `10.0` becomes `10`). 6. **Simple Values**: While CBOR (major type 7) allows null and undefined the dCBOR prohibits the use of undefined 7. **NFC Strings**: Strings must be Unicode Normalization Form C (NFC) By enforcing these rules, dCBOR ensures that any compliant encoder, regardless of the programming language or platform, will produce the exact same byte string for the same data. This makes it safe for: * **Cryptographic Signatures** (signing the data, not just a hash). * **Content-Addressable Storage** (IPFS, Merkle Trees). * **Reproducible Builds**. ## Toolings While Blockchain Commons provides reference implementations in Rust and Swift, the JavaScript/TypeScript ecosystem now has a powerful ally in **[bcts](https://github.com/leonardocustodio/bcts)**. ### What is inside bcts? It is not just a dCBOR library; it is a suite of interoperable cryptographic primitives: * **`@bcts/dcbor`**: The core package for deterministic CBOR encoding/decoding. * **`@bcts/ur`**: Implementation of **Uniform Resources**, a way to encode binary data into text (like QR codes) efficiently. * **`@bcts/envelope`**: Support for **Gordian Envelopes**, a smart document format that supports hashed elision (privacy-preserving selective disclosure). You can visit the [playground](https://bcts.dev) where you can visualize how dCBOR structures are parsed and inspect the "diagnostic notation" (a human-readable form of CBOR). ## Conclusion As we move toward a more decentralized web, the ability to agree on data representation is foundational. dCBOR provides the strictness required to turn JSON-like flexibility into cryptographic rigidity. Whether you are building a wallet, a verified credential system, or a distributed ledger, adhering to deterministic standards is a must. Thanks to the work of Blockchain Commons and community efforts, implementing these specifications are getting easier than ever. ### References * **dCBOR specification**: [datatracker.ietf.org/doc/draft-mcnally-deterministic-cbor](https://datatracker.ietf.org/doc/draft-mcnally-deterministic-cbor/) * **BCTS repository**: [github.com/leonardocustodio/bcts](https://github.com/leonardocustodio/bcts) * **Blockchain Commons**: [developer.blockchaincommons.com/dcbor](https://developer.blockchaincommons.com/dcbor)