# Nix Binary Caches
## The current state of things
Most of the bandwidth used when substituting nix store paths from a binary cache is used to transfer NAR files, which contain the contents of a store path in a canonical serialization.
Due to the lack of an index in that file, there's no support for partial susbstitution, and due to the lack of per-file hashing, no support for partial substitution either.
Nix communicates metadata in `.narinfo` files, these contain signatures of the nar-hash and some metadata.
## A new approach
While hacking on https://flokli.de/posts/2022-06-30-store-protocol/, and figuring out the low-level details on how to represent paths and trees, I ended up deciding on a serialization format similar to git:
Every directory in git is represented by a tree object, which refers to the hashes of other git objects objects, essentially building a merkle tree.
For my first attempts at https://github.com/nix-community/go-nix/pull/95, I came up with using a serialization format very similar to git (without using sha1 as a hashing function, and without actually writing the serialized form to disk).
With this, we get deduplication on a per-file level (not just output path granularity), as well as deduplication of common directory trees, all in a content-adressed store.
In addition to the savings due to the deduplication, this also means, those content-adressed blocks can be substituted in a trustless fashion, from a multitude of sources, including peer-to-peer network neighbors (with the substitution happening lazily in an intermediate layer exposing the same interface).
There's very few metadata that still needs to be signed, essentially just pointing to a content-adressed object.
Last week I successfully implemented .nar file synthesis and decomposing. This means conventional .nar files can still be generated for backwards compatibility to clients not supporting that protocol.
---
However, the underlying interface is much more powerful, and can be generalized to a generic library to store content-addressed objects.
I'd like to spend some time reworking this code to provide a generic frontend, storing arbitrary structured data, as long as a serialization method and hash function is provided, and once that's done, implement the Nix Store as a consumer of this library on top, as the amount of code for this should be quite manageable.
## Other applications
Another obvious application for this concept would be an engine storing chain data blocks in a distributed fashion.
Multiple distributed components could listen for new blocks to be produced, and store them in the content-adressed store.
Only the pointer to the current latest head would need to be propagated in the non-content-addressed store.
Clients tracking account balances could subscribe to these updates, and only fetch blocks relevant to a transaction, reducing a lot of the storage requirements of the system in general.