owned this note
owned this note
Published
Linked with GitHub
# Deterministic CID grouping
## Motivation
Users who want to store relatively small[1] DAGs on Filecoin often find it difficult to have their deal accepted, even in the presence of Fil+. One solution to this is grouping multiple root CIDs into a "batching structure" which is then used to make a deal with a specific miner. The downside is that the batching structure will naturally have a new root CID, which complicates both discovery and retrieval. Moreover there are limits to how large such a structure can be before becoming unwieldy in other IPLD context ( e.g. IPFS )
## Scope
The design of the "batching structure" needs to at address the following:
- Must have
- Structure is something existing tooling can understand. This effectively limits it to UnixFSv1.
- Structure must cater to least-common-denominator among available transports. This means individual block limits within 1MiB.
- Structure must present reasonably browsable names for the individual customer-provided dag roots within it
- Structure is simple to produce and navigate with minimal or no tooling. This effectively means no IPLD-level sharding
- For the clearly necessary "logical sharding" the choice of indirection should be such that a user knowing just their own CID and the resulting final bundle-root CID[2] can craft a selector for partial retrieval of only their content.
- Nice to have
- Structure has a placeholder or mark of some sort, hinting miners at what is and isn't reasonably indexable within it.
- Limits in consideration
- A car file accepted by the Filecoin network currently can have a maximum payload size of about 65,024MiB ( 64<<30 / 128 * 127 ). For the sake of argument let's assume a future with 128GiB sectors, which gives an exact maximum payload size of 136,365,211,648 bytes. Assuming ~60 bytes for a car header, **shortest safe CID** representation of a cidv0 sha2-256 at 34 bytes, and payload of 1024 bytes per block ( ridiculously small NFTs ), gives us an upper bound of (136365211648 - 60) / ( 2 + 34 + 1024) ~~ 128,646,426 ~~ an upper bound of 2^27 individual CIDs that could be in a deal and all be individually addressable.
- The **longest textual representation** of a "tenuously common" CID would be a `b`-multibase base32 representation of a [blake2b-512](https://cid.ipfs.io/#bafymbzacid777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777), which clocks at 113 bytes. This in turn means that a typical UnixFS directory can contain:
1048576 = ( 4b dir hdr ) + N * ( 2b pbframe hdr + 2b CID prefix + 70b CID + 2b name prefix + 113b text-CID + ~5 bytes for prefixed size ) ~~ 5405 such names without sharding, before going over the 1MiB libp2p limit. In order to be super-conservative assume a target of 2^12 entries per "batch shard". If we go with the more reasonable 256-bit hashes, we arrive at ~9363 names which translates to 2^13 entries per batch shard.
- The common textual representation of CIDs is base32, each character of which represents exactly 5 bits. This means that sharding on 2 base32 characters gives a rough distribution of 2^10 per shard, fitting comfortably within the above considerations.
## Rough spec
The above limits, combined with the perfect distribution of hashes in CIDs, means that one can safely store a "super-sector" full of addressable CIDs by having 2 layers of directories, the first layer by 2 base32 characters, the second layer by 4 base32 characters, and the final container having the full CIDs pointing to the content ( 2^(10+10+10) > 2^(27). This does not even take into account the vast overestimation of sector and CID sizes.
Therefore a "common batching directory" has the following rough spec:
- Every CID is represented as base32 cidv1 ( upgrading Qm's... this shouldn't be a problem )
- For browseability/recognizeability purposes the **leading** 3 characters of a cidv1 are combined with the 2 and 4 sharding characters for each directory sublevel
This means a directory roughly like:
```
- bafyBundleRootCid
- baf...aa
- baf...aaaa
- baf...abaa
...
- baf...77aa
- baf...ab
- baf...aaab
...
- baf...77ab
...
- baf...77
- baf...7777
- bafymbzacid777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777
```
When a user want to contruct a retrieval selector, knowing the DealID and their "starting CID" they reverse the process to determine the 2 individual directory sublevels.
Additionally this does not preclude a bundler service from adding a top-level METADATA.json or somesuch with various miner instructions (e.g. what to put in the deal `Label` field)
[1] ~4GiB or less after deduplication
[2] In practice all they would need is the assigned-on-chainpublish sequential numeric dealID, which in turn contains the final payload CID