Deterministic CID grouping

Motivation

Users who want to store relatively small[1] DAGs on Filecoin often find it difficult to have their deal accepted, even in the presence of Fil+. One solution to this is grouping multiple root CIDs into a "batching structure" which is then used to make a deal with a specific miner. The downside is that the batching structure will naturally have a new root CID, which complicates both discovery and retrieval. Moreover there are limits to how large such a structure can be before becoming unwieldy in other IPLD context ( e.g. IPFS )

Scope

The design of the "batching structure" needs to at address the following:

  • Must have

    • Structure is something existing tooling can understand. This effectively limits it to UnixFSv1.
    • Structure must cater to least-common-denominator among available transports. This means individual block limits within 1MiB.
    • Structure must present reasonably browsable names for the individual customer-provided dag roots within it
    • Structure is simple to produce and navigate with minimal or no tooling. This effectively means no IPLD-level sharding
    • For the clearly necessary "logical sharding" the choice of indirection should be such that a user knowing just their own CID and the resulting final bundle-root CID[2] can craft a selector for partial retrieval of only their content.
  • Nice to have

    • Structure has a placeholder or mark of some sort, hinting miners at what is and isn't reasonably indexable within it.
  • Limits in consideration

    • A car file accepted by the Filecoin network currently can have a maximum payload size of about 65,024MiB ( 64<<30 / 128 * 127 ). For the sake of argument let's assume a future with 128GiB sectors, which gives an exact maximum payload size of 136,365,211,648 bytes. Assuming ~60 bytes for a car header, shortest safe CID representation of a cidv0 sha2-256 at 34 bytes, and payload of 1024 bytes per block ( ridiculously small NFTs ), gives us an upper bound of (136365211648 - 60) / ( 2 + 34 + 1024) ~~ 128,646,426 ~~ an upper bound of 2^27 individual CIDs that could be in a deal and all be individually addressable.
    • The longest textual representation of a "tenuously common" CID would be a b-multibase base32 representation of a blake2b-512, which clocks at 113 bytes. This in turn means that a typical UnixFS directory can contain:
      1048576 = ( 4b dir hdr ) + N * ( 2b pbframe hdr + 2b CID prefix + 70b CID + 2b name prefix + 113b text-CID + ~5 bytes for prefixed size ) ~~ 5405 such names without sharding, before going over the 1MiB libp2p limit. In order to be super-conservative assume a target of 2^12 entries per "batch shard". If we go with the more reasonable 256-bit hashes, we arrive at ~9363 names which translates to 2^13 entries per batch shard.
    • The common textual representation of CIDs is base32, each character of which represents exactly 5 bits. This means that sharding on 2 base32 characters gives a rough distribution of 2^10 per shard, fitting comfortably within the above considerations.

Rough spec

The above limits, combined with the perfect distribution of hashes in CIDs, means that one can safely store a "super-sector" full of addressable CIDs by having 2 layers of directories, the first layer by 2 base32 characters, the second layer by 4 base32 characters, and the final container having the full CIDs pointing to the content ( 2^(10+10+10) > 2^(27). This does not even take into account the vast overestimation of sector and CID sizes.

Therefore a "common batching directory" has the following rough spec:

  • Every CID is represented as base32 cidv1 ( upgrading Qm's this shouldn't be a problem )
  • For browseability/recognizeability purposes the leading 3 characters of a cidv1 are combined with the 2 and 4 sharding characters for each directory sublevel

This means a directory roughly like:

- bafyBundleRootCid
    - baf...aa
        - baf...aaaa
        - baf...abaa
            ...
        - baf...77aa
    - baf...ab
        - baf...aaab
        ...
        - baf...77ab
      ...
    - baf...77
        - baf...7777
            - bafymbzacid777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777

When a user want to contruct a retrieval selector, knowing the DealID and their "starting CID" they reverse the process to determine the 2 individual directory sublevels.

Additionally this does not preclude a bundler service from adding a top-level METADATA.json or somesuch with various miner instructions (e.g. what to put in the deal Label field)

[1] ~4GiB or less after deduplication
[2] In practice all they would need is the assigned-on-chainpublish sequential numeric dealID, which in turn contains the final payload CID

Select a repo