owned this note
owned this note
Published
Linked with GitHub
filesystemmmmms
===============
:::info
Major loci of choices
---------------------
- metadata: in-tree or side-tree
- attributes: how many, what are they
- functional focus: filters
- chunking?
- dev nodes: wtf? sakurity?
:::
And, one very big, core note:
**you can design an archival filesystem, or a filesystem for getting things done -- and not both**.
trees, trees everywhere
-----------------------
### Metadata in-tree
- happy: the more 'obvious' choice
- sad: changes in attribs on leaf nodes may cause a lot of parent notes to diverge in hash.
- theory might say "that's log/height", but practice seems to indicate chown hits swaths and that causes *all* intermediate nodes to update, which means roughly 1/2 all nodes diverge.
- sad: recursive fetch of all notes gets metadata and data -- heavy if you only wanted the metadata.
- debatable: IPLD Selectors can get you this anyway (it's just less smackingly obvious).
### Metadata in parallel tree
- happy: Easier to fetch a tree of **just** the metadata.
- debatable: IPLD Selectors can get you this anyway (it's just less smackingly obvious).
- still, definitely makes it more obvious how to map this onto other transport/packing formats, anyway.
- imagine 'car' formats: of course you put a full tree in them.
- misleadingly seems happy: can dedup the full-content-body bytes more easily since metadata is separate.
- absolutely misleading. the leafs will dedup in IPLD no matter what. (not a _bad_ thing -- just not a good thing unique to this approach.)
### flatstitch root tree
- e.g. `[{"name":"x","type":"dir"},{"name":"x/y","type":"file"},...]`
- sad: they don't trivially compose and scale.
- makes 'root' special and that's awkward.
- happy: most updates to attribs on a leaf will change that leaf and the 'root' and nothing else.
- debatable: just how consequential this is in practice.
### variations on in-tree
Is it useful to put attribs (other than name) in their own block, so when they're the same, the CID is simply the same?
Also makes attribs "fixed size" which might be useful (but names still aren't, so, eh).
attribyooots
------------
How many different ways to do attributes *are* there?
### one
```
type Attribs struct {
mtime Int
posix Int # Ye standard 0777 mask packing here?
sticky Bool (implicit: false)
setuid Bool (implicit: false)
setgid Bool (implicit: false)
uid Int
gid Int
}
```
### two
```
type Attribs struct {
mtime Int
posix Int # Ye standard 0777 mask packing here?
sticky Bool (implicit: false)
}
```
- diff vs one: actively disregards / no-support-for uid and gid.
### three
```
type Attribs struct {
mtime optional Int
}
```
- questionable: would one ever really want a tree with mixtures of mtime and not? what does absent mean? what would cause one to produce such a mixed tree?
### four
```
type Attribs struct {
}
```
- worth pointing out: this is absolutely an option.
### five
```
type Attribs struct {
mtime Int
posix Int # Ye standard 0777 mask packing here?
sticky Bool (implicit: false)
uid Int
gid Int
}
```
- diff vs one: we can still describe owner uid and gid, but setuid and setgid are excluded from the schema entirely. Thus, perhaps you might use this schema when you know you're not going to accept inputs with those properties.
- this set of attributes might also make the most sense with a schema for the tree that also doesn't support describing device nodes, etc, per that same user story about intentionally low-priv/low-effect(? wording) modes.
### more...
### notes on attributes
- 'ctime' exists. But you can't set it without being a driver. So this is one of those things that's "archival, or fit-for-purpose"-choice.
- 'btime' is a new, recently added property! wow! but it's the same issue as 'ctime'.
- 'xattrs' are unbounded in size. that's cool.
- 'xattrs' are not all equally writeable (see "`security.*`"). So this prompts "archival, or fit-for-purpose" again.
- 'xattrs' are not all *readable* unconditionally(!??!)!
- some are not readable without privileges...
- some actually get string-munged on write in a form of "namespacing"...
- hardlinks exist but most people agree that preserving that is a mistake.
- hardlinks are for a system's local optimizations of storage dedup and should not be used for semantics.
chunking
--------
### chunking vs external integration
Basically: chunking is Good. Except: people like using basic-bitch hashes on per-file granularity because it's easier for everyone to agree on. What do?
(Not clear there's actually anything great possible here period.)
### which chunking to use?
Rolling checksums are the name of the game. There are several of these. Correctly implemented rolling checksums for chunkfinding should be very very fast, nearly invisible in performance compared to cryptographic hashing.
Update: some buzhash based things have been pushed into go-ipfs-chunker! Probably we should use them.
dev nodes: seriously, argh
--------------------------
- clearly need 'em for some root filesystem and container-related work
- they're opaque ints -- that's "easy" enough
- but some of those opaque ints are more consistently meaningful than others
- device loop nodes are effectively pointers to global vars in the kernel, for example: so they're meaningless to transport anywhere else. But surely it would be inappropriate for our understanding of posixy filesystems to take note of this! nngh.
misc references
---------------
- https://www.cyphar.com/blog/post/20190121-ociv2-images-i-tar
- the entire thing. Just read the entire thing.
- `casync` has a (overly?) comprehensive list of things you could conceivably save metadata on: https://github.com/systemd/casync/blob/e4a3c5efc8f11e0e99f8cc97bd417665d92b40a9/src/caformat.h#L82-L125
- Timeless Stack `rio` has a bunch of code around filesystem attribute normalizers which is the result of some real-world user stories discovery around needing to hash things deterministically+convergently while handling full container filesystems: https://github.com/polydawn/go-timeless-api/blob/0ece408663edb9dbc6109ef8690f8425f0d8f5f4/filesetFilters.go
- especially take a gander at the preset groups that ended up getting used in different user stories: https://github.com/polydawn/go-timeless-api/blob/0ece408663edb9dbc6109ef8690f8425f0d8f5f4/filesetFilters.go#L32-L40
- *We will probably have filters like this in IPFS for **all** future paths of development, for any and all of the above choice matrices.* Anything we do that's not exactly posix -> there will be filters/mappings. Know 'em.
perhaps a summary
-----------------
1. There are definitely more than one meaningful filesystem spec possible. Let's do several.
- We are uniquely empowered to do this and support it well and share dedup. We can rock; Therefore, we should rock.
2. Probably "containerfs" is a big one; "plainfiles" might be another; "posixlowpower" a third?
- This is an example enumeration mapping onto user stories, but might not be correct. Figuring out which of these (or which not-listed-here combinations) we actually want to focus on is still the name of the problem -- we've done discovery in this document, not made choices yet.
3. Decisions about the overall tree topology should be made *now*. All three (or more) filesystem specs should share the topology.
- Variations in the attributes are fine; variations in the topology are extremely undesirable. (Concretely: We want the same selectors to work on all of the filesystem formats!)