# Sourmash signatures - metadata thoughts
tl;dr? Keep the core signature format lean and mean, with a few required fields; put other stuff in the `metadata` attribute, which will be a list of dictionaries. This metadata attribute would be passed along with the signature internally, and reading/writing routines would need to leave it unchanged, but it would not affect md5sums of equality of signatures.
## Required fields in a signature:
* class
* license
* hash function
* signatures
* version
## Reserved block names.
We should identify some reserved block names that have special meaning. Obvious ones include:
* `sample` (for information about the sample that was computed upon) - number of bases, name, filename, maybe a download URL, etc.
* `provenance` for (maybe free form?) provenance info - what command was run on what system, etc.
* `ipfs` for IPFS data file retrieval information.
* `ncbi` or `ncbi_taxonomy` for accessions and/or taxonomy IDs and other such information.
Content of these should be more completely described and then encoded in software & a software validator.
### Reserved metadata block proposal: `tags`
tl; dr? Can we build a useful tagging system in, to support a [folksonomy](https://en.wikipedia.org/wiki/Folksonomy)?
See [Better file organization around tags not hierarchies](https://www.nayuki.io/page/designing-better-file-organization-around-tags-not-hierarchies) via @luizirber.
One thought: here it would be nice to have something that didn't change the actual signature file content, so that the hash didn't change (for e.g. IPFS distribution). Can this be done via some IPFS mechanism ([IPNS](https://github.com/ipfs/faq/issues/16)?) or [hypothesis](https://web.hypothes.is/), e.g. specify a unique URI for each signature that could then be annotated in hypothesis. (It looks like the term we want here is "external metadata")
## Conundrum: how do we do forwarding?
Another use for external metadata would be forwarding between signatures (e.g. "signature 5e665d is from a sample that has been updated; new signature is 48d23d".) Again, we want to avoid updating the signature content with this information because that would change the hash.
## Luiz comments
- IPLD allows traversing IPFS objects.
I worked a bit on making the SBT JSON valid for IPLD too,
they have examples using git https://github.com/ipfs/js-ipfs/tree/master/examples/traverse-ipld-graphs
- The annoying thing with metadata in IPFS objects is that any change will generate another hash =/
But not sure how to best represent it outside, either (if it is another IPFS object, we still need to update the signature to point to it, which defeats the purpose).
* oh, we could save the minhash in one object, and let different signatures point to the same data.
this way people can make their own metadata or extras on the signatures,
but everyone has the same values for the minhash.
- I really like the idea of making tags or extra metadata using hypothesis,
how easy is to access their data outside the browser?
- Add a 'previous_version' field, pointing to the previous IPFS object.
This way we can keep some simple versioning info