# zarr correspondance 2022-09-13 Dear all, Firstly a massive thank you to everyone who has provided feedback on the v3 spec, your comments and insights are hugely appreciated. Secondly please accept my apologies for not being able to engage fully in the discussions or provide timely responses. Unfortunately I'm not able to give nearly as much time to Zarr as I would like to at the moment. Please forgive me and correct me if I have missed any important points of view or misunderstood anything. In terms of the ZEP process, myself and Jonathan as authors of ZEP1 see our role to attempt to reach a position with the greatest possible consensus and support. We are also very conscious that work on this specification has been in progress for a long time, and everyone is keen to reach a resolution as soon as possible. With that in mind, Jonathan and I have been working through all comments and suggestions received, with help from Sanket and Josh. To keep things together, please find below a list of points we've considered so far together with proposed resolutions and explanation. This is not the final word, if you have any follow up comments, questions or suggestions please let us know. However, we are hoping that there will be no serious objections, and that we will be able to move forwards to the next stage. - 1. Core metadata and user attributes together or separate? See [#72](https://github.com/zarr-developers/zarr-specs/issues/72). We propose to retain the approach currently defined in the spec where user attributes and core metadata are stored together within the same document. However, we note that some use cases do involve relatively large volumes of data stored as user attributes, and this may cause performance issues. If this situation does arise, it would be possible to define an extension which specifies a mechanism for storing some or all user attributes within a separate document. - 2. Boolean data type in extensions or not? See [comment](https://github.com/zarr-developers/zarr-specs/pull/149#discussion_r927297719). We propose to amend the spec to add Boolean as a core data type, and edit the spec to remove current mention of Boolean as a possible extension data type. - 3. Complex number data type in extensions or not? See [comment](https://github.com/zarr-developers/zarr-specs/pull/149#discussion_r927298207). We propose to amend the spec to add complex number as a core data type. - 4. Datetime data type in extensions? See [comment](https://github.com/zarr-developers/zarr-specs/pull/149#discussion_r927298570). We propose to leave datetime data types to be defined via an extension. - 5. Named dimensions part of the core metadata spec? See [#73](https://github.com/zarr-developers/zarr-specs/issues/73) and [comment](https://github.com/zarr-developers/zarr-specs/pull/149#discussion_r927300522). We propose to amend the spec to add an array metadata property to store dimension names. The value of this property should be a JSON array of strings with the same length as the number of dimensions. - 6. Since there are no extensions in ZEP1, should we remove specific storage transformers (sharding) from ZEP1 and add it as an extension? See [comment](https://github.com/zarr-developers/zarr-specs/pull/149#discussion_r927497241). We would like to find a way to make it clear that the v3 core spec is the first spec to go through the ZEP process (via ZEP1). We forsee a range of possible extensions, and we hope that these can each be proposed and discussed via subsequent ZEPs. The first example of this is a sharding storage transformer, proposed via ZEP2. However, to avoid any complicated dependencies between specs, we will remove references to any specific extensions from the core spec. That way it should hopefully be clear that the core spec comes first, and extension specs build upon it. - 7. Have an example of what store is. See [comment](https://github.com/zarr-developers/zarr-specs/pull/149#discussion_r928313883). We will edit language in the spec to give an example of what a store is. - 8. Have a clearer definition of `Storage Transformers`, perhaps a diagram? See [comment](https://github.com/zarr-developers/zarr-specs/pull/149#discussion_r928314024). We will edit language in the spec and if possible add a diagram to help explain storage transformers. - 9. Constraints on node names. See [comment](https://github.com/zarr-developers/zarr-specs/pull/149#discussion_r928314250).* We propose to remove the length constraint on node names. - 10. `Core data type - r*`: using them for extension type fallbacks. Data stored in extensions should be readable by Zarr implementations who haven't implemented the given extension. See [comment](https://github.com/zarr-developers/zarr-specs/pull/149#discussion_r928314682). We propose to edit the spec to add some more explanation of how fallback data types should be handled by an implementation. - 11. `Chunk Grids`: Being explicit about the size of border chunks. See [comment](https://github.com/zarr-developers/zarr-specs/pull/149#discussion_r928315391). We propose to edit the spec section on regular grids to make clear the size of border chunks. - 12. Why separate `zarr_format` and `metadata_encoding` properties in entry point metadata? See [comment](https://github.com/zarr-developers/zarr-specs/pull/149#discussion_r928315793). The idea for the `zarr_format` property is that it should identify which version of the core specification the data is conformant with. The idea for the `metadata_encoding` property is that this is an extension point, allowing extensions to define alternative metadata encodings. I.e., if an extension defines a new metadata encoding, then we would expect the extension spec to mint a new URI to identify the new encoding, and use that URI as the value of the `metadata_encoding` property. That URI should be a persistent URI which redirects to the extension spec document. It is potentially confusing that both of these properties adopt the same URI value if the v3 format is used with JSON metadata encoding. This is simply because the v3 core spec defines both the v3 format and the JSON metadata encoding. We propose to add some text explaining that the `metadata_encoding` property is an extension point and how it could be used. - 13. `metadata_key_suffix`. See [comment](https://github.com/zarr-developers/zarr-specs/pull/149#discussion_r928315935). We propose to remove the `metadata_key_suffix` property, because it potentially could lead to confusing situations where the suffix and the metadata encoding to not correspond. We will amend the spec to state that the suffix should always be ".json" if the metadata encoding is JSON. For any other metadata encoding, the metadata key suffix should be specified by the extension spec defining the encoding. - 14. Clarification on `root node` name. See [comment](https://github.com/zarr-developers/zarr-specs/pull/149#discussion_r928339043). We propose to edit the spec to clarify that the root node name is the empty string. - 15. Explicitly state that the path is a string? See [comment](https://github.com/zarr-developers/zarr-specs/pull/149#discussion_r928339225). We propose to edit the spec to clarify that the node path is a string. - 16. `Memory layout`? See [comment](https://github.com/zarr-developers/zarr-specs/pull/149#discussion_r929135744). To keep implementation relatively simple and achieve broad interoperability, we propose to retain the definition of "C" and "F" memory layouts in the core spec. Note that this requires implementations to support reading of data using either of these memory layouts. We also think it is reasonable for an implementation to support only one of these memory layouts when writing data. If support for arbitrary memory layouts is potentially valuable, this could be added via an extension. An extension could achieve this because extensions can define new metadata properties or modify the allowed values of existing metadata properties within array metadata documents, providing the appropriate value of `must_understand` is given when the extension is used. - 17. Different naming scheme for data types? See [comment](https://github.com/zarr-developers/zarr-specs/pull/149#discussion_r929140806); *by Jeremy* - `Four 👍🏻` in favour - **Propose:** change spec to change naming scheme to use expanded format. - Remaining question is whether to restrict all systems to store little endian only (in which case metadata does not need to say anything about endianness) or permit storage of either be or le (in which case metadata would need to record which endianness has been used for storage). - 21. Not ending path with `/`? See [comment](https://github.com/zarr-developers/zarr-specs/pull/149#discussion_r931267280) *by Trevor Manz* - Alistair: Paths will never end with a slash by definition of path, because node names cannot contain slash. - Sanket: What should an implementation do if a user provides a path which the user has written with a trailing slash? Strip it and continue processing? Raise an error? - **Propose:** Clarify that node paths will never end with a slash, because node names cannot contain a slash. I.e., a string ending in a trailing is not a valid zarr node path. - Milestone RC1 - **Action:** Alistair communicate proposed resolutions to all above points.