owned this note changed 2 years ago
Published Linked with GitHub

ZEP1 Meeting @ 5/9

ZEP1 PR 🔗 https://github.com/zarr-developers/zarr-specs/pull/149

Attending: Sanket Verma (SV), Josh Moore (JM), Jonathan Striebel (JS), Alistair Miles (AM)

Initial remarks:

  • SV: no contention seen during preparation of this agenda. Several proposals and changes to language but no large blockers.
  • AM: see this as an editor meeting. Take each point in hand and decide on resolution amongst ourselves.

General feedback and comments 👇🏻

    1. Core metadata and user attributes together or separate? See #72
    • SV: was an open question in the original ZEP (from 2020).
    • AM: no vote against, and clear Dennis' vote for.
    • Propose: No change.
    1. Boolean in extensions or not? See comment; by Stephan Hoyer and Jeremy
    • SV: JMS says in core. Two
      Image Not Showing Possible Reasons
      • The image file may be corrupted
      • The server hosting the image is unavailable
      • The image path is incorrect
      • The image format is not supported
      Learn More →
    • Propose: Change spec to add Boolean as core data type (and edit spec to remove mention of Boolean as possible extension).
    1. Complex numbers in extensions? See comment; by Stephan Hoyer
    • Three: 👍🏻 in favour
    • Propose: Change spec to add complex numbers as core data type.
    1. Leaving datetime out of extensions? See comment; by Stephan Hoyer
    • Four: 👍🏻 in favour
    • Propose: No change, Leave for an extension.
    1. Named dimensions part of the core metadata spec? See #73 and comment; by Stephan Hoyer
    • Five: 👍🏻 in favour
    • Propose: Change spec to add a dimension names array metadata property. Value should be JSON array with same length as number of dimensions. Values should be unique. Can any/all values be null? Edits required to array metadata section, and maybe also in definitions at the top.
    1. Since there are no extensions in ZEP1, should we remove specific storage transformers (sharding) from ZEP1 and add it as an extension? See comment; by Jonathan
    • Need to clean up zarr-specs to remove specs not associated with a ZEP
    • Do we do that now? Some kind of middle ground?
    • When do we decide to approve ZEP1? Can do as long as no veto from ZIC and approval from ZSC
    • N.B., we can accept ZEP1 first, but then implementation starts, and hold off publicly advertising zarr v3 until implementations are mature
    • "Provisional" := freeze" in which only blockers are accepted
    • Propose: Remove all references to specific extensions from the core spec. (Although could still mention as idea that could be done). State for ZEP1 will become provisionally accepted. N.B., spec will be merged once it becomes provisionally accepted, and will have a clear label as provisional. (TODO Jonathan)
    1. Have an example of what store is. See comment; by Constantine Pape
    1. Have a clearer definition of Storage Transformers, perhaps a diagram? See comment; by Constantin Pape and Ryan A.
    1. Constraints on node names. See comment; by Constantin Pape, Jeremy and Ryan A.
    • Ryan voted against Windows-dependency
    • AM: could make this informative rather than normative, i.e., add informative note that long node names and paths may cause problems on some storage systems
    • JS: Node name restriction does not even solve the windows path limit problem.
    • Propose: Make this an informative note rather than normative.
    1. Core data type - r*: using them for extension type fallbacks. Data stored in extensions should be readable by Zarr implementations who haven't implemented the given extension. See comment; by Constantin Pape and Mark
    • Propose: Edit spec to add some more explanation of how fallback data types should be handled by an implementation.
    1. Chunk Grids: Being explicit about the size of border chunks. See comment; by Constantin Pape
    • Propose: Edit spec section on regular grids to make clear the size of border chunks.
    1. Why separate entry point metadata document for zarr_format and metadata_encoding. See comment
    • Propose: Add comment with explanation of why this is the way it is and what we would expect to happen if an extension defines a new metadata encoding. Alistair
    1. metadata_key_suffix. See comment
    • Propose: Edit spec to remove the metadata_key_suffix property, and state that suffix should always be .json if metadata encoding is json. For any other metadata encoding, the suffix should be specified by the extension spec defining the encoding.
    1. Clarification on root node name. See comment
    • Propose: Edit spec to clarify that root node name is the empty string.
    1. Explicitly state that the path is a string? See comment
    • Propose: Edit spec to clarify that the path is a string formed by
    1. Memory layout - support any arbitrary layouts? See comment; by Jeremy
    • JS: strongly for more than one. Very limited if you have other data.
    • SV: several discussions around this. (issue 126)
    • JS: (longish)
    • AM: agreed. Could see not supporting writing.
    • SV: and JMS' "arbitrary orders"?
    • AM: suppose not supporting it (have never experienced it) "Alistair as a litmus test" arbitrary order leads to mind bending
    • Propose: No change. Retaining definition C and F memory layouts in core spec. Support for read both C and F is required. Suggest to Jeremy that arbitrary memory layouts are defined via an extension. (TODO Jonathan)
    1. Different naming scheme for data types? See comment; by Jeremy
    • Four 👍🏻 in favour
    • Propose: change spec to change naming scheme to use expanded format.
    • Remaining question is whether to restrict all systems to store little endian only (in which case metadata does not need to say anything about endianness) or permit storage of either be or le (in which case metadata would need to record which endianness has been used for storage).
    1. Not ending path with /? See comment by Trevor Manz
    • Alistair: Paths will never end with a slash by definition of path, because node names cannot contain slash.
    • Sanket: What should an implementation do if a user provides a path which the user has written with a trailing slash? Strip it and continue processing? Raise an error?
    • Propose: Clarify that node paths will never end with a slash, because node names cannot contain a slash. I.e., a string ending in a trailing is not a valid zarr node path.
  • Milestone RC1

    • Action: Alistair communicate proposed resolutions to all above points.

    1. Optional support for float16? See comment; by Jeremy
    • Two 👍🏻 in favour
    • Action: Ask about whether conversion to/from float16 is possible in programming languages that don't support float16?
    • Option 1: Make float16 optional.
    • Option 2: Include float16 as mandatory, expect conversion to happen on platforms that don't have native support. (Favoured by Alistair.)
    • Action: Alistair discuss via comments.
    1. Clarity on use of 0 as minor version number. See comment by Jeremy
    • Interesting point, we would generally want folks to use extensions to add new features, rather than create a new minor version of the core spec.
    • Option 1: Drop the minor version. Only way to add new features is to use extensions. (Favoured by Sanket and Alistair.)
    • Option 2: Keep the minor version, but say we strongly encourage the community to use extensions to add new features.
    • Action: Sanket will discuss with Josh.
    1. Storage Keys - naming scheme drawback. See comment; by Jeremy
    • Alistair: I don't understand what "not a good way to have a path directly to a non-root array" means.
    • Alistair: I don't understand what "it would be helpful if the path were a real filesystem path" means.
    • Action: Alistair ask for clarification. Proposals would seem to break some important performance issues, like getting view of hierarchy.
    • Action: Sanket revisit notes from relevant community calls.
    1. Main issues according to Ryan A. See comment
    • Define input/output of storage transformer. See comment; by Ryan
    • Missing reference for root node. See comment; by Ryan
    • C-style layout on disk, See comment; by Ryan
    • Query on SEMVER. See comment; by Ryan
    • Proposal to remove this. See comment; by Ryan
    • Language edit to remove root. See comment; by Ryan
    1. Broaden the storage transformers to act on entire store. See comment
    • Two 👍🏻 in favour
    1. Josh additions
    • .zarray/.zgroup version .zsomething
    • meta/ root question
    • optional dimensions (shoyer 276)
    • use dev version number (joshmoore 1020)

Language edits 👇🏻

Miscellaneous issues 👇🏻

V3 Extensions 👇🏻

  • Awakard Arrays as extensions? See #89
  • Consolidate metadata as an exension. See comment; by Stephan Hoyer

Open PRs in zarr-specs repo 👇🏻

  • Revise how the domain of an array is specificed - see #144; by Jeremy
    • Had discussion over this at one of the community calls and majority is not in favour of this
  • Require fill_value to be defined - see #145; by John A. Kirkham
    • In favour
  • Support a list of codecs in place of a single compressor field - see #153; by Jeremy
    • In favour - see discussion here
  • Change data type names and endianness
    • In favour - see discussion here; by Jeremy
Select a repo