---
tags: zarr, zsc
---
# ZSC 2021-09-27
Attending: Alistair, Josh, Ryan Abernathey, Ryan W
* [TODOs from last time: 2021-09-01](https://hackmd.io/IRjXZ8lRRzCyQD76R9wyhw?view#Additional-paper-notes-amp-TODOs)
* JupyterBook Intro
- https://distill.pub/ (https://distill.pub/2021/distill-hiatus/)
- Use of templates, etc. web first.
- need a couple of designers.
* Governance (from last time)
- Twitter: tweetdeck & credentials
- invited Ryan A to tweetdeck.
* carbonplan (mapbox)
- Alistair: nice post
- Ryan: think we should focus on web-first
- Ryan: want smaller chunks for web (--> caterva)
- That's what would wave a wand at
- Want to work together
- Alistair: clarity on functionality? blosc2 v caterva
- blosc2 is n-dim aware...
- RA: hard to understand. (we aren't blosc experts anyway)
- header stuff is useful (metadata layer that we don't need)
- most of what we need is in blosc2 (we avoid the header because it's redundant on chunks)
- good to try to get caterva demo
- (post-meeting) still need array order for blocks. might be fun to experiment. space-filling curve for batching
- similar to partial reads. need to know range
- RA: don't love how it's coded.
- https://github.com/zarr-developers/zarr-python/blob/78eb8b728e92cf5cbb6ff58d7da0d4a26c54a0ec/zarr/util.py#L548
- map array slice to bytes within (chunk) file
- John has different opinion: completely bypass numcodecs. Expose caterva array directly in zarr (already ND)
- would work but caterva wouldn't be a codec (leak in abstraction)
- propose instead to augment numcodecs with possible awareness of the underlying arrays & slicing capabilities
- chicken & egg: playing with tools to figure out what we want to achieve
- Josh: trying a non-blosc sharded backend?
- Vital: keep clear abstractions (similar for V3)
- **Is there a missing concept?** Key question.
- Blosc2 Python example of chunked data: https://github.com/Blosc/python-blosc2/blob/main/examples/schunk.py
- Josh: explanation of webknossos format (see last community call)
- RyanA: Extend to support uncompressed. Know how to find something then. Pass through codec for flat-binary. Propose to structure the API -- anything that lives within a single file is accessed through numcodecs. (Slight expand to some degree) But numcodecs is for reading blobs. Zarr is responsible for coordinating many of those blobs.
- Alistair: have the feeling that all through the numcodec API isn't quite right, since it assumes you've retrieve (i.e. just a sequence of bites). Something needed in the storage in the API.
- Josh: translation like fsspec-reference-maker? key --> (url, offset, length) ... and multiple chunks?!
- RyanA: don't believe we can outsource it. (just like V3)
- Alistair: how do we solve these difficult design problems then? Previously AM, RA, Matt Rocklin, Stephan Hoyer. People were engaged. We need a forum. We need some input. Experience and knowledge.
- Josh: (1) comm. mgr. to run design meetings? (2) think we can get WK to join the community
- Josh: working on xarray SoW. Not sure how to balance it. (conflict)
- roughly "support zarr multiscale in xarray"
- Ryan: forcing third-party library to R&W image pyramids would force us to formalize the convention. xarray has a richer model.
- being able to encode that in a round-trip-able way
- show that it can be read by downstream software
- increase adoption of format
- forces to make zarr better (named dimensions)
- cf. fsspec, came through xarray
- repository for zarr-conventions (registry, discovery)
- sidelined by the v3 spec.
- AM: keep extension and convention separate. need a home.
- JM: https://data-apis.org/ ?
- JM: file-loading? anything else?
- AM: some conventions in genomics community. our names.
- JM: tabular data...
- AM: include b-open in review, percentage of their time.
- RA: hear about companies using zarr. people don't know.
- "looking for maintainers."
- AM: could try to help mentor maintainers. 1hr/wk
- community manager for coordinate. advertise. build capacity. schedule with someone
- RA: https://medium.com/pangeo/supporting-new-xarray-contributors-6c42b12b0811
* Governance
- numfocus updates (hire, meetings)
- V3 Spec & Unidata funding (6-8 months until begin)
- any vetoes on
- logo
- trademarking
- zarr-format.org: [issue585](https://github.com/zarr-developers/zarr-python/issues/585)
* GitHub: (day-to-day business) (from last time)
- Dependabot and other "noise"
- CODEOWNERS
- Reviewers....
* Brainstorming
- Corporate involvement
- Google 20% / xarray-beam
- protocollabs
- Contractors
- Webknossos
- bopen
- blosc
* Zarr backlog
* Please list issues here!