--- tags: zarr, ZSC, meeting, notes --- # ZSC Bi-weekly meeting --> [Zoom](https://openmicroscopy-org.zoom.us/j/84422642407?pwd=TWpBNCttVzR6SFY0WGcxbjVlZFBHQT09&from=addon#success) | Meeting | AM | JK | JM | RA | RW | SV | | ------------------------- | :-: | :-: | :-: | :-: | :-: | :-: | | [2024-04-11](#2024-04-11) | πŸ–οΈ | x | x | x | | x | | [2024-03-14](#2024-03-14) | x | x | x | x | x | x | | [2024-02-15](#2024-02-15) | | | x | x | x | x | | [2024-01-18](#2024-01-18) | x | | x | x | x | x | | [2023-12-21](#2023-12-21) | | | x | | | x | | [2023-12-07](#2023-12-07) | x | x | x | x | x | | | [2023-11-23](#2023-11-23) | πŸ¦ƒ | πŸ¦ƒ | πŸ¦ƒ | πŸ¦ƒ | πŸ¦ƒ | πŸ¦ƒ | | [2023-11-09](#2023-11-09) | | x | x | | | x | | [2023-10-26](#2023-10-26) | | x | | x | | x | | [2023-10-12](#2023-10-12) | | x | x | | | x | | [2023-09-28](#2023-09-28) | | | x | | | x | | [2023-09-14](#2023-09-14) | | | x | | | x | | [2023-08-31](#2023-08-31) | x | 😷 | x | x | | x | | [2023-08-17](#2023-08-17) | | | | x | x | x | | [2023-08-03](#2023-08-03) | | x | | x | x | x | | [2023-07-20](#2023-07-20) | | | x | x | x | x | | [2023-07-06](#2023-07-06) | | x | x | | | x | | [2023-06-22](#2023-06-22) | x | x | x | x | | x | | [2023-06-08](#2023-06-08) | x | x | | | | x | | [2023-05-25](#2023-05-25) | x | | x | x | | L | | [2023-05-11](#2023-05-11) | x | x | x | | | x | | [2023-04-27](#2023-04-27) | | | x | x | | 🌴 | | [2023-04-13](#2023-04-13) | | x | 🌴 | x | | x | | [2023-03-30](#2023-03-30) | - | - | - | - | - | - | | [2023-03-16](#2023-03-16) | | x | x | x | | x | | [2023-03-10](#2023-03-10) | | x | x | | | | | [2023-03-02](#2023-03-02) | | | x | | | x | | [2023-02-16](#2023-02-16) | - | - | - | - | - | - | | [2023-02-02](#2023-02-02) | - | - | - | - | - | - | | [2023-01-19](#2023-01-19) | | x | x | x | | x | | [2023-01-05](#2023-01-05) | | x | x | | | x | | [2022-12-22](#2022-12-22) | πŸŽ„ | πŸŽ„ | x | πŸŽ„ | πŸŽ„ | x | | [2022-12-08](#2022-12-08) | | x | x | x | | x | | [2022-11-24](#2022-11-24) | x | πŸ¦ƒ | x | πŸ¦ƒ | πŸ¦ƒ | x | | [2022-11-09](#2022-11-09) | | x | x | x | x | x | | [2022-10-27](#2022-10-27) | | | πŸ₯Ύ | | | | | [2022-10-13](#2022-10-13) | | x | x | x | | | | [2022-09-29](#2022-09-29) | | x | x | | | L | | [2022-09-15](#2022-09-15) | | | x | x | | ✈️ | | [2022-09-01](#2022-09-01) | L | x | x | x | | x | | [2022-08-18](#2022-08-18) | | x | πŸ• | | | x | | [2022-08-04](#2022-08-04) | | x | x | | | x | | [2022-07-21](#2022-07-21) | x | x | x | | x | x | | [2022-07-07](#2022-07-07) | x | x | x | | | πŸ” | | [2022-06-23](#2022-06-23) | | x | x | x | | x | | [2022-06-23](#2022-06-23) | | x | x | x | | x | | [2022-06-09](#2022-06-09) | x | x | 😷 | x | | x | | [2022-05-26](#2022-05-26) | x | x | x | x | | x | | [2022-05-12](#2022-05-12) | | | x | x | | x | | [2022-04-28](#2022-04-28) | x | x | x | x | x | x | | [2022-04-14](#2022-04-14) | | x | x | x | | x | | [2022-04-07](#2022-04-07) | | x | x | x | | x | | [2022-03-17](#2022-03-17) | x | x | x | | | x | | [2022-03-03](#2022-03-03) | | | x | | x | x | | [2022-02-16](#2022-02-16) | x | x | x | x | x | x | | [2022-02-02](#2022-02-02) | | x | x | x | x | | | [2021-11-30] | x | x | x | x | x | | | [2021-09-27] | x | | x | x | x | | | [2021-08-30] | x | x | x | x | x | | ## 2024-04-11 - SV: Please respond to the poll here: https://whenisgood.net/tswj9kd - Zarr-Python V3.0 Release schedule: https://github.com/zarr-developers/zarr-python/issues/1777 - JM: EOSS6 feedback and follow ups (F7, etc) - CZI - JM: no feedback (as usual) - JK: They had said they were going to distribute to the other funders. - RA: just carry on with confidence. - NASA - SV: ... POWER ... - RA: POWER is 1% of Zarr use. Not the mainstream usage. - SV: sample application to look at? - RA: community by itself is not a strong sell. - Features - RA: is NGFF multiscale standardized? - RA: how do we re-use the existing viewers? - JM: https://link.springer.com/article/10.1007/s00418-023-02209-1 - RA: nice story about bringing it to the geo community - RA: convention, but nice to have an object via extension mechanism - JM: CZI Open Science meeting in Boston this May (Anything on SciPy, other meetings?) - Anything at a high-level to discuss on the v3 front? - RA: ZEP0 refactor status - Anything we can do? - SV: waiting on feedback. Only a few comments. If that's it, we can go ahead and merge it. - RA: :+1: for just going ahead. People who care have responded. - SV: sending a final reminder to the ZSC, then will merge. A few tasks like getting rid of the ZEP website so that things a work. ## 2024-03-14 - JM: No word from CZI on EOSS6 - RA: [revisions to ZEP0](https://github.com/zarr-developers/zeps/pull/59) - SV: big changes - no clear plan for extensions previously. - simplified for codecs, etc. - combination of repos (see [prototype](https://docs-test-sanket.readthedocs.io/en/latest/) ) - discussion was previously across multiple issues. - goal of having more engagement from the ZIC - SV: see also [flowchart](https://zeps--59.org.readthedocs.build/en/59/active/ZEP0000.html#how-does-a-zep-become-accepted) - RA: good to hear - AM: good to hear a summary of RA's opinion - RA: like where it's trending. have been on different ends of a spectrum. have wanted one PR/issue for everything on a ZEP. merged gives me the outcome. on lean, how do we decide what's lean? - SV: v4, variable chunking, sharding 2.0 would be non-lean. - SV: if no implementation for a significant ZEP, then we wouldn't be in a great place. - AM: walkthrough of adding a new ZEP? - SV: open a PR against zarr-specs with two files - author decides how much reading there time is for getting comments. up to a month - when there are no further controversial issues, then leave reading phase to implementation phase - there needs to be an implementation at this point, including from someone other than the author, implying they need to gain traction within the community beforehand - then move to the voting phase - voting options: yes, endorse, abstrain, veto - AM/RA/JM: easy enough to use CI to deploy PRs on the website - AM: sounds like best of both worlds - JK: if we have a successful process, then hopefully we'll have a lot of ZEPs and will need an overview - RA: no org does this better than conda-forge (see commands) - AM: livesite built from main along with PRs with some labeling for sanity checking - RA: https://earthmover.io/blog/cloud-native-dataloader - _RA leaves at half-past_ - AM: on lean ZEPs - avoiding sticking points - JM: keep a list of the easy/lean things, and those can be added - JK: if codecs are the easy path, will everything will become a codec? other easy paths? - AM: anything that is an extension could be lean - JM: and positively that encourages us to expand the extensibility framework - AM: "if you can reference the v3 spec pointing at something with extensibility" - AM: does this let us collapse the types of specs? some conflation since lean could introduce small improvements or corrections. - SV: does that mean that extensions like sharding don't need an implementation? - JK: think even extensions need implementations (codec exampl) - SV: John mentioned small development grants - Someone submited adding a store to Zarrita (Cornelius) - A Graham something signed off on this. - ## 2024-02-15 - Hackathon - RA: about half showed up - SV: lightning talks are good. hybrid worked well - Zarr-python - JM: movement on v2 so that's good. looking forward to v3. - RA: need to decide how much backwards compatibility - edges. Store API, utility functions, etc. - not far enough to make a list of those questions - push is to get v3 feature complete (async :tada:) - RA: what is https://github.com/zarr-developers/steering-council? - JM: unsure. i'll take care of that. - Open Collective platform (JM) - RA - RW: job searching (health insurance!) - possible position tiledb. last call before an offer yesterday. - RA: don't have to resign just because you take a job there. - JM: seconded. - https://github.com/single-cell-data/TileDB-SOMA - RW: CZI & cellxgene background story - RA: company culture? - RW: barely used the product - remote first emphasis - prominent open source people to work on the stack - raised Series B (40M) - RA: hired Sean G. from the geospatial world (rasterio maintainer) - RW: mentioned by Charles Stern (sp?) - Chat platform decision (SV) - JK: decisions on github. don't care where chats happen. one value should be transparency. - RA: allergic to proliferation of platforms. default is stick to github. see need for realtime chat. - zulip overlaps with github discussions. between chat and forum. - cool kiddies are on discord. people have accounts there. - no one I know uses it. hesitant that no one will come. - SV: use discord - it's good. but: - see https://alexn.org/blog/2022/04/09/scala-gitter-discord-mistake/ - (summary: move from gitter to discord was the worst decision) - no google, etc. - discord policy that they don't plan to be open - refactor team is not liking discord, clunky - RA: inclined to trust SV's recommendation. - SV: agree with making/recording decisions on GitHub - refactoring meeting question from nvidia about "where to ask simple question?" - numcodecs - JK: past month and a half have been tough. - RA: anything we can do to support you. not really. - JM: v2 / v3 explanation - https://github.com/zarr-developers/zarr-python/pull/1588 - RA: create a shim? - RA: very stable, excellent open source package. - how many people are using it without zarr? - JK: in conda-forge, at least 3 that are numcodecs only. - minutes: https://hackmd.io/Rbh6oae8S7mNU-CPWWwPaw - RA: bus factor with numcodecs - RA: best thing is the packaging of all codecs into one wheel - JK: hoping to have more time in March. - SV: can provide a summary for the past 4-5 weeks. JK: send as an email - Other offline duties - RA: refactoring ZEP process? - SV: read comments? Yes. - community/zep meetings: people like the idea of phases - trying to merge the two proposals - everything in one repo, merged PRs is "accepted" - Action for RA is to engage with that. - tiledb -> log ZEP? ## 2024-01-18 ZSC tickets (Josh) * Any general thoughts on the tickets? * RA: didn't engage too much, but still like the idea. * JM: like Alistair's resolution style. Thank you. * AM: some of these are annoyingly large. needs narrowing down. * Large and needs splitting * ZSC#1 - "Process for specification ZEPs" * RA: simplifying discussions, etc. - https://github.com/zarr-developers/zeps/issues/55 - has explicit list of desires: - one repo for all discussions - one PR for every extension or ZEP - when PR is merged then it's over - ZEP website is another thing to maintain, more friction * RA: Allowed to update a ZEP? New ZEP? (There's no versioning) - Or go out of bounds and just do things. * AM: agree that technical part of how to contribute could be simpler so agree there. - also social process of making decisions could be simpler and fit with the community process. - to move that forward, one of the ZSC should open a ZEP to propose that - SV: discussion came just after sharding. discussions with JMS, NR, JM - there was a proposal of phases - an issue is the need to ping people to get their votes - reading phase, implementing phase, voting phase - ZEP1 was finalized before being implemented in python * RA: where does the process come from? - SV: phases comes from the ZEP meetings. - ZEP0 comes from NEP+PEP+STAC - RA: doesn't feel like this is how software development works - successful open source software moves forward from suggestions that drive forward - once there's consensus, then it's merged. - want to go farther in simplifying it. - remove beauracracy around ZEPs - SV: agree about simpler, but there should be some yes/no way to show interest - JM: GitHub isn't built for specs (see NGFF pains) - RA: haven't been overly successful, so don't feel overly wedded to it. - RA: care a lot more about the mechanics, then what we want to call it. - don't like the idea of more tracks. want to take away from the process. - e.g., don't like the fast track zep * AM: from POV, ZSC trying to solve this - agree that we want to make a change - but it's a tricky problem - SWE won't fit perfectly - not enough time to work through it perfectly - RESOLVE: to make some changes, nominate someone to find the solution and they will consult with key people (RA, JBMS). - RA: feel that they were doing that on 55, but SV says "working on something else" - SV: not public yet, but discussed in the ZEP meetings. - AM: supportive of 55 but feel there are details that need to get worked out - articulate those? use ZEPs for that? - RA: agreed that it's not ready to implement. Use this issue. - AM: feels like there's not time here, so finding the way to continue the conversation. working to consensus? Could SV surface a few more details. - RA: we're not communicating well - AM: ask SV to surface all of his thinking. looking for something as healthy as well-known proceses where people are really engaged. - we don't see that kind of activity yet - RA: danger in modeling on a larger and more active community - that's why we need fewer steps, more readibly understandable * Issues where some RESOLUTION has already been ACTIONed: * ZSC#3 - "ZSC meetings". ACTIONed: calendar entry now monthly (and public) * Likely no discussion needed, RESOLUTIONs need thumbs up & ACTIONs: * ZSC#2 - "ZSC votes on ZEPs". * (Josh Q: RESOLUTIONs aren't in the description so we vote on the proposal?) * Proposed action: "Update ZEP0" * RA: who votes on a ZEP is important and we should think about a lot (and also that what we come up with won't be correct) - cf. STAC more like a GitHub. Maintainers can approve PRs. Anyone can be nominated. Steering committee can step in when consensus is not reachable. - No specific proposal on how. - JM: emotional response to "no proposal" - RA: don't feel like I have support. - RA: main message is we should be looking at people - AM: how can ZSC work on an effective way of working? clear what we're trying to solve and what opinion everyone has. could work to get into a rhythm. get into habit of specific questions, are there any draft resolutions. what does everyone think about it, etc. - also: agree in general. * RA: back to the question, ZIC isn't an equal representation (e.g., only 1/3 engaged) - alternative is to have some other body. stakeholders who should have a voice. - open body of maintainers who get their via engagement and nomination. - could also be bigger steering council * AM: what's the proposal? - RA: actually want the ZSC being more responsive * Tabled? * Likely no discussion needed, RESOLUTIONs need thumbs up & ACTIONs: * ZSC#6 - "Extension mechanism" * once voted on, Alistair will post (to zarr-specs?) * ZSC#5 - "ZIC memerbship" * (Josh: This one may require more.) * RESOLUTIONs & ACTIONs need updating: * ZSC#4 - "Zarr-python decision making" ## 2023-12-21 Cancelled. Moving to monthly meetings. Next meeting on January, 18th. ## 2023-12-07 Meeting notes in a separate document ## 2023-11-23 Cancelled because of Thanksgiving! πŸ¦ƒ ## 2023-11-09 *Josh will likely need to leave after 30 minutes but might be able to talk from the car* - Keeping SC in sync (Sanket) - Shared calendar for SC for sharing important updates (JK: +1) - Sending bi-weekly/monthly emails to SC - e.g. difficulty getting ZEP0002 vote - JK: in conda-forge, votes (private) are emailed, pinging on matrix - JK: slow since trying to figure out HDF support. didn't want to vote until late. - SV: ZarrCon - JK: Janelia would be a good venue, also Allen Institute - Other synchronizations (Josh) - TABLED - ZIC members (see Jeremy's comment about v2/v3) - zarr-python core devs (e.g. [roadmap](https://hackmd.io/0DVKP6d9QI-VaHc0zvOuxw)) - JK: A lil' worried about async functionality - JK: updates to the steering council ## 2023-10-26 - SV: Catching up with R.A. - Previous weeks' updates - Vote for ZEP0002 - JK: Improve the compression for chunking - Get that to work in parallel - RA: Aligned with what's going in B&P group - recommendation would be to go to the meeting - RA: Strong favour to write it all in Rust and bind it to Python - JK: We can also try to reuse Blosc2 - these libraries already have logic to do what we want - JK: Not tracking buffers is not good for performance - working on the buffer case from the spec side could also help - RA: Zarr-Python Refactor - Lot of uncertainity - Tempting to throw away Zarr-Python - Want to write it from scratch using high performance language and then bind it using Python - Kerchunk has changed a lot how I look at Zarr - https://github.com/gauteh/hidefix - Scientific Python [SPEC0000](https://scientific-python.org/specs/spec-0000/) endorsement - SV to ask what ZSC thinks of it - Ask Jarrod and StΓ©fan what's the exact window for dropping Python 3.9 - Zarr-Python Benchmarking and Performance meeting rescheduled - ZEP0002 Voting ## 2023-10-12 - JK: meeting slots - alternating with benchmarking. i.e. ZEP & benchmarking overlap less than benchmarking and ZSC. - Sanket as conflicted - JK: try 30/15 minute offsets? - JK: shorten to 30 minutes? - JM: provide concentrated status during ZSC? - SV: push notes like with ZSC & community-calls? - JK: gave up data-api meeting. - SV to ask Jack to push back 30 minutes. - SV to check in with Ryan Abernathey - JK: asked nvidia about zarr - recently agreed and now trying to figure out what to focus on - met with Mads about kwikio and what needs working on. - likely to work on batch work. i.e. performance issues. (compression, read at once) - trying to add that logic to numcodecs / zarr. perhaps code paths that are friendly to blosc2, but starting to go down that path. - JM: can the groups help John with anything - JK: interesting how they are broken down. related in my mind. - JK: perhaps could write up thinking in issues - JK: numcodecs release - tried to kick that off - bumpy because of release notes, etc. (known issue) - now done. - there was a bug in the importlib code path. - https://github.com/zarr-developers/numcodecs/pull/475 - hard to unregister - ok to have - JM: mac build failing - JK: sporadically. AVX instructions. - JK: ok to drop 3.8? JM: think so. - JM -> SV: something to blog about - Endorsing [SPEC0000](https://scientific-python.org/specs/spec-0000/) and [SPEC0004](https://scientific-python.org/specs/spec-0004/) - Generally yes - Some concern about whether or not we must implement - ZEP0002 voting - Ping from Norman. Can ZSC vote. - EOSS6 LOI (JM/SV) - NSF/Unidata (JM) - Wikipedia: NEXT TIME ## 2023-09-28 - JM and SV joined for the first 10 mins. and then switched to Zarr-Python Benchmarking and Performance meeting ## 2023-09-14 - Updates from NF Summit 2023 by SV ## 2023-08-31 - Cricket blah blah blah :smile: 🏏 - https://en.wikipedia.org/wiki/Narendra_Modi_Stadium - Zarr-python - Working groups - SV: blog post to get more contributors - RA: don't want to be mentoring new people. (ship stuff) - SV: Joe opened to try to get new people. - RA: will discuss with him. His choice. (They should write those blog posts) - RA: don't want to announce plans, but outcomes. ("50% faster", "integrates with pytorch") - also when inviting feedback. - SV: trying to keep transparency (two-way communication) - RA: they can go to GitHub - AM: agree that what will make or break is that the implementations are good and that people use them. - the openness / community element is why (all other things being equal) someone would choose it - RA: agree. don't think the WG work is ready. it's risky. we're trying something. trying to jump start engagement by shipping exciting things. Wasn't a good look to have V3 sitting there for 2-3 years. - AM: point taken. - cf. https://jack-kelly.com/blog/2023-07-28-speeding-up-zarr - **RA: for things to spend time on get more involved in the development. review PRs, etc. good growth area.** - Chat platform - JM: dislike Discussions - RA: too many locations also - not big enough for own discourse - use pangeo & image.sc? - JM: chunked format location? - SV: blosc would join - **Plan: try GH Discussions more.** # 2023-08-17 - Discussion place for ZEP0004 - R.A. will create a PR in zarr-specs - Zarr-Python refactor Kick-off - Recap of the meeting held yesterday - NumFOCUS Summit 2023 - If no one can make it, then we leave it to be - HDF5 Email response - https://hackmd.io/@MSanKeys963/rkJ_XIl33 - RA: Seems fine - CC Mark Kittisopikul (kittisopikulm@janelia.hhmi.org) as well - RA: Too late to modify the existing sharding spec - Looking for another version of sharding? - SV to send an email to discuss the versioning of extensions - RA: Geo-Zarr OGC working group kick-off meeting: https://portal.ogc.org/files/?artifact_id=105667&version=1 - To develop an official OGC standard - Already doing testing GeoZarr - but formalise it as a standard - Outcome: Geospatial softwares supporting GeoZarr - Hope to converge via OGC - ZEP0004 is required for GeoZarr ## 2023-08-03 - Ryan A. and Ryan W. met last week in New York - Zarr-Sparse array support - discussion b/w Ryan A. and Ryan W. - Ryan A. closed seed round just recently for Earthmover.io - John K. will be coming to NumFOCUS Summit on behalf of conda - Ryan W. is coming up for air and jumping back to OS - SciPy sparse array support: https://github.com/scipy/scipy/releases/tag/v1.11.0 - Zarr-Python Scientific Python involvement: https://github.com/scientific-python/specs/pull/254 - Fundraising efforts for Zarr with NumFOCUS - email thread with Nolan from NumFOCUS - Sanket V. to send an email to ZSC regarding the summit participation - Asking to invite Jonathan to join him ## 2023-07-20 - Zarr-python effort (RA) - asked: use cases, challenges, available effort - nvidia has the most FTEs. ("20hrs a week") - i.e. others are interested in becoming contributors - identified two teams: performance/benchmarking and tech debt/code quality - no volunteers beyond tentatively jack for benchmarking - what's next? - Janelia to dedicate 50% FTE role for Zarr - maybe Mark K. - RA's suggestion: Bi-weekly sprints and retro after sprints - Get the time commitment upfront - need a project board / manager? - Sanket for that role - SA: need some training on expectations - see https://github.com/zarr-developers/community/issues/54 - NumFOCUS Summit (Sanket) - Sept. 11-13 Europe. - SA can go with his shiny visa. - Is Alistair available? - Jonathan, Norman? - Josh: good discussions with companies here in California - Sanket: Argonne & HDF5 - RA: spun out of Argonne. strong buy-in - avoid positioning as competitor, "meta-format" - more like tensorstore, layering cloud-native on existing - shards as valid HDF5 files. great architecture. - Sanket: discussions with Francesc - joining blosc2 and sharding - sharding on top of blosc2 - RA: there will likely be several backends - JM: HDF "cloud-optimized HDF" at meeting - RA: tiledb as unique in this space - Josh: funding - OSSci ## 2023-07-06 - Hackmd getting slow. - Splitting document - Migrating notes to webpage - zarr-python mtg - JK: conflict - SV: discussing GPU use cases - JK: chatted with Joe. started with get_items. rust store? post retrieval you want to async the decompression. works better if you can use threads. - SV: where to draw the boundaries? - scipy - plenary session? - last time: https://docs.google.com/presentation/d/1oBLAiQQTvtqP11D4is0qYXc1adMOc55iHvjJIu9KEGs/edit - https://www.youtube.com/watch?v=XiW1y18eMso - Zarr: https://www.youtube.com/watch?v=TirPmOyHrmA&t=911s - josh: - ZEP1 accepted. who's working. simple table? - ZEP2 etc. - conventions / datasets - sanket: what do others do? just updates. - funding sanket's position - have at least 6 months so some time - but should still start thinking about what strategy we're going to follow - zarr office hours (Sanket) - Oregon professor of h/w security - research paper this year for http://www.hostsymposium.org/ - Novartis, Bayer, ThermoFisher using ... - https://github.com/zarr-developers/community/issues/19#issuecomment-1622653262 - John: https://github.com/ap-- - see https://www.fiercepharma.com/pharma/top-20-pharma-companies-2022-revenue#64199168-d3e3-44ea-8bd6-c4c17f896b38 - JK: Nvidia gave money to conda-forge and dask - contributing via numfocus? Josh: then we should spend some time on the numfocus representation - SV: Jonathan is up for spending time on the website ## 2023-06-22 - Revisit the ZSC expansion discussion - Ryan A: Email to Jeremy and Jonathan to gauge interest in participation - Sanket volunteers himself to join SC - Discussion of whether this is appropriate given he's paid by the project--consensus is yes - Will Sanket stay involved if his funding runs out? YES - Discussion of sustaining funding for Sanket? - Current grant goes through October, will get a NCE - CZI EOSS 6 happening soon - Sanket suggested a German grant - Could consider fundraising more directly via - Zarr blog post - See https://hackmd.io/@MSanKeys963/H1ukZ98Dh - Discussion of benchmarking - https://github.com/scalableminds/zarrita/blob/async/BENCHMARK.md - Long discussion here - consensus was to: - Shorten it - Make the audience and message clear - this is not a sofware release, the point is to celebrate our openness and collaborative process - Don't say anything negative about zarr and performance; instead just say "work on completing the implementation is ongoing" - Roadmap for the V3 implementation - Discussion of current gaps in implementation and performance - Who is going to do the work? - Alistair's suggestion: look for where is the most healthy / active zarr implementation ## 2023-06-08 - SV part of [SPEC steering committee](https://scientific-python.org/specs/steering-committee/) and plan to add Zarr as one of the [core projects](https://scientific-python.org/specs/core-projects/). Thoughts? - AM: Thinks it's a good idea - when you start the project you kind of copy stuff but SP is a good resource - JK: Seen a gap when working MATLAB and Python - kind of going back and forth - but SP is helping to make a cohesive ecosystem - SV: Resonate with the idea and motive at SP ecosystem - think we can help them and also get benefitted by them - Adding Jeremy and Jonathan to the SC - SV: :thumbsup: for Jonathan - AM: Jonathan is good! +1 for him - Jeremy helped us a lot during the ZEP1 phase - Would defer Jeremy's inclusion to others in SC - JK: - +1 for Jonathan - Jeremy has been working really hard on SPEC - He has gone through a phase where you have a lot of responsibility in open-source project - nice to see his evolution of thinking and handling everything - SV: What's the process of adding members to the SC? - AM: The [process](https://github.com/zarr-developers/governance/blob/main/GOVERNANCE.md#zarr-steering-council) to nominate a SC member - AM: Happy to nominate Jonathan πŸŽ‰ - JK: Happy to nominate Jeremy πŸŽ‰ - **AM and JK will send out nomination emails to ZSC** πŸ™πŸ» - Zarr Python's V3 implementation - SV: There's a community work that needs to be taken care of if we go with Zarrita - JK: Issue to keep track of the sync: https://github.com/zarr-developers/zarr-python/issues/1290 - AM: Ryan's issues - performance regression - where are the performance issues and how to do address those? - Zarr-Python has been good experiment place to do new work and push boundaries - Zarrita could be a new place to work fresh - but AM also agress with SV's point - AM: Adding V2 and V3 support to Zarr-Python makes it complex and maybe difficult for new developers to come and add more features to it - SV: Josh's point of motivation - it needs to come from within - SV: Zarrita is currently maintained by scalableminds - what will happen when the funding runs out? - Are they still going to maintain Zarrita in long run? - AM: Who are the key individuals up for the work? - The question might be right for the Zarr-Python core developers? - AM: Is it about the motivation? Or funding? - SV: Both of them! - AM: Need to find lead developers for Python implementation, and help them in any way we can! - SV: How about SC for the work? - AM: Ryan is interested! - Unfortunately, I won't be able to - JK: Zarrita is playground for Zarr, right? - AM: Right! Significant code is copied from Zarr-Python - JK: Did you re-use the store logic? - AM: Yeah! - The protocol layer is different - there were lot of roundtrips while working creating a hierarchy - so rewrote it save roundtrips - by splitting metadata and data - AM: But the final version of the V3 has gone back to how we did in V2 - deliberately split metadata and data file in V3 - JK: So you think we're back at the square one? - AM: Bit of a shame we went back, but I understand there must be reasons to do it - and also I have missed a large part of discussion in between - AM: Which may mean that Zarr-Python current V2 code could be closer to V3 spec hence easing the work for us - SV: More of a reason to fix Zarr-Python ## 2023-05-25 - earthmover - JM: pointed someone to earthmover for metadata versioning. - RA: talked to Janelia folks. need to hire someone. (doing great with weather) - OSS versioning system? (eventually, once compute is there) - can't live without versioning. tx-capability is great. propagating atomic updates! - JM: linking RA & CodeOcean (via Matt McCormick). sure. no files! - zarr-python - RA: pending action item to mobile py devs for v3 - perspectives on v3, sharding, tech. debt, etc. - what's the right path forward - also native support for sharding - RA: delegating to zarrita from zarr-python? JM: plugin interfaces are going to be tricky - AM: right architecture. "ChunkGridManager". Not there yet. Needs exploratory work. Good to have a clean slate for that. - RA: can virtualize v2 as v3 (i.e. v2 is forward compatible) - can use pydantic - JM: agreed, helps with migration cost but that's still there - RA: if you can "import zarrita as zarr" - ...sidebar about the root metadata ... - AM: defining clear goals (with real datasets): performance, backwards compatiblity, cleanliness - RA: two motivations are extensibility (extension-points) & interoperability (no-python-specifics) - Sharding! - AM: Decentralized versioning! - JM: defining large blocks (ChunkGridManager, Sharding, ..) - AM: still need (big) goals and designs/(higher-level) roadmap. state them clearly. (wasn't specific enough when v3 was started) - AM: cf. new python type system where new person came in - Blocks - zarrita completeness (how is this defined) - zarr3 compatibility (running all existing tests) - sharding backend - ... - SV: someone asking about sparse (anndata user) - Discussed with Isaac - Needs to be cleaning in zarr-python (separating stores, removing spaghetti code) - JM: intrinsic motivation is needed - SV: pushing for zarr as a scientific python package. community effort. have funding. - JM: question of governance of zarr-python (should define now) to make these decisions. AM: little council. ## 2023-05-11 - Topics: SciPy 2023, v3 finalization (also Python changes?), ... - Trademark - Josh talked to Nicole about providing a royalty-free license for OME-Zarr, GeoZarr, nczzarr - Swag on the website - Debugging duck! plushy - ZEP0001 - AM: ZIC on 227 sounds great. (read comments) - Huge congrats - Strong mandate - Need vote from steering council - SV: two missing votes (Brochart and Saalfeld) - JM - AM: held off since addressed to the ZIC. - SV: good to have ZSC on that issue - JM: then blog etc. :tada: - SciPy - Swag for SciPy - AM: sadly not joining (haven't been since 2019) - JK: won't be attending unfortunately either. - SV: JK listed as a speaker. - JK: already backed out of a previous talk. - JM: RA's comments on performance - AM: v2 should be unchanged; v3 should be able to be faster - open question of move to zarrita, strip out v2 code paths, etc. - where to invest time (in the remaining next 2 months) - SV: scalableminds/zarrita is the most uptodate v3 - AM: zarrita API is structurally different - introduces Hierarchy; matches the v3 spec better - consider it _better_ - Funding sidebar - STF - CZI - etc. ## 2023-04-27 - Josh: lots of meetings and talking and papers - Ryan: earthmover close to move focus to upstream OSS dev - product is like a store, arraylake rather than s3 (version control, web UI) - crazy performance issues in zarr-python (post refactoring in v3?) - refactoring, coordinating, focusing now that v3 is on the path to getting out - restart from zarrita? - What are the main performance improvements needed: - Hierarchy traversal -- listing groups, opening children, etc. - Array and group creation -- need a fast path for creating many things at once - Sharding - What do we want to show off at SciPy? - **outcomes** of the very slow, painful Zarr refactor - Performance - 10x faster for cases x, y, z - Sharding is part of this - Also "low latency stores" - Extensibility - A concrete example of an extension that has been _implemented_ - Awkward arrays? - Funny dtypes? - Josh: Syncrotron presentation - They're invested in HDF5 - But they were excited by how simple it was to implement a new codec - ThermoFischer - Who ## 2023-04-13 - Ryan was here. No one else showed up. Left after 5 minutes. - John was 5mins late. So unfortunately missed Ryan :( Then also left. - **Then after 15 mins** β†’ Ryan, Sanket and John synced via email and got together. Here's what we discussed: - Discussed about Zarr releases β†’ 2.15.0 - Ryan wanted to get a release out soon and include Mads' PR [#1131](https://github.com/zarr-developers/zarr-python/pull/1131) - Ryan merged the PR - John merged [#1378](https://github.com/zarr-developers/zarr-python/pull/1378) to update the release notes - Joe's PR [#1383](https://github.com/zarr-developers/zarr-python/pull/1383) was failing due to `codecov`; John added [#1391](https://github.com/zarr-developers/zarr-python/pull/1391) to fix it - Sanket finally added [#1392](https://github.com/zarr-developers/zarr-python/pull/1392) to update the release notes again - John suggested that we should go for 2.15.0a1 since there we merged [#1131](https://github.com/zarr-developers/zarr-python/pull/1131) which may need some testing before pushing into the mainline; Sanket agreed - Finally John suggested we should wait until Josh gets back so that we can make an informed decision for the upcoming release **FYIs:** - Sanket is in Berlin, Germany to present Zarr @ PyCon Germany 2023; check [here](https://pretalx.com/pyconde-pydata-berlin-2023/talk/JY3R3Z/) πŸ“£ - John is taking a online management course from Stanford and he's learning a lot! πŸŽ“ Tabled - Norman is asking for a seat at the ZIC (Webknossos) ## 2023-03-30 Tabled (as no one showed up) - Norman is asking for a seat at the ZIC (Webknossos) ## 2023-03-16 - Google (Ryan talked to Hoyer) - optimized loading zarr into tensorflow - releasing tensorstore backend for xarray - Abstractions - Ryan put something very well. ("design document") - ZEP1 - Adding Norman as author vs. motivating Jeremy? - Other options? - Steering - Does funding elimiate? Managing conflicts of interest (e.g. recusing on funding decisions) - Note: council == org. admin rights - Now or after ZEP1 (re: veto rights) - Ask about vision for the project? (e.g. stability vs. shipping code) - RA: What should the ZSC be doing? - Roadmap - After V3: Sharding, Conventions, features, performance (on these dimensions), integrate with these libraries - Helps motivate contributors, stakeholders (incl. time/money contributions / grants and sponsorships for more funding) - JK: useful for companies so they can see what will be done with their resources - i.e. deal with uncertainty - RA: committed to doing that work. "Why does Zarr exist?" - JK: conda propoals are pre-agreed by core - RA: pro-active ZSC tasks - JK: roadmap in public and see who shows up - RA: yeah, e.g. NASA! (but think about the criteria) - RA: cf. Apache system, try to copy it. - ZarrCon - remote vs. in-person (possibly 2) - few 10K USD donations - work on the roadmap and other strategic work - more or less a week - RA: will write emails about gauging interest - TODOS: - add ZarrCon prep issues - add issue for the roadmap (and/or vision) - Misc - logos & code of conduct: merged :rocket: - John and Sanket plan the scientific python meeting ## 2023-03-10 (impromptu) - JM/JK discuss: - scientific python (funding!) - ZEP1 and the write abstractions - community management in general ## 2023-03-02 - SV/JM discuss scientific python meeting ## 2023-02-16 Cancelled due to Special ZEP Meeting. See: https://github.com/zarr-developers/zeps/issues/29 ## 2023-02-02 Cancelled due to Special ZEP Meeting. See: https://github.com/zarr-developers/zeps/issues/29 ## 2023-01-19 - GeoZarr: lots of people, organizational meeting (RA) - OGC: Zarr accepted. Big question: use their process or make up own - Point of ZEP4. SV: merge? No, need to respond and make example. - Meeting w/ NASA POWER group - SV: email to ZSC - Data in Zarr on AWS. real-time. - We're looking for examples. (RA: just not comprehensive list) - RA: Why we are meeting with these people? Why don't they come to our meetings? - SV: reached out to them via a mutual contact - RA: energy community is interested in POWER dataset - RA: want to connect this group with other people at NASA - SV: Jennifer Wei - JM: logo - RA: strategic engagement (concretely, ) - https://science.gsfc.nasa.gov/sed/bio/jennifer.c.wei - SV: they are looking for a helping hand. - RA: support has a cost. we have trackers / meetings - don't answer things one-on-one. - no personal support (without payment) - Josh: how do we structure our communication? - RA: clear policy on how ZSC and therefore comm-mgr with - ... large government industries - ... - they may think: - it's a company - they can contract us - we should - have our goals clear - not be re-active - know how to choose which ones (+1 on logos) - i.e. recognize what we're doing. That's OGC (NASA is a member) - OGC is the international body. - goals - NASA: funding - ... - RA: trying to protect us - RA: too high-level goals - logo: Columbia, NASA, Google, Microsoft, ESA ... - gallery: examples of zarr usage. (RA PR) - DOMAIN - PROJECT - EXAMPLE - POWER ... - OME ... - - SciPy: - https://twitter.com/SciPyConf/status/1606319038258462725 - start tweeting? - SV: workshop - JK/JM: trying to attend - RA: 50% - but ZSC high-level talk on what's happening. - not technical. but the community process. - BDFL --> community. Hiccups. (Stakeholders) - Good story for SciPy. - Interesting story to tell. - July: V3 out. Sharding out. - We'll be full circle. - RA: to help write talk. - open source governance is hard - this is harder. specification and multiple implementations. - RA: tutorial as well. - people interact differently. - but always give tutorial with xarray (cloud-native / ARD) - Matt McCormick - "cloud native big zarrs with xarray" (multi-domain) - what are the principles? - some from the HPC world - Ryan's example: https://gallery.pangeo.io/repos/pangeo-gallery/physical-oceanography/01_sea-surface-height.html - SV: pure zarr for 30 minutes? - compressors, chunking, v2/v3, sharding, ... - RA: agreed, but best if it helps with full workflow - zarr as part of the workflow - JK: doc fixes are perfect ![](https://i.imgur.com/okHWo4i.png) - holoviews side note - optimizing - caching (hard with dask) - "graph chain": https://github.com/radix-ai/graphchain - dask graph that works distributed - Weekly ZEP meetings - Cancel ZSC meetings until March - More later: - Logos - Migrate ZSC meeting to zarrdevelopers@gmail.com ## 2023-01-05 - Happy New Year! πŸ₯‚ - Josh: - Rolling updates - JM: trying to track /**outputs** - JK: scrum channel in slack for RAPIDS - per day (bullet point list) ... "bunch of meetings" - JK: depends on the purpose - SV: tracking process - Sidenode about the Histochemistry paper - JM: inviting OME-Zarr tool creators to update the community on the status of Zarr support in bioimaging - JK: kvikio is a store. Works with FSStore? No, but good question. - with DirectoryStore you know you have local data - multiple reads / writes but agnostic of GPU - Zarr indexing funding - "federated" - JK: example from NumFOCUS people with many such FITS files - kerchunk turns that into a zarr - this would graduate that into a zarr spec - e.g. sunpy developer or astropy (Stuart?) - relationship to HDF5 - https://github.com/joshmoore/zarr-utils/issues - relationship to NetCDF?? - Tabled: Revisiting SC membership every January as per the [governance doc](https://github.com/zarr-developers/governance/blob/main/GOVERNANCE.md#zarr-steering-council) - https://github.com/joshmoore/zarr-utils/issues - Deadline February 22 - SV: propose talk and tutorial - JK: sprint? - see Sanket's notes. - JK: invite people as success stories - JK: BOF? Helps to get in touch with users. - or "Hallway track" (unconference) - 2021 chaired a track around imaging - almost always a geospatial track. - sometimes particle physics - NB: emeriti: https://conda-forge.org/docs/orga/governance.html#teams-roles ## 2022-12-22 - Talk about outreachy, hiring, Java, etc. - but then HOLIDAYS! ## 2022-12-08 - Moving forward with ZEP0001 (V3) - Implementations to consider apart from `zarr-python`? - Timeline? - Any major concerns? - JM: https://github.com/zarr-developers/zeps/discussions/24 went out - RA: if we go through with all the changes, V3 going back to V2 in many ways - JK: have a summary of the changes? to v3 - JM: 25% of Jonathan covered. would hope there will be capacity. - JK: chose to break things in V3, why is Jonathan interested in re-adding V2 compatibility - RA: Path is the big one re: tech. debt. That's most of the lines of the code. - https://github.com/zarr-developers/zarr-specs/issues/177 - Want us all to have an opinion - If Ryan joins - his benchmarks can be viewed [here](https://github.com/zarr-developers/zarr-specs/issues/177#issuecomment-1341808637) - (narrow) what are performance trade-offs - JS was defending the original V3 layout - don't think we need to support a non-recursive approach - JK: bigger picture of reverting or not - timeline issue - RAPIDS release early February. (plus new conda things) - RA: trying to get people engaged - building metastore for v3 in earthmover. good to force. - JM: e.g. the transpose codec raises after we said we're keeping F and C order - JK: added `F` for how it goes into the compressor - so maybe a codec is correct - RA: named dimensions solve a lot - JK: important for what you can traverse most efficiently - RA: a disagreement with JBMS. - Discord/Gitter (JK) - SV: Use a ZEP? Process since it's about how they engage ## 2022-11-24 - Some repeats for Alistair - JM: zarr-java good news - SV: ZEP0001/V3 chat - lots of excitment at CZI & NumFOCUS - frustration on Jeremy's part - Jonathan driving things - https://github.com/orgs/zarr-developers/projects/2 - ZSC available for discussions - AM: good to have JS & JBMS in the lead - hesitated posting the responses since wouldn't be able to follow them up - SV: sharding (ZEP2) will likely open up new contributors - Sanger? - lots of internal/external politics - stress of walking a very thin line - SV: sounds similar to running a conference - ease up if funded for multiple years - excited to get back to interesting sci/tech. - SV: switch funders (CZI?) very (biomedical) data focused. - potentially. Gates have been major partner for some time, and still expressing support. - SV: https://twitter.com/KirstyGarson (CZI funding) - AM: doing lots of training (pydata stack) - Zarr Sprint and talk coming up next week at [PyData Global 2022](https://pydata.org/global2022/) - hopefully new contributors - Zarr signal/noise - email inbox - so much going on - AM: numfocus.slack.com - Governance - ZSC on longevity - ZIC on technical - community experience (OME, conda-forge, pydata) - struggled with v3 on _how to reach a solution_ - ZSC rotations? - perhaps less-daunting - good for CVs - fixed seats? - SV: CSCCE "growing" to "stable" community (also funded by CZI) - AM: have community engagement team of 4 people - SV: [Thinning zarr-python](https://github.com/zarr-developers/zarr-python/issues/1274) - Separate stores into new repos - AM: additional points - found it useful to run the tests on all stores - imagined eventually having specs for each store (even if "so obvious") - include: how to name tables, etc. - a ZEP for each? too heavy weight? (see example in original v3 PR) - https://zarr-specs.readthedocs.io/en/latest/stores/filesystem/v1.0.html - JM: work on zip store coming - Dennis' suggestion of splitting the tutorial - "non-normaltive"? "experimental"? - AM: "played around" - does the tutorial move to v3 only? probably. be aggressive. - health warning, table of version support. - JM: experimental templates moving forward? add `needs-purl: true` metadata? - AM: clearer examples (protocol extension) - different grid, metadata encoding (e.g. ProtoBuf) - JM: review each PR in terms of "does this need an extension?" - AM: if architecture is right, then someone can just plugin - AM: avoiding if/thens - *Tabled*: Two outreachies (will be managed partially on Discord) ## 2022-11-09 - **zarr-java** (Josh): Propose to start a new repository with a new maintainer list to develop a core Java library. (Thumbs up) - RA: still good to share native stores - Joe & RA have written a new store backend at earthmover - some re-implementation of fsspec - async as all the rage - unclear what language for max. interop - ZEP0001 (Sanket) - mtgs with John, Josh, Alistair - hackmd prepared about all the issues - asked Alistair when those will be posted - feedback from him says he won't be actively leading the discussion - might need to select someone from ZSC or elsewhere to lead with jonathan (@Action) - open issues for all of the hackmd bullet point items - when all of those are handled, then we merge the ZEP1 PR - last ZEP meeting, Jonathan & Jeremy happy with how things are going - RA: some ambiguity about the process. author has withdrawn. don't need a ZSC "leader" (but a champion) - JM: Update on the contract: 25% position for the rest of the grant. - RA: been thinking about governance. meritocracy. decision making is proportional. demonstrated to contribution. doesn't matter _who_ you are (if you're paid, etc.) - think excluding him isn't doing us any good - currently ZSC is underpowered - RW: could always recuse himself - RW: could also drop off - JM: is anyone here stepping up? - JK: was asked, would be a bad cop (cut the cruft) - SV: will be leading on ZEP1 (mtgs, etc.) so don't mind to be an editor (most important: get work done ASAP) - RA: thinking of doing it, but struggling to engage (trying to free up more time). added lots of comments. have set self up as counter-point. good to have discussion. JK is more neutral. - JK: could see JM, regularly talking with JS - RA: how do we avoid scope creep? JK: I'd be tempted to go to bug fix only at this point. (would need to time box it) - RA: main feature are extensions. - PyData Global 2022 Sprints (Sanket): who is joining? - 1-3.December. SV to make the schedule. - JK: need to think about it? within a week please. - xarray sprint too... - RA: will think about it with Joe. on specific features. ## 2022-10-27 ### Cancelled! ## 2022-10-13 * Meeting time (not-Mon-or-Thurs) ... til Nov. 3rd - RA: not Fridays (refreshingly open) - JK: Wed. works for a week, but then a UTC slot moves - TODO: Sanket to organize a doodle * Outreachy! Good tasks for starters? :warning: - JK: lower expectations. thrice with conda-forge - **Binder** for future Outreachies. - RA: replace something on the webpage? - Sure - https://pangeo-forge.org/catalog - https://idr.github.io/ome-ngff-samples/ - JM: more logos? - Yes. - JK: https://github.com/zarr-developers/zarr-python/issues/917 - JK: easier to load test data (see scikit-image, pangeo tutortial documetation) - https://docs.xarray.dev/en/stable/api.html?highlight=tutorial#tutorial - https://github.com/pydata/xarray-data - RA: more on cloud. Examples in s3, azure, gcs (ADVANCED) * ZEP0001 - From Alistair: https://hackmd.io/@alimanfoo/SJ1DNVRxo - Sanket to review - RA: looks great - JM: let's get it out * Java - test suites - RA: Rust? Sure if we find people. Have a dream. Stores in Rust? - cf. hard vs. language specific - JK: no rust for netcdf - JM: possibly for netcdf-c, but netcdf-java would probably be happy with it - RA: fsspec is fragile to be building on - performance, reliability, shareability - async ... - RA: trying to understand async in community. in fsspec it's bolted on - JK: Ahmet & his Java developers? Sanket to follow-up. * RA: Be more helpful? - JM: Review PRs? parallel access - RA: highest priority? - JK: https://github.com/zarr-developers/zarr-python/pull/1131 `getitems` (part of async problem) - RA: want to have meta_array SPARSE support (scipy.sparse). related to GPU - JK: "contexts" feels scary - RA: key problem in architecture - love that it's layered (storage/codec/array API) - but there are path ways that go all the way down (hard to optimize) - one critique of fsspec: not typed! * JK: Kerchunk - talking about more usage - same problem at NumFOCUS summit - using many formats together (astronomy, medical imaging) - sharding is an implementation that is general. - could we move away from that and more like "read" - JM: have pitched this before - RA: "kerchunk manifest" (list of chunks) - email thread with Martin - Central to what EarthMover is building (Apache Iceberg) - Trying to create open data lake format specification - Metastore (doc db) with chunks in cloud - RA: "Kerchunk community"... - JM: need cross-language community - JK: keeping people engaged - carrot you get to decide * Extended ZSC conversation (JK/JM) ## 2022-09-29 * Summit post-chat * JK: saw Francesc. unconference time / BoFs * Formats discussion * Storage / nibabel (Matthew Brett) * astropy with FITS data are interested * nilearn... * Turned into a Q&A about Zarr * econark https://econ-ark.org/ (simulations) * most had some kind of format (designed for single file) * went towards kerchunk (especially astro folks) * constraint of archiving forever * apply to MRI? * one-on-one with Francesc * blosc2 / revisit for sharding * JM: and Python to libc library? * JK: don't need footer/header, but in JSON file? * how far from Jonathan's format? differences? * Also mentioned using Zarr for some of their IronArray work * i.e. maybe not misunderstanding but just miscommunicating * JM: library like numtransforms in parallel to numcodecs * JK: doing it all at the C implementation * Rust ... * libzarr * A little bit about Earthmover * API vs. Format * cross language * xarray support (and Java...) * NetCDF * own format * JK: numpy structured arrays for typed attributes (always a wart) * Base64 .... * Nvidia & CPU / Python stack. * Hetereogenous computing * Josh & AI/Benchmarking paper idea * https://github.com/microsoft/planetary-computer-containers/issues/51 * async * works for all coordinate requests * for setitems as well? * thread pool * Talk to Greg and/or Jeremy * Get asked all the time baout benchmarks - "why implementation to use?" - "how performant"? (tiledb, hdf5) - ... tiledb to look at spec? :red_circle: * Sanket :tada: * JK: good to see how he works a room * JM: need to look into more funding for him * JK: NumFOCUS funding/DEI etc. * write funding proposal / to a lobbyist for a long OSS grant * Michelle / TOPS / NASA. Trying to get govt involved. * how big can you go?? * core contributors numpy/scipy (other languages?) - collaborating with non-core (e.g. sunpy) - come up with a list of things they want to see - "here are our NumPy pain points" - "this will take 6 months" - aggregate that over many projects which describes the pieces for funding - hire grant writers for NSF - cutting down overhead - filling the middle spaces - being more proscriptive... * JM: report on NIH meeting * JK: writing for senators - costs if you don't do anything * https://scientific-python.org/ - TBD :red_circle: * JK: Chatting with Dario - getting more funders / sharing lessons learned - having a meeting on the East coast in a couple of weeks - "matching funds" - longer , bigger , more , ... - NumFOCUS over wine... * Sanket joins at 19:35 CEST - Talk to Jeremy - ok to do heavy lifting for ZEP0001 - gave him stickers - baby in November * Misc * Josh: https://github.com/zarr_dev is open ## 2022-09-15 * ZEP0001 - RA: talked to Alistair. responding with consensus "within a month" - Number of things that are changing - JM: discussion with Jeremy, etc. * ZEP for conventions (as extension) - https://cfconventions.org/ - https://github.com/ome/ngff/issues/84 - https://github.com/bogovicj/ngff/wiki/Transforms-notes,-examples,-proposals - composites of those? - xarray coordinates in Zarr (defined in cf/netcdf) - named dimensions + array with that dimension ("lat", "lon") - xarray turns those into usable coordinates - try to keep zarr simple since already have xarray - JM: don't have that in Java... - RA: added package in Java on top - RA: stick to the HDF5 model (i.e. don't use the conventions) - JM: pyramids is still interesting... - RA: too many ways of generating them. convention? - RA: want to explore multi-chunking - JM: agreed. in my multiscales. * --> earthmover.io ## 2022-09-01 * Agenda: - Josh: zenodo updates (mostly for Alistair) - Looking to update all the zarr-python entries - Teams are coming - Owner swaps possible - Either need to trasnfer to use or get "teams" functionality - (replaces official zarr paper for the moment) - @@Josh to create account and send to Alistair - Josh: https://github.com/zarr-developers/zarr-illustrations-falk-2022 - Any other drawings? Let me know - Josh: zarr-python maintenance / releases - https://zarr.readthedocs.io/en/stable/contributing.html#release-procedure - More people? - Require releases.rst to be updated? - Perhaps. - Deal with conflicts manually now. - Eventually: https://github.com/conda/conda/blob/fdbedc49d4a61fb2121614de22795cfccc562b0b/docs/source/dev-guide/releasing.md - JK: Adding release action step ( https://github.com/zarr-developers/zarr-python/issues/1118 ) - Sanket to do releases as well ``` #!/usr/bin/env bash VERSION=$1 URL="https://zarr.readthedocs.io/en/stable/release.html#release-${VERSION//./-}" exec gh release create v${VERSION} --notes "See release notes ${URL}." ``` - SV: NumFOCUS session on managing releases (plus benchmarking, etc.) - RA: new startup adventure disclosure - announcement coming next week - wanted to do more on software and infrastructure (focus of career) - going on leave from Columbia to start a company with Joe Hamman - data infra focused on scientific data in the cloud building out pangeo to work with large scale data - interoperable, open standards - first product is a data lake like databricks and expose through single data model (Zarr with kerchunk) - raised some money and hiring two people = time to focus on open source stuff - "Earth Mover" - JK: logo? :) - SV: fantastic. soft spot for entrepeneurs. - Speak at PyData Global? (focused on solutions & services) - 2400+ people. RA: tentative yes. - CFP is closing in 12 days. - https://pydata.org/global2022/ - JM: EOSS5 funding, sounds similar - RA: discussing can & can't build through open-source software - pangeo experience of deploying infra (with DIY OSS model) - only gets most people so far. they can't follow it. - companies that want to run the tech just want a contract with someone. - no specific data engine in mind (not databricks==spark) - building: - great website to understand your zarr holdings (catalog; beyond a S3 bucket) - help with virtualization of zarr stores on existing file formats (differentiator from tiledb) - transforming datasets (ETL) - Alistair joins at 38 minutes after - RA shares startup info - AM: "Thank God!" - Spec? - SV: meeting on Monday. Invite sent. 3pm BST (Monday is a holiday) - SV: issue opened about regular meetings. - https://github.com/zarr-developers/zarr-specs/issues/156 - @@Josh to setup calendar entry - RA: - going _pretty well_ with feedback from community. Jeremy is strong. - need more strong voices in the conversation - nothing carries more authority than Alistair - goal of having the spec converge. (JMS is creating scope creep) - feel like only one arguing against them - AM: good to know. was hoping SV could help catch me up with an overview - what are the set of amendments that are proposed? - we can make changes but need to identify the sweet spot to make people as happy as possible - again, like responding to reviewers - Ryan's summary of what needs to be resolved is here: https://github.com/zarr-developers/zarr-specs/pull/149#pullrequestreview-1067763694 - AM: protocol extensions can override the core spec (i.e. breaking) - JM: how many of JMS' proposals can be an extension? - AM: think most of them. May need more examples. - Adopt even "micro-extensions" - AM: will get to something Monday and will surface that - JM: give Zarr V4 to JMS? task him to try V3 extensions - AM: need to show an actual _implementation_ of an extension (multiple languages) - RA: _conventions_ - defining ZEP soon. "Zarr metadata conventions" - AM: agreed. that's great as long as it wouldn't break vanilla application - JK: someone sit with Jeremy and try to implement one? - SV: zarr-specs 153 and 155 ok (from JMS)? he didn't open ZEPs? - AM: changes are in scope since ZEP0001 is open? - JM: anything that can help? - AM: really just the total (unprioritized) list - JK off for 2 weeks - SV: slides? JK: possibly on plane - SV: do-a-thon? JK: haven't thought about it yet ## 2022-08-18 Regrets: Josh, Alistair, Ryan A. and Ryan W. Updates: - EuroSciPy 2022 poster finalized, check [here](https://drive.google.com/file/d/19q_vaMvnpS8zRoAJEZAWJr1g7YCB6DWp/view?usp=sharing) Agenda: - ZEP Meetings - Wait for a couple of days for votes to come and then schedule the meetings - CZI & NumFOCUS Summit preparation - When is everyone arriving and leaving? - John is arriving on 18th September and leaving on 25th September - Sanket is arriving on 15th September and leaving on --- - Ryan --- - See this: https://github.com/zarr-developers/tracker/issues/10 - Check with CZI for poster orientation - List of things we've achieved - Inspired by the CZI EOSS1 grants - our achievements - A notebook would be great - General walkthrough of Zarr - Start with microscopy data - Coordinate system, Sparse arrays - Anything you'd like to see/discussed at NF Summit 2022? - Interoperability - Zarr and Xarray - Zarr and Dask - Grant writers for projects - What's the process look like? - What do they need from projects? - Involve funding from other agents like NSF, Gates Foundation, HHMI, Allen Institute - Zarr User Survey 2022 - Dask Survey (for inspiration) - https://blog.dask.org/2021/09/15/user-survey - https://docs.google.com/forms/d/e/1FAIpQLSfio2RIQGIQsX1QTJh4JXmTFK8s-7BbsR0VnfmsWXu1Ccb2Yw/viewform - https://github.com/dask/community/issues/148 ## 2022-08-04 Agenda: - FYI - Community Calls notes and info at new URL: https://zarr.dev/community-calls/ πŸŽ‰ - EOSS1 & EOSS4 report submitted. πŸ’― - Also check out the new blog: https://zarr.dev/eoss4-roadmap - Ryan A's travel! Who's gonna cover it? - SV: talked to NumFOCUS committee. Still figuring things out (e.g. which topics) - JK: had funding in EOSS1 (JM: and EOSS4) for travel. - Makes sense that we're starting to use it. - Assuming ~ 200 USD a night. - [ZEP0002](https://github.com/zarr-developers/zeps/pull/13) opened by [Jonathon Striebel](https://github.com/jstriebel), [Norman Rzepka](https://github.com/normanrz) & [Phillip Otto](https://github.com/philippotto) - `zarr-specs` PR - https://github.com/zarr-developers/zarr-specs/pull/152 - SV: started reading it. Looks fine. Appreciate having more eyes on it. Can then merge as a draft. Couple of commits coming every day. - Other - JM: Who's making commits on ZEP0001? No one yet. SV: reaching out to AM - SV: talking to scalableminds tomorrow. - SV: thoughts on https://zarr.dev/community-calls/ ? - JM: was wondering about summarizing them (even for ZSC) - JK: just bullet point list of topics? - JK: make them collapsible?? - JM: FYI I feel behind on zarr-python merges & releases - JK: v3 changes are important. merge? - JM: I feel pretty good about it being behind the EXPERIMENTAL_API flag - SV: can get a blog post out (no opposing thoughts) - see https://github.com/zarr-developers/zarr-python/pull/1096/files/3a9f7ccfd08dc68c2f2d148e44d1ccffd4b840ad..efa4e07cff4c2e9ef0bd1e4656ec80739167949e - JK: also https://github.com/zarr-developers/zarr-python/pull/934 (from mads) - JM: 2.13? - JK: numpy version bumped (>= 1.20) - JK: OME using v3? Not yet. Need Java - JM: looking for Java contractors ? - JM: need to choose between: - n5 - jzarr (v2, no ZIC, didn't want money) - netcdf-java - fork - clean-room implementation - get one submitted to zarr-developers - JM: B-Open working on sparse arrays ? - SV: talking to EBI (via CSCCE) - https://www.ebi.ac.uk/people/person/mariia-levchenko/ - JM: EBI has an issue with their object storage ("FIRE") that they know they need to fix - JK: youtube videos? to get more content that people can engage with. - SV: live coding? JK: Sure. - SV: see Ryan's video from geospatial - https://www.youtube.com/watch?v=unGL07trSjA&feature=youtu.be - brain blog? Talk to Greg - bids, dandi, mne, ## 2022-07-21 Agenda: - SciPy - Somewhere colder? - SV: Anyone discussing Zarr? - Organizers asked for an update from John - JK: talked to a bunch of people - met with astronomers ASDF (RA: "non-proliferation treaty") - good mtg. interested in each other. - James Webb would be a major win. - single file format. (zip backends, etc. etc.) - they are using YAML, novel. see the object references - https://blog.daemonl.com/2016/02/yaml.html - interesting hierarchical formats - JM: some of what I'd like to get out of JSON-LD - JK: might be willing to move to Zarr - JM: do some conversion? - JK: just show it to them first - SV: person-of-contact? Yes. - "Seeing is believing" - sharding came up in multiple conversations - talked to Isaac (genomics/anndata) - cares a bit more about some extension formats - unicode support (beyond numpys UTF-32) - ragged arrays. Jim was there. - Australian group on molecular dynamics - custom compressors (image codecs, etc) - nice to see new use cases - gridded arrays, some tabular - Jim mentioned possibly some value in storing arrow data in chunks - helps Isaac as well - hopefully not too big of an overhaul - lower barrier to entry - fix the objectcodec issues since arrow is cross-platform - RAPIDS would also be interested - more discussion on z-indexing (spatial) ((recontinued)) - not covered by sharding - grid option in v3: replace rectilinear by z-index - Alistair (shattered from heat) - SV: last 3.5 months - https://www.sanger.ac.uk/collaboration/genomic-surveillance-unit/ - COVID and malaria - CZI & NF Summits - John, Sanket, Ryan W all going - Alistair not available - Josh doesn't want to travel - EOSS1 & EOSS4 reports - JK: other funders? - JM: CZI single-cell and CZI community are coming but have backed off due to NFDI - SV: Sloan, Moore, Gates, (NSF) - AM: Gates likely paid for early Zarr work. Would like to surface more software needs. But it's a big entity. - Grant for statical genomics tools - sgkit, https://pystatgen.github.io/sgkit/latest/ - spark community - xarray + zarr, started a year ago - were looking at cloud computing in Africa (Jeremy) - spoke to 2i2c - SV: ZEP0001 - https://zarr.dev/zeps/draft/ZEP0001.html - Previous meeting's discussion about getting ZIC to vote - Open a PR against spec repo ## 2022-07-07 Agenda: - Josh: IBM Discussions (largely FYI) - https://opensource.science - https://medium.com/@notjustmoore/why-i-zarr-ee64eb7ffbf8 - SV _in absentia_: ZEP1 vs merged PRs? - AM: ZEP came after the PRs so out of normal process (just needs to be understood) - Open question in the ZEP. - Main focus of the PRs was the v3 core spec. - But extensions exist (extensions, datetime, codecs, sharding, etc.) - Does ZEP cover those? Even if they are placeholders? - tldr: ZEP1 just cover core or some of the extensions. (Process question) - if just core, then we could add more ZEPs. leads to clear zep per spec mapping - AM: could imagine deleting everything then adding via the process but maybe no way to get work done. - JM: Special marker on the document for the interim period. "UNAPPROVED" Banner of "provisional" - AM: Spec is complete. ZEP is complete. Can go to the next stage of the process. - We're to review? - Whatever the next stage is. - Consultation with the ZIC. - JK: scipy - small talk and plenary session - slides hopefully next week - "working on v3, ZIC structure, greg's work on support in zarr-python" - JM: talking to an illustrator about some Zarr images. ## 2022-06-23 Agenda: - JM: largely caught up, got the various releases done, :+1: - SV: finishing up some sections of ZEP001 ([link](https://github.com/MSanKeys963/alistair_zeps/blob/zep-1-2022-05-03/zep-1.md)) - RA: Zarr V2 is officially an OGC Community Standard! πŸŽ‰ - V3 as soon as possible, hopefully with help from Sanket. - GeoZarr is also becoming another OGC standard. (Convention) - JM: also good to have those logos as "Used by" on the website - SV: moving toward a better design - SV: people are also asking about how we can cite (CFF) - RA: run a contest? Prize money? Stickers? Other NumFOCUS swag - RA: aggregate a catalog of data. So people can just look at data. Make it fun. - JM: https://idr.github.io/ome-ngff-samples/ - carbonplan maps! - RA: zero-origin - possibly approaching an agreement - still some questions about extensions - need someone with authority to communicate what is the scope of the core spec and what can be in extensions - JK: scipy 2022 - mentioning v3 and evolving - and the community restructuring with ZIC - (they were asking about tools and how they changed) - JM: sharding as a use-case - RA: vote on ZEP001 within a few weeks - SV: felt that it's complete - RA: starting to think about sponsorships ## 2022-06-09 Agenda: - ZEP website done! Check here: https://zarr.dev/zeps - ZEP1 - Discussion on Jeremy's(JMS) [PR](https://github.com/zarr-developers/zarr-specs/pull/144) - RA: Natural way of defining array is out of scope and is against what Zarr array stands for - RA: Xarray provides an extra layer for latitude and longitude info - JMS's change is not what RA wants to have it - AM: Haven’t came across what JMS has proposed - AM: Not a position to caste a vote here - AM: If ZIC has veto then this cannot fly - RA: Breaks the cross language compatibility (b/w Python and Julia) - Suggestions have come up before - JMS's motivation for the offset and non-zero origin comes from [#122](https://github.com/zarr-developers/zarr-specs/issues/122) - RA: What Jeremy proposes can be handled by adding an additional model/API level/translatinoal layer above Zarr. For example: Xarray - Reasonable petition could be take what Ryan explained about Xarray - JK: How the original request of non-zero origin came from? - Appending data to arrays, negative chunk indexes - https://github.com/zarr-developers/zarr-specs/issues/9 - JK: Negative indexing important - in microscopy - RA: Geospatial community - the value of metadata format - RA: Metadata standard could solve JMS problem - AM - Apart from core and spec there is standard documents/schema - like ways of using Zarr to achieve things - How conventions could be applied to the Zarr and how you could tell everyone that how I’m using Zarr for my use case - AM: Schema framework for Zarr spec could mitigate JMS's issue - `Josh’s take on schema!?` - RA: NetCDF-C community would be doing a lot of work without anything to gain for their community - JMS's 3 options: - V4 for core spec - Extension for V3 - Could be achieved by schema/documents as mentioned by AM - NumFOCUS newsletter updates - [here](https://hackmd.io/_2HkRVZ6RLSNZJKMkmHcxQ?view#June-2022) - Anything else? - Nope - SciPy 2022 & NumFOCUS Summit 2022 - JK is doing a presentation @ SciPy 2022 - What we want to include in the presentation? - V3? - Yes! Should be announced, as more people read it more it becomes used and popular! - Another possible things to share: ZIC and how does it works and look like? - JK going to NumFOCUS Summit along with SV - Will work on slides and presentations soon! ## 2022-05-26 Agenda: - Misc - RA: optimistic about our technology - JM: playing with jupyterlite. need async! - RA: fsspec needs work. needs threads. chat with Martin. - Critical - ZEP1 - RA: thrilled - AM: hopefully in the right ballpark. Steers welcome. - SV: see website - ![](https://i.imgur.com/A9oGkgg.png) - searchable https://zarr.dev/zeps - RA: propose trying to leave zep PRs open for review - RA: and zeps in zarr-spec repo? No. - RA: goal of review? AM: more completeness and then deeper review happens once merged. - SV: more a hybrid. draft is "complete" (all sections filled). then community can see what is being said. --> Josh: publicity. - RA: have buttons on the website, "comment on this ZEP" which opens an issue or PR. - RA: worried that no one is going to give any feedback. - RA: just need to be clear what the process. how does someone give feedback. - AM: numpy has an established mailing list. - JK: what's the goal with merging a _draft_ ZEP? AM: makes it visible. - RA: that's the step of publishing a draft. So we need to be clear of what publishing means. - RA: feels too heavy - AM: want the experience of working on the spec to be like working on the code, but it's tricky because it comes down to multiple docs/versions/etc. - RA: slight-walk-back / much of the discussion will take place on the zarr-specs repo. happy to move forward. - AM: comment - v3 is bigger than most, but in practice useful to have multiple PRs building up the change (see also NEPs) - JK: change state of draft. "Proposed"? - JM: strong-yes, weak-yes, weak-no, strong-no - JM: what triggers a core change? - Verdict on [zarr-python ZIC member](https://github.com/zarr-developers/governance/issues/19) :question: - Otherwise: barring zarr-js and jzarr, complete! :tada: - JM: asking Greg? or does John/Alistair *want* to do it? - AM: out of touch personally - JK: could step up. - JM: personally don't want to. - JK: agree that it's not great, but practically, we don't have the people. - FYI - Sanket: https://twitter.com/zarr_dev/status/1528712019477598209 - also benchmarking kerchunk - Josh: second contract with scalableminds is moving (slowly) forward - Time-permitting - Changelog https://github.com/zarr-developers/zarr-python/issues/829 ? - problems - conflicts - people not doing it - difficulty of formatting - JK: ci check for PR? (but also the conflict is issue) ## 2022-05-12 - ZEP 1 Update - Sanket & Alistair to start working next week - NumFOCUS Summit (18-20.09; Austin) and CZI Summit (19.09; San Francisco) - RyanA: good question. will check. - SV: Also PyData New York is coming! - RA: SciPy in Austin in July (best conf. ever) - presenting? xarray tutorial. pangeo forge pres. 8 people. - Spec status? - JM all PRs merged, Jonathan & Jeremy editing (as they said they would) - SV to work on ZEP webpage - RA great to have ZEP0 published to show how the next one will come in - SV currently working on a couple of blogs - GSOC SV: excluded person for .zmetadata - Updates for NumFOCUS and CZI Newsletter - SV: talked to Arliss. Have some time. - Involvement with Scientific Python (https://scientific-python.org/specs/) ? ## 2022-04-28 - :sunglasses: - ZEP V3, 20min focused discussion - RA: right direction? - AM: seems like a good direction. clarity of spec ZEPs. - AM: extensions removed? SV: to be added back later. - SV: also specific for codecs & stores/storages - RA: like specifying the storage since we're reliant on the python implementations - JM: question of are stores/codecs core protocol or extensions? - AM: all active people should comment on core - everything else (extensions and possibly even storage) could be optional - AM: ZIC sounds great. - JM: think we can merge 17 & 16 and focus on ZEP1 mission of v3 - RA: would have ZEP1 be a process spec to discuss the rationale of breaking changes - but generalization won't help us move fast - JK: mostly got v2 of the ZEP but - tl;dr - slimming down of ZEP and it seems like that's been captured - RA: ask Jeremy for a clarification - AM could see writing for ZEP1 to capture motivation of original V3 - could respond to Jeremy - but engaging / debating will struggle with time - RA: if too flexible it will fragment. lack of interoperability will hurt. - would be good to clarify what is core zarr. - _RA leaves_ - JM: think would be useful to address sharding/translation in ZEP1 - AM: storage was originally in - JM: terms overloaded. also don't know the extension interfaces - AM: could see removing from V3 current draft. Perhaps add in Filters. - JM: will sharding drive a quick v3.1 or v4 - AM: storage translators is a MUST understand thing (or at least know and raise an error) - AM: ordering -- write ZEP1 for _why_ (motivations) and some evidence - and let that be reviewed and test by the community - purpose of a ZEP, stating your objectives (pointing to the spec concretely) - Ask community: "has v3 spec met its objectives?" - TODOs - SV: set up ZEP repo - JM: - [x] comment on and merge 16 - [ ] comment on and merge 17 - [x] contact Jeremy - [x] contact ZICs - AM: ZEP001 - 4 objectives - link to sharding PR (possibly proactively merge into dev branch) - ZEP & ZIC - Done - teams created - invite drafted - javascript contacted - Needs doing - merge [zic](https://github.com/zarr-developers/governance/pull/17)? - merge [zep](https://github.com/zarr-developers/governance/pull/16)? - discuss zsc members - logo: - better? https://github.com/zarr-developers/zarr-logo/commit/4666a0b8412e4a10dfb838f5e5adee02b5547f4e - all agreed - trademark: see https://gitter.im/zarr-developers/community?at=6268e424c6e3665bad9b799b - JK: works pretty well - JM: to get started - meeting lengthen to an hour: all agree ## 2022-04-14 - cloud-native - SV: starting slides tomorrow(ish) - 800 registrations - governance - impl. council: https://github.com/zarr-developers/governance/pull/17 - zep: https://github.com/zarr-developers/governance/pull/16 - JM: next step? - RA: wanted to finalize ZEP by the end of this week! - issues: - scope of spec? (part of "what is zarr and what are we governing?") - RA: example of ZEP that doesn't touch the spec? - JM: biggest example I can think of is zarr-python async-only - SV: sharding, ZEP would document also the change to zarr-python - RA: ZEP for governance/process (leave opportunity to extend for per-language later) - RA: move for a vote: - Josh: vote for limited scope - Ryan: vote for limited scope - ZEP000 by tomorrow (meta-ZEP) - ZEP001 on v3 (possibly sharding) - ZEP002 on sharding if necessary - who approves the zep? - (impl.) which repo(s)? - JK: struggle with time. as stream-lined as possible. e.g. one step. - RA: ZEPS basically as a changelog. - JK: Even before the change? - cf. draft state of NEPs and PEPs - RA: two PRs to same repo and the ZEP gets merged even if rejected - JK: that sounds like no draft status, only the PR. - SV: separate repo was to have all the narrative in one place - namespace of zarr websites - zarr.dev - zarr.dev/blog - zarr.dev/zep - zarr.dev/specs ## Need PR reviews! (netlify) - zarr-specs.readthedocs.io ## migrate or not? - zarr.readthedocs.io - steering council: - possibly norman - implementation council - ward (netcdf-c) - jeremy (tensorstore) - norman (webknossos) - trevor (javascript) - fabian (julia) - constantin (z5) - david (xtensor) - RA: can move repo but ok to not do that. - SV: add requirement to check the gitter council :smile: - RA: GH tagging should the only reliable way - JK: teams will help! - TODOs - SV: update PR 16 - RA: update PR 17 - JM: draft IC invite - v3 (TABLED) - xarray & dask - e.g. https://github.com/pydata/xarray/pull/6475 - - quick bits - https://github.com/zarr-developers/zarr-python/pull/933: ok to merge, John? - Ryan's spec PR. - FYI (SV & JM noisely adding to gitter) - youtube playlist now on https://zarr.dev - https://github.com/orgs/zarr-developers/discussions activated - Follow on points after RA left are captured in https://github.com/zarr-developers/governance/pull/17 ## 2022-04-07 - write_empty_chunks debacle - JK: flag added (d-v-b). default switched (jni). others writing weirder data with unknown fill_value failed. "auto" PR coming. - Release 2.11.x - JM: background info about v3 - JK: any open from Greg? No. - RA: would love to get V3 out. why not? - JM: in case the v3 spec breaks - RA: would love to start playing with it. - JM: that's what greg's PR should get us. (pretty straight-forward) - RA: config system like xarray? - JM: yes! reusable? RA: No. - https://docs.xarray.dev/en/latest/generated/xarray.set_options.html - https://github.com/pydata/xarray/blob/main/xarray/core/options.py - https://github.com/spacetx/starfish/blob/master/starfish/core/config/__init__.py (Josh) - JK: dask has one too. - https://github.com/dask/dask/blob/fc911b6d481e602f7549eecb180c3f7622260001/dask/config.py - JM: also something like https://dynaconf.readthedocs.io/en/docs_223/ - Governance-ish (Sharding, etc.) - Sharding - Discussion on having an editors' group. - Jonathan - Jeremy - Tom (if he wants) - Process to _nominate_ editors? - Implementors - define a team - threshold of majority for the implementors - e.g. fortran order - ZEP - PEP has statement about steering council to give team the responsibility for publishing. Could do the same. - how well-specified are extensions? can an extension later be submitted to core? what's the interface? etc. - RA: let's have ZSC finish ZEP - get implementor council - then go back to the open discussions - "sorry we were sorry, without our BDFL" - JM: so PR of zep & governance - RA: want to just tag a PR as a ZEP. (then capture as documentation) - JK: "changelog"? PR is ok for review. - More core devs? e.g. [PR 992 (redis)](https://github.com/zarr-developers/zarr-python/pull/992#pullrequestreview-920452763) (Tabled) ![](https://i.imgur.com/BhZRhxq.png) - Josh: also proposed greg. :+1: ## 2022-03-17 - Attending: Alistair, Josh, Sanket, John - Regrets: Agenda: - [ZEP](https://github.com/zarr-developers/governance/pull/16)! - AM: for spec or zarr-python? Both, but keeping all implementations in mind. - JM: how do we decide who is voting? - SV: how do we build consensus? who are the contributors? - target is anyone who would be affected (as in Python) - accept/reject/defer - if you don't have anything (strongly) against - Martha's rules:https://third-bit.com/files/2020/08/marthas/ - would require meeting time - AM: different cases. - e.g. someone opens proposal to core protocol. all maintainers should be willing. (high-bar) - different flavors of protocol extension, new codecs, ... (can be more relaxed) - want to maintain a balance between a stable core and allow exploration - how to remedy these situations? - classification of zeps - specs - core (v3) - extension (awkward arrays) - codecs - ... - impl? - AM: ZEP as gateway to suggest a body of work. - JM: "completion" status with 100% being something special - AM: including *intention* ("under-review", "planned", "in-progress") - step away from central approval process - [Sharding](https://github.com/zarr-developers/zarr-specs/pull/134) (as an example) - AM: still under active development. - JM: note the need for that overarching v3 document (ZEP?) - AM: v3 meets all the original goals: - extension framework - improve performance of accessing metadata (fewer accesses) - simplify and remove features (for cross-language impls.) - sharding is a new (but liked) goal. - but other recent questions recently as well. - "ZSC governs zarr-specs & zarr-python" - xarray / unidata talks (cf. https://github.com/pydata/xarray/issues/6374) - in W3C land, wg would meet to resolve list of questions for version - implementors as "standading working group" - JM: struggling that no one has taken responsibility to respond (sync or async) - AM: tension of interoperability versus innovation - JK: ... versus developer time! - SV: ask each language whether or not they want to adopt a ZEP - JK: https://peps.python.org/pep-0604/#objections-and-responses as an example - FYI: - first EOSS4 payment went out to scalable minds (Josh) ## 2022-03-03 Attending: Josh, Sanket, Ryan W. Regrets: Ryan A., John K. Agenda: * Josh: have a need for a remote-symlink feature * Ryan: need for documentation on how to write v3 extension - Josh: also came up during the community meeting - Sanket: talked to Alistair for a few hours; goal of having minimal number of things in the core protocol. Reduces burden on implementors and on spec proposers since they don't have to go through the whole process - Josh: https://github.com/zarr-developers/governance/issues/15 moving mainline to dev * Sanket: working on community feedback process - https://github.com/zarr-developers/governance/issues/14 - similar to NEP, PEP, ... - publishing first draft this week * Sanket: GSOC project list - https://github.com/zarr-developers/gsoc/blob/main/2022/ideas-list.md ## 2022-02-16 Attending: Ryan W., Alistair, Josh, Sanket, Ryan A., John K. Agenda: - *Nothing pressing* - FYI/Josh: timezone change (`CET -> CST`) - Sanket: Chris Holmes on cloud-native outreach day - Tentative date: 5th & 6th of April - Aiming to have session between Day 1 & 2 on European side - RA: use it to highlight geo-spatial data in Zarr - Debate about COGs v. Zarr - COGs as single-image container - Talk (20-30 minutes) / workshop (60-90 minutes) - Talk: discussing about specs and zarr overview - new features and the roadmap (what & how) - RA: include geospatial content. can give content & even present those parts. - Petabytes of climate model data in the cloud. - have time. - separate meetings. - can also get other people involved. - WS: hands-on - Sanket: GSOC - https://github.com/zarr-developers/community/issues/39 - https://github.com/zarr-developers/gsoc - 19th of Feb for deadline - Josh: sparse arrays - post-sharding community-wide discussion - AM: does V3 spec have the necessary hooks for sparse (anything we can find out now) - RW: seems like a relatively clean case (currently just 3 one-dimensional arrays) - how deeply does awareness get backed in. - JM: or alternatively do we completely hide the 3 1-D - AM: another idea would be a sparse layout for chunks (non-continugous). different kind of thing. - JM: cf. also see non-uniform chunks - AM: v3 introduced a (regular) chunk grid with the intention of allowing something to introduce irregular rectilinear grids - JK: i.e. we don't have a clear definition of how to do it. Doing it for GSOC might be to give different options for the layout, then a twitter poll for the next step, ... - AM: :+1: for some way of structuring the requirements. - Josh: FYI / https://zarr.dev - PR template for taking new data. - @@Sanket to - Process - Josh wrote to Ryan A. about next steps - Alistair & Sanket meeting next week - We're waiting on a PR from Jonathan ## 2022-02-02 Attending: RyanA, RyanW, Josh, John Agenda: - https://zarr.dev/blog repo split. ok? Yes. - gave Sanket access via tweetdeck. ok? Yes. - Use the `zarr-internal` numfocus slack channel? Yes. DONE - Invite sanket to these meetings? Yes. DONE - Use of gmail account? Youtube channel? No public email. DNS forwarding. - Specs - STAC time with Alistair - https://github.com/zarr-developers/governance/issues/14 ("ZEP") - Norman waiting - N5-Java community - Ryan A: can bootstrap quickly. draft of ZEP ASAP. copy-n-paste from STAC. - Do we want process involve specific numbered proposal? or just PRs on github? zarr-specs? somewhere else? spec==repo. (semver) - RyanA: STAC as opposed to static v2 or slow v3. - Josh: but how do we get there?? John: sharding as carrots for v3. - RyanA: do things more incrementally in V3? Evolving towards v4? - Josh: extensions as the step towards evolution - RyanA: everyone to read STAC as homework (...or... someone to impose as BDFL) - Note: *RyanA needed to leave* - C++/FYI: - Various discussions around this - xtensor-zarr stuck? - no support for n5 (Constantin can help) - cmake issues (Matt can possibly help) - z5 ready to deprecate. - tensorstore in the lead? - John: pushing GCS? S3 should be supported. - No takers for libzarr - John: use regular meeting to get xtensor-zarr users/developers together? - Matt, David - commit rights? - Tabled - "cloud-native days" talk/workshop (content) - misc/technical/detailed (time-permitting) - [build hangs](https://github.com/zarr-developers/zarr-python/pull/952) - Any ideas? ## Other notes - B-Open: https://hackmd.io/ORFQrwMDQMWivz_IbJi0CQ <!-- Previous meetings --> [2021-08-30]: /IRjXZ8lRRzCyQD76R9wyhw [2021-09-27]: /zRYgVxgMQZegcEqqNY3tMQ [2021-11-30]: /9DCqDuZXRVCCYA7AxYT9sA