owned this note
owned this note
Published
Linked with GitHub
# Archived Bi-weekly Xarray Community Developers Meeting
https://us02web.zoom.us/j/87503265754?pwd=cEFJMzFqdTFaS3BMdkx4UkNZRk1QZz09
New notes: https://hackmd.io/fx7KNO2vTKutZeUysE-SIA?both
## September 28, 2022
### Attendees
- Joe Hamman / @jhamman
- Tom Nicholas / @TomNicholas
- Justus Magin / @keewis
- Stephan Hoyer / @shoyer
- Deepak Cherian / @dcherian
- Benoît Bovy / @benbovy
- Jarrod Millman / @jarrodmillman
### 60 second updates
- Joe: represented Xarray at the CZI and NumFOCUS meetings. Talking to NumFOCUS about getting help designing a new logo/brand for Xarray. Talking to NASA about Xarray developer meeting.
- Tom:
- Draft structure for "ChunkManager" abstraction for dask/cubed
- Some docs improvements
- Benoît: explicit indexes (PR comments, issue triage, etc.)
- https://github.com/pydata/xarray/issues/7099
- https://github.com/pydata/xarray/issues/6392#issuecomment-1260618693
- feedback appreciated!
- Justus: reviews, pint-xarray
- Stephan:
- Should Xarray stop doing automatic index-based alignment?
https://github.com/pydata/xarray/issues/7045
### Agenda
- index sprint: https://github.com/pydata/xarray/discussions/7084
- Expose Variable?
## September 14, 2022
### Attendees
- Justus Magin / @keewis
- Ryan Abernathey / @rabernat
- Anderson Banihirwe / @andersy005
- Stephan Hoyer
- Benoît Bovy / @benbovy
- Jessica Scheick / @jessicas11
- Mathias Hauer / @mathause
- Tom Nicholas / @TomNicholas
### 60 second updates
- Justus: nothing, I've been on holiday for two weeks
- Anderson: nothing much, Joe + Andersn going to CZI / Numfocus
- Benoît: working on indexes (bug fixes, public API, documentation)
- Stephan: pushing more on xarray-beam
- Ryan: new meeting about alternative distributed computing systems, like cubed: https://github.com/tomwhite/cubed/
- Jessica: working on backend docs
- Mathias: reviews
- Tom: hypothesis testing strategies; trying to get cubed into xarray; met with Jim Bednar on panhelio; maybe fit ragged arrays into xarray; tried out making a PeriodicBoundaryIndex
### Agenda
- Jarrod: talk about spec process https://scientific-python.org/specs/core-projects/
- Looking for volunteers for SPEC steering committee
- Process: https://github.com/scientific-python/specs/pull/135
- Questions:
- Benoît: what about the array API spec? Is this part of it?
- Data Array API - Ralph Gommers - orthogonal to Xarray DataArray
- Indexes
- docs on custom indexes
- set_xindexes method so you can actually use custom indexes now!
- create an issue with a list of work-in-progress index implementations
- indexes sprint sometime in September / October
- https://twitter.com/mdsumner/status/1568820392114143232
- Listing backends nicely (Jessica?)
- https://github.com/pydata/xarray/pull/7000
- list_engines exists but not public API
- how to make a show_engines function that is prettier
- Supporting cubed / parallel backends in a general way
- tl;dr: special-case just dask & cubed, or make a new entrypoint for any library
- https://github.com/pydata/xarray/pull/7019#discussion_r968815406
- Slightly more complicated than a duck-array because of map_blocks
- Stephan: factor it all out into a single module, don't try to solve the general problem
- What about a SPEC for this? Jarrod says it would fit
- Tom: cubed depends on Dask so not clear that they are truly independent implementations
- How to discover the extra methods (e.g. `apply_gufunc` from the approriate module)
- issue about another library (ramba): https://github.com/pydata/xarray/issues/5970
- Arkouda: https://bears-r-us.github.io/arkouda/
## August 31, 2022
### Attendees
- Anderson Banihirwe / @andersy005
- Tom Nicholas / @TomNicholas
- Stephan Hoyer / shoyer
### 60 second updates
- Anderson
- Clean up work for xarray.dev
- Tom
- Blog posts
- Published datatree & CMIP blog post
- Publish pint-xarray blog post
- Strategies
- Working on writing hypothesis strategies for xarray objects (https://github.com/pydata/xarray/pull/6908)
### Agenda
- Investigate RTD docs build
-
## August 17, 2022
### Attendees
- Tom Nicholas / @TomNicholas
- Justus Magin / @keewis
- Ryan Abernathey / @rabernat
- Stephan Hoyer / @shoyer
- Deepak Cherian / @dcherian
- Mathias Hauser / @mathause
### 60 second updates
- Tom
- PR to add hypothesis testing strategies that generate xarray objects (https://github.com/pydata/xarray/pull/6908)
- As part of duck array testing strategies
- Added xarray calendar to Scientific-Python website (https://github.com/scientific-python/scientific-python.org/pull/294)
- Bot to automatically label PRs with topic tags (https://github.com/pydata/xarray/pull/6912)
- Realised Alessandro and Aureliana were missing from core team page in docs! (https://github.com/pydata/xarray/pull/6913)
- Justus:
- Blog post on pint, hypothesis
- Ryan
- Looking for feedback on Zarr spec / ZEPs
- https://github.com/zarr-developers/zarr-specs/pull/149
- https://github.com/zarr-developers/zarr-specs/pull/152
- https://github.com/zarr-developers/zarr-specs/issues/126
- Stephan
- Deepak
- Could use input on [PR](https://github.com/pydata/xarray/pull/6874)
- Alan Snow (RioXarray)
### Agenda
- Scientific-Python "core" project
- https://scientific-python.org/
- SPECs: Scientific Python Ecosystem Coordination (https://scientific-python.org/specs/); very relevant to Xarray
- Stephan - SPEC around units could be useful
- Jarrod Millman and Stefan Vanderbilt have been involved in SciPy for a long time
- We were expecting Jarrod to be here
- Tom will email Jarrod to communicate our unanimous support for Xarray becoming a core project
- xarray-spatial blogpost
- Would go on the Xarray blog
- Are we compfortable with this?
- Yes, but we need to review and provide editorial feedback
- We should have a policy for what can go on the blog
- office hours in september
- NASA project is support Xarray office hours. Supposed to be focused on remote sensing but thinking about making it more general.
- Deepak, Scott, and Jessica are funded to do this
- Justus maybe able to join
- Signup via Eventbrite
- Be clear about audience, open issue; guidance on how to frame question; clear interaction pro
- PRs needing feedback:
- [decorator to deprecate positional args](https://github.com/pydata/xarray/pull/6910)
- Stephan will review; doesn't sound risky
- Should we do another sprint on indexes
- YES
- How to organize? Deepak can lead; nominally October
## August 03, 2022
### Attendees
- Joe Hamman / @jhamman
- Justus Magin / @keewis
- Max / @max-sixty
- Stephan Hoyer
- Deepak
### 60 second updates
- Justus - duck array testing
- Max — not much! Hope it's OK I've been doing less recently. Always here if I can help with anything specific. FYI my colleague is building a `xarray-beam`-like interface for Spark.
- Have you seen this? https://ncar.github.io/PySpark4Climate/sparkxarray/overview/
- Stephan - not a ton
- Deepak - been hacking on a few things
- Experimenting with cupy arrays, wrote some docs, new direct-to-gpu zarr backend: https://github.com/xarray-contrib/cupy-xarray/pull/10
- CRS index: https://github.com/dcherian/crsindex/blob/main/crsindex.ipynb
### Agenda
- pint xarray blogpost
- pint-xarray was released
- just need copy edits on blog post and some formatting issues
- np.asarray on backend arrrays
- https://docs.rapids.ai/api/kvikio/stable/api.html#zarr
- https://github.com/pydata/xarray/pull/6874
- https://github.com/xarray-contrib/cupy-xarray/pull/10
## July 20, 2022
### Attendees
- Deepak Cherian
- Joe Hamman / @jhamman
- Stephan Hoyer
- Ryan Abernathey / @rabernat
### 60 second updates
- Deepak
- material for scipy tutorial
- fixing release blocker
- Stephan - ran google's test suite against xarray, just one issue
- Ryan - helped coordinate the flexible indexes sprint at scipy
- Joe - at Scipy, ready to write the user survey summary post
- Yann - Masters student in Finland, GSoC work on PyMC, interested in auto-diff in Xarray
### Agenda
- Release?
- after https://github.com/pydata/xarray/pull/6798
- Pymc use case discussion
- PyTorch: https://github.com/pydata/xarray/issues/3232
- Upstream issue: https://github.com/pytorch/pytorch/issues/58743
- Stephan: array API appraisal: https://data-apis.org/array-api/latest/
- Use case from Yann: looking to use Xarray DataArray (Groups) as optimization variables in a large optimization loop (callback labeler minimization)
- https://github.com/yannmclatchie/kulprit
- Alternative parallel array backends
- https://github.com/pydata/xarray/issues/6807
## July 06, 2022
### Attendees
- Tom Nicholas / @TomNicholas
- Justus Magin / @keewis
- Deepak Cherian
- stephan hoyer
- Max / @max-sixty
### 60 second updates
- Tom
- AWOL as travelling unexpectedly
- Now trying to get my part of xarray tutorial for scipy done asap
- Justus
- pint-xarray is close to the release, unblocking the pint-xarray blog post
- Max
- Was trying to get numbagg to work with corr, but I think it would need a decent amount of work
### Agenda
- pint-xarrablog post
- Basically no clear ToDos remaining, so should be able to release?
- issues needing feedback:
- https://github.com/pydata/xarray/issues/6749
- count for variables without the reduction dimension
- Any from: https://github.com/pydata/xarray/labels/needs%20discussion
- should we try to organize a sprint to add examples to most of our functions?
- pandas had a pretty successful "docs day", maybe we can take some inspiration from that?
## June 22, 2022
### Attendees
- Deepak Cherian
- Justus Magin / @keewis
- Ryan Abernathey / @rabernat
- Max Roos / @max-sixty
- Mathias Hauser / @mathause
- Stephan Hoyer
### 60 second updates
- Deepak : xarray-tutorial for scipy (WIP https://tutorial.xarray.dev/)
- Justus : pre-release, CI
- Max : Some PR reviews, lots of typing, we have a new prolific contributor @headr1ck I've been iterating with. I haven't been attending this often because of my schedule, I'm going to try and come by audio more often.
### Agenda
- Release?
- We are not ready; need more feedback on pre-release.
- Regressions found
- https://github.com/pydata/xarray/issues/6607
- Stephan: breaking a few wierd edge cases is ok
- PRs needing [discussion](https://github.com/pydata/xarray/issues?q=is%3Aissue+is%3Aopen+label%3A%22needs+discussion%22):
- https://github.com/pydata/xarray/pull/6702 moves .groupby etc from DataWithCoords to DataArray and Dataset
- Stephan seems to approve; will review
- https://github.com/pydata/xarray/issues/6704
- https://github.com/pydata/xarray/issues/6646
- pyvista-xarray: https://github.com/pyvista/pyvista-xarray
## June 8, 2022
### Attendees
- Justus Magin / @keewis
- Tom Nicholas / @TomNicholas
- Ryan Abernathey / @rabernat
- Joe Hamman / @jhamman
### 60 second updates
- Justus: upload wheels to TestPyPI, micromamba in CI, matplotlib in upstream-dev, some work on pint-xarray
- Tom: datatree development and documentation, currently using xarray main
- Ryan:
- Extremely deep dive on fsspec xarray interaction: https://github.com/fsspec/filesystem_spec/issues/579
- Joe:
- first xarray blog is ready to go (https://github.com/xarray-contrib/xarray.dev/pull/220), waiting on pre-release
- user survey
- xcollection: https://xcollection.readthedocs.io/en/latest/
### Agenda
- Release progress?
- plan: to make pre-release today, release notes will mention release blockers
- Scipy?
- NumFOCUS summit?
- PEP654: exceptiongroup (backport: https://github.com/agronholm/exceptiongroup)
## May 25, 2022
### Attendees
- Justus Magin / @keewis
- Ryan Abernathey / @rabernat
- Joe Hamman / @jhamman
- Tom Nicholas / @TomNicholas
- Anderson Banihirwe / @andersy005
### 60 second updates
- Justus: answering questions
- Stephan: got contract signed with numfocus & Benoit for more work on indexes
- Tom: Looked at Pint-xarray; work on datatree with B-open folks; Talked about Datatree with Joe
- Joe: opened PR in xarray yesterday about `to_dict` to add encoding - looking for feedback; PR sitting for refactor of docs index - stuck on sphinx warnings
- Anderson: trying to wrap up landing page
- Ryan:
- Zarr ZEP 1: https://github.com/zarr-developers/zeps/pull/1
- Are we testing against Zarr V3 currently? https://github.com/pydata/xarray/pull/6475
- Stephan will open a Zarr issue about group path
- Xarray-beam: email dev mailing list
### Agenda
- Release progress?
- What's blocking?
- Breaking changes in index refactor that will cause turmoil
link to issue?
- Flox?
- cf-time
- Can Benoit own the release checklist? Or on standby?
- A formal pre-release
- Who will do the pre-release? Justus will lead with support from Tom
- Tom: another Xarray paper
- Ryan: xhistogram into xarray
- Next time: Scipy!
## May 11, 2022
### Attendees
- Ryan Abernathey / @rabernat
- Tom Nicholas / @TomNicholas
- Deepak Cherian / @dcherian
- Justus Magin / @keewis
- Anderson Banihirwe / @andersy005
- Stephan
### 60 second updates
- Tom
- Inline array PR
- Been to meeting with Josh Moore / B-Open about DataTree
- Deepak
- General maintenance
- Justus
- Tried to implement array namespace support (related to array ineroperability / pytorch)
- Needs to find a working implementation
- Stephan has a reference implementation (numpy.array_api in next numpy release)
- Ryan
- Rediscovered the need for Lazy concatentation: https://github.com/pydata/xarray/issues/4628
- Anderson
- Wrapping up landing page work
- Now have a blog
- Joe writing about CZI grant
- Pint blog post
- Stephan
### Agenda
- Anderson talks about website
- The blog is live: https://xarray.dev/blog
- Looking for other packages from other fields
- plasma physics modelling framework that uses xarray: https://omfit.io/
- issues/PRs needing feedback:
- Raise deprecationwarning when dropping multiindex: https://github.com/pydata/xarray/pull/6592
- rename auto creates index: https://github.com/pydata/xarray/issues/6229
- inline_array open_dataset: https://github.com/pydata/xarray/pull/6566
- deafult to False for now
- expose as top-level argument
- test by counting number of tasks in graph, and make sure it's 1 smaller
- groupby/flox PR is ready to merge: https://github.com/pydata/xarray/pull/5734
## April 27, 2022
https://us02web.zoom.us/j/88251613296?pwd=azZsSkU1UWJZTVFKNnhIUVdZcENUZz09
### Attendees
- Joe Hamman / @jhamman
- Deepak Cherian / NCAR
- Anderson Banihirwe / @andersy005
- Stephan Hoyer / @shoyer
- Tom Nicholas / @TomNicholas
### 60 second updates
- Joe : mostly vacation, short doc site update
- Deepak : poking around issues, NASA grant is set up
- Tom
- DataTree
- Nearly finished near-total refactor (https://github.com/xarray-contrib/datatree/pull/76)
- Prototype html repr (https://github.com/xarray-contrib/datatree/pull/78)
- xGCM
- not much on xarray itself
### Agenda
- What to do with material in xarray-tutorial repo?
- https://github.com/xarray-contrib/xarray-tutorial/issues/53
- https://tutorial.xarray.dev ??
- Future of multiindex blocking release.
- https://github.com/pydata/xarray/issues/6505
- Quick license question
- ans: datatree just needs to retain any headers metioning anytree's Apache license
## April 13, 2022
Ryan's link: https://columbiauniversity.zoom.us/j/6902819781
### Attendees
- Stephan Hoyer / shoyer
- Ryan Abernathey / @rabernat
- Tom Nicholas / @TomNicholas
- Justus Magin / @keewis
- Max
- Oriol
### 60-second updates
- Stephan
- Reviewed PR about adding nczarr support
- Ryan
- https://www.ogc.org/ogcevents/cloud-native-geospatial-outreach-event
- Tom
- Contributed to Xarray
- Justus
- Did a few things on pint
- Working on pint / xarray blog post
- Max
- Working with Zarr
- doing Xarray maintenance
### Agenda
- Zarr stuff
- GDAL vs. Xarray CRS conventions
- https://github.com/pydata/xarray/issues/6448
- Related zarr encoding issue https://github.com/pydata/xarray/pull/6476
- Zarr V3: https://github.com/pydata/xarray/pull/6475
- Datatree
- Map operations downwards through the tree (e.g. mean); user question: can I get variables this way
- Oriol from Arviz
- Work with groups, want to keep everything together
- Bayesian data structures
- Priors
- Posteriors
- ??
- "draw dimension"
- They have their own tree-like data structure
- https://github.com/arviz-devs/arviz/blob/main/arviz/data/inference_data.py
- https://github.com/arviz-devs/arviz/issues/2015
- Justus will be on holiday for two weeks
## Mar 30, 2022
### Attendees
- Stephan Hoyer / shoyer
- Ryan Abernathey / @raberant
- Anderson Banihirwe / @andersy005
- Benoit Bovy / @benbovy
- Justus Magin / @keewis
- Mathias Hauser / @mathause
- Joe Hamman / @jhamman
- Tom Nicholas / @TomNicholas
- Guido Imperiale / @crusaderky
- Max Roos
### 60-second updates
- Ryan:
- Regression in opening unlistable Zarr stores: https://github.com/zarr-developers/zarr-python/issues/993
- Anderson:
- Worked on Xarray landing page. Added jupyterlite
- Benoit
- Fixing issue
- Justus
- Working on pint relese
- Working on unit support in indexes
- Docs
- Mathias
- Working on a proposal for a climate emulator
- Joe
- Doc upates for main splash page
- Wants a release and big announcement about indexes
- Wants to help release datatree
- Guido
- Coiled OSS team has been split between dask/dask and dask/distributed;
I'm in the latter so I'm one further step removed from xarray now
### Agenda
- Passing indexes to Xarray constructors:
- https://github.com/pydata/xarray/issues/6392
- Probably not the way users will create indexes the most
- Custom indexes not quite there yet
- NCZarr backend support
- https://github.com/pydata/xarray/pull/6420
- When should we release
- A month since we last released
- Release indexes refactor but not annouce until users can create custom indexes?
- Conc: Release now but don't announce CZI / indexes yet (who will do release?)
- Writable backend entrypoint?
- https://github.com/pydata/xarray/issues/5954
- https://github.com/TileDB-Inc/TileDB-CF-Py/issues/112
- https://github.com/corteva/rioxarray/issues/433
## Mar 16, 2022
### Attendees
- Joe Hamman / @jhamman
- Tom Nicholas / @TomNicholas
- Mathias Hauser / @mathause
- Justus Magin / @keewis
### 60-second updates
- Joe - Final CZI report submitted
- Tom - not alot
- Mathias: not much
- Justus: docs maintenance, nothing else
(short meeting, many folks were out this week)
## Mar 02, 2022
### Attendees
- Justus Magin / @keewis
- Tom Nicholas / @TomNicholas
- Mathias Hauser / @mathause
- Joe Hamman / @jhamman
- Benoît Bovy / @benbovy
- Ryan Abernathey / @rabernat
- Stephan Hoyer / shoyer
### 60-second updates
- Justus: worked on pint / pint-xarray
- Tom:
- Some PRs to xarray (drop_duplicates)
- Released new version just now (forgot to do it last week)
- Generally do think increased cadence is working as intended
- Otherwise xGCM (Justus I see your work but won't get to it quite yet!)
- Tentative agreement on working on DataTree (rather than Datagroups model)
- Mathias: some work concerning weighted quantile
- https://github.com/pydata/xarray/pull/6059
- Joe: NumFOCUS report due yesterday
- Benoit: indexes related issues and PRs
- Ryan: we sped up cftime a bunch https://github.com/pydata/xarray/discussions/6284!
- Stephan: NumFOCUS biannual meeting
### Agenda
- Pint-xarray blog post
- Just punt on expects decorator
- Where to publish? xarray.dev/blog?
- Will want a follow-up post anyway for index selection
- Launch xarray.dev?
- Do we need a link back from docs.xarray.dev?
- Rework docs index page (Joe will do)
- Merging explict index refactor?
- drop encoding sooner
- https://github.com/fsspec/kerchunk/issues/130
- This was the crux of the discussion before: https://github.com/pydata/xarray/pull/5065#issuecomment-806154872
- Anderson's REPL demo:
- https://github.com/xarray-contrib/xarray.dev/pull/148
- https://xarray-dev-git-interactive-repl-xarray.vercel.app/
- Interesting post on Pangeo Forum about Discrete Global Grid Systems:
https://discourse.pangeo.io/t/discrete-global-grid-systems-dggs-use-with-pangeo/2274
## Feb 16, 2022
### Attendees
- Deepak Cherian
- Justus Magin
- Alessandro Amici / @alexamici
- Joe Hamman / @jhamman
- Mathias Hauser / @mathause
- Ryan Abernathey / @rabernat
- Benoît Bovy / @benbovy
- Tom Nicholas / @TomNicholas
- Stephan Hoyer / shoyer
### 60-second updates
- Deepak : Scipy 2022 tutorial proposal is ready to go.
- Justus: tried implementing a custom index
- https://github.com/keewis/xarray-custom-indexes
- Alessandro : discussed DataTree / DataGroup options for hierarchical data structure
- Joe: made new docs page/domain live, a bunch of doc/url fixes
- Mathias: a bit of maintenace
- Benoît: https://github.com/pydata/xarray/pull/5692 should be ready by the end of this week
- Ryan: heads down on Pangeo Forge
- Tom: Discussed DataTree
- First with Josh Moore of Zarr + others on file formats
- Then with Alessandro / Stephan on DataTree vs DataGroup
- Now have choice between [two directions - input welcome](https://github.com/pydata/xarray/issues/4118#issuecomment-1039572760)
- Otherwise just answered a few discussion Q's
- Stephan:
### Agenda
- https://xarray.dev is live
- docs.xarray.dev has taken over xarray.pydata.org (1:1 link, no dead urls)
- Vercel gave us a pro account for free (Open Source)
- Release on Friday, Tom will do this. First CALVER release.
- Now have choice between [two directions - input welcome](https://github.com/pydata/xarray/issues/4118#issuecomment-1039572760)
- Summarized 2 options
- DataGroup approach can be merged into Datasets fairly easily
- Could instead add an `open_variable` function to allow low-level opening of unusual files
- Could also imagine
## Feb 2, 2022
### Attendees
- Tom Nicholas / @TomNicholas
- Joe Hamman / @jhamman
- Stephan Hoyer / @shoyer
- Justus Magin / @keewis
- Ryan Abernathey / @rabernat
- Mathias Hauser / @mathause
- Maximilian Roos
- Guido Imperiale / @crusaderky
### 60-second updates
- Tom
- still thinking about DataTree
- finalizing pint-xarray PR (pre blog post)
- Joe
- helped with packaging dependency fix
- working on xarray.dev spalsh page / domain name setup
- Stephan - no updates
- Justus
- reviewing PRs, answering issues / discussions
- Ryan - working on Zarr/FSSpec mapping objects
- https://github.com/zarr-developers/zarr-python/pull/911
- Mathias
- Maintenance stuff (warnings)
- Max
- made release, removed stable branch, making releases easier
- Guido
- Working on dask gen_cluster decorator, fixing failing tests in xarray.
There is something (not dask) in the xarray test suite that's leaking
subprocesses that don't respond or are slow to respond to SIGTERM.
I have yet to find out what.
### Agenda
- xarray.dev
- splash page being deployed to https://xarray.dev
- status: fighting with dns
- sphinx doc site deployed to https://docs.xarray.dev
- status: redirecting to https://xarray.pydata.org/
- Future sub-sites (e.g. blog) can be deployed to {foo}.xarray.dev
- Future/related projects docs can be deployed to {bar}.docs.xarray.dev
- Reading / writing from multiple netCDF groups at once [issue 6174](https://github.com/pydata/xarray/issues/6174)
- DataTree (Tom gives an overview of current discussions)
- https://cfconventions.org/cf-conventions/cf-conventions.html#groups
- Joe to build some weird datasets for testing/discussion purposes
- Tom to mess about with allowing `__getitem__` to point up the tree
## Jan 19, 2022
### Attendees
- Max Roos / @max-sixty
- Joe Hamman / @jhamman
- Mathias Hauser / @mathause
- Stephan Hoyer
- Justus Magin / @keewis
- Deepak Cherian
- Tom Nicholas / @TomNicholas
### 60-second updates
- Max — some reviews & comments & typing. Need to merge a few hanging PRs
- Joe - a bit of work on the new doc site
- Justus - answers / reviews, pint
- Deepak - no updates, cf-xarray talk at ams meeting next week
- Mathias: not much (some reviews)
- Tom
- [DataTree design doc](https://docs.google.com/document/d/19jVW5lL2jwhS0dgj9XqPBrcvIa13cpWrnDsnVLqZkfc/edit?usp=sharing)
- Helpful comments from Stephan
- Need to actually make changes
- Then will make a public issue and tag a load of people for feedback
- Little bit of pint-xarray
### Agenda
- keep_attrs
- Switch default to `keep_attrs='drop_conflicts'`?
- here's what Iris does: https://scitools-iris.readthedocs.io/en/stable/further_topics/lenient_metadata.html
- prs to review:
- keep_attrs for xr.where: https://github.com/pydata/xarray/pull/4687
- Release
- Regular releases?
- get rid of stable branch? https://docs.readthedocs.io/en/stable/versions.html: "we also create a stable version, tracking your most recent release"
- maybe switch to CalVer? Joe will open an issue
- split up the index refactor PR?
## Jan 5, 2022
### Attendees
- Max Roos / @max-sixty
- Tom Nicholas / @TomNicholas
- Mathias Hauser / @mathause
- Stephan Hoyer
- Justus Magin / @keewis
- Deepak Cherian
### 60-second updates
- Max — fixed main / dask issues I think. Not much else.
- Somewhat related to https://github.com/pydata/xarray/pull/6077
- Tom
- [DataTree design doc](https://docs.google.com/document/d/19jVW5lL2jwhS0dgj9XqPBrcvIa13cpWrnDsnVLqZkfc/edit)
- Would like any interested xarray devs to have a look first
- Then I want to get feedback from outside stakeholders
- Mathias: some upstream fixes
- Justus: some reviews / answers, keep_attrs for xr.where, metadata guide
### Agenda
- drop python 3.7
- https://github.com/pydata/xarray/issues/6138
- https://github.com/pydata/xarray/pull/5892
- DataTree review
- Deprecate `__bool__`? https://github.com/pydata/xarray/issues/6124
- Deepak: Code to generate reduction methods needs another review (thanks mathias): https://github.com/pydata/xarray/pull/5950
- Mathias: `Hashable` vs. `dims=("x", "y")`
- Maybe normalize `str | Sequence[Hashable]` to `Sequence[Hashable]`?
- Mathias to make an issue
## Dec 7, 2021
### Attendees
- Joe Hamman / @jhamman
- Justus Magin / @keewis
- Deepak Cherian / @dcherian
- Ryan Abernathey / @rabernat
- Stephan Hoyer
- Benoit Bovy / @benbovy
- Andy Sweet / @andy-sweet
- Tom Nicholas / @TomNicholas
### 60-second updates
- Joe
- Mostly coordination stuff
- Justus
- Pint-xarray review
- Deepak
- Ryan
- Zarr bitinformation codec - https://github.com/zarr-developers/numcodecs/issues/298
- Stephan
- Benoit
- Flexible indexes
- Tom
- Pint-xarray decorator
- Working on data tree
- Andy Sweet (CZI)
- Working on Napari
- Largish refactor around how Napari stores data
- Want to use Xarray as a core data model in Napari
### Agenda
- Xarray-lite (https://github.com/pydata/xarray/issues/3981)
- Intelligent dimension-name-based broadcasting
- Extract Variable from xarray
- Also could refactor out the indexing wrappers
- Max: release
- upstream-dev failures?
- PRs needing review:
- script to generate reduction functions https://github.com/pydata/xarray/pull/5734
-
## Nov 24, 2021
### Attendees
- Tom Nicholas / @TomNicholas
- Joe Hamman / @jhamman
- Justus Magin / @keewis
- Deepak Cherian / @dcherian
- Benoît Bovy / @benbovy
- Guido Imperiale / @crusaderky
- Stephan Hoyer
- Anderson Banihirwe / @andersy005
- Alan Snow / @snowman2
- Todd Anderson
### 60-second updates
- Tom
- been out the past two weeks
- Joe
- working with Anderson on new web splash page
- progress on xarray-schema
- Justus
- pint
- Deepak
- numpy-groupies is working!
- Benoît
- took a break from indexes refactor but working on it again
- expecting refactor to be done by the end of the year
- will require some user facing work after
- Guido
- working on dask.distributed
- Stephan
- not much
- Anderson
- splash page is ready to publish
- Need to sort out the routing
- Max — sorry I've been out of it for so long! Happy to see my absence
had such little impact :). I'm back and can help on things (e.g. Deepak
put out the bat signal for typing help...)
- Alan - here to listen in to the conversation (roadmap, backend entrypoint write support, custom indexes)
- Todd - working on Ramba integration, curious how backend loading works
### Agenda
- Splash page update (Anderson)
- https://tex.stackexchange.com/questions/51757/how-can-i-use-tikz-to-make-standalone-svg-graphics
- https://xarray.vercel.app/
- https://github.com/andersy005/xarray-website
- Should release (backend bugfix merged)
- +1
- PRs need review:
- script to generate reductions (deepak)
- https://github.com/pydata/xarray/pull/5950
- numpy_groupies (but needs 5950^ merged first)
- https://github.com/pydata/xarray/pull/5734
- More prescriptive issue? (Assuming we take @TomNicholas ' addition) https://github.com/pydata/xarray/pull/5787
- Ask people to type their tests? https://github.com/pydata/xarray/pull/5694
- @stephan can I (Tom) pick your brains about datatree at some point?
## Nov 10, 2021
### Attendees
- Ryan Abernathey / @rabernat
- Tom Nicholas / @TomNicholas
- Mathias Hauser / @mathause
- Joe Hamman / @jhamman
- Stephan Hoyer / shoyer
- Todd Anderson
### 60-second updates
- Tom
- Released v0.20.0
- pint-xarray blog post
- (Only thing that isn't finished is an @expects decorator for wrapping non-pint functions)
- more messing about with DataTree
- Tried [altering Dataset to include a "manifest"](https://github.com/pydata/xarray/pull/5961)
- Hopefully a better way that doesn't involve adding any new (private) attributes to Dataset
- Ryan
- Trying to engage with TileDB
- https://github.com/TileDB-Inc/TileDB-CF-Py/issues/112
- https://github.com/pydata/xarray/issues/5954
- Lecture Notes
- https://earth-env-data-science.github.io/lectures/xarray/xarray.html
- Joe
- Playing with a little schema validation lib for xarray, very early prototype: https://github.com/carbonplan/xarray-schema
- (@joe you know this exists right? https://github.com/astropenguin/xarray-dataclasses Just in case it's relevant for schema stuff)
- CITATION.cff
- Stephan
- explicit indexes is coming along, but the user facing parts won't be be finished
- simplest index: indexing across the date-line
- Mathias
- not much/ some test fixes
### Agenda
- Todd (ramba)
- Todd is numba core dev
- Ramba = Ray + Numba
- https://github.com/Python-for-HPC/ramba
- Removed Ray requirement
- Lazy evaluation - elementwise operations; certain operations trigger evaluation -> form a numba function dynamicallyti
- Release cadence / process
- Grid_ufunc
- DataTree q's (if there's time)
## Oct 27, 2021
### Attendees
- Tom Nicholas / @TomNicholas
- Justus Magin / @keewis
- Guido Imperiale / @crusaderky
- Stephan Hoyer / @shoyer
- Ryan Abernathey / @rabernat
- Joe Hamman / @jhamman
### 60-second updates
- Tom
- Wrote most of pint-xarray blog post
- Feedback welcome https://github.com/xarray-contrib/pint-xarray/pull/142
- Fixes in pint-xarray, xarray
- Requires a new xarray release
- Where should we put the blog post?
- Some other PRs to xarray (including adding .chunksizes)
- Still pontificating on DataTree problems
- Justus
- reviewing PRs but no code changes lately
- importlib PR needs attention
- https://github.com/pydata/xarray/pull/5845
- depend on importlib_metadata in conda until we drop 3.7
- Guido
- working on dask
- Stephan: working with Benoit on indexes refactor
- Joe: working on CZI grant reporting
- Ryan: busy with other things, lots of Pangeo Forge
### Agenda
- Where should we put the blog post?
- dev.to?
- Release v0.20?
- https://github.com/pydata/xarray/issues/5889
- DataTree update
- Interest via Zarr's CZI grant
- Hopefully get BOpen people to work on it?
## Oct 13, 2021
is anyone able to get in? nope
We could use my personal zoom room instead? (Tom)
https://columbiauniversity.zoom.us/j/93373916479?pwd=UnZQT0tQSWNFbC9HbGFtc1FMOVFJdz09
yes please
### Attendees
- Tom Nicholas / @TomNicholas
- Benoît Bovy / @benbovy
- Stephan Hoyer / @shoyer
- Justus Magin / @keewis
### 60-second updates
- Tom
- Fixed bug with combine_by_coords and DataArrays (https://github.com/pydata/xarray/pull/5834)
- Would like to discuss inconsistency of .chunks for DataArrays and Datasets (https://github.com/pydata/xarray/issues/5843)
- Benoît: flexible indexes (update alignment)
- Justus: pint-xarray, reviews
### Agenda
- representation of vector quantities (https://github.com/pydata/xarray/discussions/5775)
- usually: data variables in a Dataset or stacked in a DataArray
- disadvantage of DataArray: lost attributes (may be fixable)
- used in the upcoming xr.cross
- represent as a DataArray / separate package?
- changes in `stack`/`unstack` behavior (explicit indexes refactor): https://github.com/pydata/xarray/issues/5202#issuecomment-935683056
- inconsistency of .chunks
- Tom will add a .chunksizes attribute that returns a dict
- pint-xarray blog post (now that https://github.com/pydata/xarray/pull/5571 is merged)
## Sept 29, 2021
### Attendees
- Joe Hamman / @jhamman
- Justus Magin / @keewis
- Alessandro Amci / @alexamici
- Tom Nicholas / @TomNicholas
- Stephan Hoyer / @shoyer
- Julia Signell /@jsignell
- Deepak Cherian / @dcherian
- Alan Snow / @snowman2
### 60-second updates
- Joe - not much, talking with people about DataTree
- Justus - reviews
- Alessandro - discussions about Zarr/datatree contributions
- Tom
- DataTree discussions
- Held big meeting of different duck array libraries
- Made this https://github.com/pydata/duck-array-discussion
- Stephan - attended duck array meeting, following along with Benoit's indexes work
- Julia - no update, here to talk about plotting
- Deepak - NASA proposal was awarded to support Xarray
- Alan - maintaining rioxarray
### Agenda
- deprecate open_rasterio (https://github.com/pydata/xarray/pull/5808)
- merge, but make sure to update the version number should we skip 0.19.1
- plotting entrypoints (https://github.com/pydata/xarray/pull/3640)
- revisit the conversation with Anderson/Philip
- Pandas has an entrypoint pattern, we can probably just copy this
- Q for Stephan about his cryptic comment on DataTree (https://github.com/TomNicholas/datatree/issues/2#issuecomment-925349259)
- Basically have both a DataTree node and it's wrapped Dataset point to the same dict of variables, so that they can't disagree
- examples in docstrings and linking to narrative documentation
- https://github.com/pydata/xarray/issues/5816
- there are a few methods that don't have examples in docstrings, but there are some in the docs
- make sure links are readable in both text and HTML
- see https://github.com/pydata/xarray/blob/main/xarray/core/dataset.py#L2022 for an example
- (already posted in the issue)
## Sept 14, 2021
### Attendees
- Joe Hamman / @jhamman
- Justus Magin / @keewis
- Alessandro Amci / @alexamici
- Tom Nicholas / @TomNicholas
- Benoît Bovy / @benbovy
- Stephan Hoyer / @shoyer
- Deepak Cherian / @dcherian
- Anderson Banihirwe / @andersy005
### 60-second
- Joe: not much, a bit of datatree
- Alessandro: been away with a lot of work and then holyday. See the zarr CZI contact in the agenda
- Benoît: https://github.com/pydata/xarray/pull/5692
- Tom
- Stuck on this design question in DataTree - how to stop users creating name collisions? (https://github.com/TomNicholas/datatree/issues/38)
- Trying to organise cross-library duck array ops meeting - Justus perhaps we should think in more detail about agenda?
- I (Justus) will try to add the other issues I think we have
- Otherwise mostly working on other libraries
- Justus: pint
- Stephan
- Deepak: NASA grant was funded!
- Anderson: working on Xarray home page update: https://github.com/andersy005/xarray-website
### Agenda
- B-Open contacted by the zarr team that was awarded a CZI grant
- B-Open also in contact with ESA that is asking for ideas and best practives for data distribution
- [xpublish](https://github.com/xarray-contrib/xpublish) might be a nice & scallable approach to serve data dynamically, complementary to data catalogs created with, e.g., pangeo-forge. It needs work, though. It would be great to support more standardized Rest APIs (WMTS, STAC, etc.) as well as rapidly evolving data collections.
- Regarding numfocus: we are missing the last payment, with no response!
## Sept 01, 2021
### Attendees
- Tom Nicholas / @TomNicholas
- Joe Hamman / @jhamman
- Stephan Hoyer / @shoyer
- Deepak Cherian / @dcherian
- Benoît Bovy / @benbovy
- Justus Magin / @keewis
- Tom
- DataTree
- Joe helped by adding .to_netcdf and .to_zarr - can now roundtrip!
- Joe also set up some basic CI for me
- Currently a little stuck on problem of mapping over multiple trees side-by-side https://github.com/TomNicholas/datatree/issues/29
- Stephan: CZI docs need signing
- Deepak: numpy groupies, painful edge cases
- Benoit: Explicit indexes, messy PR
- Justus: moved to France, update to docs and pint things
- Joe: datatree, czi proposal, and anderson's web refactor
### Agenda
- transfer xarray-leaflet to xarray-contrib?
- publish [copy of our CZI proposal](https://docs.google.com/document/d/11zX1ghJTD6pX9TC4Ib4lmP_k1tAFUV39XAdUbeB8b98/edit?usp=sharing)?
- mypy errors? https://github.com/pydata/xarray/issues/5755
- Anderson has a draft of the new website:
- https://xarray.vercel.app/
- https://github.com/andersy005/xarray-website
- add new maintainer?
## Aug 18, 2021
To join: meet.google.com/sfs-jfqq-rtd
### Attendees
### 60-second
- Stephan:
- NumFOCUS wants 15% overhead (from 10%)
- Working on Xarray-Beam (it has docs now!): https://github.com/google/xarray-beam
- Ryan's tweet: https://twitter.com/rabernat/status/1427897395271061505
- Max: many annotations are not tested. Can we enable mypy in the test suite?
- Justus: duckarray testing, pint
- Tom: playing around with dataset trees!
## Aug 04, 2021
### Attendees
- Joe Hamman / @jhamman (on mute, get rolling without me)
- Max Roos / @max-sixty
- Justus Magin / @keewis
- Benoit Bovy / @benbovy
- Deepak Cherian / @dcherian
### 60-second
- Max: nothing much; triaging + reviews + typing
- Justus: use_bottleneck, context object for merge_attrs functions, user survey
- Benoit: indexes...
- Deepak: not much
- Joe: heard no from CZI on EOSS4 application
### Agenda
- user survey blog posts (dedicated meeting?)
- max is happy to help
- joe too. just need to schedule something
- CZI EOSS4 News :(
- upload proposal somewhere
- reach out to companies
- PRs to review:
- cross product: https://github.com/pydata/xarray/pull/5365
- API decision required; make simpler; accept only DataArrays (ask users to do )
- storage options for zarr: https://github.com/pydata/xarray/pull/5615
- complex dtypes for rasterio: https://github.com/pydata/xarray/pull/5501
- use_bottleneck: https://github.com/pydata/xarray/pull/5560
- try to mock using pytest's monkeypatch
- accessors documentation: https://github.com/pydata/xarray/pull/3960
- flexible indexes: https://github.com/pydata/xarray/pull/5636
## Jul 21, 2021
### Attendees
- Tom Nicholas / @TomNicholas
- Deepak Cherian / @dcherian
- Justus Magin / @keewis
- Joe Hamman / @jhamman
- Mathias Hauser / @mathause
- Stephan Hoyer / @shoyer
- Anderson Banihirwe / @andersy005
- Max Roos / @max-sixty
### 60-second updates:
- Tom: been away for the last week, almost ready for the release (today?) Also gave Scipy Tools talk!
- Deepak: reviewing rolling padding PR (see agenda)
- Justus: answering issues / discussions, investigating deprecation cycles
- Joe: working with Anderson on the splash page
- Mathias: not super active, just a few reviews
- Stephan: noticed new dependency addition
- Anderson: working on the splash page
### Agenda
- Release
- to_numpy is done, should be merged https://github.com/pydata/xarray/pull/5568
- roll back addtion of typing_extensions? https://github.com/pydata/xarray/pull/5624
- complete deprecation cycles?
- open PR to discuss
- pin fsspec in CI
- syntax for controlling [padding with rolling](https://github.com/pydata/xarray/pull/5603#issuecomment-883330196)
- Boolean turns padding on/off, call `pad` explicitly for full control
- `.pad(x=3, mode="wrap").rolling(time=5, x=3, pad={"x": False})`
- Full control of padding:
- Overload `pad`: `.rolling(time=5, x=3, pad={"x": {"mode": "wrap"}, "time": False})`
- `pad` and `pad_kwargs`: `rolling(time=5, x=3, pad_kwargs={"x": {"mode": "wrap"}}, pad={"x": True, "time": False})`
- bottleneck deactivation: use_bottleneck vs. accelerate_with (also for numbagg)
- https://github.com/pydata/xarray/pull/5560
- use_bottleneck for now
- splash page redesign concept: https://www.figma.com/file/olCObSTsPg5cMmSpBEramU/landing-site?node-id=2%3A2
## Jul 7, 2021
Video call link: https://meet.google.com/cbm-sope-ipd
Or dial: (US) +1 617-675-4444 PIN: 863 675 265 4912#
More phone numbers: https://tel.meet/cbm-sope-ipd?pin=8636752654912
### Attendees
- Deepak Cherian /
- Justus Magin / @keewis
- Guido Imperiale / @crusaderky
### 60-second updates:
- Deepak : unstacking to [sparse](https://github.com/pydata/xarray/pull/5577), [dask](https://github.com/pydata/xarray/issues/5582); [weighted groupby](https://github.com/pydata/xarray/pull/5480)
- Justus: rename branch, pint
- Tom: little pint integration things, duck_array_ops
- Max: A couple of tweaky updates, mostly lots of triaging and some reviews
### Agenda:
- new release?
- Tom
- pint imports
- meeting for solving the duck array hierarchy issue?
- Tom will open an issue
- Do we need a deprecation cycle for kwarg changes which we don't expect many people to use?
- How could I (@max) have handled this better? https://github.com/pydata/xarray/issues/5545#issuecomment-871808557. Generally xarray is a place of
- automatic labelling / assigning
- labels for issues (needs triage?) using the template, PRs might be more difficult
- assigning: @max will open an issue
## Jun 23, 2021
### Attendees
- Deepak Cherian / @dcherian
- Justus Magin / @keewis
- Benoît Bovy / @benbovy
- Joe Hamman / @jhamman
- Stephan Hoyer / @shoyer
- Alessandro Amici / @alexamici
- Tom Nicholas / @TomNicholas
- Max Roos / @max-sixty
### 60-second updates:
- Deepak: coarsen.construct, PR for weighted groupby (needs feedback)
- Justus: a few reviews
- Benoît: interesting indexes meeting yesterday
- Joe: no real update
- Stephan: no real update (master to main topic below)
- Alessandro: overloadded at BOpen so busy, just finished backend error message PR
- Tom:
- Tried numpy_groupies approach for histograms but not any better
- Some small reviews / PRs
- xarray-pint blog post suggestion below
- Max: usual reviews & discussions. Couple of tiny PRs. I have been trying to get
xarray & dask to replace some spark workloads, with mixed success and confusion!
### Agenda:
- issues/PRs needing attention:
- control CF encoding in to_zarr
- https://github.com/pydata/xarray/issues/5405
- combine_by_coords accepting dataarrays shoudl be merged
- https://github.com/pydata/xarray/pull/4696
- weighted groupby could use some comments (incomplete)
- https://github.com/pydata/xarray/pull/5480
- error bars in scatters
- https://github.com/pydata/xarray/pull/4857
- cumsum groupby
- https://github.com/pydata/xarray/pull/3417
- rename master to main?
- should be pretty simple
- need to double check CI
- https://github.com/pydata/xarray/issues/5516
- blogs
- xarray-pint
- xarray user survey
- new backends: xarray-sentinel, xarray-prisma
- we shoud link to these backends here:
- https://xarray.pydata.org/en/stable/user-guide/io.html#third-party-libraries
- or in the backends section
## Jun 09, 2021
Google Meet link (this week only):
meet.google.com/mvr-ccjv-acu
### Attendees
- Max Roos / @max-sixty
- Stephan Hoyer / @shoyer
- Deepak Cherian / @dcherian
- Justus Magin / @keewis
- Benoit Bovy / @benbovy
- Aureliana / @aurghs
- Guido Imperiale / @crusaderky
- Tom Nicholas / @TomNicholas
- Joe Hamman / @jhamman
### 60 second updates
- Max: Usual set of reviews, Added a "norms" for discussions (can't do a template), trying to be more assertive towards encouraging, PRs around test changes (nothing monumental)
- Stephan: discuss about primimtives for parallel computation: https://github.com/spencerkclark/xpartition/issues/13
- Justus: reviews / answers, pint integration bugs, combine_attrs + callable PR
- Deepak: nothing, was on vacation
- Aureliana: nothing, also next month probably I will not be avaible.
- Benoit: Flexible indexes (+ discussions Geo/Astro/Bio domains)
- Guido: nothing
- Joe: not really
- Tom: histogram
### Agenda
- meeting link: starting next week->https://us02web.zoom.us/j/88251613296?pwd=azZsSkU1UWJZTVFKNnhIUVdZcENUZz09
- automatic PR labeller:
- https://github.com/dask/dask/blob/main/.github/workflows/labeler.yml
- https://github.com/dask/dask/blob/main/.github/labeler.yml
- Yes, great, I saw these. I can add them, it's not what I was hoping for around assigning people, or workflow stage, but they are net good
- Canonical test fixture: https://github.com/pydata/xarray/pull/5411
- Max to check whether ds has everything we need, then if that works we can have a norm that new tests use that fixture, unless the test requires something specific.
- Histogram:
- Stephan's implementation: https://github.com/google/jax-cfd/blob/a2b4e66eca5d09f909f33071845a5bef4c95cc61/jax_cfd/data/xarray_utils.py#L226
## May 26, 2021
### Attendees
- Max Roos / @max-sixty
- Alessandro Amici / @alexamici
- Benoit Bovy / @benbovy
- Joe Hamman / @jhamman
- Tom Nicholas / @TomNicholas
- Justus Magin / @keewis
- Stephan Hoyer / @shoyer
- Deepak Cherian / @dcherian
- David Huard / @huard
- Mathias Hauser / @mathause
### 60 second updates
- Max:
- Two releases, lots of reviews
- Added a test fixture which abstracts over dask / numpy arrays
- numbagg CI
- Hawking my warez — pytest-accept — https://github.com/max-sixty/pytest-accept
- Tweaky PRs — dicts in ds construction
- Tom:
- Couple of reviews (#4696 ready to merge)
- Attended most of dask summit
- Added a suggestion for new section to our roadmap (on flexible high-level data stuctures)
- Alessandro
- CZI proposal
- reviewed a couple of issues / PR on backends
- Benoît
- Submitted #5322
- Prepared / given talk on Xarray flexible indexes @ Dask summit
- Slides: https://speakerdeck.com/benbovy/xarray-flexible-indexes
- Joe
- CZI proposal
- Roadmap update
- Justus
- worked on passing functions to merge_attrs
- prepared to enable velin
- user survey update
- dask summit
- Stephan
- xarray-beam
- watched dask-summit talks
- Deepak
- xarray session at the dask-summit
- helped on proposal
- David
- Looking into providence tracking in xarray
- Mathias
- not much
### Agenda
- provenance tracking xarray
- https://github.com/xarray-contrib/cf-xarray/issues/228
- https://discourse.pangeo.io/t/tracking-provenance-in-xarray/1510
- http://metaclip.org/
- Can of worms - xarray should provide hooks instead of hardcoded support
- https://github.com/pydata/xarray/pull/4896
- Gold standard to run provenance in a separate parallel process
- Roadmap?
- Scipy 3-min progress
- Tom as prime, Alessandro as backup
- Numbagg — 0.2.0 release? https://github.com/shoyer/numbagg/issues/37
- NoMatchingEngineError https://github.com/pydata/xarray/pull/5351
- From @max — I was about to restart numpy groupies — but @dcherian have you got this now?
- @deepak: : See https://github.com/dcherian/dask_groupby . I think if you stick `groupby_reduce` in an `apply_ufunc` call with `dask="allowed"`, things should mostly work. We could do it slowly and split the load.
- @max: Amazing!
- @deepak: there's a `chunk_reduce` function that basically gets the inputs in a form that can be provided to `numpy_groupies`. It could use some code reviewing :)
- @deepak: also `xarray_reduce` actually does do the apply_ufunc thing.
## May 12, 2021
### Attendees
- Justus Magin / @keewis
- Deepak Cherian / @dcherian
- Ryan Abernathey / @rabernat
- Benoit Bovy / @benbovy
- Max Roos / @max-sixty
- Tom Nicholas / @TomNicholas
- Joe Hamman / @jhamman
- Anderson Banihirwe / @andersy005
- Guido Imperiale / @crusaderky
- Jim Pivarski / @jpivarski
- Stephan Hoyer / @shoyer
### 60 second updates
- Justus: Keep attrs and duck array testing and new pint-xarray release
- Deepak: Dask summit next week, Anderson and Deepak are leading Xarray session
- Ryan: Released Pangeo Forge 0.3.2
- Benoit: Flexible index refactoring, new PR coming soon
- Max: Not much, had child :baby:
- Tom:
- Released 0.18!
- Tried to improve the guide for releasing
- Bit of reviewing
- Small docs additions
- Planning how to get histograms into xarray (xhistogram)
- Joe: Just the CZI proposal (agenda item below)
- Anderson: Helped with release automation
- Guido: Lots of focus on dask distributed
- Jim: Starting the "Awkward Array in the sciences" project September 1 (3 years), getting up to speed on ways we'll be able to contribute to scientific use-cases by interoperating with other data analysis tools.
- Stephan:
- just released https://github.com/google/xarray-beam (still WIP -- needs docs!)
- mode='r+' for zarr: https://github.com/pydata/xarray/pull/5252
### Agenda
- CZI EOSS proposal update (request for comments)
- https://docs.google.com/document/d/1oiXY8_4yqL2llT7e9Su3C2rFU4QQoo6JGfL5W58iPV4/edit?usp=sharing
- process and criteria for migrating projects to xarray-contrib
- https://github.com/pydata/xarray/discussions/5167
- scikit-learn has a set of rules https://github.com/scikit-learn-contrib/scikit-learn-contrib/blob/master/workflow.md
- let's add a README
- Who is owning this?
- new repository? I think github has a "user readme" feature
- Pangeo Integration Testing Project
- https://github.com/pangeo-data/pangeo-integration-tests/issues/1
- Pyvista Xarray Integration
- https://github.com/pydata/xarray/issues/4470
- https://github.com/pyvista/pyvista-xarray-accessor/issues
- Blocked PRs:
- https://github.com/pydata/xarray/pull/5008 — interpolation with different datatypes — waiting for 23 days
- https://github.com/pydata/xarray/pull/5239 — question around dropping over multiple dimensions?
- https://github.com/pydata/xarray/pull/4909 — could someone review a plotting PR? @lllvijan has a good track record with PRs.
- Dask summit: open new discussions in https://github.com/pydata/xarray/discussions (similarly to https://github.com/geopandas/community/issues/4)?
## Apr 28, 2021
### Attendees
- Deepak Cherian / @dcherian
- Justus Magin / @keewis
- Tom Nicholas / @TomNicholas
- Ryan Abernathey / @rabernat
- Alessandro Amici / @alexamici
- Max Roos / @max-sixty
- Joe Hamman / @jhamman
- Benoit Bovy / @benbovy
- Guido Imperiale / @crusaderky
### 60 second updates
- Deepak: dask summit + dask_groupby (numpy_groupies + tree reduction) progress
- Justus: pint-xarray (close to release), keep attrs, duckarray testing
- Tom: found a little bug in open_mfdataset just now
- Benoit: continue index refactor (fixing many places where pd.Index objects are used) + _FillValue issue
- Joe: just back from vacation, CZI invitation to submit EOSS proposal
- Max: Clip, rolling_exp.sum, lots of reviews, big architectural changes like replacing `raises_regex`.
- thanks max!
- Stephan: starting on Xarray-Beam
- Ryan: PR on safe_chunks merged, still problems with `encoding['chunks']`
- Alessandro: Xarray-Sentinel project
- Guido: No Xarray updates, focused on dask
### Agenda
- Xarray User Forum @ Dask summit : Need 3 15min user stories from non-geoscience domains
- google form?
- look through user survey responses?
- invite speakers? if so, who?
- Xarray paper citations
- (Tom: I or my old groupmate could maybe speak on a plasma use case?)
- Lily wang? Mike McCarthy?
- Plans for the next release?
- Deepak & Alessandro & Tom kindly volunteer
- Deepak's groupby work
- https://github.com/dcherian/dask_groupby
- duck array testing
- PR review management
- Repo access to core team members (write vs. admin role)
## Apr 14, 2021
### Attendees
- Justus Magin / @keewis
- Deepak Cherian / @dcherian
- Mathias Hauesr / @mathause
- Aureliana Barghini / @aurghs
- Stephan Hoyer / shoyer@
- Max Roos / @max-sixty
- Anderson Banihirwe / @andersy005
### 60 second updates
- Justus: duck array testing module, from_dask_dataframe
- Deepak : not much
- Tom: Tiny bit on pint-xarray
- Mathias : nothing to report
- Stephan: Python PEP with context on array typing: https://github.com/python/peps/pull/1904
- Anderson : nothing to report
### Agenda
- backend PR, Allow using a custom engine class directly in xr.open_dataset:
- https://github.com/pydata/xarray/pull/5033
- Master branch broken?
- Q from Tom: how far is the duck-array stuff from being able to integratee JAX / pytorch
- https://github.com/pydata/xarray/issues/3232
- `drop_duplicates` — <https://github.com/pydata/xarray/pull/5089>
- How to handle community-given examples: https://github.com/pydata/xarray/pull/5129 (from https://github.com/pydata/xarray/issues/5085)
- rendered version: https://nbviewer.jupyter.org/github/pydata/xarray/blob/0b57ef50148545125c929d4d865bf29faa9ba2e2/doc/examples/Wang_AirStagnationIndex.ipynb
- Stephan: anyone want to talk to RTD with me?
- anderson
- from_dask_dataframe: https://github.com/pydata/xarray/pull/4659
- Zarr chunking fixes:
https://github.com/pydata/xarray/pull/5065
- {zeros,ones, empty}_like API
- https://github.com/pydata/xarray/issues/5144
- In addition to `chunks` argument, should we consider adding the `shape` arguments so as to mirror dask/numpy's _like functions?
- typing for unary, binary ops
- https://github.com/pydata/xarray/pull/4904
--
## Mar 31, 2021
### Attendees
- Deepak Cherian / @dcherian
- Tom Nicholas / @TomNicholas
- Max Roos / @max-sixty
- Joe Hamman / @jhamman
- Mathias Hauser / @mathause
- Guido Imperiale / @crusaderky
- Benoit Bovy / @benbovy
- Ryan Abernathey / @rabernat
- Justus Magin / @keewis
- Alessandro Amici / @alexamici
- Stephan Hoyer / @shoyer
- Aureliana Barghini
### 60 second updates
- Deepak
- Groupby with dask arrays; plugged into tree reductions
- https://github.com/dcherian/dask_groupby
- (almost works; uses numpy_groupies)
- Tom
- Tracking down combine issues
- Looking at combining overlapping GIS rasters ("compositing")
- see discussion here: https://docs.google.com/document/d/1IXnk2fvpjWaRZHAwZ1i_d6l0B7iIvV_6Kmjy2IXK5Vo/edit
- Max
- Responding to issues and PRs
- Joe
- Led EOSS LOI submission
- Dask Summit
- Mathias
- Small PR on typing
- Upstream bugfix for cftime
- Guido
- None
- Benoit
- Design notes for flexible indexes
- Ryan
- https://github.com/pydata/xarray/pull/5065
- Justus
- `keep_attrs` propagation
- Tutorials
- Alessandro
- Helped with CZI proposal
- Small stuff around backend refactor
- Stephan
- Move encoding from xarray.Variable to duck arrays? https://github.com/pydata/xarray/issues/5082
- Lazy indexing arrays as a stand-alone package https://github.com/pydata/xarray/issues/5081
- Aureliana
- Moving pynio out
------
### Agenda
- tolerance on alignment / combine (submitted in September / October)
- https://github.com/pydata/xarray/pull/4467
- https://github.com/pydata/xarray/pull/4489
- new external example?
- https://github.com/pydata/xarray/pull/5086
- https://github.com/xarray-contrib/xarray-tutorial/issues/
- combine to handle raster tiles?
- https://github.com/pydata/xarray/issues/4213#issuecomment-781553004
- unique & drop_duplicates methods
- https://github.com/pydata/xarray/pull/5091
- https://github.com/pydata/xarray/pull/5089
- da.mean(("x", "y")) vs Hashable
- boolean indexing
- https://github.com/pydata/xarray/issues/4892
- Index refactor: index data <-> variable(s) (coordinates) data?
## Mar 17, 2021
### Attendees
- Ryan Abernathey / @rabernat
- Deepak Cherian / @dcherian
- Jim Pivarski / @jpivarski (just staying up-to-date; nothing to report)
- Anderson Banihirwe / @andersy005
- Mathias Hauser / @mathause
- Alessandro Amici / @alexamici
- Aureliana Barghini / @aurghs
- Max Roos / @max-sixty
### 60 second updates
- Ryan
- Thomas Nicholas starting soon
- Deepak
- Distributed Groupby
- Anderson
- The new theme is live: https://xarray.pydata.org/en/latest/
- Mathias
- Not much
- Alessandro
- Finished backend API entry points
- There are already some implementations of entry points in the wild
- Aureliana
- Did most of the work!
- Max
- Some PR reviews
### Agenda
- Query method
- Does boolean indexing (like `where`, converts to `isel`)
- Uses numexpr
- google season of docs?
- Due next week (March 26)
- O($5000) for a designer / tech write
- Application is short
- new round of CZI we (B-Open) are looking for ideas to propose
- Pre-proposal only
- Opportunity to pay down technical debt
- What about more features?
- xrft / xhistogram
- hist package
- Xarray lite
- Hierarchical data structure: https://github.com/pydata/xarray/issues/4118
- suitable for a grant proposal
- Collect use cases to determine the minimal set of features that are common to most use cases.
- TileDB backend entry point
- (if time) dask distributed groupby
## Mar 3, 2021
### Attendees
- Deepak Cherian / @dcherian
- Justus Magin / @keewis
- Joe Hamman / @jhamman
- Stephan Hoyer / @shoyer
- Benoit Bovy / @benbovy
- Anderson Banihirwe / @andersy005
- Guido Imperiale / @crusaderky
### 60 second updates
- Joe: not much hands-on stuff; sending CZI grant to NCAR for anderson docs work
- Deepak: some work on rolling
- Justus: combine_attrs, CI, lots of pint-xarray, duckarray test generator function
- Benoit: indexing refactor docs
- Anderson: not much; will finish docs PR
- Guido: support for new dask graph manipulation.
### Agenda
- Backends refactor
- basically done with read support
- draft documentation: https://github.com/pydata/xarray/pull/4810
- PR adding TileDB read-only support using apiv2: https://github.com/pydata/xarray/pull/4988/files
- Decision: we'll ask them to keep it outside xarray (but we'll happily document it)
- Indexes refactor
- see [design doc](https://github.com/pydata/xarray/pull/4979)
- CZI grant opportunities
- EOSS round 4: https://chanzuckerberg.com/rfa/essential-open-source-software-for-science/
- xarray-lite: Variable + dict of variables (e.g., for machine learning)
- array protocol: data-apis.org
- duck arrays: refactor duck-array-ops + inject_reduce_methods etc.
- Diversity & inclusion grant
- Anderson, Joe & Deepak will lead
### PRs to review
- https://github.com/pydata/xarray/pull/4986: saving np.bool_ attributes to h5netcdf
- merge new docs theme?
## Feb 17, 2021
### Attendees
- Deepak Cherian / @dcherian
- Justus Magin / @keewis
- Anderson Banihirwe / @andersy005
- Max Roos / @max-sixty
- Guido Imperiale / @crusaderky
- Mathias Hauser / @mathause
- Ryan Abernathey / @rabernat
-
### 60 second updates
- Deepak: working on rolling, sliding_window_view for dask: https://github.com/dask/dask/pull/7234
- Justus: combine_attrs (Dataset.merge, on variables, as a function), CI, pint-xarray
- Anderson: still working on rearrangement of the docs: https://github.com/pydata/xarray/pull/4835
- Max: Have been super busy on non-xarray work recently, now have some more time
- Guido: Busy with Dask. A new PR coming soon about ``__dask_postpersist__`` to implement https://github.com/dask/dask/issues/7203
- Mathias: Not much last 2 weeks, reviewed PRs.
- Ryan: Pangeo Forge
### Agenda
- Release?
- https://github.com/pydata/xarray/issues/4894
- Max will do it!
- Demo from Anderson
- additional tutorial datasets?
- https://github.com/pydata/xarray-data
## Feb 3, 2021
### Attendees
- Joe Hamman / @jhamman
- Justus Magin / @keewis
- Deepak Cherian / @dcherian
- Stephan Hoyer / @shoyer
- Anderson Banihirwe / @andersy005
- Mathias Hauser / @mathause
### 60 second updates
- Joe: not actually here today, but added two documentation PRs to the agenda. Both are related to the CZI grant and could use some extra eyes.
- Justus: combine_attrs, help with pytorch support
- Deepak: not much
- Stephan: not much
- Anderson: new theme for the documentation
- Mathias: a bit more work on the check_dtype PR
- Guido: new change in Dask API that will have implications for xarray
### Agenda
- doc prs that need some high-level review:
- @andersy005 has started working on a new doc site layout / theme: https://github.com/pydata/xarray/pull/4835
- BOpen is nearing completion of their work on the backends. Current open PR is documentation: https://github.com/pydata/xarray/pull/4810
- combine_attrs instead of keep_attrs?
- https://github.com/pydata/xarray/issues/3891
- no renaming / removing of current parameters
- keep_attrs should accept a bool, the str options from merge_attrs or a function
## Jan 20, 2021
### Attendees
- Joe Hamman / @jhamman
- Justus Magin / @keewis
- Anderson Banihirwe / @andersy005
- Mathias Hauser / @mathause
- Stephan Hoyer / shoyer
- Deepak Cherian / dcherian
### 60 seconds updates
- Joe: User survey PR
- Justus: CI review, tutorial downloads, merge attrs
- Anderson: looking at documentation
- Mathias: trying to get the typing of weighted correct; dtype check in assert_*
- Stephan: pass
- Deepak: reviewing PR on coord variables
### Agenda
- outreachy internship?
- https://www.outreachy.org/communities/cfp/
- xarray-adjacent project in xarray-contrib?
- minimum version policy
- https://github.com/pydata/xarray/issues/4179
- bump numpy to >= 1.17?
-`decode_coords="all"` to decode extra variables as coordinates?
- https://github.com/pydata/xarray/pull/2844
- stuff moved from .attrs to .encoding is backwards-incompatible
- caching & jupyter repr
- https://github.com/pydata/xarray/issues/4833
- https://github.com/pydata/xarray/issues/4240
## Jan 06, 2021
### Attendees
- Joe Hamman / @jhamman
- Justus Magin / @keewis
- Anderson Banihirwe / @andersy005
- Mathias Hauser / @mathause
- Deepak Cherian / @dcherian
- Max Roos / @max-sixty
- Stephan Hoyer / @shoyer
- Ryan Abernathey / @rabernat
### 60 seconds updates
- Joe - sent out user survey (link)[https://docs.google.com/forms/d/e/1FAIpQLSfhVUao634zgpWP3BdrMPwzCd3WUqRbZZ4Baq_l2shoMhcIlQ/viewform]; will share results with core devs
- Justus: reviews and CI
- Max — did a bunch of PRs, mostly small — one material one around unstack performance
- Deepak : NASA proposal, small performance PRs
- Mathias: str dtype of coords, dtype check in `equals`/ `identical`
- Anderson
- Ryan: playing with cupy for diffusion-based smoothing ([example](https://gist.github.com/rabernat/429238a9db40fb3dbfc74b84f64a4b36))
- Stephan: no update
### Agenda
- For awareness: dask `__setitem__` + cf-python [issue](https://github.com/dask/dask/issues/7029) [PR](https://github.com/dask/dask/pull/7033)
- awesome!
- Any final feedback on the [unstack PR](https://github.com/pydata/xarray/pull/4746)
- Decision: ship it!
- Merge this old [PR](https://github.com/pydata/xarray/pull/2844)
- sets variables referred to in CF attributes as "coordinate variables"
- breaking change... will break set_coords
- Decision: ship it! (in a breaking release)
- CI changes are looking great — anything the wider group can contribute?
- A couple of minor refactors — totally fine if they're not worthwhile — do we think they are?
- `coord_names` as list, so they're ordered? https://github.com/pydata/xarray/pull/4755 Decision: try using an OrderedSet.
- Refactor `SortedKeysDict`as dict? https://github.com/pydata/xarray/pull/4753 Decision: looks good for a breaking release!
## Dec 23, 2020
### Attendees
- Alessandro Amici / @alexamici
- Joe Hamman / @jhamman
- Justus Magin / @keewis
- Anderson Banihirwe / @andersy005
- Mathias Hauser / @mathause
### 60 seconds updates
- Alessandro:
- very close to completing read support for backend refactor
- Joe: no updates
- Justus: reviewing PRs / issues. Drop py3.6?
- Anderson:
- looking at backlog of issues/PRs.
- some interest in working on the xarray documentation
- Mathias:
- working on speeding up the CI
- comparison of dtypes
### Agenda
- drop python 3.6
- https://github.com/pydata/xarray/pull/4720
- CI discussion
- move all CI to Github Actions?
- can we have separate jobs for upstream-dev PRs and scheduled build.
- https://github.com/pydata/xarray/issues/4670
- remove `autoclose`
- https://github.com/pydata/xarray/pull/4725
- allow passing `pathlib.Path` to backends:
- https://github.com/pydata/xarray/pull/4707
- cleaning up warnings
## Dec 9, 2020
### Attendees
- Justus Magin / @keewis
- Joe Hamman / @jhamman
- Stephan Hoyer / shoyer
- Ryan Abernathey / @rabernat
- Tom Nicholas / @TomNicholas
- Alessandro Amici / @alexamici
- Mathias Hauser / @mathause
- Deepak Cherian / @dcherian
### 60 seconds updates
- Joe:
- at CZI EOSS meeting this week. Giving a talk on Xarray later this morning: https://docs.google.com/presentation/d/1vaXkAxHc-t1sP9aMubpSK4J8QbckCLHYMgGiRNZJs6M/edit?usp=sharing
- Justus: reviews, fixing documentation issues, CI
- Tom: Not much, some discussion of combining DAs on 3248
- Stephan
- Working with numfocus to get contractors paid--they are not very responsive
- NumFocus doesn't like the idea of indefinite / employment-like contracts; need a specific SOW
- Deepak : not much
- Mathias : not much
- Ryan: Pangeo Forge
- https://github.com/pangeo-forge/roadmap
- https://pangeo-for.ge/
- Alessandro:
- Working on new API; PR merged today (link?)
- New API activated via env var
- Code style questions
### Agenda
- Justus -- how is it going? Great!
- NASA OSS call:
- [benchmarking](https://github.com/pydata/xarray/issues/4648)
- Must be done by Jan 10
- 4 months / year to NCAR
- Tutorial / education stuff to UW
- [fsspec / zarr / mfdataset PR](https://github.com/pydata/xarray/pull/4461)
- Anderson as core dev
- delete pipelines upstream-dev CI
## Nov 25, 2020
### Attendees
- Justus Magin / @keewis
- Joe Hamman / @jhamman
- Mathias Hauser / @mathause
- Stephan Hoyer / shoyer
- Deepak Cherian / dcherian
- Max Roos / @max-sixty
### 60 seconds updates
- Justus: reviewing, duck array status page: https://xarray.pydata.org/en/latest/duckarrays.html
- Deepak: NASA OSS call
- Stephan: thinking about factoring out lazy arrays (note: the lazyarray and lazynumpy names are already taken! maybe xlazy?)
- Joe: reviewing backend PRs, a bit of coordination on NASA/NumFOCUS, tried out xarray-pint
- Mathias: not much; some maintenance
- Max: not much
### Agenda
- patch release? v0.16.2
- https://xarray.pydata.org/en/latest/whats-new.html
- should we merge the `copy` PR https://github.com/pydata/xarray/pull/4453?
- possible implication: https://github.com/pydata/xarray/issues/4524
- NASA proposal
- items we didn't discuss last time:
- (time permitting) numpy groupies / groupby bins / xhistogram roadmap
- location of list of related projects (I/O, duck array compat)
- 0.16.1 seems to have broke some apply_ufunc applications (https://github.com/pangeo-data/xESMF/issues/36)
- should be fixed: https://github.com/pydata/xarray/pull/4576
## Nov 11, 2020
### Attendees
- Justus Magin / @keewis
- Joe Hamman / @jhamman
- Ryan Abernathey / @rabernat
- Mathias Hauser / @mathause
- Stephan Hoyer / shoyer
- Aureliana Barghini / @aurghs
- Monica Rossetti
- Alessandro Amici
- Max Roos / @max-sixty
### 60 seconds updates
- Justus: not much (I/O libraries list)
- Joe: nothing at all :)
- Mathias: some maintenace stuff; keep_attrs for rolling
- Tom: A little bit on pint-xarray, including using it in anger
- Stephan: got money from NVIDIA via NumFocus--what to use it for?
- Ryan: helping coordinate zarr / assync work related fsspec
- Aureliana - working on API for open_dataset, chunking
- Monica -
- Alessandro - helping the B-open team; working on plugin architecture for new API
- Max - a bit of work on numpy groupies
### Agenda
- dependabot? https://github.com/pydata/xarray/issues/4313
- Would help us track changes in upstream dependencies
- But does it work with conda?
- Most failures happen because of unreleased versions--dependabot wouldn't help with this.
- Should we stop testing PRs against unreleased upstream libs and instead use a cron job to check master? **yes**
- GitHub Discussions?
- Would help move usage questions away from mailing list / SO / issues
- Easier than replying to mailing list
- It is now ON! https://github.com/pydata/xarray/discussions
- How to use NVIDIA funding
- $50K unrestricted funds in our bucket at NumFocus
- Should we talk to someone at NVIDIA?
- Should we try to focus on something GPU related? maybe a duck array tutorial?
- Pay someone for day-to-day maintenance (CI, issue tracker etc.)
- Ideally fund devs who were already doing these tasks - they have the knowhow of whole library
- (time permitting) numpy groupies / groupby bins / xhistogram roadmap
- location of list of related projects (I/O, duck array compat)
- 0.16.1 seems to have broke some apply_ufunc applications (https://github.com/pangeo-data/xESMF/issues/36)
## Oct 28, 2020
where: https://columbiauniversity.zoom.us/j/953527251?status=success
### Attendees
- Justus Magin / @keewis
- Stephan Hoyer / @shoyer
- Max Roos / @max-sixty
### 60 seconds updates
- Justus released pint-xarray. Justus will send anouncement to mailing list, Max will tweet it!
- Max worked on NumPy-Groupies and optimizing unstacking (NumPy indexing rather than pandas reindexing).
- Stephan worked on his parallel writes with Zarr PR: https://github.com/pydata/xarray/pull/4035
### Agenda
- NumPy-Groupies PR (very rough draft) https://github.com/pydata/xarray/pull/4540
- See "groupby should not squeeze out dimensions": https://github.com/pydata/xarray/issues/2157
- Issues around this line: https://github.com/pydata/xarray/pull/4540/files#diff-c98815a5dca2dda7cad2e8165c510cc49e2581e2a9a80253a19cedac881c3cd4R836
## Oct 14, 2020
where: https://columbiauniversity.zoom.us/j/953527251?status=success
### Attendees:
- Justus Magin / @keewis
- Tom Nicholas / @TomNicholas
- Deepak Cherian / @dcherian
- Max Roos / @max-sixty
- Alessandro Amici / @alexamici
- Stephan Hoyer / @shoyer
### 60 seconds updates
- Justus: not much
- Tom: Thesis writing (not much)
- Max: not much, I hopefully have some spare time coming up and thinking of doing something - any thoughts on what? Stack / unstack performace? Numpy groupies? Attrs by default?
- B-Open: refactor of decoding options
- Deepak : not much
- Stephan: not much
### Agenda
- indexing refactor update?
- da[time='2020-01-01':] PEP637 but https://twitter.com/raymondh/status/1315855141183004672 (deepak: yes!)
- ZCI first payment
- alignment with tolerance?
- https://github.com/pydata/xarray/pull/4489
- if time: numpy-groupies / numbagg plugin
- Numpy-groupies: https://github.com/pydata/xarray/issues/4473
- https://github.com/ml31415/numpy-groupies
- see https://github.com/pydata/xarray/issues/2139 for a to_dataframe mem problem
## Sep 30, 2020
where: https://columbiauniversity.zoom.us/j/953527251?status=success
### Attendees:
- Justus Magin / @keewis
- Deepak Cherian / @dcherian
- Tom Nicholas / @TomNicholas
- Mathias Hauser / @mathause
- Joe Hamman / @jhamman / NCAR & CarbonPlan
- Monica Rossetti / @TheRed86 / B-Open
- Anderson Banihirwe / @andersy005 / NCAR
- Aureliana Barghini / @aurghs / B-Open
### 60 seconds updates
- Justus: edit-on-github
- Deepak: xarray tutorial on Friday (led by Anderson)
- Tom: not much
- Mathias: not much
- Joe: reviewing backend PRs
- Stephan: playing around with xarray+dask+zarr, noticing some memory issues in dask: https://github.com/dask/dask/issues/6668
- deepak is excited about what might come out of this.
- Monica: helping with backend refactor
- Aureliana: PR coming today with alpha version of new backend
- Anderson: Xarray tutorial coming on Friday. First of a few. https://www.eventbrite.com/e/free-online-tutorial-for-xarray-tickets-122003528839
### Agenda
- Happy birthday Xarray: https://twitter.com/xarray_dev/status/1311322583522918401?s=20
- Anderson/Deepak: xarray in other domains video series
- Xarray use case in neuroscience: https://predictablynoisy.com/posts/2019/2019-10-22-xarray-neuro/
- Xarray use case in plasma simulation: https://github.com/boutproject/xBOUT
- Edit on Github: branch fixed to master, line numbers? https://github.com/pydata/xarray/pull/4460
- 0.16.2 patch release? Relevant fixes:
- Deep copy behavior with backend arrays: https://github.com/pydata/xarray/pull/4453
- Decision: we'll keep it, no need for a patch release. Maybe should document it more clearly?
## Sep 16, 2020
where: https://columbiauniversity.zoom.us/j/953527251?status=success
### Attendees:
- Justus Magin / @keewis
- Deepak Cherian / @dcherian
- Joe Hamman / @jhamman / NCAR & CarbonPlan
- Stephan Hoyer / @shoyer
- Alessandro Amici / @alexamici / B-Open
- Monica Rossetti / @TheRed86 / B-Open
- Max Roos / @max-sixty
- Tom Nicholas / @TomNicholas
### 60 seconds updates
- Deepak - PR pending for quiver plots, Coiled live stream tomorrow (3p MT tomorrow; https://www.eventbrite.com/e/scalable-computing-in-oceanography-tickets-120899294043)
- Joe - Not much
- Stephan - Not much
- Alessandro - Backends weekend refactor meeting last week was productive. Setting up tooling to allow parallel code path for new and old backends
- Monica - working on backends refactor
- Max - little things related to cov, corr, missing values.
- Tom - Things in satellite packages (xhistogram)
- Justus - worked on doctests to make sure they work, and thinking about nested duck arrays
### Agenda
- review requests:
- open_mfdataset with engine="zarr": https://github.com/pydata/xarray/pull/4187
- tricky because of differences in netcdf / zarr backend apis
- discussion around optimal chunk discovery feature
- survey update
- for awareness: https://github.com/dask/dask/issues/6646 which went in to fix https://github.com/pydata/xarray/issues/4112
- nested duck array design: https://github.com/dask/dask/issues/5329
- Patch release?
- Let's merge https://github.com/pydata/xarray/pull/4426 first
- Appropriate response to issues like https://github.com/pydata/xarray/issues/4430 - somewhat reasonable question but no issue template.
- question about expected thread concurrency performance for backends
- who should review and merge backend refactors?
- @jhamman as reviewer
## Sep 2, 2020
where: https://columbiauniversity.zoom.us/j/953527251?status=success
### Attendees:
- Justus Magin / @keewis
- Deepak Cherian / @dcherian
- Joe Hamman / @jhamman / NCAR & CarbonPlan
- Ryan Abernathey / @rabernat / Columbia LDEO
- Stephan Hoyer / @shoyer
- Aureliana Barghini / @aurghs / B-Open
- Alessandro Amici / @alexamici / B-Open
### 60 seconds updates
- Justus - pint-xarray
- Deepak - reviewing PRs
- Joe - user survey, new NASA proposal funded
- Xbatcher: https://github.com/rabernat/xbatcher
- Ryan - working on Copernicus report, hired Julius Busecke
- Stephan - parallel writing of Zarr files PR is still in progress
- regular backend meeting starting tomorrow
- Aureliana - starting to work on backend refactor
- Alessandro - also working on backend refactor, specifically looking at indexing wrappers
### Agenda
- review requests:
- open_mfdataset with engine="zarr": https://github.com/pydata/xarray/pull/4187
- tricky because of differences in netcdf / zarr backend apis
- discussion around optimal chunk discovery feature
- survey update
## August 19, 2020
where: https://columbiauniversity.zoom.us/j/953527251?status=success
### Attendees:
- Justus Magin / @keewis
- Deepak Cherian / @dcherian
- Joe Hamman / @jhamman / NCAR & CarbonPlan
- Ryan Abernathey / @rabernat / Columbia LDEO
- Tom Nicholas / @TomNicholas / Culham Centre for Fusion Energy
- Max Roos / @max-sixty
- Stephan Hoyer / @shoyer
- Mathias Hauser / @mathause
### 60 seconds updates
- Justus: working on RTD / docs
- Deepak: did 45 minute xarray tutorial at oceanhackweek 2020:
- will upload to xarray-contrib/xarray-tutorials
- Joe: user survey
- Max: did a small PR on cov / corr
- Stephan: [data API consortium](https://data-apis.org/blog/announcing_the_consortium/), [rechunker](https://github.com/pangeo-data/rechunker)
- Mathias: some work on apply_ufunc (exclude_dims and vectorize)
- Tom: Just messing about with general curve-fitting
### Agenda
- pandas 1.1 compatibility: https://github.com/pydata/xarray/pull/4292
- user survey (Joe)
- https://examples.dask.org/surveys/2019.html
- Discuss quansight API unification effort
- https://data-apis.org/blog/announcing_the_consortium/
- experimental NumPy/JAX support: https://github.com/seberg/numpy-dispatch
- Stephan wants a review on his Zarr parallel writes PR: https://github.com/pydata/xarray/pull/4035
- nested inline reprs of duck arrays: https://github.com/pydata/xarray/issues/4324
- Stephan: is this something worth taking a step back on, e.g., to write some more general design doc working through the overall issues with deeply wrapped arrays?
- (if time) — I noticed numpy warnings (e.g. `mean of empty slice`) are only filtered in tests. Should we filter these in code?
- We should definitely fix the warnings that aren't coming from dask
- In particular, warnings coming from NaN as a mask value shouldn't issue warnings
- regular backend meetings with BOpen starting next week!
## August 5, 2020
where: https://columbiauniversity.zoom.us/j/953527251?status=success
### Attendees:
- Tom Nicholas / @TomNicholas / Culham Centre for Fusion Energy
- Stephan Hoyer / @shoyer
- Justus Magin / @keewis
- Jon Thielen / @jthielen / Iowa State University
- Jacob Tomlinson / @jacobtomlinson / NVIDIA
- Max Roos / max-sixty
- Chris Markiewicz / @effigies / Stanford University
- Alessandro Amici / @alexamici / B-Open
- Aureliana Barghini / @aurghs / B-Open
- Monica Rossetti / @TheRed86 / B-Open
- Matthew Brett / @matthew-brett / University of Birmingham
### 60 seconds updates
- Tom: Working on general curve-fitting method
- Justus: fixing links / sphinx errors in the documentation
- Chris: fMRI researcher interested in using xarray or xarray-lite
- Aureliana: worked on the flexible backend and made a PR cleaning up AbstractDataStore
### Agenda
- Would it be possible to use the Xarray data model without a pandas dependency? What's the best design for that? https://github.com/pydata/xarray/issues/3981
- Neuroimaging wants xarray but not pandas dependency
- This will be hard for indexing
- Considering protocol development before refactor
- RTD often broken? Anything we can do to improve reliability?
- Dependabot integration? Yes, this would be nice. See https://github.com/pydata/xarray/issues/4313
- Duck array inline reprs? https://github.com/pydata/xarray/issues/2773 / https://github.com/pydata/xarray/pull/4248
- yes
- move sphinx-autosummary-accessors to xarray-contrib?
- yes
- Backend proposal: possibly choose one of the 3 options to start the design work: https://github.com/pydata/xarray/issues/4309
- start with proposal 3 for reading
- start looking into writing and BackendArray
## July 28, 2020 [xarray backends kickoff]
where: https://carbonplan.whereby.com/xarray
### Attendees:
- Joe Hamman / @jhamman / NCAR & CarbonPlan
- Ryan Abernathey / @rabernat / Columbia & LDEO
- Stephan Hoyer / @shoyer & xarray
- Alessandro Amici / @alexamici / B-Open
- Aureliana Barghini / @aurghs / B-Open
- Monica Rossetti / @TheRed86 / B-Open
- Max Roos / @max-sixty
### Agenda
- https://github.com/pydata/xarray/issues/1970
- https://github.com/pydata/xarray/pull/3166
- https://github.com/pydata/xarray/projects/3
### Design issues to solve
- Current design is very focused on netCDF -- needs encoding
- Clarifying interface
- Current interface is not complete / not open for customization
- `open_dataset()` does not need to be specific to netCDF
- Zarr is currently being added as an engine to open_dataset: https://github.com/pydata/xarray/pull/4187
- Ale: after opening a store, certain filters (decoding) are always done. For more complicated backeds, would be nice to specify which filters need to be applied.
- https://github.com/pydata/xarray/pull/1087
- Why is Zarr treated differently? Needs different encoding layers.
- Ale: currently we have
1. On-disk format (e.g. netcdf, zarr, tiledb)
2. Some intermediate object (various `Store` implementations)
3. The in-memory `Dataset`
- Why don't we allow a backend to produce a full xarray Dataset object? Because we need some messy logic to decide e.g. what are coordinates vs. data vars.
- Problems in backend are trickier to identify
- Interface is done with `Variable` objects
- `Dataset._file_obj` needs to be a public API
- `encoding` is attributes that are hidden by default, which indicate how data is stored on disk
- Small refactor: just change `Variable` class to include encodings
- Should we create a new `Backend` class?
- `AbstractDataStore` has both ? and implementation in same class
- Other important interfaces:
- FileManager
- Should either be broken out or exposed as public API in xarray
- Lazy arrays
- Should either be broken out or exposed as public API in xarray
- Explicit indexing
- Three indexing modes: basic (integers, slices), orthogonal (separately along each axis) and vectorized indexing
- Backend testing interface
- Reading and write support
- `DatasetIOBase` does both
## July 22, 2020
where: https://columbiauniversity.zoom.us/j/953527251?status=success
### Attendees:
- Tom Nicholas / @TomNicholas / Culham Centre for Fusion Energy
- Deepak Cherian / @dcherian
- Justus Magin / @keewis
- Joe Hamman / @jhamman / NCAR & CarbonPlan
- Jacob Tomlinson / @jacobtomlinson / NVIDIA
- Stephan Hoyer / @shoyer
- Jon Thielen / @jthielen / Iowa State University
- Ryan Abernathey / @rabernat / Columbia & LDEO
- Alessandro Amici / @alexamici / B-Open
- Aureliana Barghini / @aurghs / B-Open
- Monica Rossetti / @TheRed86 / B-Open
### 60 seconds
- Tom: moved pint-xarray to xarray-contrib
- Deepak: going through issues and PRs
- Justus: working on pint-xarray and documentation
- Joe: did the [SciPy tutorial](https://xarray-contrib.github.io/xarray-tutorial/), generated lots of material, 400 participants, many lessons learned. [Recording on YouTube](https://www.youtube.com/watch?v=mecN-Ph_-78&list=PLYx7XA2nY5Gde-6QO98KUJ9iL_WW4rgYf&index=4)
- Jacob: working on cupy / xarray integration stuff, lots of conversations; integration utility for cupy / numpy: https://github.com/jacobtomlinson/cupy-xarray
- Stephan: helped out with tutorial
- Jon: working through a few pint and dask interop PRs
- Ryan: SciPy tutorial, Released Recunker ([blog post](https://medium.com/pangeo/rechunker-the-missing-link-for-chunked-array-analytics-5b2359e9dc11), [docs](https://rechunker.readthedocs.io/), [github](https://github.com/pangeo-data/rechunker))
- Alessandro: starting to work on backends refactor, B-Open does a bunch of earth science data, developed the cfgrib backend (for ECMWF)
- Aureliana: going to be primarily working on the backends refactor
- Monica: new dev at B-Open, going to be helping Aureliana a bit on the backends refactor
### Agenda
- CZI update?
- duck array things
- duck dask checks: https://github.com/pydata/xarray/pull/4221
- "first" vs "third"-party: https://github.com/pydata/xarray/issues/4212#issuecomment-660210572
- other cupy things: https://github.com/pydata/xarray/pull/4232
- inline repr: https://github.com/pydata/xarray/pull/4248 / https://github.com/pydata/xarray/issues/2773
- keyword indexing in Python (e.g., `array[x=0]`) has come up on python-ideas again
- if you care, dive in! https://mail.python.org/archives/list/python-ideas@python.org/thread/6OGAFDWCXT5QVV23OZWKBY4TXGZBVYZS/
- Design backend refactor discussion (kickstart and / or plan a technical KO call)
## July 8, 2020
where: https://columbiauniversity.zoom.us/j/953527251?status=success
### Attendees:
- Stephan Hoyer / @shoyer
- Deepak Cherian / dcherian
- Max Roos / @max-sixty
- Tom Nicholas / @TomNicholas / Culham Centre for Fusion Energy
- Jon Thielen / @jthielen / Iowa State University
- Joe Hamman / @jhamman / NCAR & CarbonPlan
- Jacob Tomlinson / @jacobtomlinson / NVIDIA working on RAPIDS &DASK
- Justus Magin / @keewis
### 60 seconds
- Deepak: not doing much, merged a few PRs, scipy tutorial
- Thomas: ...
- Max: can contribute! will do next release
- Joe: xarray tutorial, CZI admin
- Jon: working on xarray + metpy
- Jacob: working for Nvidia/dask, working on CuPy + xarray
- Justus: documentation accessors for sphinx, pint-xarray?
### Agenda
- Tom: combine + merge named DataArray:
- https://github.com/pydata/xarray/issues/3312
- Combining named DataArrays raises question of if a named da == ds with single var
- Tom thinking they should be treated as equivalent always
- How to propagate attrs if they conflict:
- https://github.com/pydata/xarray/issues/3891
- Currently non-flexive
- But fixing that would require some sort of comparison between attr values (This would be a breaking change)
- Would be nice to have clear use-cases to guide discussion
- Cupy/sparse/NEP18 support
- [notes partly from @max so not perfect]
- Initial examples work well by using dask as a layer between xarray and cupy
- But then something will e.g. call `np.asarray` and it won't work
- it would be nice to open issues
- Xarray is happy to add small hacks to ease compatibility, particularly as temporary fixes
- @jacob plans to add some tests that test for cupy compat (and skip if cupy isn't installed)
- sparse & cf have similar implementations
- CI would live within NVidia given GPU support
- Announcement - scipy tutorial: https://xarray-contrib.github.io/xarray-tutorial/
- New doc framework: https://myst-parser.readthedocs.io/en/latest/
- Let's start a new issue to disuss
- Quick question: Does anyone mind if we move pint_xarray into xarray-contrib?
- Yes please!
- Done: https://github.com/xarray-contrib/pint-xarray
## June 24, 2020
where: https://columbiauniversity.zoom.us/j/953527251?status=success
### Attendees:
- name / @github / affiliation
- Deepak Cherian / @dcherian / NCAR
- Tom Nicholas / @TomNicholas / Culham Centre for Fusion Energy
- Justus Magin / @keewis
- Stephan Hoyer
- Max / max-sixty
### 60 second updates
- name: ...
- Stephan: someone should review zarr regions PR
- Deepak: started on cf-xarray: https://github.com/xarray-contrib/cf-xarray
- Thomas: not much to report
- Justus: pint support is very close! Needs documentation, maybe xarray-pint package?
- Max: working through the approval process :)
### Agenda
- List of urgent bugs / issues to be tackled?
- List of PRs that should be merged / close to merging
- fix facecolor plot (#4020)
-
- Release 0.16.0
- https://github.com/pydata/xarray/issues/4031
- Max volunteers!
- More formal words on joining the core dev team. Something in between:
- https://devguide.python.org/coredev/
- https://trio.readthedocs.io/en/stable/contributing.html#joining-the-team
- Markdown sphinx parser:
https://myst-parser.readthedocs.io/
## June 10, 2020
### Attendees:
- name / @github / affiliation
- Deepak Cherian / @dcherian / NCAR
- Justus Magin
- Stephan Hoyer
- Ryan Abernathey / @rabernat / Columbia
- Joe Hamman / jhamman / CarbonPlan & NCAR
### Agenda:
- GitHub pull request template
- Scipy tutorial materials due tomorrow!
- https://github.com/xarray-contrib/xarray-tutorial
- https://xarray-contrib.github.io/xarray-tutorial/
- https://docs.google.com/document/d/1YOI11ClsYxTbOwCxFcj3KeZ8b22AgGtyOXQ_wJ_CSM0/edit
- https://earth-env-data-science.github.io/lectures/xarray/xarray_intro.html
- https://earth-env-data-science.github.io/lectures/dask/dask_arrays.html
- Do we want to give a 3 minute talk for the maintainers session?
- Deepak has been volunteered
- close PRs to review:
- assert_allclose (#3847)
- built-in accessor documentation (#3988)
- tutorial rasterio file caching (#4102)
## May 28, 2020
Zoom isn't working, instead use: https://carbonplan.whereby.com/xarray
### Attendees:
- name / @github / affiliation
- Deepak Cherian / @dcherian / NCAR
- Joe Hamman / jhamman / CarbonPlan & NCAR
- Tom Nicholas / @TomNicholas / Culham Centre for Fusion Energy
- Justus Magin
- Stephan Hoyer
- Kai Muehlbauer
### 60 second updates
- Deepak : finishing map_blocks dask args PR, reviewing
- Joe: reviewed map_blocks PR. scikit-downscale is using this PR/branch.
- Justus: pint tests, accessors documentation, tutorial dataset downloading (https://github.com/pydata/xarray/issues/3986)
- Tom: not much, looking into overwritting vars in open_mfdataset
- Stephan: following along with CZI stuff
- Kai, new here, xarray user. Weather radar. Working on [wradlib](https://docs.wradlib.org/en/stable/).
### Agenda:
- Does map_blocks need a `join` kwarg? `apply_ufunc` uses `join="exact"`.
- Stephan: I regret ever picking something other than `join="exact"` by default
- `combine_attrs` breaks existing `open_mfdataset` calls
- Open PR? https://github.com/pydata/xarray/pull/4017
- `dask.array.apply_gufunc` and `apply_ufunc`
- https://github.com/pydata/xarray/pull/4060
## May 13, 2020
where: https://columbiauniversity.zoom.us/j/953527251?status=success
### Attendees:
- name / @github / affiliation
- Deepak Cherian / @dcherian / NCAR
- Joe Hamman / jhamman / CarbonPlan & NCAR
- Tom Nicholas / @TomNicholas / Culham Centre for Fusion Energy
- Stephan Hoyer
- Max / max-sixty
- Justus Magin
### 60 second updates
- name: ...
- Deepak: not much. made a list of issues to be tackled for next release. Updated map_blocks + dask args PR.
- Joe: a bit of work on project management / CZI grant
- Justus: assert_allclose formatting + issue with older dask version
- Tom Nicholas
- Stephan: parallel zarr writes, talking to Jeff
- Max
### Agenda:
- how to structure these meetings?
- Quick 60 second updates from attendees
- Agenda items, processed in order
- List of urgent bugs / issues to be tackled?
- Sparse + Multiindex + indexing: https://github.com/pydata/xarray/issues/4019
- Owner: Stephan
- error summary for allclose (fails for old dask versions): https://github.com/pydata/xarray/pull/3847
- Next release : https://github.com/pydata/xarray/issues/4031
- List of PRs that should be merged / close to merging
- Give these a new issue label
- See "next release" issue: https://github.com/pydata/xarray/issues/4031
## April 29, 2020
where: https://columbiauniversity.zoom.us/j/953527251?status=success
### Attendees:
- name / @github / affiliation
- Deepak Cherian / @dcherian / NCAR
- Tom Nicholas / @TomNicholas / Culham Centre for Fusion Energy
- Justus Magin / @keewis
- Joe Hamman / jhamman / CarbonPlan & NCAR
- Mathias / @mathause / ETH
- Stephan Hoyer
- Max / max-sixty
### 60 second updates
- name: ...
- Deepak - busy on other things lately
- Tom - Project boards from a few weeks ago. Looking to talk about road blocks to specific technical issues, would it make sense to think about specific assignments?
- Justus - working on documentation and pint integration, thinking about integration wrappers for scipy/bottleneck/numbagg and pint's dask compatibility
- Joe -
- Mathias - working on plotting MultiIndex levels
- Stephan - not much of an update
- Max - has Zoom issues, goes by Max
### Agenda:
- how to structure these meetings?
- Quick 60 second updates from attendess
- Agenda items, processed in order
- List of urgent bugs / issues to be tackled?
- List of PRs that should be merged / close to merging
- Give these a new issue label
- MultiIndex plotting: https://github.com/pydata/xarray/pull/3938
- Map Blocks: https://github.com/pydata/xarray/pull/3816
- Uses a dask backed "template"
- Documentation PRs:
- Parameterized accessors: https://github.com/pydata/xarray/pull/3960
- Blackdoc: https://github.com/pydata/xarray/pull/4012
- built-in accessors: https://github.com/pydata/xarray/pull/3988
- pint integration tests on DataArray: https://github.com/pydata/xarray/pull/3643
## Xarray Core Dev Meeting 2020-04-17
Below are the notes from Tom's email.
### Project boards
I'd like to suggest we start using [github project boards](https://help.github.com/en/github/managing-your-work-on-github/managing-project-boards) to keep track of major development efforts. I just spent like 2 hours reading the discussion on refactoring indexes on #1604, and I think it would be much easier to see the structure of our web of issues if we tracked them on project boards.
For example, the indexes board would look something like:
#### 1) Explicit indexes refactor
**To do**:
- Explicit indexes refactor (overview/discussion) ([#1603](https://github.com/pydata/xarray/issues/1603))
**In progress**:
- Explicit indexes ([#2195](https://github.com/pydata/xarray/pull/2195))
**Done**:
- Switch Dataset and DataArray to use explicit indexes ([#2639](https://github.com/pydata/xarray/pull/2639))
- Explicitly keep track of indexes with merging ([#3234](https://github.com/pydata/xarray/pull/3234))
**Would allow**:
- Wrap periodic indexes ([pangeo/#670](https://github.com/pangeo-data/pangeo/issues/670))
- Wrap kdtree indexer ([#475](https://github.com/pydata/xarray/issues/475))
- Wrap `dask.dataframe.index` ([#1650](https://github.com/pydata/xarray/issues/1650), [#3852](https://github.com/pydata/xarray/issues/3852))
- Slice using non-index coordinates ([#2028](https://github.com/pydata/xarray/issues/2028))
- How to add a custom indexer ([#2986](https://github.com/pydata/xarray/issues/2986))
- Interp on curvilinear grids ([#2281](https://github.com/pydata/xarray/issues/2281))
### Meeting?
I imagine some people now have more free time while others have (probably depending on the proximity of children)
### Notes
We now have [project boards](https://github.com/pydata/xarray/projects), setup by Thomas Nicholas.
- example from scikit-learn: https://github.com/scikit-learn/scikit-learn/projects
- How to summarize all the deep technical issues (e.g. NEPs, flexible array types). _Do we need some new developer docs?_
- Split documentation into three parts?
- Users
- Domain specific introductions to xarray
- Something like [dask-stories](https://stories.dask.org/en/latest/) (aka marketing)
- Advanced users (extend xarray)
- Developers
- Automatic parallelization board? (JH)
- Funds from NASA grant
- PRs from Deepak the need review: [#3816](https://github.com/pydata/xarray/pull/3816), [#3965](https://github.com/pydata/xarray/issues/3965)
- Logistics:
- Haven't heard anything from CZI
- SciPy tutorials
- Maybe a good opportunity to refactor our tutorial
- Who wants to help make screencasts?
- Regular developer meeting?
- Tentative proposal: 30 min every 2 weeks?
- New technical issues:
- Add units by default from netCDF files?
- Factor our CF conventions into a separate repo?
- Technical issue worth discussing:
- Switch to keep_attrs by default?
- xarray/dask/zarr integration
- How do we help users who get stuck?
- Some sort of system-wide integration tests?
- Lots of work on zarr currently
- Figuring out spec v3
- Features like image pyramids
- Other languages are implementing zarr, e.g., Javascript and Julia
- What would it mean to implement xarray in other languages?
- xarray + zarr + cloud storage gives us a serverless API for accessing data
- What minimum set of functionality would we need?
- Should zarr be the first-class storage backend for xarray?
- JH: maybe we shouldn't have any first-class storage in xarray proper
- SH: there is value in having a few first-class options
- Maybe we just need to add Zarr integration into `open_dataset`?
- JH: concern that work in this space could conflict with backend refactor
- Can we make `open_mfdataset` generic?
- e.g., for zarr and/or rasterio
- domain specific support:
- Can we store state on accessors?
- What _should_ we support?
- Subclass API?
- We could add some way to override the Dataset/DataArray operation
- Doesn't seem super technically difficult
- Can we make a more minimalist version of xarray?
- xarray.Variable API is very minimal
- sklearn only wants labels axes, not coordinate labels
- Could we make pandas an optional dependency?
- Decided against xarray because of pandas hard dependency: [SLEP-8](https://github.com/scikit-learn/enhancement_proposals/pull/18)
- Stephan:
- Thanks for picking up slack on github issues / PRs