Deepak Cherian
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
      • Invitee
    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Versions and GitHub Sync Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
Invitee
Publish Note

Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

Your note will be visible on your profile and discoverable by anyone.
Your note is now live.
This note is visible on your profile and discoverable online.
Everyone on the web can find and read all notes of this public team.
See published notes
Unpublish note
Please check the box to agree to the Community Guidelines.
View profile
Engagement control
Commenting
Permission
Disabled Forbidden Owners Signed-in users Everyone
Enable
Permission
  • Forbidden
  • Owners
  • Signed-in users
  • Everyone
Suggest edit
Permission
Disabled Forbidden Owners Signed-in users Everyone
Enable
Permission
  • Forbidden
  • Owners
  • Signed-in users
Emoji Reply
Enable
Import from Dropbox Google Drive Gist Clipboard
   owned this note    owned this note      
Published Linked with GitHub
Subscribed
  • Any changes
    Be notified of any changes
  • Mention me
    Be notified of mention me
  • Unsubscribe
Subscribe
- # New Xarray meeting notes https://us02web.zoom.us/j/87503265754?pwd=cEFJMzFqdTFaS3BMdkx4UkNZRk1QZz09 Archive: https://hackmd.io/Vv6g2ABzTPKbe2MWBQqS1w ## August 27, 2025 ### Attendees - Tom - Justus Magin / @keewis - Ian - Eni - Deepak - Stephan - Rahul Bhatia ### 60-second updates - Tom - nada - Stephan - to_netcdf/to_zarr refactoring - Deepak - reviewed indexes blog post: https://github.com/xarray-contrib/xarray.dev/pull/795 - Justus - deepcopy for marray (for sortby on xarray objects wrapping marray) - Eni - Curious about HDF4-EOS support ### Agenda - Support for HDF4-EOS? - Rahul's use case for xarray - Quantitative finance - Data higher-dimensional than dataframes - PRs to review: - propage coord attrs in weighted ops: https://github.com/pydata/xarray/pull/10602 - Refactoring writing to zarr/netcdf https://github.com/pydata/xarray/pull/10656 - DataTree.from_dict: https://github.com/pydata/xarray/pull/10658 - HTML reprs for attrs? https://github.com/pydata/xarray/pull/10663 - Slicing (if we have time) - with multiple slices: https://github.com/pydata/xarray/pull/10573 - vectorized slicing with slice objects - custom slice objects (and where they would live): right-exclusive, right-inclusive, ## August 13, 2025 ### Attendees - Kai Mühlbauer / @kmuehlbauer - Stephan - Ian Hunt-Isaak / @ianhi - Deepak - Alfonso / @aladinor - Tom ### 60-second updates - Stephan: refactor/improvements related to writing netCDF/Zarr files, .e.,g - https://github.com/pydata/xarray/pull/10623 - https://github.com/pydata/xarray/pull/10625 - https://github.com/pydata/xarray/pull/10624 - Kai: issue triaging, small doc changes plus unlimited_dims, some h5netcdf/pyfive things - Ian: Various Bio people continue to be interested, helping them. Trying to spend some time fixing Xarray bugs - Deepak: https://github.com/pydata/xarray/pull/10630 - Alfonso: .prune method to remove empty nodes in datatrees. https://github.com/pydata/xarray/pull/10598 - Tom - `.load_async` - async in general ### Agenda - pypi org? https://github.com/xarray-contrib/xarray-contrib/pull/17 - backwards compatibility break for `to_netcdf()` -> `memoryview` (instead of `bytes`)? xref [diskless](https://docs.unidata.ucar.edu/netcdf/documentation/4.8.0/md__home_wfisher_Desktop_v4_88_80_netcdf-c_docs_inmemory.html) - tom wants to discuss async: https://github.com/pydata/xarray/issues/10622 - Need 2nd reviewer to merge https://github.com/pydata/xarray/pull/10598 ## July 30, 2025 ### Attendees - Justus Magin / @keewis - Ian / @ianhi - Stephan - Alfonso / @aladinor - Tom - Deepak ### 60-second updates - Justus: - multiple slices - https://github.com/pydata/xarray/pull/10573 - outer indexing by slice followed by array without expanding the slice - https://github.com/pydata/xarray/pull/10580 - coordinates methods - https://github.com/pydata/xarray/pull/10318 - Deepak - reviewing; - Stephan - File-like objects and bytes for h5netcdf and DataTree - https://github.com/pydata/xarray/pull/10571 - Ian - Helping out people with downstream projects - Xarray Contrib - Alfonso - Check open_datatree perfomance with @TomNicholas - https://github.com/pydata/xarray/issues/10579 - Tom - Raised a few issues - Performance issue Alfonso mentioned - We don't display indexes in DataTree - Gave a talk just now to Research Software Engineers in the UK - Much more widespread use of xarray in other fields of science than I expected - "My impression is that xarray is becoming a standard tool to reach for like pandas" - "Basically every new research project at the conference used xarray somewhere" ### Agenda - scipy tutorial stipend - Tom will ask the participants - re-gift it to xarray - New defaults for combine/concat/merge ([PR](https://github.com/pydata/xarray/pull/10062)) - begins deprecation cycle; - error when stacking with "minimal" everywhere - concat/combine: - data_vars = None - means "all" for "stack" and "minimal" for "concatenate". - reasoning [here](https://github.com/pydata/xarray/pull/10062#discussion_r2233597697) - coords = "minimal" - combine/concat/merge: - join = "exact" - compat = "override" - What happens with dataset setitem and update? - This is doing something else: - https://github.com/pydata/xarray/blob/e71c34174572b03e46e9d10bbea4528aef382021/xarray/structure/merge.py#L639-L649 - What about arithmetic? - Let's consider this in the future - multiple slices for positional indexing - https://github.com/pydata/xarray/pull/10573 - is `MultipleSlices` a `BasicIndexer` or just an `OuterIndexer`? - Xarray Contrib more structure - https://github.com/xarray-contrib/xarray-contrib/pull/17 ## July 16, 2025 ### Attendees - Justus Magin / @keewis - Tom - Stephan - Ilan - Deepak ### 60-second updates - Justus - moved the issue-from-pytest-log action to scientific python - Stephan - no update - Tom - SciPy tutorial - I'm suspicious there is a regression in xarray causing things to be dask arrays that weren't before - Deepak - SciPy talk: https://xarray-indexes.readthedocs.io/ - Ilan - nullable pandas type promotion - https://github.com/pydata/xarray/pull/10423/ ### Agenda - Extension array: - https://github.com/pydata/xarray/pull/10423/ - Lazy coordinate variables - https://github.com/pydata/xarray/issues/10535 - Recommended dependencies - astropy did the switch recently on conda-forge (there's machinery for this on conda-forge) - need to be similar between PyPI and conda-forge - start publishing xarray-core? There's also a PEP on default dependencies ## July 02, 2025 ### 60-second update - Deepak: - indexes stuff - Kai: - issue triaging, trying to catch up with Deepak and Benoit development, very interesting stuff :heart: - combining CF kwargs into one single argument - Justus: - indexing with a set of (disconnected) slices: https://github.com/pydata/xarray/issues/10479 - continued allow skipping the creation of indexes: https://github.com/pydata/xarray/pull/8051 - indexing dask array with a very large dimension with numpy array: https://github.com/dask/dask/pull/11998 - Tom: - Fixing xarray for breaking changes in zarr - Stephan: - With Spencer Clark, identifying/fixing timedelta64 bugs ### Agenda - (Tom) Release xarray to not break with most recent zarr (3.0.9) - also unbreaks timedelta encoding - Tom will handle this one. - (deepak) move rasterix to xarray-contrib? - https://rasterix.readthedocs.io/en/latest/ - :+1: - KDTreeIndex in Xarray - https://github.com/pydata/xarray/pull/10478/ - :+1: - (deepak) IntervalIndex - [Proposal](https://github.com/pydata/xarray/issues/8005#issuecomment-3011641252) - "CF" IntervalIndex in Xarray or cf-xarray? - https://github.com/pydata/xarray/pull/10296 - PandasIndex(pd.IntervalIndex) encode/decode - https://github.com/pydata/xarray/pull/10483 - Group decoding options into single argument (Kai) - [PR10429](https://github.com/pydata/xarray/pull/10429) as a first step towards [#10377](https://github.com/pydata/xarray/issues/10377)/[#10422](https://github.com/pydata/xarray/pull/10422) ## June 18, 2025 ### 60-second update - Kai (will not make it today, family business :birthday:) - worked on exploring grouping keyword arguments in backends, see agenda for some first ideas - full support wrt NumFOCUS CoC (Stephan's Email) - Stephan - NumFOCUS Code of Conduct - Justus - thinking about indexing with a list of slices - Ilan - Struggling with mypy - Deepak - https://xarray.dev/blog/season-grouping ### Agenda - Adopting the NumFOCUS Code of Conduct (Stephan) - https://github.com/pydata/xarray/pull/10432 - IntervalIndex (Deepak) - Benoit PR: https://github.com/pydata/xarray/pull/10296 - pandas IntervalIndex + accessor in xarray - CF extension in a separate package? - Group decoding options into single argument (Kai) - [PR10429](https://github.com/pydata/xarray/pull/10429) as a first step towards [#10377](https://github.com/pydata/xarray/issues/10377)/[#10422](https://github.com/pydata/xarray/pull/10422) - discuss next time - xarray tutorial at scipy - Proposal https://docs.google.com/document/d/1MijDYJCWlyJpkAwhPIN17Dvbe6gCmrEGOZxzz-vKwq8/edit?usp=sharing - Tom will organize a meeting / check who can attend / will help ## June 04, 2025 ### Attendees - Justus Magin / @keewis - Kai Mühlbauer / @kmuehlbauer - Ilan Gold / @ilan-gold - Deepak Cherian / @dcherian - Eni Awowale / @eni-awowale ### 60-second update - Kai: issue triaging, getting rid of / solving old issues - Deepak : - async things - concat for multi-variable indexes: https://github.com/pydata/xarray/pull/10371 - Seasonal aggregation blogpost: https://github.com/xarray-contrib/xarray.dev/pull/777 - fixed interp pref regression: https://github.com/pydata/xarray/pull/10370 - trying to keep up with PRs - Ilan: extension arrays, getting things merged with dask (?) - Ian: - https://xarray-contrib.github.io/xarray-for-bio/intro.html ### Agenda - whats up with the GH Actions, required status checks - Justus will have a look - Extension arrays - https://github.com/pydata/xarray/pull/10334 - PRs needing review - dtypes and casting: https://github.com/pydata/xarray/pull/10380 - new concat/merge options https://github.com/pydata/xarray/pull/10062 - Forwarding of backend keyword arguments https://github.com/pydata/xarray/issues/10377 - Kai will open a PR ## May 21, 2025 ### Attendees - Ian - Davis - Alfonso - Stephan - Tom - Justus - Nick Hodgskin ### 60-second update - Ian: test for Zarr dtypes - Davis: working on Zarr dtypes, seeing what breaks in xarray - Alfonso: async and concurrent write in DataTree - Tom: - Working on adding a `.load_async` method - Justus: - indexing sprint with Benoit and Scott - https://github.com/pydata/xarray/pull/10318 - https://github.com/pydata/xarray/pull/10323 - jupyterlite for API examples - https://github.com/pydata/xarray/pull/10299 ### Agenda - Dependencies in CI - `pyproject.toml` not being used by CI - historical reasons for this design (create conda environment from files in ci/requirements, then install xarray without dependencies from the packaging metadata) - python version could be bumped by now (our policy allows that now) - could revisit this, but will need some work and coordination - async and concurrent loading of zarr-backed variables - https://github.com/pydata/xarray/pull/10327 - Should xarray synchronous code manage its own threadpool? (see https://github.com/pydata/xarray/pull/10327#discussion_r2099263100) - Decision: Yes that would be nice to do within synchronous `.load()` - There are a several other places that we could use async, especially inside DataTree - discussion - could zarr make this easier somehow? - not really, because it doesn't have enough information - also even if you could do `zarr.load_arrays(*arrs)` that doesn't help if you need to load many different xarray objects concurrently ## May 07, 2025 ### Attendees - Ian - Joe - Stephan - Justus - Deepak - Tom ### 60-second updates - Stephan - No xarray update - Ian - Blog post: https://github.com/xarray-contrib/xarray.dev/pull/775 - xarray-contrib keys to castle of projects - https://scientific-python.org/specs/spec-0006/ - jupyterlab-contrib has a policy for this - Joe - Back from CNG -> https://earthmover.io/blog/zarr-takes-cloud-native-geospatial-by-storm - Deepak - pushed a release out - more extensionarray stuff - going to merge SeasonGrouper/SeasonResampler: https://github.com/pydata/xarray/pull/9524 - Tom - Nothing much on Xarray itself - Reviewed Ian's blog post - CNG made me curious what the current status of (geo-)indexes in xarray is? - Tutorial was accepted to SciPy - Justus - meeting up with Scott and Benoit for an in-person sprint on xarray indexes ### Agenda - One gnarly alignment question: https://github.com/pydata/xarray/issues/10243 - https://github.com/dcherian/rasterix/issues/20 - PyPI rights for xarray-contrib - Deprecation warnings for moving Errors into `xarray.errors`: https://github.com/pydata/xarray/pull/10285 - Move flox into xarray? - move ahead with the restructuring of the conda-forge packages - default extras (PEP 771): https://discuss.python.org/t/pep-771-default-extras-for-python-software-packages/79706 - PRs needing review: - timedelta encoding: https://github.com/pydata/xarray/pull/10101 ## Apr 23, 2025 ### Attendees - Tom Nicholas / @TomNicholas - Eni - Stephan - Ian - Justus Magin / @keewis ### 60-second updates - not much - Stephan: - collapsing datatree - https://github.com/pydata/xarray/issues/9349 - also list getitem on DataTree ### Agenda - Ian's DataTree for asexual cell lineages idea - zcollection - example of serializing non string names: https://github.com/CNES/zcollection - SciPy plans - Lots of stuff accepted! - Xarray tutorial - Deepak and Benoit on indexes (talk?) - Ian on Xarray in biology (talk) - Justus - marray (poster, mainly Matt Haber) - xdggs (talk) - grid-weights (poster) - Tom - on VirtualiZarr (talk) - on Cubed (talk, Tom White submitted but can't attend in-person) - Joe on Icechunk (poster) ## Apr 09, 2025 ### Attendees - Deepak Cherian / @dcherian - Tom Nicholas / @TomNicholas - Nick Hodgskin / @VeckoTheGecko - Justus Magin / @keewis - Kai Mühlbauer / @kmuehlbauer - Alfonso Ladino / @aladinor ### 60-second updates - Nick - Low hanging fruit in xarray codebase (https://github.com/pydata/xarray/pull/10201) - Looking for stuff to contribute to as an xarray noob contributor. Interested in triaging issues - Deepak - understanding raster world - https://github.com/benbovy/xproj/issues/22 - Kai - not much xarray related ### Agenda - SeasonGrouper & SeasonResampler? - https://github.com/pydata/xarray/issues/10198 - https://github.com/pydata/xarray/pull/9524 - Xarray in biology - DataTree for cell lineages? - New dedicated docs section for different domains - New dedicated docs page for biology (Ian is working on this) - pandas extension array support - https://github.com/pydata/xarray/pull/9671 - dtypes need looking at - we are now exposing our internal `PandasExtensionArray` wrapper as DataArray.data ## Mar 26, 2025 ### Attendees - Tom Nicholas / @TomNicholas - Alfonso Ladino / @aladinor - Eni Awowale / @eni-awowale - Justus Magin / @keewis - Ian Hunt-Isaak /@ianhi ### 60-second updates - Tom - Fixed issues with DataTree and zarr-python v3 (PR [#10020](https://github.com/pydata/xarray/pull/10020)) - Thanks Alfonso for a lot of work there - Still some things to do (e.g. `append` kwarg) - Eni - Can't come to SciPy :( - Alfonso - Consider changing default consolidated=None to False [issue #10122](https://github.com/pydata/xarray/issues/10122) - Can't come to SciPy too - Davis - Final stretch of making data types in zarr-python 3 extensible ### Agenda - Subsetting method for DataTree - https://github.com/pydata/xarray/issues/9346 - Similar to `.match()` https://github.com/pydata/xarray/blob/66f6c17fa3c9eaaa8d1bb2b90f34d826b194fb60/xarray/core/datatree.py#L1484 ## Mar 12, 2025 ### Attendees - Eni Awowale / @eni-awowale - Justus Magin / @keewis - Julia Signell / @jsignell - Alfonso Ladino / @aladinor - Davis Bennett / @d-v-b ### 60-second updates - Eni - xr.tutorial.open_datatree (https://github.com/pydata/xarray/pull/10082) - Alfonso: - Xarray Datatree compatibilities issue with Zarr-Python V3 (PR https://github.com/pydata/xarray/pull/10020) almost ready to go (pending review from Tom and Joe) - Julia: Have been working on changing the default kwargs for concat/merge/open_mf_dataset (PR: https://github.com/pydata/xarray/pull/10062) ### Agenda - array api: - array-api-strict requires the condition to where to have a boolean dtype, so we need to cast - array-api-strict's way to specify dtype is from the namespace - changed defaults for compat in concat / merge / combine - PRs to review: - https://github.com/pydata/xarray/pull/10020 - https://github.com/pydata/xarray/pull/10062 - https://github.com/pydata/xarray/pull/10082 ## Feb 26, 2025 ### Attendees - Justus Magin / @keewis - Alfonso Ladino / @aladinor - Matt Savoie / @flamingbear - Tom Nicholas / @TomNicholas - Eni Awowale / @eni-awowale - Ian Hunt-Isaak / @ianhi - Joe Hamman / @jhamman - Davis Bennett / @d-v-b ### 60-second updates - Davis: New Zarr API for creating objects in parallel. - Alfonso: Xarray Datatree compatibilities issue with Zarr-Python V3 (PR https://github.com/pydata/xarray/pull/10020) - Waiting for issue [Zarr-Python#2830](https://github.com/zarr-developers/zarr-python/issues/2830) and [Zarr-Python#2821](https://github.com/zarr-developers/zarr-python/issues/2821) - Eni: DataTree tutorial PR. Pooch not working. [PR](https://github.com/xarray-contrib/xarray-tutorial/pull/307) ### Agenda - Xarray Scipy tutorial submission - Submission deadline is 1 week away (and has already been postponed once) - Tom will email people to get going - gap filling PR: https://github.com/pydata/xarray/pull/9402 - https://github.com/pydata/xarray/issues/7665#issuecomment-1994899282 - Stephan will contact Max - How Xarray could be more useful to the biosciences - Functional indexes, i.e., an affine transformation, would be very useful for bio-imaging - DataTree tutorial dataset: - Move datatree files to the top-level directory - Add `xr.tutorial.open_datatree` ## Feb 12, 2025 ### Attendees - Deepak Cherian /@dcherian - Justus Magin / @keewis - Alfonso Ladino / @aladinor - Eni Awowale / @eni-awowale - Tom Nicholas / @TomNicholas - Max Jones / @maxrjones - Ian Hunt-Isaak / @ianhi - Stephan ### Updates - Deepak - welcoming Ian! - Alfonso - Attemp to solve incompatibilities in Zarr V3 and DataTrees issue: https://github.com/pydata/xarray/issues/9960, PR: https://github.com/pydata/xarray/pull/10020 - Justus: - archived `xarray-datatree` on PyPI - Tom: - Mostly messing around with FROST - giving talk in a couple of hours (https://discourse.pangeo.io/t/pangeo-showcase-frost-federated-registry-of-scientific-things-feb-12-2025/4861) - Max: - looking to try out https://github.com/pydata/xarray/pull/10000 and https://github.com/pydata/xarray/pull/9543 for GeoZarr, here to found out about any known gotchas - working on Xarray on GPUs with Wei Ji and Negin at NCAR hackathon, with Tom and Akshay from NVIDIA as mentors - Eni: - ESIP IT&I talk [Thursday](https://www.esipfed.org/event/iti-information-technology-interoperability-3-11/) - Stephan - Duck typing conversion ### Agenda - SciPy proposals? - Email same list of people who ran tutorial last time - Pandas extension arrays in Xarray - bug about how we've stopped eagerly converting to numpy: https://github.com/pydata/xarray/issues/9742 - more extension array support: https://github.com/pydata/xarray/pull/9671 - Possible solutions: 1. Preserve them as pandas extension types -- but a lot of operations break - nice for categoricalarray, intervalarray, datetimearray with timezone - add support for N-d data into PandasExtensionWrapper, by adding a `.shape` attribute 2. Convert into corresponding NumPy dtypes -- but this is lossy - option to control which dtypes are converted 3. Wrap them in masked duck arrays, using marray - increased memory usage (additional bool mask) - somewhat surprising? 4. Somehow make it easier to write custom dtypes in NumPy - Array conversion methods (e.g., https://github.com/pydata/xarray/pull/9823) ```python ds.as_array_type(cp.asarray) ds.as_array_type(jnp.from_dlpack) ds.as_array_type(jnp.asarray, device=jax.devices("gpu")[0]) ds.as_array_type(pint.Quantity, units="m/s") ds.is_array_type(cp.ndarray) # -> True ``` - Could also map over all array nodes ala JAX-Xarray? - `jax.tree.map(cp.asarray, ds)` or `isinstance(array.data, cp.ndarray)` - Conceptually this is a typing issue -- could Dataset be a generic array mapping over a value type? - Kind of like dict / TypedDict ## Jan 29, 2025 ### Attendees - Tom Nicholas / @TomNicholas - Kai Mühlbauer / @kmuehlbauer - Justus Magin / @keewis - Alfonso Ladino / @aladinor - Matt Savoie / @flamingbear - Eni Awowale / @eni-awowale - Joe Hamman / @jhamman ore / @joshmoore ### 60 seconds updates - Kai & Spencer: non-nanosecond finalization - https://github.com/pydata/xarray/pull/9966 and - https://github.com/pydata/xarray/pull/9999 - Justus: - attempts to get autocompletion for accessors to work (may not actually work) - https://github.com/pydata/xarray/pull/9985 - Tom - VirtualiZarr stuff - Stephan -- no updates - Matt - just coming back to work - Davis - where could we speed up dataset creation with writing to Zarr? Blog on speeding up Zarr workloads ![image](https://hackmd.io/_uploads/B1PUYRPdJx.png) - Eni: working on tutorial demo - Deepak : no update - Alfonso: Datatree implementation on NEXRAD data - Eni: working on tutorials - Joe: would love to discuss DataTree & Zarr v3 ### Agenda - Executive order? - NumFOCUS is technically a NASA subcontractor through xarray funding... - Let's just wait for NumFOCUS to advise - Flexible indexes PR status? - https://github.com/pydata/xarray/pull/9543 - Needs either review or someone else to finish it - Multiple DataTree bugs surfaced - Using a forward slash in a node names of a DataTree causes a RecursionError: https://github.com/pydata/xarray/issues/9978 - DataTree roundtrip fails on None group lookup: https://github.com/pydata/xarray/issues/9960 - Especially this one: https://github.com/pydata/xarray/issues/9912 - The name of the root group in Zarr / Datatree - Compatibility with zarr v3 - https://github.com/pydata/xarray/issues/9984 - `root.create_array(..., name='/foo/bar')` vs `root.create_array(..., name='foo/bar')` - Release including non-nanosecond in January (tomorrow or Friday) or postpone to February - just release? Kai will create a final release tracking issue [#10002](https://github.com/pydata/xarray/issues/10002) and after that goes for preparing the release tomorrow/Friday - [marray](https://github.com/mdhaber/marray) appears to be ready for trying with `xarray` (might be possible to interface with `fillna`, `bfill` / `ffill`, `where`, `notnull` / `isnull`, `interpolate_na`): ```python import marray data = marray.numpy.asarray(np.array([1, 2, 3], dtype="int32"), mask=np.array([True, False, True])) arr = xr.DataArray(data, dims="x") assert arr.dtype == "int64" ``` ## Jan 15, 2024 ### Attendees - Stephan Hoyer - Kai Mühlbauer / @kmuehlbauer - Eni Awowale / @eni-awowale - Davis Bennett ### 60 seconds updates - Eni: - Considering datasets to use for DataTree tutorial: - [GPM_3IMERGHH](https://disc.gsfc.nasa.gov/datasets/GPM_3IMERGHH_07/summary) - [GPM_2ADPR](https://disc.gsfc.nasa.gov/datasets/GPM_2ADPR_07/summary) - [OMSO2e](https://disc.gsfc.nasa.gov/datasets/OMSO2e_003/summary?keywords=OMSO2e_003) - Desided to go with GPM_3IMERHH - Kai: non-nanosecond cf time decoding https://github.com/pydata/xarray/pull/9618 ### Agenda - non nanosecond stuff(how to move forward, next steps) - merge now, let Stephan do some google testing - continue work on separating/splitting out coders - concurrent zarr creation - example https://github.com/zarr-developers/zarr-python/pull/2665 ## Dec 18, 2024 ### Attendees - Benoît Bovy / @benbovy (cannot attend, unfortunately) - Kai Mühlbauer / @kmuehlbauer - Tom Nicholas / @TomNicholas - Matt Savoie / @flamingbear - Justus Magin / @keewis - Eni Awowale / @eni-awowale - Deepak Cherian / @dcherian - Stephan Hoyer - Matt Savoie ### 60 seconds updates - Benoît - continued working on https://github.com/benbovy/xproj - Kai - [Relax nanosecond datetime restriction in CF time decoding](https://github.com/pydata/xarray/pull/9618) - Now splitting out backwards compatible code parts for better review experience. Reviews appreciated on the pending PR's. - [x] [Add "unit"-parameter to date_range, enhance iso time parser to us](https://github.com/pydata/xarray/pull/9885) - [x] [move scalar-handling logic into possibly_convert_objects](https://github.com/pydata/xarray/pull/9900) - [ ] [Enhance and move ISO-8601 parser to coding.times](https://github.com/pydata/xarray/pull/9899) - [ ] [split out CFDatetimeCoder, deprecate use_cftime as kwarg ](https://github.com/pydata/xarray/pull/9901) - [ ] [time coding refactor](https://github.com/pydata/xarray/pull/9906) - [ ] use iso-parser when reference time out-of-bounds (needs iso-parser and time-coding refactor) - Tom - Blog post on DataTree collaboration - Hoping to finish this today, and release before the end of the year - AGU - VirtualiZarr - Serverless parallelization of opening files in `open_mfdataset` - See https://github.com/zarr-developers/VirtualiZarr/pull/349#discussion_r1885979222 - Eni - xarray.DataTree poster for AGU last week - Working/thinking about a good tutorial for notebook for DataTree - `DataTree.to_zarr(append_dim)` bug (https://github.com/pydata/xarray/issues/9858) - Deepak - working on rewriting linear interp to use indexing + averaging - rewrote interp to use apply-ufunc - https://github.com/pydata/xarray/pull/9881 - using shuffle for groupby binary ops - https://github.com/pydata/xarray/pull/9896 ### Agenda - NumFOCUS SDG idea: Xarray objects reprs (Benoît) - https://numfocus.org/programs/small-development-grants (we should use it more) - HTML (interactive) repr: - Well-defined scope, well suited for NumFOCUS SDG application - Impactful! (https://matthewrocklin.com/blog/2019/07/04/html-repr) - Fix html repr rendering issues in sphinx-based documents (dark-mode) - Datatree repr (https://github.com/pydata/xarray/issues/9350, https://github.com/pydata/xarray/issues/9350) - Embed nd-array visualization (https://github.com/pydata/xarray/issues/9324) - https://github.com/benbovy/xarray-fancy-repr (the whole Javascript ecosystem at our fingertips!) - I'm (Benoît) happy to draft a SDG proposal. Who else is interested contributing? Find a front-end / Javascipt / Viz expert? - Moving CF-related codecs outside of xarray ## Dec 04, 2024 ### Attendees - Deepak Cherian / @dcherian - Justus Magin / @keewis - Scott Henderson / @scottyhq - Tom Nicholas / @TomNicholas - Nick Hodgskin / @VeckoTheGecko ### 60 second updates - Deepak: - played around with better vectorized interp with dask - better idxmin, idxmax with dask - https://github.com/pydata/xarray/pull/9800 - pushed anderson's namedarray/backends refactor quite close - https://github.com/pydata/xarray/pull/9273 - Scott: - worked w/ Benoit lask week to rekindle CRSIndex - prototype https://github.com/benbovy/xproj - sent email about possible NSF Grant, any takers? - https://new.nsf.gov/funding/opportunities/safe-ose-safety-security-privacy-open-source-ecosystems - Justus - some progress on [marray](https://github.com/mdhaber/marray/) - Nick: - New to xarray! - Bio - Research Software Engineer in University Utrecht working on OceanParcels and other oceanography projects - Looking to depend more on Xarray, and contribute upsteam for my own professional development. - Low hanging fruit (https://github.com/pydata/xarray/pull/9821, https://github.com/pydata/xarray/pull/9840). Still need feedback on 9821 - Tom - Mostly VirtualiZarr stuff for AGU - Small xarray things - Writing blog post announcing `xarray.DataTree` - Ideas https://github.com/xarray-contrib/xarray.dev/issues/708 - Stephan - Writing yet-another implementation of labeled arrays on top of JAX: https://github.com/neuralgcm/neuralgcm/tree/main/neuralgcm/experimental/coordax ### Agenda - NSF security grant - anything to include in datatree blog announcement? - Want to include thoughts about collaboration with NASA - Including the in-kind dev time contributions that ESDIS made - Ideal in the sense of literally zero overhead - Also core dev spending 10% time spent directing someone with more time is efficient use of relative expertise - Less ideal that Tom/Justus/Stephan didn't get paid for the work - In future better to have one of the paid people at the contributing org already be a core dev - pushed anderson's namedarray/backends refactor quite close. ready for prelim review. - https://github.com/pydata/xarray/pull/9273 ## Nov 20, 2024 ### Attendees - Matt Savoie / @flamingbear - Deepak Cherian / @dcherian - Justus Magin / @keewis - Stephan Hoyer - Eni Awowale / @eni-awowale - Kai Mühlbauer / @kmuehlbauer - Tom Nicholas / @TomNicholas - Alfonso Ladino / @aladinor ### 60 second updates - Matt: mostly just watching the repo for datatree issues and using it constantly in my day to day. - Deepak: - lots of dask stuff - zarr v3 compatibility - icechunk distributed writes - Justus: - rewrite of the min-deps check script - creation of a separate github action - Kai: - datetime64 decoding (non nanosecond relaxation, https://github.com/pydata/xarray/pull/9618) - Tom: - Not much direct xarray stuff - Eni: - Testing xarray.DataTree internally ran into some issues with numpy 2.0 :-/ - Working on DataTree poster for AGU, will share accordingly with folks! ### Agenda - fsspec utility PR: https://github.com/pydata/xarray/pull/9797 - icechunk & to_zarr - https://github.com/earth-mover/icechunk/issues/383 - add the notion of closing a store? - maybe make the store readable after pickle? - upstream issue: https://github.com/earth-mover/icechunk/issues/185 - duck array / array api PR: https://github.com/pydata/xarray/pull/9798 - ImportError / ValueError when chunkmanagers not installed? - https://github.com/pydata/xarray/pull/9676 - Fine to maintain explicit list of "expected" chunkmanagers - This would help us improve error messages by pointing to packages like cubed-xarray - Separate question of whether or not the entire entrypoint system was overkill - But we can punt on that for later ## Nov 06, 2024 ### Attendees - Tom Nicholas / @TomNicholas - Deepak Cherian / @dcherian - Kai Mühlbauer / @kmuehlbauer - Owen Littlejohns / @owenlittlejohns - Matt Savoie / @flamingbear - Justus Magin / @keewis - Stephan Hoyer / @shoyer - Eni Awowale / @eni-awowale ### 60 second updates - Tom - Was ill - Now crying about election - Working on virtualizarr - Datatree release seems to have gone okay? - Deepak - shuffle: - Groupby.shuffle: https://github.com/pydata/xarray/pull/9320 - GroupBy.map(..., shuffle=True) https://github.com/pydata/xarray/pull/9706 - Joe - zarr3 - zarr3+xarray concurrency - Kai: - non-nanosecond time decoding (https://github.com/pydata/xarray/pull/9618) - old issue treatment - Justus: - astropy / numpy subclasses: https://github.com/pydata/xarray/pull/9705 ### Agenda - fsspec by default in all backends: - https://github.com/pydata/xarray/issues/9723 - perhaps with only basic fsspec functionality - will ask to open PR. - PRs needing review: - Shuffle: https://github.com/pydata/xarray/pull/9320 - GroupBy.shuffle() -> GroupBy # - GroupBy.sort() -> Dataset - - Dataset.shuffle_by(Groupers) -> Dataset - GroupBy.map(.., shuffle=True) -> uses shuffle + map_blocks (useful for quantile) - Pandas extensionarray: https://github.com/pydata/xarray/pull/9671 - kai + someone else - unlock setup-micromamba https://github.com/pydata/xarray/pull/9732 ## Oct 23, 2024 ### Attendees - Justus Magin / @keewis - Joe Hamman / @jhamman - Tom Nicholas / @TomNicholas - Eni Awowale / @eni-awowale - Deepak Cherian / @dcherian ### 60 second updates - Stephan: - Lots of DataTree refinements - Added xarray.group_subtrees(): https://github.com/pydata/xarray/pull/9636 - Justus: - open_datatree + chunks - missing value support for numpy (marray / dtypes) - Tom - Reviewing Stephan's DataTree PRs - Some small DataTree PRs myself, including updating the HTML repr to match new inheritance model - Otherwise mostly VirtualiZarr stuff - Joe - Zarr v3 - Icechunk - Deepak - just back from vacation. ### Agenda - Release? - What to do with `xarray-contrib/datatree`? - Yank it from PyPI? - No - instead release one more time with a warning on import - Maybe yank in future... - link to the migration guide in the readme - retire the old datatree repository - Tom volunteered to do the release - DataTree stuff to finish up? - Support chunks in open_datatree() - compute, load, chunk, persist? - Re-write coordinates in each group when writing to Zarr? - Zarr V3 PR - stops interpreting Zarr `.fill_value` as CF `_FillValue` only for new V3 stores - are we affected by the RTD add-ons deprecation? - https://about.readthedocs.com/blog/2024/07/addons-by-default/#how-does-it-affect-my-projects ## Oct 9, 2024 ### Attendees - Tom Nicholas / @TomNicholas - Justus Magin / @keewis - Joe Hamman / @jhamman - Owen Littlejohns / @owenlittlejohns - Spencer Clark / @spencerkclark - Mathias Hauser / @mathause ### 60 second updates - Tom - Reviewing Stephan's datatree PRs around coordinate inheritance - Migration guide for users of old datatree repo https://github.com/pydata/xarray/pull/9598 - Justus - PR to avoid truncating fixed-width strings: https://github.com/pydata/xarray/pull/9586 - Joe - Xarray <-> Zarr-python V3 integration is close but not in `main` - Mathias - Issue with reducing non-numeric scalars - Spencer - Just wanted to thank Kai for looking at datetime precision issue - Owen - Planning to review Tom's PR on datatree alignment docs ### Agenda - Zarr-python v3 status update - Consolidated metadata is on by default in xarray but not part of v3 spec - But Tom A has made that work on a branch - FillValue issues - https://github.com/pydata/xarray/issues/5475 - Strings - Added a variable-length string codec in zarr - working branches ``` pip install git+https://github.com/TomAugspurger/zarr-python@xarray-compat git+https://github.com/TomAugspurger/xarray/@fix/zarr-v3 git+https://github.com/jhamman/dask@fix/zarr-array-construction-2 ``` ## Sep 25, 2024 ### Attendees - Kai Mühlbauer / @kmuehlbauer - Justus Magin / @keewis - Deepak Cherian / @dcherian - Matt Savoie / @flamingbear - Tom Nicholas / @TomNicholas - Eni Awowale/ @eni-awowale - Paul Ockenfuß / @Ockenfuss - Spencer Clark / @spencerkclark ### 60 second updates - Kai - [ERAD2024](https://openradarscience.org/erad2024/) short course on open source software for weather radar processing - h5netcdf new release soon with additional capabilities - preparing xarray for that change - Justus - nested duck array introspection issue: https://github.com/data-apis/array-api/discussions/843#discussioncomment-10714668 - feedback: instead of a new protocol, get nested namespaces - swapping the order of preference to `__array_namespace__` over `__array_function__` - https://github.com/pydata/xarray/pull/9530 - Deepak - groupby things (chunked array, shuffle) - Stephan - DataTree inheritance issues, related to https://github.com/pydata/xarray/issues/9475 - Tom - DataTree inheritance model discussions - Wrote some documentation on the new data model - https://xray--9501.org.readthedocs.build/en/9501/user-guide/hierarchical-data.html#alignment-and-coordinate-inheritance - Would be a good thing for others to take a look at - But bear in mind it is affected by the unsolved issue https://github.com/pydata/xarray/issues/9475 - Spencer - Addressing various issues arising from changes made to enable lazy encoding of chunked arrays of datetimes. https://github.com/pydata/xarray/pull/9498 should hopefully more robustly address most of these. - See discussion here for more background: https://github.com/pydata/xarray/issues/9488#issuecomment-2351149546. - Defer cast to different dtype to its usual place in the encoding pipeline. - More safely allow different default choice of datetime64[ns] encoding units: https://github.com/pydata/xarray/issues/9154. ### Agenda 1. Need decision on xarray, xarray-core on conda-forge - https://github.com/conda-forge/xarray-feedstock/pull/113#issuecomment-2265819231 - Decision: xarray, xarray-core on conda; xarray & xarray[recommended] on PyPI? - 6 month deprecation period. 2. PRs needing review: - GroupBy(chunked array) : https://github.com/pydata/xarray/pull/9522 - netcdf4/h5netcdf: complex numbers and enums https://github.com/pydata/xarray/pull/9509 - API naming for improved gap filling. See summary of open questions [here](https://github.com/pydata/xarray/pull/9402#issuecomment-2341844048) and [here](https://github.com/pydata/xarray/pull/9402#issuecomment-2344171177) 3. Grouped Shuffle - general issue: https://github.com/pydata/xarray/issues/9546 - PR: https://github.com/pydata/xarray/pull/9320 4. DataTree inheritance issue, related to https://github.com/pydata/xarray/issues/9475 - separate discussion meeting? (Tom: Yes, we could also just stay on the call after? Stephan: unfortunately I cannot today) ## Sep 11th, 2024 ### Attendees - Matt Savoie / @flamingbear - Tom Nicholas / @TomNicholas - Owen Littlejohns / @owenlittlejohns - Eni Awowale/ @eni-awowale - Stephan Hoyer / @shoyer ### 60 second updates - Tom - Lots of DataTree stuff - We are very close to releasing! - Matt - (datatree) keeping up with main changes in docs. Need to fix current doc errors. ### Agenda - Release - DataTree q's (maybe answered now) - Hashable vs str - https://github.com/pydata/xarray/issues/8836#issuecomment-2341963401 - `DataTree.subtree.<method>` namespace? - https://github.com/pydata/xarray/issues/9472#issuecomment-2341590576 - Avoid duplicated variables by design - Eni's `open_groups` typing Q ## Aug 28, 2024 ### Attendees - Justus Magin / @keewis - Matt Savoie / @flamingbear - Tom Nicholas / - Deepak Cherian / - Daniel Kaufman / @danielfromearth ### 60 second updates - Justus: sprint on xarray + duckarrays testing framework with Tom (https://github.com/xarray-contrib/xarray-array-testing/) - Tom: - duckarrays testing - using `conventions.decode_cf_variables` without decoding actual values - https://github.com/zarr-developers/VirtualiZarr/pull/224 - going to NUMFocus summit next week - Deepak: - groupby multiple arrays - providing input to dask things; shuffling, blockwise reshape, auto rechunking, reshaping; - https://github.com/dask/dask/pull/11350/files - Matt: Nada ### Agenda - Anyone else want to come and represent Xarray at the NUMFocus summit next week in Boston? - Eni? - decode_cf - PRs needing review: - speed up docs build: https://github.com/pydata/xarray/pull/9395 - Shuffling API: - https://github.com/pydata/xarray/pull/9320 - `Dataset.shuffle_by() -> Dataset` - `DatasetGroupBy.shuffle() -> DatasetGroupBy`. ## Aug 14, 2024 ### Attendees - Justus Magin / @keewis - Owen Littlejohns / @owenlittlejohns - Eni Awowale/ @eni-awowale - Tom Nicholas / @TomNicholas - Deepak Cherian / @dcherian ### 60 second updates - Moving old datatree issues - Tom - Working on chunkmanager PR - Deepak - groupby shuffle - https://github.com/pydata/xarray/pull/9320 - engaging with dask - https://github.com/dask/dask/issues/11314 - https://github.com/dask/dask/pull/11303 - https://github.com/dask/dask/pull/11273 - Justus - merged the python 3.9 dropping PR ### Agenda - VirtualZarr ideas? - Eni's open_groups PR - https://github.com/pydata/xarray/pull/9243 ## July 31, 2024 ### Attendees - Matt Savoie / @flamingbear - Tom Nicholas / @TomNicholas - Justus Magin / @keewis - Owen Littlejohns / @owenlittlejohns ### 60 second updates - Matt - is trying to wrap head about copying trees. [#9285](https://github.com/pydata/xarray/issues/9285) should not be as hard as I'm making it. - Tom - Trying to coordinate to push datatree over the finish (well first release) line - Fixes for a bunch of small datatree bugs - PR for allowing chunked arrays that aren't dask/cubed through xarray - https://github.com/pydata/xarray/pull/9286 - rename chunkmanagers vs "ComputeManagers" - Justus: - released 2024.07.0 yesterday (new script to extract contributors from git commits) - pint-xarray: accessor entrypoints / PintIndex ### Agenda - DataTree should avoid any in-case modification - Auto-copy on setting parent? - Remove the ability to assign .parent entirely? - Need to keep .parent accessible in order to walk up through tree - Who is submitting to AGU today? - (Tom is, on VirtualiZarr) - (Ryan is) - Owen is - Stephan maybe - ChunkManager vs ComputeManager https://github.com/pydata/xarray/pull/9286 - Justus tell us about the PintIndex (postponed to next time) ## July 17, 2024 Cancelled -- only Stephan Hoyer and Justus Magin showed up. ## July 3, 2024 ### Attendees - Justus Magin / @keewis - Tom Nicholas / @TomNicholas - Matt Savoie / @flamingbear - Joe Hamman / @jhamman ### 60 second updates - Tom - Reviewed datatree coordinate inheritance PR properly (https://github.com/pydata/xarray/pull/9063) - Now unblocked for releasing datatree in `main` - Mostly actually wrote code for virtualizarr - Justus - lots of fixes for numpy2 (for the dependencies we couldn't test before) - https://github.com/pydata/xarray/pull/9136 should be ready for merging - other bug fixes (hypothesis test for datetime ExtensionArrays, arrays as attributes) - nested duck arrays: finding `cupy` underneath arbitrary layers (especially dask) - Matt - Watching [#9063 ](https://github.com/pydata/xarray/pull/9063#) get merged. - Joe - Just working on zarr-python - ### Agenda - Codecs separate from xarray? - Keeps coming up in virtualizarr - https://github.com/zarr-developers/VirtualiZarr/issues/68#issuecomment-2197682388 - Can we get zarr-python to open a netCDF file by using chunk manifests + defining enough new codecs? - One difference is that zarr codecs take arrays -> arrays but xarray decoding takes Variables -> Variables - action: open a new issue to consolidate the discussion - Interesting question of subclassing xarray.Dataset in virtualizarr - https://github.com/zarr-developers/VirtualiZarr/issues/171 - Cupy + dask - https://github.com/pydata/xarray/issues/7721 (discussion of the issue) - https://github.com/keewis/nested-duck-arrays ## June 19, 2024 ### Attendees - Matt Savoie / @flamingbear - Justus Magin / @keewis - Tom Nicholas / @TomNicholas - Stephan Hoyer / @shoyer - Owen Littlejohns / @owenlittlejohns - Eni Awowale/ @eni-awowale - David Auty / @autydp - From NASA EED-3, knows Matt and Owen ### 60 second updates - Matt - Hope to continue datatree inheritance discussion. - Justus: numpy2-compatible release last week ### Agenda - DataTree coordinate inheritance question - Release timeline - Can we release by the time of Eni and Tom's SciPy talk about DataTree? (~July 10th) - How much feedback from community do we need? - Stephan: Got plenty already - David: Has "quirky" data at NASA - Would probably prefer more lenient data model - Stephan: Prefer not to have "fallback mechanisms" - David: Wants to use datatree for analysis, ideally changing the structure as little as possible - Tom: What do we think about this `open_as_dict_of_datasets` idea? Would that help? - Tom: Solves problem of interrogating data / displaying groups - Stephan: Makes some sense - analogous to how `open_mfdataset` works for 90% of cases - As if we had made a `open_mf_as_grid_of_datasets` function to create an interrogatable intermediate structure - Stephan: Function to write a messy dataset too? (lower priority) - Matt: In favour - David: Can you open just a subtree of a file? Tom: Yes if we add a group kwarg to `open_datatree` - Eni: Useful if `open_datatree` failed on alignment it gave very clear report of what should be fixed - Justus: `preprocess` arg could be useful for "massaging" - Tom: Could use python's new Exception Groups feature for showing all errors at once - Stephan: Should also think about saving out a "crooked datatree" - Consensus?! - Plan going forward - Everyone who is interested look in detail at Stephan's PR (https://github.com/pydata/xarray/pull/9063) - Likely to spawn smaller issues / PRs about reprs and so on - Need separate PR for `open_as_dict_of_datasets` (or `open_datatree_as_dict`?) (Tom can raise issue for this) - Orthogonal to Stephan's PR - Tutorial for tidying up a messy nested netCDF file into a nice sane aligned DataTree (similar to the "Tidy Xarray" idea) ## June 5, 2024 ### Attendees - Justus Magin / @keewis - Tom Nicholas / @TomNicholas - Kai Mühlbauer / @kmuehlbauer - Stephan Hoyer / @shoyer - Joe Hamman / @jhamman - Deepak Cherian / @dcherian - Matt Savoie / @flamingbear - Mathias Hauser / @mathause ### 60 second updates - Justus: - numpy 2: more progress, still not done (lots of edge cases): https://github.com/pydata/xarray/pull/8946 - Joe - still heads down on zarr 3, alpha release coming this week - I will be at the CZI annual meeting next week, kicking off our latest grant - We're hiring (https://jobs.gusto.com/postings/earthmover-xarray-community-developer-498dca94-335e-4d5c-a6c7-83ca19772512) - Tom: - Owen, Matt, and Eni accepted our invitation to join the core team! - Mostly just this discussion about datatree coordinate inheritance behaviour (https://github.com/pydata/xarray/issues/9056) - Kai: - a bit of issue clearance - jumped now on the open_datatree-stuff - Deepak: - job posting: https://jobs.gusto.com/postings/earthmover-xarray-community-developer-498dca94-335e-4d5c-a6c7-83ca19772512 - @deepak (https://discourse.pangeo.io/t/potential-for-adapting-pythia-foundations-for-different-disciplines-e-g-neuro/4239/3?u=tomnicholas) - user survey: https://docs.google.com/forms/u/2/d/1x9bOIelnUsDMyI1tF4bN7TWK0v4nBDiwhpxh9mi6PaI/edit - last call for comments. - Stephan: - DataTree inheritance model: https://github.com/pydata/xarray/pull/9063 ### Agenda - Numpy 2: dtype casting in where - separate meeting to discuss in detail - Miscellaneous PRs - https://github.com/pydata/xarray/pull/5704 ## May 22nd, 2024 ### Attendees - Matt / @flamingbear - Justus / @keewis - Tom Nicholas / @TomNicholas - Mathias Hauser / @mathause ### 60 second updates - Matt - no update - Justus - numpy 2: array api fixes (ready for a final review!) https://github.com/pydata/xarray/pull/8854 - numpy 2: where dtype casting. Stephan helped me figure out a clean way to implement this, but didn't have time to do this, yet. - Tom - No real updates on xarray itself - Q about ChunkManager https://github.com/pydata/xarray/issues/8733#issuecomment-2111146588 - Joe - Still cranking on zarr v3 - Deepak is heads down this week - Mathias - no update ### Agenda ## May 8th, 2024 ### Attendees - Deepak - Justus - Matt - Tom - Mathias - Ryan ### 60 second updates - Deepak - optimizing zarr region writes / appends - iterating on Xarray User Survey - Justus - more numpy 2 compat... we're now down to failures with just the array api and the casting changes due to NEP 50 - Tom - Trying to start some deprecation cycles - PR to concat without creating indexes - Ryan - Trying to nerd-swipe someone into making some useful indexes - https://github.com/pydata/xarray/discussions/8955#discussioncomment-9226372 - https://discourse.pangeo.io/t/example-which-highlights-the-limitations-of-netcdf-style-coordinates-for-large-geospatial-rasters/4140/26 - Tom: even simpler case - no indexes! - PR for concat is all that is needed - Immediate next case is pandas index that is disconnected from variable data - https://github.com/TomNicholas/VirtualiZarr/issues/18#issuecomment-2025423042 - Mathias - no news - Matt - Also no update ### Agenda - NamedArray update - Stalled, Anderson has run out of time - 80% of the way there for decoupling the backends from lazy indexing. - Action item: make a todo list for whats left and needed. - PRs to review/merge - Tom: Please someone merge this indexes PR https://github.com/pydata/xarray/pull/8872 - I can't release v0.1 of VirtualiZarr until it's in xarray main... - Deepak : https://github.com/pydata/xarray/pull/8998 - Numpy 2 compat: - should we switch casting behavior to NEP 50? https://github.com/pydata/xarray/pull/8946 - https://numpy.org/neps/nep-0050-scalar-promotion.html - array api is close: https://github.com/pydata/xarray/pull/8854 - Release plan: - release before that doesn't fully support numpy 2 yet - release one version with numpy 2 and py3.9 - then drop python 3.9 ## April 24th, 2024 ### Attendees - Justus Magin / @keewis - Matt Savoie / @flamingbear - Kai Mühlbauer / @kmuehlbauer - Tom Nicholas / @TomNicholas - Joe Hamman / @jhamman - Owen Littlejohns / @owenlittlejohns - Deepak Cherian / @dcherian - Stephan ### 60 second updates - Justus: upstream-dev CI / numpy 2 compat - Tom: - On a train, probably can't call in - Looking at changing backends.NetCDFDataStore to only open file once when reading many groups - Kai are you or others planning to take this on? - I want to change internal invariants to stop checking for default pandas indexes - https://github.com/pydata/xarray/pull/8960#discussion_r1573306634 - @deepak what option are you referring to? I don't see a kwarg to `assert_equal`... https://github.com/pydata/xarray/blob/b0036749542145794244dee4c4869f3750ff2dee/xarray/testing/assertions.py#L88-L120 - https://github.com/pydata/xarray/blob/b0036749542145794244dee4c4869f3750ff2dee/xarray/testing/assertions.py#L385-L387 - +We should plumb it through.+ Wrong: look here: https://github.com/pydata/xarray/blob/b0036749542145794244dee4c4869f3750ff2dee/xarray/tests/__init__.py#L286 - (we don't use the public API directly in tests) - COol. - This might be a good time to write an "Assertions" section into the [docs page on testing](https://docs.xarray.dev/en/stable/user-guide/testing.html#testing-your-code) - Kai: not much, considering helping with datatree backend stuff together with @aladinor and @mgrover1, need to check which way to go (from the xarray side, or from the external backend side) - Matt: also not much recently. Always datatree. listening. - Owen: Continued datatree migration, current PR open: https://github.com/pydata/xarray/pull/8967 - Joe: - Foo proposal was funded! Deepak and I are hoping to hire a near-full time dev with bio experience to come work with us - Going to try zarr-python-3 in xarray next week. - Deepak - not much, pushed on public grouper api ### Agenda - Break behaviour of dataset constructor? - https://github.com/pydata/xarray/issues/8959 - `ds = xr.Dataset(data_vars={'x': ('x', [0])})` - promotes to coordinate - Start with `PendingDeprecationWarning` - Add a separate more explicit construction method/kwarg? Or use the new behavior in case a `Coordinates` object is passed - numpy 2: - array api: https://github.com/pydata/xarray/pull/8854 - main concerns: dispatching between numpy issubdtype and arrayapi isdtype <- this is kinda hairy - stephan will take a look - copy parameter to `__array__` (typing, mostly): https://github.com/pydata/xarray/pull/8939/files - dtype casting rules: https://github.com/pydata/xarray/pull/8946#issuecomment-2068949796 - general rule: determine dtype without python scalars (which are "weak dtypes" in jax), then cast python scalars to array using that dtype. If that doesn't work, either raise or determine a fallback - implementation `as_shared_dtype` - `concatenate` and `stack` shouldn't really allow python scalars (?) - may be specific to `where`, in which case that code could also go there ## April 10, 2024 ### Attendees - Matt Savoie / @flamingbear - Kai Mühlbauer / @kmuehlbauer - Tom Nicholas / @TomNicholas - Justus Magin / @keewis - Deepak Cherian / @dcherian - Owen Littlejohns / @owenlittlejohns ### 60 second updates - Matt : Good meeting for Datatree yesterday. [PR](https://github.com/flamingbear/xarray/pull/11) to [existing PR](https://github.com/pydata/xarray/pull/8879) for simplifying iterators is ready. Owen will ping Tom later today when he merges. - Justus: - "source" encoding from `fsspec` objects - h5netcdf + character sets - Tom - Mostly thinking about the virtualizarr stuff (i.e. not propagating xarray indexes and dealing with encoding) - Chance of me being able to think about datatree inheritance has gone up since NCAR machines are all down... - Kai: not much xarray related (beside some h5netcdf char encoding ;-) - Owen: [Open PR for iterators.py](https://github.com/pydata/xarray/pull/8879) - Will update based on recent feedback. - Deepak : - merged in stateful tests (https://github.com/pydata/xarray/pull/8658) - Explanation of hypothesis testing strategies https://docs.xarray.dev/en/stable/user-guide/testing.html#hypothesis-testing ### Agenda - upstream tests: - https://github.com/pydata/xarray/issues/8844 - string dtypes (needs volunteer) - array API tests - https://github.com/pydata/xarray/pull/8854 ``` xarray/tests/test_duck_array_ops.py::TestOps::test_where_type_promotion: AssertionError: assert dtype('float64') == <class 'numpy.float32'> + where dtype('float64') = array([ 1., nan]).dtype + and <class 'numpy.float32'> = np.float32 xarray/tests/test_duck_array_ops.py::TestDaskOps::test_where_type_promotion: AssertionError: assert dtype('float64') == <class 'numpy.float32'> + where dtype('float64') = array([ 1., nan]).dtype + and <class 'numpy.float32'> = np.float32 ``` - encoding and virtualizarr - https://github.com/TomNicholas/VirtualiZarr/issues/68 - https://github.com/fsspec/kerchunk/blob/a0c4f3b828d37f6d07995925b324595af68c4a19/docs/source/tutorial.rst ## March 27, 2024 ### Attendees - Deepak Cherian - Alex Ford / @asford - Tom Nicholas / @TomNicholas - Matt Savoie / @flamingbear - Stephan Hoyer ### 60 second updates - Deepak : upstream-dev fixes - Tom: - Datatree meetings - Xarray without indexes - With a few un-merged PRs I can actually xr.concat Datasets without indexes at all - See [example in VirtualiZarr](https://github.com/TomNicholas/VirtualiZarr/blob/main/docs/usage.md#manual-concatenation-ordering) - Alex F - First time attending. - Question on possible wrapping of torch-tensors in xarray - We have working internal fork, interested in upstreaming - Matt - good meeting yesterday with agreement to move forward faster not smarter. Basically move most code without improvements and identify places we want to work later. - Stephan - Benoit might be working on indexes again in a couple of months, funding from NASA grant at UW. ### Agenda - torch inside xarray - Relevant issues - https://github.com/pydata/xarray/issues/3232 - https://github.com/data-apis/array-api-compat - as a comparison point: [JAX-Xarray](https://github.com/google-deepmind/graphcast/blob/main/graphcast/xarray_jax.py) - https://github.com/pytorch/pytorch/issues/58743 - Why xarray? - "problem of dimension tracking", sequence information, hypercubes, align in to canonical coordinate frame, write gradient aware calculations inside that coordinate system, - like named dims, tried NamedTensor, switched to Xarray, have many coordinate variables, - Pain points? - pytorch isn't compliant with array API standard - can be mostly solved using the array-api-compat shim library - non-numpy dtypes - Not really covered in the array API standard - Might need special-casing within xarray

Import from clipboard

Paste your markdown or webpage here...

Advanced permission required

Your current role can only read. Ask the system administrator to acquire write and comment permission.

This team is disabled

Sorry, this team is disabled. You can't edit this note.

This note is locked

Sorry, only owner can edit this note.

Reach the limit

Sorry, you've reached the max length this note can be.
Please reduce the content or divide it to more notes, thank you!

Import from Gist

Import from Snippet

or

Export to Snippet

Are you sure?

Do you really want to delete this note?
All users will lose their connection.

Create a note from template

Create a note from template

Oops...
This template has been removed or transferred.
Upgrade
All
  • All
  • Team
No template.

Create a template

Upgrade

Delete template

Do you really want to delete this template?
Turn this template into a regular note and keep its content, versions, and comments.

This page need refresh

You have an incompatible client version.
Refresh to update.
New version available!
See releases notes here
Refresh to enjoy new features.
Your user state has changed.
Refresh to load new user state.

Sign in

Forgot password

or

By clicking below, you agree to our terms of service.

Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
Wallet ( )
Connect another wallet

New to HackMD? Sign up

Help

  • English
  • 中文
  • Français
  • Deutsch
  • 日本語
  • Español
  • Català
  • Ελληνικά
  • Português
  • italiano
  • Türkçe
  • Русский
  • Nederlands
  • hrvatski jezik
  • język polski
  • Українська
  • हिन्दी
  • svenska
  • Esperanto
  • dansk

Documents

Help & Tutorial

How to use Book mode

Slide Example

API Docs

Edit in VSCode

Install browser extension

Contacts

Feedback

Discord

Send us email

Resources

Releases

Pricing

Blog

Policy

Terms

Privacy

Cheatsheet

Syntax Example Reference
# Header Header 基本排版
- Unordered List
  • Unordered List
1. Ordered List
  1. Ordered List
- [ ] Todo List
  • Todo List
> Blockquote
Blockquote
**Bold font** Bold font
*Italics font* Italics font
~~Strikethrough~~ Strikethrough
19^th^ 19th
H~2~O H2O
++Inserted text++ Inserted text
==Marked text== Marked text
[link text](https:// "title") Link
![image alt](https:// "title") Image
`Code` Code 在筆記中貼入程式碼
```javascript
var i = 0;
```
var i = 0;
:smile: :smile: Emoji list
{%youtube youtube_id %} Externals
$L^aT_eX$ LaTeX
:::info
This is a alert area.
:::

This is a alert area.

Versions and GitHub Sync
Get Full History Access

  • Edit version name
  • Delete

revision author avatar     named on  

More Less

Note content is identical to the latest version.
Compare
    Choose a version
    No search result
    Version not found
Sign in to link this note to GitHub
Learn more
This note is not linked with GitHub
 

Feedback

Submission failed, please try again

Thanks for your support.

On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

Please give us some advice and help us improve HackMD.

 

Thanks for your feedback

Remove version name

Do you want to remove this version name and description?

Transfer ownership

Transfer to
    Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

      Link with GitHub

      Please authorize HackMD on GitHub
      • Please sign in to GitHub and install the HackMD app on your GitHub repo.
      • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
      Learn more  Sign in to GitHub

      Push the note to GitHub Push to GitHub Pull a file from GitHub

        Authorize again
       

      Choose which file to push to

      Select repo
      Refresh Authorize more repos
      Select branch
      Select file
      Select branch
      Choose version(s) to push
      • Save a new version and push
      • Choose from existing versions
      Include title and tags
      Available push count

      Pull from GitHub

       
      File from GitHub
      File from HackMD

      GitHub Link Settings

      File linked

      Linked by
      File path
      Last synced branch
      Available push count

      Danger Zone

      Unlink
      You will no longer receive notification when GitHub file changes after unlink.

      Syncing

      Push failed

      Push successfully