Tom Nicholas
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
      • Invitee
    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Versions and GitHub Sync Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
Invitee
Publish Note

Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

Your note will be visible on your profile and discoverable by anyone.
Your note is now live.
This note is visible on your profile and discoverable online.
Everyone on the web can find and read all notes of this public team.
See published notes
Unpublish note
Please check the box to agree to the Community Guidelines.
View profile
Engagement control
Commenting
Permission
Disabled Forbidden Owners Signed-in users Everyone
Enable
Permission
  • Forbidden
  • Owners
  • Signed-in users
  • Everyone
Suggest edit
Permission
Disabled Forbidden Owners Signed-in users Everyone
Enable
Permission
  • Forbidden
  • Owners
  • Signed-in users
Emoji Reply
Enable
Import from Dropbox Google Drive Gist Clipboard
   owned this note    owned this note      
Published Linked with GitHub
Subscribed
  • Any changes
    Be notified of any changes
  • Mention me
    Be notified of mention me
  • Unsubscribe
Subscribe
# Weekly Xarray-DataTree design meeting [Zoom link](https://us02web.zoom.us/j/87503265754?pwd=cEFJMzFqdTFaS3BMdkx4UkNZRk1QZz09) [Meetings issue (#8747)](https://github.com/pydata/xarray/issues/8747) - includes list of design questions [Tracking issue (#8572)](https://github.com/pydata/xarray/issues/8572) - includes checklist of what's been done so far ## Oct 22th, 2024 ### Attendees - Justus Magin / @keewis - Alfonso Ladino / @aladinor - Tom Nicholas / @TomNicholas - Owen Littlejohns / @owenlittlejohns ### Updates - Justus - - Tom - ### Agenda - Forbid slashes in coordinate names https://github.com/pydata/xarray/pull/9492 - `group` arg to `open_datatree` - remove empty parents on top of the selected node - add ancestor path to `encoding['source']`? - `chunks` support? - Before release: - https://github.com/pydata/xarray/issues/9634 - get `open_datatree` and `open_groups` to support `chunks` - implement `chunk`, `compute`, `load` and `persist` - Justus will look into `open_datatree` / `open_groups` with `chunks` - dask specific methods can be added after the release (https://github.com/pydata/xarray/issues/9355) ## Oct 15th, 2024 ### Attendees - Justus Magin / @keewis - Alfonso Ladino / @aladinor - Eni Awowale / @eni-awowale - Matt Savoie / @flamingbear ### Agenda - how do we test the `group` argument of `open_datatree`? ## Oct 8th, 2024 ### Attendees - Tom Nicholas / @TomNicholas - Justus Magin / @keewis - Alfonso Ladino / @aladinor - Eni Awowale / @eni-awowale ### Agenda - Close last issues on xarray-contrib repo? ## Oct 4th, 2024 ### Attendees - Tom Nicholas / @TomNicholas - Justus Magin / @keewis - Matt Savoie / @flamingbear - Gui(lherme) Castelao / @castelao - Kai Mühlbauer / @kmuehlbauer ### Agenda - inheritance for map_over_subtree, to_dict, and `to_<file_format>` ## Oct 1st, 2024 ### Attendees - Tom Nicholas / @TomNicholas - Justus Magin / @keewis - Matt Savoie / @flamingbear - Gui(lherme) Castelao / @castelao - Kai Mühlbauer / @kmuehlbauer - Alfonso Ladino / @aladinor ### Updates - Alfonso - https://github.com/pydata/xarray/pull/9428 ready to go. ### Agenda - Performance issue with [StoreBackendEntrypoint](https://github.com/pydata/xarray/blob/095d47fcb036441532bf6f5aed907a6c4cfdfe0d/xarray/backends/zarr.py#L1352) when opening datatree in zarr. It is taking too long compared with using [open_dataset](https://github.com/pydata/xarray/blob/095d47fcb036441532bf6f5aed907a6c4cfdfe0d/xarray/backends/zarr.py#L1225 - Open a new issue showing the unexpected behavior ## Sept 24th, 2024 ### Attendees - Tom Nicholas / @TomNicholas - Justus Magin / @keewis - Matt Savoie / @flamingbear - Owen Littlejohns / @owenlittlejohns - Gui(lherme) Castelao / @castelao - Kai Mühlbauer / @kmuehlbauer - Alfonso Ladino / @aladinor - Eni Awowale / @eni-awowale ### Updates - Tom - https://xray--9501.org.readthedocs.build/en/9501/user-guide/hierarchical-data.html#alignment-and-coordinate-inheritance - Matt: Merged docs and ghosted ### Agenda - Problem of duplicating inherited coordinates across nodes - https://github.com/pydata/xarray/issues/9475 - Coordinates backed by indexes can be cheaply (eagerly) compared, and therefore de-duplicated on assignment - This seems fine, Stephan has a PR to add this - https://github.com/pydata/xarray/pull/9531 - Problem is this doesn't work for non-indexed coordinates, because any comparison could eagerly load an arbitrarily large variable into memory - Suggestion 1: pass inherited coordinates separately in `map_over_subtree` - two arguments go into `map_over_subtree` calls - downside: can't apply functions that work on datasets anymore - def func(ds: Dataset) -> Dataset: ... dt.map_over_subtree(func) - variant: mark inherited coords with a temporary attribute, and people can duplicate by removing that - Suggestion 2: Don't allow access to inherited non-indexed coordinates - Specifically for `.dataset` inside `map_over_subtree`? - Restricts use cases to not be able to even access non-indexed coordinates - e.g. want to make decision based on scalar `ds.coords['cloud_coverage']` - Suggestion 3: Disallow overwriting any inherited coordinates inside `map_over_subtree` - Should we raise an error or warn if user tries to overwrite inherited coords? - e.g. `map_over_subtree(lambda ds: ds.isel(...))` - Add kwarg `replace_duplicated_inherited` - Suggestion 4: Forbid overriding coordinates in child nodes completely - Very restrictive, breaks netCDF model - Stronger version of suggestion 3 - https://github.com/pydata/xarray/pull/9428 might be ready? ## Sept 17th, 2024 ### Attendees - Tom Nicholas / @TomNicholas - Guiherme Castelao / @castelao - Stephan Hoyer / @shoyer ### Updates - Tom - Wrote some docs on DataTree alignment and coordinate inheritance - https://github.com/pydata/xarray/pull/9501 - Been refactoring to use a new `._walk_to` method - Stephan - Deduplicated coordinates - https://github.com/pydata/xarray/pull/9510 - Issue with passing state to the `._post_attach` method - But just an internal detail - Can't have conflicting coordinates on descendants ("no overriding") - What to do about non-indexed coordinates? - Indexed coordinates are in memory so easy to check for duplication - But non - Current design might be slow - Lots of internal method calls - Some methods have performance that scales poorly with tree depth - e.g. `__init__` constructor has quadratic performance - Let's raise an issue for this - Want to complete some traversing refactors - ## Sept 10th, 2024 ### Attendees - Tom Nicholas / @TomNicholas - Matt Savoie / @flamingbear - Owen Littlejohns / @owenlittlejohns - Eni Awowale / @eni-awowale ### Updates - Tom - Sprinted with Eni at NumFOCUS summit on Saturday - Moved / closed a bunch of issues - PRs - https://github.com/pydata/xarray/pull/9465 - https://github.com/pydata/xarray/pull/9453 - https://github.com/pydata/xarray/pull/9451 - https://github.com/pydata/xarray/pull/9470 - Reviewed several of Stephan's PRs - Eni - Sprint with Tom at NumFOCUS summit - PR for `open_groups` with zarr https://github.com/pydata/xarray/pull/9469 ### Agenda - Should the docs be in a separate branch? - Documenting coordinate inheritance and alignment rules - Deserves its own PR... - Names of things - https://github.com/pydata/xarray/issues/9458 - `DataTree(data=...)` or `DataTree(node=...)` or ? - `DataTree.ds` or `DataTree.node` or ? - Migration guide - Blog post https://github.com/xarray-contrib/xarray.dev/issues/708 - Issue-moving spree - Eni's `open_groups` for Zarr PR - https://github.com/pydata/xarray/pull/9469 - Bear in mind Stephan is about to change the meaning of "identical" slightly https://github.com/pydata/xarray/pull/9473 ## Sept 3rd, 2024 ### Attendees - Matt Savoie / @flamingbear - Eni Awowale / @eni-awowale - Justus Magin / @keewis - Tom Nicholas / @TomNicholas - Alfonso Ladino Rincon / @aladinor ### Updates - Matt: Pushed changes to fix Tom's [PR#9297](https://github.com/pydata/xarray/pull/9297) for shallow copy. Added more to remove parent from constructor keywords on my [branch](https://github.com/flamingbear/xarray/tree/datatree_init_dont_modify_inplace. I pushed to Tom's repo. - Alfonso working on `open_zarr` #9198 ### Agenda ## Aug 27th, 2024 ### Attendees - Justus Magin / @keewis - Tom Nicholas / @TomNicholas - Alfonso Ladino Rincon / @aladinor - Matt Savoie / @flamingbear - Owen Littlejohns / @owenlittlejohns - Eni Awowale / @eni-awowale ### Updates - Tom - Worked more on https://github.com/pydata/xarray/pull/9297 - Failing doctest: https://github.com/pydata/xarray/actions/runs/10476674854/job/29016117743?pr=9297 - Matt - Added Eni's open_groups to the [Documentation PR.](https://github.com/pydata/xarray/pull/9033) - Just rescanned the issues from Aug 13 triage session. - Justus: nothing (but I do remember wanting to post a review comment on [#9378](https://github.com/pydata/xarray/pull/9378)) - Alfonso: Nothing - (Still looking at [#9198](https://github.com/pydata/xarray/pull/9198)) - Owen: - Moved some issues over ### Agenda - Merge some more PRs? - Go through more old issues? ## Aug 20th, 2024 ### Attendees - Justus Magin / @keewis - Tom Nicholas / @TomNicholas - Alfonso Ladino Rincon / @aladinor ### Updates - Tom - Working on https://github.com/pydata/xarray/pull/9297 ### Agenda - Alfonso's PR on opening zarr stores with consolidated group - https://github.com/pydata/xarray/pull/9377 - merged - Etienne's PR on disallowing paths with slashes - https://github.com/pydata/xarray/pull/9378 - further modify the error message to mention that `/` in variable names is only illegal when creating datatree nodes - Stephan's PR on improving error message - https://github.com/pydata/xarray/pull/9222 - unsure why the error message is now less explicit - Cloud storage credentials - https://github.com/pydata/xarray/pull/9198 - partially fixed by Alfonso's PR, the rest can be fixed by further optimizing zarr's open_datatree to use the pre-opened store - AoB - Continue moving old issues / working on PRs ## Aug 13, 2024 ### Attendees - Matt Savoie / @flamingbear - Justus Magin / @keewis - Tom Nicholas / @TomNicholas - Eni Awowale / @eni-awowale - Owen Littlejohns / @owenlittlejohns - Gui Castelao - Alfonso Ladino Rincon / @aladinor ### Updates - Tom - Might have use cases for DataTree at CWorthy - Eni : still working on #9243. Will try the suggested mypy fix ### Triaging session <details><summary>Issues and PRs to triage</summary> ```markdown _Originally posted by @<user> in <link>_ ``` Please add the `topic-datatree` label! Open issues: - [x] #5 - (Tom) Moved upstream - [x] #9 - moved [9347](https://github.com/pydata/xarray/issues/9347) Tom - [x] #47 - moved [9348](https://github.com/pydata/xarray/issues/9348) Eni - [x] #51 (Justus) - moved to xarray - [x] #55 - recommend closing / asked @maxgrover1 and @kmuehlbauer if we can closed - [x] #58 (Justus) - moved to xarray - [x] #61 - closed: PR was merged for issue and issue is accounted for in #8572 (Eni) - [x] #67 (Tom) - closed in favor of existing xarray issue - [x] #77 (Tom) - moved over - [x] #79 (Tom) - moved over - [x] #80 (Tom) - closed as arguably already solved - [x] #93 - (Owen) migrated [9337](https://github.com/pydata/xarray/issues/9337) - closing of file using open_datatree in context manager - [x] #97 (Tom) migrated upstream - [x] #100 (Eni) Closed and moved to [9437](https://github.com/pydata/xarray/issues/9437) - [x] #124 (Justus) - closed - [x] #134 (Eni) Closed and moved to [9438](https://github.com/pydata/xarray/issues/9438) - [x] #145 - (Owen) migrated [9343](https://github.com/pydata/xarray/issues/9343) - [x] #146 - (Owen) migrated [9365](https://github.com/pydata/xarray/issues/9365) - [x] #152 - Eni - moved upstream - [ ] #168 - Eni - closed - [x] #184 - (Tom) closed as same ideas implemented by Stephan in https://github.com/pydata/xarray/pull/9064 - [x] #186 - (Tom) moved upstream - [x] #189 - Eni: moved to https://github.com/pydata/xarray/issues/9440 - [ ] #191 - [X] #192 - migrated https://github.com/pydata/xarray/issues/9349 - [ ] #193 - [ ] #195 - [ ] #199 - [x] #200 - migrated [#9335](https://github.com/pydata/xarray/issues/9335) - [x] #203 - migrated [#9345](https://github.com/pydata/xarray/issues/9345) - [x] #204 - Close in favor of #192 - [X] #206 - migrated to pydata/xarray#9350 - [X] #207 - migrated [#9338](https://github.com/pydata/xarray/issues/9338) - [ ] #210 - [ ] #230 - [ ] #232 - [ ] #235 - [x] #240 - (Tom) moved to xarray - [ ] #242 - [ ] #244 - [x] #250 - (Justus) closed - [x] #252 - Eni - closed and moved upstream https://github.com/pydata/xarray/issues/9502 - [x] #254 - (Justus) moved to xarray - [ ] #258 - [ ] #266 - [ ] #270 - [x] #276 - Eni (working on) - [ ] #277 - [ ] #281 - [ ] #283 - [x] #290 (Justus) - moved to xarray - [x] #292 - Eni - moved upstream https://github.com/pydata/xarray/issues/9503 - [x] #297 (Justus) - closed in favor of existing xarray issue (#9056) - [ ] #309 - [x] #311 (Tom) - moved to xarray - [ ] #312 - [ ] #313 - [x] #316 (Justus) - moved to xarray - [x] #320 (Eni) - moved https://github.com/pydata/xarray/issues/9539. Thought this was an interesting feature request. - [x] #322 (Justus) - closed in favor of the existing xarray issue (#9197) - [ ] #323 - [x] #325 (Tom) - closed with link to upstream replacement - [x] #331 (Tom) - closed with comment - [ ] #337 Open PRs: - [x] #114 - (Owen) linked to from xarray issue (9335). - [ ] #142 - [ ] #147 - [x] #155 - (Owen) linked to from xarray issue (9343). - [ ] #196 - [ ] #198 - [ ] #217 - [ ] #220 - [ ] #221 - [ ] #238 - [ ] #253 - [ ] #265 - [ ] #271 - [ ] #282 - [ ] #307 - [ ] #310 - [x] #314 (Tom) - linked to from new issue on xarray - [ ] #319 - [ ] #338 </details> ## Aug 6, 2024 ### Attendees - Matt Savoie / @flamingbear - Justus Magin / @keewis - Tom Nicholas / @TomNicholas - Eni Awowale / @eni-awowale ### 60 Second Updates. - Matt : waiting for PRs before re-reviewing the Documentation. If you want to see the diff for the docs updated for inheritance: [here](https://github.com/pydata/xarray/pull/9033/files/b303d6255d762f0a82188ff6446b25a7bc82aadb..421c404c59c18ab36bfb2ab9fe1db016a154d9ad) And the [current PR docs](https://xray--9033.org.readthedocs.build/en/9033/) - Eni : still working on #9243 ### Agenda - issues on the old repository: - block about an hour separately to go through the issues - Justus will organize / create a poll to find a good time - Special dask methods - https://github.com/pydata/xarray/blob/c508cc6a2e3000a9d87d2f8c611aae8733be07bf/xarray/core/dataset.py#L879 - https://github.com/xarray-contrib/datatree/pull/196 - tutorial files for datatree: possibly synthetic, neuro-imagery, or a geoscience (weather?) image pyramid - check for isomorphic trees (for `map_over_subtree`): also compare names to avoid relying on the order of the nodes ## Jul 30, 2024 ### Attendees - Matt Savoie / @flamingbear - Justus Magin / @keewis - Tom Nicholas / @TomNicholas - Stephan Hoyer / shoyer - Eni Awowale / @eni-awowale - Owen Littlejohns / @owenlittlejohns ### 60 Second Updates. - Matt: Almost completed the update for [Doc PR](https://github.com/pydata/xarray/pull/9033) - Tom: - Looked at fixing several bugs - https://github.com/pydata/xarray/issues/9285 - https://github.com/pydata/xarray/issues/9196 - https://github.com/pydata/xarray/pull/9292 - Eni: - PR [#9243](https://github.com/pydata/xarray/pull/9243) ### Agenda ## Jul 23, 2024 ### Attendees - Matt Savoie / @flamingbear - Justus Magin / @keewis - Stephan Hoyer / shoyer - Eni Awowale / @eni-awowale - Alfonso Ladino Rincon / @aladinor - Etienne Schalk / @eschalkargans - Tom Nicholas / @TomNicholas ### 60 Second Updates. - Tom: - Was at SciPy then PTO - Matt: still nothing. looking at Eni's draft [PR #9243](https://github.com/pydata/xarray/pull/9243/files) - Etienne: convert datatree to dict [PR #9080](https://github.com/pydata/xarray/pull/9080) (note: with coordinate inheritance, inherited coords are duplicated ; disadvantage: denormalization of data ; advantage: self sufficient leaf groups) - Eni: Back from SciPy and PTO working on draft PR #9243 - Will add tests to new file ### Agenda - SciPy report - We should move old issues - Best to do manually as then a human will check - Eni has issue with openDAP for trees - Latest [tasks](https://github.com/pydata/xarray/issues/8572#issuecomment-2218020742) to get datatree released and original set [#8572](https://github.com/pydata/xarray/issues/8572) ## Jul 16, 2024 ### Attendees - Matt Savoie / @flamingbear - Stephan Hoyer / @shoyer - Justus Magin / @keewis - Alfonso Ladino / @aladinor ### 60 Second Updates. - Matt has barely been even following issues. ### Agenda - Not much but Alfonso had two PRs to discuss Options for credentials for s3 when opening zarr stores https://github.com/pydata/xarray/pull/9198/files Addresses backend kwargs that were removed (addresses [#9135](https://github.com/pydata/xarray/issues/9135)) https://github.com/pydata/xarray/pull/9199/files - Early adjournment ## Jul 9, 2024 ### Attendees - Justus Magin / @keewis - Stephan Hoyer - Tom Nicholas / @TomNicholas ### Agenda - checklist for releasing datatree - https://github.com/pydata/xarray/issues/8572#issuecomment-2218020742 - ## Jul 2, 2024 ### Attendees - Matt Savoie / @flamingbear - Justus Magin / @keewis - Owen Littlejohns / @owenlittlejohns - Stephan Hoyer - Alfonso Ladino / @aladinor ### 60 second updates - Tom - Reviewed coordinate inheritance PR properly - Matt - Also viewed the inheritance PR understood most. - Owen - Also reviewed PR 9063 (inheritance) - Stephan - Inheritance PR - Alfonso - Got both PR for keywords and benchmarks ready. - https://github.com/pydata/xarray/pull/9158 - https://github.com/pydata/xarray/pull/9199 ### Agenda - Are we happy to merge Stephan's PR? - Outstanding Q's? - A couple of other things to merge - Constructor parent not mutating - What does that unblock? - Release schedule - release - whats required - docs PR - open_as_dict_of_datasets - blog ## Jun 25, 2024 ### Attendees - Matt Savoie / @flamingbear - Justus Magin / @keewis - Owen Littlejohns / @owenlittlejohns - Stephan Hoyer ### 60 second updates - Matt: Reviewed / following the inherited coordinate PR [#9063](https://github.com/pydata/xarray/pull/9063/files) - Tom: Also reviewed the PR - Owen: Also partially reviewed Stephan's PR [#9063](https://github.com/pydata/xarray/pull/9063/files) ### Agenda - Benchmark for open_datatree: https://github.com/pydata/xarray/pull/9158 - Probably should close the files - DataTree should be a context manager (like how you can already do `with open_dataset(path) as ds:`) - raise an issue for this! - Backend kwargs are not forwarded: https://github.com/pydata/xarray/issues/9135 - Review of coordinate inheritance PR [#9063](https://github.com/pydata/xarray/pull/9063/files) - Tom: Main question is what should the internal structure be? - DataTree repr: https://github.com/pydata/xarray/pull/9064 - SciPy talk - Practice talk for NASA 2nd July 12pm EDT - Everyone welcome on teams (https://teams.microsoft.com/l/meetup-join/19%3ameeting_NDc3ZWRiOGUtOTdhNS00ZDkyLWI2ZGQ[…]2c%22Oid%22%3a%2275a4b9ac-327c-4e32-9aeb-1eab36528186%22%7d) - Tom and Eni will give h of talk each - Tom on general datatree idea, Eni on NASA's use case - Top-level functions like `xr.concat` accepting DataTree objects? - https://github.com/pydata/xarray/issues/9106 ## Jun 18, 2024 ### Attendees - Matt Savoie / @flamingbear - Tom Nicholas - Eni Awowale/ @eni-awowale - Owen Littlejohns / @owenlittlejohns - Alfonso Ladino Rincon ### 60 second updates - Trying hard to wrap my head around the current discussion [#9077](https://github.com/pydata/xarray/issues/9077) ### Agenda - Inherited coordinates -- allow overrides or not? - The case for forbidding overrides - If non-alignment is allowed, we would need a way to tell update/setitem methods whether or not we want them to check alignment in this particular case - Alignment will have to be checked between variables on the same node anyway - Discuss #9077 some more? - Particularly this `open_as_dict_of_datasets` idea - Could even point to this function from within the alignment failure in `open_datatree` - Is the value in having `open_datatree` work on everything or having some xarray function work on everything? - Optional vs forbidden overriding of dimensions in child nodes - How much feedback do we actually need from the community? - Mapping top-level functions like concat over trees https://github.com/pydata/xarray/issues/9106 - Eni's SciPy talk? ## Jun 11, 2024 ### Attendees - Matt Savoie / @flamingbear - Eni Awowale / @eni-awowale - Owen Littlejohns / @owenlittlejohns - Tom Nicholas - Justus Magin / @keewis ### 60 second updates - Matt - Following discussions at most. - Tom - Mostly just following other people's issues / PRs - Justus - nothing datatree-related, but I'll try releasing numpy 2 later today - Eni - dropped a bug report #9093 about segmentation faults with `open_datatree()` ### Agenda - Let's merge some things? - open_datatree speedup PR - Matt will add commits to remove uneeded kwargs then we can merge - Tom reply to Etienne's PR about to_dict - Owen self-merge common.py PR - Coordinate inheritance issue - Stephan summarized it nicely - We should use his description to ask around - Pangeo discourse - Twitter - ESDIS metadata manager people? - Point out on issue - that one can still open invalid files using group/root kwarg - becomes hard to list the groups in a file - New function?: - `list_groups` - `open_datasets_dict` - Numpy release status? - basically done, one PR missing - will release today or tomorrow morning ## Jun 4, 2024 ### Attendees - Matt Savoie / @flamingbear - Owen Littlejohns / @owenlittlejohns - Justus Magin / @keewis - Eni Awowale / @eni-awowale - Tom Nicholas ### 60 second updates - Matt - have only read [proposal](https://github.com/pydata/xarray/issues/9056#) and PRs. - Owen - have open PRs for migration https://github.com/pydata/xarray/issues/9011, https://github.com/pydata/xarray/issues/9033 (latter probably needs to wait for numpy 2.0 support) - Stephan - sketch of hierarhical coordinates: https://github.com/pydata/xarray/pull/9063 - Tom - Also messed with hierarchical coordinates: https://github.com/pydata/xarray/pull/9065/files ### Agenda - Owens' TreeAttrAccessMixin PR - Decision to not worry about slots/dict stuff too much and move forward - Alfonso's [open_datatree PR](https://github.com/pydata/xarray/pull/9014) - Review - Stephan's hierarchical coordinates PR ## May 28, 2024 ### Attendees - Matt Savoie / @flamingbear - Justus Magin / @keewis - Eni Awowale / @eni-awowale - Tom Nicholas - Stephan Hoyer ### 60 second updates - Matt - still nothing. ### Agenda - decision on variable inheritance: - should we change behavior now? Or should we have a separate API instead? - Way to defer the decision? - Proposal - Keep `.ds`, `__getitem__` as-is - Define "compatible variables" for inheritance - Same-named dimensions have to the same - Alignable - (Compare with what it says in the CF conventions) - Additional API which allows access to inherited variables - dt.ds will never give access to inherited vars - But dt.inherited.ds would allow `__getitem__` access to inherited vars - `dt.inherited[...].ds`? - `dt.inherited.to_dataset()` -> xr.Dataset containing inherited vars - Don't change `map_over_subtree` (again for backwards compatibility) - `map_over_inherited_subtree` isolates the conceptuals of mapping over tree with inherited variables - issues: e.g. map over and see the same variable multiple times (in its "local" group and in all its child groups) - Explicit API for propagating / shallow-copying variables to child nodes? - dt.inherit()? -> DataTree - Either way: this will be a new feature, to be done in a separate release (i.e. no blocker right now) ## May 21, 2024 ### Attendees - Matt Savoie / @flamingbear - Justus Magin / @keewis - Owen Littlejohns / @owenlittlejohns - Eni Awowale / @eni-awowale - Tom Nicholas ### 60 sec updates. - Matt: Reviewed Alfonso's open_datatree PR. No ticket work. - Owen: Submitted PR for documentation and exposing DataTree in public API (https://github.com/pydata/xarray/pull/9033) ### Agenda - Announcements - Write a blog post - Doesn't need to be long - https://medium.com/pangeo/easy-ipcc-part-1-multi-model-datatree-469b87cf9114 ## May 14, 2024 ### Attendees - Matt Savoie / @flamingbear - Justus Magin / @keewis - Tom Nicholas - Alfonso Ladino - Owen Littlejohns / @owenlittlejohns - Stephan Hoyer - Eni Awowale ### 60 sec updates. - Matt slacking on other work and time off. - Owen responding to feedback for [PR](https://github.com/pydata/xarray/pull/9011) migrating `io.py` and `common.py` - Tom prepping for virtualizarr talk tomorrow ### Agenda - Alfonso's `open_datatree` performance PR - https://github.com/pydata/xarray/pull/9014 - Coordinate inheritance discussion - Implementation isn't that hard, difficulty is clear model and behaviour, especially wrt mapping - Need to keep Dataset invariant of all shared dims on one group have same length - Option (1): Explicit API separation of group with inherited variables - e.g. dt.inherited.ds - The check: `xarray.align(*[node.ds, node.parent.ds, node.parent.parent.ds, ...], join='exact')` - Tom to make an issue to write out thoughts/options ## May 7, 2024 ### Attendees - Matt Savoie / @flamingbear - Justus Magin / @keewis - Tom Nicholas - Alfonso Ladino - Owen Littlejohns / @owenlittlejohns ### 60 sec updates. - Owen: [PR migrating last pieces of datatree code into xarray.core](https://github.com/pydata/xarray/pull/9011) ### Agenda - Alfonso show us his work on opening stuff efficienctly - 1-2 order of magnitude speedup with <= 1000 groups on netcdf4! - Separate PRs would be great - important things left in the merge - docs - formalize the backend - moving to_netcdf and AttrAccessMixin - issue with slots - split up into 2 PRs to separate out the potential rabbit hole ## Apr 30rd, 2024 ### Attendees - Matt Savoie / @flamingbear - Tom Nicholas - Eni - Ty - Justus ### 60 sec updates. - Matt: PR for [ops.py](https://github.com/pydata/xarray/pull/8976) ### Agenda - Progress / priorities - Good progress on merging core modules - Still need also docs, expose API, backends optimization - Should docs be added on same release as API is made public? - Each docs page is intended to be merged into the existing xarray docs page of the same name - With the exception of "Hierarchical Data", which is its own new page in the user guide - inherited variables: - maybe have a separate namespace (for example, `dt.cf["/path/to/inherited/variable"]` does inherited access as defined by the CF conventions) - or `dt.ia[]` for inherited access. - the advantage would be that we would be able to release, then add this feature later ## Apr 23rd, 2024 ### Attendees - Matt Savoie / @flamingbear - Justus Magin / @keewis - Tom Nicholas - Eni Awowale / @eni-awowale - Owen Littlejohns / @owenlittlejohns ### 60 sec updates. - Matt: I'm just returning my attention. ops.py. - Owen: Working on migrating most of remaining modules. ### Agenda - Merge tarball PR (merged) - SciPy talk? - Ideally be able to say DataTree is in xarray main by then (July) - Integrating backends - https://github.com/xarray-contrib/datatree/issues/330 - Currently we create a new `CachingFileManager` for each group - Want to only create one per file - two options: - Modify netcdfdatastore object to iterate over groups - allow creating the datastore given a file manager object - How do we test the performance of this? - Benchmark - Create datatree object with many nodes (but doesn't need actual data) - Write to disk, then benchmark opening it up. - Action items - Tom: Dedicated issue for this? (on xarray) - Write that benchmark first (goes with the other airspeed velocity tests) - Modify netcdfdatastore to only create one FileManager - Publicly the top-level `open_datatree` function (plus docs on datatree backends) - Tom: Ask Kai and Max etc. if they are actually planning to do this - Quick questions on xarray.core.common.py and testing.py. - `from_root` kwargs to `assert_equal` → add `**options` to `assert_*` - ## Apr 16th, 2024 ### Attendees - Matt Savoie / @flamingbear - Tom Nicholas - Stephan Hoyer - Owen Littlejohns / @owenlittlejohns - Eni Awowale ### 60 sec updates. - Matt: working other side. - Owen: looking at `mapping.py` - Eni: HTML repr - https://github.com/pydata/xarray/pull/8930 - ### Agenda - Justus (can't join but would like to bring this up): - type checking of xarray apparently fails because of the typing import of `DataTree`: https://github.com/pydata/xarray/issues/8768 - should we remove that for now / replace with `"DataTree"` (not sure if that works)? - action: Matt will change tarball to stop stripping out datatree ## Apr 9th, 2024 ### Attendees - Tom Nicholas - Matt Savoie / @flamingbear - Ty Schlichenmeyer ### Agenda - Discussed the original Xarray [Tracking issue (#8572)](https://github.com/pydata/xarray/issues/8572). Tom will update where we are. - Matt will see if we can add planned work for getting the documentation another pair of eyes before the merge as well as to get a short (no pressure) blog post for both NASA and Xarray to celebrate :tada: completion. - Talked through the depth first (PreOrderIter) and breadth first (LevelOrderIter) and discussed if there was any benefit to having both in the code base. We are going to try to replace and simplify by using LevelOrderIter only. We could not determine a performance reason for having depth first considering all of the intermediate nodes have to be created. ## Apr 2nd, 2024 ### Attendees - Tom Nicholas - Justus Magin / @keewis - Eni Awowale / @eni-awowale ## Mar 26th, 2024 ### Attendees - Matt Savoie / @flamingbear - Tom Nicholas - Owen Littlejohns / @owenlittlejohns - Stephan Hoyer ### 60 Second updates - Matt: Looking at mapping.py - Owen: Resolve last few mypy issues with datatree.py PR (thanks to Matt for help there). PR is pretty much ready to go. ### Agenda - Current [datatree.py PR](https://github.com/pydata/xarray/pull/8789). [Should we pull everything that is imported from `datatree_` out of this one?](https://github.com/pydata/xarray/pull/8789#discussion_r1538584210) - `ops.py` should go into xarray's `generate_aggregations`? [no for now, can be cleaned up later, add an issue?] - Priorities? - `from xarray import datatree` ## Mar 19th, 2024 (special time) ### Attendees - Matt Savoie / @flamingbear - Tom Nicholas - Owen Littlejohns / @owenlittlejohns - Justus Magin / @keewis ### Agenda Discussed "DataTree handles Hashables" - The use cases seemed very infrequent. - zarr groups are limited to strings. The Netcdf4 doesn't have types but you can't create a group from an int `TypeError: expected str, bytes or os.PathLike object, not int` - To move forward, allow the getter to have a Hashable type, but be clear that we only use str and raise errors on non-str in DataTrees. Hopefully this solves problems with traversing and finding data, but keeps us without having terrible typing conflicts between Dataset Dataarray and DataTree Discussed issues with wrapping a Dataset in a "FrozenDataset" as a replacement for DatasetView which problematically inherits from Dataset. - First suggested solution for FrozenDataset was failing because special methods aren't caught by `__getattr__`. - Owen was looking into a metaclass solution that seemed really complicated. - Tom, Matt and Owen decided that we should move on if Owen's next stab also failed (using a mixin). Tom showed Matt the metaprogramming in [generate_aggregations.py](https://github.com/pydata/xarray/blob/main/xarray/util/generate_aggregations.py) and the resulting [_aggregations.py](https://github.com/pydata/xarray/blob/main/xarray/core/_aggregations.py) and sounded like he convinced himself that we might use that instead of the code currently in ops.py to apply the map_over_subtree decorator. This solution wasn't avaiable before as the datatree repo was separate from xarray when implemented. This would also allow us to fixup some of the documentation for datatree that is "good enough". Probably a good thing for Tom and Stephan to discuss before we migrate that code. ## Mar 12th, 2024 ### Attendees - Matt Savoie / @flamingbear - Tom Nicholas - Owen Littlejohns / @owenlittlejohns - Eni Awowale / @eni-awowale - Justus Magin / @keewis - Stephan ### 60 second updates - Matt: No progress last week. - Have PR up for datatree.py migration. Working on FrozenDataset. ### Agenda - Slow week with not much to report. - Some discussion about missing API pieces to Datatree. For merging or filtering in particular. - It was mostly agreed that maybe an advanced usage documentation with recipes for how to do common operations could be useful, but keep an eye open for opportunities to improve if obvious, repeating use cases appear. ## Mar 5th, 2024 ### Attendees - Matt Savoie / @flamingbear - Justus Magin / @keewis - Tom Nicholas - Stephan Hoyer - Eni Awowale / @eni-awowale ### 60 second updates - Matt - Struggling to rectify the mypy errors in [#8789](https://github.com/pydata/xarray/pull/8789). Looking for advice on which way to proceed. - Same story for implementing Hashable for Datatree. ### Agenda - Continue Discussion around Datatree following CF model for [scoping variables](https://cfconventions.org/cf-conventions/cf-conventions.html#_scope). + Justus would like a flag for behavor switching, Tom thinks that would over complicate things including docs and support. + Tom will go back to thinking and see if he can prototype something. - Questions for implementing Hashable for Datatree led to discussion + Should backslash "\", slash "/", dot "." and dotdot ".." be allowed in variable names (I think this was the discussion). + Seemed like Hashable should work except for the Paths. Maybe it was a bad idea in Xarray? Don't think wse had a decision on how to move here, but Matt will continue to think about it. overall generally inconsequental. - Matt will replace DatasetView with a Frozen style wrapper to Dataset. ## Feb 27th, 2024 ### Attendees - Matt Savoie / @flamingbear - Stephan Hoyer - Tom Nicholas - Eni Awowale / @eni-awowale - Etienne Schalk / @etienneschalk ### 60 second updates - Tom - Not much - at conference - Matt - Waiting on first PR, have a few others behind. https://github.com/pydata/xarray/pull/8757 ### Agenda - Recap of previous meeting - Updates / Q's - Deep dive? - Data model for inherited nodes - e.g., - Entirely independent? - Shared coordinates from parent nodes? - CF conventions: https://cfconventions.org/cf-conventions/cf-conventions.html#groups - Key clause: "If any dimension of an out-of-group variable has the same name as a dimension of the referring variable, the two must be the same dimension (i.e. they must have the same netCDF dimension ID)." - design questions: - Should we be able to open any netCDF file? - Dict contents are ambiguous when there is fallback look-up - Could maybe use ChainMap for inheritance - Example in h5netcdf https://github.com/h5netcdf/h5netcdf/blob/b19d4a03a4bb553312d77135c23f3eedba243899/h5netcdf/core.py#L697 - are we excluding any use-cases by adopting a netCDF data model? - do we allow conflicts in inherited variables? - CF conventions do not allow conflicting dimensions - Do we want to allow conflicting coordinates/data variables? - EDIT: Tom commented a summary of this https://github.com/xarray-contrib/datatree/issues/297#issuecomment-1967328385 ## Feb 20th, 2024 ### Attendees - Matt Savoie / @flamingbear - Justus Magin / @keewis - Owen Littlejohns / @owenlittlejohns - Stephan Hoyer ### 60 second updates - datatree tests are not skipped in the new release ### Agenda - Intro to the purpose of these meetings - Update from Matt? - High-level explanation of datatree's overall design from Tom - One group, one `Dataset` - Nested dictionary - Independent nodes - Store `Variable` objects instead of `Dataset`s - Map API downwards - Deep-dive into one decision / part of code (if time) - pathlib: non-pure paths on datatree? ### Actions - [X] Track down reason for exploding Dataset into pieces in datatree in issues. https://github.com/pydata/xarray/issues/8747#issuecomment-1955051183 - [X] Make migrations flat, i.e. no datatree subdir in xarray. ### Ideas - Ideas from Stephan: - Switched OrderdDict -> dict - Move Dataset-like hidden properties onto a dedicated object? - idea: subtree mapping: returns the full tree with just the specified nodes (and maybe their children) ```python dt.subtree(["/a", "/b/c"]).isel(...) ``` ###

Import from clipboard

Paste your markdown or webpage here...

Advanced permission required

Your current role can only read. Ask the system administrator to acquire write and comment permission.

This team is disabled

Sorry, this team is disabled. You can't edit this note.

This note is locked

Sorry, only owner can edit this note.

Reach the limit

Sorry, you've reached the max length this note can be.
Please reduce the content or divide it to more notes, thank you!

Import from Gist

Import from Snippet

or

Export to Snippet

Are you sure?

Do you really want to delete this note?
All users will lose their connection.

Create a note from template

Create a note from template

Oops...
This template has been removed or transferred.
Upgrade
All
  • All
  • Team
No template.

Create a template

Upgrade

Delete template

Do you really want to delete this template?
Turn this template into a regular note and keep its content, versions, and comments.

This page need refresh

You have an incompatible client version.
Refresh to update.
New version available!
See releases notes here
Refresh to enjoy new features.
Your user state has changed.
Refresh to load new user state.

Sign in

Forgot password

or

By clicking below, you agree to our terms of service.

Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
Wallet ( )
Connect another wallet

New to HackMD? Sign up

Help

  • English
  • 中文
  • Français
  • Deutsch
  • 日本語
  • Español
  • Català
  • Ελληνικά
  • Português
  • italiano
  • Türkçe
  • Русский
  • Nederlands
  • hrvatski jezik
  • język polski
  • Українська
  • हिन्दी
  • svenska
  • Esperanto
  • dansk

Documents

Help & Tutorial

How to use Book mode

Slide Example

API Docs

Edit in VSCode

Install browser extension

Contacts

Feedback

Discord

Send us email

Resources

Releases

Pricing

Blog

Policy

Terms

Privacy

Cheatsheet

Syntax Example Reference
# Header Header 基本排版
- Unordered List
  • Unordered List
1. Ordered List
  1. Ordered List
- [ ] Todo List
  • Todo List
> Blockquote
Blockquote
**Bold font** Bold font
*Italics font* Italics font
~~Strikethrough~~ Strikethrough
19^th^ 19th
H~2~O H2O
++Inserted text++ Inserted text
==Marked text== Marked text
[link text](https:// "title") Link
![image alt](https:// "title") Image
`Code` Code 在筆記中貼入程式碼
```javascript
var i = 0;
```
var i = 0;
:smile: :smile: Emoji list
{%youtube youtube_id %} Externals
$L^aT_eX$ LaTeX
:::info
This is a alert area.
:::

This is a alert area.

Versions and GitHub Sync
Get Full History Access

  • Edit version name
  • Delete

revision author avatar     named on  

More Less

Note content is identical to the latest version.
Compare
    Choose a version
    No search result
    Version not found
Sign in to link this note to GitHub
Learn more
This note is not linked with GitHub
 

Feedback

Submission failed, please try again

Thanks for your support.

On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

Please give us some advice and help us improve HackMD.

 

Thanks for your feedback

Remove version name

Do you want to remove this version name and description?

Transfer ownership

Transfer to
    Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

      Link with GitHub

      Please authorize HackMD on GitHub
      • Please sign in to GitHub and install the HackMD app on your GitHub repo.
      • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
      Learn more  Sign in to GitHub

      Push the note to GitHub Push to GitHub Pull a file from GitHub

        Authorize again
       

      Choose which file to push to

      Select repo
      Refresh Authorize more repos
      Select branch
      Select file
      Select branch
      Choose version(s) to push
      • Save a new version and push
      • Choose from existing versions
      Include title and tags
      Available push count

      Pull from GitHub

       
      File from GitHub
      File from HackMD

      GitHub Link Settings

      File linked

      Linked by
      File path
      Last synced branch
      Available push count

      Danger Zone

      Unlink
      You will no longer receive notification when GitHub file changes after unlink.

      Syncing

      Push failed

      Push successfully