Alex Kerney
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    # Xpublish community meeting ## 2026-01-02 - Alex Kerney / Gulf of Maine Research Institute / @abkfenris / akerney@gmri.org - Matthew Iannucci / Earthmover / @mpiannucci / mpiannucci@gmail.com - Joe Smith / Great Lakes Observing System / joe@glos.org - __Agenda and notes__: - Punt Zarr to a seperate plugin at some point if accessing private details or needing to vendor Xarray internals - https://github.com/xpublish-community/xpublish/pull/317#pullrequestreview-3623538989 - multi-spatial/datatrees - https://github.com/earth-mover/xpublish-tiles/issues/172 - OGC conformance - https://github.com/xpublish-community/xpublish-ogc-core - https://github.com/xpublish-community/xpublish-edr/pull/97 - Cache strategy for xpublish (more for brains to chew on) - XREDS: https://github.com/asascience-open/xreds/blob/01cfa89fee500a63a43f718d5c56ad22a7454999/xreds/dataset_provider.py#L66 - Xpublish uses Cachey https://github.com/dask/cachey internally - Dataset Provider - check XREDS implementation ## 2025-12-05 - Alex Kerney / Gulf of Maine Research Institute / @abkfenris / akerney@gmri.org - Matthew Iannucci / Earthmover / @mpiannucci / mpiannucci@gmail.com - - __Agenda and notes:__ - Earthmover - Might roll EDR into Tiles as Tiles manages ## 2025-11-07 - Alex Kerney / Gulf of Maine Research Institute / @abkfenris / akerney@gmri.org - Jonathan Joyce / RPS Ocean Science / @jonmjoyce - Matthew Iannucci / Earthmover / @mpiannucci / mpiannucci@gmail.com __Agenda and notes:__ - Tetratech - Getting ROMS grids working in Icechunk - Chunk alignment error in FVCOM - Likely to be a Virtualzarr issue rather than Icechunk specific - https://github.com/zarr-developers/VirtualiZarr/issues/815 - XREDS - Updating Icechunk support - How do we extract virtual variables and other plugins to be more generally useful - Still getting Axiom integrated but they are working on some Xpublish as well - Forecasts? - https://github.com/abkfenris/xarray_fmrc/ ## 2025-10-03 - Alex Kerney / Gulf of Maine Research Institute / @abkfenris / akerney@gmri.org - Matthew Iannucci / Earthmover / @mpiannucci / mpiannucci@gmail.com - Sean Arms - Unidata __Agenda and notes:__ - Earthmover - Xpublish-Tiles - https://github.com/earth-mover/xpublish-tiles - OGC Tiles √ - WMS - Need to do some general updates to Xpublish - Folks asking about Datatrees - Xpublish-EDR - Need to make async now that Xpublish can do async - Thredds - https://thredds-dev.unidata.ucar.edu/thredds/catalog/icechunk.html Using Xarray and Icechunk via grpc (Python feeding Java) - But could also be setup where Java feeds Python - Thinking of how to swap out services/making them pluggable - How to scale - Datashader does over a second of JIT on a coldstart ## 2025-09-05 - Alex Kerney / Gulf of Maine Research Institute / @abkfenris / akerney@gmri.org - Matthew Iannucci / Earthmover / @mpiannucci / mpiannucci@gmail.com - Jonathan Joyce / RPS Ocean Science / @jonmjoyce / - - - __Agenda and notes:__ - Earthmover - Async Xarray and Zarr - New async rendering for tiles and WMS - Deepak magic - Xpublish-tiles - https://pypi.org/project/xpublish-tiles/ - 1-400 ms - Any CRS - Renderers as plugins - Numpy tiles... - https://docs.earthmover.io/flux/tiles - Learning - Want to add the async loading to EDR - People want to do weird things with EDR, hard to tell what will be slow - Query planner, cutting off at 1 GB - Not using Dask anymore, except for EDR but want to move off it as well - Dataset validation at a per-plugin level - Maybe use XRLint - Dataset provider will eventually need to be async - Icechunk 2.0 https://github.com/earth-mover/icechunk/pull/1154 - ## 2025-08-01 - Alex Kerney / Gulf of Maine Research Institute / @abkfenris / akerney@gmri.org - Matthew Iannucci / Earthmover / @mpiannucci / mpiannucci@gmail.com - Jonathan Joyce / RPS Ocean Science / @jonmjoyce / jonathan.joyce@tetratech.com - - - __Agenda and notes:__ - Lots of interest in an ERDDAP MCP server during an IOOS webinar. It might be worth exploring what an MCP plugin might look like for Xpublish. - I think we would have to expose a new hook point. - https://gofastmcp.com/integrations/fastapi#offering-an-llm-friendly-api - https://github.com/tadata-org/fastapi_mcp/tree/main - Earthmover - Strange coordinates - Numpy tiles - Public datasets - Dynamical - ERA5 and AIFS is avaliable: https://app.earthmover.io/public/repositories - Using Cloudflare for public data as there is no egress - ~2/3 of the speed of S3 - - TetraTech/RPS/Axiom - CoDMAC - Working on barb tiles - Virtual variables - https://github.com/orgs/xpublish-community/discussions/25 - Paths - OGC Collections API - TiTiler also ignore it - Most EDR clients need the collections path - Datatrees - https://fastapi.tiangolo.com/tutorial/path-params/#path-convertor ## 2025-06-06 - Alex Kerney / Gulf of Maine Research Institute / @abkfenris / akerney@gmri.org - Matthew Iannucci / Earthmover / @mpiannucci / mpiannucci@gmail.com - Jonathan Joyce / RPS Ocean Science / @jonmjoyce / jonathan.joyce@tetratech.com - Sean Arms __Agenda and Notes: - XREDS - works with Zarr 2 and 3 - Starting to Icechunk more - Async Xarray Updates - Thomas Nicolas has been adding an async loading method for Xarray - Trying to unblock various Xpublish plugins at scale - Had a little hackathon to try to speed things up, but really need to push async and caching down the stack - Icechunk currently caches in memory - Want to build remote/disk caching - Currently Deepak is working on manifest splitting so that the core metadata can be seperate files - EDR + WMS Updates - GeoTIFF support - Thomas is probably going to refactor WMS at some point - Obs data - Shane going to start prototyping w obs data, using parquet - Can we convert from db or other sources directly to xarray? - Can we carry cf metadata with parquet/iceberg? - CF aggregations: https://github.com/cf-convention/cf-conventions/issues/508 - `actual_range` attribute with a `[min, max]` - ## 2025-04-04 - Alex Kerney / Gulf of Maine Research Institute / @abkfenris / akerney@gmri.org - Matthew Iannucci / Earthmover / @mpiannucci / mpiannucci@gmail.com - Nicholas Delli Carpini / RPS Ocean Science __Agenda and notes__: - Matt is cleaning up errors, or at least - Earthmover Xpublish service (flux) demo from Matt - XREDS - Nicholas - Working with Zarr 3 - no major performance changes - Got rid of the redis caching system. - Deserialization time might take longer than the cache time - Probably should be run at the Nginx layer - Can pickle the datasets themselves, and shared across the workers - Max number of datasets to cache in memory - Moved from Kerchunk engine to Zarr with reference file system - Fixed some logger changes - Was getting overridden by Uvicorn - WMS - Adding a triangular grid type - Zarr 3 transition - Matt will come back to the Xpublish PR - Adding tracing - Does it get added to `Deps`? - Can we make WMS data loading async? ## 2025-03-07 - Alex Kerney / Gulf of Maine Research Institute / @abkfenris / akerney@gmri.org - Matthew Iannucci / Earthmover / @mpiannucci / mpiannucci@gmail.com - Jonathan Joyce / RPS Ocean Science / @jonmjoyce / jonathan.joyce@tetratech.com __Agenda and notes__: - zarr 3 support in xpublish - Earthmover is using Xpublish with Zarr 3 now and it works, just dependency install order creativity - split out zarr plugin to its own repo? - dataset plugin uses zmetadata, how do we work around that? - https://github.com/xpublish-community/xpublish/blob/c64d6bc1d751c125af4686a70b192f9b9f53f9bd/xpublish/plugins/included/dataset_info.py#L50-L78 - Updating plugin to zarr 3 should be easier than supporting v2 api - EDR collections API - We are not compliant with edr spec - Doesn't play well with QGIS - Currently only work with gridded data, what about vector data in xarray? - WMS roadmap - Written for an RPS specific use case, doesn't adapt as well for other use cases - Performance concerns - No async in Xarray - Requests stack up - Can't use Dask for this, too many chunks as the task graph gets too big. 15-ish seconds with Dask, 1 second without - https://github.com/xpublish-community/xpublish-wms/pull/98 - CO-DMAC roadmap - XREDS vs Xpublish - XREDS: reasonable defaults, but able to override (replace or add dataset providers) - https://github.com/abkfenris/xpublish-config - Enterprise scaling / load testing - Recording metrics, open-source performance testing - Need standardized datasets bc chunking is biggest impact - Underlying architecture - Spread out request management - Simple deployment & instructions for NOAA - Test CORA visualization - Try icechunk / zarr 3 (2-3x requests) - Kerchunk supports zarr 3 / virtualizarr - Should be able to read existing datasets w/ zarr 3 - ERDDAP and THREDDS converter - https://github.com/gulfofmaine/dataset_catalog - Vector data, can we reserve data from THREDDS. OpenDAP xpublish vs opendap THREDDS. - Reserving ERDDAP: https://xpublish.onrender.com/docs https://github.com/xpublish-experiments/xpublish-erddap?tab=readme-ov-file ## 2025-02-07 - Alex Kerney / Gulf of Maine Research Institute / @abkfenris / akerney@gmri.org - Jonathan Joyce / RPS Ocean Science / @jonmjoyce / jonathan.joyce@tetratech.com - Kristen Thyng / Axiom Data Science / @kthyng / kristen@axds.co - Nicholas Delli Carpini / RPS Ocean Science - __Agenda and notes__: - Chunks? ## 2025-01-03 - Alex Kerney / Gulf of Maine Research Institute / @abkfenris / akerney@gmri.org - Matthew Iannucci / Earthmover - - - __Agenda and Notes__: - How did the AGU talk go? - Find a time to prep for Pangeo Showcase talk - Earthmover - Vendored Zarr dependencies PR https://github.com/xpublish-community/xpublish/pull/285 - Issue with how to represent tree structures without always making/opening datatrees due to non-similar datasets in the tree - Maybe figure out the core `DataTree` accessor methods that Xpublish might need, then allow plugins to return `DataTree`s or `DataTree`-like duck objects that we can traverse. - Scheeme for next few meetings - Use the showcase to try to get more folks using it to show up, find out how they are using and what they need - Maybe April we have a roadmap planning meeting to figure out what are the big common wishlist items (datatree, dask/cache...) and figure out the next step in evolution ## 2024-12-06 - Alex Kerney / Gulf of Maine Research Institute / @abkfenris / akerney@gmri.org - Joe Hamman / Earthmover / @jhamman / joe@earthmover.io - Jonathan Joyce / RPS Ocean Science / @jonmjoyce / jonathan.joyce@tetratech.com - Kristen Thyng / Axiom Data Science / @kthyng / kristen@axds.co - Matthew Iannucci / Earthmover __Agenda and Notes__: - Joe previewing his AGU talk - Scaling Xpublish OpenDap to 2k request per second built on top of IceChunk - Earthmover - has put a lot of effort into EDR over the last month or so - Scheemed on how to make things more OGC compliant - Can draw random polygons in the area request - Animated GeoJSON over time and click to get a timeseries with a point request - Focusing on OpenDAP now - Next is trying to get the dataset provider to work with a full path to work with hiearchy - Swapped `dataset_id: str` to `dataset_id: path` - Mixing `SingleDatasetRest` and caching dataset loading - Tuning dask for better concurent loading and not always useful in a single query - Often dask ends up being a 100 ms-ish overhead - When is dask faster? Zarr 3 changes the concurrency - Zarr 3 introduces its own I/O concurency pool (Dask does too, but it has additional overhead) - https://github.com/oceanhackweek/ohw24_proj_xarray_load_by_step_us - Icechunk - VirtualZarr just landed appending https://github.com/zarr-developers/VirtualiZarr/blob/main/examples/append/noaa-cdr-sst.ipynb - More work on references in progress - Zarr v3 ready? - Beta 3 is out today - Final is supposed to be out in a month - XREDS - RPS and Axiom are using more - Good thing the requirements are pinned as EarthMover is making a lot of changes to plugins at the moment https://github.com/asascience-open/xreds/blob/2271f7899f2b280ee19ced26f7bc8a2dc8ae892e/requirements.txt#L32-L41 - CO-OPS is having slow requests, probably need to rethink how WMS is structured - GMRI/NERACOOS hasn't gotten to move over - OpenDAP Protocol - Lets vendor into XPublish-OpenDAP, so can experiment faster, add note to protocol repo - https://github.com/xpublish-community/opendap-protocol/pull/10 - Not actually using Dask ## 2024-11-01 - Alex Kerney / Gulf of Maine Research Institute / @abkfenris / akerney@gmri.org - Joe Hamman / Earthmover / @jhamman / joe@earthmover.io - Jonathan Joyce / RPS Ocean Science / @jonmjoyce / jonathan.joyce@tetratech.com - Kristen Thyng / Axiom Data Science / @kthyng / kristen@axds.co - __Agenda and Notes__: - Datatree - Joe - We should change Xpublish to use datatree as the core abstraction rather than dataset - Allows the individual protocols to handle the nesting - Routers should be datatree specific - Potentially could treat the datasets at init as a default tree - Dataset transformer plugins - Lazy - Dask - Earthmover is using a local dask scheduler not distributed - Zarr 3 will change the dynamic some as it runs its own threadpool - Extensions: - https://github.com/asascience-open/xreds/blob/main/xreds/extensions/vdatum.py - - https://github.com/asascience-open/xreds/pull/21 - Should the dataset/datatree providers - Pangeo showcase - Xpublish at Scale - Who else is using it? - Co-Ops running their own operational Xpublish instance (moved from RPS). - NOAA NCOFS (Axiom) - cranky mfdataset - https://coastalscience.noaa.gov/ - IOOS Model Viewer uses some xpublish layers: https://eds.ioos.us/ ## 2024-10-04 - Alex Kerney / Gulf of Maine Research Institute / @abkfenris / akerney@gmri.org - Jonathan Joyce / RPS Ocean Science / @jonmjoyce / jonathan.joyce@tetratech.com - Kristen Thyng / Axiom Data Science / @kthyng / kristen@axds.co - Matthew Iannucci / Earthmover - __Agenda and Notes__: - Axiom is using more Xpublish - CO-OPs is using some - NERACOOS can rip out their model viewer and move to XREDs - Earthmover is using EDR, WMS, OpenDAP - Xarray PR for flexible coordinate transforms https://github.com/pydata/xarray/pull/9543 ## 2024-09-06 - Alex Kerney / Gulf of Maine Research Institute / @abkfenris / akerney@gmri.org - Kristen Thyng / Axiom Data Science / @kthyng / kristen@axds.co - Joe Hamman / Earthmover / @jhamman / joe@earthmover.io __Agenda and Notes__: - Xpublish & datatree - Some protocols are natively capabile of being heiarchy aware - OpenDAP - Zarr - EDR (through collections?) - What about making datatree aware plugins opt in? - Other plugins (WMS) can have a general path-based access - How do datatrees interact with catalogs - Datatrees don't need to have all contained datasets have the same dimensions/coordinates/variables, so catalogs most likely could be represented as trees - Might need a catalog as a datatree-like model, but allow lazy-er access - OpenDAP - Vendor vs take over package - https://github.com/MeteoSwiss/opendap-protocol/issues/9#issuecomment-2334517096 ## 2024-07-05 - Alex - Kristen - ## 2024-06-07 - Alex Kerney / Gulf of Maine Research Institute / @abkfenris / akerney@gmri.org - Matthew Iannucci / RPS Ocean Science / @mpiannucci / matthew.iannucci@tetratech.com - Kristen Thyng / Axiom Data Science / @kthyng / kristen@axds.co - Jonathan Joyce / RPS Ocean Science / @jonmjoyce / jonathan.joyce@tetratech.com __Agenda and Notes__: - Alex: - OpenDAP is still problematic - Matt - Some issues with downloading 500-ish MB downloads, but it may be something with the AWS networking - Looking at running Xpublish serverless - Caching the Xarray Dataset representation without the data ## 2024-05-03 - Alex Kerney / Gulf of Maine Research Institute / @abkfenris / akerney@gmri.org - Matthew Iannucci / RPS Ocean Science / @mpiannucci / matthew.iannucci@tetratech.com - Kristen Thyng / Axiom Data Science / @kthyng / kristen@axds.co __Agenda and Notes__: - OpenDAP coords/dims help needed https://github.com/xpublish-community/xpublish/discussions/246 - `__` as catalog path seperators? - ## 2024-04-05 - Alex Kerney / Gulf of Maine Research Institute / @abkfenris / akerney@gmri.org - Jonathan Joyce / RPS Ocean Science / @jonmjoyce / jonathan.joyce@tetratech.com - Kristen Thyng / Axiom Data Science / @kthyng / kristen@axds.co - Anthony Aufdenkampe / LimnoTech / @aufdenkampe / aaufdenkampe@limno.com __Agenda and Notes__: - Some scheeming at the IOOS DMAC meeting - Xavier's work at LimnoTech is now deployed for USGS - USGS needs to continue to support old clients using OpenDAP, but they would rather folks access them via Xarray native/STAC catalog - Also exposing OGC processing API (PyGeoAPI) so they can set their super computer to work - Looking for it to be a hands off/low maintenance THREDDS replacement - Deployed to AWS Fargate by packaging into a Docker container - Might be worth digging into their deployment in the future - IOOS Code Sprint ideas - Jonathan suggested two topics - https://github.com/ioos/ioos-code-sprint/issues - Xavier has an experiment in: https://github.com/xpublish-experiments/Catalog-To-Xpublish - https://labs-beta.waterdata.usgs.gov/api/xpublish/catalogs - Axiom is using their two longest hindcasts via Xpublish - Not expecting it to work really well - `xr.openmf_dataset` and `kerchunk` aren't working great - Not a lot of usage right now, hoping to get back to it with some of the IRA funding - ## 2024-03-01 - Joe Hamman / Earthmover / @jhamman / joe@earthmover.io - Jonathan Joyce / RPS Ocean Science / @jonmjoyce / jonathan.joyce@tetratech.com - Matthew Iannucci / RPS Ocean Science / @mpiannucci / matthew.iannucci@tetratech.com - Kristen Thyng / Axiom Data Science / @kthyng / kristen@axds.co __Agenda and Notes__: - Jonathan presenting at https://sea.ucar.edu/conference/2024 - Discussed future of Zarr v3 and Kerchunk ## 2024-02-02 __Attendees__: - Alex Kerney / Gulf of Maine Research Institute / @abkenris / akerney@gmri.org - Matthew Iannucci / RPS Ocean Science / @mpiannucci / matthewiannucci@tetratech.com __Agenda and Notes__: - Overlapping meetings ## 2024-01-05 __Attendees__: - Alex Kerney / Gulf of Maine Research Institute / @abkfenris / akerney@gmri.org - Joe Hamman / Earthmover / @jhamman / joe@earthmover.io - Matthew Iannucci / RPS Ocean Science / @mpiannucci / matthewiannucci@tetratech.com - Jonathan Joyce / RPS Ocean Science / @jonmjoyce / jonathan.joyce@tetratech.com - Kristen Thyng / Axiom Data Science / @kthyng / kristen@axds.co __Agenda and Notes__: - https://github.com/mpiannucci/redis-fsspec-cache - Works well but very much a prototype - Nerd sniping success - Go add to Jonathan's wishlist - https://github.com/orgs/xpublish-community/discussions/19 - Do we want to support datatrees natively? - Paths are still a pain, and something we will have to figure out, but we probably need to figure them out anyways for catalogs - Zarr Python 3 - Aiming for a spring beta release - Async support - Store interface is going to be entirely async - Building off of a Zarrita branch - Already has sharding support - Trying to figure out how breaking it's gonna be - https://github.com/zarr-developers/zarr-python/pull/1583 - Benchmarking and Performance: https://github.com/zarr-developers/zarr-python/discussions/1479 - Matt's very simple rust zarr 3 impl https://github.com/mpiannucci/charizarr - Most folks will use the sync api, but zarr will handle the `asyncio.run` concurrency . - Matt should comment on this: https://github.com/zarr-developers/zarr-python/discussions/1603 - Multidimensional coordinates getting lost through OpenDAP https://github.com/xpublish-community/xpublish/discussions/246 ## 2023-12-01 __Attendees__: - Joe Hamman / Earthmover / @jhamman / joe@earthmover.io - Note: I'm going to be late or may not make it at all. I am curious if anyone from this group will be at AGU. - Alex Kerney / Gulf of Maine Research Institute / @abkfenris / akerney@gmri.org - Matthew Iannucci / RPS Ocean Science / @mpiannucci / matthewiannucci@tetratech.com - Jonathan Joyce / RPS Ocean Science / @jonmjoyce / jonathan.joyce@tetratech.com - Kristen Thyng - Shane St Savage - Xavier Nogueira / LimnoTech / @xaviernogueira - - __Agenda and Notes__: - Who will be at AGU? - Nope - Jonathan will present at AMS remotely. Jan 30 at 5pm EST. - Xpublish OpenDAP - Bug in opendap plugin related to datatypes - Xavier going to look into it - https://github.com/xpublish-community/xpublish-opendap/issues/45 - WMS - done with the implementation part - Making the GetFeatureInfo more accurate based on grid cell and interpolating rather than just finding the nearest node - Tested with a bunch of unstructured models - examples via https://nextgen-dev.ioos.us/xreds/ - Working reasonably well, if there are any datasets that don't work, let Matt know - He's mainly working from the NOMADS bucket, not CO-OPs - Except for the St John's river model as it's a hot mess - Support NcWMS extensions? - GetTimeseries is supported - Some others aren't that RPS isn't using - [wms capabilities](https://github.com/xpublish-community/xpublish-wms/blob/36e44b4dc695d96576752b85832fffec2380fb33/xpublish_wms/wms/__init__.py#L23-L49) - OceansMap uses a seperate catalog that aggregates multiple servers rather than hitting `/datasets/` - Room for optimization if a stateful cache can be used somehow - Async - Eventually want to figure out async workflows with xpublish routes - Alex started dask provider workflow - Caching - Possibility of using redis to cache zarr values (metadata, chunks) which can be shared among instances of xpublish when used in a cluster - Not sure if fsspec, zarr/xarray, or xpublish is the right level for this functionality - If you want to provide predictable caching, you may want to turn Xarray's caching off - Not sure there is much benefit of redis over dask persist beyond TTL functionality - ### High-level ideas for improving xpublish capabilities: https://github.com/orgs/xpublish-community/discussions/19 ## 2023-11-03 __Attendees__: - Alex Kerney / Gulf of Maine Research Institute / @abkfenris / akerney@gmri.org - Joe Hamman / Earthmover / @jhamman / joe@earthmover.io - Jonathan Joyce / RPS Ocean Science / @jonmjoyce / jonathan.joyce@rpsgroup.com - Matthew Iannucci / RPS Ocean Science / @mpiannucci / matt.iannucci@rpsgroup.com - Kristen Thyng / Axiom Data Science / @kthyng / kthyng@gmail.com - Shane St Savage / Axiom Data Science / @srstsavage / shane@axds.co __Agenda and Notes__: - GMRI and RPS (NERACOOS and MARACOOS) looking for Xpublish (XREDS) funding - Showed NOAA CO-OPs how Xpublish could speed up their infrastructure, currently stuck on THREDDS - Would be interested in working with Axiom too - XREDS now has an OS license - https://github.com/asascience-open/xreds - Demo https://nextgen-dev.ioos.us/xreds/ - Next phase is towards defining catalog loading behavior: https://github.com/xpublish-community/xpublish-intake-provider - RPS has a junior developer working on WMS - ROMS, Regular grids, - FVCOM in progress - New updates to Xarray fixed a bunch of issues - SELFE, ADCIRC, SCHISM up next - Joe's questions: - Does anyone know the folks maintaining this: https://github.com/MeteoSwiss/opendap-protocol - If they aren't managing it, should we adopt it - Looking for an example using a custom dataset provider with nested paths (e.g. `dataset_id = "foo/bar/abc"`) - https://xpublish.readthedocs.io/en/0.3.3/getting-started/tutorial/dataset-provider-plugin.html - TODO: will need to re-mount a special router than can handle this - Do people use the zarr router for xpublish? - yes Axiom does - Joe wonders about splitting it out separate from the xpublish package - this is related to timing with v3 of zarr python - Axiom is going to do all sorts of ERDDAP related things hopefully to include Xpublish backends - NODD - Each NOAA line office has a quota, and most aren't even close to using all of it - It can take a lot of meetings to get the logistics for access figured out, but many offices are happy to push as much data up as possible - They just don't want to be paying for any compute ## 2023-10-06 __Attendees__: - Alex Kerney / Gulf of Maine Research Institute / @abkfenris / akerney@gmri.org - Jonathan Joyce / RPS Ocean Science / @jonmjoyce / jonathan.joyce@rpsgroup.com - Matthew Iannucci / RPS Ocean Science / @mpiannucci / matt.iannucci@rpsgroup.com - Joe Hamman / Earthmover / @jhamman / joe@earthmover.io - Xavier Nogueira / LimnoTech / @xrnogueira / xavier.rojas.nogueira@gmail.com - Kristen Thyng / Axiom Data Science / @kthyng / kthyng@gmail.com __Agenda and Notes__: - Axiom did some hacking on ERDDAP publishing - RPS is working on HFRadar, currently serving data with ERDDAP - Alex Mocked out Pandas DataFrame router implementation ([Discussion 16](https://github.com/orgs/xpublish-community/discussions/16)) - Kerchunk metadata only aggregations - USGS NHD and NOAA NWS is working to deprecate THREDDS - OpenDAP endpoints will be Xpublish - Zarr to OpenDAP interopertability layer - USGS would like to deprecate OpenDAP - PyGeoAPI - Zarr, STAC, S3 - RPS is working with CO-OPs - NOAA DAARWG? - NESDIS common cloud framework - Currently NESDIS only - Model and satelite data pipelines into products and services - Shared compute & data services - Possibility of broadening reach into rest of NOAA - https://nesdis-prod.s3.amazonaws.com/migrated/SessionVI_CaseyKenneth_0.pdf - OGC APIs in Xpublish - EDR - WMS - Xpublish is much faster to first unchached byte than THREDDS - RPS is still trying to figure out various scaling problems - We could implement more-compliant routes for OGC APIs in addition to having them below `/datasets` - Zarr - Python has a beta implementation of V3, but it's a little too polymorphic - Still trying to figure out how implement the internals - V2 can be mapped to V3 - List of all NODD datasets: https://www.noaa.gov/nodd/datasets ## 2023-09-01 __Attendees__: - Alex Kerney / Gulf of Maine Research Institute / @abkfenris / akerney@gmri.org - Jonathan Joyce / RPS Ocean Science / @jonmjoyce / jonathan.joyce@rpsgroup.com - Matthew Iannucci / RPS Ocean Science / @mpiannucci / matt.iannucci@rpsgroup.com - Dan Allan / Brookhaven National Laboratory / @danielballan / dallan@bnl.gov - Garrett Bischof / Brookhaven National Lab - NSLS2 / @gwbischof / gbischof@bnl.gov - Padraic Shafer / Brookhaven National Lab - NSLS2 / @padraic-shafer / pshafer@bnl.gov - Xavier Nogueira / LimnoTech / @xaviernogueira / xavier.rojas.nogueira@gmail.com __Agenda and Notes__: - Meet and greet with Tiled: https://github.com/bluesky/tiled/issues/523#issuecomment-1703015152 - Indivudal intros - Xpublish intro - Tiled intro - Pain point - IO was baked too deeply into scientific code - Made a Intake like library before Intake - Realized that a service was an interesting thing - Needed to bring data to more than just Python users - Started by working within intake - Intake had a prototype HTTP server, intake-server - Client-side xarrays pull chunks from server-side xarrays -- cool idea! - Contributed 40-45 PRs to intake over a year or two to try to develop this into something more production - Found that it was too Python focused to be language agnostic - Don't have a lot of existing services that exist in the space - Centralized on structures - Array - Table (arrow-like) - Awkward - Containers... - Three serving modes - File based, a read-only view (walk a directory) `tiled serve directory files/` - As the source of truth, clients write data into it `tiled serve catalog catalog.db --writable-storage data/` - Proxying from another service with custom Python code - Some combination of the above `tiled server config config.yml` - Focused on data access, slicing, not general-purpose data processing endpoints - Concerned about resource requirements for processing - Put it in a separate microservice? - Concerned about frequently-evolving or growing list of processing needs from our diverse user groups ## 2023-08-04: __Attendees__: - Alex Kerney / Gulf of Maine Research Institute / @abkfenris / akerney@gmri.org - Joe Hamman / Earthmover / @jhamman / joe@earthmover.io - Xavier Nogueira / LimnoTech / @xaviernogueira / xavier.rojas.nogueira@gmail.com - Kristen Thyng / Axios Data Science / kthyng@gmail.com __Agenda and Notes__: - Kristen is from Axiom and replacing Kyle in a few cases - Joe - Zarr v3 is finalized - Leading a working group to get the python implementation up to speed - https://github.com/zarr-developers/zarr-python/discussions/1480 - Pretty major refactor - Implications for the Zarr Router and Kerchunk - Store layout has changed - Built a converter - The chunk binaries are the same - Metadata changes - Xavier: TLDR on Zarr V3 changes? - Not as huge as the initial proposed spec - More flexibility around metadata and chunk extensions - ZEP002? Martin is working on a variable chunk size implementation extension - Ex: a dataset with an initial bulk update and then daily appends - Alex - Tinkering with Xpublish-config - Xavier - Want to build a STAC-Catalog plugin - Catalog router/provider plugin - Kristen - Reading what packages Kyle built - Xpublish host - ## 2023-07-07 __Attendees__: - Alex Kerney / Gulf of Maine Research Institute / @abkfenris / akerney@gmri.org - Joe Hamman / Earthmover / @jhamman / joe@earthmover.io - Xavier Nogueira / LimnoTech/ @xaviernogueira - Matthew Iannucci / RPS Group / @mpiannucci __Agenda and Notes__: - Dask - Experimental Dask client/cluster plugins https://github.com/xpublish-community/xpublish/pull/208 - Hard to make the worker environments match the server - How do we asyncronously compute a xarray Dask graph? - We should be able to `await client.submit(da.sel(time=0).data)` - https://distributed.dask.org/en/stable/asynchronous.html - Do we want to require Dask? - General consensus is No - Non async requests don't block with multiple threads, where as async requests block - - Share catalog plugin - Collections can share item IDs - Have we tried the `:path` parameter type? E.g. `/foo/{p:path}/zarr` - `/catalogs/{catalog}/{p:path}/_routes_/zarr` - TODO: open a ticket on fast api discussions? - Dataset metadata - `/datasets/{dataset_id}/files`... - We don't want to get too far from being able to access xarray datasets directly, but there are circumstances when it would be nice to be able to be able to direct users directly to storage instead of hitting Xpublish and having it say translate zarr-to-zarr - Maybe `ds.encoding['source']` might have some of that info - Wouldn't want to try to direct users to local file paths - Same data may be stored in multiple places (local, S3, GCS, OSN...) - Zarr chunks - https://github.com/xpublish-community/xpublish/issues/207 - something to try: `ds.reset_encoding()` before serving these datasets ## 2023-06-02 __Attendees__: - Alex Kerney / Gulf of Maine Research Institute / @abkfenris / akerney@gmri.org - Joe Hamman / Earthmover / @jhamman / joe@earthmover.io - Xavier Nogueira / LimnoTech / @xaviernogueira / xavier.rojas.nogueira@gmail.com - Jonathan Joyce / RPS Group / @jonmjoyce / jonathan.joyce@rpsgroup.com - Kyle Wilcox / Axiom Data Science / @kwilcox / kyle@axds.co - __Agenda and Notes__: - OpenDAP weirdness depending on the client - https://github.com/xpublish-community/xpublish-opendap/issues/18 - Dask - Links: - https://examples.dask.org/applications/async-web-server.html - https://examples.dask.org/applications/async-await.html - Jonathan: Hard/slow to connect to a remote cluster and keeping the dependencies in sync - Kyle - Spin up an Xpublish server with a sidecar cluster per dataset - Throwing big servers at the problem - We'll need to figure out how to work with Kubecluster and other ways - Joe: Has anyone tried async computation? We should be able to delgate it - Xavier: Async is the next thing he's trying - Joe: FastAPI and Dask can be async, but there isn't much info on how to use Xarray async - Jonathan: How do you manage remote state - Xavier: Mixing sync/async - https://anyio.readthedocs.io/en/stable/threads.html#calling-synchronous-code-from-a-worker-thread - https://simonwillison.net/2020/Sep/2/await-me-maybe/ - Joe: - Could await Dask tasks as coroutines. Xarray has no native async code for - It would be great to make a blog post about using Dask with Xpublish - Describing the deployment methods - Single server with Dask sidecar - Autoscaling servers and dask clusters - Hard to keep the cache alive with ephemeral dask clusters - Jonathan - Is there enough calcuations for caching in Dask, or should it be on the IO path? - Kyle - Two main access patterns - Maps - Timeseries - Dynamic dataset that invalidates after 10 min for map data - Xavier - Experimenting with chunk sizes in other servers, not effiencent with non-coordinate queries - Kyle - Keep a tree in memory for querying against - https://github.com/xarray-contrib/xoak - Jonathan - Starting to work more with irregular grids - Kyle - Not a ton of different grid types - Can make some perfomant code that can target each type of grid - Try to make everything work with cf-xarray - Joe - Haven't benefitted yet from the Xarray flexible indexes refactor - Should be able to write indexes that can be more perfomant and may to multiple variables at the NetCDF level - Benoit Bovy did the flexible index refactor and wrote xoak ## 2023-05-05 __Attendees__: - Alex Kerney / Gulf of Maine Research Institute / @abkfenris / akerney@gmri.org - Jonathan Joyce / RPS Group / @jonmjoyce / jonathan.joyce@rpsgroup.com - Joe Hamman / Earthmover / @jhamman / joe@earthmover.io - Xavier Nogueira / LimnoTech / @xaviernogueira / xavier.rojas.nogueira@gmail.com - Kyle Wilcox / Axiom Data Science / @kwilcox / kyle@axds.co - Shane St Savage / Axiom Data Science / @srstsavage / shane@axds.co __Agenda and Notes__: - Xavier demo - Build partially as a learning exercise, trying to figure out how things work - Take a intake catalog or a static STAC catalog - Geared towards working with Zarr data, so it requires the cloud storage STAC extension - Working on USGS projects - Modernizing and standardizing infrastructure - Findings - Transforming dataset provider - - Working towards releasing 0.3 [#183](https://github.com/xpublish-community/xpublish/issues/183) - Docs: https://github.com/xpublish-community/xpublish/pull/180 - Release cadence? When stuff happens, release on little changes - Axiom update - DatasetConfig plugin, loaders, metrics - [SFBOFS - Full CO-OPS OFS model](https://xpublish-sfbofs.srv.axds.co/datasets/sfbofs_all) - [SFBOFS - Last 24 hours of CO-OPS OFS model for mobile phone app](https://xpublish-sfbofs.srv.axds.co/datasets/sfbofs_latest) - [CIOFS - Actively running forecast model (auto updates with new data)](http://xpublish-ciofs.srv.axds.co/datasets/ciofs_hindcast/) - https://github.com/axiom-data-science/xpublish-host - Includes lots of metrics work - Put it into production for two APIs - Serving models to a mobile app - Last 24 hours of data - Returning binary parquet - Getting 400 Mb/s - How do we configure servers in a standard way? - DatasetConfig plugin - Auto reloading datasets - Avoiding reloading in the dask cluster - Cache invalidation - Scales to multiple workers - Currently running a 30 year hindcast, but researchers wanted data as soon as possible - Volunteers - Answer Q&As - review PRs, - Joe - Work on documentation/examples - Jonathan - Xavier (OpenDap) - ... - both on Xpublish and across the ecosystem - Bigger issues to wrangle - Caching - RPS trying to cache tiles for WMS - Dask - How is async and dask requests working? - Axiom: all sync endpoints - In theory that should help with concurrent reads or requests - Most useful in Zarr for high latency stores 'listing all the keys in a S3 bucket' - Catalogs __Action Items__: ## 2023-04-07 __Attendees__: - Alex Kerney / Gulf of Maine Research Institute / @abkfenris / akerney@gmri.org - Xavier Nogueira / LimnoTech / [@xaviernogueira](https://github.com/xaviernogueira) / xrnogueira@limno.com - Joe Hamman / Earthmover / @jhamman / joe@earthmover.io - Kyle Wilcox / Axiom Data Science / @kwilcox / kyle@axds.co - Josh Rhoades / Axiom Data Science / josh@axds.co - Shane St Savage / Axiom Data Science / shane@axds.co (Lurking for updates) - Jonathan Joyce / RPS Group / @jonmjoyce / jonathan.joyce@rpsgroup.com - Rich Signell / USGS / @rsignell-usgs - Anthony Aufdenkampe / LimnoTech / @aufdenkampe / aaufdenkampe@limno.com - Micah Wengren / NOAA IOOS / @mwengren / micah.wengren@gmail.com - Matthew Iannucci / RPS / @mpiannucci __Agenda and Notes__: - Previous - [2022-12-09 Xpublish & ZarrDAP meeting notes](https://github.com/xarray-contrib/xpublish/issues/138) - What have folks been up to - Alex - Plugins - Working on documentation updates - https://github.com/xarray-contrib/xpublish/issues/159 - Eventually working towards the Getting Started/User Guide/API Ref/Contributing structure that other Pydata projects have - Answering some lingering issues with plugin based solutions, I'll probably convert those to discussions. Once I'm happy with docs updates and we have a new release, I'll start closing some issues that are now not really relevant - Rich - Set up Xpublish Slack channel on ESIP - Kyle - Have a bunch of FastAPI services already - Intake plugin - https://github.com/axiom-data-science/xpublish-intake - A host project so that they can standardize their deployments - https://github.com/axiom-data-science/xpublish-host - Deployment? - Not at scale yet - Dask clusters on the backend - THREDDS and OpenDAP isn't keeping up - Docker image with volume mounts - On prem - Anthony - Limnotech for USGS National Hydrological Geospatial Fabric (NHGF, hydrofabric) team - Primarily working on PyGeoAPI OGC EDR deployment and async optimization - Also tasked with Xpublish OpenDAP endpoints for same datasets - Plan to start with Alex's https://github.com/gulfofmaine/xpublish-opendap - Alex thinking about looking at ZarrDAP to get relevant - Rich will touch base with the ZarrDAP folks - Axiom also needs OpenDAP - USGS moving a lot of things into the cloud/Zarr - Matt - Trying to replicate what THREDDS can do for model data - Kerchunk harvesting from NODD and then served with Xpublish to serve aggregations - [XREDS](https://github.com/asascience-open/xreds) - XaRray Environmental Data Server - Serving direct from cloud storage - Working on - WMS - Trying to standardize for different grids - Both it and EDR rely on some knowledge of the grids - Subsetting - Productionizing, only really funded for a prototype, how to scale out access '45 tiles at the same time' - Want to be able to deal with FVCOM and ROMS grids - Figure out how to make Dask deal with the subsetting - Kyle: a higher level interface for wrapping a dataset - Plugin Discussion - Alex's vision is that a server plugin can provide capabilities to other plugins - - A Github org for Xpublish - Do we want multiple orgs: 'xpublish-community' and 'xpublish-experiments' or similar? - Can we move/abstract the zarr plugin to a new repo or different default prefix (i.e. `/zarr`) - Next release - 0.3 - Plugin support - What do we need to make a release? - I think if we make a tag then create a Github release this workflow will run: https://github.com/xarray-contrib/xpublish/blob/main/.github/workflows/pypipublish.yaml - I'm not sure if it will run successfully if I don't have access to the secrets (and will they make the leap to a new org?) - Should run when a Github release is created, and should already have secrets tied to Joe's account - Get zarr prefix - Release from the new org - How often should we meet? - Monthly, first friday of the month, Joe will set up a Google Meet - Micah: Code sprint? - Find some time to virtually collaborate? - Post-ESIP summer meeting? Week of July 24 - 28. - Caching - Redis with cachey, multiple operators could work on the same key - Maybe get rid of cachey - Dask is using cachey under the hood, but isn't - Catalogs - How should endpoints be advertised - Dask - Kyle and Matt have experimented with a few different cluster __Action Items__: - Alex: Continue working on docs - Joe to set up the xpublish-community org and move Xpublish repo there - Alex will set up community resources - Alex to set up xpublish-expirments org - Add /zarr prefix to the Zarr router - Create a 0.3.0 release - Create threads discussing caching, dask, and catalogs __Links to share/explore__: -

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully