owned this note
owned this note
Published
Linked with GitHub
# GeoZarr Spec Steering Working Group
Note that a [meeting summary](#Meetings-Summary) is provided at the end of the document.
## Meetings Summary
GeoZarr Bi-weekly Call
Next: Wednesday, May 14
Time: EDT: 11:00AM - PDT: 8:00AM - CEST: 3:00PM - [[more](https://www.worldtimebuddy.com/?qm=1&lid=100,12,5128581,5368361&h=100&date=2024-4-17&sln=15-16&hf=1)]
Video call link: https://meet.google.com/jth-rstn-fwb
Or dial: (US) +1 413-350-0808 PIN: 246 812 926#
More phone numbers: https://tel.meet/jth-rstn-fwb?pin=9845739928663
## May 14th, 2024
### Attendees
### Agenda
- Calendar invites? Some folks are receiving them, some are not. I'm inclined to cancel and have folks use this page to know how/when to join. -Brianna
## April 17th, 2024
> `CNL`: not available today (still plan to bootstrap the OGC template with core definitions - during April)
### Attendees
Brianna Pagan (NASA GES DISC)
Felix Cremer (MPI BGC Julia programmer)
Christine Smit (NASA GES DISC)
Martin Durant (Anaconda)
Kevin Sampson ()
Doug Newman (NASA ESDIS)
Ryan Abernathey
Colby Fisher
### Agenda
- Ethan and Brianna tag-up on compression algorithm support
- Had some discussion before that "default zarr compression (blosc) is not a standard compression that netcdf has used so it's not available with NCZarr" but we found documentation showing it should: https://www.unidata.ucar.edu/blogs/developer/entry/nczarr-support-for-zarr-filters Ethan is following up
- Martin: Two major versions of blosc with different codecs, so could not be full support
- Brianna and Christophe tag-up on branch for refactoring existing write-up to OGC template
- Any updates on discussion from Ryan's demo/blog from last meeting: https://discourse.pangeo.io/t/example-which-highlights-the-limitations-of-netcdf-style-coordinates-for-large-geospatial-rasters/4140
- Martin: GRIB <1000 byte, but always adds coordinates, load with xarry, it will be over 100MB in memory.
- Felix: is there anything julia ecosystem can assist/learn from this
- Ryan: We need a concrete proposal, everytime something simple is suggested, many responses as to why it's more complicated, feel stuck. https://gdal.org/user/raster_data_model.html
- Martin: implement something like affine transform, and this is the implementation and a way from going from standard tags in geotiff to your explicit implementation would go a long way.
- Ryan: what would success look like, write code that we want to work, then ask where should it be implemented. Ultimately new index type in xarray, but need to define success within the group. Keep analytical not float info. Preserve analytic coordinates
- load data from geotiff save to zarr and load it again and save to geotiff and the coordinates should be the same
- Felix: idea is to save it as GDAL saves it
- Martin: how gdal defines attributes is fine
- Ryan: forget geo, just 1-D, defined analytically, A->B, shouldn't need to save every coordinate. Don't have to treat it as data.
- Felix: in Julia you can usee w/e array as dimension, not sure how these are read/saved
- Martin: not language issue, a library problem, astronomers don't have this problem although they have analytic. Need POC
- Christine: dimensions matter when you're trying to query. when would a tool use this information, you need a function to decide when you need indexes
- Martin: Yes, Logical to analytical indexes function is needed
- Christine: 1) i just want to open the file, i want xarray to do the right thing. 2) actual implementation people who want to get more in the weeds
- Brianna: it's easy enough A->B, but when we add the geospatial, that's where the convo get's blocked
- Ryan: PROJ does this, the difficult part is with serialization, how can we tell that the coordinate is present, how do we identify and is that interoperable for non-python, non-xarray softwares. Xarray developers need to show that I can create an xarray dataset that has this type of analytic coordinate system and query it. After that we can tackle with encoding
- Felix: we have this in julia, if we save to zarr just an array as integer, just a vector, why we need to talk about serialization
- Ryan: can we create a 1D xarray, save to zarr, open in julia, get a properly encoded, save in zarr, pass it back and forth.
- Felix: how would you save it.
- Ryan: for range it's start, stop, # of points, metadata variance you want to save, is it for the center of pixel. Start, stop, or offset/scale, you need to know how many points, already known if it's describing another array. Encode these floating point numbers in a lossless way, in zarr you can put them in metadata or another array, in another array probably not needed, but would encode in an optimal way, putting it into metadata as json, you want to put full bytes rather than txt based rep as number.
- Christine: advantage of seperate array, can take CF approach for describing in the metadata with additional attributes
- Ryan: push as much encoding as possible in zarr, virtualizarr. If xarray sees a variable already opened by zarr, that has units of days since some day, it triggers that's time and let's decode.
- Christine: time is more of a pain.
- Ryan: motivated here to figure out decode index, first step is in xarray dev supporters
- Pangeo/NASA funding discussion scheduled for later this afternoon: https://discourse.pangeo.io/t/nasa-funding-and-the-pangeo-ecosystem/4136
- https://github.com/zarr-developers/geozarr-spec/pull/44
- Felix: trying to build and save to disc, not exactly sure how tile matrix set is going to work, if we have some dataset, would we be able to add
- Set up dedicate agenda item
## April 3rd, 2024
### Attendees
- Brianna Pagán
- Ryan Abernathey
- Ethan Davis
- Tadd Bindas
- Max Jones
- Anthony Cak
### Agenda
- New branch for conforming issue #34 with OGC template (Brianna)
- Ethan: for CF just need groups, arrays and attributes. An extension of Zarr changes the encoding, whereas a convention is just an extra metadata that is visible to any zarr. CF is completely visible to anything that understands netcdf, whereas an extension you need not just zarr, but you need the zarr extension
- Ryan: still in the process of figuring it out for zarr, we can define some conventions, extensions are things that require changes or augmentations to the core data model. If extension is not understood, you cannot decode the data, an example that needs to be an extension, variable size chunks, if you try and go in to read zarr data, and your implementation doesn't know how to decode. Conventions should still be operable under vanilla zarr, multi-scale is an example, people use this. xarray came up with its own convention for putting convention names. This group should try to do everything through conventions, cross post with OGC. Dont get into the data model layer, like how jsons are structured.
- https://zarr.dev/zeps/draft/ZEP0004.html
- This needs to be linked to the PR for the zarr spec that implements the ZEP https://github.com/zarr-developers/zarr-specs/pull/262
- https://github.com/zarr-developers/zarr-specs/pull/262/files#diff-cacd72e8200bb6b7fb7e9ee8709abb11ecd292bb6c462f0fe402fdc46bb77927
- Describes the xarray-zarr convention
- Ryan: there are aspects where some domains what to adopt units without adopting all CF
- Still would like an example zarr file for https://github.com/zarr-developers/geozarr-spec/pull/44
- Demo from Ryan and blog: https://discourse.pangeo.io/t/example-which-highlights-the-limitations-of-netcdf-style-coordinates-for-large-geospatial-rasters/4140
- In memory information and serialized information, needs to go somewhere in metadata
- Could create custom xarray index that understands
- Ryan will make a post on pangeo discourse, discuss possible solutions, implementing an xarray custom index that supports this type of coordinate system
- Tadd: lazy coordinate system?
- Ryan: maybe implicit, lazy implies there is data just not loading yet. Lazy concept is useful but slightly different
- Tony: This is my exact use case, i have errors trying to create a zarr store using xarray from a large
- Developing tool for checking compliance
- Max: interested in a checker beyond just the convention but also looking at the data, https://discourse.pangeo.io/t/nasa-funding-and-the-pangeo-ecosystem/4136/3
## March 20th, 2024
### Attendees
- Brianna Pagán
- Christine Smit
- Colby Fisher
- Kevin Sampson
- Max Jones
- Christophe Noel
- Steve Olding
### Agenda
- Still haven't heard back on sub-group creation from Scott
- Have folks requesting 'observer' status, this will only come into play when we're voting on items
- Presentation at NOAA Enterprise Data Management Workshop in May
- Can the summary from Christophe: https://github.com/zarr-developers/geozarr-spec/issues/34 be submitted as PR?
- Adapt definitions, in a more agnostic way and map to zarr model, will be good start to adapt
- Writing this with OGC template
- Check that OGC templates are auto converting to pdfs successfully
- Ethan and Kevin as reviewers
- CN: Maxmimize interoperability of format, tradeoff of datasets encoded in zarr and maximizing the tools, recommendation versus requirement classes, do we allow a zarr to have multiple datasets, dataset with children datasets, this complicates how tools read the files. Have a requirement class which says this geozarr is complex, or the opposite, this geozarr is a simple dataset, maps to native-format
- CS: CF didn't accept group structure, recently added
- Compressor lit review (open action item, still need to tag up)
- From Christine last meeting "default zarr compression (blosc) is not a standard compression that netcdf has used so it's not available with NCZarr," however I am seeing [NCZarr Filter Support](https://docs.unidata.ucar.edu/netcdf-c/current/filters.html#filters_nczarr) a reference to blosc. Can Ethan confirm?
- Will merge: https://github.com/zarr-developers/geozarr-spec/pull/44
- Developing tool for checking compliance
- any lessons learned from CF checkers or COG checkers?
- https://cogeotiff.github.io/rio-cogeo/CLI/
### Action Items
- [ ] Brianna to try to adapt issue-34 as PR using OGC template, Ethan and Kevin (kmsampson) to review
- [ ] Brianna scheduling alternative bi-weekly coworking session to develop a tool that checks if a zarr store is compliant with existing specification
## March 6th, 2024
### Attendees
Max Jones
Michelle Roby
Ethan Davis
Brianna Pagán
Christophe Noël
Tadd Bindas
Ryan Aberanthey
Felix Cremer
Lars Barring
Sean Harkins
### Agenda
- Presentation to OGC netCDF SWG last week (Ethan)
- Specifying the Organizational Structure of GeoZarr (title edited) [#34](https://github.com/zarr-developers/geozarr-spec/issues/34) (Christophe)
- Brianna: this is the same point i bring up below in agenda for how to handle zarr_format
- Christophe: never added the mapping between geozarr and zarr, always has been implied, but we can follow NCZarr approach of how this is mapped.
- Ethan: I found current spec confusing, spelling out dataset, data array etc there's some parts that are CF, and other parts that seem to replace CF. If you look at CF-data model, it has alot of details on how CF works with these kinds of things. Would be good to have netCDF OGC SWG to have more people from CF world to look at this and how to clarify how much CF is used. In term of NCZarr and xarray, wondering if GeoZarr shouldn't be too specific, if it can handle xarray dimensions, allow for NCZarr construct that will represent same. Big differences between NCZarr zarr-v2 implementation and zarr-v3
- Christophe: its opinionated if we say that GeoZarr should use all same approaches as CF, until now just using CF for not reinventing the wheel, but minimize the size of the specification
- Ethan: CF doesn't have alot of requirements for metadata, advantage for allowing whatever CF is in the file and building on top of that, and having the pieces that are making it geozarr compliant. Lots of existing profiles of CF, WMO- for sounding data some example,
- Christophe: if we define something and its aligned with CF, that's a good approach
- Ryan: between v2 and v3 data model is not hugely changing, we shouldn't get too hung up on that. GeoZarr should specify the zarr model, how that is encoded, that's zarr job to manage. Doesn't need to go under the hood.
- Ethan: Not that CF has to be the end all, GeoZarr can reference CF data model and build off of that. Where does CF line up where does GeoZarr need to diverge.
- Tile Matrix https://github.com/zarr-developers/geozarr-spec/pull/44
- Compressor lit review
- From Christine last meeting "default zarr compression (blosc) is not a standard compression that netcdf has used so it's not available with NCZarr," however I am seeing [NCZarr Filter Support](https://docs.unidata.ucar.edu/netcdf-c/current/filters.html#filters_nczarr) a reference to blosc. Can Ethan confirm?
- https://ui.adsabs.harvard.edu/abs/2021AGUFMIN35D0418H/abstract
- Do we have to explicitly add zarr version specs to GeoZarr specs?
- i.e. [zarr-v2 arrays](https://zarr-specs.readthedocs.io/en/latest/v2/v2.0.html#arrays)
- This came to my mind when thinking of how to handle for example consildated metadata, so we need to make some statement on compatability of GeoZarr with zarr v2 and v3
- CN: I think so, OGC extension must align specific version (in particular on breaking changes). Zarr v3 is still under development, we should target v2 before v3 is released.
- Some test zarr stores so far have been using v2 consildated metadata, need some examples of zarr v3 generated zarr stores
- CN: .zmetadata only concatenates metadata of all children, so even in v2, we can specify all out of it (and possibly already add indexes to fasten coordinate and variables discovery).
- Tuesday March 12 Coworking Hour! EST Time: 11:00 AM (UTC-5)
- AOB: consider shifting for a more wordwide time slot in April (EDT: 10AM, CEST: 4PM, UTC: 2PM) ?
- BRP: That is fine, I am working US West hours,
![image](https://hackmd.io/_uploads/rk1VkGIap.png)
- Interoperabability issues with opening th example zarr stores in Julia
-
- What would we want to see from zarr sparse array support https://github.com/zarr-developers/zarr-specs/issues/245
- Tadd: You have a sparse array, and a command to write to zarr, and some codec would save it and read out of, workaround right now is a wrapper that would break up your sparse array in different parts. that's individually stored in zarr. Question: if anyone in this meet uses sparse arrays, what an interface of what they are looking for would be
- Ryan: zarr specifies on disk format, so we can imagine how to store sparse arrays, but to sparse effectively you need an in memory representation that allows you to query, most programming languages has a sparse array type, and once you have that sparse array you can use for useful things. Hash regridding between two different grids. Would like to compute this once and save it, and open it quickly. The stumbling block is figuring out what in memory would look like. Can agree on serialization but what are implementions going to do when seeing a sparse
- Sean: What is your primary analysis env? Deepak's prior art here https://ncar.github.io/esds/posts/2022/sparse-PFT-gridding/
- Tadd: A combo of zarr, xarray, dask, depending on how big problem. Smaller problems xarray, the hypersparse matrixes we use, similar to Ryan, some mapping matrix used for calculations, or using sparse.COO
- Sean: with EO data, issues with sparse data cube problem. You have a storage problem as well
- Ryan: Proposal to czi to implement sparse encoding in zarr
- https://github.com/ivirshup/binsparse-python/ ( the binsparse spec proposed in python)
### Action Items
- [ ] Felix opening new issue for Julia compatability
- [ ] Ethan and Brianna tag-up on compression algorithm support
- [ ] Ethan and Brianna tag up on more explicit open questions for CF community
## Feb 21st, 2024
### Attendees
- Brianna Pagán (NASA)
- Ryan Abernathey (EarthMover)
- Ethan Davis (UCAR/NCAR Unidata)
- Amit Kapadia (Planet)
- Michelle Roby (Radiant Earth)
- Tadd Bindas (Penn State PhD Candidate)
- Christophe Noel (Spacebel)
- Kevin Sampson (NCAR, WRF-Hydro, WRF)
- Christine Smit (NASA)
- Colby Fisher
### Agenda
- Co-chairs for OGC sub group: Christophe and Brianna
- Repo updated with OGC formatting
- Zarr Sprint summaries (https://github.com/zarr-developers/geozarr-spec/issues/33)
)
- HTTP extension, traverzarr mock-up. File browsing. Kevin Booth, same time tomorrow if folks want to join that conversation via Radiant Earth/Source
- Rust object store to be able to query data, replacing fsspec.
- Chunk manifest/virtual concat
- Ryan: chunk-manfiest, referencing/pointing to existing chunks from the zarr metadata. virtual-concat of zarr arrays, stacked zarr arrays exposed, similar to ncml, combining into one larger virtual object.
- Ethan: would love to share best practices with ncml.
- Ryan: folks are already kerchunking PBs of data and opening with zarr, but no spec.
- https://github.com/zarr-developers/zarr-specs/issues/288
- Amit: do people want improvements for kerchunking tiffs?
- Ryan: the issue with normal tiffs not COGs, is too many files, sharding can assist with this.
https://github.com/fsspec/kerchunk/issues/325
- Amit: high error rate in storage at a specific scale. More worried about cloud service provider to keep up with rate request.
- Ethan: errors coming from servers, opendap & co. have dealt alot with this.
- GeoZarr Interoperablitly https://github.com/zarr-developers/geozarr-spec/blob/main/geozarr-interop-table.md
- Christine: a few open issues, compression, default zarr compression (blosc) is not a standard compression that netcdf has used so it's not available with NCZarr, that impacts NCO, netcdf-python and panoply. For NCO, cannot access things from S3. Panoply has a dev branch that can read zarr stores.
- Ryan: no inherent or default compression for zarr, there is one for python-zarr. This is just a downside of how pluggable zarrs are, there are no standards profile. If in geozarr, we state the min set of compression options that aimed to support. Make narrower recs.
- Christine: would be nice to target the default one in the zarr-python library.
- Ryan: Make a recommendation of min set of compression options that make it compliant.
- Brianna: it would be easier to get a list from netcdf/NCO/NCZarr etc of compressions that work and have that for recommendations, rather than waiting for those tools for blosc. But looks like there needs to be some lit review over current.
- Ryan: would look at conda forge netcdf
- GeoTiff -> GeoZarr PR! https://github.com/zarr-developers/geozarr-spec/pull/42
- Overviews need some coordination with Max/the tile matrix PR below?
- Tile Matrix PR: https://github.com/zarr-developers/geozarr-spec/pull/44
- Move away from consolidated metadata in the spec itself, so no zattrs
- Ryan: just want whatever is chosen to be in the spec, can keep it if folks want, wouldn't get rid of it. Before making a rec, understanding how it could impact existing tools. Push it at the spec level.
- Ethan: linked hierarcheries can be fragile
- Ryan: both can be fragile, maybe not one better or worse
- Working session? Monday March 4, 10am-noon EST
### Action Items
- [ ] Ethan and Brianna to tag-up about compression issues.
- [X] Brianna schedule a follow up chat with Wietze and Max [Brianna made comments to open PR]
- [ ] Colby/Amit add input into PR-42
## Jan 24th, 2024
### Attendees
- Brianna Pagán
- Matt Hanson
- Michelle Roby
- Amit Kapadia
- Forrest Williams
- Kevin Booth
- Sean Harkins
- Ryan Abernathey
- Christophe Noel
- Ethan Davis
- Kevin Sampson
- Patricia Fricke
- Wietze Suijker
### Notes
- Charter approved November 2023, now an official SWG
- Currently waiting for the OGC to create a subgroup work environment which would allow us to nominate and elect chairs.
- Scott recommended using this meeting to get the list of nominees, then will be voted on
- Tentative; Christophe and Brianna as co-chairs
- Upcoming Zarr sprint with GeoZarr focus: https://lu.ma/Zarr-NYC
- Logistics
- Should we support viritual?
- Virtual as second class citizens?
- Matt H: don't waste time trying to combine in-person and virtual
- Brianna setting up a hub with example code/zarr stores
- What are the real blockers for some open issues?
- Understanding concerns with CF encoding of CRS https://github.com/zarr-developers/geozarr-spec/issues/20
- How to encode typical origin / offset coordinate variables in ZARR? https://github.com/zarr-developers/geozarr-spec/issues/17
- Ryan: Concrete outcome: be able to write raster day to python read it back in gdal and write it from gdal read back in python. Round trip for CRS. Not possible today because gdal and xarray/python world have chosen a different way to represent CRS info.
- Matt: what about case in zarr where you don't have a valid crs, can still read it in gdal, we have gcp for every pixel, interesting exercise how to read zarr data that doesn't have crs assigned and reproject data.
- Ethan: Huge software ecosystem based around netcdf/cf making sure it works with netcdf implementation of zarr support would be a big win. Ethan is chair of OGC netcdf group organizing a meeting end of Feb, one agenda item is looking at geozarr. Dave Blodgett and Brianna will join for this discussion, can be a follow-up to zarr sprint.
- Christophe: GDAL 3.9 per OSGeo/gdal#9108 will be able to infer CRS in a Zarr dataset using a CF-1 grid_mapping variable (basically raw conversion of netCDF CF-1 to Zarr)
- Wietze: Played around with zarr in QGIS which works with latest gdal, quite slow because data is large, doesn't load the data efficienctly, no pyramid concept in zarr like geotiffs
- Max: Happy to get together virtually for half day to work on pyramiding component, core decision is everything in the geozarr stac or seperate ZEP. Meeting with Sanket next week
- Single entry point, can have POC by end of sprint
- Ryan: We don't want to be spec-first.Focus on making demos - something that didn't work day one that now works on day two. There is a convention, biologists also use it heavily.
- Christophe: Yes focus on POC, but to balance not reinvent the wheel with COG, usual webmap viewer, open layers typically provide BMPs for those formats, if people experienced in this can share their knowledge here
Some good start for standard "conventiosn" for pyramid well supported by Map viewer:
- COG (GeoTiff Overviews: https://docs.ogc.org/is/21-026/21-026.html#_conformance_class_geotiff_overviews)
- Zoom LEvels https://wiki.openstreetmap.org/wiki/Zoom_levels
- OGC WMTS : align with tiling service (typically what is implemented with COG) https://www.ogc.org/standard/wmts/
- Ryan: need to be realistic, we won't accomplish as much as we want only a few hours to actually code and we should be very targeted. How shareable are some of these projects. Wanting to make pyramiding work in QGIS? Can more than one person work on that at a time? Get a candidate list of projects, what level of difficulty, what skills, rank and select 3-4.
- Ryan: do we need to pay someone with gdal expertise? Contact Evan? My inclination is to leave gdal out just based on who has RSVPed so far.
- Sean: writing CF compliant metadata, later verify it works with gdal
- Suggest divide into focus groups at the sprint to address
- Going back to the template:
```
As a [type of User], I need to [do something] with Zarr using [tool X]
```
- https://hackmd.io/t2DWpX1iQEWMKx1Fi4Px7A?both#Let%E2%80%99s-brainstorm
- Sprint focus groups:
- Michelle/Brianna go through the use cases
- Max (virtual): pyramiding
- Ryan/Kevin: http browsable zarr
- Joe (virtual): v3 for zarr-python
- Bidirectional gdal?
- probably after sprint
- Integration of Zarr with STAC Catalogs https://github.com/zarr-developers/geozarr-spec/iss4ues/32
- Ryan: would be awesome to have translator between ZARR and STAC, would have to populate some required attributes in zarr metadata but easily
- Christophe: The data store POC created they typically hold level-1 and 2 products which includes hierarchy like STAC, but without the product you may have different assets
- Ryan: in zarr there is no hidden metadata, all in json
- Sean: Curious from use case perspective, that use existing STAC cube metadata to configure zarr stores, I am against a full STAC based hierarchy.
- Ryan: Something we could do at the sprint, the idea of zarr-http browsable extension, can't list the directories, solved this with consolidated metadata, but not scalable, probably won't propogate to zarr v3, instead we want links between nodes.
- Sean: what would transition look like for this? what would happen to older stores?
- Ryan: consolidated metadata is v2 feature, once we start writing v3 in production, new extension. Consolidated metadata was originally a work around that solves why its slow reading cloud, but now major improvements have been made, now the only solution is for unlistable stores. But now there are PBs of data with this hack, forunately migrating zarr data doesn't involve rewriting chunks, option to migrate data or same data having it exposed via v2 and v3 metadata. Might be a pain, but not fundamentally expensive to rewrite jsons.
- zarr-python is in flux, not put into zarr-v3. Might be released by the sprint or working off v3 feature branch where new things are living.
### Action Items
- [ ] Open new issue on github with propopsed tasks, get community feedback by Wednesday Jan 31, reach out to list of RSVPs for zarr sprint for which task people want to join
- [ ] Create a template for what the task plan would be, assign leader, write AC
- [ ] Create a template for the structure of the spec, those templates to collect after the sprint, will happen in these discussions
## August 16th, 2023
CANCELED
### Updates
- BP: Scott Simmons to schedule public comment on charter the week of August 28th in the AM for US time zones. Will post details to this thread once confirmed day/time.
- CN: from TC-Announce "The SWG proposers will hold a webinar to highlight the planned activities and answer any questions. The webinar is scheduled for 30 August 2023 at 1400 UTC / 1000 EDT"
## August 2nd, 2023
Time: 11h-12h EDT
Meeting Link: https://meet.google.com/qdi-uwrs-pmu
Or dial: (US) +1 318-702-0039 PIN: 589 902 807#
More phone numbers: https://tel.meet/qdi-uwrs-pmu?pin=8010028568068
Attendees
- Chirstophe Noel (Spacebel)
- Ethan Davis (UCAR)
- Tyler Erickson (VorGeo)
- Brianna Pagán (NASA GES DISC)
### Agenda
- AGU abstract submission: https://agu.confex.com/agu/fm23/prelim.cgi/Session/192249
- The GeoZarr Standard Working Group (SWG) is chartered to develop a Zarr encoding for geospatial gridded data in the form of Zarr conventions (based on the approach described in the draft Zarr Enhancement Proposal 4). Zarr specifies a protocol and format used for storing Zarr arrays, while GeoZarr defines conventions and recommendations for storing multidimensional georeferenced grid of geospatial observations (including rasters). The GeoZarr SWG will also work on improving Climate and Forecast (CF) metadata conventions if necessary, particularly for alternative coordinate reference system encoding if relevant. Since January 2023, a community effort has been convening bi-weekly to take the next steps in the OGC process, with a draft charter submitted for consideration in July 2023. This presentation will provide a status of the charter, walk through core aspects of the specification, as well as optional conformance classes and finally demonstrate interoperatbility with common geospatial tools with example zarr stores.
- Hackathon for implementation examples
## July 19th, 2023
CANCELED
Brianna: Many of us are attending ESIP in Vermont. I provided Scott from OGC the link to the merged charter PR. His reply:
> This is a really good charter - thanks for the excellent work!
>
> The next step is to go to Member and public comment. I can get the document to the right place and kick off that process. The comment period last 3 weeks after which the proposed SWG needs to be presented to Membership and a vote initiated. For these last steps, the presentation could occur in our Closing Plenary at our next Member Meeting in Singapore on 28 September or via webinar before that date.
>
> Please let me know if you would prefer to have this presented in Singapore or via webinar… either would happen after the member/public comment ends.
I personally have the preference to have it presented via webinar rather than the next member meeting in Singapore - please leave a comment if you have a different preference.
## July 5th, 2023
CANCELED
Please provide last feedback/approval for charter PR by Friday July 7th.
### Agenda
- CN: I would like to address a critical point concerning ESA and Spacebel: the primary/original goal of GeoZarr is providing native geospatial functionalities (serverless) in Zarr, such as **visualization** (akin to COG), optimized access for analysis (like OGC API coverage), etc. While it is highly beneficial to add a range of other objectives to this foundation (encoding refinements/improvements is not needed by our own customers), those joining the initiative should be aware of the original objective so it can be pursued and extended as it remains a fundamental aspects for some participants (otherwise, why not starting another format project from scratch ?). Note also that some specific/advanced aspects might be explored in later version of the spec (e.g. symbology is not a primary aspect but I will elaborate during the project how it is he ideal companion of multiscales for visualisation). Finally, remind that OGC SWG philosophy is based on inclusion...
- BP: Christophe - we discussed some of these points extensively in the last call, I tried to capture some notes below. I think it's important for us within the SWG to identify what parts of the specification are part of the core, and what are part of conformance classes. For example, the visualization piece, that is centered around compatibility with tools. But as there are many use cases for even viz itself, we wouldn't want to require very specific specifications, rather have optional conformance classes that are available for guidance. The example of pyramiding/multi-scaling, not every user of zarr needs this, but there should be guidance of how to link this when a zarr convention is released. See my last comment on the PR, I suggested canceling this weeks meeting, for preference of any last async discussions needed. I think we are going around without decision in the Charter, and we need to have some of these discussions in the SWG itself.
- CN: Yes, the comment is fine and I adhere to comformance class as it's something I have mentioned from the start (despite it is generally not reported in the charter). For information, as per OGC API common, core of geozarr should be typically one/multiple conformance classes also (and personally, I don't see any topic which is needed by every user of Zarr). I believe the charter is in a very good shape now.
## June 21st, 2023
Time: 11h-12h EDT
Meeting Link: https://meet.google.com/qdi-uwrs-pmu
Or dial: (US) +1 318-702-0039 PIN: 589 902 807#
More phone numbers: https://tel.meet/qdi-uwrs-pmu?pin=8010028568068
Attendees:
- Brianna Pagán (NASA GES DISC)
- Amit Kapadia (Planet)
- Matthew Hanson (Element84)
- Alexy Shiklomanov (NASA ESDIS)
- Ethan Davis (UCAR)
- Ryan Abernathey (Earth Mover)
### Agenda
- Attendance of OGC meeting, have not received much feedback.
- BP: FWIW, I do not think we need to resolve every commit in the draft charter, I would assume some of these points can/should be discussed as part of the formalized SWG. Scott did say we should submit a charter with enough consensus, but feels like there should be path forward to discuss points.
- MH: Would like to include a slide on GeoZarr for FOSS4G conference.
- [] Brianna send pitch deck
- Open PR for draft OGC Charter: https://github.com/zarr-developers/geozarr-spec/pull/23. Some points I would like to discuss in person
- [Visualization in spec: yes/no.](https://github.com/zarr-developers/geozarr-spec/pull/23#discussion_r1204361714)
- Alexey/Tyler: remove
- David/Christophe: keep
- Solution: leave it worded 'as possiblly' and discuss as part of charter meetings?
- Brianna will removing (2) Visualisation: Simplifying the creation and display of geospatial data in web browsers without the need for complex workarounds, making geospatial information more accessible to users. and adding to (1) compatibility point of being compatible with viz software.
- [Multi-resolution and other 'upstream' zarr non-geospatial specific functionalities](https://github.com/zarr-developers/geozarr-spec/pull/23#discussion_r1204385728)
- We know of existing work, this work needs champions, Brianna brought it up in bi-weekly zarr meeting, need to have follow-up with Josh Moore, regardless, do we table these points? wait for zarr conventions then adopt as needed? if so what do we do in the meantime?
- CNL: I think we don't need to wait. However, I would encourage each member to tackle aspects of interest one by one, then see how to manage and how to consider the work in a consolidate specification, with potentially categories or "conformance-class" axed on domains.
- ED: Does not have to be the only group pushing multi-scale, but geozarr can participate, can reference other documents that will be worked on.
- MH: Aligned with OGC API features. Somewhat two-way relationship
- Same as above for 'upstream' - optimized rechunking?
- RA: Would be ideally another convention, could also have multiple chunking schemes, already implemented, creating rechunked versions. This is domain independent. Stand alone convention in zarr, fine to say this is something we want, don't think geozarr needs to specify.
- MH: Sounds like a **conformance class** - called a requirements class. In COG, you have core conformance, all of properties that make it a COG, like table of contents, image file directory, tile chunks. Having overviews at reduced resolution, is not necessary, this is optional, seperate conformance class.
- RA: Working group decides what is in conformance class versus core.
- ED: Multi-res and multiple chunking is interesting for many communities. If there is a way to make it abstract that would be awesome. Conformance class is within spec, or you can have core or higher level spec documents.
- [Inclusion of symbology](https://github.com/zarr-developers/geozarr-spec/pull/23#discussion_r1204595286)
- MH: These have been stored in STAC, the classification extension of STAC can be used.
- AS: This is way too specific. Goal should be the fewest people have to care/adjust for a special case GeoZarr. Maybe netcdf API needs to understand geozarr, but everything downstream has to know/care.
### To Do
- []
## June 7th, 2023
CANCELED
Due to attendance at the OGC meeting
Will provide updates async
Thank you for everyone who is actively working on the charter PR.
## May 24th, 2023
Time: 11h-12h EDT
Meeting Link: https://meet.google.com/qdi-uwrs-pmu
Or dial: (US) +1 318-702-0039 PIN: 589 902 807#
More phone numbers: https://tel.meet/qdi-uwrs-pmu?pin=8010028568068
### Agenda
- Open PR for draft OGC Charter: https://github.com/zarr-developers/geozarr-spec/pull/23
- Please add your name as you see fit under 8.5. Supporters of this Charter
- **Please submit feedback no later than next Wednesday March 31st.**
- Met with Radiant Earth Foundation re: Cloud-Native Geospatial Foundation https://cloudnativegeo.org/
- Asked to fill out a survey and help draft a blog post
- Circle back with Amit on providing a file to demo for https://github.com/zarr-developers/geozarr-spec/pull/19
- Meeting with Google Earth Engine Ingest team last week, more related to pangeo-forge, but they had interest as well in figuring out ways to work wiht zarrs with GEE.
- Alex Merose to try and demo the ability of working with a zarr store directly in GEE
## May 10th, 2023
Time: 11h-12h EDT
Meeting Link: https://meet.google.com/qdi-uwrs-pmu
Or dial: (US) +1 318-702-0039 PIN: 589 902 807#
More phone numbers: https://tel.meet/qdi-uwrs-pmu?pin=8010028568068
Attendees:
### Agenda
- OGC SWG Update
- June 05 - next OGC meeting, Scott is happy to set up time for discussion and socialize session, should be options to join remotely. Would need some draft in any state before then. Next Member meeting is Sept 25.
- Group can use June 05 as deadline for draft
- https://github.com/zarr-developers/geozarr-spec/pull/19
## April 26th, 2023
Time: 11h-12h EDT
Meeting Link: https://meet.google.com/qdi-uwrs-pmu
Or dial: (US) +1 318-702-0039 PIN: 589 902 807#
More phone numbers: https://tel.meet/qdi-uwrs-pmu?pin=8010028568068
Attendees:
- Amit Kapadia / Planet
- Sean Harkins / Dev Seed
- Brianna Pagán / NASA / @briannapagan
- Ethan Davis / UCAR
- David Blodgett / USGS
### Agenda
- Add GeoTransform as implemented by GDAL https://github.com/zarr-developers/geozarr-spec/pull/19
- Added suggested reviewers
- Handling requests moving forward
- Need an example zarr file
- Sean: Sean Gillies, maintainer of rasterio, suggest to add to PR @sgillies
- Amit can take the lead on encoding an example zarr
- https://github.com/zarr-developers/geozarr-spec/issues/10 Might be good enough, some slight changes needed to the following metadata below:
Metadata from example zarr file
```
<xarray.DataArray 'sr' (datetime: 365, band: 4, row: 256, col: 256)>
dask.array<open_dataset-e9e64732163986119bc61b23a21b10a4sr, shape=(365, 4, 256, 256), dtype=int16, chunksize=(365, 4, 32, 32), chunktype=numpy.ndarray>
Coordinates:
* band (band) <U5 'blue' 'green' 'red' 'nir'
* datetime (datetime) datetime64[ns] 2021-01-01 2021-01-02 ... 2021-12-31
Dimensions without coordinates: row, col
Attributes:
_CRS: PROJCS["WGS 84 / UTM zone 14N",GEOGCS["WGS 84",DATUM["WGS_19...
_TRANSFORM: [3.0, 0.0, 696000.0, 0.0, -3.0, 4536000.0]
```
- Standards Working Group OGC Draft: https://github.com/zarr-developers/geozarr-spec/blob/ogc-charter/CHARTER.adoc
- Timelines
- Tenatively meeting Scott May 3rd 09:30EDT
- David: Fits as a community standard better than SWG. Zarr v2 is a community standard. CF-baseline is not governed in OGC. CF is it's own comunity and has it's own governance structure. This is building up from CF and zarr which feels more like a community standard. Also if we want to go with SWG we have to write a charter, get it approved and only then we can start working, so could also delay things. We need operators.
- Ethan: Community can be easier and faster, on the other hand getting something like this going and governance issues come up. NetCDF came into existance before the community standard tract.
### Action Items
- [ ] Amit creating example zarr with geotransform PR from David
- [ ] Brianna can lead interoptibility testing with example zarr posted from Amit
- [ ] David adding Sean Gilles to thread, might have to explicitly ask Sean + Alan to submit a review.
## April 12th, 2023
Time: 11h-12h EDT
Meeting Link: https://meet.google.com/qdi-uwrs-pmu
Or dial: (US) +1 318-702-0039 PIN: 589 902 807#
More phone numbers: https://tel.meet/qdi-uwrs-pmu?pin=8010028568068
Attendees:
- Brianna Pagán / NASA GES DISC / @briannapagan
- Matthew Handon / Element84
- Amit Kapadia / Planet
- David Blodgett / USGS
- Ryan Abernathey / Earthmover
- Christophe Noël / Spacebel
- Ethan Davis / Unidata
### Agenda
- Scott from OGC - will not be able to make it
- Questions Christophe posed to Scott. We are unsure if this aligns with the expectations for an OGC Implementation Standard and whether it is suitable for an SWG. We would greatly appreciate your input on the following:
- Are the objectives of the GEoZarr working group compliant with the requirements for an OGC SWG?
- Is the Zarr ZEP-4 document suitable for developing an OGC Implementation Standard?
- Scott's response:
- In short related to Christophe’s question: yes, the objectives are definitely in line with OGC intent and the ZEP-4 document includes information that is suitable for an OGC Standard. BUT, OGC would still need a formal Standard document to reference that looks quite different from the ZEP-4 document.
- Here is how I can see this working. The content of a ZEP would need to be described as clearly-defined requirements that are testable. The OGC template for a Standard [1] would then be populated from the ZEP text and requirement(s). The Standard could be very short - no need to write hundreds of pages if very few are needed. Finally, the enhancement (which I suppose would be extended, but optional functionality for Zarr) could also be described as is best for the Zarr community as an included Annex in the Standard so that Zarr users see what they are used to and OGC Standard readers also see what they expect.
- Any open questions/comments from: https://github.com/zarr-developers/geozarr-spec/issues/14#issuecomment-1503548600
### Discussions
- Matt: last thing people want is to be handed a pdf with the spec asked to implement, but if you give people tools, that's more successfull, why STAC was successful.
- Come up with something like a basic spec, then make it work.
- David: Build on CF conceptual model that lends itself to a nice zarr implementation.
- Ryan: If we can say zarr is using CF convention for CRS then gdal would be able to decode. Whereas if geozarr is something seperate it would have to be implemented at a different layer. Use case in practice is data cubes.
- David: building on top of CF convention instead of netcdf, a core assumption we need to build on. Wouldn't go through full standard tract, we would already come with something working. What has emerged is a desire to write a zarr convention using OGC process to write convention. I think where we left of - we would write a charter for a SWG - what done looks like and kickoff a SWG spec dev process through OGC where result would be presented as a zarr convention through ZEP4 process.
- Matt: Advantage of community standard requires the right people and we don't have all of it. Maybe we need more outreach to get the right people.
- Matt: what is the role of STAC in all of this? Is it orthogonal to using CF?
- Ryan: netCDF, COG, zarr, all different assets in a STAC collection. STAC is for searching. STAC can be useful for many data cubes. Zarr is a catalogue. Zarr <> STAC same conceptual level.
- Matt: seems alot of people using zarr for smaller that aren't global. I still want to do that geospatial query, is the answer there that it's not a good use case.
- Christophe: cloud native data store using zarrs from ESA. For each directory, there will be a zarr file including the metadata. Also a STAC file describing.
- David: Something that looks/feels like nczarr, an incremental add on, bring on WKT, this is geozarr: it builds on CF, it breaks some netcdf
- Amit: gdal is missing the transform when using zarrs. Band shuffling needed, CRS needed. Time series remote sensing.
- Ryan: gdal writes a zarr, xarray tries to open, cannot because xarray using netcdf model. Add origin offset/transform... why are we doing it in zarr? This is a netcdf issue. xarray built on netcdf model not CF.
- David: wouldn't break anything in xarray, just xarray wouldn't understand.
- Ryan: we have that with rioxarray.
- Ethan: So many tools that already read netcdf, haven't been written for the cloud. The netcdf model is just arrays/attributes. CF is the convention that tells you want attributes to put in there to identify things. Not sure on the netcdf/cf contiuum this lies. The basic stuff like coordinate variable, 2d lat/lon, there's alot of stuff, anything that's from geoworld is going to know how to deal with that. CF + CRS... problem is not enough poeple pushing, that's what it takes, a concerted effort and to be willing to work.
- Ryan: xarray cherry picked, can decode time, but nothing else lat/lon. cf-python, implementing xarray like thing.
- David: can we ask Evan...
- Ryan: solve gdal <> xarray problem.
- Christophe: very specific, intent of ESA is to migrate all datasets to standard format, I believe our expectations was not only to have a data format, but to have an alternative to geodatacubes, geozarr is alternative to datacubes, all functionalities wouldn't be available, but this means holding multiple projections or scales of the data.
- Ryan: we want to fix interoptibility with zarr, Christophe is opinionated idea of what is in a geodatacube to be serverless. geozarr is how to put crs in zarr.
### Action Items
- [ ] write a PR to describe how to put in origin offset metadata in zarr, can use what gdal, can we prototype interoptibility, just around crs
- [ ] Divide crs problem vs geodatacube standardization
- Christophe + Brianna to focus on geodatacub + OGC
## March 29th, 2023
Time: 11h-12h EDT
Meeting Link: https://meet.google.com/qdi-uwrs-pmu
Or dial: (US) +1 318-702-0039 PIN: 589 902 807#
More phone numbers: https://tel.meet/qdi-uwrs-pmu?pin=8010028568068
Attendees:
- Brianna Pagán / NASA GES DISC / @briannapagan
- Ryan Abernathey / LDEO / Earthmover / @rabernat
- Christophe Noel / Spacebel (for ESA) @christophenoel
- Sean Harkins / Development Seed / @sharkinsspatial
- David Blodgett (USGS)
- ... ?
### Summary
Participants debated whether to develop a spec extension or a convention, with Ryan suggesting the latter. They also discussed the potential of going through the OGC SWG process, with Christophe advocating for its formation and Brianna and Sean acknowledging the benefits of a parallel effort. They also discussed the roadmap, existing conventions, and encoding differences between GDAL and Xarray. The conversation touched upon the relationship between GeoZarr and other standards, as well as the implications of incorporating CF conventions and CRS.
### Agenda
- Discuss upcoming CEOS WGISS #54 meeting on April 19th, 2023.
- https://github.com/zarr-developers/geozarr-spec/issues/11
- ZEP Process - Confirmed?
- Christophe: Is our objective to adhere to the ZEP process confirmed?
- Ryan: I believe our work is better suited as a convention. It doesn't require any core changes or extensions to Zarr; it's just a set of guidelines for storing metadata and organizing data within the existing Zarr framework. We can post this proposed convention on the Zarr website and follow the process. It doesn't need a ZEP (Zarr enhancement proposal), in my opinion.
- OGC Process - Confirmed?
- Christophe: Is our objective to adhere to the ZEP process confirmed?
- Brianna: I was under the impression that going through the OGC process is a good idea since it can be done in parallel. Whatever we decide as the convention can be presented to OGC. I'm curious about others' viewpoints. From a NASA perspective, yes.
- Sean: You're right, Brianna. As long as it's a parallel effort that doesn't distract from our main work, it's a fine approach. However, it may be a slow, friction-filled process. Having more existing traction and widespread adoption could act as a forcing function for acceptance.
- Ryan: I believe we should discuss a roadmap. We have engaging conversations every two weeks, but I don't see a clear path for converging, aligning, and implementing our spec.
- David: I'd like to point out that there's an HDF SWG, so having a Zarr SWG wouldn't be out of line with current practices. I'm not advocating for it, but it could be a way to discuss encoding data in Zarr within the OGC sphere. There's also a NetCDF SWG for encoding the semantics of NetCDF data, which is separate from the binary encoding. Additionally, there's a GeoTIFF SWG, etc.
- Brianna: I agree with Ryan that we don't have much to show yet. My perspective is that while I want to pursue the OGC route and don't mind if it takes a year, I'm more focused on what we can start referencing. I've added to the agenda and invited Denis from NCZarr, but I'm not sure he received my invitation.
- Christophe : With regards to Sean comment, writing conventions on your own to impose the adoption seems to me to be completely contrary to a standardization process. Indeed, we should gather all experts (including from the OGC community) as soon as possible in order to better represent all use cases and gather all the skills in our team. OGC SWG provides the opportunity to gather other ideas and be supported by research projects (such as OGC Testbed), mailing, etc. so what not starting the creation process immediately ?
- Ryan: Christophe has a point – what's the harm in starting the SWG creation process? It will take time, and some people will be pleased to know we've begun setting up the SWG. It won't happen quickly, and since we're already moving slowly without a clear process, maybe the OGC could provide the structure we need. I guess I'm in favor of that.
- Brianna: I'm mindful of potential issues with ODC, which may create barriers for some people to participate. If we have someone actively contributing who's not an official member associated with a company, I'd want to help facilitate a fairer process, but only if we have someone in that situation.
- Ryan: A counter proposal worth considering is that we could develop a convention ourselves and present it to OGC as a community standard, essentially saying, "Here it is. Take it or leave it."
-
- David: https://gisandscience.com/2014/06/11/ogc-seeks-comment-on-charter-for-new-netcdf-standards-working-group/
- Official votes are only by OGC orgs
- We want this to be a community standard
- Roadmap
- Ryan: Perhaps we should discuss the roadmap and the specific work that needs to be done first, and then revisit this question later on in the meeting.
- Ryan: Do we piggy back off CF or do we create a new standard seperate? Existing zarr community standard doc says, we put netcdf data into Zarr. The issue is that we put NetCDF data into Zarr, but complications arose when GDAL did something differently, creating a new way of incorporating geospatial data into Zarr. Now, we have two competing conventions as a result of our discussions. Christophe's original is aligned with CF approach.
- Brianna: how is this related to nczarr if at all.
- Ryan: another stanard, not zarr compliant
- David: binary enconding of CF, but also include geotiff use cases.
- Ryan: We already have on file OGC, how to put netcdf data into zarr, so if question is how to encode imagery, maybe we look at CF conventiond not OGC. We would need a very coordinated proposal that is signed off on heavy hitters, which says this is what we want in CF, if not put in, we will fork CF.
- Christophe: current draft of geozarr reuse essentially standard names from CF (other stuff is optional) so I don't know if we realy want to apply all about the very substantial CF conventions. For concerns about SWG: It's up for chair to decide on what we do with requests from external people.
- Ryan: Transform concept in CF
- Sean: If specify a transform based, will it break things? can we keep parallel representations?
- Ryan: We will always have zarr data encoding netcdf into zarr, do we want another route there, where you don't care about full netcdf compliant dataset.
- Sean: don't we want to push CRS in CF?
- Invited Dennis from nczarr: https://docs.unidata.ucar.edu/nug/current/nczarr_head.html
- https://github.com/zarr-developers/zarr-specs/issues/41
- Notes from convo:
- https://github.com/ome/ngff/issues/174
- https://portal.ogc.org/files/100727
- http://www.opengis.net/doc/CS/zarr/2.0
- https://www.ogc.org/standards/community/
- https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in
-
- Moving Forward
- Update existing zarr community guidance about how to store geospatial data in zarr
- In parallel, improve CF conventions
- Leverage zarr v3 for our work
## Canceled: ~~March 15th, 2023~~
Time: 11h-12h EST
Meeting Link: https://meet.google.com/qdi-uwrs-pmu
Or dial: (US) +1 318-702-0039 PIN: 589 902 807#
More phone numbers: https://tel.meet/qdi-uwrs-pmu?pin=8010028568068
**Please provide any updates/requests on https://github.com/zarr-developers/geozarr-spec/**
For those interested in a co-working session I am blocking off time next Monday and Tuesday afternoon (EST) to make progress on the numerous use cases we've defined
## March 1st, 2023
Time: 11h-12h EST
Meeting Link: https://meet.google.com/qdi-uwrs-pmu
Or dial: (US) +1 318-702-0039 PIN: 589 902 807#
More phone numbers: https://tel.meet/qdi-uwrs-pmu?pin=8010028568068
Attendees:
- Brianna Pagán / NASA GES DISC / @briannapagan
- Ryan Abernathey / LDEO / Earthmover / @rabernat
- Amit Kapadia / Planet
- Aaron Friesz / NASA LP DAAC
- Sean Harkins / Development Seed / @sharkinsspatial
- Matthew Hanson / Element 84 / @matthewhanson
### Summary
Participants discussed their progress and shared updates on tasks from the previous week. Brianna provided a small Zarr store example, and the group acknowledged that they felt stuck. Sean shared a use case focusing on browser-based visualization, while Ryan and Brianna suggested working with example Zarr stores to identify any issues. The group also discussed the GeoZarr spec, example workflows, and the need for support for rasterio's CRS model in Zarr. The participants agreed to work with the provided example Zarr stores and to build example notebooks based on these datasets. The next steps include Amit sharing sample data and all members continuing to develop example notebooks.
### Agenda
- Updates to last week's to-dos
- Brianna: Example small zarr store, taking 6 time slices from https://disc.gsfc.nasa.gov/datasets/GLDAS_NOAH025_3H_2.1/summary. Can download here: https://tinyurl.com/small-zarr-example
- Ryan: Feels like we're stuck
- Sean: sketching out a use case, browser based viz
- Ryan: viz can bring in complicated issues. Softwares that understand geospatial info and some that does not. For example netcdf and xarray. Hold geospatial and do not do anything to it. GDAL must understand. Achieve interoptibility with two chains.
- Sean: biggest use case for rioxr is writing out external netcdf files from xarray dataset created in analysis env. Focused on writing netcdf from source dataset.
- Brianna: I prefer to send out a zarr store, people trying to use it, see what breaks.
- Ryan: the geozarr spec is written down, but not implemented.
- Brianna: Provided a netcdf based zarr let's get a tiff based zarr out there. Have people try to work with it.
- Ryan: example workflow
- Can I open data with Xarray / Zarr and then pass it to rioxarray? Can we generate the rioxarray `spatial_ref` variable from a generic dataset?
- In memory rep of geospatial and then serialization
- Could we make a zipped geozarr that is functionally identical to a single COG?
- Let's try to actively work with example zarr posted.
- https://stackoverflow.com/questions/69228924/how-to-convert-zarr-data-to-geotiff
- for opening as zip ```group = zarr.open_group(zarr.ZipStore('GLDAS_NOAH025_3H.zarr.zip', mode='r'), path="GLDAS_NOAH025_3H.zarr", mode="r")```
- Next step for this dataset clipping by shapefile
- Amit: This rioxarray method claims to write CRS back to the xarray dataset attributes. I don't see that it works. https://corteva.github.io/rioxarray/stable/rioxarray.html#rioxarray.rioxarray.XRasterBase.write_coordinate_system
- Can GDAL understand non-evenly spaced lat / lon coordinate data?
- Matt: no. You have to create GCPs
- Amit: we need support for rasterio CRS model for Zarr
- Ryan: Gdal has a zarr driver, it's not interoptible with netcdf style tool chain. What is use case for wanting to use zarr in this context?
- Amit: working with time series data, temporal queries quickly. Zarr is interesting for in analysis, not archival.
- GDAL Zarr driver: https://gdal.org/drivers/raster/zarr.html
TODO:
- [ ] Amit shares some sample data
- [ ] All continue building up example notebooks from posted datasets
## February 15th, 2023
Time: 11h-12h EST
Meeting Link: https://meet.google.com/qdi-uwrs-pmu
Or dial: (US) +1 318-702-0039 PIN: 589 902 807#
More phone numbers: https://tel.meet/qdi-uwrs-pmu?pin=8010028568068
Attendees:
Brianna R. Pagán (NASA GES DISC)
Aaron Friesz (NASA LP DAAC)
Anderson Banihirwe (CarbonPlan)
David Blodgegtt (U.S. Geological Survey)
Alexey Shiklomanov (NASA GSFC ESDIS)
Christophe Noel (Spacebel)
Scott (OGC)
### Summary
The team discussed the OGC presentation, the dependency mapping, and collaborations with other ecosystems. Scott provided insights on the process of bringing a spec into OGC as a standard and estimated the time required. The team exchanged ideas on the challenges of transforming source formats into Zarr, including encoding, data models, and implementation in various software. They also discussed the aspects that GeoZarr should address or recommend, such as multiple resolutions, projections, and dimensional optimizations. They identified individuals to explore QGIS/Python, R (stars), and other ecosystems, and shared links to demo Zarr catalogs. To-dos include compiling use cases, revisiting a GitHub issue, and exploring QGIS/Python and R ecosystems.
### Agenda
- OGC presentation/chat by Scott
- OGC model is we have two ways of bringing a spec in as a standard.
- Standard working group process, use OGC resources, githubs etc
- Spec fully developed in community, then apply to become an OGC community standard. External community still owns it. This works well for existing efforts.
- The SWG path is faster in that there is no need to have the spec done, or widely implemented.
- Three OGC members have to suppoprt the charter, charter is written and goes through approval process.
- For something already cooking, the whole process could be done in 8-9 months.
- For community standard about the same time of lead time.
- Can be done in parallel, publically in GitHub. Once everyone feels comfortable enough, can submit. Scott endorses this approach
- David: domain alignment? Scott: discussions about open geo data cube
- Scott has spent alot of time writing these ups, willing to help us with discussions on writing charts
- Dependency mapping: https://github.com/zarr-developers/geozarr-spec/issues/6
- Is this useful?
- Christophe: What is the challenge of bringing the original format to what zarr adds as a functionality which is a n-dimension array. And what features we want from those applications. When you use existing tools you can easily transform source formats into zarr. If you do a simple conversion, you do not get more than the original format.
- Anderson: In Python ecosystem, ongoing effort kerchunk. What can we pull from this?
- Not exclusive to this effort
- We can add this to the mapping
- Alexey: how do we encode? Unidata/netcdf versus GDAL. We need a spec to fit in both paradigms. Is any major use case we're missing?
- David: push back on GDAL, what about level-2 swatch?
- Alexey: Let's punt on L2 swath data (irregular pixel sizes); that's not a problem that _anyone_ has solved. But, GeoZarr absolutely should support just linear affine transforms ("rotations") in the way that most GDAL drivers do.
- Since Xarray is so popular, maybe we start by prototyping an Xarray extension for "virtual variables" that parse CRS information and a 6-parameter affine transform (stored as a parameter), just to have something to play with.
- Implementation in GDAL seems a lot harder...but maybe a GDAL person
- GDAL has an [SRS encoding for Zarr](https://gdal.org/drivers/raster/zarr.html#srs-encoding) that might be relevant
- David: yes, two data models. Common data structure. Real dichotomy from CRS vs WKT
- Even: binary numbers is geotiff, netcdf it's text. You can define mapping to EPSG and names. Do we invent mapping
- David: you need to be able to map these parameters to the software. The EPSG registery is a registery of projection models, some are support CRS
- Alexey: how to read netcdf into a raster data model. You can start with opening dataset in xarray, everything is an additional attribute, applying it in
- David: Spec might support more than one implementation, it's about supporting certain functions. If you have a dataset with too many cells for CF, you rep it as GDAL style original offset. Risk of accepting 'optional' fields. Do we expect people who have geotiffs for naming variables?
- Even: More an issue of conversion tool. That is going to be specific of each dataset. Some might have geotiff bands for time, or each time step in seperate file. Not the goal of this spec, you should support n-dimensional array. Two different sides of GDAL, one historical 2-d raster but now we have more recently multi-dimensional, strongly modeled from netcdf and hdf5, few gdal drivers that implement both drivers. QGIS/rasterio, both use 2-d classic version, not multi-dimensional. 3-d gdal n-dim info and n-dim translate.
- Christophe: geozarr recommend at least I can find multiple resolutions or multiple projections. For the bands you need to map multiple files into dimensions of bands, or different convention but this would help needing to know what to do when accessing the data.
NOTE: I would expect GeoZarr to address / recommend such aspects:
* How to describe/access multiple related variables, with heterogeneous coordinates (e.g. children Datasets)
* How to describe/access multiple resolutions of the data (multiscales draft may help )
* How to encode/describe for optimised Map Tiling support
* How to describe/access subsets only available in some resolutions (e.g. an index of the dimensions / resolution)
* How to describe/access multiple projections (index ?)
* How to describe/access multiple dimensional optimisations (rechunking)
* How to describe/access typical EO products (e.g. multispectral band recommended as a dimension of the array)
* How to describe/access time series that have not been normalized (e.g. footprints no aligned)
- David: multi-member ensembles
- Who can steer other ecosystems (R, QGIS, Julia, Javascript)
- Even: If you have 2-d zarr array this should work with gdal/QGIS
- What happens with 3-d, a zarry with time, too many and it would explode.
- (Alexey) NOTE from internal EOSDIS discussion:
- EOSDIS is inter-conversion between Proj strings and CF specification
- HDF_EOS-2 (built on HDF-4 format) is based on GCTP projection specifications
### To Do
- [ ] Christophe to compile some use concrete use cases he has access to, optimized on some dimensions.
- [ ] Looking at again https://github.com/christophenoel/geozarr-spec/issues/3
- [ ] Brianna will spend some time looking at the QGIS/python
- [ ] David can take the lead for R (stars)
- [x] Post a small demo zarr? Brianna has been using catalogs from: https://pangeo-forge.org/catalog. Suggesting:
- https://pangeo-forge.org/dashboard/feedstock/81
## February 1st, 2023
Time: 11h-12h EST
Meeting Link: https://meet.google.com/qdi-uwrs-pmu
Or dial: (US) +1 318-702-0039 PIN: 589 902 807#
More phone numbers: https://tel.meet/qdi-uwrs-pmu?pin=8010028568068
Attendees:
- Brianna R. Pagán (NASA GES DISC)
- Ryan Abernathey (LDEO / Earthmover)
- Aaron Friesz (NASA LP DAAC)
- Anderson Banihirwe (CarbonPlan)
- David Blodgegtt (U.S. Geological Survey)
- Josh Moore (Zarr, very late)
- Alexey Shiklomanov (NASA GSFC ESDIS)
- Sean Harkins (Dev Seed)
- Christophe Noel (Spacebel)
### Agenda
* Time adjustment?
- Ryan: How can we communicate async? Using GitHub
- Will keep it at this day/time
* Will make a more explicit request and formalize some presentations via the suggestions of Ryan:
- Scott or someone from OGC could explain the SWG process (currently RSVP'ed as maybe)
- Matthew Hanson could explain how the STAC process has worked (declined this AM)
- Ryan could explain how the Zarr spec and convention process works
- Christophe could give an overview of where GeoZarr stands today and what are some of the challenges / open questions that have to be addressed (requested for this to be at a later date)
* High level overview of zarr (Ryan):
* Zarr created 2015, Alastair created and python implementation, both zarr and n5 (not HDF5) arose for a more simple "hackable" data format
* Many implementations in zarr in different languages, native implementations
* Shifted to community model after Alastair exited, Sanket is funded community director
* Currently working on a [V3 spec](https://zarr-specs.readthedocs.io/en/latest/core/v3.0.html): creates a formal mechanism to extend the spec
* Also working on [conventions](https://github.com/zarr-developers/zeps/pull/28), don't require changes to spec, more for changing variables, downstream applications, more lightweight than a spec
* Is geozarr going to be a convention or spec? Right now it's akin to a convention
* High level overview of existing geozarr spec repo (Christophe):
* Initially involved with a young company Constellar, in contract to provide a cloud native database
* Originally working with COGs, needed extra dimension of light spectrum and time, some work to extend COGs with time, but didn't have the speed (as it wasa not really N-D array but rather series of arrays)
* Zarr fulfilled the capabilities looking for, but noticed the libraries like xarray, rasterio, gdal etc all needed some geospatial metadata.
* Based on xarray conventions, Christophe based conventions for geozarr, extended with additional features, because wanted integration of symbology
* Adopted CF conventions for the standard names, allows client to know exactly what coordinates refer to
* Christophe created conventions very close to netcdf
* Geozarr interest is to be extended to include geotiff capabilities
* Thinking this should not be restricted to CF
* What are the high level objectives of this commitee:
* Focus on use cases that we want enabled by this work
* Not going to convince communities how to encode data, best thing we can do is write out operability between tools
* If we land on a convention here? How do we get community consensus. Two primary conventions:
* GDAL/WKT
* Grid mapping/CF conventions
* Here is how geozarr will use this convention so that all software reading zarr can identify if that convention is present
* Very few people have implemented CRS math, in practice it always goes thorugh proj/wkt, that is not supported by CF
- Affine transforms / polynomial affine transforms are very complex, hard to implement
* Important to solve these workflow issues, but not the focus of this group
### Let's brainstorm!
* As a data scientist I want to open zarr in gdal and get crs
* As a data scientist I want to write data in xarray then open in gdal and get crs
* Right now, conflicting standards, xarray using CF conventions, gdal using adhoc
* Global models where cell geometry becomes important, bounds concept from CF
* Remote sensing, or DEMs with high res, storing individual cell coordinates that become larger than the dataset itself, and you need an origin offset
* I want to be able to subset swath/L2/irregular grids
* Cross domain interoptibility?
* Geospatial viz in the browser. With v2 had to build custom workaround, v3 could benefit from extensions or the spec of geozarr
* As a GIS analyst I want to be able to read a zarr store into ArcGIS/QGIS with correct spatial represenation
Proposal: let's do 5 minutes of silent writing of use cases
Template:
```
- As a [type of User], I need to [do something] with Zarr using [tool X]
```
- As a geospatial analyst, I would like to have support for rectilinear affine transforms (already supported by GDAL)
- As a geospatial analyst,I would like to have support for ground control points (already supported by GDAL)
- As a climate scientist, I need to open CMIP6 data from AWS stored in Zarr format with Xarray, reproject the data to web Mercator, and export a COG for visualization purposes. The Zarr data were transcoded directly from NetCDF using CF conventions with a `grid_mapping` variable and no WKT.
- As a data scientist at a remote sensing company, I want to build harmonized datacubes of Sentinel / MODIS / Landast data and store them in Zarr on S3. I need all my tools to understand the CRS of the data cube.
- As a publisher of integrated climate and landscape data products, I need one set of conventions to house both 2D coordinate variable low granularity (e.g. climate) data and highly granular (e.g. elevation) data, so my client software and the infrastructure we use to work with both can be less complex and more understandable for all involved.
- As a client/tool, I want to discover dimensions, coordinates, and variables. Dimensions shall include (if relevant) the spectrum band or wavelength and the provide unambigous description (e.g. standard name) to interpret the coordinates.
- As a GeoTiff provider, I want to be able to encode in GeoZarr my set of GeoTiff (e.g one file per resolution) and encode in a standard way the various resolution/band arrays
- As a client/tool, I want to discover if data downscales are available
- As a client/tool, I want to discover, if rechunked (dimension-optimised) instance of the data are available (e.g., time series optimised rechunked array)
- As a client/tool, I want to discover a composition of array (e.g. subarrays being temporal instances or adjacent regions)
- As a user/client I want to be able to retrieve subset of the data.
- As a client/tool, I want to discover a set of visual portayls of the geospatial data and the relevant symbology.
- As a Map Viewer, I want to be able to discovery the GeoZarr product and display the data on a map with the right projection and be able to browse the other dimensions (time, elevation, bands, wavelengths)
- As a Catalogue, I want to be able to provide the necessary information about the GeoZarr product so it can be displayed on a map
- As a tiler or frontend developer, I can access a zarr archive with reduced resolution overviews stored with a standard CRS and level convention.
- As an xarray user I'd like flexible CRS enabled indexes to be able to optimally request sharded data with spatial operations.
- As a frontend developer, i would like to be able to develop browser-based tools to visualize data stored in zarr store by taking full advantage of the zarr geospecial/CF conventions
- As a geospatial analyst, I want to analyze remote sensing / climate datasets (that follow the NetCDF/Xarray data model) alongside "traditional" raster and vector datasets in a variety of projections in my desktop GIS client (QGIS, ArcGIS Pro).
- QGIS is almost entirely based on GDAL
- As a climate scientist working in climate impacts, I want to aggregate gridded climate/remote sensing data (that follow the NetCDF/Xarray data model) to political units (e.g., counties, states, countries) distributed as spatial polygons.
- If I'm coding in Python: Geopandas, Xarray, rasterio
- If I'm coding in R: ncdf4/RNetCDF; stars/terra (bindings for GDAL); sf (bindings for OGR)
- As a scientist using remote sensing data, I want to be able to use the latest, most frequent, and highest-resolution satellite data (which are only distributed as L2 swaths) in my spatial analyses (that also involve "normal" raster data in GeoTIFF and vector data for my site).
- As a GIS specialist, I can open an S3 / http url point to a Zarr dataset in the cloud and interact with it the same way I would with a COG
Sofware/Repos needed for this interoperability
* gdal
* netcdf-java
* xarray
* rasterio / rioxarray (are these subsumed by GDAL?)
* [ODC-GEO](https://odc-geo.readthedocs.io/)
* [pyresample](https://pyresample.readthedocs.io/en/latest/)
* panoply
* ENVI
* Javascript/Typescript ecosystem
* [zarrita.js](https://github.com/manzt/zarrita.js)
* [zarr.js](https://github.com/gzuidhof/zarr.js)
* [maps](https://github.com/carbonplan/maps)
* [mini-maps](https://github.com/carbonplan/minimaps)
* [JuliaGeo](https://juliageo.org/)
* (ArchGDAL)[https://yeesian.com/ArchGDAL.jl/latest/]
* [GeoRust](https://georust.org/)
* QGIS
* R spatial ecosystem:
* [terra](https://github.com/rspatial/terra)
* [sf](https://r-spatial.github.io/sf/)
* [stars](https://github.com/r-spatial/stars)
* [ggspatial](https://paleolimbot.github.io/ggspatial/) -- High-level package for visualization;
* [ncmeta](https://hypertidy.github.io/ncmeta/)
* [ncdf4](https://cran.r-project.org/web/packages/ncdf4/index.html)
* [RNetCDF](https://cran.r-project.org/web/packages/RNetCDF/index.html)
* [hdf5r](https://cran.r-project.org/web/packages/hdf5r/index.html)
* [wk](https://github.com/paleolimbot/wk) -- Low-level interface to WKT, WKB, etc. used by higher-level packages
* [exactextractr](https://github.com/isciences/exactextractr)
### Action Items
- [ ] Brianna extend invitation to those below:
- David Hoeze
- [ ] As a community, let's diagram the geospatial Zarr stack and dependency chain for each ecosystem (Python, R, Julia, Rust, QGIS) - Ryan will kick this off
- [x] ~~Brianna~~ Christophe (Thanks!) ask for a presentation from OGC
- [ ] PR to geozarr repo that would propose what is needed to encompass more of the above use cases (David + Sean, @ Evan in the PR)
- https://github.com/christophenoel/geozarr-spec/issues/3
## January 19th, 2023
Time: 11h-12h EST
Meeting Link:
https://teams.microsoft.com/l/meetup-join/19%3ameeting_MmZiMjg2MjQtMDcxYy00ZmUwLWFlZGUtNmRlMjFiZjYwYzhm%40thread.v2/0?context=%7b%22Tid%22%3a%227005d458-45be-48ae-8140-d43da96dd17b%22%2c%22Oid%22%3a%228c084ade-238f-4df1-a52f-7242fbe14037%22%7d
Attendees:
- Brianna R. Pagán (NASA GES DISC)
- Christine Smit (NASA GES DISC)
- Hailiang Zhang (NASA GES DISC)
- Dieu My Nguyen (NAsA GES DISC)
- Ryan Abernathey (Columbia / LDEO & Earthmover)
- Sean Harkins (DevSeed)
- Alexandra Kirk (DevSeed) - VEDA
- Amit Kapadia (Planet) - Playing with temporal stacks - sent by Chris
- Aaron Friesz (NASA Land Processes DAAC) - Looking for next gen cloud optimized fmt
- Matt Hanson (Element 84) - interested in Zarr
- Christophe Noel (SpaceBel) - introduced GeoZarr, support ESA
- Max Jones (CarbonPlan) - use Zarr for [geospatial viz in the browser](https://github.com/carbonplan/maps)
- David Blodgett (USGS) - helped with netCDF, help on Unidata
- Lucas Sterzinger (NASA)
- Alexy (NASA Goddard) - science advisor to EOSDIS - do we need to transform all our data?
- Raphael Hagen (CarbonPlan)
### Purpose
Discussions for moving forward a community led geozarr spec.
https://github.com/christophenoel/geozarr-spec
Clear playbook we could follow inspired by Chris's work on GeoParquet: https://cholmes.medium.com/geoparquet-1-0-0-beta-1-released-6390ecb4c6d0
### Agenda
* Roundtable Intros: names, roles, motivation for joining the call
* Finding a home for the geozarr-spec repo:
- [ ] Leave as is
- [x] Move to community org, Ryan offered zarr-developers
- [ ] Other ideas?
- *Discussion*
- Christophe: ESA is on board with this collaboration
- David: bringing in front of netcdf standards working group
- Alexey: if NASA leads we wil just slow things down
- Ryan: cannot assume the spec we all agree on will be what it is today. We should not assume that what we align on be CF conventions. Vocal community that doesn't want that, coming from GIS raster world. Should get netcdf working group involved from OGC, we will confront the culture clash between raster GIS vs netcdf climate communities.
- David: Spatial first enconding for multi-dimensions, momentum has taken over. Coverage implemention folks harder to get on board.
- Ryan: Align with OGC somehow, don't need to follow full OGC playbook. Another standards group, which is zarr itself, have done alot of work with ZEPs (see link below). Part of this was developing a process for extensions, implementors etc.
- Hailiang: We have ZEP3, version 3 has lots of new features, irregular chunking, sharding etc..
- Ryan: We can do this orthogonal to the zarr version. Ryan introduced ZEP as a convention, for how a domain will store metadata. It doesn't need to be a zarr extension, just need to say how we are going to store metadata in the container, this would be applied to ZEP2 or ZEP3 etc. It can be seperated from that process.
- Hailiang: any existing examples?
- Ryan: many ad-hoc conventions out there, xarray, microscopy community, but no formal process
- Alexey: That said, whatever we do here should probably be (in some way) connected to the in-development OGC GeoDataCube standard: https://www.ogc.org/pressroom/pressreleases/4829
- David: hesistant to bring it to a SWG that's not the netcdf SWG. Zarr as a binary carrier is a standalone spec, same as TIFF and HDF5, that we build on top of with conventions.
- Ryan: This is the proposal around Zarr Metadata Conventions https://github.com/zarr-developers/zeps/pull/28
- Alexey: I'm on the GeoDataCube SWG; they've only had a few meetings right now, so it's in pretty early stages. They are definitely aware of the NetCDF data model — I'm pretty sure not listing that here is just a minor oversight, but I'll ensure it's there during the next meeting.The idea is to not have the overhead of another SWG, this could be an agile way and point to existing SWG
* Ultimately intent is to be an OGC spec and can be moved to the opengeospatial repo
* Following the ZEP process
* https://zarr.dev/zeps/active/ZEP0000.html
* Chartering a new Steering Working Group (SWG) under OGC
* What does this entail?
* Timeline?
* Who is involved?
* *Discussion:*
* David: It doesn't have to be slow, it can be rapid. Recommend drafting something in zarr community, get ready to roll.
* Ryan: Zarr + OGC, question - is the community standard process? It's alot more light weight, standard is developed outside of OGC. What are pro/cons of community standard vs OGC standard working group process
* Alexey: Nothing formally at NASA keeping us from using this, we already are
* Christine: having OGC stamp on it helps.
* Brianna: zarr is not an official NASA approved data format, but we're still moving forward with use
* Matt: proponent of community standard approach, his own personal experience with STAC. Adoption is the most important part, have people use whatever we come up with. We do this by supporting open source implementations that can utilize this. More important than where it lives.
* David: why not community standard? Because you want to be in OGC architecture.
* Ryan: clear political advantages of becoming an OGC standard, consesus that is our long term goal, not let progress be blocked by this. Convene implementors, we as a group move forward of doing the hard work, which is discussing what is correct from a technical point of view.
* David: concur
* Christophe: we support the approach of SWG
* Relevant and timely thread for active contributors to consider, "adding geospatial conventions to zarr": https://twitter.com/EvenRouault/status/1614036054088056832?s=20&t=spYyeANdHwKBvL5dMiDEpg
* Other technical open questions to revisit:
* Alexey: To clarify - If the focus here is on the metadata and not the internal Zarr storage, we could use whatever we want for the storage backend, right, via fsspec-reference/Kerchunk-like workflows? I.e., We are specifically targeting the JSON, not the underlying storage? (I think this is similar to what Ryan just said).
* Christophe: Yes but I think the extension must consider the aspects specific for S3 backend.
* Who are the implementors for this being a success:
* NASA
* Brianna, Christine
* Unidata / NetCDF
* Ethan Davis and Dennis Heimberger
* Zarr:
* Ryan
* GDAL:
* Even Rouault - invite? Planet contracts with him.
* Planetary Computer:
* Tom - invite, Matt will invite.
* Open Geospatial Data Cube:
* Kirill
* STAC:
* Panoply?
* Robert Shmunk, ncZarr
### Action items
- [x] Ryan will coordinate with Christophe to transfer the repo
- [x] Brianna will send invitations to the implementers listed above to get involved
- [x] Brianna will schedule a bi-weekly call, but sometimes what's really need is to all get in a room together.
- [ ] Brianna can organize these sprints.
- [ ] In person meet-up at summer ESIP otherwise maybe NASA could sponsor a specific : https://www.esipfed.org/meetings
- [ ] Channel of communication: gitter?
## Meetings Summary
* 21 June 2023: The feedback on the OGC meeting attendance was limited. There was a debate on whether it was necessary to resolve every commit in the draft charter before moving ahead. Discussion on the draft OGC Charter focused on whether to include visualization in the spec, etc.
* 24 May 2023: There was a discussion on the open PR for the draft OGC Charter, and participants were encouraged to provide feedback. The meeting with Radiant Earth Foundation regarding the Cloud-Native Geospatial Foundation was also discussed, as was the upcoming meeting with the Google Earth Engine Ingest team.
* 26 April 2023: The addition of GeoTransform as implemented by GDAL was discussed, and the need for an example Zarr file was highlighted. The discussion then moved to the Standards Working Group OGC Draft, with debates on whether to move forward with a community standard or an SWG. The importance of interoperability and community involvement was highlighted.
* 12 April 2023: The questions posed to Scott from OGC regarding the alignment of GeoZarr objectives with the requirements for an OGC SWG were reviewed. Discussions ensued on how to create the OGC standard from the ZEP-4 document, with Scott suggesting a process for translating the ZEP into a Standard. The role of STAC in relation to the use of CF was also discussed, along with the challenges associated with interoperability between GDAL and xarray.
* 29 March 2023: Debated developing a spec extension or a convention for GeoZarr, and discussed the potential of going through the OGC SWG process. Also touched upon a roadmap, and encoding differences between GDAL and Xarray.
* 1 March 2023: Discussed progress and shared updates on example Zarr stores to identify any issues, focusing on browser-based visualization and the need for support for rasterio's CRS model in Zarr. Agreed to continue developing example notebooks based on the provided datasets.
* 15 February 2023: Scott Simons presentation of OGC standardisation processes. Debated challenges of transforming source formats into Zarr, including encoding, data models, and implementation in various software. Explored aspects that GeoZarr should address or recommend.
* 1 February 2023: Explored high-level objectives for the committee, such as use cases, compatibility, and community consensus; brainstormed use cases and software/repos needed for interoperability.
* 19 January 2023: Discussions for moving forward a community-led geozarr spec, transferring the repo, and organizing bi-weekly calls and in-person meet-ups.