owned this note
owned this note
Published
Linked with GitHub
# Discussion on grid
Do not separate climate versus EO
More about about grids and it is more about GIS versus global model grids
Data cubes: base structure of xarray
one assumption is the co-gridding. Alignment of samples.
co-sampled.
Xarray can drive scope.
Examples with some basic DGGS lib data manipulation, coordinate conversions, selecting etc: https://github.com/allixender/dggs_t1
Data tree in Xarray.
Can only interoperate if they shared the same discretisation.
We can resample or interpolation.
But we should not do it too often to make sure we do not have too many uncertainties on the data.
## What problem do we want to solve?
- how we put such grid in xarray?
Time dimension in xarray is well handled.
Can we have a data cube which is not xyz? e.g. kee this time dimension.
DGGS part 1 in ISO 19170 abstract specification
quantization of time
## How do we discretize. MTSIC (time)
Get sample dataset in DGGS
Let's forget about multi-resolution and we focus on mono-resolution.
Let's try to solve it for one resolution.
## What are our requirements
- do you expect the grid system to exist? or re-use existing one.
- We do not invent a new grid system, we re-use existing grid system
- It is not the specificity of DGGS but we want to explore with DGGS but it could be implemented with any other grid system.
## Scope for the sprint
Original Pangeo DGGS code sprint repo:
https://github.com/pangeo-data/bids2023_codesprint
Benoit nicely explaning stuff:
https://github.com/pangeo-data/bids2023_codesprint/issues/3
We want to work on the cell id.
We can have orthogonal dimensions: time and z.
Repositories of examples:
- Tina/Ifremer, Justus/Ifremer Healpix/healpy with notebook (regridding):
- https://github.com/iaocea/xarray-healpy
- https://github.com/IAOCEA/xarray-healpy/blob/main/example/healpix_regrid.ipynb
- Example data [10.5281/zenodo.10074922](http://doi.org/10.5281/zenodo.10074922).
- Alex, various basic DGGS operations with H3, rHealpix, DGGRID
- https://github.com/allixender/dggs_t1
- Uber H3 intro super simple: https://github.com/allixender/dggs_t1/blob/master/h3_intro.ipynb
- some H3 data in CSV: https://github.com/allixender/dggs_t1/blob/master/h3_agg.csv
- Some agg and visualisations (from Uber tut): https://github.com/allixender/dggs_t1/blob/master/h3_unified_data_layers.ipynb
- Some (old, informative at least): https://github.com/allixender/dggs_t1/blob/master/more_grids.ipynb
- Ryan: Added an example creating H3:
- https://github.com/pangeo-data/bids2023_codesprint/blob/main/xr_air_temperature_h3.ipynb
- H3pandas lib: https://github.com/DahnJ/H3-Pandas/blob/master/h3pandas/h3pandas.py
Base operations:
- Selection (.sel, isel), where (.where)
- interpolation (.interp)
- regridding (going between different kind grids)
### Regridding
- projection:
- map source (latlon) to target (cell id / zone id), then aggregate
-
### Xarray DGGS extension
repository (code + examples): https://github.com/benbovy/xdggs
### Example Code
```python
import xarray as xr
ds = xr.open_dataset("something_healpix.nc")
# ds.temperature.dims == ('time', 'cell_id')
# how to decode the grid?
# if for each lat
ds = ds.dggs.decode()
# select by coordinates
# Q: do we use a custom index?
ds.sel(lon=45, lat=30, time="2023-10-01", method='nearest')
# or an accessor?
ds.dggs.sel(lon=45, lat=30, time="2023-10-01", method='nearest')
# coarsen (to absolute zoom level)
ds.dggs.coarsen(level=3).mean()
# select by bounding box
ds.dggs.bbox((ll_lon, ll_lat, ur_lon, ur_lat))
ds.dggs.query(shapely.bbox(ll_lon, ll_lat, ur_lon, ur_lat))
ds.dggs.query(shapely.Polygon([[lon, lat], ...]))
# visualization?
ds.isel(time=0).temperature.dggs.plot()
```
### Additional discussion on Thursday
(please extend / correct, my memory of our lively discussion is already somewhat hazy)
- to progress, we need to collect the features we need to be able to work with DGGS using xarray (→ roadmap / design document)
- conversion / reconstruction of coordinates from / to cell ids
- selection of cells
- encoding / decoding for storage
- interpolation
- plotting
- multi-resolution datasets
- very useful for merging datasets with the same dggs but different resolutions
- grid itself still has to be unchanging over time
- STAC / catalogs?
- use bounding box / envelope?
- main area of work so far: selection of data
- not all of this has to live in xdggs, for example conversion of existing dataset with lat / lon gridding to DGGS:
- interpolation / resampling to a DGGS of roughly the same resolution
- implementation can live in pyresample / xesmf / any other resampling / regridding library
- not all of the code should be written in python, some of this should be implemented in a lower-level, high-performance language
- grant proposal together with openeo: yes, but we will start working on this before the grant starts
- meeting with the OGC working group on DGGS for more feedback / exchange (Peter will help set that up)