GPU-native Zarr: Optimizing data throughput for large-scale geospatial machine learning workflows

## GPU-native Zarr 🧊 ### Optimizing data throughput for <br> large-scale geospatial ML workflows 🧰 <small>[FOSS4G 2025 lightning talk<br> Wednesday 19 Nov 2025, 15:45–15:50 (NZDT)](https://talks.osgeo.org/foss4g-2025/talk/review/T3KBTZLMJASRG7TLZEAASQRESFUJHEML)</small> _by **[Wei Ji Leong](https://github.com/weiji14)** @ [Development Seed](https://developmentseed.org/team/weiji-leong)_  <small>P.S. Slides are at https://hackmd.io/@weiji14/foss4g2025</small> <img src="https://user-images.githubusercontent.com/88113/194071025-c0af6172-0c8f-4daa-9de0-84667f998301.png" alt="Zarr logo" width="15%"> <img src="https://www.archives.ucar.edu/sites/default/files/images/NSF-NCAR_Lockup-UCAR-Dark_102523%20%282%29.png" alt="NCAR logo" width="20%"> <img src="https://www.nvidia.com/content/dam/en-zz/Solutions/about-nvidia/logo-and-brand/01-nvidia-logo-vert-500x200-2c50-d@2x.png" alt="NVIDIA logo" width="15%"> <img src="https://devseed.com/aiaia-docs/assets/graphics/content/dev-seed-logo-test.png" alt="Development Seed logo" width="25%"> --- <style> .two-cols{ display: flex; padding: 0.5rem 1rem; } .col { flex: 1; overflow: auto; } </style>  ### Step 1: Optimize chunking and compression 🧱 #### **using Xarray** 🍱 <div class="two-cols"> <div class="col"> Follow your usage access patterns 😇 - Rule of thumb - use <br> $1\text{MB} < \text{chunk_size} < 100\text{MB}$ - Apply suitable compression (e.g. ZSTD) if reading over network - Pro tip: consider sharded Zarr to save on cloud storage costs </div> <div class="col"> ```python import xarray as xr ds = xr.open_mfdataset("ERA5.zarr") # Rechunk the data ds = ds.chunk({ "time": 1, "level": 1, "latitude": 640, "longitude": 1280 }) # Save to Zarr v3 ds.to_zarr( store="rechunked_ERA5.zarr", zarr_version=3 ) ``` </div> </div> ---  ### Step 2: Direct to GPU ⏩ #### **with Zarr-Python 3 (+ KvikIO)** <div class="two-cols"> <div class="col"> Decompress on CPU memory, then read to CUDA memory using NVIDIA GPU Direct Storage (GDS) 😎 ![Flowchart-technically decompression is still done on CPUs](https://xarray.dev/posts/gpu-pipline/flowchart_2.png) </div> <div class="col"> ```python import kvikio.zarr import xarray as xr import zarr airt = xr.tutorial.open_dataset( name="air_temperature" ) airt.to_zarr("/tmp/air-temp.zarr", mode="w") with zarr.config.enable_gpu(): store = kvikio.zarr.GDSStore( root="/tmp/air-temp.zarr" ) ds = xr.open_dataset( filename_or_obj=store, engine="zarr" ) assert isinstance(ds.air.data, cp.ndarray) ``` </div> </div> ---  ### Step 3: GPU-based decompression 🚀 #### **using nvCOMP** <div class="two-cols"> <div class="col"> Send (little) compressed data to GPU, let GPU do parallel decompression and compute 😮 ![GPU native decompression](https://xarray.dev/posts/gpu-pipline/flowchart_3.png) </div> <div class="col"> ![nvcomp Zstd performance benchmark](https://xarray.dev/posts/gpu-pipline/zstd_benchmark.png) <small>However, still requires <br> https://github.com/zarr-developers/zarr-python/pull/2863 🫠</small> </div> </div> ---  ### Step 4: Overlap CPU and GPU compute 🔀 #### **with NVIDIA DALI** <div class="two-cols"> <div class="col"> Concurrent data pre-processing on CPU and GPU 🤹 ![NVIDIA DALI overview](https://docs.nvidia.com/deeplearning/dali/user-guide/docs/_images/dali.png) </div> <div class="col"> ![nvcomp Zstd performance benchmark](https://xarray.dev/posts/gpu-pipline/profiling_screenshot_dali.png) <small>Example code at <br> https://github.com/pangeo-data/ncar-hackathon-xarray-on-gpus 🤗</small> </div> </div> ---  ## Thank you! :sheep: <div class="two-cols"> <div class="col"> 🛝 Slides https://hackmd.io/@weiji14/foss4g2025 📝 Blog post https://xarray.dev/blog/gpu-pipeline 🧑‍💻 Code https://github.com/pangeo-data/ncar-hackathon-xarray-on-gpus <br> <br> 👾 GitHub: @weiji14<br> 🐘 Mastodon: @weiji14@mastodon.nz<br> ✉️ Email: weiji@developmentseed.org </div> <div class="col"> <img src="https://hackmd.io/_uploads/HJovkj-xWg.png" alt="Xarray on GPUs team" width="75%"> <img src="https://hackmd.io/_uploads/SJF3KAmeZx.png" alt="QR code to https://hackmd.io/@weiji14/foss4g2025" width="45%"> </div> </div>