<!-- Docs for making Markdown slide deck on HackMD using Revealjs https://hackmd.io/s/how-to-create-slide-deck https://hackmd.io/c/codimd-documentation/%2F%40codimd%2Fmarkdown-syntax https://revealjs.com --> <!-- .slide: style="font-size: 0.85em;" --> ## GPU-native Zarr ๐ŸงŠ ### Optimizing data throughput for <br> large-scale geospatial ML workflows ๐Ÿงฐ <small>[FOSS4G 2025 lightning talk<br> Wednesday 19 Nov 2025, 15:45โ€“15:50 (NZDT)](https://talks.osgeo.org/foss4g-2025/talk/review/T3KBTZLMJASRG7TLZEAASQRESFUJHEML)</small> _by **[Wei Ji Leong](https://github.com/weiji14)** @ [Development Seed](https://developmentseed.org/team/weiji-leong)_ <!-- Put the link to this slide here so people can follow --> <small>P.S. Slides are at https://hackmd.io/@weiji14/foss4g2025</small> <img src="https://user-images.githubusercontent.com/88113/194071025-c0af6172-0c8f-4daa-9de0-84667f998301.png" alt="Zarr logo" width="15%"> <img src="https://www.archives.ucar.edu/sites/default/files/images/NSF-NCAR_Lockup-UCAR-Dark_102523%20%282%29.png" alt="NCAR logo" width="20%"> <img src="https://www.nvidia.com/content/dam/en-zz/Solutions/about-nvidia/logo-and-brand/01-nvidia-logo-vert-500x200-2c50-d@2x.png" alt="NVIDIA logo" width="15%"> <img src="https://devseed.com/aiaia-docs/assets/graphics/content/dev-seed-logo-test.png" alt="Development Seed logo" width="25%"> --- <style> .two-cols{ display: flex; padding: 0.5rem 1rem; } .col { flex: 1; overflow: auto; } </style> <!-- .slide: style="font-size: 0.85em;" --> ### Step 1: Optimize chunking and compression ๐Ÿงฑ #### **using Xarray** ๐Ÿฑ <div class="two-cols"> <div class="col"> Follow your usage access patterns ๐Ÿ˜‡ - Rule of thumb - use <br> $1\text{MB} < \text{chunk_size} < 100\text{MB}$ - Apply suitable compression (e.g. ZSTD) if reading over network - Pro tip: consider sharded Zarr to save on cloud storage costs </div> <div class="col"> ```python import xarray as xr ds = xr.open_mfdataset("ERA5.zarr") # Rechunk the data ds = ds.chunk({ "time": 1, "level": 1, "latitude": 640, "longitude": 1280 }) # Save to Zarr v3 ds.to_zarr( store="rechunked_ERA5.zarr", zarr_version=3 ) ``` </div> </div> --- <!-- .slide: style="font-size: 0.85em;" --> ### Step 2: Direct to GPU โฉ #### **with Zarr-Python 3 (+ KvikIO)** <div class="two-cols"> <div class="col"> Decompress on CPU memory, then read to CUDA memory using NVIDIA GPU Direct Storage (GDS) ๐Ÿ˜Ž ![Flowchart-technically decompression is still done on CPUs](https://xarray.dev/posts/gpu-pipline/flowchart_2.png) </div> <div class="col"> ```python import kvikio.zarr import xarray as xr import zarr airt = xr.tutorial.open_dataset( name="air_temperature" ) airt.to_zarr("/tmp/air-temp.zarr", mode="w") with zarr.config.enable_gpu(): store = kvikio.zarr.GDSStore( root="/tmp/air-temp.zarr" ) ds = xr.open_dataset( filename_or_obj=store, engine="zarr" ) assert isinstance(ds.air.data, cp.ndarray) ``` </div> </div> --- <!-- .slide: style="font-size: 0.85em;" --> ### Step 3: GPU-based decompression ๐Ÿš€ #### **using nvCOMP** <div class="two-cols"> <div class="col"> Send (little) compressed data to GPU, let GPU do parallel decompression and compute ๐Ÿ˜ฎ ![GPU native decompression](https://xarray.dev/posts/gpu-pipline/flowchart_3.png) </div> <div class="col"> ![nvcomp Zstd performance benchmark](https://xarray.dev/posts/gpu-pipline/zstd_benchmark.png) <small>However, still requires <br> https://github.com/zarr-developers/zarr-python/pull/2863 ๐Ÿซ </small> </div> </div> --- <!-- .slide: style="font-size: 0.85em;" --> ### Step 4: Overlap CPU and GPU compute ๐Ÿ”€ #### **with NVIDIA DALI** <div class="two-cols"> <div class="col"> Concurrent data pre-processing on CPU and GPU ๐Ÿคน ![NVIDIA DALI overview](https://docs.nvidia.com/deeplearning/dali/user-guide/docs/_images/dali.png) </div> <div class="col"> ![nvcomp Zstd performance benchmark](https://xarray.dev/posts/gpu-pipline/profiling_screenshot_dali.png) <small>Example code at <br> https://github.com/pangeo-data/ncar-hackathon-xarray-on-gpus ๐Ÿค—</small> </div> </div> --- <!-- .slide: style="font-size: 0.65em;" --> ## Thank you! :sheep: <div class="two-cols"> <div class="col"> ๐Ÿ› Slides https://hackmd.io/@weiji14/foss4g2025 ๐Ÿ“ Blog post https://xarray.dev/blog/gpu-pipeline ๐Ÿง‘โ€๐Ÿ’ป Code https://github.com/pangeo-data/ncar-hackathon-xarray-on-gpus <br> <br> ๐Ÿ‘พ GitHub: @weiji14<br> ๐Ÿ˜ Mastodon: @weiji14@mastodon.nz<br> โœ‰๏ธ Email: weiji@developmentseed.org </div> <div class="col"> <img src="https://hackmd.io/_uploads/HJovkj-xWg.png" alt="Xarray on GPUs team" width="75%"> <img src="https://hackmd.io/_uploads/SJF3KAmeZx.png" alt="QR code to https://hackmd.io/@weiji14/foss4g2025" width="45%"> </div> </div>
{"title":"GPU-native Zarr: Optimizing data throughput for large-scale geospatial machine learning workflows","description":"FOSS4G 2025 presentation","slideOptions":"{\"theme\":\"simple\",\"width\":\"80%\"}","lang":"en-NZ","contributors":"[{\"id\":\"c1f3f3d8-2cb7-4635-9d54-f8f7487d0956\",\"add\":14274,\"del\":9219,\"latestUpdatedAt\":1763072808933}]"}
    365 views