<!--
Docs for making Markdown slide deck on HackMD using Revealjs
https://hackmd.io/s/how-to-create-slide-deck
https://hackmd.io/c/codimd-documentation/%2F%40codimd%2Fmarkdown-syntax
https://revealjs.com
-->
<!-- .slide: style="font-size: 0.85em;" -->
## GPU-native Zarr ๐ง
### Optimizing data throughput for <br> large-scale geospatial ML workflows ๐งฐ
<small>[FOSS4G 2025 lightning talk<br> Wednesday 19 Nov 2025, 15:45โ15:50 (NZDT)](https://talks.osgeo.org/foss4g-2025/talk/review/T3KBTZLMJASRG7TLZEAASQRESFUJHEML)</small>
_by **[Wei Ji Leong](https://github.com/weiji14)** @ [Development Seed](https://developmentseed.org/team/weiji-leong)_
<!-- Put the link to this slide here so people can follow -->
<small>P.S. Slides are at https://hackmd.io/@weiji14/foss4g2025</small>
<img src="https://user-images.githubusercontent.com/88113/194071025-c0af6172-0c8f-4daa-9de0-84667f998301.png" alt="Zarr logo" width="15%">
<img src="https://www.archives.ucar.edu/sites/default/files/images/NSF-NCAR_Lockup-UCAR-Dark_102523%20%282%29.png" alt="NCAR logo" width="20%">
<img src="https://www.nvidia.com/content/dam/en-zz/Solutions/about-nvidia/logo-and-brand/01-nvidia-logo-vert-500x200-2c50-d@2x.png" alt="NVIDIA logo" width="15%">
<img src="https://devseed.com/aiaia-docs/assets/graphics/content/dev-seed-logo-test.png" alt="Development Seed logo" width="25%">
---
<style>
.two-cols{
display: flex;
padding: 0.5rem 1rem;
}
.col {
flex: 1;
overflow: auto;
}
</style>
<!-- .slide: style="font-size: 0.85em;" -->
### Step 1: Optimize chunking and compression ๐งฑ
#### **using Xarray** ๐ฑ
<div class="two-cols">
<div class="col">
Follow your usage access patterns ๐
- Rule of thumb - use <br> $1\text{MB} < \text{chunk_size} < 100\text{MB}$
- Apply suitable compression (e.g. ZSTD) if reading over network
- Pro tip: consider sharded Zarr to save on cloud storage costs
</div>
<div class="col">
```python
import xarray as xr
ds = xr.open_mfdataset("ERA5.zarr")
# Rechunk the data
ds = ds.chunk({
"time": 1,
"level": 1,
"latitude": 640,
"longitude": 1280
})
# Save to Zarr v3
ds.to_zarr(
store="rechunked_ERA5.zarr",
zarr_version=3
)
```
</div>
</div>
---
<!-- .slide: style="font-size: 0.85em;" -->
### Step 2: Direct to GPU โฉ
#### **with Zarr-Python 3 (+ KvikIO)**
<div class="two-cols">
<div class="col">
Decompress on CPU memory, then read to CUDA memory using NVIDIA GPU Direct Storage (GDS) ๐

</div>
<div class="col">
```python
import kvikio.zarr
import xarray as xr
import zarr
airt = xr.tutorial.open_dataset(
name="air_temperature"
)
airt.to_zarr("/tmp/air-temp.zarr", mode="w")
with zarr.config.enable_gpu():
store = kvikio.zarr.GDSStore(
root="/tmp/air-temp.zarr"
)
ds = xr.open_dataset(
filename_or_obj=store, engine="zarr"
)
assert isinstance(ds.air.data, cp.ndarray)
```
</div>
</div>
---
<!-- .slide: style="font-size: 0.85em;" -->
### Step 3: GPU-based decompression ๐
#### **using nvCOMP**
<div class="two-cols">
<div class="col">
Send (little) compressed data to GPU, let GPU do parallel decompression and compute ๐ฎ

</div>
<div class="col">

<small>However, still requires <br> https://github.com/zarr-developers/zarr-python/pull/2863 ๐ซ </small>
</div>
</div>
---
<!-- .slide: style="font-size: 0.85em;" -->
### Step 4: Overlap CPU and GPU compute ๐
#### **with NVIDIA DALI**
<div class="two-cols">
<div class="col">
Concurrent data pre-processing on CPU and GPU ๐คน

</div>
<div class="col">

<small>Example code at <br> https://github.com/pangeo-data/ncar-hackathon-xarray-on-gpus ๐ค</small>
</div>
</div>
---
<!-- .slide: style="font-size: 0.65em;" -->
## Thank you! :sheep:
<div class="two-cols">
<div class="col">
๐ Slides
https://hackmd.io/@weiji14/foss4g2025
๐ Blog post
https://xarray.dev/blog/gpu-pipeline
๐งโ๐ป Code
https://github.com/pangeo-data/ncar-hackathon-xarray-on-gpus
<br>
<br>
๐พ GitHub: @weiji14<br>
๐ Mastodon: @weiji14@mastodon.nz<br>
โ๏ธ Email: weiji@developmentseed.org
</div>
<div class="col">
<img src="https://hackmd.io/_uploads/HJovkj-xWg.png" alt="Xarray on GPUs team" width="75%">
<img src="https://hackmd.io/_uploads/SJF3KAmeZx.png" alt="QR code to https://hackmd.io/@weiji14/foss4g2025" width="45%">
</div>
</div>
{"title":"GPU-native Zarr: Optimizing data throughput for large-scale geospatial machine learning workflows","description":"FOSS4G 2025 presentation","slideOptions":"{\"theme\":\"simple\",\"width\":\"80%\"}","lang":"en-NZ","contributors":"[{\"id\":\"c1f3f3d8-2cb7-4635-9d54-f8f7487d0956\",\"add\":14274,\"del\":9219,\"latestUpdatedAt\":1763072808933}]"}