OVERVIEW - Cycle 29 05/25
- Cycle:
- Betting table: 13.05.2025
- Review meeting: 03.06.2025 (slides)
- Participants:
- CSCS: Edoardo, Enrique, Will
- EXCLAIM: Anurag, Magdalena, Mauro
- MCH: Christoph
Available people:
- GridTools:
- Christos: 50% or a bit less (rest on PASC projects -push a bit more at the beginning-)
- Edoardo: 100%
- Enrique: 100%
- Hannes: 0%
- Philip: 60% (because of LUMI)
- Rico: 80-90 % (Anticipate sporadic SwissTwins work)
- Sara (ECMWF): 0% (working on distributed PMAP-GO + PASC poster)
- Till: 66% (12.05-23.05 only, rest vacation)
- Ioannis: 50% (because of cornerstone)
- CSCS:
- Mikael (Simberg): 50%
- Giacomo: 50% (rest on Flang investigation and Tabascal)
- Will: 50% (rest: performance of GT4Py muphys, EXCLAIM Symposium)
- Prashanth: 40% (mainly for I/O bits of ICON4Py)
- EXCLAIM:
- Chia Rui:
- Jacopo: 5-10% (helping out with Hannes' open PRs)
- Magdalena: Full Cycle
- Nikki: Full Cycle
- Yilu: almost full cycle (will be away for 4 days)
- MCH:
- Christoph: 100%
- Daniel (50%):
Cycle Goal
Meaningful performance baseline for the Dycore granule
Same as Cycle 28, but also improving performance beyond what we see right now. ICON-EXCLAIM running with ICON4Py and GT4Py-DaCe main
branches, with minimal Python overhead and meaningful performance benchmarks. This also requires finishing the combined programs and merging all the half-finished features like concat_where
, and complete the benchmarking infrastructure to support writing more optimizations in next cycles.
Tasks
Project |
Appetite |
Developers |
Support |
[DaCe] Optimization VIII # |
full cycle |
Philip, Ioannis |
Edoardo |
[DaCe] Toolchain Runtime Support # |
full cycle |
Edoardo, Giacomo |
Enrique |
[GT4Py] Concat where cleanup & merge # |
full cycle |
Till |
Enrique |
[ICON4Py] Benchmarking infrastructure # |
full cycle |
Enrique, Christos, (Magdalena?) |
Chia Rui |
[ICON4Py] Combined solve non_hydro stencils # |
1 week |
Christoph (until done) |
Nikki |
[ICON4Py] Hannes' fixes 1: Static variants of programs # |
1 week |
Nikki |
Edoardo |
[ICON4Py] Hannes' fixes 2: Fix skip values # |
1 week |
Magdalena |
|
[ICON4Py] Hannes' fixes 3: Fix validation errors # |
1 week |
Jacopo, Daniel |
Christoph |
[ICON4Py] Small CI improvements # |
1 week |
Magdalena |
Enrique |
[ICON4Py] Standalone greenline # |
full cycle |
Mikael, Yilu |
Magdalena |
[ICON4Py] Improving performance for CFL condition in dycore # |
2 weeks |
Christoph, Chia Rui |
Magdalena |
[Blueline] ICON with Granule deployment # |
full cycle |
Rico, Will |
Christoph |
Data Compression Project # |
1 week |
Nikki, Christos |
|
Planning Action points
@egparedes sync with @mluz about the icon4py tasks
combined stencils
RBF & standalone grid status
@egparedes ask for possible dates for the group brainstorming meeting (if needed) and betting table and send the invites
@egparedes send betting table invite
- sync and discuss with all people involved on different spack+venv projects (@ricoh @muellch @Will Sawyer, …)
Brainstorming
- benchmarking infrastructure:
- StencilTests with larger grids (either standalone or with serialized data)
- validate with small grid
- benchmark with large grid
- accept static args
- green-line JW
- intermediate step: serialize APE with R02B07
- switch to standalone grid when ready
- MCH experiment: ICON-CH2-small for validation, Christoph s ICON-CH2 for benchmark
- add to bencher benchmark diffusion, dycore
- blue-line + granule (ape large grid?)
profiling & identifying the low performance in the blueline + granule
- gt4py
- merge performance counters as soon as they are merged in gt4py
- finish the concat_where
- dace
- reduce overhead of the program calls (dace)
- toolchain instrumentation
- continue with DaCe optimizations
- parallel runs with DaCe
- icon4py
- take over and merge PRs from Hannes handover document
- finish the combined stencils (one week?)
consolidate the serialization points in icon-EXCLAIM (required for the serialization of the larger grids needed for benchmarking) (few days?)
- standalone: (1 cycle, both tasks could go in parallel )
- finish RBF validation
- glue code to start using them and merge and integrate z_ifc branch.
- Simple CI improvements: integrate nox utils to skip uneeded test execution (few days?)
- fix CFL reduction in icon4py dycore
PASC Poster
- ICON uenvs –> needs larger discussion ??
- missing experiments:
- is it possible to run spack-installed uv in the install phase of a dependent spack package in the context of a uenv build? (locally -> uv does not run because spack does not expose libbz2 to it during staging of the dependent package, why?). Possible fix: depend on
bzip2
as a build dependency in the icon4py spack package?
- IF YES -> only needed to add a venv with icon4py for runtime in uenv post-install
- IF NO -> consider alternatives:
pip install
instead of uv sync
, install icon4py
and then ICON
both in post-install
- uenv deployment ?
- ICON CI (Will, Daniel) ?