# Aquaplanet with ICON4Py dycore at scale
###### tags: `functional cycle 12`
Developers: Abishek, David, Jonas (C2SM)
Appetite: full cycle
## Goals
- An `icon-exclaim` codebase with an ICON4Py dycore<sup>1</sup> that runs the EXCLAIM aquaplanet experiments at scale<sup>2</sup>, and matches Fortran + OpenACC in accuracy and performance.
- Add a panel showing dycore with GT4Py stencils to the following plot:

- provide basic scaling plots for plain OpenACC vs GT4Py enabled.
<sup>1</sup><small> A Fortran+ACC dycore that also incorporates ICON4Py dycore stencils</small>
<sup>2</sup><small> Run `exclaim_ape_R2B08` on Daint with 2000 nodes</small>
## Tasks
- [x] Merge Aquaplanet GPU features into `dkrz/icon-nwp`. [Done](https://gitlab.dkrz.de/icon/icon-nwp/-/commit/512ae6cfec1de8ccec1d170d66f77d4460c20601) (David L.)
- [ ] Move `icon-dsl` to Daint
- [x] Get `icon-dsl` to build on Daint (Abishek)
- [ ] Generate probtest reference for Daint
- [ ] Merge EXCLAIM `icon-dsl` with `aquaplanet_gpu_features`
- [x] Sync EXCLAIM `aquaplanet_gpu_features` with a specific tag of `dkrz/icon-nwp` (Abishek)
- [x] ec_rad: maybe from `dkrz/icon-nwp`, else pick Daniels branch. Small task. (Abishek)
- [x] Sync `aquaplanet_gpu_features` with `icon-nwp/master` after `icon-nwp/ecrad_acc` is merged into `icon-nwp/master`
- [ ] Merge EXCLAIM `tmp-merge` into `exclaim_gt4py_dycore` (Christoph, Abishek)
- `tmp-merge` is working branch for merge, `exclaim_gt4py_dycore` is a copy of `aquaplanet_gpu_features`
- [ ] LAM dycore -> Global dycore
- [ ] Handle `skip_values=True` in icon4py
- [x] Changes to gt4py to support `skip_values=True` ([PR](https://github.com/GridTools/gt4py/pull/1058)) (Hannes)
- [x] Remove hard-code nproma from `dsl/CMakeLists.txt` and `dsl/icon_setup.cpp` ([PR](https://github.com/C2SM/icon4py/pull/123)) (Christoph)
- [x] Fix JSON multinode issue ([PR](https://github.com/C2SM/icon4py/pull/123)) (Christoph)
- [x] Fix ouput crashes (David L, Anurag D.)
- [ ] Merge with `exclaim_gt4py_dycore`
- [ ] Scale experiments on more nodes (Abishek)
- [ ] Discuss overall + micro benchmarking strategies with MCH
- [ ] Run hard scaling tests
## Verification, Validation & UQ
1. **mch_ch_r04b09_dsl**
- [x] Verf run
- [x] Subst run
- [x] Probtest
2. **atm_ape_test** (Global)
- [x] Verf run
- [x] Subst run
- [x] Probtest
3. **exclaim_ape_R2B05** (Global)
- [x] Verf run
- [x] Subst run
- [ ] Probtest
4. **mch_opr_r04b07**
- [ ] Verf run
- [ ] Subst run
- [ ] Probtest
## Nice to have:
- Port additional stencils for global mode
- [ ] [Stencil 1](https://github.com/C2SM/icon-exclaim/blob/aa450d1778ef7c56a1f1d88f3ecf719a8294cdb1/src/atm_dyn_iconam/mo_solve_nonhydro.f90#L2319)
- [ ] [Stencil 2](https://github.com/C2SM/icon-exclaim/blob/aa450d1778ef7c56a1f1d88f3ecf719a8294cdb1/src/atm_dyn_iconam/mo_nh_diffusion.f90#L1205)
- New codebase capable of running in three modes (this feature is planned for the DSL pre-processor)
- with ICON4Py dycore in verification mode
- with ICON4Py dycore in substituion mode
- with Fortran + OpenACC dycore
## Future goals:
- [ ] Gt4Py microphysics & satad
## Non Goals
- Full GT4Py dycore, since need halo exchange