Try   HackMD

[ICON4Py] Improving performance for CFL condition in dycore

  • Shaped by: Christoph
  • Appetite (FTEs, weeks): 1 cycle
  • Developers:

Problem

The CFL condition computation and the follow up additional diffusion stencils in velocity advection need to be as efficient as possible given the tools given to us. At the same time it should lead to the same result as in ICON OpenACC.

Appetite

Full cycle, but as a side project.

Solution

  • Combine all stencils which act on cells (resulting in 8_to_18), the CFL condition is computed for each cell, and the cell-based additional diffusion stencil can be inlined, re-using the condition.
  • Directly after the cell computations, reduce the CFL from a cell value to a scalar, which also gives a condition if the CFL was exceeded anywhere. This scalar (the maximum CFL) has to be computed anyway to be interface compatible.
  • This scalar can also be used to decide if the 2nd additional diffusion stencil for the edge computation has to be called or not. In the 2nd additional diffusion stencil we still use the per-edge CFL condition, same as in the ICON code, in order to ensure the same results
  • The 'levelmask' array will be deleted everywhere.

No-Go

Rabbit holes

Progress