Translate and integrate tracer advection stencils

# Translate and integrate tracer advection stencils ###### tags: `functional cycle 11` Developers: Magdalena, Abishek, Christoph? Appetite: half-cycle ## Problem The tracer advection itself contains many horizontal and vertical indirection stencils (horizontal upwind schemes and vertical Semi-Lagrangian schemes), so translating this component might require additional features in gt4py. A part of the horizontal tracer advection was already translated to dusk by Erwan Cossevin at MeteoSwiss. ## Background The ICON dycore consists of the dry dycore plus the tracer advection. In real weather or climate runs the runtime is dominated by the dycore (50-70%), especially considering how little code the dycore is compared to the rest of a full model. So for performance reasons one should translate and optimize the dycore first. ## Integration issues ## Translation blocked #### hflx_limiter_mo_stencil_01 (Part II) - AG - [ ] Translated to `gt4py` - [ ] Merged into `icon4py` - [ ] Ported and verified in `icon-exclaim` [Branch](https://github.com/C2SM/icon4py/tree/hflx_limiter_mo_stencil_01_2). *Requires built-ins (min) to be able to handle sparse dimensions* Depends on gt4py PRs [PR947](https://github.com/GridTools/gt4py/pull/947) (essential) and [PR946](https://github.com/GridTools/gt4py/pull/946) (optional) #### hflx_limiter_mo_stencil_01 - AG Part I - [x] Translated to `gt4py` - [ ] Merged into `icon4py` - [ ] Ported and verified in `icon-exclaim` will merge with hflx_limiter_mo_stencil_01 #### face_val_ppm_stencil_03 - [ ] Translated to `gt4py` - [ ] Merged into `icon4py` - [ ] Ported and verified in `icon-exclaim` *same as face_val_ppm_stencil_04* including the line ```advection/mo_advection_vflux.f90 function=compute_face_values_ppm line=2604 ``` ```fortran p_face(jc,elevp1) = p_cc(jc,elev) ``` #### face_val_ppm_stencil_01 - [ ] Translated to `gt4py` - [ ] Merged into ` icon4py` - [ ] Ported and verified in `icon-exclaim` *same as face_val_ppm_stencil_02* absolute domain or where. Implemented in [PR](https://github.com/C2SM/icon4py/pull/99/files). icon4pygen needs to be fixed before they can be integrated. #### advection/mo_advection_vflux.f90 function=upwind_vflux_ppm4gpu line=1581 ```fortran ! total mass crossing jk'th edge during \Delta t z_mass = p_dtime*p_mflx_contra_v(jc,jk,jb) ! Case: w > 0 IF (z_mass > 0._wp) THEN jks = jk ! initialize shifted index DO WHILE( (z_mass > p_cellmass_now(jc,jks,jb)) .AND. (jks <= nlev-1) ) z_mass = z_mass - p_cellmass_now(jc,jks,jb) jks = jks+1 ! update Courant number z_cfl(jc,jk,jb) = z_cfl(jc,jk,jb) + 1._wp ENDDO ! now we add the fractional Courant number ! The MIN function is required here for the case that ! we approach the lower boundary and exit the above loop ! because of jks > nlev-1. z_cfl(jc,jk,jb) = z_cfl(jc,jk,jb) + MIN(1._wp,z_mass/p_cellmass_now(jc,jks,jb)) ELSE ! Case w < 0 jks = jk-1 ! initialize shifted index DO WHILE( (ABS(z_mass) > p_cellmass_now(jc,jks,jb)) .AND. & & (jks >= slevp1_ti) ) z_mass = z_mass + p_cellmass_now(jc,jks,jb) jks = jks-1 ! update Courant number z_cfl(jc,jk,jb) = z_cfl(jc,jk,jb) - 1._wp ENDDO ! now we add the fractional Courant number ! The MAX function is required here for the case that ! we approach the upper boundary and exit the above loop ! because of jks < slevp1_ti. z_cfl(jc,jk,jb) = z_cfl(jc,jk,jb) + MAX(-1._wp,z_mass/p_cellmass_now(jc,jks,jb)) ENDIF ``` --- -> nested scan, "xxx Langrangian", while #### advection/mo_advection_vflux.f90 function=upwind_vflux_ppm4gpu line=1820 ```fortran ! get integer shift (always non-negative) js = FLOOR(ABS(z_cfl(jc,jk,jb))) IF (js == 0) CYCLE ! no work to do z_iflx = 0._wp ! case w > 0 IF (z_cfl(jc,jk,jb) > 0._wp) THEN DO n = 1, js jk_shift = jk-1 + n ! Integer flux (division by p_dtime is done at the end) z_iflx = z_iflx + p_cc(jc,jk_shift,jb) * p_cellmass_now(jc,jk_shift,jb) ENDDO ! case w <= 0 ELSE DO n = 1, js jk_shift = jk - n ! cycle if the source model level is in a region where advection is ! turned off for the present variable IF (jk_shift < slevp1) CYCLE ! Integer flux (division by p_dtime is done at the end) z_iflx = z_iflx - p_cc(jc,jk_shift,jb) * p_cellmass_now(jc,jk_shift,jb) ENDDO ENDIF ! compute full (integer- plus high order fractional) flux p_upflux(jc,jk,jb) = p_upflux(jc,jk,jb) + z_iflx*rdtime ``` -> nested scan, "xxx Langrangian" #### advection/mo_advection_vflux.f90 function=upwind_vflux_ppm4gpu line=1776 ```fortran ikm1 = jk-1 ! get integer shift (always non-negative) js = FLOOR(ABS(z_cfl(jc,jk,jb))) ! get fractional part of Courant number (always non-negative) z_cflfrac = ABS(z_cfl(jc,jk,jb)) - REAL(js,wp) ! compute shifted cell index IF (z_cfl(jc,jk,jb) > 0._wp) THEN jks = MIN(jk,nlev)+js wsign = 1._wp ELSE jks = ikm1-js wsign = -1._wp ENDIF ! this is needed in addition in order to avoid accessing non-existing (uninitalized) ! source levels for tracers that are not advected on all model levels IF (jks < slev) CYCLE ! compute flux ! ! flux formula differs between w>0 and w<0. ! By using the coefficient 'wsign' we are able to merge ! the two formula into one. z_q_int = p_cc(jc,jks,jb) & & + wsign*(z_delta_q(jc,jks) * (1._wp - z_cflfrac)) & & - z_a1(jc,jks)*(1._wp - 3._wp*z_cflfrac + 2._wp*z_cflfrac*z_cflfrac) p_upflux(jc,jk,jb) = wsign * p_cellmass_now(jc,jks,jb) & & * z_cflfrac * z_q_int * rdtime ``` needs type casts for js = floor(), as floor returns floats in python. and also a cast for real(js,wp) ## In progress ### From Dusk #### hflx_limiter_mo_stencil_02 - ML - [x] Translated to `gt4py` - [ ] Merged into `icon4py` - [x] Ported and verified in `icon-exclaim` #### vert_adv_stencil_01 - AG - [x] Translated to `gt4py` - [ ] Merged into `icon4py` - [x] Ported and verified in `icon-exclaim` (same as vert_adv_stencil_02) #### hflx_limiter_pd_stencil_01 - ML - [x] Translated to `gt4py` - [x] Merged into `icon4py` - [x] Ported and verified in `icon-exclaim` #### step_advection_stencil_01 - ML - [x] Translated to `gt4py` - [ ] Merged into `icon4py` - [x] Ported and verified in `icon-exclaim` - [ ] merged into `icon-exclaim` #### step_advection_stencil_02 - ML - [x] Translated to `gt4py` - [ ] Merged into `icon4py` - [x] Ported and verified in `icon-exclaim` - [ ] merged into `icon-exclaim` #### step_advection_stencil_03 - AG - [x] Translated to `gt4py` - [ ] Merged into `icon4py` - [x] Ported and verified in `icon-exclaim` - [ ] merged into `icon-exclaim` #### hor_adv_stencil_01 - AG - [x] Translated to `gt4py` - [ ] Merged into `icon4py` - [x] Ported and verified in `icon-exclaim` #### upwind_hflux_miura_stencil_01 - AG - [x] Translated to `gt4py` - [ ] Merged into `icon4py` - [x] Ported and verified in `icon-exclaim` - [ ] merged into `icon-exclaim` #### upwind_hflux_miura_stencil_02 - ML - [x] Translated to `gt4py` - [ ] Merged into `icon4py` - [x] Ported and verified in `icon-exclaim` #### hflx_limiter_mo_stencil_03 - ML - [x] Translated to `gt4py` - [ ] Merged into `icon4py` - [x] Ported and verified in `icon-exclaim` #### hflx_limiter_mo_stencil_04 - ML - [x] Translated to `gt4py` - [ ] Merged into `icon4py` - [x] Ported and verified in `icon-exclaim` #### upwind_vflux_ppm_stencil_01 - ML - [x] Translated to `gt4py` - [ ] Merged into `icon4py` - [x] Ported and verified in `icon-exclaim` same as upwind_vflux_ppm_stencil_02 #### v_limit_prbl_sm_stencil_01 - AG - [x] Translated to `gt4py` - [ ] Merged into `icon4py` - [x] Ported and verified in `icon-exclaim` - [ ] merged into `icon-exclaim` where. *Same as `v_limit_prbl_sm_stencil_02`* #### face_val_ppm_stencil_05 - AG - [x] Translated to `gt4py` - [ ] Merged into `icon4py` - [x] Ported and verified in `icon-exclaim` - [ ] merged into `icon-exclaim` *same as face_val_ppm_stencil_06* ### Duplicates #### vert_adv_stencil_02 -> vert_adv_stencil_01 - [x] Ported and verified in `icon-exclaim` #### face_val_ppm_stencil_06 -> face_val_ppm_stencil_05 - [x] Ported and verified in `icon-exclaim` #### face_val_ppm_stencil_04 -> face_val_ppm_stencil_03 - [ ] Ported and verified in `icon-exclaim` #### face_val_ppm_stencil_02 -> face_val_ppm_stencil_01 - [ ] Ported and verified in `icon-exclaim` #### v_limit_prbl_sm_stencil_02 -> v_limit_prbl_sm_stencil_01 - [x] Ported and verified in `icon-exclaim` #### upwind_vflux_ppm_stencil_02 -> upwind_vflux_ppm_stencil_01 - [ ] Ported and verified in `icon-exclaim` #### hflx_limiter_mo_stencil_05 -> `hflx_limiter_mo_stencil_01 + hflx_limiter_mo_stencil_02 + hflx_limiter_mo_stencil_03` - [ ] Ported and verified in `icon-exclaim` In mo_advection_hlimit it is called at the end of a large #ifdef __DSL_VERIFY (200 lines) that calls the other three stencils. ### From Fortran *Note:* **advection scheme 52 (FFSL hybrid miura 3 horizontal)** (->Nina Burgdorfer (NB)) #### btraj_dreg_stencil NB called by hflux_ffsl_hybrid. - [x] Translated to `gt4py` - [ ] Merged into `icon4py` - [x] Ported and verified in `icon-exclaim` - [ ] merged into `icon-exclaim` #### divide_flux_area_list_stencil NB called by hflux_ffsl_hybrid. - [x] Translated to `gt4py` - [ ] Merged into `icon4py` - [x] Ported and verified in `icon-exclaim` - [ ] merged into `icon-exclaim` #### prep_gauss_quadrature_l_stencil_01 (Integration of a linear 2D polynomial) NB Called by hflux_ffsl_hybrid. - [x] Translated to `gt4py` - [ ] Merged into `icon4py` - [ ] Ported and verified in `icon-exclaim` - [ ] merged into `icon-exclaim` [Update 01.11] ICON doesn't run with lsq_high_ord = 1 and 2->Translate for case lsq_high_ord = 3 #### prep_gauss_quadrature_c_stencil (integration of cubic tracer subgrid distribution, order 3) NB Called by hflux_ffsl_hybrid. - [x] Translated to `gt4py` - [ ] Merged into `icon4py` - [x] Ported and verified in `icon-exclaim` - [ ] merged into `icon-exclaim` #### prep_gauss_quadrature_c_list_stencil (integration of cubic tracer subgrid distribution, order 3) NB Called by hflux_ffsl_hybrid. - [ ] Translated to `gt4py` - [ ] Merged into `icon4py` - [ ] Ported and verified in `icon-exclaim` - [ ] merged into `icon-exclaim` #### recon_lsq_cell_l_stencil_01 NB Called by hflux_ffsl_hybrid. - [x] Translated to `gt4py` - [ ] Merged into `icon4py` - [x] Ported and verified in `icon-exclaim` - [ ] merged into `icon-exclaim` [Update 01.11] ICON doesn't run with lsq_high_ord = 1 and 2->Translate for case lsq_high_ord = 3 #### recon_lsq_cell_c_stencil NB Called by hflux_ffsl_hybrid. - [x] Translated to `gt4py` - [ ] Merged into `icon4py` - [ ] Ported and verified in `icon-exclaim` - [ ] merged into `icon-exclaim` [Update 14.11] Inliner problem prevents compilation #### recon_lsq_cell_c_svd_stencil NB Called by hflux_ffsl_hybrid. - [x] Translated to `gt4py` - [ ] Merged into `icon4py` - [ ] Ported and verified in `icon-exclaim` - [ ] merged into `icon-exclaim` [Update 14.11] Inliner problem prevents compilation #### hflux_ffsl_hybrid_stencil_01 (Calculate flux at cell edge, lsq_high_ord = 3) NB - [ ] Translated to `gt4py` - [ ] Merged into `icon4py` - [ ] Ported and verified in `icon-exclaim` - [ ] merged into `icon-exclaim` [In Stand-by] Solve patch0_cell_idx(je,jk,jb) that is an input param. of the stencil that itself takes je, jk and jb as index. #### hflux_ffsl_hybrid_stencil_02 (Compute total flux) NB - [x] Translated to `gt4py` - [ ] Merged into `icon4py` - [x] Ported and verified in `icon-exclaim` - [ ] merged into `icon-exclaim` #### upwind_hflux_up_stencil (first order Godunov scheme) NB - [ ] Translated to `gt4py` - [ ] Merged into `icon4py` - [ ] Ported and verified in `icon-exclaim` - [ ] merged into `icon-exclaim` #### upwind_hflux_miura #### grad_fe_cell Called by upwind_hflux_muira_cycle #### grad_green_gauss_cell Called by upwind_hflux_muira_cycle #### advection/mo_advection_hlimit.f90 function=hflx_limiter_pd line=1312 just a copy before gt4py stencil wrap_run_hflx_limiter_pd_stencil_02 #### zero stencil - ml ##### dimension: CellDim * advection/mo_advection_vflux.f90 function=upwind_vflux_ppm4gpu line=1574: ```fortran z_cfl(i_startidx:i_endidx,slev_ti:nlevp1,jb) = 0._wp ``` * advection/mo_advection_vflux.f90 function=compute_face_values_ppm line=2465 ```fortran z_slope(i_startidx:i_endidx,slev) = 0._wp ``` - [x] Translated to `gt4py` - [ ] Merged into `icon4py` - [x] Ported and verified in `icon-exclaim` - [ ] merged into `icon-exclaim * advection/mo_advection_vflux.f90 function=set_bc_vadv line=2203 ```fortran upflx_top(i_start:i_end) = 0._wp ``` * advection/mo_advection_vflux.f90 function=set_bc_vadv line=2225 ```fortran upflx_bottom(i_start:i_end) = 0._wp ``` - [x] Translated to `gt4py` - [ ] Merged into `icon4py` - [ ] Ported and verified in `icon-exclaim` - [ ] merged into `icon-exclaim #### advection/mo_advection_vlimit.f90 function=v_limit_slope_sm line=803 ```fortran ! index of top half level ikm1 = jk - 1 ! index of bottom half level ikp1 = MIN( jk+1, elev ) ! equivalent formulation of Colella and Woodward (1984) slope limiter ! following Lin et al (1994). p_cc_min = MIN(p_cc(jc,ikm1),p_cc(jc,jk),p_cc(jc,ikp1)) slope(jc,jk) = SIGN( & & MIN( ABS(slope(jc,jk)), 2._wp*(p_cc(jc,jk)-p_cc_min) ), & & slope(jc,jk) ) ``` ## Things to keep in mind To test all of the different advection schemes in the tracer advection one could design an experiment with many tracers in the namelist, each covering one of the schemes. In general making sure that the test coverage for the tracer advection is high is as difficult as it is important.