<!-- Abstract:
The United Kingdom Chemistry and Aerosols (UKCA) model is a community atmospheric chemistry and aerosol microphysics model, which forms part of the Met Office's weather forecasting system. It is also a key part of the UK Earth System Model (UKESM), whose outputs feed into IPCC reports. UKCA is a particularly expensive part of the Met Office's model, so significant reductions to its computational cost would be welcome. The chemistry component of UKCA (the most expensive) uses an implicit timestepping scheme in which each iteration starts from a default, large timestep size. In each iteration, an attempt is made to solve the nonlinear system over several grid-boxes in the spatial domain, but in many cases this fails at the default timestep size. If so, the timestep size is repeatedly halved and the solver re-run until convergence is achieved. In this talk, we propose a method for predicting the timestep that will be required in advance, thereby avoiding wasted computation in the attempts to run the chemistry model with timestep sizes that are too large. We utilise a simple machine learning (ML) based approach, coupled into UKCA using FTorch - a Fortran interface for the popular Python-based ML tool, PyTorch. Further, we make use of FTorch's recently added online training functionality, in order to avoid archiving training data that would have no clear other purpose.
-->
<style>
.reveal {
font-size: 27px;
}
</style>
<style>
.green {color: green;}
</style>
<style>
.red {color: red;}
</style>
## Accelerating UKCA by predicting timesteps with FTorch
<u>Joe Wallwork</u><sup>1</sup>, Luke Abraham<sup>2,3</sup>, Jack Atkinson<sup>1</sup>
<sup>1</sup>Institute of Computing for Climate Science, University of Cambridge, U.K.
<sup>2</sup>Department of Chemistry, University of Cambridge, U.K.
<sup>3</sup>National Centre for Atmospheric Science, U.K.
<!---->
<img src="https://hackmd.io/_uploads/SyNht0cpyg.png" alt="drawing" width="400"/>
<img src="https://hackmd.io/_uploads/ryH0C69a1l.png" alt="drawing" width="300"/>
Slides: https://hackmd.io/@jwallwork/2025-durham-hpc-days?type=slide
<!-- 15 + 5 minute slot -->
---
## Funding
* The [Institute of Computing for Climate Science (ICCS)](https://iccs.cam.ac.uk) acknowledges funding from [Schmidt Sciences](https://www.schmidtsciences.org).
* This project also received funding from a [C2D3-Accelerate grant](https://science.ai.cam.ac.uk/news/2024-12-09-exploring-novel-applications-of-ai-for-research-and-innovation-%E2%80%93-announcing-our-2024-funded-projects.html) for novel applications of AI in research and innovation.
---
## UKCA: United Kingdom Chemistry \& Aerosols
* Atmospheric composition model used in UKESM and at the Met Office. <!--* From long climate simulations to regional air quality forecasts.-->
* ~85-200 tracers and ~300-750 reactions.
* UKCA takes ~25% of UM runtime (depending on configuration).
* UKCA's chemical solver takes ~40% UKCA runtime (~10% overall UM runtime).

---
## UKCA chunking approach
* Structured latitude-longitude grid.
* *Chemistry solver is spatially independent.*
* Chunking options:
1. horizontal levels.
2. vertical columns (slower).
3. full domain (intended for GPU).
---
## UKCA timestepping approach
* Implicit timestepping, quasi-Newton, full LU decomposition.
* For each time subinterval to be integrated...
<!--
* Start with $\Delta t=3600$.
* Try to integrate with the current timestep size.
* If *any grid-box* fails, half the step and try again.
-->

---
## Numbers of halving steps
Low resolution "N48" UKCA job with 10 timesteps.

---
## ML halving steps

* Idea: for each grid-box, map input variables to the number of halving steps.
* https://github.com/Cambridge-ICCS/mlstep
---
## FTorch - overview
Fortran interface for PyTorch, https://cambridge-iccs.github.io/FTorch.

* Open source (MIT license) and open development.
* Designed to be familiar to both Fortran programmers and PyTorch users.
* Uses `iso_c_binding` to interface with the Torch C++ backend (no data copying).
* Couple directly to `libtorch` $\implies$ no need for Python runtime.
---
## FTorch - offline training workflow

---
## Offline approach
1. **Generate training data (Fortran)**
Run UKCA test case, writing input arrays and an output array containing numbers of halving steps with NetCDF.
2. **Data processing (Python)**
Raw training material unsuitable.
3. **Training and scripting (Python)**
Load training data, use it to train ML model, and save in TorchScript format.
4. **Inference (Fortran)**
Load trained model and use it to predict timestep.
---
## Offline approach - 1. Generate training data (Fortran)
```fortran
USE ncutils, ONLY: write_nc_real_1d, wrice_nc_integer_1d, ...
! [Setup]
! Open training data files for writing
iteration = iteration + 1
IF (training) THEN
tot_n_points = theta_field_size * model_levels
! Write temperature data to file
ALLOCATE(zt_full(tot_n_points))
zt_full(:) = PACK(temp,.TRUE.)
CALL write_nc_real_1d(iteration, "zt", zt_full)
DEALLOCATE(zt_full)
! [Other inputs]
END IF
! [Run solver with a chunk size of 1]
IF (training) THEN
! Write numbers of chemistry timesteps to file
ALLOCATE(ncsteps_full(tot_n_points))
CALL write_nc_integer_1d(iteration, "ncsteps", ncsteps_full)
DEALLOCATE(ncsteps_full)
END IF
! [Cleanup]
```
---
## Offline approach - 2. Data processing (Python)
* 10 timesteps $\times$ 3D domain $\implies$ 2.6M data points!
* Mostly zeros $\implies$ massive bias.
* Take nonzero points, plus `zero_factor=3` times as many zero points $\implies$ just 4,904.
---
## Offline approach - 3. Training (Python)
```python
import torch
from mlstep.data_utils import NetCDFDataLoader
from mlstep.net import FCNN
from mlstep.propagate import propagate
# [Setup]
# Load the target and feature data from file
features_1d = ["stratflag", "zp", "zt", "zq", "cldf", "cldl"]
features_2d = ["prt", "dryrt", "wetrt", "ftr"]
ncloader = NetCDFDataLoader(
features_1d, features_2d, num_timesteps, zero_factor=zero_factor
)
target_data = ncloader.load_target_data()
max_nhsteps = ncloader.max_nhsteps
# Setup model, optimiser, and loss function
nn = FCNN(input_size, max_nhsteps=max_nhsteps, hidden_size=hidden_size)
nn = nn.to(device, dtype=torch.float)
optimizer = torch.optim.Adam(nn.parameters(), lr=lr)
criterion = torch.nn.CrossEntropyLoss(reduction="sum")
# [Training loop]
# Save model in TorchScript format
scripted_model = torch.jit.script(nn)
scripted_model.save("model.pt")
```
---
## Offline approach - 4. Inference (Fortran)
```fortran
use ftorch
! [Setup]
IF (.NOT. training) THEN
CALL torch_tensor_from_array(in_tensors, ..., torch_kCPU)
CALL torch_tensor_from_array(out_tensor, out_data, torch_kCPU)
CALL torch_model_load(mlp, "model.pt", torch_kCPU)
END IF
DO i=1,rows
DO j=1,row_length
! [Array chunking, fill out_data]
IF (.NOT. training) THEN
CALL torch_model_forward(mlp, in_tensors, out_tensor)
ncsteps_full(kcs:kce) = out_data
END IF
! [Run solver on chunk]
END DO
END DO
! [Cleanup]
```
---
## Offline results

Validation results: ~0% overestimation! ...and ~10% underestimation.
<!-- Unfortunately every column chunk -->
---
## Summary and conclusions
* UKCA timestepping algorithm needs reworking to improve performance.
* FTorch can be used to integrate a PyTorch emulator.
* Preliminary work on offline training (training in Python).
* Need to generate more training data!
* Inbuilt regularisation with having more zero data points.
* Might be able to drop some input variables.
---
## Future work: online training
* We recently exposed automatic differentiation and optimisers in FTorch.
* Means we can define the ML model in PyTorch but then do the training in Fortran.
* Avoids saving large volumes of training data and gives possibility to extend loss function to include model errors from Fortran.
---
## Resources
* FTorch webpage: https://cambridge-iccs.github.io/FTorch.
* Atkinson et al., (2025). FTorch: a library for coupling PyTorch models to Fortran. Journal of Open Source Software, 10(107), 7602, https://doi.org/10.21105/joss.07602.

* [ICCS ML coupling workshop](https://cambridge-iccs.github.io/ml-coupling-workshop) - 3-4 September, Cambridge, U.K.
{"title":"Accelerating UKCA by predicting timesteps with FTorch","description":"Talk at Durham HPC Days 2025","contributors":"[{\"id\":\"033ac354-bcb8-4c50-8db3-75282f8d798a\",\"add\":10737,\"del\":1684}]"}