# PASC 24 Minisymposium Proposal https://pasc24.pasc-conference.org/submission/guidelines-for-minisymposia/ ## Title Architectures for Hybrid Next-Generation Weather and Climate Models ## Scientific domain(s) - Climate, Weather, and Earth Sciences (incl. solid earth dynamics) - Computational Methods and Applied Mathematics ## Organizer details - Hannes Vogt (ETH - CSCS) - Enrique Gonzalez (ETH - CSCS) - Mauro Bianco (ETH - CSCS) ## Abstract Climate and weather models have traditionally been implemented in low-level languages like Fortran for performance and legacy reasons. However, this approach is no longer sustainable given the diverse and evolving landscape of hardware architectures and the rapid improvement of machine learning approaches for numerical weather prediction. ML models are catching up with state-of-the-art physics-based models and could start replacing them within the next decade. Future weather models are thus poised to be hybrid, where classic numerical methods could be used to train and augment ML models improving the effectiveness and precision of the final system. The weather and climate community has actively explored alternative tools and methodologies to deal with the performance portability issues, and significant achievements have been done in recent years with to""ols like GT4Py, PSyclone, DaCe, Kokkos, Pytorch, JAX, Julia. However, there has been less work on addressing the architectural challenges appearing when mixing classic physics-based model components together with high-performance GPU and ML frameworks in Python. In this minisymposium, we aim to delve into the architectural design of the next-generation production-quality weather and climate models, which would need to accommodate the diverse tools used nowadays by scientists, including classic numerical methods, performance-portable frameworks for heterogeneous accelerators, auto differentiation toolkits and machine learning libraries. The architectural challenges involved in the integration of those diverse tools are often overlooked when developing new scientific software projects. The planned talks will touch on those challenges, with a special emphasis on the following key topics: - Automatic differentiation: adjoint versions of weather and climate model are important in data assimilation and optimization processes. We would like to discuss strategies for supporting automatic differentiation tools, to enhance their versatility and adaptability as an essential ingredient for machine learning methods. - Integration and substitution of specific model parts with numerical, data-driven or ML components. Definition of clear, documented and efficient interfaces are a key aspect to achieve interoperability and composition of diverse components. - Data handling and workflow infrastructure to handle multiple model runs for training or large ensembles simulations enabled by ML models. Efficient data handling pipelines, with support for standardized formats, are key to deal with model data movement through the archiving infrastructure where ML-based compression tools are likely to play a role in the future. ## Tentative speakers - Jeremy McGibbon (AI2) for his work on "ACE: A fast, skillful learned global atmospheric model for climate prediction" (https://arxiv.org/abs/2310.02074) - Stephan Hoyer (Google) on his work with "JAX-CFD: Computational Fluid Dynamics in JAX" for weather and climate applications (https://github.com/google/jax-cfd, https://indico.dkrz.de/event/45/contributions/203/attachments/59/120/ESiWACE2%20talk_Hoyer%20virtual%20workshop%20on%20weather%20_%20climate%20modeling%20-%207%20October%202022.pdf) - Peter Dueben (ECMWF) for his key role in the ECMWF ML strategy on challenges of interaction between conventional models and machine learning (https://gmd.copernicus.org/articles/11/3999/2018/). - Daniel Holdaway (NASA) on his work on automatic differentiation for the adjoint of the FV3 dynamical core in FV3-JEDI. - Thorsten Kurth (NVIDIA) on FourCastNet (https://github.com/NVlabs/FourCastNet) - Stefano Ubbiali (ETH) on his work on Tasmania for process coupling in atmospheric models (https://www.research-collection.ethz.ch/handle/20.500.11850/539521) - Langwen Huang (ETH) on compressing multidimensional weather and climate data into neural networks (https://arxiv.org/abs/2210.12538) ## Diversity and Inclusivity In the mini symposium we will involve speakers from different geographical locations, both from industry and academic institutions. The tentative list of speakers already contains people at different stages in their career path. We will work with the candidates to encourage them to propose alternative speakers from their groups to further diversify the mini symposium. -------------------------- -------------------------- -------------------------- ## Sources ### Projects - climt (Climate Modelling and Diagnostics Toolkit) [https://climt.readthedocs.io/en/latest/introduction.html] - Sympl: A System for Modelling Planets [https://sympl.readthedocs.io/en/latest/] climt builds on top of Sympl and Jeremy was involved in both (here is a publication presenting them: https://gmd.copernicus.org/articles/11/3781/2018/) - Tasmania (developed by Stefano Ubbiali during his PhD) also builds on top of Sympl [https://github.com/eth-cscs/] [https://www.research-collection.ethz.ch/handle/20.500.11850/539521] - Pace: a Python-based performance-portable atmospheric model [https://gmd.copernicus.org/articles/16/2719/2023/] [https://ai2cm.github.io/pace/]. This is an implementation of the FV3GFS / SHiELD atmospheric model developed by NOAA/GFDL using the GT4Py domain-specific language in Python. Interestingly, even if Jeremy was involved, they didn't use climt/Sympl here - Veros: Versatile Ocean Simulation in Pure Python uses JAX as backend and is fully written in Python [https://veros.readthedocs.io/en/latest/] and https://agupubs.onlinelibrary.wiley.com/doi/10.1029/2021MS002717 Other interesting but slightly less related candidates: - XCast: A python climate forecasting toolkit [https://github.com/kjhall01/xcast] [https://www.frontiersin.org/articles/10.3389/fclim.2022.953262/full] XCast is a Python Climate Forecasting toolkit - a set of flexible functions and classes that let you implement any forecasting workflow you can think of. - Pangeo project [https://pangeo.io/index.html] Pangeo is first and foremost a community promoting open, reproducible, and scalable science. This community provides documentation, develops and maintains software, and deploys computing infrastructure to make scientific research and programming easier. The Pangeo software ecosystem involves open source tools such as xarray, iris, dask, jupyter, and many other packages - ClimateLearn is a Python library for accessing state-of-the-art climate data and machine learning models in a standardized, straightforward way. [https://aditya-grover.github.io/blog/2023/climate-learn/] [https://github.com/aditya-grover/climate-learn] This could be interesting since I think AI models in the climate and weather domain are becoming just another common tool for users and scientists - ECMWF - [MAELSTROM team (ECMWF project)](https://www.maelstrom-eurohpc.eu/team) - https://www.ecmwf.int/en/about/media-centre/science-blog/2021/large-scale-machine-learning-applications-weather-and - https://www.atmorep.org/ - [ECMWF - The rise of data-driven weather forecasting](https://arxiv.org/abs/2307.10128) ### People - [Victoria Bennett (ECMWF)](https://de.linkedin.com/in/victoria-bennett-40aab434): - https://www.ecmwf.int/en/about/media-centre/news/2023/growing-role-machine-learning-ecmwf - https://www.umr-cnrm.fr/accord/IMG/pdf/chantry_accordml.pdf - [Rossella Arcucci (Imperial College London)](https://www.imperial.ac.uk/people/r.arcucci) - https://www.nature.com/articles/s41612-023-00387-2 - [Rochelle Schneider (European Space Agency)](https://philab.esa.int/team/rochelle-schneider/) - https://www.nature.com/articles/s41612-023-00387-2 ## Titles Dr. Oliver Elbert (oliver.elbert@noaa.gov) - Insights from Integrating a Python DSL-based Dynamical Core into an Atmospheric Model Dr. Karthik Kashinath (kkashinath@nvidia.com) - Tools/Pipelines for Software Defined, AI-Enabled and Scalable XXX - Tools for AI-Enabled, Scalable Earth System Models - Technologies for AI-Enabled, Scalable Earth System Models - Towards AI-Enabled, Scalable workflows for Earth System Models - Towards Dr. Stephan Hoyer (shoyer@gmail.com) - Lessons from NeuralGCM for building differentiable hybrid models at scale Dr. Stefano Ubbiali (subbiali@phys.ethz.ch) - Can We Build Composable Atmospheric Models Without Sacrificing Performance? Institute for Atmospheric and Climate Science (IAC) ## 200 words abstract Climate and weather models, traditionally built using low-level languages like Fortran for performance, face sustainability challenges due to evolving hardware architectures and advances in machine learning (ML). ML models are rapidly approaching the effectiveness of physics-based models, suggesting a future shift towards hybrid systems that blend classic numerical methods with ML. This evolution necessitates exploring new tools and methodologies to address performance portability issues and the integration of physics-based models with high-performance GPU and ML frameworks in Python. Significant progress has been made with domain specific languages or general purpose software libraries, but integrating these with traditional model components remains a challenge. The next-generation weather and climate models must accommodate a range of tools, including numerical methods, performance-portable frameworks, auto differentiation toolkits, and ML libraries. However, the architectural complexities of integrating these diverse tools are often overlooked in scientific software development. The upcoming minisymposium will focus on architectural design for scalable weather and climate models, addressing key topics like automatic differentiation, optimization, integration of various model components, and efficient data handling for ML-enabled simulations. ## Diversity and Inclusivity In the mini-symposium we involve speakers from different geographical locations, both from industry and academic institutions at different stages of their carrer path. Since submitting our expression of interest, we have successfully included a speaker from an underrepresented ethnic group, further enriching the diversity of our event. Efforts have also been made to engage female experts in the field of modern software engineering for weather and climate models. Regrettably, we were unable to secure their participation in time for this proposal. Additionally, we reached out to our initially intended speakers who were unavailable, seeking their recommendations for speakers with diverse backgrounds. However, we did not receive any further suggestions that closely aligned with the specific focus of our symposium.