MetPy CSSI 2021 roadmap
===
:::info
- **Time**: 2:00 pm MDT
- **Participants**:
- John Allen
- Lydia Bunting
- Drew Camron
- Connor Cozad
- Kevin Goebbert
- Russell Manser
- Ryan May
:::
- re issue [#1655](https://github.com/Unidata/MetPy/issues/1655) once completed. May open up a new discussion to close issue and formalize roadmap?
- **existing roadmap** on [docs](https://unidata.github.io/MetPy/latest/devel/roadmap.html)
- promises in **upcoming grant** ([award/abstract](https://www.nsf.gov/awardsearch/showAward?AWD_ID=2103682)):
- speed and scalability of CAPE and other calculations
- creating benchmarking tools to establish baseline and continually evaluate improvements
- enable and integrate with [dask](https://dask.org/)
- BUFR/GRIB polishing
- create resources to train for these improvements
- exploratory/stretch:
- GPU support
- expansion thoughts from discussion:
- upstream contributions to pangeo and other relevant efforts ([Pangeo Forge](https://pangeo-forge.readthedocs.io/en/latest/), [rechunker](https://rechunker.readthedocs.io/en/latest/))
- [XSEDE 2.0](https://www.xsede.org/), [JetStream 2](https://jetstream-cloud.org/)
- **previous grant** no-cost extension deliverables:
- solver, solver, solver
- **other** roadmap key markers:
- separation of siphon (TDS) and MetPy (domain-specific) remote access functionality
- New grant Gantt chart
```mermaid
gantt
title MetPy CSSI Timeline (DRAFT)
axisFormat %Y-%m
section Performance
Create Performance Suite: 2021-05-01, 720d
Test Performance Tools : 2021-08-01, 270d
Optimize Metpy Functions : 2022-02-01, 480d
Improve Scalability : 2022-11-01, 540d
section Docs and Training
Documentation/Example Work : 2022-02-01, 810d
Training Workshop : 2023-02-01, 90d
Training Workshop : 2024-02-01, 90d
section Others
File Format Support : 2023-02-01, 270d
Explore GPU Support : 2023-11-01, 180d
```
- Performance/evaluation tools
- [Numba](http://numba.pydata.org/)
- [Cython](https://cython.org/)
- [gt4py](https://github.com/GridTools/gt4py)
- [asv](https://asv.readthedocs.io/en/stable/)
- Other ideas:
- encompassing `get_layer` functionality in xarray `.sel()`
- Some functions like DCAPE may become feasible with optimization solution in place
- More resources for units, including a guide that goes through the relevant uses and explains the rationale for having units
- Need to go back through issues and see if there are any nuggest lurking in there
- Likely avoid employing dask internally just due to the challenges setting things up in a way that works generally well--just ensure interoperability with dask
- Need to test across variety of dask configurations (local cluster, HPC, cloud cluster)
- Numba sounds like the best initial solution; it avoids us needing to do all the platform builds ourselves, but instead makes that an externality. It's not perfect, though, since that makes us dependent on Numba for support of additional platforms (e.g. macOS m1) and future Python versions. On the plus side, might give us PyPy support?