# The Road to NumPy 2.0 (informational NEP version) :author: Sebastian Berg :author: Ralf Gommers :author: Inessa Pawson :author: don't be shy :) *Actually, maybe we can even agree on adding "Steering council" or "on behalf of the steering council"* > **_Note:_** This is a living document. We plan to modify it through continued dialogue with the community. Its acceptance indicates consensus on the process and timelines. Abstract ======== NumPy has long avoided doing a major release to avoid breaking downstream and generally tries to be conservative about changes while moving forward slowly. While we do not strictly follow semantic version there are certain type of changes that are larger or different from what is done in a typical NumPy release. There are two kinds of changes that require a major release of NumPy. First, changes to the C-API that break backwards compatibility and thus require recompiling of downstream libraries like SciPy. Second, larger changes to the Python API that are more complex or cannot go through our typical deprecation cycle. A number of the kinds of changes that require a major release are in the pipeline. As a result, NumPy 2.0 is in preparation with a planned release in January 2024. While our aim is for NumPy 2.0 to be a smooth ride that can easily be upgraded to by a majority of users - many of whom should not even need to be aware of the major release at all - we accept that larger changes mean that some users may have a more difficult transition. We believe that this release will fix some long-standing issues and will set us up for future development in the core of NumPy: * C API changes to enable it to continue to evolve, * making the Python API more consistent and easier to learn, * improving type casting rules by elimination value-based casting, * improving interoperability with other array libraries and Python compilers Finally, we expect that doing this first major release in well over 15 years will enable us to learn some valuable lessons regarding the pros and cons of larger API changes. Motivation and impact ===================== NumPy 2.0 release is required for fixing old bugs and modernizing NumPy’s code base. It is not planned to be a "break the world" release. This means: * It must be possible to compile downstream packages to be compatible with both new and old NumPy versions. However, the C-API is expected to be broken. The path to achieve this compatibility will be defined as a *high priority* project. * The *majority* of users should not require code updates or such updates should be very easy to do. Expert users are likely to notice changes though. * We accept that some NumPy users will not able to adopt NumPy 2.0 immediately or will have to wait until following releases for adoption. One should keep in mind that even bug fixes can break the code of a small number of users. The NumPy 2.0 release will encompass larger changes which are listed below. Scope of work ============= High priority projects ---------------------- The projects in this section are considered high impact from a compatibility point of view, and also key to the need for or benefits of NumPy 2.0. Unless otherwise noted, these are currently proposals, most of these changes have their own NEPs which should be accepted. Enable breaking the C-API ------------------------- The NumPy C-API has many definitions which are unused or very rarely used. We wish to break the C-API in small ways to allow further improvements in NumPy, including ABI updates. More details can be found in `NEP 53 <NEP53>`_. The NEP does not list every anticipated change. * **Status**: Implementation * **Champion**: Sebastian Berg, Matti Picus * **Severity**: Severe (for maintainers without a plan), typical for users * **Affects**: Library maintainers, some users * **Notes**: * Many users may have issues if they upgrade to the latest NumPy version without updating other packages that depend on NumPy. We assume that this isn't a common scenario and will mostly result in clear errors. * All libraries will have to be recompiled. The transition plan will ensure that libraries adhering to best practices will have an easy transition. Remove value-based casting -------------------------- The design described in :ref:`NEP50` changes the promotion behavior of NumPy scalars by removing any value-based casting. In NumPy 2.0, we propose to use this behavior for all relevant functionality, and hence remove support for value-based casting. Details for this change are discussed in :ref:`NEP50`. * **Status**: Largely implemented, but open for discussions and open questions to be addressed. * **Champion**: Sebastian Berg, Mike McCarthy, (I really could use at least one more) * **Severity**: High in rare cases, some results can change or memory can bloat. * **Affects**: Many users, but hopefully not most as one needs to use smaller than default precision types to be affected. A thorough cleanup of the Python API ------------------------------------ The NumPy API is quite messy, with many functions and aliases that are not recommended for use, namespaces that are private but missing underscores in their names, inconsistencies in argument names, and more. Changes will include removing aliases and outdated functionality (including many things that have been doc-deprecated already), making namespaces private, and making function signatures more consistent. Furthermore, the NumPy reference documentation will be reorganized to reflect the different types of remaining namespaces: "regular" (recommended for general usage), "special-purpose" (for a small subset of users and a quite specific purpose) and "legacy" (kept for backwards compatibility, but not recommended for new code). More details on this can be found in `NEP 52 <NEP 52>`_. * **Status**: NEP 52 is mostly complete, needs some more deprecations in 1.25.0, and more detailed triaging of the full NumPy API surface. * **Champion**: Ralf Gommers, Stefan van der Walt, ... * **Severity**: Medium. It is expected that a lot of projects and users will see some breakage, but also that code changes to more idiomatic usage will be straightforward and compatible with both numpy 1.X and 2.0 * **Affects**: Many users and downstream projects Add array API standard support to the main namespace ---------------------------------------------------- The main reason [NEP 47](https://numpy.org/neps/nep-0047-array-api-standard.html#backward-compatibility) aimed for a separate `numpy.array_api` submodule rather than the main namespace is that casting rules differed too much. With value-based casting being removed (see above and :ref:`NEP50`), that will be resolved in NumPy 2.0. Having NumPy be a superset of the array API standard will be a significant improvement for code portability to other libraries (CuPy, JAX, PyTorch, etc.) and thereby address one of the top user requests from the [2020 NumPy user survey](https://numpy.org/user-survey-2020/) (GPU support). See [the `numpy.array_api` API docs](https://numpy.org/devdocs/reference/array_api.html#table-of-differences-between-numpy-array-api-and-numpy) for an overview of differences between it and the main namespace (the "strictness" ones are not applicable). Some of the key design rules from the array API standard (e.g., output dtypes predictable from input dtypes, no polymorphic APIs with varying number of returns controlled by keywords, using positional-only and keyword-only arguments) will also be applied to NumPy functions that are not part of the array API standard. * **Status**: separate NEP to be written. * **Champion**: Aaron Meurer, Ralf Gommers * **Severity**: Medium. Most impact of breaking changes is likely concentrated in a few widely used APIs (e.g., change semantics of `copy=False` keyword to actually mean "don't copy" rather than "copy if needed") * **Affects**: most users and downstream projects **Other projects** ---------------- We anticipate that new ideas/projects that are appropriate only for a major release will come up. Rather than trying to capture a full list, we give a few examples here and then outline the process for proposing a change to NumPy for the 2.0 release: - Improve NumPy-Numba compatibility (see `here <https://mail.python.org/archives/list/numpy-discussion@python.org/message/QL6BTNYZC3UXBUAWMCMO7KZJTDWBBPCO/>`__) - Change the default integer type on Windows to `int64` - Deprecate the `libnpymath` and `libnpyrandom` static libraries A list of all "projects", or items on the 2.0 roadmap, will be maintained on the [NumPy 2.0 Project Board](https://github.com/orgs/numpy/projects/9). See the *Project selection process* section below for how to go about adding something to that board. Please note that the limited developer bandwidth and the complexity of moving forward a widely used package like NumPy will inevitably mean that many changes that we would like to see cannot be included. We believe that doing a NumPy 2.0 release is very much worthwhile, primarily for the high-prio projects listed above. Release strategy ================ NumPy 2.0 is planned to be released in January 2024. The beta and release candidate frequency leading up to the final 2.0 release will be similar to those for a minor release. Importantly, we do *not* anticipate to continue releasing new 1.2X.0 minor releases in parallel with 2.0. The last pre-2.0 minor release will continue to receive bug fixes and have bugfix releases as needed for some time, just like we do for other minor releases (typical is 3-5 bugfix releases for up to a year after the ``.0`` release). An important part of the C API/ABI changes in NumPy 2.0 is to make it possible to compile a downstream package against NumPy 2.0 and have it work with both 2.0 and 1.2X.0. As a result, there should be no needed for a long-term-support strategy for pre-2.0 versions. Process for proposing features or changes for NumPy 2.0 ========================================================== The below outlines the process we try to follow to make decisions on what to include in NumPy 2.0. Project selection process ------------------------- To determine the scope of work for NumPy 2.0 release, we suggest introducing three categories of projects/proposals: 1. *high*: proposal requires high visibility or may be critical for the NumPy 2.0 release, 2. *normal*: proposal is either something that could fit in any minor release, or it has some impact that we wouldn't normally find acceptable for a minor release but has a small enough blast radius that it does not need a NEP. 3. *candidate*: changes which are in an early planning stage, and are not yet approved or not have a high likelihood to land in 2.0. High priority proposals will be listed explicitly in this NEP. A [project board](https://github.com/orgs/numpy/projects/9) will track all projects proposed for NumPy 2.0, with their categories, owners and current state of progress. Proposing a project for NumPy 2.0 release ----------------------------------------- To start a project, there is one important thing: believe that your change makes NumPy better and commit to trying to make it happen. To have a proposal listed on the NumPy 2.0 project board, we require the following: * At least two champions for each proposal, one of whom must be a NumPy core developer or similar to one in standing. * A brief assessment of the anticipated impact on downstream and end-users. This means assessing how many users/what groups of users are affected and in what way. * Support by the NumPy community or Steering Council (ideally both). Positive feedback to your proposal on the NumPy mailing list is a strong indicator of community support. If *any* of the above requirements are not met, proposals will be listed as “candidate”. NumPy maintainers will review "candidate" projects on a case by case basis. We suggest including a brief header in every proposal (issue or PR): ``` * **Champions**: * **Severity**: How does it affect users? * **Affects**: Who/how many users does it affect? ``` Any further details or adjustments shall be added on request. Large changes may require their own NEP when requested by a maintainer. As a *suggestion*, "affects" could be roughly guided by the number of users: *rare*, *limited*, *common*, and *ubiquitous*. While "severity" could be *minor*, *typical* (code update needed), *severe* (e.g. large change/difficult to find), *critical* (incorrect results or no clear path for fixing things). The two together can then be used as a basis for decision making and discussion.