Dispatching discussion - 12/15/21

Intros - Ralf

Background: discussion here:
- https://labs.quansight.org/blog/2021/11/pydata-extensibility-vision
- https://discuss.scientific-python.org/t/a-proposed-design-for-supporting-multiple-array-types-across-scipy-scikit-learn-scikit-image-and-beyond/131
Coordinated approach, break from the new-package-per-hardware paradigm
- Decision-making, timeframes - coordinated across projects

Intros - Stefan

Focus the discussion: support for other array objects vs. selective dispatching

Discussion

Error conversion between hardware devices? What if there's a CPU step in the middle of a GPU processing chain?
- Ralf: this should error: no automagic conversion between array types
- Strictly-typed dispatch, no implicit conversions
- Implicit conversions are difficult, both for users and developers
Can't pass arbitrary array objects into a foreign array library
- How does this work with NEP18/NEP35 & type-based dispatching
  - The __array_function__ and other array protocols are numpy-specific for the purposes of this discussion; not part of the array API standard
  - Sebastian: could have a duck-array fallback for this
High-level question: do we want to do this? Is this useful?
- Having talked to HW-folks, e.g. AMD,
- What are the options when there is a more performant algorithm that relies on new hardware. Where do they plug in?:
  - Copy API and have a new library
  - Monkeypatch functionality in existing library
- These options are not particularly robust
- Monkeypatching makes the system opaque - if you get strange results, it's difficult to get the failures back to the users.
- Explicit import switching: difficult to support in packages that are built e.g. on top of scipy
What's the scope for this proposal? Multi-GPU libraries?
- The assumptions change (?) when working with clusters, colocated devices
- A: It seems like this should be okay as long as the distributed nature is entirely encapsulated in the backend
- Take dask as an example - currently adheres closely to numpy API, but when you start really using distributed-specific features (e.g. blocksize) then things may not be as performant/have new challenges
- What about lazy systems?
  - Can subset features that don't work in a lazy-implementation e.g. and mark them as such
Is there any CI/CD infrastructure in place for testing on specialized hardware
- Public CI systems for HW is (should be?) a requirement
- This is a bigger problem than just this project (see apple M1)
- Every feature should be tested somewhere, but who's responsible? Hardware-specific code will not live in the consumer libraries (e.g. scikits)
  - Functions, including dispatch, should be tested by the backend implementer
    - Dask is kind of a special case, more of a meta-array than a backend implementer.
  - What about uarray?
    - in scipy.fft, cupy has the test for dispatching on their end
- It'd be nice if backend implementers could find a way to make a consumer library's test suite runnable w/ their backend
Is there a way of defining the interface/design patterns to look at the design in a more abstract way
- Maybe abstract design principles should be better-documented
Numerical precision questions in testing: should be a mechanism for users to specify what precision they need from backends for testing
- One opinion: this should be up to the backend. Backend implementer documents what their precision is, upstream authors need to be aware
  - Gets more tricky when you run the original test suite on a new backend
Clarifying the distinction between what's been done in NumPy (e.g. NEPS) and what's being proposed for scipy
- A SPEC (NEP-like document, cross-project) for documenting/decision-making
- Re: numpy vs. scipy. The dispatching machinery in NumPy was not made public and wasn't designed to handle backend dispatching
- NumPy dispatching only does type-dispatching, not backend-selection dispatching
  - __array_function__ and __array_ufunc__ seem to work well in RAPIDS ecosystem, can wait-and-see about adoptying any new dispatching mechanism in numpy
Backend registration, discovery, versioning etc. How will this work?
- Register on import?
- plugin-style where consumer looks for backends
- Backend versioning?
  - Keeping backends and interfaces in sync
- Re: compatibility - have the library dictate to backend implementers that they should follow the libraries interface? Backends accept any signature and only use what it understands, ignoring the rest?
- Related problem: rollout. Not everyone is publishing their libraries (or backends) sychronously.
  - +1, having an abstract API might help because the API can be versioned
- Re: chaning APIs
  - the majority of current API changes are adding new kwargs, not backward incompatible. This is handleable
- Extremely important - needs to be layed out more concretely
Have some way for backend-implementers to indicate back to users how much of a given API they support. This usually isn't immediately obvious to users. Whose job is it to inform users how much of any given API is supported by a backend. Can be addressed via tooling
- numba has a similar problem (currently manually updated), but some tooling for this
- Similar problem in dask via __array_function__
- Libraries may be able to reject backend registration if it's known to be uncompatible - depends on how much information libraries have about what backends support
- Nuances in compatibility tables: e.g. in cupy, there may be instances where the signature looks the same as numpy, but only a subset of options for a particular kwarg are supported.
Is it possible for library/API authors to make their tests importable/usable by backends?
- Important for the previous point: how do implementers/libraries stay in sync
Who are the arbiters for deciding what's supported?
- No arbiters - hope that people are honest/correct about it, but no formal role/procedure for verifying support
- Library authors can't reasonably put requirements on backend implementers in terms of what fraction of the API must be supported
- Opinion: it's best for backend implementers to have freedom to decide what they provide or not
In the future, libraries may be written where the user interfaces and backend interfaces are decoupled from the start by design
Dispatching between libraries, e.g. scipy.ndimage and scikit-image
- Another example: scipy.optimize.minimize w/in scikit-learn
Two distinct concerns: duck-typing vs. type-based dispatching and backend selection
- Supporting various array types would be more work for library maintainers
- Minimize duplication of pure-Python code by backend implementers
- Combine array_api with a function-level dispatcher to support most use-cases
Having scikit-learn be the API is viewed as a positive, +1 for dispatching
What about a backend system for computational kernels shared by multiple GPU libs? (E.g. CUDA solvers)
- Seems not high priority right now.
- There could be a "glue" layer for this internally

Next steps

Good ideas from the discussion above:
- Sphinx(?) tooling for support tables
- Making test suites reusable
Things to document/describe better
Collaborative long-term plans between libraries and potential backend-implementers
Get a SPEC started from a distillation of the discussion on discuss.scientific-python.org
Come up with a minimal set of principles that will guide the effort
- a, b, c need to be implemented before approaching technology decisions
  - Docs w/ tables
  - Allow function overrides, etc.
- Should have this in place before discussion about specific technological approachs (e.g. uarray)
A single SPEC has a lot less info than the blog posts + discussion + this meeting
- Dynamic summarization of ongoing discussions would be useful
Where to start with the concrete implementation? A formal SPEC
Explicitly layout the target audience, potential (hoped-for?) impact?
Organize another call - open to the public

Afternotes

single-dispatch (stlib) vs multi-dispatch

Dispatching discussion - 12/15/21

Intros - Ralf

Intros - Stefan

Discussion

Next steps

Afternotes

Read more

Scientific Python BOF 2024

SPEC-.. : CI Best Practices fo

Scientific Python BOF 2023

SP dev summit 2024 work log