# Dispatching discussion - 12/15/21
## Intros - Ralf
- Background: discussion here:
* https://labs.quansight.org/blog/2021/11/pydata-extensibility-vision
* https://discuss.scientific-python.org/t/a-proposed-design-for-supporting-multiple-array-types-across-scipy-scikit-learn-scikit-image-and-beyond/131
- Coordinated approach, break from the new-package-per-hardware paradigm
* Decision-making, timeframes - coordinated across projects
## Intros - Stefan
- Focus the discussion: support for other array objects vs. selective dispatching
## Discussion
- Error conversion between hardware devices? What if there's a CPU step in the middle of a GPU processing chain?
* Ralf: this should error: no automagic conversion between array types
* Strictly-typed dispatch, no implicit conversions
* Implicit conversions are difficult, both for users and developers
- Can't pass arbitrary array objects into a foreign array library
* How does this work with NEP18/NEP35 & type-based dispatching
- The `__array_function__` and other array protocols are numpy-specific for the purposes of this discussion; not part of the array API standard
- Sebastian: could have a duck-array fallback for this
- High-level question: do we want to do this? Is this useful?
* Having talked to HW-folks, e.g. AMD,
* What are the options when there is a more performant algorithm that relies on new hardware. Where do they plug in?:
- Copy API and have a new library
- Monkeypatch functionality in existing library
* These options are not particularly robust
* Monkeypatching makes the system opaque - if you get strange results, it's difficult to get the failures back to the users.
* Explicit import switching: difficult to support in packages that are built e.g. on top of scipy
- What's the scope for this proposal? Multi-GPU libraries?
* The assumptions change (?) when working with clusters, colocated devices
* A: It seems like this should be okay as long as the distributed nature is entirely encapsulated in the backend
* Take dask as an example - currently adheres closely to numpy API, but when you start really using distributed-specific features (e.g. blocksize) then things may not be as performant/have new challenges
* What about lazy systems?
- Can subset features that don't work in a lazy-implementation e.g. and mark them as such
- Is there any CI/CD infrastructure in place for testing on specialized hardware
* Public CI systems for HW is (should be?) a requirement
* This is a bigger problem than just this project (see apple M1)
* Every feature should be tested somewhere, but who's responsible? Hardware-specific code will not live in the consumer libraries (e.g. scikits)
- Functions, including dispatch, should be tested by the backend implementer
* Dask is kind of a special case, more of a meta-array than a backend implementer.
- What about `uarray`?
* in scipy.fft, cupy has the test for dispatching on their end
* It'd be nice if backend implementers could find a way to make a consumer library's test suite runnable w/ their backend
- Is there a way of defining the interface/design patterns to look at the design in a more abstract way
* Maybe abstract design principles should be better-documented
- Numerical precision questions in testing: should be a mechanism for users to specify what precision they need from backends for testing
* One opinion: this should be up to the backend. Backend implementer documents what their precision is, upstream authors need to be aware
- Gets more tricky when you run the original test suite on a new backend
- Clarifying the distinction between what's been done in NumPy (e.g. NEPS) and what's being proposed for scipy
* A SPEC (NEP-like document, cross-project) for documenting/decision-making
* Re: numpy vs. scipy. The dispatching machinery in NumPy was not made public and wasn't designed to handle backend dispatching
* NumPy dispatching only does type-dispatching, not backend-selection dispatching
- `__array_function__` and `__array_ufunc__` seem to work well in RAPIDS ecosystem, can wait-and-see about adoptying any new dispatching mechanism in numpy
- Backend registration, discovery, versioning etc. How will this work?
* Register on import?
* plugin-style where consumer looks for backends
* Backend versioning?
- Keeping backends and interfaces in sync
* Re: compatibility - have the library dictate to backend implementers that they should follow the libraries interface? Backends accept any signature and only use what it understands, ignoring the rest?
* Related problem: rollout. Not everyone is publishing their libraries (or backends) sychronously.
- +1, having an abstract API might help because the API can be versioned
* Re: chaning APIs
- the majority of current API changes are adding new kwargs, not backward incompatible. This is handleable
* Extremely important - needs to be layed out more concretely
- Have some way for backend-implementers to indicate back to users how much of a given API they support. This usually isn't immediately obvious to users. Whose job is it to inform users how much of any given API is supported by a backend. Can be addressed via tooling
* `numba` has a similar problem (currently manually updated), but some tooling for this
* Similar problem in `dask` via `__array_function__`
* Libraries may be able to reject backend registration if it's known to be uncompatible - depends on how much information libraries have about what backends support
* Nuances in compatibility tables: e.g. in `cupy`, there may be instances where the signature looks the same as `numpy`, but only a subset of options for a particular kwarg are supported.
- Is it possible for library/API authors to make their tests importable/usable by backends?
* Important for the previous point: how do implementers/libraries stay in sync
- Who are the arbiters for deciding what's supported?
* No arbiters - hope that people are honest/correct about it, but no formal role/procedure for verifying support
* Library authors can't reasonably put requirements on backend implementers in terms of what fraction of the API *must* be supported
* Opinion: it's best for backend implementers to have freedom to decide what they provide or not
- In the future, libraries may be written where the user interfaces and backend interfaces are decoupled from the start by design
- Dispatching between libraries, e.g. `scipy.ndimage` and `scikit-image`
* Another example: `scipy.optimize.minimize` w/in `scikit-learn`
- Two distinct concerns: duck-typing vs. type-based dispatching and backend selection
* Supporting various array types would be more work for library maintainers
* Minimize duplication of pure-Python code by backend implementers
* Combine `array_api` with a function-level dispatcher to support most use-cases
- Having `scikit-learn` be the API is viewed as a positive, +1 for dispatching
-
- What about a backend system for computational kernels shared by multiple GPU libs? (E.g. CUDA solvers)
- Seems not high priority right now.
- There could be a "glue" layer for this internally
## Next steps
- Good ideas from the discussion above:
* Sphinx(?) tooling for support tables
* Making test suites reusable
- Things to document/describe better
- Collaborative long-term plans between libraries and potential backend-implementers
- Get a SPEC started from a distillation of the discussion on discuss.scientific-python.org
- Come up with a minimal set of principles that will guide the effort
* a, b, c need to be implemented before approaching technology decisions
- Docs w/ tables
- Allow function overrides, etc.
* Should have this in place before discussion about specific technological approachs (e.g. `uarray`)
- A single SPEC has a lot less info than the blog posts + discussion + this meeting
* Dynamic summarization of ongoing discussions would be useful
- Where to start with the concrete implementation? A formal SPEC
- Explicitly layout the target audience, potential (hoped-for?) impact?
- Organize another call - open to the public
## Afternotes
- single-dispatch (stlib) vs multi-dispatch
-