owned this note
owned this note
Published
Linked with GitHub
# Dispatching in scikit-image: Summary and future developments
The goal of backend dispatching in `scikit-image` is to provide compatibility with various backends, written in different languages, for all types of hardware and architectures, without breaking any previous code or the current user-API. This is crucial for extending the library’s capabilities and ensuring it runs optimally on diverse environments.
## Current Implementation in [scikit-image #7520](https://github.com/scikit-image/scikit-image/pull/7520)
- Uses `importlib.metadata.entry_points()` to list and filter available backends via the `skimage_backends` and `skimage_backend_infos` entry point groups. NetworkX has a similar [Python `entry_point`](https://packaging.python.org/en/latest/specifications/entry-points/) based dispatching implementation([see here](https://github.com/networkx/networkx/blob/main/networkx/utils/backends.py)).
- Dispatching can be disabled by setting the environment variable `SKIMAGE_NO_DISPATCHING=1`. If no backends are installed all dispatching related code is a "no-op" (the decorator removes itself)
- Functions are marked as dispatchable using a `@dispatchable` decorator, which checks for backend implementations of the function when called.(would become a class in future to enhance the dispatching capabilities and to better manage dispatchable algorithms)
- `can_has` is the mechanism that allows a backend to accept or reject a function call based on the function name and the arguments passed in. If it is `False` we move onto the next backend
- Currently, users cannot explicitly specify the backend they wish to use. Instead, all installed backends are sorted alphabetically, and the dispatch mechanism selects the first backend that both implements the function and whose `can_has` method doesn't return `False`. This backend's implementation is then used for the function call.
- The `BackendInformation` class allows backends to specify which functions they implement as well as additional information about the functions they support.
- A `DispatchNotification` warning is issued when a function is dispatched to a backend, informing the user about the dispatch.
- Backend discovery is cached using `functools.cache` to avoid repeated lookups.
- If no backend implements the function or dispatching is disabled, the original scikit-image function is used.
## Future Enhancements
To make the system more flexible and robust, several improvements are proposed:
1. Enabling dispatching mechanisms other than environment variable-based, such as:
- Kwarg-based dispatching: Checks the function signature for a `backend=` kwarg and dispatches based on kwarg's value.
- Type-based dispatching: Dispatch based on the input array types (e.g., `cupy -> cupy`).
2. Backend-Specific Arguments: If necessary, introduce support for backend-specific arguments, ensuring clear guidelines and documentation for their usage.
- A few very stretched out extentions(ideas) of this :
- Support for non-`scikit-image` algorithms in the backend.
- Allow dispatching from one backend to another if supported. A single backend supporting multiple array types. Ensure that different array types (e.g., `cupy`, `numpy`) work seamlessly with the backends.
3. **Fallback**: Instead of falling back to scikit-image's implementation when the selected backend lacks the required support or doesn't have the implementation, we fall back to some other backend(s) based on a backend priority list provided by the user.
- Compatibility among multiple array types.
4. Testing the dispatching system should focus on:
- General tests applicable to all backends, like simple unit tests ensuring they are correctly discovered through `scikit-image`'s entry points.([relevant PR](https://github.com/networkx/nx-parallel/pull/89))
- Running scikit-image tests for backends to test their algorithms (Opt-in).
- Implemented in NetworkX - might or might not be a preferred feature by the scikit-image backends.
5. Dispatching Docs:
- Displaying which backend implementations exsists for an algorithm on scikit-image's user facing docs website.
- Key points from the related [issue#7550](https://github.com/scikit-image/scikit-image/issues/7550):
- Using JSON documents to track backend-supported functions and version compatibility, allowing real-time updates without new scikit-image releases.(But, is this possible with sphinx?)
- Suggests dynamically updating function docstrings to show available backends, with concerns about potential side effects.
- Explores handling new backends after release and providing more tools for inspecting backend states.
- docs for backend developer(how to create a backend, etc.) and docs backend users, general docs(how dispatching is setup and how it works, etc. - can be an enhancement proposal doc)
6. Better Introspection
7. **Version Compatibility**: Define guidelines for version compatibility between `scikit-image` and its backends, including how new versions of backends are supported. Additionally, some general guidelines for backend developers will be beneficial.
8. Challenges:
- Functions that don't fit in this dispatching machinery?
## Any thoughts on broader scientific Python ecosystem and `entry-point` based dispatching? (`spatch`, SPEC2, etc.)
- What kind of projects should adopt it?
- What are the parts of this `entry-point` based dispatching that would be common for most of the libraries (like the backend discover, etc.) and what are the parts that would be specific to a particular library?
-