Currently it is hard to implement backend plugins, especially for data-scientists & MLE’s who do not have working knowledge of Golang. Also, performance requirements, maintenance and development is cumbersome.
The document here proposes a path to make it possible to write plugins rapidly, while decoupling them from the core flytepropeller engine.
Flytekit Backend Plugin is a plugin registry written by fastAPI. Both users and Propeller can send the request to it to run the jobs (Bigquery, Databricks). Moreover, this registry should be stateless, so we can easily to scale up to multiple machines.
FlytePropeller Backend System Plugin: New web api plugin used to send the request to flytekit backend service to create a job on an external platform.
Users should be able to write the plugin using a rest-like interface. In this proposal we recommend using a simplified form of WebAPI plugins. the goal of the plugins is to enable the following functions
It should be possible to implement this using a Web Service framework like FastAPI
. Raw implementation for a fastAPI
like server can be as follows.
Note: It should be possible to implement multiple resource plugins per python library. Thus each resource should be delimited.
Ideally, we should provide a simplified interface to this in flytekit. This would mean that the flytekit plugin can simply use this interface to create a local plugin and a backendplugin.
The workflow code running Snowflake, Databricks should not be changed. We can add a new config enable_backend_flytekit_plugin_system
. If it's true, the job will be handled by plugin system. If can't find the plugin in the registry, falling back to use the default web api plugin in the propeller.
Secrets management should not be imposed using Flyte’s convention, though we should provide a simplified secrets reader using flytekits secret system?
There are two options to deploy backend plugin system.
Pros:
Cons:
Pros:
Cons:
Backend plugin system can be run in the deployment independently. Because it's stateless, we can just scale up the replica if request increases.
This yaml file can be added into the Flyte helm chart (flyte-core
).
Compare the memory and CPU consumption of the FastAPI server with that of the current Propeller server.
The round latency between grpc and fastAPI server.
The round latency between grpc and current propeller.