HackMD - Collaborative Markdown Knowledge Base

<2023-03-02> ## Getting feedback Not only think about end users, but also embassadors, since they can lead to new users. ## In Potential Scope - Model Cards - has the potential to be adopted by upstream sklearn - the app to create one, easier inspection of the models added to the model cards - there are annoying issues with model cards that make it a bit tricky to work with them and create them - we think this is somewhat high priority, and the annoyances need to be fixed, otherwise users will be deterred from using it. - Persistence - skops - it's in a good shape, but needs many improvements, such as better inspection of the file, better support for c-extension types, extendability, etc - there are many optimization potentials on the format's speed and size - this is high priority if we want to push it more aggressively. For instance, it would need to be more stable for a place like sagemaker to potentially support it. - ONNX - it kinda works, as long as you have a single sklearn estimator. - todo would be, better tools to check if user's model is suppoted, and to better support complex estimators such as pipelines and column transformers - it would also need much better docs and workflow for people to implement what's needed for a custom estimator - we haven't started this, it has potential to bring users, but it'd be a bit of work. - Serving - we do serving right now, but it's very slow, and half the time gives a timeout - has many issues with different dtypes, etc - could support a better way than sending/receiving json - improve inference performance - pretty much all of the work here would be in api-inference-community repo - priority is not clear since there isn't much usage on the backend side. But it's more of a chicken and egg problem. Some improvements would be nice. - Documenting on the skops side, how to serve using fastapi would be nice. ## Usage It's been pretty stable around 50-100 downloads a day from pypi. We've been having continuous engagement from community members on the repo. The scikit-learn core team is excited about the project. Ideas on increasing usage: - outreach: we're giving talks around - PyCon Italy is the next scheduled one - Did one in PyData Berlin meetup - Did another talk with Linux Foundation's ML Security Committee - usage/conversion: - the model card app would make people happy probably - we need clearer communication on why people should use skops format, people are still confused as what the difference between `.skops` and `.onnx` is, for example. - many people know of ONNX, convenience methods for that could bring them here. - people always look for how to deploy their models online, if we have something related to that, it'd bring more people here. But this is a very large topic and whole bunch of companies/teams work on it. - we could try to convince a place like sagemaker to support `.skops` format. (the format would need to be more mature for that)