# Lecture 2: What is a Machine Learning Platform
Part of mini-course of [Apache Submarine: Design and Implementation of a Machine Learning Platform](https://hackmd.io/@submarine/B17x8LhAH). Day 1, [Lecture 2](https://cloudera2008-my.sharepoint.com/:p:/g/personal/weichiu_cloudera2008_onmicrosoft_com/EWewCTOgLDhMrypPSKRxb-sBe5n85XGqequJfqWcHor2wg?e=D5fOrX)
* 1.5 hr
* Also known as machine learning infrastructure, AI infrastructure, Machine Learning Operations (MLOps)
* Why do you need a system at all?
* Reduce time/effort to develop ML product, simplify workflow
* Repeatable process
* Support a variety of ML frameworks, users
* This is a booming world; many ML algorithms and frameworks and data scientists don’t standardize on any of them.
* Easier to evaluate models
* Easy to scale up/down, push to production
* Industrialization of AI / Productive ML / Productionizing ML...
* Worth noting: this is a new domain, and people are still trying to figure out. So this is exciting area, best practices still yet to be established. But it also means what I talk about today may deprecate really soon.
* Why not just a notebook, like Juypter?
* Collaboration
* [Interesting read: “I Don't Like Notebooks”](https://docs.google.com/presentation/d/1n2RlMdmv1p25Xy5thJUhkKGvjtV-dkAIsUXP-AL4ffI/edit?usp=sharing)
* To summarize, hidden state, bad coding habit, modularity, reproducibility, hard to share across media
* Reproducibility: Python library version
* Why not a data visualization/BI tool?
* Why not an ML framework?
* ML Platform → ML Framework → ML Algorithms
* ML Platform = supports one or multiple ML framework + toolkit to support ML workflow
* ML Framework/ ML library = an implementation of ML algorithms, supports one or more ML algorithms, supports one or more languages
* What is there in the market
* [13 frameworks for mastering machine learning](https://www.idginsiderpro.com/article/3026262/13-frameworks-for-mastering-machine-learning.html)
* Open source: Submarine, MLFlow, Kubeflow, TFX
* MLFlow
* open source platform for managing the end-to-end machine learning lifecycle.
* Tracking experiments to record and compare parameters and results ([MLflow Tracking](https://mlflow.org/docs/latest/tracking.html#tracking)).
* Packaging ML code in a reusable, reproducible form in order to share with other data scientists or transfer to production ([MLflow Projects](https://mlflow.org/docs/latest/projects.html#projects)).
* Managing and deploying models from a variety of ML libraries to a variety of model serving and inference platforms ([MLflow Models](https://mlflow.org/docs/latest/models.html#models)).
* KubeFlow
* TBD
* TFX
* machine learning platform based on TensorFlow
* Paper: [https://ai.google/research/pubs/pub46484](https://ai.google/research/pubs/pub46484)
* [Compare to other ML pipeline](https://docs.google.com/document/d/1KK1aNsivo6Eyji4r71bLIdwS-hdLeVX0_HCkgQZYoFM/edit#heading=h.wdicv4gxymrz)
* Others:
* Apache Singa, Apache Marvin
* Lyft Flyte
* [https://flyte.org/](https://flyte.org/)
* Combines machine learning and data processing into a platform.
Netflix Metaflow
* [Open-Sourcing Metaflow, a Human-Centric Framework for Data Science](https://medium.com/netflix-techblog/open-sourcing-metaflow-a-human-centric-framework-for-data-science-fa72e04a5d9)
* Commercial: Cloudera Data Science Workbench, SageMaker, Azure Machine Learning Studio, H2O.ai SAS, RapidMiner, Dataiku, DataRobot, IBM DSX...
* Apache Submarine
* Big data, large scale, distributed GPU training
* [Submarine project spin-off to TLP proposal](https://docs.google.com/document/d/1kE_f-r-ANh9qOeapdPwQPHhaJTS7IMiqDQAS8ESi4TA/edit)
* algorithm development, model batch training, model incremental training, model online services and model management
* Why Submarine: because data is stored in Hadoop cluster, so naturally ML/DL job run on the cluster.
* Big tech companies already developed MLPs, however, they are generally not open source. Submarine intend to be the open standard for machine learning platform.
### Related conferences
* USENIX Symposium on Networked Systems Design and Implementation (NSDI)
* [https://www.usenix.org/conference/nsdi20](https://www.usenix.org/conference/nsdi20)
* ACM Symposium on Operating Systems Principles (SOSP)
* [https://sosp19.rcs.uwaterloo.ca/](https://sosp19.rcs.uwaterloo.ca/)
* Machine Learning and Systems (MLSys)
* [https://mlsys.org/](https://mlsys.org/)
* USENIX Conference on Operational Machine Learning (OpML)
* [https://www.usenix.org/conference/opml19](https://www.usenix.org/conference/opml19)

### Submarine Architecture

### TFX

###### tags: `2019-minicourse-submarine` `Machine Learning`