# ML Platform Research Note
- [HackMD – LaTeX 語法與示範](https://hackmd.io/@sysprog/B1RwlM85Z?type=view)
## Paper
- [Clipper: A Low-Latency Online Prediction Serving System](https://www.usenix.org/system/files/conference/nsdi17/nsdi17-crankshaw.pdf)
- [TensorFlow-Serving: Flexible, High-Performance ML Serving](http://learningsys.org/nips17/assets/papers/paper_1.pdf)
- [A Case for Serverless Machine Learning](http://learningsys.org/nips18/assets/papers/101CameraReadySubmissioncirrus_nips_final2.pdf)
- [TICTAC: ACCELERATING DISTRIBUTED DEEP LEARNING WITH COMMUNICATION SCHEDULING](https://mlsys.org/Conferences/2019/doc/2019/199.pdf)
- [PipeMare: Asynchronous Pipeline Parallel DNN Training](https://arxiv.org/pdf/1910.05124.pdf)
- [TFX: A TensorFlow-Based Production-Scale Machine Learning Platform](https://storage.googleapis.com/pub-tools-public-publication-data/pdf/b500d77bc4f518a1165c0ab43c8fac5d2948bc14.pdf)
- [Kubebench: A Benchmarking Platform for ML Workloads](https://alln-extcloud-storage.cisco.com/ciscoblogs/5c0fda3a560b9.pdf)
- [Hidden Technical Debt in Machine Learning Systems](https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf)
- [Deep CTR Prediction in Display Advertising](https://arxiv.org/pdf/1609.06018.pdf)
- [Large-Scale Machine Learning at Twitter](http://users.umiacs.umd.edu/~jimmylin/publications/Lin_Kolcz_SIGMOD2012.pdf)
- [Practical Lessons from Predicting Clicks on Ads at Facebook](https://research.fb.com/wp-content/uploads/2016/11/practical-lessons-from-predicting-clicks-on-ads-at-facebook.pdf)
- [On Challenges in Machine Learning Model Management](http://sites.computer.org/debull/A18dec/p5.pdf)
- [An Experimentation and Analytics Framework for Large-Scale AI Operations Platforms](https://www.usenix.org/system/files/opml20-paper-rausch_0.pdf)
## PPT
- [ACCELERATED COMPUTING FOR AI](http://learningsys.org/nips18/assets/slides/Catanzaro_AI_Systems_Workshop_2018.pdf)
- [PipeDream: Generalized Pipeline Parallelism for DNN Training](https://sosp19.rcs.uwaterloo.ca/slides/narayanan.pdf)
- [MLOp Lifecycle Scheme for Vision-based Inspection Process in Manufacturing](https://www.usenix.org/sites/default/files/conference/protected-files/opml19_slides_lim.pdf)
- [[Lecture] SOFA Quick Start](https://docs.google.com/presentation/d/1fyNnLlU-0WMIddkI8hgYn0Tg1vbP9i7VuXSPIsXB2L4/edit#slide=id.g5c0adbf077_0_422)
- [Bighead: Airbnb’s End-to-End Machine Learning Infrastructure](https://static1.squarespace.com/static/53629df3e4b02e2dc6655a87/t/5d7bb27e4e663e641fc69c15/1568387721126/B147+-+Hoh%2C+Andrew.pdf)
- [Apache Hadoop 机器学习引擎 Submarine 及生态 刘勋](https://myslide.cn/slides/18398)
- [Scaling Deep Learning on Hadoop at LinkedIn](https://www.slideshare.net/ssuser72f42a/scaling-deep-learning-on-hadoop-at-linkedin)
- [MLeap: Productionize Data Science Workflows Using Spark](slideshare.net/JenAman/mleap-productionize-data-science-workflows-using-spark?next_slideshow=1)
- [Bighead airbnb’s End-to-End Machine Learning Infrastructure](https://cdn.oreillystatic.com/en/assets/1/event/278/Bighead_%20Airbnb_s%20end-to-end%20machine%20learning%20platform%20Presentation.pdf)
- [Machine Learning as Code and Kubernetes with Kubeflow](https://myslide.cn/slides/13049)
- [Kubebench:Benchmarking ML Workloads on Kubernetes](https://schd.ws/hosted_files/kccncchina2018english/17/Kubebench_KubeCon2018China.pdf)
- [ML Ops and Kubeflow Pipelines](https://www.usenix.org/sites/default/files/conference/protected-files/srecon19apac_slides_sato.pdf)
- [Building AI Platfrom Based on Kubernetes and TensorFlow](http://bos.itdks.com/a1d52ddb24d34f19a194c83a30ff6f43.pdf)
- [Apache Spark Model Deployment](https://www.slideshare.net/databricks/apache-spark-model-deployment)
- [What are the Unique Challenges and Opportunities in Systems for ML](https://www.slideshare.net/matei/what-are-the-unique-challenges-and-opportunities-in-systems-for-ml)
- [Kubeflow++ Building an Open Source Data Science Platform](https://events19.linuxfoundation.org/wp-content/uploads/2017/12/Kubeflow-Building-and-Operating-a-OSS-Data-Science-Platform-J%C3%B6rg-Schad-Mesosphere.pdf)
- [Zipline—Airbnb’s Declarative Feature Engineering Framework](https://www.slideshare.net/databricks/ziplineairbnbs-declarative-feature-engineering-framework)
## github
- https://github.com/kanonjz/paper
- https://github.com/cortexlabs/cortex
- https://github.com/ucbrise/clipper
- https://github.com/tensorflow/serving
- https://github.com/Angel-ML/serving
- https://github.com/tensorflow/tensorrt
- https://github.com/tensorflow/tfx
- https://github.com/kubeflow/examples
- https://github.com/microsoft/nni
- https://github.com/apache/submarine
- https://github.com/tensorflow/cloud
- https://github.com/Netflix/metaflow
- https://github.com/quantumblacklabs/kedro
- https://github.com/HDI-Project/AutoBazaar
- https://github.com/awslabs/djl
- https://github.com/bentoml/BentoML
## Conference
List of Machine Learning and Deep Learning conferences in 2020
https://tryolabs.com/blog/machine-learning-deep-learning-conferences/
- [Systems for ML 2018](http://learningsys.org/nips18/acceptedpapers.html)
- [SysML Conference 2019](https://mlsys.org/Conferences/2019/index.html#schedule)
- [aiconference](https://aiconference.london/)
- [SOSP 2019 Program](https://sosp19.rcs.uwaterloo.ca/program.html)
- [OpML '19 Conference Program](https://www.usenix.org/conference/opml19/program)
- [ScaledML 2019](http://scaledml.org/2019/index.html)
- [Workshop on AI Systems at SOSP 2019](http://learningsys.org/sosp19/)
## Talk
- [Bighead: Airbnb’s End-to-End Machine Learning Platform-1](https://databricks.com/session/bighead-airbnbs-end-to-end-machine-learning-platform)
- [Bighead: Airbnb’s End-to-End Machine Learning Platform-2](https://www.youtube.com/watch?v=UvcnoOrgyhE)
- [Zipline: Airbnb’s Machine Learning Data Management Platform](https://databricks.com/session/zipline-airbnbs-machine-learning-data-management-platform)
- [Machine Learning with TensorFlow and PyTorch on Apache Hadoop using Cloud Dataproc (Cloud Next '19)](https://www.youtube.com/watch?v=hr7_pG3yEOQ)
- [Benchmarking Machine Learning Workloads on Kubeflow - Xinyuan Huang, Cisco Systems, Inc. & Ce Gao](https://www.youtube.com/watch?v=9sLRIBYYUlQ)
- [Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda Tan, Hortonworks](https://www.youtube.com/watch?v=RlLfxa81hgo)
- [SFBigAnalytics_20200825: Apache Submarine: State of the union](https://www.youtube.com/watch?v=Zu_YxxmL6LU&ab_channel=SFBigAnalytics)
- [TensorFlow On Spark: Scalable TensorFlow Learning on Spark Clusters](https://databricks.com/session/tensorflow-on-spark-scalable-tensorflow-learning-on-spark-clusters)
- [Simplifying Model Management with MLflow - Matei Zaharia (Databricks) Corey Zumar (Databricks)](https://www.youtube.com/watch?v=MSUTaCBhD7A&ab_channel=Databricks)
- [Platform for Complete Machine Learning Lifecycle (mlflow)](https://pyvideo.org/pydata-miami-2019/platform-for-complete-machine-learning-lifecycle.html)
- [Building and Managing a Centralized Kubeflow Platform at Spotify - Keshi Dai & Ryan Clough, Spotify](https://www.youtube.com/watch?v=m9XhsnNSMAI&ab_channel=CNCF%5BCloudNativeComputingFoundation%5D)
- [Human-Centric Machine Learning Infrastructure @Netflix](https://www.youtube.com/watch?v=XV5VGddmP24&ab_channel=InfoQ)
## Blog
- [Meet Michelangelo: Uber’s Machine Learning Platform](https://eng.uber.com/michelangelo-machine-learning-platform/)
- [Twitter meets TensorFlow](https://blog.twitter.com/engineering/en_us/topics/insights/2018/twittertensorflow.html)
- [Using Deep Learning at Scale in Twitter’s Timelines](https://blog.twitter.com/engineering/en_us/topics/insights/2017/using-deep-learning-at-scale-in-twitters-timelines.html)
- [Introducing FBLearner Flow: Facebook’s AI backbone](https://engineering.fb.com/core-data/introducing-fblearner-flow-facebook-s-ai-backbone/)
- [A Tour of End-to-End Machine Learning Platforms](https://www.kdnuggets.com/2020/07/tour-end-to-end-machine-learning-platforms.html)
- [3 Common Technical Debts in Machine Learning and How to Avoid Them](https://towardsdatascience.com/3-common-technical-debts-in-machine-learning-and-how-to-avoid-them-17f1d7e8a428)
- [Implementing Apache Submarine — a unified AI Platform](https://medium.com/analytics-vidhya/implementing-apache-submarine-a-unified-ai-platform-459c9edd541e)
###### tags: `Research` `Note`