Introducing FBLearner Flow: Facebook’s AI backbone

# Introducing FBLearner Flow: Facebook’s AI backbone ## Preface Facebook 的許多體驗和交互都是依靠AI實現的。例如使用機器學來給你提供獨特的、個人化的體驗、個人化新聞流、過濾了可能得罪人的內容、突顯流行熱點話題並將搜索結果排名。但是以前，工程師必須有很強機器學習背景，不然沒法好好利用公司的機器學習基礎設施。在 2014 年底，Facebook 重新定義了機器學習平台，將 AI 和 ML 算法送到 Facebook 每一位工程師的手中。 #### 建造一個全新平台：FBLearner Flow * 每個機器學習算法應該可以一次性設置好，可以重複使用 * 工程師可以寫一個訓練流水線，在許多機器上平行運行，可以被很多工程師所用 * 無論工程師在機器學習領域的背景深淺，都可以很簡單地訓練模型，而且，其中幾乎所有的步驟都可以完全自動化 * 人人都能很方便地搜索過往試驗、查看結果、與他人分享，並在某一個試驗中開啟新的變量 ## Core concepts and components * Workflows * 在 FBLearner Flow 中定義的一個 pipeline，所有機器學習任務的入口。每個 workflow 作為一個具體的任務並根據 operator 來定義且可以平行運作，例如訓練和評估某個具體的模型。 * Operators * Operator 是 workflow 的建造模塊。從概念上，你可以將 operator 想像為一個程序裡的一個功能。在 FBLearner Flow 中，operator 是執行的最小單位，可以在單一機器上運作。 * Channels * 代表輸入和輸出，在一個 workflow 的各個 operator 之間流轉。所有 channel 都用一個 FBLearner 定義的定制類別系統輸入。此外平台包括這三個核心組成部分： * Authorship and execution environment * Experimentation management UI * Machine learning library ## Authorship and execution environment > 平台上所有的工作流和操作員都定義為 Python 功能，使用自行實作的裝飾器來整合進入平台讓我們來看一個具體的情景，我們想用經典的Iris數據集訓練一個決策樹，基於花朵的花瓣和萼片大小來預測花的品種。 ``` python= # The typed schema of the Hive table containing the input data feature_columns = ( ('petal_width', types.INTEGER), ('petal_height', types.INTEGER), ('sepal_width', types.INTEGER), ('sepal_height', types.INTEGER), ) label_column = ('species', types.TEXT) all_columns = feature_columns + (label_column,) # This decorator annotates that the following function is a workflow within # FBLearner Flow @workflow( # Workflows have typed inputs and outputs declared using the FBLearner type # system input_schema=types.Schema( labeled_data=types.DATASET(schema=all_columns), unlabeled_data=types.DATASET(schema=feature_columns), ), returns=types.Schema( model=types.MODEL, mse=types.DOUBLE, predictions=types.DATASET(schema=all_columns), ), ) def iris(labeled_data, unlabeled_data): # Divide the dataset into separate training and evaluation dataset by random # sampling. split = SplitDatasetOperator(labeled_data, train=0.8, evaluation=0.2) # Train a decision tree with the default settings then evaluate it on the # labeled evaluation dataset. dt = TrainDecisionTreeOperator( dataset=split.train, features=[name for name, type in feature_columns], label=label_column[0], ) metrics = ComputeMetricsOperator( dataset=split.evaluation, model=dt.model, label=label_column[0], metrics=[Metrics.LOGLOSS], ) # Perform predictions on the unlabeled dataset and produce a new dataset predictions = PredictOperator( dataset=unlabeled_data, model=dt.model, output_column=label_column[0], ) # Return the outputs of the workflow from the individual operators return Output( model=dt.model, logloss=metrics.logloss, predictions=predictions, ) ``` * `@workflow` 裝飾器告訴 FBLearner Flow，`iris()`不是一個普通Python function，而是一個 workflow * `input_schema` 和 `returns` 參值說明了工作流的輸入類型及輸出類型。 * 框架會自動確認這些類型，確保 workflow 收到的數據是符合其預期的。這個例子中，`labeled_data` 輸入標記為有四個欄目的數據庫輸入。如果在數據庫中有一個欄目缺失，那麼就會提出一個 `TypeError` 異常，因為數據庫與這個 workflow 不兼容。工作流不是線性執行，而是分兩個步驟： 1. DAG 編譯步驟 2. Operator 執行步驟 DAG 編譯階段完成時，FBLearner Flow 將打造一個 Operator DAG，可以預定何時進行執行，每個操作員只要上一級成功完成就可以開始執行。在這個例子中，`ComputeMetricsOperator` 和 `PredictOperator` 之間沒有數據相關性，因此這兩個操作員可以同時平行運行。 ![](https://i.imgur.com/Ln06SFJ.png) 在 operator 執行階段，每個操作員有自己的 CPU、GPU 和存儲要求。FBLearner Flow 會分配一個匹配操作員任務要求的機器部分。平台自動將相關的代碼分配給機器，在操作員之間傳送輸入和輸出。 ## Experimentation management UI 在幾百個不同的工作流，進行著無數個機器學習任務。面臨的一個挑戰是打造一個通用的UI界面，可以匹配多元的工作流使用。利用定制類別系統，打造了一個可以不需要理解每個 workflow 的實施細節、就能夠解讀輸入和輸出的 UI。為了進一步定制化，平台UI提供了一個插件系統，可以用來為具體的團隊和整合 Facebook 系統提供定制化體驗。 FBLearner Flow UI還提供了一些額外功能： * 發布 workflow；將輸出視覺化並進行比較；管理試驗。 ![](https://i.imgur.com/SHQ4Kv4.png) * 視覺化輸出並比較 ![](https://i.imgur.com/tU1C0Iu.png) * 管理試驗 ![](https://i.imgur.com/eM6Pewr.png) ## Machine learning library FBLearner Flow 平台的一個核心原則是，它不與任何具體的算法綁定。因而，平台可以支持無數的ML算法以及這些算法的創新組合。平台也很容易延展——任何工程師都可以寫一個新的 workflow，讓他或她最愛的算法可供全公司使用。算法的開源實施很容易在一個 workflow 中完成，並整合進入Facebook的基礎設施。 Facebook的應用機器學習團隊維護的工作流可以為常用算法提供可擴展的實施，包括： * Neural networks * Gradient boosted decision trees * LambdaMART * Stochastic gradient descent * Logistic regression --- #### Reference * [Introducing FBLearner Flow: Facebook’s AI backbone](https://engineering.fb.com/2016/05/09/core-data/introducing-fblearner-flow-facebook-s-ai-backbone/)