
[TOC]
# MLflow
MLflow is an open source platform for managing the **end-to-end** machine learning lifecycle.

why to use Mlflow?
---
1. keep track of experiments
2. reuse or reproduce code
3. package and deploy models
4. manage models
---
## MLflow Tracking
The MLflow Tracking component is an **API** and **UI** for logging parameters, code versions, metrics, and output files when running your machine learning code and for later **visualizing** the results. MLflow Tracking lets you log and query experiments using [Python](https://mlflow.org/docs/latest/python_api/index.html), [R](https://mlflow.org/docs/latest/R-api.html), [Java](https://mlflow.org/docs/latest/java_api/index.html), and [REST](https://mlflow.org/docs/latest/rest-api.html) API APIs.

MLflow Tracking is organized around the concept of **runs**, which are executions of some piece of data science code.

Example:
```python
import mlflow
with mlflow.start_run() as run:
mlflow.log_param("my", "param")
mlflow.log_metric("score", 100)
```
## MLflow Projects
An MLflow Project is a format for packaging data science code in a **reusable** and **reproducible** way, based primarily on conventions. In addition, the Projects component includes an **API** and **command-line** tools for running projects, making it possible to chain together projects into workflows.

Example:
```python=
import mlflow
project_uri = "https://github.com/mlflow/mlflow-example"
params = {"alpha": 0.5, "l1_ratio": 0.01}
# Run MLflow project and create a reproducible conda environment
# on a local host
mlflow.run(project_uri, parameters=params)
```
```bash=
* remote
mlflow run https://github.com/mlflow/mlflow-example -v 3c0711f8868232f17a9adbb69fb1186ec8a3c0c7 -b local -P alpha=0.5 -P l1_ratio=0.01
* local
mlflow run /home/u1353162/project -b local -P alpha=0.5 -P l1_ratio=0.01
```
## MLflow Models
An MLflow Model is a standard format for packaging machine learning models.

### Storage Format
* All of the flavors that a particular model supports are defined in its MLmodel file in YAML format
* mlflow.sklearn outputs models as follows:
model/
├── MLmodel
└── model.pkl
### Built-In Model Flavors
* Python Function (python_function)
* R Function (crate)
* H2O (h2o)
* Keras (keras)
* MLeap (mleap)
* PyTorch (pytorch)
* Scikit-learn (sklearn)
* Spark MLlib (spark)
* TensorFlow (tensorflow)
* ONNX (onnx)
* MXNet Gluon (gluon)
* XGBoost (xgboost)
* LightGBM (lightgbm)
* Spacy(spaCy)
* Fastai(fastai)
* Statsmodels (statsmodels)
### Deploy MLflow models
You deploy MLflow model locally or generate a Docker image using the CLI interface to the mlflow.models module.
#### Commands
* **serve** deploys the model as a local REST API server.
* **build_docker** packages a REST API endpoint serving the model as a docker image.
* **predict** uses the model to generate a prediction for a local CSV or JSON file.
# MLflow使用手冊
## 儲存artifacts在S3的方式:
1. 設定環境變數證書和位置
```bash
export MLFLOW_S3_ENDPOINT_URL=https://cos.twcc.ai
export AWS_ACCESS_KEY_ID=QQV0EC2S038W29QMK1VV
export AWS_SECRET_ACCESS_KEY=EDV54sny8Mez1YrIqB5hQpgSNrpMKMFutlQyRhFw
export MLFLOW_TRACKING_URI=https://ai-mkt.nchc.org.tw/mlflow/9a0737e7ee387f06338f6addd7b9ab51/
```
2. 安裝mlfow和boto3
pip install mlflow
pip install boto3
3. 開啟mlflow並配置artifact store
方法一: 開啟MLflow tracking server(開啟tacking server之前必須設定endpoint)
mlflow server --default-artifact-root s3://bucket --host IP --port PORT
方法二: 新增Experiments(CLI)
mlflow experiments create -n experiment-name -l s3://bucket/path
方法三: 新增Experiments(Python SDK)
```python
from mlflow.tracking import MlflowClient
#Create an experiment with a name that is unique and case sensitive.
client = MlflowClient()
experiment_id = client.create_experiment("experiment name", "s3://bucket/path")
```
## 部屬model方法:
```bash
mlflow models serve -m s3://bucket/your/model/path --host IP --port PORT --no-conda
```
{"metaMigratedAt":"2023-06-15T19:01:23.862Z","metaMigratedFrom":"YAML","title":"Mlflow","breaks":true,"contributors":"[{\"id\":\"0ef7395c-32c4-41e1-b0fe-911cf0fdf7a1\",\"add\":7521,\"del\":3360}]"}