AB InBev MLOps Challenge

# AB InBev MLOps Challenge Jimmy Alexander Pulido Arias > [name=Jimmy Alexander Pulido Arias] You can read this doc online in https://hackmd.io/@GM7_PYyKTYimjfqKBc0E2w/Sy2g9xuVF ## Part A: Model selection I selected the following kaggle challenges solution: Iris - https://www.kaggle.com/caesarmario/iris-classification-using-various-ml-models ## Setup: 1. Clone the following git repository: https://github.com/jiapulidoar/abinbev-mlops.git 2. build the image`docker build -t mlops-airflow:latest . ` 3. Run the docker container: `docker run -d -p 8080:8080 -p 8008:8008 mlops-airflow` 4. Enter to Airflow panel: http://localhost:8080/admin/ 5. Once the pipeline is executed the FastAPI can be accessed through: http://localhost:8008/ ## Part B: Orchestrating the model For this challenges the following setup was used: * **Orchestrating tool:** Airflow * **Language**: python * **DB**: LiteSQL * **Classifier**: SVM * **API**: FastAPI ### Airflow Pipeline: 1. Graph view: ![](https://i.imgur.com/exwzJYt.png) 2. Tree view ![](https://i.imgur.com/iqX57Ar.png) * **preprocess_data**: Split data into training and test and save in Data base * **train**: train the model and store it * **evaludate_model**: evaludate the model using accuracy_score and classification_report and sabe in DB * **Serve**: Deploy and start API ## DB tables: * **Iris** | SepalLengthCm | SepalWidthCm | PetalLengthCm | PetalWidthCm |Species | | -------- | -------- | -------- | -------- |-------- | | number | number | number | number |Text | * **iris_split** | SepalLengthCm | SepalWidthCm | PetalLengthCm | PetalWidthCm |Species |Train | | -------- | -------- | -------- | -------- |-------- |-------- | | number | number | number | number |Text |Boolean | * **model_evaluation** | model | accuracy | report | | -------- | -------- | -------- | | model name | number | classification_report | * **model_evaluation** | SepalLengthCm | SepalWidthCm | PetalLengthCm | PetalWidthCm |Species_prediction| | -------- | -------- | -------- | -------- |-------- | | number | number | number | number |Text | ## Part C: Exposing the model as an API The API was implemented using [FastAPI](https://fastapi.tiangolo.com/) ### Fast API Once the pipeline is executed and the **serve** is running, we can access the API through the url: `http://localhost:8008/docs` ### Endpoint: ``` http://localhost:8008/evaluate POST ``` * Request sample: ```json= { "items": [ { "SepalLengthCm": 150, "SepalWidthCm":23, "PetalLengthCm": 10, "PetalWidthCm": 8 }, { "SepalLengthCm": 0, "SepalWidthCm": 0, "PetalLengthCm": 0, "PetalWidthCm": 0 } ] } ``` * Response sample: ```json= { "0": { "SepalLengthCm": 150, "SepalWidthCm": 23, "PetalLengthCm": 10, "PetalWidthCm": 8, "Species_prediction": "Iris-virginica" }, "1": { "SepalLengthCm": 0, "SepalWidthCm": 0, "PetalLengthCm": 0, "PetalWidthCm": 0, "Species_prediction": "Iris-setosa" } } ``` ### Sample: * Request ![](https://i.imgur.com/DEe1Cm3.png) * Response ![](https://i.imgur.com/3ESv1JY.png) ## References: > [[IRIS Dataset](https://www.kaggle.com/caesarmario/iris-classification-using-various-ml-models/data)][Iris Classification using Various ML Models] > [[10 Minutes to Building a Machine Learning Pipeline with Apache Airflow](https://towardsdatascience.com/10-minutes-to-building-a-machine-learning-pipeline-with-apache-airflow-53cd09268977)] > [Machine Learning Pipelines with Kubeflow](https://towardsdatascience.com/machine-learning-pipelines-with-kubeflow-4c59ad05522)