# AB InBev MLOps Challenge
Jimmy Alexander Pulido Arias
> [name=Jimmy Alexander Pulido Arias]
You can read this doc online in https://hackmd.io/@GM7_PYyKTYimjfqKBc0E2w/Sy2g9xuVF
## Part A: Model selection
I selected the following kaggle challenges solution: Iris - https://www.kaggle.com/caesarmario/iris-classification-using-various-ml-models
## Setup:
1. Clone the following git repository: https://github.com/jiapulidoar/abinbev-mlops.git
2. build the image`docker build -t mlops-airflow:latest . `
3. Run the docker container: `docker run -d -p 8080:8080 -p 8008:8008 mlops-airflow`
4. Enter to Airflow panel: http://localhost:8080/admin/
5. Once the pipeline is executed the FastAPI can be accessed through: http://localhost:8008/
## Part B: Orchestrating the model
For this challenges the following setup was used:
* **Orchestrating tool:** Airflow
* **Language**: python
* **DB**: LiteSQL
* **Classifier**: SVM
* **API**: FastAPI
### Airflow Pipeline:
1. Graph view:

2. Tree view

* **preprocess_data**: Split data into training and test and save in Data base
* **train**: train the model and store it
* **evaludate_model**: evaludate the model using accuracy_score and classification_report and sabe in DB
* **Serve**: Deploy and start API
## DB tables:
* **Iris**
| SepalLengthCm | SepalWidthCm | PetalLengthCm | PetalWidthCm |Species |
| -------- | -------- | -------- | -------- |-------- |
| number | number | number | number |Text |
* **iris_split**
| SepalLengthCm | SepalWidthCm | PetalLengthCm | PetalWidthCm |Species |Train |
| -------- | -------- | -------- | -------- |-------- |-------- |
| number | number | number | number |Text |Boolean |
* **model_evaluation**
| model | accuracy | report |
| -------- | -------- | -------- |
| model name | number | classification_report |
* **model_evaluation**
| SepalLengthCm | SepalWidthCm | PetalLengthCm | PetalWidthCm |Species_prediction|
| -------- | -------- | -------- | -------- |-------- |
| number | number | number | number |Text |
## Part C: Exposing the model as an API
The API was implemented using [FastAPI](https://fastapi.tiangolo.com/)
### Fast API
Once the pipeline is executed and the **serve** is running, we can access the API through the url: `http://localhost:8008/docs`
### Endpoint:
```
http://localhost:8008/evaluate POST
```
* Request sample:
```json=
{
"items": [
{
"SepalLengthCm": 150,
"SepalWidthCm":23,
"PetalLengthCm": 10,
"PetalWidthCm": 8
},
{
"SepalLengthCm": 0,
"SepalWidthCm": 0,
"PetalLengthCm": 0,
"PetalWidthCm": 0
}
]
}
```
* Response sample:
```json=
{
"0": {
"SepalLengthCm": 150,
"SepalWidthCm": 23,
"PetalLengthCm": 10,
"PetalWidthCm": 8,
"Species_prediction": "Iris-virginica"
},
"1": {
"SepalLengthCm": 0,
"SepalWidthCm": 0,
"PetalLengthCm": 0,
"PetalWidthCm": 0,
"Species_prediction": "Iris-setosa"
}
}
```
### Sample:
* Request

* Response

## References:
> [[IRIS Dataset](https://www.kaggle.com/caesarmario/iris-classification-using-various-ml-models/data)][Iris Classification using Various ML Models]
> [[10 Minutes to Building a Machine Learning Pipeline with Apache Airflow](https://towardsdatascience.com/10-minutes-to-building-a-machine-learning-pipeline-with-apache-airflow-53cd09268977)]
> [Machine Learning Pipelines with Kubeflow](https://towardsdatascience.com/machine-learning-pipelines-with-kubeflow-4c59ad05522)