# Machine Learning System Architecture
### Components:
- infrastructure
- applications
- data
- documentation
- configuration
____
# Research environment
### ML Pipeline:
- Gathering Data Sources
- Data Analysis
- Feature Engineering: filling missing value, ..etc (d)
- Feature Selection: select most predictable fields (d)
- Model Building: build many ML models and test its performance (d)
- Model building Business uplift evaluation
```mermaid
graph LR
A[Gather Data] --> B[Data Analysis] --> C[Data Pre-processing] --> D[Variable Selection] --> E[Machine Learning Model Building] --> F[Evaluation]
```
### Feature Engineering: Variable Transformation
- missing data
- labels
- distribution: better spread of values may benefit performance
- outliers: extreme low or high regarding to dataset
### Feature Selection:
algorithm to find best subset of features (the most predictive ones)
- enhanced gerneralization by reducing overfitting
- Models with less features are easier to deploy
___
# Model Deployment
**ML systems challenge** -> Reproduciblity
**Reproduciblity** -> returning same results given same data across system
### Deployment of ML Pipelines
```mermaid
graph LR
A[Raw Data] -->B(Feature Engineering) -->C(Model Training) -->D(Scoring)
-->E(Prediction)
```
- **research environment** -> develop ML models (Jupyter - Numpy - Pandas..etc)
- **production environment** -> place ML models (Python - Docker..etc)
##### Deployment of ML models:
In research env:
```mermaid
flowchart LR
Databse[(Historical Data)] --> ML_Model[Machine Learning Model]
```
In production env:
```mermaid
flowchart LR
Live_Data[Live Data] --> ML_Model[Machine Learning Model]
```
##### ML Pipeline:
- Series of steps need to occur from the moment we receive a data to the moment we make a prediction
- created in research env
- we need to make it reproducible in prod
### Key principles for ML systems:
- automate all stages of ML workflow
- training is reproducible
- use version control
- Testing ML models
- full ML pipeline integration tested
- all input feature code is tested
- model specification code is unit tested
- model quality is validated before attempting to serve it
- shallow, canary process
- monitor model performance
### Architecture Approaches for ML Systems:
- serving ML models - formats:
- embedded (predict on the fly)
- dedicated model API
- model published as data
- offline predictions(outdated)
___
# Reading resources:
- Feature Engineering:
- [Feature Engineering for Machine Learning: A Comprehensive Overview](https://www.blog.trainindata.com/feature-engineering-for-machine-learning/)
- [Best Resources to Learn about Feature Engineering for Machine Learning](https://trainindata.medium.com/best-resources-to-learn-feature-engineering-for-machine-learning-6b4af690bae7)
- [Practical Code Implementation of Feature Engineering Techniques with Python](https://towardsdatascience.com/practical-code-implementations-of-feature-engineering-for-machine-learning-with-python-f13b953d4bcd)
- [Resources to learn more about Machine Learning](https://trainindata.medium.com/find-out-the-best-resources-to-learn-machine-learning-cd560beec2b7)
- Technical debt in ML systems -> https://papers.nips.cc/paper/2015/file/86df7dcfd896fcaf2674f757a2463eba-Paper.pdf
- ML Modeling: https://christophergs.com/machine%20learning/2020/03/14/how-to-monitor-machine-learning-models/
- Integration test: https://martinfowler.com/bliki/IntegrationTest.html
- Testing guide: https://www.martinfowler.com/testing/
- Shadow Deployment: https://christophergs.com/machine%20learning/2019/03/30/deploying-machine-learning-applications-in-shadow-mode/
- Rubrik for ML production systems: https://storage.googleapis.com/pub-tools-public-publication-data/pdf/45742.pdf
- Netflix Architecture for Recommendation systems: https://netflixtechblog.com/system-architectures-for-personalization-and-recommendation-e081aa94b5d8
- Site reliability engineering: https://sre.google/sre-book/table-of-contents/
- Repo: https://github.com/trainindata/deploying-machine-learning-models