Chapter 2. Introduction to Machine Learning Systems Design

# Chapter 2. Introduction to Machine Learning Systems Design ## Requirements for ML Systems - Reliability - Scalability - Maintainability - Adaptability ## Iterative Process Step 1. Project scoping Step 2. Data engineering Step 3. ML model development Step 4. Deployment Step 5. Monitoring and continual learning Step 6. Business analysis ## Framing ML Problems ### Classification versus regression - A regression model can easily be framed as a classification model. - A classification model can become a regression model if we make it output values between 0 and 1. ### Binary versus multiclass classification - A binary classification has 2 classes. If there are more than 2 classes, it becomes multiclass classification. __ML models typically need at least 100 examples for each class to learn to classify that class__. - When the number of classes is __large__, __hierarchical classification__ might be useful ### Multiclass versus multilabel classification - __A multiclass classification__ each example belongs to exactly one class - __A multilabel classification__ each example belongs to one or more classes. ### Multiple ways to frame a problem Example: Predicting what app a phone user wants to use next __Input:__ User demographic information, time, location, previous apps used __Output:__ A probability distribution for every single app on the user’s phone. __Solution 1:__ __Classification__: Output is a vector of N size for N recomented apps. => _**A bad approach** a new app is added, you might have to retrain your model from scratch, or at least retrain all the components of your model whose number of parameters depends on N_ __Solution 2:__ __Regression__: The output is a single value between 0 and 1. the higher the value, the more likely the user will open the app given the context