ML Op's - HackMD

![](https://hackmd.io/_uploads/Hk32GIrZT.png) # Machine Learning in Production - Model Deployment (ML Ops) ## Intro > [A Chat with Andrew on MLOps: From Model-centric to Data-centric A](https://www.youtube.com/watch?v=06-AZXmwHjo) ### Life Cycle In the chapter dedicated to Machine Learning Deployment, we embark on a structured journey through the intricate process of bringing ML models to life. This systematic approach to the Machine Learning project lifecycle proves indispensable, helping you meticulously plan each step while minimizing unexpected hurdles. The journey commences with scoping, where project objectives are clearly defined, setting the stage for what Machine Learning can solve and establishing crucial variables. Data collection follows, involving the acquisition of essential data, baseline metric establishment, and meticulous dataset organization. This phase often reveals non-intuitive best practices, enriching your understanding of data preparation. Transitioning to the model phase, you select, train, and rigorously analyze your ML models. This iterative process demands attention to detail, as feedback loops may trigger model updates or a return to data collection. The deployment phase, often misconceived as the end, is merely the midpoint. It involves technical deployment, continuous monitoring, data tracking, and dynamic system maintenance to ensure model adaptability. This framework is a versatile compass for diverse Machine Learning projects, spanning computer vision, audio data, structured data, and beyond. Capture it for reference as we embark on this enlightening journey through the realm of Machine Learning deployment. ![](https://hackmd.io/_uploads/HyLeLUb-p.png) **Example (speach recognition)** In the context of a Machine Learning project centered on speech recognition, the process unfolds in four distinct stages: scoping, data, modeling, and deployment. Initially, the scoping phase sets the project's foundation by defining objectives, such as voice search accuracy, latency, throughput, and resource requirements. This phase serves to estimate key metrics and establish project parameters. Following scoping, the data stage comes into play. Here, the focus shifts to data collection, baseline establishment, and data labeling and organization. One significant challenge is maintaining data consistency, as diverse transcription conventions can impact the learning algorithm's performance. Effective data definition encompasses aspects like silence duration and volume normalization, laying the groundwork for robust speech recognition. With a well-defined dataset in hand, the modeling phase emerges. It involves model selection, training, and error analysis. While traditionally, the code or algorithm took precedence in academic settings, practical deployments often prioritize data and hyperparameter optimization, yielding high-performing models. Systematic error analysis informs data and code improvements, a strategy efficient for building accurate models. The final stage, deployment, brings the speech recognition system to life. Whether on mobile phones or edge devices, deployment incorporates voice activity detection and prediction servers, often residing in the cloud. Monitoring and maintenance post-deployment are critical, addressing challenges like concept drift and data distribution changes, ensuring the system continues to deliver the desired value. This chapter provides a comprehensive view of the Machine Learning project life cycle, with speech recognition as an illustrative example. ### Proof of Concept A **Proof of Concept (POC)** in a machine learning (ML) project is a preliminary trial to assess the feasibility of an ML solution before full-scale deployment. It involves defining the problem, data collection and preparation, model selection, training, and evaluation. In the deployment phase, the POC helps inform crucial decisions. For deployment considerations, the POC aids in addressing scalability and choosing between cloud or edge deployment. A scaled-down model is deployed to a limited user subset to gather real-world feedback, influencing the decision on whether to proceed with full-scale deployment. In summary, the POC serves as a vital bridge between ML model development and deployment, offering insights and validation before committing to a larger-scale implementation. This POC its about 5-10% of the project only, there are so many more things in a Machine Learning Infrastrucutre and we will see some of them in this chapter. ![](https://hackmd.io/_uploads/H1CBjUZbT.png) ### Concept and Data Drift In MLOps (Machine Learning Operations), concept drift and data drift are two important concepts related to the changing nature of data in machine learning models: - **Concept Drift**: Concept drift refers to the phenomenon where the *statistical properties of the target variable (the concept) change over time*. In other words, the relationship between the input features and the target variable that your machine learning model has learned no longer holds true. This can happen due to various reasons, such as changes in user behavior, external factors, or shifts in the data-generating process. Concept drift can lead to a drop in model performance, as the model's assumptions become outdated. - **Data Drift**: Data drift, on the other hand, refers to changes in the input *data distribution* over time. It occurs *when the characteristics of the data used to train a machine learning model differ from the data the model encounters in the real world during deployment*. Data drift can lead to a degradation in model performance because the model may not generalize well to the new data distribution. Addressing concept drift and data drift is crucial in MLOps to ensure that machine learning models maintain their accuracy and effectiveness over time. Continuous monitoring, model retraining, and adaptation strategies are typically employed to mitigate the impact of these drifts and keep models performing optimally in production. > [More info](https://towardsdatascience.com/machine-learning-in-production-why-you-should-care-about-data-and-concept-drift-d96d0bc907fb) ### Baseline In the early stages of a machine learning project, it's crucial to create a baseline for performance. This baseline serves as your starting point for what is expected of the ML model, allowing you to track improvements efficiently. Take the example of speech recognition. In this scenario, there are four speech categories with varying accuracy: clear speech, speech with car noise, speech with people noise, and low bandwidth audio. It might seem logical to focus on low bandwidth audio due to its lower accuracy, but it's essential to establish baselines for all categories. To do this, you can have human transcriptionists label your data and measure their accuracy, defining "human level performance" (HLP). This analysis reveals which category offers the most potential for improvement. The approach to setting baselines differs for structured and unstructured data. Unstructured data, like images or text, aligns with human interpretation, making "human level performance" a suitable baseline. For structured data, this approach is less useful. To establish a baseline, you can conduct a literature search, explore state-of-the-art results, or create a simple model. The performance of an existing system can also serve as a reference point. The baseline acts as a critical reference point, indicating the best achievable performance or "Bayes error." Setting ambitious accuracy goals before establishing a baseline can add undue pressure to your machine learning team. It's wise to prioritize establishing a rough baseline first for project success. ## Deployment ### Edge vs Cloud Edge deployment involves running a machine learning (ML) model on local devices or at the network's periphery, closer to where data is generated. This approach minimizes data transfer and reduces latency, making it ideal for real-time or offline applications where immediate decision-making is essential, such as IoT devices, autonomous vehicles, or mobile apps. Cloud deployment, on the other hand, entails hosting ML models on remote servers in data centers or the cloud. It offers scalability, vast computing resources, and centralized management. Cloud deployment suits scenarios where data processing can be done remotely and doesn't require real-time responses, such as web applications, big data analytics, or services accessed over the internet. Deploying a machine learning (ML) model in the cloud or at the edge involves trade-offs based on specific use cases: - **Cloud Deployment**: In cloud deployment, the ML model runs on powerful remote servers. It offers scalability, easy management, and access to vast computing resources. This is suitable when: - Scalability: You need to handle large datasets or a high number of concurrent users. - Resource-Intensive Models: Your model requires substantial computational power, memory, or specialized hardware (e.g., GPUs). - Centralized Management: You want centralized control, updates, and maintenance. - **Edge Deployment**: Edge deployment involves running the ML model on local devices or at the network's edge. It's advantageous when: - Low Latency: Real-time or near-real-time inference is crucial, minimizing communication delays. - Privacy/Security: Data stays on the device, reducing privacy concerns and data transfer risks. - Offline Usage: The application needs to work without a continuous internet connection. The choice depends on your specific requirements. For example, deploying a real-time facial recognition model on a security camera might be more suitable at the edge for low latency and privacy reasons, while a recommendation system that processes vast amounts of data might work better in the cloud for scalability and resource availability. ### Realtime vs Batch Real-time and batch processing are two common approaches in MLOps for handling data and model deployments, each suited to different use cases: - **Real-Time Processing** - Low Latency: Real-time processing is designed for applications that require immediate or near-immediate responses. It's ideal for use cases where low latency is crucial, such as fraud detection, recommendation systems, or autonomous vehicles. - Continuous Data: Real-time processing deals with streaming data, where information is processed and predictions are made as new data arrives. This allows for real-time decision-making based on the most current information. - Scalability Challenges: Implementing real-time processing can be more complex and resource-intensive than batch processing, especially when dealing with high data volumes or complex models. - Examples: Real-time processing is used in applications like chatbots that provide instant responses, real-time monitoring of social media sentiment, and stock trading algorithms. - **Batch Processing** - High Throughput: Batch processing, on the other hand, focuses on processing large volumes of data in batches or chunks. It is suitable for use cases where low latency is not critical, and high throughput is more important. - Periodic Updates: Batch processing typically involves periodic updates or retraining of machine learning models based on accumulated data. This is well-suited for scenarios like monthly financial reporting, recommendation model updates, or data analysis. - Resource Efficiency: Batch processing can be more resource-efficient for handling large-scale data processing tasks because it doesn't require constant, real-time attention to incoming data. - Examples: Batch processing is used for tasks like nightly data warehousing, monthly billing, or weekly model retraining. In many MLOps workflows, a combination of both real-time and batch processing is employed. For example, data may be collected in real-time, processed in batches to train or update models, and then the trained models are deployed for real-time inference. The choice between real-time and batch processing depends on the specific requirements of the application, including latency, data volume, and the need for up-to-date information. ### Decision-Making Checklist Decision-Making for Software Engineering in MLOps can be quite tricky so I developed a checklist of things to take in consideration - [ ] **Real-Time vs. Batch Predictions** - Determine whether your application requires real-time predictions (e.g., speech recognition with sub-second response times) or batch predictions (e.g., overnight processing of patient records). - Consider the nature of your task to decide if you need software that responds quickly or if batch processing is acceptable. - [ ] **Cloud, Edge, or Web Browser Deployment** - Choose the deployment environment for your prediction service: cloud, edge (local devices), or web browser. - Opt for cloud deployments for resource-intensive tasks like speech recognition. Edge deployments are suitable for scenarios with intermittent internet connectivity, such as in-car speech systems. Modern web browsers also offer tools for deploying machine learning models. - [ ] **Resource Availability** - Assess the available CPU, GPU, and memory resources for your prediction service. - Ensure that your chosen software architecture aligns with the available hardware resources to avoid over-optimization issues. - [ ] **Latency and Throughput** - Determine the latency requirements for real-time applications (e.g., responding within 500 milliseconds). - Calculate the required throughput, measured in queries per second (QPS), based on your application's demand and available resources. - [ ] **Logging** - Consider implementing comprehensive data logging to facilitate analysis, review, and future model retraining. - Logging can provide valuable insights into system performance and data quality. - [ ] **Security and Privacy** - Tailor the level of security and privacy in your software based on the sensitivity of your data and regulatory requirements. - High-security standards may be necessary for handling sensitive information like patient records. - [ ] **Continuous Monitoring and Maintenance** - Recognize that deploying a system is just the beginning; ongoing monitoring and maintenance are crucial, especially for addressing concept drift and data drift. - Different practices apply to the initial deployment compared to maintaining and updating a deployed system. This checklist serves as a guide for making informed software engineering decisions when implementing a prediction service. It helps ensure that your software aligns with the specific needs and requirements of your machine learning application, whether in terms of speed, resource utilization, security, or adaptability to changing data conditions. ### List of tools Here's a list of tools commonly used in MLOps, covering various stages of the machine learning lifecycle: - **Version Control** - **Git** - **GitHub, GitLab, or Bitbucket** for hosting repositories - **Git LFS** for handling large files - **Data Management** - Data storage solutions (e.g., **Amazon S3, Google Cloud Storage**) - Data versioning tools (e.g., **DVC, Pachyderm**) - Data preprocessing libraries (e.g., **pandas, NumPy**) - **Environment Management** - Containerization tools (e.g., **Docker**) - Container orchestration (e.g., **Kubernetes**) - Virtual environments (e.g., **Anaconda, virtualenv**) - **Model Development** - **Jupyter Notebooks** for interactive model development - ML libraries (e.g., **TensorFlow, PyTorch, scikit-learn**) - IDEs for coding and debugging (e.g., **PyCharm, VSCode**) - **Continuous Integration/Continuous Deployment (CI/CD)** - CI/CD pipelines (e.g., **Jenkins, Travis CI, CircleCI**) - Automated testing frameworks (e.g., **pytest**) - Artifact repositories (e.g., **Nexus, Artifactory**) - **Model Training and Experiment Tracking** - ML experiment tracking (e.g., **MLflow, TensorBoard**) - Hyperparameter tuning (e.g., **Optuna, Ray Tune**) - Distributed computing (e.g., **Apache Spark**) - **Model Deployment** - Model serving platforms (e.g., **TensorFlow Serving, FastAPI**) - Container registries (e.g., **Docker Hub, AWS ECR**) - API gateways (e.g., **Kong, Ambassador**) - **Monitoring and Logging** - Application performance monitoring (e.g., **Prometheus, Datadog**) - Log management (e.g., **ELK Stack, Splunk**) - Error tracking (e.g., **Sentry, Rollbar**) - **Scaling and Auto-Scaling** - Infrastructure as Code (IaC) tools (e.g., **Terraform, AWS CloudFormation**) - Auto-scaling solutions (e.g., **AWS Auto Scaling, Kubernetes Horizontal Pod Autoscaling**) - **Security and Access Control** - Identity and Access Management (IAM) solutions (e.g., **AWS IAM, Google Cloud IAM**) - Encryption tools (e.g., **HashiCorp Vault**) - Security scanning (e.g., **Nessus, Clair**) - **Data Pipelines** - ETL tools (e.g., **Apache Airflow, Apache NiFi**) - Data streaming platforms (e.g., **Apache Kafka, AWS Kinesis**) - **Collaboration and Documentation** - Documentation platforms (e.g., **Confluence, Sphinx**) - Communication tools (e.g., **Slack, Microsoft Teams**) - **Experimentation and A/B Testing** - Experimentation frameworks (e.g., **PlanOut, Google Optimize**) - Feature flags and toggles (e.g., **LaunchDarkly**) - **Model Monitoring and Retraining** - Model monitoring tools (e.g., **Seldon Alibi, ModelDB**) - Continuous data validation (e.g., **Great Expectations**) - **Backup and Disaster Recovery** - Data backups and snapshots - Disaster recovery plans - **Cost Management** - Cost tracking and optimization tools (e.g., **AWS Cost Explorer, CloudHealth**) - **Compliance and Governance** - Compliance monitoring tools (e.g., **AWS Config, Azure Policy**) - Model explainability tools (e.g., **IBM Watson OpenScale**) - **Knowledge Sharing and Training** - Internal knowledge-sharing platforms - Online courses and resources (e.g., **Coursera, Udacity**) - **Continuous Learning** - Stay updated with industry trends and advancements in AI/ML using **Blogs and Newsletters, Online Courses and Tutorials, Books and Publications & MOOCs** (Massive Open Online Courses) Remember that the specific software tools you choose may vary based on your organization's needs, infrastructure, and technology stack. This checklist provides a comprehensive overview of the typical tools and areas to consider in an MLOps environment. ### Deployment Patterns In the realm of machine learning deployment, various strategies and patterns emerge to ensure a smooth and controlled transition from model development to real-world application. These deployment patterns serve as blueprints for managing the introduction of machine learning algorithms into production environments. In this context, we explore several key deployment types, each tailored to specific use cases and scenarios. From shadow mode deployments for cautious testing to canary and blue-green deployments for gradual rollouts, and considerations regarding the degree of automation, these patterns provide invaluable guidance for optimizing the deployment process. - **Shadow Mode Deployment**: In this pattern, a learning algorithm runs parallel to human decision-making but does not influence real-world decisions initially. Its output is observed and compared to human judgment to gather data and verify the algorithm's performance. ![](https://hackmd.io/_uploads/r1brkpzb6.png) - **Canary Deployment**: This strategy involves gradually rolling out a new machine learning algorithm to a small fraction of the overall traffic. This allows for monitoring and testing of the algorithm's performance with a limited impact on the system. If issues arise, they affect only a small portion of the traffic. ![](https://hackmd.io/_uploads/rk2_kazbp.png) ![](https://hackmd.io/_uploads/HJR91aGWp.png) - **Blue-Green Deployment**: Blue-Green deployment is used when switching between two versions of a software system, with the old version referred to as "blue" and the new version as "green." The deployment initially routes traffic to the old version and then switches it to the new version when ready. This approach enables quick rollback to the previous version if problems occur. ![](https://hackmd.io/_uploads/SJZM1pM-6.png) ![](https://hackmd.io/_uploads/Sk7b1azba.png) - **Degree of Automation**: choosing the appropriate degree of automation in deployment, ranging from full human decision-making to full automation by the learning algorithm. Intermediate levels of automation, such as AI assistance or partial automation, are also considered. The choice of automation depends on the application and system performance. ![](https://hackmd.io/_uploads/SJ2lCnM-a.png) These deployment patterns help organizations ensure the successful deployment and monitoring of machine learning models. ## Error Analysis The heart of the machine learning development process is error analysis, which helps efficiently improve algorithm performance. For example, in speech recognition, you can use tags to categorize errors and understand their origins. Tags, like "car noise" or "people noise," allow you to identify patterns in data that may be the source of errors. Error analysis is an iterative process, where you continually examine, tag, and classify examples. **By analyzing tags, you can determine which data categories require improvement**. Key metrics to consider include the fraction of errors with a specific tag*, the misclassification rate for that tag, the prevalence of the tag in the dataset, and the potential for improvement, often measured against human-level performance. - **What fraction of erros has that tag?** - **Of all data with that tag, what fraction is misclassified?** - **What fraction of all the data hast that tag?** - **How much room for improvement is there on data with that tag?** This process is valuable for improving machine learning models and can be applied to various applications, from speech recognition to defect detection and product recommendations. ### Prioritize what you work on To prioritize your focus when improving your learning algorithm, consider factors like: - **How much room for improvement relative to human-level performance there is** - **How frequently the categories appears in your dataset** - **How easy is to enhance accuracy in the category** - **How importance is improving performance in that category.** You can use these criteria to make fruitful decisions about what to work on. Once you identify the categories for improvement, a productive approach is to add more data or improve the data quality in those specific categories. Collecting more data can be time-consuming and expensive, but by analyzing the data, you can be more targeted in data collection or use data augmentation to enhance performance. This approach helps efficiently boost your learning algorithm's performance, making it more effective for building production-ready machine learning systems. ### Skewed datasets Skewed datasets, where the ratio of positive to negative examples is far from 50-50, require special handling. In these cases, raw accuracy isn't a very informative metric. Instead, it's more useful to build something called the confusion matrix, which becomes crucial in understanding the algorithm's performance, especially when dealing with skewed data. A confusion matrix classifies predictions into four categories: true positives, false positives, true negatives, and false negatives. Let's illustrate this with a manufacturing example. If, for instance, only a small fraction of smartphones are defective (1% labeled as "y equals 1") while the vast majority are not (99% labeled as "y equals 0"), a simplistic algorithm that predicts "0" for all would achieve high accuracy (99.7%). However, for skewed datasets like this, precision and recall are more valuable metrics. Precision measures how accurate the algorithm's positive predictions are, while recall gauges how effectively it captures positive examples. So, for this skewed manufacturing example, the confusion matrix would look like this: ![](https://hackmd.io/_uploads/HylrwIBZa.png) These metrics are more informative and reliable when assessing algorithm performance in cases of skewed datasets. In multi-class classification problems, such as defect detection in manufacturing, evaluating individual categories is important. For instance, if you're detecting different defects like scratches, dents, pit marks, and discolorations, each of which can be quite rare, you might want to assess precision and recall for each defect type. > In manufacturing, it's common for factories to prioritize high recall as they aim to avoid allowing defective phones to leave the production line. If an algorithm exhibits slightly lower precision, it's generally acceptable because human inspection can be relied upon to verify the phone's condition. Therefore, many factories emphasize achieving high recall to ensure the detection of potential defects. Combining precision and recall using the F1 score provides a single evaluation metric for the algorithm's performance across all defect types, enabling benchmarking to human-level performance and prioritization of improvements. ![](https://hackmd.io/_uploads/SJVk9LS-6.png) For example, in a factory setting, one model might achieve an F1 score of 90% for scratches but only 50% for dents, while another model scores 80% for scratches and 80% for dents. In this case, the first model is superior for detecting scratches, whereas the second model performs better in recognizing dents. The F1 score provides a balanced assessment of overall performance when dealing with rare classes. In summary, **when dealing with skewed data, precision and recall, along with the confusion matrix and the F1 score, are crucial tools for understanding, evaluating, and prioritizing the performance of learning algorithms.** ### Performance Auditing In the fast-paced world of machine learning and AI, ensuring the reliability and accuracy of your learning algorithms is paramount. Even when your models perform well on metrics like accuracy or F1 score, conducting a final performance audit before deploying them in production is a crucial step. This last check can help you uncover hidden issues and prevent potential post-deployment problems. ![](https://hackmd.io/_uploads/HkNlp8rb6.png) Let's explore the framework for this essential process: **Step 1: Brainstorm Potential Pitfalls** Begin by brainstorming all the ways your system might go wrong. Consider various aspects, such as its performance across different subsets of the data, including different demographics like genders or ethnicities, and its ability to avoid specific errors like false positives and false negatives, which are particularly significant in skewed datasets. You'll want to evaluate how your algorithm handles rare and important classes, as we discussed in the key challenges video earlier. **Step 2: Define Metrics** After identifying areas of concern, establish relevant metrics to assess your algorithm's performance regarding these potential issues. Common practice involves evaluating the system's performance on subsets or slices of the data, focusing on specific demographics or error types, rather than assessing the entire development dataset. **Step 3: Leverage MLOps Tools** MLOps tools, such as TensorFlow Model Analysis (TFMA), can automate the evaluation process, providing detailed metrics for different slices of data. These tools streamline the auditing of your models and enhance efficiency. These tools are particularly handy when you're dealing with complex data sets or large-scale machine learning operations. **Step 4: Gain Buy-In** To ensure that you're targeting the most relevant problems and using the right metrics, it's crucial to secure buy-in from your business or product stakeholders. Collaboratively decide on the most appropriate problems to address and the metrics to use when assessing your system's performance. **Step 5: Problem Discovery and Resolution** By following this framework, you'll have a systematic approach to discover any issues before they impact your production system. If any problems are identified, this proactive approach allows you to address them before deployment, saving you from unexpected complications in the future. Let's illustrate this framework with an example involving speech recognition. Imagine you're building a speech recognition system, and you brainstorm various ways it might go wrong. For instance, you're concerned about the accuracy on different genders, ethnicities, or accents. You're also worried about the performance on various devices with distinct microphones. Additionally, you aim to avoid offensive or rude transcriptions. For each of these concerns, establish the relevant metrics, apply them to appropriate data slices, and assess the system's performance. Standards for fairness and bias in AI continue to evolve, so it's essential to stay updated with industry-specific norms and best practices. Building a team or seeking external advice for brainstorming potential issues can significantly reduce the risk of overlooking critical considerations. By taking these steps and conducting thorough performance audits, you can deploy learning algorithms with confidence, knowing they've been rigorously evaluated and prepared for the demands of production environments. ## Testing Performance In this chapter the focus is on effectively monitoring deployed machine learning systems to ensure they meet performance expectations. Monitoring is a crucial aspect of maintaining and improving machine learning systems in a real-world setting. This process involves setting up monitoring dashboards that track relevant metrics specific to the application. Depending on the system's nature, these metrics could encompass server load, null outputs, or the fraction of missing input values. ### Monitoring Dashboard When deciding what to monitor, it's advisable to brainstorm potential issues and define metrics that would detect these problems. The goal is to ensure the system's robustness and responsiveness to anomalies. The monitoring dashboards may start with a wide array of metrics and gradually narrow down to the most relevant ones. ![](https://hackmd.io/_uploads/HJ8GO-V-6.png) Key categories of metrics include **software metrics**, which assess the health of the software implementation, and **input and output metrics**, which track how the machine learning algorithm is performing statistically and whether the data distribution has changed. These metrics are tailored to the specific application and need to be configured for tracking using MLOps tools or custom implementations. ![](https://hackmd.io/_uploads/B1zRuZ4Z6.png) Machine learning deployment is an iterative process, and monitoring evolves over time. Real user data and real traffic enable performance analysis, which, in turn, informs updates to the deployment. It's common to adjust the metrics and their alarm thresholds as you gain a deeper understanding of the system's behavior. Automatic retraining of models may also be part of the process, depending on the application. ![](https://hackmd.io/_uploads/rkWTqZNZ6.png) ![](https://hackmd.io/_uploads/Sye49WEWp.png) Ultimately, monitoring and maintaining machine learning systems are critical to ensuring they continue to deliver accurate and reliable results, while also adapting to changing data distributions or other issues that may arise in a production environment. > [More info](https://christophergs.com/machine%20learning/2020/03/14/how-to-monitor-machine-learning-models/) ### Monitoring Pipeline In here the focus is on understanding why effectively monitoring machine learning pipelines, which consist of multiple steps or components, is crucial. These complex systems often involve cascading effects, where changes in one model can affect the performance of subsequent model. Specific examples, such as a speech recognition pipeline and user profile generation, illustrate how changes in data distribution or components can lead to issues in downstream processes. For instance, altering the way audio clips are processed by the voice activity detection module (VAD) can impact the quality of the speech recognition system. Similarly, changes in user behavior or data attributes can influence the performance of a recommendation system. ![](https://hackmd.io/_uploads/rk-2nWN-a.png) ![](https://hackmd.io/_uploads/SJWnabNWa.png) Therefore we have to monitor metrics at various stages of the pipeline, including software metrics, input metrics, and output metrics, to detect changes and ensure data-driven and concept drifts are detected early. It is also good advise to consider the rate of data change, which can vary from slow changes over months to rapid shifts in minutes, depending on the application. ![](https://hackmd.io/_uploads/HyW9C-VZp.png) As with single-model monitoring, the process involves brainstorming potential issues, identifying relevant metrics, and configuring monitoring systems to ensure that the machine learning pipeline continues to deliver reliable results amid evolving data and components. The ability to detect issues early and take appropriate actions is vital for maintaining the quality of machine learning pipelines. ### Load Testing Load testing is a crucial process in software development that involves testing the performance and behavior of a system under various load conditions. It helps identify potential bottlenecks, assess system capacity, and ensure optimal performance. Load testing is typically conducted in multiple stages, including unit testing, integration testing, UI testing, and acceptance testing. Let's delve into each step: ![](https://hackmd.io/_uploads/B1itPmlHn.png) 1. **Unit Testing:** Unit testing focuses on individual components or units of code to ensure their proper functionality in isolation. In the context of load testing, unit testing involves assessing the performance and behavior of each component under load. This step usually includes testing individual functions, methods, or modules with different load scenarios to identify any performance issues or bottlenecks at a granular level. 2. **Integration Testing:** Integration testing involves testing the interaction between different components or modules of a system. In load testing, integration testing verifies the performance and behavior of the system when multiple components are combined and subjected to load. The goal is to identify any performance issues that may arise due to the integration of different modules and to ensure that they can handle the expected load collectively. 3. **UI Testing:** UI (User Interface) testing focuses on testing the performance and responsiveness of the system's graphical user interface under load. This step ensures that the system can handle the expected load while maintaining an acceptable level of usability and responsiveness for end users. UI testing often involves simulating user interactions, such as clicking buttons, filling out forms, or navigating through the system, to assess the system's behavior and performance under different load conditions. 4. **Acceptance Testing:** Acceptance testing is the final step in load testing and involves validating the system's performance against predefined acceptance criteria. This step ensures that the system meets the performance requirements specified by stakeholders and performs adequately under the expected load. Acceptance testing may involve simulating realistic load scenarios or conducting tests with real-world data to assess the system's performance, stability, and response times. Overall, these four steps of load testing (unit, integration, UI, and acceptance) collectively help identify performance issues, bottlenecks, and areas of improvement in a system to ensure it can handle the expected load and deliver optimal performance to end users. #### LOCUST **Locust is an open-source, Python-based, distributed load testing tool**. It allows users to simulate large-scale concurrent user traffic on a system to measure its performance and identify potential bottlenecks. With a simple and intuitive syntax, Locust enables developers to define user behavior using Python code, making it easy to create realistic load scenarios. It supports distributed testing, provides real-time monitoring and reporting, and offers flexibility for customization. Locust is widely used for load testing web applications and APIs due to its scalability, ease of use, and powerful features. ![](https://hackmd.io/_uploads/ByghM_Xxr3.png) **Locusfile** 'locusfile.py': ``` from locust import HttpUser, task, between class MyUser(HttpUser): wait_time = between(1, 3) # Wait time between tasks in seconds @task def my_task(self): # Send an HTTP GET request self.client.get("/") @task2 def my_task2(self): # Send an HTTP POST request self.client.post("/sum", name="sum", json={'num1':1, 'num2':2}) ``` In this example, we define a Locust user class called MyUser that inherits from HttpUser, which is a base class provided by Locust for HTTP load testing. We set the wait_time attribute to specify the range of time the user waits between executing tasks. Inside the MyUser class, we define two tasks using the @task decorator. We can run this Locustfile using the Locust command-line tool, specifying the host and the number of users and hatch rate. For example: ``` locust -f locustfile.py --host=http://localhost:8000 --users=10 --spawn-rate=2 ``` This command will start the load test with 10 users and a spawn rate of 2 users per second, targeting the specified host (in this case, "http://localhost:8000"). You can then access the Locust's web-based interface (usually at http://localhost:8089) to start the load test, monitor the requests, response times, and other metrics in real-time, and analyze the performance of your system under load. ![](https://hackmd.io/_uploads/Hk9ccXgS3.png) ![](https://hackmd.io/_uploads/B1TZoXgHh.png) ![](https://hackmd.io/_uploads/HkT1i7lSn.png) ## Microservices Architecture using Docker ### Microservices Microservices are an architectural style for designing and developing software applications. In a microservices architecture, an application is built as a collection of small, loosely coupled, and independently deployable services that work together to provide the overall functionality of the application. Traditionally, applications were built as monolithic systems, where all the components of the application were tightly integrated and deployed as a single unit. This approach can lead to challenges in terms of scalability, maintainability, and flexibility. Microservices offer an alternative approach to building applications by breaking them down into smaller, more manageable services. Each microservice in a microservices architecture is designed to perform a specific business function and can be developed, deployed, and scaled independently of other services. These services communicate with each other through well-defined APIs (Application Programming Interfaces), typically using lightweight protocols such as HTTP/REST or messaging systems like RabbitMQ or Apache Kafka. There are several key characteristics of microservices: - **Decentralization**: Each microservice is developed and deployed independently, allowing teams to work on different services simultaneously and adopt different technologies and programming languages based on the requirements of each service. - **Scalability**: Microservices enable horizontal scalability, meaning that individual services can be scaled independently based on their specific needs. This allows for better resource utilization and improved performance. - **Flexibility and Agility**: Since services are decoupled, it is easier to modify, update, and replace individual services without affecting the entire system. This flexibility enables organizations to adapt to changing business requirements more rapidly. - **Resilience and Fault Isolation**: Failures in one microservice do not propagate to other services, thanks to their loose coupling. This isolation improves fault tolerance and makes it easier to identify and fix issues. - Independent Deployment: Microservices can be deployed independently, allowing for faster release cycles and enabling continuous deployment and integration practices. - **Autonomous Teams**: Each microservice can be developed and maintained by a separate cross-functional team, giving them autonomy and accountability for their specific service. This decentralization promotes faster development cycles and fosters innovation. However, microservices also introduce some challenges. They require additional effort in terms of service discovery, inter-service communication, data consistency, and managing the overall system complexity. Properly designing, deploying, and monitoring microservices architectures requires careful consideration and planning. Overall, microservices provide a way to build scalable, flexible, and loosely coupled applications that can adapt to changing requirements and technological advancements. ![](https://hackmd.io/_uploads/r1qy_1DE2.png) While microservices offer several advantages, they also come with a set of challenges and potential disadvantages. Here are some common disadvantages of microservices: - **Potential for Spiraling Out of Control**: In some cases, the number of microservices can proliferate rapidly, leading to an overwhelming number of services to manage. As the system grows, it becomes challenging to keep track of all the services, their dependencies, and their interactions. This can result in increased complexity, difficulty in understanding the overall system behavior, and a higher maintenance burden. Without proper governance and guidelines, the microservices architecture can become unwieldy and difficult to maintain. - **Increased Complexity**: Microservices introduce a higher level of complexity compared to monolithic architectures. Managing a distributed system with multiple services requires additional effort in terms of service discovery, load balancing, fault tolerance, and monitoring. The complexity of inter-service communication can also be a challenge to address. - **Network Latency and Overhead**: Microservices rely on network communication for inter-service interactions. This introduces potential latency and overhead compared to in-process communication in monolithic systems. Network failures or performance issues can affect the overall system performance. - **Data Consistency**: Maintaining data consistency across multiple services can be challenging. Each microservice typically has its own database or data store, and ensuring data consistency and synchronization across services can become complex. Implementing transactions spanning multiple services is difficult and often requires adopting additional techniques like distributed transactions or event-driven architectures. - **Service Coordination**: In a microservices architecture, services need to coordinate with each other to achieve complex business workflows. Implementing coordination and choreography mechanisms, such as event-driven architectures or workflow engines, can add additional complexity and introduce points of failure. - **Operational Overhead**: Managing and deploying multiple services requires additional operational effort compared to a monolithic architecture. Each service needs to be individually deployed, monitored, and scaled. DevOps practices and infrastructure automation become crucial to handle the increased operational complexity. - **Distributed System Challenges**: Microservices are distributed systems, which means they are susceptible to issues like network failures, partial failures, and eventual consistency. Handling distributed transactions, ensuring fault tolerance, and managing service discovery and configuration become important aspects to consider. - **Testing and Debugging**: Testing and debugging microservices can be more complex compared to monolithic systems. Testing scenarios involving multiple services, managing test data consistency, and debugging issues that span across services can be challenging. - **Learning Curve and Team Skills**: Adopting microservices may require a learning curve for development teams. The shift from a monolithic mindset to a microservices mindset involves understanding new architectural principles, implementing distributed systems best practices, and adopting new tools and technologies. Additionally, cross-functional teams need to have expertise in specific technologies used by each service. It's important to note that while these disadvantages exist, they can be mitigated with proper architecture design, best practices, and the right tooling. Microservices are not a one-size-fits-all solution and should be carefully evaluated based on the specific needs and requirements of the application and organization. Here is an example of Uber that had to many API's: ![](https://hackmd.io/_uploads/ry8fikwV3.png) When a microservices architecture spirals out of control, it often manifests in the following ways: - **Service Explosion**: The number of microservices grows exponentially as different teams and developers create services to fulfill specific needs. This can lead to a large number of services, making it harder to manage, deploy, and monitor them effectively. - **Dependency Hell**: Microservices may have dependencies on other services, and as the number of services increases, so does the complexity of managing those dependencies. It becomes challenging to understand the impact of changes and updates in one service on the others, potentially leading to compatibility issues and versioning problems. - **Operational Overload**: With a large number of services, the operational overhead increases significantly. It becomes more challenging to manage deployments, monitor performance, ensure scalability, and handle updates or rollbacks across numerous services. - **Inconsistent Practices**: With multiple teams working on different services, there is a risk of inconsistent practices and varying levels of quality. Each team may have its own approach to development, deployment, testing, and documentation, leading to a lack of standardization and increased maintenance efforts. To mitigate the risk of spiraling out of control, it is crucial to have clear guidelines, standards, and governance in place. This includes establishing a well-defined service discovery mechanism, implementing monitoring and observability practices, enforcing consistent development and testing standards, and regularly reviewing the service landscape to identify opportunities for consolidation or refactoring. ### Message Broker A message broker is a component or middleware that facilitates the exchange of messages between different software applications or components. It acts as an intermediary, enabling communication and coordination between distributed systems by providing a reliable and asynchronous messaging infrastructure. In a messaging system, producers are responsible for sending messages, while consumers receive and process those messages. The message broker sits in between, ensuring that messages are delivered to the appropriate recipients efficiently and reliably. - **Producer**: A producer is an application or component that is responsible for sending messages to the message broker. It establishes a connection with the broker and publishes messages to the broker for further processing and delivery. - **Consumer**: A consumer is an endpoint or application that receives and consumes messages from the message broker. It establishes a connection with the broker and subscribes to specific queues or topics to receive relevant messages. Consumers process these messages based on their specific requirements or business logic. - **Queue**: A queue is a fundamental concept in message brokers. It is a data structure used by the message broker to store messages until they are consumed by a consumer. Messages sent to a queue are typically processed in a first-in, first-out (FIFO) manner, meaning that the order in which they are received is preserved. **Point-to-point**: Point-to-point messaging is a messaging pattern where messages are sent from a sender to a specific receiver, ensuring that only one receiver consumes each message. ![](https://hackmd.io/_uploads/ByIjjkPVn.png) **Publish/Suscribe** Publish/Subscribe messaging is a pattern where messages are published to topics, and multiple subscribers receive those messages without the need for senders and receivers to have direct knowledge of each other. ![](https://hackmd.io/_uploads/rJXdnyvN3.png) **Here are some key characteristics and functionalities of a message broker:** - **Message Routing**: The message broker receives messages from producers and routes them to the appropriate consumers based on predefined rules or patterns. This allows decoupling between message producers and consumers, as they don't need to know each other's details or location. - **Message Queuing**: The broker often implements a message queue, where messages are stored until they can be delivered to the consumers. This allows for asynchronous communication, where producers can send messages without requiring immediate consumption by consumers. - **Message Transformation**: The broker can perform message transformation or enrichment by modifying the content or format of messages. This can include converting between different data formats, aggregating multiple messages, or adding metadata to messages. - **Message Persistence**: Some message brokers provide persistence capabilities, allowing messages to be stored even if the broker or consumer is temporarily unavailable. This ensures message durability and prevents message loss. - **Delivery Guarantees**: Message brokers often support various delivery guarantees to ensure reliable message delivery. This includes options such as at-most-once delivery (messages may be lost), at-least-once delivery (messages may be duplicated but not lost), and exactly-once delivery (messages are delivered exactly once). - **Pub/Sub and Point-to-Point**: Message brokers typically support both publish/subscribe (pub/sub) and point-to-point messaging patterns. In pub/sub, messages are published to topics, and multiple subscribers can receive the messages. In point-to-point, messages are sent to specific queues, and only one consumer can receive each message. - **Scalability and Load Balancing**: Message brokers can be designed to handle high message volumes and provide mechanisms for load balancing and scalability. They can distribute messages across multiple broker instances or nodes to ensure efficient message processing. Popular message broker implementations include Apache Kafka, RabbitMQ, ActiveMQ, and AWS Simple Queue Service (SQS). These brokers are used in various domains and scenarios, such as event-driven architectures, real-time data streaming, job processing, and distributed systems communication. Overall, a message broker plays a vital role in enabling asynchronous, reliable, and decoupled communication between different components or systems, facilitating the exchange of messages and ensuring efficient message delivery. **Drawbacks** While message brokers offer several benefits, there are also some drawbacks to consider: - **Single Point of Failure**: A message broker can become a single point of failure. If the broker goes down, it can disrupt the entire messaging system, impacting message delivery and communication between producers and consumers. Implementing high availability and redundancy measures is crucial to mitigate this risk. - **Increased Complexity**: Introducing a message broker adds complexity to the overall system architecture. Developers need to understand and implement the messaging protocol, handle message serialization and deserialization, manage message routing and subscriptions, and ensure proper error handling and recovery mechanisms. - **Potential Performance Impact**: The use of a message broker can introduce additional network latency and processing overhead compared to direct communication between components. The broker needs to handle message routing, persistence, and potential transformations, which can impact the overall system performance, especially in high-throughput scenarios. - **Operational Overhead**: Running and managing a message broker requires additional operational effort. It involves setting up and maintaining the broker infrastructure, monitoring its health and performance, ensuring proper configuration and scalability, and handling upgrades or migrations. This adds complexity to the overall system operations. - **Message Ordering Challenges**: While message brokers offer message delivery guarantees, ensuring strict ordering of messages can be challenging. In certain scenarios, messages may arrive out of order, requiring additional logic and synchronization mechanisms to handle sequence-dependent operations correctly. - **Potential Data Loss**: Message brokers that do not provide built-in persistence mechanisms may risk data loss if the broker or consumer fails before messages are consumed. In such cases, the messages in transit or waiting in the broker may be lost, potentially impacting data integrity and consistency. - **Vendor Lock-in**: Some message brokers may introduce vendor lock-in, as they come with proprietary protocols and APIs. Switching to a different message broker implementation may require significant effort and changes to the codebase and infrastructure. - **Learning Curve and Complexity**: Working with message brokers introduces a learning curve for developers. They need to understand the concepts, protocols, and best practices associated with messaging systems. This can require additional training and expertise, especially for teams new to message brokers. It's essential to weigh these drawbacks against the benefits and specific requirements of your system before deciding to adopt a message broker. Careful evaluation and architectural planning are necessary to ensure that the benefits outweigh the potential drawbacks in your particular use case. **Most common Message Brokers** - **Apache Kafka**: Apache Kafka is a distributed streaming platform that provides high-throughput, fault-tolerant, and scalable messaging capabilities. It is widely used for real-time data streaming, event sourcing, and building event-driven architectures. - **RabbitMQ**: RabbitMQ is an open-source message broker that implements the Advanced Message Queuing Protocol (AMQP). It supports various messaging patterns, including pub/sub and point-to-point, and provides robust message queuing and routing capabilities. - **ActiveMQ**: Apache ActiveMQ is a popular open-source message broker that supports multiple messaging protocols, including AMQP, MQTT, and STOMP. It offers features like message persistence, high availability, and flexible message routing. - **Amazon Simple Queue Service (SQS)**: SQS is a managed message queuing service provided by Amazon Web Services (AWS). It offers reliable and scalable message queuing with features like message retention, delayed delivery, and dead-letter queues. - **Redis**: Redis is an in-memory data structure store, but it also provides messaging capabilities through its Pub/Sub feature. While not a dedicated message broker, Redis can be used as a lightweight messaging system for simple pub/sub scenarios or as a component in a larger messaging infrastructure. These message brokers are widely adopted and have robust ecosystems, providing a range of features and integration options to support different messaging needs. ### **Docker** Docker is an open-source platform that allows you to automate the deployment, scaling, and management of applications using containerization. It provides an efficient and consistent way to package applications, along with their dependencies and runtime environment, into containers. These containers are isolated environments that can run on any system that supports Docker, providing consistency and portability across different computing environments. **Virtual Machine vs Docker container:** A virtual machine (VM) is a software emulation of a physical computer system. It allows multiple operating systems to run on a single physical machine by abstracting the underlying hardware and providing each virtual machine with its own virtualized hardware resources, including CPU, memory, storage, and network interfaces. Each VM runs its own complete operating system, and applications within the VM are isolated from the host system and other VMs. ![](https://hackmd.io/_uploads/HytnJxwE2.png) Docker, on the other hand, uses containerization, which enables applications to run in lightweight and isolated environments. Docker containers share the host machine's OS kernel, resulting in faster startup times, less resource overhead, and better utilization of system resources. Docker is more efficient and provides greater flexibility for deploying and scaling applications compared to VMs. ![](https://hackmd.io/_uploads/rkG01xPVh.png) *Advantages of a Virtual Machine (VM):* - **Easier to use, user-friendly interface**: VMs often come with user-friendly interfaces and management tools that make them easier to set up and use, especially for users who are less familiar with command-line interfaces or containerization concepts. - **Can be the starting point for using Docker**: VMs can serve as a starting point for using Docker. By running a VM and then installing Docker within the VM, you can leverage the benefits of both technologies. This approach allows you to isolate and manage your containers within the VM environment. - **Can emulate architectures**: VMs provide the ability to emulate different hardware architectures. This is useful when you need to run software on a specific architecture that differs from the host machine, such as running an ARM-based operating system on an x86 machine. *Advantages of a Docker Container:* - **Consumes fewer resources**: Docker containers are more lightweight and efficient compared to virtual machines. Containers share the host system's kernel and utilize a minimal runtime environment, resulting in reduced resource consumption and optimized resource utilization. - **Faster boot times**: Containers start significantly faster than virtual machines because they don't need to boot an entire operating system. Docker containers leverage the host system's kernel and only load the necessary dependencies, enabling near-instant startup times. - **Better portability**: Docker containers are highly portable and can run on any system that has Docker installed, regardless of the underlying infrastructure. Containers encapsulate the application and its dependencies into a single container image, ensuring consistency and portability across different environments. In summary, virtual machines provide user-friendly interfaces, support emulating different architectures, and can serve as a foundation for using Docker. On the other hand, Docker containers offer resource efficiency, faster boot times, and superior portability. Choosing between VMs and Docker containers depends on the specific requirements, use case, and trade-offs you need to consider. **Key components:** Here are the key components and concepts in Docker: - **Docker Engine**: This is the core component of Docker that runs and manages containers. It consists of a client-server architecture, where the Docker client interacts with the Docker daemon (server) to build, run, and manage containers. - **Docker Image**: An image is a lightweight, standalone, and executable software package that includes everything needed to run a piece of software, including the code, runtime, system tools, libraries, and dependencies. Images are built based on instructions defined in a Dockerfile, which specifies the steps to create the image. - **Container**: A container is a runnable instance of an image. It provides a consistent and isolated environment for an application to run, ensuring that it has all the necessary dependencies and configurations. Containers are lightweight, portable, and can be easily replicated or moved across different environments. - **Docker Registry**: A registry is a centralized repository that stores Docker images. The default public registry is Docker Hub, but you can also set up private registries to store and share your own images within your organization. - **Dockerfile**: It is a text file that contains a set of instructions for building a Docker image. These instructions define the base image, environment variables, dependencies, commands to run, and other configurations needed for the application. Dockerfiles are highly customizable and allow for automating the image creation process. - **Docker Compose**: Docker Compose is a tool for defining and running multi-container Docker applications. It uses a YAML file to configure and manage multiple services, allowing you to define the relationships and dependencies between containers. This simplifies the process of orchestrating complex applications. - **Orchestration**: Docker Swarm and Kubernetes are popular orchestration tools that help manage a cluster of Docker hosts. They provide features for scaling, load balancing, service discovery, and high availability. Orchestration tools are used in production environments to manage and deploy containers across multiple machines or cloud instances. Docker has revolutionized software development and deployment by providing a consistent and reproducible environment across different platforms. It simplifies the process of packaging and distributing applications, making it easier to collaborate, deploy, and scale software systems. ![](https://hackmd.io/_uploads/HJO8WxP42.png) #### **Dockerfile:** Let's break down the Dockerfile instructions: - **FROM**: Specifies the base image for the Docker image. In this example, we're using the official Node.js 14 image as the starting point. - **WORKDIR**: Sets the working directory inside the container where subsequent instructions will be executed. In this case, it sets the working directory to /app. - **COPY**: Copies the package.json and package-lock.json files from the host machine to the working directory (.) inside the container. - **RUN**: Runs a command during the image build process. Here, it executes 'npm install' inside the container to install the application dependencies based on the package.json files. - **COPY**: Copies the rest of the application code from the host machine to the working directory inside the container. - **EXPOSE**: Specifies the port that the container will listen on. In this example, it exposes port 3000. - **CMD**: Defines the command to start the application when the container is launched. Here, it uses npm start to start the Node.js application. Let's see some examples: - [ ] *Here's an example of a simple Dockerfile for a Node.js application:* ``` # Specify the base image FROM node:14 # Set the working directory inside the container WORKDIR /app # Copy package.json and package-lock.json to the working directory COPY package*.json ./ # Install application dependencies RUN npm install # Copy the rest of the application code to the working directory COPY . . # Expose a port that the container will listen on EXPOSE 3000 # Define the command to start the application CMD [ "npm", "start" ] ``` With this Dockerfile, you can build an image by running the following command in the directory containing the Dockerfile: ``` docker build -t my-node-app . ``` The resulting image can then be used to run containers, allowing you to deploy and run your Node.js application in a consistent and isolated environment. - [ ] *Here's an example of a Dockerfile for setting up a container with a PostgreSQL database:* ``` # Specify the base image FROM postgres:13 # Set environment variables ENV POSTGRES_USER myuser ENV POSTGRES_PASSWORD mypassword ENV POSTGRES_DB mydatabase # Copy initialization scripts to the Docker image COPY init.sql /docker-entrypoint-initdb.d/ # Expose the default PostgreSQL port EXPOSE 5432 ``` In this example: - **FROM**: We're using the official PostgreSQL 13 image as the base image. - **ENV**: Environment variables are set to configure the PostgreSQL database. Replace myuser, mypassword, and mydatabase with the desired values. These environment variables set the default username, password, and database name during container initialization. - **COPY**: The init.sql script is copied to the /docker-entrypoint-initdb.d/ directory inside the container. This directory is automatically executed during the initialization of the PostgreSQL container, allowing you to run custom SQL scripts on database creation. - **EXPOSE**: The default PostgreSQL port 5432 is exposed, allowing other containers or host systems to access the PostgreSQL database. To build the image, you can run the following command in the directory containing the Dockerfile: ``` docker build -t my-postgres . ``` After building the image, you can run a container from it: ``` docker run -d -p 5432:5432 my-postgres ``` This command creates and starts a container from the image, mapping the container's port 5432 to the host system's port 5432. The PostgreSQL database will now be accessible on the host system using the specified port. Remember to create the init.sql script and place it in the same directory as the Dockerfile. The script can contain the necessary SQL statements to initialize the database schema or perform any other required setup. This example demonstrates how to set up a PostgreSQL database using Docker, but you can adapt the Dockerfile to set up other SQL databases, such as MySQL or SQLite, by changing the base image and modifying the initialization scripts accordingly. #### Creating images with Dockerfiles ![](https://hackmd.io/_uploads/ryj7bO-H2.png) A Dockerfile always start from another image, specified using the FROM instruction. For example in our API Dockerfile: `FROM python:3.8.13 as base` > 'base' is the alias inside the dockerfile In this case we are defining Python image and version 3.8.13 and defining a name in this case base. 1. Creating image: ![](https://hackmd.io/_uploads/HyUnG_ZSn.png) 2. Customise images: - RUN ``` RUN <valid-shell-command> FROM ubuntu RUN apt-get update RUN apt-get install -y python3 ``` Use the -y flag to avoid any prompts ![](https://hackmd.io/_uploads/BJJDmdWSh.png) - COPY The COPY instruction copies files from our local machine into the image we are building: ``` COPY src-path-on-host dest-path-on-image ``` Not specifying a filename in the src-path will copy all the file contents. ``` COPY src-folder dest-folder ``` ![](https://hackmd.io/_uploads/r1-wIObrh.png) - CMD The CMD instruction has three forms: `CMD ["executable","param1","param2"] (exec form, this is the preferred form)` `CMD ["param1","param2"] (as default parameters to ENTRYPOINT)` `CMD command param1 param2 (shell form)` There can only be one CMD instruction in a Dockerfile. If you list more than one CMD then only the last CMD will take effect. The main purpose of a CMD is to provide defaults for an executing container. - ENTRYPOINT ENTRYPOINT has two forms: The exec form, which is the preferred form: `ENTRYPOINT ["executable", "param1", "param2"]` The shell form: `ENTRYPOINT command param1 param2` An ENTRYPOINT allows you to configure a container that will run as an executable. 3. Run image: `docker run -it imagename /bin/bash` ## Deployment with APIs An Application Programming Interface (API) is a type of service with a set of rules (code) and specifications that applications can follow to communicate within them.It serves as an interface between programs in the same way a user interface makes easier the human-software interaction. Based the purpose, the APIs can be categorized as:-Open: Are public.-Partner: - Are public but access is limited to authorized clients. - Internal: Will be used internally by other services. - Composite: Can handle multiple task in a single request ### **REST** Representational State Transfer (REST) is a software architecture that imposes constraints on how an API should work, the implementation of those conditions results in a RESTful API.A RESTful API works over HTTP so the communication is through HTTP methods. ![](https://hackmd.io/_uploads/ByhE5AqNh.png) **REST - Constraints:** - Client - Server: The client (that makes requests) and server (that makes responses) stay separated and are independent of each other. - Uniform interface: Implementation follows standardized rules (HTTP) to be able to work in other APIs effortlessly. - Stateless: The server doesn’t store state, which means each request is treated independently of the previous request. The client is responsible for managing the state of the application. - Cacheable: caching should be applied to resources when applicable and then these resources must declare themselves cacheable. Caching can be implemented on the server or client side. - Layered system: REST allows you to use a layered system architecture, in this way a client cannot ordinarily tell whether it is connected directly to the end server or an intermediary along the way. - Code on demand (Optional): you are free to return executable code instead static data **REST - URIs** REST APIs use Uniform Resource Identifiers (URIs) to address resources.URI designs range from masterpieces that clearly communicate the API’s resource model like: `http://api.example.com/louvre/leonardo-da-vinci/mona-lisa` To those that are much harder for people to understand, such as: `http://api.example.com/68dd0-a9d3-11e0-9f1c-0800200c9a66●RFC ` 3986 defines the generic URI syntax as shown below: `URI = scheme "://" authority "/" path [ "?" query ] [ "#" fragment ]` **REST - 7 rules for URI** 1. A trailing forward slash should not be included in URIs. - http://api.canvas.com/shapes instead of this http://api.canvas.com/shapes/ 2. Forward slash separator must be used to indicate a hierarchical relationship. - http://api.canvas.com/shapes/polygons/quadrilaterals/squares 3. Hyphens should be used to improve the readability of URIs. - http://api.example.com/blogs/jane-doe/posts/this-is-my-first-post 4. Underscores (_) should not be used in URIs. 5. Lowercase letters should be preferred in URI paths. - This http://api.example.com/my-folder/my-doc is preferred over this HTTP://API.EXAMPLE.COM/my-folder/my-do 6. File extensions should not be included in URIs. - This http://api.college.com/students/3248234/courses/2005/fall is preferred over this http://api.college.com/students/3248234/courses/2005/fall.json 7. Should the endpoint name be singular or plural? Keep the URI format consistent and always use a plural. Example: - http://api.college.com/students/3248234/courses retrieves a list of all courses that are learned by a student with id 3248234 - http://api.college.com/students/3248234/courses/physics retrieves course physics for a student with id 3248234. **REST - HTTP methods** The most common methods in the communication via REST are: ![](https://hackmd.io/_uploads/B1JLoCqNh.png) **REST - HTTP status codes** ![](https://hackmd.io/_uploads/Hy0Di0qVh.png) ### **REST APIs with Flask** Flask is a popular web framework that provides developers with the necessary tools, libraries, and technologies to build web applications. It offers a flexible and lightweight approach to web development, allowing developers to create various types of applications ranging from simple web pages and blogs to more complex projects such as wikis, calendars, and commercial websites. One of the key features of Flask is its integration with Jinja2 templates, a powerful templating engine that allows developers to separate the presentation layer from the application logic. This enables the creation of dynamic and interactive web pages. Flask is compliant with WSGI 1.0 (Web Server Gateway Interface), which is a standard interface between web servers and web applications in Python. This ensures compatibility and interoperability with different web servers and deployment environments. Another advantage of Flask is its built-in support for unit testing. This means developers can easily write tests to ensure the correctness and robustness of their web applications, facilitating the development of reliable and maintainable code. Furthermore, Flask benefits from a vast ecosystem of extensions. These extensions provide additional functionalities that can be integrated seamlessly into Flask applications, enhancing their capabilities. These extensions cover a wide range of features, including database integration, authentication and authorization mechanisms, API development, and more. Overall, **Flask's simplicity, flexibility, and extensibility make it a popular choice among developers for building web applications of varying complexity.** #### **Minimal flask APP** ``` from flask import Flask app = Flask(__name__) @app.route('/') def hello(): return 'Hello, World!' if __name__ == '__main__': app.run() ``` In this minimal Flask app, we first import the Flask class from the flask module. We then create an instance of the Flask class and assign it to the variable app. Next, we define a route using the @app.route decorator. In this case, the route is '/', which represents the root URL of the application. The decorated function, hello(), will be called when a request is made to the root URL. Inside the hello() function, we simply return the string 'Hello, World!', which will be displayed in the browser when accessing the root URL. Finally, we use the if __name__ == '__main__': conditional statement to ensure that the app only runs if the script is executed directly (not imported as a module). We call the app.run() method to start the Flask development server. To run this app, save the code in a Python file (e.g., app.py) and execute it using a Python interpreter. You should see a message indicating that the Flask server is running. You can then access the app by visiting http://localhost:5000 in your web browser, and you'll see the "Hello, World!" message displayed. #### **Routing** Routing in Flask refers to the process of mapping URLs to specific functions or views in your application. Each route is associated with a specific URL pattern, and when a request is made to that URL, Flask invokes the corresponding view function. Here's an example of a Flask application with a route that returns a JSON response: ``` from flask import Flask, request app = Flask(__name__) @app.route('/sum', methods=['POST']) def sum_numbers(): data = request.get_json() num1 = data['number1'] num2 = data['number2'] result = num1 + num2 return {'result': result} if __name__ == '__main__': app.run() ``` To test this example, you can send a POST request but you will need to define the parameters 'num1' and 'num2' after the '?' and separated by '&'. Like this: http://localhost:5000/sum?num1=5&num2=3 . And you will receive the following JSON body: ``` { "result": 8 } ``` #### **HTTP Methods** HTTP (Hypertext Transfer Protocol) methods, also known as HTTP verbs, are the actions that can be performed on resources identified by URLs (Uniform Resource Locators). They define the type of operation to be carried out on a particular resource when a client makes a request to a server. ![](https://hackmd.io/_uploads/ry_JkxWH3.png) Here are some commonly used HTTP methods and their purposes: - **GET**: retrieve a representation of a resource from the server, primarily used for fetching data. The server sends back the requested resource in the response. - **request.args** : the key7value pairs in URL query string - **POST**: submit data to the server to create a new resource. It is often used when submitting forms or uploading files. The data sent with a POST request is typically included in the body of the request. - **request.form** : the key/value pairs in the body, from a HTML post form, or JavaScript request that isn't JSON encoded. - **request.files** : the files in the body, which Flask keeps separate from form. HTML forms must use enctype=multipart/form-data or files will not be uploaded. - **request.json** : parsed JSON data. The request must have the application/json content type, or use request.get_json(force=True) to ignore the content type. - **PUT**: to update an existing resource on the server. It replaces the entire representation of the resource with the new data provided in the request. If the resource does not exist, it may be created. - **DELETE**: to delete a specified resource on the server identified by the URL. - **PATCH**: to partially update a resource. - **HEAD**: similar to the GET method, but it retrieves only the response headers and does not include the actual resource body. Often used to check the metadata or headers of a resource without downloading the entire content. When building web applications with Flask, you can define routes that handle requests with different HTTP methods. For example, you can have a route that handles GET requests to display a webpage and another route that handles POST requests to process form submissions. Here's an example of a Flask route that handles both **GET** and **POST** requests: ``` from flask import Flask, request app = Flask(__name__) @app.route('/', methods=['GET', 'POST']) def index(): if request.method == 'POST': # Handle form submission name = request.form.get('name') # Process the submitted data return 'Hello, ' + name + '!' else: # Handle GET request return 'Welcome to the homepage.' if __name__ == '__main__': app.run() ``` In this example, the route '/' is configured to accept both **GET** and **POST** requests. Inside the function, we check the **'request.method'** attribute to determine the type of request. If it's a **POST** request, we retrieve the submitted data from the form using request.form.get() and process it accordingly. If it's a GET request, we return a simple welcome message. By combining Flask with the appropriate HTTP methods, you can create versatile web applications that handle different types of requests and perform various operations on resources. #### **Template rendering** Template rendering is a process where dynamic data is combined with pre-defined templates to generate a final output, such as a web page. The templates contain placeholders that are replaced with actual values, resulting in a customized and dynamic presentation of the data. Templates can be used to generate any type of text file. For web applications, you’ll primarily be generating HTML pages, but you can also generate markdown, plain text for emails, any anything else. To render a template we can use the render_template() method to show them: ![](https://hackmd.io/_uploads/HyHx4eWr3.png) - Example: ![](https://hackmd.io/_uploads/HJJcigZBn.png) ![](https://hackmd.io/_uploads/B1Q4cg-rh.png) ![](https://hackmd.io/_uploads/HJ2KaeZH3.png) When you render templates, the framework will search for them in a directory called "templates". If your Flask application is structured as a module, the "templates" folder should be placed alongside that module in the directory structure. Flask will automatically find and use the templates from this folder when rendering dynamic content. ![](https://hackmd.io/_uploads/ByptKlZrh.png) > *Module: In Python, a module is simply a file that contains Python definitions, functions, classes, etc. that can be imported and used in other Python code. In the context of a Flask application, structuring your code as a module means that your application is defined in a Python module, with a `__init__.py` file that sets up the Flask application instance and contains the application code. This allows you to more easily organize and modularize your code, and also makes it easier to reuse code in other applications or modules.* **Looping** Template rendering with a loop allows for repetitive content generation by iterating over a collection of data and dynamically generating output based on each item in the collection. Using jinja we can list and iterate them to repeat a block of html: ![](https://hackmd.io/_uploads/SkF0Cl-H3.png) ![](https://hackmd.io/_uploads/ryeSGkW-H2.png) The `{% for %}` loop iterates over the items list. For each item, it generates an `<li>` element containing the value of the item. The loop repeats this process until all items in the list have been processed, resulting in a list of `<li>` elements in the final HTML output. ![](https://hackmd.io/_uploads/HJ74kZZSn.png) **Condition** Template rendering with conditions allows for dynamic content generation based on certain conditions. It enables the rendering of different blocks of content or applying different logic based on the values of variables or other conditions. Also we can condition what we show in our html files to reuse the same file but with different outputs based on the logic we implement. ![](https://hackmd.io/_uploads/rkC4bZbB2.png) ![](https://hackmd.io/_uploads/H1rB--Wrn.png) **Files uploads** When files are uploaded in web applications, they are temporarily stored in memory or a temporary location on the server's filesystem. These files can be accessed through the "files" attribute of the request object, which contains a dictionary of uploaded files. Each file behaves like a regular Python file object, but with an additional "save()" method that allows you to permanently store the file on the server's filesystem. ![](https://hackmd.io/_uploads/rykObWWBh.png) > *In a microservice application with a machine learning model, file uploads can play a crucial role in providing input data for the model's predictions. The application may receive files containing data or images that need to be processed by the machine learning model. These uploaded files can be temporarily stored in memory or a temporary location on the server's filesystem. The microservice can access these files through the request object's "files" attribute. The machine learning model can then load and process the uploaded files as needed. After processing, the model's predictions or results can be returned to the client or stored for further analysis. The "save()" method can be utilized to store the processed files on the server's filesystem for future reference or auditing purposes.* #### **Blueprint** Blueprint in Flask is a way to organize and group related routes, templates, and static files into reusable components. It allows for modular application design by defining a collection of routes and associated functionality within a blueprint object. Blueprints can be registered with an application and used to create a structured and scalable Flask application with distinct features or sections. ![](https://hackmd.io/_uploads/HkEsEWWSh.png) - Example Create blueprint: ![](https://hackmd.io/_uploads/ry1WB-ZB2.png) Import: ![](https://hackmd.io/_uploads/B1UgLWZB3.png) Instanciate: ![](https://hackmd.io/_uploads/B1VULbWB3.png) Testing: `/suma` doesn't work ![](https://hackmd.io/_uploads/SyN0I-WH3.png) Testing: `/api/suma` works fine ![](https://hackmd.io/_uploads/SyiePWWS3.png) ### **Architecture** #### **Simplest Architecture** In a simple architecture for a Flask API with a machine learning model, you would use Flask as the web framework to handle HTTP requests. Routes would be defined to handle different API requests, such as predictions or data uploads. The machine learning model would be incorporated within the Flask application to process the requests and generate responses. Data handling logic would preprocess and format the incoming data, and the model's output would be processed to construct appropriate responses. Finally, the Flask API and the machine learning model would be deployed on a server or cloud platform for user accessibility. ![](https://hackmd.io/_uploads/BkyaLHbS3.png) Pros - Easy to implement● - Useful to quickly deploy POCs Cons - Coupled API and ML model (we cannot scale only one of two) - What happens with heaviest data (like images)? #### **Microservice** In this architecture, we adopt a microservice approach by separating the components into different containers. The web server is developed using Flask, the message queue is implemented with Redis, and the ML model server runs in its own container. ![](https://hackmd.io/_uploads/rJ1NvrZHn.png) By distributing the components into separate containers, we gain several advantages over a single-container approach: - **Scalability**: Each component can be scaled independently based on its specific requirements. For example, if the web server experiences high traffic, we can horizontally scale it by adding more instances without impacting the ML model server. Similarly, if the ML model server needs additional computational resources to handle increased prediction requests, we can scale it separately. - **Maintainability**: Separating the components allows for individual updates and modifications without affecting others. For instance, we can update the ML model server to use a newer ML framework version without impacting the web server or the message queue. This modular approach simplifies maintenance and reduces the risk of unintended consequences. - **Flexibility**: With separate containers, we have the flexibility to choose the most suitable technologies for each component. For example, we can use Flask for the web server, Redis for the message queue, and select the ML framework or server that best suits our ML model requirements. This flexibility enables us to leverage the strengths of different tools and technologies for each specific task. - **Fault isolation**: By separating components, we achieve fault isolation. If one container encounters issues or crashes, it doesn't affect the other containers. For example, if the ML model server experiences errors, the web server and message queue can continue functioning independently. This isolation ensures that failures are contained and do not disrupt the entire system. - **Scalability of development**: The separation of components into different containers promotes a modular development approach. Developers can work on individual services independently, leading to faster development cycles and better collaboration. It also allows specialized teams to focus on specific areas, such as frontend development, ML model development, or infrastructure management. Pros - Now we can scale each component - We can update one of the components without affecting others - Allow us to easily add new components to our initial pipeline Cons - A bad setting may end up in worse performance - Hardest to debug, we need robust test suites (unit, integration and, stress/load) #### **Redis localhost on a docker container** ![](https://hackmd.io/_uploads/SJR1HazBn.png) ## [OverTheWire](http://overthewire.org) OverTheWire is an online project that provides a series of challenges and cybersecurity games aimed at helping people learn and practice skills in the field of cybersecurity. The challenges are designed to teach concepts such as cryptography, forensic analysis, vulnerability exploitation, and more. The project offers various games like "Bandit," "Natas," "Leviathan," and "Narnia," each with its own set of challenges and increasing levels of difficulty. Participants have to solve the challenges using ethical hacking techniques and creative thinking to progress to the next level. OverTheWire is an excellent platform for cybersecurity enthusiasts and aspiring professionals to enhance their skills and gain hands-on experience in realistic security situations. Remember, it is always important to act ethically and legally when using these skills and to respect the terms and conditions set by OverTheWire. ## [Coockiecutter](https://drivendata.github.io/cookiecutter-data-science/) Cookiecutter Data Science is a project template that helps data scientists set up their projects effectively. It provides a standardized structure and workflow, ensuring reproducibility and organization. By using this template, data scientists can quickly create a project with essential directories and files. It includes folders for data, notebooks, scripts, and documentation. Additionally, it offers pre-configured files like README templates, Dockerfiles, and environment configurations. This template saves time on project setup, allowing data scientists to focus on analysis and modeling. It promotes consistency and collaboration among team members. Cookiecutter Data Science is customizable, adapting to specific needs and preferences. Overall, it simplifies project management, making data science projects more organized, reproducible, and efficient. ### Directory structure ``` ├── LICENSE ├── Makefile <- Makefile with commands like `make data` or `make train` ├── README.md <- The top-level README for developers using this project. ├── data │ ├── external <- Data from third party sources. │ ├── interim <- Intermediate data that has been transformed. │ ├── processed <- The final, canonical data sets for modeling. │ └── raw <- The original, immutable data dump. │ ├── docs <- A default Sphinx project; see sphinx-doc.org for details │ ├── models <- Trained and serialized models, model predictions, or model summaries │ ├── notebooks <- Jupyter notebooks. Naming convention is a number (for ordering), │ the creator's initials, and a short `-` delimited description, e.g. │ `1.0-jqp-initial-data-exploration`. │ ├── references <- Data dictionaries, manuals, and all other explanatory materials. │ ├── reports <- Generated analysis as HTML, PDF, LaTeX, etc. │ └── figures <- Generated graphics and figures to be used in reporting │ ├── requirements.txt <- The requirements file for reproducing the analysis environment, e.g. │ generated with `pip freeze > requirements.txt` │ ├── setup.py <- Make this project pip installable with `pip install -e` ├── src <- Source code for use in this project. │ ├── __init__.py <- Makes src a Python module │ │ │ ├── data <- Scripts to download or generate data │ │ └── make_dataset.py │ │ │ ├── features <- Scripts to turn raw data into features for modeling │ │ └── build_features.py │ │ │ ├── models <- Scripts to train models and then use trained models to make │ │ │ predictions │ │ ├── predict_model.py │ │ └── train_model.py │ │ │ └── visualization <- Scripts to create exploratory and results oriented visualizations │ └── visualize.py │ └── tox.ini <- tox file with settings for running tox; see tox.readthedocs.io ```