# Cloud architecture and workflow Moving forward to production: - **Continuous integration** Build and test applications using CI pipelines. Every change made to any code repository is automatically built and tested. This increases product quality. - **Continuous deployment** Automatically deploy application changes to a specific cloud environment increasing the agility of the team. Multiple environments: dev, pre, prod. - **High availability** Ensuring applications run with zero downtime. With every new application deployment, the system take care of gratefully shutting down previous deployments and use a load balancing strategies to redirect traffic to new instances. The system should automatically react to application failures creating new replicas and replacing failed with new ones. - **High scalability** In order to ensure high availability in the system and react to a high traffic demand, the system will be able to get more computing resources and scale up/down applications with new replicas. - **Observability: Metrics and alerting** Every application should expose metrics (e.g computer resources, latencies, number of requests, healthy streams). All these metrics can be monitorized to check system and applications behaviour in time series and alert if any error or failure occurs. - **Security** Put services in a VPC to prevent access from outside. Use Least Privilege principle to minimize resource access to certain roles with specific permissions. ## CI/CD ### CI/CD services There are plenty of options and services out there, these are some of the most popular and those which we have more experience working with. - **github actions** Github CI service, already integrated with existing github repos. Has support for linux, macOS and Windows environemnts. You can run the pipelines in self hosted runners using your own computer provider. Pricing model is based on per-minute rates. Using self hosted runners, there are no minute-rates applied. - **jenkins** Opensource CI tool written in Java. The UI is quite old-fashion, but the CI system is very powerful and customizable. The main con of using Jenkins is that you have to self host it and take care of its maintainability. - **circleci** One of the most widely used CI services in the industry. Very well integrated with github. Its widely used so it has good support and community resources. Supports macOS images. Pricing rated by number of users. ## Docker containers Quoted from docker website: > A Docker container image is a lightweight, standalone, executable package of software that includes everything needed to run an application: code, runtime, system tools, system libraries and settings. They are like virtual machines, with similar resource isolation and allocation benefits, but virtualizing OS instead of hardware. Containers are more portable and efficient than virtual machines. Running applications on docker containers will allow us to isolate applications, making them more portable and decoupled from hardware machines and other apps. ## Container orchestration Container orchestation tools help us to manage, scale and automate deployment of containerized applications. Containers orchestration is the key point for high availability and high scalability. - **kubernetes (k8s)** Opensource project. Is the de facto standard container ochestration tool. Big ecosystem, widely used, very powerful and customizable. It is provider agnostic, so it can be run on gcloud, aws, azure etc. - **ecs** AWS product that tries to simplify container orchestration. Its a simpler solution than k8s although less customizable. ### kubernetes services Cloud providers offers kubernetes managed services. The services includes automatic cluster updates, and maintenance of the master node where kube api server runs. - eks Elastic kubernetes service is an AWS product. It has a fixed cost of 72$/month, only for maintenance, + the cost of the computer resources (EC2 instances). Provided with SLA. - gke Google kubernetes engine. There are no management cost for the first cluster, you only pay for computing resources. More robust and better integrated in the gcp ecosystem than aks and eks. Doesn't come with SLA. ## Monitoring and alerting - **Prometheus** An open-source monitoring system with a flexible and powerful query language, promSQL. Widely used and very powerful. It comes with an alerting system. Very well integrated with kubernetes environment. - **Grafana** Tool for visualizing prometheus generated metrics. Very customizable, it comes with multiple kind of graphs and nice features. - **pingdom** Service for healthchecks - **sentry** Service to store and monitorize application errors. ## Notification services Services that can be used to route prometheus alerts to configured notification systems (slack, email, calls etc.). - opsgenie - pagerduty # Proposal - kubernetes (EKS): Setup a k8s cluster for container orchestration using EKS aws service. Its more customizable than ECS and will be much easier to find people with expertise on kubernetes than on ECS. We could deploy all applications on the cluster: backend, python lambdas as jobs, dashboard apps, static websites etc. We could use the same infrastructure to deploy machine learning algorithms, prediction engines or new applications. As worker nodes we can setup some EC2 spot instances (t3) and some on-premise EC2 instances (t3). The final config depends on the final workload of the cluster. Initial pricing estimation: 3 EC2 **t3.medium spot** nodes running 24/7 - 30$ / month 2 EC2 **t3.medium** onpremise nodes running 24/7 - 60$ / month EKS cluster management - 72$/month Total (aprox): *162$/month* - **vpc** The cluster will run in a private vpc, along with the other aes services. We have to decide the VPC model to use, but it has to block incomming traffic from internet and allow service to service communication inside of the vpc. - **CI/CD** Github Actions will be the easiest integration as applications code is already hosted on github. We can self host github actions runners in the kubernetes cluster, so no extra cost would be added. Every app should be deployed to the k8s cluster trough git push to github repositories. Github has support for using macOS runners for building iOS apps. - **Monitoring** Setup Prometheus and Grafana on the k8s cluster to collect metrics from ec2 nodes and every application deployed on the cluster. - **Pingdom** Setup pingdom as healthcheck service. It's important to have at least one alerting system outside the k8s cluster to be able to alert if something happens on the cluster --> 13$/month for 10 healtchecks - **Opsgenie** Free plan for up to 5 users. #### CI/CD Workflow ![](https://i.imgur.com/medamuj.png) #### Cloud architecture ![](https://i.imgur.com/LWLA3r5.png)