Research Project I: Deploying a Google Kubernetes Engine cluster in Google Cloud Platform using Terraform and deploying an application

# Deploying a Google Kubernetes Engine cluster in Google Cloud Platform using Terraform and deploying an application ###### tags: `GKE` `GCP` `Terraform` `kubectl` `service` `Ingress` `LB` ## Introduction With the continuous iteration and expansion of enterprise applications, the release of applications may involve multiple teams, such as PC, mobile, applet and so on. Application release has also become a high-risk and high-pressure process, as well as the communication of application development iterations, and the testing cost has also become greatly uncontrollable. Today, the concept of DevOps management emerged, CI, CD and powerful deployment automation methods ensure the repeatability of deployment tasks and reduce the possibility of deployment errors. Google Kubernetes Engine (GKE) is a managed Kubernetes service, which means that the Google Cloud Platform (GCP) is fully responsible for managing the cluster's control plane. In particular, GCP: - Manages Kubernetes API servers and the etcd database. - Runs the Kubernetes control-plane single or multiple availability zones. - Scales the control-plane as you add more nodes to your cluster. - Provides a mechanism to upgrade your control plane and nodes to a newer version. - Rotates certificates and keys. There are three popular options to run and deploy a GKE cluster: - Create a cluster from the GCP web interface. - Use the gcloud command-line utility. - Define the cluster using Infrastructure as Code tool such as Terraform. Copied from [learnk8s.io](https://learnk8s.io/terraform-gke) **HashiCorp Terraform** is an open-source tool for building, changing, and versioning infrastructure safely and efficiently. From the upper-layer software configuration to the lower-layer network and system configuration, Terraform can be used for unified management. It helps perform several tasks, such as : + Create, manage, and update infrastructure resources such as physical machines, VMs, network switches, containers, and more. + Deploy an application to Heroku platform + Dynamically schedule request resources + Multi-cloud deployment, etc. The main purpose of this project focuses on provisioning a cluster in GKE using Terraform and deploying an application ## Methodology ## Test Environment ![](https://i.imgur.com/8E9gQ7c.png) ### Prerequisites Before starting we will need to have the following in place: + A Google Cloud Platform (GCP) account and log in + A project in Google Cloud Console Once the project is created, enable the Kubernetes Engine API as shown below ![](https://i.imgur.com/WQoojdL.png) Then, install the following: - Terraform - gcloud_cli - kubectl ## Provisioning a single cluster with 3 nodes using Terraform and deploying a Docker Nginx application ## 1. Creating GCP project credentials We need a file with the credentials that Terraform needs to interact with the Google Cloud API to create the cluster and related networking components. ![](https://i.imgur.com/l4MyTH0.png) Then: + Create a session key + Select JSON as the key type + Download and save the key locally. ## 2. Provisioning the cluster with Terraform The following folder contains the necessary files to provision the cluster and deploy the containerized app. . ├── main.tf ├── nginx-app │ ├── deployment.yaml │ ├── ingress.yaml │ └── loadbalancer.yaml ├── outputs.tf ├── provider.tf └── variables.tf ### Understanding Terraform code Terraform code is maintained within directories. These files represent the GCP resources that will be created. Terraform runtime will read all *.tf, *.tfvars files in the working directory, so no need to write everything in a single file. It is listed in different files by responsibilities. #### Breakdown: provider.tf terraform { required_providers { google = { source = "hashicorp/google" version = ">=4.6.0" } } } provider "google" { credentials = file("") project = "calm-seeker" } > The provider.tf indicates which cloud provider will be use. A provider is a library used by Terraform to achieve the interactions the API and the services. #### Breakdown: variables.tf variable "project_id" { description = "The project ID to host the cluster in" default = "calm-seeker" } variable "cluster_name" { description = "The name for the GKE cluster" default = "rp1-gke-cluster" } variable "env_name" { description = "The environment for the GKE cluster" default = "prod" } variable "region" { description = "The region to host the cluster in" default = "europe-west1" } variable "network" { description = "The VPC network created to host the cluster in" default = "gke-network" } variable "subnetwork" { description = "The subnetwork created to host the cluster in" default = "gke-subnet" } variable "ip_range_pods_name" { description = "The secondary ip range to use for pods" default = "ip-range-pods" } variable "ip_range_services_name" { description = "The secondary ip range to use for services" default = "ip-range-services" } The variable file represents the parameters used by the resource, such as the region of the cluster, the network details, the project id, etc... that will be used by Terraform. #### Breakdown: outputs.tf output "cluster_name" { description = "rp1-gke-cluster" value = module.gke.name } output "host" { value = google_container_cluster.primary.endpoint sensitive = true } The output.tf contains the data of the cluster that we need to inspect after the provisioning of the infrastructure. #### Breakdown: main.tf module "gke_auth" { source = "terraform-google-modules/kubernetes-engine/google//modules/auth" depends_on = [module.gke] project_id = var.project_id location = module.gke.location cluster_name = module.gke.name } resource "local_file" "kubeconfig" { content = module.gke_auth.kubeconfig_raw filename = "kubeconfig-${var.env_name}" } module "gcp-network" { source = "terraform-google-modules/network/google" version = "4.0.1" project_id = var.project_id network_name = "${var.network}-${var.env_name}" subnets = [ { subnet_name = "${var.subnetwork}-${var.env_name}" subnet_ip = "10.10.0.0/16" subnet_region = var.region }, ] secondary_ranges = { "${var.subnetwork}-${var.env_name}" = [ { range_name = var.ip_range_pods_name ip_cidr_range = "10.20.0.0/16" }, { range_name = var.ip_range_services_name ip_cidr_range = "10.30.0.0/16" }, ] } } module "gke" { source = "terraform-google-modules/kubernetes-engine/google//modules/private-cluster" project_id = var.project_id name = "${var.cluster_name}-${var.env_name}" regional = true region = var.region network = module.gcp-network.network_name subnetwork = module.gcp-network.subnets_names[0] initial_node_count = 3 ip_range_pods = var.ip_range_pods_name ip_range_services = var.ip_range_services_name node_pools = [ { name = "node-pool1" machine_type = "e2-medium" node_locations = "europe-west1-b" node_count = 3 min_count = 1 max_count = 6 disk_size_gb = 30 }, ] } The main.tf file indicates what resources we need to create, such as network, vpc, cluster, etc. The GKE module contains all the specifications about the cluster and the node pool such as the name, the type of instances used, the regions, etc. That’s all. The Terraform files are all ready to go. ### Initializing Terraform The next step is to initialize Terraform by running `terraform init`. Terraform will generate a directory named .terraform and download each module source declared in main.tf. Our initialization will pull in the google provider. st6@pacman:~/Documents/rp1-prod$ terraform init Initializing modules... Downloading registry.terraform.io/terraform-google-modules/network/google 4.0.1 for gcp-network... - gcp-network in .terraform/modules/gcp-network - gcp-network.firewall_rules in .terraform/modules/gcp-network/modules/firewall-rules - gcp-network.routes in .terraform/modules/gcp-network/modules/routes - gcp-network.subnets in .terraform/modules/gcp-network/modules/subnets - gcp-network.vpc in .terraform/modules/gcp-network/modules/vpc Downloading registry.terraform.io/terraform-google-modules/kubernetes-engine/google 18.0.0 for gke... - gke in .terraform/modules/gke/modules/private-cluster Downloading registry.terraform.io/terraform-google-modules/gcloud/google 2.1.0 for gke.gcloud_delete_default_kube_dns_configmap... - gke.gcloud_delete_default_kube_dns_configmap in .terraform/modules/gke.gcloud_delete_default_kube_dns_configmap/modules/kubectl-wrapper - gke.gcloud_delete_default_kube_dns_configmap.gcloud_kubectl in .terraform/modules/gke.gcloud_delete_default_kube_dns_configmap Downloading registry.terraform.io/terraform-google-modules/kubernetes-engine/google 18.0.0 for gke_auth... - gke_auth in .terraform/modules/gke_auth/modules/auth Initializing the backend... Initializing provider plugins... - Finding latest version of hashicorp/local... [... . . ...] - Installing hashicorp/google v3.90.1... - Installed hashicorp/google v3.90.1 (signed by HashiCorp) Terraform has created a lock file .terraform.lock.hcl to record the provider selections it made above. Include this file in your version control repository so that Terraform can guarantee to make the same selections by default when you run "terraform init" in the future. Terraform has been successfully initialized! You may now begin working with Terraform. Try running "terraform plan" to see any changes that are required for your infrastructure. All Terraform commands should now work. If you ever set or change modules or backend configuration for Terraform, rerun this command to reinitialize your working directory. If you forget, other commands will detect it and remind you to do so if necessary. ### Previewing with Terraform st6@pacman:~/Documents/rp1-prod$ terraform apply Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols: + create <= read (data resources) Terraform will perform the following actions: # local_file.kubeconfig will be created + resource "local_file" "kubeconfig" { + content = (sensitive) + directory_permission = "0777" + file_permission = "0777" + filename = "kubeconfig-prod1" + id = (known after apply) } [....... .......... .........] # module.gcp-network.module.vpc.google_compute_network.network will be created + resource "google_compute_network" "network" { + auto_create_subnetworks = false + delete_default_routes_on_create = false + gateway_ipv4 = (known after apply) + id = (known after apply) + mtu = 0 + name = "gke-network-prod1" + project = "calm-seeker-337410" + routing_mode = "GLOBAL" + self_link = (known after apply) } # module.gke.module.gcloud_delete_default_kube_dns_configmap.module.gcloud_kubectl.null_resource.module_depends_on[0] will be created + resource "null_resource" "module_depends_on" { + id = (known after apply) + triggers = { + "value" = "2" } } Plan: 13 to add, 0 to change, 0 to destroy. Changes to Outputs: + cluster_name = "rp1-gke-cluster-prod1" ### Applying Configuration st6@pacman:~/Documents/rp1-prod$ terraform apply Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols: + create <= read (data resources) [......] Do you want to perform these actions? Terraform will perform the actions described above. Only 'yes' will be accepted to approve. Enter a value: yes module.gke.random_string.cluster_service_account_suffix: Creating... module.gke.random_string.cluster_service_account_suffix: Creation complete after 0s [id=tdl7] module.gke.google_service_account.cluster_service_account[0]: Creating... module.gcp-network.module.vpc.google_compute_network.network: Creating... module.gke.random_shuffle.available_zones: Creating... module.gke.random_shuffle.available_zones: Creation complete after 0s [id=-] module.gke.google_service_account.cluster_service_account[0]: Creation complete after 2s [id=projects/calm-seeker-337410/serviceAccounts/tf-gke-rp1-gke-cluster-tdl7@calm-seeker-337410.iam.gserviceaccount.com] module.gke.google_project_iam_member.cluster_service_account-log_writer[0]: Creating... module.gcp-network.module.vpc.google_compute_network.network: Still creating... [10s elapsed] [...... ...... ....] module.gke.module.gcloud_delete_default_kube_dns_configmap.module.gcloud_kubectl.null_resource.module_depends_on[0]: Creation complete after 0s [id=8752609360790339243] module.gke_auth.data.google_client_config.provider: Reading... module.gke_auth.data.google_container_cluster.gke_cluster: Reading... module.gke_auth.data.google_client_config.provider: Read complete after 0s [id=projects/calm-seeker-337410/regions//zones/] module.gke_auth.data.google_container_cluster.gke_cluster: Read complete after 3s [id=projects/calm-seeker-337410/locations/europe-west1/clusters/rp1-gke-cluster-prod1] module.gke_auth.data.template_file.kubeconfig: Reading... module.gke_auth.data.template_file.kubeconfig: Read complete after 0s [id=db3be5ba8b47a86386594eb621d87925c567b8725cb07ed1dcc18e4e588a6611] local_file.kubeconfig: Creating... local_file.kubeconfig: Creation complete after 0s [id=5bee35713af85c82599b953dd5bd85a6b0585b3a] Apply complete! Resources: 13 added, 0 changed, 0 destroyed. Outputs: cluster_name = "rp1-gke-cluster-prod1" Then check on the GKE Dashboard ![](https://i.imgur.com/zrd5msU.png) ## 3. Deploying the application and services ### Build and push the app on Docker Hub #### Dockerfile FROM nginx: latest COPY ./index.html /usr/share/nginx/html/index.html COPY ./images /usr/share/nginx/html/images COPY ./assets /usr/share/nginx/html/assets RUN chmod 777 /usr/share/nginx/html/ ### Deploy into the cluster #### Breakdown: deployment.yaml Here we are specifying the apiVersion and that we want a Deployment. apiVersion: apps/v1 kind: Deployment metadata: name: rp1-app spec: selector: matchLabels: name: rp1-app template: metadata: labels: name: rp1-app spec: containers: - name: appclear image: themuntu/rp1-app:latest ports: - containerPort: 80 Our deployment fields are described as follow: - The `.metadata.name` field which indicates the Deployment name which is `rp1-app`. - The `.spec.selector` field defines how the Deployment finds which Pods to manage. - The `template` field contains the following sub-fields: + The Pods are labeled `name : rp1-app` using the `.metadata.labels` field. + The `.template.spec` field indicates that the Pods run one container, nginx, which runs the docker rp1-app latest image. Now we connect to the cluster and run the deployment.yaml st6@pacman:~/Documents/rp1-prod$ gcloud container clusters get-credentials rp1-gke-cluster-prod1 --region europe-west1 --project calm-seeker Fetching cluster endpoint and auth data. kubeconfig entry generated for rp1-gke-cluster-prod1. st6@pacman:~/Documents/rp1-prod$ cd nginx-app/ st6@pacman:~/Documents/rp1-prod/nginx-app$ kubectl apply -f deployment.yaml deployment.apps/rp1-app created ### Configure port-forwarding for local testing Here we need to retrieve the pod name and the container port. Then configure: st6@pacman:~/Documents/rp1-prod/nginx-app$ kubectl get pods NAME READY STATUS RESTARTS AGE rp1-app-74674bd47-txhch 1/1 Running 0 9m st6@pacman:~/Documents/rp1-prod/nginx-app$ kubectl port-forward rp1-app-74674bd47-txhch 8080:80 Forwarding from 127.0.0.1:8080 -> 80 Forwarding from [::1]:8080 -> 80 Handling connection for 8080 Handling connection for 8080 And check: ![](https://i.imgur.com/gDL3j8G.jpg) Since port-forwarding is just for local testing, we then configure a Loadbalancer service as a permanent solution to route the traffic to the pods ### Configure Load Balancer service to expose the pods #### Breakdown: Loadbalancer.yaml apiVersion: v1 kind: Service metadata: name: rp1-app loadbalancer spec: type: LoadBalancer ports: - port: 80 targetPort: 80 selector: name: rp1-app Our service fields are described as follow: - The `.metadata.name` field indicates the service name. - The `spec.type` field defines which service we want to use. - The `.spec.selector` field defines how the service finds which Pods to manage. - The `spec.ports.port` indicates which port will be open. - The `spec.ports.targetPort` indicates on which container port it is mapped. Here we then launch the service and test: st6@pacman:~/Documents/rp1-prod/nginx-app$ kubectl apply -f loadbalancer.yaml service/rp1-app created st6@pacman:~/Documents/rp1-prod/nginx-app$ kubectl get services NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 10.30.0.1 <none> 443/TCP 40h rp1-app LoadBalancer 10.30.42.133 <pending> 80:30877/TCP 17s st6@pacman:~/Documents/rp1-prod/nginx-app$ kubectl get services NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 10.30.0.1 <none> 443/TCP 40h rp1-app LoadBalancer 10.30.42.133 34.79.198.64 80:30877/TCP 3m33s st6@pacman:~/Documents/rp1-prod/nginx-app$ The capture below shows that the load balancer is serving the application correctly. ![](https://i.imgur.com/45qfc9Z.jpg) Using a load balancer to expose the service is good but not an ultimate solution. In case we need to add additional services to the application, we will also need to set up the same number of Load balancers to serve them. Kubernetes provides a resource in order to solve this issue, which is the Ingress. ### Configure Ingress to expose the cluster #### Breakdown: ingress.yaml In Google Cloud Platform when a GKE cluster is deployed, the Ingress controller is automatically deployed. So we will just have to define the Ingress manifest and define the rules and paths to reach the application. apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: rp1-app annotations: cloud.google.com/load-balancer-type: "External" kubernetes.io/ingress.class: "gce" spec: rules: - http: paths: - path: /* pathType: ImplementationSpecific backend: service: name: rp1-app port: number: 80 - The ingress stated here will routes all the traffic from /* to the targeted pods. - `kubernetes.io/ingress.class "gce"` : select the right Ingress controller in the cluster. - `cloud.google.com/load-balancer-type "External"` : specifies the public-facing load balancer st6@pacman:~/Documents/rp1-prod/nginx-app$ kubectl apply -f ingress.yaml ingress.networking.k8s.io/rp1-app created st6@pacman:~/Documents/rp1-prod/nginx-app$ kubectl get services NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 10.30.0.1 <none> 443/TCP 42h rp1-app LoadBalancer 10.30.42.133 34.79.198.64 80:30877/TCP 113m st6@pacman:~/Documents/rp1-prod/nginx-app$ kubectl get ingress NAME CLASS HOSTS ADDRESS PORTS AGE rp1-app <none> * 34.149.224.65 80 6m28s ![](https://i.imgur.com/571oSJD.jpg) We can then have an overview of the workloads and deployed services of our cluster in the GCP console: ![](https://i.imgur.com/KxAj8mp.png) With the trend of IT infrastructure going to the cloud, manual operation and maintenance on the cloud have become unsustainable, resulting in difficult management, and difficulty in DevOps.Terraform, is based on the concept of infrastructure as code (IaC), which can define infrastructure through templates, standardize and automate the entire deployment process, cooperate with changesets, deviation detection, etc. During this project, It helped us provision our Google Kubernetes Engine cluster easily which made us focus more on the deployment of the application and services than building the entire environment. _____ #### Resources: Mostly taken from : - [Terraform Kubernetes Engine Module](https://registry.terraform.io/modules/terraform-google-modules/kubernetes-engine/google/latest) - [Learnk8s.io](https://learnk8s.io/terraform-gke) - [Learn IaC part 2](https://circleci.com/blog/learn-iac-part02/) - [Kubernetes tutorial for beginner- GKE- Google Cloud](https://www.youtube.com/watch?v=jW_-KZCjsm0&list=WL&index=12) - [services-networking](https://kubernetes.io/docs/concepts/services-networking/) - [services-networking/ingress](https://kubernetes.io/docs/concepts/services-networking/ingress/) - [Kubernetes deployment](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/) - [Configuration management tools](https://www.youtube.com/watch?v=OmRxKQHtDbY&list=WL&index=13)