Page2 - HackMD

--- SECTION: ARCHITECTURE, INSTALL AND MAINTENANCE ### A pod called elastic-app-cka02-arch is running in the default namespace. The YAML file for this pod is available at /root/elastic-app-cka02-arch.yaml on the student-node. The single application container in this pod writes logs to the file /var/log/elastic-app.log. ### One of our logging mechanisms needs to read these logs to send them to an upstream logging server but we don't want to increase the read overhead for our main application container so recreate this POD with an additional sidecar container that will run along with the application container and print to the STDOUT by running the command tail -f /var/log/elastic-app.log. You can use busybox image for this sidecar container. ### sidecar container running as expected? ### YAML file updated with the new container? ### You can find me on Answer Recreate the pod with a new container called sidecar. Update the /root/elastic-app-cka02-arch.yaml YAML file as shown below: ```typescript apiVersion: v1 kind: Pod metadata: name: elastic-app-cka02-arch spec: containers: - name: elastic-app image: busybox:1.28 args: - /bin/sh - -c - > mkdir /var/log; i=0; while true; do echo "$(date) INFO $i" >> /var/log/elastic-app.log; i=$((i+1)); sleep 1; done volumeMounts: - name: varlog mountPath: /var/log - name: sidecar image: busybox:1.28 args: [/bin/sh, -c, 'tail -f /var/log/elastic-app.log'] volumeMounts: - name: varlog mountPath: /var/log volumes: - name: varlog emptyDir: {} ``` Next, recreate the pod: student-node ~ ➜ kubectl replace -f /root/elastic-app-cka02-arch.yaml --force --context cluster3 pod "elastic-app-cka02-arch" deleted pod/elastic-app-cka02-arch replaced --- ### Run a pod called alpine-sleeper-cka15-arch using the alpine image in the default namespace that will sleep for 7200 seconds. ### alpine pod created? answer: student-node ~ ➜ vi alpine-sleeper-cka15-arch.yaml ``` apiVersion: v1 kind: Pod metadata: name: alpine-sleeper-cka15-arch spec: containers: - name: alpine image: alpine command: ["/bin/sh", "-c", "sleep 7200"] ``` student-node ~ ➜ kubectl apply -f alpine-sleeper-cka15-arch.yaml --context cluster3 --- A pod called logger-complete-cka04-arch has been created in the default namespace. Inspect this pod and save ALL the logs to the file /root/logger-complete-cka04-arch on the student-node. Task completed? answer: Run the command kubectl logs logger-complete-cka04-arch --context cluster3 > /root/logger-complete-cka04-arch on the student-node. Run the command ``` student-node ~ ➜ kubectl logs logger-complete-cka04-arch --context cluster3 > /root/logger-complete-cka04-arch student-node ~ ➜ head /root/logger-complete-cka04-arch INFO: Wed Oct 19 10:50:54 UTC 2022 Logger is running CRITICAL: Wed Oct 19 10:50:54 UTC 2022 Logger encountered errors! SUCCESS: Wed Oct 19 10:50:54 UTC 2022 Logger re-started! INFO: Wed Oct 19 10:50:54 UTC 2022 Logger is running CRITICAL: Wed Oct 19 10:50:54 UTC 2022 Logger encountered errors! SUCCESS: Wed Oct 19 10:50:54 UTC 2022 Logger re-started! INFO: Wed Oct 19 10:50:54 UTC 2022 Logger is running student-node ~ ➜ ``` --- An etcd backup is already stored at the path /opt/cluster1_backup_to_restore.db on the cluster1-controlplane node. Use /root/default.etcd as the --data-dir and restore it on the cluster1-controlplane node itself. You can ssh to the controlplane node by running ssh root@cluster1-controlplane from the student-node. etcd backup restored? answer: SSH into cluster1-controlplane node: student-node ~ ➜ ssh root@cluster1-controlplane ``` Install etcd utility (if not installed already) and restore the backup: cluster1-controlplane ~ ➜ cd /tmp cluster1-controlplane ~ ➜ export RELEASE=$(curl -s https://api.github.com/repos/etcd-io/etcd/releases/latest | grep tag_name | cut -d '"' -f 4) cluster1-controlplane ~ ➜ wget https://github.com/etcd-io/etcd/releases/download/${RELEASE}/etcd-${RELEASE}-linux-amd64.tar.gz cluster1-controlplane ~ ➜ tar xvf etcd-${RELEASE}-linux-amd64.tar.gz ; cd etcd-${RELEASE}-linux-amd64 cluster1-controlplane ~ ➜ mv etcd etcdctl /usr/local/bin/ cluster1-controlplane ~ ➜ etcdctl snapshot restore --data-dir /root/default.etcd /opt/cluster1_backup_to_restore.db ``` --- Find the node across all clusters that consumes the most CPU and store the result to the file /opt/high_cpu_node in the following format cluster_name,node_name. The node could be in any clusters that are currently configured on the student-node. NOTE: It's recommended to wait for a few minutes to allow deployed objects to become fully operational and start consuming resources. data stored in /opt/high_cpu_node? answer: Check out the metrics for all node across all clusters: student-node ~ ➜ kubectl top node --context cluster1 --no-headers | sort -nr -k2 | head -1 cluster1-controlplane 127m 1% 703Mi 1% student-node ~ ➜ kubectl top node --context cluster2 --no-headers | sort -nr -k2 | head -1 cluster2-controlplane 126m 1% 675Mi 1% student-node ~ ➜ kubectl top node --context cluster3 --no-headers | sort -nr -k2 | head -1 cluster3-controlplane 577m 7% 1081Mi 1% student-node ~ ➜ kubectl top node --context cluster4 --no-headers | sort -nr -k2 | head -1 cluster4-controlplane 130m 1% 679Mi 1% Using this, find the node that uses most cpu. In this case, it is cluster3-controlplane on cluster3. Save the result in the correct format to the file: student-node ~ ➜ echo cluster3,cluster3-controlplane > /opt/high_cpu_node --- demo-pod-cka29-trb pod is stuck in aPending state, look into issue to fix the same, Make sure pod is in Running state and stable. Fixed the issues? ssh cluster1-controlplane pod is in running state? answer ``` Look into the POD events kubectl get event --field-selector involvedObject.name=demo-pod-cka29-trb You will see some Warnings like: Warning FailedScheduling pod/demo-pod-cka29-trb 0/3 nodes are available: 3 pod has unbound immediate PersistentVolumeClaims. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling. This seems to be something related to PersistentVolumeClaims, Let's check that: kubectl get pvc You will notice that demo-pvc-cka29-trb is stuck in Pending state. Let's dig into it kubectl get event --field-selector involvedObject.name=demo-pvc-cka29-trb You will notice this error: Warning VolumeMismatch persistentvolumeclaim/demo-pvc-cka29-trb Cannot bind to requested volume "demo-pv-cka29-trb": incompatible accessMode Which means the PVC is using incompatible accessMode, let's check the it out kubectl get pvc demo-pvc-cka29-trb -o yaml kubectl get pv demo-pv-cka29-trb -o yaml Let's re-create the PVC with correct access mode i.e ReadWriteMany kubectl get pvc demo-pvc-cka29-trb -o yaml > /tmp/pvc.yaml vi /tmp/pvc.yaml Under spec: change accessModes: from ReadWriteOnce to ReadWriteMany Delete the old PVC and create new kubectl delete pvc demo-pvc-cka29-trb kubectl apply -f /tmp/pvc.yaml Check the POD now kubectl get pod demo-pod-cka29-trb It should be good now. ``` --- A pod called nginx-cka01-trb is running in the default namespace. There is a container called nginx-container running inside this pod that uses the image nginx:latest. There is another sidecar container called logs-container that runs in this pod. For some reason, this pod is continuously crashing. Identify the issue and fix it. Make sure that the pod is in a running state and you are able to access the website using the curl http://kodekloud-exam.app:30001 command on the controlplane node of cluster1. pod running? website accessible? ``` Check the container logs: kubectl logs -f nginx-cka01-trb -c nginx-container You can see that its not able to pull the image. Edit the pod kubectl edit pod nginx-cka01-trb -o yaml Change image tag from nginx:latst to nginx:latest Let's check now if the POD is in Running state kubectl get pod You will notice that its still crashing, so check the logs again: kubectl logs -f nginx-cka01-trb -c nginx-container From the logs you will notice that nginx-container is looking good now so it might be the sidecar container that is causing issues. Let's check its logs. kubectl logs -f nginx-cka01-trb -c logs-container You will see some logs as below: cat: can't open '/var/log/httpd/access.log': No such file or directory cat: can't open '/var/log/httpd/error.log': No such file or directory Now, let's look into the sidecar container kubectl get pod nginx-cka01-trb -o yaml Under containers: check the command: section, this is the command which is failing. If you notice its looking for the logs under /var/log/httpd/ directory but the mounted volume for logs is /var/log/nginx (under volumeMounts:). So we need to fix this path: kubectl get pod nginx-cka01-trb -o yaml > /tmp/test.yaml vi /tmp/test.yaml Under command: change /var/log/httpd/access.log and /var/log/httpd/error.log to /var/log/nginx/access.log and /var/log/nginx/error.log respectively. Delete the existing POD now: kubectl delete pod nginx-cka01-trb Create new one from the template kubectl apply -f /tmp/test.yaml Let's check now if the POD is in Running state kubectl get pod It should be good now. So let's try to access the app. curl http://kodekloud-exam.app:30001 You will see error curl: (7) Failed to connect to kodekloud-exam.app port 30001: Connection refused So you are not able to access the website, et's look into the service configuration. Edit the service kubectl edit svc nginx-service-cka01-trb -o yaml Change app label under selector from httpd-app-cka01-trb to nginx-app-cka01-trb You should be able to access the website now. curl http://kodekloud-exam.app:30001 ``` --- There is an existing persistent volume called orange-pv-cka13-trb. A persistent volume claim called orange-pvc-cka13-trb is created to claim storage from orange-pv-cka13-trb. However, this PVC is stuck in a Pending state. As of now, there is no data in the volume. Troubleshoot and fix this issue, making sure that orange-pvc-cka13-trb PVC is in Bound state. Fixed the PVC? ``` List the PVC to check its status kubectl get pvc So we can see orange-pvc-cka13-trb PVC is in Pending state and its requesting a storage of 150Mi. Let's look into the events kubectl get events --sort-by='.metadata.creationTimestamp' -A You will see some errors as below: Warning VolumeMismatch persistentvolumeclaim/orange-pvc-cka13-trb Cannot bind to requested volume "orange-pv-cka13-trb": requested PV is too small Let's look into orange-pv-cka13-trb volume kubectl get pv We can see that orange-pv-cka13-trb volume is of 100Mi capacity which means its too small to request 150Mi of storage. Let's edit orange-pvc-cka13-trb PVC to adjust the storage requested. kubectl get pvc orange-pvc-cka13-trb -o yaml > /tmp/orange-pvc-cka13-trb.yaml vi /tmp/orange-pvc-cka13-trb.yaml Under resources: -> requests: -> storage: change 150Mi to 100Mi and save. Delete old PVC and apply the change: kubectl delete pvc orange-pvc-cka13-trb kubectl apply -f /tmp/orange-pvc-cka13-trb.yaml ``` --- The green-deployment-cka15-trb deployment is having some issues since the corresponding POD is crashing and restarting multiple times continuously. Investigate the issue and fix it, make sure the POD is in running state and its stable (i.e NO RESTARTS!). POD is stable now? ``` List the pods to check its status kubectl get pod its must have crashed already so lets look into the logs. kubectl logs -f green-deployment-cka15-trb-xxxx You will see some logs like these 2022-09-18 17:13:25 98 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins 2022-09-18 17:13:25 98 [Note] InnoDB: Memory barrier is not used 2022-09-18 17:13:25 98 [Note] InnoDB: Compressed tables use zlib 1.2.11 2022-09-18 17:13:25 98 [Note] InnoDB: Using Linux native AIO 2022-09-18 17:13:25 98 [Note] InnoDB: Using CPU crc32 instructions 2022-09-18 17:13:25 98 [Note] InnoDB: Initializing buffer pool, size = 128.0M Killed This might be due to the resources issue, especially the memory, so let's try to recreate the POD to see if it helps. kubectl delete pod green-deployment-cka15-trb-xxxx Now watch closely the POD status kubectl get pod Pretty soon you will see the POD status has been changed to OOMKilled which confirms its the memory issue. So let's look into the resources that are assigned to this deployment. kubectl get deploy kubectl edit deploy green-deployment-cka15-trb Under resources: -> limits: change memory from 256Mi to 512Mi and save the changes. Now watch closely the POD status again kubectl get pod It should be stable now. ``` --- The pink-depl-cka14-trb Deployment was scaled to 2 replicas however, the current replicas is still 1. Troubleshoot and fix this issue. Make sure the CURRENT count is equal to the DESIRED count. You can SSH into the cluster4 using ssh cluster4-controlplane command. CURRENT count is equal to the DESIRED coun ``` List the ReplicaSet to check the status kubectl get deployment We can see DESIRED count for pink-depl-cka14-trb is 2 but the CURRENT count is still 1 As we know Kube Controller Manager is responsible for monitoring the status of replica sets/deployments and ensuring that the desired number of PODs are available so let's check if its running fine. kubectl get pod -n kube-system So kube-controller-manager-cluster4-controlplane is crashing, let's check the events to figure what's happening student-node ~ ✖ kubectl get event --field-selector involvedObject.name=kube-controller-manager-cluster4-controlplane -n kube-system LAST SEEN TYPE REASON OBJECT MESSAGE 10m Warning NodeNotReady pod/kube-controller-manager-cluster4-controlplane Node is not ready 3m25s Normal Killing pod/kube-controller-manager-cluster4-controlplane Stopping container kube-controller-manager 2m18s Normal Pulled pod/kube-controller-manager-cluster4-controlplane Container image "k8s.gcr.io/kube-controller-manager:v1.24.0" already present on machine 2m18s Normal Created pod/kube-controller-manager-cluster4-controlplane Created container kube-controller-manager 2m18s Warning Failed pod/kube-controller-manager-cluster4-controlplane Error: failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec: "kube-controller-manage": executable file not found in $PATH: unknown 108s Warning BackOff pod/kube-controller-manager-cluster4-controlplane Back-off restarting failed container student-node ~ ➜ You will see some errors as below Warning Failed pod/kube-controller-manager-cluster4-controlplane Error: failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec: "kube-controller-manage": executable file not found in $PATH: unknown Seems like its trying to run kube-controller-manage command but it is supposed to run kube-controller-manager commmand. So lets look into the kube-controller-manager manifest which is present under /etc/kubernetes/manifests/kube-controller-manager.yaml on cluster4-controlplane node. So let's SSH into cluster4-controlplane ssh cluster4-controlplane vi /etc/kubernetes/manifests/kube-controller-manager.yaml Under containers: -> - command: change kube-controller-manage to kube-controller-manager and restart kube-controller-manager-cluster4-controlplane POD kubectl delete pod kube-controller-manager-cluster4-controlplane -n kube-system Check now the ReplicaSet kubectl get deployment CURRENT count should be equal to the DESIRED count now for pink-depl-cka14-trb. ``` --- The cat-cka22-trb pod is stuck in Pending state. Look into the issue to fix the same. Make sure that the pod is in running state and its stable (i.e not restarting or crashing). Note: Do not make any changes to the pod (No changes to pod config but you may destory and re-create). Pod is running? Pod is running on the desired node? Pod wasn't modified? ``` Let's check the POD status kubectl get pod You will see that cat-cka22-trb pod is stuck in Pending state. So let's try to look into the events kubectl --context cluster2 get event --field-selector involvedObject.name=cat-cka22-trb You will see some logs as below Warning FailedScheduling pod/cat-cka22-trb 0/3 nodes are available: 1 node(s) had untolerated taint {node-role.kubernetes.io/master: }, 2 node(s) didn't match Pod's node affinity/selector. preemption: 0/2 nodes are available: 3 Preemption is not helpful for scheduling. So seems like this POD is using the node affinity, let's look into the POD to understand the node affinity its using. kubectl --context cluster2 get pod cat-cka22-trb -o yaml Under affinity: you will see its looking for key: node and values: cluster2-node02 so let's verify if node01 has these labels applied. kubectl --context cluster2 get node cluster2-node01 -o yaml Look under labels: and you will not find any such label, so let's add this label to this node. kubectl label node cluster1-node01 node=cluster2-node01 Check again the node details kubectl get node cluster2-node01 -o yaml The new label should be there, let's see if POD is scheduled now on this node kubectl --context cluster2 get pod Its is but it must be crashing or restarting, so let's look into the pod logs kubectl --context cluster2 logs -f cat-cka22-trb You will see logs as below: The HOST variable seems incorrect, it must be set to kodekloud Let's look into the POD env variables to see if there is any HOST env variable kubectl --context cluster2 get pod -o yaml Under env: you will see this env: - name: HOST valueFrom: secretKeyRef: key: hostname name: cat-cka22-trb So we can see that HOST variable is defined and its value is being retrieved from a secret called "cat-cka22-trb". Let's look into this secret. kubectl --context cluster2 get secret kubectl --context cluster2 get secret cat-cka22-trb -o yaml You will find a key/value pair under data:, let's try to decode it to see its value: echo "<the decoded value you see for hostname" | base64 -d ok so the value is set to kodekloude which is incorrect as it should be set to kodekloud. So let's update the secret: echo "kodekloud" | base64 kubectl edit secret cat-cka22-trb Change requests storage hostname: a29kZWtsb3Vkdg== to hostname: a29kZWtsb3VkCg== (values may vary) POD should be good now. ``` --- In the dev-wl07 namespace, one of the developers has performed a rolling update and upgraded the application to a newer version. But somehow, application pods are not being created. To get back the working state, rollback the application to the previous version . After rolling the deployment back, on the controlplane node, save the image currently in use to the /root/rolling-back-record.txt file and increase the replica count to the 5. You can SSH into the cluster1 using ssh cluster1-controlplane command. rolling back successful? image saved to the file? Replica set to 5? ``` Run the command to change the context: - kubectl config use-context cluster1 Check the status of the pod: - kubectl get pods -n dev-wl07 One of the pods is in an error state. As a quick fix, we need to rollback to the previous revision as follows: - kubectl rollout undo -n dev-wl07 deploy webapp-wl07 After successful rolling back, inspect the updated image: - kubectl describe deploy -n dev-wl07 webapp-wl07 | grep -i image On the Controlplane node, save the image name to the given path /root/rolling-back-record.txt: - ssh cluster1-controlplane echo "kodekloud/webapp-color" > /root/rolling-back-record.txt And increase the replica count to the 5 with help of kubectl scale command: - kubectl scale deploy -n dev-wl07 webapp-wl07 --replicas=5 Verify it by running the command: kubectl get deploy -n dev-wl07 ``` --- A manifest file is available at the /root/app-wl03/ on the student-node node. There are some issues with the file; hence couldn't deploy a pod on the cluster3-controlplane node. After fixing the issues, deploy the pod, and it should be in a running state. NOTE: - Ensure that the existing limits are unchanged. Pod app-wl03 is created with given limits? Pod is running? ``` Set the correct context: - kubectl config use-context cluster3 Use the cd command to move to the given directory: - cd /root/app-wl03/ While creating the resource, you will see the error output as follows: - kubectl create -f app-wl03.yaml The Pod "app-wl03" is invalid: spec.containers[0].resources.requests: Invalid value: "1Gi": must be less than or equal to memory limit In the spec.containers.resources.requests.memory value is not configured as compare to the memory limit. As a fix, open the manifest file with the text editor such as vim or nano and set the value to 100Mi or less than 100Mi. It should be look like as follows: - resources: requests: memory: 100Mi limits: memory: 100Mi Final, create the resource from the kubectl create command: - kubectl create -f app-wl03.yaml pod/app-wl03 created ``` --- One of our applications runs on the cluster3-controlplane node. Due to the possibility of traffic increase, we want to scale the application pods to loadbalance the traffic and provide a smoother user experience. cluster3-controlplane node has enough resources to deploy more application pods. Scale the deployment called essports-wl02 to 5. Deployment scaled to 5? ``` Set the correct context: kubectl config use-context cluster3 Now, get the details of the nodes: - kubectl get nodes -owide then SSH to the given node by the following command: - ssh cluster3-controlplane And run the kubectl scale command as follows: - kubectl scale deploy essports-wl02 --replicas=5 OR You can run the kubectl scale command from the student node as well: - kubectl scale deploy essports-wl02 --replicas=5 Verify the scaled-up pods by kubectl get command: - kubectl get pods The number of pods should be 1 to 5. ``` --- Create a storage class called orange-stc-cka07-str as per the properties given below: - Provisioner should be kubernetes.io/no-provisioner. - Volume binding mode should be WaitForFirstConsumer. Next, create a persistent volume called orange-pv-cka07-str as per the properties given below: - Capacity should be 150Mi. - Access mode should be ReadWriteOnce. - Reclaim policy should be Retain. - It should use storage class orange-stc-cka07-str. - Local path should be /opt/orange-data-cka07-str. - Also add node affinity to create this value on cluster1-controlplane. Finally, create a persistent volume claim called orange-pvc-cka07-str as per the properties given below: - Access mode should be ReadWriteOnce. - It should use storage class orange-stc-cka07-str. - Storage request should be 128Mi. - The volume should be orange-pv-cka07-str. orange-stc-cka07-str storage class created correctly? orange-pv-cka07-str PV created correctly? orange-pvc-cka07-str PVC created correctly? ``` Create a yaml file as below: kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: orange-stc-cka07-str provisioner: kubernetes.io/no-provisioner volumeBindingMode: WaitForFirstConsumer --- apiVersion: v1 kind: PersistentVolume metadata: name: orange-pv-cka07-str spec: capacity: storage: 150Mi accessModes: - ReadWriteOnce persistentVolumeReclaimPolicy: Retain storageClassName: orange-stc-cka07-str local: path: /opt/orange-data-cka07-str nodeAffinity: required: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: In values: - cluster1-controlplane --- kind: PersistentVolumeClaim apiVersion: v1 metadata: name: orange-pvc-cka07-str spec: accessModes: - ReadWriteOnce storageClassName: orange-stc-cka07-str volumeName: orange-pv-cka07-str resources: requests: storage: 128Mi Apply the template: kubectl apply -f <template-file-name>.yaml ``` --- There is a persistent volume named apple-pv-cka04-str. Create a persistent volume claim named apple-pvc-cka04-str and request a 40Mi of storage from apple-pv-cka04-str PV. The access mode should be ReadWriteOnce and storage class should be manual. Task Completed? ``` Set context to cluster1: Create a yaml template as below: apiVersion: v1 kind: PersistentVolumeClaim metadata: name: apple-pvc-cka04-str spec: volumeName: apple-pv-cka04-str storageClassName: manual accessModes: - ReadWriteOnce resources: requests: storage: 40Mi Apply the template: kubectl apply -f <template-file-name>.yaml ``` --- Part I: Create a ClusterIP service .i.e. service-3421-svcn in the spectra-1267 ns which should expose the pods namely pod-23 and pod-21 with port set to 8080 and targetport to 80. Part II: Store the pod names and their ip addresses from the spectra-1267 ns at /root/pod_ips_cka05_svcn where the output is sorted by their IP's. Please ensure the format as shown below: ```typescript POD_NAME IP_ADDR pod-1 ip-1 pod-3 ip-2 pod-2 ip-3 ... ``` "service-3421-svcn" exists? service-3421-svcn is of "type: ClusterIP"? port: 8080? targetPort: 80? Service only exposes pod "pod-23" and "pod-21"? correct and sorted output stored in "/root/pod_ips_cka05_svcn"? ``` Switching to cluster3: kubectl config use-context cluster3 The easiest way to route traffic to a specific pod is by the use of labels and selectors . List the pods along with their labels: student-node ~ ➜ kubectl get pods --show-labels -n spectra-1267 NAME READY STATUS RESTARTS AGE LABELS pod-12 1/1 Running 0 5m21s env=dev,mode=standard,type=external pod-34 1/1 Running 0 5m20s env=dev,mode=standard,type=internal pod-43 1/1 Running 0 5m20s env=prod,mode=exam,type=internal pod-23 1/1 Running 0 5m21s env=dev,mode=exam,type=external pod-32 1/1 Running 0 5m20s env=prod,mode=standard,type=internal pod-21 1/1 Running 0 5m20s env=prod,mode=exam,type=external Looks like there are a lot of pods created to confuse us. But we are only concerned with the labels of pod-23 and pod-21. As we can see both the required pods have labels mode=exam,type=external in common. Let's confirm that using kubectl too: student-node ~ ➜ kubectl get pod -l mode=exam,type=external -n spectra-1267 NAME READY STATUS RESTARTS AGE pod-23 1/1 Running 0 9m18s pod-21 1/1 Running 0 9m17s Nice!! Now as we have figured out the labels, we can proceed further with the creation of the service: student-node ~ ➜ kubectl create service clusterip service-3421-svcn -n spectra-1267 --tcp=8080:80 --dry-run=client -o yaml > service-3421-svcn.yaml Now modify the service definition with selectors as required before applying to k8s cluster: student-node ~ ➜ cat service-3421-svcn.yaml apiVersion: v1 kind: Service metadata: creationTimestamp: null labels: app: service-3421-svcn name: service-3421-svcn namespace: spectra-1267 spec: ports: - name: 8080-80 port: 8080 protocol: TCP targetPort: 80 selector: app: service-3421-svcn # delete mode: exam # add type: external # add type: ClusterIP status: loadBalancer: {} Finally let's apply the service definition: student-node ~ ➜ kubectl apply -f service-3421-svcn.yaml service/service-3421 created student-node ~ ➜ k get ep service-3421-svcn -n spectra-1267 NAME ENDPOINTS AGE service-3421 10.42.0.15:80,10.42.0.17:80 52s To store all the pod name along with their IP's , we could use imperative command as shown below: student-node ~ ➜ kubectl get pods -n spectra-1267 -o=custom-columns='POD_NAME:metadata.name,IP_ADDR:status.podIP' --sort-by=.status.podIP POD_NAME IP_ADDR pod-12 10.42.0.18 pod-23 10.42.0.19 pod-34 10.42.0.20 pod-21 10.42.0.21 ... # store the output to /root/pod_ips student-node ~ ➜ kubectl get pods -n spectra-1267 -o=custom-columns='POD_NAME:metadata.name,IP_ADDR:status.podIP' --sort-by=.status.podIP > /root/pod_ips_cka05_svcn ``` --- Create a pod with name tester-cka02-svcn in dev-cka02-svcn namespace with image registry.k8s.io/e2e-test-images/jessie-dnsutils:1.3. Make sure to use command sleep 3600 with restart policy set to Always . Once the tester-cka02-svcn pod is running, store the output of the command nslookup kubernetes.default from tester pod into the file /root/dns_output on student-node. 'dev-cka02-svcn' namespace exists? 'tester-cka02-svcn' pod exists in dev-cka02-svcn namespace? correct image used? Restart policy set to "Always"? Command "sleep 3600" specified ? Correct dns output stored in '/root/dns_output" ? ``` Change to the cluster1 context before attempting the task: kubectl config use-context cluster1 Since the "dev-cka02-svcn" namespace doesn't exist, let's create it first: kubectl create ns dev-cka02-svcn Create the pod as per the requirements: kubectl apply -f - << EOF apiVersion: v1 kind: Pod metadata: name: tester-cka02-svcn namespace: dev-cka02-svcn spec: containers: - name: tester-cka02-svcn image: registry.k8s.io/e2e-test-images/jessie-dnsutils:1.3 command: - sleep - "3600" restartPolicy: Always EOF Now let's test if the nslookup command is working : student-node ~ ➜ kubectl exec -n dev-cka02-svcn -i -t tester-cka02-svcn -- nslookup kubernetes.default ;; connection timed out; no servers could be reached command terminated with exit code 1 Looks like something is broken at the moment, if we observe the kube-system namespace, we will see no coredns pods are not running which is creating the problem, let's scale them for the nslookup command to work: kubectl scale deployment -n kube-system coredns --replicas=2 Now let store the correct output into the /root/dns_output on student-node : kubectl exec -n dev-cka02-svcn -i -t tester-cka02-svcn -- nslookup kubernetes.default >> /root/dns_output We should have something similar to below output: student-node ~ ➜ cat /root/dns_output Server: 10.96.0.10 Address: 10.96.0.10#53 Name: kubernetes.default.svc.cluster.local Address: 10.96.0.1 ``` --- Create a loadbalancer service with name wear-service-cka09-svcn to expose the deployment webapp-wear-cka09-svcn application in app-space namespace. "wear-service-cka09-svcn" created in "app-space" namespace ? Type: LoadBalancer ? Deployment 'webapp-wear' exposed ? ``` Switch to cluster3 : kubectl config use-context cluster3 On student node run the command: student-node ~ ➜ kubectl expose -n app-space deployment webapp-wear-cka09-svcn --type=LoadBalancer --name=wear-service-cka09-svcn --port=8080 service/wear-service-cka09-svcn exposed student-node ~ ➜ k get svc -n app-space NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE wear-service-cka09-svcn LoadBalancer 10.43.68.233 172.25.0.14 8080:32109/TCP 14s ``` --- Deploy a messaging-cka07-svcn pod using the redis:alpine image with the labels set to tier=msg. Now create a service messaging-service-cka07-svcn to expose the messaging-cka07-svcn application within the cluster on port 6379. TIP: Use imperative commands. Pod Name: messaging-cka07-svcn Image: redis:alpine Labels: tier=msg Service: messaging-service-cka07-svcn Port: 6379 Type: ClusterIP Use the right labels ``` Switch to cluster2 : kubectl config use-context cluster3 On student-node, use the command kubectl run messaging-cka07-svcn --image=redis:alpine -l tier=msg Now run the command: kubectl expose pod messaging-cka07-svcn --port=6379 --name messaging-service-cka07-svcn. ``` --- For this question, please set the context to cluster1 by running: kubectl config use-context cluster1 There is a deployment called nginx-dp-cka04-trb which has been used to deploy a static website. The access to this website can be tested by running: curl http://kodekloud-exam.app:30002. However, it is not working at the moment. Troubleshoot and fix it. ``` First list the available pods: kubectl get pod Look into the nginx-dp-xxxx POD logs kubectl logs -f <pod-name> You may not see any logs so look into the kubernetes events for <pod-name> POD Look into the POD events kubectl get event --field-selector involvedObject.name=<pod-name> You will see an error something like: 70s Warning FailedMount pod/nginx-dp-cka04-trb-767b767dc-6c5wk Unable to attach or mount volumes: unmounted volumes=[nginx-config-volume-cka04-trb], unattached volumes=[index-volume-cka04-trb kube-api-access-4fbrb nginx-config-volume-cka04-trb]: timed out waiting for the condition From the error we can see that its not able to mount nginx-config-volume-cka04-trb volume Check the nginx-dp-cka04-trb deployment kubectl get deploy nginx-dp-cka04-trb -o=yaml Under volumes: look for the configMap: name which is nginx-configuration-cka04-trb. Now lets look into this configmap. kubectl get configmap nginx-configuration-cka04-trb The above command will fail as there is no configmap with this name, so now list the all configmaps. kubectl get configmap You will see an configmap named nginx-config-cka04-trb which seems to be the correct one. Edit the nginx-dp-cka04-trb deployment now kubectl edit deploy nginx-dp-cka04-trb Under configMap: change nginx-configuration-cka04-trb to nginx-config-cka04-trb. Once done wait for the POD to come up. Try to access the website now: curl http://kodekloud-exam.app:30002 ``` --- For this question, please set the context to cluster1 by running: kubectl config use-context cluster1 A template to create a Kubernetes pod is stored at /root/red-probe-cka12-trb.yaml on the student-node. However, using this template as-is is resulting in an error. Fix the issue with this template and use it to create the pod. Once created, watch the pod for a minute or two to make sure its stable i.e, it's not crashing or restarting. Make sure you do not update the args: section of the template. ``` Try to apply the template kubectl apply -f red-probe-cka12-trb.yaml You will see error: error: error validating "red-probe-cka12-trb.yaml": error validating data: [ValidationError(Pod.spec.containers[0].livenessProbe.httpGet): unknown field "command" in io.k8s.api.core.v1.HTTPGetAction, ValidationError(Pod.spec.containers[0].livenessProbe.httpGet): missing required field "port" in io.k8s.api.core.v1.HTTPGetAction]; if you choose to ignore these errors, turn validation off with --validate=false From the error you can see that the error is for liveness probe, so let's open the template to find out: vi red-probe-cka12-trb.yaml Under livenessProbe: you will see the type is httpGet however the rest of the options are command based so this probe should be of exec type. Change httpGet to exec Try to apply the template now kubectl apply -f red-probe-cka12-trb.yaml Cool it worked, now let's watch the POD status, after few seconds you will notice that POD is restarting. So let's check the logs/events kubectl get event --field-selector involvedObject.name=red-probe-cka12-trb You will see an error like: 21s Warning Unhealthy pod/red-probe-cka12-trb Liveness probe failed: cat: can't open '/healthcheck': No such file or directory So seems like Liveness probe is failing, lets look into it: vi red-probe-cka12-trb.yaml Notice the command - sleep 3 ; touch /healthcheck; sleep 30;sleep 30000 it starts with a delay of 3 seconds, but the liveness probe initialDelaySeconds is set to 1 and failureThreshold is also 1. Which means the POD will fail just after first attempt of liveness check which will happen just after 1 second of pod start. So to make it stable we must increase the initialDelaySeconds to at least 5 vi red-probe-cka12-trb.yaml Change initialDelaySeconds from 1 to 5 and save apply the changes. Delete old pod: kubectl delete pod red-probe-cka12-trb Apply changes: kubectl apply -f red-probe-cka12-trb.yaml ``` --- For this question, please set the context to cluster1 by running: kubectl config use-context cluster1 There is an existing persistent volume called orange-pv-cka13-trb. A persistent volume claim called orange-pvc-cka13-trb is created to claim storage from orange-pv-cka13-trb. However, this PVC is stuck in a Pending state. As of now, there is no data in the volume. Troubleshoot and fix this issue, making sure that orange-pvc-cka13-trb PVC is in Bound state. ``` List the PVC to check its status kubectl get pvc So we can see orange-pvc-cka13-trb PVC is in Pending state and its requesting a storage of 150Mi. Let's look into the events kubectl get events --sort-by='.metadata.creationTimestamp' -A You will see some errors as below: Warning VolumeMismatch persistentvolumeclaim/orange-pvc-cka13-trb Cannot bind to requested volume "orange-pv-cka13-trb": requested PV is too small Let's look into orange-pv-cka13-trb volume kubectl get pv We can see that orange-pv-cka13-trb volume is of 100Mi capacity which means its too small to request 150Mi of storage. Let's edit orange-pvc-cka13-trb PVC to adjust the storage requested. kubectl get pvc orange-pvc-cka13-trb -o yaml > /tmp/orange-pvc-cka13-trb.yaml vi /tmp/orange-pvc-cka13-trb.yaml Under resources: -> requests: -> storage: change 150Mi to 100Mi and save. Delete old PVC and apply the change: kubectl delete pvc orange-pvc-cka13-trb kubectl apply -f /tmp/orange-pvc-cka13-trb.yaml ``` --- For this question, please set the context to cluster1 by running: kubectl config use-context cluster1 We deployed an app using a deployment called web-dp-cka06-trb. it's using the httpd:latest image. There is a corresponding service called web-service-cka06-trb that exposes this app on the node port 30005. However, the app is not accessible! Troubleshoot and fix this issue. Make sure you are able to access the app using curl http://kodekloud-exam.app:30005 command. ``` List the deployments to see if all PODs under web-dp-cka06-trb deployment are up and running. kubectl get deploy You will notice that 0 out of 1 PODs are up, so let's look into the POD now. kubectl get pod You will notice that web-dp-cka06-trb-xxx pod is in Pending state, so let's checkout the relevant events. kubectl get event --field-selector involvedObject.name=web-dp-cka06-trb-xxx You should see some error/warning like this: Warning FailedScheduling pod/web-dp-cka06-trb-76b697c6df-h78x4 0/1 nodes are available: 1 persistentvolumeclaim "web-cka06-trb" not found. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling. Let's look into the PVCs kubectl get pvc You should see web-pvc-cka06-trb in the output but as per logs the POD was looking for web-cka06-trb PVC. Let's update the deployment to fix this. kubectl edit deploy web-dp-cka06-trb Under volumes: -> name: web-str-cka06-trb -> persistentVolumeClaim: -> claimName change web-cka06-trb to web-pvc-cka06-trb and save the changes. Look into the POD again to make sure its running now kubectl get pod You will find that its still failing, most probably with ErrImagePull or ImagePullBackOff error. Now lets update the deployment again to make sure its using the correct image. kubectl edit deploy web-dp-cka06-trb Under spec: -> containers: -> change image from httpd:letest to httpd:latest and save the changes. Look into the POD again to make sure its running now kubectl get pod You will notice that POD is still crashing, let's look into the POD logs. kubectl logs web-dp-cka06-trb-xxxx If there are no useful logs then look into the events kubectl get event --field-selector involvedObject.name=web-dp-cka06-trb-xxxx --sort-by='.lastTimestamp' You should see some errors/warnings as below Warning FailedPostStartHook pod/web-dp-cka06-trb-67dccb7487-2bjgf Exec lifecycle hook ([/bin -c echo 'Test Page' > /usr/local/apache2/htdocs/index.html]) for Container "web-container" in Pod "web-dp-cka06-trb-67dccb7487-2bjgf_default(4dd6565e-7f1a-4407-b3d9-ca595e6d4e95)" failed - error: rpc error: code = Unknown desc = failed to exec in container: failed to start exec "c980799567c8176db5931daa2fd56de09e84977ecd527a1d1f723a862604bd7c": OCI runtime exec failed: exec failed: unable to start container process: exec: "/bin": permission denied: unknown, message: "" Let's look into the lifecycle hook of the pod kubectl edit deploy web-dp-cka06-trb Under containers: -> lifecycle: -> postStart: -> exec: -> command: change /bin to /bin/sh Look into the POD again to make sure its running now kubectl get pod Finally pod should be in running state. Let's try to access the webapp now. curl http://kodekloud-exam.app:30005 You will see error curl: (7) Failed to connect to kodekloud-exam.app port 30005: Connection refused Let's look into the service kubectl edit svc web-service-cka06-trb Let's verify if the selector labels and ports are correct as needed. You will note that service is using selector: -> app: web-cka06-trb Now, let's verify the app labels: kubectl get deploy web-dp-cka06-trb -o yaml Under labels you will see labels: -> deploy: web-app-cka06-trb So we can see that service is using wrong selector label, let's edit the service to fix the same kubectl edit svc web-service-cka06-trb Let's try to access the webapp now. curl http://kodekloud-exam.app:30005 Boom! app should be accessible now. ``` --- SECTION: TROUBLESHOOTING For this question, please set the context to cluster4 by running: kubectl config use-context cluster4 There is a pod called pink-pod-cka16-trb created in the default namespace in cluster4. This app runs on port tcp/5000 and it is exposed to end-users using an ingress resource called pink-ing-cka16-trb in such a way that it is supposed to be accessible using the command: curl http://kodekloud-pink.app on cluster4-controlplane host. However, this is not working. Troubleshoot and fix this issue, making any necessary to the objects. Note: You should be able to ssh into the cluster4-controlplane using ssh cluster4-controlplane command. ``` SSH into the cluster4-controlplane host and try to access the app. ssh cluster4-controlplane curl kodekloud-pink.app You must be getting 503 Service Temporarily Unavailabl error. Let's look into the service: kubectl edit svc pink-svc-cka16-trb Under ports: change protocol: UDP to protocol: TCP Try to access the app again curl kodekloud-pink.app You must be getting curl: (6) Could not resolve host: example.com error, from the error we can see that its not able to resolve example.com host which indicated that it can be some issue related to the DNS. As we know CoreDNS is a DNS server that can serve as the Kubernetes cluster DNS, so it can be something related to CoreDNS. Let's check if we have CoreDNS deployment running: kubectl get deploy -n kube-system You will see that for coredns all relicas are down, you will see 0/0 ready pods. So let's scale up this deployment. kubectl scale --replicas=2 deployment coredns -n kube-system Once CoreDBS is up let's try to access to app again. curl kodekloud-pink.app It should work now. ``` --- SECTION: TROUBLESHOOTING For this question, please set the context to cluster2 by running: kubectl config use-context cluster2 The cat-cka22-trb pod is stuck in Pending state. Look into the issue to fix the same. Make sure that the pod is in running state and its stable (i.e not restarting or crashing). Note: Do not make any changes to the pod (No changes to pod config but you may destory and re-create). ``` Let's check the POD status kubectl get pod You will see that cat-cka22-trb pod is stuck in Pending state. So let's try to look into the events kubectl --context cluster2 get event --field-selector involvedObject.name=cat-cka22-trb You will see some logs as below Warning FailedScheduling pod/cat-cka22-trb 0/3 nodes are available: 1 node(s) had untolerated taint {node-role.kubernetes.io/master: }, 2 node(s) didn't match Pod's node affinity/selector. preemption: 0/2 nodes are available: 3 Preemption is not helpful for scheduling. So seems like this POD is using the node affinity, let's look into the POD to understand the node affinity its using. kubectl --context cluster2 get pod cat-cka22-trb -o yaml Under affinity: you will see its looking for key: node and values: cluster2-node02 so let's verify if node01 has these labels applied. kubectl --context cluster2 get node cluster2-node01 -o yaml Look under labels: and you will not find any such label, so let's add this label to this node. kubectl label node cluster1-node01 node=cluster2-node01 Check again the node details kubectl get node cluster2-node01 -o yaml The new label should be there, let's see if POD is scheduled now on this node kubectl --context cluster2 get pod Its is but it must be crashing or restarting, so let's look into the pod logs kubectl --context cluster2 logs -f cat-cka22-trb You will see logs as below: The HOST variable seems incorrect, it must be set to kodekloud Let's look into the POD env variables to see if there is any HOST env variable kubectl --context cluster2 get pod -o yaml Under env: you will see this env: - name: HOST valueFrom: secretKeyRef: key: hostname name: cat-cka22-trb So we can see that HOST variable is defined and its value is being retrieved from a secret called "cat-cka22-trb". Let's look into this secret. kubectl --context cluster2 get secret kubectl --context cluster2 get secret cat-cka22-trb -o yaml You will find a key/value pair under data:, let's try to decode it to see its value: echo "<the decoded value you see for hostname" | base64 -d ok so the value is set to kodekloude which is incorrect as it should be set to kodekloud. So let's update the secret: echo "kodekloud" | base64 kubectl edit secret cat-cka22-trb Change requests storage hostname: a29kZWtsb3Vkdg== to hostname: a29kZWtsb3VkCg== (values may vary) POD should be good now. ``` --- SECTION: SCHEDULING For this question, please set the context to cluster1 by running: kubectl config use-context cluster1 We have deployed a simple web application called frontend-wl04 on cluster1. This version of the application has some issues from a security point of view and needs to be updated to version 2. Update the image and wait for the application to fully deploy. You can verify the running application using the curl command on the terminal: student-node ~ ➜ curl http://cluster1-node01:30080 <!doctype html> <title>Hello from Flask</title> <body style="background: #2980b9;"></body> <div style="color: #e4e4e4; text-align: center; height: 90px; vertical-align: middle;"> <h1>Hello from frontend-wl04-84fc69bd96-p7rbl!</h1> <h2> Application Version: v1 </h2> </div> student-node ~ ➜ Version 2 Image details as follows: 1. Current version of the image is `v1`, we need to update with the image to kodekloud/webapp-color:v2. 2. Use the imperative command to update the image. ``` Set the context: - kubectl config use-context cluster1 Now, test the current version of the application as follows: student-node ~ ➜ curl http://cluster1-node01:30080 <!doctype html> <title>Hello from Flask</title> <body style="background: #2980b9;"></body> <div style="color: #e4e4e4; text-align: center; height: 90px; vertical-align: middle;"> <h1>Hello from frontend-wl04-84fc69bd96-p7rbl!</h1> <h2> Application Version: v1 </h2> </div> student-node ~ ➜ Let's update the image, First, run the below command to check the existing image: - kubectl get deploy frontend-wl04 -oyaml | grep -i image After checking the existing image, we have to use the imperative command (It will take less than a minute) to update the image: - kubectl set image deploy frontend-wl04 simple-webapp=kodekloud/webapp-color:v2 Finally, run the below command to check the updated image: - kubectl get deploy frontend-wl04 -oyaml | grep -i image It should be the kodekloud/webapp-color:v2 image and the same should be visible when you run the curl command again: student-node ~ ➜ curl http://cluster1-node01:30080 <!doctype html> <title>Hello from Flask</title> <body style="background: #16a085;"></body> <div style="color: #e4e4e4; text-align: center; height: 90px; vertical-align: middle;"> <h1>Hello from frontend-wl04-6c54f479df-5tddd!</h1> <h2> Application Version: v2 </h2> </div> ``` --- SECTION: SCHEDULING For this question, please set the context to cluster3 by running: kubectl config use-context cluster3 A manifest file is available at the /root/app-wl03/ on the student-node node. There are some issues with the file; hence couldn't deploy a pod on the cluster3-controlplane node. After fixing the issues, deploy the pod, and it should be in a running state. NOTE: - Ensure that the existing limits are unchanged. ``` Set the correct context: - kubectl config use-context cluster3 Use the cd command to move to the given directory: - cd /root/app-wl03/ While creating the resource, you will see the error output as follows: - kubectl create -f app-wl03.yaml The Pod "app-wl03" is invalid: spec.containers[0].resources.requests: Invalid value: "1Gi": must be less than or equal to memory limit In the spec.containers.resources.requests.memory value is not configured as compare to the memory limit. As a fix, open the manifest file with the text editor such as vim or nano and set the value to 100Mi or less than 100Mi. It should be look like as follows: - resources: requests: memory: 100Mi limits: memory: 100Mi Final, create the resource from the kubectl create command: - kubectl create -f app-wl03.yaml pod/app-wl03 created ``` --- SECTION: SCHEDULING For this question, please set the context to cluster3 by running: kubectl config use-context cluster3 We have deployed a 2-tier web application on the cluster3 nodes in the canara-wl05 namespace. However, at the moment, the web app pod cannot establish a connection with the MySQL pod successfully. You can check the status of the application from the terminal by running the curl command with the following syntax: curl http://cluster3-controlplane:NODE-PORT To make the application work, create a new secret called db-secret-wl05 with the following key values: - 1. DB_Host=mysql-svc-wl05 2. DB_User=root 3. DB_Password=password123 Next, configure the web application pod to load the new environment variables from the newly created secret. Note: Check the web application again using the curl command, and the status of the application should be success. You can SSH into the cluster3 using ssh cluster3-controlplane command. ``` Set the correct context: - kubectl config use-context cluster3 List the nodes: - kubectl get nodes -o wide Run the curl command to know the status of the application as follows: - ssh cluster2-controlplane curl http://10.17.63.11:31020 <!doctype html> <title>Hello from Flask</title> ... <img src="/static/img/failed.png"> <h3> Failed connecting to the MySQL database. </h3> <h2> Environment Variables: DB_Host=Not Set; DB_Database=Not Set; DB_User=Not Set; DB_Password=Not Set; 2003: Can't connect to MySQL server on 'localhost:3306' (111 Connection refused) </h2> As you can see, the status of the application pod is failed. NOTE: - In your lab, IP addresses could be different. Let's create a new secret called db-secret-wl05 as follows: - kubectl create secret generic db-secret-wl05 -n canara-wl05 --from-literal=DB_Host=mysql-svc-wl05 --from-literal=DB_User=root --from-literal=DB_Password=password123 After that, configure the newly created secret to the web application pod as follows: - --- apiVersion: v1 kind: Pod metadata: labels: run: webapp-pod-wl05 name: webapp-pod-wl05 namespace: canara-wl05 spec: containers: - image: kodekloud/simple-webapp-mysql name: webapp-pod-wl05 envFrom: - secretRef: name: db-secret-wl05 then use the kubectl replace command: - kubectl replace -f <FILE-NAME> --force In the end, make use of the curl command to check the status of the application pod. The status of the application should be success. curl http://10.17.63.11:31020 <!doctype html> <title>Hello from Flask</title> <body style="background: #39b54b;"></body> <div style="color: #e4e4e4; text-align: center; height: 90px; vertical-align: middle;"> <img src="/static/img/success.jpg"> <h3> Successfully connected to the MySQL database.</h3> ``` SECTION: STORAGE For this question, please set the context to cluster1 by running: kubectl config use-context cluster1 A persistent volume called papaya-pv-cka09-str is already created with a storage capacity of 150Mi. It's using the papaya-stc-cka09-str storage class with the path /opt/papaya-stc-cka09-str. Also, a persistent volume claim named papaya-pvc-cka09-str has also been created on this cluster. This PVC has requested 50Mi of storage from papaya-pv-cka09-str volume. Resize the PVC to 80Mi and make sure the PVC is in Bound state. ``` Edit papaya-pv-cka09-str PV: kubectl get pv papaya-pv-cka09-str -o yaml > /tmp/papaya-pv-cka09-str.yaml Edit the template: vi /tmp/papaya-pv-cka09-str.yaml Delete all entries for uid:, annotations, status:, claimRef: from the template. Edit papaya-pvc-cka09-str PVC: kubectl get pvc papaya-pvc-cka09-str -o yaml > /tmp/papaya-pvc-cka09-str.yaml Edit the template: vi /tmp/papaya-pvc-cka09-str.yaml Under resources: -> requests: change storage: 50Mi to storage: 80Mi and save the template. Delete the exsiting PVC: kubectl delete pvc papaya-pvc-cka09-str Delete the exsiting PV and create using the template: kubectl delete pv papaya-pv-cka09-str kubectl apply -f /tmp/papaya-pv-cka09-str.yaml Create the PVC using template: kubectl apply -f /tmp/papaya-pvc-cka09-str.yaml ``` --- SECTION: STORAGE For this question, please set the context to cluster1 by running: kubectl config use-context cluster1 There is a persistent volume named apple-pv-cka04-str. Create a persistent volume claim named apple-pvc-cka04-str and request a 40Mi of storage from apple-pv-cka04-str PV. The access mode should be ReadWriteOnce and storage class should be manual. ``` Set context to cluster1: Create a yaml template as below: apiVersion: v1 kind: PersistentVolumeClaim metadata: name: apple-pvc-cka04-str spec: volumeName: apple-pv-cka04-str storageClassName: manual accessModes: - ReadWriteOnce resources: requests: storage: 40Mi Apply the template: kubectl apply -f <template-file-name>.yaml ``` --- SECTION: SERVICE NETWORKING For this question, please set the context to cluster1 by running: kubectl config use-context cluster1 John is setting up a two tier application stack that is supposed to be accessible using the service curlme-cka01-svcn. To test that the service is accessible, he is using a pod called curlpod-cka01-svcn. However, at the moment, he is unable to get any response from the application. Troubleshoot and fix this issue so the application stack is accessible. While you may delete and recreate the service curlme-cka01-svcn, please do not alter it in anyway. ``` Test if the service curlme-cka01-svcn is accessible from pod curlpod-cka01-svcn or not. kubectl exec curlpod-cka01-svcn -- curl curlme-cka01-svcn ..... % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- 0:00:10 --:--:-- 0 We did not get any response. Check if the service is properly configured or not. kubectl describe svc curlme-cka01-svcn '' .... Name: curlme-cka01-svcn Namespace: default Labels: <none> Annotations: <none> Selector: run=curlme-ckaO1-svcn Type: ClusterIP IP Family Policy: SingleStack IP Families: IPv4 IP: 10.109.45.180 IPs: 10.109.45.180 Port: <unset> 80/TCP TargetPort: 80/TCP Endpoints: <none> Session Affinity: None Events: <none> The service has no endpoints configured. As we can delete the resource, let's delete the service and create the service again. To delete the service, use the command kubectl delete svc curlme-cka01-svcn. You can create the service using imperative way or declarative way. Using imperative command: kubectl expose pod curlme-cka01-svcn --port=80 Using declarative manifest: apiVersion: v1 kind: Service metadata: labels: run: curlme-cka01-svcn name: curlme-cka01-svcn spec: ports: - port: 80 protocol: TCP targetPort: 80 selector: run: curlme-cka01-svcn type: ClusterIP You can test the connection from curlpod-cka-1-svcn using following. kubectl exec curlpod-cka01-svcn -- curl curlme-cka01-svcn ``` --- For this question, please set the context to cluster3 by running: kubectl config use-context cluster3 Create a ReplicaSet with name checker-cka10-svcn in ns-12345-svcn namespace with image registry.k8s.io/e2e-test-images/jessie-dnsutils:1.3. Make sure to specify the below specs as well: command sleep 3600 replicas set to 2 container name: dns-image Once the checker pods are up and running, store the output of the command nslookup kubernetes.default from any one of the checker pod into the file /root/dns-output-12345-cka10-svcn on student-node. ``` Change to the cluster4 context before attempting the task: kubectl config use-context cluster3 Create the ReplicaSet as per the requirements: kubectl apply -f - << EOF --- apiVersion: v1 kind: Namespace metadata: creationTimestamp: null name: ns-12345-svcn spec: {} status: {} --- apiVersion: apps/v1 kind: ReplicaSet metadata: name: checker-cka10-svcn namespace: ns-12345-svcn labels: app: dns tier: testing spec: replicas: 2 selector: matchLabels: tier: testing template: metadata: labels: tier: testing spec: containers: - name: dns-image image: registry.k8s.io/e2e-test-images/jessie-dnsutils:1.3 command: - sleep - "3600" EOF Now let's test if the nslookup command is working : student-node ~ ➜ k get pods -n ns-12345-svcn NAME READY STATUS RESTARTS AGE checker-cka10-svcn-d2cd2 1/1 Running 0 12s checker-cka10-svcn-qj8rc 1/1 Running 0 12s student-node ~ ➜ POD_NAME=`k get pods -n ns-12345-svcn --no-headers | head -1 | awk '{print $1}'` student-node ~ ➜ kubectl exec -n ns-12345-svcn -i -t $POD_NAME -- nslookup kubernetes.default ;; connection timed out; no servers could be reached command terminated with exit code 1 There seems to be a problem with the name resolution. Let's check if our coredns pods are up and if any service exists to reach them: student-node ~ ➜ k get pods -n kube-system | grep coredns coredns-6d4b75cb6d-cprjz 1/1 Running 0 42m coredns-6d4b75cb6d-fdrhv 1/1 Running 0 42m student-node ~ ➜ k get svc -n kube-system NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 62m Everything looks okay here but the name resolution problem exists, let's see if the kube-dns service have any active endpoints: student-node ~ ➜ kubectl get ep -n kube-system kube-dns NAME ENDPOINTS AGE kube-dns <none> 63m Finally, we have our culprit. If we dig a little deeper, we will it is using wrong labels and selector: student-node ~ ➜ kubectl describe svc -n kube-system kube-dns Name: kube-dns Namespace: kube-system .... Selector: k8s-app=core-dns Type: ClusterIP ... student-node ~ ➜ kubectl get deploy -n kube-system --show-labels | grep coredns coredns 2/2 2 2 66m k8s-app=kube-dns Let's update the kube-dns service it to point to correct set of pods: student-node ~ ➜ kubectl patch service -n kube-system kube-dns -p '{"spec":{"selector":{"k8s-app": "kube-dns"}}}' service/kube-dns patched student-node ~ ➜ kubectl get ep -n kube-system kube-dns NAME ENDPOINTS AGE kube-dns 10.50.0.2:53,10.50.192.1:53,10.50.0.2:53 + 3 more... 69m NOTE: We can use any method to update kube-dns service. In our case, we have used kubectl patch command. Now let's store the correct output to /root/dns-output-12345-cka10-svcn: student-node ~ ➜ kubectl exec -n ns-12345-svcn -i -t $POD_NAME -- nslookup kubernetes.default Server: 10.96.0.10 Address: 10.96.0.10#53 Name: kubernetes.default.svc.cluster.local Address: 10.96.0.1 student-node ~ ➜ kubectl exec -n ns-12345-svcn -i -t $POD_NAME -- nslookup kubernetes.default > /root/dns-output-12345-cka10-svcn ``` --- SECTION: SERVICE NETWORKING For this question, please set the context to cluster3 by running: kubectl config use-context cluster3 There is a deployment nginx-deployment-cka04-svcn in cluster3 which is exposed using service nginx-service-cka04-svcn. Create an ingress resource nginx-ingress-cka04-svcn to load balance the incoming traffic with the following specifications: pathType: Prefix and path: / Backend Service Name: nginx-service-cka04-svcn Backend Service Port: 80 ssl-redirect is set to false ``` First change the context to "cluster3": student-node ~ ➜ kubectl config use-context cluster3 Switched to context "cluster3". Now apply the ingress resource with the given requirements: kubectl apply -f - << EOF apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: nginx-ingress-cka04-svcn annotations: nginx.ingress.kubernetes.io/ssl-redirect: "false" spec: rules: - http: paths: - path: / pathType: Prefix backend: service: name: nginx-service-cka04-svcn port: number: 80 EOF Check if the ingress resource was successfully created: student-node ~ ➜ kubectl get ingress NAME CLASS HOSTS ADDRESS PORTS AGE nginx-ingress-cka04-svcn <none> * 172.25.0.10 80 13s As the ingress controller is exposed on cluster3-controlplane using traefik service, we need to ssh to cluster3-controlplane first to check if the ingress resource works properly: student-node ~ ➜ ssh cluster3-controlplane cluster3-controlplane:~# curl -I 172.25.0.11 HTTP/1.1 200 OK ... ``` SECTION: SERVICE NETWORKING For this question, please set the context to cluster3 by running: kubectl config use-context cluster3 Create a loadbalancer service with name wear-service-cka09-svcn to expose the deployment webapp-wear-cka09-svcn application in app-space namespace. ``` Switch to cluster3 : kubectl config use-context cluster3 On student node run the command: student-node ~ ➜ kubectl expose -n app-space deployment webapp-wear-cka09-svcn --type=LoadBalancer --name=wear-service-cka09-svcn --port=8080 service/wear-service-cka09-svcn exposed student-node ~ ➜ k get svc -n app-space NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE wear-service-cka09-svcn LoadBalancer 10.43.68.233 172.25.0.14 8080:32109/TCP 14s ``` --- SECTION: ARCHITECTURE, INSTALL AND MAINTENANCE Find the node across all clusters that consumes the most memory and store the result to the file /opt/high_memory_node in the following format cluster_name,node_name. The node could be in any clusters that are currently configured on the student-node. ``` Check out the metrics for all node across all clusters: student-node ~ ➜ kubectl top node --context cluster1 --no-headers | sort -nr -k4 | head -1 cluster1-controlplane 124m 1% 768Mi 1% student-node ~ ➜ kubectl top node --context cluster2 --no-headers | sort -nr -k4 | head -1 cluster2-controlplane 79m 0% 873Mi 1% student-node ~ ➜ kubectl top node --context cluster3 --no-headers | sort -nr -k4 | head -1 cluster3-controlplane 78m 0% 902Mi 1% student-node ~ ➜ kubectl top node --context cluster4 --no-headers | sort -nr -k4 | head -1 cluster4-controlplane 78m 0% 901Mi 1% student-node ~ ➜ Using this, find the node that uses most memory. In this case, it is cluster3-controlplane on cluster3. Save the result in the correct format to the file: student-node ~ ➜ echo cluster3,cluster3-controlplane > /opt/high_memory_node ``` --- SECTION: ARCHITECTURE, INSTALL AND MAINTENANCE For this question, please set the context to cluster1 by running: kubectl config use-context cluster1 Create a generic secret called db-user-pass-cka17-arch in the default namespace on cluster1 using the contents of the file /opt/db-user-pass on the student-node ``` Create the required secret: student-node ~ ➜ kubectl create secret generic db-user-pass-cka17-arch --from-file=/opt/db-user-pass ``` --- SECTION: SERVICE NETWORKING For this question, please set the context to cluster1 by running: kubectl config use-context cluster1 John is setting up a two tier application stack that is supposed to be accessible using the service curlme-cka01-svcn. To test that the service is accessible, he is using a pod called curlpod-cka01-svcn. However, at the moment, he is unable to get any response from the application. Troubleshoot and fix this issue so the application stack is accessible. While you may delete and recreate the service curlme-cka01-svcn, please do not alter it in anyway. ``` info_outline Solution Test if the service curlme-cka01-svcn is accessible from pod curlpod-cka01-svcn or not. kubectl exec curlpod-cka01-svcn -- curl curlme-cka01-svcn ..... % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- 0:00:10 --:--:-- 0 We did not get any response. Check if the service is properly configured or not. kubectl describe svc curlme-cka01-svcn '' .... Name: curlme-cka01-svcn Namespace: default Labels: <none> Annotations: <none> Selector: run=curlme-ckaO1-svcn Type: ClusterIP IP Family Policy: SingleStack IP Families: IPv4 IP: 10.109.45.180 IPs: 10.109.45.180 Port: <unset> 80/TCP TargetPort: 80/TCP Endpoints: <none> Session Affinity: None Events: <none> The service has no endpoints configured. As we can delete the resource, let's delete the service and create the service again. To delete the service, use the command kubectl delete svc curlme-cka01-svcn. You can create the service using imperative way or declarative way. Using imperative command: kubectl expose pod curlme-cka01-svcn --port=80 Using declarative manifest: apiVersion: v1 kind: Service metadata: labels: run: curlme-cka01-svcn name: curlme-cka01-svcn spec: ports: - port: 80 protocol: TCP targetPort: 80 selector: run: curlme-cka01-svcn type: ClusterIP You can test the connection from curlpod-cka-1-svcn using following. kubectl exec curlpod-cka01-svcn -- curl curlme-cka01-svcn ``` --- SECTION: STORAGE For this question, please set the context to cluster1 by running: kubectl config use-context cluster1 We want to deploy a python based application on the cluster using a template located at /root/olive-app-cka10-str.yaml on student-node. However, before you proceed we need to make some modifications to the YAML file as per details given below: The YAML should also contain a persistent volume claim with name olive-pvc-cka10-str to claim a 100Mi of storage from olive-pv-cka10-str PV. Update the deployment to add a sidecar container, which can use busybox image (you might need to add a sleep command for this container to keep it running.) Share the python-data volume with this container and mount the same at path /usr/src. Make sure this container only has read permissions on this volume. Finally, create a pod using this YAML and make sure the POD is in Running state. ``` Update olive-app-cka10-str.yaml template so that it looks like as below: --- kind: PersistentVolumeClaim apiVersion: v1 metadata: name: olive-pvc-cka10-str spec: accessModes: - ReadWriteMany storageClassName: olive-stc-cka10-str volumeName: olive-pv-cka10-str resources: requests: storage: 100Mi --- apiVersion: apps/v1 kind: Deployment metadata: name: olive-app-cka10-str spec: replicas: 1 template: metadata: labels: app: olive-app-cka10-str spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: In values: - cluster1-node01 containers: - name: python image: poroko/flask-demo-app ports: - containerPort: 5000 volumeMounts: - name: python-data mountPath: /usr/share/ - name: busybox image: busybox command: - "bin/sh" - "-c" - "sleep 10000" volumeMounts: - name: python-data mountPath: "/usr/src" readOnly: true volumes: - name: python-data persistentVolumeClaim: claimName: olive-pvc-cka10-str selector: matchLabels: app: olive-app-cka10-str --- apiVersion: v1 kind: Service metadata: name: olive-svc-cka10-str spec: type: NodePort ports: - port: 5000 nodePort: 32006 selector: app: olive-app-cka10-str Apply the template: kubectl apply -f olive-app-cka10-str.yaml ``` --- Create a static pod named static-busybox on the controlplane node that uses the busybox image and the command sleep 1000. ``` Create a pod definition file in the manifests directory. For that use command kubectl run --restart=Never --image=busybox static-busybox --dry-run=client -oyaml --command -- sleep 1000 > /etc/kubernetes/manifests/static-busybox.yaml ``` ---

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.