Kubernetes Deep-Dive Activities

# Kubernetes Deep-Dive Activities ## Activity 1: Understanding Pod Lifecycle and Container Runtime **Reading Resources:** - https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/ - https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/ - https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#pod-v1-core **Context:** Pods are the atomic unit in Kubernetes, but their lifecycle is more complex than just "running" or "stopped". Understanding the phases, conditions, and container states is crucial for debugging and designing resilient applications. **Questions:** 1. What are the five possible phases of a Pod, and what does each phase indicate? 2. Run `kubectl get pods -o wide` in your cluster. What additional information does the READY column show compared to STATUS? 3. What's the difference between Pod phase and Pod conditions? List at least three Pod conditions. 4. Create a Pod manifest that uses both postStart and preStop lifecycle hooks. What happens if postStart fails? 5. How does the restartPolicy affect container behavior? Write three different Pod specs demonstrating each policy. 6. What is the difference between `livenessProbe` and `readinessProbe`? When would you use `startupProbe`? 7. Design a probe configuration for a slow-starting Java application that takes 2 minutes to initialize. Include all three probe types. 8. What happens to a Pod when a node becomes NotReady? How does the eviction timeout work? 9. Investigate init containers: Create a Pod with 2 init containers and 1 main container. Use `kubectl describe` to observe the initialization sequence. 10. Advanced: How does the container runtime (containerd/CRI-O) interact with kubelet during Pod creation? What are the CRI calls involved? --- ## Activity 2: Deployment Strategies and Rolling Updates **Reading Resources:** - https://kubernetes.io/docs/concepts/workloads/controllers/deployment/ - https://kubernetes.io/docs/tutorials/kubernetes-basics/update/update-intro/ - https://kubernetes.io/docs/concepts/workloads/controllers/replicaset/ **Context:** Deployments manage ReplicaSets which manage Pods. This abstraction enables sophisticated update strategies, but understanding the mechanics is essential for production operations. **Questions:** 1. What is the relationship between Deployments, ReplicaSets, and Pods? Draw a diagram showing ownership. 2. Run `kubectl rollout history deployment/<name>`. What determines the revision numbers? 3. What are the two update strategies available for Deployments? When would you use one over the other? 4. Explain the purpose of `maxSurge` and `maxUnavailable`. What happens with maxSurge=1 and maxUnavailable=0? 5. Create a Deployment with 10 replicas. Perform a rolling update and use `kubectl rollout status` to monitor. What do you observe? 6. How does `progressDeadlineSeconds` work? Trigger a failed deployment by using a non-existent image and observe the behavior. 7. Write a Deployment manifest that ensures zero downtime during updates (consider probes, PodDisruptionBudget, and update strategy). 8. What is the purpose of the `revision` annotation on ReplicaSets? How many old ReplicaSets are kept by default? 9. Implement a canary deployment manually using two Deployments sharing the same Service. How do you control traffic distribution? 10. Advanced: How does the Deployment controller decide when to scale down old ReplicaSets? What is the reconciliation loop logic? --- ## Activity 3: Service Discovery and Networking Deep Dive **Reading Resources:** - https://kubernetes.io/docs/concepts/services-networking/service/ - https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/ - https://kubernetes.io/docs/concepts/cluster-administration/networking/ **Context:** Kubernetes networking follows specific rules: all Pods can communicate with each other, and each Pod gets its own IP. Services provide stable endpoints, but the implementation involves iptables/IPVS rules, DNS, and the kube-proxy. **Questions:** 1. What are the four types of Services? For each type, explain what ClusterIP value it gets. 2. Create a headless Service. How does DNS resolution differ from a regular ClusterIP service? 3. Use `nslookup` from within a Pod to query a Service. What DNS records are created for a Service? 4. What is the purpose of EndpointSlices? How do they improve upon Endpoints? 5. Create a Service without a selector and manually create Endpoints. When would this pattern be useful? 6. Investigate kube-proxy modes: iptables vs IPVS. Run `iptables-save | grep <service-name>` on a node to see the rules. 7. What is sessionAffinity and how does it work? Create a Service with sessionAffinity: ClientIP and test the behavior. 8. Explain the difference between externalTrafficPolicy: Cluster vs Local. What are the trade-offs? 9. Create an ExternalName Service pointing to an external database. How does DNS resolution work in this case? 10. Advanced: How does the Service controller maintain the correlation between Services and EndpointSlices? What happens during a network partition? --- ## Activity 4: StatefulSets and Persistent Storage Patterns **Reading Resources:** - https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/ - https://kubernetes.io/docs/concepts/storage/persistent-volumes/ - https://kubernetes.io/docs/tasks/run-application/run-replicated-stateful-application/ **Context:** StatefulSets provide guarantees about ordering and uniqueness of Pods. Combined with PersistentVolumes, they enable running stateful workloads, but require understanding of storage classes, volume binding, and pod identity. **Questions:** 1. What are the three guarantees that StatefulSets provide that Deployments don't? 2. What is the naming pattern for StatefulSet Pods and their PVCs? Create a 3-replica StatefulSet and verify. 3. Explain the difference between Parallel and OrderedReady pod management policies. 4. How does persistent volume claiming work in StatefulSets? What is volumeClaimTemplates? 5. Create a StatefulSet with a volumeClaimTemplate. Delete a Pod. What happens to its PVC? What about when you delete the StatefulSet? 6. What is the purpose of the governing Service in a StatefulSet? How does it enable direct Pod addressing? 7. Scale a StatefulSet from 3 to 5 replicas, then back to 3. What happens to the PVCs of pods 3 and 4? 8. Implement a master-slave database pattern using StatefulSet. How do you identify the master (ordinal 0)? 9. What are the update strategies for StatefulSet? Demonstrate a partition-based canary update. 10. Advanced: How do you handle split-brain scenarios in a StatefulSet-based cluster? Design a solution using init containers and readiness probes. --- ## Activity 5: RBAC and Security Policies **Reading Resources:** - https://kubernetes.io/docs/reference/access-authn-authz/rbac/ - https://kubernetes.io/docs/concepts/security/pod-security-standards/ - https://kubernetes.io/docs/tasks/configure-pod-container/security-context/ **Context:** Kubernetes RBAC is built on four objects: ServiceAccounts (or Users), Roles, RoleBindings, and the cluster-wide equivalents. Understanding the principle of least privilege and how permissions aggregate is crucial for security. **Questions:** 1. What is the difference between Role/RoleBinding and ClusterRole/ClusterRoleBinding? 2. List the default ClusterRoles in Kubernetes. What permissions does the 'edit' role have that 'view' doesn't? 3. Create a ServiceAccount that can only list and get Pods in a specific namespace. Write the Role and RoleBinding. 4. What are aggregated ClusterRoles? Find an example in the default roles using label selectors. 5. How do you grant a ServiceAccount permission to create Deployments but not delete them? Write the Role rules. 6. What is the difference between resourceNames and resources in a Role rule? When would you use resourceNames? 7. Create a Pod with a non-root securityContext. What capabilities would you drop for a web server Pod? 8. Implement Pod Security Standards at the namespace level. What are the three policy levels? 9. How does impersonation work in Kubernetes? When would an admin use `kubectl --as` or `--as-group`? 10. Advanced: Design an RBAC structure for a multi-tenant cluster with three teams, each with dev/staging/prod namespaces. Include namespace-admin and view-only roles. --- ## Activity 6: ConfigMaps, Secrets, and Configuration Management **Reading Resources:** - https://kubernetes.io/docs/concepts/configuration/configmap/ - https://kubernetes.io/docs/concepts/configuration/secret/ - https://kubernetes.io/docs/tasks/configmap-secret/managing-secret-using-kubectl/ **Context:** Externalizing configuration from container images is a cloud-native principle. ConfigMaps and Secrets provide mechanisms for this, but understanding mounting behaviors, updates, and security implications is important. **Questions:** 1. What are the three ways to consume a ConfigMap in a Pod? Demonstrate each with examples. 2. What happens when you update a ConfigMap that's mounted as a volume? How long does it take to propagate? 3. Create a Secret from literal values, from a file, and from a TLS certificate. How are they stored differently? 4. What is the maximum size of a ConfigMap or Secret? What happens if you exceed it? 5. How are Secrets encrypted at rest in etcd? What additional security measures can you implement? 6. Create a Pod that uses both envFrom and specific env variables from the same ConfigMap. What happens with conflicts? 7. What is the difference between subPath and items when mounting ConfigMaps as volumes? 8. Implement a configuration hot-reload pattern: create a Pod that watches a mounted ConfigMap for changes. 9. How do immutable ConfigMaps and Secrets work? What are the benefits and limitations? 10. Advanced: Design a gitops-friendly configuration management strategy using ConfigMaps, Secrets, and Kustomize for a microservices application with 5 services across 3 environments. --- ## Activity 7: Resource Management and Autoscaling **Reading Resources:** - https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/ - https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/ - https://kubernetes.io/docs/concepts/policy/resource-quotas/ **Context:** Kubernetes resource management operates at multiple levels: container requests/limits, pod QoS classes, namespace quotas, and node capacity. Understanding how the scheduler and kubelet use these is essential for stability. **Questions:** 1. What is the difference between requests and limits? How does each affect scheduling and runtime behavior? 2. What are the three QoS classes? Create Pods demonstrating each class and verify with `kubectl describe`. 3. Set up a ResourceQuota in a namespace limiting total CPU to 4 cores and memory to 8Gi. What happens when you exceed it? 4. What is the relationship between HPA, VPA, and Cluster Autoscaler? When would you use each? 5. Create an HPA that scales based on CPU. Generate load and observe the scaling behavior. What is the default scale-down stabilization? 6. How do LimitRanges work? Create one that sets default requests/limits and maximum limits for a namespace. 7. What happens when a container exceeds its memory limit vs CPU limit? Demonstrate both scenarios. 8. Implement a custom metrics HPA using metrics from your application (e.g., queue length or request rate). 9. How does the OOMKiller work in Kubernetes? What determines which container gets killed first? 10. Advanced: Design a resource allocation strategy for a namespace hosting both batch jobs and web services, ensuring batch jobs don't starve web services but can use spare capacity. --- ## Activity 8: Networking Policies and Segmentation **Reading Resources:** - https://kubernetes.io/docs/concepts/services-networking/network-policies/ - https://kubernetes.io/docs/tasks/administer-cluster/declare-network-policy/ - https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#networkpolicy-v1-networking-k8s-io **Context:** NetworkPolicies are implemented by the CNI plugin and provide L3/L4 filtering. Understanding how policies combine, default behaviors, and the differences between ingress/egress rules is crucial for network segmentation. **Questions:** 1. What is the default network policy behavior in Kubernetes? What changes when you add the first NetworkPolicy? 2. Create a NetworkPolicy that denies all ingress traffic to a namespace. How do you allow specific exceptions? 3. What is the difference between podSelector and namespaceSelector in policy rules? 4. Write a NetworkPolicy that allows ingress only from Pods with label app=frontend OR from namespace=monitoring. 5. How do you implement egress filtering? Create a policy that allows Pods to only reach specific external IPs. 6. What happens when multiple NetworkPolicies select the same Pod? Do they override or combine? 7. Implement a three-tier application isolation: frontend can talk to backend, backend can talk to database, but frontend cannot reach database directly. 8. How do you debug NetworkPolicies? Use `kubectl describe netpol` and test with curl/wget from different Pods. 9. Create a policy that allows traffic on specific ports. How do you handle both TCP and UDP? 10. Advanced: Design a zero-trust network model for a microservices application where each service can only communicate with its declared dependencies. Include monitoring and DNS exceptions. --- ## Activity 9: Observability and Debugging **Reading Resources:** - https://kubernetes.io/docs/tasks/debug/debug-application/ - https://kubernetes.io/docs/tasks/debug/debug-cluster/resource-usage-monitoring/ - https://kubernetes.io/docs/reference/kubectl/cheatsheet/#interacting-with-running-pods **Context:** Debugging in Kubernetes requires understanding various information sources: events, logs, metrics, and traces. Knowing which tool to use and how to correlate information across sources is essential for troubleshooting. **Questions:** 1. What are Kubernetes Events? Use `kubectl get events --sort-by='.lastTimestamp'` to find recent issues. 2. How do you debug a Pod stuck in Pending? List five possible causes and how to investigate each. 3. What's the difference between `kubectl logs` and `kubectl logs --previous`? When is each useful? 4. Create a Pod with multiple containers. How do you view logs from a specific container or all containers? 5. What is ephemeral debugging? Use `kubectl debug` to add a debugging container to a running Pod. 6. How do you investigate high memory usage in a Pod? What metrics and commands would you use? 7. Implement distributed tracing context propagation between two services. How do correlation IDs work? 8. What is the purpose of kubectl port-forward vs kubectl proxy? Demonstrate debugging a service using both. 9. Create a failing Pod (CrashLoopBackOff). Use various kubectl commands to determine the root cause. 10. Advanced: Design a comprehensive observability strategy including metrics (what to measure), logs (what to log), traces (what to trace), and alerts (what to alert on) for a microservices application. --- ## Activity 10: Kubernetes Architecture and Control Plane **Reading Resources:** - https://kubernetes.io/docs/concepts/overview/components/ - https://kubernetes.io/docs/concepts/architecture/control-plane-node-communication/ - https://kubernetes.io/docs/concepts/extend-kubernetes/ **Context:** Understanding Kubernetes architecture - API server, scheduler, controller manager, etcd, kubelet, and kube-proxy - helps in troubleshooting, capacity planning, and extending Kubernetes. **Questions:** 1. What are the five main control plane components? What would happen if each one failed? 2. How does the watch mechanism work in Kubernetes? Use `kubectl get pods --watch` and explain the event stream. 3. What is the role of etcd? How do you determine what's stored for a specific resource? 4. Explain the scheduler's decision process. What are priorities and predicates (now called plugins)? 5. How does kubelet register a node? What information does it report in the node status? 6. What is the controller pattern? List five built-in controllers and what resources they manage. 7. How does admission control work? What's the difference between validating and mutating webhooks? 8. Create a simple Custom Resource Definition. How does it extend the Kubernetes API? 9. What happens during a kubectl apply? Trace the request from client to etcd storage. 10. Advanced: Design a highly available Kubernetes control plane. How do you handle etcd leader election, API server load balancing, and split-brain scenarios?