# Kubernetes issue - ContainerStatusUnknown ###### tags: `kubernetes` I have a virtual Kubernetes environment in version 1.23. And there is a pod in status of ContainerStatusUnknown. Examine this issue to further understand what and why the issue occur. Update: this happens twice on the same node, I think its due to OOM. ``` root@worker2:~# free -m total used free shared buff/cache available Mem: 1983 1147 113 3 722 646 Swap: 0 0 0 ``` #### ContainerStatusUnknown A pod in "dmeo" namespace in "ContainerStatusUnknown". ``` ubuntu@master:~$ kubectl get pod ubuntu2 -n demo -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES ubuntu2 0/1 ContainerStatusUnknown 1 23h 10.85.189.70 worker2 <none> <none> ``` ``` ubuntu@master:~$ kubectl get pod -n demo ubuntu2 -o yaml | grep -i phase phase: Failed ``` get pod ID ``` ubuntu@master:~$ kubectl get pod -n demo ubuntu2 -o jsonpath='{.metadata.uid}' 798371fc-09a8-4740-b39a-8263da43d881 ``` Check the pod details ``` ubuntu@master:~$ kubectl describe pod ubuntu2004 -n demo Name: ubuntu2 Namespace: demo Priority: 0 Node: worker2/192.168.122.12 Start Time: Thu, 26 May 2022 11:17:33 +1000 Labels: app=ubuntu2 Annotations: cni.projectcalico.org/containerID: bf6758f68f1862d383bff8a1707d739cca7deff4aedd5953cb40e63df19054da cni.projectcalico.org/podIP: cni.projectcalico.org/podIPs: Status: Failed Reason: Evicted Message: The node was low on resource: ephemeral-storage. Container ubuntu was using 35684Ki, which exceeds its request of 0. IP: 10.85.189.70 IPs: IP: 10.85.189.70 Containers: ubuntu: Container ID: Image: ubuntu:latest Image ID: Port: <none> Host Port: <none> Command: /bin/sleep 3650d State: Terminated Reason: ContainerStatusUnknown Message: The container could not be located when the pod was terminated Exit Code: 137 Started: Mon, 01 Jan 0001 00:00:00 +0000 Finished: Mon, 01 Jan 0001 00:00:00 +0000 Last State: Terminated Reason: ContainerStatusUnknown Message: The container could not be located when the pod was deleted. The container used to be Running Exit Code: 137 Started: Mon, 01 Jan 0001 00:00:00 +0000 Finished: Mon, 01 Jan 0001 00:00:00 +0000 Ready: False Restart Count: 1 Environment: <none> Mounts: /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-9g5wv (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: kube-api-access-9g5wv: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: <nil> DownwardAPI: true QoS Class: BestEffort Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: <none> ``` > In the describe output, the container ID, image ID, etc... are empty. > This probably due to the container was deleted already in the cri-o. > But when and why the Pod was not deleted? > The last status message of the container tells pod was deleted? > The pod also tells a message that the node was low on resource ephemeral-storage. So kubelet on the node might stop the container. > But there is no event logs on the pod, which could not prove that the contianer was stopped by the "kubelet". > The exit code 137 represents "out of memory" in linux. Maybe it is reason why container terminated? Check kubelet log file on work2 node. ``` root@worker2:/var/log# grep -re "10.85.189.70" ./syslog.1 | wc -l 6 root@worker2:/var/log# grep -re "10.85.189.70" ./syslog.1 | grep "ContainerID" | head -n 1 May 26 11:17:35 worker2 crio[589]: 2022-05-26 11:17:35.477 [INFO][1075037] ipam_plugin.go 284: Calico CNI IPAM assigned addresses IPv4=[10.85.189.70/26] IPv6=[] ContainerID="bf6758f68f1862d383bff8a1707d739cca7deff4aedd5953cb40e63df19054da" HandleID="k8s-pod-network.bf6758f68f1862d383bff8a1707d739cca7deff4aedd5953cb40e63df19054da" Workload="worker2-k8s-ubuntu2-eth0" ``` > Got container ID "bf6758f68f1862d383bff8a1707d739cca7deff4aedd5953cb40e63df19054da" > It is pod Check logs again to find what time the container was being deleted. ``` May 26 11:17:59 worker2 crio[589]: time="2022-05-26 11:17:59.790680442+10:00" level=info msg="Started container" PID=1075244 containerID=3f46b70cb2d355b558c7dc64f42cbfcd2329d808a946f2067c200571c53abc41 description=demo/ubuntu2/ubuntu id=a84ec1d4-5eef-403a-8535-07dc6e6163de name=/runtime.v1.RuntimeService/StartContainer sandboxID=bf6758f68f1862d383bff8a1707d739cca7deff4aedd5953cb40e63df19054da ``` > the conainer was created at 11:17AM ``` May 26 21:01:28 worker2 crio[589]: time="2022-05-26 21:01:28.760061929+10:00" level=info msg="Stopping pod sandbox: bf6758f68f1862d383bff8a1707d739cca7deff4aedd5953cb40e63df19054da" id=afd4efd8-57e5-4e31-8280-c23b60a4908e name=/runtime.v1.RuntimeService/StopPodSandbox May 26 21:01:28 worker2 crio[589]: time="2022-05-26 21:01:28.890135955+10:00" level=info msg="Got pod network &{Name:ubuntu2 Namespace:demo ID:bf6758f68f1862d383bff8a1707d739cca7deff4aedd5953cb40e63df19054da UID:798371fc-09a8-4740-b39a-8263da43d881 NetNS:/var/run/netns/7be644af-bdcc-443b-afbd-624c85d47d53 Networks:[{Name:k8s-pod-network Ifname:eth0}] RuntimeConfig:map[k8s-pod-network:{IP: MAC: PortMappings:[] Bandwidth:<nil> IpRanges:[]}] Aliases:map[]}" May 26 21:01:30 worker2 crio[589]: 2022-05-26 21:01:29.744 [INFO][1372241] k8s.go 572: Cleaning up netns ContainerID="bf6758f68f1862d383bff8a1707d739cca7deff4aedd5953cb40e63df19054da" May 26 21:01:30 worker2 crio[589]: 2022-05-26 21:01:29.758 [INFO][1372241] dataplane_linux.go 506: Calico CNI deleting device in netns /var/run/netns/7be644af-bdcc-443b-afbd-624c85d47d53 ContainerID="bf6758f68f1862d383bff8a1707d739cca7deff4aedd5953cb40e63df19054da" May 26 21:01:30 worker2 crio[589]: 2022-05-26 21:01:29.784 [INFO][1372241] dataplane_linux.go 523: Calico CNI deleted device in netns /var/run/netns/7be644af-bdcc-443b-afbd-624c85d47d53 ContainerID="bf6758f68f1862d383bff8a1707d739cca7deff4aedd5953cb40e63df19054da" May 26 21:01:30 worker2 crio[589]: 2022-05-26 21:01:29.784 [INFO][1372241] k8s.go 579: Releasing IP address(es) ContainerID="bf6758f68f1862d383bff8a1707d739cca7deff4aedd5953cb40e63df19054da" May 26 21:01:30 worker2 crio[589]: 2022-05-26 21:01:29.784 [INFO][1372241] utils.go 196: Calico CNI releasing IP address ContainerID="bf6758f68f1862d383bff8a1707d739cca7deff4aedd5953cb40e63df19054da" May 26 21:01:30 worker2 crio[589]: 2022-05-26 21:01:29.870 [INFO][1372255] ipam_plugin.go 415: Releasing address using handleID ContainerID="bf6758f68f1862d383bff8a1707d739cca7deff4aedd5953cb40e63df19054da" HandleID="k8s-pod-network.bf6758f68f1862d383bff8a1707d739cca7deff4aedd5953cb40e63df19054da" Workload="worker2-k8s-ubuntu2-eth0" May 26 21:01:30 worker2 crio[589]: 2022-05-26 21:01:30.360 [INFO][1372255] ipam_plugin.go 434: Released address using handleID ContainerID="bf6758f68f1862d383bff8a1707d739cca7deff4aedd5953cb40e63df19054da" HandleID="k8s-pod-network.bf6758f68f1862d383bff8a1707d739cca7deff4aedd5953cb40e63df19054da" Workload="worker2-k8s-ubuntu2-eth0" May 26 21:01:30 worker2 crio[589]: 2022-05-26 21:01:30.360 [INFO][1372255] ipam_plugin.go 443: Releasing address using workloadID ContainerID="bf6758f68f1862d383bff8a1707d739cca7deff4aedd5953cb40e63df19054da" HandleID="k8s-pod-network.bf6758f68f1862d383bff8a1707d739cca7deff4aedd5953cb40e63df19054da" Workload="worker2-k8s-ubuntu2-eth0" May 26 21:01:30 worker2 crio[589]: 2022-05-26 21:01:30.365 [INFO][1372241] k8s.go 585: Teardown processing complete. ContainerID="bf6758f68f1862d383bff8a1707d739cca7deff4aedd5953cb40e63df19054da" May 26 21:01:30 worker2 systemd[3528991]: run-containers-storage-overlay\x2dcontainers-bf6758f68f1862d383bff8a1707d739cca7deff4aedd5953cb40e63df19054da-userdata-shm.mount: Succeeded. May 26 21:01:30 worker2 systemd[1]: run-containers-storage-overlay\x2dcontainers-bf6758f68f1862d383bff8a1707d739cca7deff4aedd5953cb40e63df19054da-userdata-shm.mount: Succeeded. May 26 21:01:30 worker2 crio[589]: time="2022-05-26 21:01:30.469889194+10:00" level=info msg="Stopped pod sandbox: bf6758f68f1862d383bff8a1707d739cca7deff4aedd5953cb40e63df19054da" id=afd4efd8-57e5-4e31-8280-c23b60a4908e name=/runtime.v1.RuntimeService/StopPodSandbox May 26 21:01:31 worker2 crio[589]: time="2022-05-26 21:01:31.302373199+10:00" level=info msg="Stopping pod sandbox: bf6758f68f1862d383bff8a1707d739cca7deff4aedd5953cb40e63df19054da" id=dfcf3e2a-aba6-4521-b065-0a03103ef7e0 name=/runtime.v1.RuntimeService/StopPodSandbox May 26 21:01:31 worker2 crio[589]: time="2022-05-26 21:01:31.302408765+10:00" level=info msg="Stopped pod sandbox (already stopped): bf6758f68f1862d383bff8a1707d739cca7deff4aedd5953cb40e63df19054da" id=dfcf3e2a-aba6-4521-b065-0a03103ef7e0 name=/runtime.v1.RuntimeService/StopPodSandbox May 26 21:01:31 worker2 crio[589]: time="2022-05-26 21:01:31.302700347+10:00" level=info msg="Removing pod sandbox: bf6758f68f1862d383bff8a1707d739cca7deff4aedd5953cb40e63df19054da" id=f611266c-55e0-4f1c-9471-4ab3622a973e name=/runtime.v1.RuntimeService/RemovePodSandbox May 26 21:01:31 worker2 crio[589]: time="2022-05-26 21:01:31.454474319+10:00" level=info msg="Removed pod sandbox: bf6758f68f1862d383bff8a1707d739cca7deff4aedd5953cb40e63df19054da" id=f611266c-55e0-4f1c-9471-4ab3622a973e name=/runtime.v1.RuntimeService/RemovePodSandbox ``` > the container was being deleted at 21:01PM. The logs looks ok. kubelet logs about the pod, fail to watch the container. It is ok at time of pod creation. ``` May 26 11:17:33 worker2 kubelet[1499524]: I0526 11:17:33.129861 1499524 topology_manager.go:200] "Topology Admit Handler" May 26 11:17:33 worker2 kubelet[1499524]: I0526 11:17:33.503315 1499524 reconciler.go:221] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kube-api-access-9g5wv\" (UniqueName: \"kubernetes.io/projected/798371fc-09a8-4740-b39a-8263da43d881-kube-api-access-9g5wv\") pod \"ubuntu2\" (UID: \"798371fc-09a8-4740-b39a-8263da43d881\") " pod="demo/ubuntu2" May 26 11:17:36 worker2 kubelet[1499524]: W0526 11:17:36.115343 1499524 manager.go:1176] Failed to process watch event {EventType:0 Name:/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pod798371fc_09a8_4740_b39a_8263da43d881.slice/crio-bf6758f68f1862d383bff8a1707d739cca7deff4aedd5953cb40e63df19054da.scope WatchSource:0}: Error finding container bf6758f68f1862d383bff8a1707d739cca7deff4aedd5953cb40e63df19054da: Status 404 returned error &{%!s(*http.body=&{0xc00012d4b8 <nil> <nil> false false {0 0} false false false <nil>}) {%!s(int32=0) %!s(uint32=0)} %!s(bool=false) <nil> %!s(func(error) error=0x840ee0) %!s(func() error=0x840fe0)} ``` Check kubelet logs in more detail. ![](https://i.imgur.com/LCWTWWQ.png) > the Eviction manager logs about pod failed to evict and then shows evicted and waiting for clean up. > checked logs on master node, nothing special to me. > Because "pod" is lowest level workload type, it does not recover it-self when failure occur. #### Reference https://bugzilla.redhat.com/show_bug.cgi?id=1734524 https://www.cnblogs.com/chuanzhang053/p/16048622.html https://morethink.xyz/i/0899009936416