# OCPBUGS-7615 persisting pending install state for operator
## Attempted timeline visualization (mermaid)
sequence dump from
```shell=
./omc logs -n openshift-operator-lifecycle-manager catalog-operator-76c4c9dd94-dt25z
```
```mermaid
gantt
title events
dateFormat HH:mm:ss.SSS
axisFormat %m:%s
section operators installation (to complete phase)
vran-accleration-operators : vran, 07:33:56.803546, 173s # 07:33:56.803546, 07:36:49.748259216
openshift-ptp : optp, 155s # 07:33:52.794738, 07:36:27.545406586
openshift-local-storage : local-storage, 182s # 07:33:47.196941, 07:36:49.748259216
openshift-storage : storage, 483s # 07:33:31.596321515, 07:38:14.666338065
openshift-sriov-network-operator : sriov, 175s # 07:33:46.620067, 07:36:41.145978467
section community-operator-index
service not available in DNS : active, c1, 07:33:31.596321515, 81s # 07:34:52.194173860
api not available : c2, after c1, 19s # 07:35:56.833239417, 07:36:15.333047883
object has been modified : c3, after c2, 32s # 07:36:25.393697240, 07:36:54.748680918
catalog servicing requests : c4, after c3, 100s # 07:36:54.748680918
section catalog-pods
redhat-operator-index-stmfx : 07:34:43.057508034, 10s
community-operator-index-scf6t : 07:34:42.779634997, 10s
certified-operator-index-wwrgp : 07:34:42.714304903, 10s
```
Can't quite get the diagram to show properly, but the catalog pods come into service (info logs announcing availability of GRPC interface) 72 seconds after the attempted installation start.
## Attempted timeline visualization (google doc)
Spreadsheet attempt to visualize: https://docs.google.com/spreadsheets/d/1PW8vlmaoOzGByTQRwa0SWB_5UXbtEUe1rBfBt6lFR9A/
## Diagnostics
`k get csv -A`
shows only sriov-network-operator.v4.12.0-202301062016 in `Pending` phase
`k describe csv -n openshift-sriov-network-operator sriov-network-operator.v4.12.0-202301062016`
shows that Status.Conditions is `RequirementsNotMet` and Requirement Status shows that none of the 6 CRDs are installed and none of the RBAC has been configured.
`k get clusterroles -A` shows that the SA for the sriov operator has not been created.
# Kube API server status
From: tshort@redhat.com
It appears that the kube-apiserver terminated itself, as installation was continuing. Looking at the jobs in the `openshift-kube-apiserver` namespace:
```
$ omc get pods -n openshift-kube-apiserver
NAME READY STATUS RESTARTS AGE
installer-2-master0.b11oe13sl1.dyn.onebts.espoo.nsn-rdnet.net 0/1 Failed 0 5h42m
installer-2-retry-1-master0.b11oe13sl1.dyn.onebts.espoo.nsn-rdnet.net 0/1 Failed 0 5h38m
installer-2-retry-2-master0.b11oe13sl1.dyn.onebts.espoo.nsn-rdnet.net 0/1 Completed 0 5h35m
installer-5-master0.b11oe13sl1.dyn.onebts.espoo.nsn-rdnet.net 0/1 Completed 0 5h33m
installer-7-master0.b11oe13sl1.dyn.onebts.espoo.nsn-rdnet.net 0/1 Completed 0 5h31m
installer-8-master0.b11oe13sl1.dyn.onebts.espoo.nsn-rdnet.net 0/1 Completed 0 5h29m
kube-apiserver-master0.b11oe13sl1.dyn.onebts.espoo.nsn-rdnet.net 5/5 Running 0 5h28m
```
The CSV in pending has the following status timestamps:
```
conditions:
- lastTransitionTime: "2023-02-14T07:36:24Z"
lastUpdateTime: "2023-02-14T07:36:24Z"
message: requirements not yet checked
phase: Pending
reason: RequirementsUnknown
- lastTransitionTime: "2023-02-14T07:36:24Z"
lastUpdateTime: "2023-02-14T07:36:25Z"
message: one or more requirements couldn't be found
phase: Pending
reason: RequirementsNotMet
lastTransitionTime: "2023-02-14T07:36:24Z"
lastUpdateTime: "2023-02-14T07:36:25Z"
```
But the kube-apiserver log timestamps start at:
```
2023-02-14T07:39:46.547370907Z flock: getting lock took 0.000004 seconds
2023-02-14T07:39:46.547430375Z Copying system trust bundle ...
2023-02-14T07:39:46.558270420Z I0214 07:39:46.558141 1 loader.go:374] Config loaded from file: /etc/kubernetes/static-pod-resources/configmaps/kube-apiserver-cert-syncer-kubeconfig/kubeconfig
2023-02-14T07:39:46.558614084Z Copying termination logs to "/var/log/kube-apiserver/termination.log"
2023-02-14T07:39:46.558614084Z I0214 07:39:46.558574 1 main.go:161] Touching termination lock file "/var/log/kube-apiserver/.terminating"
```
Looking at the `openshift-kube-apiserver-operator` via:
```
omc logs -n -p -n openshift-kube-apiserver-operator kube-apiserver-operator-67fd98d7b4-p2nsn
```
We see:
```
2023-02-14T07:39:34.648803684Z I0214 07:39:34.648683 1 termination_observer.go:236] Observed event "TerminationPreShutdownHooksFinished" for API server pod "kube-apiserver-master0.b11oe13sl1.dyn.onebts.espoo.nsn-rdnet.net" (last termination at 2023-02-14 07:37:59 +0000 UTC) at 0001-01-01 00:00:00 +0000 UTC
2023-02-14T07:39:34.653234752Z I0214 07:39:34.653047 1 installer_controller.go:512] "master0.b11oe13sl1.dyn.onebts.espoo.nsn-rdnet.net" is in transition to 8, but has not made progress because installer is not finished, but in Running phase
2023-02-14T07:39:36.652396780Z I0214 07:39:36.652285 1 termination_observer.go:236] Observed event "TerminationGracefulTerminationFinished" for API server pod "kube-apiserver-aster0.b11oe13sl1.dyn.onebts.espoo.nsn-rdnet.net" (last termination at 2023-02-14 07:37:59 +0000 UTC) at 0001-01-01 00:00:00 +0000 UTC
2023-02-14T07:39:41.274601418Z E0214 07:39:41.274508 1 base_controller.go:272] auditPolicyController reconciliation failed: Get "https://172.22.0.1:443/api/v1/namespaces/openshift-kube-apiserver/configmaps/kube-apiserver-audit-policies": dial tcp 172.22.0.1:443: connect: connection refused
2023-02-14T07:39:41.281138759Z E0214 07:39:41.281038 1 base_controller.go:272] auditPolicyController reconciliation failed: Get "https://172.22.0.1:443/api/v1/namespaces/openshift-kube-apiserver/configmaps/kube-apiserver-audit-policies": dial tcp 172.22.0.1:443: connect: connection refused
2023-02-14T07:39:41.292627662Z E0214 07:39:41.292546 1 base_controller.go:272] auditPolicyController reconciliation failed: Get "https://172.22.0.1:443/api/v1/namespaces/openshift-kube-apiserver/configmaps/kube-apiserver-audit-policies": dial tcp 172.22.0.1:443: connect: connection refused
2023-02-14T07:39:41.314055901Z E0214 07:39:41.313974 1 base_controller.go:272] auditPolicyController reconciliation failed: Get "https://172.22.0.1:443/api/v1/namespaces/openshift-kube-apiserver/configmaps/kube-apiserver-audit-policies": dial tcp 172.22.0.1:443: connect: connection refused
2023-02-14T07:39:41.355383433Z E0214 07:39:41.355281 1 base_controller.go:272] auditPolicyController reconciliation failed: Get "https://172.22.0.1:443/api/v1/namespaces/openshift-kube-apiserver/configmaps/kube-apiserver-audit-policies": dial tcp 172.22.0.1:443: connect: connection refused
2023-02-14T07:39:41.436803783Z E0214 07:39:41.436724 1 base_controller.go:272] auditPolicyController reconciliation failed: Get "https://172.22.0.1:443/api/v1/namespaces/openshift-kube-apiserver/configmaps/kube-apiserver-audit-policies": dial tcp 172.22.0.1:443: connect: connection refused
2023-02-14T07:39:41.598429197Z E0214 07:39:41.598346 1 base_controller.go:272] auditPolicyController reconciliation failed: Get "https://172.22.0.1:443/api/v1/namespaces/openshift-kube-apiserver/configmaps/kube-apiserver-audit-policies": dial tcp 172.22.0.1:443: connect: connection refused
2023-02-14T07:39:41.920612011Z E0214 07:39:41.920522 1 base_controller.go:272] auditPolicyController reconciliation failed: Get "https://172.22.0.1:443/api/v1/namespaces/openshift-kube-apiserver/configmaps/kube-apiserver-audit-policies": dial tcp 172.22.0.1:443: connect: connection refused
2023-02-14T07:39:42.562202646Z E0214 07:39:42.562113 1 base_controller.go:272] auditPolicyController reconciliation failed: Get "https://172.22.0.1:443/api/v1/namespaces/openshift-kube-apiserver/configmaps/kube-apiserver-audit-policies": dial tcp 172.22.0.1:443: connect: connection refused
2023-02-14T07:39:43.844649871Z E0214 07:39:43.844554 1 base_controller.go:272] auditPolicyController reconciliation failed: Get "https://172.22.0.1:443/api/v1/namespaces/openshift-kube-apiserver/configmaps/kube-apiserver-audit-policies": dial tcp 172.22.0.1:443: connect: connection refused
2023-02-14T07:39:46.406592192Z E0214 07:39:46.406493 1 base_controller.go:272] auditPolicyController reconciliation failed: Get "https://172.22.0.1:443/api/v1/namespaces/openshift-kube-apiserver/configmaps/kube-apiserver-audit-policies": dial tcp 172.22.0.1:443: connect: connection refused
2023-02-14T07:40:09.854387143Z I0214 07:40:09.854261 1 servicehostname.go:40] syncing servicenetwork hostnames: [172.22.0.1 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local openshift openshift.default openshift.default.svc openshift.default.svc.cluster.local]
```
The API server is indicating that it is terminating at `2023-02-14T07:39:34.648803684Z`:
```
termination_observer.go:236] Observed event "TerminationPreShutdownHooksFinished" for API server pod "kube-apiserver-master0.b11oe13sl1.dyn.onebts.espoo.nsn-rdnet.net" (last termination at 2023-02-14 07:37:59 +0000 UTC) at 00
```
And connections are refused:
```
base_controller.go:272] auditPolicyController reconciliation failed: Get "https://172.22.0.1:443/api/v1/namespaces/openshift-kube-apiserver/configmaps/kube-apiserver-audit-policies": dial tcp 172.22.0.1:443: connect: connection refused
```