#### Prow Status
* [Operators - Check/Gate failures](https://prow.ci.openshift.org/?job=pull-ci-openstack-k8s-operators*&state=failure)
* For now, we just look (and investigate) if a test has multiple failures across different PRs
* [Operators - Check/Gate status](https://prow.ci.openshift.org/?job=pull-ci-openstack-k8s-operators*)
* [pre-commit jobs failures](https://prow.ci.openshift.org/?job=pull-ci-openstack-k8s-operators*-main-precommit-check&state=failure)
* [pre-commit jobs status](https://prow.ci.openshift.org/?job=pull-ci-openstack-k8s-operators*-main-precommit-check)
* [unit tests jobs failures](https://prow.ci.openshift.org/?job=pull-ci-openstack-k8s-operators*-operator-main-unit&state=failure)
* [unit test jobs status](https://prow.ci.openshift.org/?job=pull-ci-openstack-k8s-operators*-operator-main-unit)
* [kuttl tests jobs failures](https://prow.ci.openshift.org/?job=pull-ci-openstack-k8s-operators*-operator-build-deploy-kuttl&state=failure)
* [kuttl tests jobs status](https://prow.ci.openshift.org/?job=pull-ci-openstack-k8s-operators*-operator-build-deploy-kuttl)
* [Tempest jobs failures](https://prow.ci.openshift.org/?job=pull-ci-openstack-k8s-operators*-operator-build-deploy-tempest&state=failure)
* [Tempest jobs status](https://prow.ci.openshift.org/?job=pull-ci-openstack-k8s-operators*-operator-build-deploy-tempest)
### How to debug a Prow job failure
* precommit
* artifacts -> build.log
* usually linter type errors in make-operator-lint, golangci-lint
* build/deploy
* artifacts -> build.log
* search for ...
* kuttl
* check if it's actually a kuttl failure - might be the build/deploy step
* look in artifacts -> -operator-build-deploy-kuttl and check if there is even a kuttl log
* [example of real kuttl failure](https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_release/40096/rehearse-40096-pull-ci-openstack-k8s-operators-openstack-operator-main-openstack-operator-build-deploy-kuttl/1673755615094116352)
* openstack-k8s-operators-kuttl/build-log.txt look for error there
* look for file job definition in https://github.com/openshift/release
* https://github.com/openshift/release/tree/master/ci-operator/step-registry/openstack-k8s-operators/kuttl
* example look in: https://github.com/openshift/release/blob/master/ci-operator/jobs/openstack-k8s-operators/glance-operator/openstack-k8s-operators-glance-operator-main-presubmits.yaml#L118 (all in precommit definition - in the operator folder)
* also in config: https://github.com/openshift/release/blob/master/ci-operator/config/openstack-k8s-operators/glance-operator/openstack-k8s-operators-glance-operator-main.yaml#L96
* to find the per operator definition: https://github.com/openstack-k8s-operators/keystone-operator/tree/main/tests/kuttl/tests/change_keystone_config
* tempest
##### Hive status
[Hive metrics to watch cluster pools](https://grafana-route-ci-grafana.apps.ci.l2s4.p1.openshiftapps.com/d/22491886c1e19dde8d2984bca82154c1/cluster-pool-dashboard?orgId=1&from=now-7d&to=now)
What to look for in Hive metrics:
* https://docs.ci.openshift.org/docs/how-tos/cluster-claim/#existing-cluster-pools
* Search for openstack
* Look for number "Ready" - will show availability and if we are short on cluster for scheduled jobs
##### Operator Image Build status
[Dashboard for monitoring operator Container builds](https://github.com/gibizer/openstack-k8s-status)
#### Podifed Container Build Lines
* [openstack-periodic-container-master-centos9](https://review.rdoproject.org/zuul/buildsets?pipeline=openstack-periodic-container-master-centos9)
* [openstack-periodic-container-antelope-centos9](https://review.rdoproject.org/zuul/buildsets?pipeline=openstack-periodic-container-antelope-centos9&skip=0)
#### EDPM Tests
CI-framework
* https://review.rdoproject.org/zuul/builds?job_name=cifmw-end-to-end&skip=0
* https://review.rdoproject.org/zuul/builds?job_name=cifmw-end-to-end-nobuild-tagged&skip=0
* https://review.rdoproject.org/zuul/builds?job_name=ci-framework-crc-podified-edpm-deployment&skip=0
* https://review.rdoproject.org/zuul/builds?job_name=ci-framework-crc-podified-edpm-baremetal&skip=0
Edpm-ansible
* https://review.rdoproject.org/zuul/builds?job_name=edpm-ansible-crc-podified-edpm-deployment&skip=0
* https://review.rdoproject.org/zuul/builds?job_name=edpm-ansible-crc-podified-edpm-baremetal&skip=0
Dataplane-operator
* https://review.rdoproject.org/zuul/builds?job_name=dataplane-operator-crc-podified-edpm-baremetal&skip=0
* https://review.rdoproject.org/zuul/builds?job_name=dataplane-operator-crc-podified-edpm-deployment&skip=0
Openstack-operator
* https://review.rdoproject.org/zuul/builds?job_name=openstack-operator-crc-podified-edpm-baremetal&skip=0
* https://review.rdoproject.org/zuul/builds?job_name=openstack-operator-crc-podified-edpm-deployment&skip=0
Openstack-ansibleee-operator
* https://review.rdoproject.org/zuul/builds?job_name=ansibleee-operator-crc-podified-edpm-deployment&skip=0
Openstack-baremetal-operator
* https://review.rdoproject.org/zuul/builds?job_name=openstack-baremetal-operator-crc-podified-edpm-baremetal&skip=0
Periodic Jobs
* https://review.rdoproject.org/zuul/builds?job_name=periodic-podified-edpm-deployment-antelope-ocp-crc-1cs9
* https://review.rdoproject.org/zuul/builds?job_name=periodic-podified-edpm-baremetal-antelope-ocp-crct
### How to debug a EDPM job failure
Example log:
* https://review.rdoproject.org/zuul/build/62a2f4396564439490779bab749bd377 will show the top level error
* "View log" -> job-output.txt to confirm error
* ci-framework-data/logs/: ansible.log-**date** will show the ansible failure
* example `"stderr_lines": ["error: timed out waiting for the condition on openstackcontrolplanes/openstack-network-isolation"]`
* epdm folder:
* events.log - all the events happened on a cluster
* operator_pods.txt (any failed operators - copy pod name)
* openstack_pods.txt (openstack service pods created during deployment)
* pods (search for pod name.txt)
* pv.log (useful for any storage related failure)
* crs - custom resources generated for each services and operator during deployment
* crc folder:
* contains all the logs related to crc cluster
* system-config/libvirt/:
* Logs related to libvirt vms
* Look for latest changes to imapcted operators in https://github.com/openstack-k8s-operators
Common errors:
* ` *** [Makefile:393: namespace] Error 1\n", "stderr_lines": ["error: You must be logged in to the server (Unauthorized)"`
### Common Errors
#### Memory
```could not run steps: step unit failed: failed to create or restart test pod: unable to create pod: Pod "unit" is invalid: spec.containers[0].resources.requests: Invalid value: "2171836734": must be less than or equal to memory limit of 2Gi```
Upper limit error:
https://github.com/openshift/release/blob/master/ci-operator/config/openstack-k8s-operators/ironic-operator/openstack-k8s-operators-ironic-operator-main.yaml#L54-L56
```
resources:
'*':
limits:
memory: 2Gi
requests:
cpu: 100m
memory: 200Mi
```
Need to bump to 4Gi