owned this note
owned this note
Published
Linked with GitHub
# Ruck Rover - Mar 11th - 17h 2022
###### tags: `ruck_rover`
## Previous RR notes: https://hackmd.io/uA4i9wDhQfiolHeVOIHbdg
## Notes:
### Mar-17th:
* Gate failure https://bugs.launchpad.net/tripleo/+bug/1965426
* happens only in ussuri or older branches
* affects validations jobs
* ping jpovidin or matbu tomorrow, I got no answers today
* Train C8 promoted, some components promoted
* Chasing promotion of a few old components, chasing master and wallaby c9
* Issues with mirrors still happening:
* <dviroel|ruck> guilhermesp_: hey o/ - today we start to face mirrors issues again, less then before the migration, but we still have jobs failing to download content
* <dviroel|ruck> guilhermesp_: "Operation too slow. Less than 1000 bytes/sec transferred the last 30 seconds"
* <guilhermesp_> yeah, i actually migrated to another old compute ( i was facing issues to live migrate it between those old systems to the new ones ) -- you can re-open the ticket and we can try to work on get a permanent solution. Maybe even taking a snapshot and relaunching it to a new hv if that not a lot of work on your side -- we'll see
* nhicher will re-open the ticket.
### Mar-16th
* (doug): C8 wallaby promoted by skipping fs020 this time.
* It seems that vexxhost fixed the issues with mirrors by migration their VMs (reopen 362150 if needed).
* There are some components that are too old - I triggered some testprojects to see how it goes. I noticed that fs001 was passing on component line, we should try to promoted components as possible.- retriggered failed jobs: https://review.rdoproject.org/r/c/testproject/+/40642
* Container build failure on 16.2 needs investigation. (it is fixed now, mirror issue, https://code.engineering.redhat.com/gerrit/c/testproject/+/317875)
* Trying to promote Train C8 with 59c817e5bcaebfb5aeee50225b2dc5f2 (missing ovbs) - Already promoted
* https://bugs.launchpad.net/tripleo/+bug/1965124 mirror issues
* stable/wallaby pep8 broken? commented https://bugs.launchpad.net/tripleo/+bug/1964935/comments/3 & posted cherrypick https://review.opendev.org/c/openstack/tripleo-heat-templates/+/834012 (is merged now)
### Mar-15th
* (doug): criteria change was needed to promote wallaby-c9 (fs001). Still missing wallaby-c8 and train-c8, but unfortunately vexxhost issues are back, mirror and connectivity issues with nodes made lot of jobs to fail. (╯°□°)╯︵ ┻━┻. @chkumar we need to update program call doc tomorrow.
### Mar-14th
* (doug): Lots of failures related to mirrors. Nothing promoted so far, featureset035 is blocking wallaby and master C9 in previous hashes and rerunning those jobs throws different errors everytime. Last thing on 14th, rerunning master and wallaby c9 failed jobs - some jobs still failing due to mirrors, others multiple tempest failures. featureset001 needs attention in almost all releases.
## Upstream
### Status:
* Gate:
* C9 main:
* C9 wallaby:
* C8 wallaby:
* C8 victoria:
* C8 ussuri:
* C8 train: Promoted Today
* C7 train: All jobs have passed, hope it promotes.
### Upstream Issues:
* CS9 master component jobs are failing due to missing containers from Registry
* https://bugs.launchpad.net/tripleo/+bug/1964457/ - https://review.opendev.org/c/openstack/tripleo-quickstart-extras/+/833027
* FS01 re-run for wallaby/victoria/ussuri - https://review.rdoproject.org/r/c/testproject/+/40525
* **Master failure**
* fs035 testing: https://review.rdoproject.org/r/c/testproject/+/40461
* Updated depends on with
* https://review.opendev.org/c/openstack/tripleo-heat-templates/+/833708
* https://review.opendev.org/c/openstack/puppet-tripleo/+/833711
* with 831608: Make rescue, volume attachment compute tests to create SSH-able server | https://review.opendev.org/c/openstack/tempest/+/831608
* Failures :https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp-featureset035-master
* No pattern found - different failures every run
* As per takashi, https://bugs.launchpad.net/tripleo/+bug/1964824 might be related (rough guess)
* Added the logs there https://bugs.launchpad.net/tripleo/+bug/1964824/comments/7
* **Mirror Issue**
* https://logserver.rdoproject.org/openstack-periodic-integration-stable4/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-centos-8-buildimage-overcloud-full-train/1fd6839/job-output.txt
* re-running cs8 train skipped /failed jobs here: https://review.rdoproject.org/r/c/testproject/+/40521
* More Jobs
* https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-9-standalone-master/b5b47fa/logs/undercloud/home/zuul/install_packages.sh.log.txt.gz
* **RETRY_LIMIT in ovb check jobs**
* Comments
* Asked on #rhos-ops with below details
* https://support.vexxhost.com/hc/en-us/requests/362100
* https://support.vexxhost.com/hc/en-us/requests/362141
* node_failures in component line https://review.rdoproject.org/zuul/builds?result=NODE_FAILURE timestamp from 04:31:31 to 06:55:41 UTC
* Jobs with retry_limit
* `stack_status_reason: 'Resource CREATE failed: OperationalError: resources.baremetal_env.resources.openstack_baremetal_servers.resources[2].resources.baremetal_ports:` is the error
* Below is the stack id and job logs
* https://logserver.rdoproject.org/32/831932/6/openstack-check/tripleo-ci-centos-9-ovb-3ctlr_1comp-featureset001/9f37b22/job-output.txt
stack id: 177ce44b-f404-46bd-9869-888b7ce19ff6
* https://logserver.rdoproject.org/96/833196/1/openstack-check/tripleo-ci-centos-8-ovb-3ctlr_1comp_1supp-featureset039/149d4c6/job-output.txt
stack id: 0f0d76cd-1923-4574-a199-002632bc998d
* https://logserver.rdoproject.org/22/832722/1/openstack-check/tripleo-ci-centos-8-ovb-3ctlr_1comp_1supp-featureset039/c36da1c/job-output.txt
stack id: b9431de7-975c-4401-8181-b28708dd87f2
* **Gate Failures**
* Already rechecked (if it still failes, need to log a bug)
* https://review.opendev.org/c/openstack/tripleo-heat-templates/+/832722/
* https://review.opendev.org/c/openstack/tripleo-heat-templates/+/833556/
* https://review.opendev.org/c/openstack/tripleo-heat-templates/+/818637/
---
## Downstream
### Status:
* RHOS 17:
* RHOS 16.2:
### Downstream Issues:
* Container build on 16-2 failing:
* https://sf.hosted.upshift.rdu2.redhat.com/logs/84/195884/144/check/periodic-tripleo-build-containers-ubi-8-internal-rhel-8-build-push-upload-rhos-16.2/594fe71/logs/build.log
---
## RDO
* sent a message to amoralej and jcapitao about this job failing - no responses yet
* https://jenkins-cloudsig-ci.apps.ocp.ci.centos.org/job/weirdo-victoria-centos8-promote-puppet-openstack-scenario003/
---
## Fixed:
~~* tripleo-ansible-centos-8-molecule-tripleo_cephadm failure
* Status: waiting for reviews
* Patches: https://review.opendev.org/c/openstack/tripleo-ansible/+/833178
* In gates~~
* container build is passing now
* https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-build-containers-centos-9-push-master/bbc6eb7/logs/build.log
* SSL Cert update happened that time
* re-running here https://review.rdoproject.org/r/c/testproject/+/36356/26#message-c89594e16604bb56277d91f2f61dd9b50a9ecb99
* periodic-tripleo-ci-build-containers-centos-9-push-master https://review.rdoproject.org/zuul/build/7eb5a6ac4cd94012a9b8017720f0558c : SUCCESS in 1h 05m 28s
* https://review.rdoproject.org/r/c/testproject/+/36356/26#message-bbe021d515d0c112bd69360b1927fb67396982b8
* **openstackclient>=5.2.0 conflicting pip dependencies**
* Status: under investigation
* LP: https://bugs.launchpad.net/tripleo/+bug/1964468
* CIX:
* Jobs:
* periodic-tripleo-ci-build-containers-centos-9-push-master
* Patches
* Comments:
* https://bugs.launchpad.net/tripleo/+bug/1964477 - Tengu is working on a fix that may be useful
* https://review.opendev.org/c/openstack/tripleo-quickstart-extras/+/833091
* dviroel: unable to reproduce in my c9 env
* better now on testproject (https://review.rdoproject.org/r/c/testproject/+/36356/26)
* at least the piece that was failing
* UpStream gate failure
* https://520c46833a3e12074454-e521a4dbcb573778e4bab95c0cd81671.ssl.cf2.rackcdn.com/831932/6/gate/tripleo-ci-centos-9-content-provider-wallaby/b8490fd/logs/delorean_logs/component/validation/91/63/9163d8bcbcf364db9c04323436de35fc3946b380_dev/rpmbuild.log.txt.gz
* `DEBUG: =========================
DEBUG: Failures during discovery
DEBUG: =========================
DEBUG: --- import errors ---
DEBUG: Failed to import test module: validations_libs.tests.callback_plugins.test_vf_fail_if_no_hosts
DEBUG: Traceback (most recent call last):
DEBUG: File "/usr/lib64/python3.9/unittest/loader.py", line 436, in _find_test_path
DEBUG: module = self._get_module_from_name(name)
DEBUG: File "/usr/lib64/python3.9/unittest/loader.py", line 377, in _get_module_from_name
DEBUG: __import__(name)
DEBUG: File "/builddir/build/BUILD/validations-libs-1.7.0.dev8/validations_libs/tests/callback_plugins/test_vf_fail_if_no_hosts.py", line 27, in <module>
DEBUG: from oslotest import base
DEBUG: ModuleNotFoundError: No module named 'oslotest'`
* might came due to this https://github.com/openstack/validations-libs/commit/9163d8bcbcf364db9c04323436de35fc3946b380#diff-fac4c6890301d4de5c3f4266837803d5240c84a3d8b6c735bbc6a64c39d2f94eR16
* Recent run got cleared in check
* Based on this https://zuul.opendev.org/t/openstack/builds?job_name=tripleo-ci-centos-9-content-provider-wallaby
* if again comes, please file a bug and add python-oslotest as BR in https://github.com/rdo-packages/validations-libs-distgit/blob/wallaby-rdo/python-validations-libs.spec
* It seems this job is passing https://zuul.openstack.org/builds?job_name=tripleo-ci-centos-9-content-provider-wallaby&skip=0
* **Check Failures**
* https://bugs.launchpad.net/tripleo/+bug/1964595
* Blocking some patches like: https://review.opendev.org/c/openstack/tripleo-ansible/+/833182/
* https://bugs.launchpad.net/tripleo/+bug/1964530
* Comments:
* fultonj working on a fix
* https://zuul.opendev.org/t/openstack/builds?job_name=tripleo-ci-centos-9-scenario001-standalone
* https://zuul.opendev.org/t/openstack/builds?job_name=tripleo-ci-centos-9-scenario004-standalone
* Patch: https://review.opendev.org/c/openstack/tripleo-ansible/+/833182
* Merged, waiting for more runs
* **CS9 wallaby**
* Comments
* ~~re-running full-tempest-api and scenario her~~ ~~https://review.rdoproject.org/r/c/testproject/+/36255~~
* CS9 Tempest tests failing with "Host 'standalone.localdomain' is not mapped to any cell
* https://bugs.launchpad.net/tripleo/+bug/1964269
* Affected jobs:
* periodic-tripleo-ci-centos-9-scenario007-multinode-oooq-container-master
* fs35
* Re-running failed jobs: https://review.rdoproject.org/r/c/testproject/+/40445
* fs035 passed
* centos-9-scenario001-standalone tripleo_ansible_inventory or deployed_metalsmith parameter is required [Fixed]
* https://bugs.launchpad.net/tripleo/+bug/1964530 (fulton working on fix)
* testing it here: https://review.opendev.org/c/openstack/tripleo-ci/+/833337
* CIX: https://trello.com/c/GLMsaCid/2401-cixlp1964530tripleociproa-centos-9-scenario001-standalone-tripleoansibleinventory-or-deployedmetalsmith-parameter-is-required
* affects all jobs that deploy ceph (scn001, scn004, scn010)
* Comments:
* we need tqe change that updates ceph container to 6.0.7 - but depends on wallaby tripleo-common too (see https://review.opendev.org/c/openstack/tripleo-ci/+/833337)
* ~~we depends on https://bugs.launchpad.net/tripleo/+bug/1964595 fix too~~
* **ERROR** State in few jobs
* https://review.rdoproject.org/zuul/builds?result=ERROR
* https://review.rdoproject.org/zuul/build/7241289f87cf466caad03d39c48d6fee
* Error: Failed to update project ansible-collections/ansible.netcommon
* Need investigation
* No longer seen removing it from here
* ** Wallaby Failure**
* c8s DLRN stuck : Now it is fixed
* Failed due to mirror issue re-running here: https://review.rdoproject.org/r/c/testproject/+/40440
* https://logserver.rdoproject.org/40/40440/2/check/periodic-tripleo-ci-centos-9-standalone-on-multinode-ipa-wallaby/8c11aa9/job-output.txt
* Still failing
* `Timeout was reached for http://mirror.regionone.vexxhost-nodepool-tripleo.rdoproject.org/centos-stream/9-stream/BaseOS/x86_64/os/Packages/crypto-policies-20220203-1.gitf03e75e.el9.noarch.rpm [Operation too slow. Less than 1000 bytes/sec transferred the last 30 seconds]", "[FAILED] crypto-policies-20220203-1.gitf03e75e.el9.noarch.rpm: No more mirrors to try - All mirrors were already tried without success", "", "The downloaded packages were saved in cache until the next successful transaction.", "You can remove cached packages by executing 'dnf clean packages'`
* Network issue on vexxhost
* **Gate Failure**
* tripleo-ci-centos-8-standalone-upgrade-victoria (1x)
* Tempest test failure (test_minimum_basic_scenario):
* "None matches Is(None): Failed to find floating IP '192.168.24.157' in server addresses: {'tempest-TestMinimumBasicScenario-1018522959-network'"
* https://22a8c80ab9bbcd6728f0-31bf408f42c27b0363a3847bc2706cfc.ssl.cf1.rackcdn.com/833331/1/gate/tripleo-ci-centos-8-standalone-upgrade-victoria/c3ec607/logs/undercloud/var/log/tempest/stestr_results.html
* Rechecked the job - fixed now
* https://review.opendev.org/c/openstack/tripleo-heat-templates/+/833331/1#message-d11b2fe589391abc292d67162be07786ee0c2b46