owned this note
owned this note
Published
Linked with GitHub
# Ruck and Rover notes #33
###### tags: `ruck_rover`
:::info
Important links for ruck rover's [ruck/rover links to help](https://hackmd.io/07z0xroHTFi2IbX93P5ZfQ)
**Ruck Rover - Unified Sprint #33**
Dates: Sep 9 - Sep 29
Tripleo CI team ruck|rover: Folco/Bhageshris
OSP CI team ruck|rover: wznoinsk|ruck, TalmoR|ruck
:::
[TOC]
---
## on-going issues
:::danger
## TripleO
### gate
* Bug: standalone upgrade train failing: https://bugs.launchpad.net/tripleo/+bug/1896595
* Note that job is non-voting so not very high priority.
### periodic / 3rd party
- master
- Bug: https://bugs.launchpad.net/tripleo/+bug/1897863 (not a promotion blocker)
- https://review.opendev.org/#/c/755252/ (Add support for rdo openvswitch layered upgrade special treatment.)
- Except standalone upgrade master job all jobs are **GREEN**
- train
- we are getting NODE_FAILURE on periodic-tripleo-centos-8-train-promote-promoted-components-to-tripleo-ci-testing, waiting for next run.
- ussuri
- Except standalone upgrade ussuri job all jobs are **GREEN**
- stein
@bhagyashris @rfolco please focus on stein upgrade jobs.
* UPGRADE JOBS voting and failing
- undercloud upgrade: https://bugs.launchpad.net/tripleo/+bug/1896248
- standalone upgrade: https://bugs.launchpad.net/tripleo/+bug/1896537
- https://zuul.openstack.org/builds?job_name=tripleo-ci-centos-7-standalone-upgrade-stein
- rocky
## OSP
:::
---
## Sept 30th
### TripleO
* gate
* periodic / 3rd party
**master**
- Bug: https://bugs.launchpad.net/tripleo/+bug/1897863 (not a promotion blocker)
- https://review.opendev.org/#/c/755252/ (Add support for rdo openvswitch layered upgrade special treatment.)
- Except standalone upgrade master all jobs are **GREEN**
**ussuri**
- Except standalone upgrade ussuri job all jobs are **GREEN**
**train**
- we are getting NODE_FAILURE on periodic-tripleo-centos-8-train-promote-promoted-components-to-tripleo-ci-testing, waiting for next run.
* component pipeline:
master: https://bugs.launchpad.net/tripleo/+bug/1897947
### OSP
## Sept 29th
### TripleO
* gate
* periodic / 3rd party
**master**
- ~~https://review.opendev.org/#/c/754783/~~
- Note: Jobs will pass, once will get the baremetal component promotion.
**ussuri**
* fs039 is failing because of tempest test failure
- https://logserver.rdoproject.org/openstack-periodic-integration-stable1/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp_1supp-featureset039-ussuri/d5b47d0/logs/undercloud/var/log/tempest/stestr_results.html.gz
- Note: Will wait for next run as it's not consistent failure.
**train**
* **GREEN**
### OSP
## Sept 28th
### TripleO
* gate
* periodic / 3rd party
**master**
* Bug: https://bugs.launchpad.net/tripleo/+bug/1897505 (overcloud-prep-images failes at Introspect overcloud images (fs001, fs002 and fs039 is failing))
- [DNM] Increase ironic conductor conf values https://review.opendev.org/754690
- https://review.opendev.org/#/c/754783/
- Testproject: https://review.rdoproject.org/r/29351
```
bhagyashri|rover> hjensas, hi need one help : few CI jobs are failing on master because of https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-master/0ca0cb9/logs/undercloud/var/log/containers/ironic-inspector/ironic-inspector.log.txt.gz
<bhagyashri|rover> it's basically failing at overcloud image introspection
<bhagyashri|rover> do you have any idea?hoonetorg hrybacki
<bhagyashri|rover> hjensas, https://bugs.launchpad.net/tripleo/+bug/1897505
<hjensas> bhagyashri|rover: hm, wonder if it could be that it is slow to transition state in the host cloud? Can you try playing with ironic.conf [conductor]ower_state_change_timeout = 60 and [conductor]sync_power_state_interval = 60 ? Maby bump to 90 or 120 ?
```
**ussuri**
* **GREEN** except sc010-ovn
- skiplist: https://review.rdoproject.org/r/29709
**train**
* **GREEN**
**stein**
* featureset002-stein-upload failing with (Connection to the 10.0.0.111 via SSH timed out.)
- https://logserver.rdoproject.org/openstack-periodic-integration-stable3/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset002-stein-upload/f2d0f0c/logs/undercloud/home/zuul/tempest/tempest.html.gz
### OSP
## Sept 25th
### TripleO
* gate
* periodic / 3rd party
**master**
* **In the current run master is GREEN**
* Bug: https://bugs.launchpad.net/tripleo/+bug/1897213 ([inconsistent] scenario010-standalone-master is getting TIMED_OUT at Execute tempest test task)
* Fix: https://review.rdoproject.org/r/29634
**ussuri**
**train**
* Bug: https://bugs.launchpad.net/tripleo/+bug/1897228
* Fix: https://review.opendev.org/#/c/754281/
* Testproject: https://review.rdoproject.org/r/#/c/29252/
### OSP
## Sept 24th
### TripleO
* Bug: https://bugs.launchpad.net/tripleo/+bug/1896917 ( problem with some vexxhost hypervisor)
* gate
* periodic / 3rd party
**master**
- periodic-tripleo-ci-centos-8-scenario010-ovn-provider-standalone-master >> https://review.rdoproject.org/r/29619 should fix
- periodic-tripleo-ci-centos-8-standalone-full-tempest-scenario-master (will open bug if happens again)
**ussuri**
- https://bugs.launchpad.net/tripleo/+bug/1895705 scen10-ovn >> skiplist >> https://review.opendev.org/#/c/754198/
**train**
- ~~https://bugs.launchpad.net/tripleo/+bug/1895705 scen10-ovn~~ skiplist
- https://review.rdoproject.org/zuul/builds?job_name=tripleo-ci-centos-8-ovb-3ctlr_1comp_1supp-featureset039&branch=stable%2Ftrain# known?
~~- <ykarel> rfolco|ruck, https://review.rdoproject.org/r/29617~~
**stein**
- several jobs failing tempest tests example: https://logserver.rdoproject.org/openstack-periodic-integration-stable3/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-7-multinode-1ctlr-featureset010-stein/f8673fb/logs/undercloud/home/zuul/tempest/tempest.html.gz
(no action taken yet)
### OSP
## Sept 23th
### TripleO
* gate
* Bug: https://bugs.launchpad.net/tripleo/+bug/1896738
* ~~Fix: https://review.opendev.org/#/c/753564/~~
* Fix: https://review.opendev.org/#/q/topic:linters-fix+(status:open+OR+status:merged)
* ~~Most of the failures are because of POST_FAULURES and that was infra issue and that got fixed. https://review.opendev.org/#/c/753498/1~~
* periodic / 3rd party
**master**
**ussuri**
**train**
* **GREEN**
**stein**
### OSP
## Sept 22th
### TripleO
* gate
* Bug: https://bugs.launchpad.net/tripleo/+bug/1896595 ([c7 train] ConflictException: 409: Client Error for url: http://192.168.24.1:9696/v2.0/networks.json, Unable to create the flat network. Physical network datacentre is in use failing standalone upgrade train )
* periodic / 3rd party
**master**
**ussuri**
**train**
* **GREEN** :)
**stein**
* Bug: https://bugs.launchpad.net/tripleo/+bug/1896537
* Summary:
1 . It might be similar to https://bugs.launchpad.net/tripleo/+bug/1895822 (standalone-upgrade ussuri)
2. in both cases it passes deployment & upgrade OK but it seems there is something happening during the upgrade that kills connectivity
3. we have a cix on that one so few folks have looked into it but we don't have root cause yet
### OSP
## Sept 21th
### TripleO
* gate
* ~~Bug: https://bugs.launchpad.net/tripleo/+bug/1896439~~
* ~~Fix: https://review.opendev.org/#/c/752881/~~
* ~~Fix: https://review.opendev.org/#/c/752902/~~
* Bug: https://bugs.launchpad.net/tripleo/+bug/1896469
* Fix: https://review.opendev.org/#/c/752908
* periodic / 3rd party
**master**
* ~~Bug: https://bugs.launchpad.net/tripleo/+bug/1896469~~
* ~~master pipeline got hit by RETRY_LIMIT because of this change: https://review.rdoproject.org/r/#/c/29009/ (Update rdo-openvswitch to support ovs/ovn2.13 NFV SIG builds)~~
* ~~Fix: https://review.rdoproject.org/r/29542 (Revert "Update rdo-openvswitch to support ovs/ovn2.13 NFV SIG builds")~~
~~~
<bhagyashri|rover> ykarel, hi master jobs are getting into retry limit https://logserver.rdoproject.org/openstack- periodic-integration-main/opendev.org/openstack/tripleo- ci/master/periodic-tripleo-ci-centos-8-standalone-master/8864213/job-output.txt
<bhagyashri|rover> ykarel, looks like this is https://review.rdoproject.org/r/#/c/29009/ hitting retry limit
<bhagyashri|rover> amoralej, ^^
<amoralej> bhagyashri|rover, where?
<bhagyashri|rover> amoralej, https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-standalone-master/8864213/job-output.txt
<amoralej> mmm, it's missing nfv repo
<amoralej> ykarel, ^
<amoralej> oh, the multinode part
<amoralej> grrr
<amoralej> but we tested it in multinode, iirc
<ykarel> amoralej, bhagyashri|rover looking
<ykarel> amoralej, uhh so we tested it but at that point of job master deps repo was used
<amoralej> ykarel, let's revert
~~~
**ussuri**
* Same issue as master
**c8 train**
* :green_heart::green_heart::green_heart:
**stein**
* Bug: https://bugs.launchpad.net/tripleo/+bug/1896220
* Bug: https://bugs.launchpad.net/tripleo/+bug/1896537
## Sept 18th
### TripleO
* gate
* [LP: 1896178](https://bugs.launchpad.net/tripleo/+bug/1896178) ( [master][train] nothing provides libmysqlclient.so.21()(64bit) needed by collectd-mysql-5.11.0-2.el8.x86_64)
Fix: https://review.opendev.org/#/c/752621/ (Temporarily drop mysql)
Note: Please check details on bug comment section.
* periodic / 3rd party
**master**
* Blocker: [LP: 1896178](https://bugs.launchpad.net/tripleo/+bug/1896178)
~~Fix: https://review.rdoproject.org/r/#/c/29472/ (Add temporary hotfix repo for CentOS 8 issue with myslq module)~~
Related patches:
* ~~https://review.rdoproject.org/r/#/c/29458/ (Use mariadb connector instead of mysql)~~
* ~~Note: Test patch also passed https://review.rdoproject.org/r/#/c/29227/~~
**ussuri**
* **GREEN** :) :green_heart::green_heart::green_heart:
**c8 train**
* Blocker: [LP: 1896178](https://bugs.launchpad.net/tripleo/+bug/1896178) Same as master one.
* ~~test_listener_CRUD (scen10-ovn)~~
* ~~BUG: https://bugs.launchpad.net/tripleo/+bug/1895705~~ (bug still open, tests on skiplist)
* ~~skiplist: https://review.opendev.org/751162 [stein|train] Add octavia tests to skip list **(Merged)**~~
**stein**
* Bug: https://bugs.launchpad.net/tripleo/+bug/1896220
* fs002, fs010 and fs030 is failing because of Memory issue. It will pass in the next run.
- ~~fs037 prepare-parameter:~~
- ~~FIX: https://review.opendev.org/#/c/751340/ **(Merged)**~~
- ~~testproject: https://review.rdoproject.org/r/#/c/29297/ (Passing)~~
- ~~fs038 tempest octavia:~~
- ~~BUG: https://bugs.launchpad.net/tripleo/+bug/1895248~~
- ~~Fix: https://review.opendev.org/#/c/751162/ **(Merged)**~~
### OSP
## Sept 17th
### TripleO
https://bugs.launchpad.net/tripleo/+bug/1896126 >> tempest workspace
* gate
* ~~https://review.opendev.org/#/c/752187/ (Increase configure_swap_size to 4096)~~
* ~~https://review.opendev.org/#/c/752129/ (Inject both paths for validations roles location) >> ussuri~~
* ~~https://review.opendev.org/#/c/752197/ (Inject both paths for validations roles location) >> train~~
* periodic / 3rd party
**master**
* ~~https://bugs.launchpad.net/tripleo/+bug/1895792 ( public endpoint for compute service not found is getting TIMED_OUT on fs002, fs020 and fs035 Edit)~~
* ~~fix : https://review.opendev.org/#/c/752199/1 (Run some more featuresets with baremetal provisioning)~~
* ~~https://bugs.launchpad.net/tripleo/+bug/1874418 ((inconsistent) periodic centos8 fs39 fails sometimes with Error, some other host (FA:16:3E:B5:60:2A) already uses address 10.0.0.1)~~
* ~~https://review.opendev.org/#/c/752370/ (Enable undercloud nova for fs020) got +w on it~~. Testproject patch : https://review.rdoproject.org/r/#/c/29351/ (GREEN)
* ~~in the recent run build-containers-ubi-8-push job failed here https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-build-containers-ubi-8-push/4f35982/logs/build.log~~
* ~~**Note:** on the testproject patch it passed https://review.rdoproject.org/r/#/c/29227/ , so waiting for next run.~~
**ussuri**
* featureset039-ussuri is failing because of this https://bugs.launchpad.net/tripleo/+bug/1874418 ((inconsistent) periodic centos8 fs39 fails sometimes with Error, some other host (FA:16:3E:B5:60:2A) already uses address 10.0.0.1) though this not promotion blocker. **Rest all GREEN** :)
**c8 train**
* **GREEN** :)
## Sept 16th
### TripleO
* gate
* https://review.opendev.org/#/c/752187/ (Increase configure_swap_size to 4096)
* https://review.opendev.org/#/c/752129/ (Inject both paths for validations roles location) >> ussuri
* https://review.opendev.org/#/c/752197/ (Inject both paths for validations roles location) >> train
* ~~https://review.opendev.org/#/c/751653/ (Inject both paths for validations roles location)~~
* ~~https://review.opendev.org/#/c/751828/ (Increase configure_swap_size to 4096)~~
* ~~https://review.opendev.org/#/c/751930/ (skip test_snapshot_pattern in master)~~
* ~~https://review.opendev.org/#/c/750967/ (Fix duplicated declaration when the deprecated parameter is used) This will fix sc12 issue~~
* periodic / 3rd party
**master**
* https://bugs.launchpad.net/tripleo/+bug/1895792 ( public endpoint for compute service not found is getting TIMED_OUT on fs002, fs020 and fs035 Edit)
* here is the fix : https://review.opendev.org/#/c/752199/1 (Run some more featuresets with baremetal provisioning)
* https://bugs.launchpad.net/tripleo/+bug/1874418 ((inconsistent) periodic centos8 fs39 fails sometimes with Error, some other host (FA:16:3E:B5:60:2A) already uses address 10.0.0.1)
~~* https://bugs.launchpad.net/tripleo/+bug/1895828 (
tempest.lib.exceptions.UnexpectedResponseCode: Unexpected response code received is failing on scenario010-standalone-master)~~
* Note: this issue is not seen on testproject run https://review.rdoproject.org/r/#/c/29351/ will wait for next run
- ~~https://bugs.launchpad.net/tripleo/+bug/1895580 'TagFloatingIpTestJSON' object has no attribute 'assertItemsEqual'~~
- ~~https://review.opendev.org/#/c/750369/~~
- ~~https://review.opendev.org/#/c/747760/~~
- ~~https://review.opendev.org/#/c/751244/ (Revert "Change path for validation Ansible files")~~
**ussuri**
* https://review.opendev.org/#/c/752187/ (Increase configure_swap_size to 4096)
* https://review.opendev.org/#/c/752129/ (Inject both paths for validations roles location)
**Note: For ussuri promotion will need above two patches to merge soon.**
**c8 train**
* ~~privilege escalation (undercloud-containers)~~
* ~~BUG: https://bugs.launchpad.net/tripleo/+bug/1895700
**Note:** In the recent run, didn't see this issue and job is GREEN. Will wait for next run result to reconfirm this issue.~~
* fs039 is in RETRY_LIMIT: Quota exceeded, too many key pairs.
~~https://logserver.rdoproject.org/openstack-periodic-integration-stable2/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp_1supp-featureset039-train/381071f/job-output.txt~~
~~Note: @weshayutin will need to delete stale ports on rdo-cloud and vexxhost~~
**stein**
### OSP
## Sept 15th
### TripleO
* gate
* https://bugs.launchpad.net/tripleo/+bug/1895601 (tripleo standalone centos-8 tempest fail: estSnapshotPattern:_run_cleanups): 500 DELETE http://192.168.24.3:9292/v2/images)
~~https://review.opendev.org/#/c/751930/ (skip test_snapshot_pattern in master)~~
* **List of patches needs to be merge to clear gate faiures**:
~~https://review.opendev.org/#/c/751653/ (Inject both paths for validations roles location)~~
~~https://review.opendev.org/#/c/751828/ (Increase configure_swap_size to 4096)~~
~~https://review.opendev.org/#/c/751930/ (skip test_snapshot_pattern in master)~~
https://review.opendev.org/#/c/751771/ (Add the variable list_skipped_job)
https://review.opendev.org/#/c/750967/ (Fix duplicated declaration when the deprecated parameter is used)
* periodic / 3rd party
**master**
* tempest neutron plugin
* ~~BUG: https://bugs.launchpad.net/tripleo/+bug/1895580 ‘TagFloatingIpTestJSON’ object has no attribute ‘assertItemsEqual’~~
**ussuri**
* couldn't resolve module/action 'warn'"
* BUG: https://bugs.launchpad.net/tripleo/+bug/1895507
* FIX: https://review.opendev.org/#/c/751653/
**c8 train**
* test_listener_CRUD (scen10-ovn)
* BUG: https://bugs.launchpad.net/tripleo/+bug/1895705
~~* privilege escalation (undercloud-containers)
* BUG: https://bugs.launchpad.net/tripleo/+bug/1895700~~
* memory (fs001, fs002, fs020, fs035, fs039)
* FIX: https://review.rdoproject.org/r/#/c/29335/ (Increase configure_swap_size to 4096 for ovb jobs)
* Related-Bug: #1895290
* Related-Bug: #1895288
**stein**
* container-prepare-parameter.yaml
* FIX: https://review.opendev.org/#/c/751340/ (**STILL UNRESOLVED**)
* testproject: https://review.rdoproject.org/r/#/c/29297/
### OSP
## Sept 14th
### TripleO
* gate
(clear)
- ~~https://bugs.launchpad.net/tripleo/+bug/1895027 scen12~~
- ~~https://review.opendev.org/#/c/750980 lower constraints~~
- ~~https://bugs.launchpad.net/tripleo/+bug/1895056~~
* periodic / 3rd party
**master**
* ~~**NEW!!** https://bugs.launchpad.net/tripleo/+bug/1895580 'TagFloatingIpTestJSON' object has no attribute 'assertItemsEqual'~~
* https://bugs.launchpad.net/tripleo/+bug/1895507 (Standalone jobs failing with "couldn't resolve module/action 'warn'" )
https://review.opendev.org/#/c/751653/ (Inject both paths for validations roles location)
https://review.opendev.org/#/c/751766/ (
Revert "Remove objects migrated to validations-common")
**c8 train**
* https://bugs.launchpad.net/tripleo/+bug/1895290(
[train] ERROR root [ ] Image prepare failed: A process in the process pool was terminated abruptly while the future was running or pending failing on few jobs)
hare is the fix: https://review.rdoproject.org/r/#/c/29313/ (Increase configure_swap_size to 4096)
testing here: https://review.rdoproject.org/r/#/c/28452/ , https://review.rdoproject.org/r/#/c/29254/
### OSP
## Sept 11th
### TripleO
* gate
* periodic / 3rd party
**master**
* master jobs are still failing with https://bugs.launchpad.net/tripleo/+bug/1895056 (package conflict, openstack-tripleo-validations 12.4.1-0.20200909044505.ae20b37.el8.noarch and validations common-1.1.1-0.20200908154415.946e3a8.el8.noarch)
https://review.opendev.org/#/c/751244/ (Revert "Change path for validation Ansible files") Hope this will solve the problem
**ussuri**
* standalone-on-multinode-ipa-ussuri: failed at Execute tempest test: https://logserver.rdoproject.org/openstack-periodic-integration-stable1/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-standalone-on-multinode-ipa-ussuri/315bb39/job-output.txt
https://logserver.rdoproject.org/openstack-periodic-integration-stable1/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-standalone-on-multinode-ipa-ussuri/33d2ced/logs/undercloud/var/log/tempest/stestr_results.html.gz
* scenario010-standalone-ussuri: failed at Execute tempest : https://logserver.rdoproject.org/openstack-periodic-integration-stable1/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-scenario010-standalone-ussuri/d843988/logs/undercloud/var/log/tempest/stestr_results.html.gz
* **Note:** Ran both the job in testproject : https://review.rdoproject.org/r/#/c/29250/ (standalone-on-multinode-ipa is still failing and scenario010-standalone is passed)
**c8 train**
* https://bugs.launchpad.net/tripleo/+bug/1895288 (
[train] MemoryError: [tripleotraincentos8/centos-binary-nova-libvirt] Memory Error failing on few jobs)
* https://bugs.launchpad.net/tripleo/+bug/1895290 (
[train] ERROR root [ ] Image prepare failed: A process in the process pool was terminated abruptly while the future was running or pending failing on few jobs)
**stein**
* https://bugs.launchpad.net/tripleo/+bug/1895248 ([stein] (LoadBalancerScenarioTest:test_load_balancer_ipv4_CRUD) show_loadbalancer provisioning_status updated to an invalid state of ERROR is failing on fs038)
https://review.opendev.org/#/c/751162/ (Add octavia test to skip list)
* fs037 is blocking promotion: https://bugs.launchpad.net/tripleo/+bug/1895314
### OSP
## Sept 10th
### TripleO
(rfolco) cix cards updated, just do a quick pass before monday cix call
* gate
https://bugs.launchpad.net/tripleo/+bug/1895005 (Tempest failure: cloud overcloud is not found stable/train)
https://bugs.launchpad.net/tripleo/+bug/1895138 (
centos-8 standalone-upgrade-ussuri fails build-test-packages issue creating /root/DLRN) marios is working on it.
* RDO CI
**master**
* master promotion has been bolocked because of this issue: https://bugs.launchpad.net/tripleo/+bug/1895056 (
package conflict, openstack-tripleo-validations-12.4.1-0.20200909044505.ae20b37.el8.noarch and validations-common-1.1.1-0.20200908154415.946e3a8.el8.noarch)
As below patches got merged and that hits this issue:
https://review.opendev.org/#/c/750369/1
https://review.opendev.org/#/c/747760/
~~~
Tengu> bhagyashri|rover: there's a patch on-going, but it has some issues in CI https://review.opendev.org/713204
~~~
* recently 'periodic-tripleo-ci-build-containers-ubi-8-push' failed to push the conatiners because of **connection refused** issue but it passed on testproject: https://review.rdoproject.org/r/#/c/29227/
~~~
noop SUCCESS in 0s
periodic-tripleo-ci-build-containers-ubi-8-push SUCCESS in 47m 53s
~~~
**c8 train** : is good today :) - passed all jobs except scenario010-ovn-provider-standalone-train
### OSP
## Sept 9th
### TripleO
- gate is bad (rfolco looking)
### OSP