# Ruck and Rover notes #33 ###### tags: `ruck_rover` :::info Important links for ruck rover's [ruck/rover links to help](https://hackmd.io/07z0xroHTFi2IbX93P5ZfQ) **Ruck Rover - Unified Sprint #33** Dates: Sep 9 - Sep 29 Tripleo CI team ruck|rover: Folco/Bhageshris OSP CI team ruck|rover: wznoinsk|ruck, TalmoR|ruck ::: [TOC] --- ## on-going issues :::danger ## TripleO ### gate * Bug: standalone upgrade train failing: https://bugs.launchpad.net/tripleo/+bug/1896595 * Note that job is non-voting so not very high priority. ### periodic / 3rd party - master - Bug: https://bugs.launchpad.net/tripleo/+bug/1897863 (not a promotion blocker) - https://review.opendev.org/#/c/755252/ (Add support for rdo openvswitch layered upgrade special treatment.) - Except standalone upgrade master job all jobs are **GREEN** - train - we are getting NODE_FAILURE on periodic-tripleo-centos-8-train-promote-promoted-components-to-tripleo-ci-testing, waiting for next run. - ussuri - Except standalone upgrade ussuri job all jobs are **GREEN** - stein @bhagyashris @rfolco please focus on stein upgrade jobs. * UPGRADE JOBS voting and failing - undercloud upgrade: https://bugs.launchpad.net/tripleo/+bug/1896248 - standalone upgrade: https://bugs.launchpad.net/tripleo/+bug/1896537 - https://zuul.openstack.org/builds?job_name=tripleo-ci-centos-7-standalone-upgrade-stein - rocky ## OSP ::: --- ## Sept 30th ### TripleO * gate * periodic / 3rd party **master** - Bug: https://bugs.launchpad.net/tripleo/+bug/1897863 (not a promotion blocker) - https://review.opendev.org/#/c/755252/ (Add support for rdo openvswitch layered upgrade special treatment.) - Except standalone upgrade master all jobs are **GREEN** **ussuri** - Except standalone upgrade ussuri job all jobs are **GREEN** **train** - we are getting NODE_FAILURE on periodic-tripleo-centos-8-train-promote-promoted-components-to-tripleo-ci-testing, waiting for next run. * component pipeline: master: https://bugs.launchpad.net/tripleo/+bug/1897947 ### OSP ## Sept 29th ### TripleO * gate * periodic / 3rd party **master** - ~~https://review.opendev.org/#/c/754783/~~ - Note: Jobs will pass, once will get the baremetal component promotion. **ussuri** * fs039 is failing because of tempest test failure - https://logserver.rdoproject.org/openstack-periodic-integration-stable1/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp_1supp-featureset039-ussuri/d5b47d0/logs/undercloud/var/log/tempest/stestr_results.html.gz - Note: Will wait for next run as it's not consistent failure. **train** * **GREEN** ### OSP ## Sept 28th ### TripleO * gate * periodic / 3rd party **master** * Bug: https://bugs.launchpad.net/tripleo/+bug/1897505 (overcloud-prep-images failes at Introspect overcloud images (fs001, fs002 and fs039 is failing)) - [DNM] Increase ironic conductor conf values https://review.opendev.org/754690 - https://review.opendev.org/#/c/754783/ - Testproject: https://review.rdoproject.org/r/29351 ``` bhagyashri|rover> hjensas, hi need one help : few CI jobs are failing on master because of https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-master/0ca0cb9/logs/undercloud/var/log/containers/ironic-inspector/ironic-inspector.log.txt.gz <bhagyashri|rover> it's basically failing at overcloud image introspection <bhagyashri|rover> do you have any idea?hoonetorg hrybacki <bhagyashri|rover> hjensas, https://bugs.launchpad.net/tripleo/+bug/1897505 <hjensas> bhagyashri|rover: hm, wonder if it could be that it is slow to transition state in the host cloud? Can you try playing with ironic.conf [conductor]ower_state_change_timeout = 60 and [conductor]sync_power_state_interval = 60 ? Maby bump to 90 or 120 ? ``` **ussuri** * **GREEN** except sc010-ovn - skiplist: https://review.rdoproject.org/r/29709 **train** * **GREEN** **stein** * featureset002-stein-upload failing with (Connection to the 10.0.0.111 via SSH timed out.) - https://logserver.rdoproject.org/openstack-periodic-integration-stable3/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset002-stein-upload/f2d0f0c/logs/undercloud/home/zuul/tempest/tempest.html.gz ### OSP ## Sept 25th ### TripleO * gate * periodic / 3rd party **master** * **In the current run master is GREEN** * Bug: https://bugs.launchpad.net/tripleo/+bug/1897213 ([inconsistent] scenario010-standalone-master is getting TIMED_OUT at Execute tempest test task) * Fix: https://review.rdoproject.org/r/29634 **ussuri** **train** * Bug: https://bugs.launchpad.net/tripleo/+bug/1897228 * Fix: https://review.opendev.org/#/c/754281/ * Testproject: https://review.rdoproject.org/r/#/c/29252/ ### OSP ## Sept 24th ### TripleO * Bug: https://bugs.launchpad.net/tripleo/+bug/1896917 ( problem with some vexxhost hypervisor) * gate * periodic / 3rd party **master** - periodic-tripleo-ci-centos-8-scenario010-ovn-provider-standalone-master >> https://review.rdoproject.org/r/29619 should fix - periodic-tripleo-ci-centos-8-standalone-full-tempest-scenario-master (will open bug if happens again) **ussuri** - https://bugs.launchpad.net/tripleo/+bug/1895705 scen10-ovn >> skiplist >> https://review.opendev.org/#/c/754198/ **train** - ~~https://bugs.launchpad.net/tripleo/+bug/1895705 scen10-ovn~~ skiplist - https://review.rdoproject.org/zuul/builds?job_name=tripleo-ci-centos-8-ovb-3ctlr_1comp_1supp-featureset039&branch=stable%2Ftrain# known? ~~- <ykarel> rfolco|ruck, https://review.rdoproject.org/r/29617~~ **stein** - several jobs failing tempest tests example: https://logserver.rdoproject.org/openstack-periodic-integration-stable3/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-7-multinode-1ctlr-featureset010-stein/f8673fb/logs/undercloud/home/zuul/tempest/tempest.html.gz (no action taken yet) ### OSP ## Sept 23th ### TripleO * gate * Bug: https://bugs.launchpad.net/tripleo/+bug/1896738 * ~~Fix: https://review.opendev.org/#/c/753564/~~ * Fix: https://review.opendev.org/#/q/topic:linters-fix+(status:open+OR+status:merged) * ~~Most of the failures are because of POST_FAULURES and that was infra issue and that got fixed. https://review.opendev.org/#/c/753498/1~~ * periodic / 3rd party **master** **ussuri** **train** * **GREEN** **stein** ### OSP ## Sept 22th ### TripleO * gate * Bug: https://bugs.launchpad.net/tripleo/+bug/1896595 ([c7 train] ConflictException: 409: Client Error for url: http://192.168.24.1:9696/v2.0/networks.json, Unable to create the flat network. Physical network datacentre is in use failing standalone upgrade train ) * periodic / 3rd party **master** **ussuri** **train** * **GREEN** :) **stein** * Bug: https://bugs.launchpad.net/tripleo/+bug/1896537 * Summary: 1 . It might be similar to https://bugs.launchpad.net/tripleo/+bug/1895822 (standalone-upgrade ussuri) 2. in both cases it passes deployment & upgrade OK but it seems there is something happening during the upgrade that kills connectivity 3. we have a cix on that one so few folks have looked into it but we don't have root cause yet ### OSP ## Sept 21th ### TripleO * gate * ~~Bug: https://bugs.launchpad.net/tripleo/+bug/1896439~~ * ~~Fix: https://review.opendev.org/#/c/752881/~~ * ~~Fix: https://review.opendev.org/#/c/752902/~~ * Bug: https://bugs.launchpad.net/tripleo/+bug/1896469 * Fix: https://review.opendev.org/#/c/752908 * periodic / 3rd party **master** * ~~Bug: https://bugs.launchpad.net/tripleo/+bug/1896469~~ * ~~master pipeline got hit by RETRY_LIMIT because of this change: https://review.rdoproject.org/r/#/c/29009/ (Update rdo-openvswitch to support ovs/ovn2.13 NFV SIG builds)~~ * ~~Fix: https://review.rdoproject.org/r/29542 (Revert "Update rdo-openvswitch to support ovs/ovn2.13 NFV SIG builds")~~ ~~~ <bhagyashri|rover> ykarel, hi master jobs are getting into retry limit https://logserver.rdoproject.org/openstack- periodic-integration-main/opendev.org/openstack/tripleo- ci/master/periodic-tripleo-ci-centos-8-standalone-master/8864213/job-output.txt <bhagyashri|rover> ykarel, looks like this is https://review.rdoproject.org/r/#/c/29009/ hitting retry limit <bhagyashri|rover> amoralej, ^^ <amoralej> bhagyashri|rover, where? <bhagyashri|rover> amoralej, https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-standalone-master/8864213/job-output.txt <amoralej> mmm, it's missing nfv repo <amoralej> ykarel, ^ <amoralej> oh, the multinode part <amoralej> grrr <amoralej> but we tested it in multinode, iirc <ykarel> amoralej, bhagyashri|rover looking <ykarel> amoralej, uhh so we tested it but at that point of job master deps repo was used <amoralej> ykarel, let's revert ~~~ **ussuri** * Same issue as master **c8 train** * :green_heart::green_heart::green_heart: **stein** * Bug: https://bugs.launchpad.net/tripleo/+bug/1896220 * Bug: https://bugs.launchpad.net/tripleo/+bug/1896537 ## Sept 18th ### TripleO * gate * [LP: 1896178](https://bugs.launchpad.net/tripleo/+bug/1896178) ( [master][train] nothing provides libmysqlclient.so.21()(64bit) needed by collectd-mysql-5.11.0-2.el8.x86_64) Fix: https://review.opendev.org/#/c/752621/ (Temporarily drop mysql) Note: Please check details on bug comment section. * periodic / 3rd party **master** * Blocker: [LP: 1896178](https://bugs.launchpad.net/tripleo/+bug/1896178) ~~Fix: https://review.rdoproject.org/r/#/c/29472/ (Add temporary hotfix repo for CentOS 8 issue with myslq module)~~ Related patches: * ~~https://review.rdoproject.org/r/#/c/29458/ (Use mariadb connector instead of mysql)~~ * ~~Note: Test patch also passed https://review.rdoproject.org/r/#/c/29227/~~ **ussuri** * **GREEN** :) :green_heart::green_heart::green_heart: **c8 train** * Blocker: [LP: 1896178](https://bugs.launchpad.net/tripleo/+bug/1896178) Same as master one. * ~~test_listener_CRUD (scen10-ovn)~~ * ~~BUG: https://bugs.launchpad.net/tripleo/+bug/1895705~~ (bug still open, tests on skiplist) * ~~skiplist: https://review.opendev.org/751162 [stein|train] Add octavia tests to skip list **(Merged)**~~ **stein** * Bug: https://bugs.launchpad.net/tripleo/+bug/1896220 * fs002, fs010 and fs030 is failing because of Memory issue. It will pass in the next run. - ~~fs037 prepare-parameter:~~ - ~~FIX: https://review.opendev.org/#/c/751340/ **(Merged)**~~ - ~~testproject: https://review.rdoproject.org/r/#/c/29297/ (Passing)~~ - ~~fs038 tempest octavia:~~ - ~~BUG: https://bugs.launchpad.net/tripleo/+bug/1895248~~ - ~~Fix: https://review.opendev.org/#/c/751162/ **(Merged)**~~ ### OSP ## Sept 17th ### TripleO https://bugs.launchpad.net/tripleo/+bug/1896126 >> tempest workspace * gate * ~~https://review.opendev.org/#/c/752187/ (Increase configure_swap_size to 4096)~~ * ~~https://review.opendev.org/#/c/752129/ (Inject both paths for validations roles location) >> ussuri~~ * ~~https://review.opendev.org/#/c/752197/ (Inject both paths for validations roles location) >> train~~ * periodic / 3rd party **master** * ~~https://bugs.launchpad.net/tripleo/+bug/1895792 ( public endpoint for compute service not found is getting TIMED_OUT on fs002, fs020 and fs035 Edit)~~ * ~~fix : https://review.opendev.org/#/c/752199/1 (Run some more featuresets with baremetal provisioning)~~ * ~~https://bugs.launchpad.net/tripleo/+bug/1874418 ((inconsistent) periodic centos8 fs39 fails sometimes with Error, some other host (FA:16:3E:B5:60:2A) already uses address 10.0.0.1)~~ * ~~https://review.opendev.org/#/c/752370/ (Enable undercloud nova for fs020) got +w on it~~. Testproject patch : https://review.rdoproject.org/r/#/c/29351/ (GREEN) * ~~in the recent run build-containers-ubi-8-push job failed here https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-build-containers-ubi-8-push/4f35982/logs/build.log~~ * ~~**Note:** on the testproject patch it passed https://review.rdoproject.org/r/#/c/29227/ , so waiting for next run.~~ **ussuri** * featureset039-ussuri is failing because of this https://bugs.launchpad.net/tripleo/+bug/1874418 ((inconsistent) periodic centos8 fs39 fails sometimes with Error, some other host (FA:16:3E:B5:60:2A) already uses address 10.0.0.1) though this not promotion blocker. **Rest all GREEN** :) **c8 train** * **GREEN** :) ## Sept 16th ### TripleO * gate * https://review.opendev.org/#/c/752187/ (Increase configure_swap_size to 4096) * https://review.opendev.org/#/c/752129/ (Inject both paths for validations roles location) >> ussuri * https://review.opendev.org/#/c/752197/ (Inject both paths for validations roles location) >> train * ~~https://review.opendev.org/#/c/751653/ (Inject both paths for validations roles location)~~ * ~~https://review.opendev.org/#/c/751828/ (Increase configure_swap_size to 4096)~~ * ~~https://review.opendev.org/#/c/751930/ (skip test_snapshot_pattern in master)~~ * ~~https://review.opendev.org/#/c/750967/ (Fix duplicated declaration when the deprecated parameter is used) This will fix sc12 issue~~ * periodic / 3rd party **master** * https://bugs.launchpad.net/tripleo/+bug/1895792 ( public endpoint for compute service not found is getting TIMED_OUT on fs002, fs020 and fs035 Edit) * here is the fix : https://review.opendev.org/#/c/752199/1 (Run some more featuresets with baremetal provisioning) * https://bugs.launchpad.net/tripleo/+bug/1874418 ((inconsistent) periodic centos8 fs39 fails sometimes with Error, some other host (FA:16:3E:B5:60:2A) already uses address 10.0.0.1) ~~* https://bugs.launchpad.net/tripleo/+bug/1895828 ( tempest.lib.exceptions.UnexpectedResponseCode: Unexpected response code received is failing on scenario010-standalone-master)~~ * Note: this issue is not seen on testproject run https://review.rdoproject.org/r/#/c/29351/ will wait for next run - ~~https://bugs.launchpad.net/tripleo/+bug/1895580 'TagFloatingIpTestJSON' object has no attribute 'assertItemsEqual'~~ - ~~https://review.opendev.org/#/c/750369/~~ - ~~https://review.opendev.org/#/c/747760/~~ - ~~https://review.opendev.org/#/c/751244/ (Revert "Change path for validation Ansible files")~~ **ussuri** * https://review.opendev.org/#/c/752187/ (Increase configure_swap_size to 4096) * https://review.opendev.org/#/c/752129/ (Inject both paths for validations roles location) **Note: For ussuri promotion will need above two patches to merge soon.** **c8 train** * ~~privilege escalation (undercloud-containers)~~ * ~~BUG: https://bugs.launchpad.net/tripleo/+bug/1895700 **Note:** In the recent run, didn't see this issue and job is GREEN. Will wait for next run result to reconfirm this issue.~~ * fs039 is in RETRY_LIMIT: Quota exceeded, too many key pairs. ~~https://logserver.rdoproject.org/openstack-periodic-integration-stable2/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp_1supp-featureset039-train/381071f/job-output.txt~~ ~~Note: @weshayutin will need to delete stale ports on rdo-cloud and vexxhost~~ **stein** ### OSP ## Sept 15th ### TripleO * gate * https://bugs.launchpad.net/tripleo/+bug/1895601 (tripleo standalone centos-8 tempest fail: estSnapshotPattern:_run_cleanups): 500 DELETE http://192.168.24.3:9292/v2/images) ~~https://review.opendev.org/#/c/751930/ (skip test_snapshot_pattern in master)~~ * **List of patches needs to be merge to clear gate faiures**: ~~https://review.opendev.org/#/c/751653/ (Inject both paths for validations roles location)~~ ~~https://review.opendev.org/#/c/751828/ (Increase configure_swap_size to 4096)~~ ~~https://review.opendev.org/#/c/751930/ (skip test_snapshot_pattern in master)~~ https://review.opendev.org/#/c/751771/ (Add the variable list_skipped_job) https://review.opendev.org/#/c/750967/ (Fix duplicated declaration when the deprecated parameter is used) * periodic / 3rd party **master** * tempest neutron plugin * ~~BUG: https://bugs.launchpad.net/tripleo/+bug/1895580 ‘TagFloatingIpTestJSON’ object has no attribute ‘assertItemsEqual’~~ **ussuri** * couldn't resolve module/action 'warn'" * BUG: https://bugs.launchpad.net/tripleo/+bug/1895507 * FIX: https://review.opendev.org/#/c/751653/ **c8 train** * test_listener_CRUD (scen10-ovn) * BUG: https://bugs.launchpad.net/tripleo/+bug/1895705 ~~* privilege escalation (undercloud-containers) * BUG: https://bugs.launchpad.net/tripleo/+bug/1895700~~ * memory (fs001, fs002, fs020, fs035, fs039) * FIX: https://review.rdoproject.org/r/#/c/29335/ (Increase configure_swap_size to 4096 for ovb jobs) * Related-Bug: #1895290 * Related-Bug: #1895288 **stein** * container-prepare-parameter.yaml * FIX: https://review.opendev.org/#/c/751340/ (**STILL UNRESOLVED**) * testproject: https://review.rdoproject.org/r/#/c/29297/ ### OSP ## Sept 14th ### TripleO * gate (clear) - ~~https://bugs.launchpad.net/tripleo/+bug/1895027 scen12~~ - ~~https://review.opendev.org/#/c/750980 lower constraints~~ - ~~https://bugs.launchpad.net/tripleo/+bug/1895056~~ * periodic / 3rd party **master** * ~~**NEW!!** https://bugs.launchpad.net/tripleo/+bug/1895580 'TagFloatingIpTestJSON' object has no attribute 'assertItemsEqual'~~ * https://bugs.launchpad.net/tripleo/+bug/1895507 (Standalone jobs failing with "couldn't resolve module/action 'warn'" ) https://review.opendev.org/#/c/751653/ (Inject both paths for validations roles location) https://review.opendev.org/#/c/751766/ ( Revert "Remove objects migrated to validations-common") **c8 train** * https://bugs.launchpad.net/tripleo/+bug/1895290( [train] ERROR root [ ] Image prepare failed: A process in the process pool was terminated abruptly while the future was running or pending failing on few jobs) hare is the fix: https://review.rdoproject.org/r/#/c/29313/ (Increase configure_swap_size to 4096) testing here: https://review.rdoproject.org/r/#/c/28452/ , https://review.rdoproject.org/r/#/c/29254/ ### OSP ## Sept 11th ### TripleO * gate * periodic / 3rd party **master** * master jobs are still failing with https://bugs.launchpad.net/tripleo/+bug/1895056 (package conflict, openstack-tripleo-validations 12.4.1-0.20200909044505.ae20b37.el8.noarch and validations common-1.1.1-0.20200908154415.946e3a8.el8.noarch) https://review.opendev.org/#/c/751244/ (Revert "Change path for validation Ansible files") Hope this will solve the problem **ussuri** * standalone-on-multinode-ipa-ussuri: failed at Execute tempest test: https://logserver.rdoproject.org/openstack-periodic-integration-stable1/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-standalone-on-multinode-ipa-ussuri/315bb39/job-output.txt https://logserver.rdoproject.org/openstack-periodic-integration-stable1/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-standalone-on-multinode-ipa-ussuri/33d2ced/logs/undercloud/var/log/tempest/stestr_results.html.gz * scenario010-standalone-ussuri: failed at Execute tempest : https://logserver.rdoproject.org/openstack-periodic-integration-stable1/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-scenario010-standalone-ussuri/d843988/logs/undercloud/var/log/tempest/stestr_results.html.gz * **Note:** Ran both the job in testproject : https://review.rdoproject.org/r/#/c/29250/ (standalone-on-multinode-ipa is still failing and scenario010-standalone is passed) **c8 train** * https://bugs.launchpad.net/tripleo/+bug/1895288 ( [train] MemoryError: [tripleotraincentos8/centos-binary-nova-libvirt] Memory Error failing on few jobs) * https://bugs.launchpad.net/tripleo/+bug/1895290 ( [train] ERROR root [ ] Image prepare failed: A process in the process pool was terminated abruptly while the future was running or pending failing on few jobs) **stein** * https://bugs.launchpad.net/tripleo/+bug/1895248 ([stein] (LoadBalancerScenarioTest:test_load_balancer_ipv4_CRUD) show_loadbalancer provisioning_status updated to an invalid state of ERROR is failing on fs038) https://review.opendev.org/#/c/751162/ (Add octavia test to skip list) * fs037 is blocking promotion: https://bugs.launchpad.net/tripleo/+bug/1895314 ### OSP ## Sept 10th ### TripleO (rfolco) cix cards updated, just do a quick pass before monday cix call * gate https://bugs.launchpad.net/tripleo/+bug/1895005 (Tempest failure: cloud overcloud is not found stable/train) https://bugs.launchpad.net/tripleo/+bug/1895138 ( centos-8 standalone-upgrade-ussuri fails build-test-packages issue creating /root/DLRN) marios is working on it. * RDO CI **master** * master promotion has been bolocked because of this issue: https://bugs.launchpad.net/tripleo/+bug/1895056 ( package conflict, openstack-tripleo-validations-12.4.1-0.20200909044505.ae20b37.el8.noarch and validations-common-1.1.1-0.20200908154415.946e3a8.el8.noarch) As below patches got merged and that hits this issue: https://review.opendev.org/#/c/750369/1 https://review.opendev.org/#/c/747760/ ~~~ Tengu> bhagyashri|rover: there's a patch on-going, but it has some issues in CI https://review.opendev.org/713204 ~~~ * recently 'periodic-tripleo-ci-build-containers-ubi-8-push' failed to push the conatiners because of **connection refused** issue but it passed on testproject: https://review.rdoproject.org/r/#/c/29227/ ~~~ noop SUCCESS in 0s periodic-tripleo-ci-build-containers-ubi-8-push SUCCESS in 47m 53s ~~~ **c8 train** : is good today :) - passed all jobs except scenario010-ovn-provider-standalone-train ### OSP ## Sept 9th ### TripleO - gate is bad (rfolco looking) ### OSP