# Ruck and rover notes #28 ###### tags: `ruck_rover` :::info Important links for ruck rover's [ruck/rover links to help](https://hackmd.io/07z0xroHTFi2IbX93P5ZfQ) **Ruck Rover - Unified Sprint 28** Dates: May 28 - June 17 Tripleo CI team ruck|rover: Folco (rfolco) / Pooja (pojadhav) OSP CI team ruck|rover: Vadim (vgriner), Waldemar (wznoinsk) Previous notes: https://hackmd.io/2MdkNAUuT7aBcM0Yck4xnw **Next #29 notes**: https://hackmd.io/XcuH2OIVTMiuxyrqSF6ocw ::: [TOC] --- ## on-going issues :::danger ## TripleO https://bugs.launchpad.net/tripleo/+bug/1883430 https://bugs.launchpad.net/tripleo/+bug/1883439 ### gate * ~~scen003: (designate) https://launchpad.net/bugs/1883692~~ * ussuri: Disable Designate service for scenario 03 https://review.opendev.org/736018 https://bugs.launchpad.net/tripleo/+bug/1883909 https://review.opendev.org/#/c/736183/ https://bugs.launchpad.net/tripleo/+bug/1883910 ### RDO CI * ussuri: * full-tempest-scenario: failing in different tests * all ovb: timeout/node failure * master keeps failing same jobs, see history ## OSP * OSP17 still without attention (!) because of fires in OSP<16 * outage of tlv labs is over * jenkins **back online** * queue is big again after not running since yesterday (expected) * there are **issues with jenkins failing to connect to tlv located slaves** * affects **qe-generic-tlv-01..03** slaves * affects seal slaves (eg seal47 used as extra hw for phase1, non-critical) * it shows abort/cancellation + ioexceptions in log as http://pastebin.test.redhat.com/876490 * seems that the issue is network related, although our tests (ping/mtu/stability) so far come empty (all seems working as usuall) * wiping jar cache, as reconfiguring slaves in jenkins had no effect * ::: --- :::info add dates in decending order so the latest date is at the top. Break out TripleO and OSP sections. ::: ### Reviews / Fixes ::: spoiler PATCHES 1. ~~https://review.opendev.org/734112 Fix image_sanity check~~ 1. ~~https://review.opendev.org/733699 Fix periodic condition - sanity~~ 1. ~~https://review.opendev.org/#/c/730763/ train image build nv~~ 1. https://review.opendev.org/733676 cirros 0.5.1 by default 2. https://review.opendev.org/#/c/733170 enable networksecgrouptest 3. ~~https://review.opendev.org/732420 ipv6 skip list~~ 4. ~~https://review.opendev.org/#/c/733114 pin dib~~ 5. ~~https://review.opendev.org/732618 fix c8 image builds~~ 6. ~~https://review.rdoproject.org/r/27724 fix fs035 train timeouts~~ 7. ~~https://review.rdoproject.org/r/#/c/27901 scen10 == fs062~~ 8. ~~https://review.opendev.org/#/c/732464 scenario010 nv~~ 9. ~~https://review.rdoproject.org/r/#/c/27845/ fix image sanity in ~~ 1. https://review.opendev.org/#/c/733659/ py3 c7 ::: ### Launchpad Bugs Reported :::spoiler BUGS | Bugzilla | Name | status | Review | | -------- | ---- |------- | ------ | | [1878190](https://bugs.launchpad.net/tripleo/+bug/1878190) | periodic-tripleo-ci-centos-8-ovb-1ctlr_2comp-featureset020-master job is consistently failing because of some tesmpest test are failing | Triged | [727192](https://review.opendev.org/#/c/727192/) | Bugs w/ CI tags (ci, alert, promotion-blocker) https://tinyurl.com/ycnkznfh ::: ## June 15th ### TripleO * train * scen10 * fs020 * master * scen10-ovn * tempest-skipped * fs020 ## June 12th ### OSP * rhos-qe-jenkins queue is too big (>200 jobs) * OSP16 * RHOS-16.1-RHEL-8-20200610.n.0 promoted phase2, phase3 started * there is new RHOS-16.1-RHEL-8-20200611.n.0 * two DFG-octavia jobs failed, tvignaud already retriggered them * they failed somewhere in OC deploy (not investigated, not doing so now/today-friday) ## June 11th ### Tripleo * centos-8 build push failed with container build failed (octavia-base) https://logserver.rdoproject.org/openstack-periodic-master/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-centos-8-master-containers-build-push/91e8f44/logs/build.log * periodic-tripleo-ci-centos-8-scenario000-multinode-oooq-container-updates-master https://logserver.rdoproject.org/openstack-periodic-master/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-scenario000-multinode-oooq-container-updates-master/3fa8685/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz * Load balancer issue still exists - sc10 train job https://review.rdoproject.org/zuul/builds?pipeline=openstack-periodic-24hr&job_name=periodic-tripleo-ci-centos-7-scenario010-standalone-train * fs20 train job failing consistently with inconsistent results https://review.rdoproject.org/zuul/builds?pipeline=openstack-periodic-24hr&job_name=periodic-tripleo-ci-centos-7-ovb-1ctlr_2comp-featureset020-train https://logserver.rdoproject.org/openstack-periodic-24hr/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-1ctlr_2comp-featureset020-train/092767e/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz https://logserver.rdoproject.org/openstack-periodic-24hr/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-1ctlr_2comp-featureset020-train/08d95fb/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz * Rocky-container-build-push job inconsistent https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-centos-7-rocky-containers-build-push * Rocky-fs02-upload also inconsistent https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset002-rocky-upload ### OSP * OSP13z12 some p3 still in progress (seems some reruns too) * OSP16.1 20200610.n.0 compose passed p1, has multiple failure in p2 (single job so far) * https://projects.engineering.redhat.com/browse/RHOSINFRA-3315 (rarely happening flaky, likely we understand it now, will attempt at fix) * https://projects.engineering.redhat.com/browse/RHOSINFRA-3266 (long standing flaky, expected to be solved rhel-8.2 upgrade of slaves) * psedlak: all phase2 jobs passed * after individual rerun due to the issues above * so manual rerun of phase2-multijob with REEVALUATE+PROMOTE option is needed to promote/trigger p3 * but holding back with promotion: * p3 multijobs atm have throttling limit 36 hours (can run again ~1am friday utc, also this will be dropped/changed in future) * lot of p3 is still in progress for previous compose (and at least some rely on passed_phase2 symlink atm, to be fixed by improving how UMB triggering works RHOSINFRA-3485) * also there is 150 jobs in queue still (it currently affects gates and such too) * i plan to trigger promoting+reevaluation on friday morning brq time ## June 10th ### TripleO * https://review.rdoproject.org/r/#/c/28041 merged (image issues) * ## June 9th ### Tripleo * (rfolco) focusing on **overcloud image missing files** * https://review.rdoproject.org/r/#/c/27986/ * container build push consistently failing : https://review.rdoproject.org/zuul/builds?pipeline=openstack-periodic-master&job_name=periodic-tripleo-centos-8-master-containers-build-push * ansible-pacemaker failing (promotion-blocker ) : reported bug : https://bugs.launchpad.net/tripleo/+bug/1882664 https://logserver.rdoproject.org/openstack-periodic-master/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-centos-8-buildimage-overcloud-full-master/a3b0702/build.log ### OSP * new composes for OSP13 and 16.1 - p1/2 in progress check results on wednesday ## June 8th ### Tripleo * reported bug for not reporting to dlrnapi : https://bugs.launchpad.net/tripleo/+bug/1882534 Fix is merged : https://review.rdoproject.org/r/#/c/27977/ ## June 5th ### Tripleo ~~c7 py2 jobs broken >> https://review.opendev.org/#/c/726579~~ REVERTED ussuri container build >> https://review.opendev.org/#/c/733790 #### master * scen10-ovn * full-tempest-scenario * ipv6 skip list: https://review.opendev.org/#/c/732420/ * full-tempest-api * tempest.api.compute.servers.test_delete_server.DeleteServersTestJSON * JUST ONCE, watching more runs: https://review.rdoproject.org/r/27964 * fs039: * https://bugs.launchpad.net/tripleo/+bug/1875353 * fs001: * /etc/pki/tls/private does not exist * https://bugs.launchpad.net/tripleo/+bug/1879766 * fs020: * pacemaker https://bugs.launchpad.net/tripleo/+bug/1867602 * tempest failures: ipv6, TestNetworkAdvancedServerOps * fs030: * timeouts (deploy) * fs035: * INCONSISTENT * timeout * /etc/pki/tls/private does not exist * https://bugs.launchpad.net/tripleo/+bug/1879766 #### ussuri * image build * container build ##### details * tripleo-buildimage-overcloud-full-centos-8-ussuri and tripleo-buildimage-overcloud-full-centos-8 needs an attention. https://zuul.opendev.org/t/openstack/builds?job_name=tripleo-buildimage-overcloud-full-centos-8-ussuri https://zuul.opendev.org/t/openstack/builds?job_name=tripleo-buildimage-overcloud-full-centos-8 https://logserver.rdoproject.org/openstack-periodic-latest-released/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-centos-8-buildimage-overcloud-full-ussuri/4a764e7/build.log * a ghost of https://bugs.launchpad.net/tripleo/+bug/1879767 ???? ``` Failed to open connection to "system" message bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory .... 2020-06-05 09:28:47.339 | Installing : ansible-pacemaker-1.0.4-0.20200526160932.5847167 304/317Error unpacking rpm package ansible-pacemaker-1.0.4-0.20200526160932.5847167.el8.noarch 2020-06-05 09:28:47.345 | 2020-06-05 09:28:47.346 | Installing : crudini-0.9.3-1.el8.noarch 305/317 2020-06-05 09:28:47.346 | error: unpacking of archive failed on file /usr/share/ansible/plugins/modules/pacemaker_cluster.py;5eda101c: cpio: open failed - Inappropriate ioctl for device 2020-06-05 09:28:47.346 | error: ansible-pacemaker-1.0.4-0.20200526160932.5847167.el8.noarch: install failed 2020-06-05 09:28:47.346 | ``` * periodic-tripleo-centos-8-ussuri-containers-build-push job having consistent failure with "failed to build containers" https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-centos-8-ussuri-containers-build-push&pipeline=%09openstack-periodic-latest-released https://logserver.rdoproject.org/openstack-periodic-latest-released/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-centos-8-ussuri-containers-build-push/ccc1d41/logs/build.log Reported this issue : https://bugs.launchpad.net/tripleo/+bug/1882246 ~~Invastigate and upload a fix : https://review.opendev.org/#/c/733820/ Testing the job here : https://review.rdoproject.org/r/#/c/27963/~~ Fix: https://review.opendev.org/#/c/733790 testproject: https://review.rdoproject.org/r/27966 ### OSP * OSP17 still without attention (!) because of fires in OSP<16 * foreign jobs still invading p1/p2 views * not yet tested https://code.engineering.redhat.com/gerrit/#/c/198375 * psedlak: need to get info about this from fhubik (as there is new cleaned up proposal of octavia jobs, do we still need this?) * OSP16.1 * two octavia jobs failed on 'infrared cloud-config' plugin not recognizing 16.1 as version of choice * fasttracked https://review.gerrithub.io/c/rhos-infra/cloud-config/+/494993 * now in tempest stage so cloud-config issue resolved * from yesterday to still followup: * **osp13 two red** (one packstack one ospd) * packstack cleared * one ospd timeouts in OC deploy, likely already cixed https://trello.com/c/aL9jyT9A * **osp15 RED phase1, RED/Yellow octavia** in phase2 * latest 15 build seems old RHOS_TRUNK-15.0-RHEL-8-20200520.n.0 * so maybe not new issues, but i do not see these in CIX board * job status is from 15 days or 6 days old, so just safety reruns exposing infra issue and not a product one (but i do not see them passed for this puddle in history) * investigation/rerun definitelly needed (but priority of other osp?) ## June 4th ### Tripleo * **Upstream Gate**: SC10-standalone failed consistently : https://zuul.opendev.org/t/openstack/builds?job_name=tripleo-ci-centos-8-scenario010-standalone inconsistent failures : latest failure while deploy and last 2 failures are related to “async task did not complete within the requested time - 5700s * **RDO CI Failures**: #### master * fs001 (promotion blocker for master): /etc/pki/tls/private does not exist https://bugs.launchpad.net/tripleo/+bug/1879766 https://review.rdoproject.org/zuul/builds?pipeline=openstack-periodic-master&job_name=periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-master * ### OSP * OSP17 still without attention (!) because of fires in OSP<16 * foreign jobs still invading p1/p2 views * not yet tested https://code.engineering.redhat.com/gerrit/#/c/198375 * psedlak: what is the overall status? (once we sync up keep just the one with issues) * osp10 all blue * osp12 tab is empty (should be removed?) * *tkorol*: **osp13 two red** (one packstack one ospd) * osp14 empty p1/p2 section * *psedlak*: **osp15 RED phase1, RED/Yellow octavia** in phase2 * latest 15 build seems old RHOS_TRUNK-15.0-RHEL-8-20200520.n.0 * so maybe not new issues, but i do not see these in CIX board * job status is from 15 days or 6 days old, so just safety reruns exposing infra issue and not a product one (but i do not see them passed for this puddle in history) * investigation/rerun definitelly needed (but priority of other osp?) * osp16.0 all blue (>2weeks) * osp16.1 blue * osp17 phase1 is RED, p2 not run yet * [puddle-status](https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/util-monitor-puddle-symlinks/41023/console) indicates only 16.1-p2 and 17 not promoted? * [infra-monitor-job](https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/util-monitor-infra/97107/console) is having quite few issues - but not new seems already for some time (maybe related to pre-testing tlv2 slaves move?) ## June 3rd ### Tripleo * Fix is up for the issue reported https://bugs.launchpad.net/tripleo/+bug/1881732 Fix Patch is here : https://review.opendev.org/#/c/733114/ Test Patch for the same : https://review.rdoproject.org/r/#/c/27914/ #### RDO promoted train and ussuri ##### master * scenario010-ovn-provider-standalone-master * **ignoring** * full-tempest master: * ipv6 tests https://bugs.launchpad.net/tripleo/+bug/1881624 * **rechecked** https://review.opendev.org/732420 add ipv6 hotplug tests to skip list * fs039: * FileNotFoundError: [Errno 2] No such file or directory: '/etc/sssd/sssd.conf' * FileNotFoundError: [Errno 2] No such file or directory: '/var/lib/ipa-client/sysrestore/...' * https://bugs.launchpad.net/tripleo/+bug/1875353 * fs001: * /etc/pki/tls/private does not exist * https://bugs.launchpad.net/tripleo/+bug/1879766 * fs020: * pacemaker https://bugs.launchpad.net/tripleo/+bug/1867602 * tempest failures: ipv6, TestNetworkAdvancedServerOps * fs030: * timeouts (deploy) * fs035: * /etc/pki/tls/private does not exist * https://bugs.launchpad.net/tripleo/+bug/1879766 ## June 2nd ### Tripleo * tripleo-buildimage-centos7 jobs failing due to python version. as python2 support removed from diskimage-builder. Reported a bug, https://bugs.launchpad.net/tripleo/+bug/1881732 * featureset001 failing on master due to this issue (https://bugs.launchpad.net/tripleo/+bug/1879766) Job - periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-master is under promotion criteria consistently failing. https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-master * periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039-stein failing consistently with unreachable all nodes. https://review.rdoproject.org/zuul/builds?pipeline=openstack-periodic-24hr%20&job_name=periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039-stein https://logserver.rdoproject.org/openstack-periodic-24hr/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039-stein/3cef6f2/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz ## June 1st ### Tripleo scenario004 / 001 master busted by https://bugs.launchpad.net/tripleo/+bug/1881670 - revert up scenario010 https://review.opendev.org/#/c/732464/1 - moving to non-voting until fixed manually promoted ussuri.. -> taking it out of loop on promoter server as it's busted. --- rfolco notes: testproject * testing image_sanity on https://review.rdoproject.org/r/#/c/27845/ * timeout increase for fs035 on https://review.rdoproject.org/r/27724 * fs020 train w/ new cirros on https://review.rdoproject.org/r/#/c/27878/ * scen010 ovn w/ new cirros on https://review.rdoproject.org/r/#/c/27880/ master * full-tempest master: * https://bugs.launchpad.net/tripleo/+bug/1881624 * https://review.opendev.org/732420 add ipv6 hotplug tests to skip list * fs039, fs001: * watch (inconsistent results) * fs020: * inconsistent results, tempest failures * hit pcsd bug https://bugs.launchpad.net/tripleo/+bug/1867602 * fs030: * timeout on last run (mostly green) * fs035: * /etc/pki/tls/private does not exist (seen more than once) * tempest (ipv6) - https://bugs.launchpad.net/tripleo/+bug/1881624 * https://github.com/cirros-dev/cirros/issues/58 * scen10 ovn: * https://review.rdoproject.org/r/27880 Test scen10 ovn master * consistently failing on https://logserver.rdoproject.org/openstack-periodic-master/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-scenario010-ovn-provider-standalone-master/fb1373f/logs/undercloud/var/log/tempest/stestr_results.html.gz * while scen10 test is green https://logserver.rdoproject.org/openstack-periodic-master/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-scenario010-standalone-master/56fb6b8/logs/undercloud/var/log/tempest/stestr_results.html.gz train * scen10 train: https://bugs.launchpad.net/tripleo/+bug/1881584 * fs001, fs002, fs035: timeout * https://tinyurl.com/ybcv8wp9 * scen004: * still failing all the time - https://bugs.launchpad.net/tripleo/+bug/1879292 ussuri *(mostlt green except by)* * fs020: https://bugs.launchpad.net/tripleo/+bug/1881642 --- * RDO CI Failures * **periodic-tripleo-ci-centos-8-scenario010-ovn-provider-standalone-master** failing consistently with tempest. https://review.rdoproject.org/zuul/builds?pipeline=openstack-periodic- master&job_name=periodic-tripleo-ci-centos-8-scenario010-ovn-provider- standalone-master Job is under promotion criteria but commented in [1]. [1] https://github.com/rdo-infra/ci-config/blob/master/ci- scripts/dlrnapi_promoter/config/CentOS-8/master.ini#L33 * **periodic-tripleo-ci-centos-8-standalone-full-tempest-scenario-master** failing consistently with tempest tests. https://review.rdoproject.org/zuul/builds?pipeline=openstack-periodic-master&job_name=periodic-tripleo-ci-centos-8-standalone-full-tempest-scenario-master https://logserver.rdoproject.org/openstack-periodic-master/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-standalone-full-tempest-scenario-master/8bd10fb/logs/undercloud/var/log/tempest/tempest_run.log.txt.gz * **periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-ussuri** recently failed with 1 tempest test failure. Same isseu being reported for FS20 earlier. This job is in promotion criteria. https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-ussuri&pipeline=openstack-periodic-latest-released https://bugs.launchpad.net/tripleo/+bug/1744907 * **periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-train** is failing consistently with "Overcloud configuration failed". https://review.rdoproject.org/zuul/builds?pipeline=openstack-periodic-24hr&job_name=periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-train * **periodic-tripleo-ci-centos-7-scenario010-standalone-train** failing consistently. ~~~ 2020-06-01 05:53:08.194423 | primary | TASK [os_tempest : Ensure private network exists] ****************************** 2020-06-01 05:53:08.194501 | primary | Monday 01 June 2020 05:53:08 +0000 (0:00:00.100) 0:39:46.330 *********** 2020-06-01 05:53:11.920072 | primary | FAILED - RETRYING: Ensure private network exists (5 retries left). 2020-06-01 05:53:24.624402 | primary | FAILED - RETRYING: Ensure private network exists (4 retries left). 2020-06-01 05:53:37.223961 | primary | FAILED - RETRYING: Ensure private network exists (3 retries left). 2020-06-01 05:53:49.816758 | primary | FAILED - RETRYING: Ensure private network exists (2 retries left). 2020-06-01 05:54:02.321538 | primary | FAILED - RETRYING: Ensure private network exists (1 retries left). 2020-06-01 05:54:14.724326 | primary | fatal: [undercloud -> 127.0.0.2]: FAILED! => { 2020-06-01 05:54:14.724473 | primary | "attempts": 5, 2020-06-01 05:54:14.724539 | primary | "changed": false 2020-06-01 05:54:14.724566 | primary | } 2020-06-01 05:54:14.724608 | primary | 2020-06-01 05:54:14.724635 | primary | MSG: 2020-06-01 05:54:14.724711 | primary | 2020-06-01 05:54:14.724871 | primary | ConflictException: 409: Client Error for url: http://192.168.24.1:9696/v2.0/networks.json, Unable to create the network. The tunnel ID 1 is in use. ~~~ also see https://logserver.rdoproject.org/openstack-periodic-24hr/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-7-scenario010-standalone-train/7b60b68/logs/undercloud/var/log/containers/neutron/server.log.txt.gz ``` -06-01 05:53:11.663 32 DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn n=1 command(idx=4): PgAclAddCommand(direction=to-lport, log=False, name=[], may_exist=False, entity=pg_d8d3c626_ef73_4b41_9f6a_4503278c5312, priority=1002, action=allow-related, external_ids={'neutron:security_group_rule_id': 'c7029504-a1a9-42d4-9247-a460fdfdb4cf'}, match=outport == @pg_d8d3c626_ef73_4b41_9f6a_4503278c5312 && ip4 && ip4.src == $pg_d8d3c626_ef73_4b41_9f6a_4503278c5312_ip4, severity=[]) do_commit /usr/lib/python2.7/site-packages/ovsdbapp/backend/ovs_idl/transaction.py:84 2020-06-01 05:53:11.668 36 DEBUG networking_ovn.ovsdb.ovsdb_monitor [-] Hash Ring: Node 8689956c-4f66-404a-ad4a-11ec99f1fcd5 (host: standalone.localdomain) handling event "create" for row ac4a1a42-ae6d-4708-9a8b-7e9655ff3000 (table: ACL) notify /usr/lib/python2.7/site-packages/networking_ovn/ovsdb/ovsdb_monitor.py:462 2020-06-01 05:53:11.669 36 DEBUG networking_ovn.ovsdb.ovsdb_monitor [-] Hash Ring: Node 8689956c-4f66-404a-ad4a-11ec99f1fcd5 (host: standalone.localdomain) handling event "create" for row f9302631-b1c2-4473-9f52-e98bc5660ace (table: ACL) notify /usr/lib/python2.7/site-packages/networking_ovn/ovsdb/ovsdb_monitor.py:462 2020-06-01 05:53:11.669 34 DEBUG networking_ovn.ovsdb.ovsdb_monitor [-] Hash Ring: Node e5210224-234a-4070-a5a3-282594bdc96e (host: standalone.localdomain) handling event "create" for row 8a9c4091-4aff-439f-8aa9-fc32d9d28cf7 (table: ACL) notify /usr/lib/python2.7/site-packages/networking_ovn/ovsdb/ovsdb_monitor.py:462 2020-06-01 05:53:11.670 36 DEBUG networking_ovn.ovsdb.ovsdb_monitor [-] Hash Ring: Node 8689956c-4f66-404a-ad4a-11ec99f1fcd5 (host: standalone.localdomain) handling event "create" for row 10004d08-787e-4e30-a623-74e8a5c2394d (table: ACL) notify /usr/lib/python2.7/site-packages/networking_ovn/ovsdb/ovsdb_monitor.py:462 2020-06-01 05:53:11.671 36 DEBUG networking_ovn.ovsdb.ovsdb_monitor [-] Hash Ring: Node 8689956c-4f66-404a-ad4a-11ec99f1fcd5 (host: standalone.localdomain) handling event "create" for row 7ca26059-bbc4-4f57-9b0e-e8e6c257466c (table: Port_Group) notify /usr/lib/python2.7/site-packages/networking_ovn/ovsdb/ovsdb_monitor.py:462 2020-06-01 05:53:11.690 32 INFO networking_ovn.db.revision [req-40f346c8-bfeb-4e3f-b42f-96540da554f3 3c550cf5718d489e899d2b974d076c59 c3bac0775f3f4f709b305f72cf217853 - default default] Successfully bumped revision number for resource d8d3c626-ef73-4b41-9f6a-4503278c5312 (type: security_groups) to 1 2020-06-01 05:53:11.704 32 DEBUG oslo_concurrency.lockutils [req-5ef76d30-94e8-46aa-82d3-08631918685e 3c550cf5718d489e899d2b974d076c59 c3bac0775f3f4f709b305f72cf217853 - - -] Lock "event-dispatch" acquired by "neutron.plugins.ml2.ovo_rpc.dispatch_events" :: waited 0.000s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:327 2020-06-01 05:53:11.746 32 INFO neutron.pecan_wsgi.hooks.translation [req-40f346c8-bfeb-4e3f-b42f-96540da554f3 3c550cf5718d489e899d2b974d076c59 c3bac0775f3f4f709b305f72cf217853 - default default] POST failed (client error): There was a conflict when trying to complete your request. ``` source: https://opendev.org/openstack/openstack-ansible-os_tempest/src/branch/master/tasks/tempest_resources.yml#L146 https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-7-scenario010-standalone-train&pipeline=openstack-periodic-24hr Issue reported in launchpad : https://bugs.launchpad.net/tripleo/+bug/1881584 * ## May 30th ### Tripleo tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset053 failing continuously https://review.rdoproject.org/zuul/builds?job_name=tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset053 ~~~ 2020-05-30 06:22:04.768735 | primary | TASK [repo-setup : Get DLRN hash - passed tag - component-based] *************** 2020-05-30 06:22:04.768795 | primary | Saturday 30 May 2020 06:22:04 +0000 (0:00:00.083) 0:00:19.015 ********** 2020-05-30 06:22:05.556357 | primary | fatal: [undercloud]: FAILED! => { 2020-05-30 06:22:05.558103 | primary | "changed": true, 2020-05-30 06:22:05.558392 | primary | "cmd": "set -euo pipefail\ndlrn_base=https://trunk.rdoproject.org/centos7-master\nif [ -e /etc/ci/mirror_info.sh ]; then\n source /etc/ci/mirror_info.sh\n NODEPOOL_RDO_PROXY=${NODEPOOL_RDO_PROXY:-https://trunk.rdoproject.org}\n dlrn_base=${dlrn_base/https:\\/\\/trunk.rdoproject.org/$NODEPOOL_RDO_PROXY}\nfi\ncurl -s --fail --show-error ${dlrn_base}/current-tripleo/delorean.repo.md5\n", 2020-05-30 06:22:05.558433 | primary | "delta": "0:00:00.318829", 2020-05-30 06:22:05.558497 | primary | "end": "2020-05-30 06:22:05.541197", 2020-05-30 06:22:05.558536 | primary | "rc": 22, 2020-05-30 06:22:05.558579 | primary | "start": "2020-05-30 06:22:05.222368" 2020-05-30 06:22:05.558589 | primary | } 2020-05-30 06:22:05.558599 | primary | 2020-05-30 06:22:05.558613 | primary | STDERR: 2020-05-30 06:22:05.558622 | primary | 2020-05-30 06:22:05.558663 | primary | curl: (22) The requested URL returned error: 404 Not Found 2020-05-30 06:22:05.558673 | primary | 2020-05-30 06:22:05.558682 | primary | 2020-05-30 06:22:05.558693 | primary | MSG: 2020-05-30 06:22:05.558703 | primary | 2020-05-30 06:22:05.558723 | primary | non-zero return code ~~~ ~~gate issue solved~~ * ~~https://bugs.launchpad.net/tripleo/+bug/1881090 (virt-customize: error: libguestfs error: overcloud-full.qcow2: No such file failing ooci-build-images) here is the fix:~~ ~~https://review.opendev.org/731498 (Added image sanity check condition)~~ ~~https://review.opendev.org/#/c/731587 and https://review.opendev.org/#/c/731498~~ ## May 29th ### Tripleo build image issue (fs002): fix https://review.opendev.org/#/c/731823 test https://review.rdoproject.org/r/27845 Test 731823 fs035 ussuri 3rd party: https://review.rdoproject.org/r/27846 Add fs035 (ussuri) 3rd party job to layout * Gate: - tripleo-ci-centos-7-standalone-upgrade-train failed two time with same error: ~~~ 2020-05-29 05:26:40 | 2020-05-29 05:26:40.206 139137 INFO osc_lib.shell [-] command: tripleo upgrade -> tripleoclient.v1.tripleo_upgrade.Upgrade (auth=False) 2020-05-29 05:26:40 | 2020-05-29 05:26:40.209 139137 ERROR tripleoclient.v1.tripleo_upgrade.Upgrade [-] User interaction required, cannot confirm. 2020-05-29 05:26:40 | 2020-05-29 05:26:40.210 139137 ERROR openstack [-] User did not confirm upgrade, so exiting. Consider using the --yes parameter if you prefer to skip this warning in the future: UndercloudUpgradeNotConfirmed: User did not confirm upgrade, so exiting. Consider using the --yes parameter if you prefer to skip this warning in the future 2020-05-29 05:26:40 | 2020-05-29 05:26:40.210 139137 INFO osc_lib.shell [-] END return value: 1 ~~~ https://zuul.openstack.org/builds?pipeline=gate&job_name=tripleo-ci-centos-7-standalone-upgrade-train https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_597/727889/1/gate/tripleo-ci-centos-7-standalone-upgrade-train/597b47b/logs/undercloud/home/zuul/standalone_upgrade.log https://bugs.launchpad.net/tripleo/+bug/1881306 reported here. https://review.opendev.org/#/c/731782/ here is the fix. * RDO CI Failures: - **Ussuri** - periodic-tripleo-ci-centos-8-ovb-1ctlr_1comp-featureset002-ussuri consistenly failing with below error ~~~ 2020-05-28 17:26:34.276657 | primary | libguestfs: trace: set_verbose true 2020-05-28 17:26:34.276695 | primary | libguestfs: trace: set_verbose = 0 2020-05-28 17:26:34.276733 | primary | libguestfs: trace: set_memsize 2048 2020-05-28 17:26:34.276770 | primary | libguestfs: trace: set_memsize = 0 2020-05-28 17:26:34.276808 | primary | libguestfs: trace: set_smp 2 2020-05-28 17:26:34.276844 | primary | libguestfs: trace: set_smp = 0 2020-05-28 17:26:34.277414 | primary | libguestfs: trace: set_network true 2020-05-28 17:26:34.277479 | primary | libguestfs: trace: set_network = 0 2020-05-28 17:26:34.277564 | primary | libguestfs: trace: add_drive "overcloud-full.qcow2" "readonly:false" "protocol:file" "discard:besteffort" 2020-05-28 17:26:34.277618 | primary | libguestfs: trace: add_drive = -1 (error) 2020-05-28 17:26:34.278384 | primary | virt-customize: error: libguestfs error: overcloud-full.qcow2: No such file 2020-05-28 17:26:34.278417 | primary | or directory 2020-05-28 17:26:34.278430 | primary | libguestfs: trace: close 2020-05-28 17:26:34.278904 | primary | libguestfs: closing guestfs handle 0x55d953ea8070 (state 0) 2020-05-28 17:26:34.278920 | primary | /bin/virt-copy-out: access: overcloud-full.qcow2: No such file or directory ~~~ https://review.rdoproject.org/zuul/builds?pipeline=openstack-periodic-latest-released&job_name=periodic-tripleo-ci-centos-8-ovb-1ctlr_1comp-featureset002-ussuri https://review.opendev.org/#/c/731498/ this fix is up for the issue ### OSP ## May 28th (handoff) ### Tripleo * Train and Stein get promoted recently. * Master promotion is blocked because of fs001 failure. * RDO CI Failures: - https://bugs.launchpad.net/tripleo/+bug/1881090 (virt-customize: error: libguestfs error: overcloud-full.qcow2: No such file failing ooci-build-images) https://review.opendev.org/731498 (Added image sanity check condition) - **Master** : fs001 is failing on master and blocking promotion. Issue : https://bugs.launchpad.net/tripleo/+bug/1879766 (master ovb jobs failing on Destination directory /etc/pki/tls/private does not exist) - **Ussuri**: "periodic-tripleo-ci-centos-8-standalone-full-tempest-scenario-ussuri" is failing because of tempest test failures : https://logserver.rdoproject.org/openstack-periodic-latest-released/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-standalone-full-tempest-scenario-ussuri/903029a/logs/undercloud/var/log/tempest/stestr_results.html.gz "periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset035-ussuri" is getting time out consistently at Execute tempest test task. Submitted patch to increase the time out here : https://review.rdoproject.org/r/#/c/27811/ and testing the jobs: https://review.rdoproject.org/r/#/c/27789/ - **Train C7**: Currently most of the jobs are failing because of https://bugs.launchpad.net/tripleo/+bug/1881090 (virt-customize: error: libguestfs error: overcloud-full.qcow2: No such file failing ooci-build-images), once this will patche will merge https://review.opendev.org/731498 (Added image sanity check condition) it will go green. ### OSP ### Completed Items - [ ] fix master - [x] pooja -dlrnapi not reporting https://review.rdoproject.org/r/#/c/27977/ update - https://review.opendev.org/#/c/724147/ and https://review.opendev.org/#/c/722486/ after this patches marged issue started. Standalone jobs taking new version of urllib3 but not container build and image build jobs. New fix to ensure urllib3 latest verion pickup : https://review.rdoproject.org/r/#/c/27997/ some invastigation here : https://bugs.launchpad.net/tripleo/+bug/1882534 - [ ] folco - ensure image-sanity is running and failing overcloud image builds when missing files are found - [x] https://review.opendev.org/734112 Fix image_sanity check - [ ] https://review.rdoproject.org/r/27986 Test image_sanity fix - [ ] actually testing yatin's fix https://review.rdoproject.org/r/#/c/28041/2/playbooks/tripleo-ci-periodic-base-upload/tmpfiles.yaml - [ ] pooja - work w/ chandan we need to get a working overcloud image - [ ] fix ussuri -[x] ~~folco - epel not found, tc patch https://review.opendev.org/#/c/733790~~ MERGED :heavy_check_mark: ~~update - tested the fix - https://review.rdoproject.org/r/#/c/28002/ working fine and fix got workflow +1, soon it will merge.~~ - [ ] pooja - work w/ chandan to get a workig overcloud-image build - [ ] wes, look at promoting train - [ ] https://trunk.rdoproject.org/api-centos-train/api/civotes_detail.html?commit_hash=9d933a3e31003a8e250625c8d97da6a346b15571&distro_hash=2e42c32283c467496fe59c7ac67f0adffeb1d2d8