owned this note
owned this note
Published
Linked with GitHub
# Ruck and rover notes #28
###### tags: `ruck_rover`
:::info
Important links for ruck rover's [ruck/rover links to help](https://hackmd.io/07z0xroHTFi2IbX93P5ZfQ)
**Ruck Rover - Unified Sprint 28**
Dates: May 28 - June 17
Tripleo CI team ruck|rover: Folco (rfolco) / Pooja (pojadhav)
OSP CI team ruck|rover: Vadim (vgriner), Waldemar (wznoinsk)
Previous notes: https://hackmd.io/2MdkNAUuT7aBcM0Yck4xnw
**Next #29 notes**: https://hackmd.io/XcuH2OIVTMiuxyrqSF6ocw
:::
[TOC]
---
## on-going issues
:::danger
## TripleO
https://bugs.launchpad.net/tripleo/+bug/1883430
https://bugs.launchpad.net/tripleo/+bug/1883439
### gate
* ~~scen003: (designate) https://launchpad.net/bugs/1883692~~
* ussuri: Disable Designate service for scenario 03 https://review.opendev.org/736018
https://bugs.launchpad.net/tripleo/+bug/1883909
https://review.opendev.org/#/c/736183/
https://bugs.launchpad.net/tripleo/+bug/1883910
### RDO CI
* ussuri:
* full-tempest-scenario: failing in different tests
* all ovb: timeout/node failure
* master keeps failing same jobs, see history
## OSP
* OSP17 still without attention (!) because of fires in OSP<16
* outage of tlv labs is over
* jenkins **back online**
* queue is big again after not running since yesterday (expected)
* there are **issues with jenkins failing to connect to tlv located slaves**
* affects **qe-generic-tlv-01..03** slaves
* affects seal slaves (eg seal47 used as extra hw for phase1, non-critical)
* it shows abort/cancellation + ioexceptions in log as http://pastebin.test.redhat.com/876490
* seems that the issue is network related, although our tests (ping/mtu/stability) so far come empty (all seems working as usuall)
* wiping jar cache, as reconfiguring slaves in jenkins had no effect
*
:::
---
:::info
add dates in decending order so the latest date is at the top. Break out TripleO and OSP sections.
:::
### Reviews / Fixes
::: spoiler PATCHES
1. ~~https://review.opendev.org/734112 Fix image_sanity check~~
1. ~~https://review.opendev.org/733699 Fix periodic condition - sanity~~
1. ~~https://review.opendev.org/#/c/730763/ train image build nv~~
1. https://review.opendev.org/733676 cirros 0.5.1 by default
2. https://review.opendev.org/#/c/733170 enable networksecgrouptest
3. ~~https://review.opendev.org/732420 ipv6 skip list~~
4. ~~https://review.opendev.org/#/c/733114 pin dib~~
5. ~~https://review.opendev.org/732618 fix c8 image builds~~
6. ~~https://review.rdoproject.org/r/27724 fix fs035 train timeouts~~
7. ~~https://review.rdoproject.org/r/#/c/27901 scen10 == fs062~~
8. ~~https://review.opendev.org/#/c/732464 scenario010 nv~~
9. ~~https://review.rdoproject.org/r/#/c/27845/ fix image sanity in ~~
1. https://review.opendev.org/#/c/733659/ py3 c7
:::
### Launchpad Bugs Reported
:::spoiler BUGS
| Bugzilla | Name | status | Review |
| -------- | ---- |------- | ------ |
| [1878190](https://bugs.launchpad.net/tripleo/+bug/1878190) | periodic-tripleo-ci-centos-8-ovb-1ctlr_2comp-featureset020-master job is consistently failing because of some tesmpest test are failing | Triged | [727192](https://review.opendev.org/#/c/727192/) |
Bugs w/ CI tags (ci, alert, promotion-blocker)
https://tinyurl.com/ycnkznfh
:::
## June 15th
### TripleO
* train
* scen10
* fs020
* master
* scen10-ovn
* tempest-skipped
* fs020
## June 12th
### OSP
* rhos-qe-jenkins queue is too big (>200 jobs)
* OSP16
* RHOS-16.1-RHEL-8-20200610.n.0 promoted phase2, phase3 started
* there is new RHOS-16.1-RHEL-8-20200611.n.0
* two DFG-octavia jobs failed, tvignaud already retriggered them
* they failed somewhere in OC deploy (not investigated, not doing so now/today-friday)
## June 11th
### Tripleo
* centos-8 build push failed with container build failed (octavia-base)
https://logserver.rdoproject.org/openstack-periodic-master/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-centos-8-master-containers-build-push/91e8f44/logs/build.log
* periodic-tripleo-ci-centos-8-scenario000-multinode-oooq-container-updates-master
https://logserver.rdoproject.org/openstack-periodic-master/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-scenario000-multinode-oooq-container-updates-master/3fa8685/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz
* Load balancer issue still exists - sc10 train job
https://review.rdoproject.org/zuul/builds?pipeline=openstack-periodic-24hr&job_name=periodic-tripleo-ci-centos-7-scenario010-standalone-train
* fs20 train job failing consistently with inconsistent results
https://review.rdoproject.org/zuul/builds?pipeline=openstack-periodic-24hr&job_name=periodic-tripleo-ci-centos-7-ovb-1ctlr_2comp-featureset020-train
https://logserver.rdoproject.org/openstack-periodic-24hr/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-1ctlr_2comp-featureset020-train/092767e/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz
https://logserver.rdoproject.org/openstack-periodic-24hr/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-1ctlr_2comp-featureset020-train/08d95fb/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz
* Rocky-container-build-push job inconsistent
https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-centos-7-rocky-containers-build-push
* Rocky-fs02-upload also inconsistent
https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset002-rocky-upload
### OSP
* OSP13z12 some p3 still in progress (seems some reruns too)
* OSP16.1 20200610.n.0 compose passed p1, has multiple failure in p2 (single job so far)
* https://projects.engineering.redhat.com/browse/RHOSINFRA-3315 (rarely happening flaky, likely we understand it now, will attempt at fix)
* https://projects.engineering.redhat.com/browse/RHOSINFRA-3266 (long standing flaky, expected to be solved rhel-8.2 upgrade of slaves)
* psedlak: all phase2 jobs passed
* after individual rerun due to the issues above
* so manual rerun of phase2-multijob with REEVALUATE+PROMOTE option is needed to promote/trigger p3
* but holding back with promotion:
* p3 multijobs atm have throttling limit 36 hours (can run again ~1am friday utc, also this will be dropped/changed in future)
* lot of p3 is still in progress for previous compose (and at least some rely on passed_phase2 symlink atm, to be fixed by improving how UMB triggering works RHOSINFRA-3485)
* also there is 150 jobs in queue still (it currently affects gates and such too)
* i plan to trigger promoting+reevaluation on friday morning brq time
## June 10th
### TripleO
* https://review.rdoproject.org/r/#/c/28041 merged (image issues)
*
## June 9th
### Tripleo
* (rfolco) focusing on **overcloud image missing files**
* https://review.rdoproject.org/r/#/c/27986/
* container build push consistently failing :
https://review.rdoproject.org/zuul/builds?pipeline=openstack-periodic-master&job_name=periodic-tripleo-centos-8-master-containers-build-push
* ansible-pacemaker failing (promotion-blocker ) :
reported bug : https://bugs.launchpad.net/tripleo/+bug/1882664
https://logserver.rdoproject.org/openstack-periodic-master/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-centos-8-buildimage-overcloud-full-master/a3b0702/build.log
### OSP
* new composes for OSP13 and 16.1 - p1/2 in progress check results on wednesday
## June 8th
### Tripleo
* reported bug for not reporting to dlrnapi : https://bugs.launchpad.net/tripleo/+bug/1882534
Fix is merged : https://review.rdoproject.org/r/#/c/27977/
## June 5th
### Tripleo
~~c7 py2 jobs broken >> https://review.opendev.org/#/c/726579~~ REVERTED
ussuri container build >> https://review.opendev.org/#/c/733790
#### master
* scen10-ovn
* full-tempest-scenario
* ipv6 skip list: https://review.opendev.org/#/c/732420/
* full-tempest-api
* tempest.api.compute.servers.test_delete_server.DeleteServersTestJSON
* JUST ONCE, watching more runs: https://review.rdoproject.org/r/27964
* fs039:
* https://bugs.launchpad.net/tripleo/+bug/1875353
* fs001:
* /etc/pki/tls/private does not exist
* https://bugs.launchpad.net/tripleo/+bug/1879766
* fs020:
* pacemaker https://bugs.launchpad.net/tripleo/+bug/1867602
* tempest failures: ipv6, TestNetworkAdvancedServerOps
* fs030:
* timeouts (deploy)
* fs035:
* INCONSISTENT
* timeout
* /etc/pki/tls/private does not exist
* https://bugs.launchpad.net/tripleo/+bug/1879766
#### ussuri
* image build
* container build
##### details
* tripleo-buildimage-overcloud-full-centos-8-ussuri and tripleo-buildimage-overcloud-full-centos-8 needs an attention.
https://zuul.opendev.org/t/openstack/builds?job_name=tripleo-buildimage-overcloud-full-centos-8-ussuri
https://zuul.opendev.org/t/openstack/builds?job_name=tripleo-buildimage-overcloud-full-centos-8
https://logserver.rdoproject.org/openstack-periodic-latest-released/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-centos-8-buildimage-overcloud-full-ussuri/4a764e7/build.log
* a ghost of https://bugs.launchpad.net/tripleo/+bug/1879767 ????
```
Failed to open connection to "system" message bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory
....
2020-06-05 09:28:47.339 | Installing : ansible-pacemaker-1.0.4-0.20200526160932.5847167 304/317Error unpacking rpm package ansible-pacemaker-1.0.4-0.20200526160932.5847167.el8.noarch
2020-06-05 09:28:47.345 |
2020-06-05 09:28:47.346 | Installing : crudini-0.9.3-1.el8.noarch 305/317
2020-06-05 09:28:47.346 | error: unpacking of archive failed on file /usr/share/ansible/plugins/modules/pacemaker_cluster.py;5eda101c: cpio: open failed - Inappropriate ioctl for device
2020-06-05 09:28:47.346 | error: ansible-pacemaker-1.0.4-0.20200526160932.5847167.el8.noarch: install failed
2020-06-05 09:28:47.346 |
```
* periodic-tripleo-centos-8-ussuri-containers-build-push job having consistent failure with "failed to build containers"
https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-centos-8-ussuri-containers-build-push&pipeline=%09openstack-periodic-latest-released
https://logserver.rdoproject.org/openstack-periodic-latest-released/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-centos-8-ussuri-containers-build-push/ccc1d41/logs/build.log
Reported this issue : https://bugs.launchpad.net/tripleo/+bug/1882246
~~Invastigate and upload a fix : https://review.opendev.org/#/c/733820/
Testing the job here : https://review.rdoproject.org/r/#/c/27963/~~
Fix: https://review.opendev.org/#/c/733790
testproject: https://review.rdoproject.org/r/27966
### OSP
* OSP17 still without attention (!) because of fires in OSP<16
* foreign jobs still invading p1/p2 views
* not yet tested https://code.engineering.redhat.com/gerrit/#/c/198375
* psedlak: need to get info about this from fhubik (as there is new cleaned up proposal of octavia jobs, do we still need this?)
* OSP16.1
* two octavia jobs failed on 'infrared cloud-config' plugin not recognizing 16.1 as version of choice
* fasttracked https://review.gerrithub.io/c/rhos-infra/cloud-config/+/494993
* now in tempest stage so cloud-config issue resolved
* from yesterday to still followup:
* **osp13 two red** (one packstack one ospd)
* packstack cleared
* one ospd timeouts in OC deploy, likely already cixed https://trello.com/c/aL9jyT9A
* **osp15 RED phase1, RED/Yellow octavia** in phase2
* latest 15 build seems old RHOS_TRUNK-15.0-RHEL-8-20200520.n.0
* so maybe not new issues, but i do not see these in CIX board
* job status is from 15 days or 6 days old, so just safety reruns exposing infra issue and not a product one (but i do not see them passed for this puddle in history)
* investigation/rerun definitelly needed (but priority of other osp?)
## June 4th
### Tripleo
* **Upstream Gate**:
SC10-standalone failed consistently :
https://zuul.opendev.org/t/openstack/builds?job_name=tripleo-ci-centos-8-scenario010-standalone
inconsistent failures : latest failure while deploy and last 2 failures are related to “async task did not complete within the requested time - 5700s
* **RDO CI Failures**:
#### master
* fs001 (promotion blocker for master):
/etc/pki/tls/private does not exist
https://bugs.launchpad.net/tripleo/+bug/1879766
https://review.rdoproject.org/zuul/builds?pipeline=openstack-periodic-master&job_name=periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-master
*
### OSP
* OSP17 still without attention (!) because of fires in OSP<16
* foreign jobs still invading p1/p2 views
* not yet tested https://code.engineering.redhat.com/gerrit/#/c/198375
* psedlak: what is the overall status? (once we sync up keep just the one with issues)
* osp10 all blue
* osp12 tab is empty (should be removed?)
* *tkorol*: **osp13 two red** (one packstack one ospd)
* osp14 empty p1/p2 section
* *psedlak*: **osp15 RED phase1, RED/Yellow octavia** in phase2
* latest 15 build seems old RHOS_TRUNK-15.0-RHEL-8-20200520.n.0
* so maybe not new issues, but i do not see these in CIX board
* job status is from 15 days or 6 days old, so just safety reruns exposing infra issue and not a product one (but i do not see them passed for this puddle in history)
* investigation/rerun definitelly needed (but priority of other osp?)
* osp16.0 all blue (>2weeks)
* osp16.1 blue
* osp17 phase1 is RED, p2 not run yet
* [puddle-status](https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/util-monitor-puddle-symlinks/41023/console) indicates only 16.1-p2 and 17 not promoted?
* [infra-monitor-job](https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/util-monitor-infra/97107/console) is having quite few issues - but not new seems already for some time (maybe related to pre-testing tlv2 slaves move?)
## June 3rd
### Tripleo
* Fix is up for the issue reported https://bugs.launchpad.net/tripleo/+bug/1881732
Fix Patch is here : https://review.opendev.org/#/c/733114/
Test Patch for the same : https://review.rdoproject.org/r/#/c/27914/
#### RDO
promoted train and ussuri
##### master
* scenario010-ovn-provider-standalone-master
* **ignoring**
* full-tempest master:
* ipv6 tests https://bugs.launchpad.net/tripleo/+bug/1881624
* **rechecked** https://review.opendev.org/732420 add ipv6 hotplug tests to skip list
* fs039:
* FileNotFoundError: [Errno 2] No such file or directory: '/etc/sssd/sssd.conf'
* FileNotFoundError: [Errno 2] No such file or directory: '/var/lib/ipa-client/sysrestore/...'
* https://bugs.launchpad.net/tripleo/+bug/1875353
* fs001:
* /etc/pki/tls/private does not exist
* https://bugs.launchpad.net/tripleo/+bug/1879766
* fs020:
* pacemaker https://bugs.launchpad.net/tripleo/+bug/1867602
* tempest failures: ipv6, TestNetworkAdvancedServerOps
* fs030:
* timeouts (deploy)
* fs035:
* /etc/pki/tls/private does not exist
* https://bugs.launchpad.net/tripleo/+bug/1879766
## June 2nd
### Tripleo
* tripleo-buildimage-centos7 jobs failing due to python version. as python2 support removed from diskimage-builder.
Reported a bug, https://bugs.launchpad.net/tripleo/+bug/1881732
* featureset001 failing on master due to this issue (https://bugs.launchpad.net/tripleo/+bug/1879766)
Job - periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-master is under promotion criteria consistently failing.
https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-master
* periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039-stein failing consistently with unreachable all nodes.
https://review.rdoproject.org/zuul/builds?pipeline=openstack-periodic-24hr%20&job_name=periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039-stein
https://logserver.rdoproject.org/openstack-periodic-24hr/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039-stein/3cef6f2/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz
## June 1st
### Tripleo
scenario004 / 001 master busted by
https://bugs.launchpad.net/tripleo/+bug/1881670
- revert up
scenario010 https://review.opendev.org/#/c/732464/1
- moving to non-voting until fixed
manually promoted ussuri.. -> taking it out of loop on promoter server as it's busted.
---
rfolco notes:
testproject
* testing image_sanity on https://review.rdoproject.org/r/#/c/27845/
* timeout increase for fs035 on https://review.rdoproject.org/r/27724
* fs020 train w/ new cirros on https://review.rdoproject.org/r/#/c/27878/
* scen010 ovn w/ new cirros on https://review.rdoproject.org/r/#/c/27880/
master
* full-tempest master:
* https://bugs.launchpad.net/tripleo/+bug/1881624
* https://review.opendev.org/732420 add ipv6 hotplug tests to skip list
* fs039, fs001:
* watch (inconsistent results)
* fs020:
* inconsistent results, tempest failures
* hit pcsd bug https://bugs.launchpad.net/tripleo/+bug/1867602
* fs030:
* timeout on last run (mostly green)
* fs035:
* /etc/pki/tls/private does not exist (seen more than once)
* tempest (ipv6) - https://bugs.launchpad.net/tripleo/+bug/1881624
* https://github.com/cirros-dev/cirros/issues/58
* scen10 ovn:
* https://review.rdoproject.org/r/27880 Test scen10 ovn master
* consistently failing on https://logserver.rdoproject.org/openstack-periodic-master/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-scenario010-ovn-provider-standalone-master/fb1373f/logs/undercloud/var/log/tempest/stestr_results.html.gz
* while scen10 test is green https://logserver.rdoproject.org/openstack-periodic-master/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-scenario010-standalone-master/56fb6b8/logs/undercloud/var/log/tempest/stestr_results.html.gz
train
* scen10 train: https://bugs.launchpad.net/tripleo/+bug/1881584
* fs001, fs002, fs035: timeout
* https://tinyurl.com/ybcv8wp9
* scen004:
* still failing all the time - https://bugs.launchpad.net/tripleo/+bug/1879292
ussuri
*(mostlt green except by)*
* fs020: https://bugs.launchpad.net/tripleo/+bug/1881642
---
* RDO CI Failures
* **periodic-tripleo-ci-centos-8-scenario010-ovn-provider-standalone-master**
failing consistently with tempest.
https://review.rdoproject.org/zuul/builds?pipeline=openstack-periodic- master&job_name=periodic-tripleo-ci-centos-8-scenario010-ovn-provider- standalone-master
Job is under promotion criteria but commented in [1].
[1] https://github.com/rdo-infra/ci-config/blob/master/ci- scripts/dlrnapi_promoter/config/CentOS-8/master.ini#L33
* **periodic-tripleo-ci-centos-8-standalone-full-tempest-scenario-master**
failing consistently with tempest tests.
https://review.rdoproject.org/zuul/builds?pipeline=openstack-periodic-master&job_name=periodic-tripleo-ci-centos-8-standalone-full-tempest-scenario-master
https://logserver.rdoproject.org/openstack-periodic-master/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-standalone-full-tempest-scenario-master/8bd10fb/logs/undercloud/var/log/tempest/tempest_run.log.txt.gz
* **periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-ussuri** recently failed with 1 tempest test failure. Same isseu being reported for FS20 earlier. This job is in promotion criteria.
https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-ussuri&pipeline=openstack-periodic-latest-released
https://bugs.launchpad.net/tripleo/+bug/1744907
* **periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-train** is failing consistently with "Overcloud configuration failed".
https://review.rdoproject.org/zuul/builds?pipeline=openstack-periodic-24hr&job_name=periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-train
* **periodic-tripleo-ci-centos-7-scenario010-standalone-train** failing consistently.
~~~
2020-06-01 05:53:08.194423 | primary | TASK [os_tempest : Ensure private network exists] ******************************
2020-06-01 05:53:08.194501 | primary | Monday 01 June 2020 05:53:08 +0000 (0:00:00.100) 0:39:46.330 ***********
2020-06-01 05:53:11.920072 | primary | FAILED - RETRYING: Ensure private network exists (5 retries left).
2020-06-01 05:53:24.624402 | primary | FAILED - RETRYING: Ensure private network exists (4 retries left).
2020-06-01 05:53:37.223961 | primary | FAILED - RETRYING: Ensure private network exists (3 retries left).
2020-06-01 05:53:49.816758 | primary | FAILED - RETRYING: Ensure private network exists (2 retries left).
2020-06-01 05:54:02.321538 | primary | FAILED - RETRYING: Ensure private network exists (1 retries left).
2020-06-01 05:54:14.724326 | primary | fatal: [undercloud -> 127.0.0.2]: FAILED! => {
2020-06-01 05:54:14.724473 | primary | "attempts": 5,
2020-06-01 05:54:14.724539 | primary | "changed": false
2020-06-01 05:54:14.724566 | primary | }
2020-06-01 05:54:14.724608 | primary |
2020-06-01 05:54:14.724635 | primary | MSG:
2020-06-01 05:54:14.724711 | primary |
2020-06-01 05:54:14.724871 | primary | ConflictException: 409: Client Error for url: http://192.168.24.1:9696/v2.0/networks.json, Unable to create the network. The tunnel ID 1 is in use.
~~~
also see
https://logserver.rdoproject.org/openstack-periodic-24hr/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-7-scenario010-standalone-train/7b60b68/logs/undercloud/var/log/containers/neutron/server.log.txt.gz
```
-06-01 05:53:11.663 32 DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn n=1 command(idx=4): PgAclAddCommand(direction=to-lport, log=False, name=[], may_exist=False, entity=pg_d8d3c626_ef73_4b41_9f6a_4503278c5312, priority=1002, action=allow-related, external_ids={'neutron:security_group_rule_id': 'c7029504-a1a9-42d4-9247-a460fdfdb4cf'}, match=outport == @pg_d8d3c626_ef73_4b41_9f6a_4503278c5312 && ip4 && ip4.src == $pg_d8d3c626_ef73_4b41_9f6a_4503278c5312_ip4, severity=[]) do_commit /usr/lib/python2.7/site-packages/ovsdbapp/backend/ovs_idl/transaction.py:84
2020-06-01 05:53:11.668 36 DEBUG networking_ovn.ovsdb.ovsdb_monitor [-] Hash Ring: Node 8689956c-4f66-404a-ad4a-11ec99f1fcd5 (host: standalone.localdomain) handling event "create" for row ac4a1a42-ae6d-4708-9a8b-7e9655ff3000 (table: ACL) notify /usr/lib/python2.7/site-packages/networking_ovn/ovsdb/ovsdb_monitor.py:462
2020-06-01 05:53:11.669 36 DEBUG networking_ovn.ovsdb.ovsdb_monitor [-] Hash Ring: Node 8689956c-4f66-404a-ad4a-11ec99f1fcd5 (host: standalone.localdomain) handling event "create" for row f9302631-b1c2-4473-9f52-e98bc5660ace (table: ACL) notify /usr/lib/python2.7/site-packages/networking_ovn/ovsdb/ovsdb_monitor.py:462
2020-06-01 05:53:11.669 34 DEBUG networking_ovn.ovsdb.ovsdb_monitor [-] Hash Ring: Node e5210224-234a-4070-a5a3-282594bdc96e (host: standalone.localdomain) handling event "create" for row 8a9c4091-4aff-439f-8aa9-fc32d9d28cf7 (table: ACL) notify /usr/lib/python2.7/site-packages/networking_ovn/ovsdb/ovsdb_monitor.py:462
2020-06-01 05:53:11.670 36 DEBUG networking_ovn.ovsdb.ovsdb_monitor [-] Hash Ring: Node 8689956c-4f66-404a-ad4a-11ec99f1fcd5 (host: standalone.localdomain) handling event "create" for row 10004d08-787e-4e30-a623-74e8a5c2394d (table: ACL) notify /usr/lib/python2.7/site-packages/networking_ovn/ovsdb/ovsdb_monitor.py:462
2020-06-01 05:53:11.671 36 DEBUG networking_ovn.ovsdb.ovsdb_monitor [-] Hash Ring: Node 8689956c-4f66-404a-ad4a-11ec99f1fcd5 (host: standalone.localdomain) handling event "create" for row 7ca26059-bbc4-4f57-9b0e-e8e6c257466c (table: Port_Group) notify /usr/lib/python2.7/site-packages/networking_ovn/ovsdb/ovsdb_monitor.py:462
2020-06-01 05:53:11.690 32 INFO networking_ovn.db.revision [req-40f346c8-bfeb-4e3f-b42f-96540da554f3 3c550cf5718d489e899d2b974d076c59 c3bac0775f3f4f709b305f72cf217853 - default default] Successfully bumped revision number for resource d8d3c626-ef73-4b41-9f6a-4503278c5312 (type: security_groups) to 1
2020-06-01 05:53:11.704 32 DEBUG oslo_concurrency.lockutils [req-5ef76d30-94e8-46aa-82d3-08631918685e 3c550cf5718d489e899d2b974d076c59 c3bac0775f3f4f709b305f72cf217853 - - -] Lock "event-dispatch" acquired by "neutron.plugins.ml2.ovo_rpc.dispatch_events" :: waited 0.000s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:327
2020-06-01 05:53:11.746 32 INFO neutron.pecan_wsgi.hooks.translation [req-40f346c8-bfeb-4e3f-b42f-96540da554f3 3c550cf5718d489e899d2b974d076c59 c3bac0775f3f4f709b305f72cf217853 - default default] POST failed (client error): There was a conflict when trying to complete your request.
```
source: https://opendev.org/openstack/openstack-ansible-os_tempest/src/branch/master/tasks/tempest_resources.yml#L146
https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-7-scenario010-standalone-train&pipeline=openstack-periodic-24hr
Issue reported in launchpad : https://bugs.launchpad.net/tripleo/+bug/1881584
*
## May 30th
### Tripleo
tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset053 failing continuously
https://review.rdoproject.org/zuul/builds?job_name=tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset053
~~~
2020-05-30 06:22:04.768735 | primary | TASK [repo-setup : Get DLRN hash - passed tag - component-based] ***************
2020-05-30 06:22:04.768795 | primary | Saturday 30 May 2020 06:22:04 +0000 (0:00:00.083) 0:00:19.015 **********
2020-05-30 06:22:05.556357 | primary | fatal: [undercloud]: FAILED! => {
2020-05-30 06:22:05.558103 | primary | "changed": true,
2020-05-30 06:22:05.558392 | primary | "cmd": "set -euo pipefail\ndlrn_base=https://trunk.rdoproject.org/centos7-master\nif [ -e /etc/ci/mirror_info.sh ]; then\n source /etc/ci/mirror_info.sh\n NODEPOOL_RDO_PROXY=${NODEPOOL_RDO_PROXY:-https://trunk.rdoproject.org}\n dlrn_base=${dlrn_base/https:\\/\\/trunk.rdoproject.org/$NODEPOOL_RDO_PROXY}\nfi\ncurl -s --fail --show-error ${dlrn_base}/current-tripleo/delorean.repo.md5\n",
2020-05-30 06:22:05.558433 | primary | "delta": "0:00:00.318829",
2020-05-30 06:22:05.558497 | primary | "end": "2020-05-30 06:22:05.541197",
2020-05-30 06:22:05.558536 | primary | "rc": 22,
2020-05-30 06:22:05.558579 | primary | "start": "2020-05-30 06:22:05.222368"
2020-05-30 06:22:05.558589 | primary | }
2020-05-30 06:22:05.558599 | primary |
2020-05-30 06:22:05.558613 | primary | STDERR:
2020-05-30 06:22:05.558622 | primary |
2020-05-30 06:22:05.558663 | primary | curl: (22) The requested URL returned error: 404 Not Found
2020-05-30 06:22:05.558673 | primary |
2020-05-30 06:22:05.558682 | primary |
2020-05-30 06:22:05.558693 | primary | MSG:
2020-05-30 06:22:05.558703 | primary |
2020-05-30 06:22:05.558723 | primary | non-zero return code
~~~
~~gate issue solved~~
* ~~https://bugs.launchpad.net/tripleo/+bug/1881090 (virt-customize: error: libguestfs error: overcloud-full.qcow2: No such file failing ooci-build-images) here is the fix:~~
~~https://review.opendev.org/731498 (Added image sanity check condition)~~
~~https://review.opendev.org/#/c/731587 and https://review.opendev.org/#/c/731498~~
## May 29th
### Tripleo
build image issue (fs002):
fix https://review.opendev.org/#/c/731823
test https://review.rdoproject.org/r/27845 Test 731823
fs035 ussuri 3rd party:
https://review.rdoproject.org/r/27846 Add fs035 (ussuri) 3rd party job to layout
* Gate:
- tripleo-ci-centos-7-standalone-upgrade-train failed two time with same error:
~~~
2020-05-29 05:26:40 | 2020-05-29 05:26:40.206 139137 INFO osc_lib.shell [-] command: tripleo upgrade -> tripleoclient.v1.tripleo_upgrade.Upgrade (auth=False)[00m
2020-05-29 05:26:40 | 2020-05-29 05:26:40.209 139137 ERROR tripleoclient.v1.tripleo_upgrade.Upgrade [-] User interaction required, cannot confirm.[00m
2020-05-29 05:26:40 | 2020-05-29 05:26:40.210 139137 ERROR openstack [-] User did not confirm upgrade, so exiting. Consider using the --yes parameter if you prefer to skip this warning in the future: UndercloudUpgradeNotConfirmed: User did not confirm upgrade, so exiting. Consider using the --yes parameter if you prefer to skip this warning in the future[00m
2020-05-29 05:26:40 | 2020-05-29 05:26:40.210 139137 INFO osc_lib.shell [-] END return value: 1[00m
~~~
https://zuul.openstack.org/builds?pipeline=gate&job_name=tripleo-ci-centos-7-standalone-upgrade-train
https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_597/727889/1/gate/tripleo-ci-centos-7-standalone-upgrade-train/597b47b/logs/undercloud/home/zuul/standalone_upgrade.log
https://bugs.launchpad.net/tripleo/+bug/1881306 reported here.
https://review.opendev.org/#/c/731782/ here is the fix.
* RDO CI Failures:
- **Ussuri** - periodic-tripleo-ci-centos-8-ovb-1ctlr_1comp-featureset002-ussuri consistenly failing with below error
~~~
2020-05-28 17:26:34.276657 | primary | libguestfs: trace: set_verbose true
2020-05-28 17:26:34.276695 | primary | libguestfs: trace: set_verbose = 0
2020-05-28 17:26:34.276733 | primary | libguestfs: trace: set_memsize 2048
2020-05-28 17:26:34.276770 | primary | libguestfs: trace: set_memsize = 0
2020-05-28 17:26:34.276808 | primary | libguestfs: trace: set_smp 2
2020-05-28 17:26:34.276844 | primary | libguestfs: trace: set_smp = 0
2020-05-28 17:26:34.277414 | primary | libguestfs: trace: set_network true
2020-05-28 17:26:34.277479 | primary | libguestfs: trace: set_network = 0
2020-05-28 17:26:34.277564 | primary | libguestfs: trace: add_drive "overcloud-full.qcow2" "readonly:false" "protocol:file" "discard:besteffort"
2020-05-28 17:26:34.277618 | primary | libguestfs: trace: add_drive = -1 (error)
2020-05-28 17:26:34.278384 | primary | virt-customize: error: libguestfs error: overcloud-full.qcow2: No such file
2020-05-28 17:26:34.278417 | primary | or directory
2020-05-28 17:26:34.278430 | primary | libguestfs: trace: close
2020-05-28 17:26:34.278904 | primary | libguestfs: closing guestfs handle 0x55d953ea8070 (state 0)
2020-05-28 17:26:34.278920 | primary | /bin/virt-copy-out: access: overcloud-full.qcow2: No such file or directory
~~~
https://review.rdoproject.org/zuul/builds?pipeline=openstack-periodic-latest-released&job_name=periodic-tripleo-ci-centos-8-ovb-1ctlr_1comp-featureset002-ussuri
https://review.opendev.org/#/c/731498/ this fix is up for the issue
### OSP
## May 28th (handoff)
### Tripleo
* Train and Stein get promoted recently.
* Master promotion is blocked because of fs001 failure.
* RDO CI Failures:
- https://bugs.launchpad.net/tripleo/+bug/1881090 (virt-customize: error: libguestfs error: overcloud-full.qcow2: No such file failing ooci-build-images) https://review.opendev.org/731498 (Added image sanity check condition)
- **Master** : fs001 is failing on master and blocking promotion.
Issue : https://bugs.launchpad.net/tripleo/+bug/1879766 (master ovb jobs failing on Destination directory /etc/pki/tls/private does not exist)
- **Ussuri**: "periodic-tripleo-ci-centos-8-standalone-full-tempest-scenario-ussuri" is failing because of tempest test failures : https://logserver.rdoproject.org/openstack-periodic-latest-released/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-standalone-full-tempest-scenario-ussuri/903029a/logs/undercloud/var/log/tempest/stestr_results.html.gz
"periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset035-ussuri" is getting time out consistently at Execute tempest test task. Submitted patch to increase the time out here : https://review.rdoproject.org/r/#/c/27811/ and testing the jobs: https://review.rdoproject.org/r/#/c/27789/
- **Train C7**: Currently most of the jobs are failing because of https://bugs.launchpad.net/tripleo/+bug/1881090 (virt-customize: error: libguestfs error: overcloud-full.qcow2: No such file failing ooci-build-images), once this will patche will merge https://review.opendev.org/731498 (Added image sanity check condition) it will go green.
### OSP
### Completed Items
- [ ] fix master
- [x] pooja -dlrnapi not reporting https://review.rdoproject.org/r/#/c/27977/
update -
https://review.opendev.org/#/c/724147/ and https://review.opendev.org/#/c/722486/ after this patches marged issue started. Standalone jobs taking new version of urllib3 but not container build and image build jobs.
New fix to ensure urllib3 latest verion pickup : https://review.rdoproject.org/r/#/c/27997/
some invastigation here : https://bugs.launchpad.net/tripleo/+bug/1882534
- [ ] folco - ensure image-sanity is running and failing overcloud image builds when missing files are found
- [x] https://review.opendev.org/734112 Fix image_sanity check
- [ ] https://review.rdoproject.org/r/27986 Test image_sanity fix
- [ ] actually testing yatin's fix https://review.rdoproject.org/r/#/c/28041/2/playbooks/tripleo-ci-periodic-base-upload/tmpfiles.yaml
- [ ] pooja - work w/ chandan we need to get a working overcloud image
- [ ] fix ussuri
-[x] ~~folco - epel not found, tc patch https://review.opendev.org/#/c/733790~~ MERGED :heavy_check_mark:
~~update - tested the fix - https://review.rdoproject.org/r/#/c/28002/ working fine
and fix got workflow +1, soon it will merge.~~
- [ ] pooja - work w/ chandan to get a workig overcloud-image build
- [ ] wes, look at promoting train
- [ ] https://trunk.rdoproject.org/api-centos-train/api/civotes_detail.html?commit_hash=9d933a3e31003a8e250625c8d97da6a346b15571&distro_hash=2e42c32283c467496fe59c7ac67f0adffeb1d2d8