# Ruck and Rover notes #31 ###### tags: `ruck_rover` :::info Important links for ruck rover's [ruck/rover links to help](https://hackmd.io/07z0xroHTFi2IbX93P5ZfQ) **Ruck Rover - Unified Sprint #31 Dates: July 30th - August 19 Tripleo CI team ruck|rover: Marios Andreou (marios) Chandan Kumar (chandankumar) Downstream CI team ruck|rover: Filip Hubik (fhubik) Tuvya Korol (tkorol) Vex/rdocloud introspection: vexx: https://bit.ly/3kfhIOb rdo-cloud: https://bit.ly/2Xu4jIG Previous notes(sprint #30): https://hackmd.io/6Bx0FXwlRNCc75l39NSKvg Next notes(sprint #32): none - this sprint 31 is current one ::: [TOC] #### on-going issues :::danger #### OSP Jenkins TLV2 migration polishing * UMB doesn't work * https://projects.engineering.redhat.com/browse/RHOSINFRA-3635 LIBVIRT LEASE (RHEL8.x, x in 0-2) BUG: Escalation: https://trello.com/c/I0ix688S will affect OSP16.1 for some time going forward (https://bugzilla.redhat.com/show_bug.cgi?id=1840307 [libvirt]) - see [4th Aug](#Tue-04-Aug) * also other OSPs reported it (13), also many various p3 jobs * https://bugzilla.redhat.com/show_bug.cgi?id=1868271 (8.2.1) * first scratch build available Remaining OSP13 escalation was not looked upon yet Upgrade DFG p3 job is not getting SWAP_PUDDLENO* param from layer above ::: ------------------------ ## Thu 20 Aug ### tripleo #### New/Transient/Nobug yet: :::spoiler Rerunning full tempest ussuri - https://review.rdoproject.org/r/29018 https://logserver.rdoproject.org/openstack-periodic-integration-stable1/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-standalone-full-tempest-scenario-ussuri/736c131/logs/undercloud/var/log/tempest/tempest_run.log.txt.gz ``` {2} neutron_tempest_plugin.scenario.test_floatingip.FloatingIpMultipleRoutersTest.test_reuse_ip_address_with_other_fip_on_other_router [147.500962s] ... FAILED Captured traceback: ~~~~~~~~~~~~~~~~~~~ Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/neutron_tempest_plugin/scenario/test_floatingip.py", line 563, in test_reuse_ip_address_with_other_fip_on_other_router servers_num=1, fip_addresses=[ip_address]) File "/usr/lib/python3.6/site-packages/neutron_tempest_plugin/scenario/test_floatingip.py", line 477, in _create_network_and_servers network=network, fip_address=fip)) File "/usr/lib/python3.6/site-packages/neutron_tempest_plugin/scenario/test_floatingip.py", line 498, in _create_server_and_fip port=port) File "/usr/lib/python3.6/site-packages/neutron_tempest_plugin/api/base.py", line 634, in create_floatingip **kwargs)['floatingip'] File "/usr/lib/python3.6/site-packages/neutron_tempest_plugin/services/network/json/network_client.py", line 972, in create_floatingip resp, body = self.post(uri, body) File "/usr/lib/python3.6/site-packages/tempest/lib/common/rest_client.py", line 283, in post return self.request('POST', url, extra_headers, headers, body, chunked) File "/usr/lib/python3.6/site-packages/tempest/lib/common/rest_client.py", line 687, in request self._error_checker(resp, resp_body) File "/usr/lib/python3.6/site-packages/tempest/lib/common/rest_client.py", line 808, in _error_checker raise exceptions.Conflict(resp_body, resp=resp) tempest.lib.exceptions.Conflict: Conflict with state of target resource Details: {'type': 'IpAddressAlreadyAllocated', 'message': 'IP address 192.168.24.133 already allocated in subnet 356b93e1-6377-4055-9a6e-ce4a93b213cb', 'detail': ''} ``` ::: * Re-running fs030/fs035 job - https://review.rdoproject.org/r/29019 * Jobs failing with RETRY_LIMIT with primary | /bin/sh: line 1: git: command not found at prepare-workspace-git : Clone cached repo to workspace https://bugs.launchpad.net/tripleo/+bug/1892326 ### osp R&R transitioning and knowledge transfer Libvirt fix testing (fhubik) UMB debug (fhubik) ## Wed 19 Aug \o/ ### tripleo #### New/Transient/Nobug yet: :::spoiler ~~Train standalone jobs are failing check/gate - unable to resolve tripleo_deploy_control_virtual_ip~~ - https://bugs.launchpad.net/tripleo/+bug/1892078 ::: :::spoiler ERROR: Package 'pymod2pkg' requires a different Python: 2.7.5 not in '>=3.6' on tripleo-ci-centos-7-containers-multinode-train due to mirror issues * https://f05144106d047809c3b3-8b5cc018a02fed487ca2a6d59a4ee9d5.ssl.cf2.rackcdn.com/746592/1/gate/tripleo-ci-centos-7-containers-multinode-train/12933b7/job-output.txt ``` 2020-08-19 00:08:43.959611 | primary | TASK [build-test-packages : Pip install rdopkg] ******************************** 2020-08-19 00:08:43.959806 | primary | Wednesday 19 August 2020 00:08:43 +0000 (0:00:06.832) 0:05:19.651 ****** 2020-08-19 00:08:48.608093 | primary | fatal: [undercloud]: FAILED! => { 2020-08-19 00:08:48.608221 | primary | "changed": false, 2020-08-19 00:08:48.608288 | primary | "cmd": [ 2020-08-19 00:08:48.608407 | primary | "/home/zuul/dlrn-venv/bin/pip2", 2020-08-19 00:08:48.608491 | primary | "install", 2020-08-19 00:08:48.608558 | primary | "-U", 2020-08-19 00:08:48.608637 | primary | "rdopkg" 2020-08-19 00:08:48.608689 | primary | ] 2020-08-19 00:08:48.608729 | primary | } 2020-08-19 00:08:48.608778 | primary | 2020-08-19 00:08:48.608824 | primary | MSG: 2020-08-19 00:08:48.608861 | primary | 2020-08-19 00:08:48.609182 | primary | stdout: Looking in indexes: https://mirror.ord.rax.opendev.org/pypi/simple, https://mirror.ord.rax.opendev.org/wheel/centos-7.8-x86_64 2020-08-19 00:08:48.609257 | primary | Collecting rdopkg 2020-08-19 00:08:48.609637 | primary | Downloading https://mirror.ord.rax.opendev.org/pypifiles/packages/db/e0/3102e985c43b9fc6aeddb3279a7d338a745fcdacfc3cf3dd56d15d5ba1cf/rdopkg-1.2.0-py2-none-any.whl (71 kB) 2020-08-19 00:08:48.609743 | primary | Collecting distroinfo>=0.3.0 2020-08-19 00:08:48.610146 | primary | Downloading https://mirror.ord.rax.opendev.org/pypifiles/packages/88/e5/e3f6a502251476966273a2065befb38d37687a58c50a3afd9dd92895f4a1/distroinfo-0.3.2-py2-none-any.whl (18 kB) 2020-08-19 00:08:48.610227 | primary | Collecting blessings 2020-08-19 00:08:48.610599 | primary | Downloading https://mirror.ord.rax.opendev.org/pypifiles/packages/8d/b1/a3fe6fd8a012e6d019bafd671c2fee0597ea97ff2e76c25aadfa4545fc32/blessings-1.7-py2-none-any.whl (26 kB) 2020-08-19 00:08:48.610681 | primary | Collecting munch 2020-08-19 00:08:48.611083 | primary | Downloading https://mirror.ord.rax.opendev.org/pypifiles/packages/cc/ab/85d8da5c9a45e072301beb37ad7f833cd344e04c817d97e0cc75681d248f/munch-2.5.0-py2.py3-none-any.whl (10 kB) 2020-08-19 00:08:48.611163 | primary | Collecting requests 2020-08-19 00:08:48.611545 | primary | Downloading https://mirror.ord.rax.opendev.org/pypifiles/packages/45/1e/0c169c6a5381e241ba7404532c16a21d86ab872c9bed8bdcd4c423954103/requests-2.24.0-py2.py3-none-any.whl (61 kB) 2020-08-19 00:08:48.611633 | primary | Collecting pbr>=0.5.6 2020-08-19 00:08:48.612027 | primary | Downloading https://mirror.ord.rax.opendev.org/pypifiles/packages/96/ba/aa953a11ec014b23df057ecdbc922fdb40ca8463466b1193f3367d2711a6/pbr-5.4.5-py2.py3-none-any.whl (110 kB) 2020-08-19 00:08:48.612108 | primary | Collecting pymod2pkg 2020-08-19 00:08:48.612465 | primary | Downloading https://mirror.ord.rax.opendev.org/pypifiles/packages/c9/04/d59f150b9e0f9c377dc3efe71f737ddebfb34173d3ca1dc94cef09f20aa1/pymod2pkg-0.25.0.tar.gz (17 kB) 2020-08-19 00:08:48.612503 | primary | 2020-08-19 00:08:48.613225 | primary | :stderr: DEPRECATION: Python 2.7 reached the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 is no longer maintained. pip 21.0 will drop support for Python 2.7 in January 2021. More details about Python 2 support in pip can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support 2020-08-19 00:08:48.613423 | primary | ERROR: Package 'pymod2pkg' requires a different Python: 2.7.5 not in '>=3.6' ``` ::: #### https://bugs.launchpad.net/tripleo/+bug/1892078 Train standalone jobs are failing check/gate - unable to resolve tripleo_deploy_control_virtual_ip #### **NOT A BLOCKER** https://bugs.launchpad.net/tripleo/+bug/1892169 periodic centos 8 train FS20 tempest fails ### osp First libvirt scratchbuild done https://bugzilla.redhat.com/show_bug.cgi?id=1868271#c3 * testing has begun ## Tue 18 Aug ### tripleo ongoing issues duplicated here - no need check previous days #### https://bugs.launchpad.net/tripleo/+bug/1892008 periodic OVB train centos 7 failing overcloud configuration (config download) Invalid cross-device link #### https://bugs.launchpad.net/puppet-openstack-integration/+bug/1891992 [Master][scenario002][ec2api] Tempest test(test_create_delete_bucket) failing #### https://bugs.launchpad.net/tripleo/+bug/1891372 rocky periodic jobs are failing with " Error: image tripleorocky/centos-binary-tempest:9801dc7461cbd6cbd73868e72e74d21d586c6708_fbb4de96-updated-20200812131025 not found" #### https://bugs.launchpad.net/tripleo/+bug/1890798 periodic centos8 Ussuri multinode minor update job fails: tderr": "Error: resource 'ip-192.168.24.16' is not running on any node #### https://bugs.launchpad.net/tripleo/+bug/1890389 [Master] tripleoclient does not play well with cliff-3.4.0 #### https://bugs.launchpad.net/tripleo/+bug/1889122 mirror timeouts in upstream causing undercloud and standalone failures #### ~~https://bugs.launchpad.net/tripleo/+bug/1891971 ERROR: No matching distribution found for pprint (from -r /home/zuul/src/opendev.org/openstack/tripleo-ci/test-requirements.txt (line 6))~~ #### New/Transient/Nobug yet: * ~~C8 Ussuri promotion (re-running the failed jobs in testproject- https://review.rdoproject.org/r/28994)~~ :::spoiler [/etc/sysconfig/network-scripts/ifup-eth] Error, some other host (FA:16:3E:CB:52:88) already uses address 10.0.0.1. * https://logserver.rdoproject.org/openstack-periodic-integration-stable1/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp_1supp-featureset039-ussuri/37e391c/job-output.txt ``` 2020-08-17 22:06:20.314877 | primary | [2020/08/17 10:06:20 PM] [ERROR] stdout: ERROR : [/etc/sysconfig/network-scripts/ifup-eth] Error, some other host (FA:16:3E:CB:52:88) already uses address 10.0.0.1. 2020-08-17 22:06:20.314884 | primary | , stderr: WARN : [ifup] You are using 'ifup' script provided by 'network-scripts', which are now deprecated. 2020-08-17 22:06:20.314889 | primary | WARN : [ifup] 'network-scripts' will be removed in one of the next major releases of RHEL. 2020-08-17 22:06:20.314895 | primary | WARN : [ifup] It is advised to switch to 'NetworkManager' instead - it provides 'ifup/ifdown' scripts as well. 2020-08-17 22:06:20.314901 | primary | 2020-08-17 22:06:20.314907 | primary | Traceback (most recent call last): 2020-08-17 22:06:20.314912 | primary | File "/bin/os-net-config", line 10, in <module> 2020-08-17 22:06:20.314918 | primary | sys.exit(main()) 2020-08-17 22:06:20.314924 | primary | File "/usr/lib/python3.6/site-packages/os_net_config/cli.py", line 349, in main 2020-08-17 22:06:20.314935 | primary | activate=not opts.no_activate) 2020-08-17 22:06:20.314941 | primary | File "/usr/lib/python3.6/site-packages/os_net_config/impl_ifcfg.py", line 1881, in apply 2020-08-17 22:06:20.314963 | primary | raise os_net_config.ConfigurationError(message) 2020-08-17 22:06:20.314968 | primary | os_net_config.ConfigurationError: Failure(s) occurred when applying configuration ```~~~~ ::: :::spoiler ~~FATAL | Gather podman infos | overcloud-controller-1~~ green at test https://review.rdoproject.org/r/#/c/28994/ * https://logserver.rdoproject.org/openstack-periodic-integration-stable1/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-ussuri/d670534/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz ``` ATAL | Gather podman infos | overcloud-controller-1 | error={"changed": false, "msg": "Unable to gather info for ['1d80a14d7bed', '7cc5974a59bf', '5c448e9a9b79', '47516ce77307', '469b8894a78c', '9884d1e53afa', '8b3d2d5201eb', '9a8ba7c83a01', '5f304de29125', 'b7c321e8a0ca', 'd342acc6aa41', 'cd858ff84a36', 'c33350c68cbc', '4df3404e0f3a', '632c45236f23', '844896406c71', 'dc9f95bf777b', '793fde778317', 'a6c517024d9d', 'fe3b087c1ade', '1732b1f73e0c', '75bfbfbd8e16', '5a24946172c6', '09dcb0c8e7e9', 'a78f95a57861', '9ab385dc1e1e', '89d1f7f0a79d', 'e7288b4cb64d', '732098f889aa', 'b84405fd0e40', '38071fd3ea74', '08db348b3e0f', '060cab6e76d3', 'f5377e4e694f', 'e518ba88968a', '91841060c926', '971cc8c40ee2', '19b381d57ba5']: Error: error looking up container \"1d80a14d7bed\": no container with name or ID 1d80a14d7bed found: no such container\n"} ``` ::: :::spoiler ~~centos7 periodic ovb new bug~~ https://bugs.launchpad.net/tripleo/+bug/1892008 * https://review.rdoproject.org/zuul/buildset/435b17a1c46d44b19362bf0c36d2f25a * https://logserver.rdoproject.org/openstack-periodic-integration-stable2-centos7/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-train/fa33575/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz ::: :::spoiler heat stacks doing better today (fixed typo http://pastebin.test.redhat.com/893941) http://dashboard-ci.tripleo.org/d/wb8HBhrWk/cockpit?orgId=1&fullscreen&panelId=231 ::: :::spoiler 15:11 < Tengu> marios|ruck: hello there! already seen this in stable/ussuri? ERROR: Could not find a version that satisfies the requirement python-glanceclient===3.1.2 15:12 < Tengu> marios|ruck: example job: https://review.opendev.org/#/c/746635/1 * https://24e738e84b2178e7adaa-7d59354a7b453d58914b817150958c54.ssl.cf2.rackcdn.com/746635/1/check/openstack-tox-pep8/9204071/job-output.txt * 15:14 < marios|ruck> Tengu: doesn't seem to be a widespread thing yet but maybe 'incoming' https://zuul.opendev.org/t/openstack/builds?job_name=openstack-tox-pep8# ::: ### osp Phase3 DFG:DF retrospective mtg today UMB still occurs * UMB debugged by 3 people on all layers * looking into Upgrade jobs also ------------- ## Mon 17 Aug ### ~~tripleo~~ #### https://bugs.launchpad.net/tripleo/+bug/1891372 rocky periodic jobs are failing with " Error: image tripleorocky/centos-binary-tempest:9801dc7461cbd6cbd73868e72e74d21d586c6708_fbb4de96-updated-20200812131025 not found" #### https://bugs.launchpad.net/tripleo/+bug/1890798 periodic centos8 Ussuri multinode minor update job fails: tderr": "Error: resource 'ip-192.168.24.16' is not running on any node #### https://bugs.launchpad.net/tripleo/+bug/1890389 [Master] tripleoclient does not play well with cliff-3.4.0 #### https://bugs.launchpad.net/tripleo/+bug/1889122 mirror timeouts in upstream causing undercloud and standalone failures #### New/Transient/Nobug yet: ~~* Running c8 ussuri fs030 timed out job: https://review.rdoproject.org/r/28913~~ :::spoiler prefetch_image”, “changed”: false, “msg”: “Failed to pull image * https://logserver.rdoproject.org/openstack-periodic-integration-stable2/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-containers-undercloud-minion-train/fcf23ed/logs/subnode-1/home/zuul/minion_install.log.txt.gz * Pre-fetch all the container * [minion] (item=192.168.24.1:8787/tripleotraincentos8/centos-binary-heat-engine:ce647a91d4e57f55df3fbb28bef1cb13-updated-20200816045800) => {"ansible_loop_var": "prefetch_image", "changed": false, "msg": "Failed to pull image 192.168.24.1:8787/tripleotraincentos8/centos-binary-heat-engine:ce647a91d4e57f55df3fbb28bef1cb13-updated-20200816045800", "prefetch_image": "192.168.24.1:8787/tripleotraincentos8/centos-binary-heat-engine:ce647a91d4e57f55df3fbb28bef1cb13-updated-20200816045800"} * Not in centos8 train criteria (http://38.145.34.55/config/CentOS-8/train.ini) ::: :::spoiler Ussuri current-tripleo-rdo promoter error missing containers? * 2020-08-17 04:12:09,289 22066 ERROR promoter Containers promote 'aggregate: 294c650b84e58da9874de4ba41d2a08a, commit: 5609d73554db9dd1d7c6b3c9535801f32ea0bc33, distro: d29c936057b81e6083628a28e6eb3ac94cb40b00, component: tripleo, timestamp: 1596454669' to current-tripleo-rdo: Failed promotionstart logs ----------------------------- * 2020-08-17 04:12:09,292 22066 ERROR promoter TASK [containers-promote : Generate list of containers to push] **************** 2020-08-17 04:12:09,292 22066 ERROR promoter failed: [localhost] (item=base) => {"ansible_loop_var": "item", "changed": true, "cmd": "echo -n \"Checking trunk.registry.rdoproject.org/tripleoussuri/centos-binary-base:294c650b84e58da9874de4ba41d2a08a ... \";\ndocker manifest inspect --insecure \"trunk.registry.rdoproject.org/tripleoussuri/centos-binary-base:294c650b84e58da9874de4ba41d2a08a\"\nif [[ \"$?\" == \"0\" ]]; then\n echo base >> /tmp/parsed_containers-centos-8-ussuri.txt;\n echo \"OK\";\nelse\n echo \"FAIL\";\n echo \"ERROR ========== centos-binary-base IS NOT BUILT! FIX THIS ASAP! ==========\";\n exit 1\nfi\n", "delta": "0:00:02.190775", "end": "2020-08-17 04:08:37.987377", "item": "base", "msg": "non-zero return code", "rc": 1, "start": * http://38.145.34.55/logs/centos8_ussuri.log * Failures are legit -> 294c650b84e58da9874de4ba41d2a08a is from 04th Aug, https://trunk.rdoproject.org/centos8-ussuri/tripleo-ci-testing/29/4c/294c650b84e58da9874de4ba41d2a08a/ 13:44 < chkumar|rover> those containers does not exists now in rdo registry so failure is legit 13:44 < chkumar|rover> may be we can improve the messaging part 13:44 < chkumar|rover> let me propose a patch for the same ::: :::spoiler need to clean RDO heat stacks, ports etc.. http://dashboard-ci.tripleo.org/d/wb8HBhrWk/cockpit?orgId=1&fullscreen&panelId=231 ::: :::spoiler rhos-17 component pipeline is all red 2020-08-17 04:40:55.141521 | primary | - Status code: 403 for http://download.devel.redhat.com/rcm-guest/puddles/OpenStack/17.0-RHEL-8/latest-RHOS_TRUNK-17-RHEL-8/compose/OpenStack/x86_64/os/repodata/repomd.xml (IP: 10.0.14.183) https://sf.hosted.upshift.rdu2.redhat.com/logs/openstack-component-compute/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-rhel-8-standalone-compute-rhos-17/c87a988/job-output.txt ::: ### osp Investigation of weekend's 16.1 p3 results * hitting titan80 issue - whole p3 broken initially * retriggering also with issues (abandoned mjobs, previous were running) * email explaining it sent * majority of results should be now complete * UMB still not working in automation * reconsidering p2->p3 triggering to be only manual? * plan to check OSP13 ntp escalation once new OSP13 is done ## Thu 13 Aug ### ~~tripleo~~ #### ~~https://bugs.launchpad.net/tripleo/+bug/1891317 openstack-tox-tht ci failing with sudo: a password is required\n", "module_stdout": "", "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error~~ #### https://bugs.launchpad.net/tripleo/+bug/1891372 rocky periodic jobs are failing with " Error: image tripleorocky/centos-binary-tempest:9801dc7461cbd6cbd73868e72e74d21d586c6708_fbb4de96-updated-20200812131025 not found" #### ~~https://bugs.launchpad.net/tripleo/+bug/1891179 periodic OVB train centos 7 failing ovb-manage No server with a name or ID~~ #### https://bugs.launchpad.net/tripleo/+bug/1890798 periodic centos8 Ussuri multinode minor update job fails: tderr": "Error: resource 'ip-192.168.24.16' is not running on any node #### https://bugs.launchpad.net/tripleo/+bug/1890389 [Master] tripleoclient does not play well with cliff-3.4.0 #### https://bugs.launchpad.net/tripleo/+bug/1889122 mirror timeouts in upstream causing undercloud and standalone failures #### New/Transient/Nobug yet: :::spoiler ~~master integration pipeline containers failed (all else skip)~~ * NEXT RUN WAS OK: https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-build-containers-ubi-8-push/933e53f/ * ~~https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-build-containers-ubi-8-push/5c6e3de/logs/build.log~~ * ~~2020-08-13 00:41:06 | 2020-08-13 00:41:06.824 38830 ERROR openstack Stderr: 'error building at STEP "RUN dnf -y install libseccomp podman && dnf clean all && rm -rf /var/cache/dnf": error while running runtime: exit status 1\n'~~ ::: :::spoiler ~~ussuri/train integration pipe ipa image build fail~~ * https://logserver.rdoproject.org/openstack-periodic-integration-stable1/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-centos-8-buildimage-ironic-python-agent-ussuri/a791756/build.log * 2020-08-13 09:27:30.057 | [MIRROR] librados2-14.2.10-1.el8.x86_64.rpm: Status code: 404 for http://mirror.centos.org/centos/8/storage/x86_64/ceph-nautilus/Packages/l/librados2-14.2.10-1.el8.x86_64.rpm (IP: 64.150.179.24) * also hitting gates * https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_434/745924/2/gate/tripleo-ci-centos-8-scenario001-standalone/4342702/job-output.txt 2020-08-13 08:36:45.580573 | primary | Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='opendev.org', port=443): Read timed out. (read timeout=60.0)",)': /openstack/requirements/raw/branch/stable/train/upper-constraints.txt * also hitting train integration pipeline ReadTimeoutError... 'opendev.org'... /openstack/requirements/raw/branch/master/upper-constraints.txt * https://logserver.rdoproject.org/openstack-periodic-integration-stable2/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-multinode-1ctlr-featureset030-train/278f19e/job-output.txt * https://logserver.rdoproject.org/openstack-periodic-integration-stable2/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-multinode-1ctlr-featureset010-train/08d200f/job-output.txt ::: ### osp Fighting to get 16.1 phase3 thru * likely will attempt also during weekend * UMB doesn't work, it has to be run manually * https://projects.engineering.redhat.com/browse/RHOSINFRA-3635 13 on 7.9 situation is not clear yet * https://projects.engineering.redhat.com/browse/RHOSINFRA-3631 ## Wed 12 Aug ### ~~tripleo~~ #### ~~https://bugs.launchpad.net/tripleo/+bug/1891293 periodic centos-8 scen10 standalone master fails tempest - octavia_tempest_plugin scenario.v2~~ #### https://bugs.launchpad.net/tripleo/+bug/1891317 openstack-tox-tht ci failing with sudo: a password is required\n", "module_stdout": "", "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error #### https://bugs.launchpad.net/tripleo/+bug/1891372 rocky periodic jobs are failing with " Error: image tripleorocky/centos-binary-tempest:9801dc7461cbd6cbd73868e72e74d21d586c6708_fbb4de96-updated-20200812131025 not found" #### https://bugs.launchpad.net/tripleo/+bug/1891179 periodic OVB train centos 7 failing ovb-manage No server with a name or ID #### ~~https://bugs.launchpad.net/tripleo/+bug/1891000 master network component failing tempest - neutron containers unexpected keyword argument 'libc'~~ #### https://bugs.launchpad.net/tripleo/+bug/1890798 periodic centos8 Ussuri multinode minor update job fails: tderr": "Error: resource 'ip-192.168.24.16' is not running on any node #### https://bugs.launchpad.net/tripleo/+bug/1890389 [Master] tripleoclient does not play well with cliff-3.4.0 #### https://bugs.launchpad.net/tripleo/+bug/1889122 mirror timeouts in upstream causing undercloud and standalone failures #### ~~https://bugs.launchpad.net/tripleo/+bug/1891287 periodic integration pipelines POST_FAIL promoted-components-to-tripleo-ci-testing~~ #### New/Transient/Nobug yet: ::: spoiler ~~POST_FAIL bad buildsets periodic integration & component master/ussuri/train8/stein~~https://bugs.launchpad.net/tripleo/+bug/1891287 * master * https://review.rdoproject.org/zuul/buildset/bbe3ed78d934459e93efd925a157146f * ussuri * https://review.rdoproject.org/zuul/buildset/da5f9d4dfc22433082ea25531f47e49c * train * https://review.rdoproject.org/zuul/buildset/ca0801b6f60d4488b727669eef8669b7 * stein * https://review.rdoproject.org/zuul/buildset/a3eddccced824dda90ab9cbca5cd4f66 * component tripleo https://review.rdoproject.org/zuul/buildset/14a0be8480174a619415ec309bbb735b * FIX Turn off ara-report https://review.rdoproject.org/r/28938 * 11:07 < marios|ruck> ykarel: how did you see that ara was the problem i mean since there were no logs 11:07 < ykarel> marios|ruck, on console for running job ::: :::spoiler ~~11:30 < ykarel> marios|ruck, is scenario010 tempest failures known?~~ https://bugs.launchpad.net/tripleo/+bug/1891293 * 11:30 < ykarel> https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-8-scenario010-standalone-master 11:31 < marios|ruck> ykarel: no seems new (yesterday) haven't seen it yet noting to dig * https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-scenario010-standalone-master/21e15f7/logs/undercloud/var/log/tempest/stestr_results.html.gz ::: ### osp Jenkins migration to https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/ done by most part * p1/p2/p3 resuming normal operation, but we are monitoring issues closely and expecting/experiencing minor issues (buildmarks, reevaluating reports, UMB issue(s), stuck threads, …) * * post-jenkins-jjb-update job - can cause problems with merging patches (updating production jenkins job state), not yet resolved https://projects.engineering.redhat.com/browse/RHOSINFRA-3625 * * * Jenkins restart did not help clear this CI quality might be degraded because of libvirt issue (https://trello.com/c/I0ix688S) * workaround merged and already active in InfraRed https://projects.engineering.redhat.com/browse/RHOSINFRA-3266 * we are expecting RHEL backport for 8.2.1 (https://bugzilla.redhat.com/show_bug.cgi?id=1868271) and upgrade of CI nodes to 8.2 soon (https://projects.engineering.redhat.com/browse/RHOSINFRA-3426) to meet the backport and also needs of DFG pushing for it OSP13 - started moving CI jobs to RHEL-7.9 as default https://projects.engineering.redhat.com/browse/RHOSINFRA-3631 OSP16.2 - first phase1 job added, early development stage https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/view/QE/view/OSP16.2/ ## Tue 11 Aug ### ~~tripleo~~ #### https://bugs.launchpad.net/tripleo/+bug/1891179 periodic OVB train centos 7 failing ovb-manage No server with a name or ID #### https://bugs.launchpad.net/tripleo/+bug/1891000 master network component failing tempest - neutron containers unexpected keyword argument 'libc' #### https://bugs.launchpad.net/tripleo/+bug/1890798 periodic centos8 Ussuri multinode minor update job fails: tderr": "Error: resource 'ip-192.168.24.16' is not running on any node #### https://bugs.launchpad.net/tripleo/+bug/1890389 [Master] tripleoclient does not play well with cliff-3.4.0 #### ~~https://bugs.launchpad.net/tripleo/+bug/1890266 centos 8 security component + integration pipeline master - Failed container(s): ['nova_wait_for_api_service~~ #### https://bugs.launchpad.net/tripleo/+bug/1889122 mirror timeouts in upstream causing undercloud and standalone failures #### New/Transient/Nobug yet: :::spoiler ~~train 8 missed promotion for fs1 (tempest)~~ * https://review.rdoproject.org/zuul/buildset/2e0b165400e4422a99545fbef70bfec2 * posted testproject https://review.rdoproject.org/r/28927 \o/ promoted ::: :::spoiler ~~train 7 ovb-manage attach instance to network fails~~ https://bugs.launchpad.net/tripleo/+bug/1891179 * https://logserver.rdoproject.org/openstack-periodic-integration-stable2-centos7/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset002-train-upload/a2ed9b0/job-output.txt * https://logserver.rdoproject.org/openstack-periodic-integration-stable2-centos7/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-train/b972c84/job-output.txt * https://logserver.rdoproject.org/openstack-periodic-integration-stable2-centos7/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-train/5479602/job-output.txt ::: :::spoiler why is master cloudops component not promoting? * green http://dashboard-ci.tripleo.org/d/FE8Hf29Wz/component-pipeline?orgId=1&fullscreen&panelId=426&from=now-7d&to=now * criteria there https://github.com/rdo-infra/ci-config/blob/ede8102125fe32e44972c8730818481915ef367f/ci-scripts/dlrnapi_promoter/config/CentOS-8/component/master.yaml#L27 * logs there https://logserver.rdoproject.org/openstack-promote-component/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-centos-8-train-component-cloudops-promote-to-promoted-components/c79fb8c/job-output.txt * https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-centos-8-train-component-cloudops-promote-to-promoted-components ::: :::spoiler ~~why train common component isn't promoting?~~ https://review.rdoproject.org/r/28928 * https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-centos-8-train-component-common-promote-to-promoted-components * https://logserver.rdoproject.org/openstack-promote-component/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-centos-8-train-component-common-promote-to-promoted-components/71bcdc3/job-output.txt * 2020-08-11 12:05:34.906572 | DEBUG: criteria job --->periodic-tripleo-ci-centos-8-standalone-common-ussuri<--- ::: :::spoiler http://lists.openstack.org/pipermail/openstack-discuss/2020-August/016438.html * "Ansible 2.8.14 and 2.9.12 change the default mode, that created files will get, from 0666 (with umask; which would usually produce 0644) to 0600. [1]" * 18:30 < ykarel|away> marios|ruck, ack i fired https://review.rdoproject.org/r/#/c/28929/ to test tripleo deploys with ansible-2.9.12, ::: ### osp * libvirt bug hit again, typo in my patch, fixed hopefuly now by part2 * https://projects.engineering.redhat.com/browse/RHOSINFRA-3266 * tested with dnsmasq being killed as part of the preceeding tasks * repeats as it should now (part 2) * unclear whether it solves these shiftstack occurences mentioned in JIRA yet * Jenkins restart * jjb not updating, 504 by Jenkins, threads deadlocked, related to UMB * possibly can explain the p3 UMB delay? unclear * Ci slaves to RHEL-8.2 upgrade ongoing * People requesting advanced. virt's qemu-kvm should hopefully get it * once it is done ## Mon 10 Aug ### ~~tripleo~~ #### ~~https://bugs.launchpad.net/tripleo/+bug/1890997 USSURI periodic ovb fs35 fails tempest "computeFault": {"code": 500, "message": "Unexpected API Error~~ #### ~~https://bugs.launchpad.net/tripleo/+bug/1891000 master network component failing tempest - neutron containers unexpected keyword argument 'libc'~~ #### ~~https://bugs.launchpad.net/tripleo/+bug/1890798 periodic centos8 Ussuri multinode minor update job fails: tderr": "Error: resource 'ip-192.168.24.16' is not running on any node~~ #### ~~https://bugs.launchpad.net/tripleo/+bug/1890389 [Master] tripleoclient does not play well with cliff-3.4.0~~ #### ~~https://bugs.launchpad.net/tripleo/+bug/1890266 centos 8 security component + integration pipeline master - Failed container(s): ['nova_wait_for_api_service~~ #### ~~https://bugs.launchpad.net/tripleo/+bug/1889122 mirror timeouts in upstream causing undercloud and standalone failures~~ #### New/Transient/Nobug yet: :::spoiler 10:40 < ykarel> chkumar|rover, marios|ruck is issue with network component known? 10:40 < ykarel> we seeing in puppet promotion, so similar should be hitting in network component 10:41 < ykarel> https://logserver.rdoproject.org/ci.centos.org/weirdo-generic-puppet-openstack-scenario001/16211/weirdo-project/logs/neutron/l3-agent.txt.gz 10:46 < marios|ruck> ykarel: haven't seen it and the fails on the component are tempest https://logserver.rdoproject.org/openstack-component-network/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-multinode-1ctlr-featureset010-network-master/b996fdb/job-output.txt 10:46 < marios|ruck> ykarel: noting on the hackmd for now 10:47 -!- tosky [~tosky@dynamic-adsl-78-13-252-77.clienti.tiscali.it] has joined #oooq 10:47 < ykarel> yes tempest failure is likely due to error i shared above 10:48 < ykarel> marios|ruck, and for info it's caused after https://review.opendev.org/#/c/722254/ filed https://bugs.launchpad.net/tripleo/+bug/1891000 ::: :::spoiler new ussuri fs35 blocker tempest fails https://logserver.rdoproject.org/openstack-periodic-integration-stable1/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset035-ussuri/0a45ef9/logs/undercloud/var/log/tempest/stestr_results.html.gz https://logserver.rdoproject.org/openstack-periodic-integration-stable1/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset035-ussuri/45d7fc9/logs/undercloud/var/log/tempest/stestr_results.html.gz filed https://bugs.launchpad.net/tripleo/+bug/1890997 ::: :::spoiler master security component not promoting * https://logserver.rdoproject.org/openstack-promote-component/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-centos-8-ussuri-component-security-promote-to-promoted-components/05a4bf2/job-output.txt * nit there will fixup https://github.com/rdo-infra/ci-config/commit/73f814994df5ab476f0601fc6b16e1cbf9b16f17 (tripleo component job into security criteria) * https://review.rdoproject.org/r/#/c/28917/ ::: ### osp * Mainly going thru p3 results and cross checking against the libvirt issue * OSP13 passed 2 shows issues, possibly related to the open ntp escalation * Attempt to UMB "debug", since last p3 was not triggered right * 4hrs delay, but might be caused by these remains of TLV outage * well keep monitoring, since we dont have any tools to debug it (afaik) ## Fri 07 Aug ### ~~tripleo~~ #### ~~https://bugs.launchpad.net/tripleo/+bug/1890798 periodic centos8 Ussuri multinode minor update job fails: tderr": "Error: resource 'ip-192.168.24.16' is not running on any node~~ #### ~~https://bugs.launchpad.net/tripleo/+bug/1890389 [Master] tripleoclient does not play well with cliff-3.4.0~~ #### ~~https://bugs.launchpad.net/tripleo/+bug/1890266 centos 8 security component + integration pipeline master - Failed container(s)['nova_wait_for_api_service~~ #### ~~https://bugs.launchpad.net/tripleo/+bug/1885314 vexx: OVB master job running on vexxhost show some nodes failing introspection step~~ #### ~~https://bugs.launchpad.net/tripleo/+bug/1889122 mirror timeouts in upstream causing undercloud and standalone failures~~ #### ~~New/Transient/Nobug yet:~~ * ~~periodic-tripleo-ci-centos-8-multinode-1ctlr-featureset030-ussuri - timeout re-running in testproject - https://review.rdoproject.org/r/28889~~ :::spoiler ~~periodic-tripleo-ci-centos-8-scenario000-multinode-oooq-container-updates-ussuri -rerunning in testproject~~ https://review.rdoproject.org/r/28890 * https://logserver.rdoproject.org/openstack-periodic-integration-stable1/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-scenario000-multinode-oooq-container-updates-ussuri/dba0e70/logs/undercloud/home/zuul/overcloud_update_run_Controller.log.txt.gz * "Failed container(s): ['mysql_init_bundle'], check logs in /var/log/containers/stdouts/" * https://logserver.rdoproject.org/openstack-periodic-integration-stable1/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-scenario000-multinode-oooq-container-updates-ussuri/dba0e70/logs/subnode-1/var/log/extra/podman/containers/mysql_init_bundle/stdout.log.txt.gz ::: :::spoiler ~~train 8 can promote today \o/ (hope)~~ * https://review.rdoproject.org/zuul/buildset/8e9bfa494c934cd681b7f6e450c530c9 * ruck|rover tmux systemctl restart service to kick @ 0905UTC ::: ### osp * CI ressurestion in TLV2 lab, focus on 0804 p3 16.1 to be the only one that needs p3 run * build-marks broken, mjob not reevaluating * fixed by psedlak, not clear how yet, still TBI * libvirt bug fix testing done and merged * https://projects.engineering.redhat.com/browse/RHOSINFRA-3266 * p3 triggered by UMB, but triggering issue found?! * delay in UMB reaction 4hrs? What the heck?! * TBI * but it run in the end, results are good (relatively) * a lot of jobs reached yellow/blue * from quick view coudln't find any libvirt bug being hiz ## Thu 06 Aug ### ~~tripleo~~ #### ~~https://bugs.launchpad.net/tripleo/+bug/1890571 periodic integration/component jobs failing "[Zuul] Log Stream did not terminate"~~ #### ~~https://bugs.launchpad.net/tripleo/+bug/1890389 [Master] tripleoclient does not play well with cliff-3.4.0~~ #### ~~https://bugs.launchpad.net/tripleo/+bug/1890266 centos 8 security component + integration pipeline master - Failed container(s): ['nova_wait_for_api_service~~ #### ~~https://bugs.launchpad.net/tripleo/+bug/1885314 vexx: OVB master job running on vexxhost show some nodes failing introspection step~~ #### ~~https://bugs.launchpad.net/tripleo/+bug/1889122 mirror timeouts in upstream causing undercloud and standalone failures~~ #### ~~New/Transient/Nobug yet:~~ :::spoiler ~~train gate too many requests tripleo/+bug/1889122 https://review.opendev.org/#/c/744955/~~ * https://bugs.launchpad.net/tripleo/+bug/1889122/comments/31 ::: :::spoiler ~~integration pipelines failing:~~ * master NODE_FAILURE periodic-tripleo-centos-8-master-promote-promoted-components-to-tripleo-ci-testing everything else skips * ussuri dlrn error failed https://review.rdoproject.org/zuul/buildset/2415ed1dde0146159a927472a25ade7f for https://logserver.rdoproject.org/openstack-periodic-integration-stable1/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-centos-8-ussuri-promote-promoted-components-to-tripleo-ci-testing/33f9395/job-output.txt 2020-08-06 02:04:09.946585 | primary | dlrnapi_client.rest.ApiException: (500) * train https://review.rdoproject.org/zuul/buildset/4194b87620d94559becd39ed94490697 - different things going on strange http://paste.openstack.org/raw/796624/ ::: ### osp * outage continues, prolonged by IT by 1 day * builds marks broken, mjob not reevaluating, libvirt bug workaround not ready * p3 is not ready, maybe tomorrow * libvirt bug workaround testing continues * increasing loop count and checking results in chunks of 30-100 runs to get data * https://projects.engineering.redhat.com/browse/RHOSINFRA-3266 ## Wed 05 Aug ### ~~tripleo~~ #### ~~https://bugs.launchpad.net/tripleo/+bug/1890389 [Master] tripleoclient does not play well with cliff-3.4.0~~ #### ~~https://bugs.launchpad.net/tripleo/+bug/1890266 centos 8 security component + integration pipeline master - Failed container(s): ['nova_wait_for_api_service~~ #### ~~https://bugs.launchpad.net/tripleo/+bug/1889764 /sbin/pcs cluster setup tripleo_cluster standalone addr=192.168.24.1 --token 10000 --encryption 1' returned 1 instead of one of [0]~~ train https://review.opendev.org/#/c/744192/ merged #### ~~https://bugs.launchpad.net/tripleo/+bug/1885314 vexx: OVB master job running on vexxhost show some nodes failing introspection step~~ #### ~~https://bugs.launchpad.net/tripleo/+bug/1889122 mirror timeouts in upstream causing undercloud and standalone failures~~ #### New/Transient/Nobug yet: :::spoiler NODE_FAIL periodic integration train first run today & skipped all - also master/component * train: https://review.rdoproject.org/zuul/buildset/63414913629b4829afe5a3ad8818e91c * master: https://review.rdoproject.org/zuul/builds?pipeline=openstack-periodic-integration-main * stein: https://review.rdoproject.org/zuul/buildset/65c20e14386c4ac79678fd97d38cde55 * 11:16 < chandankumar> jpena|on_duty, https://review.rdoproject.org/zuul/builds?result=NODE_FAILURE * another ticket to Vexxhost: #UAO-863218. It all points to some issue in Neutron ::: :::spoiler various gate timeout & fails including 'too many requests' * https://bugs.launchpad.net/tripleo/+bug/1889122/comments/16 * actual timeouts ... a tempest fail a 'too many requests' http://paste.openstack.org/raw/796590/ ::: ### osp * Planned TLV outage + Jenkins to TLV2 migration * http://post-office.corp.redhat.com/archives/rhos-qe-dept/2020-August/msg00214.html * ~100 p3 jobs cancelled manually ## Tue 04 Aug ### ~~tripleo~~ #### ~~https://bugs.launchpad.net/tripleo/+bug/1890266 centos 8 security component + integration pipeline master - Failed container(s): ['nova_wait_for_api_service~~ #### ~~https://bugs.launchpad.net/tripleo/+bug/1889764 /sbin/pcs cluster setup tripleo_cluster standalone addr=192.168.24.1 --token 10000 --encryption 1' returned 1 instead of one of [0]~~ #### ~~https://bugs.launchpad.net/tripleo/+bug/1885314 vexx: OVB master job running on vexxhost show some nodes failing introspection step~~ #### ~~https://bugs.launchpad.net/tripleo/+bug/1889122 mirror timeouts in upstream causing undercloud and standalone failures~~ #### New/Transient/Nobug yet: :::spoiler train promotion blocked https://bugs.launchpad.net/tripleo/+bug/1889764/comments/10 * waiting for https://review.opendev.org/#/c/744192/ to go through gates ... it is currently blocked by https://review.opendev.org/#/c/744499/2 to fix the puppet-openstack-lint-ubuntu-bionic FAILURE in 5m 06s (thanks chkumar|rover) * latest buildset https://review.rdoproject.org/zuul/buildset/69d381f19f414a6493fc3fecb7b2e51d * https://logserver.rdoproject.org/openstack-periodic-integration-stable2/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-scenario004-standalone-train/2d7e1ed/logs/undercloud/home/zuul/standalone_deploy.log.txt.gz * https://logserver.rdoproject.org/openstack-periodic-integration-stable2/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-multinode-1ctlr-featureset030-train/77af4b6/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz * https://logserver.rdoproject.org/openstack-periodic-integration-stable2/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-ovb-1ctlr_1comp-featureset002-train/f0b29d6/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz ::: :::spoiler vexx introspection/resource issues continue - examples periodic/traincentos8 * https://bugs.launchpad.net/tripleo/+bug/1885314/comments/3 * https://bugs.launchpad.net/tripleo/+bug/1885314/comments/4 * 13:24 < sshnaidm_> marios|ruck, chkumar|rover just fyi, vexx has problems with heat stacks in ovb again 13:24 13:26 < sshnaidm_> marios|ruck, I have from yesterday, https://logserver.rdoproject.org/87/28387/3/check/periodic-tripleo-ci-centos-7-ovb-1ctlr_2comp-featureset021-train/b22d09f/job-output.txt ::: ::: spoiler Config update on promoter https://review.rdoproject.org/r/28842 * Add ubi-8 container build job to ussuri promotion criteria - https://review.rdoproject.org/r/28842 * Update it on Aug 5 once we get few runs ::: ### osp phase2 OSP16.1 new puddle (RHOS-16.1-RHEL-8-20200803.n.0) * hitting libvirt flake https://bugzilla.redhat.com/show_bug.cgi?id=1840307 * JIRA well discussed: https://projects.engineering.redhat.com/browse/RHOSINFRA-3266 * escalation: https://trello.com/c/I0ix688S * http://post-office.corp.redhat.com/archives/rhos-qe-dept/2020-August/msg00154.html ## Mon 03 Aug ### ~~tripleo~~ #### ~~https://bugs.launchpad.net/tripleo/+bug/1889764 /sbin/pcs cluster setup tripleo_cluster standalone addr=192.168.24.1 --token 10000 --encryption 1' returned 1 instead of one of [0]~~ #### ~~https://bugs.launchpad.net/tripleo/+bug/1889122 mirror timeouts in upstream causing undercloud and standalone failures~~ #### ~~https://bugs.launchpad.net/tripleo/+bug/1885314 vexx: OVB master job running on vexxhost show some nodes failing introspection step~~ #### ~~New/Transient/Nobug yet:~~ * ResourceInError: resources.Controller: Went to status ERROR due to "Message: No valid host was found. , Code: 500 ::: spoiler EXPAND for links * CREATE_FAILED ResourceInError: resources.NovaCompute: Went to status ERROR due to "Message: Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance 0409b287-7500-43dd-a6bc-a0dea4a62ab4., Code: 500 * https://logserver.rdoproject.org/openstack-periodic-integration-stable2/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-train/d7cf899/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz * https://logserver.rdoproject.org/openstack-periodic-integration-stable2/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp_1supp-featureset039-train/302fa7e/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz * https://logserver.rdoproject.org/openstack-periodic-integration-stable2/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset035-train/48bfa84/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz * Re-running failed ovb jobs here: https://review.rdoproject.org/r/#/c/28822/ * Notes: * After running the above failed jobs on rdocloud, during overcloud deployment it is hitting https://bugs.launchpad.net/tripleo/+bug/1889764 (pcs issue) ::: * ~~marios|ruck cannot access RDO promoter~~ ### osp * Amnon suspects something wrong with OSP16.1 p3 trigger as phase2 passed and p3 was not triggered? * Decided to postpone p3, waiting for BootHole fixed puddle later today * http://post-office.corp.redhat.com/archives/rhos-qe-dept/2020-August/msg00031.html * fhubik&tkorol's sync on R&R basics, knowledge transfer ## Fri 31 Jul ### ~~tripleo~~ #### ~~https://bugs.launchpad.net/tripleo/+bug/1889764 /sbin/pcs cluster setup tripleo_cluster standalone addr=192.168.24.1 --token 10000 --encryption 1' returned 1 instead of one of [0]~~ #### ~~https://bugs.launchpad.net/tripleo/+bug/1889122 mirror timeouts in upstream causing undercloud and standalone failures~~ #### ~~https://bugs.launchpad.net/tripleo/+bug/1889357 Centos7 Check/Gate jobs failing with UnicodeDecodeError: 'ascii' codec can't decode byte 0xc5 in position 2: ordinal not in range(128)~~ #### ~~Periodic promotion~~ * ~~rekicking stein and queens promotion: https://review.rdoproject.org/r/#/c/28797/~~ #### New/Transient/Nobug yet: :::spoiler gate - vexx introspection issues ### gate * https://9a2c06ca6797eeff848d-08206d950263e10ee10a8fbd7831cbcc.ssl.cf1.rackcdn.com/740302/13/gate/tripleo-ci-centos-8-scenario003-standalone/606f802/logs/undercloud/var/log/tripleo-container-image-prepare.log ``` TIMING | tripleo-modify-image : Write Dockerfile to {{ modify_dir_path }} | 0:00:22.924 | 0.04s\n\x1b[Ke30=\x1b[4D\x1b[K' Stderr: '[WARNING]: provided hosts list is empty, only localhost is available. Note that\nthe implicit localhost does not match \'all\'\nFatal Python error: GC object already tracked\n\nCurrent thread 0x00007f75f2a69700 (most recent call first):\n File "/usr/lib64/python3.6/multiprocessing/connection.py", line 390 in _recv\n File "/usr/lib64/python3.6/multiprocessing/connection.py", line 411 in _recv_bytes\n File "/usr/lib64/python3.6/multiprocessing/connection.py", line 220 in recv_bytes\n File "/usr/lib64/python3.6/multiprocessing/queues.py", line 94 in get\n File "/usr/lib/python3.6/site-packages/ansible/plugins/strategy/__init__.py", line 84 in results_thread_main\n File "/usr/lib64/python3.6/threading.py", line 864 in run\n File "/usr/lib64/python3.6/threading.py", line 916 in _bootstrap_inner\n File "/usr/lib64/python3.6/threading.py", line 884 in _bootstrap\n\nThread 0x00007f7602bca740 (most recent call first):\n File "/usr/lib/python3.6/site-packages/ansible/plugins/strategy/__init__.py", line 788 in _wait_on_pending_results\n File "/usr/lib/python3.6/site-packages/ansible/plugins/strategy/linear.py", line 325 in run\n File "/usr/lib/python3.6/site-packages/ansible/executor/task_queue_manager.py", line 244 in run\n File "/usr/lib/python3.6/site-packages/ansible/executor/playbook_executor.py", line 169 in run\n File "/usr/lib/python3.6/site-packages/ansible/cli/playbook.py", line 127 in run\n File "/bin/ansible-pl ``` * https://4f473903d10afdc4be34-fffc383f4fef4c1ab3e9fce9c63696ff.ssl.cf1.rackcdn.com/743629/12/gate/tripleo-build-containers-centos-8/f1d633f/logs/containers-build-errors.log ``` grep: /tmp/container-*/docker: No such file or directory ``` * introspection issues on vexxhost continue: testproject rechecked the train/stein/rocky/queens jobs but still a lot failing introspection. Note that there was a vexxhost update at the end of last week * run patches on rdo-cloud if needed e.g. https://review.rdoproject.org/r/#/c/28705/ * To run periodic jobs in rdo-cloud vs. vexhost (introspection errors) e.g. https://review.rdoproject.org/r/#/c/28704/2/zuul.d/integration-pipeline-stable2-centos7.yaml #### Investigation continues.. * https://logserver.rdoproject.org/openstack-periodic-integration-stable2/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-multinode-1ctlr-featureset037-updates-train/9762bb2/logs/undercloud/var/log/paunch.log.txt.gz * https://logserver.rdoproject.org/openstack-periodic-integration-stable2/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-train/aa76fed/logs/undercloud/var/log/extra/errors.txt.txt.gz [Need to confirm as per paunch log but hitting introspection failure] * https://logserver.rdoproject.org/openstack-periodic-integration-stable2/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-ovb-1ctlr_1comp-featureset002-train/8e08f50/logs/undercloud/var/log/extra/errors.txt.txt.gz ``` 2020-07-31 05:19:52.344 84425 ERROR paunch [ ] Error executing ['podman', 'container', 'exists', 'rabbitmq_init_logs']: returned 1 2020-07-31 05:19:52.435 84425 WARNING paunch [ ] Did not find container with "['podman', 'ps', '-a', '--filter', 'label=container_name=rabbitmq_init_logs', '--filter', 'label=config_id=tripleo_step1', '--format', '{{.Names}}']" - retrying without config_id 2020-07-31 05:19:52.556 84425 WARNING paunch [ ] Did not find container with "['podman', 'ps', '-a', '--filter', 'label=container_name=rabbitmq_init_logs', '--format', '{{.Names}}']" ``` ::: ::: spoiler ~~Best~~ worst case (for wed) centos-7 train candidate https://trunk.rdoproject.org/api-centos-train/api/civotes_detail.html?commit_hash=6f44509dcb4faa4bc0340ae138c86d77ef2e2c84&distro_hash=14a932b45720ecb4c2cc5a8811fc6c59ba6255d5 ::: ### osp fhubik/tkorol taking over from previous rucks * knowledge transfer * few CI p2 retriggers, but no new fires so far ## Thu 30 Jul ### ~~tripleo~~ ####~~New/Transient/Nobug yet:~~ * ~~ruck|rover need access to promoters - e.g. http://38.145.34.55/centos8_master.log-20200529 --> missing jobs [u'periodic-tripleo-ci-centos-8-ovb-1ctlr_2comp-featureset020-master BUT https://github.com/rdo-infra/ci-config/blob/722d298ca4baee00f79ad91968d9fd3ff801fbb6/ci-scripts/dlrnapi_promoter/config/CentOS-8/master.ini#L40~~ #### ~~https://bugs.launchpad.net/tripleo/+bug/1889524 periodic centos8 ovb featureset 1 baremetal master fails ironic unexpected keyword argument 'hash_function~~ #### ~~https://bugs.launchpad.net/tripleo/+bug/1889529 periodic centos8 scenario10 network component standalone master timeout tempest conflicting state~~ #### ~~https://bugs.launchpad.net/tripleo/+bug/1889553 centos8 periodic master jobs failing tempest 'no more ip addresses'~~ ~~FYI.. tripleo centos7 ovb jobs on rdocloud https://review.rdoproject.org/r/#/c/28705/~~ ~~Best centos-7 train candidate https://trunk.rdoproject.org/api-centos-train/api/civotes_detail.html?commit_hash=6f44509dcb4faa4bc0340ae138c86d77ef2e2c84&distro_hash=14a932b45720ecb4c2cc5a8811fc6c59ba6255d5~~ ~~triggered centos8 train~~ ### osp R&R transfer, sprint planning, shifting duties, finishing prev. tasks