ruck_rover
ruck/rover primer: https://docs.openstack.org/tripleo-docs/latest/ci/ruck_rover_primer.html
Infrared gerrit: https://review.gerrithub.io/q/project:redhat-openstack/infrared
Infrared doc: https://infrared.readthedocs.io/en/latest/
Cockpit: http://tripleo-cockpit.usersys.redhat.com/d/9DmvErfZz/cockpit?orgId=1
Internal Cockpit (WIP) http://tripleo-cockpit.usersys.redhat.com/?orgId=1
http://cistatus.tripleo.org/
https://trello.com/b/j4IcIomh/production-chain-escalation
http://rhos-release.virt.bos.redhat.com:3030/rhosp
Debugging Tools https://docs.google.com/document/d/1VZhje7ZN9sk4E31fYVrPxpqMJGz5ZhHRfhte_RYMXxg/edit#
Review.rdoproject.org dashboard: https://review.rdoproject.org/grafana/?orgId=1&var-datasource=default&var-server=registry.rdoproject.org.rdocloud&var-inter=$__auto_interval_inter
CentOS pre-release rpm updates for minor releases http://mirror.centos.org/centos/7/cr/x86_64/Packages/
hackmd.io rh-openstack-dev
https://hackmd.io/team/rh-openstack-ci?nav=overview
Internal software factory: https://sf.hosted.upshift.rdu2.redhat.com
upstream rsync mirror logs: files.openstack.org/mirror/logs/rsync-mirrors/centos.log
TRELLO RETROSPECTIVE https://trello.com/b/0VFswmht/rdo-infra-retrospective?menu=filter&filter=label:UniSprint21
Internal Dashboard - https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/QE/view/OSP16/ OSP-10 - OSP-16
RHOS INFRA INFRARED ISSUES https://projects.engineering.redhat.com/issues/?filter=34183
CIX escalation https://mojo.redhat.com/docs/DOC-1098748#jive_content_id_CIX_Escalation_Automation_and_email_format
CIX board https://trello.com/b/j4IcIomh/production-chain-escalation
Nodepool image logs: https://softwarefactory-project.io/nodepool-log/
We may want to move this etherpad to something internal at this point
please add your (colored) name here: time to move to hackmd WDYT? +1 (either now - start of the sprint/rr - or in 3 weeks)
marios (baby blue) fhubik("green lantern") wznoinsk (orange)Amnon(Marrooned)
Dates: April 16 - May 7
Tripleo CI team ruck|rover: Gabriele (panda) && Amol (akahat)
OSP CI team ruck|rover (April 24 - May 15): Filip (fhubik), Vadim (vgriner)
Previous notes: link
put these issues in the spoiler.
@akahat FYI.. @arxcruz is investigating the tempest
failures in stein.
@TheG Please work the networking team to bring https://zuul.openstack.org/builds?job_name=tripleo-ci-centos-8-scenario010-ovn-provider-standalone online.
CentOS-7 OVB jobs are RED fs001
https://bugs.launchpad.net/tripleo/+bug/1875731
https://bugs.launchpad.net/tripleo/+bug/1876972
TRAIN: GREEN
STEIN: Tempest fail ( arx is looking at it )
ROCKY: Tempest fail ( @arxcruz FYI)
QUEENS: Tempest fail ( @arxcruz FYI)
Thank you!
Bugzilla | Name | status | Review |
---|---|---|---|
1873770 | OVB fs001 in centos8 master fails to push certificates contents to controllers | Incomplete | |
1873892 | Non root login prevented on overcloud machines | Fixed Release | |
1874019 | scenario009-multinode.yaml and openshift.yaml is missing | In Progress | |
1875352 | keystone container failed to start in scenario000 | Triged | |
1875871 | periodic rocky jobs failing with missing –name argument for pcs | Triged | |
1875846 | Overcloud stack creation failed because of failed dependencies. | Closed | |
1875833 | The WebSocket timed out before the Workflow completed in rocky/stain jobs | New | |
1876087 | Queens, tempest.scenario.test_network_basic_ops.TestNetworkBasicOps failing. Timeout | Triged | |
1876096 | Queens: tempest.scenario.test_volume_boot_pattern.TestVolumeBootPattern tests failed | Triged | |
1876672 | Python 2 - AttributeError: 'module' object has no attribute 'get_makefile_name' | Fixed Release | |
1876893 | Error: error removing container - device or resource busy | In Progress | |
1877031 | queens tripleo-ci-centos-7-undercloud-upgrades broken for ansible version |
holiday in CZ (fhubik), off day in TLV (vgriner)
tripleo-ci-centos-7-containerized-undercloud-upgrades
Need to watch:
Pacemaker patch for queens, rocky
https://review.rdoproject.org/r/#/c/27035/
Latest patch for ssh deployment failures:
https://review.opendev.org/#/c/723824/
containers-multinode Featureset010 is failing in queens, stein and rock jobs.. There is only one tempest test configured in that job.
Since there is only one tempest tempest test, instead of skipping it I have to turn tempest off completely.
Arx, Soniya..I need your help to diagnose and bug queens, rocky stein for the test failure if it appears to be unique per release.
We have queens: https://bugs.launchpad.net/tripleo/+bug/1876087
We need bugs on:
Stein: https://logserver.rdoproject.org/openstack-periodic-24hr/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-7-multinode-1ctlr-featureset010-stein/05b9be9/logs/tempest.html.gz
Rocky: https://logserver.rdoproject.org/openstack-periodic-24hr/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-7-multinode-1ctlr-featureset010-rocky/e5f9368/logs/tempest.html.gz
These all could be related, but we need to confirm that of course. Thank you for interrupting your schedule to help check these out.
Investigating periodic queens failures
reviewing to unblock queens
https://review.opendev.org/724703 (still failing, network hiccups)
holiday in CZ (fhubik), off day in TLV (vgriner)
Openstack periodic latest release pipeline
List of issues found:
fhubik dealing with foreing jobs invading p2 for no reason
periodic-tripleo-ceph-integration-centos-8-scenario004-standalone
Job name
WATCH tripleo-ci-centos-7-containerized-undercloud-upgrades
master centos8 failing to build containers because of a network issue while downloading repo info for openvswitch
tempest component still failing on the IP allocation error. Tempest guys are aware of the CIX bug there but are joining the investigation
Promoter raises an error during promotion, but the promotion is not affected. Investigating promotion code
Stein is 2 days behind. looking at logs.
Need to watch
Jobs:
Replaced with
Jobs need to watch:
Bugs:
020-04-22 17:14:47,863 25243 ERROR promoter Candidate hash 'aggregate: b3720367a6a0349abcfb06939bed3101, commit: 50837618bdbc4ee18ba25da00a4d98cae9744d68, distro: 99ace58fa85ff53a3de0c282131df46336f81d66, component: ui, timestamp: None': client dlrn_client FAILED promotion attempt to current-tripleo
2020-04-22 17:14:47,863 25243 ERROR promoter API returned different promoted hash
Traceback (most recent call last):
File "/home/centos/ci-config-refactored/ci-scripts/dlrnapi_promoter/logic.py", line 140, in promote
candidate_label=candidate_label)
File "/home/centos/ci-config-refactored/ci-scripts/dlrnapi_promoter/dlrn_client.py", line 359, in promote
candidate_label=candidate_label)
File "/home/centos/ci-config-refactored/ci-scripts/dlrnapi_promoter/dlrn_client.py", line 550, in promote_hash
raise PromotionError("API returned different promoted hash")
PromotionError: API returned different promoted hash
2020-04-22 17:14:47,866 25243 ERROR promoter Error while trying to promote tripleo-ci-testing to current-tripleo
2020-04-22 17:14:47,866 25243 WARNING promoter Candidate label 'tripleo-ci-testing': NO candidate hash promoted to current-tripleo
2020-04-22 17:14:47,866 25243 INFO promoter Candidate label 'current-tripleo': Attempting promotion to 'current-tripleo-rdo'
2020-04-22 17:14:48,810 25243 INFO promoter Candidate label 'current-tripleo': Fetched 10 hashes
2020-04-22 17:14:49,527 25243 WARNING promoter Target label 'current-tripleo-rdo': No hashes fetched. This could mean that the target label is new or it's the wrong label
Gate jobs failing:
tripleo-ci-centos-7-undercloud-containers
tripleo-ci-centos-7-scenario009-multinode-oooq-container
openshift.yaml
and scenario009-multinode.yaml
files are missing.tripleo-ci-centos-8-containers-multinode
Periodic jobs:
periodic-tripleo-centos-8-master-containers-build-push
Need to watch:
periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-master fails, opened https://bugs.launchpad.net/tripleo/+bug/1873770
Need to watch jobs:
ansible_user: tripleo-admin
. Primary guess this job is not ported properly in Train.noticing the latest patches for glance https://review.opendev.org/#/c/712533/ are not consistently resolving previous scenario01/02 issues.. watching
@TheG tripleo-ci-centos-7-containerized-undercloud-upgrades should be voting for everything that is not master … let's take a look
@5rFAC3bRTASHvK6LfOxGWA Amol please watch periodic-openstack-master and openstack-periodic-latest-released in review.rdoproject.org