Ruck Rover 2022-08-05 to 2022-08-11

tags: ruck_rover
Previous RR notes: https://hackmd.io/H9CSoXvlTm6nTZ4bsJkeRg

Cockpit

Downstream cockpit

OpenStack Program Meeting 2022

Downstream promoter


2022-08-11

New/Transient no bug yet

New bugs today

https://bugs.launchpad.net/tripleo/+bug/1985031 tripleo-tox-molecule consistently failing with failed: [localhost] (item=https://images.rdoproject.org/centos8/train/rdo_trunk/current-tripleo/ironic-python-agent.tar)

Gate ongoing blocker https://bugs.launchpad.net/tripleo/+bug/1984175/comments/9

Check Jobs

master blocked cix https://bugs.launchpad.net/bugs/1984184

wallaby c9 blocked cix https://bugs.launchpad.net/tripleo/+bug/1984453

wallaby c8 chasing https://review.rdoproject.org/r/c/testproject/+/44517 Run missing fs1 job for wallaby/8 c36ce5aec4d97093b683c04f3ee56212

train chasing https://review.rdoproject.org/r/c/testproject/+/44516 Run missing train jobs for 29018450b497197a282c6d4463b4c745

rhos17.1 on rhel9

https://bugzilla.redhat.com/show_bug.cgi?id=2116287 - major blocker in d/stream atm for across all releases. We havent promoted single line due to this infra issue.

Promotions :

(https://docs.google.com/document/d/1n6ArkMh68R9zivjlyGbpedkggk1wMwEIcrMZSN2uIjc/edit#heading=h.hqhtw5tvhd63)

  • OSP 16.2 RHEL-8 promoted on Aug 4, 2022
  • OSP 17 RHEL-8 Promoted on Aug 5, 2022
  • OSP-17 RHEL-9 Promoted on Aug 7, 2022
  • OSP-17.1 RHEL-9 Promoted on July 27, 2022

rhos17 on rhel9

rhos16.2

Upstream components

master

wallaby

wallaby c8

train

Downstream components

rhos17 on rhel9

rhos16.2


2022-08-10

New/Transient no bug yet

New bugs today

https://bugs.launchpad.net/tripleo/+bug/1984175 tripleo-ci-centos-9-undercloud-upgrade - cannot install both NetworkManager-1:1.39.5-1.el9.x86_64 and NetworkManager-1:1.39.12

https://bugs.launchpad.net/tripleo/+bug/1984453 periodic-tripleo-ci-centos-9-ovb-1ctlr_1comp-featureset002-master is failing image build - dependency fence-agents-common = 4.10.0-28.el9

https://bugs.launchpad.net/tripleo/+bug/1984184 fs001 and fs035 OVB jobs failing to set up private network for os_tempest ( patch under test https://review.rdoproject.org/r/c/testproject/+/36254)

(NON VOTING) https://bugs.launchpad.net/tripleo/+bug/1984237 [FIPS] Standalone deploy failing with: "Error in GnuTLS initialization: Error while performing self checks"

Gate green

  • looks like new issue undercloud-upgrade

Check Jobs

master ongoing upstream RETRY bug https://bugs.launchpad.net/tripleo/+bug/1983817/comments/10

wallaby c9

wallaby c8

train

rhos17.1 on rhel9

rhos17 on rhel9

rhos17 on rhel8

rhos16.2

Upstream components

master

wallaby c9

wallaby c8

train

Downstream components

rhos17 on rhel9

rhos17 on rhel8

rhos16.2


2022-08-09

New bugs today

https://bugs.launchpad.net/tripleo/+bug/1984035 seen on cs8 wallaby while uploading the image

Gate green

Check Jobs

master ongoing issue all branches https://bugs.launchpad.net/tripleo/+bug/1983817/comments/5

https://opendev.org/openstack/openstack-ansible-os_tempest/commit/f8c8a1ed6c59dcbf1fbe66d137d715e53af2ff51 looks like the latest commit taken by openstack-os_tempest marked as the master branch as of 07/13. https://opendev.org/openstack/openstack-ansible-os_tempest/graph show other available commits. Maybe check with Arx impact of this change - or moving the branch. Tried these jobs on IBM cloud - fails at the same point.

wallaby c9

wallaby c8

train

rhos17.1 on rhel9

rhos17 on rhel9

rhos17 on rhel8

rhos16.2

Upstream components

master

wallaby c9

wallaby c8

train

Downstream components

rhos17 on rhel9

rhos17 on rhel8

rhos16.2

  • periodic-tripleo-ci-rhel-8-scenario010-standalone-network-rhos-16.2 only latest build failed previous green history so rechecking.

  • periodic-tripleo-ci-rhel-8-scenario007-standalone-network-rhos-16.2 latest one build history is good, only one failure with retry-limit and one with node_failure thats why rechecking.

  • periodic-tripleo-ci-rhel-8-standalone-network-rhos-16.2 build history good except one retry limit and one node failure thats why rechecking

  • periodic-tripleo-ci-rhel-8-ovb-3ctlr_1comp-featureset001-internal-network-rhos-16.2 right now failing with retry_limit the real issue we actually facing now. https://bugzilla.redhat.com/show_bug.cgi?id=2116287

  • periodic-tripleo-ci-rhel-8-ovb-3ctlr_1comp-featureset001-internal-network-rhos-16.2 (currently failing with node_failures/retry_limits)

2022-08-05 15:20:16 | 2022-08-05 15:20:01Z [overcloud.Controller.2.Controller]: CREATE_FAILED  ResourceInError: resources.Controller: Went to status ERROR due to "Message: No valid host was found. , Code: 500"
2022-08-05 15:20:16 | 2022-08-05 15:20:01Z [overcloud.Controller.2]: CREATE_FAILED  Resource CREATE failed: ResourceInError: resources.Controller: Went to status ERROR due to "Message: Build of instance 278b9381-5621-435c-9b8e-0fd6e83e4898 aborted: Failed to provision instance 278b9381-5621-435c-9b8e-0fd6e83e4898: Failed to prepare to de
2022-08-05 15:20:16 | 2022-08-05 15:20:01Z [overcloud.ComputeIpListMap.EnabledServicesValue]: CREATE_IN_PROGRESS  state changed
2022-08-05 15:20:16 | 2022-08-05 15:20:01Z [overcloud.ComputeIpListMap.EnabledServicesValue]: CREATE_COMPLETE  state changed
2022-08-05 15:20:16 | 2022-08-05 15:20:01Z [overcloud.ComputeIpListMap]: CREATE_COMPLETE  Stack CREATE completed successfully
2022-08-05 15:20:16 | 2022-08-05 15:20:01Z [overcloud.Controller.2]: CREATE_FAILED  ResourceInError: resources[2].resources.Controller: Went to status ERROR due to "Message: Build of instance 278b9381-5621-435c-9b8e-0fd6e83e4898 aborted: Failed to provision instance 278b9381-5621-435c-9b8e-0fd6e83e4898: Failed to prepare to deploy: IPMI 
2022-08-05 15:20:16 | 2022-08-05 15:20:01Z [overcloud.Controller]: UPDATE_FAILED  Resource CREATE failed: ResourceInError: resources[2].resources.Controller: Went to status ERROR due to "Message: Build of instance 278b9381-5621-435c-9b8e-0fd6e83e4898 aborted: Failed to provision instance 278b9381-5621-435c-9b8e-0fd6e83e4898: Failed to 
2022-08-05 15:20:16 | 2022-08-05 15:20:01Z [overcloud.ComputeIpListMap]: CREATE_COMPLETE  state changed
2022-08-05 15:20:16 | 2022-08-05 15:20:02Z [overcloud.Controller]: CREATE_FAILED  resources.Controller: Resource CREATE failed: ResourceInError: resources[2].resources.Controller: Went to status ERROR due to "Message: Build of instance 278b9381-5621-435c-9b8e-0fd6e83e4898 aborted: Failed to provision instance 278b9381-5621-435c-9b8e-0f
2022-08-05 15:20:16 | 2022-08-05 15:20:02Z [overcloud]: CREATE_FAILED  Resource CREATE failed: resources.Controller: Resource CREATE failed: ResourceInError: resources[2].resources.Controller: Went to status ERROR due to "Message: Build of instance 278b9381-5621-435c-9b8e-0fd6e83e4898 aborted: Failed to provision instance 27

New/Transient no bug yet


2022-08-08

New Bugs today:

https://bugs.launchpad.net/tripleo/+bug/1983817 periodic integration all branches RETRY Could not resolve host: mirror.regionone.vexxhost-nodepool-tripleo.rdoproject.org

https://bugzilla.redhat.com/show_bug.cgi?id=2116287 No promotions occurs due to NODE_FAILURE at downstream on weekend since 7th August

Gate green

Check Jobs

master chasing master 3acadba0c3986b5a074088073042b411 https://review.rdoproject.org/r/c/testproject/+/44460 https://code.engineering.redhat.com/gerrit/c/testproject/+/423882

looks like a real issue on fs001 and fs035: TASK [os_tempest : Ensure private network exists]503 Service Unavailable (example logs in testproject) https://logserver.rdoproject.org/60/44460/2/check/periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp-featureset035-master/b2eee76/job-output.txt

wallaby c9

Chasing failures at: https://review.rdoproject.org/r/c/testproject/+/36255 f40013fd0fe770cbca616a1b53663936 is missing: internal-kvm (rerun in https://code.engineering.redhat.com/gerrit/c/testproject/+/423882), fs020 and fs039 (two reruns in testproject above so can compare results)

wallaby c8

Hash_under_test=https://trunk.rdoproject.org/api-centos8-wallaby/api/civotes_agg_detail.html?ref_hash=31a61a906cdcd8fccc322d597aa1dd5f All tests passed promoter is pushing containers but failing to find images: http://promoter.rdoproject.org/promoter_logs/centos8_wallaby.log-20220807 pinged Amol and Chandan - pls follow up.

train chasing train * recheck https://review.rdoproject.org/r/c/testproject/+/44442/4#message-14062031d9b8e0b9bebbd95150f76ce3f09f2a74 (just fs35 * chase second hash posted https://review.rdoproject.org/r/c/testproject/+/44462 04837c814032381ae4e9d817c276a22 - rerunning with fs001 and scenario007

For hash: e04837c814032381ae4e9d817c276a22 - only missing fs001: comparing two logs: https://logserver.rdoproject.org/openstack-periodic-integration-stable4/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-train/1f0a584/logs/undercloud/var/log/tempest/stestr_results.html.gz and https://logserver.rdoproject.org/62/44462/1/check/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-train/75ded79/logs/undercloud/var/log/tempest/stestr_results.html.gz inconsistent test failures - can skip promote this hash (latest one to run in train)

rhos17.1 on rhel9

rhos17 on rhel9

  • promoted yesterday(7th august)

rhos17 on rhel8

rhos16.2

Upstream components

master

wallaby c9

wallaby c8

train

Downstream components

rhos17 on rhel9

  • no blockers

rhos17 on rhel8

rhos16.2

New/Transient no bug yet


2022-08-05

New Bugs today:

https://bugzilla.redhat.com/show_bug.cgi?id=2115778

https://bugs.launchpad.net/tripleo/+bug/1983718 periodic master scen1 standalone fails/timeout 'manage firewall rules'

Component: validations: - 12 days out https://review.rdoproject.org/r/c/testproject/+/44413 real issue: https://logserver.rdoproject.org/13/44413/1/check/periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp-featureset001-component-master-validation/a965b91/job-output.txt pinged Jiri

Gate green https://lists.openstack.org/pipermail/openstack-discuss/2022-August/029865.html

Check Jobs

master - blocked new bug https://bugs.launchpad.net/tripleo/+bug/1983718

wallaby c9 - promoted today

wallaby c8

train - chasing https://review.rdoproject.org/r/c/testproject/+/44442 Run missing jobs train 8cf307aefe47066dfa2b89be39b174f8 && https://code.engineering.redhat.com/gerrit/c/testproject/+/423713 Run periodic-tripleo-ci-centos-8-scenario010-kvm-internal-standalone-train

rhos17.1 on rhel9

rhos17 on rhel9

rhos17 on rhel8

rhos16.2

  • No blockers

Upstream components

master

wallaby c9

wallaby c8

train

Downstream components

rhos17 on rhel9

  • network component lagging
    • periodic-tripleo-ci-rhel-9-ovb-3ctlr_1comp-featureset001-internal-network-rhos-17
    • periodic-tripleo-ci-rhel-9-scenario010-standalone-network-rhos-17

rhos17 on rhel8

rhos16.2

2022-08-04

previous ruck|rover shift - notes @ https://hackmd.io/H9CSoXvlTm6nTZ4bsJkeRg

Select a repo