RHOSP17 on Rhel9

tags: Design

Mini Integration line Status

  • Pipeline
  • Container build
  • Image build
  • Vanilla Standalone

Pipeline

Container build (Green run)

Patches:

* Release file: https://code.engineering.redhat.com/gerrit/c/tripleo-environments/+/300678

* 822503: Introduce iptables_package var |https://review.opendev.org/c/zuul/zuul-jobs/+/822503

* Only add temp repos on el8 - https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/302181

Image Build(Green run):

Patches:

* Release file: https://code.engineering.redhat.com/gerrit/c/tripleo-environments/+/300678

* 822503: Introduce iptables_package var | https://review.opendev.org/c/zuul/zuul-jobs/+/822503

* 824660: Fixes for centos-9-stream efi behaviour | https://review.opendev.org/c/openstack/diskimage-builder/+/824660

Standalone(Green run):

* Release file: https://code.engineering.redhat.com/gerrit/c/tripleo-environments/+/300678

* 822503: Introduce iptables_package var | https://review.opendev.org/c/zuul/zuul-jobs/+/822503

Issues -

Bug#1: https://bugzilla.redhat.com/show_bug.cgi?id=2043678 (openstack-network-scripts and network-script conflict)

Bug#2 https://bugzilla.redhat.com/show_bug.cgi?id=2043922 - Standalone deployment on RHEL9 failing Error: Evaluation Error: Error while evaluating a Virtual Query, Could not autoload puppet/type/pcmk_resource: Could not autoload puppet/provider/pcmk_resource/default: cannot load such file rexml/document

  • Solved with new puppet-headless

Bug#3 https://bugzilla.redhat.com/show_bug.cgi?id=2044776 - RHOSP17 Standalone deployment failing with Error: Evaluation Error: Error while evaluating a Function Call, undefined method `escape' for URI:Module (file: /etc/puppet/modules/nova/manifests/migration/libvirt.pp, line: 201, column: 49) on node standalone

Bug#4 https://bugzilla.redhat.com/show_bug.cgi?id=2048488 - RHOSP17 Standalone deployment failing to start rabbitmq_wait_bundle container, RabbitMQ version cannot run on Erlang 23.0.4: minimum required version is 23.2 (erts 11.1

  • fixed with new erlang

OVB fs001(Sandeep)

Job def

Issues -

1 : Fetch-zuul-cloner-fork causing issue

https://sf.hosted.upshift.rdu2.redhat.com/logs/69/305869/2/check/periodic-tripleo-ci-rhel-9-ovb-3ctlr_1comp-featureset001-internal-rhos-17/5b28867/job-output.txt

2022-01-22 03:53:15.085655 | TASK [fetch-zuul-cloner-fork : Install zuul-cloner shim dependencies]
2022-01-22 03:53:15.889468 | primary | ERROR
2022-01-22 03:53:15.889907 | primary | {
2022-01-22 03:53:15.889957 | primary |   "msg": "Failed to find required executable virtualenv in paths: /sbin:/bin:/usr/sbin:/usr/bin:/usr/local/sbin"
2022-01-22 03:53:15.889988 | primary | }

Fix: https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-config/+/305898

Full Integration pipeline Status

  • sc001
  • sc002
  • sc004
  • sc007
  • sc010
  • sc012
  • sc full-tempest-api
  • sc full-tempest-scenario
  • Container-multinode
  • Multinode IPA
  • OVB fs001
  • OVB fs020
  • OVB fs035
  • Baremetal fs001
  • Jenkins trigger job
  • Integration line criteria file
  • Promoter for RHOSP17 RHEL9
  • First promotion to current-tripleo
  • ruck_rover.py support for rhosp17 on rhel9
  • Dashboard for rhel-9 17 line vs wallaby C9

Ongoing issues

jboyer[m]> we're doing monthly updates of the RHEL 9 Beta channels
<jboyer[m]> another update will come in April

Standalone SC01(Sandeep):

Periodic: https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/308483

Standalone SC02(bhagyashri):

https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/309437/3

Standalone SC04(bhagyashri):

https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/309439/3

Standalone SC07(Pooja):

https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/308664

Bug #1:- Standalone deployment on sc07 failing to start rabbitmq_wait_bundle container, rabbitmq crash with error:{badmatch,{error,{{shutdown,{failed_to_start_child,net_kernel,{'EXIT',nodistribution}}} - https://bugzilla.redhat.com/2050624 Fix: https://code.engineering.redhat.com/gerrit/c/openstack/sf-config/+/309499 https://code.engineering.redhat.com/gerrit/c/openstack/sf-config/+/309607

Bug #2:- Hitting same bug which we hit for wallaby: https://bugs.launchpad.net/tripleo/+bug/1959582

Awaiting new tripleo component

Standalone SC010(Pooja/Sandeep):

Standalone SC012(Pooja/Sandeep)

https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/316278 python3-vritualbmc deps issues, resolves with below:- https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/317391

sc012 was not getting triggered in rhel-9 as well https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/318369 We need to update rhel-8 criteria as well.

Standalone full-tempest-api(bhagyashri):

https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/309447

Standalone full-tempest-scenario(Pooja):

https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/309635

Container-multinode(Pooja/Sandeep)

https://sf.hosted.upshift.rdu2.redhat.com/logs/95/311995/1/check/periodic-tripleo-ci-rhel-9-containers-multinode-rhos-17/d15a216/logs/undercloud/home/zuul/undercloud_install.log

2022-02-17 06:09:20.623820 | fa163e41-308d-b41c-929e-00000000075a |      FATAL | Ensure system is NTP time synced | undercloud | error={"changed": true, "cmd": ["chronyc", "waitsync", "30"], "delta": "0:04:50.219164", "end": "2022-02-17 06:09:20.594276", "msg": "non-zero return code", "rc": 1, "start": "2022-02-17 06:04:30.375112", "stderr": "", "stderr_lines": [], "stdout": "try: 1, refid: 00000000, correction: 0.000000000, skew: 0.000\ntry: 2, refid: 00000000, correction: 0.000000000, skew: 0.000\ntry: 3, refid: 00000000, correction:

Testing with fix: https://code.engineering.redhat.com/gerrit/c/tripleo-environments/+/312419

Multinode IPA(Pooja/Sandeep)

OVB fs020

OVB fs035

Baremetal fs001(Sandeep)

total 5 env - split for integtation line - 16.2 (rhel8), 17 (rhel8), 17 (rhel-9), 17(rhel-9 with uefi), and 1 for c9

Jenkins trigger job (Sandeep)

https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-config/+/318572 and https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/318574

Create Integration line criteria file for RHOSP17 RHEL9 (Sandeep)

308787: RHEL9 RHOSP17 Integration line criteria file | https://code.engineering.redhat.com/gerrit/c/tripleo-environments/+/308787

Set promoter for RHOSP17 RHEL9

https://review.rdoproject.org/r/c/rdo-infra/ci-config/+/39012 Enable the rhos17 promotion on rhel-9

Dashboard for rhel-9 17 line vs wallaby C9.

Move 17 jobs in a separate file (standalone)

Tech debt

1 trunk-candidate

https://issues.redhat.com/browse/CLOUDBLD-8692

2 Multi-node-bridge-ovs.repo to install openvswitch.
  • We will be testing openvswitch from nfvsig, not from fdp builds

https://sf.hosted.upshift.rdu2.redhat.com/logs/77/305077/5/check/periodic-tripleo-ci-rhel-9-standalone-rhos-17/2f94e53/job-output.txt

2022-01-19 12:47:03.482550 | TASK [multi-node-bridge : Install openvswitch]
2022-01-19 12:47:06.573718 | primary | ERROR
2022-01-19 12:47:06.574145 | primary | {
2022-01-19 12:47:06.574192 | primary |   "failures": [],
2022-01-19 12:47:06.574245 | primary |   "msg": "Depsolve Error occured: \n Problem: package rdo-openvswitch-1:2.15-2.el9s.noarch requires network-scripts-openvswitch2.15, but none of the providers can be installed\n  - package network-scripts-openvswitch2.15-2.15.0-35.el9s.x86_64 requires network-scripts, but none of the providers can be installed\n  - package network-scripts-openvswitch2.15-2.15.0-51.el9s.x86_64 requires network-scripts, but none of the providers can be installed\n  - package network-scripts-openvswitch2.15-2.15.0-56.el9s.x86_64 requires network-scripts, but no~~~~ne of the providers can be installed\n  - cannot install the best candidate for the job\n  - nothing provides initscripts(x86-64) >= 10.11.1 needed by openstack-network-scripts-10.11.1-1.el9s.x86_64",
2022-01-19 12:47:06.574280 | primary |   "rc": 1,
2022-01-19 12:47:06.574307 | primary |   "results": []
2022-01-19 12:47:06.574333 | primary | }

Workaround: Add rhos_release_args: "17-nightly ceph-5.1 -r 9.0" in var

https://bugzilla.redhat.com/show_bug.cgi?id=2060027#c1

3 Need FTBFS monitoring for rhel9 osp17 - jjyoce is working on it.
4 Move RHEL8 release file to have same heirarchy as RHEL9 or try suggestion from https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/305077/15#message-a3173baa6971f15d935f4e75e039f0440d279b87
5 Disable osp-trunk-deps.

https://code.engineering.redhat.com/gerrit/c/tripleo-environments/+/314091

6 Test if osp-trunk-deps get enabled during rpm build in build-test-package role.
6 Remove selinux workaround.

https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/318504

##### 7 Reduce 16.2 line run from twice to once.

8 Move rhel9 integration jobs to separate file.
9. Move rhel9 jobs parent from rhel8 jobs
Select a repo