owned this note
owned this note
Published
Linked with GitHub
# RHOSP17 on Rhel9
###### tags: `Design`
:::info
Important links for engineer working in downstream:
* [Dashboard](http://tripleo-cockpit.usersys.redhat.com/d/KyHCwLHMk/rhos-16-2-full-component-pipeline)
* [Zuul](https://sf.hosted.upshift.rdu2.redhat.com/zuul/t/tripleo-ci-internal/status)
* Important Git repos:-
* [tripleo-ci-config](http://git.app.eng.bos.redhat.com/git/openstack/tripleo-ci-internal-config.git)
* [tripleo-ci-internal-jobs](http://git.app.eng.bos.redhat.com/git/openstack/tripleo-ci-internal-jobs.git)
* [tripleo-environment](
http://git.app.eng.bos.redhat.com/git/tripleo-environments.git/)
* [Gerrit](https://code.engineering.redhat.com/gerrit/)
* [Downstream Container Registry(project: tripleorhos-17-rhel-9)](
https://registry-console-default.cloud.registry.upshift.redhat.com/registry)
* [Dlrn Repos](
http://osp-trunk.hosted.upshift.rdu2.redhat.com/rhel9-osp17/)
* [Codesearch](
https://sf.hosted.upshift.rdu2.redhat.com/codesearch/?)
* [promotion-trunk](https://osp-trunk.hosted.upshift.rdu2.redhat.com/rhel9-osp17/)
:::
:::info
# Mini Integration line Status
- [x] Pipeline
- [x] Container build
- [x] Image build
- [x] Vanilla Standalone
:::
::: danger
## Pipeline
* Pipeline def
* ~~https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/305840~~
* Add pipeline
* ~~https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-config/+/306035~~
## ~~Container build (Green run)~~
### Patches:
* ~~testproject:~~ ~~https://code.engineering.redhat.com/gerrit/c/testproject/+/299390~~
* ~~Add exclude for network-scripts in pre for bug https://bugzilla.redhat.com/show_bug.cgi?id=2043678~~
* Need to move away from using nightly.
* ~~https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/307838~~
* ~~https://code.engineering.redhat.com/gerrit/c/tripleo-environments/+/307840~~
* ~~307913: Define downstream_cert_install_command release file | https://code.engineering.redhat.com/gerrit/c/tripleo-environments/+/307913~~
* ~~Job def: https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/299521~~
~~* Release file: https://code.engineering.redhat.com/gerrit/c/tripleo-environments/+/300678~~
~~* 822503: Introduce iptables_package var |https://review.opendev.org/c/zuul/zuul-jobs/+/822503~~
~~* Only add temp repos on el8 - https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/302181~~
## ~~Image Build(Green run)~~:
### Patches:
* ~~Main patch with job def: https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/302630~~
* ~~307913: Define downstream_cert_install_command release file | https://code.engineering.redhat.com/gerrit/c/tripleo-environments/+/307913~~
~~* Add exclude for network-scripts in pre for bug https://bugzilla.redhat.com/show_bug.cgi?id=2043678~~
* ~~823365: Preliminary work to support CentOS 9 Stream | https://review.opendev.org/c/openstack/ironic-python-agent-builder/+/823365~~
~~* Release file: https://code.engineering.redhat.com/gerrit/c/tripleo-environments/+/300678~~
~~* 822503: Introduce iptables_package var | https://review.opendev.org/c/zuul/zuul-jobs/+/822503~~
* ~~824598: Add configs for RHEL 9 image | https://review.opendev.org/c/openstack/tripleo-common/+/824598 (Merged)~~
* ~~Above cherry-pick: https://review.opendev.org/c/openstack/tripleo-common/+/825227~~
~~* 824660: Fixes for centos-9-stream efi behaviour | https://review.opendev.org/c/openstack/diskimage-builder/+/824660~~
* ~~openstack/tripleo-puppet-elements master: Add config for RHEL9 in overcloud-base element https://review.opendev.org/c/openstack/tripleo-puppet-elements/+/825259 (Merged)~~
* ~~cherry-pick https://review.opendev.org/c/openstack/tripleo-puppet-elements/+/825850~~
* ~~openstack/tripleo-puppet-elements master: Add config for EL9 in overcloud-agent element https://review.opendev.org/c/openstack/tripleo-puppet-elements/+/825263 (Merged)~~
* ~~https://review.opendev.org/c/openstack/tripleo-puppet-elements/+/825748 (cherry-pick)~~
* ~~825696: delorean-repo remove yum_plugin_priorities_package | https://review.opendev.org/c/openstack/tripleo-image-elements/+/825696 Merged~~
* ~~https://review.opendev.org/c/openstack/tripleo-image-elements/+/825749 cherry-pick~~
:::danger
### Bugs:
~~https://bugs.launchpad.net/tripleo/+bug/1957856 - Seems to be solved by https://review.opendev.org/c/openstack/diskimage-builder/+/824660~~
:::
## ~~Standalone(Green run):~~
* Add in pipeline: ~~https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/308443~~
* ~~Job def: https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/305077~~
~~* Release file: https://code.engineering.redhat.com/gerrit/c/tripleo-environments/+/300678~~
~~* 822503: Introduce iptables_package var | https://review.opendev.org/c/zuul/zuul-jobs/+/822503~~
:::danger
### Issues -
### ~~Bug#1: https://bugzilla.redhat.com/show_bug.cgi?id=2043678 (openstack-network-scripts and network-script conflict)~~
* fix: ~~https://code.engineering.redhat.com/gerrit/c/tripleo-environments/+/305867~~
* ~~Also need to add same workaround in container and image build pre.~~
### ~~Bug#2 https://bugzilla.redhat.com/show_bug.cgi?id=2043922 - Standalone deployment on RHEL9 failing Error: Evaluation Error: Error while evaluating a Virtual Query, Could not autoload puppet/type/pcmk_resource: Could not autoload puppet/provider/pcmk_resource/default: cannot load such file -- rexml/document~~
* ~~Solved with new puppet-headless~~
### ~~Bug#3 https://bugzilla.redhat.com/show_bug.cgi?id=2044776 - RHOSP17 Standalone deployment failing with Error: Evaluation Error: Error while evaluating a Function Call, undefined method `escape' for URI:Module (file: /etc/puppet/modules/nova/manifests/migration/libvirt.pp, line: 201, column: 49) on node standalone~~
* ~~Need new puppet-stdlib~~
* ~~https://code.engineering.redhat.com/gerrit/c/puppet-stdlib/+/306409 - Add .gitreview~~
* ~~https://code.engineering.redhat.com/gerrit/c/puppet-stdlib/+/306411 -[DOWNSTREAM ONLY] Replacing URI.escape with URI-DEFAULT_PARSER~~
### ~~Bug#4 https://bugzilla.redhat.com/show_bug.cgi?id=2048488 - RHOSP17 Standalone deployment failing to start rabbitmq_wait_bundle container, RabbitMQ version cannot run on Erlang 23.0.4: minimum required version is 23.2 (erts 11.1~~
* ~~fixed with new erlang~~
:::
## ~~OVB fs001(Sandeep)~~
Job def
* ~~https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/305869~~
:::danger
### Issues -
##### 1 : ~~Fetch-zuul-cloner-fork causing issue~~
~~https://sf.hosted.upshift.rdu2.redhat.com/logs/69/305869/2/check/periodic-tripleo-ci-rhel-9-ovb-3ctlr_1comp-featureset001-internal-rhos-17/5b28867/job-output.txt~~
~~~
2022-01-22 03:53:15.085655 | TASK [fetch-zuul-cloner-fork : Install zuul-cloner shim dependencies]
2022-01-22 03:53:15.889468 | primary | ERROR
2022-01-22 03:53:15.889907 | primary | {
2022-01-22 03:53:15.889957 | primary | "msg": "Failed to find required executable virtualenv in paths: /sbin:/bin:/usr/sbin:/usr/bin:/usr/local/sbin"
2022-01-22 03:53:15.889988 | primary | }
~~~
Fix: ~~https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-config/+/30~~5898
* No package epel-release available."
~~fix: https://review.rdoproject.org/r/c/config/+/39141~~
* ~~[Node provisioning failure](https://bugzilla.redhat.com/show_bug.cgi?id=2053527) - Fixed~~
* [Overcloud deploy failing with "Error: /Stage[main]/Pacemaker::Corosync/User[hacluster]/password: change from [redacted] to [redacted] failed: chpasswd said chpasswd: cannot execute /usr/sbin/sss_cache: Permission denied"](https://bugzilla.redhat.com/show_bug.cgi?id=2057261) - We have workaround - selinux-policy in override repo
* ~~Issue - build_image.sh taking wrong release:~~
* ~~312889: Add dib_release in release file. | https://code.engineering.redhat.com/gerrit/c/tripleo-environments/+/312889~~
:::
:::info
# Full Integration pipeline Status
- [x] sc001
- [x] sc002
- [x] sc004
- [x] sc007
- [x] sc010
- [x] sc012
- [x] sc full-tempest-api
- [x] sc full-tempest-scenario
- [x] Container-multinode
- [x] Multinode IPA
- [x] OVB fs001
- [x] OVB fs020
- [x] OVB fs035
- [ ] Baremetal fs001
- [x] Jenkins trigger job
- [x] Integration line criteria file
- [x] Promoter for RHOSP17 RHEL9
- [x] First promotion to current-tripleo
- [x] ruck_rover.py support for rhosp17 on rhel9
- [x] Dashboard for rhel-9 17 line vs wallaby C9
:::
::: danger
#### Ongoing issues
* Podman container network issue: http://pastebin.test.redhat.com/1034041
~~~
jboyer[m]> we're doing monthly updates of the RHEL 9 Beta channels
<jboyer[m]> another update will come in April
~~~
* ~~https://bugzilla.redhat.com/show_bug.cgi?id=2060047 - RHOSP17 overcloud-full image build failing with Error: "Problem: The operation would result in removing the following protected packages: grub2-pc"~~
* ~~https://bugzilla.redhat.com/show_bug.cgi?id=2060027 - RHOSP17 on RHEL-9 jobs failure with Error: nothing provides chkconfig and bc needed by openstack-network-scripts-10.11-1.el9osttrunk.x86_64~~
* ~~Looks like auto-fixed - this morning~~.
:::
## ~~Standalone SC01(Sandeep):~~
~~Periodic: https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/308483~~
## ~~Standalone SC02(bhagyashri):~~
~~https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/309437/3~~
## ~~Standalone SC04(bhagyashri):~~
~~https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/309439/3~~
## ~~Standalone SC07(Pooja):~~
~~https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/308664~~
::: danger
Bug #1:-
~~Standalone deployment on sc07 failing to start rabbitmq_wait_bundle container, rabbitmq crash with error:{badmatch,{error,{{shutdown,{failed_to_start_child,net_kernel,{'EXIT',nodistribution}}} - https://bugzilla.redhat.com/2050624~~
Fix:
~~https://code.engineering.redhat.com/gerrit/c/openstack/sf-config/+/309499~~
~~https://code.engineering.redhat.com/gerrit/c/openstack/sf-config/+/309607~~
~~Bug #2:- Hitting same bug which we hit for wallaby: https://bugs.launchpad.net/tripleo/+bug/1959582~~
~~Awaiting new tripleo component~~
:::
## Standalone SC010(Pooja/Sandeep):
* ~~https://review.opendev.org/c/openstack/octavia/+/829539 (need cherry-pick)
octavia-tempest-plugin need promotion till downstream.~~
* ~~https://review.opendev.org/c/openstack/octavia/+/830205~~
* ~~https://review.opendev.org/c/openstack/octavia/+/830206~~
## Standalone SC012(Pooja/Sandeep)
~~https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/316278~~
python3-vritualbmc deps issues, resolves with below:-
~~https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/317391~~
sc012 was not getting triggered in rhel-9 as well
~~https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/318369~~
We need to update rhel-8 criteria as well.
## ~~Standalone full-tempest-api(bhagyashri):~~
~~https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/309447~~
## ~~Standalone full-tempest-scenario(Pooja)~~:
~~https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/309635~~
## ~~Container-multinode(Pooja/Sandeep)~~
* ~~https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/311995~~
::: danger
~~https://sf.hosted.upshift.rdu2.redhat.com/logs/95/311995/1/check/periodic-tripleo-ci-rhel-9-containers-multinode-rhos-17/d15a216/logs/undercloud/home/zuul/undercloud_install.log~~
~~~
2022-02-17 06:09:20.623820 | fa163e41-308d-b41c-929e-00000000075a | FATAL | Ensure system is NTP time synced | undercloud | error={"changed": true, "cmd": ["chronyc", "waitsync", "30"], "delta": "0:04:50.219164", "end": "2022-02-17 06:09:20.594276", "msg": "non-zero return code", "rc": 1, "start": "2022-02-17 06:04:30.375112", "stderr": "", "stderr_lines": [], "stdout": "try: 1, refid: 00000000, correction: 0.000000000, skew: 0.000\ntry: 2, refid: 00000000, correction: 0.000000000, skew: 0.000\ntry: 3, refid: 00000000, correction:
~~~
~~Testing with fix: https://code.engineering.redhat.com/gerrit/c/tripleo-environments/+/312419~~
:::
## ~~Multinode IPA(Pooja/Sandeep)~~
* ~~https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/311972~~
* ~~https://code.engineering.redhat.com/gerrit/c/openstack/sf-config/+/312872~~
* ~~Rhel-8 mulitnode fix:~~ ~~https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/314086~~
## OVB fs020
## OVB fs035
## Baremetal fs001(Sandeep)
* https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/310326
* Need to discuss which machine we can move here
total 5 env - split for integtation line - 16.2 (rhel8), 17 (rhel8), 17 (rhel-9), 17(rhel-9 with uefi), and 1 for c9
* Remove baremetal jobs from 16.2 component line
* ~~https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/310303~~
## ~~Jenkins trigger job (Sandeep)~~
~~https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-config/+/318572~~ and ~~https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/318574~~
## ~~Create Integration line criteria file for RHOSP17 RHEL9 (Sandeep)~~
~~308787: RHEL9 RHOSP17 Integration line criteria file | https://code.engineering.redhat.com/gerrit/c/tripleo-environments/+/308787~~
## ~~Set promoter for RHOSP17 RHEL9~~
~~https://review.rdoproject.org/r/c/rdo-infra/ci-config/+/39012 Enable the rhos17 promotion on rhel-9~~
## Dashboard for rhel-9 17 line vs wallaby C9.
## Move 17 jobs in a separate file (standalone)
:::info
# Tech debt
##### 1 ~~trunk-candidate~~
~~https://issues.redhat.com/browse/CLOUDBLD-8692~~
##### 2 Multi-node-bridge-ovs.repo to install openvswitch.
- We will be testing openvswitch from nfvsig, not from fdp builds
https://sf.hosted.upshift.rdu2.redhat.com/logs/77/305077/5/check/periodic-tripleo-ci-rhel-9-standalone-rhos-17/2f94e53/job-output.txt
~~~
2022-01-19 12:47:03.482550 | TASK [multi-node-bridge : Install openvswitch]
2022-01-19 12:47:06.573718 | primary | ERROR
2022-01-19 12:47:06.574145 | primary | {
2022-01-19 12:47:06.574192 | primary | "failures": [],
2022-01-19 12:47:06.574245 | primary | "msg": "Depsolve Error occured: \n Problem: package rdo-openvswitch-1:2.15-2.el9s.noarch requires network-scripts-openvswitch2.15, but none of the providers can be installed\n - package network-scripts-openvswitch2.15-2.15.0-35.el9s.x86_64 requires network-scripts, but none of the providers can be installed\n - package network-scripts-openvswitch2.15-2.15.0-51.el9s.x86_64 requires network-scripts, but none of the providers can be installed\n - package network-scripts-openvswitch2.15-2.15.0-56.el9s.x86_64 requires network-scripts, but no~~~~ne of the providers can be installed\n - cannot install the best candidate for the job\n - nothing provides initscripts(x86-64) >= 10.11.1 needed by openstack-network-scripts-10.11.1-1.el9s.x86_64",
2022-01-19 12:47:06.574280 | primary | "rc": 1,
2022-01-19 12:47:06.574307 | primary | "results": []
2022-01-19 12:47:06.574333 | primary | }
~~~
Workaround: Add `rhos_release_args: "17-nightly ceph-5.1 -r 9.0"` in var
https://bugzilla.redhat.com/show_bug.cgi?id=2060027#c1
###### 3 ~~Need FTBFS monitoring for rhel9 osp17 - jjyoce is working on it.~~
##### 4 Move RHEL8 release file to have same heirarchy as RHEL9 or try suggestion from https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/305077/15#message-a3173baa6971f15d935f4e75e039f0440d279b87
##### 5 ~~Disable osp-trunk-deps.~~
~~https://code.engineering.redhat.com/gerrit/c/tripleo-environments/+/314091~~
##### 6 Test if osp-trunk-deps get enabled during rpm build in build-test-package role.
##### ~~6 Remove selinux workaround.~~
~~https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/318504~~
~~##### 7 Reduce 16.2 line run from twice to once.~~
##### 8 Move rhel9 integration jobs to separate file.
##### 9. Move rhel9 jobs parent from rhel8 jobs
:::