owned this note
owned this note
Published
Linked with GitHub
# Dependency Pipeline Design and Planning
###### tags: `Design`
This document describes how the various dependency pipelines are being designed
[TOC]
## Background Info
:::spoiler Third Party Dependencies and the Dependency Pipelines (click to expand/collapse)
### What are Dependencies?
Dependencies refer to any rpms that are installed as part of an RDO/OSP deployment, that are not built by DLRN. Examples of dependencies are:
* openvswitch
* container-tools module
* ubi-8 container image (not rpms based)
* CentOS/RHEL (OS-related)
* ansible
### What are the Dependency Pipelines
A dependency pipeline provides the means by which to test upcoming changes to third-party rpms installed with RDO/OSP product code - while keeping the production CI running on the known stable dependency versions to keep product development moving forward.
A failure in the dependency pipelines will give infra/DFG teams advanced warning of issues that will arrive when the dependency version under test is included in the integration lines.
The infra/DFG teams can choose to alert the third party dependency provider and/or patch RDO/OSP to ensure it is compatible with the upcoming dependency change.
### Motivation
Traditionally, product teams discovered that a dependency had been updated with an incompatible change only when the the dependency update was released and incorporated into the production CI system - impacting all jobs.
This incompatible change would cause all the integration lines and check jobs to fail - immediately blocking all product development until the incompatibility was resolved.
The dependency pipelines aim to provide a "canary in the coal mine" - allowing teams to take preventative action to address upcoming dependency changes. At the minimal, teams can pin dependency versions to keep tests running while the incompatiblities are resolved by third party groups.
### Glossary
| Term | Description |
| ------ | -------- |
| Dependency | rpms/modules/container images that are installed as part of an RDO/OSP deployment, that are not built by DLRN
| Dependency Pipeline | A new pipeline that validates an upcoming changes to dependencies against stable product code |
| Dependency pipeline jobs | A set of CI jobs that validate an upcoming change to a dependency that will be integrated into RDO/OSP in the future |
:::
### Goal/Non-Goals statement
* To give product teams a heads up on upcoming changes so that they can take preventative action and/or plan accordingly. (early detection of issues intro'd by dependencies)
* De-risk RHEL9 and related content integration with OSP.
* Provide feedback to OSP leadership associated with https://trello.com/b/UxQnPO2Y/rhel9-openstack-changes (OSP on rhel9 project planning/monitoring)
### Overview
Comparing the old workflow to the dependency pipeline workflow:
#### old workflow:
```graphviz
digraph hierarchy {
nodesep=1.0
node [color=Green,fontname=Courier,shape=box]
edge [color=Blue, style=dashed]
{dependency_v1 [color=Black]} ->{component integration check}
}
```
***THEN DEPENDENCY UPDATES from v1 to v2...***
```graphviz
digraph hierarchy {
nodesep=1.0
node [color=Red,fontname=Courier,shape=box]
edge [color=Blue, style=dashed]
{dependency_v2 [color=Black]}->{component_FAIL integration_FAIL check_FAIL}
}
```
#### dependency pipeline workflow (allows for continued product development):
```graphviz
digraph hierarchy {
nodesep=1.0
node [color=Red,fontname=Courier,shape=box]
edge [color=Blue, style=dashed]
{dependency_v2 [color=Black]}->{ "dependency_pipeline" }->{"dependency_pipeline/test1"}->{"pin dependency to v1" "report failute in LP or BZ" "add compatibility fix"} ->{"integate fix" [color=black,fontcolor=black,shape=box3d]}->{"unpin dependency" [color=green]}
nodesep=1.0
node [color=Green,fontname=Courier,shape=box]
edge [color=Blue, style=dashed]
{dependency_v1 [color=Black]}->{component integration check}
}
```
```graphviz
digraph hierarchy {
nodesep=1.0
node [color=Green,fontname=Courier,shape=box]
edge [color=Blue, style=dashed]
{dependency_v3 [color=Black]}->{ "dependency_pipeline" }->{"dependency_pipeline/test1"}
nodesep=1.0
node [color=Green,fontname=Courier,shape=box]
edge [color=Blue, style=dashed]
{dependency_v2 [color=Black]}->{component integration check}
}
```
### Stakeholders
* Ronelle, Phil, Wes, Kevin Carter, Alan Pevec, Mike Burns, Chris Jones
### Milestones
* "design review"
* 8.3 - 8.4 dep pipelines
* centos/rhel 8 pipelines
* Use centos/rhel 8 as a proof of concept for rhel 9
* CI
* procedures re: bugs, cix,
* holding failing rpms back from production builds
* CIX visibility to dependencies being "pinned"
* The pinning of failing deps needs to be an automated process
* 9.x design
* OSP17 components build on 8.4 passing Integration Gate prior to introducing 9.0
* RHEL 9
* new os
* work on packstack / puppet deployment
* standalone rhel9
* https://trello.com/b/UxQnPO2Y/rhel9-openstack-changes
### Organizational dependencies
* dependency job ownership
* DFG's including delivery when the issue is NOT OSP content (Phil testing this assertion, true?)
### Decision points
* what pipelines to build
* **centos stream (upstream)**
* test rpms that are included in the upcoming OS version.
* **ansible (tricky no rpm built for 2.10 ->2.11, 2.10 available upstream)**
* test the lastest upcoming ansible rpm. In the case of ansible 2.10, there is no plan to create a downstream rpm but upstream developers want to prepare for the expected changes. Here is a case where we need both an upstream and downstream line.
* **container tools modules (upstream/downstream/both)?**
* test buildah, podman and related rpms. Another case where we need parallel upstream/downstream lines. Changes to the rpms may hit downstream first but the change to make the product compatible will hit upstream first
* **openvswitch/ovn**
* test openvswitch/ovn - note that this dependency is installed in the zuul playbooks for multinode jobs - which can cause issues when a version is installed that is later than the one the product expects
* **pacemaker**
* (not yet designed) test HA-related rpms (parallel upstream/downstream lines are required)
* adv-virt
* test virtualization mofules rpms. The suggestion to start with downstream here first?
* in what order
* who owns the debug/CIX cards
* what corrective action can be taken?
* sometimes the fix is on the product side
* often third party reliant
* do we have a point of contact on each of these dependencies? Alan Pevec is leading the OSP on RHEL9 project.
* where does the pipeline run
* (OS being downstream first and product being upstream first)
* duplicate upstream/downstream pipelines
* psi public vs. vexxhost
* diff scripts running within jobs
## Team sync meeting points
* timelines
* basic status
* share dependencies
* steps required before next sync call
## Design and steps
### POC setup
(initial design) https://hackmd.io/I_CFKPXHTza-5i2kFDIs5Q#Dependency-Pipeline
* Container tools:
https://sf.hosted.upshift.rdu2.redhat.com/zuul/t/tripleo-ci-internal/builds?pipeline=openstack-dependencies-container_tools
* Latest ansible
https://sf.hosted.upshift.rdu2.redhat.com/zuul/t/tripleo-ci-internal/builds?pipeline=openstack-dependencies-latest_ansible
* CentOS stream
https://review.rdoproject.org/zuul/builds?pipeline=openstack-dependencies-centos8stream
* Pipelines consist of:
* one standalone - no container update
* one standalone with container build
---- from https://hackmd.io/14KFQiyWSBCRsfJmBZNARw#Dependency-change-workflow ----
### Initial design plan
* baseos 8.2 -> 8.3 -> 8.4
* rhel-u-next (8.4) https://code.engineering.redhat.com/gerrit/#/c/222741/
* ansible
* http://download-node-02.eng.bos.redhat.com/rhel-8/nightly/ANSIBLE/latest-ANSIBLE-2.9-RHEL-8/compose/Base/x86_64/os/
* libpod,podman,buildah
* latest 8.2. compose -> app-stream
* module:
* container-tools:8.2 --> 8.3
* both are contained in the repo below, need adjust the config.
* http://download-node-02.eng.bos.redhat.com/rhel-8/rel-eng/RHEL-8/latest-RHEL-8.3/compose/AppStream/x86_64/os/
* adv-virt
* http://download-node-02.eng.bos.redhat.com/rhel-8/nightly/ADVANCED-VIRT-8/latest-ADVANCED-VIRT-8.1.1-RHEL-8/compose/Advanced-virt/x86_64/os/
* pacemaker
* http://download-node-02.eng.bos.redhat.com/rhel-8/rel-eng/RHEL-8/latest-RHEL-8.2/compose/HighAvailability/x86_64/os/
* openvswitch / ovn
* http://download-node-02.eng.bos.redhat.com/rhel-8/nightly/FDP/latest-FDP-8-RHEL-8/compose/Server/x86_64/os/
* the dependency-repos
* puppet?
#### TASKS - remove any 3rd party dependency jobs
* Once container-tools jobs are running, we no longer need buildah/podman 3rd party jobs.
* Remove these 3rd party jobs as the last stage of setting up the dependency lines.
#### Diff script design
* Where does the script run and where is it displayed
Suggestion - actually run the diff within the job and store the output in the logs but have a script that we can run independently
Output the diff file link in the cockpit
* What does the diff script display?
* The difference between the two lists ONLY
* Not a combined list of the two outputs
* If a file exists in only one OR if it exists as a different version
* Keep the output file naming consistent
* control list
* dependency list
* Deciding if there is a difference in the versions:
* rpm out looks as follows: {name}-{number/letter sequence}
* pattern split:
```
import re
package_name = re.split('(-[0-9]+)', i)[0]
```
#### Diff within the job logs - questions
* same job so the only rpms that should change should be related to the updated stream/module/repo
* if we use the job that builds containers, why would we have to diff the container content? Do we want a per container diff list as well as per node?
* what should be compared within the context of the dependency jobs
* what, if anything, we should extend to add to all jobs/upgrade jobs/component jobs/jenkins jobs
* if repopquery is still a good way to go
* if we take the repoquery - and then build containers - what additional benefit we get from scraping the container rpms
* what are we diff'ing here? just displaying rpms and versions. Do we require the highest epoch thing? If the versions are the same and the releases are different?
* Let's consider the three cases:
* rpm
* module
* stream - possibility of multiple versions
## Setting dependency versions
GOAL: Single source of truth for where dependencies are locked to a particular version
* current-idea: tripleo-versions rpm
#### Where are we locking dependency packages
* quickstart release config https://opendev.org/openstack/tripleo-quickstart/src/branch/master/config/release/tripleo-ci/CentOS-8/master.yml#L194
* tripleo-ansible https://github.com/openstack/tripleo-ansible/blob/master/tripleo_ansible/roles/tripleo_podman/vars/redhat-8.2.yml#L19
* disk-image-builder https://opendev.org/openstack/diskimage-builder/src/branch/master/diskimage_builder/elements/rhel-common/README.rst#L267
#### depenency injection for forward looking tests
* http://git.app.eng.bos.redhat.com/git/tripleo-environments.git/tree/config/release/dependency_ci/container_tools/repo_config.yaml
* https://opendev.org/openstack/tripleo-quickstart/src/branch/master/config/release/dependency_ci/centos8stream/repo_config.yaml
## Monitoring updates
After a discusion w/ the rdo folks, it was deemed a good idea to include the rdoinfo and nfvinfo projects that test dependencies in our cockpit monitoring tool.
* [ rdoinfo](https://review.rdoproject.org/r/#/q/project:rdoinfo)
* [ nfvinfo](https://review.rdoproject.org/r/#/q/project:nfvinfo)
rdoinfo and nfvinfo take proposed changes to dependencies and test
them against packstack, puppet and tripleo jobs. This does help ensure the quality of dependency changes. However it does not speak to inconsistent errors that often arise from dependency changes. The dependency pipeline allows the CI team to vet changes over time and highlight issues in the consistency.
* change to add package logging
* https://review.rdoproject.org/r/c/rdo-jobs/+/32364
* query can be something like
```
SELECT "branch", "change", "dep_change", "log_link", "duration", "result", "result_num" FROM "build" WHERE ("project" = 'rdoinfo' and "type"='rdo' and "dep_change"!='')
```
* Mike burns requested the following jenkins jobs be added to the openvswitch dep lines
* https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/job/DFG-network-ml2ovs-fdp-trigger/
* https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/job/DFG-network-ml2ovs-fdp-trigger/lastSuccessfulBuild/artifact/report/report.html
* https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/job/DFG-network-ml2ovn-fdp-trigger/52/
## rhel_u_next redesign
* build new rhos-release rpm (change in modules, rpms possible)
* use provider job to build new containers
* build new images
* run jobs depending on ^^
* standalone
* ovb
* multinode
* upgrade
* possible updated pacemaker etc.