Dependency Pipeline Design and Planning

tags: Design

This document describes how the various dependency pipelines are being designed

Background Info

Third Party Dependencies and the Dependency Pipelines (click to expand/collapse)

What are Dependencies?

Dependencies refer to any rpms that are installed as part of an RDO/OSP deployment, that are not built by DLRN. Examples of dependencies are:

  • openvswitch
  • container-tools module
  • ubi-8 container image (not rpms based)
  • CentOS/RHEL (OS-related)
  • ansible

What are the Dependency Pipelines

A dependency pipeline provides the means by which to test upcoming changes to third-party rpms installed with RDO/OSP product code - while keeping the production CI running on the known stable dependency versions to keep product development moving forward.

A failure in the dependency pipelines will give infra/DFG teams advanced warning of issues that will arrive when the dependency version under test is included in the integration lines.

The infra/DFG teams can choose to alert the third party dependency provider and/or patch RDO/OSP to ensure it is compatible with the upcoming dependency change.

Motivation

Traditionally, product teams discovered that a dependency had been updated with an incompatible change only when the the dependency update was released and incorporated into the production CI system - impacting all jobs.

This incompatible change would cause all the integration lines and check jobs to fail - immediately blocking all product development until the incompatibility was resolved.

The dependency pipelines aim to provide a "canary in the coal mine" - allowing teams to take preventative action to address upcoming dependency changes. At the minimal, teams can pin dependency versions to keep tests running while the incompatiblities are resolved by third party groups.

Glossary

Term Description
Dependency rpms/modules/container images that are installed as part of an RDO/OSP deployment, that are not built by DLRN
Dependency Pipeline A new pipeline that validates an upcoming changes to dependencies against stable product code
Dependency pipeline jobs A set of CI jobs that validate an upcoming change to a dependency that will be integrated into RDO/OSP in the future

Goal/Non-Goals statement

  • To give product teams a heads up on upcoming changes so that they can take preventative action and/or plan accordingly. (early detection of issues intro'd by dependencies)
  • De-risk RHEL9 and related content integration with OSP.
  • Provide feedback to OSP leadership associated with https://trello.com/b/UxQnPO2Y/rhel9-openstack-changes (OSP on rhel9 project planning/monitoring)

Overview

Comparing the old workflow to the dependency pipeline workflow:

old workflow:

digraph hierarchy {

nodesep=1.0
node [color=Green,fontname=Courier,shape=box]
edge [color=Blue, style=dashed]

    {dependency_v1 [color=Black]} ->{component integration check}
    
}

THEN DEPENDENCY UPDATES from v1 to v2

digraph hierarchy {

nodesep=1.0
node [color=Red,fontname=Courier,shape=box]
edge [color=Blue, style=dashed]

    {dependency_v2 [color=Black]}->{component_FAIL integration_FAIL check_FAIL}
    
}

dependency pipeline workflow (allows for continued product development):

digraph hierarchy {

nodesep=1.0
node [color=Red,fontname=Courier,shape=box]
edge [color=Blue, style=dashed]

     {dependency_v2 [color=Black]}->{ "dependency_pipeline" }->{"dependency_pipeline/test1"}->{"pin dependency to v1" "report failute in LP or BZ" "add compatibility fix"} ->{"integate fix" [color=black,fontcolor=black,shape=box3d]}->{"unpin dependency" [color=green]}
     
nodesep=1.0
node [color=Green,fontname=Courier,shape=box]
edge [color=Blue, style=dashed]

    {dependency_v1 [color=Black]}->{component integration check}
}
digraph hierarchy {

nodesep=1.0
node [color=Green,fontname=Courier,shape=box]
edge [color=Blue, style=dashed]

     {dependency_v3 [color=Black]}->{ "dependency_pipeline" }->{"dependency_pipeline/test1"}

nodesep=1.0
node [color=Green,fontname=Courier,shape=box]
edge [color=Blue, style=dashed]

    {dependency_v2 [color=Black]}->{component integration check}
    
}

Stakeholders

  • Ronelle, Phil, Wes, Kevin Carter, Alan Pevec, Mike Burns, Chris Jones

Milestones

  • "design review"

    • 8.3 - 8.4 dep pipelines
      • centos/rhel 8 pipelines
    • Use centos/rhel 8 as a proof of concept for rhel 9
      • CI
      • procedures re: bugs, cix,
        • holding failing rpms back from production builds
        • CIX visibility to dependencies being "pinned"
        • The pinning of failing deps needs to be an automated process
    • 9.x design
  • OSP17 components build on 8.4 passing Integration Gate prior to introducing 9.0

  • RHEL 9

Organizational dependencies

  • dependency job ownership
    • DFG's including delivery when the issue is NOT OSP content (Phil testing this assertion, true?)

Decision points

  • what pipelines to build
    • centos stream (upstream)
      • test rpms that are included in the upcoming OS version.
    • ansible (tricky no rpm built for 2.10 ->2.11, 2.10 available upstream)
      • test the lastest upcoming ansible rpm. In the case of ansible 2.10, there is no plan to create a downstream rpm but upstream developers want to prepare for the expected changes. Here is a case where we need both an upstream and downstream line.
    • container tools modules (upstream/downstream/both)?
      • test buildah, podman and related rpms. Another case where we need parallel upstream/downstream lines. Changes to the rpms may hit downstream first but the change to make the product compatible will hit upstream first
    • openvswitch/ovn
      • test openvswitch/ovn - note that this dependency is installed in the zuul playbooks for multinode jobs - which can cause issues when a version is installed that is later than the one the product expects
    • pacemaker
      • (not yet designed) test HA-related rpms (parallel upstream/downstream lines are required)
    • adv-virt
      • test virtualization mofules rpms. The suggestion to start with downstream here first?
  • in what order
  • who owns the debug/CIX cards
  • what corrective action can be taken?
    • sometimes the fix is on the product side
    • often third party reliant
    • do we have a point of contact on each of these dependencies? Alan Pevec is leading the OSP on RHEL9 project.
  • where does the pipeline run
    • (OS being downstream first and product being upstream first)
    • duplicate upstream/downstream pipelines
    • psi public vs. vexxhost
  • diff scripts running within jobs

Team sync meeting points

  • timelines
  • basic status
  • share dependencies
  • steps required before next sync call

Design and steps

POC setup

(initial design) https://hackmd.io/I_CFKPXHTza-5i2kFDIs5Q#Dependency-Pipeline

from https://hackmd.io/14KFQiyWSBCRsfJmBZNARw#Dependency-change-workflow

Initial design plan

TASKS - remove any 3rd party dependency jobs

  • Once container-tools jobs are running, we no longer need buildah/podman 3rd party jobs.
  • Remove these 3rd party jobs as the last stage of setting up the dependency lines.

Diff script design

  • Where does the script run and where is it displayed Suggestion - actually run the diff within the job and store the output in the logs but have a script that we can run independently Output the diff file link in the cockpit
  • What does the diff script display?
    • The difference between the two lists ONLY
    • Not a combined list of the two outputs
    • If a file exists in only one OR if it exists as a different version
  • Keep the output file naming consistent
    • control list
    • dependency list
  • Deciding if there is a difference in the versions:
    • rpm out looks as follows: {name}-{number/letter sequence}
    • pattern split:
import re
package_name = re.split('(-[0-9]+)', i)[0]

Diff within the job logs - questions

  • same job so the only rpms that should change should be related to the updated stream/module/repo
  • if we use the job that builds containers, why would we have to diff the container content? Do we want a per container diff list as well as per node?
  • what should be compared within the context of the dependency jobs
  • what, if anything, we should extend to add to all jobs/upgrade jobs/component jobs/jenkins jobs
  • if repopquery is still a good way to go
  • if we take the repoquery - and then build containers - what additional benefit we get from scraping the container rpms
  • what are we diff'ing here? just displaying rpms and versions. Do we require the highest epoch thing? If the versions are the same and the releases are different?
  • Let's consider the three cases:
    • rpm
    • module
    • stream - possibility of multiple versions

Setting dependency versions

GOAL: Single source of truth for where dependencies are locked to a particular version

  • current-idea: tripleo-versions rpm

Where are we locking dependency packages

depenency injection for forward looking tests

Monitoring updates

After a discusion w/ the rdo folks, it was deemed a good idea to include the rdoinfo and nfvinfo projects that test dependencies in our cockpit monitoring tool.

rdoinfo and nfvinfo take proposed changes to dependencies and test them against packstack, puppet and tripleo jobs. This does help ensure the quality of dependency changes. However it does not speak to inconsistent errors that often arise from dependency changes. The dependency pipeline allows the CI team to vet changes over time and highlight issues in the consistency.

SELECT "branch", "change", "dep_change", "log_link", "duration", "result", "result_num" FROM "build" WHERE ("project" = 'rdoinfo' and "type"='rdo' and "dep_change"!='')

rhel_u_next redesign

  • build new rhos-release rpm (change in modules, rpms possible)
  • use provider job to build new containers
  • build new images
  • run jobs depending on ^^
    • standalone
    • ovb
    • multinode
    • upgrade
  • possible updated pacemaker etc.
Select a repo