Enabling Cluster-API-based Installations via openshift-install

Summary

This enhancement discusses how openshift-install can use
cluster-api (CAPI) infrastructure providers to provision infrastructure for clusters,
without requiring access to an external management cluster or a local container runtime.
By running a Kubernetes control plane and CAPI-provider controllers as
subprocesses on the installer host, openshift-install can use CAPI and its
providers in a similar manner to how Terraform and its providers are currently
being used.

Motivation

There are two primary motivations:

  1. OpenShift Alignment with CAPI: CAPI offers numerous potential benefits;
    such as: day-2 infrastructure management, an API for users to edit cluster
    infrastructure, and upstream collaboration. Installer support for CAPI would
    be foudational for adopting these benefits.

  2. Terraform BSL License Change: due to the restrictive license change of
    Terraform, openshift-install needs a framework to replace the primary
    tool it used to provision cluster infrastructure. In addition to the benefits
    listed above, CAPI provides solutions for the biggest gaps left by Terraform:
    a common API for specifying infrastructure across cloud providers and robust
    infrastructure error handling.

User Stories

  • As an existing user/client of the installer, I want backwards compatibility so that I can continue to use the installer (e.g. create cluster) in the same manner and with existing automation.
  • As a security analyst, I want the installer image to be free of Terraform and related dependencies to decrease surface area for vulnerabilities.
  • As an advanced user or cluster administrator, I want to be able to edit the CAPI infrastructure manifests so that I can customize control-plane infrastructure.

Goals

  • To provide a common user and developer experience when installing and developing across cloud platforms
  • To be backwards compatible and fully satisfy the requirements of install-config type APIs.
  • To keep the user experience for day-zero operations unchanged or improved.
  • To not require any new runtime dependencies.
  • To provide an extensible framework to plug-in new infrastructure cloud providers.

Non-Goals / Future work

  • To retain full and strict backward compatibility with the infrastructure previously created with Terraform
  • To optimize build processes or binary size
  • To use an existing management cluster to install OpenShift
  • To pivot the CAPI manifests to the newly-installed cluster to enable day-2 infrastructure management within the cluster.

Proposal

The Installer will create CAPI infrastructure manifests based on user
input from the install config; then, in order to provision cluster infrastructure,
apply the manifests to CAPI controllers running on a local Kubernetes control-plane
setup by envtest.

Workflow Description

cluster creator is a human user responsible for deploying a
cluster. Note that the workflow does not change for this user.

openshift-install is the Installer binary.

  1. The cluster creator provides an install-config and credentials
  2. (optional) The cluster creator runs openshift-install create manifests
  3. (optional) The cluster creator edits the newly created CAPI manifests.
  4. The cluster creator runs openshift-install create cluster
  5. openshift-install extracts the kube-api server, etcd, CAPI infrastructure provider & the cloud CAPI provider to the install dir
  6. openshift-install using envtest initializes a control plane locally on the Installer host
  7. openshift-install execs the CAPI infrastructure and cloud provider as subprocesses, pointing them to the local control plane
  8. openshift-install applies the CAPI manifests to the control plane
  9. The CAPI controllers provision cluster infrastructure based on the manifests
  10. openshift-install monitors the status of the local manifests as they are applied
  11. If the statuses are as expected, infrastrucutre has been provisioned and installation continues with the normal flow.

In the case of an error in the final step, the Installer will bubble up resources with non-expected statuses.

Variation and form factor considerations [optional]

How does this proposal intersect with Standalone OCP, Microshift and Hypershift?

If the cluster creator uses a standing desk, in step 1 above they can
stand instead of sitting down.

See
https://github.com/openshift/enhancements/blob/master/enhancements/workload-partitioning/management-workload-partitioning.md#high-level-end-to-end-workflow
and https://github.com/openshift/enhancements/blob/master/enhancements/agent-installer/automated-workflow-for-agent-based-installer.md for more detailed examples.

API Extensions

API Extensions are CRDs, admission and conversion webhooks, aggregated API servers,
and finalizers, i.e. those mechanisms that change the OCP API surface and behaviour.

  • Name the API extensions this enhancement adds or modifies.

  • Does this enhancement modify the behaviour of existing resources, especially those owned
    by other parties than the authoring team (including upstream resources), and, if yes, how?
    Please add those other parties as reviewers to the enhancement.

    Examples:

    • Adds a finalizer to namespaces. Namespace cannot be deleted without our controller running.
    • Restricts the label format for objects to X.
    • Defaults field Y on object kind Z.

Fill in the operational impact of these API Extensions in the "Operational Aspects
of API Extensions" section.

Implementation Details/Notes/Constraints [optional]

Overview

In a typical CAPI installation, manifests indicating the desired cluster configuration are applied to a
management cluster. In order to keep openshift-install free of any new external runtime dependencies,
the dependencies will be embedded into the openshift-install binary,
extracted at runtime, and cleaned up afterward. This approach is similar to what we have been using
for Terraform.

With Terraform, the Installer has been embedding the Terraform and cloud-specific provider binaries
within the Installer binary and extracting them at runtime. The Installer produces the Terraform
configuration files and invokes Terraform using the tf-exec library.

terraform diagram(2)

We can follow a similar pattern to run CAPI controllers locally on the Installer host. In addition
to the CAPI controller binaries, kube-apiserver and etcd are embedded in order to run a local
control plane, orchestrated with envtest.

capi diagram(3)

Local control plane

The local control plane is setup using the previously available work done in Controller Runtime through envtest.
Envtest was born due to a necessity to run integration tests for controllers against a real API server, register webhooks
(conversion, admission, validation), and managing the lifecycle of Custom Resource Definitions.

Over time, envtest matured in a way that now can be used to run controllers in a local environment,
reducing or eliminating the need for a full Kubernetes cluster to run controllers.

At a high level, the local control plane is responsible for:

  • Setting up certificates for the apiserver and etcd.
  • Running (and cleaning up, on shutdown) the local control plane components.
  • Installing any required component, like Custom Resource Definitions (CRDs)
    • For Cluster API core the CRDs are stored in data/data/cluster-api/core-components.yaml.
    • Infrastructure providers are expected to store their components in data/data/cluster-api/<name>-infrastructure-components.yaml
  • Upon install, the local control plane takes care of modifying any webhook (conversion, admission, validation) to point to the host:post combination assigned.
    • Each controller manager will have its own host:port combination assigned.
    • Certificates are generated and injected in the server, and the client certs in the api-server webhook configuration.
  • For each process that the local control plane manages, a health check (ping to /healthz) is required to pass similarly how, when running in a Deployment, a health probe is configured.

Manifests

The Installer will produce the CAPI manifests as part of the manifests target, writing them to a new
cluster-api directory alongside the existing manifests and openshift directories:

$ ./openshift-install create manifests --dir install-dir INFO Credentials loaded from the "default" profile in file "~/.aws/credentials" INFO Consuming Install Config from target directory INFO Manifests created in: install-dir/cluster-api, install-dir/manifests and install-dir/openshift $ tree install-dir/cluster-api/ install-dir/cluster-api/ ├── 00_capi-namespace.yaml ├── 01_aws-cluster-controller-identity-default.yaml ├── 01_capi-cluster.yaml ├── 02_infra-cluster.yaml ├── 10_inframachine_mycluster-6lxqp-master-0.yaml ├── 10_inframachine_mycluster-6lxqp-master-1.yaml ├── 10_inframachine_mycluster-6lxqp-master-2.yaml ├── 10_inframachine_mycluster-6lxqp-master-bootstrap.yaml ├── 10_machine_mycluster-6lxqp-master-0.yaml ├── 10_machine_mycluster-6lxqp-master-1.yaml ├── 10_machine_mycluster-6lxqp-master-2.yaml └── 10_machine_mycluster-6lxqp-master-bootstrap.yaml 1 directory, 12 files

The manifests within this cluster-api directory will not be written to the cluster or included in bootstrap ignition.
In future work, we expect these manifests to be pivoted to the cluster to enable the target cluster to take over managing
its own infrastructure.

Risks and Mitigations

While we do not expect these changes to introduce a significant security risk, we are working with product security teams
to ensure they are aware of the changes and are able to review.

Drawbacks

By depending on CAPI providers whose codebases live in a repository external to the Installer,
the process for developing features and delivering fixes is complex. While we had the
same situation for Terraform, the CAPI providers will be more actively developed than their
Terraform counterparts. Furthermore, it will be necessary to ensure that the CAPI providers
used by the Installer match the version of those in the payload.

While this external dependency is a significant drawback, it is not unique to this design
and is common throughout OpenShift (e.g. any time the API or library-go must be updated
before being vendored into a component). To minimize the devex friction, we will focus
on documenting a workflow for developing providers while working with the Installer. If
the problem becomes significant, we could consider automation to bump Installer providers
when merges happen upstream or in our forks.

Design Details

Open Questions [optional]

  1. UX design during install process as well as during failure (log collection). The Installer will dump
    (potentially prettified) controller logs. Once we reach a certain level of stability it may be worthwhile
    to implement a UI.

Test Plan

As this is replacing existing functionality in the Installer, we can rely on existing
testing infrastructure.

Graduation Criteria

Dev Preview -> Tech Preview

  • Ability to utilize the enhancement end to end
  • End user documentation, relative API stability

Tech Preview -> GA

  • More testing (upgrade, downgrade, scale)
  • Sufficient time for feedback
  • Available by default
  • User facing documentation created in openshift-docs

Removing a deprecated feature

  • Announce deprecation and support policy of the existing feature
  • Deprecate the feature

Upgrade / Downgrade Strategy

As this enhancement only concerns the Installation process and affects only the underlying cluster
infrastructure, this change should not affect existing cluster upgrades.

Version Skew Strategy

N/A

Operational Aspects of API Extensions

N/A

Failure Modes

During a failed install, the controller logs (displayed in stdout and collect in .openshift_install.log)
will contain useful information. The status of the CAPI manifests may also contain useful information,
in which case it would be important to display that to users and collect for bugs and support cases. There
is an open question about the best way to handle this UX, and we expect the answer to become more clear during
development.

As the infrastructure will be reconciled by a controller, it will be possible to resolve issues during an ongoing
installation, although this would not necessarily be a feature we would call attention to for documented use cases.

Finally, the Installer will need to be able to identify when infrastructure provisioning has failed during an installation.
Initially this will be achieved through a timeout.

Support Procedures

Describe how to

  • detect the failure modes in a support situation, describe possible symptoms (events, metrics,
    alerts, which log output in which component)

    Examples:

    • If the webhook is not running, kube-apiserver logs will show errors like "failed to call admission webhook xyz".
    • Operator X will degrade with message "Failed to launch webhook server" and reason "WehhookServerFailed".
    • The metric webhook_admission_duration_seconds("openpolicyagent-admission", "mutating", "put", "false")
      will show >1s latency and alert WebhookAdmissionLatencyHigh will fire.
  • disable the API extension (e.g. remove MutatingWebhookConfiguration xyz, remove APIService foo)

    • What consequences does it have on the cluster health?

      Examples:

      • Garbage collection in kube-controller-manager will stop working.
      • Quota will be wrongly computed.
      • Disabling/removing the CRD is not possible without removing the CR instances. Customer will lose data.
        Disabling the conversion webhook will break garbage collection.
    • What consequences does it have on existing, running workloads?

      Examples:

      • New namespaces won't get the finalizer "xyz" and hence might leak resource X
        when deleted.
      • SDN pod-to-pod routing will stop updating, potentially breaking pod-to-pod
        communication after some minutes.
    • What consequences does it have for newly created workloads?

      Examples:

      • New pods in namespace with Istio support will not get sidecars injected, breaking
        their networking.
  • Does functionality fail gracefully and will work resume when re-enabled without risking
    consistency?

    Examples:

    • The mutating admission webhook "xyz" has FailPolicy=Ignore and hence
      will not block the creation or updates on objects when it fails. When the
      webhook comes back online, there is a controller reconciling all objects, applying
      labels that were not applied during admission webhook downtime.
    • Namespaces deletion will not delete all objects in etcd, leading to zombie
      objects when another namespace with the same name is created.

Implementation History

Major milestones in the life cycle of a proposal should be tracked in Implementation History.

Alternatives

Using other infrastructure-as-code alternatives such as Pulumi, Ansible, or OpenTofu
all have their own individual drawbacks. We prefer the CAPI solution over
these alternatives because it:

  • streamlines Installer development (we do not need to re-implement features for the control plane)
  • lays the foundation for OpenShift to implement future CAPI features
  • requires less development effort, as CAPI providers are already setup to provision infrastructure for a cluster

It would also be possible to implement the installation using direct SDK calls for the cloud provider. In addition
to the reasons stated above, using individual SDK implementations would not present a common framework across various
cloud platforms.

Infrastructure Needed [optional]

Use this section if you need things from the project. Examples include a new
subproject, repos requested, github details, and/or testing infrastructure.

Listing these here allows the community to get the process for these resources
started right away.

Select a repo