<!-- This work is licensed under a Creative Commons Attribution 3.0 Unported License. http://creativecommons.org/licenses/by/3.0/legalcode --> # DHCP-less Discuss options and outline a proposal to enable DHCP-less leveraging existing Ironic support for this functionality, without dependencies on downstream customizations. ## Status implementable ## Summary Metal<sup>3</sup> provides an IPAM controller which can be used to enable deployment with static-IPs instead of DHCP, however currently it is not possible to use this functionality in a fully DHCP-less environment without downstream customizations. This proposal outlines the outstanding issues, and potential solutions to enable an improved DHCP-less solution for Metal<sup>3</sup> users. ## Motivation Infrastructure management via Metal<sup>3</sup> in DHCP-less environments is common, but today our upstream features only partially solve for this use-case. Since there are several groups in the community who require this functionality, it makes sense to collaborate and ensure we can use upstream components where possible and only depend on downstream customizations where absolutely required. ### Goals Provide a method to support DHCP-less deployments without any downstream customizations (except perhaps a different IPA ramdisk image?). ### Non-Goals Existing methods used to solve this with downstream customizations (such as a custom PreprovisioningImageController) are valid and will still sometimes be required, this doesn't aim to replace such methods, only to provide a simpler path for those using only the upstream components. This proposal will focus on the Metal<sup>3</sup> components only - there are also OS dependencies and potential related areas of work in Ironic, these will be mentioned in the Dependencies section but not covered in detail here. This proposal will only consider the Metal<sup>3</sup> IPAM controller - there are other options but none are currently integrated via CAPM3. ## Proposal Implement an new CAPM3 controller to handle setting the BareMetalHost `preProvisioningNetworkDataName` ### User Stories #### Static network configuration (no IPAM) As a user I want to manage my networkConfiguration statically as part of my BareMetalHost inventory. In this case the network configuration is provided via a Secret which is either manually created or templated outside the scope of Metal<sup>3</sup> The BareMetalHost API already supports two interfaces for passing network configuration: * `networkData` - this data is passed to the deployed OS via Ironic via a configuration drive partition. It is then typically read on firstboot by a tool such as `cloud-init` which supports the OpenStack network data format. * `preprovisioningNetworkDataName` - this data is designed to allow passing data during the preprovisioning phase, e.g to configure networking for the IPA deploy ramdisk. The `preprovisioningNetworkDataName` API was added initially to enable [image building workflows](https://github.com/metal3-io/baremetal-operator/blob/main/docs/api.md#preprovisioningimage), and a [recent BMO change](https://github.com/metal3-io/baremetal-operator/pull/1380) landed to enable this flow without any custom PreprovisioningImage controller. #### IPAM configuration As a user I wish to make use use of the Metal<sup>3</sup> IPAM solution, in a DHCP-less environment. Metal<sup>3</sup> provides an [IPAM controller](https://github.com/metal3-io/ip-address-manager) which can be used to allocate IPs used as part of the Metal3Machine lifecycle. Some gaps exist which prevent realizing this flow, so the main focus of the proposal will be how to solve for this use-case. ## Design Details ### PreprovisioningNetworkData Controller Currently CAPM3 uses Metal3DataTemplate to template the BareMetalHost configuration derived from the IPAM resources, and by default this workflow is coupled to the Metal3Machine/Metal3MachineTemplate. This is an issue for the pre-provisioning case since at the point of BareMetalHost inspection no Machine is associated with the BareMetalHost. We can potentially resolve this with a new controller that handles the pre-provisioning actions prior to associating a BareMetalHost resource with a Metal3Machine, it will do the following: - Read each Metal3MachineTemplate, matching BareMetalHost resources by `hostSelector` - Use the referenced Metal3DataTemplate `networkData` to create a secret which sets the BareMetalHost `preprovisioningNetworkDataName` #### Assumptions and Open Questions - If we want to support different network configurations for pre-provisioning and provisioning, we'll need to add preprovisioningNetworkData support to Metal3DataTemplate (this could also serve as a way to opt-in to this new behavior if we don't want a controller-level flag?) - It is assumed that the BareMetalHost label used by the hostSelector does not change between pre-provisioning and provisioning, if we want to allow different IP pools for pre-provisioning and deployed cluster usage we'll need a new resource containing a `hostSelector` which is not aligned with the Machine lifecycle. - There is a risk of BareMetalHost resource labels matching more than one Metal3MachineTemplate, how do we handle that? ### Inspection on initial registration On initial registration of a host, inspection is triggered immediately but this process cannot complete without preprovisioning network configuration in a DHCP-less environment (because the IPA ramdisk can't connect back to the Ironic API). This can be resolved if the BareMetalHost resources are created with the existing [paused annotation](https://github.com/metal3-io/baremetal-operator/blob/main/docs/api.md#pausing-reconciliation), set to a pre-determined value (e.g `metal3.io/preprovisioning`) which can then be removed by the new controller after `preprovisioningNetworkDataName` has been set, then inspection will be able to succeed. ### Implementation Details/Notes/Constraints #### IP Reuse A related issue has been previously addressed via the [IP Reuse](https://github.com/metal3-io/cluster-api-provider-metal3/blob/main/docs/ip_reuse.md) functionality - this means we can couple IPClaims to the BareMetalHost resources which will enable consistent IP allocations for pre-provisioning and subsequent provisioning operations (provided the same IPPool is used for both steps) ### Risks and Mitigations - TODO ### Work Items TODO ### Dependencies #### Firstboot agent support An agent in the IPA ramdisk image is required to consume the network data provided via the processes outlined above. The Ironic DHCP-less documentation describes using glean (a minimal python-based cloud-init alternative), but we don't currently have any community-supported IPA ramdisk image containing this tool. #### Potential config-drive conflict on redeployment ### Test Plan TODO ### Upgrade / Downgrade Strategy TODO ### Version Skew Strategy N/A ## Drawbacks TODO ## Alternatives ### Kanod One possibility is to manage the lifecycle of `preprovisioningNetworkDataName` outside of the Metal<sup>3</sup> core components - such an approach has been successfully demonstrated in the [Kanod community](https://gitlab.com/Orange-OpenSource/kanod/) which is related to the [Sylva](https://sylvaproject.org) project. The design proposal here has been directly inspired by this work, but I think directly integrating this functionality into CAPM3 has the following advantages: * We can close a functional gap which potentially impacts many Metal<sup>3</sup> users, not only those involved with Kanod/Sylva * We can avoid requiring a separate IPAM controller and IP address pool (and potentially reserved IP range), and instead leverage the IPAM resources provided by Metal<sup>3</sup> * Directly integrating into CAPM3 means we can use a common approach for `networkData` and `preprovisioningNetworkData` - this should be easier to understand/maintain and potentially reduces the number of host-specific resources required (which can be important at very large scale) ## References TODO