<!--
This work is licensed under a Creative Commons Attribution 3.0
Unported License.
http://creativecommons.org/licenses/by/3.0/legalcode
-->
# DHCP-less
Discuss options and outline a proposal to enable DHCP-less leveraging existing
Ironic support for this functionality, without dependencies on downstream customizations.
## Status
implementable
## Summary
Metal<sup>3</sup> provides an IPAM controller which can be used to enable
deployment with static-IPs instead of DHCP, however currently it is not
possible to use this functionality in a fully DHCP-less environment without
downstream customizations.
This proposal outlines the outstanding issues, and potential solutions to
enable an improved DHCP-less solution for Metal<sup>3</sup> users.
## Motivation
Infrastructure management via Metal<sup>3</sup> in DHCP-less environments
is common, but today our upstream features only partially solve for this use-case.
Since there are several groups in the community who require this functionality,
it makes sense to collaborate and ensure we can use upstream components where
possible and only depend on downstream customizations where absolutely required.
### Goals
Provide a method to support DHCP-less deployments without any downstream
customizations (except perhaps a different IPA ramdisk image?).
### Non-Goals
Existing methods used to solve this with downstream customizations (such
as a custom PreprovisioningImageController) are valid and will still sometimes
be required, this doesn't aim to replace such methods, only to provide a simpler
path for those using only the upstream components.
This proposal will focus on the Metal<sup>3</sup> components only - there are
also OS dependencies and potential related areas of work in Ironic, these will
be mentioned in the Dependencies section but not covered in detail here.
This proposal will only consider the Metal<sup>3</sup> IPAM controller -
there are other options but none are currently integrated via CAPM3.
## Proposal
Implement an new CAPM3 controller to handle setting the BareMetalHost `preProvisioningNetworkDataName`
### User Stories
#### Static network configuration (no IPAM)
As a user I want to manage my networkConfiguration statically as part of my
BareMetalHost inventory.
In this case the network configuration is provided via a Secret which is
either manually created or templated outside the scope of Metal<sup>3</sup>
The BareMetalHost API already supports two interfaces for passing network configuration:
* `networkData` - this data is passed to the deployed OS via Ironic via a
configuration drive partition. It is then typically read on firstboot by
a tool such as `cloud-init` which supports the OpenStack network data format.
* `preprovisioningNetworkDataName` - this data is designed to allow passing data
during the preprovisioning phase, e.g to configure networking for the IPA deploy
ramdisk.
The `preprovisioningNetworkDataName` API was added initially to enable [image
building workflows](https://github.com/metal3-io/baremetal-operator/blob/main/docs/api.md#preprovisioningimage), and a [recent BMO change](https://github.com/metal3-io/baremetal-operator/pull/1380) landed to enable this flow without any custom PreprovisioningImage controller.
#### IPAM configuration
As a user I wish to make use use of the Metal<sup>3</sup> IPAM solution, in a
DHCP-less environment.
Metal<sup>3</sup> provides an [IPAM controller](https://github.com/metal3-io/ip-address-manager)
which can be used to allocate IPs used as part of the Metal3Machine lifecycle.
Some gaps exist which prevent realizing this flow, so the main focus of the
proposal will be how to solve for this use-case.
## Design Details
### PreprovisioningNetworkData Controller
Currently CAPM3 uses Metal3DataTemplate to template the BareMetalHost configuration derived from the IPAM resources, and by default this workflow is coupled to the Metal3Machine/Metal3MachineTemplate. This is an issue for the pre-provisioning case since at the point of BareMetalHost inspection no Machine is associated with the BareMetalHost.
We can potentially resolve this with a new controller that handles the pre-provisioning
actions prior to associating a BareMetalHost resource with a Metal3Machine, it will do the following:
- Read each Metal3MachineTemplate, matching BareMetalHost resources by `hostSelector`
- Use the referenced Metal3DataTemplate `networkData` to create a secret which sets the BareMetalHost `preprovisioningNetworkDataName`
#### Assumptions and Open Questions
- If we want to support different network configurations for pre-provisioning and provisioning, we'll need to add preprovisioningNetworkData support to Metal3DataTemplate (this could also serve as a way to opt-in to this new behavior if we don't want a controller-level flag?)
- It is assumed that the BareMetalHost label used by the hostSelector does not change between pre-provisioning and provisioning, if we want to allow different IP pools for pre-provisioning and deployed cluster usage we'll need a new resource containing a `hostSelector` which is not aligned with the Machine lifecycle.
- There is a risk of BareMetalHost resource labels matching more than one Metal3MachineTemplate, how do we handle that?
### Inspection on initial registration
On initial registration of a host, inspection is triggered immediately but this process cannot complete without preprovisioning network configuration in a DHCP-less environment (because the IPA ramdisk can't connect back to the Ironic API).
This can be resolved if the BareMetalHost resources are created with the existing [paused annotation](https://github.com/metal3-io/baremetal-operator/blob/main/docs/api.md#pausing-reconciliation), set to a pre-determined value (e.g `metal3.io/preprovisioning`) which can then be removed by the new controller after `preprovisioningNetworkDataName` has been set, then inspection will be able to succeed.
### Implementation Details/Notes/Constraints
#### IP Reuse
A related issue has been previously addressed via the [IP Reuse](https://github.com/metal3-io/cluster-api-provider-metal3/blob/main/docs/ip_reuse.md) functionality - this means we can couple IPClaims to the BareMetalHost resources which will enable consistent IP allocations for pre-provisioning and subsequent provisioning operations (provided the same IPPool is used for both steps)
### Risks and Mitigations
- TODO
### Work Items
TODO
### Dependencies
#### Firstboot agent support
An agent in the IPA ramdisk image is required to consume the network data provided via the processes outlined above.
The Ironic DHCP-less documentation describes using glean (a minimal python-based cloud-init alternative), but we don't
currently have any community-supported IPA ramdisk image containing this tool.
#### Potential config-drive conflict on redeployment
### Test Plan
TODO
### Upgrade / Downgrade Strategy
TODO
### Version Skew Strategy
N/A
## Drawbacks
TODO
## Alternatives
### Kanod
One possibility is to manage the lifecycle of `preprovisioningNetworkDataName` outside of
the Metal<sup>3</sup> core components - such an approach has been successfully demonstrated
in the [Kanod community](https://gitlab.com/Orange-OpenSource/kanod/) which is related to
the [Sylva](https://sylvaproject.org) project.
The design proposal here has been directly inspired by this work, but I think directly integrating
this functionality into CAPM3 has the following advantages:
* We can close a functional gap which potentially impacts many Metal<sup>3</sup> users, not only those involved with Kanod/Sylva
* We can avoid requiring a separate IPAM controller and IP address pool (and potentially reserved IP range), and instead leverage the IPAM resources provided by Metal<sup>3</sup>
* Directly integrating into CAPM3 means we can use a common approach for `networkData` and `preprovisioningNetworkData` - this should be easier to understand/maintain and potentially reduces the number of host-specific resources required (which can be important at very large scale)
## References
TODO