# Metamorph -RedFish based Provisioner for Metal^3^
[TOC]
## Summary
This proposal is to expand the scope of Metal³ to introduce multi provisioner mechanism.
### Status
Implementable
## Motivation
Metal^3^ follows the paradigm of Kubernetes Native Infrastructure (KNI), which is an approach to use Kubernetes to manage underlying infrastructure. Managing the configuration of some physical devices is closely related to managing physical hosts.
We observed few issues with current Ironic as Provisioner like management of bare metal servers over the WAN. This is one of critical issues which needs to be resolved.
### Goals
* Ease of management of bare metal servers over the WAN.
* Introduce in house RedFish based provisioner in Metal3
* Provisioner should help for detecting cause of failures of errored host.
* Introduce Multi provisioner mechanism in metal3 which will help add support for many BM provisioners in the future.
* It will have no impact on existing Ironic functionality but provide great ability to resolve many more issues for BM.
### Non-Goals
* To add support for Other deployment interfaces
## Proposal
This document proposes to add a new mechanic to automatically perform physical device configuration with in house redfish provisioner before provisioning a BareMetalHost. It contains modification to BMHosts objects, existing provisioner, ErrorDetection Object etc.
### User Stories
1. As a User, I want a fast provisioning mechanism based on K8s with minimal configuration to deploy hosts at edge sites.
2. As a User, I should be able to configure hosts with less steps using only Redfish Virtual Media to avoid DHCP and PXE related runtime issues without Ironic.
3. As a User, I should be able to validate/detect errors in the hosts which will provide details on what went wrong, and which is the cause of failures.
4. As a User, I should be able to have in-house provisioner so it will be helpful for future enhancement.
5. As a user, I have to onboard node at edge where PXE doesn’t float over the WAN installation support.
## Design Details
Resources and controllers scheduled to be added and modified in the current design:
### Multi Provisioner Plugin Mechanism
Current Architecture of BMO has implemented Ironic API’s using GopherCloud. This logic needs more modification to support multiple Provisioning Systems in the future. We have to introduce a multi-provisioner mechanism to add support a many provisioners in Metal³.
Multi-provisioning mechanism will be extended by adding many provisioners in the future. This implementation will be done by using Factory pattern which helps promote loose coupling for future enhancements/modifications of provisioners.
This multi-provisioner mechanism will give the user the ability to choose the right provisioner based on his/her needs.
* Changes needed in baremetal-controller.go file.
* Introduce ProvManager.go to handle provisioners.
* Generalizing provisioner interface will be easy to extend/add/modify new features.
Provisioner pattern can be extended to other resources in future `Ex. Storage, Network, Acceleration`
People may come with new types of requirements for managing Storage, Network and different accelerators with their own libraries, and this change will certainly be helpful for enhancements like these.
### MetaMorph
We are introducing Metamorph as a separate provisioner project under Metal^3^ community.
* Metamorph is already in development phase.
* We are going to redesign some of the metamorph project.
* Plan to implement/call only specific RedFish API’s using Go’s rest client lib.
* This provisioner tackle very critical issues highlighed in the goal section and will be helpful for Error Detection Mechanism.
### Changes in existing provisioner
* Ironic an existing provisioner requires modification and needs to be integrated in a new plugin mechanism.
* We have to move the entire Ironic (Gopher Cloud) tightly coupled implementation into a plugin based implementation.
* We don’t have to change any calls made to Ironic or any association with other dependencies like gophercloud except baremetal-controller.go.
* Directory structure will be introduced based on the new approach, so module path level changes will be expected.
* Needed to change tightly coupled Ironic integration dependencies.
#### Modify Metal^3^-dev-env script:
* Add support in configuration and installation Metal^3^-dev-env script.
* This requires some amount of efforts.
* Plugin specific ENV params need to be introduced.
* Some changes needed in installation scripts.
* Metamorph will become K8s based in-built provisioner
* helps deal with edge level RedFish specific operations.
### Error Detection Mechanism
* This is one of the huge requirements we have from the Airship community “how to detect the cause behind the failures”.
* With the Metamorph coming into picture, we can easily diagnose the failure by making separate calls to Redfish API’s to get the details of the host state.
* Here we can bring HWCC classification capabilities to find it based on the failed or error state hosts, we have to rerun the validation module to find the cause of failure.
* This error detection mechanism will bring more changes in HWCC, and some level Job’s and interfaces calls will need to be introduced in BMO.
* In HWCC CR, we have to introduce error classification labels which will fetch error state hosts. Based on the error state, it will call an error detection mechanism which will trigger a call for a particular request.
* Error Detection mechanism will be responsible to get the details of the hosts based on the defined error states.
* Plan is to introduce only four error states as supported by Metal3.
* Error detection mechanism will introduce more API’s redfish.go (metamorph) for getting more info about actual errors by implementing interface and Job specific API’s.
* Once the information received from these API’s, Error Detection Mechanism adds annotation for detailed errors in BMH object.
* Users can simply get the errored hosts in classification CR status and if the user wants to get to know more info on errors, s/he can open the bmh CR file and see the annotation section.
* So there will two labels to be introduced in CR file,
* Label the hosts based on type of error - RegistrationError, InspectionError, ProvisioningError etc.
* Label - To be introduced to run Error Detection Mechanism called error-detect=true/false.
* This way, we can implement a fully useful mechanism to classify as well as detect errros of the defined error-state.
* This Error detection Mechanism will be implemented in Metamorph considering future enhancement and reducing the dependency on BMO.
* Metamorph will become key Redfish module for Metal3.
### Block Diagram
```mermaid
graph TB
id1[BMH] --> id2[Plugin Manager ] --> id3[Ironic] & id4[Metamorph]
```
### Sequence Diagram
```mermaid
sequenceDiagram
BMH->>+BM_Controller.go: Request for BM Provisioning
BM_Controller.go->>+provPluginManager.go: Redirect request to Metamorph
provPluginManager.go->>+MetamorphDriver: Extracts the request and calls the API
MetamorphDriver->>+Metamorph: Request Obj
Metamorph->>-MetamorphDriver: Return Response
MetamorphDriver->>-provPluginManager.go: Returns Host Provisioned Status
provPluginManager.go->>-BM_Controller.go: Returns the Host Status
BM_Controller.go-->>-BMH: Host
participant BMH
note right of BMH: Metamorph Workflow
```
## Implementation Details/Notes/Constraints
Explained detailed approach in Design section
### Risks and Mitigations
* Existing Ironic implementation needs to be integrated and tested properly.
* Should not break any functionality with New Generic Provisioning Mechanism
#### Work Items
* Define common provisioner plugin mechanism.
* Implement Metamorph in plugin-based design.
* Move existing provisioner in New Plugin architecture.
* Implement Metadata collection module for Metamorph.
* Redefine Redfish API’s in Metamorph.
* Introduce new error-state API in Metamorph.
* Imeplement Metamorph Error-State controller.
* Implement Redfish calls for collection of failure/error logs from interfaces.
* Introduce extraction and conversion logic for RedFish API’s in Error-State-Controller.
* Define proper format for exceptions/errors for Error classification.
* Write unit-tests for PluginManager and Drivers.
* Write Unit Tests for Metamorph API and Controller.
* Write unit tests for RedFish API lib.
### Dependencies
* CAPM3
* Baremetal-Operator
### Test Plan
* Unit tests will be implemented.
* Functional testing will be performed with respect to implemented Metamorph.
* Deployment & integration testing will be done.
## References
* [https://metamorph.readthedocs.io/en/latest/](https://metamorph.readthedocs.io/en/latest/)
* [https://github.com/bm-metamorph/MetaMorph](https://github.com/bm-metamorph/MetaMorph)