# Metal3 proposal for CNCF incubation ## Table of Contents * [About Metal3](#about-metal3) * [Documentation](#documentation) * [Socials](#socials) * [Talls](#talks) * [Sandbox Acceptance](#sandbox-acceptance) * [Progress Since Sandbox](#progress-since-sandbox) * [Technical advances](#technical-advances) * [Incubation Stage Requirements](#incubation-stage-requirements) * [Statement on Alignment with the CNCF Mission](#statement-on-alignment-with-the-cncf-mission) * [Alignment with Other CNCF Projects](#alignment-with-other-cncf-projects) * [Future Plan](#future-plan) ## About Metal3 The Metal³ project (pronounced: “Metal Kubed”) provides components for bare metal host management with Kubernetes. You can enroll your bare metal machines, provision operating system images, and then, if you like, deploy Kubernetes clusters to them. From there, operating and upgrading your Kubernetes clusters can be handled by Metal³. Moreover, Metal³ is itself a Kubernetes application, so it runs on Kubernetes, and uses Kubernetes resources and APIs as its interface. Metal³ is one of the providers for the Kubernetes sub-project Cluster API. Cluster API provides infrastructure agnostic Kubernetes lifecycle management, and Metal³ brings the bare metal implementation. Furthermore Metal³ utilizes a component from the OpenStack ecosystem called Ironic. Ironic is used for booting and installing machines. Metal³ handles the installation of Ironic as a standalone component (there’s no need to bring along the rest of OpenStack). Ironic is supported by a mature community of hardware vendors and supports a wide range of bare metal management protocols which are continuously tested on a variety of hardware. Backed by Ironic, Metal³ can provision machines, no matter the brand of hardware. In summary, you can write Kubernetes manifests representing your hardware and your desired Kubernetes cluster layout. Then Metal³ can: - Discover your hardware inventory - Configure BIOS and RAID settings on your hosts - Optionally clean a host’s disks as part of provisioning - Install and boot an operating system image of your choice - Deploy Kubernetes - Upgrade Kubernetes or the operating system in your clusters with a non-disruptive rolling strategy - Automatically remediate failed nodes by rebooting them and removing them from the cluster if necessary You can even deploy Metal³ to your clusters so that they can manage other clusters using Metal³… ### Documentation - [User guide](https://book.metal3.io/) - [Quick start guide](https://metal3.io/try-it.html) - [In depth BMO docs](https://github.com/metal3-io/baremetal-operator/tree/main/docs) - [Design documnets](https://github.com/metal3-io/metal3-docs/tree/main/design) ### Socials - [Twitter](https://twitter.com/metal3_io) - [Youtube](https://www.youtube.com/channel/UC_xneeYbo-Dl4g-U78xW15g) - [Website](https://metal3.io/) - [GitHub](https://github.com/metal3-io) - [Slack](https://kubernetes.slack.com/messages/CHD49TLE7) - [Mailing list/Google group](https://groups.google.com/g/metal3-dev) ### Talks - [Kubernetes meetup, Helsinki, May 2022](https://youtu.be/LTzIudvLs9A?t=8332) - [Open Infra Summit (2022)](https://www.youtube.com/watch?v=KOmEjmRbMWM) - [OpenStack Bare Metal SIG, Dec 2021](https://youtu.be/rjSC6cJ9YY8) - [CNCF Webinar on Metal3, Dec 2020](https://youtu.be/-BvQWz2cfCg) - [Openshift Commons talk on Metal3, Oct 2020](https://youtu.be/HVKwWAE1nUE) ### Sandbox Acceptance The Metal3 project was accepted into CMCF sandbox on 8th of September 2020 https://github.com/cncf/toc/pull/408. ## Progress Since Sandbox The Metal3-io project went through many changes since its acceptance to incubation. Repositories that has prooved to be unmaintainable or lost their importance has been archived after extensive community discussions. As it stands today the projet revolvs around 4 core component plus a development environment, a documentation repository , a website repository and an infrastructure as code (IAAC) repository that handles the CI/CD processes for the project. The 4 core components are: - [baremetal-operator](https://github.com/metal3-io/baremetal-operator) - [cluster-api-provider-metal3](https://github.com/metal3-io/cluster-api-provider-metal3) - [ip-address-manager](https://github.com/metal3-io/ip-address-manager) - [ironic-image](https://github.com/metal3-io/ironic-image) The development environment can be found at: [metal3-dev-env](https://github.com/metal3-io/metal3-dev-env). The design documents and the user-guide is located at [metal3-docs](https://github.com/metal3-io/metal3-docs). The website's source is located at [metal3-io.github.io](https://github.com/metal3-io/metal3-io.github.io). The CI/CD IAAC repository is at [project-infra](https://github.com/metal3-io/project-infra). ### Technical Advances **Annual sandbox reports** - [2022](https://github.com/cncf/toc/pull/958) - [2021](https://github.com/cncf/toc/pull/734) **Aggregated list of technical advances in sandbox** - **Cluster-api-provider-metal3** - node reuse feature ([PR #169](https://github.com/metal3-io/cluster-api-provider-metal3/pull/169)) - remediation controller ([PR #157](https://github.com/metal3-io/cluster-api-provider-metal3/pull/157)) - Add v1beta1 types and related changes [#342](https://github.com/metal3-io/cluster-api-provider-metal3/pull/342) - Support IP Reuse for BMHs using Preallocations [#656](https://github.com/metal3-io/cluster-api-provider-metal3/pull/656) - Introduce additional providerID format [#563](https://github.com/metal3-io/cluster-api-provider-metal3/pull/563) - Adoption of CAPI e2e framework [Initial PR](https://github.com/metal3-io/cluster-api-provider-metal3/pull/194) - Add live-iso support to CAPI Metal3 provider [#189](https://github.com/metal3-io/cluster-api-provider-metal3/pull/189) - **Baremetal operator** - authenticating to Ironic ([#601](https://github.com/metal3-io/baremetal-operator/pull/601)) - added RAID configuration support ([#134](https://github.com/metal3-io/metal3-docs/pull/134)) - added select vendor-independent BIOS configuration options ([#63](https://github.com/metal3-io/metal3-docs/pull/63)) - support automatic secure boot ([#161](https://github.com/metal3-io/metal3-docs/pull/161)) - add boot-iso API to BareMetalHost ([#150](https://github.com/metal3-io/metal3-docs/pull/150)) - optimised concurrent provisioning performance and tested at scale up to 1000 Hosts ([PR #725,](https://github.com/metal3-io/baremetal-operator/pull/725) [Issue #905](https://github.com/metal3-io/baremetal-operator/issues/905)) - added custom agent image controller ([#183](https://github.com/metal3-io/metal3-docs/pull/183)) - Release process for BMO [#1150](https://github.com/metal3-io/baremetal-operator/pull/1150) - HardwareData custom resource for host inspection data [#1099](https://github.com/metal3-io/baremetal-operator/pull/1099) - PreprovisioningImage API and integration [#936](https://github.com/metal3-io/baremetal-operator/pull/936) - New API and integration of hostFirmwareSettings and firmwareSchema resources [#901](https://github.com/metal3-io/baremetal-operator/pull/901/), [#938](https://github.com/metal3-io/baremetal-operator/pull/938) - Enable RAID for Redfish-based iDRAC driver flavor [#1095](https://github.com/metal3-io/baremetal-operator/pull/1095) - Reboot API Implementation [#424](https://github.com/metal3-io/baremetal-operator/pull/424) - Add physicalDisks and controller parameters to HardwareRAID [#1062](https://github.com/metal3-io/baremetal-operator/pull/1062) - Implement explicit reboot mode options [#795](https://github.com/metal3-io/baremetal-operator/pull/795) - Add live-iso support [#759](https://github.com/metal3-io/baremetal-operator/pull/759) - Add support for detached annotation [#827](https://github.com/metal3-io/baremetal-operator/pull/827) - Add support for ironic custom deploy interface [#884](https://github.com/metal3-io/baremetal-operator/pull/884) - **ip-address-manager** - It's a comprehensive tool to manage static IP address allocations in Cluster API Provider Metal3. It has its own controller and quite a few new features are added in this repo. - **Improve end user documentation** - Added a user-guide in form of the [Metal3Book](https://book.metal3.io/) ## Incubation Stage Requirements This is a "checkbox" **_Document that it is being used successfully in production by at least three independent direct adopters which, in the TOC’s judgement, are of adequate quality and scope._** The project's adopters list can be found [here](https://github.com/metal3-io/metal3-docs/blob/main/ADOPTERS.md). **Companies that are using Metal3 in production:** `IKEA IT AB`: "IKEA IT AB uses Metal3 to handle Bare Metal provisioning and lifecycle management in its CAPI-Based bare metal cloud infrastructure." `Ericsson`: "We have chosen Metal3 as a bare metal provisioner for Ericsson’s Cloud Container Distribution since Metal3 is a forerunner when it comes to Kubernetes on top of bare metal servers. Besides the robust features of Metal3, a very involved community and clearly defined roadmap have been the key factors for us to choose Metal3 as a core component." `Red Hat`: Red Hat's OpenShift distribution includes Metal3 as part of its solution for automating the deployment of bare metal clusters. `Fujitsu`: "As a server vendor, we are developing and using Metal3 to provide Fujitsu servers as Kubernetes bare metal nodes." `Deutsche Telekom`: "We have various Telco applications that require bare metal infrastructure. We rely on metal3 for provisioning since it supports layer 3 only deployments which makes it easier to use in our complex networking setup than other options." **Other open source projects that are utilizing Metal3:** [Airship](https://www.airshipit.org) [DT Technik "Das SCHIFF"](https://github.com/telekom/das-schiff) [OpenShift](https://github.com/openshift/) In addition to the official adopters there are projects like [Medik8s](https://www.medik8s.io/) who are providing options for users to use Metal3 specific features like Metal3 based [remediation](https://www.medik8s.io/remediation/metal3/metal3/). Furthermore maintainers have noticed that there are many more companies that are actively using Metal3, as representatives of said companies are opening issues, writing proposals and in general contribute to the project but have not registered themselves as official adopters. **_Have a healthy number of committers. A committer is defined as someone with the commit bit; i.e., someone who can accept contributions to some or all of the project._** The following table contains both the project's approvers and reviewers combined. | Reviewer/Approver | GitHub ID | Affiliation | | ------------------ | -------------- | ------------------------------ | | Andrea Fasano | andfasano | Red Hat | | Bob Fournier | bfournie | Red Hat | | Derek Higgins | derekhiggins | Red Hat | | Dmitry Tantsur | dtantsur | Red Hat | | Riccardo Pittau | elfosardo | Red Hat | | Honza Pokorny | honza | Red Hat | | Himanshu Roy | hroyrh | Red Hat | | Zane Bitter | zaneb | Red Hat | | Iury Gregory | iurygregory | Red Hat | | Kashif Khan | kashifest | Ericsson Software Technology | | Furkat Gofurov | furkatgofurov7 | Ericsson Software Technology | | Mohammed Boukhalfa | mboukhalfa | Ericsson Software Technology | | Lennart Jern | lentzi90 | Ericsson Software Technology | | Adil Ghaffar | adilGhaffarDev | Ericsson Software Technology | | Moshiur Rahman | smoshiur1237 | Ericsson Software Technology | | Sunnatillo Samadov | Sunnatillo | Ericsson Software Technology | | Adam Rozman | Rozzii | Ericsson Software Technology | | Tuomo Tanskanen | tuminoid | Ericsson Software Technology | Complete list of approvers across different repositories can be found [here](https://github.com/metal3-io/metal3-docs/blob/master/maintainers/ALL-OWNERS). We have introduced the option of moving people to emeritus approvers and reviewers list for people who became less active in the review/maintenance process. **_Demonstrate a substantial ongoing flow of commits and merged contributions._** This [DevStats](https://metal3.devstats.cncf.io/d/8/dashboards?orgId=1&from=now-1y&to=now-1h&editPanel=2) graph shows that we have a fairly consistent number of contributions throughout the year. In addition, we have a very high number of downloads of our container images from [quay.io](https://quay.io/). Since the download statistics is not publicly visible, we are not adding any link. As an example of our traction the aggregated number of container image downloads from the Metal3's container repository from `10-26-2022 to 11-2-2022` was on `average 12193` and the peak was `15755 during a single day`. ![](https://raw.githubusercontent.com/Nordix/metal3-clusterapi-docs/master/2022-metal3-io-annual-review/2022-metal3-io-container-downloads.png) A further example of the project's tracion is the github download statistics of Metal3-io components: [Cluster API Provider Metal3](https://hanadigital.github.io/grev/?user=metal3-io&repo=cluster-api-provider-metal3) [IP Address Manager](https://hanadigital.github.io/grev/?user=metal3-io&repo=ip-address-manager) **_A clear versioning scheme._** [Versioning information](https://metal3io.netlify.app/version_support.html) TODO? Should we add here something extra? **_Clearly documented security processes explaining how to report security issues to the project, and describing how the project provides updated releases or patches to resolve security vulnerabilities_** The Metal3-io project's general security policy is located [here](https://metal3io.netlify.app/security_policy.html). In addition to the project's security policy, all the active repositories should have a `SECURITY_CONTACTS` file. The `SECURITY_CONTACTS` file informs contributors about: - the project's security policy - basic rules of security vulnerability disclosure Optionally the `SECURITY_CONTACTS` may contain: - repository specific security fix back porting information - repository specific security fix release information In addition ot the genearl security policy and repo specific `SECURITY_CONTACTS` files the project also utilises a set of automated security analysis tools e.g. TODO ## Statement on Alignment with the CNCF Mission As cloud native technologies become the norm for organizations to deploy and manage their applications, Metal³ aims to apply similar patterns such as declarative APIs to allow administrators to manage their underlying infrastructure. The Metal³ project beleives in CNCF’s mission to make cloud-native technologies accessible for all, that is why the Metal3 project is trying to make physical infrastructure management more accessible to the cloud native ecosystem and to the world wide opensource software community. ### Alignment with Other CNCF Projects The Metal3 stack integrates into [Kubernetes Cluster API](https://github.com/kubernetes-sigs/cluster-api) which is a CNCF project. Metal3-io's IPAM is also used by [cluster-api-provider-vsphere's](https://github.com/kubernetes-sigs/cluster-api-provider-vsphere) CI. cluster-api-provider-vsphere is a sub project of sig-cluster-lifecycle and is a CNCF supported project. TODO maybe kiosk appearances should be listed? ## Future Plan TODO ## Notes 7.9.2023 ESJ discussion: What is the CI/CD status? Do explicit code coverage metrics exist? If not, what is the subjective adequacy of automated testing? Do different levels of tests exist (e.g. unit, integration, interface, end-to-end), or is there only partial coverage in this regard? Why? - We have a lot of coverage and a big CI but how it is percived from OSS pov? - We know we lack in some areas (let's list) but we aim to improve on them (so and so list them). TODO : we have to provide a link to our governance document (who is a maintainer, approver, reviewer, how are they selected etc..) TODO : we need a membership policy to be linked example: https://github.com/kubevirt/community/blob/main/membership_policy.md NOTE: we have no SIGs so we can skip that part NOTE: release process is partiall complete, we need clear release mechanism for BMO and Ironic-image NOTE: we have to sort out the cadence situation we have conflicting documentation related to the cadence: TLDR: BMO and Ironic releasing: - We have 2 core repos repos that have no clear release process (still WIP) "Small repos": - What should we do with the ironic-client and the ipa-downloader do those really need theire own repo? - how should we release ipa-downloader, mariaDB, ironic-client, ? [Adam] these could be released together with bmo or the ironic-image easily as utilites with the same version number, and could be hosted in the same repo - maybe we should go into details about the small repos but we should deal with them regardless - Does anyone use ironic-agent-image ??? can we get rid of it??? Governance: - We have conflicting cadence documentation, have to be in 1 place and cover all subprojects even if the project diviates from the others ``` baremetal-operator/CONTRIBUTING.md 47:Baremetal Operator doesn't follow release cadence and versioning of upstream metal3-docs/design/community/cncf-sandbox-application.adoc 46:* link:https://github.com/metal3-io/cluster-api-provider-metal3 /blob/main/docs/releasing.md[cluster-api-provider-metal3] - Following the s ame release cadence as the cluster-api project of the Kubernetes cluster lifecycle SIG cluster-api-provider-metal3/CONTRIBUTING.md 53:> Cluster API release cadence and versioning which follows upstream Kubernetes ``` - We have to clean up the governance and membership policy and these have to be in one place and apply to all sub projects equaly - We suggest not to expand social media presence, let's focus on the website (blogging) and slack/e-mail discussions Security: - let's improve our security process documentation (minor improvements) - we can't provide a security patch release process until BMO, Ironic-image and the small repos need clear releasing and cadence processes - no need to discuss individual security scanners, and checks Quality of code: - let's document code quality processes (we have linters, we use language provided factoring tools) - we have TODO integarations with SonarQube and others For graduation to consider (can go under Future Items section): - should we have a top level Metal3 ? we could have this discussion (but this is more for graduation) each of them should be releaseable on it's own but we should provide a full stack release