This KEP proposes to enhance kubeadm to start using etcd's learner mode which was introduced in version 3.4. [The release notes for etcd 3.4](https://etcd.io/docs/v3.3/learning/learner/#features-in-v34) suggest a number of benefits of using this method. The proposal aims to add the new mode as a standard kubeadm / Kubernetes feature gate that is graduated over the period of one year or more, while collecting feedback from all kubeadm users. ## Motivation

Kubeadm currently adds all members in the "old way" that etcd supported, that is to add them as voting members from the beginning. If added as learners instead, such members would not disrupt the cluster quorum if they end up being faulty. The "old way" has proven problematic in cases where kubeadm attempts to add a etcd cluster member from a control plane node running on slower infrastructure. In such cases users have to manually interfere and remove the faulty member, by using tools such as etcdctl.

### Goals

- Add a new code path in kubeadm that can be used to deploy etcd with learner mode enabled.
- Use a new feature gate EtcdLearnerMode that can be used to toggle the feature until graduation to GA.
- Deprecate and remove the "old way" of adding members ### Non-Goals

- Support both the "old way" and "learner mode" in kubeadm as a toggle in the kubeadm API. Ideally we should support only a single, stable, community approved code path. ## Proposal ### User Stories (Optional)

#### Story 1

As a kubeadm user, I wish that my HA cluster is more resilient to etcd member failures during addition of new members at cluster bring up time due to slow infrastructure.

#### Story 2

As a kubeadm user, I wish that my HA cluster is constructed following the recommendation by etcd maintainers and using the latest features - i.e. to use learner mode instead of adding all new members as voting.

### Notes/Constraints/Caveats (Optional) How will security be reviewed, and by whom? How will UX be reviewed, and by whom? Consider including folks who also work outside the SIG or subproject. --> #### Risk: insufficient testing by kubeadm users Once the new code path is added and the logic is controlled by a feature gate, the feature gate will be in Alpha state or disabled by default. Even if e2e tests are added we need to notify users that we are making this important change to etcd and that they start testing it ASAP during Alpha, but not in production. ##### Mitigation Notify users on all possible communication channels: Slack, ML, Reddit, Twitter, etc. Keep umbrella issue as a place for discussion and user feedback. Attempt gathering feedback from parties that build product on top of kubeadm. #### Risk: unstable implementation of learner mode Once the new feature is added we need to test the stability of the new code path. ## Design Details

Currently most of the logic of stacked etcd member support in kubeadm is centralized around a couple of files in the source code. These files contain the etcd client wrapped logic and the logic for maintaining a static pod manifest for the etcd server instance. With the introduction of the new feature gate EtcdLearnerMode a new code path must be created. Preferably the number of "if EtcdLearnerMode" branches in the code should be minimized.

Kubeadm currently has some sensitive timeouts while adding etcd members the "old way". Waiting for learners to become voting members would require some modifications in kubeadm in terms of how we wait for a member to be added. Some details can be found in the [official etcd documentation](https://etcd.io/docs/v3.3/learning/learner/#features-in-v34). If there's any ambiguity about HOW your proposal will be implemented, this is the place to discuss them. --> Currently most of the logic of stacked etcd member support in kubeadm is centralized around a couple of files in the source code. These files contain the etcd client wrapped logic and the logic for maintaining a static pod manifest for the etcd server instance. With the introduction of the new feature gate EtcdLearnerMode a new code path must be created. Preferably the number of "if EtcdLearnerMode" branches in the code should be minimized. Kubeadm currently has some sensitive timeouts while adding etcd members the "old way". Waiting for learners to become voting members would require some modifications in kubeadm in terms of how we wait for a member to be added. ### Test Plan

[x] I/we understand the owners of the involved components may require updates to existing tests to make this code solid enough prior to committing the changes necessary to implement this enhancement.

##### Prerequisite testing updates [testing-guidelines]: https://git.k8s.io/community/contributors/devel/sig-testing/testing.md --> [x] I/we understand the owners of the involved components may require updates to existing tests to make this code solid enough prior to committing the changes necessary to implement this enhancement. ##### Prerequisite testing updates <!-- Based on reviewers feedback describe what additional tests need to be added prior implementing this enhancement to ensure the enhancements have also solid foundations. --> ##### Unit tests <!-- In principle every added code should have complete unit test coverage, so providing the exact set of tests will not bring additional value. ##### Unit tests

New unit tests must be added for all code paths that use the EtcdLearnerMode feature gate. Once the feature graduates to GA, these unit tests must be merged as part of the default unit tests for testing the kubeadm "stacked etcd" logic.

##### Integration tests

N/A ##### e2e tests

A new e2e test must be added as part of the [kubeadm dashboard](https://k8s-testgrid.appspot.com/sig-cluster-lifecycle-kubeadm). All tests in this dashboard use the [kinder](https://github.com/kubernetes/kubeadm/tree/main/kinder) tool.

- During Alpha (disabled by default): add a new e2e test that enables the feature gate EtcdLearnerMode
- During Beta (enabled by default): modify the e2e test to test the feature gate EtcdLearnerMode as disabled
- During GA (locked to enabled): remove the e2e test as the logic will be exercised in all existing kubeadm e2e tests ### Graduation Criteria

#### Alpha

- Feature implemented behind the feature gate EtcdLearnerMode
- Initial unit and e2e tests completed and enabled
- [Document the feature gate](https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm-init/#feature-gates).

#### Beta

- Gather feedback from developers and surveys
- Make unit and e2e test changes
- Update the feature gate documentation

#### GA

- Gather feedback from developers and surveys
- Update unit tests
- Remove e2e tests as this will be the only code path for adding etcd members and it will be tested by all existing kubeadm e2e tests
- Update the feature gate documentation Consider the following in developing the graduation criteria for this enhancement: - [Maturity levels (`alpha`, `beta`, `stable`)][maturity-levels] - [Feature gate][feature gate] lifecycle - [Deprecation policy][deprecation-policy] Clearly define what graduation means by either linking to the [API doc definition](https://kubernetes.io/docs/concepts/overview/kubernetes-api/#api-versioning) or by redefining what graduation means. In general we try to use the same stages (alpha, beta, GA), regardless of how the functionality is accessed. [feature gate]: https://git.k8s.io/community/contributors/devel/sig-architecture/feature-gates.md [maturity-levels]: https://git.k8s.io/community/contributors/devel/sig-architecture/api_changes.md#alpha-beta-and-stable-versions [deprecation-policy]: https://kubernetes.io/docs/reference/using-api/deprecation-policy/ Below are some examples to consider, in addition to the aforementioned [maturity levels][maturity-levels]. #### Alpha - Feature implemented behind a feature flag - Initial e2e tests completed and enabled #### Beta - Gather feedback from developers and surveys - Complete features A, B, C - Additional tests are in Testgrid and linked in KEP #### GA - N examples of real-world usage - N installs - More rigorous forms of testing—e.g., downgrade tests and scalability tests - Allowing time for feedback **Note:** Generally we also wait at least two releases between beta and GA/stable, because there's no opportunity for user feedback, or even bug reports, in back-to-back releases. **For non-optional features moving to GA, the graduation criteria must include [conformance tests].** [conformance tests]: https://git.k8s.io/community/contributors/devel/sig-architecture/conformance-tests.md #### Deprecation - Announce deprecation and support policy of the existing flag - Two versions passed since introducing the functionality that deprecates the flag (to address version skew) - Address feedback on usage/changed behavior, provided on GitHub issues - Deprecate the flag --> #### Alpha - Feature implemented behind the feature gate EtcdLearnerMode - Initial unit and e2e tests completed and enabled - [Document the feature gate](https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm-init/#feature-gates). #### Beta - Gather feedback from developers and surveys - Make unit and e2e test changes - Update the feature gate documentation #### GA - Gather feedback from developers and surveys - Update unit tests - Remove e2e tests as this will be the only code path for adding etcd members and it will be tested by all existing kubeadm e2e tests - Update the feature gate documentation ### Upgrade / Downgrade Strategy <!-- If applicable, how will the component be upgraded and downgraded? ### Upgrade / Downgrade Strategy

- N/A -> Alpha: users can patch their `ClusterConfiguration` in the `kube-system/kubeadm-config` ConfigMap to before calling `kubeadm upgrade apply` This will allow them to enable learner mode in case they wish to add more etcd members to this cluster. This scenario is anticipated as rare, because usually users maintain a stable control plane with 3 or more members before upgrading it. But it is still plausible and can be documented in the feature gate documentation.

- Alpha -> Beta: similarly to the previous stage users can modify the `ClusterConfiguration` to disable the feature gate during upgrade. This will allow them to use the "old way", in case they wish to add more etcd members to the cluster while the feature gate is enabled by default.

- Beta -> GA: users could no longer patch the `ClusterConfiguration` to opt-out of the feature and it will be locked to default. But it is still plausible and can be documented in the feature gate documentation. - Alpha -> Beta: similarly to the previous stage users can modify the `ClusterConfiguration` to disable the feature gate during upgrade. This will allow them to use the "old way", in case they wish to add more etcd members to the cluster while the feature gate is enabled by default. - Beta -> GA: users could no longer patch the `ClusterConfiguration` to opt-out of the feature and it will be locked to default. ### Version Skew Strategy <!-- If applicable, how will the component handle version skew with other components? What are the guarantees? Make sure this is in the test plan. Consider the following in developing a version skew strategy for this enhancement: - Does this enhancement involve coordinating behavior in the control plane and in the kubelet? How does an n-2 kubelet without this feature available behave when this feature is used? - Will any other components on the node change? ### Version Skew Strategy

One important point to make would be that kubeadm must handle a case where the user locked their etcd server version to version < 3.4. This would mean that they must get a sensible error in the lines of "etcd learner mode is not supported by this etcd version" and the control plane with stacked etcd initialization should fail.

All etcd versions that are > 3.4 should be treated as supported by the EtcdLearnerMode feature gate.

If EtcdLearnerMode goes GA, but the user prefers to stay on etcd version < 3.4, their existing cluster will continue to work but they will not be able to add new stacked etcd members. For new clusters the combination of EtcdLearnerMode (GA) and etcd version < 3.4 will not be supported. ## Production Readiness Review Questionnaire

kubeadm is considered an "out of tree" component and PRR is out of scope.

## Implementation History

- 2022-05-10: KEP draft created ## Drawbacks

The implementation and enablement of EtcdLearnerMode by default hides a number of risks around stability. The "old way" has been tested for years and consumed by many users. By modifying this code path we are introducing potential for user complains about HA cluster creation and maintenance with kubeadm. Sufficient testing and gathering feedback from users would be mandatory.

## Alternatives

N/A

## Infrastructure Needed (Optional)

N/A