KEP-NNNN DNSClass: Configurable DNS Settings for Pods

# KEP-NNNN DNSClass: Configurable DNS Settings for Pods  - [Release Signoff Checklist](#release-signoff-checklist) - [Summary](#summary) - [Motivation](#motivation) - [Goals](#goals) - [Non-Goals](#non-goals) - [Proposal](#proposal) - [User Stories (Optional)](#user-stories-optional) - [Story 1](#story-1) - [Story 2](#story-2) - [Notes/Constraints/Caveats (Optional)](#notesconstraintscaveats-optional) - [Risks and Mitigations](#risks-and-mitigations) - [Design Details](#design-details) - [Test Plan](#test-plan) - [Prerequisite testing updates](#prerequisite-testing-updates) - [Unit tests](#unit-tests) - [Integration tests](#integration-tests) - [e2e tests](#e2e-tests) - [Graduation Criteria](#graduation-criteria) - [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy) - [Version Skew Strategy](#version-skew-strategy) - [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire) - [Feature Enablement and Rollback](#feature-enablement-and-rollback) - [Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning) - [Monitoring Requirements](#monitoring-requirements) - [Dependencies](#dependencies) - [Scalability](#scalability) - [Troubleshooting](#troubleshooting) - [Implementation History](#implementation-history) - [Drawbacks](#drawbacks) - [Alternatives](#alternatives) - [Infrastructure Needed (Optional)](#infrastructure-needed-optional)  ## Release Signoff Checklist Items marked with (R) are required *prior to targeting to a milestone / release*. - [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR) - [ ] (R) KEP approvers have approved the KEP status as `implementable` - [ ] (R) Design details are appropriately documented - [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors) - [ ] e2e Tests for all Beta API Operations (endpoints) - [ ] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) - [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free - [ ] (R) Graduation criteria is in place - [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) - [ ] (R) Production readiness review completed - [ ] (R) Production readiness review approved - [ ] "Implementation History" section is up-to-date for milestone - [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io] - [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes ## Summary DNSClass is a cluster-scoped resource that allows users to configure the DNS settings for Pods in a Kubernetes cluster, replacing the default behavior of ClusterFirst and ClusterFirstWithHostNet policies. This provides a way to change the default values of ClusterFirst and ClusterFirstWithHostNet policies without requiring additional resources such as MutatingAdmissionWebhooks or modifying Pod manifest files. ## Motivation The current default DNS settings for Pods in Kubernetes are hard-coded and not easily configurable at the cluster level. This can cause limitations for admin users who need to customize the DNS settings for the Pods based on their specific requirements. By introducing DNSClass as a configurable cluster-wide resource, users will have more flexibility in configuring the DNS settings for their Pods without having to resort to workarounds. ### Goals - Provide a cluster-wide mechanism to change the default values of ClusterFirst and ClusterFirstWithHostNet. - Reduce the complexity and extra effort for users who need to customize DNS settings for Pods. ### Non-Goals - Improve the performance of DNS - Modifying the existing behavior of DNSConfig in the Pod manifest when the `None` policy is selected. ## Proposal The proposal involves introducing a new API resource called DNSClass in the node.k8s.io API group. DNSClass will have a Spec field called DNSConfig, which will contain the desired DNS configuration for Pods. Pods can reference a DNSClass by specifying the DNSClassName in their PodSpec. Example DNSClass resource definition: ```yaml! kind: DNSClass apiVersion: node.k8s.io/v1alpha1 metadata: annotations: dnsclass.kubernetes.io/is-default-class: "true" name: standard spec: dnsConfig: nameservers: - 192.0.2.1 searches: - ns1.svc.cluster-domain.example - my.dns.search.suffix options: - name: ndots value: "2" - name: edns0 ``` Example usage for DNSClassName in Pod manifest: ```yaml! apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment spec: replicas: 3 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: DNSClassName: standard containers: - name: nginx image: nginx:1.14.2 ports: - containerPort: 80 ``` When the Pod is created, the DNS settings specified in the referenced DNSClass will be applied to the Pod's `/etc/resolv.conf` file. ```bash! nameserver 192.0.2.1 search ns1.svc.cluster-domain.example my.dns.search.suffix options ndots:2 edns0 ``` Users will also have the ability to mark a DNSClass as the default class or non-default class using annotations, allowing for easy customization of the default DNS settings. The above DNSClass example has the required annotation to set a DNSClass as default. Example usage without DNSClassName in Pod manifest: ```yaml! apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment spec: replicas: 3 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:1.14.2 ports: - containerPort: 80 ``` `/etc/resolv.conf` file will be the same as the example mentioned above. ```bash! nameserver 192.0.2.1 search ns1.svc.cluster-domain.example my.dns.search.suffix options ndots:2 edns0 ``` Mark the default DNSClass as non-default: ```bash! kubectl patch dnsclass standard -p '{"metadata": {"annotations":{"dnsclass.kubernetes.io/is-default-class":"false"}}}' ``` Mark a DNSClass as default: ```bash! kubectl patch dnsclass gold -p '{"metadata": {"annotations":{"dnsclass.kubernetes.io/is-default-class":"true"}}}' ``` ### User Stories (Optional) #### Story 1 As a cluster administrator, it should be possible to change the DNS configuration without requiring additional resources such as MutatingAdmissionWebhook, nor should it be necessary to expect users to add DNS configuration to their manifest files. Ideally, the process of changing the DNS configuration should be straightforward, allowing administrators to make the necessary updates with minimal effort. #### Story 2 As a cluster admin, I needed to optimize DNS performance, such as redefining searches and ndots. However, I have encountered a limitation where the configuration of the ClusterFirst and ClusterFirstWithHostNet DNS policies can only be done within the Pod manifest. ### Notes/Constraints/Caveats (Optional) - DNSClass will only be effective for ClusterFirst and ClusterFirstWithHostNet policies, as other policies already support user-based configuration. - Changing the DNSClass configuration will not update the DNS settings in existing Pods until they are rescheduled. We do not want to see 100 or 1000 Pods are restarting. - DNSClass is a cluster-wide resource and is not namespaced. It can be used to configure DNS settings for all Pods in the cluster. - DNSClass can be used to mark a DNSClass as the default class or as a non-default class, allowing users to specify which DNSClass should be used as the default for Pods that do not specify a DNSClassName. ### Risks and Mitigations 1. Risk: Changing the default DNSClass may impact existing Pods that do not specify a DNSClassName, as they will be affected by the new default DNSClass configuration. Mitigation: Cluster administrators should carefully review and test the changes to the default DNSClass configuration before rolling it out to production clusters. They should also communicate any changes to users and provide documentation on how to update Pods that rely on the default DNSClass. 2. Risk: Pods will not be rescheduled after changing the DNSClass configuration, resulting in a delay in the new DNS configuration taking effect. Mitigation: Cluster administrators should plan for pod rescheduling and communicate any expected delays to users. They should also monitor the cluster for any Pods that are not using the updated DNS configuration and take necessary actions to ensure that Pods are rescheduled as needed. 3. Risk: Incorrect configuration of DNSClass may result in DNS resolution issues or other networking problems in the cluster. Mitigation: Cluster administrators should thoroughly test the DNSClass configuration before applying it to production clusters. They should also closely monitor the cluster after applying the configuration changes and be prepared to rollback or make adjustments if any issues arise. 4. Risk: Misconfiguration of DNS settings may cause issues with DNS resolution for pods. Mitigation: Proper documentation and guidelines should be provided to users on how to configure DNS settings correctly. 5. Risk: The introduction of a new API resource may require changes to existing tools and libraries that interact with Kubernetes API. Mitigation: Care should be taken to ensure that the changes made to DNS settings through DNSClass do not negatively impact the performance or stability of the cluster. ## Design Details The DNSClass API will be introduced as a new resource in the node.k8s.io API group. The API will consist of the following fields: - `metadata`: Metadata for the DNSClass, including name and labels. - `spec`: The DNS configuration, consisting of the following fields: - `dnsConfig`: The DNS settings to be applied to Pods, including nameservers, search domains, and options. Example DNSClass resource definition: ```go! type DNSClassSpec struct { DNSConfig DNSConfig `json:"dnsConfig"` } type DNSClass struct { metav1.TypeMeta `json:",inline"` metav1.ObjectMeta `json:"metadata,omitempty"` Spec DNSClassSpec `json:"spec,omitempty"` } type DNSConfig struct { Nameservers []string `json:"nameservers,omitempty"` Searches []string `json:"searches,omitempty"` Options []DNSConfigOption `json:"options,omitempty"` } type DNSConfigOption struct { Name string `json:"name,omitempty"` Value string `json:"value,omitempty"` } ``` Pod ```go! // PodSpec is a description of a Pod. type PodSpec struct { // ... DNSClassName string `json:"dnsClassName,omitempty"` // ... } ``` ### Test Plan - [x] I/we understand the owners of the involved components may require updates to existing tests to make this code solid enough prior to committing the changes necessary to implement this enhancement. ##### Prerequisite testing updates N/A ##### Unit tests - Add unit tests for DNSClass API resource - Add unit tests to verify the behavior of default DNSClass - Add unit tests to verify the behavior of DNSClassName when specified in PodSpec. - Add unit tests to verify the behavior of DNSClassName when changed in the PodSpec. ##### Integration tests N/A ##### e2e tests - Add e2e tests to verify the behaviour of default DNSClass - Add e2e tests to verify the behavior of DNSClassName with different PodDNSConfig configurations. - Add e2e tests to verify the behavior of DNSClassName when changed in the PodSpec. ### Graduation Criteria The DNSClass feature can follow the standard Kubernetes graduation criteria for new features, including alpha, beta, and stable phases. The graduation criteria can be defined as follows: - Alpha: - DNSClass API is introduced and available for testing. - Basic functionality, including DNS configuration for Pods and support for default/non-default DNS classes, is implemented. - Documentation and examples are provided. - Unit tests for DNSClass API are implemented and passing. - Beta: - DNSClass feature is thoroughly tested and validated in real-world scenarios. - Feedback from users and the community is gathered and addressed. - Comprehensive documentation, including user guides and best practices, is provided. - Integration tests for DNSClass feature in a Kubernetes cluster are implemented and passing. - Stable: - DNSClass feature is considered stable and ready for production use. - All known issues and bugs are addressed. - Extensive testing, including performance and scalability tests, is performed. - The feature has been used in production environments successfully for a significant period of time. - The feature is well-documented, including examples, troubleshooting guides, and upgrade instructions. ### Upgrade / Downgrade Strategy This feature is a feature gate so explicit opt-in is necessary during upgrades and explicit opt-out during downgrades. When upgrading to a new version of Kubernetes that includes the DNSClass feature, existing pods will continue to use the default DNS settings unless the DNSClassName is specified in the PodSpec or defining a Default DNSClass. If the DNSClassName is specified, the pod will be rescheduled with the updated DNS settings. When downgrading to a previous version of Kubernetes that does not include the DNSClass feature, pods that are using the DNSClassName in their PodSpec will continue to use the custom DNS settings until pods are restarted. However, new pods created after the downgrade will not have the DNSClassName field available in the PodSpec. ### Version Skew Strategy N/A ## Production Readiness Review Questionnaire ### Feature Enablement and Rollback ###### How can this feature be enabled / disabled in a live cluster? - [ ] Feature gate (also fill in values in `kep.yaml`) - Feature gate name: - Components depending on the feature gate: - [ ] Other - Describe the mechanism: - Will enabling / disabling the feature require downtime of the control plane? - Will enabling / disabling the feature require downtime or reprovisioning of a node? ###### Does enabling the feature change any default behavior? No ###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)? Yes, through feature gates. ###### What happens if we reenable the feature if it was previously rolled back? ###### Are there any tests for feature enablement/disablement? ### Rollout, Upgrade and Rollback Planning  ###### How can a rollout or rollback fail? Can it impact already running workloads? ###### What specific metrics should inform a rollback?  ###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?  ###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?  ### Monitoring Requirements  ###### How can an operator determine if the feature is in use by workloads?  ###### How can someone using this feature know that it is working for their instance?  - [ ] Events - Event Reason: - [ ] API .status - Condition name: - Other field: - [ ] Other (treat as last resort) - Details: ###### What are the reasonable SLOs (Service Level Objectives) for the enhancement?  ###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?  - [ ] Metrics - Metric name: - [Optional] Aggregation method: - Components exposing the metric: - [ ] Other (treat as last resort) - Details: ###### Are there any missing metrics that would be useful to have to improve observability of this feature?  ### Dependencies  ###### Does this feature depend on any specific services running in the cluster?  ### Scalability  ###### Will enabling / using this feature result in any new API calls?  ###### Will enabling / using this feature result in introducing new API types?  ###### Will enabling / using this feature result in any new calls to the cloud provider?  ###### Will enabling / using this feature result in increasing size or count of the existing API objects?  ###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?  ###### Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components?  ###### Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?  ### Troubleshooting  ###### How does this feature react if the API server and/or etcd is unavailable? ###### What are other known failure modes?  ###### What steps should be taken if SLOs are not being met to determine the problem? ## Implementation History ## Drawbacks ## Alternatives Using a webhook that modifies the existing pods [DNS Config](https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-dns-config) ## Infrastructure Needed (Optional)