# KEP-NNNN DNSClass: Configurable DNS Settings for Pods
<!-- toc -->
- [Release Signoff Checklist](#release-signoff-checklist)
- [Summary](#summary)
- [Motivation](#motivation)
- [Goals](#goals)
- [Non-Goals](#non-goals)
- [Proposal](#proposal)
- [User Stories (Optional)](#user-stories-optional)
- [Story 1](#story-1)
- [Story 2](#story-2)
- [Notes/Constraints/Caveats (Optional)](#notesconstraintscaveats-optional)
- [Risks and Mitigations](#risks-and-mitigations)
- [Design Details](#design-details)
- [Test Plan](#test-plan)
- [Prerequisite testing updates](#prerequisite-testing-updates)
- [Unit tests](#unit-tests)
- [Integration tests](#integration-tests)
- [e2e tests](#e2e-tests)
- [Graduation Criteria](#graduation-criteria)
- [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy)
- [Version Skew Strategy](#version-skew-strategy)
- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire)
- [Feature Enablement and Rollback](#feature-enablement-and-rollback)
- [Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning)
- [Monitoring Requirements](#monitoring-requirements)
- [Dependencies](#dependencies)
- [Scalability](#scalability)
- [Troubleshooting](#troubleshooting)
- [Implementation History](#implementation-history)
- [Drawbacks](#drawbacks)
- [Alternatives](#alternatives)
- [Infrastructure Needed (Optional)](#infrastructure-needed-optional)
<!-- /toc -->
## Release Signoff Checklist
Items marked with (R) are required *prior to targeting to a milestone / release*.
- [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
- [ ] (R) KEP approvers have approved the KEP status as `implementable`
- [ ] (R) Design details are appropriately documented
- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
- [ ] e2e Tests for all Beta API Operations (endpoints)
- [ ] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
- [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free
- [ ] (R) Graduation criteria is in place
- [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
- [ ] (R) Production readiness review completed
- [ ] (R) Production readiness review approved
- [ ] "Implementation History" section is up-to-date for milestone
- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
## Summary
DNSClass is a cluster-scoped resource that allows users to configure the DNS settings for Pods in a Kubernetes cluster, replacing the default behavior of ClusterFirst and ClusterFirstWithHostNet policies. This provides a way to change the default values of ClusterFirst and ClusterFirstWithHostNet policies without requiring additional resources such as MutatingAdmissionWebhooks or modifying Pod manifest files.
## Motivation
The current default DNS settings for Pods in Kubernetes are hard-coded and not easily configurable at the cluster level. This can cause limitations for admin users who need to customize the DNS settings for the Pods based on their specific requirements. By introducing DNSClass as a configurable cluster-wide resource, users will have more flexibility in configuring the DNS settings for their Pods without having to resort to workarounds.
### Goals
- Provide a cluster-wide mechanism to change the default values of ClusterFirst and ClusterFirstWithHostNet.
- Reduce the complexity and extra effort for users who need to customize DNS settings for Pods.
### Non-Goals
- Improve the performance of DNS
- Modifying the existing behavior of DNSConfig in the Pod manifest when the `None` policy is selected.
## Proposal
The proposal involves introducing a new API resource called DNSClass in the node.k8s.io API group. DNSClass will have a Spec field called DNSConfig, which will contain the desired DNS configuration for Pods. Pods can reference a DNSClass by specifying the DNSClassName in their PodSpec.
Example DNSClass resource definition:
```yaml!
kind: DNSClass
apiVersion: node.k8s.io/v1alpha1
metadata:
annotations:
dnsclass.kubernetes.io/is-default-class: "true"
name: standard
spec:
dnsConfig:
nameservers:
- 192.0.2.1
searches:
- ns1.svc.cluster-domain.example
- my.dns.search.suffix
options:
- name: ndots
value: "2"
- name: edns0
```
Example usage for DNSClassName in Pod manifest:
```yaml!
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
DNSClassName: standard
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
```
When the Pod is created, the DNS settings specified in the referenced DNSClass will be applied to the Pod's `/etc/resolv.conf` file.
```bash!
nameserver 192.0.2.1
search ns1.svc.cluster-domain.example my.dns.search.suffix
options ndots:2 edns0
```
Users will also have the ability to mark a DNSClass as the default class or non-default class using annotations, allowing for easy customization of the default DNS settings.
The above DNSClass example has the required annotation to set a DNSClass as default. Example usage without DNSClassName in Pod manifest:
```yaml!
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
```
`/etc/resolv.conf` file will be the same as the example mentioned above.
```bash!
nameserver 192.0.2.1
search ns1.svc.cluster-domain.example my.dns.search.suffix
options ndots:2 edns0
```
Mark the default DNSClass as non-default:
```bash!
kubectl patch dnsclass standard -p '{"metadata": {"annotations":{"dnsclass.kubernetes.io/is-default-class":"false"}}}'
```
Mark a DNSClass as default:
```bash!
kubectl patch dnsclass gold -p '{"metadata": {"annotations":{"dnsclass.kubernetes.io/is-default-class":"true"}}}'
```
### User Stories (Optional)
#### Story 1
As a cluster administrator, it should be possible to change the DNS configuration without requiring additional resources such as MutatingAdmissionWebhook, nor should it be necessary to expect users to add DNS configuration to their manifest files. Ideally, the process of changing the DNS configuration should be straightforward, allowing administrators to make the necessary updates with minimal effort.
#### Story 2
As a cluster admin, I needed to optimize DNS performance, such as redefining searches and ndots. However, I have encountered a limitation where the configuration of the ClusterFirst and ClusterFirstWithHostNet DNS policies can only be done within the Pod manifest.
### Notes/Constraints/Caveats (Optional)
- DNSClass will only be effective for ClusterFirst and ClusterFirstWithHostNet policies, as other policies already support user-based configuration.
- Changing the DNSClass configuration will not update the DNS settings in existing Pods until they are rescheduled. We do not want to see 100 or 1000 Pods are restarting.
- DNSClass is a cluster-wide resource and is not namespaced. It can be used to configure DNS settings for all Pods in the cluster.
- DNSClass can be used to mark a DNSClass as the default class or as a non-default class, allowing users to specify which DNSClass should be used as the default for Pods that do not specify a DNSClassName.
### Risks and Mitigations
1. Risk: Changing the default DNSClass may impact existing Pods that do not specify a DNSClassName, as they will be affected by the new default DNSClass configuration.
Mitigation: Cluster administrators should carefully review and test the changes to the default DNSClass configuration before rolling it out to production clusters. They should also communicate any changes to users and provide documentation on how to update Pods that rely on the default DNSClass.
2. Risk: Pods will not be rescheduled after changing the DNSClass configuration, resulting in a delay in the new DNS configuration taking effect.
Mitigation: Cluster administrators should plan for pod rescheduling and communicate any expected delays to users. They should also monitor the cluster for any Pods that are not using the updated DNS configuration and take necessary actions to ensure that Pods are rescheduled as needed.
3. Risk: Incorrect configuration of DNSClass may result in DNS resolution issues or other networking problems in the cluster.
Mitigation: Cluster administrators should thoroughly test the DNSClass configuration before applying it to production clusters. They should also closely monitor the cluster after applying the configuration changes and be prepared to rollback or make adjustments if any issues arise.
4. Risk: Misconfiguration of DNS settings may cause issues with DNS resolution for pods.
Mitigation: Proper documentation and guidelines should be provided to users on how to configure DNS settings correctly.
5. Risk: The introduction of a new API resource may require changes to existing tools and libraries that interact with Kubernetes API.
Mitigation: Care should be taken to ensure that the changes made to DNS settings through DNSClass do not negatively impact the performance or stability of the cluster.
## Design Details
The DNSClass API will be introduced as a new resource in the node.k8s.io API group. The API will consist of the following fields:
- `metadata`: Metadata for the DNSClass, including name and labels.
- `spec`: The DNS configuration, consisting of the following fields:
- `dnsConfig`: The DNS settings to be applied to Pods, including nameservers, search domains, and options.
Example DNSClass resource definition:
```go!
type DNSClassSpec struct {
DNSConfig DNSConfig `json:"dnsConfig"`
}
type DNSClass struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`
Spec DNSClassSpec `json:"spec,omitempty"`
}
type DNSConfig struct {
Nameservers []string `json:"nameservers,omitempty"`
Searches []string `json:"searches,omitempty"`
Options []DNSConfigOption `json:"options,omitempty"`
}
type DNSConfigOption struct {
Name string `json:"name,omitempty"`
Value string `json:"value,omitempty"`
}
```
Pod
```go!
// PodSpec is a description of a Pod.
type PodSpec struct {
// ...
DNSClassName string `json:"dnsClassName,omitempty"`
// ...
}
```
### Test Plan
- [x] I/we understand the owners of the involved components may require updates to existing tests to make this code solid enough
prior to committing the changes necessary to implement this enhancement.
##### Prerequisite testing updates
N/A
##### Unit tests
- Add unit tests for DNSClass API resource
- Add unit tests to verify the behavior of default DNSClass
- Add unit tests to verify the behavior of DNSClassName when specified in PodSpec.
- Add unit tests to verify the behavior of DNSClassName when changed in the PodSpec.
##### Integration tests
N/A
##### e2e tests
- Add e2e tests to verify the behaviour of default DNSClass
- Add e2e tests to verify the behavior of DNSClassName with different PodDNSConfig configurations.
- Add e2e tests to verify the behavior of DNSClassName when changed in the PodSpec.
### Graduation Criteria
The DNSClass feature can follow the standard Kubernetes graduation criteria for new features, including alpha, beta, and stable phases. The graduation criteria can be defined as follows:
- Alpha:
- DNSClass API is introduced and available for testing.
- Basic functionality, including DNS configuration for Pods and support for default/non-default DNS classes, is implemented.
- Documentation and examples are provided.
- Unit tests for DNSClass API are implemented and passing.
- Beta:
- DNSClass feature is thoroughly tested and validated in real-world scenarios.
- Feedback from users and the community is gathered and addressed.
- Comprehensive documentation, including user guides and best practices, is provided.
- Integration tests for DNSClass feature in a Kubernetes cluster are implemented and passing.
- Stable:
- DNSClass feature is considered stable and ready for production use.
- All known issues and bugs are addressed.
- Extensive testing, including performance and scalability tests, is performed.
- The feature has been used in production environments successfully for a significant period of time.
- The feature is well-documented, including examples, troubleshooting guides, and upgrade instructions.
### Upgrade / Downgrade Strategy
This feature is a feature gate so explicit opt-in is necessary during upgrades and explicit opt-out during downgrades.
When upgrading to a new version of Kubernetes that includes the DNSClass feature, existing pods will continue to use the default DNS settings unless the DNSClassName is specified in the PodSpec or defining a Default DNSClass. If the DNSClassName is specified, the pod will be rescheduled with the updated DNS settings.
When downgrading to a previous version of Kubernetes that does not include the DNSClass feature, pods that are using the DNSClassName in their PodSpec will continue to use the custom DNS settings until pods are restarted. However, new pods created after the downgrade will not have the DNSClassName field available in the PodSpec.
### Version Skew Strategy
N/A
## Production Readiness Review Questionnaire
### Feature Enablement and Rollback
###### How can this feature be enabled / disabled in a live cluster?
- [ ] Feature gate (also fill in values in `kep.yaml`)
- Feature gate name:
- Components depending on the feature gate:
- [ ] Other
- Describe the mechanism:
- Will enabling / disabling the feature require downtime of the control
plane?
- Will enabling / disabling the feature require downtime or reprovisioning
of a node?
###### Does enabling the feature change any default behavior?
No
###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
Yes, through feature gates.
###### What happens if we reenable the feature if it was previously rolled back?
###### Are there any tests for feature enablement/disablement?
### Rollout, Upgrade and Rollback Planning
<!--
This section must be completed when targeting beta to a release.
-->
###### How can a rollout or rollback fail? Can it impact already running workloads?
###### What specific metrics should inform a rollback?
<!--
What signals should users be paying attention to when the feature is young
that might indicate a serious problem?
-->
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
<!--
Describe manual testing that was done and the outcomes.
Longer term, we may want to require automated upgrade/rollback tests, but we
are missing a bunch of machinery and tooling and can't do that now.
-->
###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
<!--
Even if applying deprecation policies, they may still surprise some users.
-->
### Monitoring Requirements
<!--
This section must be completed when targeting beta to a release.
For GA, this section is required: approvers should be able to confirm the
previous answers based on experience in the field.
-->
###### How can an operator determine if the feature is in use by workloads?
<!--
Ideally, this should be a metric. Operations against the Kubernetes API (e.g.,
checking if there are objects with field X set) may be a last resort. Avoid
logs or events for this purpose.
-->
###### How can someone using this feature know that it is working for their instance?
<!--
For instance, if this is a pod-related feature, it should be possible to determine if the feature is functioning properly
for each individual pod.
Pick one more of these and delete the rest.
Please describe all items visible to end users below with sufficient detail so that they can verify correct enablement
and operation of this feature.
Recall that end users cannot usually observe component logs or access metrics.
-->
- [ ] Events
- Event Reason:
- [ ] API .status
- Condition name:
- Other field:
- [ ] Other (treat as last resort)
- Details:
###### What are the reasonable SLOs (Service Level Objectives) for the enhancement?
<!--
This is your opportunity to define what "normal" quality of service looks like
for a feature.
It's impossible to provide comprehensive guidance, but at the very
high level (needs more precise definitions) those may be things like:
- per-day percentage of API calls finishing with 5XX errors <= 1%
- 99% percentile over day of absolute value from (job creation time minus expected
job creation time) for cron job <= 10%
- 99.9% of /health requests per day finish with 200 code
These goals will help you determine what you need to measure (SLIs) in the next
question.
-->
###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
<!--
Pick one more of these and delete the rest.
-->
- [ ] Metrics
- Metric name:
- [Optional] Aggregation method:
- Components exposing the metric:
- [ ] Other (treat as last resort)
- Details:
###### Are there any missing metrics that would be useful to have to improve observability of this feature?
<!--
Describe the metrics themselves and the reasons why they weren't added (e.g., cost,
implementation difficulties, etc.).
-->
### Dependencies
<!--
This section must be completed when targeting beta to a release.
-->
###### Does this feature depend on any specific services running in the cluster?
<!--
Think about both cluster-level services (e.g. metrics-server) as well
as node-level agents (e.g. specific version of CRI). Focus on external or
optional services that are needed. For example, if this feature depends on
a cloud provider API, or upon an external software-defined storage or network
control plane.
For each of these, fill in the following—thinking about running existing user workloads
and creating new ones, as well as about cluster-level services (e.g. DNS):
- [Dependency name]
- Usage description:
- Impact of its outage on the feature:
- Impact of its degraded performance or high-error rates on the feature:
-->
### Scalability
<!--
For alpha, this section is encouraged: reviewers should consider these questions
and attempt to answer them.
For beta, this section is required: reviewers must answer these questions.
For GA, this section is required: approvers should be able to confirm the
previous answers based on experience in the field.
-->
###### Will enabling / using this feature result in any new API calls?
<!--
Describe them, providing:
- API call type (e.g. PATCH pods)
- estimated throughput
- originating component(s) (e.g. Kubelet, Feature-X-controller)
Focusing mostly on:
- components listing and/or watching resources they didn't before
- API calls that may be triggered by changes of some Kubernetes resources
(e.g. update of object X triggers new updates of object Y)
- periodic API calls to reconcile state (e.g. periodic fetching state,
heartbeats, leader election, etc.)
-->
###### Will enabling / using this feature result in introducing new API types?
<!--
Describe them, providing:
- API type
- Supported number of objects per cluster
- Supported number of objects per namespace (for namespace-scoped objects)
-->
###### Will enabling / using this feature result in any new calls to the cloud provider?
<!--
Describe them, providing:
- Which API(s):
- Estimated increase:
-->
###### Will enabling / using this feature result in increasing size or count of the existing API objects?
<!--
Describe them, providing:
- API type(s):
- Estimated increase in size: (e.g., new annotation of size 32B)
- Estimated amount of new objects: (e.g., new Object X for every existing Pod)
-->
###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?
<!--
Look at the [existing SLIs/SLOs].
Think about adding additional work or introducing new steps in between
(e.g. need to do X to start a container), etc. Please describe the details.
[existing SLIs/SLOs]: https://git.k8s.io/community/sig-scalability/slos/slos.md#kubernetes-slisslos
-->
###### Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components?
<!--
Things to keep in mind include: additional in-memory state, additional
non-trivial computations, excessive access to disks (including increased log
volume), significant amount of data sent and/or received over network, etc.
This through this both in small and large cases, again with respect to the
[supported limits].
[supported limits]: https://git.k8s.io/community//sig-scalability/configs-and-limits/thresholds.md
-->
###### Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?
<!--
Focus not just on happy cases, but primarily on more pathological cases
(e.g. probes taking a minute instead of milliseconds, failed pods consuming resources, etc.).
If any of the resources can be exhausted, how this is mitigated with the existing limits
(e.g. pods per node) or new limits added by this KEP?
Are there any tests that were run/should be run to understand performance characteristics better
and validate the declared limits?
-->
### Troubleshooting
<!--
This section must be completed when targeting beta to a release.
For GA, this section is required: approvers should be able to confirm the
previous answers based on experience in the field.
The Troubleshooting section currently serves the `Playbook` role. We may consider
splitting it into a dedicated `Playbook` document (potentially with some monitoring
details). For now, we leave it here.
-->
###### How does this feature react if the API server and/or etcd is unavailable?
###### What are other known failure modes?
<!--
For each of them, fill in the following information by copying the below template:
- [Failure mode brief description]
- Detection: How can it be detected via metrics? Stated another way:
how can an operator troubleshoot without logging into a master or worker node?
- Mitigations: What can be done to stop the bleeding, especially for already
running user workloads?
- Diagnostics: What are the useful log messages and their required logging
levels that could help debug the issue?
Not required until feature graduated to beta.
- Testing: Are there any tests for failure mode? If not, describe why.
-->
###### What steps should be taken if SLOs are not being met to determine the problem?
## Implementation History
## Drawbacks
## Alternatives
Using a webhook that modifies the existing pods [DNS Config](https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-dns-config)
## Infrastructure Needed (Optional)