# CRD Versioning Brainstorming
## PoC development branch:
https://github.com/MadhavJivrajani/kubernetes/tree/crd-field-level-versioning
## Option 1 - Introduce feature gate field in CRD
**CRD**
```yaml=
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
# name must match the spec fields below, and be in the form: <plural>.<group>
name: crontabs.stable.example.com
spec:
# group name to use for REST API: /apis/<group>/<version>
group: stable.example.com
# list of versions supported by this CustomResourceDefinition
versions:
- name: v1
# Each version can be enabled/disabled by Served flag.
served: true
# One and only one version must be marked as the storage version.
storage: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
cronSpec:
type: string
image:
type: string
replicas:
type: integer
# either Namespaced or Cluster
scope: Namespaced
names:
# plural name to be used in the URL: /apis/<group>/<version>/<plural>
plural: crontabs
# singular name to be used as an alias on the CLI and for display
singular: crontab
# kind is normally the CamelCased singular type. Your resource manifests use this.
kind: CronTab
# shortNames allow shorter string to match your resource on the CLI
shortNames:
- ct
featureGates:
component: istio # optional
- name: ReplicasFeatureGate
enabled: false
default: false
prerelease: alpha
fieldPaths:
- .spec.replicas
- name: AnotherFeatureGate
enabled: true
default: true
prerelease: beta
fieldPaths:
- .spec.foo
- .spec.bar
```
**CR**
```yaml
apiVersion: "stable.example.com/v1"
kind: CronTab
metadata:
name: my-new-cron-object
spec:
cronSpec: "* * * * */5"
image: my-awesome-cron-image
replicas: 3
```
## Behavior
| Existing persisted object has field (.spec.replicas) | Feature Gate (ReplicasFeatuerGate) | Action |
| -------- | ----------- | ------------ |
| no | disabled | Drop Field |
| no | enabled | Set field |
| yes | disabled | Do nothing - don't drop and don't update the field |
| yes | enabled | Update field |
## Brainstorming
**17 Sept**
- What fields do we need to consider in the featureGate field?
- We could also create a command called “kubectl feature-gates” (similar to “kubectl api-resources”) to list the feature gates. To start with, we could only list the feature gates used in CRDs (and not include the feature gates for the built-in types).
- Can we use field pruning for dropping fields that have their feature gate disabled?
- Are there any challenges with conversion?
- Do we introduce feature gates during conversation?
- We should be able to graduate feature gates from alpha to beta within an apiversion
### Example
- Prerequisite - you create a CRD
- CR (v1beta1) - .spec.replicas =3 - by feature gate ReplicasFeatureGate (alpha, default - false)
- If you create the CR with feature gate disabled => .spec.replicas is dropped
- CR (v1) - spec.replicas = 3 => feature gate in beta, default: true
- If you create the CR with feature gate, .spec.replicas is not dropped
**27 Sept**
- [Nikhita] Today for built-in types:
- **Feature Gates gate the same fields in multiple apiVersions** - Say we have two apiVersions `v1` (storage version) and `v2beta1`. Since v1 is stable, it is the storage version. If a new field `.spec.replicas` is added in `v2beta1`, it must also be added to `v1` since `v1` is the storage version. ([ref](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api_changes.md#multiple-api-versions)). So in this case, the feature gate would gate the `.spec.replicas` field in both `v1` and `v2beta1` versions.
- **Feature Gate has same behavior for all apiVersions** - We do NOT set different alpha/beta/stable feature gate status for different apiVersion. This also ensures that the fields that the feature gate _and_ field is tied to have a consistent status (alpha, beta, etc) in all apiVersions.
- **Feature Gate status is only checked (for the storage version) before persisting to etcd** - feature gate status is NOT checked during conversion and only when persisting to etcd. In the above example, if someone creates a `v2beta1` CR with `.spec.replicas=3` when the feature gate is disabled:
- `v2beta1` will first get converted to `v1` based on the conversion webhook. So we will now have a `v1` resource with `.spec.repliacas=3`.
- When the `v1` resource is being stored in etcd, the apiserver will check the feature gate status. It notices that the feature gate is disabled so it drops the `.spec.replicas=3` field from the resource and the peristed object does not have the field.
- [Nikhita] Following the same principles for custom resources:
- Feature Gates should have the same configuration for all CR versions
- Each feature gate is mapped to a set of fields, for all CR versions
- Conversion does not need to care about feature gates
- Feature gate status is only checked in apiextensions-apiserver after the CR has been converted to the storage version but _before_ persisting to etcd. This check will be done in the `PrepareForCreate` and `PrepareForUpdate` methods in the registry. [link to code](https://github.com/kubernetes/kubernetes/blob/d385d0602a1075837bf8713b9f56964c154aede7/staging/src/k8s.io/apiextensions-apiserver/pkg/registry/customresource/strategy.go#L92)
- Validation
- No two feature gates must have the same `fieldPath`
- Do we need to fail validation or default `default: true` if `prerelease` is `beta`?
- Expose feature gate info in OpenAPI
- Should we return warnings for `deprecated` feature gates?
- `prelease` is one of alpha, beta or deprecated
- Relevant [code](https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/apiextensions-apiserver/pkg/registry/customresourcedefinition/strategy.go#L116)
- `LockToDefault` is not needed here because we don't need to remove any if/then logic in code when a feature gate goes GA
- What happens when the storage version is updated? Nothing! Because k8s automatically handles storage version conversions...
- Do we need to worry about deprecation policy here? Nope, it's on the component maintainers.
- What if you create a feature gate for an already existing field?
- Kubernetes will handle this to avoid data loss
- Does it make sense to "lock" the status of a feature gate?
- No, the maintainer should be careful when updating this
- Add clear docs that this is a risky operation!
- In the POC, we should validate `addtionalPrinterColumns` returns the feature gated value of the field.
**8th Oct**
- [Madhav] a small change to the structure of feature gates due to `component` field not being consistent with a yaml list type.
- Not too sure of the naming of the `yaml` nodes yet
```yaml=
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
# name must match the spec fields below, and be in the form: <plural>.<group>
name: crontabs.stable.example.com
spec:
# group name to use for REST API: /apis/<group>/<version>
group: stable.example.com
# list of versions supported by this CustomResourceDefinition
versions:
- name: v1
# Each version can be enabled/disabled by Served flag.
served: true
# One and only one version must be marked as the storage version.
storage: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
cronSpec:
type: string
image:
type: string
replicas:
type: integer
# either Namespaced or Cluster
scope: Namespaced
names:
# plural name to be used in the URL: /apis/<group>/<version>/<plural>
plural: crontabs
# singular name to be used as an alias on the CLI and for display
singular: crontab
# kind is normally the CamelCased singular type. Your resource manifests use this.
kind: CronTab
# shortNames allow shorter string to match your resource on the CLI
shortNames:
- ct
customFeatureGates:
component: istio # optional
featureGates:
- name: ReplicasFeatureGate
enabled: false
default: false
prerelease: alpha
fieldPaths:
- .spec.replicas
- name: AnotherFeatureGate
enabled: true
default: true
prerelease: beta
fieldPaths:
- .spec.foo
- .spec.bar
```
**20th October**
- [Madhav] Further validation specific implementation:
- We need to enforce the format of `fieldPaths`, i.e. it being a JSON path.
- For enforcement, we need to write validation.
- Inspiration for this can be taken from CRD `scale` and `status` subresource validation that [currently exists](https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/apiextensions-apiserver/pkg/registry/customresource/validator.go#L144).
- However, unlike the `scale` and `status` subresource, the validation will not just be for paths under `.spec` and `.status`, it can be either.
- Adding to this, do we want to allow arbitrary JSON paths being specified?
- My understranding is that even if arbitrary paths are provided, if the exact path does not exist in the CR being created/updated, no change will occur and things should proceed as if the `fieldPath` did not exist in the first place.
- *Need more feedback here!*
~~- [Madhav] According ot the current iteration, if the `featureGate` is enabled, and the `fieldPath` does not exist in the storage version, we set the field.
- TODO: How do we set the field?
- Do we get to know the type from the OpenAPI schema? If so, are there are a bounded number of types that are allowed?~~
**27th Jan**
- [madhav] Do we allow fieldpaths that have common paths? Ex: `.spec.replicas` and `.spec.replicas.something`?
- If so, what's the behaviour here?
- If not allowed, need validation.
- [madhav] What should the generation behaviour be like?
- During update, if a feature gate is disabled, and we drop the field, and if the change that was being made was _just_ to the field that was gated, old == new, how does generation get effected here?
- Before that: learn about what generation is and its intent
- https://stackoverflow.com/questions/47100389/what-is-the-difference-between-a-resourceversion-and-a-generation/47101418
- [madhav] Something to think about: even if a feature gate _is_ disabled, if the CR config is applied through `kubectl`, the field we wanted to gate is still visibile.
- Is this okay?
- Are there any security related concerns maybe if this happens? i doubt it?
- Behaviour wise this should not matter because everything happens as before, just at the last stage of either `PrepareToCreate` or `PrepareToUpdate`, the field is dropped.
**31st Jan**
- [madhav] re: above point about generation:
- We don't need to consider changing it with any additional logic.
- We change generation iff things other than the metadata change.
- If a feature gate is enabled, it may cause changes on update, which would lead to a non-metadata change and that is handled by the current logic.
- If a feature gate _is_ disabled, then no need to worry because it won't cause any change on updates.
- The only change it might cause is in the `last-applied-configuration` annotation but that's in metadata, so its fine.
**2nd Feb**
> Do we allow fieldpaths that have common paths?
[nikhita] *Assumption*: If feature gates exist for a parent fieldPath (`.spec.foo`) & it's children (`.spec.foo.qux`), then the feature gate of the child path will only be considered if the feature gate of the parent path is enabled.
Consider that we are applying a config:
```
.spec
.foo # <- controlled by FooFeatureGate
.baz = 2
.qux = 3 # <- controlled by QuxFeatureGate
```
and if the object was persisted, the persisted object was (for cases 5-8)
```
.spec
.foo # <- controlled by FooFeatureGate
.qux = 1 # <- controlled by QuxFeatureGate
```
Sr. No | Existing persisted object has field (.spec.foo.qux) | Feature Gate (FooFeatureGate) | Feature Gate (QuxFeatureGate) | Action | Object |
| --------| -------- | ----------- | ------------ | --------- | --------- |
| 1 | no | disabled | disabled | Drop foo & children | `.spec` |
| 2 | no | disabled | enabled | Drop foo & children | `.spec` |
| 3 | no | enabled | disabled | Update foo, drop qux | `.spec.foo.baz = 2` |
| 4 | no | enabled | enabled | Update both foo and qux | `.spec.foo.baz =2`, `.spec.foo.qux = 3` |
| 5 | yes | disabled | disabled | Do nothing | `.spec.foo.qux = 1` |
| 6 | yes | disabled | enabled | Do nothing | `.spec.foo.qux = 1` |
| 7 | yes | enabled | disabled | Update foo, don't update qux | `.spec.foo.baz =2`, `.spec.foo.qux = 1` |
| 8 | yes | enabled | enabled | Update both foo and qux | `.spec.foo.baz =2`, `.spec.foo.qux = 3` |
> Something to think about: even if a feature gate _is_ disabled, if the CR config is applied through `kubectl`, the field we wanted to gate is still visibile.
[nikhita] The field is still visible in the `last-applied-configuration` annotation and this is fine because `last-applied-configuration` is calculated at the client-side and the feature gate calculation is done at the server side. The `last-applied-configuration` annotation is kubectl's client-side view of what it sent to the server so this is fine.
> What should the generation behaviour be like?
[nikhita] Generation is updated only when spec changes. If spec does not change, generation is *not* updated.
Generation is used in conjuction with `status.ObservedGeneration`. The `observedGeneration` field is typically used by a controller.
Generation is increased by the api server on spec changes, observedGeneration is meant to be set to generation by the controller when it processes the spec (change). This is helpful for clients to track whether their changes have been observed by the controller or to verify that they are looking into an up to date resource (metadata.generation == status.observedgeneration).
So to process this change, we should drop the fields first [1] and then check if the specs are equal before updating the generation [2]
[1] https://github.com/kubernetes/kubernetes/blob/48da959dbff18bfef6e801bd8c8ab3c88b7a7650/pkg/registry/apps/daemonset/strategy.go#L94
[2] https://github.com/kubernetes/kubernetes/blob/48da959dbff18bfef6e801bd8c8ab3c88b7a7650/pkg/registry/apps/daemonset/strategy.go#L119-L121
There are still cases of API resources not following this convention of updating generation. For this, see https://kubernetes.slack.com/archives/C0EG7JC6T/p1519927395000092
> maybe we make enabled an optional field, nil value of which falls back to prerelease and default fields?
[nikhita] Yes, we can make both `enabled` and `default` optional fields.
If `enabled` is nil, we fallback to the `default` values for `alpha` and `beta` features.
`stable` features are considered on (by default) irrespective of the value of `enabled`. Note: we might want to implement something like `LockToDefault` in the future but I don't think we need to care about it right now.
If `default` is not specified, we apply the following defaults:
- If prerelease = alpha, default = false
- If prerelease = stable, default = true
We also **validate** that:
- If prerelease = alpha, default = false
- If prerelease = stable, default = true
- If prerelease = beta, default value is specified
**7th Feb**
- [madhav] Assumption: for dealing with common paths, we assume that validation takes care of the fact that any given path will be guarded by no more than one feature gate.
**8th Feb**
[madhav] Trying to brainstorm and think of a simple way to implement handling common field paths.
Important assumption: validation takes care of the fact that any given path will be guarded by no more than one feature gate.
```go
// One more cleanup step before strings.Split, but ignoring here.
numComponents := len(strings.Split(fieldPath, "."))
```
> How do we define a parent-of relationship?
Given 2 field paths A and B, A is parent of B iff, no. of components of A < no. of components of B AND A is a substring of B. (No need to mention "proper" substring on account of the first condition).
> What is the goal?
The goal here is to have a way to efficiently lookup parent field paths, given a field path.
> How do we know if a feature gate should be considered?
A feature gate should be considered iff all parents of each field path of the feature gate is enabled.
for parent-child relation - use: https://github.com/kubernetes/utils/tree/master/field
*Note to self: this can very very likely be done in a simpler and more efficient manner, think about it again in a couple of days when the brain fog lifts. If things still seem convoluted then get concerned about if the current policy of handling common paths is good enough or not. If the implementation itself takes this much brain power then is it simple and intuitive enough to be comfortably used by an end user?*
CRD
```yaml=
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
# name must match the spec fields below, and be in the form: <plural>.<group>
name: crontabs.stable.example.com
spec:
# group name to use for REST API: /apis/<group>/<version>
group: stable.example.com
# list of versions supported by this CustomResourceDefinition
versions:
- name: v1
# Each version can be enabled/disabled by Served flag.
served: true
# One and only one version must be marked as the storage version.
storage: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
cronSpec:
type: string
image:
type: string
replicas:
type: object
properties:
maxAvailable:
type: integer
maxUnavailable:
type: integer
# either Namespaced or Cluster
scope: Namespaced
names:
# plural name to be used in the URL: /apis/<group>/<version>/<plural>
plural: crontabs
# singular name to be used as an alias on the CLI and for display
singular: crontab
# kind is normally the CamelCased singular type. Your resource manifests use this.
kind: CronTab
# shortNames allow shorter string to match your resource on the CLI
shortNames:
- ct
customFeatureGates:
component: istio # optional
featureGates:
- name: ReplicasFeatureGate
enabled: true
default: false
preRelease: alpha
fieldPaths:
- .spec.replicas
- name: MaxAvailableFeatureGate
preRelease: alpha # should be considered disabled
fieldPaths:
- .spec.replicas.maxAvailable
- name: MaxUnavailableFeatureGate
enabled: true
default: false
preRelease: alpha
fieldPaths:
- .spec.replicas.maxUnavailable
```
CR
```yaml=
apiVersion: "stable.example.com/v1"
kind: CronTab
metadata:
name: my-new-cron-object
spec:
cronSpec: "* * * * */5"
image: my-awesome-cron-image
replicas:
maxAvailable: 100
maxUnavailable: 5
```
---
TODO before writing validation:
- [x] Make `enabled` and `default` fields as optional
- [x] Write logic for determining enabled state of a feature gate based on values of `enabled`, `default` and `preRelease`.
- [x] Incorporate `beta` off by default: https://github.com/kubernetes/enhancements/pull/3137
- [x] Implement logic for common paths in feature gates.
---
KEP TODO
- [ ] Introduce `deprecatedWarning`
- [ ] Figure out what the enabled state is going to be for deprecated gates
- [ ] Update KEP with details of `deprecatedWarning`
- [ ] Improve documentation comments
- [ ] Re-check things to do in v