--- title: catalogsource-spec-update authors: - "@perdasilva" reviewers: - TBD approvers: - TBD creation-date: 2021-11-1 last-updated: 2021-12-30 status: provisional tags: enhancement proposal, catalogsource --- # catalogsource-spec-update ## Release Signoff Checklist - [ ] Enhancement is `implementable` - [ ] Design details are appropriately documented from clear requirements - [ ] Test plan is defined - [ ] Graduation criteria for dev preview, tech preview, GA ## Summary `CatalogSource`s can be backed by catalog data from one of three different sources: a docker image, a `ConfigMap`, or from a service that implements the registry's grpc api. The current `CatalogSource` spec specifies many top-level optional attributes that are used or ignored with different levels precedence by the operator depending on the value of `spec.sourceType` (`grpc` or `configMap`) and of other optional settings. This schema format is undesirable as it requires special logic in the backend to pick and choose which attributes to use. It can be a source of bugs, it increases the documentation burden, and it doesn't offer the best usability. For instance, `CatalogSource` CRs can carry superfluous information that won't be used by the catalog operator, which, under certain circumstances could lead to user confusion and/or error (and a bad rep for OLM). The overall quality of the `CatalogSource` spec can be further improved by grouping the configuration parameters for each catalog types into their own structure and removing these attributes from the top-level of the schema. Additional schema validation tags can be added, e.g. [oneOf](https://swagger.io/docs/specification/data-models/oneof-anyof-allof-not/) keyword supported by OpenAPI V3 schemas, to ensure that only a single catalog source type is configured per CR. ## Motivation The motivation for this enhancement proposal comes from issue [#2452](https://github.com/operator-framework/operator-lifecycle-manager/issues/2452) and the work done in [PR#](https://github.com/operator-framework/operator-lifecycle-manager/pull/2512) to mitigate it. Essentially, a way was needed to configure scheduling-pod-spec attributes of the pod created by the catalog operator in response to `CatalogSource`s where `spec.sourceType` = `grpc` and `spec.image` is set. Pragmatically, due to time constraints, the chosen solution was to add another optional top-level attribute (`spec.grpcPodConfig`) that, if set, overrides the values in the catalog pod's `podSpec`. This work highlighted the need for a better `CatalogSource` schema. This work could also be considered tech debt, or a general improvement. I think it could be a better way to design the `CatalogSource` schema and facilitate its growth and change. ### Goals * Redefine the `CatalogSource` by deprecating top-level, scattered, configuration attributes for the different catalog source types (i.e. `grpc-image`, `grpc-service` and `grpc-configMap`) * Rely on the api-server for validation against the CRDs OpenAPI v3 schema * Users should no longer be able to input values that will be ignored by the operator anyway * Make no addition of new features or controls ### Non-Goals * Rethink the whole `CatalogSource` as a concept * Change current catalog source types in number or behavior (i.e. no additional fields will be added/deprecated/removed other than what is strictly necessary to meet the stated goals) ### Open Questions 1. How far does kubebuilder support union-types in schemas? 2. Does kubebuilder add the required validation syntax to the output schema (or will this have to be done in post?)? 3. Is this enough of an improvement to bother? ## Proposal ### `CatalogSource` Spec Changes The following `CatalogSource` spec attributes will be deprecated: 1. `spec.sourceType` (and made optional) 2. `spec.image` 3. `spec.address` 4. `spec.configMap` 5. `spec.grpcPodConfig` The following `optional` attributes will be added (described in `go`): #### `catalogImageSource` A `CatalogSource` that is backed by a registry image. This will also house the recently added configuration knobs for overriding certain registry pod spec attributes. ```go= // Contains all configuration knobs for a registry-image-based catalog type CatalogImageSource struct { // CatalogImage is an operator-registry container image to instantiate a registry-server with. CatalogImage string `json:"catalogImage,omitempty"` // PodSpecOverrides optional registry pod spec overrides. // +optional PodSpecOverrides *CatalogImageSourcePodSpecOverrides `json:"podSpecOverrides,omitempty"` } // Replaces GrpcPodConfig type CatalogImageSourcePodSpecOverrides struct { // NodeSelector is a selector which must be true for the pod to fit on a node. // Selector which must match a node's labels for the pod to be scheduled on that node. // +optional NodeSelector map[string]string `json:"nodeSelector,omitempty"` // Tolerations are the catalog source's pod's tolerations. // +optional Tolerations []corev1.Toleration `json:"tolerations,omitempty"` // If specified, indicates the pod's priority. // If not specified, the pod priority will be default or zero if there is no // default. // +optional PriorityClassName string `json:"priorityClassName,omitempty" protobuf:"bytes,24,opt,name=priorityClassName"` } ``` Example: ```yaml= apiVersion: operators.coreos.com/v1alpha1 kind: CatalogSource metadata: name: operatorhubio-catalog namespace: olm spec: displayName: Community Operators publisher: OperatorHub.io catalogImageSource: catalogImage: quay.io/operatorhubio/catalog:latest podSpecOverrides: nodeSelector: label: value ``` #### `catalogServiceSource` A `CatalogSource` that is backed by a `Service` that implements the registry's grpc api. ```go= // Contains all pertinent knobs to configure a registry service backed catalog source type CatalogServiceSource struct { // Address is a host that OLM can use to connect to a pre-existing registry. // Format: <registry-host or ip>:<port> Address string `json:"address,omitempty"` } ``` Example: ```yaml= apiVersion: operators.coreos.com/v1alpha1 kind: CatalogSource metadata: name: my-catalog namespace: default spec: displayName: My Operators catalogServiceSource: address: some.namespace.svc ``` #### `configMapSource` A `CatalogSource` that is backed by a `ConfigMap`. ```go= // Contains all knobs required for a ConfigMap-backed catalog source type ConfigMapSource struct { // ConfigMap is the name of the ConfigMap to be used to back a configmap-server registry ConfigMap string `json:"configMap,omitempty"` } ``` Example: ```yaml= apiVersion: operators.coreos.com/v1alpha1 kind: CatalogSource metadata: name: my-catalog namespace: default spec: displayName: My Operators configMapSource: configMap: somemap ``` ### Spec Validation CRD OpenAPI v3 schemas [support](https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#specifying-a-structural-schema) `allOf`, `oneOf`, `anyOf`, and `not` that allow the schema do be validated in different ways. For instance, ```yaml spec: type: object properties: a: type: string b: type: string oneOf: - required: ["a"] - required: ["b"] ``` In the schema above, either only `a`, or only `b`, or both would be valid. Anything else would be rejected. Similarly, the following could be added to the schema to ensure at least one valid way of configuring a catalog source is used, i.e. ```yaml= ... oneOf: - required: [ "catalogServiceSource" ] - required: [ "catalogImageSource" ] - required: [ "configMapSource" ] # during the deprecation period, we'll need legacy support - required: [ "sourceType", "image" ] - required: [ "sourceType", "address" ] - required: [ "sourceType", "configMap" ] ``` See the full spec in the [appendix](#Appendix-I-New-Catalog-Source-Schema). ### Risks and Mitigations These spec changes are breaking/deprecating and would require a phased change to manage the transition to the new version and avoid user frustration. In order to mitigate any undesired user experience the plan is: 1. maintain the spec backwards compatible for the duration of the deprecation period 2. new features are added to the new structure (as a forcing function to move to the new schema) 3. rely on api-server CRD schema validation to ensure only valid `CatalogSource` resources can be created 4. once the deprecation period is over, bump the CRD version, remove the deprecated fields, and supply a conversion hook The deprecation will be announced on all pertinent channels: source, website, documentation, etc. ### Test Plan 1. Add unit tests (based on the current unit tests for the legacy schema) 2. Add validation tests (to the api repo?) 3. Add e2e tests (based on the current e2e `CatalogSource` tests for the legacy schema) ### Graduation Criteria N/A? ### Upgrade / Downgrade Strategy As previously mentioned, throughout the deprecation period the legacy schema will be supported and new features would be added to the incoming structure to promote the adoption of the new schema. After the requisite deprecation period is over, a new version of the `CatalogSource` CRD will be cut. The first release would also need to provide a conversion hook to convert the CRDs to their new version. ### Version Skew Strategy N/A? ## Implementation History Dec 2021: Draft Enhancement Proposal ## Drawbacks This approach has the following drawbacks: Firstly, the proposed changes will break the API contract and the transition to the new version will need to be managed, including providing a conversion hook. Secondly, there is no support for `oneOf` on the `operator-sdk`/`kubebuilder` tag (as far as I could find). Therefore, this validation syntax may need to be added to the CRD out-of-band (i.e. script called from the right make target). This could be error prone and/or a maintenance burden. However, the validation tests should ensure the validation syntax was correctly added, which should mitigate the the risk of giving a broken build to our users. Finally, CRD validation doesn't seem to return particularly useful messages in the context of `oneOf`. For instance, if the CRD does not validate against any of the schemas in the `oneOf`, the message is: ``` The CatalogSource "catalog" is invalid: * <nil>: Invalid value: "": "spec" must validate one and only one schema (oneOf). Found none valid * spec.catalogServiceSource: Required value ``` ## Alternatives [Union-types](https://github.com/kubernetes/enhancements/blob/master/keps/sig-api-machinery/1027-api-unions/README.md) are making some in-roads in Kubernetes. But, I don't think they are available yet. Schematically, they allow an attribute to be of a set of different types. This could have the potential to improve validation messages. From what I could gather from the documentation, I don't think there would be a huge difference in the implementation with or without union-types (maybe the addition of a `discriminator` attribute and a few kubebuilder tags). But I have not figured out a way to really improve the error messages. ## Review Log ### 2022-01-18 Action Items - [x] Double-check "a", "b", or both. Should only be only "a" or "b" - [ ] Review links Alex attached to understand the union-types vs oneOf and where the community is going. So, we don't go down a bad path. - [ ] Add CEL validation as a possible alternative - [ ] Consider also deprecating configMap (this has knock-on affects) - [ ] Consider introducing the next version along side the current one - [ ] Upgrade story needs more cowbell: OCP doesn't like new CRD versions due to downgrades - [ ] Investigate introducing the new version but keeping the old version as the stored version. Then in a next release migrate it as the stored version. Pending on what OCP promises regarding downgrades (how many minor versions can I go back?) ## Appendix I. New Catalog Source Schema ```yaml= apiVersion: apiextensions.k8s.io/v1 kind: CustomResourceDefinition metadata: annotations: controller-gen.kubebuilder.io/version: v0.6.2 creationTimestamp: null name: catalogsources.operators.coreos.com spec: group: operators.coreos.com names: categories: - olm kind: CatalogSource listKind: CatalogSourceList plural: catalogsources shortNames: - catsrc singular: catalogsource scope: Namespaced versions: - additionalPrinterColumns: - description: The pretty name of the catalog jsonPath: .spec.displayName name: Display type: string - description: The type of the catalog jsonPath: .spec.sourceType name: Type type: string - description: The publisher of the catalog jsonPath: .spec.publisher name: Publisher type: string - jsonPath: .metadata.creationTimestamp name: Age type: date name: v1alpha1 schema: openAPIV3Schema: description: CatalogSource is a repository of CSVs, CRDs, and operator packages. type: object required: - metadata - spec properties: apiVersion: description: 'APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources' type: string kind: description: 'Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds' type: string metadata: type: object spec: type: object properties: address: description: 'Address (deprecated) is a host that OLM can use to connect to a pre-existing registry. Format: <registry-host or ip>:<port> Only used when SourceType = SourceTypeGrpc. Ignored when the Image field is set.' type: string catalogImageSource: type: object properties: catalogImage: description: CatalogImage is an operator-registry container image to instantiate a registry-server with. type: string podSpecOverrides: description: PodSpecOverrides optional registry pod spec overrides. type: object properties: nodeSelector: description: NodeSelector is a selector which must be true for the pod to fit on a node. Selector which must match a node's labels for the pod to be scheduled on that node. type: object additionalProperties: type: string priorityClassName: description: If specified, indicates the pod's priority. If not specified, the pod priority will be default or zero if there is no default. type: string tolerations: description: Tolerations are the catalog source's pod's tolerations. type: array items: description: The pod this Toleration is attached to tolerates any taint that matches the triple <key,value,effect> using the matching operator <operator>. type: object properties: effect: description: Effect indicates the taint effect to match. Empty means match all taint effects. When specified, allowed values are NoSchedule, PreferNoSchedule and NoExecute. type: string key: description: Key is the taint key that the toleration applies to. Empty means match all taint keys. If the key is empty, operator must be Exists; this combination means to match all values and all keys. type: string operator: description: Operator represents a key's relationship to the value. Valid operators are Exists and Equal. Defaults to Equal. Exists is equivalent to wildcard for value, so that a pod can tolerate all taints of a particular category. type: string tolerationSeconds: description: TolerationSeconds represents the period of time the toleration (which must be of effect NoExecute, otherwise this field is ignored) tolerates the taint. By default, it is not set, which means tolerate the taint forever (do not evict). Zero and negative values will be treated as 0 (evict immediately) by the system. type: integer format: int64 value: description: Value is the taint value the toleration matches to. If the operator is Exists, the value should be empty, otherwise just a regular string. type: string catalogServiceSource: type: object properties: address: description: 'Address is a host that OLM can use to connect to a pre-existing registry. Format: <registry-host or ip>:<port>' type: string configMap: description: ConfigMap (deprecated) is the name of the ConfigMap to be used to back a configmap-server registry. Only used when SourceType = SourceTypeConfigmap or SourceTypeInternal. type: string configMapSource: type: object properties: configMap: description: ConfigMap is the name of the ConfigMap to be used to back a configmap-server registry type: string description: type: string displayName: description: Metadata type: string grpcPodConfig: description: GrpcPodConfig (deprecated) exposes different overrides for the pod spec of the CatalogSource Pod. Only used when SourceType = SourceTypeGrpc and Image is set. type: object properties: nodeSelector: description: NodeSelector is a selector which must be true for the pod to fit on a node. Selector which must match a node's labels for the pod to be scheduled on that node. type: object additionalProperties: type: string priorityClassName: description: If specified, indicates the pod's priority. If not specified, the pod priority will be default or zero if there is no default. type: string tolerations: description: Tolerations are the catalog source's pod's tolerations. type: array items: description: The pod this Toleration is attached to tolerates any taint that matches the triple <key,value,effect> using the matching operator <operator>. type: object properties: effect: description: Effect indicates the taint effect to match. Empty means match all taint effects. When specified, allowed values are NoSchedule, PreferNoSchedule and NoExecute. type: string key: description: Key is the taint key that the toleration applies to. Empty means match all taint keys. If the key is empty, operator must be Exists; this combination means to match all values and all keys. type: string operator: description: Operator represents a key's relationship to the value. Valid operators are Exists and Equal. Defaults to Equal. Exists is equivalent to wildcard for value, so that a pod can tolerate all taints of a particular category. type: string tolerationSeconds: description: TolerationSeconds represents the period of time the toleration (which must be of effect NoExecute, otherwise this field is ignored) tolerates the taint. By default, it is not set, which means tolerate the taint forever (do not evict). Zero and negative values will be treated as 0 (evict immediately) by the system. type: integer format: int64 value: description: Value is the taint value the toleration matches to. If the operator is Exists, the value should be empty, otherwise just a regular string. type: string icon: type: object required: - base64data - mediatype properties: base64data: type: string mediatype: type: string image: description: Image (deprecated) is an operator-registry container image to instantiate a registry-server with. Only used when SourceType = SourceTypeGrpc. If present, the address field is ignored. type: string priority: description: 'Priority field assigns a weight to the catalog source to prioritize them so that it can be consumed by the dependency resolver. Usage: Higher weight indicates that this catalog source is preferred over lower weighted catalog sources during dependency resolution. The range of the priority value can go from positive to negative in the range of int32. The default value to a catalog source with unassigned priority would be 0. The catalog source with the same priority values will be ranked lexicographically based on its name.' type: integer publisher: type: string secrets: description: Secrets represent set of secrets that can be used to access the contents of the catalog. It is best to keep this list small, since each will need to be tried for every catalog entry. type: array items: type: string sourceType: description: SourceType (deprecated) is the type of source type: string updateStrategy: description: UpdateStrategy defines how updated catalog source images can be discovered Consists of an interval that defines polling duration and an embedded strategy type type: object properties: registryPoll: type: object properties: interval: description: Interval is used to determine the time interval between checks of the latest catalog source version. The catalog operator polls to see if a new version of the catalog source is available. If available, the latest image is pulled and gRPC traffic is directed to the latest catalog source. type: string oneOf: - required: [ "catalogServiceSource" ] - required: [ "catalogImageSource" ] - required: [ "configMapSource" ] - required: [ "sourceType", "image" ] - required: [ "sourceType", "address" ] - required: [ "sourceType", "configMap" ] status: type: object properties: conditions: description: Represents the state of a CatalogSource. Note that Message and Reason represent the original status information, which may be migrated to be conditions based in the future. Any new features introduced will use conditions. type: array items: description: "Condition contains details for one aspect of the current state of this API Resource. --- This struct is intended for direct use as an array at the field path .status.conditions. For example, type FooStatus struct{ // Represents the observations of a foo's current state. // Known .status.conditions.type are: \"Available\", \"Progressing\", and \"Degraded\" // +patchMergeKey=type // +patchStrategy=merge // +listType=map // +listMapKey=type Conditions []metav1.Condition `json:\"conditions,omitempty\" patchStrategy:\"merge\" patchMergeKey:\"type\" protobuf:\"bytes,1,rep,name=conditions\"` \n // other fields }" type: object required: - lastTransitionTime - message - reason - status - type properties: lastTransitionTime: description: lastTransitionTime is the last time the condition transitioned from one status to another. This should be when the underlying condition changed. If that is not known, then using the time when the API field changed is acceptable. type: string format: date-time message: description: message is a human readable message indicating details about the transition. This may be an empty string. type: string maxLength: 32768 observedGeneration: description: observedGeneration represents the .metadata.generation that the condition was set based upon. For instance, if .metadata.generation is currently 12, but the .status.conditions[x].observedGeneration is 9, the condition is out of date with respect to the current state of the instance. type: integer format: int64 minimum: 0 reason: description: reason contains a programmatic identifier indicating the reason for the condition's last transition. Producers of specific condition types may define expected values and meanings for this field, and whether the values are considered a guaranteed API. The value should be a CamelCase string. This field may not be empty. type: string maxLength: 1024 minLength: 1 pattern: ^[A-Za-z]([A-Za-z0-9_,:]*[A-Za-z0-9_])?$ status: description: status of the condition, one of True, False, Unknown. type: string enum: - "True" - "False" - Unknown type: description: type of condition in CamelCase or in foo.example.com/CamelCase. --- Many .condition.type values are consistent across resources like Available, but because arbitrary conditions can be useful (see .node.status.conditions), the ability to deconflict is important. The regex it matches is (dns1123SubdomainFmt/)?(qualifiedNameFmt) type: string maxLength: 316 pattern: ^([a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*/)?(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])$ x-kubernetes-list-map-keys: - type x-kubernetes-list-type: map configMapReference: type: object required: - name - namespace properties: lastUpdateTime: type: string format: date-time name: type: string namespace: type: string resourceVersion: type: string uid: description: UID is a type that holds unique ID values, including UUIDs. Because we don't ONLY use UUIDs, this is an alias to string. Being a type captures intent and helps make sure that UIDs and names do not get conflated. type: string connectionState: type: object required: - lastObservedState properties: address: type: string lastConnect: type: string format: date-time lastObservedState: type: string latestImageRegistryPoll: description: The last time the CatalogSource image registry has been polled to ensure the image is up-to-date type: string format: date-time message: description: A human readable message indicating details about why the CatalogSource is in this condition. type: string reason: description: Reason is the reason the CatalogSource was transitioned to its current state. type: string registryService: type: object properties: createdAt: type: string format: date-time port: type: string protocol: type: string serviceName: type: string serviceNamespace: type: string served: true storage: true subresources: status: {} ```