---
title: catalogsource-spec-update
authors:
- "@perdasilva"
reviewers:
- TBD
approvers:
- TBD
creation-date: 2021-11-1
last-updated: 2021-12-30
status: provisional
tags: enhancement proposal, catalogsource
---
# catalogsource-spec-update
## Release Signoff Checklist
- [ ] Enhancement is `implementable`
- [ ] Design details are appropriately documented from clear requirements
- [ ] Test plan is defined
- [ ] Graduation criteria for dev preview, tech preview, GA
## Summary
`CatalogSource`s can be backed by catalog data from one of three different sources: a docker image, a `ConfigMap`, or from a service that implements the registry's grpc api. The current `CatalogSource` spec specifies many top-level optional attributes that are used or ignored with different levels precedence by the operator depending on the value of `spec.sourceType` (`grpc` or `configMap`) and of other optional settings. This schema format is undesirable as it requires special logic in the backend to pick and choose which attributes to use. It can be a source of bugs, it increases the documentation burden, and it doesn't offer the best usability. For instance, `CatalogSource` CRs can carry superfluous information that won't be used by the catalog operator, which, under certain circumstances could lead to user confusion and/or error (and a bad rep for OLM).
The overall quality of the `CatalogSource` spec can be further improved by grouping the configuration parameters for each catalog types into their own structure and removing these attributes from the top-level of the schema.
Additional schema validation tags can be added, e.g. [oneOf](https://swagger.io/docs/specification/data-models/oneof-anyof-allof-not/) keyword supported by OpenAPI V3 schemas, to ensure that only a single catalog source type is configured per CR.
## Motivation
The motivation for this enhancement proposal comes from issue [#2452](https://github.com/operator-framework/operator-lifecycle-manager/issues/2452) and the work done in [PR#](https://github.com/operator-framework/operator-lifecycle-manager/pull/2512) to mitigate it. Essentially, a way was needed to configure scheduling-pod-spec attributes of the pod created by the catalog operator in response to `CatalogSource`s where `spec.sourceType` = `grpc` and `spec.image` is set. Pragmatically, due to time constraints, the chosen solution was to add another optional top-level attribute (`spec.grpcPodConfig`) that, if set, overrides the values in the catalog pod's `podSpec`. This work highlighted the need for a better `CatalogSource` schema.
This work could also be considered tech debt, or a general improvement. I think it could be a better way to design the `CatalogSource` schema and facilitate its growth and change.
### Goals
* Redefine the `CatalogSource` by deprecating top-level, scattered, configuration attributes for the different catalog source types (i.e. `grpc-image`, `grpc-service` and `grpc-configMap`)
* Rely on the api-server for validation against the CRDs OpenAPI v3 schema
* Users should no longer be able to input values that will be ignored by the operator anyway
* Make no addition of new features or controls
### Non-Goals
* Rethink the whole `CatalogSource` as a concept
* Change current catalog source types in number or behavior (i.e. no additional fields will be added/deprecated/removed other than what is strictly necessary to meet the stated goals)
### Open Questions
1. How far does kubebuilder support union-types in schemas?
2. Does kubebuilder add the required validation syntax to the output schema (or will this have to be done in post?)?
3. Is this enough of an improvement to bother?
## Proposal
### `CatalogSource` Spec Changes
The following `CatalogSource` spec attributes will be deprecated:
1. `spec.sourceType` (and made optional)
2. `spec.image`
3. `spec.address`
4. `spec.configMap`
5. `spec.grpcPodConfig`
The following `optional` attributes will be added (described in `go`):
#### `catalogImageSource`
A `CatalogSource` that is backed by a registry image. This will also house the recently added configuration knobs for overriding certain registry pod spec attributes.
```go=
// Contains all configuration knobs for a registry-image-based catalog
type CatalogImageSource struct {
// CatalogImage is an operator-registry container image to instantiate a registry-server with.
CatalogImage string `json:"catalogImage,omitempty"`
// PodSpecOverrides optional registry pod spec overrides.
// +optional
PodSpecOverrides *CatalogImageSourcePodSpecOverrides `json:"podSpecOverrides,omitempty"`
}
// Replaces GrpcPodConfig
type CatalogImageSourcePodSpecOverrides struct {
// NodeSelector is a selector which must be true for the pod to fit on a node.
// Selector which must match a node's labels for the pod to be scheduled on that node.
// +optional
NodeSelector map[string]string `json:"nodeSelector,omitempty"`
// Tolerations are the catalog source's pod's tolerations.
// +optional
Tolerations []corev1.Toleration `json:"tolerations,omitempty"`
// If specified, indicates the pod's priority.
// If not specified, the pod priority will be default or zero if there is no
// default.
// +optional
PriorityClassName string `json:"priorityClassName,omitempty" protobuf:"bytes,24,opt,name=priorityClassName"`
}
```
Example:
```yaml=
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
name: operatorhubio-catalog
namespace: olm
spec:
displayName: Community Operators
publisher: OperatorHub.io
catalogImageSource:
catalogImage: quay.io/operatorhubio/catalog:latest
podSpecOverrides:
nodeSelector:
label: value
```
#### `catalogServiceSource`
A `CatalogSource` that is backed by a `Service` that implements the registry's grpc api.
```go=
// Contains all pertinent knobs to configure a registry service backed catalog source
type CatalogServiceSource struct {
// Address is a host that OLM can use to connect to a pre-existing registry.
// Format: <registry-host or ip>:<port>
Address string `json:"address,omitempty"`
}
```
Example:
```yaml=
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
name: my-catalog
namespace: default
spec:
displayName: My Operators
catalogServiceSource:
address: some.namespace.svc
```
#### `configMapSource`
A `CatalogSource` that is backed by a `ConfigMap`.
```go=
// Contains all knobs required for a ConfigMap-backed catalog source
type ConfigMapSource struct {
// ConfigMap is the name of the ConfigMap to be used to back a configmap-server registry
ConfigMap string `json:"configMap,omitempty"`
}
```
Example:
```yaml=
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
name: my-catalog
namespace: default
spec:
displayName: My Operators
configMapSource:
configMap: somemap
```
### Spec Validation
CRD OpenAPI v3 schemas [support](https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#specifying-a-structural-schema) `allOf`, `oneOf`, `anyOf`, and `not` that allow the schema do be validated in different ways. For instance,
```yaml
spec:
type: object
properties:
a:
type: string
b:
type: string
oneOf:
- required: ["a"]
- required: ["b"]
```
In the schema above, either only `a`, or only `b`, or both would be valid. Anything else would be rejected.
Similarly, the following could be added to the schema to ensure at least one valid way of configuring a catalog source is used, i.e.
```yaml=
...
oneOf:
- required: [ "catalogServiceSource" ]
- required: [ "catalogImageSource" ]
- required: [ "configMapSource" ]
# during the deprecation period, we'll need legacy support
- required: [ "sourceType", "image" ]
- required: [ "sourceType", "address" ]
- required: [ "sourceType", "configMap" ]
```
See the full spec in the [appendix](#Appendix-I-New-Catalog-Source-Schema).
### Risks and Mitigations
These spec changes are breaking/deprecating and would require a phased change to manage the transition to the new version and avoid user frustration. In order to mitigate any undesired user experience the plan is:
1. maintain the spec backwards compatible for the duration of the deprecation period
2. new features are added to the new structure (as a forcing function to move to the new schema)
3. rely on api-server CRD schema validation to ensure only valid `CatalogSource` resources can be created
4. once the deprecation period is over, bump the CRD version, remove the deprecated fields, and supply a conversion hook
The deprecation will be announced on all pertinent channels: source, website, documentation, etc.
### Test Plan
1. Add unit tests (based on the current unit tests for the legacy schema)
2. Add validation tests (to the api repo?)
3. Add e2e tests (based on the current e2e `CatalogSource` tests for the legacy schema)
### Graduation Criteria
N/A?
### Upgrade / Downgrade Strategy
As previously mentioned, throughout the deprecation period the legacy schema will be supported and new features would be added to the incoming structure to promote the adoption of the new schema. After the requisite deprecation period is over, a new version of the `CatalogSource` CRD will be cut. The first release would also need to provide a conversion hook to convert the CRDs to their new version.
### Version Skew Strategy
N/A?
## Implementation History
Dec 2021: Draft Enhancement Proposal
## Drawbacks
This approach has the following drawbacks:
Firstly, the proposed changes will break the API contract and the transition to the new version will need to be managed, including providing a conversion hook.
Secondly, there is no support for `oneOf` on the `operator-sdk`/`kubebuilder` tag (as far as I could find). Therefore, this validation syntax may need to be added to the CRD out-of-band (i.e. script called from the right make target). This could be error prone and/or a maintenance burden. However, the validation tests should ensure the validation syntax was correctly added, which should mitigate the the risk of giving a broken build to our users.
Finally, CRD validation doesn't seem to return particularly useful messages in the context of `oneOf`. For instance, if the CRD does not validate against any of the schemas in the `oneOf`, the message is:
```
The CatalogSource "catalog" is invalid:
* <nil>: Invalid value: "": "spec" must validate one and only one schema (oneOf). Found none valid
* spec.catalogServiceSource: Required value
```
## Alternatives
[Union-types](https://github.com/kubernetes/enhancements/blob/master/keps/sig-api-machinery/1027-api-unions/README.md) are making some in-roads in Kubernetes. But, I don't think they are available yet. Schematically, they allow an attribute to be of a set of different types. This could have the potential to improve validation messages. From what I could gather from the documentation, I don't think there would be a huge difference in the implementation with or without union-types (maybe the addition of a `discriminator` attribute and a few kubebuilder tags). But I have not figured out a way to really improve the error messages.
## Review Log
### 2022-01-18 Action Items
- [x] Double-check "a", "b", or both. Should only be only "a" or "b"
- [ ] Review links Alex attached to understand the union-types vs oneOf and where the community is going. So, we don't go down a bad path.
- [ ] Add CEL validation as a possible alternative
- [ ] Consider also deprecating configMap (this has knock-on affects)
- [ ] Consider introducing the next version along side the current one
- [ ] Upgrade story needs more cowbell: OCP doesn't like new CRD versions due to downgrades
- [ ] Investigate introducing the new version but keeping the old version as the stored version. Then in a next release migrate it as the stored version. Pending on what OCP promises regarding downgrades (how many minor versions can I go back?)
## Appendix I. New Catalog Source Schema
```yaml=
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.6.2
creationTimestamp: null
name: catalogsources.operators.coreos.com
spec:
group: operators.coreos.com
names:
categories:
- olm
kind: CatalogSource
listKind: CatalogSourceList
plural: catalogsources
shortNames:
- catsrc
singular: catalogsource
scope: Namespaced
versions:
- additionalPrinterColumns:
- description: The pretty name of the catalog
jsonPath: .spec.displayName
name: Display
type: string
- description: The type of the catalog
jsonPath: .spec.sourceType
name: Type
type: string
- description: The publisher of the catalog
jsonPath: .spec.publisher
name: Publisher
type: string
- jsonPath: .metadata.creationTimestamp
name: Age
type: date
name: v1alpha1
schema:
openAPIV3Schema:
description: CatalogSource is a repository of CSVs, CRDs, and operator packages.
type: object
required:
- metadata
- spec
properties:
apiVersion:
description: 'APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
type: string
kind:
description: 'Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
type: string
metadata:
type: object
spec:
type: object
properties:
address:
description: 'Address (deprecated) is a host that OLM can use to connect to a pre-existing registry. Format: <registry-host or ip>:<port> Only used when SourceType = SourceTypeGrpc. Ignored when the Image field is set.'
type: string
catalogImageSource:
type: object
properties:
catalogImage:
description: CatalogImage is an operator-registry container image to instantiate a registry-server with.
type: string
podSpecOverrides:
description: PodSpecOverrides optional registry pod spec overrides.
type: object
properties:
nodeSelector:
description: NodeSelector is a selector which must be true for the pod to fit on a node. Selector which must match a node's labels for the pod to be scheduled on that node.
type: object
additionalProperties:
type: string
priorityClassName:
description: If specified, indicates the pod's priority. If not specified, the pod priority will be default or zero if there is no default.
type: string
tolerations:
description: Tolerations are the catalog source's pod's tolerations.
type: array
items:
description: The pod this Toleration is attached to tolerates any taint that matches the triple <key,value,effect> using the matching operator <operator>.
type: object
properties:
effect:
description: Effect indicates the taint effect to match. Empty means match all taint effects. When specified, allowed values are NoSchedule, PreferNoSchedule and NoExecute.
type: string
key:
description: Key is the taint key that the toleration applies to. Empty means match all taint keys. If the key is empty, operator must be Exists; this combination means to match all values and all keys.
type: string
operator:
description: Operator represents a key's relationship to the value. Valid operators are Exists and Equal. Defaults to Equal. Exists is equivalent to wildcard for value, so that a pod can tolerate all taints of a particular category.
type: string
tolerationSeconds:
description: TolerationSeconds represents the period of time the toleration (which must be of effect NoExecute, otherwise this field is ignored) tolerates the taint. By default, it is not set, which means tolerate the taint forever (do not evict). Zero and negative values will be treated as 0 (evict immediately) by the system.
type: integer
format: int64
value:
description: Value is the taint value the toleration matches to. If the operator is Exists, the value should be empty, otherwise just a regular string.
type: string
catalogServiceSource:
type: object
properties:
address:
description: 'Address is a host that OLM can use to connect to a pre-existing registry. Format: <registry-host or ip>:<port>'
type: string
configMap:
description: ConfigMap (deprecated) is the name of the ConfigMap to be used to back a configmap-server registry. Only used when SourceType = SourceTypeConfigmap or SourceTypeInternal.
type: string
configMapSource:
type: object
properties:
configMap:
description: ConfigMap is the name of the ConfigMap to be used to back a configmap-server registry
type: string
description:
type: string
displayName:
description: Metadata
type: string
grpcPodConfig:
description: GrpcPodConfig (deprecated) exposes different overrides for the pod spec of the CatalogSource Pod. Only used when SourceType = SourceTypeGrpc and Image is set.
type: object
properties:
nodeSelector:
description: NodeSelector is a selector which must be true for the pod to fit on a node. Selector which must match a node's labels for the pod to be scheduled on that node.
type: object
additionalProperties:
type: string
priorityClassName:
description: If specified, indicates the pod's priority. If not specified, the pod priority will be default or zero if there is no default.
type: string
tolerations:
description: Tolerations are the catalog source's pod's tolerations.
type: array
items:
description: The pod this Toleration is attached to tolerates any taint that matches the triple <key,value,effect> using the matching operator <operator>.
type: object
properties:
effect:
description: Effect indicates the taint effect to match. Empty means match all taint effects. When specified, allowed values are NoSchedule, PreferNoSchedule and NoExecute.
type: string
key:
description: Key is the taint key that the toleration applies to. Empty means match all taint keys. If the key is empty, operator must be Exists; this combination means to match all values and all keys.
type: string
operator:
description: Operator represents a key's relationship to the value. Valid operators are Exists and Equal. Defaults to Equal. Exists is equivalent to wildcard for value, so that a pod can tolerate all taints of a particular category.
type: string
tolerationSeconds:
description: TolerationSeconds represents the period of time the toleration (which must be of effect NoExecute, otherwise this field is ignored) tolerates the taint. By default, it is not set, which means tolerate the taint forever (do not evict). Zero and negative values will be treated as 0 (evict immediately) by the system.
type: integer
format: int64
value:
description: Value is the taint value the toleration matches to. If the operator is Exists, the value should be empty, otherwise just a regular string.
type: string
icon:
type: object
required:
- base64data
- mediatype
properties:
base64data:
type: string
mediatype:
type: string
image:
description: Image (deprecated) is an operator-registry container image to instantiate a registry-server with. Only used when SourceType = SourceTypeGrpc. If present, the address field is ignored.
type: string
priority:
description: 'Priority field assigns a weight to the catalog source to prioritize them so that it can be consumed by the dependency resolver. Usage: Higher weight indicates that this catalog source is preferred over lower weighted catalog sources during dependency resolution. The range of the priority value can go from positive to negative in the range of int32. The default value to a catalog source with unassigned priority would be 0. The catalog source with the same priority values will be ranked lexicographically based on its name.'
type: integer
publisher:
type: string
secrets:
description: Secrets represent set of secrets that can be used to access the contents of the catalog. It is best to keep this list small, since each will need to be tried for every catalog entry.
type: array
items:
type: string
sourceType:
description: SourceType (deprecated) is the type of source
type: string
updateStrategy:
description: UpdateStrategy defines how updated catalog source images can be discovered Consists of an interval that defines polling duration and an embedded strategy type
type: object
properties:
registryPoll:
type: object
properties:
interval:
description: Interval is used to determine the time interval between checks of the latest catalog source version. The catalog operator polls to see if a new version of the catalog source is available. If available, the latest image is pulled and gRPC traffic is directed to the latest catalog source.
type: string
oneOf:
- required: [ "catalogServiceSource" ]
- required: [ "catalogImageSource" ]
- required: [ "configMapSource" ]
- required: [ "sourceType", "image" ]
- required: [ "sourceType", "address" ]
- required: [ "sourceType", "configMap" ]
status:
type: object
properties:
conditions:
description: Represents the state of a CatalogSource. Note that Message and Reason represent the original status information, which may be migrated to be conditions based in the future. Any new features introduced will use conditions.
type: array
items:
description: "Condition contains details for one aspect of the current state of this API Resource. --- This struct is intended for direct use as an array at the field path .status.conditions. For example, type FooStatus struct{ // Represents the observations of a foo's current state. // Known .status.conditions.type are: \"Available\", \"Progressing\", and \"Degraded\" // +patchMergeKey=type // +patchStrategy=merge // +listType=map // +listMapKey=type Conditions []metav1.Condition `json:\"conditions,omitempty\" patchStrategy:\"merge\" patchMergeKey:\"type\" protobuf:\"bytes,1,rep,name=conditions\"` \n // other fields }"
type: object
required:
- lastTransitionTime
- message
- reason
- status
- type
properties:
lastTransitionTime:
description: lastTransitionTime is the last time the condition transitioned from one status to another. This should be when the underlying condition changed. If that is not known, then using the time when the API field changed is acceptable.
type: string
format: date-time
message:
description: message is a human readable message indicating details about the transition. This may be an empty string.
type: string
maxLength: 32768
observedGeneration:
description: observedGeneration represents the .metadata.generation that the condition was set based upon. For instance, if .metadata.generation is currently 12, but the .status.conditions[x].observedGeneration is 9, the condition is out of date with respect to the current state of the instance.
type: integer
format: int64
minimum: 0
reason:
description: reason contains a programmatic identifier indicating the reason for the condition's last transition. Producers of specific condition types may define expected values and meanings for this field, and whether the values are considered a guaranteed API. The value should be a CamelCase string. This field may not be empty.
type: string
maxLength: 1024
minLength: 1
pattern: ^[A-Za-z]([A-Za-z0-9_,:]*[A-Za-z0-9_])?$
status:
description: status of the condition, one of True, False, Unknown.
type: string
enum:
- "True"
- "False"
- Unknown
type:
description: type of condition in CamelCase or in foo.example.com/CamelCase. --- Many .condition.type values are consistent across resources like Available, but because arbitrary conditions can be useful (see .node.status.conditions), the ability to deconflict is important. The regex it matches is (dns1123SubdomainFmt/)?(qualifiedNameFmt)
type: string
maxLength: 316
pattern: ^([a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*/)?(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])$
x-kubernetes-list-map-keys:
- type
x-kubernetes-list-type: map
configMapReference:
type: object
required:
- name
- namespace
properties:
lastUpdateTime:
type: string
format: date-time
name:
type: string
namespace:
type: string
resourceVersion:
type: string
uid:
description: UID is a type that holds unique ID values, including UUIDs. Because we don't ONLY use UUIDs, this is an alias to string. Being a type captures intent and helps make sure that UIDs and names do not get conflated.
type: string
connectionState:
type: object
required:
- lastObservedState
properties:
address:
type: string
lastConnect:
type: string
format: date-time
lastObservedState:
type: string
latestImageRegistryPoll:
description: The last time the CatalogSource image registry has been polled to ensure the image is up-to-date
type: string
format: date-time
message:
description: A human readable message indicating details about why the CatalogSource is in this condition.
type: string
reason:
description: Reason is the reason the CatalogSource was transitioned to its current state.
type: string
registryService:
type: object
properties:
createdAt:
type: string
format: date-time
port:
type: string
protocol:
type: string
serviceName:
type: string
serviceNamespace:
type: string
served: true
storage: true
subresources:
status: {}
```