# ScopedCache Updated Design Thoughts The current PoC relies heavily on the controller-runtime `cache.Cache` implementation for generating informers. The problem with this is that users are forced to work with the limitations of controller-runtime's method of generating informers. Further design discussions proposed the idea of creating this dynamic scoped cache in a way that uses client-go informers instead which allows for more flexibility in the way informers are created/configured. In order to accomplish this there are some changes that need to be made to the current PoC's design. In this design "thought", I am proposing a new breakdown of "layered" apis ## Proposed Breakdown - Cache Components (think "what do I need to *build* a cache") - Caches - Non-Cluster-Scoped Cache - Cluster-Scoped Cache - Wrapper around the Caches to implement controller-runtime `cache.Cache` - Higher Level API ### Cache Components This layer is meant to consist of the components that can be used to make up a Cache. There are a few components that I believe would help with building a Cache: - A type for tracking key things for an informer such as: - `context.CancelFunc` - This will allow us to close the `context.Context` that an informer was started with. This is important to keep track of so that we can properly shutdown an informer when asked or when it loses permissions. - Dependents - Resources that depend on this informer to exist. This is essentially used as a reference counter to determine if an informer is still needed. - Potentially something like: ```go type ScopeInformerData struct { Cancel context.CancelFunc // This could be a different mapping or // type. I haven't quite thought about // The *best* way for tracking the dependents Dependents map[types.UID]metav1.Object } ``` - A type for mapping an `informers.GenericInformer` to it's corresponding `ScopeInformerData`. This would allow us to quickly check if an informer already exists and update its data as necessary. - Potentially something like: ```go type Informers map[informers.GenericInformer]ScopeInformerData ``` - A type for mapping a GVK to informers. In this dynamic cache there could be multiple informers for any given GVK. - Potentially something like: ```go type GvkToInformers map[schema.GroupVersionKind]Informers ``` As an implementation detail, it may be nice to consider adding functions to help with some of the common operations that may need to be done with these types. ### Caches This layer adopts the Unix philosophy of "do one thing and do it well". The idea is to take the cache components from the previous layer and develop two caches: - A Non-Cluster-Scoped cache - A Cluster-Scoped cache These caches do not necessarily have to implement the controller-runtime `cache.Cache` interface #### Non-Cluster-Scoped cache The goal behind this Non-Cluster-Scoped cache is to be able to handle all the caching logic for watches at a namespace level and not be concerned with informers created to watch at a cluster level. The main type for this might look something like: ```go type NamespaceScopedCache struct { Namespaces map[string]GvkToInformers } ``` It should be able to: - Add/Remove namespaces as needed - Add/Remove GVKs from a namespace as needed - Add/Remove an informer from a namespace-GVK pair as needed - Use informers to `GET` a given resource - Use informers to `LIST` resources - For all namespaces currently tracked - For a particular namespace #### Cluster-Scoped cache The goal behind the Cluster-Scoped cache is to be able to handle all the caching logic for watches at a cluster level and not be concerned with informers created to watch at a namespace level. The main type for this might look something like: ```go type ClusterScopedCache struct { GvkInformers GvkToInformers } ``` It should be able to: - Add/Remove GVKs as needed - Add/Remove an informer from a GVK as needed - Use informers to `GET` a given resource - Use informers to `LIST` resources across the cluster #### Wrapper This layer is meant to wrap the caches in the previous layer into an implementation that satisfies the controller-runtime `cache.Cache` interface. The main type for this might look something like: ```go type ScopedCache struct { ClusterCache ClusterScopedCache NamespaceCache NamespaceScopedCache // This would be used for tracking which GVKs are // being watched at the cluster level versus the // namespace level. Ideally they won't be watched // at both levels. clusterGVKs map[schema.GroupVersionKind]struct{} } ``` #### Higher Level API This layer is to create an API that helps operator authors more easily adopt the use of the `ScopeCache`. Implementation is TBD. # Implementation/Design Proposal (WIP) ## Cache Components ```go // ScopeInformer is a wrapper around a client-go // informer that is meant to store information // needed for the dynamic scoped cache type ScopeInformer struct { // The actual client-go informer itself Informer informers.GenericInformer // The context.CancelFunc to terminate // the informer Cancel context.CancelFunc // The resources that are dependent // on this informer Dependents map[types.UID]metav1.Object } // TODO: add helpful functions to ^ (if any) // Informers is a mapping of a string // (meant to be a unique string) to a ScopeInformer type Informers map[string]ScopeInformer // TODO: add helpful functions to ^ (if any) // GvkToInformers is a mapping of GVK to Informers type GvkToInformers map[schema.GroupVersionKind]Informers // TODO: add helpful functions to ^ (if any) // InformerOptions are meant to set options // when adding an informer to a cache type InformerOptions struct { // Namespace the informer is watching Namespace string // GVK the informer is watching Gvk schema.GroupVersionKind // Unique identifier for the informer Key string // The informer itself Informer informers.GenericInformer // The resource dependent on the informer Dependent metav1.Object } ``` ## Caches ```go type DynamicScopedCache interface { // Get will return a runtime.Object for a given key and GVK. // If an error is encountered, the error will be returned. // It will only attempt to use the existing informers to // find the requested resource and will NOT automatically // create a new informer if it cannot be found. func Get(key types.NamespacedName, gvk schema.GroupVersionKind) (runtime.Object, error) {} // List will return a list of runtime.Objects for a given key and GVK. // If an error is encountered, the error will be returned. // It will only attempt to use the existing informers to // list and will NOT automatically create a new informer. func List(listOpts client.ListOptions, gvk schema.GroupVersionKind) ([]runtime.Object, error) {} // AddInformer will take in InformerOptions as a parameter // and will use these options to add a new informer to the cache. // If the informer already exists, a new dependent will be added // to the existing informer. func AddInformer(infOpts InformerOptions) {} // RemoveInformer will take in InformerOptions as a parameter // and will use these options to remove an existing informer from the cache. // an informer will only be fully removed from the cache if the // informer has no more dependents. func RemoveInformer(infOpts InformerOptions) {} // Start will start the cache func Start() {} // IsStarted returns a boolean that represents whether // or not the cache has been started func IsStarted() bool {} } ``` ### Cluster Cache ```go // ClusterScopedCache is a dynamic cache // with a focus on only tracking informers // with watches at a cluster level type ClusterScopedCache struct { // GvkInformers is to store each GVK // to a set of informers GvkInformers GvkToInformers // started represents whether or not // the cache has been started started bool } // NewClusterScopedCache will create and return a new ClusterScopedCache func NewClusterScopedCache() *ClusterScopedCache {} // Implementation of the DynamicScopedCache interface // -------------------------------------------------- func (csc *ClusterScopedCache) Get(key types.NamespacedName, gvk schema.GroupVersionKind) (runtime.Object, error) {} func (csc *ClusterScopedCache) List(listOpts client.ListOptions, gvk schema.GroupVersionKind) ([]runtime.Object, error) {} func (csc *ClusterScopedCache) AddInformer(infOpts InformerOptions) {} func (csc *ClusterScopedCache) RemoveInformer(infOpts InformerOptions) {} func (csc *ClusterScopedCache) Start() {} func (csc *ClusterScopedCache) IsStarted() bool {} // -------------------------------------------------- ``` ### Namespace Cache ```go // NamespaceScopedCache is a dynamic cache // with a focus on only tracking informers // with watches at a namespace level type NamespaceScopedCache struct { // Namespaces is a mapping of namespace // to GVK to informers Namespaces map[string]GvkToInformers // started represents whether or not // the cache has been started started bool } // NewNamespaceScopedCache will create and return a new NamespaceScopedCache func NewNamespaceScopedCache() *NamespaceScopedCache {} // Implementation of the DynamicScopedCache interface // -------------------------------------------------- func (nsc *NamespaceScopedCache) Get(key types.NamespacedName, gvk schema.GroupVersionKind) (runtime.Object, error) {} func (nsc *NamespaceScopedCache) List(listOpts client.ListOptions, gvk schema.GroupVersionKind) ([]runtime.Object, error) {} func (nsc *NamespaceScopedCache) AddInformer(infOpts InformerOptions) {} func (nsc *NamespaceScopedCache) RemoveInformer(infOpts InformerOptions) {} func (nsc *NamespaceScopedCache) Start() {} func (nsc *NamespaceScopedCache) IsStarted() bool {} // -------------------------------------------------- ``` ## Wrapper ```go // ScopedInformerFactory is a function that is used to create a // ScopedInformer for the provided GVR and SharedInformerOptions. // It is up to the function to properly implement any error // handling logic for the informer losing permissions. type ScopedInformerFactory func(gvr schema.GroupVersionResource, options ...informers.SharedInformerOption) (ScopedInformer, error) // ScopedCache is a wrapper around the // NamespaceScopedCache and ClusterScopedCache // that implements the controller-runtime // cache.Cache interface. Wrapping both scoped // caches enables this cache to dynamically handle // informers that establish watches at both the cluster // and namespace level type ScopedCache struct { // nsCache is the NamespaceScopedCache // being wrapped by the ScopedCache nsCache *NamespaceScopedCache // clusterCache is the ClusterScopedCache // being wrapped by the ScopedCache clusterCache *ClusterScopedCache // RESTMapper is used when determining // if an API is namespaced or not RESTMapper apimeta.RESTMapper // Scheme is used when determining // if an API is namespaced or not Scheme *runtime.Scheme // started represents whether or not // the ScopedCache has been started started bool // scopedInformerFactory is used // to create ScopedInformers whenever // one needs to be created by the // ScopedCache. scopedInformerFactory ScopedInformerFactory } // ScopedCacheOption is a function to set values on the ScopedCache type ScopedCacheOption func(*ScopedCache) // WithScopedInformerFactory is an option that can be used // to set the ScopedCache.scopedInformerFactory field when // creating a new ScopedCache. func WithScopedInformerFactory(sif ScopedInformerFactory) ScopedCacheOption {} // ScopeCacheBuilder is a builder function that // can be used to return a controller-runtime // cache.NewCacheFunc. This function enables controller-runtime // to properly create a new ScopedCache func ScopedCacheBuilder(opts ...ScopedCacheOption) cache.NewCacheFunc {} // client.Reader implementation // ---------------------- // Get will attempt to get the requested resource from the appropriate // cache. If no informers exist that would allow for finding the requested resource, // it will attempt to create the informer to allow for finding the requested resource // and add it to the appropriate cache. func (sc *ScopedCache) Get(ctx context.Context, key client.ObjectKey, obj client.Object) error {} // List will attempt to get the requested list of resources from the appropriate // cache. If no informers exist that would allow for retrieving a list, // it will attempt to create the informer to allow for retrieving a list // and add it to the appropriate cache. func (sc *ScopedCache) List(ctx context.Context, list client.ObjectList, opts ...client.ListOption) error {} // ---------------------- // cache.Informers implementation // ---------------------- // GetInformer will search the caches for the informer that meets the request. // If none is found, it will attempt to create the requested informer and add it // to the appropriate cache. Upon success, the informer will be returned. func (sc *ScopedCache) GetInformer(ctx context.Context, obj client.Object) (cache.Informer, error) {} // GetInformerForKind will search the caches for the informer that meets the request. // If none is found, it will attempt to create the requested informer and add it // to the appropriate cache. Upon success, the informer will be returned. func (sc *ScopedCache) GetInformerForKind(ctx context.Context, gvk schema.GroupVersionKind) (cache.Informer, error) {} // Start will start all the ScopedCache. This involves starting both // the ClusterScopedCache and NamespaceScopedCache and all their informers func (sc *ScopedCache) Start(ctx context.Context) error {} // WaitForCacheSync will block until all the caches have been synced func (sc *ScopedCache) WaitForCacheSync(ctx context.Context) bool {} // IndexField will add an index field to the appropriate informers func (sc *ScopedCache) IndexField(ctx context.Context, obj client.Object, field string, extractValue client.IndexerFunc) error {} // ---------------------- // AddInformer will add an informer to the appropriate cache based // on the provided InformerOptions. If InformerOptions.Namespace == "" // the informer will be added to the ClusterScopedCache, otherwise // it will be added to the NamespaceScopedCache. func (sc *ScopedCache) AddInformer(infOpts InformerOptions) {} // RemoveInformer will remove an informer from the appropriate cache based // on the provided InformerOptions. If InformerOptions.Namespace == "" // the informer will be removed from the ClusterScopedCache, otherwise // it will be removed from the NamespaceScopedCache. func (sc *ScopedCache) RemoveInformer(infOpts InformerOptions) {} ``` ## Higher Level API Currently no implementation/design ideas as they could change based on how the implementation of the core library changes. Will need to implement sane defaults for things like the `ScopedInformerFactory` to make adoption of the `ScopedCache` easier. # Notes Dynamic Cache Repo: https://github.com/everettraven/telescopia Demo Operator Repo: https://github.com/everettraven/scoped-operator-poc/tree/poc/telescopia ## Open Questions/Things to do: ### Higher Level Concepts ### Implementation Details - What do we need to do so we can evaluate all potential list options for informers? - [`metav1.ListOptions` details](https://github.com/kubernetes/apimachinery/blob/v0.25.3/pkg/apis/meta/v1/types.go#L322) - Should we make this a user defined function? - Iron out custom errors and when to return what error type - Should we use a custom error when a resource is "Not Found" in case the resource isn't found due to an informer not yet existing that could find this resource? - What other cases are there that we should return a custom error? - Unit Testing - E2E Testing - Move the demo operator into the dynamic cache repo as testdata - What do we need to do to ensure that we are able to accept the controller-runtime cache configuration options appropriately? - [`cache.Options` details](https://github.com/kubernetes-sigs/controller-runtime/blob/v0.13.1/pkg/cache/cache.go#L98) - When should we automatically kill informers? Should this be configurable? - Should we make the key used when adding generated informers to the cache configurable? - Maybe another user defined function? - Evaluate and remove unneccessary helper functions and interfaces - Higher Level API - What default functions need to be implemented? (TBD based on other work) - What else needs to be added to make the on-ramp/adoption easier? ### Other - Documentation - CI ## Potential Epic/User-Story Breakdown - Epic: Transition Telescopia from PoC to more stable implementation - As an operator author, I want to be able to define the method used to generate list options that are used when creating an informer, so that I can customize the configuration of informers. - As an operator author, I want to be able to determine which errors come from the dynamic cache and which errors come from the Kubernetes API server, so that I can handle the errors appropriately. - As an operator author, I want to use the controller-runtime `cache.Options` to configure the cache, so that I can configure the dynamic cache in a way that I am familiar with. - As an operator author, I want to configure whether or not informers should be automatically terminated, so that I can be more in control of how informers are used. - As an operator author, I want to be able to define the method used for generating the informer key used when adding an informer to the cache, so that I can reliably recreate the key used for a given informer. - As a maintainer, I want unit tests, so that we can ensure the logic of the dynamic cache components are working as expected. - TBD - Can we break this down further into unit tests for particular components? - As a maintainer, I want end-to-end tests, so that we can ensure the logic of the dynamic cache works in a realistic scenario. - As a maintainer, I want to remove unnecessary helper functions and interfaces, so that the dynamic cache is easier to maintain. - Epic: Transition Telescopia to an Operator Framework maintained project - As a Bryce, transfer ownership of everettraven/telescopia to operator-framework/telescopia, so it can be maintained by the Operator Framework - Epic: Initial release of the Telescopia library - As a maintainer, I want automated CI, so that we can easily verify changes before they are accepted. - TBD - Break this down further - As a maintainer, I want a mostly automated release process, so that we can keep the release process simple. - As an operator author, I want to be able to read through documentation, so that I can learn how to use the dynamic cache. - TBD - Break this down further - Epic: Create a higher level api to ease the adoption of the Telescopia caching layer - As an operator author, I want to be able to use default configurations, so that I can easily transition my operator to utilize the dynamic cache. - TBD - Break this down further