Try   HackMD

Informers Based on RBAC

Release Signoff Checklist

  • Enhancement is implementable
  • Design details are appropriately documented from clear requirements
  • Test plan is defined
  • Graduation criteria for dev preview, tech preview, GA

Background

To learn more about the motiviation behind Descoping Operators, please review:

Summary

As part of the Operator Framework's effort to move towards Descoped Operators we must identify how an operator could be configured to reconcile events in specific namespaces based on available Role Based Access Controls (RBAC). The Operator Framework ultimately hopes to fulfill the following problem statement:

As an operator author I want to develop an operator that can handle changing permissions so that cluster admins can use Role Based Access Controls (RBAC) to scope the permissions given to my operator.

Motivation

In an effort to support the descoped operator model, the Operator Framework will need to identify how a descoped operator can:

  • Manage its informers for resources it cares about.
  • Manage its cache as the operator gains/loses RBAC.

Goals

  • Document how an operator is "scoped" today
  • Document limitations with the existing approaches
  • Propose how we would like operators to be configured to resolve.

Non-Goals

  • Describe how the proposed solution could be backwards compatible with OLM.
  • Capture how Cluster Administrators could generate the operator's RBAC in a specific set of namespaces. Please review the oria operator documentation if you'd like to learn more about this problem space.

Proposal

Background

Operators built with Kubebuilder and the Operator SDK rely on Controller Runtime to build the cache used by their operator. Controller Runtime currently supports two forms of caches:

The cache used by the operator can be set by the options.NewCacheFunc parameter passed into the New Manager Function:

  • The Informer Cache is the default cache and can be configured to watch all namespaces or a single namespace. To configure it to watch a single namespace you can set the options.Namespace parameter when creating a new Manager.
  • The Multi Namespace Cache is used when the options.NewCacheFunc is set to use the cache.MultiNamespacedCacheBuilder function. This cache can be configured to only set watches in a specific set of namespaces.

Caches represent an operator's view of the resources available on the cluster. When an operator attempts to interact with a resource, the cache will:

  • Populates the cache with a list call.
  • Establish a watch to mirror changes to resources in the cache made by outside entities.

This means that an operator always needs appropriate list and watch permissions when configuring the cache. "Appropriate" list and watch perrmissions are defined by each Cache.

In the case of the Informer Cache, the operator must have:

  • list and watch permissions for the GVK at the cluster level if not constrainted to a single namespace
  • list and watch permissions for the GVK at a namespace level if configured to a single namespace

In the case of the Multi Namespace Cache, the operator must have list and watch permissions for the GVK of the resource in all namespaces it is configured to.

This introduces a number of issues concerning the Operator Framework's Descoping Plan in which:

  • An operator reconciles events for an API it introduces across the cluster.
  • An operator is scoped to namespaces based on available RBAC. (Example: My Pod Operator watches its FooPod CRD across the cluster but can only get, list, watch, create, delete, update pods in the bar namespace).

Let's discuss key issues below

Key Issues with existing Caches

Scalability of creating namespaced watches

Many customers are uncomfortable with granting operators cluster wide list and watch permissions. As discussed earlier, the existing cache implementation relies on certain list and watch permissions to populate the cache.

The existing Multi Namespace Cache does not scale well. For each resource in the cache, a watch stream is established in the namespaces given to the Manager function at startup. This cache was designed to limit an operator to a small number of namespaces. The more namespaces that are used with this cache, the less performant it becomes.

POSSIBLE SOLUTION:

The number of watches that we need to create could be limited by creating the watches as needed. In this scenario, watches would be established when processing the CR, the operator could provide context to the cache regarding how to configure the watch. For example, if my CR specified that I needed to configure the cache for secrets in a specific namespace, the watch could be established when the CR is processed. Likewise, the watch could be eliminated when all CRs that rely on the watch are deleted.

Caches for Individual Resources

Neither cache supports watching specific resources within a namespace. Consider the scenario in which a CR allows a user to specify a single secret that the operator must watch or interact with. There are many instances where an operator may not need to watch all events related to a resource type in a namespace. For example, the ingress controller only establishes a watch on secrets specified in a CR, it does not receive events for other secrets in the namespace.

Scopes of Watched Resource must be consistent

Neither of Controller Runtime's cache implementations support populating GVKs at different scopes. The caches are populated for each GVK at the cluster level OR within the set of provided namespaces.

New Cache Proposal

In an effort to address the shortcomings surrounding the existing cache implementations, this document proposes the introduction of a Dynamic Cache.

Key features delivered by the new Cache include:

  • Establishing a watch for a GVK at the cluster level.
  • Establishing a watch for a GVK in specific namespaces.
  • Establishing a watch for a specific resource.
  • Removal of watches if no longer required.

The key features deviate heavily from existing cache implementations because they assume that watches are either set at the cluster level or namespace level. If the new cache is designed to handle multiple ways to establish an informer, the onus must be placed on the Operator Author to specify how to configure the watch.

Design Details

Removal of Watches

There is a need to remove watches when they are no longer necessary in an effort to reduce the load that each operator places on the API server. Consider the scenario where the API server must fulfill the watch requests on a cluster with 100's of operators, the strain placed on the API server is unacceptable when using the Multi Namespace Cache.

In instances where a watch is established by the Cache when reconciling a CR, there is a need to track when the watch can be discontinued. Today, this is done by placing a finalizer on the CR and having the operator clean it up.

Establishing Informers while Reconciling a CR

Many of the benefits for establishing a watch based on the existing of a CR that require it are consistent whether or not the watch is established at the cluster or namespace level or for a specific resource. These beneifits include:

  • Establishing a watch with the API server only if needed.
  • The abiity to remove a watch if no longer needed.

Establishing cluster wide informers

TODO

Establishing namespaced informers

Consider the following workflow with the Bar CR:

# The Bar CR apiVersion: operators.io.operator-framework/v1 kind: Bar metadata: name: sample spec: secretsInNamespacenamespace: namespaceBar
  • Operator Foo reconciles the Bar CR
  • Operator Foo attempts to list the secrets that exist in the namespaceBar namespace.
  • Operator Foo's cache is not populated with secrets in the namespaceBar namespace.
  • The Dynamic Cache establishes a watch for all secrets in the namespaceBar namespace and populates the cache with a list.
  • Operator Foo is returned the result from the get request for the namespaceBar/secretBar secret

The above scenario would only require list, watch, and get permissions for secrets in the namespaceBar namespace, so these permissions would not be required at the cluster level. The user could specify this in code like:

<INSERT EXAMPLE CODE HERE

NOTE: The current PoC implementation would allow for this but only allows for one namespace to be have informers created in per CR. For example, for a given CR you would only be able to create informers in either namespace foo or bar but not both.

Establishing an Informer for a Specific Resource

Consider the following workflow with the Bar CR:

# The Bar CR apiVersion: operators.io.operator-framework/v1 kind: Bar metadata: name: sample spec: secret: name: secretBar namespace: namespaceBar
  • Operator Foo reconciles the Bar CR
  • Operator Foo attempts to get the namespaceBar/secretBar secret
  • Operator Foo's cache is not populated for the namespaceBar/secretBar secret.
  • The Dynamic Cache establishes a watch for just the namespaceBar/secretBar secret and populates the cache with a list.
  • Operator Foo is returned the result from the get request for the namespaceBar/secretBar secret

The above scenario would only require list, watch, and get permissions on the namespaceBar/secretBar secret, so these permissions would not be required at the cluster or namespace level. The user could specify this in code like:

<INSERT EXAMPLE CODE HERE

NOTE: The current PoC implementation only has a way to create watches on a specific resource in the same namespace a CR is in.

A major benefit to this approach is that an operator can be granted RBAC on specific resources.

Scoping List/Watch RBAC for an API provided by the operator

TODO


[Bryce Palmer] - Proposed Changes (Add PoC details and demo from presentation document):

PoC Details as of 08/26/2022

The scoped-cache-poc is a Go library that has a cache implementation (ScopedCache) that satisfies the controller-runtime cache.Cache interface. This ScopedCache provides operator authors a dynamic caching layer that can be used to handle dynamically changing permissions.

As of now, this library is ONLY the caching layer. It is up to the operator author to implement the logic to update the cache and handle the permission changes appropriately.

How does it work?

The idea behind the ScopedCache is to create informers for resources as they are needed. This means:

  • Informers are only created when a CR is reconciled
  • Informers are only created for resources that are related to the CR
  • Informers only live as long as the corresponding CR is around. If the CR is deleted the corresponding informers should be stopped.

One assumption is that Operators will always need to watch for CRs they reconcile at the cluster level.

In order to accomplish this, the ScopedCache is comprised of a couple different caches:

  • A cache.Cache that is used for everything that should be cluster scoped
  • A NamespacedResourceCache that is a mapping of Namespace -> ResourceCache
    • A ResourceCache is a mapping of types.UID -> cache.Cache
      • The types.UID is the unique identifier of a given Kubernetes Resource

To properly use the ScopedCache, when reconciling a CR you would need to create the corresponding watches. The workflow for creating these watches would look like:

  • Create a new cache.Cache with options that scope the cache's view to only objects created or referenced when reconciling the given CR
  • Add the cache.Cache above to the ScopedCache:
    • Internally, the ScopedCache will create the correct mapping of Namespace -> Resource -> cache.Cache for a given CR and cache.Cache
  • Get/Create necessary informers for the CR from the ScopedCache
  • Configure informers to handle changed permissions
  • Start the cache.Cache that corresponds to the CR being reconciled
  • Use controller-runtime utility functions to create watches with the informers from the ScopedCache

Due to the process of adding caches for a CR to the ScopedCache being a deliberate process - if there are any requests made to the ScopedCache without any ResourceCaches having been created, it is assumed that it is intended to use the cluster scoped cache.Cache.

Demonstration

This demonstration is an adapted version of the Operator SDK Memcached Operator Tutorial. It utilizes the scoped-cache-poc library and the ScopedCache cache implementation, enabling the Memcached operator to handle dynamically changing permissions.

The source code for this operator can be found on GitHub here: https://github.com/everettraven/scoped-operator-poc/tree/test/scoped-cache

NOTE: When using the scoped-operator-poc repository make sure to use the test/scoped-cache branch

Logic Flow of Operator

Yes

No

Yes

No

Yes

No

No

Yes

Yes

No

No

Yes

No

Yes

Yes

No

Request

Get Memcached CR

Deleted?

Remove Cache for Memcached CR

Cache for Memcached CR exists?

Get Deployment for Memcached CR

Create Cache & Informers for Memcached CR

Error?

NotFound?

Ensure Deployment Replicas == Memcached Size

List Pods for Memcached CR

Error?

Update Memcached Status.Nodes with Pods

Create Deployment for Memcached

Forbidden?

Stop Informers && Remove Cache

Forbidden?

Log Error

Forbidden?

Remove Finalizer

Step 1

Run setup.sh to:

  • Delete existing KinD cluster
  • Create a new KinD cluster
  • Apply RBAC to give the following cluster level permissions:
    • create, delete, list, watch, patch, get, and update permissions for memcacheds resources
    • update permissions for memcacheds/finalizers sub-resource
    • get, patch, update permissions for memcacheds/status sub-resource
  • Create namespaces allowed-one, allowed-two, denied
  • Apply RBAC to give create, delete, get, list, patch, update, and watch permissions for deployments resources in the allowed-one and allowed-two namespaces
  • Apply RBAC to give get, list, watch permissiosn for pods resources in the allowed-one and allowed-two namespaces

Step 2

Run redeploy.sh to:

  • Remove any existing deployments of the operator from the cluster
  • Build the image for the operator
  • Load the built image to the KinD cluster
  • Deploy the operator on the cluster
  • List the pods in the scoped-memcached-operator-system namespace so we can easily copy the pod name for when we take a look at the operator pod logs

Step 3

Get the logs by running:

kubectl -n scoped-memcached-operator-system logs <pod-name>

We should see that the operator has started successfully:

1.6608304693692005e+09  INFO    controller-runtime.metrics      Metrics server is starting to listen    {"addr": "127.0.0.1:8080"}
1.660830469369386e+09   INFO    setup   starting manager
1.6608304693696425e+09  INFO    Starting server {"kind": "health probe", "addr": "[::]:8081"}
1.6608304693696718e+09  INFO    Starting server {"path": "/metrics", "kind": "metrics", "addr": "127.0.0.1:8080"}
I0818 13:47:49.369663       1 leaderelection.go:248] attempting to acquire leader lease scoped-memcached-operator-system/86f835c3.example.com...
I0818 13:47:49.373329       1 leaderelection.go:258] successfully acquired lease scoped-memcached-operator-system/86f835c3.example.com
1.6608304693733532e+09  DEBUG   events  Normal  {"object": {"kind":"Lease","namespace":"scoped-memcached-operator-system","name":"86f835c3.example.com","uid":"ac82b4f7-a193-4d52-864b-315e1fc80ce1","apiVersion":"coordination.k8s.io/v1","resourceVersion":"564"}, "reason": "LeaderElection", "message": "scoped-memcached-operator-controller-manager-7b4c9bb485-7jlkh_666ab6d0-e194-43c8-8348-724608e1521e became leader"}
1.6608304693735454e+09  INFO    Starting EventSource    {"controller": "memcached", "controllerGroup": "cache.example.com", "controllerKind": "Memcached", "source": "kind source: *v1alpha1.Memcached"}
1.6608304693736055e+09  INFO    Starting Controller     {"controller": "memcached", "controllerGroup": "cache.example.com", "controllerKind": "Memcached"}
1.660830469473995e+09   INFO    Starting workers        {"controller": "memcached", "controllerGroup": "cache.example.com", "controllerKind": "Memcached", "worker count": 1}

Step 4

Create some Memcached resources in the allowed-one and allowed-two namespaces by running:

kubectl apply -f config/samples/cache_v1alpha1_memcached.yaml

Step 5

Get the logs again:

kubectl -n scoped-memcached-operator-system logs <pod-name>

For each CR we should see that there are logs signifying that:

  • A cache has been created for the Memcached CR
  • 2 event sources (watches) have been started (one is for Deployments created by the controller and one is for Pods created from the Deployments)
  • Attempt to get a deployment
  • Creation of a deployment

The logs should look similar to:

1.6608306977001324e+09  INFO    Creating cache for memcached CR {"controller": "memcached", "controllerGroup": "cache.example.com", "controllerKind": "Memcached", "memcached": {"name":"memcached-sample-allowed-one","namespace":"allowed-one"}, "namespace": "allowed-one", "name": "memcached-sample-allowed-one", "reconcileID": "188d52ee-593c-4716-980f-b4fe500bdb6c", "CR UID:": "b2427753-f092-4e22-a633-4d39bea7a0c4"}
1.660830697700437e+09   INFO    Starting EventSource    {"controller": "memcached", "controllerGroup": "cache.example.com", "controllerKind": "Memcached", "source": "informer source: 0xc0000a4640"}
1.6608306978010592e+09  INFO    Starting EventSource    {"controller": "memcached", "controllerGroup": "cache.example.com", "controllerKind": "Memcached", "source": "informer source: 0xc0000a4780"}
1.6608306978010962e+09  INFO    Getting Deployment      {"controller": "memcached", "controllerGroup": "cache.example.com", "controllerKind": "Memcached", "memcached": {"name":"memcached-sample-allowed-one","namespace":"allowed-one"}, "namespace": "allowed-one", "name": "memcached-sample-allowed-one", "reconcileID": "188d52ee-593c-4716-980f-b4fe500bdb6c"}
1.660830697801139e+09   INFO    Creating a new Deployment       {"controller": "memcached", "controllerGroup": "cache.example.com", "controllerKind": "Memcached", "memcached": {"name":"memcached-sample-allowed-one","namespace":"allowed-one"}, "namespace": "allowed-one", "name": "memcached-sample-allowed-one", "reconcileID": "188d52ee-593c-4716-980f-b4fe500bdb6c", "Deployment.Namespace": "allowed-one", "Deployment.Name": "memcached-sample-allowed-one"}
1.6608306978104348e+09  INFO    Creating cache for memcached CR {"controller": "memcached", "controllerGroup": "cache.example.com", "controllerKind": "Memcached", "memcached": {"name":"memcached-sample-allowed-two","namespace":"allowed-two"}, "namespace": "allowed-two", "name": "memcached-sample-allowed-two", "reconcileID": "9bdb2035-db34-412b-bf2f-1496df56c134", "CR UID:": "690a5864-af57-4e6a-a75a-efce861d09cf"}
1.660830697810686e+09   INFO    Starting EventSource    {"controller": "memcached", "controllerGroup": "cache.example.com", "controllerKind": "Memcached", "source": "informer source: 0xc0001b0e60"}
1.6608306978107378e+09  INFO    Starting EventSource    {"controller": "memcached", "controllerGroup": "cache.example.com", "controllerKind": "Memcached", "source": "informer source: 0xc0001b10e0"}
1.6608306978107502e+09  INFO    Getting Deployment      {"controller": "memcached", "controllerGroup": "cache.example.com", "controllerKind": "Memcached", "memcached": {"name":"memcached-sample-allowed-two","namespace":"allowed-two"}, "namespace": "allowed-two", "name": "memcached-sample-allowed-two", "reconcileID": "9bdb2035-db34-412b-bf2f-1496df56c134"}
1.6608306978107738e+09  INFO    Creating a new Deployment       {"controller": "memcached", "controllerGroup": "cache.example.com", "controllerKind": "Memcached", "memcached": {"name":"memcached-sample-allowed-two","namespace":"allowed-two"}, "namespace": "allowed-two", "name": "memcached-sample-allowed-two", "reconcileID": "9bdb2035-db34-412b-bf2f-1496df56c134", "Deployment.Namespace": "allowed-two", "Deployment.Name": "memcached-sample-allowed-two"}

As the deployments are spun up and reconciled, the deployment may be modified. This operator sets ownership on deployments and will reconcile the parent Memcached CR whenever a child deployment is modified. You may see a chunk of logs similar to (example truncated to only a couple logs for brevity):

1.6608307072214928e+09  INFO    Getting Deployment      {"controller": "memcached", "controllerGroup": "cache.example.com", "controllerKind": "Memcached", "memcached": {"name":"memcached-sample-allowed-one","namespace":"allowed-one"}, "namespace": "allowed-one", "name": "memcached-sample-allowed-one", "reconcileID": "fa336c14-f699-4f70-89d0-37631770441f"}
1.660830707233768e+09   INFO    Getting Deployment      {"controller": "memcached", "controllerGroup": "cache.example.com", "controllerKind": "Memcached", "memcached": {"name":"memcached-sample-allowed-two","namespace":"allowed-two"}, "namespace": "allowed-two", "name": "memcached-sample-allowed-two", "reconcileID": "644fd96b-346b-47b8-8c36-784e1741bbbb"}

Step 6

Check the namespaces to see that the proper deployments are created:

kubectl -n allowed-one get deploy

Output should look like:

NAME                           READY   UP-TO-DATE   AVAILABLE   AGE
memcached-sample-allowed-one   2/2     2            2           13m
kubectl -n allowed-two get deploy

Output should look like:

NAME                           READY   UP-TO-DATE   AVAILABLE   AGE
memcached-sample-allowed-two   3/3     3            3           14m

Step 7

Let's see what happens when we create a Memcached CR in a namespace that the operator does not have proper permissions in:

Create a Memcached CR in the namespace denied by running:

cat << EOF | kubectl apply -f -
apiVersion: cache.example.com/v1alpha1
kind: Memcached
metadata:
  name: memcached-sample-denied
  namespace: denied
spec:
  size: 1
EOF

Check the logs, we should see:

1.6608487955810938e+09	INFO	Creating cache for memcached CR	{"controller": "memcached", "controllerGroup": "cache.example.com", "controllerKind": "Memcached", "memcached": {"name":"memcached-sample-denied","namespace":"denied"}, "namespace": "denied", "name": "memcached-sample-denied", "reconcileID": "865240aa-1eac-48d0-9a64-56c2eec66b88", "CR UID:": "b49142b4-cc50-4465-969c-7257049247b6"}
1.660848795581366e+09	INFO	Starting EventSource	{"controller": "memcached", "controllerGroup": "cache.example.com", "controllerKind": "Memcached", "source": {}}
1.6608487955813868e+09	INFO	Starting EventSource	{"controller": "memcached", "controllerGroup": "cache.example.com", "controllerKind": "Memcached", "source": {}}
1.660848795581394e+09	INFO	Getting Deployment	{"controller": "memcached", "controllerGroup": "cache.example.com", "controllerKind": "Memcached", "memcached": {"name":"memcached-sample-denied","namespace":"denied"}, "namespace": "denied", "name": "memcached-sample-denied", "reconcileID": "865240aa-1eac-48d0-9a64-56c2eec66b88"}
1.6608487971761699e+09	INFO	Not permitted to get Deployment	{"controller": "memcached", "controllerGroup": "cache.example.com", "controllerKind": "Memcached", "memcached": {"name":"memcached-sample-denied","namespace":"denied"}, "namespace": "denied", "name": "memcached-sample-denied", "reconcileID": "865240aa-1eac-48d0-9a64-56c2eec66b88"}
1.6612814011258633e+09  INFO    Removing cache for memcached CR due to invalid permissions      {"controller": "memcached", "controllerGroup": "cache.example.com", "controllerKind": "Memcached", "memcached": {"name":"memcached-sample-denied","namespace":"denied"}, "namespace": "denied", "name": "memcached-sample-denied", "reconcileID": "ba671951-e2da-4b2b-87d9-9b0667f0c608"}
1.6612814011259036e+09  INFO    Removing ResourceCache  {"controller": "memcached", "controllerGroup": "cache.example.com", "controllerKind": "Memcached", "memcached": {"name":"memcached-sample-denied","namespace":"denied"}, "namespace": "denied", "name": "memcached-sample-denied", "reconcileID": "ba671951-e2da-4b2b-87d9-9b0667f0c608", "CR UID:": "9f60a7b1-dee6-4b6c-a729-420ec651c0dc", "ResourceCache": {"9f60a7b1-dee6-4b6c-a729-420ec651c0dc":{"Scheme":{}}}}
1.6612814011259542e+09  INFO    ResourceCache successfully removed      {"controller": "memcached", "controllerGroup": "cache.example.com", "controllerKind": "Memcached", "memcached": {"name":"memcached-sample-denied","namespace":"denied"}, "namespace": "denied", "name": "memcached-sample-denied", "reconcileID": "ba671951-e2da-4b2b-87d9-9b0667f0c608", "CR UID:": "9f60a7b1-dee6-4b6c-a729-420ec651c0dc", "ResourceCache": {}}

We can see we are also removing any caches that have been created for this Memcached CR to prevent unnecessary informers from hanging around.

Checking the Memcached CR with kubectl -n denied describe memcached we can see the status:

Status:
  State:
    Message:  Not permitted to get Deployment: deployments.apps "memcached-sample-denied" is forbidden: Not permitted based on RBAC
    Status:   Failed

Step 8

Update the RBAC to give permissions to the denied namespace by running:

cat << EOF | kubectl apply -f - 
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: op-rolebinding-default
  namespace: denied
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: scoped-operator-needs
subjects:
- kind: ServiceAccount
  name: scoped-memcached-operator-controller-manager
  namespace: scoped-memcached-operator-system
EOF

After a little bit of time we should see in the logs:

1.66085439100725e+09	INFO	Creating a new Deployment	{"controller": "memcached", "controllerGroup": "cache.example.com", "controllerKind": "Memcached", "memcached": {"name":"memcached-sample-denied","namespace":"denied"}, "namespace": "denied", "name": "memcached-sample-denied", "reconcileID": "8bfc654a-e372-47c8-9be5-2cf89f654c34", "Deployment.Namespace": "denied", "Deployment.Name": "memcached-sample-denied"}
1.66085439102921e+09	INFO	Getting Deployment	{"controller": "memcached", "controllerGroup": "cache.example.com", "controllerKind": "Memcached", "memcached": {"name":"memcached-sample-denied","namespace":"denied"}, "namespace": "denied", "name": "memcached-sample-denied", "reconcileID": "0bbc7b20-f392-45dc-a210-0d10ec58ff34"}
1.660854392647686e+09	INFO	Getting Deployment	{"controller": "memcached", "controllerGroup": "cache.example.com", "controllerKind": "Memcached", "memcached": {"name":"memcached-sample-denied","namespace":"denied"}, "namespace": "denied", "name": "memcached-sample-denied", "reconcileID": "610f80fc-9d7e-46b1-ac25-5e5286fa97d2"}

We can see in the Memcached CR status that it has been successfully reconciled:

Status:
  Nodes:
    memcached-sample-denied-7685b99f49-tv2b8
  State:
    Message:  Deployment memcached-sample-denied successfully created
    Status:   Succeeded

We can also see that the deployment is up and running by running kubectl -n denied get deploy:

NAME                      READY   UP-TO-DATE   AVAILABLE   AGE
memcached-sample-denied   1/1     1            1           71s

Step 9

Now let's restrict access again by deleting the RBAC we applied to give permissions in the denied namespace:

kubectl -n denied delete rolebinding op-rolebinding-default

This change won't affect the existing Memcached CR since it has already been reconciled, but if we edit the existing Memcached CR or create a new one in the denied namespace we will see these logs start to pop up again:

1.6608546666716454e+09	INFO	Getting Deployment	{"controller": "memcached", "controllerGroup": "cache.example.com", "controllerKind": "Memcached", "memcached": {"name":"memcached-sample-denied","namespace":"denied"}, "namespace": "denied", "name": "memcached-sample-denied", "reconcileID": "b88c4f4f-4885-4bd4-a706-dc6b888dbca7"}
1.6608546666883416e+09	INFO	Not permitted to get Deployment	{"controller": "memcached", "controllerGroup": "cache.example.com", "controllerKind": "Memcached", "memcached": {"name":"memcached-sample-denied","namespace":"denied"}, "namespace": "denied", "name": "memcached-sample-denied", "reconcileID": "b88c4f4f-4885-4bd4-a706-dc6b888dbca7"}

The Memcached CR status will again look like:

Status:
  Nodes:
    memcached-sample-denied-7685b99f49-tv2b8
  State:
    Message:  Not permitted to get Deployment: deployments.apps "memcached-sample-denied" is forbidden: Not permitted based on RBAC
    Status:   Failed

In this example I edited the existing Memcached CR to kick off the reconciliation loop which is why the Status.Nodes field is still populated.

Another thing to note in this case - if there is no reason for the reconciliation loop to run in the denied namespace the existing watches won't be cleaned up. Eventually the watches will attempt to refresh and they will encounter a WatchError due to permissions having been revoked. If not handled properly this will cause the Operator to enter a blocking loop where it continuously attempts to reconnect the watch.

In this Operator, when creating informers we inject our own WatchErrorHandler that will close the channel used by the informers to stop them. We then remove the ResourceCache that did not have the proper permissions so that when we reconcile a CR in that namespace again, we attempt to recreate the informers in the event RBAC has changed. This handling of the WatchError prevents the blocking loop of continuously attempting to reconnect the watch.

In the Operator logs, this process looks like:

W0823 19:34:02.520559       1 reflector.go:324] pkg/mod/k8s.io/client-go@v0.24.3/tools/cache/reflector.go:167: failed to list *v1.Deployment: deployments.apps is forbidden: User "system:serviceaccount:scoped-memcached-operator-system:scoped-memcached-operator-controller-manager" cannot list resource "deployments" in API group "apps" in the namespace "denied"
1.661283242520624e+09   INFO    Removing resource cache for memcached resource due to invalid permissions     {"controller": "memcached", "controllerGroup": "cache.example.com", "controllerKind": "Memcached", "memcached": {"name":"memcached-sample-denied","namespace":"denied"}, "namespace": "denied", "name": "memcached-sample-denied", "reconcileID": "967b9ba2-14f8-4a1b-bc14-a2802940d4a4", "memcached": {"apiVersion": "cache.example.com/v1alpha1", "kind": "Memcached", "namespace": "denied", "name": "memcached-sample-denied"}}

Step 10

Now lets delete the Memcached CR from the denied namespace entirely by running:

kubectl -n denied delete memcached memcached-sample-denied

Because the operator utilizes finalizers, our resource should not be deleted until the finalizer is removed. As part of the finalizer logic, we remove the cache for the Memcached CR that is being deleted. We should see in the logs:

1.6608559969812284e+09	INFO	Memcached is being deleted	{"controller": "memcached", "controllerGroup": "cache.example.com", "controllerKind": "Memcached", "memcached": {"name":"memcached-sample-denied","namespace":"denied"}, "namespace": "denied", "name": "memcached-sample-denied", "reconcileID": "a186a050-8ad7-4c92-855c-de59d7b371ea"}
1.6608559969812474e+09	INFO	Removing ResourceCache	{"controller": "memcached", "controllerGroup": "cache.example.com", "controllerKind": "Memcached", "memcached": {"name":"memcached-sample-denied","namespace":"denied"}, "namespace": "denied", "name": "memcached-sample-denied", "reconcileID": "a186a050-8ad7-4c92-855c-de59d7b371ea", "CR UID:": "eda9cac4-c3c6-4da1-b920-f374748d40cb", "ResourceCache": {"eda9cac4-c3c6-4da1-b920-f374748d40cb":{"Scheme":{}}}}
1.6608559969813137e+09	INFO	ResourceCache successfully removed	{"controller": "memcached", "controllerGroup": "cache.example.com", "controllerKind": "Memcached", "memcached": {"name":"memcached-sample-denied","namespace":"denied"}, "namespace": "denied", "name": "memcached-sample-denied", "reconcileID": "a186a050-8ad7-4c92-855c-de59d7b371ea", "CR UID:": "eda9cac4-c3c6-4da1-b920-f374748d40cb", "ResourceCache": {}}
1.660855996986491e+09	INFO	Memcached resource not found. Ignoring since object must be deleted	{"controller": "memcached", "controllerGroup": "cache.example.com", "controllerKind": "Memcached", "memcached": {"name":"memcached-sample-denied","namespace":"denied"}, "namespace": "denied", "name": "memcached-sample-denied", "reconcileID": "aa1eef8d-90ac-428c-b434-d76abdcf167b"}
1.6608560039957695e+09	INFO	Memcached resource not found. Ignoring since object must be deleted	{"controller": "memcached", "controllerGroup": "cache.example.com", "controllerKind": "Memcached", "memcached": {"name":"memcached-sample-denied","namespace":"denied"}, "namespace": "denied", "name": "memcached-sample-denied", "reconcileID": "f51f0c62-0237-468b-8e55-7a6d03a0d400"}


Drawbacks

PoC Limitations as of 08/26/2022

  • Limitation: controller-runtime is not designed in such a way that easily enables this functionality. There are workarounds that needed to be created to be able to properly implement this logic.
    Potential Solution: Work with controller-runtime to implement changes that limits workarounds and improves the user experience when it comes to creating scopeable operators.
  • Limitation: Only a caching layer. It is up to the Operator Authors to implement more complex logic to properly update the cache and handle changing permissions.
    Potential Solution: Provide a higher level library that provides Operator Authors a better user experience when developing operators that should handle changing permissions.
  • Limitation: Currently when using the scoped-cache-poc library watches are recreated multiple times for the same resources due to the way that informers are created. Informers returned by the ScopedCache is currently an aggregate of all informers available to an operator within the NamespacedResourceCache.
    Potential Solution: Provide a way to only get informers from a specific ResourceCache, enabling watches to be created only for the informers from a specific ResourceCache.

Alternatives

Dynamically configuring watches based on RBAC at startup

The Operator Framework team explored a solution in which watches were established for the Multi Namespace Cache based on Available RBAC, but was abandoned due to performance issues and guidance from David Eads.

In this approach, an operator would perform a series of SelfSubjectAccessReview requests to understand:

  • Which GVKs the operator could establish watches at the cluster level
  • Which GVKs the operator could establish watches at the namespace level.

The operator would use this information to configure watches for the cache accordingly. Some drawbacks to this approach included:

  • On startup, the operator would spam a cluster with SelfSubjectAccessReviewRequests for each GVK that needed to be watched. This presented a performance hit on large clusters with thousands of namespaces.
  • The operator would establish a watch for each GVK in each namespace with permissions. This presented the same scalability issue seen in the Multi Namespace Cache.
  • The operator would need to be restarted in order to restablish the watches in the cache whenever the RBAC of the operator was changed.

Ultimately, David Eads concluded that there was no feasible way for the API to support scoped operators at scale, and encouraged us to investigate an approach where:

  • Watches are driven by CRs when possible.
    • Configured "just in time".
    • Deleted when no longer needed.