**Pull Design**
- Admins "register" their clusters with Quay by providing a `ServiceAccount` token with the ability to read/watch all `Pods` on their cluster
**Questions**
- WebSocket endpoint in Quay to get realtime updating stream of live images on clusters?
**Getting live images out of the cluster:**
- Fetching all pods and searching their `spec.containers` requires downloading _all the JSON_ (no server-side filtering) which could be unbounded size in the MBs
- Should `quay-bridge-operator` just label `Pods` with the image(s) they are running?
- Advantage: no new CRD needed
- Disadvantages: writing to pods, users could delete labels
- `kubectl get pods --all-namespaces -l <quay-host>/<namespace>/<repo>/<tag>`
- Should `quay-bridge-operator` watch `Pods` and aggregate into CRs for easier querying by Quay?
- Could also be readonly using aggregated apiserver (more work)
- Disadvantage: just duplicating info that is already in the k8s API (on `Pods`, but not easily queryable due to large JSON response)
- Could also use labels for server-side filtering
- `kubectl get liveimages --all-namespaces -l <quay-host>/<namespace>/<repo>/<tag>`
- Should `quay-bridge-operator` watch `Pods` and expose its own API (behind a `Service`) for easier querying by Quay?
- Advantage: could be a generic endpoint that other orchestration tools implement to get the same features (do we care?)
- Disadvantage: setting up a `Service` and `Ingress` or k8s API service proxy thing
**Preventing deletion of live tags**
- Create decorator on `DELETE /v1/repository/<apirepopath:repository>/tag/<tag>` endpoint which checks for the given tag on any of the registered clusters using Kubernetes API call
- Concern: this could significantly slow down this operation, but deletion of tags shouldn't be that critical
**Viewing clusters which have a given tag running on them**
- Add a query parameter to `GET /v1/repository/<apirepopath:repository>/tag/<tag>` to include this data in an extra field?
- Add new endpoint for retrieving this data?
- Ideally this should be responsive (using WebSockets)
### Registering Clusters
- Start with adding k8s clusters at the entire registry level (Quay instance)
- Later individual organizations, users, and teams can add clusters using the same UX
- Create `KubernetesClusterAccess` database model
- Joey: We might want to use a slightly different name, since if these are going to have permissions on them, `KubernetesClusterAccessPermission` reads oddly.
- Contains cluster name and (and other metadata), API endpoint, and auth token (later encrypted)
- Initially, restrict one instance per cluster using `UNIQUE` index on `api_endpoint`
- Question: Performance implications of a migration which _removes_ a `UNIQUE` index?
- Joey: Probably okay, but unless we require it early on, let's just check it client side
- Question: How does access to a cluster relate to individual Quay users?
- `KubernetesClusterAccess` has a foreign key field to the specific user who created it?
- New table (`UserKubernetesClusterAccess` or something) which links a `User` to a `KubernetesClusterAccess`
- Joey: I recommend we have every `KubernetesClusterAccess` have an `owner` field pointing to the namespace that owns the cluster (and `NULL` for registry-wide). Then we define a new table called `KubernetesClusterPermission` with columns `cluster_id`, `team_id` and `user_id` (similar to how we do for repositories), with the `team_id` or `user_id` indicating permission for that team/user. Cluster-wide ones would have no such entries.
- New internal Quay CRUD API for managing `KubernetesClusterAccess` objects
- Eventually UI for adding clusters
- Even more eventuallier OAuth flow with refresh tokens
- All Kubernetes calls happen in the browser client app using encrypted token(s)
- Add new endpoint to NGINX (like `/k8s/<cluster-endpoint>/<some-k8s-api-call>`)
- `proxy_pass` to the actual cluster API endpoint
- Use `auth_request` to decrypt token (same as `/_storage_proxy/` endpoint)
- For more security, maybe whitelist only certain k8s resources in the NGINX proxy
- Joey: :+1:
- Send along JWT to browser containing all the tokens a user has permission to
- Super useful NGINX module --> http://nginx.org/en/docs/http/ngx_http_auth_jwt_module.html
- Low expiration time and whitelisted k8s paths
# Brainstorming Notes
### 5/7/2020
#### Goals
1. Determine how k8s clusters get registered with Quay
- Should `KubernetesClusterAccess` have a `user` field?
- Yes, they need an `owner`
- No, have a separate table that maps users<->cluster access
- Superuser-defined whitelist of clusters (feature flag)
- Two modes: open (register access at namespace level), and closed (superuser-defined whitelist)
2. Determine first feature that takes advantage of cluster access
- When looking at a tags list, show all clusters running this tag
- When deleting a tag, warn me if it is running on clusters
#### Action Items
- [x] Close https://github.com/quay/quay/pull/300
- [ ] Start spike branch off `python3` working on NGINX proxy to identify roadblocks early
# Thought
What if we make a dedicated section in OpenShift console for Quay registry?
### Pros
- Reflected by some CRD instance (like `QuayIntegration`)
- No more need to deal with Quay/k8s auth or proxy (just use Quay API from k8s)
- Can use Quay webhook notifications to some cluster `Service` for some things (would need to add more supported events to Quay)
- Integration of OpenShift user management with Quay RBAC
- Easily add users/teams/orgs to Quay from existing OpenShift users
- "Single pane of glass" (ugh)
### Cons
- Doesn't help with "warn when deleting tag"
- Potentially a lot of hitting the Quay API
- How does RBAC work
- In multicluster, which UI do you use to interact with a single Quay?
- Could interfere with hopes of building shiny new Quay UI
# User Stories
## Personas
**Cluster Admin**: In charge of one or more Kubernetes clusters with root access.
**OpenShift User**: Access to one or more namespaces on a cluster.
**Quay Admin**: In charge of one Quay registry, with superuser access.
**Quay User**: Member of one or more teams in a Quay registry.
## Stories
**Initiative**: As an cluster admin, I want more *insight*, *control*, and *security* in regards to the container images running on my clusters in order to enhance the experience for my developers and ensure the reliability of the workloads on my cluster.
**Story**: As a Quay user, I need to know if deleting a tag will affect any running pods in order to prevent breaking my Kubernetes workloads.
**Story**: As an OpenShift admin, I want to prevent container images with vulnerabilities from running on my clusters in order to ensure the security of my cluster workloads.
**Story**: As an OpenShift user, I want to see if a newer tag with no vulnerabilities exists in order to easily update my pod's image and keep my workloads secure.
**Story**: As an OpenShift user, I want to be able to revert my deployments to use an older version of a tag in order to have more confidence when pulling tags in production.
**Story**: As an OpenShift user, I need to know if the tag my pod is running will expire soon in order to prevent downtime from failing to pull the tag.
**Story**: As an OpenShift admin, I want to know what container images are running on my clusters in order to have a better understanding of the workloads on my clusters.
---
**Epic**: As an OpenShift user, I want the build system to push and pull my container images to a registry with a graphical user interface in order to have a delightful, container-native development experience.
**Story**: As an OpenShift user, I want to know when my teammate pushes a new tag to our repository in order to speed up my development cycle.
**Story**: As an OpenShift user, I want to prevent creating a pod which runs a tag that does not exist, in order to reduce headaches from dumb mistakes.
**Story**: As an OpenShift user, I want to know if the tag my container is running has changed or been deleted in order to know that I need to update it or restore the tag.
**Story**: As an OpenShift user, I want to know if the container image I am about to use _before_ creating a pod has any vulnerabilities in order to keep SecOps from yelling at me.
**Story**: As an OpenShift user, I want to see exactly which environment variables, entrypoints, ports, and mountable volumes a container image has _before_ creating a pod in order to avoid misconfiguration.
**Story**: As an OpenShift user, I want to be able to specify that a tag should be seeded into the cluster as soon as it's pushed in order to speed up my development experience.
**Story**: As an OpenShift user, I want a link to the container image repository when looking at a pod in order to understand more about the container image.
---
**Epic**: As an OpenShift admin, I want unified role-based access control between users of my clusters and my container registry in order to easily manage the security of my cluster workloads.
**Story**: As an OpenShift user, I want to authenticate with a Quay registry using my OpenShift credentials in order to not have to remember another password.
**Story**: As an OpenShift user, I want know there is a pull secret present for the container image I am about to use _before_ creating a pod in order to reduce headaches from dumb mistakes.
**Story**: As an OpenShift user, I want to be able to choose a robot account to use as a pull secret when creating a pod in order to avoid the extra step of creating one manually.
**Story**: As a Quay user, I need to know if deleting a robot account will affect any of my running pods in order to prevent breaking my Kubernetes workloads.