# Design: bounded service account token support for cert-manager Vault issuers **Status:** this is a design draft created by Maël Valais <mael@vls.dev> on 24 January 2022 and updated on 27 July 2022. A consensus around "solution 1" was reached during [the biweekly meeting on 27 July 2022](https://docs.google.com/document/d/1Tc5t6ylY9dhXAan1OjOoldeaoys1Yh4Ir710ATfBa5U/edit#bookmark=kix.21vhju66r6wp). In September 2022, this page will be adapted to fit the [cert-manager proposal template](https://github.com/cert-manager/cert-manager/blob/master/design/template.md) and then copied over to the [`design/`](https://github.com/cert-manager/cert-manager/blob/master/design/) folder. Issues: - [Make it possible to use a projected service account token to the Vault Issuer instead of a service account Secret](https://github.com/jetstack/cert-manager/issues/4144) which is implemented in [The Vault issuer can now be given a serviceAccountRef (PR 5502)](https://github.com/jetstack/cert-manager/pull/5502). - [Vault Service Account auth via ambient credentials](https://github.com/jetstack/cert-manager/issues/4733) which is implemented in [Add possibility to use ambient credentials for login to vault](https://github.com/jetstack/cert-manager/pull/4734). Affected users: - Neil Witts (Morgan Stanley) ## The Problems <div id="problem-1"></div> ### Problem 1: Security risk of static tokens The Vault issuer relies on a static token. That static token is the token automatically created and stored in a Secret resource for each Kubernetes service account. By "static", we mean that the token does not have an expiry time. For example, let us imagine that you have created the service account `vault-sa` for use with cert-manager. Kubernetes has automatically created the Secret resource `vault-sa-token-hvwsb`, which we use in the issuer manifest: ```yaml apiVersion: cert-manager.io/v1 kind: Issuer spec: vault: auth: kubernetes: secretRef: name: vault-sa-token-hvwsb # ❌ key: token ``` This token is static because its JWT payload does not contain the `exp` field: ```yaml # kubectl get secret vault-sa-secret -ojsonpath='{.data.token}' | base64 -d | step crypto jwt inspect --insecure { "payload": { "iss": "kubernetes/serviceaccount", "kubernetes.io/serviceaccount/namespace": "default", "kubernetes.io/serviceaccount/secret.name": "vault-sa-secret", "kubernetes.io/serviceaccount/service-account.name": "vault-sa", "kubernetes.io/serviceaccount/service-account.uid": "ca3227e0-a5fc-4ac6-9167-2bb665be4af3", "sub": "system:serviceaccount:default:vault-sa" }, } ``` Using this static token is insecure for two reasons: (1) The token never expires, and (2) Because it is stored in a Secret resource, an RBAC mistake may reveal the token to anyone in the cluster. <div id="problem-2"></div> ### ~~Problem 2: Vault issuer breaking with Kubernetes 1.24~~ (demoted) People upgrading their cluster to Kubernetes 1.24 will see their Vault issuer break. In Kubernetes 1.24, the "token controller" that auto-creates a Secret resource for each service account is disabled by default. The same problem may have appeared earlier if your company has disabled the "token controller" by giving the flag `--controllers=-serviceaccount-token` to `kube-apiserver`. This problem was demoted because an easy fix was found: when migrating to 1.24, people can fix the issue in two different ways: 1. *(global workaround)* If people don't want to change their existing Vault issuer manifests, people may re-enable the token controller with the following flag given to `kube-apiserver`: ```text --feature-gates=LegacyServiceAccountTokenNoAutoGeneration=false ``` 2. *(local workaround)* [As explained by Peter](https://github.com/cert-manager/cert-manager/issues/4144#issuecomment-1181815631), people can specify a secret name in which a static token will be created using an annotation. That requires changing the existing Vault issuer manifests, which will surprise people and needs to be well documented in the cert-manager upgrade notes. ```yaml apiVersion: v1 kind: Secret metadata: name: vault-token annotations: kubernetes.io/service-account.name: vault type: kubernetes.io/service-account-token ``` One positive aspect of this change is that instead of having to find the name of the generated Secret resource in the ServiceAccount object (e.g., `vault-token-ae90df`), people will use a static name. ### ~~Problem 3: Incohenrent `iss`~~ (demoted) Users of the Vault Issuer have noticed a discrepency between other applications that rely on the Vault API and cert-manager, such as [kubernetes-external-secrets](https://github.com/external-secrets/kubernetes-external-secrets). Being able to use the same Kubernetes auth configuration (more specifically, the same `iss`) across applications would simplify using the Kubernetes auth with cert-manager for people using kubernetes-external-secrets. This problem was "demoted" because HashiCorp has disabled and deprecated the `iss` validation in Vault 1.9, and HashiCorp also encourages people lower versions of Vault (1.8 and below) to disable the `iss` validation entirely. Turning off `iss` validation can be done with the following command (the command depends on your Kubernetes Auth configuration settings): ```sh vault write auth/kubernetes/config \ # your existing settings here \ disable_iss_validation=true ``` The reason HashiCorp decided it is OK to disable this validation for all versions of Vault is because the Kubernetes apiserver already validates the `iss` field. For more information, you can read the page [Kubernetes Auth And Kubernetes 1.21](https://www.vaultproject.io/docs/auth/kubernetes#kubernetes-1-21) on the official Vault website. <div id="solution-1"></div> ## Proposed solution: cert-manager requests a token using the Token Request API > **tl;dr:** This is the preferred solution. It was partially implemented in [PR 4524](https://github.com/jetstack/cert-manager/pull/4524). A weak consensus has been reached on during the cert-manager maintainers on 28 July 2022 [the biweekly dev meeting](https://docs.google.com/document/d/1Tc5t6ylY9dhXAan1OjOoldeaoys1Yh4Ir710ATfBa5U/edit#bookmark=id.grs6j5f0a8i0). A new field is available to the user on the Issuer: `serviceAccountRef`. With this field, the user lets cert-manager know which service account cert-manager should be asking a token for. ```yaml apiVersion: cert-manager.io/v1 kind: Issuer metadata: name: vault-issuer namespace: my-namespace spec: vault: auth: kubernetes: role: vault-sa mountPath: /v1/auth/kubernetes serviceAccountRef: name: vault-sa # ✨ audience: vault ``` The "bounded" token that cert-manager receives looks like this: ```yaml { "payload": { "aud": [ "vault" ], "exp": 1659030197, "iat": 1627494197, "iss": "https://kubernetes.default.svc.cluster.local", "kubernetes.io": { "namespace": "cert-manager", "pod": { "name": "busybox", "uid": "f6d92780-e660-4eee-83d3-1c43dfbe28db" }, "serviceaccount": { "name": "vault-sa", "uid": "c41885be-9641-4cb5-95c6-95121082d954" }, "warnafter": 1627497804 }, "nbf": 1627494197, "sub": "system:serviceaccount:cert-manager:cert-manager" # 🚧 } } ``` Initially, it was thought that letting cert-manager request tokens for any service account on the cluster wasn't a good idea. We then realized that we could have users create an RBAC role specific to the service account name. After installing cert-manager with Helm, and after having create the above Vault issuer in Kubernetes, a permission error will show on the Issuer, letting the user know that they need to do an extra step: ```console $ kubectl describe issuer vault-issuer -n my-namespace Events: Type Reason Age Message ---- ------ --- ------- Warning TokenRequest 1s Cannot request token, did you add an RBAC rule? ``` The missing step, not handled by Helm, is to apply a "custom" Role containing the service account name `vault-sa`: ```yaml apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: vault-sa-for-cert-manager namespace: my-namespace rules: - apiGroups: [""] resources: ["serviceaccounts/token"] resourceNames: ["vault-sa"] verbs: ["create"] --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: vault-sa-for-cert-manager namespace: my-namespace roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: vault-sa-for-cert-manager subjects: - kind: ServiceAccount name: vault-sa namespace: my-namespace ``` Although this solution suffers from this extra manual step, the cert-manager maintainers agree on it being a good solution from a security and usability standpoint. > Note about the "audience" (`aud` in the token): the audience is meant ## Alternative solutions <div id="ambiant-credentials"></div> ### Alternative solution: Ambiant Credentials By Reusing The cert-manager Token ("solution 2") > **tl;dr:** this solution is weakly rejected because we give Vault a token that has way too many permissions than required. This idea was presented in the issue [4733](https://github.com/cert-manager/cert-manager/issues/4733). With this solution, the user doesn't have to set either `secretRef` or `serviceAccountRef`. Instead, cert-manager uses its own projected token in order to authenticate to Vault. Since Kubernetes 1.21, the auto-mounted service account token is "projected" into the pod, meaning that it now has an expiry time (`exp`) and it is bound to a specific pod. For example, the following JWT payload corresponds to the auto-mounted token that cert-manager uses to talk to kube-apiserver: ```yaml # kubectl run -it --rm -q busybox -n cert-manager --image=busybox --restart=Never --overrides='{"spec":{"serviceAccountName": "cert-manager"}}' -- cat /var/run/secrets/kubernetes.io/serviceaccount/token | step crypto jwt inspect --insecure { "payload": { "aud": [ "https://kubernetes.default.svc.cluster.local", "k3s" ], "exp": 1659030197, "iat": 1627494197, "iss": "https://kubernetes.default.svc.cluster.local", "kubernetes.io": { "namespace": "cert-manager", "pod": { "name": "busybox", "uid": "f6d92780-e660-4eee-83d3-1c43dfbe28db" }, "serviceaccount": { "name": "cert-manager", "uid": "c41885be-9641-4cb5-95c6-95121082d954" }, "warnafter": 1627497804 }, "nbf": 1627494197, "sub": "system:serviceaccount:cert-manager:cert-manager" } } ``` Although this solution solves the problems 1 and 2, it is not conceivable to "leak" cert-manager's main token (which has lots of god-like roles attached) to an external service. The token handed to Vault does not need any role for the TokenReview API call to work. > Note that this is an assumption that the cert-manager maintainers make: we think that it is not safe to hand to Vault the same token as cert-manager's "god-mode" token. It is possible that Hashicorp Vault is seen as a "safe place", and that the risk taken by re-using cert-manager's token when authenticating with Vault isn't that high. <div id="solution-3"></div> ### Alternative solution: Volume Projected Token Reusing The cert-manager Service Account ("solution 3") > **tl;dr:** this solution is rejected because it is hard to implement (Helm hacks) on top of having the same downsides as the [solution 2](#ambiant-credentials). Another idea is to be able configure the Vault issuer to load a token from disk, letting us make use of a "volume projected token". The "volume projected token" allows you to create a secondary projected token on top of the default one, but with a different audience, expiry, and path on disk. Both projected tokens still share the same service account. The Helm chart would be changed so that it is now possible to have a "volume projected token", as follows: ```yaml apiVersion: apps/v1 kind: Deployment metadata: name: cert-manager spec: template: spec: serviceAccountName: cert-manager containers: - image: quay.io/jetstack/cert-manager-controller:v1.8.0 args: - --v=2 - --cluster-resource-namespace=$(POD_NAMESPACE) - --leader-election-namespace=kube-system volumeMounts: - mountPath: /var/run/secrets/tokens name: vault-token volumes: - name: vault-token projected: sources: - serviceAccountToken: path: vault-token expirationSeconds: 7200 audience: vault ``` The token itself is very similar to the one we saw in solution 2, except for the `aud` field. Notice that `sub` is the same as with cert-manager's auto-mounted token, because the same service account (the one specified in the Pod's `spec.serviceAccountName`) is used. ```yaml { "payload": { "aud": [ "vault" ], "exp": 1659030197, "iat": 1627494197, "iss": "https://kubernetes.default.svc.cluster.local", "kubernetes.io": { "namespace": "cert-manager", "pod": { "name": "busybox", "uid": "f6d92780-e660-4eee-83d3-1c43dfbe28db" }, "serviceaccount": { "name": "cert-manager", "uid": "c41885be-9641-4cb5-95c6-95121082d954" }, "warnafter": 1627497804 }, "nbf": 1627494197, "sub": "system:serviceaccount:cert-manager:cert-manager" # 🚧 } } ``` In order to use this projected token, the Vault issuer manifest contains a path to the token file present on disk: ```yaml apiVersion: cert-manager.io/v1 kind: Issuer metadata: name: vault-issuer namespace: my-namespace spec: vault: auth: kubernetes: role: vault-sa mountPath: /v1/auth/kubernetes tokenPath: /var/run/secrets/tokens/vault-token # ✨ ``` The `audience` field can be set to any value, as long as it matches the `audience` that is configured in the Kubernetes auth. For example, to match the above example, the `audience` has to be set to `vault`: ```sh vault write auth/kubernetes/config audience=vault ``` In order for this solution to work, the audience The field `audience` in the projected token volume must match with: 1. At least one of the audiences configured in kube-apiserver (`--api-audiences`, which defaults to the value of `--service-account-issuer`, which defaults to `https://kubernetes.default.svc.cluster.local`). 2. At least one of the audiences configured in Vault's Kubernetes auth config `audience`, for example: ```sh vault write auth/kubernetes/config audience=vault ```