owned this note
owned this note
Published
Linked with GitHub
# Dex OIDC Upgrade/Configuration Change in Existing Cluster (Brownfield)
## Abstract
The support of Dex IdP (Identity Provider) by Airship2 cluster requires the use of Cluster API resources to configure the Target cluster's API server with OIDC flags as well as deploying Dex as workload.
The brownfield upgrade may be targetting the API server and/or the Dex service.
# Upgrading OIDC flags - API Server
This type of upgrade is not possible as most of OIDC flags are immutable.
* apiServer
* certSANs, e.g., dex.target-cluster.capi.io
* extraArgs
* oidc-issuer-url, e.g., https://dex.target-cluster.capi.io:32556/dex
* oidc-client-id, e.g., dex-capi-kubernetes
* oidc-ca-file, e.g., /etc/kubernetes/certs/dex-cert
# Upgrading Dex service
Upgrading the Dex service is a service non-affecting operation as this service is only required when users are signing in for getting their personalized
kubeconfig file generated by dex-aio.
## Use Case Scenario
First deploy **dex-aio** with AT&T labs LDAP connector and generate kubeconfig file for a given user, e.g., sx3394.
![](https://i.imgur.com/8DNNPGH.png)
Once the personalized **kubeconfig** has been generated, the user may use it to access the cluster but without any role binding he/she will not be able to do anything.
See example below of roles and roles bindings used for this use case.
```yaml=
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: default
name: pod-reader
rules:
- apiGroups: [""] # "" indicates the core API group
resources: ["pods"]
verbs: ["get", "watch", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
# "namespace" omitted since ClusterRoles are not namespaced
name: secret-reader
rules:
- apiGroups: [""]
#
# at the HTTP level, the name of the resource for accessing Secret
# objects is "secrets"
resources: ["secrets"]
verbs: ["get", "watch", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
# This role binding allows "jane" to read pods in the "default" namespace.
# You need to already have a Role named "pod-reader" in that namespace.
kind: RoleBinding
metadata:
name: read-pods
namespace: default
subjects:
# You can specify more than one "subject"
- kind: User
name: sx3394 # "name" is case sensitive
apiGroup: rbac.authorization.k8s.io
roleRef:
# "roleRef" specifies the binding to a Role / ClusterRole
kind: Role #this must be Role or ClusterRole
name: pod-reader # this must match the name of the Role or ClusterRole you wish to bind to
apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
# This cluster role binding allows anyone in the "manager" group to read secrets in any namespace.
kind: ClusterRoleBinding
metadata:
name: read-secrets-global
subjects:
- kind: Group
name: AP-NC_Test_Users # Name is case sensitive
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: secret-reader
apiGroup: rbac.authorization.k8s.io
```
Apply this manifest to the target cluster as **admin** user.
```bash=
$ kubectl --kubeconfig ~/.airship/kubeconfig --context target-cluster apply -f dex-role-rolebinding-sx3394.yaml
role.rbac.authorization.k8s.io/pod-reader created
clusterrole.rbac.authorization.k8s.io/secret-reader created
rolebinding.rbac.authorization.k8s.io/read-pods created
clusterrolebinding.rbac.authorization.k8s.io/read-secrets-global created
kubectl --kubeconfig ~/.kube/config --context sx3394-target-cluster get po
NAME READY STATUS RESTARTS AGE
airship-host-config-84cb99597c-wxzwn 1/1 Running 0 7h51m
$ kubectl --kubeconfig ~/.kube/config --context sx3394-target-cluster get secret
NAME TYPE DATA AGE
airship-host-config-token-5trgg kubernetes.io/service-account-token 3 7h52m
default-token-7xr8v kubernetes.io/service-account-token 3 7h54m
hco-ssh-auth kubernetes.io/ssh-auth 2 7h51m
```
This **kubeconfig** is good for 24 hours and **id_token** will be refreshed when it expires. In normal circunstances the refresh cycle takes place and new **id_token** and **refresh_token** are automatically retrieved.
In this use case, we will update the identity provider and if all goes well the **id_token** for sx3394 user will expire and he will not be able to access the target cluster anymore.
## Replacing Dex Identity Provider
Then update **dex-aio** with a new LDAP instance, e.g., OpenLDAP. See *helm* chart override as example below (e.g., manifest dex-chart-override-open-ldap.yaml).
```yaml=
# Dex Helm Kustomization
params:
site:
name: target-cluster
endpoints:
hostname: dex.utility.local
port:
https: 30556
http: 30554
k8s: 6443
tls:
cert_manager: true
issuer:
name: workload-cluster-ca-issuer
kind: Issuer
oidc:
client_id: utility-kubernetes
client_secret: pUBnBOY80SnXgjibTYM9ZWNzY2xreNGQok
ldap:
bind_password: "dex4airship2"
name: "OPEN LDAP TEST SERVICES"
config:
host: 104.43.139.232
port: 389
bind_dn: admin
bind_pw_env: LDAP_BIND_PW
username_prompt: MSFT UID
user_search:
base_dn: dc=test,dc=airship,dc=local
filter: "(objectClass=person)"
username: cn
idAttr: cn
emailAttr: name
nameAttr: name
group_search:
base_dn: ou=groups,dc=test,dc=airship,dc=local
filter: "(objectClass=group)"
userMatchers:
userAttr: DN
groupAttr: member
nameAttr: name
config:
dex.yaml:
expiry:
signingKeys: 6h
idTokens: 24h
connectors:
- type: ldap
name: OpenLDAP
id: ldap
config:
# LDAPS without certificate validation:
host: 104.43.139.232:389
insecureNoSSL: true
insecureSkipVerify: false
bindDN: cn=admin,dc=test,dc=airship,dc=local
bindPW: dex4airship2
# usernamePrompt: Email Address
usernamePrompt: Open LDAP UUID
userSearch:
# The directory directly above the user entry.
baseDN: ou=People,dc=test,dc=airship,dc=local
filter: "(objectClass=Person)"
# Expect user to enter "attuid" when logging in.
username: mail
# idAttr: cn
idAttr: DN
# When an email address is not available, use another value unique to the user, like name.
emailAttr: mail
nameAttr: cn
groupSearch:
# The directory directly above the group entry.
# baseDN: cn=groups,cn=compat,dc=example,dc=org
baseDN: ou=Groups,dc=test,dc=airship,dc=local
filter: "(objectClass=groupOfNames)"
# The group search needs to match the "cn" attribute on
# the user with the "member" attribute on the group.
userMatchers:
- userAttr: uid
groupAttr: memberUid
# Unique name of the group.
nameAttr: cn
```
Then apply the updated chart to **dex-aio**, which will rollout the new **dex-aio** service.
```bash=
$ helm --kubeconfig ~/.airship/kubeconfig upgrade dex-aio "./charts/dex-aio" --namespace target-infra -f $HOME/projects/oidc-dex/dex-chart-override-open-ldap.yaml
WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /home/sx3394/.airship/kubeconfig
WARNING: Kubernetes configuration file is world-readable. This is insecure. Location: /home/sx3394/.airship/kubeconfig
Release "dex-aio" has been upgraded. Happy Helming!
NAME: dex-aio
LAST DEPLOYED: Tue Sep 14 16:02:19 2021
NAMESPACE: target-infra
STATUS: deployed
REVISION: 3
TEST SUITE: None
$ kubectl --context target-cluster get po -n target-infra
NAME READY STATUS RESTARTS AGE
dex-aio-59976f7c78-jqmzt 3/3 Running 0 46s
dex-aio-6775968b4d-nwztk 3/3 Terminating 0 4h56m
```
Once the **dex-aio** service back up with a new identity provider, in this case, OpenLDAP, any new user will be able to get his/her personalized *kubeconfig* file.
![](https://i.imgur.com/ITGbgFe.png)
In this case, a new user id was used with the new identity provider.
As for the previous case, role bindings needs to be created for this new user id. See example below.
```yaml=
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: ericsson-pod-reader
# namespace: default
rules:
- apiGroups: [""] # "" indicates the core API group
resources: ["pods"]
verbs: ["get", "watch", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
# This role binding allows "jane" to read pods in the "default" namespace.
# You need to already have a Role named "ericsson-pod-reader" in that namespace.
kind: ClusterRoleBinding
metadata:
name: ericsson-read-pods
# namespace: default
subjects:
# You can specify more than one "subject"
- kind: User
name: sidney.shiba@ericsson.com # "name" is case sensitive
apiGroup: rbac.authorization.k8s.io
roleRef:
# "roleRef" specifies the binding to a Role / ClusterRole
kind: ClusterRole #this must be Role or ClusterRole
name: ericsson-pod-reader # this must match the name of the Role or ClusterRole you wish to bind to
apiGroup: rbac.authorization.k8s.io
```
Once the role bindings are created for the user id, the user can start issuing *kubectl* commands that he is allowed to.
```bash=
$ kubectl --kubeconfig ~/.airship/kubeconfig --context target-cluster apply -f dex-role-rolebinding-sidney-ericsson.yaml
clusterrole.rbac.authorization.k8s.io/ericsson-pod-reader created
clusterrolebinding.rbac.authorization.k8s.io/ericsson-read-pods created
$ kubectl --kubeconfig ~/.kube/config --context sidney.shiba-target-cluster get po -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-66bff467f8-d27mn 1/1 Running 0 8h
coredns-66bff467f8-phkcb 1/1 Running 0 8h
etcd-node01 1/1 Running 0 8h
kube-apiserver-node01 1/1 Running 0 8h
kube-controller-manager-node01 1/1 Running 0 8h
kube-proxy-fjzn2 1/1 Running 0 7h36m
kube-proxy-z4tvs 1/1 Running 0 8h
kube-scheduler-node01 1/1 Running 0 8h
```
## What Happens when token expires.
The use case is to replace the identity provider and when the *idtoken* from the previous provider expires, the respective **kubeconfig** MUST also expire and deny access to the cluster.
The *idtoken* is currently set to expire evert 24 hours and that's what has been observed.
```bash=
sx3394@ubuntu:~$ date
Wed 15 Sep 2021 04:24:07 PM CST
sx3394@ubuntu:~$ kubectl --kubeconfig ~/.kube/config --context sx3394-target-cluster get po
Unable to connect to the server: failed to refresh token: oauth2: cannot fetch token: 500 Internal Server Error
Response: {"error":"server_error"}
sx3394@ubuntu:~$ kubectl --kubeconfig ~/.kube/config --context sidney.shiba-target-cluster get po
NAME READY STATUS RESTARTS AGE
airship-host-config-84cb99597c-wxzwn 1/1 Running 0 17h
```
## Observations
The use case scenario was based on using the imperative **helm** commands instead of the declarative approach using the helm operator deployed on target cluster.
### Upgrades based on Helm Operator
Upgrading **dex-aio** through the helm operator may require updating the *dex-aio* charts in the *airshipit/charts/charts/dex-aio* repository, which will required to create a new helm-chart-collator docker image. In this case, the helm-chart-collar MUST be redeployed first.
In some cases, the upgrade can be done through kustomize when customization is achieved by using the *values* field of **dex-aio** charts.
```yaml=
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
name: dex-aio
spec:
...
values:
images:
applications:
dex:
tag: v2.28.1
name: dexidp/dex
repo: quay.io
nginx:
tag: 1.17.10-alpine
name: nginx
repo: docker.io
authenticator:
tag: 1.2.0
name: mintel/dex-k8s-authenticator
repo: docker.io
node_labels:
dex:
key: node-role.kubernetes.io/worker
value: ""
params:
site:
name: dex-virtual-airship-core
endpoints:
hostname: dex.function.local
port:
https: 30556
http: 30554
k8s: 6443
tls:
cert_manager: true
issuer:
name: workload-cluster-ca-issuer
kind: Issuer
oidc:
client_id: function-kubernetes
client_secret: pUBnBOY80SnXgjibTYM9ZWNzY2xreNGQok
ldap:
bind_password: "your LDAP bind password"
name: "LDAP TEST SERVICES"
config:
host: "your LDAP FQDN"
port: 636
bind_dn: "your LDAP bind username"
bind_pw_env: LDAP_BIND_PW
username_prompt: SSO Username
user_search:
base_dn: dc=testservices,dc=test,dc=com
filter: "(objectClass=person)"
username: cn
idAttr: cn
emailAttr: name
nameAttr: name
group_search:
base_dn: ou=groups,dc=testservices,dc=test,dc=com
filter: "(objectClass=group)"
userMatchers:
userAttr: DN
groupAttr: member
nameAttr: cn
```
>NOTE: in the case an attribute is not available under *values*, it is preferrable to update the **dex-aio** charts with new variables and making it more customizable.
### Supporting Upgrades with Airshipctl
***dex-aio*** is considered as workload and **airshipctl phase run workload-target** is used to deploy it among other services.
If the desired result is to upgrade all workload services that have their respective manifest updated, then the **airshipctl phase run workload-target** will do the job.
In order to support "blue/green" deployment, it is suggested to create a new site repo containing the workload updates. Update the $HOME/.airship/config to refer to the new site and invoke the airshipctl phase run command. In case the new dex-aio service is not behaving correctly the old service can be redeployed.
For validating this scenario, all three dex-aio container images have been updated as well as the IdP password. This kustomization was provided at site level. See trace below:
```bash=
$ airshipctl phase run workload-target --debug
[airshipctl] 2021/09/16 17:53:08 opendev.org/airship/airshipctl/pkg/phase/executors/k8s_applier.go:110: Filtering out documents that shouldn't be applied to kubernetes from document bundle
[airshipctl] 2021/09/16 17:53:08 opendev.org/airship/airshipctl/pkg/phase/executors/k8s_applier.go:115: Getting kubeconfig context name from cluster map
[airshipctl] 2021/09/16 17:53:08 opendev.org/airship/airshipctl/pkg/phase/executors/k8s_applier.go:120: Getting kubeconfig file information from kubeconfig provider
[airshipctl] 2021/09/16 17:53:08 opendev.org/airship/airshipctl/pkg/k8s/kubeconfig/builder.go:257: Received error when extracting context, ignoring kubeconfig. Error: failed merging kubeconfig: source context 'target-cluster' does not exist in source kubeconfig
[airshipctl] 2021/09/16 17:53:08 opendev.org/airship/airshipctl/pkg/k8s/kubeconfig/builder.go:167: Merging kubecontext for cluster 'target-cluster', into site kubeconfig
[airshipctl] 2021/09/16 17:53:08 opendev.org/airship/airshipctl/pkg/phase/executors/k8s_applier.go:127: Using kubeconfig at '/home/sx3394/.airship/kubeconfig-194686359' and context 'target-cluster'
[airshipctl] 2021/09/16 17:53:08 opendev.org/airship/airshipctl/pkg/phase/executors/k8s_applier.go:99: WaitTimeout: 43m20s
[airshipctl] 2021/09/16 17:53:08 opendev.org/airship/airshipctl/pkg/k8s/applier/applier.go:78: Getting infos for bundle, inventory id is workload-target
[airshipctl] 2021/09/16 17:53:08 opendev.org/airship/airshipctl/pkg/k8s/applier/applier.go:112: Inventory Object config Map not found, auto generating Inventory object
[airshipctl] 2021/09/16 17:53:08 opendev.org/airship/airshipctl/pkg/k8s/applier/applier.go:119: Injecting Inventory Object: {"apiVersion":"v1","kind":"ConfigMap","metadata":{"creationTimestamp":null,"labels":{"cli-utils.sigs.k8s.io/inventory-id":"workload-target"},"name":"airshipit-workload-target","namespace":"airshipit"}}{nsfx:false,beh:unspecified} into bundle
[airshipctl] 2021/09/16 17:53:08 opendev.org/airship/airshipctl/pkg/k8s/applier/applier.go:125: Making sure that inventory object namespace airshipit exists
namespace/ingress unchanged
namespace/lma unchanged
namespace/local-storage unchanged
helmrelease.helm.toolkit.fluxcd.io/ingress unchanged
helmrelease.helm.toolkit.fluxcd.io/elasticsearch-data unchanged
helmrelease.helm.toolkit.fluxcd.io/elasticsearch-ingest unchanged
helmrelease.helm.toolkit.fluxcd.io/grafana unchanged
helmrelease.helm.toolkit.fluxcd.io/kibana unchanged
helmrelease.helm.toolkit.fluxcd.io/kube-prometheus-stack unchanged
helmrelease.helm.toolkit.fluxcd.io/logging-operator unchanged
helmrelease.helm.toolkit.fluxcd.io/logging-operator-logging unchanged
helmrelease.helm.toolkit.fluxcd.io/prometheus-elasticsearch-exporter unchanged
helmrelease.helm.toolkit.fluxcd.io/thanos-operator configured
helmrelease.helm.toolkit.fluxcd.io/provisioner unchanged
helmrelease.helm.toolkit.fluxcd.io/dex-aio configured
helmrepository.source.toolkit.fluxcd.io/collator unchanged
issuer.cert-manager.io/workload-cluster-ca-issuer unchanged
17 resource(s) applied. 0 created, 15 unchanged, 2 configured
helmrelease.helm.toolkit.fluxcd.io/elasticsearch-data is Current: Resource is Ready
helmrelease.helm.toolkit.fluxcd.io/provisioner is Current: Resource is Ready
helmrepository.source.toolkit.fluxcd.io/collator is Current: Resource is Ready
helmrelease.helm.toolkit.fluxcd.io/elasticsearch-ingest is Current: Resource is Ready
helmrelease.helm.toolkit.fluxcd.io/kube-prometheus-stack is Current: Resource is Ready
helmrelease.helm.toolkit.fluxcd.io/thanos-operator is Current: Resource is Ready
helmrelease.helm.toolkit.fluxcd.io/logging-operator-logging is Current: Resource is Ready
helmrelease.helm.toolkit.fluxcd.io/prometheus-elasticsearch-exporter is Current: Resource is Ready
issuer.cert-manager.io/workload-cluster-ca-issuer is Current: Resource is Ready
namespace/ingress is Current: Resource is current
namespace/lma is Current: Resource is current
helmrelease.helm.toolkit.fluxcd.io/grafana is Current: Resource is Ready
helmrelease.helm.toolkit.fluxcd.io/logging-operator is Current: Resource is Ready
helmrelease.helm.toolkit.fluxcd.io/dex-aio is Current: Resource is Ready
namespace/local-storage is Current: Resource is current
helmrelease.helm.toolkit.fluxcd.io/ingress is Current: Resource is Ready
helmrelease.helm.toolkit.fluxcd.io/kibana is Current: Resource is Ready
helmrelease.helm.toolkit.fluxcd.io/dex-aio is InProgress: Reconciliation in progress
[airshipctl] 2021/09/16 17:53:49 opendev.org/airship/airshipctl/pkg/k8s/applier/applier.go:93: applier channel closed
helmrelease.helm.toolkit.fluxcd.io/dex-aio is Current: Resource is Ready
```
After the new dex-aio service is up and running, the process of generating the personalized kubeconfig was executed successfully.
# Recap
## Updating API Server OIDC flags
Updating API Server OIDC flags on KubeadmControlPlane resource is not possible as these flags are immutable. See below:
* apiServer
* certSANs, e.g., dex.target-cluster.capi.io
* extraArgs
* oidc-issuer-url, e.g., https://dex.target-cluster.capi.io:32556/dex
* oidc-client-id, e.g., dex-capi-kubernetes
* oidc-ca-file, e.g., /etc/kubernetes/certs/dex-cert
The most likely case for an OIDC flag change would be the Dex FQDN, which in the above example would be **dex.target-cluster.capi.io**. Instead of updating the corresponding flags, we can rely on the DNS ALias mechanism.
## Replacing Dex's Identity Provider
As a reminder, Dex is used to generate a personalized *kubeconfig* file for a given user and it delegates the user authentication to an external identity provider (IdP) such as LDAP.
When replacing Dex's IdP the kubeconfig generated by the previous IdP will be invalidated after the associated id_token expires (i.e., 24 hours period). Therefore, users will still be able to access the cluster for a day. Past that time, the access will be denied.
When the new IdP becomes operational, all users MUST "sign in" with Dex and get their new **kubeconfig** generated.
NOTE: If there is a need to revoke access to some users during this transition period and 24 hours wait is not acceptable, then the cluster administrator MUST revoke the RBAC authorization by revoking their respective [Cluster] Role Binding resources.
## Changing Identity Provider Password
The API server relies on the id_token from the kubeconfig file to authenticate a user. It does not relay the request to the IdP for authentication.
When the id_token expires the **kubectl** CLI will trigger the id_token refreshing procedure to retrieve the new id_token and refresh_token. At this time the IdP is contacted and password verified. Therefore, service is not affected by the password change.
## Upgrade through Airshipctl
It is possible to upgrade **dex-aio** service through **airshipctl phase run workload-target** command. The recommended location to provide the necessary updates to **dex-aio** service is at the site level.
>NOTE: due to the idempotency nature of this *phase run* command, any manifest changes detected by K8S will redeploy the corresponding service, in our case **dex-aio**. It goes without saying that the updated attribute(s) MUST be mutable, otherwise the service will not be redeployed.