The Pinniped Supervisor is an OIDC Issuer. It supports the OIDC offline_access scope with the refresh_token grant type as a way to refresh a user's tokens without requiring user interaction. Upon logging in to pinniped, users are granted a refresh token that is valid for 9 hours.
Currently Pinniped users' initial authentication is with the external identity provider, but the Supervisor does not verify user information with the external IDP when refreshing tokens. It only checks whether it recognizes the token and whether it is within the 9 hours.
When the refresh flow occurs, the supervisor should verify that the user still exists in the upstream identity provider and check for any changes in group membership.
The method of checking the upstream IDP will be different depending on the IDP type. Currently, we support 3 different upstream identity provider types, OIDCIdentityProvider
, LDAPIdentityProvider
and ActiveDirectoryIdentityProvider
.
During the initial token request flow to the upstream IDP, we should request the offline_access
scope so that we get an upstream refresh token. Then we can use that refresh token to get a new access, id, and refresh token when the downstream refresh happens.
We should also be able to use the new upstream access token to check for group information from the UserInfo endpoint.
We need to start sending prompt=consent
(used by google and possibly other idps)along with access_type=offline
to always get a refresh token from the upstream OIDC IDP (or at least, signal clearly that we want one). In the absence of a refresh token, we would just use the access token to gate the session lifetime (which may be really short).
We had some good conversation in regards to the prompt
parameter:
https://github.com/vmware-tanzu/pinniped/pull/850
https://kubernetes.slack.com/archives/C01BW364RJA/p1632242482109100
tldr is that we need to make it clear that the pinniped supervisor does not support the user or OAuth client specifying a prompt
value as we need control over it in the backend to get refresh tokens issued.
Our current approach of only sending access_type=offline
results in us only getting a refresh token on the very first login for a particular user instead of on every login. Setting both prompt
and access_type
gives us a refresh token on each login as expected. I confirmed that with https://accounts.google.com set as an OIDC IDP, we can have multiple refresh tokens in flight and use each independently to get access tokens.
Another aspect to consider: if we ask for refresh tokens from the IDP, we should ideally "logout" those sessions (via the revocation_endpoint
, see https://datatracker.ietf.org/doc/html/rfc7009) once the Pinniped session expires (refresh tokens from Google are valid for 6 months). This should happen even if the user never does a refresh flow with Pinniped. This would imply that we need to hold the refresh tokens in plaintext on the server side.
access_type
param to be set to offline
to get a refresh token, unlike the more common scope of offline_access
. Then you can request new tokens using grant_type
refresh_token
(making sure we request the openid
scope so we get a new id token).Note that there are limits on the number of refresh tokens that will be issued; one limit per client/user combination, and another per user across all clients. You should save refresh tokens in long-term storage and continue to use them as long as they remain valid. If your application requests too many refresh tokens, it may run into these limits, in which case older refresh tokens will stop working.
Seems like the limit in question is 50 per google account per client id, according to google's documentation. So we should be fine.
Google does not appear to invalidate the existing refresh token and grant a new one when you use it. Here is a sample response (note that there isn't a refresh token in it):
Okta uses the default OIDC behavior of requesting theoffline_access
scope to get a refresh token, which you can refresh by sending a token request with grant_type
refresh_token
. You have to request the openid
scope to get a new id token.
Valid responses will always return a refresh token, it depends on okta configuration whether it's a new one or the old one.
Okta has some weird, complicated behavior related to the prompt param and consent dialogs.
Gitlab uses a pretty standard OIDC flow. The old refresh token is always revoked and a new one issued during the refresh flow.
Does requesting access_type=offline
param break integrations with other IDPs?
Does requesting offline_access
scope break integration with Google? Hopefully not, according to section 2.4 of the OIDC spec, "Scope values used that are not understood by an implementation SHOULD be ignored", so even if it's not used it shouldn't hurt.
LDAP doesn't have a built in refresh flow. We could find out if the groups changed with a query, but for the use case where you changed your password, the only way to find out that the bind credentials you gave us don't work anymore is by binding, which requires passing the username and password back to the external ldap identity provider.
It's hard to find out whether an account has been disabled using other means, because it's common advice to just change the user's password to a different value without changing any other attributes, 1 2. Some ldap providers have attributes such as pwdAccountLockedTime
or nsAccountLock
, but there's no standard.
Since Active Directory communicates using LDAP, the same considerations that apply to LDAP apply to active directory. We could perform other checks because, unlike generic ldap, the schema of Active Directory is pretty well known. For example we could see if the pwdChangedTime was after the time that you logged in, or whether the userAccountControl attribute means that it's disabled.
Each of the idps requires storing some extra information about the upstream idp in order to validate user information upon refresh– refresh token for OIDC, username and password for LDAP. Refresh tokens are not as sensitive as LDAP credentials.
We already store downstream tokens as Kubernetes secrets on the Supervisor. But storing LDAP credentials would have a bigger potential downside than storing just Pinniped tokens, because of privilege escalation. If a malicious user had access to read secrets on the supervisor, they would be able to get the upstream credentials of all of the currently logged in users. We could encrypt the secrets, but where would we store the encryption key? If it were stored as a secret it may as well not be encrypted.
Storing OIDC refresh tokens on the Supervisor cluster may be necessary so that we can revoke them after some time.
We could keep the credentials on the user's own machine rather than on the supervisor, so each client only stores their own credentials. This requires that we pass the information back to the client, get them to store it, and pass it back to us upon refresh, all in an OIDC compliant way. It would be nice if it was fairly standard so that other clients could use it in the future. This would be done by making the downstream refresh token be based on upstream refresh token/bind information. We could encrypt the credentials (using some ephemeral key), pass them back to the user as a refresh token, and decrypt them to check against upstream when they pass them back. We would have to store the encryption key on the supervisor, probably as a secret so that each Supervisor pod can access it. Alternatively, we could do the opposite, where we pass the key back to the user and keep the encrypted credentials.
OIDC has a built-in concept of a session length. If the upstream refresh token works, you should be able to refresh your supervisor tokens.
However, it's possible that the upstream session length isn't desirable. For example, if the upstream IDP session is valid for a week, but the pinniped admin wants sessions to only last for a day, that should be configurable. In that case, refresh should work if a) upstream refresh works and b) the user-configured refresh time has not elapsed. We could add a field OIDCIdentityProvider.spec.refresh.sessionLength
to allow this to be configured.
What if you want longer tokens than your upstream IDP allows? We won't be able to validate the upstream IDP session length, so we can't necessarily disallow this, but we should make it clear in the docs that this is not intended behavior.
There's no concept of a session length. This would have to be set in Pinniped, whether that's a user configuration option or a preset default.
The default could remain 9 hours, since that's what we have right now.
We could add a field LDAPIdentityProvider.spec.refresh.sessionLength
to be configured.
If a user logs in using OIDC, and then doesn't make any kubectl commands, they won't trigger the downstream refresh flow, so we won't trigger the upstream refresh flow and learn if your session is still valid. If we only tie session length to upstream refresh, some tokens could just lie around forever. We should implement an idle timeout to delete these secrets. This isn't security critical (once they try to use their credentials, we will check upstream and find out whether it's allowed), so the length could be long, like a day. It could also be user configurable.
Current Pinniped users may expect that they can have a 9 hour session, regardless of their upstream IDP settings. It's possible that users would want to continue using Pinniped this way.
It's also possible that even with the precautions we take to encrypt them, users will be skeeved out by us storing user LDAP credentials.
Opting out would require an additional field (spec.refresh.disableUpstreamRefresh
) on the identityprovider to either opt-in or opt-out. It's probably better to make it opt-out, to keep it safe. Opt-out would change behavior for upgrading users, but that seems like a good thing.
We would likely need to encode information about which idp the refresh token was for upon refresh. However, if we don't do that now we should still be fine in the future, the worst that could happen is that the refresh tokens don't work and we prompt the user to log in again after upgrade. We should cross that bridge when we come to it.
spec.refresh.sessionLength
, to allow users to set a shorter session length than their upstream. Default is to defer to upstream.spec.refresh.idleTimeout
, to allow users to choose how long to keep tokens around as secrets in the supervisor before garbage collecting them. Default is 24 hours.prompt=consent
, access_type=offline
, and scope
includes offline_access
.prompt=none
,login
or select_account
are requested, do not ignore them.idleTimeout
, make a revocation request to the upstream to delete the refresh token, then garbage collect their associated secrets.spec.refresh.sessionLength
to allow users to set session length. Default is 9 hours.spec.refresh.disableUpstreamRefresh
to allow users to opt out of upstream refresh.sessionLength
, if not, get the key out of storage and use it to decrypt the refresh token, then attempt a bind.spec.refresh.sessionLength
to allow users to set session length. Default is 9 hours.userAccountControl=2
(the code that represents a disabled account) and b) pwdChangedTime
> login time (meaning the password has changed since we last logged in).