owned this note
owned this note
Published
Linked with GitHub
# Openscapes & Authentication
This is treated as an ongoing issue of the technical and human infrastructure of the openscapes hub, and this document is written in the style of an [blameless postmortem](https://sre.google/sre-book/postmortem-culture/)
## Summary
(to be written at the very end)
## Timeline
### Apr 2022
There is a desire to [delete list of users in the JupyterHub admin panel](https://github.com/2i2c-org/infrastructure/issues/1213) in one go. This would require upstream work in the JupyterHub project - particularly, *frontend* upstream work. We don't have the capacity to do that at the moment, so we instead [suggest](https://github.com/2i2c-org/infrastructure/issues/1213#issuecomment-1101856176) switching to using GitHub teams for authentication. This would 'outsource' the handling of user accounts to GitHub, and many other communities have been pretty happy with it. Using organizations and teams as two levels of providing access has mapped very well to other communities, and it is thought it would work well here too. This suggestion [is welcomed](https://github.com/2i2c-org/infrastructure/issues/1213#issuecomment-1101882459), and a [new issue](https://github.com/2i2c-org/infrastructure/issues/1222) is created to track progress here.
### Jul 2022 - Jan 2023
This actually triggers a bunch of work that is broadly useful, to everyone using JupyterHub.
We [validate](https://github.com/2i2c-org/infrastructure/issues/1222#issuecomment-1111955667) that removing a user from a GitHub team actually denies them access, even if they are present in the JupyterHub admin interface, *but only after they also log out*. A [new config](https://github.com/jupyterhub/oauthenticator/pull/631) is added upstream to help prevent this. A [big refactor](https://github.com/jupyterhub/oauthenticator/pull/526) of the OAuthenticator project is performed to help with this. A [lot of work](https://github.com/2i2c-org/infrastructure/issues/1222#issuecomment-1396763850) is identified as things that need to be done to move this forward
### Jul 2023
A year of work eventually result in a big new [16.0 release](https://github.com/jupyterhub/oauthenticator/pull/641) with a lot of features and better maintainability.
We now feel much better about moving Openscapes to using GitHub auth with teams, as overall both 2i2c and the upstream JupyterHub community now has a better understanding and more trust in how GitHub authentication would work in terms of people who may be removed from having access. Yay for broad improvements that help everyone!
### Oct 5, 2023
In the openscapes slack, there is a [request](https://openscapes.slack.com/archives/C02NC3Y62J1/p1696528660956039) to help cleanup the hub access list. In particular, there is a desire to answer the question of "who is *active*?" There were 889 users, how many are actually still using it? The JupyterHub admin panel has a 'last active' date, but this is not clear enough.
Thankfully, since another project (LEAP) also had similar questions, we had built and deployed [prometheus-dirsize-exporter](https://github.com/yuvipanda/prometheus-dirsize-exporter) to monitor both the size as well as 'last updated timestamp' of user home directories, and this is a pretty accurate view of when a user was last active. The JupyterHub API *also* has a decent view of that *in this case*, because we have kept the same JupyterHub database throughout. However, since the prometheus API also allows us to know *how big* someone's home directories are, we can also help cut costs by getting rid of those (or dealing with them in some way - tbd). Adding a new grafana graph or providing raw data from prometheus is also easier than writing something talking to the JupyterHub API, so both these points led us to using prometheus to determine this.
As part of this, the original suggestion of moving to GitHub teams is [brought up](https://openscapes.slack.com/archives/C02NC3Y62J1/p1696529418979569?thread_ts=1696528660.956039&cid=C02NC3Y62J1) again. This time, the proposed benefit is that different teams could get access to different profiles. Currently this is instead managed via the *staging* hub having differnt profiles for mentors to test, but this could unlock more potential by allowing that for the production hub.
### Oct 6, 2023
[An issue](https://github.com/2i2c-org/infrastructure/issues/3240) is opened to help plan moving them to GitHub teams authentication. This may also be the first time it's more clearly [expressed](https://github.com/2i2c-org/infrastructure/issues/3240#issuecomment-1751218146) how much money storage is costing, so there's more determination to get this done.
### Nov 1, 2023
Openscapes staging hub [is moved to GitHubOAuthenticator](https://github.com/2i2c-org/infrastructure/pull/3357). A set of GitHub teams [is created](https://github.com/2i2c-org/infrastructure/pull/3357#issuecomment-1789111871) under the main NASA-Openscapes organization, and testing [confirms](https://openscapes.slack.com/archives/C02NC3Y62J1/p1698854330997959) this works as intended.
After confirmation, the openscapes production hub is [also switched over](https://github.com/2i2c-org/infrastructure/pull/3360) and [confirmed to work](https://openscapes.slack.com/archives/C02NC3Y62J1/p1698863924422269).
# TODO: Add information here about *how* these users were added to the github teams, and issues around that
### Nov 3, 2023
There are now still users in the JupyterHub admin interface who can no longer login because they aren't a part of any appropriate team, because either they have not been invited or accepted the invite. It [is decided](https://openscapes.slack.com/archives/C02NC3Y62J1/p1699014134050549) that they will be kept on until we cleanup home directories.
### Nov 9, 2023
A user trying to login [gets a 403](https://openscapes.slack.com/archives/C02NC3Y62J1/p1699470873548449) because they had not already been added to the GitHub teams. It is discovered [and confirmed](https://openscapes.slack.com/archives/C02NC3Y62J1/p1699472879497269?thread_ts=1699472492.281759&cid=C02NC3Y62J1) that GitHub only allows *organization admins* to invite people, and so it is harder to delegate this to more people. At the time of this writing, there are 12 admins on the openscapes JupyterHub, but only [3 owners](https://github.com/orgs/NASA-Openscapes/people?query=role%3Aowner) of the GitHub org.
There is some confusion about the access policies, because the 403 simply says 'contact administrators'. The policies are written [here](https://github.com/NASA-Openscapes/2i2cAccessPolicies), and sent to the user via slack.
### Nov 10, 2023
It is reported that [workshop invitations](https://github.com/2i2c-org/infrastructure/issues/3240#issuecomment-1804496293) are a struggle. The primary problems [identified](https://github.com/2i2c-org/infrastructure/issues/3240#issuecomment-1805935580) are:
1. Usernames of people coming to workshop are not previously known, and it is hard to add them to the GitHub team at appropriate time.
2. Users are often new to GitHub, and don't always know to 'accept' the invitation
An unstated issue here is that only 3 people can now grant access to the hub, down from 12!
There is a [suggestion](https://github.com/2i2c-org/infrastructure/issues/3240#issuecomment-1805945980) to allow adding people either via the JupyterHub interface *or* GitHub teams to solve this, and this is [accepted](https://github.com/2i2c-org/infrastructure/issues/3240#issuecomment-1805953451)
A [new issue](https://github.com/2i2c-org/infrastructure/issues/3413) is opened to figure out how to accomplish this.
### Nov 13, 2023
A few different solutions are proposed on the issue, but involving GitHub invites as a primary source of truth [is deemed](https://github.com/2i2c-org/infrastructure/issues/3413#issuecomment-1808763642) to add a lot of complexity to the process.
This [triggers](https://github.com/2i2c-org/infrastructure/issues/3413#issuecomment-1808885308) the process to actually writing this document, so we can approach this problem with more structure and clarity. Taking account of the *human* part of this infrastructure as well as the technical parts is considered a primary desire.
The next workshop is on Dec 10, and the goal is to make sure everything works as it should by then.
### Nov 16, 2023
There was a workshop today, and it mostly went well! There were just 10-12 people in the workshop, and they were added 1 by 1 manually, via the GitHub UI, 1 hour before the workshop started. Few of those folks had ignored the github email, and the invite had to be deleted & re-invited, but mostly ok. The AGU workshop (on Dec 10) has more people, so maybe more problematic.
The idea floated earlier, that people with *pending* invites are also allowed to login, may be enough.
There is hope yet for GitHub teams.
## What went well
## Where we got lucky
## What went wrong
## Action items