This is treated as an ongoing issue of the technical and human infrastructure of the openscapes hub, and this document is written in the style of an blameless postmortem
(to be written at the very end)
There is a desire to delete list of users in the JupyterHub admin panel in one go. This would require upstream work in the JupyterHub project - particularly, frontend upstream work. We don't have the capacity to do that at the moment, so we instead suggest switching to using GitHub teams for authentication. This would 'outsource' the handling of user accounts to GitHub, and many other communities have been pretty happy with it. Using organizations and teams as two levels of providing access has mapped very well to other communities, and it is thought it would work well here too. This suggestion is welcomed, and a new issue is created to track progress here.
This actually triggers a bunch of work that is broadly useful, to everyone using JupyterHub.
We validate that removing a user from a GitHub team actually denies them access, even if they are present in the JupyterHub admin interface, but only after they also log out. A new config is added upstream to help prevent this. A big refactor of the OAuthenticator project is performed to help with this. A lot of work is identified as things that need to be done to move this forward
A year of work eventually result in a big new 16.0 release with a lot of features and better maintainability.
We now feel much better about moving Openscapes to using GitHub auth with teams, as overall both 2i2c and the upstream JupyterHub community now has a better understanding and more trust in how GitHub authentication would work in terms of people who may be removed from having access. Yay for broad improvements that help everyone!
In the openscapes slack, there is a request to help cleanup the hub access list. In particular, there is a desire to answer the question of "who is active?" There were 889 users, how many are actually still using it? The JupyterHub admin panel has a 'last active' date, but this is not clear enough.
Thankfully, since another project (LEAP) also had similar questions, we had built and deployed prometheus-dirsize-exporter to monitor both the size as well as 'last updated timestamp' of user home directories, and this is a pretty accurate view of when a user was last active. The JupyterHub API also has a decent view of that in this case, because we have kept the same JupyterHub database throughout. However, since the prometheus API also allows us to know how big someone's home directories are, we can also help cut costs by getting rid of those (or dealing with them in some way - tbd). Adding a new grafana graph or providing raw data from prometheus is also easier than writing something talking to the JupyterHub API, so both these points led us to using prometheus to determine this.
As part of this, the original suggestion of moving to GitHub teams is brought up again. This time, the proposed benefit is that different teams could get access to different profiles. Currently this is instead managed via the staging hub having differnt profiles for mentors to test, but this could unlock more potential by allowing that for the production hub.
An issue is opened to help plan moving them to GitHub teams authentication. This may also be the first time it's more clearly expressed how much money storage is costing, so there's more determination to get this done.
Openscapes staging hub is moved to GitHubOAuthenticator. A set of GitHub teams is created under the main NASA-Openscapes organization, and testing confirms this works as intended.
After confirmation, the openscapes production hub is also switched over and confirmed to work.
There are now still users in the JupyterHub admin interface who can no longer login because they aren't a part of any appropriate team, because either they have not been invited or accepted the invite. It is decided that they will be kept on until we cleanup home directories.
A user trying to login gets a 403 because they had not already been added to the GitHub teams. It is discovered and confirmed that GitHub only allows organization admins to invite people, and so it is harder to delegate this to more people. At the time of this writing, there are 12 admins on the openscapes JupyterHub, but only 3 owners of the GitHub org.
There is some confusion about the access policies, because the 403 simply says 'contact administrators'. The policies are written here, and sent to the user via slack.
It is reported that workshop invitations are a struggle. The primary problems identified are:
An unstated issue here is that only 3 people can now grant access to the hub, down from 12!
There is a suggestion to allow adding people either via the JupyterHub interface or GitHub teams to solve this, and this is accepted
A new issue is opened to figure out how to accomplish this.
A few different solutions are proposed on the issue, but involving GitHub invites as a primary source of truth is deemed to add a lot of complexity to the process.
This triggers the process to actually writing this document, so we can approach this problem with more structure and clarity. Taking account of the human part of this infrastructure as well as the technical parts is considered a primary desire.
The next workshop is on Dec 10, and the goal is to make sure everything works as it should by then.
There was a workshop today, and it mostly went well! There were just 10-12 people in the workshop, and they were added 1 by 1 manually, via the GitHub UI, 1 hour before the workshop started. Few of those folks had ignored the github email, and the invite had to be deleted & re-invited, but mostly ok. The AGU workshop (on Dec 10) has more people, so maybe more problematic.
The idea floated earlier, that people with pending invites are also allowed to login, may be enough.
There is hope yet for GitHub teams.