> Draft text on what a "scalable hub" (aka. "simple hub") could refer to. # Scalable service The scalable hub I've envisioned is the most feature complete service we still _at the largest scale conceivable_ could: communicate, demo, sell, setup, onboard, maintain, support, offboard, and transition from or decomission. By imposing notable restrictions focused on improving scalability, we can acquire a _clearly distinct_ service. This distinctiveness can lead to great value for us and communities using it. I think the scalable service is a meaniningfully distinct services, but others may risk not being. ## Imposed constraints By imposing constraints, we can get a scalable hub. - _Shared clusters only_ By constraining the scalable hubs to deploy to shared clusters, we avoid cloud account setups and support chart deployments etc. and can instead just do a hub deployment. - _AWS only_ The managed NFS service by AWS is superior to other cloud providers because its pay as you go without upper size limits, and because it features "Intelligent-Tiering" to transition files to a cheaper infrequent access storage class. - (?) _User nodes in one zone_ - Would reduce cost of EFS storage - Would increase a non-zero but low risk that a node would be unavailable when we wanted to to scale up - Only OAuthenticator based login - no public access hubs (not just any google account for example) - no user group specific profile list entries or options - no GPU access - no dask gateway - no binderhub-service (fancy profiles to build images or binderhub UI) ## User features - shared folder - admins can read/write, non-admin users can only read - server options: - exposes a selection of pre-defined resource requests - exposes a selection of externally managed images - exposes up to X custom images - server shutdown system to save cloud cost - max idle time 60 minutes - max time 7 days - (?) community specified domain ## Meta features - easy to co-locate next to data - right to replicate - cryptomining abuse protection - security consideration - upstream contributions - (?) allusers folder - (?) access to a scratch bucket (7 days retention) A reason not to - while its scalable to setup, its non-trivial to describe how when and why to use object storage, where we then also need to make assumptions on the software in the user image (is `aws` installed etc.) - (?) (-) access to requester pays based object storage A reason not to - the hub could be setup co-located with the data as well possibly. ## Initial setup info requirements - A more narrow selection of resource allocation choices? - A more narrow selection of pre-defined images? - Any community specified images? - (TODO: not completed list) ## Documentation needs - Pre-sale docs to help drive sales - Pre-sale demo demo to help drive sales - Onboarding related docs - What if anything is covered by support - (this list is incomplete!) ## Cloud cost compensation In shared clusters we will be invoiced by the cloud provider and we need to be compensated, but how isn't locked in stone. We now have practices on attributing cloud costs to communities and passing them on, but we could have other ways to get compensated. Here are choices to consider. - _Full passthrough_ Like this, all cloud costs get attributed and passed through to the communities. This is what we currently do in shared clusters. - (?) _Partial passthrough_ Like this, we could absorb some costs into the service cost while still passing through some costs. We could for example absorb the cost of core nodes, which could then make the passthrough cost be zero if the hub isn't used. - (?) _Fixed cost_ Like this, we would invoice communities a fixed amount. It requires us to protect ourselves from paying more than we get compensated for, which could for example be done by various constraints. ### (??) Potential of cloud credits With us paying the cloud costs and then invoicing communities to get compensated, we could practically convert credits into cash. There are big caveats related to this that would need to be resolved though - like CS&S taking no cut of the cloud costs currently. ## New feature ideas - (?) On demand transition to a new cloud region among a selection - (?) Monthly usage activity report - (?) User session reports