# Tutorials
## Set up your machine with the right environment and access
- Full local setup to use everything
- Install tools
- sops
- kubectl
- gcloud
- aws
- az
- terraform
- python3
- docker
- repo2docker
- Local python env setup
- How do you set up to run deploy.py
- Authenticate
- sops auth
# Topic guides
## Logging
What are logs
Where do they come from
- JupyterHub
- Notebook server
- Proxy
- nginx-ingress
- Kubernetes control plane
- GitHub action logs (image builds and deploys)
- Autoscaler
- Console of the cloud provider
- dask-gateway controller, api server, schedulers, workers
### Links to how-tos about accessing /using logs
*(these might be how-tos)*
Logs of currently running stuff - `kubectl`
Logs of stuff that isn't currently running - cloud specific stuff
Looking at *events* (`kubectl describe`)
## Configuration
### What kinds of configuration do we use and where is it located?
Config cascade and hierarchy
- upstream software (jupyterhub, authenticator, kubespawner)
- z2jh
- our helm-templates (basehub, daskhub)
- deployer customizations (which we are reducing)
- per-hub overrides (viewable in cluster.yaml)
- configurator
- admin panel
- notebook-specific user config on user home directories
Starting at most specific, and looking upwards
(not sure howto) Common locations where config is?
- This is where admin users are configured
- resource usage is configured
- and other vague common things and where they are
## Support charts
What gets deployed on each cluster?
- prometheus (prometheus-server, kube-state-metrics, node-exporter), grafana, nginx-ingress
- (autoscaler on AWS)
- List their functions, and upstream info on what they are
## Dask and `dask-gateway`
- what is dask? (link outs)
- parts of dask (distributed, dask-gateway, (alt: dask-kubernetes), dask-labextension)
- What our deployment of dask is like (dask-gateway)
- Our node labeling conventions for dask
- Different parts of dask-gateway and
- Note that you don't have to be a dask user to help debug dask, same way you don't have to be a numpy user to help debug jupyter!
## Cluster design
Node pools we have and why we have them
- core node
- user nodes (notebook pod nodes + dask-schedulers also land here)
- dask worker nodes (ephemerel)
Considerations for *sizing* node pools on CPU & RAM
Considerations for type of disk on node pools
- cost vs speed of image pulls
Differences in Cloud Providers
Autoscaling
- GKE
- EKS + autoscaler
- Azure
Network policy and why we have it
*Where* to put the cluster (location)
Cluster Master highly available (when, why, cost)
## Image building and management process
- What's required inside the image?
- repo2docker docs
- ???
## Home directory storage
- Cloud provider specific
- EFS on AWS <3
- Filestore on GCP
- Azure file on Azure
- custom NFS VM in some places, why?
- How Unix permissions work
- They don't haha!
- uid 1000 and gid 1000 on everything
- how is access restricted to just the user's own home directory?
- What is the shared/ and shared-readwrite/ directory?
- What is nfs-share-creator and why do we have it?
# How-tos
## Shared home directory
- Enable shared-readwrite & shared/ directories
## Image building
- Set up an image to be built on quay.io from a new github repo
## Autoscaling
- Manually scale up (and down) a nodepool on each cloud provider
## Support charts
- How to deploy
- How to decommission
- How to look at logs of (link to our howto logs section)
## Debugging
### How to access logs
- Things that are currently running
- `kubectl logs`
- Table of options to try w/ quick pointers to different things
- Things that may not be running anymore
- Cloud provider UIs.