# Production readiness review - Day 1+2 21-22/04/2020 ## Participants ### Jetstack * [Jon Tutcher](mailto:jon.tutcher@jetstack.io) * [Christian Simon](mailto:christian.simon@jetstack.io) ## Video call * Hangout: **meet.google.com/nro-svpn-gkm** +44 20 3910 5845‬ PIN: ‪326 194 948#‬ ## Questions *Just ask here or in the video chat* ### Links - This site [https://hackmd.io/ECvgiORATEqc6ePEnBCIFw](https://hackmd.io/ECvgiORATEqc6ePEnBCIFw) ## Agenda ### Schedule (Both Days) | | (BST) | |-----------------|-------------| | Start | 09.30 | | Lunch | 12:00-13:00 | | Finish | 17:00 | ### Day 1 * Introduction * Establish where to focus in the next two days (e.g. break out sessions) * High level overview of architecture of cluster including applications running on it * HA What are the availablilty requirements? * Outages? * Load balancing control plane via DNS * Looking into etcd topology * Ingress in depth * MetalLB in Layer 2 mode (2 IP ranges, one is allocated from) * Nginx-Ingress (from Nginx inc) * Uses wildcard cert * Readiness/Liveness probes * Resources management of containers * Dynamic scaling * Workload autoscaling (VPA, HPA) * Cluster autoscaling * Stateful applications? * NFS * Block storage? * Work loads/applications * Jupyterhub * Helm best practices * Tiller security * CNI * Choice for Weave (multi-cast) * * Cluster API * https://cluster-api.sigs.k8s.io/ * Potentially the future for the lower levels of Kubenretes deployments * Manage Kubernetes through Kubernetes * Bare metal provider: http://metal3.io/ * ### Day 2 * Monitoring * Great PromQL introduction: https://promcon.io/2019-munich/slides/promql-for-mere-mortals.pdf * Thanos for global view, depuplication: https://improbable.io/blog/thanos-architecture-at-improbable * Cortex: https://cortexmetrics.io/docs/production/running-in-production/ * RBAC * User auth using certificates (potential alternatives) * Dynamic spin up of Namespaces including namespaces * Every fedid has a namespace (fedid == user) * Projects with multiple users are necessary * PSP * Custom policies * NFS uid-gid remapping / hostPath approach * Going over check list * General pitfalls of cluster setup * Gather kubeadm configs * RBAC + PSP setup * ## Required - From control-plane nodes /etc/kubernetes/manifests/*.yaml - Kubeadm config generated from Ansilbe - kubectl get configmap kubelet-config-x.y kubeadm-config - Templated RBAC and PSP rules - JWT OpenID token from a user (without signature) - kubectl get nodes -o yaml - docker info - systemd-cgls