Building and Deploying Apps with Carvel

# Building and Deploying Apps with Carvel ## "Learning Paths" 1. [Iterations through K8s-based deployments](#Iterations-Through-K8s-based-Deployments) 2. [Use Case: Deploy and keep up-to-date a self-hosted github runner using carvel](#Use-Case-Deploy-and-Keep-up-to-date-a-Self-Hosted-GitHub-Runner) ... ### Iterations Through K8s-based Deployments From a monolith app, evolve along two lines: 1. decompose into a set of micro-services 2. increasing levels of abstraction on top of "virtualization" tech so that you can have first-hand experience of... - what added complexities distributed systems bring - what specific leverage each "virtualization" technology layer provides #### Iteration 0: Monolithic App Create an app that performs three or four actions. Ex: a web app that: - serves a single page - selects a template - selects a message to populate that template - simulates heavy processing by sleeping Put this source in git to track the evolution. #### Iteration 0.1: Containerize the App Get the app running in a Docker container, on your local machine (capture the steps in script(s)). #### Iteration 0.2: Orchestrate the Container Get that Docker container to deploy on a local Kubernetes cluster (capture in other script(s)). #### Iteration 0.3: On the Web Deploy the same app on a managed Kubernetes cluster (e.g. GKE) #### Iteration 1: Micro-Services **What?** Decompose the app from [Iteration 0](#Iteration-0-Monolithic-App) into a small collection of services.\ **Why?** highlight challenges that arise from declaring multiple containers Ex: 1. a main web app that will service requests 2. a micro-service that given the time of day, returns a template 3. another micro-service that selects a message of the day at random 4. a third micro-service that sleeps between a configurable minimum and maximum number of second. (see also []()) #### Iteration 2: Second Party Software **What?** Extract one (or more) services into their separate project/module. \ **Why?** Make more clear the work required to stitch back together a distributed system. Ex: 1. Write a greeting web service that the gateway web server obtains its greeting message from. 1. Separate the web service into its own git repo (including all configuration) 1. Support Multiple Environments (dev, ci, stage, prod) 1. Upgrade to the next version ### Use Case: Deploy and Keep up-to-date a Self-Hosted GitHub Runner Motivation: - github action outages may occur in the future. - we depend on github to test / release our software. ... ### References / Resources - Tanzu Developer Center article on using kapp-controller: https://hackmd.io/@Ym0e3B30SCaypZ4mW4h3AQ/SJWBbscbt - Kubernetes Tutorials: https://kubernetes.io/docs/tutorials/ - GCP Console > Kubernetes Clusters: https://console.cloud.google.com/kubernetes/list/overview?project=cf-k8s-lifecycle-tooling-klt - Path to production: https://miro.com/app/board/o9J_kg86zEk=/ - --- :::info :information_source: **Exploring the Why** \ This section is an attempt to connect the dots from market motivation(s) to the work that Carvel tools do. This is useful to the extent that knowing so is motivating and/or engenders cognitive empathy for those using our tools. ::: ## The Big Why Why would anyone want to use Kubernetes in the first place?\ If they are using Kubernetes, why would they use Carvel tools? \ And who _are_ these people anyway? **Goal:** I want to provide SaaS on the web, in a way that it's quick and easy to ... - decompose the system into many parts with minimal additional overhead - scale up (and down) capacity — so that the service can meet exponential growth _at minimal cost_ - create (nearly) identical environments — to gain confidence new software is working - ensure that only the software I intend to run is what's deployed - update the deployment(s) with no interruption to users **Strategy:** build or buy a Kubernetes-based PaaS - decompose the system into "microservices" - write each microservice in the language/platform that suits its tasks best - deploy each microservice as a collection of Kubernetes `Pods` and their attendant parts - `ConfigMap`s and `Secret`s for "injecting" configuration data - `Volume`s for long-term storage - `Service`s for networking - `ServiceAccounts` for permissions control - `Namespaces` for service isolation and quotas - ... and much much more! **Problems:** Kubernetes is not a PaaS... it's a PaaS construction kit and batteries are not included - there is a lot of redundant config in Kubernetes manifests - there are many essential jobs that are out-of-scope for Kubernetes: - the supply chain of software into the cluster (having one and securing it) - building container images - distributing container images - collecting container images that will run together into groups - verify that container images running are the same produced by the developers - verify that container images running were vetted by - see also: https://cloud.google.com/artifact-registry/docs/secure#public-sources - keeping the stuff deployed on the cluster organized (there is no Kubernetes Object that represents our entire deployed system) - **Solution:** Build the remaining parts of a PaaS, in part with Carvel tools - capture boilerplate configuration in `ytt` templates - extract variables as "Data Values" - extract repeated YAML fragments as functions - extract collections of templates as libraries - allow operators to make configuration changes through `ytt` overlays - build container images with `kbld` - scrape the Kubernetes config for references to container images - for first-party software, configure a builder to produce the image from source - for second/third party software, locate the image in some registry - resolve references to their cryptographic digest to establish immutability/identity of the software - distribute collections of images (along with their K8s config) with `imgpkg` - center around the emerging "filesystem of the net" — Docker/OCI registries - combine K8s configuration with the images it references into a single unit: a bundle - deploy sets of services/components as "App"s using `kapp` TBD: - secure "chain of custody" for deployed software through signatures - signatures "prove" that a certain process was performed on that software (e.g. vulnerability scan) ### See Also - https://vmblog.com/archive/2021/09/24/improving-the-lifecycles-of-kubernetes.aspx > Reflecting on the experience for authors of Kubernetes-based workloads, there are large gaps in packaging and lifecycle management. > > First, developers and operators have to install, manage, and update packages software manually through a set of imperative commands and without being able to use standard Kubernetes APIs. The above approach gets even more cumbersome, complex to learn, and error prone if the packaged software being installed has dependencies. An imperative approach is easy to get started with, but poses challenges for Day 2 operations, such as updates. > > Next, developers and operators have a hard time knowing what is running on a cluster. It is difficult to inspect various Kubernetes objects, and this gets more complex when software has dependencies. Creating and managing clones of a cluster for dev, staging, and prod is hard. Auditing software to ensure that what is running on the cluster is up to date and patched and matches the desired configuration is equally a manual and error-prone process. > > Last, the user experience for a developer who is writing and running their own software is the same as that for software that developers are consuming that are written by someone else. This often leads to developers needing to learn a lot more about the packaged software than they want to. > > Inspired by the "small, sharp tool" philosophy of unix, Carvel provides a set of reliable, single-purpose, composable tools that aid in application building, configuration, and deployment to Kubernetes. It offers declarative APIs to enable easy updates to software by updating configuration files and letting Kubernetes do what it does best in reconciling state. Building on that, it provides immutable bundles of software distributed using OCI registries so that you know exactly what is running on your cluster and can reproduce the state of a cluster at will. Lastly, it uses a layered approach with appropriate abstractions to provide a UX that is most suited to what you are doing, since operators can use these tools separately or in concert. --- # John's First Exploration ## Iteration 0: Simple App Deployed Manually - just happened to open VS Code and was prompted with a bunch of Docker/K8s-related plugins. - One such plugin was "Cloud Code — Kubernetes". - That plugin had a "Create a Kubernetes Sample" button which spewed out a simple web app. - Incidentally, they integrate with Skaffold to deploy the app. - this got me thinking about the development experience: should `kbld` serve developers? If so, how? - when I simply did a `docker build` and `docker run`, I couldn't access the app at first - I learned that there are two levels of port opening in docker: expose and publish. - EXPOSE is a way of documenting that the contained application(s) are listening on the named port. - publishing a port means actually configuring IP tables to flow traffic from the host to the container on the named port. - I deployed the app, locally, :fireworks: - Next is to get it in a cluster. - I created a new Kubernetes Cluster in GCP - To login I need to use gcloud to authenticate - gcloud CLI on my machine is broken (some incompatibility with python?) - I installed python 3.8 (brew install python@3.8) ==> gcloud now working ```bash= $ gcloud auth login $ gcloud container clusters get-credentials hello-cluster ``` - I then simply deployed the app to that cluster ```bash= $ kubectl apply -f src/deploy deployment.apps/go-hello-world created service/go-hello-world-external created ``` - But I didn't know if my app was really up (nor did I know what IP address was serving my app) - I had to run `kubectl get all` and locate the service to see the external ip address - I deleted ```bash= $ kubectl delete -f src/deploy/ deployment.apps "go-hello-world" deleted service "go-hello-world-external" deleted ``` - which seemed to hang for quite a while (no additional output) and then just returned. - Now deploying with `kapp` ```bash= $ kapp deploy -a hello -f src/deploy (lots of informative output!) ``` - And when I delete the app... ```bash= $ kapp delete -a hello ``` - I don't need the original files; and I see output that shows me progress of the delete itself. ## Iteration 1: The "Babble" Microservice - We decided that both coming up with a greeting _and_ delivering that greeting is much to much responsibility for one team. - We spawn a new product and team around it: Babble. - This new team uses Node.js ```bash= $ docker build -f babble/src/build/Dockerfile babble/src/app/ -t jsr-babble:v0.0.1 $ docker run -d jsr-babble:v0.0.1 $ docker network inspect bridge # look for the container's IP address; here, it was 172.17.0.2 $ docker build -f hello-world/src/build/Dockerfile hello-world/src/app/ -t jsr-hello:v0.0.1 $ docker run -p 8080:8080 --env BABBLE_HOSTNAME=172.17.0.2:3000 jsr-hello:v0.0.1 $ curl http://localhost:8080 ``` - as I was developing, even with just two containers, I started losing track of whether the current published image had my latest changes. - Turning to Kubernetes, I wrote the essential Deployment and Service manifests. - I wired up the "babble" service to "hello" by simply referencing it by its DNS name (conveniently, if the two are in the same namespace, it's just the service name!) **Pain points so far...** At this juncture, I have identified some pain points as a developer: - every change I make, I need to compile, test, build an image, tag and push that image and then redeploy everything. (neatly illustrated by Skaffold: https://skaffold.dev/docs/pipeline-stages/) This has gotten old, quick. - I want to move off of DockerHub. The rate limiting, the lack of paths... it's just irritating. **Switching to Google Artifact Repository** - GAR is the next version of GCR - now handles half-a-dozen artifact types including Docker images - now there are per-repository permissions - I needed to enable its API for our project (done). - https://cloud.google.com/artifact-registry/docs/docker/quickstart --- # Side-Journey: Container Security At the time of writing, our current image had 17 vulnerabilities, including two medium level CVEs. ```bash= $ docker run -d --name db arminc/clair-db $ docker run -p 6060:6060 --link db:postgres -d --name clair arminc/clair-local-scan $ curl -L https://github.com/arminc/clair-scanner/releases/download/v12/clair-scanner_darwin_amd64 >clair-scanner $ chmod +x clair-scanner $ ./clair-scanner --ip docker.for.mac.localhost k14s/image@sha256:5d1b60d46c6503a0c44 $ ./clair-scanner_darwin_amd64 --ip docker.f 2021/10/01 13:37:15 [INFO] ▶ Start clair-scanner 2021/10/01 13:37:20 [INFO] ▶ Server listening on port 9279 2021/10/01 13:37:20 [INFO] ▶ Analyzing f532767635e716995568102223bd6b2a9e87a5119d105387f38f0a8ddee7a035 2021/10/01 13:37:21 [INFO] ▶ Analyzing 65b2b33140c78dd43baf05b4e417b6991c102378ff9b6992dcfb5e61feb518a3 2021/10/01 13:37:22 [INFO] ▶ Analyzing 7f4dba77b882fec706c1643c7b1ecd36dd45833be6d897a513e2e43726242211 2021/10/01 13:37:24 [WARN] ▶ Image [k14s/image@sha256:5d1b60d46c6503a0c44abb5c0ccce93ac1c39fd1241671f79df985688a2859c7] contains 17 total vulnerabilities 2021/10/01 13:37:24 [ERRO] ▶ Image [k14s/image@sha256:5d1b60d46c6503a0c44abb5c0ccce93ac1c39fd1241671f79df985688a2859c7] contains 17 unapproved vulnerabilities ``` It's unlikely that they are _actual_ vulnerabilities in typical use, but why not have a cleaner image? Installing fresh CA certificates: :::spoiler From https://manuals.gfi.com/en/kerio/connect/content/server-configuration/ssl-certificates/adding-trusted-root-certificates-to-the-server-1605.html: 1. Install the ca-certificates package: `yum install ca-certificates` 2. Enable the dynamic CA configuration feature: `update-ca-trust force-enable` 3. Add it as a new file to /etc/pki/ca-trust/source/anchors/: `cp foo.crt /etc/pki/ca-trust/source/anchors/` 4. Use command: `update-ca-trust extract` ::: --- # Details ## Building Containers Todo: - [ ] Multi-stage Builds https://docs.docker.com/develop/develop-images/multistage-build/ - [ ] Container image formats (docker schema 1, docker schema 2, OCI Image spec) - [ ] Distroless container images https://hackernoon.com/distroless-containers-hype-or-true-value-2rfl3wat - [ ] or minimized images: https://github.com/grycap/minicon - [ ] Scanning containers for vulnerabilities - https://github.com/arminc/clair-scanner (client) - https://github.com/arminc/clair-local-scan (server) **References:** - https://docs.docker.com/develop/develop-images/dockerfile_best-practices/ - https://sysdig.com/blog/image-scanning-best-practices/