# FCCI Tech: Kubernetes deployment of simple web services :::info * Video: https://youtu.be/WnkGkCoRGnk * The example deployment is found here: https://github.com/AaltoSciComp/scicomp-docs/tree/master/_meta/search ::: Talk meta: - Open this document - Help take collaborative notes - You can write in questions or comments anytime (inline, right in the section where we are discussing.) - Big-picture questions can be written at the very bottom - This talk needs your contributions. Recording was done in a way that prevented audience speaking from being recorded. This is the reason for some silences and also repeating the questions. ## Roadmap What's the easiest way to make some web service visible online? What if you need management, like some way of scaling? And don't want to do sysadmin? We'll see how kubernetes can handle some of these. **Container glossary (part 1):** - **image**: the packed filesystem of a container. - **container**: A running copy of that filesystem+any runtime data. ## Step 1: Have your webservice **This is not a main part of the talk** I assume you already have a simple webservice. We'll use, as an example, the scicomp-docs-search service. (It's a quick demo hack I did) The purpose is a backend search database to go along with AaltoGPT. - source: https://github.com/AaltoSciComp/scicomp-docs/blob/master/_meta/search.py - Test it: https://scicomp-docs-search.k8s-test.cs.aalto.fi/?q=pytorch%20on%20triton How to run it: ```console $ git clone https://github.com/AaltoSciComp/scicomp-docs.git $ pip install -r requirements.txt # in venv... $ pip install beautifulsoup4 markdownify $ make dirhtml $ python _meta/search.py serve Creating database You must `make dirhtml` first. Done: creating database Server running on ('', 8000) ``` Test: ```console $ curl localhost:8000?q=pytorch%20on%20triton [response] ``` We see it starts a web server on port 8000. I assume you can do this however is appropriate to you (if not, ask for help and/or find the other good online resources). What we want next: we want this to appear online - with SSL - with some domain name - reasonably securely - with minimum effort ## Step 2: Containerize **This is not a main part of the talk** Second step is to be able to run this in a container. Here's the Dockerfile for our example service: - Dockerfile: https://github.com/AaltoSciComp/scicomp-docs/blob/master/_meta/search/Dockerfile - Build script: https://github.com/AaltoSciComp/scicomp-docs/blob/master/_meta/search/build.sh Notes: - Container contains the whole OS and search.py: basically a script to set it up. - Bottom defines the what runs and how it interacts with the world: ``` CMD ["python", "search.py", "serve", "--db=search/search.db", "--bind=0.0.0.0:8000"] EXPOSE 8000/tcp ``` - The `_meta/seach/build.sh` script builds the image (and tags it as `harbor.cs.aalto.fi/aaltorse/scicomp-docs-search:latest` - note the name includes the location) - Run the container: ```consele $ docker run --publish 8000:8000 -it --rm harbor.cs.aalto.fi/aaltorse/scicomp-docs-search:latest ``` - You can see it runs locally: ```console $ docker run --publish 8000:8000 -it --rm harbor.cs.aalto.fi/aaltorse/scicomp-docs-search:latest ``` With this, we can easily move the code and all dependencies around: you can test locally, you can easily move to other computers. Notes: - What to do if it needs a lot of non-code data (model weights)? Let's discuss later. - I've seen containers designed so that most config happens with environment variables - Another option is setting command line options and/or mounting a config file inside. ## Step 3: Share container on a registry **Glossary:** - **Registry**: A service which hosts container images. We need to share the image to the kubernetes cluster where it will run. Since this is declaritive, we don't directly push it there: - We put it on a **registry** - We define the location from where it can be pulled. Background: how docker works: - Image is **tagged** as `harbor.cs.aalto.fi/aaltorse/scicomp-docs-search:latest`. Note how the location is part of the tag. This is the universal locator. - `docker push NAME` pushes that to the registry - `docker pull NAME` pulls from that registry - `docker login URL` (or something?) handles authentication. Locally, we run: ```console $ docker push harbor.cs.aalto.fi/aaltorse/scicomp-docs-search:latest ``` Kubernetes magically handles pulling the image if it's needed. It tries to be smart about things like reusing nodes that already have the image pulled. If the image isn't public, kubernetes will require authentication (somehow) to pull it. So what we have now: given the name (which is the address), any computer in the world can get this code and run it. ## Glossary (part 2):: - **kubernetes**: a container orchestration system (runs containers and provides high-level management interface. It's declarative.) - **ingress**:: Stuff that handles incoming web requests. - **pod**: The unit of scheduling. It can have multiple containers working together, but we can look at it as "the container that's running." - **service**: The way we'll use it is like a pointer at a service. - **YAML**: human-readable config language. - **kubectl**: the locally-running command line interface. - **API server**: kubectl always communicates to the "kubernetes cluster" over a web API. You need to link the kubectl on your own computer to your API server - **namespace**s: separates different uses. Can provide permissions, resource limits, etc. ## Step 4: Create kubernetes deployment YAML This is a *declarative* definition of our service. We can create, update, delete and recreate, etc. the service. It defines the image, what services run, what ingresses there are from the internet, the domain names it's listening on, etc. https://github.com/AaltoSciComp/scicomp-docs/blob/master/_meta/search/k8s.yaml Let's go over the parts: - Ingress - Defines how to accept incoming HTTP - Perhaps cluster-dependent? - Forwards to a specific service named `scicomp-docs-search` - Service - Gives a name to a set of pods (containers) that can accept incoming connections under that name - Has a `selector` to decide where to throw it - Deployment - Defines the actual service itself: - name - container images (and secrets to pull them) - ports - environment variables - security info, resource limits, etc. - Defines how to match the running service so it can be stopped and restarted. What features do we see? - Environment var from a secret - Automatic SSL (`cert-manager.io/cluster-issuer`) - Automatic hostname via wildcard dns: `scicomp-docs-search.k8s-test.cs.aalto.fi What else could we do: - Multiple replicas - Load-balancing ## Step 5: Do the deployment Generically you **apply** the yaml: it says "take this yaml, update your server state to match this". The generic command would be: ```console $ kubectl apply -f DEFINITIONS.yaml ``` I made a script `redeploy.sh` that will re-deploy as needed: - https://github.com/AaltoSciComp/scicomp-docs/blob/master/_meta/search/redeploy.sh - rebuild the image (`build.sh`) - push the image (`docker push`) - update the service definition (`apply -f`) - rollout the new service (restart the containers) (`kubectl rollout restart DEPLOYMENT`) How it works: - `kubectl` is the command that controls everything - It always communicates with some other k8s API that does everything. - This deployes to a namespace named `rse` (`-n rse` on every kubectl command) - I've already defined the respective secrets that are needed (as seen in the yaml file). Notes: - Secrets: things like passwords/tokens have special support and can be kept out of the images. I made secrets manually. What we get now: our whole service is declaratively defined and idempotently done. Run `redeploy.sh` and no matter what the current state is, it'll be updated. ## Step 6: Managing the service Different kubernetes commands let you examine things: the syntax is usually `kubectl VERB OBJECT` ```console $ kubectl -n rse get pods # list pods $ kubectl -n rse describe pod PODNAME # pod detail $ kubectl -n rse delete pod PODNAME # delete pod (might be re-created by deployment) $ kubectl -n rse logs PODNAME [-f] # logs of pod $ kubectl -n rse logs deploy/DEPLOYAME # logs of pod in deployment $ kubectl -n rse get deployment # list deployments $ kubectl -n rse describe deployment NAME $ kubectl -n rse exec -it PODNAME COMMAND ... # exec in container ``` - `kubectl apply -f` updates existing deployments (automaticaly..) - `kubectl -n rse create secret [OPTIONS]` create secrets - ... and many more. Web search for what you need. What we have now: A bunch of tools which can manage our deployments (from any computer...). Many of these will be familiar if you use docker, but there is much, much more. ## Summary: what else can kubernetes do? - Scale - Add more replicas to a deployment: automaticaly gets larger - Resource requests and limits - Memory/CPU - Requests: how much use is expected - Limits: cgroup enforced - Move around - In theory, k8s API is same across different providers: CS, Azure, etc... - Run locally - e.g. minikube, microk8s. I don't know much about these. - Everything is *declarative*, so easy to re-create from scratch - Shareable clusters (namespaces with different limits and access) - Deploy from git Limitations: - Good to have someone who knows this well to consult. I have read some about kubernetes and still it's hard to make a whole service by myself. ### My proposals - We develop containers for our services - We try to make our initial kubernetes definitions (based on templates) - We work with kubernetes expert to do the actual deployment. ## Q&A (add your questions here, either in advance or live) * Is CS kubernetes * production ready? RD: it's used for production educational software * OK to use for some a bit heavy calculations? RD: yes, plenty of resources (not HPC cluster size but multi-CPU size) * Would (CS) kubernetes be a solution to providing runners for version.aalto.fi ? * RD: I suppose so. * What provides security / isolation in kubernetes? * RD: The container runtime. * When is this the right model for a web service? * Well... if it matchs what you hear above. Good for microservices which can work separately as small services. * Check out definitions of "cloud native": https://en.wikipedia.org/wiki/Cloud-native_computing * but I think that research software testing can have some applicability too. * For AI models, loading weights can be the slowest part. How does this work with conatiners/kubernetes? * Could the model weights be located elsewhere? e.g. NFS or separate container mounted in? * How much work would it be to put all the model data onto a nfs that's mounted on the k8s cluster and who would need to do that? * (this remains an open discussion) * How different are various kubernetes implementations (local kubernetes, Azure, AWS, Google cloud platform, etc)? * Remains an open discussion * Isn't it heavy to use Kubernetes for just a small web page or even small web app? * Running k8s 'just because' probably not good. But if we have a k8s and it is easy to use, then there isn't much difference to running on a VM/own machine. + automating updates should be easier on k8s. * RD: yes, I probably wouldn't go setting up kubernetes just for one thing. But if it's there and you have support, it's probably easier to use it rather than not. * How would you suggest to put secrets into the container? Via ENV variables or mounted files? * k8s secret (a type of cluster config) -> mount as file / ENV * Building your own image vs using generic images from public repositories? * The question was "should we build our own images, or should we take a generic image and mount in the code we need?" * This is something good to think about. * Best practices for auto-updating containers? * It is easy to neglet and just forget old containers images running * When people complain about the complexity of kubernetes (there are enough jokes and memes on the internet), do they refer to the maintenance of the server? (which we here assume is done by somebody else and not our concern if I understood correctly) or do they refer to the yaml part? or API changes? * Learning what all the k8s yaml can do is not exactly easy. * Having someone to ask from helps a lot. * Separating the cluster administration from container development probably helps (?) * RD: my opinion is that specialization of labor is definitely useful here. * Is it easy to run a service in k8s, but use an external db server for data? So does the (CS) cluster have a single IP address you can use to allow connections from on the DB server side? * Couldn't you simply set up authentication on the db and have your service auth against the db server? * Probably not if it's a server ran by ITS. You need a list of allowed IPs as well. * Do they do authentication by IP address only? That's something that can easily be faked. While you can't get data that way, you can compromise integrity and alter data.... * No, but if it's set up on a firewall there's not much you can do. * RD: I think there are different confs here. There are plenty of options of networking. We should ask our local expert.