Talk meta:
Recording was done in a way that prevented audience speaking from being recorded. This is the reason for some silences and also repeating the questions.
What's the easiest way to make some web service visible online?
What if you need management, like some way of scaling? And don't want to do sysadmin?
We'll see how kubernetes can handle some of these.
Container glossary (part 1):
This is not a main part of the talk
I assume you already have a simple webservice. We'll use, as an example, the scicomp-docs-search service. (It's a quick demo hack I did)
The purpose is a backend search database to go along with AaltoGPT.
How to run it:
$ git clone https://github.com/AaltoSciComp/scicomp-docs.git
$ pip install -r requirements.txt # in venv...
$ pip install beautifulsoup4 markdownify
$ make dirhtml
$ python _meta/search.py serve
Creating database
You must `make dirhtml` first.
Done: creating database
Server running on ('', 8000)
Test:
$ curl localhost:8000?q=pytorch%20on%20triton
[response]
We see it starts a web server on port 8000. I assume you can do this however is appropriate to you (if not, ask for help and/or find the other good online resources).
What we want next: we want this to appear online
This is not a main part of the talk
Second step is to be able to run this in a container. Here's the Dockerfile for our example service:
Notes:
CMD ["python", "search.py", "serve", "--db=search/search.db", "--bind=0.0.0.0:8000"]
EXPOSE 8000/tcp
_meta/seach/build.sh
script builds the image (and tags it as harbor.cs.aalto.fi/aaltorse/scicomp-docs-search:latest
- note the name includes the location)$ docker run --publish 8000:8000 -it --rm harbor.cs.aalto.fi/aaltorse/scicomp-docs-search:latest
$ docker run --publish 8000:8000 -it --rm harbor.cs.aalto.fi/aaltorse/scicomp-docs-search:latest
With this, we can easily move the code and all dependencies around: you can test locally, you can easily move to other computers.
Notes:
Glossary:
We need to share the image to the kubernetes cluster where it will run. Since this is declaritive, we don't directly push it there:
Background: how docker works:
harbor.cs.aalto.fi/aaltorse/scicomp-docs-search:latest
. Note how the location is part of the tag. This is the universal locator.docker push NAME
pushes that to the registrydocker pull NAME
pulls from that registrydocker login URL
(or something?) handles authentication.Locally, we run:
$ docker push harbor.cs.aalto.fi/aaltorse/scicomp-docs-search:latest
Kubernetes magically handles pulling the image if it's needed. It tries to be smart about things like reusing nodes that already have the image pulled. If the image isn't public, kubernetes will require authentication (somehow) to pull it.
So what we have now: given the name (which is the address), any computer in the world can get this code and run it.
This is a declarative definition of our service. We can create, update, delete and recreate, etc. the service. It defines the image, what services run, what ingresses there are from the internet, the domain names it's listening on, etc.
https://github.com/AaltoSciComp/scicomp-docs/blob/master/_meta/search/k8s.yaml
Let's go over the parts:
scicomp-docs-search
selector
to decide where to throw itWhat features do we see?
cert-manager.io/cluster-issuer
)What else could we do:
Generically you apply the yaml: it says "take this yaml, update your server state to match this". The generic command would be:
$ kubectl apply -f DEFINITIONS.yaml
I made a script redeploy.sh
that will re-deploy as needed:
build.sh
)docker push
)apply -f
)kubectl rollout restart DEPLOYMENT
)How it works:
kubectl
is the command that controls everythingrse
(-n rse
on every kubectl command)Notes:
What we get now: our whole service is declaratively defined and idempotently done. Run redeploy.sh
and no matter what the current state is, it'll be updated.
Different kubernetes commands let you examine things: the syntax is usually kubectl VERB OBJECT
$ kubectl -n rse get pods # list pods
$ kubectl -n rse describe pod PODNAME # pod detail
$ kubectl -n rse delete pod PODNAME # delete pod (might be re-created by deployment)
$ kubectl -n rse logs PODNAME [-f] # logs of pod
$ kubectl -n rse logs deploy/DEPLOYAME # logs of pod in deployment
$ kubectl -n rse get deployment # list deployments
$ kubectl -n rse describe deployment NAME
$ kubectl -n rse exec -it PODNAME COMMAND ... # exec in container
kubectl apply -f
updates existing deployments (automaticaly..)kubectl -n rse create secret [OPTIONS]
create secretsWhat we have now: A bunch of tools which can manage our deployments (from any computer…). Many of these will be familiar if you use docker, but there is much, much more.
Limitations:
(add your questions here, either in advance or live)
Is CS kubernetes
Would (CS) kubernetes be a solution to providing runners for version.aalto.fi ?
What provides security / isolation in kubernetes?
When is this the right model for a web service?
For AI models, loading weights can be the slowest part. How does this work with conatiners/kubernetes?
How different are various kubernetes implementations (local kubernetes, Azure, AWS, Google cloud platform, etc)?
Isn't it heavy to use Kubernetes for just a small web page or even small web app?
How would you suggest to put secrets into the container? Via ENV variables or mounted files?
Building your own image vs using generic images from public repositories?
Best practices for auto-updating containers?
When people complain about the complexity of kubernetes (there are enough jokes and memes on the internet), do they refer to the maintenance of the server? (which we here assume is done by somebody else and not our concern if I understood correctly) or do they refer to the yaml part? or API changes?
Is it easy to run a service in k8s, but use an external db server for data? So does the (CS) cluster have a single IP address you can use to allow connections from on the DB server side?