try to execute
An error will occur, and the display on juju status
is that an error occurred during the charm download.
Go to /var/log/juju/unit_kata.log
to view the system log to see what went wrong:
Finally, it was found that the permission denied problem occurred in the lib/charms/layer/basic.py
file.
The problematic code is as follows:
It should be an error caused by insufficient permissions when check_output
executes the command, because I don't know how to give root permissions when adding-relation, so I decided to modify the code directly:
After the modification is completed, the subordinate kata charm can be installed normally.
Also executing
Sometimes the same error occurs, and the problem can be solved by using the same method.
This is the screen after the deployment is complete:
In the teaching of Ensuring security and isolation in Charmed Kubernetes with Kata Containers, the pod is created with the kata container in the way of runtime class, and the actual reference An error will occur after the operation, the error message is as follows:
However, this article How to use Kata Containers and Containerd on the official github of kata container states that Pods can be created using annotations untrusted workload.
The pod can be successfully created using the following yaml file.
After running, you can use the containerd command to confirm whether it is running in kata-container.
You can see that RUNTIME displays io.containerd.kata.v2 as the pod we created with kata container.
The next question is why the runtimeclass cannot be started normally.
First go to the containerd configuration file /etc/containerd/config.toml
to have a look.
Here you can see containerd’s setting of ruintimeclass. There are two runtimeclasses runc
and kata
in total. After looking at it, I think there is no special problem, and because if you use the following ctr command directly, there will be no problem. So temporarily rule out the problem of kata config.
Finally, I decided to comment out the runtimeclass option of kata to see if the error was caused by the option.
toml file will end up like this.
Then restart containerd:
Then try to build the pod again:
In the end it worked.
Verify with ctr command:
Finally, it was confirmed that the kata container was successfully deployed by using runtimeclass.
It is also possible to execute:
To see if there is a qemu vm being executed, so as to verify that the kata container will indeed wrap a layer of qemu vm outside.
But according to github:
From Containerd v1.2.4 and Kata v1.6.0, there is a new runtime option supported, which allows you to specify a specific Kata configuration file as follows:
It should also support the runtime option. I thought the config path was wrong, but it can still start normally after using kata-runtime-config to bring in the address and restart. I don’t know what caused the problem yet.
Finally, I tried again and only marked runtime options but did not make any settings.
According to the description in github, if the config path is not set, the system will run the default path by itself, but in the end, the pod cannot be created normally. It should be that the runtime options cannot be set here, but the specific reason has not been found yet.
Currently, there are two ways to create a pod using kata container:
If you use the Annotation method to create a pod, you need to add the untrusted_workload_runtime
mark under /etc/containerd/config.toml
[plugins.cri.containerd]
to let containerd know what to do when encountering untrusted_workload
annotations Which runtime to use.
In the yaml file of kubernetes, just add annotation under the metadata:
As shown below:
If you want to use runtimeclass to create a pod, you must first set it in /etc/containerd/config.toml
, and you must set it under [plugins.cri.containerd.runtimes]
to let containerd know what runtimeclass and Which runtime should be used.
In addition, on the kubernetes side, you must first create a runtimeclass and then substitute it into the yaml file of the pod. The yaml file of the runtimeclass is as follows:
The more important thing here is the handler field. Fill in the runtimeclass set in containerd, for example plugins.cri.containerd.runtimes.runc
is runc
, plugins.cri.containerd.runtimes.kata
is kata
.
After that, just add the runtimeClassName field to the spec.
First of all, we can check the running and setting of vsock
implement
You can observe the flag of use_vsock
and what the guest id is.
Then you can judge from the btrctl command whether the bridge on the host side is connected to the veth device.
After comparison before and after, when we build a pod with kata container, the bridge of cni0
will be connected to the veth device vethfaabff91
, which conforms to the description on github.
Next, we can look for information about the network namespace, and then view the network architecture and tc rule.
This command will display the current network stack, so when we open more new pods, the higher ones represent the neamespace of newer pods.
Next, you can use ip addr
to observe the network card settings and addresses.
You can see the tap device of tap0_kata
and the veth device of eth0.
Finally, we can use tc filter show dev [name] [ingress, egress]
to view the tc rule inside.
It can be seen that when the match tc rule can be mirrored to the eth0 end, and vice versa.
If you want to go in the opposite direction, you can also filter and transmit data according to the same rule.
If you want to precisely match the corresponding container namespace and veth name, you can use the ctr command to find out the corresponding setting value.
First describe the pod to find out the corresponding containerd id:
At this time, we determine that the id of containerd is fd1cf62c3f55d0d3d004c8c44e7916b92b6afe3f84e06ff37efb9ea7638e5d05
, and then we can use this id to find the corresponding vm id.
After checking the container info, we determined that the vm id is 27fc6c87feadff5e0a11b618f5d6a67032568c543fea9c70705028faea275e01
.
Then we can use ctr command to search namespace data:
From this we know that the id of this container namespace is cni-42d0400b-c2c9-f365-0497-3f170429a55f
, we can enter this namespace to see the devices inside:
You can see that the series of numbers after eth0
is the index 1806
of this device. We can get the matching veth device by searching in the default namespace.
From this, we know that vethf95422f9@if3 is the corresponding veth device, and we can also check its existence from brctl show:
In this way, we have successfully matched the corresponding container id and its network settings.
We can use the debug mode to connect to the vm to check the status, and the operation can be found here[enable debug console](https://github.com/kata-containers/documentation/blob/master/Developer-Guide. md#enabling-debug-console-for-qemu).
After we have finished setting, we can use the following steps to connect to the vm.
After entering, there is no way to manipulate some basic commands such as ls
and cat
, so you can only use tab
to find the relevant process status in /proc
.
First of all, you can make sure that kata-agent
will indeed run in the vm, and you can find the process of kata-agent
in /proc/66/comm
.
In addition, since cat
cannot be used, it can only be printed line by line by using the read
command of the shell with a loop.
If you want to confirm whether there is nginx
, you can also search under /proc
.
You can also see some settings when vm is loaded in /proc/cmdline.
After comparing the settings displayed by ps aux | grep qemu
on the host side, it can be found that many settings will also be brought into the vm, such as the use_vsock
flag and some agent settings.
Finally, if you want to check the network settings, each network card will be listed in /proc/net/dev
, we can use this to verify whether eth0
exists in the vm.
It can be seen that there is indeed an eth0
network card, which also conforms to the official description on github.
If you want to know the ip information, you can get the information from /proc/net/fib_trie
.
Here we use kubectl
to view the ip of the pod, and then look for the corresponding address above.
As mentioned above, there is indeed an ip address of 10.1.39.160, and when we look at /proc/net/dev
, only the network card eth0
is processing the transmission of packets, so it can be confirmed that it is the same as described on github The data will also be transmitted to the vm via eth0
.
In addition, route information can be found in /proc/net/route
.
Finally, if you want to know the MAC address, you can check it in /sys/class/net/eth0/address
.
According to Using Nvidia GPU device with Kata Containers According to the description, we can use Nvidia GPU in kata container using two modes:
The comparison between the two is as follows:
Technology | Description | Behavior | Detail |
---|---|---|---|
Nvidia GPU pass-through mode | GPU passthrough | Physical GPU assigned to a single VM | Direct GPU assignment to VM without limitation |
Nvidia vGPU mode | GPU sharing | Physical GPU shared by multiple VMs | Mediated passthrough |
Nvidia GPUs Recommended for Virtualization:
Some hardware needs larger PCI BARs window like Nvidia Tesla P100, K40m
If a larger BARS MMIO mapping is required, the section above 4G needs to be enabled in the PCI configuration of the BIOS.
Different brands may show different results:
The following flags must be set on the host kernel:
CONFIG_VFIO
CONFIG_VFIO_IOMMU_TYPE1
CONFIG_VFIO_MDEV
CONFIG_VFIO_MDEV_DEVICE
CONFIG_VFIO_PCI
Also set intel_iommu=on
in the kernel cmdline at boot time.
We also need to set related items in /etc/kata-containers/configuration.toml
. In addition, if non-large BARs devices are used, the recommended version is Kata version 1.3.0 or above, and if large BARs devices are used, the recommended version is It is required to be on Kata version 1.11.0 or above.
Hotplug for PCI devices by shpchp (Linux's SHPC PCI Hotplug driver):
Hotplug for PCIe devices by pciehp (Linux's PCIe Hotplug driver):
Next, we need to create a kernel that supports GPU, and change the default kernel to what we created. How to create a guest kernel is in [Build Kata Containers Kernel](https://github.com/kata-containers/kata-containers /tree/main/tools/packaging/kernel) is explained in detail.
In addition, to build a kernel, you need to have the corresponding environment and software. You can also find it in Build Kata Containers Kernel View in the article.
Here, first use go to grab the script that builds the kernel:
Then set the following kernel config option:
You also need to disable CONFIG_DRM_NOUVEAU
:
Then you can build the kernel:
Then generate the rpm package of kernel-devel
:
Finally, set the kernel path in /etc/kata-containers/configuration.toml
and you are done.
First find the Bus-Device-Function (BDF) of the GPU on the host:
Then find the IOMMU group of the GPU:
Check the IOMMU number under /dev/vfio
:
Then you can create a container to use the GPU:
The example here is to use docker to directly start the container. If you want to use kubernetes or containerd, it should be similar.
We can enter the container to confirm whether the GPU device is in the PCI devices list:
You can also confirm the size of the PCI BARS in the container:
Nvidia vGPU is a licensed product on all supported GPU boards. A software license is required to enable all vGPU features within the guest VM.
The purpose of vGPU mode is that different VMs can use the GPU device on the host at the same time, so the installation is performed on the host, and a license is required to use these functions on the guest VM.
First download the official Nvidia driver from Nvidia, such as NVIDIA-Linux-x86_64-418.87.01.run
.
Then download kernel-devel
, the rpm file that has been created before.
Then you can follow the official steps to decompress, compile and download the Nvidia driver:
or
View installer logs:
Then load the Nvidia driver module:
Finally, you can view the status:
Once installed, we can use this GPU with a different VM.