FreeBSD Container Engine

Introduction

xc is a work-in-progress container engine for FreeBSD to run both Linux[1] and FreeBSD containers. It can use OCI compatible image registries[2] for image distribution, such as DockerHub, or Azure Container Registry.

Some highlight on features unique to xc includes:

  • pre-instantiation sanity checks, including missing environment variables on supported images
  • Better DTrace support, by default xc exposes /dev/dtrace/helper to the containers which allows USDT to be registered by applications.

xc targets FreeBSD-14, however, most of the features will still work with older versions, except VNET container as it depends on this patch and this patch, which the patches the result of development of this project.

Although xc utilizes OCI registry for image distribution, it uses a different image format, which can be subject to breaking changes at anytime unnoticed until the first stable version released.

Table of contents

Requirements

Building

You will need Rust, Cargo an cmake to build this project, the easiest way to either use rustup or install from pkg. Now, with cargo installed, you can build the project with

cargo build

Note: If you want to push images, it is much better to build the project with release configuration cargo build --release, due to much better sha2 performance

Running

Supported CPU architecture

xc should support all architecture FreeBSD supports. xc is mostly developed on aarch64 machine and quite a bit on amd64.

File system

xc supports only ZFS currently. There are plans to make it work for non ZFS systems but that depends on the availability of overlayfs in base

Networking

xc relies on pf (ipfw is planned but not developed) for port exposure (via rdr)

You may need to add NAT related rules in your pf configuration if you wish to allow internet access for the containers.

Quick start

Building

# clone the project
git clone https://github.com/michael-yuji/xc.git
# build the project
cd xd
cargo build --release

Installing

# The ocitar utility must in $PATH, any directory in $PATH works but we pick /usr/local/bin here
cp target/release/ocitar /usr/local/bin
# xcd, the daemon, must run as root
cp target/release/xcd /usr/local/sbin
# copy xc, the client utility to some $PATH, can run by normal user as well
cp target/release/xc /usr/local/bin

Configuration

Create ZFS datasets for hosting images and rootfs of containers with the assumption that zroot is the name of the ZFS pool.
zfs create -p -o atime=off zroot/xc/datasets
zfs create -p -o atime=off zroot/xc/run

Create a json configuration file at /usr/local/etc/xc.conf

{
  "ext_ifs": [
    "igb0"
  ],
  "image_dataset": "zroot/xc/datasets",
  "container_dataset": "zroot/xc/run",
  "layers_dir": "/var/cache",
  "devfs_id_offset": 1000,
  "image_database_store": "/var/db/xc.image.sqlite",
  "database_store": "/var/db/xc.sqlite",
  "socket_path": "/var/run/xc.sock",
  "networks": {},
  "registries": "/var/db/xc.registries.json"
}
Key Description
ext_ifs specifies the external network interfaces which the port forwarding rules will be triggered by default.
image_dataset The dataset where rootfs of container images will be stored. This dataset must exists.
container_dataset The dataset which will be used to store the root dataset of running containers. This dataset must exists.
layers_dir The directory where image file system layers will be stored
devfs_id_offset xc takes care of devfs ruleset generation, this variable tells xc how to generate the id for the rulesets
image_database_store This is the sqlite database which xc stores image manifests, this file will be created automatically if it does not exist
database_store This is the sqlite database which xc stores address allocation and network definition, this file will be created atuomatically if it does not exist
socket_path The UNIX socket the daemon will listen at and accept connection from
networks Mapping between host network interfaces and xc networks, leave it empty for now as it can be managed via cli
registries Json file that should kept secure, which stores credentials for different container registries, this file will be created automatically if it does not exist

Run

Now you are ready to run xc. Starting at this port throughout the section, we are assuming you are running as root, for the sake of keeping things a bit simpler. Running containers as non-privileged users is supported in xc, but we are not going to get into that in the quick start guide.

In a terminal, start the daemon in foreground:
# xcd

By default, DockerHub is set as the default registry.

Run a pre-built FreeBSD image

The following command pull this image from dockerhub.
# xc pull freebsdxc/freebsd:13.2

Now you can run the image
# xc run freebsdxc/freebsd:13.2 /bin/sh

By default xc containers does not attach to any network, see Networking section for more information.

The image in this example freebsdxc/freebsd:13.2 runs /etc/rc on a stock FreeBSD base, which means sendmail is enabled, so if you attached unusable network to it, it might stuck at initialization for awhile, so play around with it without attach to a network first and don't panic if it seems stuck. (check the log of xcd!)

Run a Linux image

First load the linux kernel module and enable efi fallback to Linux
# kldload linux64
# sysctl kern.efi64.fallback_brand=3

The following command pull this image
# xc pull library/mariadb:10.9 mariadb:10.9

Now you can run the image
# xc run library/mariadb:10.9 -e MARIADB_ROOT_PASSWORD=password

By default xc containers does not attach to any network

Running an image, with network

We are now going to create a managed network. Additionally, we are going to make the containers able to access the internet.

We want xc to automatically assign addresses from the range 192.168.17.0/24 to the containers.

You can pick any ranges you want, we just pick 192.168.17.0/24 as an example.

Create an example network
  1. Create the interface we are going to use in host. We will call it xc0

# ifconfig bridge create inet 192.168.17.254/24 name xc0

Here we create a bridge interface named xc0 with an ip address 192.168.17.254 and subnet mask 255.255.255.0 (because the /24)
We use a bridge interface because it allow us to serve both VNET and non-VNET Jails

  1. Create the xc network, let's name it example

# xc network create --alias xc0 --bridge xc0 --default-router 192.168.17.254 example 192.168.17.0/24

Essentially it means: When we attach a container to this example network, find an ip in the range of 192.168.17.0/24 that other containers attached to the same network is not using. xc also adds the allocated address to the pf table xc:network:example. (similarly, if a network is called foo, the address allocated from the pool will be added to xc:network:foo)

If the container is non-VNET, add the address alias to xc0 (because of --alias xc0), as if we run ifconfig xc0 inet 192.168.17.x/24 alias

If the container is VNET, the runtime creates the epair interfaces, (epairXa, epairXb), move epairXb to the container, assign it with an the allocated 192.168.17.x/24 address, and add epairXa to xc0 (because of --bridge xc0)

  1. Let's say we want our containers to access the internet via NAT, so we need to configure the firewall pf. Assuming the network interface connected to our default gateway interface is igb0

a minimal /etc/pf.conf is going to look like

ext_if="igb0"

# This rule creates a NAT to the $ext_if when the source address
# is an address in the 'xc:network:example' table
nat on $ext_if from <xc:network:example> to any -> ($ext_if)

# In case we need to perform port redirection (-p rules), add our
# rdr anchor here
rdr-anchor xc-rdr
  1. Start the firewall before starting any container requires internet started
    # service pf start
  2. Now we can run a container with the extra --network <name> flag
    # xc run --network example freebsdxc/freebsd:13.2 --name test
  3. Type fetch -o- https://google.com to verify internet is working. Don't kill this container yet as we are going use it to test our VNET container in later step.

    Hint: in the test container console, you can type <Ctrl-p>-q (Control-P follow by q) to detach from the console, and run xc attach test to reattach the console.

  4. Now let's try out VNET container.
    # xc run --vnet --network example freebsdxc/freebsd:13.2 --name test2 /bin/sh
  5. Verify everything is working
    8.1.1. Test we can ping internet
    # ping 1.1.1.1
    8.1.2 Test we can ping the other container
    # ping <address of test>
    8.1.3 Test DNS is working
    # ping google.com

If you wonder why DNS magically work in these example containers, by default xc copies /etc/resolv.conf from the host to the containers. You can override this behaviour by providing the DNS nameservers by using one or multiple --dns <dns ip> arguments.
For example --dns 8.8.8.8 --dns 8.8.4.4 generates a resolv.conf look like

nameserver 8.8.8.8
nameserver 8.8.4.4

Usages

Show running containters

xc ps

Pull an image from a remote registry and name it foo:bar

xc pull example.io/my-image:bar foo:bar

Push a local image foo:bar to a registry example.io as foo1:bar1

xc push foo:bar example.io/foo1:bar1

Kill a container named foo

xc kill foo

Run a container with image freebsdxc/freebsd:13.2 and name the container "example"

xc run freebsdxc/freebsd:13.2 --name example /bin/sh

Run a container and add the ip address 192.168.8.8 on igb0 to it

xc run freebsdxc/freebsd:13.2 --ip 'igb0|192.168.8.8' /bin/sh

Run a container and add the ip address 192.168.8.7 and 192.168.8.8 on igb0 to it

xc run freebsdxc/freebsd:13.2 --ip 'igb0|192.168.8.7/24,192.168.8.8/24' /bin/sh

Run a vnet container and move the igb0 to the Jail, with ip address 192.168.8.7 and 192.168.8.8

xc run freebsdxc/freebsd:13.2 --vnet --ip 'igb0|192.168.8.7/24,192.168.8.8/24' /bin/sh

Run a container using the example network

xc run --network example freebsdxc/freebsd:13.2 /bin/sh

Link a container named foo, the command will enter blocking mode until the container is killed, killing the command will result in killing the container as well

xc link foo

Pull Images and OCI Registry

You can pull images from public registries using the pull command:

xc pull <server>/<repo>:<tag>

For example, the command xc pull index.docker.io/freebsdxc/freebsd:13.2 pulls the image in repo freebsdxc/freebsd with tag test-amd64 and will be accessible as freebsdxc/freebsd:13.2 locally.

By default xc uses DockerHub as the default registry, that means if the server component is not available, xc will try to pull from DockerHub instead

If you are trying to pull a "official image" from DockerHub, remember to add the library/ prefix to the repo, for example, the official image of mariadb can be pulled by xc pull library/mariadb:10.9 or xc pull index.docker.io/library/mariadb:10.9

If your registry requires a credential to access, you can use xc login --username <username> --password <password> <server> to add credential to a registry.

If you prefer to deal with the registries.json file directly,

Here's an example of the registry file (/var/db/xc.registries.json in the example configuration shown above)

{
    "default": "index.docker.io",
    "registries": {
        "index.docker.io": {
            "base_url": "https://index.docker.io",
        }
    }
}

If you have credentials for some of the registries:

{
    "default": "index.docker.io",
    "registries": {
        "index.docker.io": {
            "base_url": "https://index.docker.io",
            "basic_auth": {
                "username": "my_docker_hub_username",
                "password": "my_docker_hub_access_token"
            }
        },
        "my_azure_cr.azurecr.io": {
            "base_url": "https://my_azure_cr.azurecr.io",
            "basic_auth": {
                "username": "my_username",
                "password": "my_token"
            }
        }
    }
}

Networking

There are many ways to configure the network for xc containers. You can assign IP addresses to the containers just like normal Jails, but you can also let xc handle address allocation for you via network objects, in fact, you can even mix both!

Mananged address allocation

To have xc allocate the addresses to the containers, first, you need to create the network objects, for example, the following command creates a network named example, with an address space of 172.17.0.0/24. If a container is attached to a network, xc allocates an address within the address space and assigns it to the container. It is also possible to request an explicit address from an xc network. See Request explicit address sub-section for more.

xc network create --alias igb0 --bridge bridge0 example 172.17.0.0/24

The alias interface is the interface that will be used to create an IP alias for non-vnet containers. The bridge interface is the interface that the interface of the container will be bridged to (to be exact, the epairXa end of the epairX interfaces, that epairXb is the interface that moved to the container)

xc does not guarantee the connectivity between the containers, even in the same network, all xc does is create an alias/bridge the interfaces, it is the responsibility of the administrator to oversee the network topology. This also means xc is trying hard to stay out of your way in terms of network engineering.

To run a container attached to a network, use the --network <network> argument.

For example, xc run --network example freebsdxc/freebsd:13.2 creates a container attached to the network named example.

Request explicit address

You can request an explicit address from the address pool from a network.

For example, xc run --network 'example|172.17.0.200' freebsdxc/freebsd:13.2.

In the case where the address is not available, you'll get an error like this:

Err(
    ErrResponse {
        errno: 2,
        value: Object {
            "error": String("address 172.17.0.200 already consumed"),
        },
    },
)

Unmanaged address allocation

Use --ip '<iface>|x.x.x.x/m' to assign an IP address to the container manually. Where <iface> is the name of the network interface and x.x.x.x/m is the CIDR of the IP address.

For example, if you want to allocate 192.168.13.31/24 to the container, on interface igb0, you should add --ip 'igb0|192.168.13.31/24' to the run command.

If the container uses vnet. The interface will be moved to the container.

Tips: you can allocate multiple addresses on the interface, each CIDR block needs to be separated by ,, for example, the argument --ip 'igb0|192.168.13.31/24,192.168.8.8/24' allocates both 192.168.13.31/24 and 192.168.8.8/24 to the container.

Mixing managed and unmanaged address allocation

You can mix managed and unmanaged address allocation together anytime, for example,

xc run --vnet --network example --ip 'igb0|192.168.1.111/24,[dead:beef::1]/24 freebsdxc/freebsd:t13.2'

Tracing Container

You can trace system calls and many others of your container, using the xc trace <container> command. By default, without any extra arguments, xc trace <container> launches dwatch(1) with -F syscall and trace all the syscall entry/exit of system calls.

Containers created by xc are by default with /dev/dtrace/helper exposed, this allows applications running in the container to register their own USDT probes, and can be traced by the host system.

xc trace is marely a wrapper around dwatch, xc trace <container> -- <args>... is translated to dwatch -j <jid> <args>.... Checkout the man page of dwatch(1) for amazing things you can do with it.


  1. As long as supposed by FreeBSD Linuxulator ↩︎

  2. Tested: DockerHub, Microsoft Azure CR ↩︎

Select a repo