OpenShift Installer for Code Ready Containers
=============================================
Introduction
------------
We will first investigate and document the process on `libvirt`, to find a way to eliminate the use of Terraform.
Process flow
------------
As of now the process is something like below and we have some gaps which we need to ask to installer team.
- Installer can generate the Ignition (`.ign`) config files. They do not have any platform specific details (like libvirt, aws ..etc.) so this should be consumed by any infrastructure.
- during bootup of the base system (RHCOS), Igition will run as a pre-boot (once), and will setup disks. This process can take quite long (2mins is not unheard of)
- Do we really require a local DNS for communication with different components? Is this because we don't have a control over created VM IP address?
- Is `192.168.126.x` domain is reserved for the component IP assignment? As per libvirt doc this can't be modified.
- https://github.com/openshift/installer/blob/master/pkg/asset/installconfig/platform.go#L48
- In case of current libvirt provider who handles the worker node creation (Is it master or Terraform?)
- It seems there is a callback reporting service on the bootstrap node. How does this tie-in to the overall process?
- the installer creates several resources inside the default storage pool. These are the following:
```bash
$ sudo virsh vol-list default
Name Path
------------------------------------------------------------------------------
bootstrap /var/lib/libvirt/images/bootstrap
bootstrap.ign /var/lib/libvirt/images/bootstrap.ign
coreos_base /var/lib/libvirt/images/coreos_base
master.ign /var/lib/libvirt/images/master.ign
master0 /var/lib/libvirt/images/master0
worker.ign /var/lib/libvirt/images/worker.ign
```
Files named `.ign` are the ignition configuration files that will be provided to the virtual machine. The `qemu-kvm` process gets started with `-fw_cfg name=opt/com.coreos/config,file=/var/lib/libvirt/images/bootstrap.ign` to refer to this file.
**Note:** the option `-fw_cfg` is not available on stock qemu-kvm for CentOS. This needs to be updated with `yum install -y centos-release-qemu-ev`.
`coreos_base` pool is used for what? we can see it is mounted to all the node (master/worker/bootstrap).
```bash
[core@test1-master-0 ~]$ sudo df -h
Filesystem Size Used Avail Use% Mounted on
/dev/vda2 16G 6.0G 9.8G 38% /
devtmpfs 1.5G 0 1.5G 0% /dev
tmpfs 1.5G 0 1.5G 0% /dev/shm
tmpfs 1.5G 57M 1.5G 4% /run
tmpfs 1.5G 0 1.5G 0% /sys/fs/cgroup
tmpfs 1.5G 8.0K 1.5G 1% /tmp
/dev/vda1 297M 90M 207M 31% /boot ==> this is coming from the coreos_base.
```
**Comment (GB):** likely the base is the immutable part of the OS, while the `bootstrap` and `master0` are referring to the actual persistent disks attached to the VM.
Resource definition created by Terraform
-----------------------------------------
Network bridge created by terraform is not using any DHCP range and assign the IP with generated mac address.
`etcd-server-ssl` service is initially handled by bootstrap node and then it delegated to master.
```bash
$ virsh -c qemu+tcp://192.168.100.1/system net-dumpxml test1
<network>
<name>test1</name>
<uuid>268ee1c8-86ae-41ce-bf36-fe24fc2563fc</uuid>
<forward mode='nat'>
<nat>
<port start='1024' end='65535'/>
</nat>
</forward>
<bridge name='tt0' stp='on' delay='0'/>
<mac address='52:54:00:da:6d:ae'/>
<domain name='tt.testing' localOnly='yes'/>
<dns>
<srv service='etcd-server-ssl' protocol='tcp' domain='test1.tt.testing' target='test1-etcd-0.tt.testing' port='2380' weight='10'/>
<host ip='192.168.126.11'>
<hostname>test1-api</hostname>
<hostname>test1-etcd-0</hostname>
</host>
<host ip='192.168.126.50'>
<hostname>test1</hostname>
</host>
</dns>
<ip family='ipv4' address='192.168.126.1' prefix='24'>
<dhcp>
<host mac='46:19:e0:2f:8e:97' name='test1-master-0' ip='192.168.126.11'/>
<host mac='76:7c:1e:35:9d:d5' name='test1-bootstrap' ip='192.168.126.10'/>
<host mac='aa:11:07:14:d7:a7' name='test1-worker-0-8sk59' ip='192.168.126.51'/>
</dhcp>
</ip>
</network>
```
Some of the resource creation during the installation process.
```bash
DEBUG Running &exec.Cmd{Path:"/home/prkumar/work/github/practice/go/src/github.com/openshift/installer/bin/terraform", Args:[]string{"/home/prkumar/work/github/practice/go/src/github.com/openshift/installer/bin/terraform", "apply", "-auto-approve", "-input=false", "-no-color", "-state=terraform.tfstate"}}
data.libvirt_network_dns_host_template.bootstrap: Refreshing state...
data.libvirt_network_dns_host_template.workers: Refreshing state...
data.libvirt_network_dns_srv_template.etcd_cluster: Refreshing state...
data.libvirt_network_dns_host_template.masters: Refreshing state...
data.libvirt_network_dns_host_template.etcds: Refreshing state...
module.volume.libvirt_volume.coreos_base: Creating...
format: "" => "<computed>"
name: "" => "test1-base"
pool: "" => "default"
size: "" => "<computed>"
source: "" => "file:///home/prkumar/.cache/openshift-install/libvirt/image/8d2cb1f8b4e6e4cf754d05f5c742e8ae"
libvirt_ignition.worker: Creating...
content: "" => "{\"ignition\":"
name: "" => "test1-worker.ign"
pool: "" => "default"
libvirt_ignition.master: Creating...
content: "" => "{\"ignition\":"
name: "" => "test1-master.ign"
pool: "" => "default"
libvirt_network.tectonic_net: Creating...
addresses.#: "" => "1"
addresses.0: "" => "192.168.126.0/24"
autostart: "" => "true"
bridge: "" => "tt0"
dns.#: "" => "1"
dns.0.hosts.#: "" => "4"
dns.0.hosts.0.hostname: "" => "test1-api"
dns.0.hosts.0.ip: "" => "192.168.126.10"
dns.0.hosts.1.hostname: "" => "test1-api"
dns.0.hosts.1.ip: "" => "192.168.126.11"
dns.0.hosts.2.hostname: "" => "test1-etcd-0"
dns.0.hosts.2.ip: "" => "192.168.126.11"
dns.0.hosts.3.hostname: "" => "test1"
dns.0.hosts.3.ip: "" => "192.168.126.50"
dns.0.local_only: "" => "true"
dns.0.srvs.#: "" => "1"
dns.0.srvs.0.domain: "" => "test1.tt.testing"
dns.0.srvs.0.port: "" => "2380"
dns.0.srvs.0.protocol: "" => "tcp"
dns.0.srvs.0.service: "" => "etcd-server-ssl"
dns.0.srvs.0.target: "" => "test1-etcd-0.tt.testing"
dns.0.srvs.0.weight: "" => "10"
domain: "" => "tt.testing"
mode: "" => "nat"
name: "" => "test1"
module.bootstrap.libvirt_ignition.bootstrap: Creating...
content: "" => "{\"ignition\":".
name: "" => "test1-bootstrap.ign"
pool: "" => "default"
libvirt_network.tectonic_net: Creation complete after 6s (ID: 268ee1c8-86ae-41ce-bf36-fe24fc2563fc)
module.volume.libvirt_volume.coreos_base: Creation complete after 9s (ID: /var/lib/libvirt/images/test1-base)
libvirt_volume.master: Creating...
base_volume_id: "" => "/var/lib/libvirt/images/test1-base"
format: "" => "<computed>"
name: "" => "test1-master-0"
pool: "" => "default"
size: "" => "<computed>"
module.bootstrap.libvirt_volume.bootstrap: Creating...
base_volume_id: "" => "/var/lib/libvirt/images/test1-base"
format: "" => "<computed>"
name: "" => "test1-bootstrap"
pool: "" => "default"
size: "" => "<computed>"
libvirt_ignition.worker: Creation complete after 9s (ID: /var/lib/libvirt/images/test1-worker.ign;5bec0fb9-e4d3-1536-eb0f-a0d1fe0fc7a3)
libvirt_ignition.master: Creation complete after 9s (ID: /var/lib/libvirt/images/test1-master.ign;5bec0fb9-254c-693b-9847-d6ad5cb21016)
module.bootstrap.libvirt_ignition.bootstrap: Creation complete after 8s (ID: /var/lib/libvirt/images/test1-bootstrap.ign;5bec0fb9-8bd0-b619-76b3-32781e14f82b)
module.bootstrap.libvirt_volume.bootstrap: Creation complete after 0s (ID: /var/lib/libvirt/images/test1-bootstrap)
module.bootstrap.libvirt_domain.bootstrap: Creating...
arch: "" => "<computed>"
console.#: "" => "1"
console.0.target_port: "" => "0"
console.0.type: "" => "pty"
coreos_ignition: "" => "/var/lib/libvirt/images/test1-bootstrap.ign;5bec0fb9-8bd0-b619-76b3-32781e14f82b"
disk.#: "" => "1"
disk.0.scsi: "" => "false"
disk.0.volume_id: "" => "/var/lib/libvirt/images/test1-bootstrap"
emulator: "" => "<computed>"
machine: "" => "<computed>"
memory: "" => "2048"
name: "" => "test1-bootstrap"
network_interface.#: "" => "1"
network_interface.0.addresses.#: "" => "1"
network_interface.0.addresses.0: "" => "192.168.126.10"
network_interface.0.hostname: "" => "test1-bootstrap"
network_interface.0.mac: "" => "<computed>"
network_interface.0.network_id: "" => "268ee1c8-86ae-41ce-bf36-fe24fc2563fc"
network_interface.0.network_name: "" => "<computed>"
qemu_agent: "" => "false"
running: "" => "true"
vcpu: "" => "2"
libvirt_volume.master: Creation complete after 0s (ID: /var/lib/libvirt/images/test1-master-0)
libvirt_domain.master: Creating...
arch: "" => "<computed>"
console.#: "" => "1"
console.0.target_port: "" => "0"
console.0.type: "" => "pty"
coreos_ignition: "" => "/var/lib/libvirt/images/test1-master.ign;5bec0fb9-254c-693b-9847-d6ad5cb21016"
disk.#: "" => "1"
disk.0.scsi: "" => "false"
disk.0.volume_id: "" => "/var/lib/libvirt/images/test1-master-0"
emulator: "" => "<computed>"
machine: "" => "<computed>"
memory: "" => "3072"
name: "" => "test1-master-0"
network_interface.#: "" => "1"
network_interface.0.addresses.#: "" => "1"
network_interface.0.addresses.0: "" => "192.168.126.11"
network_interface.0.hostname: "" => "test1-master-0"
network_interface.0.mac: "" => "<computed>"
network_interface.0.network_id: "" => "268ee1c8-86ae-41ce-bf36-fe24fc2563fc"
network_interface.0.network_name: "" => "<computed>"
qemu_agent: "" => "false"
running: "" => "true"
vcpu: "" => "2"
module.bootstrap.libvirt_domain.bootstrap: Creation complete after 3s (ID: 4ed8607e-606a-4f82-8161-1a7b198d0bf6)
libvirt_domain.master: Creation complete after 3s (ID: e4cc8e4b-35ab-4b87-9ee3-fa0b4c89c587)
Apply complete! Resources: 9 added, 0 changed, 0 destroyed.
```
When bootstrap node complete.
```bash
If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
DEBUG Running &exec.Cmd{Path:"/home/prkumar/work/github/practice/go/src/github.com/openshift/installer/bin/terraform", Args:[]string{"/home/prkumar/work/github/practice/go/src/github.com/openshift/installer/bin/terraform", "apply", "-auto-approve", "-input=false", "-no-color", "-state=terraform.tfstate"}, Env:[]string(nil), Dir:"/tmp/openshift-install-681679557", Stdin:io.Reader(nil), Stdout:io.Writer(nil), Stderr:io.Writer(nil), ExtraFiles:[]*os.File(nil), SysProcAttr:(*syscall.SysProcAttr)(nil), Process:(*os.Process)(nil), ProcessState:(*os.ProcessState)(nil), ctx:context.Context(nil), lookPathErr:error(nil), finished:false, childFiles:[]*os.File(nil), closeAfterStart:[]io.Closer(nil), closeAfterWait:[]io.Closer(nil), goroutine:[]func() error(nil), errch:(chan error)(nil), waitDone:(chan struct {})(nil)}...
data.libvirt_network_dns_host_template.masters: Refreshing state...
data.libvirt_network_dns_host_template.workers: Refreshing state...
data.libvirt_network_dns_srv_template.etcd_cluster: Refreshing state...
libvirt_ignition.worker: Refreshing state... (ID: /var/lib/libvirt/images/test1-worker.ign;5bec0fb9-e4d3-1536-eb0f-a0d1fe0fc7a3)
libvirt_volume.coreos_base: Refreshing state... (ID: /var/lib/libvirt/images/test1-base)
libvirt_ignition.master: Refreshing state... (ID: /var/lib/libvirt/images/test1-master.ign;5bec0fb9-254c-693b-9847-d6ad5cb21016)
libvirt_ignition.bootstrap: Refreshing state... (ID: /var/lib/libvirt/images/test1-bootstrap.ign;5bec0fb9-8bd0-b619-76b3-32781e14f82b)
libvirt_volume.bootstrap: Refreshing state... (ID: /var/lib/libvirt/images/test1-bootstrap)
data.libvirt_network_dns_host_template.etcds: Refreshing state...
libvirt_volume.master: Refreshing state... (ID: /var/lib/libvirt/images/test1-master-0)
libvirt_network.tectonic_net: Refreshing state... (ID: 268ee1c8-86ae-41ce-bf36-fe24fc2563fc)
libvirt_domain.bootstrap: Refreshing state... (ID: 4ed8607e-606a-4f82-8161-1a7b198d0bf6)
libvirt_domain.master: Refreshing state... (ID: e4cc8e4b-35ab-4b87-9ee3-fa0b4c89c587)
libvirt_network.tectonic_net: Modifying... (ID: 268ee1c8-86ae-41ce-bf36-fe24fc2563fc)
dns.0.hosts.#: "4" => "3"
dns.0.hosts.0.ip: "192.168.126.10" => "192.168.126.11"
dns.0.hosts.1.hostname: "test1-api" => "test1-etcd-0"
dns.0.hosts.2.hostname: "test1-etcd-0" => "test1"
dns.0.hosts.2.ip: "192.168.126.11" => "192.168.126.50"
dns.0.hosts.3.hostname: "test1" => ""
dns.0.hosts.3.ip: "192.168.126.50" => ""
libvirt_network.tectonic_net: Modifications complete after 0s (ID: 268ee1c8-86ae-41ce-bf36-fe24fc2563fc)
Apply complete! Resources: 0 added, 1 changed, 0 destroyed.
DEBUG Running &exec.Cmd{Path:"/home/prkumar/work/github/practice/go/src/github.com/openshift/installer/bin/terraform", Args:[]string{"/home/prkumar/work/github/practice/go/src/github.com/openshift/installer/bin/terraform", "init", "-input=false", "-no-color"}, Env:[]string(nil), Dir:"/tmp/openshift-install-681679557", Stdin:io.Reader(nil), Stdout:io.Writer(nil), Stderr:io.Writer(nil), ExtraFiles:[]*os.File(nil), SysProcAttr:(*syscall.SysProcAttr)(nil), Process:(*os.Process)(nil), ProcessState:(*os.ProcessState)(nil), ctx:context.Context(nil), lookPathErr:error(nil), finished:false, childFiles:[]*os.File(nil), closeAfterStart:[]io.Closer(nil), closeAfterWait:[]io.Closer(nil), goroutine:[]func() error(nil), errch:(chan error)(nil), waitDone:(chan struct {})(nil)}...
Initializing modules...
- module.volume
- module.bootstrap
Initializing provider plugins...
Terraform has been successfully initialized!
```
Installer challenges
--------------------
* The inter-cluster communication relies on hostnames, which means DNS needs to be available. Current solution relies on the DNS overlay of libvirt.
With CoreDNS we can provide a cross-platform solution to handle this.
Either we use the integrated `kubernetes` support or a simple zone file.
Alternatively we can populate the `/etc/hosts` file inside the VM using ignition.
* Providing the ignition config on startup for HyperV is
currently not possible.