owned this note
owned this note
Published
Linked with GitHub
# Installing OKD with Assisted installer
## Getting started
Assisted Installer is a web service, which helps installing OCP/OKD via UPI method. The service generates discovery ISO, which registers the hosts and accepts commands from the centralized service.
This method prevents most common mistakes (i.e. network misconfiguration, missing DNS records) and visualized installation process. Unlike IPI method, Assisted Installer relies on user infrastructure, thus giving more options when it comes to architecture choices.
## Running Assisted Service
```
git clone https://github.com/openshift/assisted-service/
```
Customize service URLs:
```
[vrutkovs@centos8stream assisted-service]$ git diff
diff --git a/deploy/podman/okd-configmap.yml b/deploy/podman/okd-configmap.yml
index 0fc523cf..206ab5a5 100644
--- a/deploy/podman/okd-configmap.yml
+++ b/deploy/podman/okd-configmap.yml
@@ -3,7 +3,7 @@ kind: ConfigMap
metadata:
name: config
data:
- ASSISTED_SERVICE_HOST: 127.0.0.1:8090
+ ASSISTED_SERVICE_HOST: 34.118.84.65:8090
ASSISTED_SERVICE_SCHEME: http
AUTH_TYPE: none
DB_HOST: 127.0.0.1
@@ -16,7 +16,7 @@ data:
DUMMY_IGNITION: "false"
ENABLE_SINGLE_NODE_DNSMASQ: "false"
HW_VALIDATOR_REQUIREMENTS: '[{"version":"default","master":{"cpu_cores":4,"ram_mib":16384,"disk_size_gb":120,"installation_disk_speed_threshold_ms":10,"network_latency_threshold_ms":100,"packet_loss_percentage":0},"worker":{"cpu_cores":2,"ram_mib":8192,"disk_size_gb":120,"installation_disk_speed_threshold_ms":10,"network_latency_threshold_ms":1000,"packet_loss_percentage":10},"sno":{"cpu_cores":8,"ram_mib":16384,"disk_size_gb":120,"installation_disk_speed_threshold_ms":10}}]'
- IMAGE_SERVICE_BASE_URL: http://127.0.0.1:8888
+ IMAGE_SERVICE_BASE_URL: http://34.118.84.65:8888
IPV6_SUPPORT: "true"
LISTEN_PORT: "8888"
NTP_DEFAULT_SERVER: ""
@@ -24,8 +24,8 @@ data:
POSTGRESQL_PASSWORD: admin
POSTGRESQL_USER: admin
PUBLIC_CONTAINER_REGISTRIES: 'quay.io'
- SERVICE_BASE_URL: http://127.0.0.1:8090
+ SERVICE_BASE_URL: http://34.118.84.65:8090
```
These URLs must be reachable by the nodes so that the agent could report status and fetch commands from the server.
If you're running in a disconnected environment please update FCOS URLs in `OS_IMAGES` and image pullspecs in `RELEASE_IMAGES` and `OKD_RPMS_IMAGES`
Lets run it:
```
[vrutkovs@centos8stream assisted-service]$ make deploy-onprem OKD=true
podman play kube --configmap deploy/podman/okd-configmap.yml deploy/podman/pod.yml
Pod:
3e7ae70150dd78758e59131516c356ee63797d06addce81e51ce84904918573c
Containers:
8ec9753fa785b890cfa64f4f0430ee1f0638e1f97cad0cb28caf5a935af57a70
bc9d7aed1dd98116ba4879d27fee00a8e15257274a594681caba90fabaace9eb
17e97906412439f51167afa55c56f95bea2378bcbca196a799078b3d503e3da1
caa66751a213ed0a35bb578838fcee7d59392a48bbcb66d3aaf9ed2fbd5eaf6d
./hack/retry.sh 90 2 "curl -f http://127.0.0.1:8090/ready"
curl -f http://127.0.0.1:8090/ready
curl: (56) Recv failure: Connection reset by peer
> failed with exit code 56, waiting 2 seconds to retry...
```
Once health checks are passing, `ASSISTED_SERVICE_HOST` can be opened in the browser:
---
![](https://i.imgur.com/DYCFnqE.png)
## Discovering hosts
Click "Create new cluster" and fill in the form:
![](https://i.imgur.com/ibEuR2t.png)
---
Click "Next" below and click "Add Hosts" button:
![](https://i.imgur.com/5MrScMt.png)
We're using "minimal" image, which would pull rootfs from builds.fedoraproject.org and has ssh key embedded.
Click "Generate Discovery ISO"
---
![](https://i.imgur.com/1MEJXPa.png)
The dialog exposes discovery ISO used to boot every node. Note, that we don't need any Ignition to boot it.
Lets create a 8 core 16GB RAM VM:
```
$ wget -O discovery_image_okd.iso 'http://34.118.84.65:8888/images/7ebf7402-e5c2-4332-a90c-5a007034b322?arch=x86_64&type=minimal-iso&version=4.9'
--2022-02-12 15:58:50-- http://34.118.84.65:8888/images/7ebf7402-e5c2-4332-a90c-5a007034b322?arch=x86_64&type=minimal-iso&version=4.9
Connecting to 34.118.84.65:8888... connected.
HTTP request sent, awaiting response... 200 OK
Length: 94828544 (90M) [application/octet-stream]
Saving to: ‘discovery_image_okd.iso’
discovery_image_okd.iso 100%[======================================================================================================>] 90.44M 375MB/s in 0.2s
2022-02-12 15:58:50 (375 MB/s) - ‘discovery_image_okd.iso’ saved [94828544/94828544]
$ sudo mv discovery_image_okd.iso /var/lib/libvirt/images
$ sudo virt-install --autostart --virt-type=kvm --name master --memory 16500 --vcpus=16 --cdrom=/var/lib/libvirt/images/discovery_image_okd.iso --disk path=/var/lib/libvirt/images/master.qcow2,size=150,bus=virtio,format=qcow2 --events on_reboot=restart --boot hd,cdrom --noautoconsole
WARNING No operating system detected, VM performance may suffer. Specify an OS with --os-variant for optimal results.
Starting install...
Domain is still running. Installation may be in progress.
You can reconnect to the console to complete the installation process.
```
Wait for machine to request an address:
```
$ sudo journalctl -b -f -u libvirtd
-- Logs begin at Fri 2022-01-28 10:33:32 UTC. --
Feb 12 15:59:33 centos8stream dnsmasq-dhcp[29740]: DHCPOFFER(virbr0) 192.168.122.97 52:54:00:85:af:dd
Feb 12 15:59:33 centos8stream dnsmasq-dhcp[29740]: DHCPDISCOVER(virbr0) 52:54:00:85:af:dd
Feb 12 15:59:33 centos8stream dnsmasq-dhcp[29740]: DHCPOFFER(virbr0) 192.168.122.97 52:54:00:85:af:dd
Feb 12 15:59:33 centos8stream dnsmasq-dhcp[29740]: DHCPREQUEST(virbr0) 192.168.122.97 52:54:00:85:af:dd
Feb 12 15:59:33 centos8stream dnsmasq-dhcp[29740]: DHCPACK(virbr0) 192.168.122.97 52:54:00:85:af:dd
```
Update hosts entries, restart DNSMasq and ssh on the node:
```
$ sudo vi /etc/hosts && sudo pkill -SIGHUP dnsmasq && ssh core@master.okd.this.host
** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** **
This is a host being installed by the OpenShift Assisted Installer.
It will be installed from scratch during the installation.
The primary service is agent.service. To watch its status, run:
sudo journalctl -u agent.service
To view the agent log, run:
sudo journalctl TAG=agent
** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** **
Fedora CoreOS 34.20210626.3.1
Tracker: https://github.com/coreos/fedora-coreos-tracker
Discuss: https://discussion.fedoraproject.org/c/server/coreos/
[systemd]
Failed Units: 1
selinux.service
[core@localhost ~]$
```
This is a plain FCOS host with kubelet/crio installed in overlayFS:
```
[core@master ~]$ rpm-ostree status
State: idle
Deployments:
● ostree://fedora:fedora/x86_64/coreos/stable
Version: 34.20210626.3.1 (2021-07-14T14:49:01Z)
Commit: 252fffde6f56d183a3c51c05a0c602b61011f6cb4de23a58313ba3b0023dc360
GPGSignature: Valid signature by 8C5BA6990BDB26E19F2A1A801161AE6945719A39
$ ls -l /usr/bin/kubelet /usr/bin/crio
-rwxr-xr-x. 1 root root 73593408 Feb 7 06:58 /usr/bin/crio
-rwxr-xr-x. 1 root root 121561792 Feb 3 16:10 /usr/bin/kubelet
```
These binaries are extracted from `okd-rpms` image, see `journalctl -b -u okd-overlay.service`.
This VM now runs Assisted Installer agent:
```
[core@master ~]$ sudo systemctl status agent
● agent.service
Loaded: loaded (/etc/systemd/system/agent.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/agent.service.d
└─wait-for-okd.conf
Active: active (running) since Sat 2022-02-12 16:02:11 UTC; 4min 50s ago
Process: 1559 ExecStartPre=/usr/local/bin/agent-fix-bz1964591 quay.io/ocpmetal/assisted-installer-agent:latest (code=exited, status=0/SUCCESS)
Process: 1658 ExecStartPre=podman run --privileged --rm -v /usr/local/bin:/hostbin quay.io/ocpmetal/assisted-installer-agent:latest cp /usr/bin/agent /hostbin (code=exited, status=0/SUCC>
Main PID: 2123 (agent)
...
```
And on Assisted Installer UI we see host details:
![](https://i.imgur.com/AOiW4nn.png)
Click `Next` to continue
---
On "Networking" page set cluster networking settings:
![](https://i.imgur.com/7cOfqY3.png)
Click `Next` to proceed to review page and click `Install cluster`
## Bootstrap in place
Assisted Installer is fetching selected OKD release, extracts installer and runs it to generate Ignition files. It also ensures that registered hosts can pull images for selected release.
![](https://i.imgur.com/hhooN1c.png)
---
After bootstrap ignition is generated, the service would pass it to the agent, it would apply it on the host (without a reboot) and run `bootkube.service`:
```
[core@master ~]$ sudo journalctl -b -u bootkube | head -n10
-- Journal begins at Sat 2022-02-12 15:59:26 UTC, ends at Sat 2022-02-12 16:16:24 UTC. --
Feb 12 16:14:28 random-hostname-74890eca-63ef-4d07-baa3-6ffe3b6e6d97 systemd[1]: Started Bootstrap a Kubernetes cluster.
Feb 12 16:14:29 random-hostname-74890eca-63ef-4d07-baa3-6ffe3b6e6d97 podman[11656]: 2022-02-12 16:14:29.187467435 +0000 UTC m=+0.145509404 container create 1eb62c1c42337c05f452a0ce5edbc219608221e37ab83a3c8b007916ad1b5305 (image=quay.io/openshift/okd@sha256:ce42e3e42c19b2d97f51221a65e1f97191b6f51f6e8552ba339dace501e29d1a, name=charming_margulis, io.openshift.release=4.9.0-0.okd-2022-01-29-035536, io.openshift.release.base-image-digest=sha256:45828d66e36c763d63c851c9c037a16c3bb2df50b26f86e5da461d8f0af225df)
Feb 12 16:14:29 random-hostname-74890eca-63ef-4d07-baa3-6ffe3b6e6d97 podman[11656]: 2022-02-12 16:14:29.259999082 +0000 UTC m=+0.218041023 container init 1eb62c1c42337c05f452a0ce5edbc219608221e37ab83a3c8b007916ad1b5305 (image=quay.io/openshift/okd@sha256:ce42e3e42c19b2d97f51221a65e1f97191b6f51f6e8552ba339dace501e29d1a, name=charming_margulis, io.openshift.release=4.9.0-0.okd-2022-01-29-035536, io.openshift.release.base-image-digest=sha256:45828d66e36c763d63c851c9c037a16c3bb2df50b26f86e5da461d8f0af225df)
...
```
![](https://i.imgur.com/briztF0.png)
---
When `bootkube.service` is finished, the agent would pack etcd database and manifests in ignition, patch `master.ign` and write FCOS image to disk, applying the modified `master.ign` on the host.
This would allow the host to run as a first master without a running bootstrap host (so called "bootstrap-in-place" method)
![](https://i.imgur.com/E8KGD3L.png)
---
The VM would now stop, waiting for user to detach the discovery ISO and boot from disk. On boot, the host would start `machine-config-daemon-firstboot` to pivot into expected OS (unlike overlay in discovery ISO phase it needs to be persisted):
```
$ sudo virsh start master
Domain 'master' started
$ ssh core@master.okd.this.host
Fedora CoreOS 34.20210626.3.1
Tracker: https://github.com/coreos/fedora-coreos-tracker
Discuss: https://discussion.fedoraproject.org/c/server/coreos/
[core@master ~]$ sudo journalctl -b -f -u machine-config-daemon-firstboot
-- Journal begins at Sat 2022-02-12 16:23:06 UTC, ends at Sat 2022-02-12 16:24:39 UTC. --
Feb 12 16:24:08 master systemd[1]: Starting Machine Config Daemon Firstboot...
Feb 12 16:24:08 master machine-config-daemon[1543]: I0212 16:24:08.415726 1543 update.go:1897] Running: systemctl start rpm-ostreed
Feb 12 16:24:09 master machine-config-daemon[1543]: I0212 16:24:09.860076 1543 rpm-ostree.go:325] Running captured: rpm-ostree status --json
...
```
Assisted Installer UI shows the host is in "Rebooting" stage, which is expected to last until Assisted Agent is running via kubernetes static pod.
## Finalizing the installation
After another reboot OKD machine-os is applied:
```
$ ssh core@master.okd.this.host
Fedora CoreOS 34
Tracker: https://github.com/coreos/fedora-coreos-tracker
Discuss: https://discussion.fedoraproject.org/c/server/coreos/
Last login: Sat Feb 12 16:24:12 2022 from 192.168.122.1
[core@master ~]$ sudo rpm-ostree status
State: idle
Deployments:
● pivot://quay.io/openshift/okd-content@sha256:7755b7626fe2316173e4ccc7723eeac2438212e98ff16388e57d10380c4a319d
CustomOrigin: Managed by machine-config-operator
Version: 49.34.202201282225-0 (2022-01-28T22:29:16Z)
fedora:fedora/x86_64/coreos/stable
Version: 34.20210626.3.1 (2021-07-14T14:49:01Z)
Commit: 252fffde6f56d183a3c51c05a0c602b61011f6cb4de23a58313ba3b0023dc360
GPGSignature: Valid signature by 8C5BA6990BDB26E19F2A1A801161AE6945719A39
```
Check that kubelet.service is running and wait for assisted-installer-controller to become active so that installation would proceed to the next phase:
```
[core@master ~]$ sudo su
[root@master core]# export KUBECONFIG=/etc/kubernetes/bootstrap-secrets/kubeconfig
[root@master core]# # oc get pods -n assisted-installer -w
NAME READY STATUS RESTARTS AGE
assisted-installer-controller--1-ghfrj 0/1 Pending 0 12m
assisted-installer-controller--1-ghfrj 0/1 Pending 0 14m
assisted-installer-controller--1-ghfrj 0/1 ContainerCreating 0 14m
assisted-installer-controller--1-ghfrj 1/1 Running 0 14m
```
## Voila
Shortly afterwards UI would update the status:
![](https://i.imgur.com/UZO2UYw.png)
and when all operators would rollout UI would show console URL and credentials:
![](https://i.imgur.com/O3tPjcw.png)