Lumen LABS - HackMD

# Lumen LABS [TOC] ## Infra Details This is a shared colo LUMEN lab @Dallas, [Christoph is POC](mailto:cdoerbec@redhat.com) [More info](https://docs.google.com/document/d/1ZDQ4fdGXP19LXkV3rYToqvLpkZOA-AWLSrTPFA0gwsg/edit?usp=sharing) ## How to Access ### General Details 1. You need an account check with Christoph (mail-to:cdoerbec@redhat.com) 2. Once you have an account there are 2 methods of access: * individual user based ssh with key only (no passwords!) `ssh -p '8022' <username>@64.156.74.226` * username/password via cockpit `https://64.156.74.226:9090/` * First time only, you need to add your `ssh-key pub-key` to your account in `cockpit console` * Login to [Cockpit](https://4.71.168.210:9090/) * browse to your account * from `left-nav -> Accounts > vkanumal click ‘+’ ` to add authorized key ### SSH-Config (Recommended) * Create a `~/.ssh/config` with following content, on your local-box ``` ### copy paste content and change 'vkanumal' ### to your user-id Host tiger-jumpbox HostName 4.71.168.210 ## change-me User vkanumal Port 8022 Host tiger-bastion HostName 192.168.122.100 User root ## change-me ProxyCommand ssh -W %h:%p vkanumal@tiger-jumpbox ``` * after this you should be able to login to the lab as * `ssh tiger-bastion` ### SShuttle This util will help ease the access to remote nodes [more-info](https://github.com/sshuttle/sshuttle) Create a simple script like below and run it at the start ``` nohup sshuttle --ns-hosts 192.168.116,101,192.168.116.102,192.168.116,103,192.168.116.104,192.168.116,105,192.168.116.106 -vNHr vkanumal6757@tiger-jumpbox 2>&1 > telco-lab.log & ``` #### You can use [systemd as-well](https://sshuttle.readthedocs.io/en/stable/requirements.html#additional-suggested-software) # Quickstart : OpenShift AI Installation This is a quickstart guide to OpenShift installation using Assisted-Installer method [read more here.](https://docs.openshift.com/container-platform/4.7/installing/installing_platform_agnostic/installing-platform-agnostic.html#prerequisites) ## HelperNode For the OCP install, you need system services like DNS, DHCP, NTP apart from HTTPD & HA-Proxy services. If you do not have these in your infrastructure already you can use the [official HelperNode](https://github.com/RedHatOfficial/ocp4-helpernode) automation for the setup. ![Services](https://i.imgur.com/jj4zjuQ.png "Services Needed for OCP Install") ### DHCP Config ```[ ] ###/etc/dhcp/dhcpd.conf authoritative; ddns-update-style interim; default-lease-time 14400; max-lease-time 14400; option routers 192.168.116.1; option broadcast-address 192.168.116.255; option subnet-mask 255.255.255.0; option domain-name-servers 192.168.116.99; option domain-name "ocp-tigertelco.example.com"; option domain-search "ocp-tigertelco.example.com", "example.com"; subnet 192.168.116.0 netmask 255.255.255.0 { interface eth0; # class tigernodes { # match if substring(hardware,1,4) = 24:6E; # } pool { # allow members of tigernodes; range 192.168.116.113 192.168.116.114; # Static entries host bootstrap { hardware ethernet aa:bb:cc:11:42:40; fixed-address 192.168.116.161; } host master0 { hardware ethernet 24:6e:96:33:37:5d; fixed-address 192.168.116.101; } host master1 { hardware ethernet 24:6E:96:08:43:75; fixed-address 192.168.116.102; } host master2 { hardware ethernet 24:6E:96:66:F9:D5; fixed-address 192.168.116.103; } host worker0 { hardware ethernet 24:6e:96:19:f3:65; fixed-address 192.168.116.104; } host worker1 { hardware ethernet 24:6E:96:31:A6:7D; fixed-address 192.168.116.105; } host worker2 { hardware ethernet 24:6E:96:08:64:C5; fixed-address 192.168.116.106; } host provisioner { hardware ethernet 52:54:00:a6:19:a4; fixed-address 192.168.116.110; } # this will not give out addresses to hosts not listed above deny unknown-clients; # this is PXE specific #filename "pxelinux.0"; #next-server 192.168.116.99; } } ``` ## Cloud Portal Login First step is to create a configuration that build an automated discovery.iso for your installation. Login to [cloud.redhat.com](https://cloud.redhat.com/openshift/assisted-installer/clusters) ### Create new-cluster ![](https://i.imgur.com/HdBTnjQ.png) ### Discovery Image ![](https://i.imgur.com/pe4iiFi.png) * Needs `SSH-Key` this key will be used to login to all the nodes as-well * Download the generated `ISO image` and copy to the `NFS location` that is accessible from `OOB` network ## Prepare Nodes You need your nodes to be running same firmware, since __in-lab__ all machine are IDRAC8, it is recommended to run following versions: * `BIOS`: 2.12.1 * `FIRMWARE`: 2.75.100.75 There are several ways you can handle firmware * RACADM * Ipmitool * openmanage * Redfish ### Option-1: RACADM TBD ```[ ] ### rachelp.sh ## Detach image racadm -r $1 -u root -p <passwd> remoteimage -d ## attach the new-discovery-image racadm -r $1 -u root -p <passwd> remoteimage -c -l 192.168.110.10:/home/iso/vkanumal/discovery_image_lumen-venu-cluster.iso ## Set Boot to VCD racadm -r $1 -u root -p <passwd> set iDRAC.VirtualMedia.BootOnce 1 racadm -r $1 -u root -p <passwd> set iDRAC.ServerBoot.FirstBootDevice VCD-DVD ## Power Cycle the node racadm -r $1 -u root -p <passwd> serveraction powercycle``` ``` A simple `for-loop` will do the trick ```[ ] for i in `seq 101 106; do echo $i; ./rachelp.sh 192.168.116.$i; done;` ``` ### Option-2: Ipmitool TBD ```[ ] ipmitool -I lanplus -U root -P <passwd> -H 192.168.110.102 chassis bootparam set bootflag force_bios ipmitool -I lanplus -U root -P <passwd> -H 192.168.110.102 power on ipmitool -I lanplus -U root -P <passwd> -H 192.168.110.102 chassis bootdev pxe ipmitool -I lanplus -U root -P <passwd> -H 192.168.110.102 power cycle ``` ### Option-3: Ansible open-manage Lab inventory: ``` all: hosts: example.com: idrac: children: masters: hosts: master0: idrac_ip: 192.168.110.101 master1: idrac_ip: 192.168.110.102 master2: idrac_ip: 192.168.110.103 workers: hosts: worker0: idrac_ip: 192.168.110.104 worker1: idrac_ip: 192.168.110.105 worker2: idrac_ip: 192.168.110.106 vars: idrac_user: root idrac_password: <passwd> ``` example : ``` ansible-playbook -i inventory.yaml idrac/idrac_firmware_info.yml -v ``` ### Option-4: Redfish - Python-Client TBD Recommended for use with IDRAC-9, firmware 3.30.30.30 and above ``` python3 GetSystemHWInventoryREDFISH.py -ip 192.168.110.102 -u root -p <passwd> -n N | grep AssociatedNetworkAddresses ``` ## Install OpenShift At a high-level following are the steps to run the default install process - Upload the discovery.iso to NFS location - Using OOB tools, attach the .iso as virtual media to all the nodes that are part of cluster recommended configuration (3-master nodes, 3-worker nodes) - Trigger the node power-cycle with boot device set to virutal media After successful reboot you should see the nodes boot into RHOCS and auto register to the cloud-console. ![](https://i.imgur.com/27KzExs.png) Set the node roles either 'Master' or 'Worker', you can let the installer auto choose. ![](https://i.imgur.com/mtdM30a.png) Choose the Domain `ocp-tigertelco.example.com` Set the API `192.168.116.224` and Ingress IP `192.168.116.225` and click on Install Cluster ![API & Ingress Virtual IPs](https://i.imgur.com/50uuXaa.png "API & Ingress Virtual IPs") Install will start ![](https://i.imgur.com/1kvW61b.png) this will take around 10-15 minutes to finish. ### Accessing OpenShift Console To access the newly installed cluster, in lab setup you need to add entries to your `/etc/hosts` ![](https://i.imgur.com/W032sgo.png) Console should be accessible at [demo-console](https://console-openshift-console.apps.ai-demo-lumen.ocp-tigertelco.example.com/dashboards) ![](https://i.imgur.com/cfDMMsY.png) ### CLI Access Download `kubeconfig` from the console and set your bash `KUBECONFIG` then you should be able to access the cluster using [`oc`]() ``` [root@bastion-centos assist]# oc cluster-info Kubernetes control plane is running at https://api.ai-demo-lumen.ocp-tigertelco.example.com:6443 To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'. ``` ``` [root@bastion-centos assist]# oc get nodes NAME STATUS ROLES AGE VERSION master0.ocp-tigertelco.example.com Ready master 3d23h v1.20.0+ba45583 master1.ocp-tigertelco.example.com Ready master 3d23h v1.20.0+ba45583 master2.ocp-tigertelco.example.com Ready master 3d23h v1.20.0+ba45583 worker0.ocp-tigertelco.example.com Ready worker 3d23h v1.20.0+ba45583 worker1.ocp-tigertelco.example.com Ready worker 3d23h v1.20.0+ba45583 ``` ## Additional Tasks ### Multiple Networks TBD ### SSH to the Control Nodes TBD ### Add/Remove nodes from Cluster TBD