OCP SNO Setup 20240328

# OpenShift SNO Setup 2024-03-28 :::success :memo: **NOTE** This is from a home lab collab session on 2024-03-28. We used Red Hat Assisted Installer option to install Single Node OpenShift. This document then covers the post-install setup. ::: --- This is how we setup a SNO (single node OpenShift) system and then configured storage, networking and OpenShift Virtualization (kubevirt, CNV) to be able to not only run containers but also Virtual Machines. ### Terminology SNO : Single Node OpenShift OCP : Red Hat OpenShift Container Platform, or just OpenShift for short CNV : Container Native Virtualization kubevirt : Kubernetes Virtualization. See more at ==https://kubevirt.io== # SNO install using Assisted Installer # Getting Started I took one of my mini 4x4 PC systems and installed SNO onto it. Basic install, nothing special. I used the Red Hat assisted installer option. >**My Mini 4x4 PC has these attributes:** AMD Ryzen CPU 64GB memory 1x SATA SSD – 1TB /dev/sda (used for OCP installation) 1x NVMe SSD – 1TB /dev/nvme0n1 (space for other uses like VM image storage, etc.) 1x 2.5GbE network interface port 1x 1GbE network interface port Pre-requisites: >1. If you have multiple storage devices, choose which one to identify as the install/boot storage device, as you'll need to identify it during the Assisten Installer process. >2. You need to have a single IP address assigned in DNS with both forward and reverse lookup tested and working correctly. This will be shared use for 3 things: node IP, API IP, Ingress IP. >:::spoiler >Any OpenShift with more than 3 nodes (3- or 4-node compact cluster or larger clusters with more the enterprise standard of 5 nodes (3 control plane and 2 worker node), must have more than 1 unique IP address assigned and ready in the DNS and tested for both forward and reverse lookup. Those unique IP addresses breakdown as follows: >A. You need one unique IP address for each node >B. You need one unique IP address for the API IP, which will be able to float or be used by any control plane node in the cluster. >C. You need one unique IP address for the Ingress IP, which will be able to float or be used by any control plane node in the cluster. >So for a compact 3-node cluster home lab, for instance, you'd need 5 unique IP addresses: >1) node 1 2) node 2 3) node 3 4) API IP 5) Ingress IP. >But in a SNO or single node OpenShift, you can use one single unique IP address for all 3 uses: node IP, API IP, Ingress IP. >::: >The IP address(es) will be asked for and required during your initial installation, including with the use of the Assisted Installer, which we'll be using for the procedure today. After installation, I took these steps: >1. Setup an htpasswd authentication method with users and passwords so I could eliminate the kubeadmin temp user from installation. >2. I installed these operators: A) LVM Storage: this is to use the local storage on that unused NVMe SSD, for instance B) OpenShift Virtualization >3. We know we want to run Virtual Machines, don’t we! C) One of those also installed a pre-req or co-req operator called Package Server >:::spoiler >$ htpasswd -B -b htpasswd.users ocpadmin "supersecret%1234" >$ htpasswd -B -b htpasswd.users developer1 "supersecret%1234" > >$ oc create secret generic htpass-secret --from-file=htpasswd=htpasswd.users -n openshift-config > >$ cat htpasswd_cr.yaml >``` >--- >apiVersion: config.openshift.io/v1 >kind: OAuth >metadata: > name: cluster >spec: > identityProviders: > - name: Local Logins > mappingMethod: claim > type: HTPasswd > htpasswd: > fileData: > name: htpass-secret >``` >$ oc apply -f htpasswd_cr.yaml >$ oc adm policy add-cluster-role-to-user cluster-admin ocpadmin # First Things First ## Cluster Console / Web Terminal ### Terminal Operator There is a very useful tool for working in the OpenShift console GUI, which is called the Web Terminal operator. This gives you an easy to find and click button to open up a terminal window so you can get a Linux shell prompt and use OCP CLI commands with the `oc` utility. i.e. it gives you a quick easy `oc` CLI terminal window to work in. So let's do this now, right up front, first things first! Just follow these steps: 1. On your OCP console go to the left nav frame and open up **Operators** --> **OperatorHub**. 2. Go to the **All items** search bar and enter "web terminal". 3. You should get 2 results, and we want what is usually the 2nd one listed called "**Web Terminal**". Click on that one. 4. You should get a right side pop-up window with information and an **Install** button at the top. Just click on that **Install** button as there are no changes we need to make and the defaults are all okay. 5. You're presented a new page with some operator details to set before installing. All of the defaults are usually okay, so just click the blue **Install** button at the bottom. 6. It will start the install, but there isn't much to see there, so click back on the left nav frame **Operators** --> **Installed Operators**. You should now see the Web Terminal operator listed, and you can watch the "Status" column change as it progresses until you see "Succeeded". 7. You may well be asked to refresh your console window. If asked, please do that. 8. Now you should see a new icon in the upper right of your console window next to the icons for Red Hat Applications, Notifications, the + to add new YAML quickly, and now a new one that is the **>_** icon, and then the **?** Help button. Try it now. Click on the **>_**. It opens up a small height terminal at the bottom part of your current window. You can grab the divider line and pull that up or down to expand or shrink the height. Or you are given an action button in the upper right to close the terminal window, or open it in another browser tab or window. The very first time you run this, it will ask you to initialize the terminal, and all is asks for is a project name. But it's not really asking, it makes it look like you have a choice but you don't, it's essentially hard coded. But it's trivial, so we're not concerned. Since the project name is already filled out as the default for this operator, project "**openshift-terminal**", all you have to do then is click the blue **Start** button. # Storage Staying in Administrator mode on our cluster, we looked at the LVM Storage operator. When we go to the LCM Cluster tab, it shows that we already have an LVM cluster that was setup for us, called ```lvmcluster-sample```. This was done because we used the Assisted Installer, and as part of that we checked the box to add LVM Storage operator during installation, automatically for us. >NOTE: There is another storage operator called "Local Storage". Why didn't we use this one? Why use the LVM Storage operator? The answer is because the Local Storage operator is a throwback to the early days of OpenShift before we had ODF (OpenShift Data Foundation) and we needed some way to provide storage to the containers. If you have SNO and internal local storage available, then LVM storage is absolutely the way to go like we're setting up here. You can also add NFS or iSCSI storage methods for various use cases and with trade-offs. >If you are setting up a compact 3-node cluster or more than 3 nodes, then ODF is the way to go unless you have something else available like IBM Fusion or other storage operators. So that assisted installer option setup this sample lvmcluster for us. If you install the LVM Storage operator on your own, seperately, after OCP installation, then you will typically have to create your own LVM cluster as the first step post-operator installation. When we dive into that ```lvmcluster-sample``` resource, and we look at the YAML, then we hit the “Reload” to collapse all those “managedFields” lines (about 40 lines collapsed), then we see most of the important info, in our case this is lines 53-61. ``` spec: storage: deviceClasses: - fstype: xfs name: vg1 thinPoolConfig: name: thin-pool-1 overprovisionRatio: 10 sizePercent: 90 ``` This tells us a few things. It is using the correct device, as it scanned our storage devices after the operator was installed, and it found this storage device clean, no partitions, so not being used by anything else or having any apparent data on it currently, so it automatically set that up for us to use. So the LVM Storage operator is basically looking at that and saying "okay, if you're not going to use that then I am". ## Logical Volume Management (LVM) First off, here are some good sources to learn more about LVM in Linux. https://www.digitalocean.com/community/tutorials/an-introduction-to-lvm-concepts-terminology-and-operations https://linuxhandbook.com/lvm-guide/ Some explanation here about LVM, i.e. Logical Volume Management that will help those unfamiliar with this technology. There are three main components to LVM: 1. Physical Volumes 2. Volume Groups 3. Logical Volumes ### Why use LVM? The main advantage of LVM is how easy it is to resize a logical volume or volume group. It abstracts away all the ugly parts (partitions, raw disks) and leaves us with a central storage pool to work with. ### Physical Volumes The very first thing you need to know about LVM, is physical volumes. Physical volumes (PVs) are the raw materials or building blocks that are used to achieve the abstraction that is logical volumes. In simpler words, physical volumes are the logical unit of an LVM system. A physical volume can be anything, a raw disk, or a disk partition. Creating and initializing a physical volume are the same thing. Both mean you're just preparing the building blocks (i.e. partitions, disks) for further operations. This will become clearer in moment. Utilities: All Linux utilities that manage physical volumes start with the letters pv for Physical Volume. E.g. pvcreate, pvchange, pvs, pvdisplay etc. ### Volume Groups Volume groups (VGs) are collections of physical volumes. It is the next level of abstraction in LVM. Volume groups are the storage pool that combines the storage capacity of multiple raw storage devices. Utilities: All volume group Linux utility names start with vg, which stands for Volume Group, e.g. vgcreate, vgs, vgrename etc. A volume group is a logical group of one or more disk storage devices, such as /dev/sda or /dev/nvme0n1, but doesn't have to be a whole disk device, it could be just a single partition on a disk, such as /dev/sda5 or /dev/nvme3n1p8, or multiple partitions. Volume groups can thus span multiple disks and/or partitions, types, sizes, etc. yet gives you one logical device to work with. Another benefit of this is you can easily add and remove physical volumes from volume groups, thus they can grow quickly and easily, and flexibly. ### Logical Volumes A logical volume (LV) is then a logical storage device construct that will exist withing that volume group. This is what you're going to mostly work with. A logical volume is like a partition, but instead of sitting on top of a raw disk, it sits on top of a volume group. >You can: >* Format a logical volume with whichever filesystem you want. >* Mount it anywhere in the filesystem you want. Utilities: All Linux logical volume utility names start with lv, which stands for Logical Volume. e.g. lvcreate, lvs, lvreduce etc.vgcreate, vgs, vgrename etc. ## OpenShift Storage Classes What Storage Classes bring to OpenShift is that they let us create a dynamic, self-service for the storage. And then when the claims are created by admins, by developers, etc., basically whatever is requesting storage out of Kubernetes, they just put the request out there and then the StorageClass will dynamically allocate the persistent volumes to satisfy those claims. If a claim comes in for 10GB, then LVM Storage operator is like "oh heck, I've got something to do! I'm going to make a 10GB logical volume!" and then it will glue the request for 10GB together with the actual LV that is 10GB (that they just created) and provide that dynamic provisioning. The claim is something that lives inside of a namespace (a.k.a. project in OCP, though a project is more than a kubernetes namespace). >OpenShift Project and Kubernetes Namespace are basically the same: A Project is a Kubernetes namespace with additional annotations (an functionality) to provide multi tenancy: >* A cluster admin can inject a template for project creation and other customizations >* More granular permissions. For example, you can configure it so that a given user can only see a subset of all the projects (with RBAC on Namespaces you can only limit the ability to list all the namespaces or none of them) Let's go to the **Storage** section on the left nav frame, then go to StorageClasses. We can see here that the LVM Storage operator also created a StorageClass for us, called `lvms-vg1` . You will see the middle column of that StorageClass as listed there shows Provisioner to be `topolvm.io`. topolvm.io is the upstream open source project for LVM Storage, so this assures us this is connected to that. Now click into that StorageClass `lvms-vg1` and you'll see various info on the default page called **Details**. You'll notice that the Volume binding mode shows as `WaitForFirstConsumer`. In order to test this you can't simply create a PVC (PersistenVolumeClaim) for storage. You actually have to create a pod that attempts to mount/use that, and then that pod would be the first consumer. So right now the StorageClass is waiting for something to want to use the storage, instead of preallocating it when you don't even have a pod or VM that's actually going to use it. The alternative is to set it for **Immediate** binding. You can change it to Immediate if you'd like to test it out and see what happens, how it's a bit different. But it's not a complicated concept, so you likely get the idea already. To make this change you can't simply edit the YAML of the existing StorageClass. But you can essentially make a copy and edit the copy, that's the easiest way to often perform actions like this. So copy the entire YAML from line 1 all the way to the end, thus putting it in your OS cut/copy/paste buffer. Then in the upper right hand top corner next to your OCP console login name, you'll see a dark **+** sign inside a small white circle (in dark mode, that is). Click on that, it's the easy shortcut method to paste and edit some YAML to create new stuff in OCP. So we clicked on the **+** and pasted that YAML text we copied from the other StorageClass. Then we need to make 2 small strategic changes. 1. First change the name as you cannot have 2 resources with the same name, and also we want to give it a more meaningful name. So edit the name from `lvms-vg1` to something like `lvms-immediate`. 2. You can leave all of the uid and other labels and managedFields and other data from the other StorageClass, it will discard all of that anyway. So again, this is what helps make this an easy method, by copy/pasting and making small changes to YAML code to create new resources in OCP. 3. Change what for us shows as line 50, and the very last line, of `volumdBindingMode` from `volumeBindingMode: WaitForFirstConsumer` to this `volumeBindingMode: Immediate`. Must be capital letter "I". 4. Hit blue CREATE button at the bottom to create this new StorageClass. Click back on StorageClasses in your left nav frame and you should now see 2 storage classes: `lvms-immediate` and `lvms-vg1`. ## Persistent Volume Claims Let's now create and discuss PersistentVolumeClaims (PVCs). On your left-hand nav frame click on Storage --> PersistentVolumeClaims section. You may well not see any PVCs listed yet as this is a new system. Let's create one to demonstrate those 2 StorageClasses we discussed and worked with in the previous section. First, we've been working in the OCP Project of "All Projects", i.e. we're not in any specific project, we're looking at or working with **ALL PROJECTS**. Let's create a new test project to work with and keep out of any important existing project. To create a new project: 1. Click on the Project: All Projects pull down and at the bottom of that pull down, click on Create Project. 2. Give it name like `test-001` or `my-first-test-project` or whatever you'd like. You should see the Project name changed to your newly created Project, but also that you stayed on the page listing PersistentVolumeClaims, and this list is empty because the Project is new and thus empty. Click on the Blue button for **Create PersistentVolumeClaim**. You'll see 2 choices given to you, **With Form** and **With Data upload form**. :::info >**With Form**: let's you define a new blank/empty PVC, with these fields > * which StorageClass > * the PVC name > * Access mode : RWO or RWX > * Volume Mode : Filesystem or Block >**With Data upload form**: this gives you a form where you can upload a qcow2 cloud image file that will copied/loaded into a new PVC, setting the StorageClass and size of your new claim. ::: We will choose the **With Form** option. 1. Choose the StorageClass to be our new `lvms-immediate` one created moments ago. 2. Give it some name, just a test name like `my-test-pvc-01`. 3. Give it a size, say 5 GiB. Keep in mind if you're using a very small StorageClass to test with, or you've got a lot of space used in your StorageClass already, don't put a bigger size here than is available! 4. Leave default settings for **Access Mode** (RWO) and **Volume mode** (Filesystem). 5. Click CREATE. It will create the new PVC, and it will almost instantly change the status to **Bound** in 2 places, in the Details view on the right side under **Status**, and at near the top left of your view showing **PVC** **my-test-pvc-01** **Bound**. It transitions very very quickly from "Pending" to "Bound", which is due to the fact that LVs are created very quickly, and then filesystems are created very quickly on those LVs. You can actually go the OCP node, go to the Terminal tab, and log into the node (which is running RHEL CoreOS) and look at the Linux LVM storage info and you'll see your new LV there. To do this follow these steps: 1. Go to Compute --> Nodes, then click on the link for the node you want to log into. In our case this is a SNO, so there is only one to choose from. 2. You're taken to Details view by default, so then click on the Terminal tab near the top center of the page. 3. Enter `chroot /host` to change to typical root access. (you'll see the hint to do this right above the terminal window shown!) 4. try the command `lvs` to list LVs, and then `pvs` to list PVs We see something like this on our SNO system: ``` sh-4.4# chroot /host sh-5.1# pvs PV VG Fmt Attr PSize PFree /dev/nvme0n1 vg1 lvm2 a-- 931.51g 93.15g sh-5.1# vgs VG #PV #LV #SN Attr VSize VFree vg1 1 17 0 wz--n- 931.51g 93.15g sh-5.1# lvs LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert f324d501-b6b0-4581-a1c9-265768b5f1aa vg1 Vwi-a-tz-- 10.00g thin-pool-1 0.00 thin-pool-1 vg1 twi-aotz-- <837.54g 0.00 4.22 sh-5.1# ``` It is using what is called thin-provisioning in LVM Storage by default, so that is why you see that **thin-pool** shown there. Next let's look at our block IDs, to give you some further details and understanding about the underlying storage use in the RHEL CoreOS on the nodes. so do the commands with results shown below. ``` sh-5.1# find /dev | grep f1aa /dev/vg1/f324d501-b6b0-4581-a1c9-265768b5f1aa /dev/disk/by-id/dm-name-vg1-f324d501--b6b0--4581--a1c9--265768b5f1aa /dev/mapper/vg1-f324d501--b6b0--4581--a1c9--265768b5f1aa sh-5.1# ``` Those three returned files are symlinks to the same thing, so you can use any of these to look at info like the block ID. ``` sh-5.1# blkid /dev/mapper/vg1-f324d501--b6b0--4581--a1c9--265768b5f1aa /dev/mapper/vg1-f324d501--b6b0--4581--a1c9--265768b5f1aa: PTUUID="90a1c058" PTTYPE="gpt" sh-5.1# ``` Or you may not see any block ID yet at all since we haven't put any filesystem on this yet, i.e. we haven't actually touched or used this LV yet such as to put a filesystem or other data onto it, so it's essentially created but blank, and so your results would look more like this below. ``` sh-5.1# blkid /dev/mapper/vg1-f324d501--b6b0--4581--a1c9--265768b5f1aa sh-5.1# ``` Let's clean up after our tests we've been doing. We're going to delete the PVC and then the StorageClass we created, in that order. Go to Storage --> PersistentVolumeClaims. Select our `my-test-pvc-01` and then the three vertical dots to the far right of this PVC, you'll get a pull-down menu, choose **Delete PersistentVolumeClaim**. Now go to Storage --> StorageClasses. Select our `lvms-immediate` and then the three vertical dots to the far right of this SC, you'll get a pull-down menu, choose **Delete StorageClass**. >NOTE! You cannot delete a StorageClass that has PVCs using it, i.e. requesting storage from it. # OpenShift Virtualization Now, on our system we only have the one StorageClass again, the way the system was when we sort of started our work here today, and so you might think that you should have an automatic default StorageClass, or just because you have only one and it was generated by the installation of the LVM Storage operator, then it'd be your default and set to the default. But this is not the case! >NOTE! Just having a single StorageClass does not mean it will automatically be the default or used as the default. So let's set a default storage class now, as this will have implications for the OpenShift Virtualization operator since it will want to download cloud images for several Red Hat operating systems like Fedora, CentOS Stream, and RHEL. But it can't download those if it doesn't have a place to store them! Follow these steps: 1. Go to Storage --> StorageClasses. 2. Click on your single StorageClass, which is likely called `lvms-vg1`. You will be taken to the Details page. 3. Down the list of information shown on that Details page, you'll see the Annotations section. 4. There will likely be at least 1 annotation already. And then you'll see a small pencil icon. 5. Click on that pencil icon and we're going to add an additional annotation. 6. You'll get a small pop-up window to let you edit/add/remove annotations. 7. These are key:value based 8. This key seems a little ridiculous, but it is `storageclass.kubernetes.io/is-default-class` 9. Enter that Key then. And then the value will just be `true`. 10. Click on the blue SAVE button in the lower right of that pop up window. Now nearly the instant we do this, the OpenShift Virtualization operator will detect it now has a default storage and start to download those VM template images. It runs pretty regularly on a scheduled basis, so it's checking quickly at regular scheduled intervals. If you want to see this in action, do this: 1. Click on the left nav frame for Storage --> PersistentVolumeClaims. 2. Make sure you're Project is set to "All Projects". 3. You should now see many PVCs for stuff like CentOS 7, CentOS Stream 8, CentOS Stream 9, Fedora, RHEL 7, RHEL 8 and RHEL 9 images, which are cloud-init QCOW2 image files. ## OC CLI and Terminal Operator If you have the OpenShift CLI tools installed on your local desktop operating system, you can open a terminal and login to the cluster as an admin user and run `oc` commands there. >NOTE: a handy shortcut to login is given by going to your user name in the far upper right of the OCP console GUI, click your name, and choose "Copy login command". >That will open a new browser tab, you will have to provide your cluster user id and password credentials again, then you will be given 2 choices to use. Generally we use the first one which has the `oc login --token=xxxxxxxxxxxxxx --server=https://xxxxxxxx:6443` command laid out for you, just copy and paste that into your local OS terminal with the `oc` command in your path, and you should be all logged in, easy peasy. Run this command in your terminal. `oc get datavolume -A` It will look something like this if you catch it in time to still see downloads happening for those OCP Virt guest OS template image files. ``` [my-system]$ oc get datavolume -A NAMESPACE NAME PHASE PROGRESS RESTARTS AGE openshift-virtualization-os-images centos-stream8-068c47daa8db ImportInProgress 72.84% 17d openshift-virtualization-os-images centos-stream9-1f444afc5668 Succeeded 100.0% 14d openshift-virtualization-os-images centos7-8ea5aa5fcbf1 Succeeded 100.0% 46d openshift-virtualization-os-images fedora-722ac1d6b4f1 ImportInProgress 92.16% 17d openshift-virtualization-os-images rhel8-14313b7f990e Succeeded 100.0% 46d openshift-virtualization-os-images rhel9-b2c52ac49e20 ImportInProgress 78.44% 46d ``` You can also see this by going to the left nav frame Virtualization --> Catalog. Then click on the tab in the upper part of the page to select "Template catalog". You should now see all of the Linux ones with a sort of blue badge that says "Source available". You won't see that for any Windows Server or Desktop OS images, as those are not downloaded or made available by default or from Red Hat. You have to set those up yourself. Let's test things out by creating a test VM now! While still in that VM Template catalog, choose Fedora. You will get a pop-up window to create your VM. Choose these settings. 1. **Storage --> Disk source**: leave this as Template default. This means it will use those default template images we just saw were downloading and should now be finished. 2. **Storage --> Disk size**: change this if you want larger, but don't make it smaller than the 30GiB default. 3. If you expand the little twisty for **Optional parameters** then you can set the **CLOUD_USER_PASSWORD** to log into the system once installed. 4. Down towards the bottom you can set the **VirtualMachine name**. By default it will generate something like this, assuming our Fedora template catalog item was chosen: `fedora-blue-iguana-39`. 5. **Quick create VirtualMachine --> Project**: You can't change the project here, so if you're still in the **All Projects** project, then you'll see Project set to `default`. The moment you click on the blue **Quick create VirtualMachine** button, you will be taken to a new view which is your new VM page and Overview tab. You can watch the Status on the Overview --> Details section. At the start it will say "Provisioning". >NOTE: While you see "Provisioning", what is happening under the hood is it the system is doing an LVM clone operation of the VM template image into it's own new LVM. Then after "Provisioning" you should see the status "Starting" and then "Running" if all goes well. ## Some Nice Features for VMs Here are some cool things you can do with your virtual machines. ### Quick Action Buttons While looking at the VirtualMachine details of any VM, you will see in the upper right some blue buttons for that represent one-click actions: STOP, RESTART, PAUSE. You can also find these actions on the Actions pull down menu. ### Console From the main **Overview** tab you see by default when looking at the VirtualMachine details of any VM, you will see a small view of the live console. Under that you will see a link to "**Open web console**". This will open a new browser tab with just a console terminal session, or open it up in a new browser window, depending on your browser and settings. You can also reach the terminal by going to the **Console** tab in VM details. This way you don't have to change browser tabs or windows if you don't wish to. When using either console terminal method, you will see a small twisty option near the top just above the console screen that says "**Guest login credentials**". This is nice because you don't have to remember some random user login and password that were system generated. Just click on that twisty, and it shows you the **User name** and **Password**. Then even nicer is it has the option just to the right of each to copy them to the clipboard. Then below that is the option to Paste them into the terminal for you. >NOTE: The first time you do this, pay attention to your browser as you should get a small pop-up asking for permission to **Allow** this copy/paste for your clipboard buffer access. You should only have to give that access once per browser app you use. So follow these steps: 1. Once you're in the console view, click on the **Guest login credentials**. 2. Click on the **Copy to clipboard** icon to the right of **User name**. Then click on the blue **Paste** link/button a little lower down. 3. Repeat step 2 but with the **Password**. Copy it then paste it. 4. That's it! You should be logged in. If you type `ping google.com` you should see ping responses. In my system my small mini 4x4 PC has only a single 2.5GbE network port. By default OCP setup NAT'd bridge network that VMs use. So you'll see probably a 10.0.x.y IP address, or something else you may have chosen if you customized your networking with different CIDR network address blocks in the non-routable 10.0.0.0 or 172.0.x.x or 192.168.x.x ranges. This is fine if you only need to get out to your other networks or to the Internet. But you may well want to be able to communicate with your VM from outside the cluster from somewhere else on your lab/home/office network. So wouldn't it be handy or useful to be able to put your VMs right on your main network in your lab, home office, demo center, etc.? Or have options to add one vnet interface for your VMs that is a private host only type network, and then an additional vnet interface that is bridged to your main outside network? Let's see how to do that next in our **NETWORKING** section. # Networking for VMs With the defaults we went with in our Assisted Installer, or with what you may choose through your OCP install method, we end up with You get to a network config for OCP through a flow chart of decisions like are you using OCP with the right kind of network? There are 2 different choices for OCP networks: 1) **OpenShift SDN** and more recently we've got a new choice of 2) **OVN Kubernetes**. To get to the networking setup we want to get to, we have to make sure we're using the new **OVN Kubernetes**, which is the new default option when installing OCP. And we have to get to the point where we're asking "Are you trying to get to a different network interface on your server, or are you trying to use what we might call the primary network interface, meaning the main network interface that OCP is using. A secondary interface is no big deal. But to share a single interface which is also the main one that OCP is using for all of its network communication, that's the situation we're in here and will show you how to do this. It requires a little bit of what we might call "smooth talking OpenShift into doing what we need it to do" to let us do this network configuration. >NOTE: You may notice if you look at your VM details page that there is an IP address shown there, but it doesn't match the IP address you see when you're logged into your VM and show your network details. To explain why this is, go to your VM details page, then click on the upper tab called **Configuration**. And then it may be hard to see, but in that smaller details info window below, there is also now a set of vertical tabs towards the left side, showing Details, Storage, Network, Scheduling, etc. >Click on the Network tab. >You will see Network interfaces info. And in the info listed, you see a column called **TYPE** and the value shown there for our VM network interface is "**Masquerade**". This is why you are shown a different IP address within the VM and outside of it in the VM details page. ## Add a New Bridged Network So let's add this new network! First thing we'll do is use YAML to add some additional OCP resource setup. Here is the YAML template we'll be using. This came from a RH SA. ``` # You must make 2 or 3 changes to this file # 1 - optional: change the name from "openshift-management" to something else if you wish like # 2 - set the namespace # 3 - update the "netAttachDefName" line to be namespace/name (e.g. my-namespace/bridge-labnet) --- apiVersion: k8s.cni.cncf.io/v1 kind: NetworkAttachmentDefinition metadata: name: bridge-labnet namespace: my-namespace spec: config: |2 { "name": "physnet", "topology":"localnet", "netAttachDefName": "my-namespace/bridge-labnet", "type": "ovn-k8s-cni-overlay", "cniVersion": "0.4.0" } ``` 1. Open up the YAML window by clicking on the "**+**" icon in the upper right. 2. Copy that YAML above and paste it into your YAML window. The comments on lines 1-4 are okay as long as they start with "#". 3. We made these changes: ``` name: openshift-vm-bridge namespace: test-project01 "netAttachDefName": "test-project01/openshift-vm-bridge", ``` So our resulting YAML looks like this: ``` --- apiVersion: k8s.cni.cncf.io/v1 kind: NetworkAttachmentDefinition metadata: name: openshift-vm-bridge namespace: test-project01 spec: config: |2 { "name": "physnet", "topology":"localnet", "netAttachDefName": "test-project01/openshift-vm-bridge", "type": "ovn-k8s-cni-overlay", "cniVersion": "0.4.0" } ``` 4. Click on the blue **Create** button at the bottom. 5. That's it. Really! Let's go back to the VM we created earlier. 1. Go to Virtualization --> Virtual Machines. Choose your VM. 2. Go to the **Configuration** tab. Go the left side nav for **Network**. 3. Now click the blue button to "**Add network interface**". You get a new pop-up window. 4. First off, choose a name for this network interface. OCP will typically choose a new resource name in a format roughly like this: ` <resource type>-<random word>-<another random word>-<number> `. But you can change this something more meaningful if you like. Or not, it may not matter at all to your lab/demo systems. 5. Leave the model as "**virtio**". 6. Now for the network, click on the pull-down and you should see our new network listed. In our case we see our new network already chosen (as it's the only other network available), which is "**test-project01/openshift-vm-bridge**". 7. If you don't see the new network, it may be that you created this VM in a different project than the one you created the new network in. In my case the project is "test-project01". 8. Leave the "**Type**" as "**Bridge**". Leave the MAC Address blank as it will auto create and assign an unused one. 9. Click "**Save**". 10. If your VM is currently running, unless you changed a recently added feature flag to allow hot-add of devices like this, then it won't take effect until you restart the VM. So you should see your new network interface added, but if the VM is running you should see an orange flag next to the name that says "**Pending**". 11. If your VM is running, the either Stop/Start it or use the Reset button near the top upper right. If you go into your VM now, go to the Console, use your copy/paste for login credentials, then you should be able to run this command and see 2 network interfaces. "ip a" is short hand for "ip address" or "ip address show" ``` [cloud-user@rhel9-some-words ~]$ ip a ``` You should now see an `eth0` and `eth1` network interface. `eth0` is the original and is the default NAT'd network. `eth1` is our new bridged network and should have a DHCP address from your actual lab/home/office network. ## Network Gotchas ### Double Default Routes Go the VM we've been working with up until now and just added the new bridged network interface for. Go the console and login. Run this command `ip route list` . You may well see something like this: ``` [cloud-user@rhel9-some-words ~]$ ip route list default via 10.0.2.1 dev eth0 proto dhp src 10.0.2.2 metric 100 default via 172.16.16.254 dev eth1 proto dhp src 172.16.16.176 metric 101 10.0.2.0/24 dev eth0 proto kernel scope link src 10.0.2.2. metric 100 172.16.16.0/24 dev eth1 proto kernel scope link src 172.16.16.176 metric 101 ``` Having dual or double, or frankly any more than one, set of default routes and gateways is problematic for any system, and people can have this problem and get stuck trying to figure it out. The solution here is to modify eth0 in our case to not be the default route or use its default gateway. If you had this situation on purpose for a reason where there are 2 or more default routes, the way the systems work is there has to be a tie-breaker, and that would be the numbers you see shown above on the far right, that `metric 100` or `metric 101` number. It basically goes with golf scoring for who wins, which means the lowest number (score) wins the tie-breaker. There are a few ways to fix this, but here's what I used. I was still logged into the console window of my VM. 1. I used the command `nmtui` 2. This brings up a Network Manager text UI. 3. I chose `Edit a connection` 4. I chose the `eth0` device 5. I went to the **IPv4 Configuration** section, down to toggle on the option that says "**Never use this network for default route" 6. That's all, so go down to the "**OK**" to save and quit that edit mode. 7. Then go to **Back**, then **Quit** and you should be back at the command prompt. 8. To take effect you can either use the command `nmcli con down` and the name of the network connection (default is often something like "System eth0"), or just reboot the VM. 9. Now you can log back in to the console if you rebooted the VM, and re-run that command `ip route list`. 10. You should see only one default route and gateway now, and for the `eth1` network interface. # Back to Talk About Storage ## ODF Lite for S3 Storage There is a storage option for ODF where you can choose not to install the whole ODF Operator but instead just install what they call the MultiCloud Gateway. Some folks call this "ODF Lite". What this gives you essentially is a way to create AWS S3 type storage, i.e. object storage. An example use case for this would be if you're installing Quay for a demo or your home lab use and Quay needs S3 type object storage. ## NFS Let's go over how to setup NFS storage for your cluster to access. ==Source for notes on this section: https://hackmd.io/JoGMdMJlQ_2H4vUuJpu0cw== Now this method we'll talk about next uses a helm chart to setup, so the easiest way to use helm is use the Web Terminal we went over the installation of earlier. So go ahead and click on the **>_** icon at the upper right to open the OCP terminal. You should see something like this: ``` Welcome to the OpebnShift Web Terminal. Type "help" for a list of installed CLI tools. bash-4.4 ~ $ ``` Let's make sure we have helm available. ``` bash-4.4 ~ $ helm version version.BuildInfo{Version:"v3.11", GitCommit:"", GitTreeState:"", GoVersion:"go1.21rc3"} ``` Okay, good, helm is installed and recent/latest version. Let's do some pre-req checks now. ``` # Tell helm where to find the csi-driver-nfs chart repository helm repo add csi-driver-nfs https://raw.githubusercontent.com/kubernetes-csi/csi-driver-nfs/master/charts # Ask helm to list the repo contents. You should see v4.6.0 and several other versions. helm search repo -l csi-driver-nfs ``` You will see the chart and app versions available listed for you. Let's go ahead and follow option 2 in our source guide (referenced just above) to install the 4.6.0 version. ``` # Option 2 - Install on a Single Node Openshift (SNO) cluster. # When there is only one node in the cluster, we don't need multiple replicas of the controller pod helm install csi-driver-nfs csi-driver-nfs/csi-driver-nfs --version v4.6.0 \ --create-namespace \ --namespace csi-driver-nfs \ --set controller.runOnControlPlane=true \ --set externalSnapshotter.enabled=true \ --set externalSnapshotter.customResourceDefinitions.enabled=false ``` Now we need to grant some additional permissions to the ServiceAccounts. ``` oc adm policy add-scc-to-user privileged -z csi-nfs-node-sa -n csi-driver-nfs oc adm policy add-scc-to-user privileged -z csi-nfs-controller-sa -n csi-driver-nfs ``` Just to verify things so far, let's go back to your OCP web GUI. Go to **Workloads** --> **Pods**. Change your project to `csi-driver-nfs`. You should now see 3 pods running and ready, and hopefully none are in the CrashLoopback state. Next, we're ready for some YAML. Click on the **+** icon in the upper right and let's add some YAML! Paste the YAML below into the YAML editor. There are 4 lines potentially you'd need to change for your NFS server, share and folder names. `name: nfs-csa` : You can change this to whatever storage class name you'd like, but this name is good enough for us so we're keeping it. `server: nfs-server.example.com` : this will be your NFS server's IP/FQDN `share: /nfs-share/example-dir` : this will be your NFS server's exported directory such as /nfs-export or /volume/share `subDir: 'ocp_nfs/mycluster/${pvc.metadata.namespace}-${pvc.metadata.name}-${pv.metadata.name}'` : this is the Folder/Subdir name template we're going to create as a subdirectory path to keep our stuff in its own folder. It's following this template: **`pvcnamespace-pvcname-pvname`** ``` --- apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: nfs-csi provisioner: nfs.csi.k8s.io parameters: server: nfs-server.example.com ### NFS server's IP/FQDN share: /nfs-share/example-dir ### NFS server's exported directory subDir: 'mysubdir/mycluster/${pvc.metadata.namespace}-${pvc.metadata.name}-${pv.metadata.name}' ### Folder/subdir name template reclaimPolicy: Delete volumeBindingMode: Immediate ``` > :warning: **NOTE** > On my Synology system where I have an NFS export, I had to use these values, where I have a storage volume called **volume1**, and then a big shared storage space called **share1**. And then I wanted to store stuff there in their own subdirectory for this cluster of /ocp_nfs/ocp4a/ and then the rest of what the template had for pvcnamespace-pvcname-pvname. So I had these values for my storage class YAML below. **`share: /volume1/share1`** **`subDir: 'ocp_nfs/ocp4a/${pvc.metadata.namespace}-${pvc.metadata.name}-${pv.metadata.name}'`** > > But if you were hosting NFS from a Linux server, then it'd be a little different like this. You'd only need the exported directory for share, such as `/mynfsexport` but then you can add additional subdirectories here that you wish to such as `/mynfsexport/subdir01/subdir02` **`share: /mynfsexport/subdir01/subdir02`** > And then for subDir you could have just the main template part: **`subDir: '${pvc.metadata.namespace}-${pvc.metadata.name}-${pv.metadata.name}'`** Then once done with editing, you'd click on the blue **Create** button to create this new StorageClass to create it. Now let's test it out by creating a **PersistentVolumeClaim**. First change your project to an appropriate one by selecting the project from the Project selector in the upper left. In our case our project is **`test-project01`**. Now go to the left nav frame and choose **Storage** --> **PersistentVolumeClaims**. Click on the blue **Create PersistentVolumeClaim** button. Set the fields as follows: ``` StorageClass: nfs-csi PersistentVolumeClaim name: whatever name you wish, maybe "my-test-nfs-pvc01". Access Mode: Shared access (RWX) ``` The size is largely irrelevant because there is nothing that will actually enforce how much space this can take. But we set ours to 5GiB. > :warning: **NOTE** > We generally like to set the Access Mode to **Shared access (RWX)** because it's very common for use cases like 3 or 10 web servers all pointing to the same directory for their HTML files and other common data, for example. It's not a big deal though, it's not really enforced, so don't sweat it that much if you don't set this you'll probably still be okay with your storage. We like to set it here so we see that is the mode we need and want it to be (yes, even though it's not really enforceable). > >>But take care because these things aren't really enforced then other folks could come along and mess with your files and data you're storing there. What we're saying then is there aren't a lot of safeguards with NFS storage access like this, it's fairly wide open. If you find that your new test PVC stays at **Pending** for more than a few seconds, you may have some mistakes in the YAML for your NFS server or export/share or subdirectory name(s). 1. In that case you must first delete the test PVC you just created. 2. Then go to the `nfs-csi` StorageClass, copy the YAML for it so it's in your clipboard buffer, then delete this `nfs-csi` StorageClass. 3. Now go to verify your NFS server IP/FQDN, the export/share name, the subdirectory(ies) that you want do exist and are all spelled correctly. 4. Next click on the **+** YAML creator/editor, paste your copied YAML from before, and make the necessary corrections. 5. Then once done with editing, you'd click on the blue **Create** button to create this new StorageClass to create it. Repeat the steps above then to create a new test PVC. If it works you should see your PVC almost immediately show the status of **Bound**. Did it work now? Cool! If it didn't, repeat the steps to delete the PVC, delete the StorageClass and then recreate the StorageClass and test PVC. If you go to your NFS server mount point on another system and look in the export/share directory and subdir folder, you'll see something like this directory there now. `/mysubdir/mycluster/test-project01-my-test-nfs-pvc01-pvc-0b123665-5831-4c60-9aa7-75738dd9f4ac` This is because: our PVC namespace was `test-project01` our PVC was `my-test-nfs-pvc01` our PV the system then created for us is `pvc-0b123665-5831-4c60-9aa7-75738dd9f4ac` This all follows that template we defined in that YAML. This one `subDir: 'mysubdir/mycluster/${pvc.metadata.namespace}-${pvc.metadata.name}-${pv.metadata.name}'` . Now that you can see more clearly the results, you may wish to modify those according to your needs in the future when using NFS storage for your projects, making it shorter and more terse, or longer and more verbose, as desired. Let's try something else now, to take a snapshot of our test VM we created earlier. 1. Go to left side nav frame **Virtualization** --> **Virtualmachines**. 2. Select any VM you have, you should be taken to the VirtualMachine details page. 3. There are a couple ways to quickly take a snapshot of your VM. 1. On the main Overview tab, you'll see on the middle right side of your screen a blue link for Snapshots (0). Yours may show some number greater than zero in the parentheses if you've already taken any snapshots. Click on that blue link. 2. Or go the Snapshots tab on the horizontal menu bar starting with **Overview**. 4. Click on the blue button near upper left that says **Take snapshot**. If you want you can change any of the fields provided there, but for a quick test don't bother changing the random name the system generated for us, or any other field, and just click on **Save**. The snapshot will start right away. > :memo: **NOTEWORTHY** You might wonder why we weren't asked where we wanted to save the snapshot we just took. There was no field to enter or select a choice of storage location. In order to take snapshots at all, i.e. to have the ability to create snapshots, it requirest that your storage be CSI storage. CSI has been the newest type of storage, but for years now, so it's not bleeding edge, it's very stable. But there are some out there that are not CSI storage, so those will not help us here to create snapshots. If you'd like to try another experiment, how about we try to create a new VM and use our NFS storage for storing that image? Let's do it! 1. Make sure you are in the right Project if you have special networks setup, etc. 2. Go to left side nav frame **Virtualization** --> **Catalog**. 3. Choose one of your **Source Available** templates. Perhaps Fedora. Let's create a Fedora VM! So click on the Fedora VM template. 4. You could just click the blue **Quick create VirtualMachine**, but don't. We need to customize our VM configuration first, so instead click on the **Customize VirtualMachine**. 5. Go to the Disks horizontal tab. On the disk in the list called **rootdisk**, click on the 3 dots to the far right, and select **Edit**. 6. Go towards the bottom to the field **StorageClass**, and from the pull-down choose the NFS storage class we setup just recently. 7. Now click on the blue **Save** button. Then click on the blue **Create VirtualMachine** button. 8. Now, let's watch the status. You have to be a bit quick to catch this, but with NFS going over the network you should have a little time as it's much slower than using local storage like SATA or NVMe SSD flash storage, or even spinning rust hard disk drives (HDDs). 9. Click on the **>_** icon in the upper right to open our web terminal, unless you already still have it open at the bottom of your window. In either case, go to the terminal. 10. Enter this command `oc get datavolume -A` to list all current data volumes in all namespaces. 11. You should see a new one that shows a **Phase** status of **`CloneinProgress`**. 12. And to the right of that you should see a progress percentage value, and if you did this quickly enough it should be maybe only in the 10-50% value range. 13. On the far right you'll see an **Age** value that should be very recent, probably only two minutes or less in age. 14. You can also track the progress of long running processes like this particular VM creation, by entering the CLI command `oc get pod -A | grep cdi`. 15. And if the process is long enough and you want to see what's happening from the logs, you can also use this CLI command `oc logs -f`. > :bulb: **TIP** If you go back to your list of VirtualMachines, you'll see that they have a blue badge next to the name that says **VM**. > And if you go to the left side nav frame and select **Workloads** --> **Pods**, then you will see your VM pods listed there, and in the middle column of **Owner** they will show a blue badge of **VMI**. This stands for Virtual Machine Instance. > So the difference between a **VM** and a **VMI**, if you're familiar with VMware vSphere systems and their VMs, it's like the difference between a .vmx config file and the virtual machine that is up and running. In OpenShift, a VMI means the virtual machine is running, and the VM is the definition of what the virtual machine is configured to be. --- # Just trying out some MarkDown stuff below. :::danger :warning: **DANGER** **WARNING** ==a serious note or warning== ::: :::info > :memo: **MEMO** something to take note of ::: :::warning > :bulb: **TIP** Here's a tip for you! ::: :::spoiler **SPOILER** hidden details or spoiler text here ::: :::success :heavy_check_mark: **SUCCESS!** some success text here ::: ---