Hands-on Install BeeGFS on Virtual Machines

# Hands-on Install BeeGFS on Virtual Machines >[name=Cheng-Chin Chiang] [time= June 20, 2024]<chcchiang@asiaa.sinica.edu.tw> Introduction --- Before installing the BeeGFS on real new storage and servers, I tested to install it on virtual machines. There are some virtual machine software we can use, like [UTM](https://mac.getutm.app/) (designed for Mac system), [VirtualBox](https://www.virtualbox.org/wiki/Downloads) (it is free and best for hobbyists), [Parallels](https://www.parallels.com/products/desktop/) (best for Apple silicon devices), or [VMware](https://www.vmware.com/products/desktop-hypervisor.html.html.html) (best for corporate users and IT professionals). Install VirtualBox --- For Mac, download and install VirtualBox from [this home page](https://www.virtualbox.org/wiki/Downloads). For the Ubuntu, install it via the commands: ``` $ sudo apt-get update $ sudo apt-get upgrade $ sudo apt-get install virtualbox ``` Check the VirtualBox version: ``` $ vboxmanage --version ``` Start the VirtualBox: ``` $ virtualbox ``` Install Vagrant --- Vagrant is a tool for building and distributing development environments. It can improve the efficiency of building an OS environment on VirtualBox. One can also easily interact with the virtual environment without more time latency or extra settings. For the Mac, we can install the Vagrant using brew: ``` $ brew install vagrant ``` For the Ubuntu, install with the command: ``` $ sudo apt-get -y install vagrant ``` Note that Vagrant can not work with VirtualBox on the Mac `M1/M2/M3` CPU. It only works on the Mac `Intel` CPU. One can review the [Vagrand Cloud](https://app.vagrantup.com/boxes/search), choose a prebuilt OS environment, and then download/install it on VirtualBox. Create/Configure Virtual Machines --- In this experiment, we install `AlmaLinux8` on four virtual machines. According to the quick start guide on [BeeGFS Documentation](https://doc.beegfs.io/latest/quick_start_guide/quick_start_guide.html#step-1-package-download-and-installation), we need four kinds of host services: Management Server `(node01)`, Metadata Server `(node02)`, Storage Server `(node03)`, and Client `(node04)`. Like the following architectural overview: ![截圖 2024-06-17 下午1.03.46](https://hackmd.io/_uploads/rkrSrSTH0.png) First, we download the vagrant file from the [Vagrant Cloud](https://app.vagrantup.com/boxes/search), choose [AlmalLinux8 Box](https://app.vagrantup.com/almalinux/boxes/8) and download it via the command: ``` $ vagrant init almalinux/8 ``` Second, we modify the `Vagrantfile` file, assign the IP address, and open it to the public network. Like the following settings: :::info :::spoiler `Vagrantfile` ```bash= # -*- mode: ruby -*- # vi: set ft=ruby : # All Vagrant configuration is done below. The "2" in Vagrant.configure # configures the configuration version (we support older styles for # backwards compatibility). Please don't change it unless you know what # you're doing. Vagrant.configure("2") do |config| # The most common configuration options are documented and commented below. # For a complete reference, please see the online documentation at # https://docs.vagrantup.com. # Every Vagrant development environment requires a box. You can search for # boxes at https://vagrantcloud.com/search. config.vm.box = "almalinux/8" # Disable automatic box update checking. If you disable this, then # boxes will only be checked for updates when the user runs # `vagrant box outdated`. This is not recommended. # config.vm.box_check_update = false # Create a forwarded port mapping which allows access to a specific port # within the machine from a port on the host machine. In the example below, # accessing "localhost:8080" will access port 80 on the guest machine. # NOTE: This will enable public access to the opened port # config.vm.network "forwarded_port", guest: 80, host: 8080 # Create a forwarded port mapping which allows access to a specific port # within the machine from a port on the host machine and only allow access # via 127.0.0.1 to disable public access # config.vm.network "forwarded_port", guest: 80, host: 8080, host_ip: "127.0.0.1" # Create a private network, which allows host-only access to the machine # using a specific IP. config.vm.network "private_network", ip: "192.168.56.11" # Create a public network, which generally matched to bridged network. # Bridged networks make the machine appear as another physical device on # your network. config.vm.network "public_network" # Share an additional folder to the guest VM. The first argument is # the path on the host to the actual folder. The second argument is # the path on the guest to mount the folder. And the optional third # argument is a set of non-required options. # config.vm.synced_folder "../data", "/vagrant_data" # Provider-specific configuration so you can fine-tune various # backing providers for Vagrant. These expose provider-specific options. # Example for VirtualBox: # # config.vm.provider "virtualbox" do |vb| # # Display the VirtualBox GUI when booting the machine # vb.gui = true # # # Customize the amount of memory on the VM: # vb.memory = "1024" # end # # View the documentation for the provider you are using for more # information on available options. # Enable provisioning with a shell script. Additional provisioners such as # Ansible, Chef, Docker, Puppet and Salt are also available. Please see the # documentation for more information about their specific syntax and use. # config.vm.provision "shell", inline: <<-SHELL # apt-get update # apt-get install -y apache2 # SHELL end ``` ::: Finally, we start the VirtualBox and then build the Vagrantfile by the command: ``` $ vagrant up ``` ![截圖 2024-06-14 上午10.21.17](https://hackmd.io/_uploads/ry0D57tH0.png) Choose the bridged network interface by typing `1`, i.e., `enp4s0`, and then wait for the building process to finish. For more usages of Vagrant, one can see the [Vagrant Cheat Sheet](https://gist.github.com/wpscholar/a49594e2e2b918f4d0c4). Once the vagrant build process is finished, we can see that it has already run on the VirtualBox, like the following snapshot: ![截圖 2024-06-13 晚上11.07.14](https://hackmd.io/_uploads/BJtP6quHA.png) In the same way, we build and run the other three virtual machines on the VirtualBox (with different IP settings per machine). Now we ssh into the virtual machine terminal mode ``` $ vagrant ssh ``` If we type the command ``` $ ip addr ``` we get the following message: :::info ```bash 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 08:00:27:16:38:c6 brd ff:ff:ff:ff:ff:ff altname enp0s3 inet 10.0.2.15/24 brd 10.0.2.255 scope global dynamic noprefixroute eth0 valid_lft 86383sec preferred_lft 86383sec inet6 fe80::a00:27ff:fe16:38c6/64 scope link noprefixroute valid_lft forever preferred_lft forever 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 08:00:27:60:b1:4e brd ff:ff:ff:ff:ff:ff altname enp0s8 inet 192.168.56.11/24 brd 192.168.56.255 scope global noprefixroute eth1 valid_lft forever preferred_lft forever inet6 fe80::a00:27ff:fe60:b14e/64 scope link valid_lft forever preferred_lft forever 4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 08:00:27:0b:78:43 brd ff:ff:ff:ff:ff:ff altname enp0s9 inet 172.17.22.135/24 brd 172.17.22.255 scope global dynamic noprefixroute eth2 valid_lft 86383sec preferred_lft 86383sec inet6 fe80::a00:27ff:fe0b:7843/64 scope link valid_lft forever preferred_lft forever ``` ::: Note that the `node01` server IP is `192.168.56.11`. We set the hostname with the command (remember being as the root) ``` $ su - # passwd: vagrand $ hostnamectl set-hostname node01.asiaa.sinica.edu.tw $ hostname $ node01.asiaa.sinica.edu.tw ``` Next, we define the node names with respect to their IPs by editing the file `/etc/hosts` ``` $ vi /etc/hosts ``` Adding following lines ``` 192.168.56.11 node01.asiaa.sinica.edu.tw node01 192.168.56.12 node02.asiaa.sinica.edu.tw node02 192.168.56.13 node03.asiaa.sinica.edu.tw node03 192.168.56.14 node04.asiaa.sinica.edu.tw node04 ``` and saving it. We have established a simple cluster with four nodes. We can verify it by connecting to each node with the command ``` $ ssh node01 $ ssh node02 $ ssh node03 $ ssh node04 ``` on each node. Note that to clean up the file `/root/.ssh/known_hosts` if the old ssh key exists. This is just in case the node server machine changes. >[!Caution] On AlmaLinux9, we need extra steps to enable root login via ssh. By editing the file `/etc/ssh/sshd_config`, changing the setting `PermitRootLogin yes`, and then `systemctl restart sshd`. Install BeeGFS --- Following the [BeeGFS Quick Start Giude](https://doc.beegfs.io/latest/quick_start_guide/quick_start_guide.html#step-3-basic-configuration), we step by step install BeeGFS on these four nodes. ### Step1: Package Download and Installation Download [this repository](https://www.beegfs.io/release/beegfs_7.4.3/dists/beegfs-rhel8.repo) file and store it in the directory `/etc/yum.repos.d` on **all nodes**: ``` $ yum install wget -y $ wget -O /etc/yum.repos.d/beegfs_rhel8.repo https://www.beegfs.io/release/beegfs_7.4.3/dists/beegfs-rhel8.repo ``` For AlmaLinux9: ``` $ wget -O /etc/yum.repos.d/beegfs_rhel9.repo https://www.beegfs.io/release/beegfs_7.4.3/dists/beegfs-rhel9.repo ``` Now we can install the packages from the repository: ``` $ ssh root@node01 yum install beegfs-mgmtd # management service $ ssh root@node02 yum install beegfs-meta libbeegfs-ib # metadata service; libbeegfs-ib is only required for RDMA $ ssh root@node03 yum install beegfs-storage libbeegfs-ib # storage service; libbeegfs-ib is only required for RDMA $ ssh root@node04 yum install beegfs-client beegfs-helperd beegfs-utils # client and command-line utils ``` ### Setp2: Basic Configuration #### Management Service The management service needs to know where it can store its data. It will only store node information like connectivity data, so it will not require much storage space, and its data access is not performance critical. ``` $ ssh root@node01 $ /opt/beegfs/sbin/beegfs-setup-mgmtd -p /data/beegfs/beegfs_mgmtd ``` #### Metadata Service The metadata service needs to know where it can store its data and where the management service is running. Typically, you will have multiple metadata services running on different machines. Optionally, you can also define a custom numeric `metadata service ID` (range `1..65535`). As this service is running on a server with the name `node02` in our example, we will also pick number `2` as the metadata service ID here. ``` $ ssh root@node02 $ /opt/beegfs/sbin/beegfs-setup-meta -p /data/beegfs/beegfs_meta -s 2 -m node01 ``` #### Storage Service The storage service needs to know where it can store its data and how to reach the management server. Typically, you will have multiple storage services running on different machines and/or multiple storage targets (e.g., multiple RAID volumes) per storage service. Optionally, you can also define a custom numeric `storage service ID` and numeric storage target ID (both in range `1..65535`). As this service is running on a server with the name node03 in our example, we will pick number `3` as the ID for this storage service and we will use `301` as the `storage target ID` to show that this is the first target (`01`) of storage service 3. ``` $ ssh root@node03 $ /opt/beegfs/sbin/beegfs-setup-storage -p /mnt/myraid1/beegfs_storage -s 3 -i 301 -m node01 ``` #### Client The client needs to know where the management service is running. ``` $ ssh root@node04 $ /opt/beegfs/sbin/beegfs-setup-client -m node01 ``` The client mount directory is defined in a separate configuration file. This file will be used by the beegfs-client service startup script. By default, BeeGFS will be mounted to /mnt/beegfs. #### Connection authentication We create a file which contains a shared secret ``` $ ssh root@node01 $ dd if=/dev/random of=/etc/beegfs/connauthfile bs=128 count=1 ``` Ensure the file is only readable by the root user ``` $ chown root:root /etc/beegfs/connauthfile $ chmod 400 /etc/beegfs/connauthfile ``` Then copy the file to all hosts in the cluster (node01~node04). ``` $ cd /etc/beegfs $ scp connauthfile root@node02:/etc/beegfs $ scp connauthfile root@node03:/etc/beegfs $ scp connauthfile root@node04:/etc/beegfs ``` Finally, we edit all configuration files of all services you are currently using (incl. helperd/mon) on all hosts in the cluster and configure the `connAuthFile = /etc/beegfs/connauthfile` with the absolute path/filename to the file which contains a shared secret. ``` $ ssh root@node01 vi /etc/beegfs/beegfs-mgmtd.conf $ ssh root@node02 vi /etc/beegfs/beegfs-meta.conf $ ssh root@node03 vi /etc/beegfs/beegfs-storage.conf $ ssh root@node04 vi /etc/beegfs/beegfs-helperd.conf $ ssh root@node04 vi /etc/beegfs/beegfs-client.conf ``` ### Step3: Service Startup BeeGFS services can be started in arbitrary order by using the corresponding `init.d` or `systemctl service` scripts. All services create log files (`/var/log/beegfs-xxxx`). ``` $ ssh root@node01 systemctl start beegfs-mgmtd $ ssh root@node02 systemctl start beegfs-meta $ ssh root@node03 systemctl start beegfs-storage $ ssh root@node04 systemctl start beegfs-helperd ``` >[!Caution] Note that you need to check if the Linux ==kernel-devel== module is installed and available. Also, the current kernel-devel version of AlmaLinux9 is not compatible with BeeGFT. See [discussions on this webpage](https://groups.google.com/g/fhgfs-user/c/PeXrEL_WkLw). In other words, ++we can only install ==BeeGFS Client== on ==AlmaLinux8== server++! ``` $ ssh root@node04 $ yum install kernel-devel or $ yum install kernel-devel-$(uname -r) $ rpm -qa | grep kernel # Check current kernel packages and version $ uname -msr # Check current kernel version $ cd /lib/modules/4.18.0-553.el8_10.x86_64 $ ls -l build $ rm -f build # If the build symbolic link is invalid, delete and reset it $ ln -s /usr/src/kernels/4.18.0-553.5.1.el8_10.x86_64 /lib/modules/4.18.0-553.el8_10.x86_64/build $ /etc/init.d/beegfs-client rebuild # Test if the build process is successful ``` >[!Caution] Also, we need to disable `SELinux` by editing the file `/etc/selinux/config` with `SELINUX=disabled` ``` $ ssh root@node04 $ vi /etc/selinux/config ``` Finally, we can start the `beegfs-client` service ``` $ ssh root@node04 $ systemctl start beegfs-client $ systemctl status beegfs-client ``` ### Step4: Check Connectivity Check the detected network interfaces and transport protocols from a client node with the following commands: ``` $ ssh node04 $ beegfs-ctl --listnodes --nodetype=meta --nicdetails ``` :::info ``` node02.asiaa.sinica.edu.tw [ID: 2] Ports: UDP: 8005; TCP: 8005 Interfaces: + eth1[ip addr: 192.168.56.12; type: TCP] + eth2[ip addr: 172.17.22.120; type: TCP] + eth0[ip addr: 10.0.2.15; type: TCP] Number of nodes: 1 Root: 2 ``` ::: ``` $ beegfs-ctl --listnodes --nodetype=storage --nicdetails ``` :::info ``` node03.asiaa.sinica.edu.tw [ID: 3] Ports: UDP: 8003; TCP: 8003 Interfaces: + eth1[ip addr: 192.168.56.13; type: TCP] + eth2[ip addr: 172.17.22.145; type: TCP] + eth0[ip addr: 10.0.2.15; type: TCP] Number of nodes: 1 ``` ::: ``` $ beegfs-ctl --listnodes --nodetype=client --nicdetails ``` :::info ``` 26AD-666B1825-node04.asiaa.sinica.edu.tw [ID: 2] Ports: UDP: 8004; TCP: 0 Interfaces: + eth1[ip addr: 192.168.56.14; type: TCP] + eth2[ip addr: 172.17.22.212; type: TCP] + eth0[ip addr: 10.0.2.15; type: TCP] Number of nodes: 1 ``` ::: ``` $ beegfs-net ``` :::info ``` mgmt_nodes ============= node01.asiaa.sinica.edu.tw [ID: 1] Connections: TCP: 1 (192.168.56.11:8008); meta_nodes ============= node02.asiaa.sinica.edu.tw [ID: 2] Connections: TCP: 1 (192.168.56.12:8005); storage_nodes ============= node03.asiaa.sinica.edu.tw [ID: 3] Connections: TCP: 1 (192.168.56.13:8003); ``` ::: ``` $ beegfs-check-servers ``` :::info ``` Management ========== node01.asiaa.sinica.edu.tw [ID: 1]: reachable at 192.168.56.11:18463 (protocol: TCP) Metadata ========== node02.asiaa.sinica.edu.tw [ID: 2]: reachable at 192.168.56.12:17695 (protocol: TCP) Storage ========== node03.asiaa.sinica.edu.tw [ID: 3]: reachable at 192.168.56.13:17183 (protocol: TCP) ``` ::: ``` $ beegfs-df ``` :::info ``` METADATA SERVERS: TargetID Cap. Pool Total Free % ITotal IFree % ======== ========= ===== ==== = ====== ===== = 2 low 18.3GiB 16.3GiB 89% 9.6M 9.5M 99% STORAGE TARGETS: TargetID Cap. Pool Total Free % ITotal IFree % ======== ========= ===== ==== = ====== ===== = 301 emergency 18.3GiB 16.3GiB 89% 9.6M 9.6M 100% ``` ::: If we `df -h` on the client site (node04): :::info ``` Filesystem Size Used Avail Use% Mounted on devtmpfs 458M 0 458M 0% /dev tmpfs 476M 0 476M 0% /dev/shm tmpfs 476M 13M 463M 3% /run tmpfs 476M 0 476M 0% /sys/fs/cgroup /dev/sda4 19G 2.4G 16G 13% / /dev/sda3 1014M 166M 849M 17% /boot /dev/sda2 200M 5.9M 194M 3% /boot/efi tmpfs 96M 0 96M 0% /run/user/1000 vagrant 916G 151G 766G 17% /vagrant beegfs_nodev 19G 2.1G 17G 12% /mnt/beegfs ``` ::: We can find the `/mnt/beegfs` is mounted, which is the BeeGFS data storage folder. Benchmarking a BeeGFS System --- According to the [BeeGFS document about benchmark measuring](https://doc.beegfs.io/latest/advanced_topics/benchmark.html), the storage targets benchmark is intended to determine the maximum theoretical performance of BeeGFS on the storage targets or to detect defective or misconfigured storage targets. The storage benchmark is started and monitored with the `beegfs-ctl` tool. The following example starts a write benchmark on all targets of all BeeGFS storage servers with an IO block size of `512 KB`, using `8 threads` per target, each of which will write `10 GB` of data to its own file. ``` $ ssh root@node04 $ cd /mnt/beegfs $ beegfs-ctl --storagebench --alltargets --write --blocksize=512K --size=10G --threads=8 ``` To query the benchmark status/result of all targets, execute the command below. ``` $ beegfs-ctl --storagebench --alltargets --status ``` :::info ``` Server benchmark status: Running: 1 Write benchmark results: Min throughput: 1779703 KiB/s nodeID: node03.asiaa.sinica.edu.tw [ID: 3], targetID: 301 Max throughput: 1779703 KiB/s nodeID: node03.asiaa.sinica.edu.tw [ID: 3], targetID: 301 Avg throughput: 1779703 KiB/s Aggregate throughput: 1779703 KiB/s ``` ::: The generated files will not be automatically deleted when a benchmark is complete. You can delete them by using the following command. ``` $ beegfs-ctl --storagebench --alltargets --cleanup ``` To exit the Vagrant, we type ``` $ exit ``` on the Vagrant virtual machine terminal. To close the virtual machine, we type ``` $ vagrant halt ``` Then the virtual machine will be closed.