:::success # LS Lab 6 - Fault Tolerant Storage & Backup ::: # Fault Tolerant Storage ## Task 1 - Take a pick :::info Before choosing, briefly describe and explain the difference between block storage, file storage and object storage. ::: This storage formats that save, organize, and present data in different ways. File storage organizes and represents data as a hierarchy of files in folders. Block storage chunks data into arbitrarily organized, evenly sized volumes. Object storage manages data and links it to associated metadata. :::info Take a pick: **DRBD** ::: ## Task 2 - Fault tolerant setup :::info Create and configure a necessary amount of VMs for your option. Install necessary packets and then configure a distributed block device. ::: To begin with, let's create two virtual machines (I use simple ubuntu 20.04 images that were left over from my first lab work).And I use two clones of one machine and for this I changed machine-id one of this for different IPs. For each machine, you need to create additional storage (vdb) in the Virtual Manager settings. I do this in order to make it easier for myself, not to mess with the `dd` command and not to delete the existing file system of machines. ``` #install the packages and skip setting up additional options sudo apt-get update sudo apt install drbd-utils ``` Next, you need to change the configuration on the first host (vm1) in **/etc/drbd.conf**: ``` global { usage-count no; } common { syncer { rate 100M; } } resource r0 { protocol C; startup { wfc-timeout 15; degr-wfc-timeout 60; } net { cram-hmac-alg sha1; shared-secret "secret"; } on vm1 { device /dev/drbd0; disk /dev/vdb; address 10.1.1.61:7788; meta-disk internal; } on vm2 { device /dev/drbd0; disk /dev/vdb; address 10.1.1.92:7788; meta-disk internal; } } ``` ``` #via ssh copy this config to the vm2 scp /etc/drbd.conf vm2:~ #move it in /etc sudo mv drbd.conf /etc/ #on both vm initializing the metadata repository sudo drbdadm create-md r0 #and start drbd daemon also on both vm sudo systemctl start drbd.service ``` <center> ![](https://i.imgur.com/giIaoiP.png) Figure 1 - drbd.service </center> ``` #on vm1 to make it primary node and #start synchronization between hosts sudo drbdadm -- --overwrite-data-of-peer primary all #to see a process on vm2 watch -n1 cat /proc/drbd ``` <center> ![](https://i.imgur.com/4F07Awr.png) Figure 2 - Synchronization process </center> :::info Check (for all VMs) that a new block device appeared in the system and format it with usual filesystem like EXT4 or XFS and then mount. Make sure that each VM can recognize a filesystem on distributed block device ::: ``` #adding and mounting the file system sudo mkfs -t ext4 /dev/drbd0 sudo mount /dev/drbd0 /srv ``` <center> ![](https://i.imgur.com/cWi4qBQ.png) Figure 3 - EXT4 (the first screenshot was lost) ![](https://i.imgur.com/ivlj7OL.png) Figure 4 - New block device appeared in the system </center> To check the recognition of the file system, I took several steps from the test subtask: unmounted the block device, demoted the first node to a secondary role, promoted the second node to the primary role, and then mounted the device on it again. I didn't create a file system on the second node, I just mounted and unmounted the device, so you can see below that both nodes (at different times) recognize the file system as EXT4. <center> ![](https://i.imgur.com/2aXpRso.png) Figure 5 - File system </center> :::info Validate that storage on your distributed block device is fault tolerant (create some data, destroy one node, check storage status, etc.). ::: To check fault tolerant, we need to do something inside the drbd repository, my mount point is the /srv directory. ``` #move to random directory cd /etc/default #create some data in it mkdir test_dir #copy data in mount point sudo cp -r /etc/default /srv #then unmount /srv sudo umount /srv #change nodes roles sudo drbdadm secondary r0 sudo drbdadm primary r0 #mount partition sudo mount /dev/drbd0 /srv ``` <center> ![](https://i.imgur.com/s9NHKmA.png) Figure 6 - New roles ![](https://i.imgur.com/Pul8Aum.png) Figure 7 - Result </center> :::info Have you lost your data after destroying one node? Was it necessary to do something on another nodes to get the data? ::: No, because I used protocol C: > Protocol C Synchronous replication protocol. Local write operations on the primary node are considered completed only after both the local and the remote disk write(s) have been confirmed. As a result, loss of a single node is guaranteed not to lead to any data loss. Data loss is, of course, inevitable even with this replication protocol if all nodes (resp. their storage subsystems) are irreversibly destroyed at the same time. I can also add that when the machines are suddenly shut down (extrem reboot workstation), the storage does not disappear, it is enough after switching on to check drbd.daemon, select the main node again and mount the device on it. And the different situation. Random synchronization behavior can lead to a situation where the write operation has been completed, but the transmission over the network has not yet taken place, or vice versa. If at this moment the active node fails and fault tolerance is initiated, then this data block is not synchronized between the nodes — it was written to the failed node before the failure, but replication has not yet been completed. Thus, when the node is eventually restored, this block must be removed from the dataset during subsequent synchronization. Otherwise, the failed node will be "one record ahead" of the surviving node, which will violate the "all or nothing" principle for replicated storage. This is all appropriate for my version of drbd 8.4.11, however, in version 9 everything works a little differently. ## Task 3 - High Available setup :::info Modify your configuration to make storage high available. Besides it, you will need to format your block device with a clustered filesystem (e.g., **OCFS2**, GlusterFS, MooseFS). Again validate the storage on your distributed block device. Was it necessary to do something on another nodes to get the data in this setup? ::: First I need to change the configuration (**/etc/drbd.conf**), in my case I want to add options responsible for the actions of the node when a splitbrain is detected, change the bandwidth of the synchronization channel and set both nodes as primary (since simultaneous data access is possible in this way, this mode usually requires the use of a shared cluster file system). ``` global { usage-count no; } common { syncer { rate 1000M; } } resource r0 { protocol C; startup { become-primary-on both; } net { allow-two-primaries yes; after-sb-0pri discard-zero-changes; after-sb-1pri discard-secondary; after-sb-2pri disconnect; } on vm1 { device /dev/drbd0; disk /dev/vdb; address 10.1.1.61:7788; meta-disk internal; } on vm2 { device /dev/drbd0; disk /dev/vdb; address 10.1.1.92:7788; meta-disk internal; } } ``` ``` #via ssh copy this config to the vm2 scp /etc/drbd.conf vm2:~ #move it in /etc sudo mv drbd.conf /etc/ ``` :::spoiler While using two nodes as the main ones for one-time access to storage, it is recommended to use fencing policies, however, I did not quite understand what is the advantage of freezing I/O operations? Can you please comment on this. fencing resource-and-stonith; } handlers { fence-peer "..."; unfence-peer "..."; } ::: Configurations are not updated immediately, so first run a check to see which commands drbdadm will execute instead of actually executing, and then configure the configuration of the kernel module so that it matches the configuration files. <center> ![](https://i.imgur.com/hqLMXDg.png) ![](https://i.imgur.com/q6U0JZP.png) Figures 8, 9 - Update config </center> Now we are starting synchronization. And for change roles on both nodes, you need to run: ``` drbdadm primary r0 ``` <center> ![](https://i.imgur.com/vfPzve8.png) Figure 10 - Two primaries nodes </center> Now let's change the storage file system, this requires a package of utilities for **ocfs2** and a new configuration. ``` sudo apt-get update sudo apt-get install ocfs2-tools ``` The following basic configuration must be added to the **/etc/ocfs2/cluster.conf** file on both nodes. ``` cluster: node_count = 2 name = ocfs2cluster node: number = 1 cluster = ocfs2cluster ip_port = 7777 ip_address = 10.1.1.61 name = vm1 node: number = 2 cluster = ocfs2cluster ip_port = 7777 ip_address = 10.1.1.92 name = vm2 ``` After that, you need to recreate the storage file system. ``` mkfs.ocfs2 -L "test" /dev/drbd0 ``` <center> ![](https://i.imgur.com/ppuIE5F.png) Figure 11 - New FS </center> The following parameters must be set in **/etc/default/o2cb**: <center> ![](https://i.imgur.com/QQlmfZ8.png) Figure 12 - o2cb service </center> ``` #register the cluster specified in the configuration o2cb register-cluster ocfs2cluster #add the necessary services to autorun systemctl enable drbd o2cb ocfs2 systemctl start drbd o2cb ocfs2 ``` We add mount points to **fstab** on both nodes - the noauto option, which means that the FS will not be mounted at startup (but via systemd) and heartbeat=local, which means using the heartbeat service on each node. Local pulse refers to the ripple of the disk on all shared devices. In this mode, the pulse starts during mounting and stops during unmounting. This is just right for small storages. ``` /dev/drbd0 /srv ocfs2 defaults,noauto,heartbeat=local 0 0 ``` <center> ![](https://i.imgur.com/tIPCNMu.png) Figure 13 - Result ![](https://i.imgur.com/9ER1nu0.png) Figure 14 - Simple example </center> In general, it is categorically not recommended to use two node primaries without a quorum (there is a chance to resolve some split-brain scenarios automatically in the quorum, in the case of exclusively two primaries, manual intervention will be required). However, this significantly increases the availability of storage, allowing two nodes to use it at the same time. And the risk that a split brain will happen in such a small storage is likely only possible in the event of a [network failure](https://serverfault.com/questions/485545/dual-primary-ocfs2-drbd-encountered-split-brain-is-recovery-always-going-to-be), which is quite realistic to solve with several independent network connections. ## Task 4 - Questions :::info Explain what is fault tolerance and high availability ::: Fault tolerance - this is an environment in which service is not interrupted, but the cost of such service is much higher then usual systems. High availability - this is an environment has a minimal service interruption. Many services are willing to absorb a small amount of downtime with high availability rather than pay the much higher cost of providing fault tolerance. :::info What is a split-brain situation? ::: Split brain is a situation where, due to temporary failure of all network links between cluster nodes, and possibly due to intervention by a cluster management software or human error, both nodes switched to the Primary role while disconnected. This is a potentially harmful state, as it implies that modifications to the data might have been made on either node, without having been replicated to the peer. Thus, it is likely in this situation that two diverging sets of data have been created, which cannot be trivially merged. :::info What are the advantages and disadvantages of clustered filesystems? ::: Advantages: * Cluster file systems provide increased resource availability (if one of the nodes fails, other nodes are available that can be used, providing access to the necessary data without downtime). * A cluster file system can distribute projects and data to different nodes to create the desired configuration. * Performance increases because multiple server nodes provide more computing power. * The ability to balance the load on the hardware by sharing the work of servicing multiple application clients. Disadvantages: * Concurrency control becomes a problem when multiple people or clients access the same file or block and want to update it. * Remote access requires additional costs due to the distributed structure. * Failure of disk hardware or a given storage node in a cluster can create a single point of failure, which can lead to data loss or unavailability (solved by data replication, for example). # Backup ## Task 1 - Take a pick :::info Choose any tool from the list below: **Borg** ::: ## Task 2 - Configuration & Testing :::info Create 2 VMs (server and client), install and configure OS (please check the list of supported OS for your solution). Configure your solution: install necessary packets, edit the configuration. Create a repo on the backup server which will be used to store your backups. *Bonus:* configure an encrypted repo ::: I will continue to work with my two machines, now VM1 is the server and VM2 is the client. First I need to install the new packages. ``` apt install borgbackup ``` On the server, I created a simple user for backups (with /home directory): ``` sudo useradd -m borg ``` After that, I need to link the server to the client: ``` #on client machine generate keys ssh-keygen #on server add client's key mkdir ~borg/.ssh echo 'command="/usr/local/bin/borg serve" ssh-rsa <key here>' > ~borg/.ssh/authorized_keys chown -R borg:borg ~borg/.ssh ``` <center> ![](https://i.imgur.com/xHNBkUU.png) Figure 15 - SSH-key </center> ``` #initialize borg repo on the server from the client # repokey: stores the (encrypted) key into BackupRepo/config borg init -e repokey borg@10.1.1.61:BackupRepo ``` <center> ![](https://i.imgur.com/OJpODSA.png) Figure 16 - Initializing a remote repository </center> For repositories using repokey encryption, the key is stored in the repository in the configuration file. Thus, a backup is not strictly necessary, but protects the repository from unavailability if the file is damaged for some reason. ``` borg key export --paper /home/borg/BackupRepo > encrypted-key-backup.txt ``` <center> ![](https://i.imgur.com/O8jyVCV.png) Figure 17 - Example of exporting a key to a text file </center> A feature of borg is that it is necessary to encrypt the repository during initialization, after creating a repository there is no way to encrypt it, although there is a change of the passphrase. :::info Make a backup of `/home` directory (create some files and directories before backuping) of your client. Don't forget to make a verification of backup (some solutions provide an embedded option to verify backups). If there is no embedded option to verify backups try to make a verification on your own. ::: Now let's create the first backup for the `/home` directory (in it I created a test folder and a small text file with random text). Every time the encrypted repository is accessed, borg asks for a password. ``` #--status and --list for statistics on the backup and the files that got into it borg create --stats --list borg@10.1.1.61:BackupRepo::"TestBackup2" /home ``` <center> ![](https://i.imgur.com/cQYcBpB.png) Figure 18 - First backup </center> The **--verify-data** option will perform a full integrity check (unlike the CRC32 segment check) of the data, which means reading data from the repository, decrypting and unpacking it. This is a cryptographic check that will detect (accidental) corruption. <center> ![](https://i.imgur.com/ucQh4u2.png) Figure 19 - Check the backup ![](https://i.imgur.com/jn1qPTt.png) Figure 20 - List of archives and a list of files inside one of the archives </center> :::info Then damage your client's `/home` directory (encrypt it or forcibly remove) and restore from backup. Has anything changed with the files after restoring? Can you see and edit them? ::: ``` sudo rm -rf home ``` <center> ![](https://i.imgur.com/fwFzlRR.png) Figure 21 - Non-home </center> ``` borg extract borg@10.1.1.61:BackupRepo::TestBackup2 ``` <center> ![](https://i.imgur.com/SHS3orD.png) Figure 22 - New home </center> In fact, I just requested an archive with the right directory from the server and extracted the necessary data from it. Borg massively restores files to the current working directory. The only difference I saw was, of course, a change in the access and modification time of the directory. But it also changed the creation time, for example, comparing the crtime directories /home and /lib. It is also interesting that the modification time of the /home directory (when I created a new folder in it) and the creation time differ by an hour, that is, it turns out that the directory was changed first, and then created. Although this applies to recreating from an archive, the file system considers it a new object for itself. And, yes, I can see them and edit them. <center> ![](https://i.imgur.com/z96pzWR.png) Figure 23 - Whats new </center> ## Task 3 - Questions :::info When and how often do you need to verify your backups? ::: Here it would be appropriate to answer with the words of professor Alma, "I have no idea what kind of system you have, what you are doing in it and what size it is, and what you want from it. As soon as you tell me, then we'll talk." Based on this, it can be assumed that everything depends on the requirements of the business and the technologies used. Since in the question you are asking about my backups, in this case I would answer like this: my virtual machines are very simple, a large amount of data does not pass through them, however, due to the fact that they are experimental, and I try different things on them (like trying to change the size of the root storage) it would be nice to make backups before each such experiment so as not to disrupt the performance of the virtual machine image. It would also be nice to store these backups outside of virtual machines and check them about once every two to three weeks (not all labs need virtual machines). :::info Backup rotations schemes, what are they? Are they available in your solution? ::: Changing the working set of media in the process of copying is called their rotation. The scheme determines how and when each piece of removable storage is used for a backup job and how long it is retained once it has backup data stored on it Specifically, the prune command is available in borg (cleans up the repository, deleting all archives that do not correspond to any of the specified storage parameters, but does not free up disk space) to automate backup scenarios that want to save a certain number of historical backups, that is, the GFS (Grandfather-father-son) scheme. ## References: 1. [Ubuntu HA DRBD](https://ubuntu.com/server/docs/ubuntu-ha-drbd) 2. [DRBD Guide](https://linbit.com/drbd-user-guide/drbd-guide-9_0-en/) 3. [Man: o2cb](https://www.mankier.com/7/o2cb#Synopsis) 4. [HA vs fault tolerance](https://www.ibm.com/docs/en/powerha-aix/7.2?topic=aix-high-availability-versus-fault-tolerance) 5. [Clustered FS: Pros & Cons ](https://docs.oracle.com/cd/E56047_01/html/E56080/goori.html) 6. [Borg Documentation: encrypt repo](https://borgbackup.readthedocs.io/en/stable/usage/init.html)