Try   HackMD

Simple OpenNMS Environment using ScyllaDB

This lab starts an OpenNMS instance, a 3-node ScyllaDB cluster, and optionally an instance for the Scylla Monitoring Stack in the cloud, for learning purposes.

To monitor a network, it is advised to enable ActiveMQ or Kafka and use Minions. For simplicity, the embedded ActiveMQ will be enabled.

Follow this guide for general information about how to configure Scylla/Cassandra for OpenNMS/Newts. Most of that knowledge is transferable to ScyllaDB.

Requirements

The scripts used through this tutorial use envsubst, make sure to have it installed.

Make sure to log into Azure using az login prior creating the VM.

If you have a restricted account in Azure, make sure you have the Network Contributor role and the Virtual Machine Contributor role associated with your Azure AD account for the resource group where you want to create the VM. Of course, either Owner or Contributor at the resource group level are welcome.

If you have a restricted account in Azure, make sure you have the Network Contributor role and the Virtual Machine Contributor role associated with your Azure AD account for the resource group where you want to create the VM. Of course, either Owner or Contributor at the resource group level are welcome.

Create common Environment Variables

# Main export RG_NAME="OpenNMS" # Change it to use a shared one export LOCATION="eastus" # Azure Region export VNET_CIDR="13.0.0.0/16" export VNET_SUBNET="13.0.1.0/24" export VNET_NAME="$USER-scylla-vnet" export VNET_SUBNET_NAME="subnet1" export SCYLLA_VM_SIZE="Standard_D2s_v3" # 2 VCPU, 8 GB of RAM export SCYLLA_DISK_SIZE="512" # Disk size in GB export SCYLLA_CLUSTER_NAME="opennms-cluster" export SCYLLA_DATACENTER="DC1" export SCYLLA_REPLICATION_FACTOR="2" # Less than total nodes in cluster export SCYLLA_SEED="$USER-scylla1" export NEWTS_KEYSPACE_NAME="newts" export ONMS_HEAP="4096" # Expressed in MB and must fit ONMS_VM_SIZE export ONMS_VM_SIZE="Standard_D2s_v3" # 2 VCPU, 8 GB of RAM export ONMS_VM_NAME="$USER-scylla-onms" export SCYLLA_MON_VM_SIZE="Standard_D2s_v3" # 2 VCPU, 8 GB of RAM export SCYLLA_MON_VM_NAME="$USER-scylla-monitor" # Generated export SUBNET_BASE=${VNET_SUBNET/\.0\/24/}

Feel free to change the content and keep in mind that $USER is what we will use throughout this tutorial to identify all the resources we will create in Azure uniquely.

Do not confuse the Azure Location or Region with the Minion Location; they are both unrelated things.

We're going to leverage the Azure DNS services to avoid the need to remember and using Public IP addresses.

In Azure, the default public DNS follow the same pattern:

<vm-name>.<location>.cloudapp.azure.com

To make the VMs FQDN unique, we're going to add the username to the VM name. For instance, the OpenNMS FQDN would be:

agalue-onms01.eastus.cloudapp.azure.com

The above is what we can use to access the VM via SSH and to configure Minions.

Create the Azure Resource Group

This is a necessary step, as every resource in Azure must belong to a resource group and a location.

However, you can omit the following command and use an existing one if you prefer. In that case, make sure to adjust the environment variable RG_NAME so the subsequent commands will target the correct group.

az group create -n $RG_NAME -l $LOCATION

Create the Virtual Network

az network vnet create -g $RG_NAME \ --name $VNET_NAME \ --address-prefix $VNET_CIDR \ --subnet-name $VNET_SUBNET_NAME \ --subnet-prefix $VNET_SUBNET \ --tags Owner=$USER \ --output table

The reason for creating the VNET is that we need static IP addresses for each ScyllaDB instance (as a given Scylla node identifies itself within the cluster through its IP Address).

Create VMs for the ScyllaDB cluster

We will name each VM like follows, based on the chosen username and CIDR for the VNET and the subnet within it:

  • agalue-scylla1 (13.0.1.11)
  • agalue-scylla2 (13.0.1.12)
  • agalue-scylla3 (13.0.1.13)

Note the hostnames include the chosen username to make them unique, which is mandatory for shared resource groups and the default Azure DNS public domain on the chosen region.

Remember that each VM in Azure is reachable within the same VNet from any other VM through its hostname, and the following will take advantage of this fact when configuring seed nodes and Newts access from OpenNMS.

It is assumed that each VM would have at least 8GB of RAM (a Standard_D2s_v3 instance should be sufficient), although feel free to adjust it. The key element is that Scylla requires dedicated disks configured as RAID0 formatted as XFS, so additional disks for this purpose will be added (and the Scylla setup tool takes care of the configuration and formatting).

It is also assumed that agalue-scylla1 will be the seed node.

Network Topology will be used as that's a typical scenario in production. For this, we're going to use rack-awareness, but a single-dc. The DC name will be DC1, and for the rack, we're going to use the hostname of the ScyllaDB instance.

Create the cloud-init YAML file as /tmp/scylla-template.yaml with the following content for ScyllaDB 4.5, expecting to be installed in RHEL/CentOS 8:

#cloud-config package_upgrade: false write_files: - owner: root:root path: /etc/scylla/fix-schema.cql content: | ALTER KEYSPACE system_auth WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', '$SCYLLA_DATACENTER' : 3 }; ALTER KEYSPACE system_distributed WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', '$SCYLLA_DATACENTER' : 3 }; ALTER KEYSPACE system_traces WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', '$SCYLLA_DATACENTER' : 2 }; - owner: root:root path: /etc/scylla/newts.cql content: | CREATE KEYSPACE IF NOT EXISTS $NEWTS_KEYSPACE_NAME WITH replication = { 'class' : 'NetworkTopologyStrategy', '$SCYLLA_DATACENTER' : $SCYLLA_REPLICATION_FACTOR }; CREATE TABLE IF NOT EXISTS $NEWTS_KEYSPACE_NAME.samples ( context text, partition int, resource text, collected_at timestamp, metric_name text, value blob, attributes map<text, text>, PRIMARY KEY((context, partition, resource), collected_at, metric_name) ) WITH compaction = { 'compaction_window_size': '7', 'compaction_window_unit': 'DAYS', 'expired_sstable_check_frequency_seconds': '86400', 'class': 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy' } AND gc_grace_seconds = 604800; CREATE TABLE IF NOT EXISTS $NEWTS_KEYSPACE_NAME.terms ( context text, field text, value text, resource text, PRIMARY KEY((context, field, value), resource) ); CREATE TABLE IF NOT EXISTS $NEWTS_KEYSPACE_NAME.resource_attributes ( context text, resource text, attribute text, value text, PRIMARY KEY((context, resource), attribute) ); CREATE TABLE IF NOT EXISTS $NEWTS_KEYSPACE_NAME.resource_metrics ( context text, resource text, metric_name text, PRIMARY KEY((context, resource), metric_name) ); - owner: root:root permissions: '0750' path: /etc/scylla/bootstrap.sh content: | #!/bin/bash function wait_for_seed { echo "Waiting for $SCYLLA_SEED..." until echo -n > /dev/tcp/$SCYLLA_SEED/9042; do printf '.' sleep 5 done } function start_scylla { echo "Starting ScyllaDB..." systemctl enable scylla-server systemctl start scylla-server } if [ ! -f "/etc/scylla/.configured" ]; then echo "Scylla is not configured." exit fi echo "Bootstrapping instance $(hostname)..." if [[ "$SCYLLA_SEED" == "$(hostname)" ]]; then start_scylla wait_for_seed echo "Configuring keyspaces..." cqlsh -f /etc/scylla/fix-schema.cql $(hostname) cqlsh -f /etc/scylla/newts.cql $(hostname) else wait_for_seed start_scylla fi - owner: root:root permissions: '0750' path: /etc/scylla/configure.sh content: | #!/bin/bash set -e if [ "$(id -u -n)" != "root" ]; then echo "Error: you must run this script as root" >&2 exit 4 # According to LSB: 4 - user had insufficient privileges fi if [ -f "/etc/scylla/.configured" ]; then echo "Scylla node already configured." exit fi ipaddr=$(ifconfig eth0 | grep 'inet[^6]' | awk '{print $2}') # Basic Configuration cfg="/etc/scylla/scylla.yaml" sed -r -i "s/[#]?cluster_name:.*/cluster_name: '$SCYLLA_CLUSTER_NAME'/" $cfg sed -r -i "/seeds:/s/127.0.0.1/$SCYLLA_SEED/" $cfg sed -r -i "/^endpoint_snitch:/s/SimpleSnitch/GossipingPropertyFileSnitch/" $cfg sed -r -i "/^endpoint_snitch:/a dynamic_snitch: false" $cfg for field in listen_address rpc_address api_address; do sed -r -i "s/^$field:.*/$field: $ipaddr/" $cfg done # Performance Improvement sed -r -i "/num_tokens:/s/256/16/" $cfg # Network Topology cfg="/etc/scylla/cassandra-rackdc.properties" index=$(hostname | awk '{ print substr($0,length,1) }') echo "dc=$SCYLLA_DATACENTER" >> $cfg echo "rack=Rack$index" >> $cfg # Enable JMX Access cfg="/etc/scylla/cassandra/cassandra-env.sh" sed -r -i "/rmi.server.hostname/s/.public name./$ipaddr/" $cfg sed -r -i "/rmi.server.hostname/s/^#//" $cfg sed -r -i "/jmxremote.access/s/#//" $cfg sed -r -i "/LOCAL_JMX=/s/yes/no/" $cfg # Configure Environment disk=$(readlink -f /dev/disk/azure/scsi1/lun0) echo "Waiting on disk $disk" while [ ! -e $disk ]; do printf '.' sleep 10 done scylla_setup --setup-nic-and-disks --disks $disk --no-version-check chown scylla:scylla /etc/scylla/jmxremote.* touch /etc/scylla/.configured - owner: root:root permissions: '0400' path: /etc/scylla/jmxremote.password content: | monitorRole QED controlRole R&D cassandra cassandra - owner: root:root permissions: '0400' path: /etc/scylla/jmxremote.access content: | monitorRole readonly cassandra readwrite controlRole readwrite \ create javax.management.monitor.*,javax.management.timer.* \ unregister - owner: root:root permissions: '0400' path: /etc/snmp/snmpd.conf content: | rocommunity public default syslocation Azure syscontact IT dontLogTCPWrappersConnects yes disk / disk /var/lib/scylla packages: - epel-release - net-snmp - net-snmp-utils runcmd: - systemctl enable --now snmpd - curl -o /etc/yum.repos.d/scylla.repo -L http://downloads.scylladb.com/rpm/centos/scylla-4.5.repo - dnf install -y scylla - /etc/scylla/configure.sh - /etc/scylla/bootstrap.sh

For simplicity, I'm extracting the last digit from the hostname and use it as part of the rack name. You can apply other rules if needed.

Create the Scylla cluster on CentOS 8 VMs (with an additional disk of 500GB for data, as Scylla requires):

envsubst "$(env | cut -d= -f1 | sed -e 's/^/$/')" < /tmp/scylla-template.yaml > /tmp/scylla.yaml for i in {1..3}; do VM_NAME="$USER-scylla$i" VM_IP="$SUBNET_BASE.1$i" echo "Creating VM $VM_NAME ($VM_IP)..." az vm create --resource-group $RG_NAME --name $VM_NAME \ --size $SCYLLA_VM_SIZE \ --image OpenLogic:CentOS:8_4:latest \ --admin-username $USER \ --ssh-key-values ~/.ssh/id_rsa.pub \ --vnet-name $VNET_NAME \ --subnet $VNET_SUBNET_NAME \ --private-ip-address $VM_IP \ --public-ip-address-dns-name $VM_NAME \ --public-ip-sku Standard \ --data-disk-sizes-gb $SCYLLA_DISK_SIZE \ --custom-data /tmp/scylla.yaml \ --tags Owner=$USER \ --no-wait done

There is no need to open ports, as any VM can reach any other VM through any port by default, and the cluster won't be exposed to the internet (except for SSH via FQDN).

Also note that static IPs will be used because that's a Cassandra requirement (and Scylla is no different as it behaves the same way).

All the VMs will be created simultaneously; although, the bootstrap script takes care about having only one joining the cluster at a time.

Create a VM for OpenNMS

Create a cloud-init script to deploy OpenNMS and PostgreSQL in Ubuntu with the following content and save it at /tmp/opennms-template.yaml:

#cloud-config package_upgrade: false write_files: - owner: root:root path: /etc/opennms-overlay/opennms.properties.d/newts.properties content: | org.opennms.timeseries.strategy=newts org.opennms.newts.config.hostname=$SCYLLA_SEED org.opennms.newts.config.keyspace=$NEWTS_KEYSPACE_NAME org.opennms.newts.config.port=9042 org.opennms.newts.config.read_consistency=ONE org.opennms.newts.config.write_consistency=ANY org.opennms.newts.config.resource_shard=604800 org.opennms.newts.config.ttl=31540000 org.opennms.newts.config.cache.priming.enable=true org.opennms.newts.config.cache.priming.block_ms=60000 # The following settings must be tuned in production org.opennms.newts.config.writer_threads=2 org.opennms.newts.config.ring_buffer_size=131072 org.opennms.newts.config.cache.max_entries=131072 org.opennms.newts.config.max-connections-per-host=8192 # The following must be a factor of the number of Cores on a Scylla node org.opennms.newts.config.core-connections-per-host=2 org.opennms.newts.config.max-connections-per-host=2 - owner: root:root permissions: '0750' path: /etc/opennms-overlay/bootstrap.sh content: | #!/bin/bash set -e systemctl --now enable postgresql sudo -u postgres createuser opennms sudo -u postgres psql -c "ALTER USER postgres WITH PASSWORD 'postgres';" sudo -u postgres psql -c "ALTER USER opennms WITH PASSWORD 'opennms';" sed -r -i 's/password=""/password="postgres"/' /etc/opennms/opennms-datasources.xml sed -r -i '/0.0.0.0:61616/s/([<][!]--|--[>])//g' /etc/opennms/opennms-activemq.xml sed -r -i '/enabled="false"/{$!{N;s/ enabled="false"[>]\n(.*OpenNMS:Name=Syslogd.*)/>\n\1/}}' /etc/opennms/service-configuration.xml rsync -avr /etc/opennms-overlay/ /etc/opennms/ echo 'JAVA_HEAP_SIZE=$ONMS_HEAP' > /etc/opennms/opennms.conf /usr/share/opennms/bin/runjava -s /usr/share/opennms/bin/fix-permissions /usr/share/opennms/bin/install -dis echo "Waiting for $SCYLLA_SEED..." until echo -n > /dev/tcp/$SCYLLA_SEED/9042; do printf '.' sleep 5 done systemctl --now enable opennms - owner: root:root permissions: '0400' path: /etc/snmp/snmpd.conf content: | rocommunity public default syslocation Azure syscontact IT dontLogTCPWrappersConnects yes disk / apt: preserve_sources_list: true sources: opennms: source: deb https://debian.opennms.org stable main packages: - snmp - snmpd - opennms - opennms-webapp-hawtio bootcmd: - curl -s https://debian.opennms.org/OPENNMS-GPG-KEY | apt-key add - runcmd: - /etc/opennms-overlay/bootstrap.sh

The above installs the latest OpenJDK 11, the latest PostgreSQL, and the latest OpenNMS Horizon. I added the most basic configuration for PostgreSQL to work with authentication. The embedded ActiveMQ is enabled, as well as Syslogd.

Create an Ubuntu VM for OpenNMS:

envsubst < /tmp/opennms-template.yaml > opennms.yaml az vm create --resource-group $RG_NAME --name $ONMS_VM_NAME \ --size $ONMS_VM_SIZE \ --image canonical:0001-com-ubuntu-server-focal:20_04-lts:latest \ --admin-username $USER \ --ssh-key-values ~/.ssh/id_rsa.pub \ --vnet-name $VNET_NAME \ --subnet $VNET_SUBNET_NAME \ --private-ip-address "$SUBNET_BASE.100" \ --public-ip-address-dns-name $ONMS_VM_NAME \ --public-ip-sku Standard \ --custom-data /tmp/opennms.yaml \ --tags Owner=$USER \ --output table az vm open-port -g $RG_NAME -n $ONMS_VM_NAME \ --port 61616 --priority 100 -o table az vm open-port -g $RG_NAME -n $ONMS_VM_NAME \ --port 8980 --priority 200 -o table

Keep in mind that the cloud-init process starts once the VM is running, meaning you should wait about 5 minutes after the az vm create is finished to see OpenNMS up and running.

In case there is a problem, SSH into the VM using the public IP and the provided credentials and check /var/log/cloud-init-output.log to verify the progress and the status of the cloud-init execution.

You can SSH the Scylla VMs from the OpenNMS VM, as those don't have public IP addresses.

Create a VM for the Scylla Monitoring Stack

Work in progress (not verified).

This is an optional step, not required to use Scylla. Please note that its configuration requires listing all the nodes explicitly from the Scylla cluster, meaning if you're initializing more than 3, update scylla_servers.yml accordingly.

Create a cloud-init script to deploy the Monitoring Stack in CentOS 8 with the following content and save it at /tmp/scylla-monitoring-template.yaml:

#cloud-config package_upgrade: false write_files: - owner: root:root permissions: '0400' path: /etc/snmp/snmpd.conf content: | rocommunity public default syslocation Azure syscontact IT dontLogTCPWrappersConnects yes disk / - owner: root:root permissions: '0644' path: /etc/scylla-monitoring/scylla_manager_servers.example.yml content: | # List Scylla Manager end points - targets: - 127.0.0.1:56090 - owner: root:root permissions: '0644' path: /etc/scylla-monitoring/scylla_servers.yml content: | # ScyllaDB Statistics - targets: - $USER-scylla1:9180 - $USER-scylla2:9180 - $USER-scylla3:9180 labels: cluster: $SCYLLA_CLUSTER_NAME dc: $SCYLLA_DATACENTER # Node Exporter - targets: - $USER-scylla1:9100 - $USER-scylla2:9100 - $USER-scylla3:9100 labels: cluster: $SCYLLA_CLUSTER_NAME dc: $SCYLLA_DATACENTER - owner: root:root permissions: '0750' path: /etc/scylla-monitoring/bootstrap.sh content: | #!/bin/bash set -e if [ "$(id -u -n)" != "root" ]; then echo "Error: you must run this script as root" >&2 exit 4 # According to LSB: 4 - user had insufficient privileges fi systemctl enable --now docker usermod -aG docker $USER if [ ! -e /opt/scylla-monitoring ]; then git clone https://github.com/scylladb/scylla-monitoring.git /opt/scylla-monitoring fi src="/home/$USER/prometheus" data="$src/data" mkdir -p $data cp -f /etc/scylla-monitoring/*.yml $src/ runuser -l $USER -c 'cd /opt/scylla-monitoring && ./start-all.sh -d $data' packages: - epel-release - net-snmp - net-snmp-utils - git runcmd: - curl -s https://download.docker.com/linux/centos/docker-ce.repo > /etc/yum.repos.d/docker-ce.repo - yum install -y docker-ce docker-ce-cli containerd.io - /etc/scylla-monitoring/bootstrap.sh

Then,

envsubst < scylla-monitoring-template.yaml > scylla-monitoring.yaml az vm create --resource-group $RG_NAME --name $SCYLLA_MON_VM_NAME \ --size $SCYLLA_MON_VM_SIZE \ --image OpenLogic:CentOS:8_4:latest \ --admin-username $USER \ --ssh-key-values ~/.ssh/id_rsa.pub \ --vnet-name $VNET_NAME \ --subnet $VNET_SUBNET_NAME \ --private-ip-address "$SUBNET_BASE.200" \ --public-ip-address-dns-name $SCYLLA_MON_VM_NAME \ --public-ip-sku Standard \ --custom-data /tmp/scylla-monitoring.yaml \ --tags Owner=$USER \ --output table az vm open-port -g $RG_NAME -n $SCYLLA_MON_VM_NAME --port 3000 --priority 200 -o table

Monitor the Cluster with OpenNMS

Even if Scylla offers a comprehensive monitoring stack based on Prometheus, and even if the JMX statistics available in Cassandra are unavailable in ScyllaDB, it worth checking the Scylla servers like regular ones with OpenNMS.

Wait until OpenNMS is up and running and then execute the following:

ONMS_FQDN="$ONMS_VM_NAME.$LOCATION.cloudapp.azure.com" cat <<EOF >/tmp/OpenNMS.xml <?xml version="1.0"?> <model-import date-stamp="$(date +"%Y-%m-%dT%T.000Z")" foreign-source="OpenNMS"> EOF for vm in $(az vm list -g $RG_NAME --query "[?contains(name,'$USER-')].name" -o tsv); do ipaddr=$(az vm show -g $RG_NAME -n $vm -d --query privateIps -o tsv) cat <<EOF >>/tmp/OpenNMS.xml <node foreign-id="$vm" node-label="$vm"> <interface ip-addr="$ipaddr" status="1" snmp-primary="P"/> </node> EOF done cat <<EOF >>/tmp/OpenNMS.xml </model-import> EOF curl -v -u admin:admin \ -H 'Content-Type: application/xml' -d @/tmp/OpenNMS.xml \ http://$ONMS_FQDN:8980/opennms/rest/requisitions curl -v -u admin:admin -X PUT \ http://$ONMS_FQDN:8980/opennms/rest/requisitions/OpenNMS/import

Create a Minion VM on your network

Create the following cloud-init template to create a Minion (assuming the embedded ActiveMQ within OpenNMS is in place) and save it as /tmp/minion-template.yaml:

#cloud-config package_upgrade: true write_files: - owner: root:root path: /tmp/org.opennms.minion.controller.cfg content: | location=$MINION_LOCATION id=$MINION_ID http-url=http://$ONMS_FQDN:8980/opennms broker-url=failover:tcp://$ONMS_FQDN:61616 apt: preserve_sources_list: true sources: opennms: source: deb https://debian.opennms.org stable main packages: - opennms-minion bootcmd: - curl -s https://debian.opennms.org/OPENNMS-GPG-KEY | apt-key add - runcmd: - mv -f /tmp/org.opennms.minion.controller.cfg /etc/minion/ - sed -i -r 's/# export JAVA_MIN_MEM=.*/export JAVA_MIN_MEM="$MINION_HEAP_SIZE"/' /etc/default/minion - sed -i -r 's/# export JAVA_MAX_MEM=.*/export JAVA_MAX_MEM="$MINION_HEAP_SIZE"/' /etc/default/minion - /usr/share/minion/bin/scvcli set opennms.http admin admin - /usr/share/minion/bin/scvcli set opennms.broker admin admin - systemctl --now enable minion

Note the usage of environment variables within the YAML template. We will substitute them before creating the VM.

Then, create the runtime template:

export MINION_ID="minion01" export MINION_LOCATION="Durham" export MINION_HEAP_SIZE="1g" export ONMS_FQDN="$ONMS_VM_NAME.$LOCATION.cloudapp.azure.com" envsubst < /tmp/minion-template.yaml > /tmp/$MINION_ID.yaml

Then, start the new Minion via multipass with one core and 2GB of RAM:

multipass launch -c 1 -m 2G -n $MINION_ID --cloud-init /tmp/$MINION_ID.yaml

Clean Up

When you're done, make sure to delete the cloud resources.

If you created the resource group for this exercise, you could remove all the resources with the following command:

az group delete -g $RG_NAME

If you're using an existing resource group that you cannot remove, make sure only to remove all the resources created in this tutorial. All of them should be easily identified as they will contain the username and the VM name as part of the resource name. The easiest way is to use the Azure Portal for this operation. Alternatively,

IDS=($(az resource list \ --resource-group $RG_NAME \ --query "[?contains(name,'$USER-') && type!='Microsoft.Compute/disks']".id \ --output tsv | tr '\n' ' ')) for id in "${IDS[@]}"; do echo "Removing $id" az resource delete --ids "$id" --verbose done DISKS=($(az resource list \ --resource-group $RG_NAME \ --query "[?contains(name,'$USER-') && type=='Microsoft.Compute/disks']".id \ --output tsv | tr '\n' ' ')) for id in "${DISKS[@]}"; do echo "Removing $id" az resource delete --ids "$id" --verbose done

The reason to have two sets of deletion groups is that, by default, the list contains disks initially, which cannot be removed before the VMs. For this reason, we exclude the disks on the first set, and then we remove the disks.

Then clean the local resources:

multipass delete minion01 multipass purge