# Simple OpenNMS Environment using Cassandra This lab starts an OpenNMS instance, a 3-node Cassandra cluster and optionally an instance for Cassandra Reaper in the cloud, for learning purposes. To monitor a network, it is advised to enable [ActiveMQ](https://hackmd.io/@agalue/HkY4Vhtbd) or [Kafka](https://hackmd.io/@agalue/SJ1X4vKz_) and use Minions. For simplicity, the embedded ActiveMQ will be enabled. :::success Follow [this](https://hackmd.io/@agalue/Bku_AgeXd) guide for general information about how to configure Cassandra for OpenNMS/Newts. ::: ## Requirements * Have an [Azure Subscription](https://azure.microsoft.com/en-us/free/) ready. * Install [Azure CLI](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli) * Install [Multipass](https://multipass.run/), for the remote Minions. The scripts used through this tutorial use [envsubst](https://www.gnu.org/software/gettext/manual/html_node/envsubst-Invocation.html), make sure to have it installed. :::info Make sure to log into Azure using `az login` prior creating the VM. ::: :::warning If you have a restricted account in Azure, make sure you have the `Network Contributor` role and the `Virtual Machine Contributor` role associated with your Azure AD account for the resource group where you want to create the VM. Of course, either `Owner` or `Contributor` at the resource group level are welcome. ::: :::danger There are problems with `cqlsh` in `3.11.11` (see [CASSANDRA-16822](https://issues.apache.org/jira/browse/CASSANDRA-16822) for more details). Use [this](https://archive.apache.org/dist/cassandra/redhat/311x/) repository to install an older version, by adjusting the `baseurl` under `yum_repos` for the `cloud-init` template for Cassandra. ::: ## Create common Environment Variables ```bash= # Main export RG_NAME="OpenNMS" # Change it to use a shared one export LOCATION="eastus" # Azure Region export VNET_CIDR="13.0.0.0/16" export VNET_SUBNET="13.0.1.0/24" export VNET_NAME="$USER-cassandra-vnet" export VNET_SUBNET_NAME="subnet1" export CASSANDRA_VERSION="40x" # Either 311x or 40x export CASSANDRA_VM_SIZE="Standard_D2s_v3" # 2 VCPU, 8 GB of RAM export CASSANDRA_DISK_SIZE="512" # Disk size in GB export CASSANDRA_HEAP="4g" # Must fit CASSANDRA_VM_SIZE export CASSANDRA_CLUSTER_NAME="OpenNMS-Cluster" export CASSANDRA_DATACENTER="DC1" export CASSANDRA_REPLICATION_FACTOR="2" export CASSANDRA_SEED="$USER-cassandra1" export NEWTS_KEYSPACE_NAME="newts" export ONMS_HEAP="4096" # Expressed in MB and must fit ONMS_VM_SIZE export ONMS_VM_SIZE="Standard_D2s_v3" # 2 VCPU, 8 GB of RAM export ONMS_VM_NAME="$USER-cassandra-onms" export REAPER_VM_NAME="$USER-cassandra-reaper" # Generated export SUBNET_BASE=${VNET_SUBNET/\.0\/24/} ``` :::success Feel free to change the content and keep in mind that `$USER` is what we will use throughout this tutorial to identify all the resources we will create in Azure uniquely. ::: :::warning Do not confuse the Azure Location or Region with the Minion Location; they are both unrelated things. ::: We're going to leverage the Azure DNS services to avoid the need to remember and using Public IP addresses. In Azure, the default public DNS follow the same pattern: ``` <vm-name>.<location>.cloudapp.azure.com ``` To make the VMs FQDN unique, we're going to add the username to the VM name. For instance, the OpenNMS FQDN would be: ``` agalue-onms01.eastus.cloudapp.azure.com ``` The above is what we can use to access the VM via SSH and to configure Minions. ## Create the Azure Resource Group This is a necessary step, as every resource in Azure must belong to a resource group and a location. However, you can omit the following command and use an existing one if you prefer. In that case, make sure to adjust the environment variable `RG_NAME` so the subsequent commands will target the correct group. ```bash= az group create -n $RG_NAME -l $LOCATION ``` ## Create the Virtual Network ```bash= az network vnet create -g $RG_NAME \ --name $VNET_NAME \ --address-prefix $VNET_CIDR \ --subnet-name $VNET_SUBNET_NAME \ --subnet-prefix $VNET_SUBNET \ --tags Owner=$USER \ --output table ``` The reason for creating the VNET is that we need static IP addresses for each Cassandra instance. ## Create VMs for the Cassandra cluster We will name each VM like follows, based on the chosen username and CIDR for the VNET and the subnet within it: - `agalue-cassandra1` (`13.0.1.11`) - `agalue-cassandra2` (`13.0.1.12`) - `agalue-cassandra3` (`13.0.1.13`) Note the hostnames include the chosen username to make them unique, which is mandatory for shared resource groups and the default Azure DNS public domain on the chosen region. :::success Remember that each VM in Azure is reachable within the same VNet from any other VM through its hostname, and the following will take advantage of this fact when configuring seed nodes and Newts access from OpenNMS. ::: It is assumed that each VM would have at least 8GB of RAM, due to the chosen amount of HEAP that will be reserved (a [Standard_D2s_v3](https://docs.microsoft.com/en-us/azure/virtual-machines/dv3-dsv3-series) instance should be sufficient). It is also assumed that `agalue-cassandra1` will be the seed node. Network Topology will be used as that's a typical scenario in production. For this, we're going to use rack-awareness, but a single-dc. The DC name will be `DC1`, and for the rack, we're going to use the hostname of the Cassandra instance. Create the [cloud-init](https://cloudinit.readthedocs.io/en/latest/) YAML file as `/tmp/cassandra-template.yaml` with the following content for Cassandra 3.11.x using OpenJDK 8 or Cassandra 4.x using OpenJDK 11, expecting to be installed in RHEL/CentOS 8, with a dedicated disk for data: ```yaml= #cloud-config package_upgrade: false write_files: - owner: root:root path: /etc/sysctl.d/99-cassandra.conf content: | net.ipv4.tcp_keepalive_time=60 net.ipv4.tcp_keepalive_probes=3 net.ipv4.tcp_keepalive_intvl=10 net.core.rmem_max=16777216 net.core.wmem_max=16777216 net.core.rmem_default=16777216 net.core.wmem_default=16777216 net.core.optmem_max=40960 net.ipv4.tcp_rmem=4096 87380 16777216 net.ipv4.tcp_wmem=4096 65536 16777216 net.ipv4.tcp_window_scaling=1 net.core.netdev_max_backlog=2500 net.core.somaxconn=65000 vm.swappiness=1 vm.zone_reclaim_mode=0 vm.max_map_count=1048575 - owner: root:root path: /etc/cassandra/fix-schema.cql content: | ALTER KEYSPACE system_auth WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', '$CASSANDRA_DATACENTER' : 3 }; ALTER KEYSPACE system_distributed WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', '$CASSANDRA_DATACENTER' : 3 }; ALTER KEYSPACE system_traces WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', '$CASSANDRA_DATACENTER' : 2 }; - owner: root:root path: /etc/cassandra/reaper.cql content: | CREATE KEYSPACE IF NOT EXISTS reaper_db WITH replication = { 'class' : 'NetworkTopologyStrategy', '$CASSANDRA_DATACENTER' : $CASSANDRA_REPLICATION_FACTOR }; - owner: root:root path: /etc/cassandra/newts.cql content: | CREATE KEYSPACE IF NOT EXISTS $NEWTS_KEYSPACE_NAME WITH replication = { 'class' : 'NetworkTopologyStrategy', '$CASSANDRA_DATACENTER' : $CASSANDRA_REPLICATION_FACTOR }; CREATE TABLE IF NOT EXISTS $NEWTS_KEYSPACE_NAME.samples ( context text, partition int, resource text, collected_at timestamp, metric_name text, value blob, attributes map<text, text>, PRIMARY KEY((context, partition, resource), collected_at, metric_name) ) WITH compaction = { 'compaction_window_size': '7', 'compaction_window_unit': 'DAYS', 'expired_sstable_check_frequency_seconds': '86400', 'class': 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy' } AND gc_grace_seconds = 604800 AND read_repair_chance = 0; CREATE TABLE IF NOT EXISTS $NEWTS_KEYSPACE_NAME.terms ( context text, field text, value text, resource text, PRIMARY KEY((context, field, value), resource) ); CREATE TABLE IF NOT EXISTS $NEWTS_KEYSPACE_NAME.resource_attributes ( context text, resource text, attribute text, value text, PRIMARY KEY((context, resource), attribute) ); CREATE TABLE IF NOT EXISTS $NEWTS_KEYSPACE_NAME.resource_metrics ( context text, resource text, metric_name text, PRIMARY KEY((context, resource), metric_name) ); - owner: root:root permissions: '0750' path: /etc/cassandra/bootstrap.sh content: | #!/bin/bash function wait_for_seed { until echo -n > /dev/tcp/$CASSANDRA_SEED/9042; do echo "### seed unavailable - sleeping" sleep 5 done } function start_cassandra { echo "Starting Cassandra..." systemctl enable cassandra systemctl start cassandra } if [ ! -f "/etc/cassandra/.configured" ]; then echo "Cassandra is not configured." exit fi index=$(hostname | awk '{ print substr($0,length,1) }') echo "Bootstrapping instance $index..." if [[ "$index" == "1" ]]; then start_cassandra wait_for_seed echo "Configuring keyspaces..." cqlsh -f /etc/cassandra/fix-schema.cql $(hostname) cqlsh -f /etc/cassandra/newts.cql $(hostname) cqlsh -f /etc/cassandra/reaper.cql $(hostname) else wait_for_seed start_delay=$((60*($index-1))) sleep $start_delay start_cassandra fi - owner: root:root permissions: '0750' path: /etc/cassandra/configure.sh content: | #!/bin/bash if [ "$(id -u -n)" != "root" ]; then echo "Error: you must run this script as root" >&2 exit 4 # According to LSB: 4 - user had insufficient privileges fi if [ -f "/etc/cassandra/.configured" ]; then echo "Cassandra node already configured." exit fi if [ "$CASSANDRA_VERSION" == "40x" ]; then dnf install -y java-11-openjdk echo 2 | alternatives --config java echo 2 | alternatives --config python sed -i -r "s/AND read_repair_chance = 0//" /etc/cassandra/newts.cql else dnf install -y python2 echo 3 | alternatives --config python fi intf="eth0" # Basic Configuration cfg="/etc/cassandra/conf/cassandra.yaml" sed -r -i "/cluster_name:/s/Test Cluster/$CASSANDRA_CLUSTER_NAME/" $cfg sed -r -i "/seeds:/s/127.0.0.1/$CASSANDRA_SEED/" $cfg sed -r -i "s/^listen_address/#listen_address/" $cfg sed -r -i "s/^rpc_address/#rpc_address/" $cfg sed -r -i "s/^# listen_interface: .*/listen_interface: $intf/" $cfg sed -r -i "s/^# rpc_interface: .*/rpc_interface: $intf/" $cfg sed -r -i "/^endpoint_snitch:/s/SimpleSnitch/GossipingPropertyFileSnitch/" $cfg sed -r -i "/^endpoint_snitch:/a dynamic_snitch: false" $cfg # Performance Tuning cores=$(cat /proc/cpuinfo | grep "^processor" | wc -l) sed -r -i "/num_tokens:/s/256/16/" $cfg sed -r -i "/enable_materialized_views:/s/true/false/" $cfg sed -r -i "s/#concurrent_compactors: .*/concurrent_compactors: $cores/" $cfg # Network Topology cfg="/etc/cassandra/conf/cassandra-rackdc.properties" index=$(hostname | awk '{ print substr($0,length,1) }') sed -r -i "/^dc/s/=.*/=$CASSANDRA_DATACENTER/" $cfg sed -r -i "/^rack/s/=.*/=Rack$index/" $cfg rm -f /etc/cassandra/conf/cassandra-topology.properties # Enable JMX Access cfg="/etc/cassandra/conf/cassandra-env.sh" sed -r -i "/rmi.server.hostname/s/.public name./$(hostname)/" $cfg sed -r -i "/rmi.server.hostname/s/^#//" $cfg sed -r -i "/jmxremote.access/s/#//" $cfg sed -r -i "/LOCAL_JMX=/s/yes/no/" $cfg # Configure Heap (make sure it is consistent with the available RAM) cfg="/etc/cassandra/conf/jvm.options" if [ "$CASSANDRA_VERSION" == "40x" ]; then cfg="/etc/cassandra/conf/jvm-server.options" fi sed -r -i "s/#-Xms4G/-Xms$CASSANDRA_HEAP/" $cfg sed -r -i "s/#-Xmx4G/-Xmx$CASSANDRA_HEAP/" $cfg # Disable CMSGC and enable G1GC if [ "$CASSANDRA_VERSION" == "40x" ]; then cfg="/etc/cassandra/conf/jvm11-server.options" fi ToDisable=(UseParNewGC UseConcMarkSweepGC CMSParallelRemarkEnabled SurvivorRatio MaxTenuringThreshold CMSInitiatingOccupancyFraction UseCMSInitiatingOccupancyOnly CMSWaitDuration CMSParallelInitialMarkEnabled CMSEdenChunksRecordAlways CMSClassUnloadingEnabled) for entry in "${ToDisable[@]}"; do sed -r -i "/$entry/s/-XX/#-XX/" $cfg done ToEnable=(UseG1GC G1RSetUpdatingPauseTimePercent MaxGCPauseMillis InitiatingHeapOccupancyPercent ParallelGCThreads) for entry in "${ToEnable[@]}"; do sed -r -i "/$entry/s/#-XX/-XX/" $cfg done # Configure Data Disk disk=/dev/disk/azure/scsi1/lun0 dev=$disk-part1 label=LUN0 echo "Waiting on $disk" while [ ! -e $(readlink -f $disk) ]; do printf '.' sleep 10 done echo ';' | sfdisk $disk echo "Waiting on $dev" while [ ! -e $(readlink -f $dev) ]; do printf '.' sleep 1 done mkfs -t xfs -L $label $dev mkdir /tmp/__cassandra mv /var/lib/cassandra/* /tmp/__cassandra/ mount -L $label /var/lib/cassandra mv /tmp/__cassandra/* /var/lib/cassandra/ echo "LABEL=$label /var/lib/cassandra auto defaults,noatime 0 0" >> /etc/fstab # Fix Permissions chown cassandra:cassandra /etc/cassandra/jmxremote.* touch /etc/cassandra/.configured - owner: root:root permissions: '0400' path: /etc/cassandra/jmxremote.password content: | monitorRole QED controlRole R&D cassandra cassandra - owner: root:root permissions: '0400' path: /etc/cassandra/jmxremote.access content: | monitorRole readonly cassandra readwrite controlRole readwrite \ create javax.management.monitor.*,javax.management.timer.* \ unregister - owner: root:root permissions: '0400' path: /etc/snmp/snmpd.conf content: | rocommunity public default syslocation Azure syscontact IT dontLogTCPWrappersConnects yes disk /var/lib/cassandra yum_repos: cassandra: name: Apache cassandra baseurl: https://downloads.apache.org/cassandra/redhat/$CASSANDRA_VERSION/ gpgkey: https://downloads.apache.org/cassandra/KEYS enabled: true gpgcheck: true packages: - net-snmp - net-snmp-utils - cassandra - cassandra-tools runcmd: - sysctl --system - systemctl enable --now snmpd - /etc/cassandra/configure.sh - /etc/cassandra/bootstrap.sh ``` :::danger Even if Cassandra 4 supports OpenJDK 11, it is considered an experimental feature as mentioned [here](https://cassandra.apache.org/doc/latest/cassandra/new/java11.html). However, some people report great results with OpenJDK 16 and ZGC instead G1GC. ::: :::warning For simplicity, I'm extracting the last digit from the `hostname` and use it as part of the `rack` name. You can apply other rules if needed; for instance, if you're deploying multiple instances of Cassandra in the same physical machine, you could use the `hostname` as `rack` or whatever makes sense to you. Please keep in mind this affects how data replicates across the cluster. ::: Create the Cassandra cluster on CentOS 8 VMs: ```bash= envsubst "$(env | cut -d= -f1 | sed -e 's/^/$/')" < /tmp/cassandra-template.yaml > /tmp/cassandra.yaml for i in {1..3}; do VM_NAME="$USER-cassandra$i" VM_IP="$SUBNET_BASE.1$i" echo "Creating VM $VM_NAME ($VM_IP)..." az vm create --resource-group $RG_NAME --name $VM_NAME \ --size $CASSANDRA_VM_SIZE \ --image OpenLogic:CentOS:8_4:latest \ --admin-username $USER \ --ssh-key-values ~/.ssh/id_rsa.pub \ --vnet-name $VNET_NAME \ --subnet $VNET_SUBNET_NAME \ --private-ip-address $VM_IP \ --public-ip-address-dns-name $VM_NAME \ --public-ip-sku Standard \ --data-disk-sizes-gb $CASSANDRA_DISK_SIZE \ --custom-data /tmp/cassandra.yaml \ --tags Owner=$USER \ --no-wait done ``` There is no need to open ports, as any VM can reach any other VM through any port by default, and the cluster won't be exposed to the internet (except for SSH via FQDN). Also note that static IPs will be used because that's a Cassandra requirement. All the VMs will be created simultaneously; although, the bootstrap script takes care about having only one joining the cluster at a time. ## Create a VM for OpenNMS Create a [cloud-init](https://cloudinit.readthedocs.io/en/latest/) script to deploy OpenNMS and PostgreSQL in Ubuntu with the following content and save it at `/tmp/opennms-template.yaml`: ```yaml= #cloud-config package_upgrade: true write_files: - owner: root:root path: /etc/opennms-overlay/default-foreign-source.xml content: | <foreign-source xmlns="http://xmlns.opennms.org/xsd/config/foreign-source" name="default" date-stamp="2021-03-31T00:00:00.000Z"> <scan-interval>1d</scan-interval> <detectors> <detector name="ICMP" class="org.opennms.netmgt.provision.detector.icmp.IcmpDetector"/> <detector name="SNMP" class="org.opennms.netmgt.provision.detector.snmp.SnmpDetector"/> </detectors> <policies/> </foreign-source> - owner: root:root path: /etc/opennms-overlay/opennms.properties.d/newts.properties content: | org.opennms.timeseries.strategy=newts org.opennms.newts.config.hostname=$CASSANDRA_SEED org.opennms.newts.config.keyspace=$NEWTS_KEYSPACE_NAME org.opennms.newts.config.port=9042 org.opennms.newts.config.read_consistency=ONE org.opennms.newts.config.write_consistency=ANY org.opennms.newts.config.resource_shard=604800 org.opennms.newts.config.ttl=31540000 org.opennms.newts.config.cache.priming.enable=true org.opennms.newts.config.cache.priming.block_ms=60000 # The following settings most be tuned in production org.opennms.newts.config.writer_threads=2 org.opennms.newts.config.ring_buffer_size=131072 org.opennms.newts.config.cache.max_entries=131072 - owner: root:root permissions: '0750' path: /etc/opennms-overlay/bootstrap.sh content: | #!/bin/bash systemctl --now enable postgresql sudo -u postgres createuser opennms sudo -u postgres psql -c "ALTER USER postgres WITH PASSWORD 'postgres';" sudo -u postgres psql -c "ALTER USER opennms WITH PASSWORD 'opennms';" sed -r -i 's/password=""/password="postgres"/' /etc/opennms/opennms-datasources.xml sed -r -i '/0.0.0.0:61616/s/([<][!]--|--[>])//g' /etc/opennms/opennms-activemq.xml sed -r -i '/enabled="false"/{$!{N;s/ enabled="false"[>]\n(.*OpenNMS:Name=Syslogd.*)/>\n\1/}}' /etc/opennms/service-configuration.xml rsync -avr /etc/opennms-overlay/ /etc/opennms/ echo 'JAVA_HEAP_SIZE=$ONMS_HEAP' > /etc/opennms/opennms.conf sed -r -i "s/cassandra-username/cassandra/" /etc/opennms/poller-configuration.xml sed -r -i "s/cassandra-password/cassandra/" /etc/opennms/poller-configuration.xml sed -r -i "s/cassandra-username/cassandra/" /etc/opennms/collectd-configuration.xml sed -r -i "s/cassandra-password/cassandra/" /etc/opennms/collectd-configuration.xml /usr/share/opennms/bin/runjava -s /usr/share/opennms/bin/fix-permissions /usr/share/opennms/bin/install -dis until echo -n > /dev/tcp/$CASSANDRA_SEED/9042; do echo "### Cassandra seed unavailable - sleeping" sleep 5 done systemctl --now enable opennms - owner: root:root permissions: '0400' path: /etc/snmp/snmpd.conf content: | rocommunity public default syslocation Azure syscontact IT dontLogTCPWrappersConnects yes disk / apt: preserve_sources_list: true sources: opennms: source: deb https://debian.opennms.org stable main packages: - snmp - snmpd - opennms - opennms-webapp-hawtio bootcmd: - curl -s https://debian.opennms.org/OPENNMS-GPG-KEY | apt-key add - runcmd: - /etc/opennms-overlay/bootstrap.sh ``` The above installs the latest OpenJDK 11, the latest PostgreSQL, and the latest OpenNMS Horizon. I added the most basic configuration for PostgreSQL to work with authentication. The embedded ActiveMQ is enabled, as well as Syslogd. Create an Ubuntu VM for OpenNMS: ```bash= envsubst < /tmp/opennms-template.yaml > opennms.yaml az vm create --resource-group $RG_NAME --name $ONMS_VM_NAME \ --size $ONMS_VM_SIZE \ --image UbuntuLTS \ --admin-username $USER \ --ssh-key-values ~/.ssh/id_rsa.pub \ --vnet-name $VNET_NAME \ --subnet $VNET_SUBNET_NAME \ --private-ip-address "$SUBNET_BASE.100" \ --public-ip-address-dns-name $ONMS_VM_NAME \ --public-ip-sku Standard \ --custom-data /tmp/opennms.yaml \ --tags Owner=$USER \ --output table az vm open-port -g $RG_NAME -n $ONMS_VM_NAME \ --port 61616 --priority 100 -o table az vm open-port -g $RG_NAME -n $ONMS_VM_NAME \ --port 8980 --priority 200 -o table ``` Keep in mind that the `cloud-init` process starts once the VM is running, meaning you should wait about 5 minutes after the `az vm create` is finished to see OpenNMS up and running. :::warning In case there is a problem, SSH into the VM using the public IP and the provided credentials and check `/var/log/cloud-init-output.log` to verify the progress and the status of the cloud-init execution. ::: :::success You can SSH the Cassandra VMs from the OpenNMS VM, as those don't have public IP addresses. ::: ## Create a VM for Cassandra Reaper :::warning This is an optional step, not required to use Cassandra. ::: Create a [cloud-init](https://cloudinit.readthedocs.io/en/latest/) script to deploy Cassandra Reaper in Ubuntu with the following content and save it at `/tmp/reaper-template.yaml`: ```yaml= #cloud-config package_upgrade: true write_files: - owner: root:root permissions: '0400' path: /etc/snmp/snmpd.conf content: | rocommunity public default syslocation Azure syscontact IT dontLogTCPWrappersConnects yes disk / - owner: root:root permissions: '0644' path: /etc/cassandra-reaper/cassandra-reaper.yaml content: | segmentCountPerNode: 64 repairParallelism: SEQUENTIAL repairIntensity: 0.9 scheduleDaysBetween: 7 repairRunThreadCount: 15 hangingRepairTimeoutMins: 30 storageType: cassandra enableCrossOrigin: true incrementalRepair: false blacklistTwcsTables: true enableDynamicSeedList: true repairManagerSchedulingIntervalSeconds: 10 jmxConnectionTimeoutInSeconds: 5 useAddressTranslator: false maxParallelRepairs: 10 purgeRecordsAfterInDays: 30 numberOfRunsToKeepPerUnit: 10 datacenterAvailability: ALL jmxAuth: username: cassandra password: cassandra logging: level: INFO loggers: io.dropwizard: WARN org.eclipse.jetty: WARN appenders: - type: console logFormat: "%-6level [%d] [%t] %logger{5} - %msg %n" threshold: WARN - type: file logFormat: "%-6level [%d] [%t] %logger{5} - %msg %n" currentLogFilename: /var/log/cassandra-reaper/reaper.log archivedLogFilenamePattern: /var/log/cassandra-reaper/reaper-%d.log.gz archivedFileCount: 20 server: type: default applicationConnectors: - type: http port: 8080 bindHost: 0.0.0.0 adminConnectors: - type: http port: 8081 bindHost: 0.0.0.0 requestLog: appenders: [] cassandra: clusterName: "$CASSANDRA_CLUSTER_NAME" contactPoints: ["$CASSANDRA_SEED"] keyspace: reaper_db autoScheduling: enabled: false initialDelayPeriod: PT15S periodBetweenPolls: PT10M timeBeforeFirstSchedule: PT5M scheduleSpreadPeriod: PT6H metrics: frequency: 1 minute reporters: - type: log logger: metrics accessControl: sessionTimeout: PT10M shiro: iniConfigs: ["classpath:shiro.ini"] apt: preserve_sources_list: true sources: cassandra-reaper: source: deb https://dl.bintray.com/thelastpickle/reaper-deb wheezy main packages: - snmp - snmpd - openjdk-8-jdk - reaper bootcmd: - apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 2895100917357435 runcmd: - systemctl enable cassandra-reaper - systemctl start cassandra-reaper ``` Then, ```bash= envsubst < /tmp/reaper-template.yaml > /tmp/reaper.yaml az vm create --resource-group $RG_NAME --name $REAPER_VM_NAME \ --size Standard_D2s_v3 \ --image UbuntuLTS \ --admin-username $USER \ --ssh-key-values ~/.ssh/id_rsa.pub \ --vnet-name $VNET_NAME \ --subnet $VNET_SUBNET_NAME \ --private-ip-address "$SUBNET_BASE.101" \ --public-ip-address-dns-name $REAPER_VM_NAME \ --public-ip-sku Standard \ --custom-data /tmp/reaper.yaml \ --tags Owner=$USER \ --output table az vm open-port -g $RG_NAME -n $REAPER_VM_NAME --port 8080 --priority 200 -o table ``` ## Monitor the infrastructure Wait until OpenNMS is up and running and then execute the following, to start monitoring all the Cassandra servers, the OpenNMS server and the Reaper server via SNMP and JMX. ```bash= cat <<EOF >/tmp/OpenNMS.xml <?xml version="1.0"?> <model-import date-stamp="$(date +"%Y-%m-%dT%T.000Z")" foreign-source="OpenNMS"> <node foreign-id="onms-server" node-label="onms-server"> <interface ip-addr="$SUBNET_BASE.100" status="1" snmp-primary="P"/> <interface ip-addr="127.0.0.1" status="1" snmp-primary="N"> <monitored-service service-name="OpenNMS-JVM"/> <monitored-service service-name="Postgres"/> </interface> </node> <node foreign-id="reaper" node-label="reaper"> <interface ip-addr="$SUBNET_BASE.101" status="1" snmp-primary="P"> <monitored-service service-name="HTTP-8080"/> </interface> </node> EOF for i in {1..3}; do cat <<EOF >>/tmp/OpenNMS.xml <node foreign-id="cassandra$i" node-label="cassandra$i"> <interface ip-addr="$SUBNET_BASE.1$i" status="1" snmp-primary="P"> <monitored-service service-name="JMX-Cassandra"/> <monitored-service service-name="JMX-Cassandra-Newts"/> </interface> </node> EOF done cat <<EOF >>/tmp/OpenNMS.xml </model-import> EOF ONMS_FQDN="$ONMS_VM_NAME.$LOCATION.cloudapp.azure.com" curl -v -u admin:admin \ -H 'Content-Type: application/xml' -d @/tmp/OpenNMS.xml \ http://$ONMS_FQDN:8980/opennms/rest/requisitions curl -v -u admin:admin -X PUT \ http://$ONMS_FQDN:8980/opennms/rest/requisitions/OpenNMS/import ``` ## Create a Minion VM on your network Create the following [cloud-init](https://cloudinit.readthedocs.io/en/latest/) template to create a Minion (assuming the embedded ActiveMQ within OpenNMS is in place) and save it as `/tmp/minion-template.yaml`: ```yaml= #cloud-config package_upgrade: true write_files: - owner: root:root path: /tmp/org.opennms.minion.controller.cfg content: | location=$MINION_LOCATION id=$MINION_ID http-url=http://$ONMS_FQDN:8980/opennms broker-url=failover:tcp://$ONMS_FQDN:61616 apt: preserve_sources_list: true sources: opennms: source: deb https://debian.opennms.org stable main packages: - opennms-minion bootcmd: - curl -s https://debian.opennms.org/OPENNMS-GPG-KEY | apt-key add - runcmd: - mv -f /tmp/org.opennms.minion.controller.cfg /etc/minion/ - sed -i -r 's/# export JAVA_MIN_MEM=.*/export JAVA_MIN_MEM="$MINION_HEAP_SIZE"/' /etc/default/minion - sed -i -r 's/# export JAVA_MAX_MEM=.*/export JAVA_MAX_MEM="$MINION_HEAP_SIZE"/' /etc/default/minion - /usr/share/minion/bin/scvcli set opennms.http admin admin - /usr/share/minion/bin/scvcli set opennms.broker admin admin - systemctl --now enable minion ``` :::info Note the usage of environment variables within the YAML template. We will substitute them before creating the VM. ::: Then, create the runtime template: ```bash= export MINION_ID="minion01" export MINION_LOCATION="Durham" export MINION_HEAP_SIZE="1g" export ONMS_FQDN="$ONMS_VM_NAME.$LOCATION.cloudapp.azure.com" envsubst < /tmp/minion-template.yaml > /tmp/$MINION_ID.yaml ``` Then, start the new Minion via `multipass` with one core and 2GB of RAM: ```bash= multipass launch -c 1 -m 2G -n $MINION_ID --cloud-init /tmp/$MINION_ID.yaml ``` ## Clean Up When you're done, make sure to delete the cloud resources. If you created the resource group for this exercise, you could remove all the resources with the following command: ```bash= az group delete -g $RG_NAME ``` If you're using an existing resource group that you cannot remove, make sure only to remove all the resources created in this tutorial. All of them should be easily identified as they will contain the username and the VM name as part of the resource name. The easiest way is to use the Azure Portal for this operation. Alternatively, ```bash= IDS=($(az resource list \ --resource-group $RG_NAME \ --query "[?contains(name,'$USER-') && type!='Microsoft.Compute/disks']".id \ --output tsv | tr '\n' ' ')) for id in "${IDS[@]}"; do echo "Removing $id" az resource delete --ids "$id" --verbose done DISKS=($(az resource list \ --resource-group $RG_NAME \ --query "[?contains(name,'$USER-') && type=='Microsoft.Compute/disks']".id \ --output tsv | tr '\n' ' ')) for id in "${DISKS[@]}"; do echo "Removing $id" az resource delete --ids "$id" --verbose done ``` The reason to have two sets of deletion groups is that, by default, the list contains disks initially, which cannot be removed before the VMs. For this reason, we exclude the disks on the first set, and then we remove the disks. Then clean the local resources: ```bash= multipass delete $MINION_ID multipass purge ```