# Simple OpenNMS Environment using Elasticsearch for Flows
This lab starts an OpenNMS instance and a 3 node Elasticsearch cluster in the cloud, for learning purposes.
To monitor a network, it is advised to enable [ActiveMQ](https://hackmd.io/igsj5WstQROskqq2AtqdIg) or [Kafka](https://hackmd.io/A4IVlzaSSLe-RXLq2kkbrg) and use Minions. For simplicity, the embedded AMQ will be enabled, and a simple minion will be started to test flows.
## Requirements
* Have an [Azure Subscription](https://azure.microsoft.com/en-us/free/) ready.
* Install [Azure CLI](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli)
* Install [Multipass](https://multipass.run/)
The scripts used through this tutorial use [envsubst](https://www.gnu.org/software/gettext/manual/html_node/envsubst-Invocation.html), make sure to have it installed.
:::info
Make sure to log into Azure using `az login` prior creating the VM.
:::
:::danger
If you have a restricted account in Azure, make sure you have the `Network Contributor` role and the `Virtual Machine Contributor` role associated with your Azure AD account for the resource group on which you would like to create the VM. Of course, `Owner` or `Contributor` at resource group level are welcome.
:::
## Create common Environment Variables
```bash=
export RG_NAME="OpenNMS" # Change it to use a shared one
export LOCATION="eastus" # Azure Region
export VNET_CIDR="13.0.0.0/16"
export VNET_SUBNET="13.0.1.0/24"
export VNET_NAME="$USER-vnet"
export VNET_SUBNET_NAME="subnet1"
export ONMS_HEAP="4096" # Expressed in MB and must fit ONMS_VM_SIZE
export ONMS_VM_SIZE="Standard_D2s_v3" # 2 VCPU, 8 GB of RAM
export ONMS_VM_NAME="$USER-opennms"
export ELASTIC_VERSION="7.6.2" # Must match OpenNMS Drift Plugin
export ELASTIC_VM_SIZE="Standard_D2s_v3" # 2 VCPU, 8 GB of RAM
export ELASTIC_HEAP="4g" # Must fit ELASTIC_VM_SIZE
export ELASTIC_SHARDS="9"
export ELASTIC_REPLICAS="2" # Must be less than number of data nodes
export KIBANA_VM_NAME="$USER-kibana"
export KIBANA_VM_SIZE="Standard_D2s_v3" # 2 VCPU, 8 GB of RAM
```
:::info
Feel free to change the content if needed, but if you're planing to change the VNet settings, make sure to adjust the VM creation commands.
:::
We're going to leverage the Azure DNS services to avoid the need to remember and using Public IP addresses.
In Azure, the default public DNS follow the same pattern:
```
<vm-name>.<location>.cloudapp.azure.com
```
To make the VMs FQDN unique, we're going to add the username to the VM name. For instance, the OpenNMS FQDN would be:
```
agalue-opennms.eastus.cloudapp.azure.com
```
The above is what we can use to access the VM via SSH and to configure Minions.
## Create the Azure Resource Group
This is a necessary step, as every resource in Azure must belong to a resource group and a location.
However, you can omit the following command and use an existing one if you prefer. In that case, make sure to adjust the environment variable `RG_NAME` so the subsequent commands will target the correct group.
```bash=
az group create -n $RG_NAME -l $LOCATION
```
## Create the Virtual Network
```bash=
az network vnet create -g $RG_NAME \
--name $VNET_NAME \
--address-prefix $VNET_CIDR \
--subnet-name $VNET_SUBNET_NAME \
--subnet-prefix $VNET_SUBNET
```
## Create the Elasticsearch cluster
Create the [cloud-init](https://cloudinit.readthedocs.io/en/latest/) YAML file as `/tmp/elasticsearch-template.yaml` with the following content to create an Ubuntu VM with Elasticsearch 7.6.2, OpenJDK 11 and the [OpenNMS Elasticsearch Plugin](https://github.com/OpenNMS/elasticsearch-drift-plugin) that matches the Elasticsearch version (which is a mandatory requirement).
For simplicity, each instance will have all roles (i.e., master, data, coordinator). It is advised to have a 3-node cluster for masters, a 2-node cluster for coordinators, and an n-node cluster for data in production. That requires defining the roles on each case, but that falls outside the scope of this guide.
```yaml=
#cloud-config
package_upgrade: false
write_files:
- owner: root:root
path: /etc/systemd/system/elasticsearch.service.d/override.conf
content: |
[Service]
LimitMEMLOCK=infinity
- owner: root:root
permissions: '0750'
path: /etc/elasticsearch/configure.sh
content: |
#!/bin/bash
if [ -f "/etc/elasticsearch/.configured" ]; then
echo "Elasticsearch node already configured."
exit
fi
cat <<EOF >>/etc/elasticsearch/elasticsearch.yml
# Basic Configuration
cluster.name: OpenNMS
node.name: $(hostname)
network.host: $(ifconfig eth0 | grep 'inet[^6]' | awk '{print $2}')
xpack.monitoring.collection.enabled: true
bootstrap.memory_lock: true
search.max_buckets: 50000
discovery.seed_hosts: ["$USER-elastic1"]
cluster.initial_master_nodes: $USER-elastic1,$USER-elastic2,$USER-elastic3
EOF
sed -i -r 's/^(-Xm[xs])1g/\1$ELASTIC_HEAP/' /etc/elasticsearch/jvm.options
systemctl --now enable elasticsearch
touch /etc/elasticsearch/.configured
packages:
- net-tools
- apt-transport-https
- openjdk-11-jre-headless
runcmd:
- wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-$ELASTIC_VERSION-amd64.deb
- dpkg -i elasticsearch-$ELASTIC_VERSION-amd64.deb
- wget https://github.com/OpenNMS/elasticsearch-drift-plugin/releases/download/v$ELASTIC_VERSION/elasticsearch-drift-plugin_$ELASTIC_VERSION-1_all.deb
- dpkg -i elasticsearch-drift-plugin_$ELASTIC_VERSION-1_all.deb
- /etc/elasticsearch/configure.sh
```
If you want to start a single-node cluster, make sure to update the `configure.sh` script from the above template to include:
```yaml
discovery.type: single-node
#discovery.seed_hosts: ...
#cluster.initial_master_nodes: ...
```
Create the Elasticsearch cluster:
```bash=
envsubst < /tmp/elasticsearch-template.yaml > /tmp/elasticsearch.yaml
for i in {1..3}; do
VM_NAME="$USER-elastic$i"
echo "Creating VM $VM_NAME..."
az vm create --resource-group $RG_NAME --name $VM_NAME \
--size $ELASTIC_VM_SIZE \
--image UbuntuLTS \
--admin-username $USER \
--ssh-key-values ~/.ssh/id_rsa.pub \
--vnet-name $VNET_NAME \
--subnet $VNET_SUBNET_NAME \
--public-ip-address "" \
--custom-data /tmp/elasticsearch.yaml \
--no-wait
done
```
There is no need to open ports, as any VM can reach any other VM through any port by default, and the cluster won't be exposed to the internet.
## Create a VM for Kibana
This is an optional but useful step as Kibana can not only help to visualize data in Elasticsearch but also helps to check the health of the cluster. It has to run in Azure to avoid exposing Elasticsearch to the Internet.
Create the template with the following content and save it at `/tmp/kibana-template.yaml`:
```yaml=
#cloud-config
package_upgrade: false
write_files:
- owner: root:root
permissions: '0750'
path: /etc/kibana/configure.sh
content: |
#!/bin/bash
cat <<EOF >>/etc/kibana/kibana.yml
# Basic Configuration
server.host: $(ifconfig eth0 | grep 'inet[^6]' | awk '{print $2}')
server.name: $(hostname)
elasticsearch.hosts: ["http://$USER-elastic1:9200"]
EOF
systemctl --now enable kibana
packages:
- net-tools
- apt-transport-https
runcmd:
- wget https://artifacts.elastic.co/downloads/kibana/kibana-$ELASTIC_VERSION-amd64.deb
- dpkg -i kibana-$ELASTIC_VERSION-amd64.deb
- /etc/kibana/configure.sh
```
Create the VM:
```bash=
envsubst < /tmp/kibana-template.yaml > kibana.yaml
az vm create --resource-group $RG_NAME --name $KIBANA_VM_NAME \
--size $KIBANA_VM_SIZE \
--image UbuntuLTS \
--admin-username $USER \
--ssh-key-values ~/.ssh/id_rsa.pub \
--vnet-name $VNET_NAME \
--subnet $VNET_SUBNET_NAME \
--public-ip-address-dns-name $KIBANA_VM_NAME \
--custom-data /tmp/kibana.yaml \
--output table
az vm open-port -g $RG_NAME -n $KIBANA_VM_NAME --port 5601 --priority 100 -o table
```
## Create a VM for OpenNMS
Create a [cloud-init](https://cloudinit.readthedocs.io/en/latest/) script to deploy OpenNMS in Ubuntu with the following content and store it at `/tmp/opennms-template.yaml`:
```yaml=
#cloud-config
package_upgrade: true
write_files:
- owner: root:root
path: /etc/opennms-overlay/opennms.properties.d/rrd.properties
content: |
org.opennms.rrd.storeByGroup=true
org.opennms.rrd.storeByForeignSource=true
org.opennms.rrd.strategyClass=org.opennms.netmgt.rrd.rrdtool.MultithreadedJniRrdStrategy
org.opennms.rrd.interfaceJar=/usr/share/java/jrrd2.jar
opennms.library.jrrd2=/usr/lib/jni/libjrrd2.so
- owner: root:root
path: /etc/opennms-overlay/org.opennms.features.flows.persistence.elastic.cfg
content: |
elasticUrl=http://$USER-elastic1:9200
globalElasticUser=elastic
globalElasticPassword=elastic
connTimeout=30000
readTimeout=300000
retries=1
elasticIndexStrategy=daily
# The following settings should be consistent with your ES cluster
settings.index.number_of_shards=$ELASTIC_SHARDS
settings.index.number_of_replicas=$ELASTIC_REPLICAS
apt:
preserve_sources_list: true
sources:
opennms:
source: deb https://debian.opennms.org stable main
packages:
- opennms
- opennms-webapp-hawtio
- opennms-helm
- jrrd2
bootcmd:
- curl -s https://debian.opennms.org/OPENNMS-GPG-KEY | apt-key add -
runcmd:
# Configure PostgreSQL
- systemctl --now enable postgresql
- sudo -u postgres createuser opennms
- sudo -u postgres psql -c "ALTER USER postgres WITH PASSWORD 'postgres';"
- sudo -u postgres psql -c "ALTER USER opennms WITH PASSWORD 'opennms';"
- sed -r -i 's/password=""/password="postgres"/' /etc/opennms/opennms-datasources.xml
# Configure ActiveMQ
- sed -r -i '/0.0.0.0:61616/s/([<][!]--|--[>])//g' /etc/opennms/opennms-activemq.xml
# Enable Syslogd
- sed -r -i '/enabled="false"/{$!{N;s/ enabled="false"[>]\n(.*OpenNMS:Name=Syslogd.*)/>\n\1/}}' /etc/opennms/service-configuration.xml
# Initialize and start OpenNMS
- rsync -avr /etc/opennms-overlay/ /etc/opennms/
- /usr/share/opennms/bin/runjava -s
- /usr/share/opennms/bin/install -dis
- echo 'JAVA_HEAP_SIZE=$ONMS_HEAP' > /etc/opennms/opennms.conf
- systemctl --now enable opennms
- systemctl --now enable grafana-server
```
:::warning
If you're using a single-node Elasticsearch cluster, make sure to set `settings.index.number_of_replicas=0`, and perhaps `settings.index.number_of_shards=1`.
:::
The above installs the latest OpenJDK 11, the latest PostgreSQL, and the latest OpenNMS Horizon. I added the most basic configuration for PostgreSQL to work with authentication. The embedded ActiveMQ is enabled, as well as Syslogd.
Create an Ubuntu VM for OpenNMS:
```bash=
envsubst < /tmp/opennms-template.yaml > opennms.yaml
az vm create --resource-group $RG_NAME --name $ONMS_VM_NAME \
--size $ONMS_VM_SIZE \
--image UbuntuLTS \
--admin-username $USER \
--ssh-key-values ~/.ssh/id_rsa.pub \
--vnet-name $VNET_NAME \
--subnet $VNET_SUBNET_NAME \
--public-ip-address-dns-name $ONMS_VM_NAME \
--custom-data /tmp/opennms.yaml \
--output table
az vm open-port -g $RG_NAME -n $ONMS_VM_NAME --port 8980 --priority 200 -o table
az vm open-port -g $RG_NAME -n $ONMS_VM_NAME --port 61616 --priority 300 -o table
az vm open-port -g $RG_NAME -n $ONMS_VM_NAME --port 3000 --priority 400 -o table
```
Keep in mind that the `cloud-init` process starts once the VM is running, meaning you should wait about 5 minutes after the `az vm create` is finished to see OpenNMS up and running.
:::warning
In case there is a problem, SSH into the VM using the public IP and the provided credentials and check `/var/log/cloud-init-output.log` to verify the progress and the status of the cloud-init execution.
:::
:::success
You can SSH the Cassandra VMs from the OpenNMS VM, as those don't have public IP addresses.
:::
## Create a Minion VM on your network
Create the following [cloud-init](https://cloudinit.readthedocs.io/en/latest/) template to create a Minion (assuming the embedded ActiveMQ within OpenNMS is in place) and save it as `/tmp/minion-template.yaml`:
```yaml=
#cloud-config
package_upgrade: true
write_files:
- owner: root:root
path: /etc/minion-overlay/org.opennms.minion.controller.cfg
content: |
location=$MINION_LOCATION
id=$MINION_ID
http-url=http://$ONMS_FQDN:8980/opennms
broker-url=failover:tcp://$ONMS_FQDN:61616
- owner: root:root
path: /etc/minion-overlay/org.opennms.features.telemetry.listeners-udp-9999.cfg
content: |
name = Flows
class-name = org.opennms.netmgt.telemetry.listeners.UdpListener
parameters.port = 9999
parsers.0.name = Netflow-5
parsers.0.class-name = org.opennms.netmgt.telemetry.protocols.netflow.parser.Netflow5UdpParser
parsers.0.parameters.dnsLookupsEnabled=false
parsers.1.name = Netflow-9
parsers.1.class-name = org.opennms.netmgt.telemetry.protocols.netflow.parser.Netflow9UdpParser
parsers.1.parameters.dnsLookupsEnabled=false
parsers.2.name = SFlow
parsers.2.class-name = org.opennms.netmgt.telemetry.protocols.sflow.parser.SFlowUdpParser
parsers.2.parameters.dnsLookupsEnabled=false
apt:
preserve_sources_list: true
sources:
opennms:
source: deb https://debian.opennms.org stable main
packages:
- opennms-minion
bootcmd:
- curl -s https://debian.opennms.org/OPENNMS-GPG-KEY | apt-key add -
runcmd:
- rsync -avr /etc/minion-overlay/ /etc/minion/
- rm -f /etc/minion/org.opennms.features.telemetry.listeners.flows.cfg
- sed -i -r 's/# export JAVA_MIN_MEM=.*/export JAVA_MIN_MEM="$MINION_HEAP_SIZE"/' /etc/default/minion
- sed -i -r 's/# export JAVA_MAX_MEM=.*/export JAVA_MAX_MEM="$MINION_HEAP_SIZE"/' /etc/default/minion
- /usr/share/minion/bin/scvcli set opennms.http admin admin
- /usr/share/minion/bin/scvcli set opennms.broker admin admin
- systemctl --now enable minion
```
:::info
Note the usage of environment variables within the YAML template. We will substitute them before creating the VM.
:::
Then, create the runtime template:
```bash=
export MINION_ID="minion01"
export MINION_LOCATION="Durham"
export MINION_HEAP_SIZE="1g"
export ONMS_FQDN="$ONMS_VM_NAME.$LOCATION.cloudapp.azure.com"
envsubst < /tmp/minion-template.yaml > /tmp/$MINION_ID.yaml
```
Then, start the new Minion via `multipass` with one core and 2GB of RAM:
```bash=
multipass launch -c 1 -m 2G -n $MINION_ID --cloud-init /tmp/$MINION_ID.yaml
```
:::danger
The content you add for `parsers.X.name` will be part of the Sink API's Telemetry Topic. That has to match the Queue name for the Adapter in `telemetryd-configuration.xml` in OpenNMS, or the `name` attribute inside the `org.opennms.features.telemetry.adapters-XXXX.cfg` file when using Sentinel.
:::
## Monitor a flow-capable machine
As having a flow-capable router can be complicated, we're going to use [udpgen](https://github.com/OpenNMS/udpgen). This tool can serve to test flows and send Syslog and SNMP Traps via Minion to the OpenNMS server running in Azure.
The machine that will be running `udpgen` must be part of the OpenNMS inventory. Assuming the machine IP is `192.168.0.40`, do the following from the OpenNMS instance:
```bash=
/usr/share/opennms/bin/provision.pl requisition add Test
/usr/share/opennms/bin/provision.pl node add Test srv01 srv01
/usr/share/opennms/bin/provision.pl node set Test srv01 location Durham
/usr/share/opennms/bin/provision.pl interface add Test srv01 192.168.0.40
/usr/share/opennms/bin/provision.pl interface set Test srv01 192.168.0.40 snmp-primary P
/usr/share/opennms/bin/provision.pl requisition import Test
```
Then, find the IP of the Minion using `multipass list`, then execute the following from the machine added as a node above (the examples assumes the IP of the Minion is `192.168.75.16`):
To send SNMP Traps:
```bash=
udpgen -h 192.168.75.16 -x snmp -r 1 -p 1162
```
To send Syslog Messages:
```bash=
udpgen -h 192.168.75.16 -x syslog -r 1 -p 1514
```
To send Netflow 5 Packets:
```bash=
udpgen -h 192.168.75.16 -x netflow5 -r 1 -p 9999
```
:::success
The C++ version of `udpgen` only works on Linux. If you're on MacOS, you can use the [Go](https://github.com/agalue/udpgen) version of it. Unfortunately, Windows is not supported.
:::
## Clean Up
When you're done, make sure to delete the cloud resources.
If you created the resource group for this exercise, you could remove all the resources with the following command:
```bash=
az group delete -g $RG_NAME
```
If you're using an existing resource group that you cannot remove, make sure only to remove all the resources created in this tutorial. All of them should be easily identified as they will contain the username and the VM name as part of the resource name. The easiest way is to use the Azure Portal for this operation. Alternatively,
```bash=
IDS=($(az resource list \
--resource-group $RG_NAME \
--query "[?contains(name,'$USER-') && type!='Microsoft.Compute/disks']".id \
--output tsv | tr '\n' ' '))
for id in "${IDS[@]}"; do
echo "Removing $id"
az resource delete --ids "$id" --verbose
done
DISKS=($(az resource list \
--resource-group $RG_NAME \
--query "[?contains(name,'$USER-') && type=='Microsoft.Compute/disks']".id \
--output tsv | tr '\n' ' '))
for id in "${DISKS[@]}"; do
echo "Removing $id"
az resource delete --ids "$id" --verbose
done
```
The reason to have two sets of deletion groups is that, by default, the list contains disks initially, which cannot be removed before the VMs. For this reason, we exclude the disks on the first set, and then we remove the disks.
Then clean the local resources:
```bash=
multipass delete $MINION_ID1
multipass purge
```