# On-Call tl;dr; Self-hosted vs Saas should be carefully considered. When the oncall self-hosted solution fails we have no alerts. The only viable options for on-call are OpsGenie or iLert ## Self-hosted options ### Grafana OnCall OSS https://grafana.com/docs/grafana-cloud/oncall/open-source/#production-environment ![](https://grafana.com/static/img/docs/oncall/oncall-alertworkflow.png) [helm chart](https://github.com/grafana/oncall/blob/dev/helm/oncall/values.yaml) requires redis, mariaDB/postgreSQL, rabbitMQ (at least all are bundeled with the chart). Pro: - nice integrations with Alertmanager, Loki and so on Con: - Self hosted, when it fails we have no alerts - no SMS or Phone Calls - mobile App which can override Phones DnD mode (You must first connect your Grafana OnCall OSS to Grafana Cloud for the mobile app to work.) - most of the features are Cloud bound ### harpia https://docs.harpia.io/ GitHub: https://github.com/harpia-io/harpia Pro: - self hosted Con: - Self hosted, when it fails we have no alerts - needs kafka and a bunch of other dependencies.. - ## SaaS options ### Atlassian OpsGenie Opsgenie: https://www.atlassian.com/software/opsgenie Pro: - Easy Prometheus Integration: https://support.atlassian.com/opsgenie/docs/integrate-opsgenie-with-prometheus/ - ChatOps Slack integration: https://www.atlassian.com/software/opsgenie/slack - no in-house maintanance - statuspage.io integration: https://support.atlassian.com/opsgenie/docs/integrate-opsgenie-with-statuspage/ - Con: - ![](https://i.imgur.com/UupcG8R.png) ### iLert Seems faily straight-forward https://www.ilert.com/ Pro: - tons of integrations ([Prometheus](https://docs.ilert.com/integrations/prometheus), [Sentry](https://docs.ilert.com/integrations/sentry)) - [Google Cloud Functions](https://docs.ilert.com/integrations/gcf) outbound integration - [ilert app](https://play.google.com/store/apps/details?id=de.ilert.client.iphone) - can be configured via Terraform - we would come far with the free tier Con: - No log enrichment ![](https://i.imgur.com/Ne0nKh4.png) ### PagerDuty Pro: - Integration with Alertmanager - ChatOps with Slack & MS Teams - Statuspage.io integration Contra: - Pricing: ![](https://i.imgur.com/z1dTkHL.png) ### Splunk OnCall (former VictorOPs) https://www.splunk.com/en_us/products/pricing/faqs/observability.html Pro: - Prometheus integration Contra: - no pricing avaliable - Splunk is known to be expensive # Status Pages https://github.com/ivbeg/awesome-status-pages https://enqueuezero.com/architecture/status-site.html ## Atlassian Status Page [Link](https://www.atlassian.com/software/statuspage/pricing) Example: https://status.dropbox.com/