How to Manage and Optimise Alerts

With so many alerting methods included in StatusCake, it’s sometimes hard to find the balance in terms of alerting method and frequency, in this article we will look at alerting in more detail and delve into some of the ways that we can make monitoring alerts more efficient and constructive.

Types of alerting and things to consider:

Email – Email alerts are a reliable and relatively swift way to get your notifications, but no one alerting method can ever be foolproof and as such there are some recommended steps and prerequisites to ensure that your email alerts always come through successfully!

First off it’s a great idea to whitelist our sender address “[email protected]” in your inbox, many of our messages are similar in content due to the nature of our service, so it’s not uncommon for these types of messages to be filtered into spam, by the same token you can also filter your StatusCake alerts into a special folder to reduce clogging of the main mailbox if it’s used for other purposes.

SMS – SMS alerting is considered slightly less reliable than email as to be sent it depends almost entirely on Mobile network and personal phone reception, you can see larger variations in the receipt time with this type of alerting due to these factors.

It’s a good idea to frequently check that no automatic message bars have been placed by your phone or network if you are consistently receiving a large amount of downtime alerts through SMS, we find that most users instead opt to use Mobile-Push type alerts instead, owing to their increased reliability, and the lack of associated cost.

Mobile Push – On it’s own this alerting method can suffer from some of the same issues as SMS, due to the need for a stable internet connection, but is generally much more reliable, and when paired with one or more alerting method in the same contact group we get a very reliable alerting combo!

We offer direct integrations for Mobile push alerts with the following services:
Pushover
Slack
Pagerduty
Pushbullet
VictorOps

Making use of these integrations is a great choice as it can both reduce the cost of alerting (If using SMS) and bring in new functionality, for example with PagerDuty; phone calls can be sent in the place of the normal alerts, giving you even more flexibility on how you are notified.

Alerting Frequency and Management

Alerting frequency is a big factor in how fast problems are picked up and handled, below we will review some important settings and considerations that we should take into account when setting up the alerts, both on the test and contact group end:

Alert Delay Rate – The Alert Delay Rate defines how long we will wait to notify you of a downtime, and as such this means that if you have a setting above the length of any given downtime you will not receive the alert. It’s important to adjust this accordingly, if you really aren’t worried about 1-2 minute downtimes, you can set this value to 2, and you won’t be notified, but for alerts that are crucial and need instant attention you’d want to have this at a setting of 0 for instant alerts.

Repeat Alert – This setting is found in the Contact Group, and it defines how often the alert should be repeated in the case that the issue is not detected as resolved. Again there will be a certain amount of balancing needed here, too many alerts can be a distraction for the team, but at the same time there may be cases where a reminder is essential, for this reason we allow you to configure the repeats at intervals from 1 minute to 1 hour in length. Email and Mobile Push alerts will repeat, however SMS alerts will not.

Separation of alerting types – Especially relevant for larger businesses, it can be very confusing for staff to have all alert types coming into the same place. Which is why it’s a good idea to use our Contact Groups to apply some separation in terms of which department or contact chain each category of alerts is sent to. This both reduces pressure on “All Rounder” staff and makes things tidier and easier to manage. An effective example would be to have your uptime tests reporting to a separate group than your SSL tests, which could be managed by a different team.

Using Dashboards for a live overview – Even when you are covered by our alerts, you can gain extra peace of mind by running one of our status Dashboards (Business plan +) on a screen at the office, these contain a live display of your vital operations and an extra layer of safety, you are able to see problems as they actually happening which means an even faster reaction time.

 

Related Articles