StatusCake

Alert Rate Improvements For Test monitoring

website monitoring

Here at StatusCake we’ve always pushed the idea that with website monitoring every second counts, be it reputation, revenue or trust all can be lost. It’s with this in mind that we have made some changes to our test systems today.

Previously our system would do confirmation tests after a test has failed and this means if your site is stuck loading (for example it’s under heavy load) it can take up to your configured timeout rate for each confirmation. If you have set to confirm with 4 servers and each one days thirty seconds then you won’t get alerted for over 2 minutes!

What’s changed?

From today when a test is taking longer than its average load time the confirmation servers will start work right away all at once, if the original test completes without a problem just a bit slower than normal then the confirmation data will be dropped, however if the original signs of a problem end up becoming a downtime trigger all confirmation servers will have already started (if not completed) the work load they had been set.

This means we’re shaving of seconds to when you first know about an issue affecting your website all the while without introducing possibilities of false downtime.

Along with these changes we’ve made some design decision changes. Previously your uptime would be determined by any downtime beyond that of the designated trigger rate you had set, so if you had downtime of 4 minutes with a 5 minute trigger rate then your uptime would remain at 100% – now uptime is taken from the first validated confirmation. This change was made because we believe it better reflects the real world data.

Both these changes are part of a continuous ongoing improvement program we have here at StatusCake. Stay tuned for more improvements (and new features!)

Share this

More from StatusCake

Buy vs Build in the Age of AI (Part 2)

6 min read The Real Cost of Owning Monitoring Isn’t Code — It’s Everything Else In Part 1, we explored how AI has dramatically reduced the cost of building monitoring tooling. That much is clear. You can scaffold uptime checks quickly, generate alert logic in minutes, and set-up dashboards faster than most teams used to schedule the kickoff

Buy vs Build in the Age of AI (Part 1)

5 min read AI Has Made Building Monitoring Easy. It Hasn’t Made Owning It Any Easier. A few months ago, I spoke to an engineering manager who proudly told me they had rebuilt their monitoring stack over a long weekend. They’d used AI to scaffold synthetic checks. They’d generated alert logic with dynamic thresholds. They’d then wired everything

Alerting Is a Socio-Technical System

3 min read In the previous posts, we’ve looked at how alert noise emerges from design decisions, why notification lists fail to create accountability, and why alerts only work when they’re designed around a clear outcome. Taken together, these ideas point to a broader conclusion. That alerting is not just a technical system, it’s a socio-technical one. Alerting

Designing Alerts for Action

3 min read In the first two posts of this series, we explored how alert noise emerges from design decisions, and why notification lists fail to create accountability when responsibility is unclear. There’s a deeper issue underneath both of those problems. Many alerting systems are designed without being clear about the outcome they’re meant to produce. When teams

A Notification List Is Not a Team

3 min read In the previous post, we looked at how alert noise is rarely accidental. It’s usually the result of sensible decisions layered over time, until responsibility becomes diffuse and response slows. One of the most persistent assumptions behind this pattern is simple. If enough people are notified, someone will take responsibility. After more than fourteen years

Alert Noise Isn’t an Accident — It’s a Design Decision

3 min read In a previous post, The Incident Checklist: Reducing Cognitive Load When It Matters Most, we explored how incidents stop being purely technical problems and become human ones. These are moments where decision-making under pressure and cognitive load matter more than perfect root cause analysis. When systems don’t support people clearly in those moments, teams compensate.

Want to know how much website downtime costs, and the impact it can have on your business?

Find out everything you need to know in our new uptime monitoring whitepaper 2021

*By providing your email address, you agree to our privacy policy and to receive marketing communications from StatusCake.