StatusCake

Alert Rate Improvements For Test monitoring

website monitoring

Here at StatusCake we’ve always pushed the idea that with website monitoring every second counts, be it reputation, revenue or trust all can be lost. It’s with this in mind that we have made some changes to our test systems today.

Previously our system would do confirmation tests after a test has failed and this means if your site is stuck loading (for example it’s under heavy load) it can take up to your configured timeout rate for each confirmation. If you have set to confirm with 4 servers and each one days thirty seconds then you won’t get alerted for over 2 minutes!

What’s changed?

From today when a test is taking longer than its average load time the confirmation servers will start work right away all at once, if the original test completes without a problem just a bit slower than normal then the confirmation data will be dropped, however if the original signs of a problem end up becoming a downtime trigger all confirmation servers will have already started (if not completed) the work load they had been set.

This means we’re shaving of seconds to when you first know about an issue affecting your website all the while without introducing possibilities of false downtime.

Along with these changes we’ve made some design decision changes. Previously your uptime would be determined by any downtime beyond that of the designated trigger rate you had set, so if you had downtime of 4 minutes with a 5 minute trigger rate then your uptime would remain at 100% – now uptime is taken from the first validated confirmation. This change was made because we believe it better reflects the real world data.

Both these changes are part of a continuous ongoing improvement program we have here at StatusCake. Stay tuned for more improvements (and new features!)

Share this

More from StatusCake

Alert Noise Isn’t an Accident — It’s a Design Decision

3 min read In a previous post, The Incident Checklist: Reducing Cognitive Load When It Matters Most, we explored how incidents stop being purely technical problems and become human ones. These are moments where decision-making under pressure and cognitive load matter more than perfect root cause analysis. When systems don’t support people clearly in those moments, teams compensate.

The Incident Checklist: Reducing Cognitive Load When It Matters Most

4 min read In the previous post, we looked at what happens after detection; when incidents stop being purely technical problems and become human ones, with cognitive load as the real constraint. This post assumes that context. The question here is simpler and more practical. What actually helps teams think clearly and act well once things are already

When Things Go Wrong, Systems Should Help Humans — Not Fight Them

3 min read In the previous post, we explored how AI accelerates delivery and compresses the time between change and user impact. As velocity increases, knowing that something has gone wrong before users do becomes a critical capability. But detection is only the beginning. Once alerts fire and dashboards light up, humans still have to interpret what’s happening,

When AI Speeds Up Change, Knowing First Becomes the Constraint

5 min read In a recent post, I argued that AI doesn’t fix weak engineering processes; rather it amplifies them. Strong review practices, clear ownership, and solid fundamentals still matter just as much when code is AI-assisted as when it’s not. That post sparked a follow-up question in the comments that’s worth sitting with: With AI speeding things

Make Your Engineering Processes Resilient. Not Your Opinions About AI

4 min read Why strong reviews, accountability, and monitoring matter more in an AI-assisted world Artificial intelligence has become the latest fault line in software development.  For some teams, it’s an obvious productivity multiplier.  For others, it’s viewed with suspicion.  A source of low-quality code, unreviewable pull requests, and latent production risk. One concern we hear frequently goes

Blog

How to monitor IPFS assets with StatusCake

3 min read IPFS is a game-changer for decentralised storage and the future of the web, but it still requires active monitoring to ensure everything runs smoothly.

Want to know how much website downtime costs, and the impact it can have on your business?

Find out everything you need to know in our new uptime monitoring whitepaper 2021

*By providing your email address, you agree to our privacy policy and to receive marketing communications from StatusCake.