Want to know how much website downtime costs, and the impact it can have on your business?
Find out everything you need to know in our new uptime monitoring whitepaper 2021



“In the midst of chaos, there is also opportunity”
–Sun Tzu, The Art of War
If there’s one universal constant in the world of business, it’s that things will go wrong. Probably at the most inconvenient of times and in the most inconvenient of ways. It’s Murphy’s law, or, if you’re from England the much more fun, “Sod’s law”. These moments can define your business more than any other. Unfortunately, far more than usual day-to-day ever will. After all, you’ll never have more eyes on your business than when you’re in a crisis (especially if you have something like unscheduled website downtime).
The decisions you make when things go wrong can actually create strong brand loyalty and customer trust – if you do it right. Here are my tips on managing your way through a crisis and coming out stronger:
The first step to managing a crisis is to understand what the crisis is, where it starts and ends, and how much of it is in your control. That doesn’t mean working out every last detail, but it’s good to come out ahead.
Pre-emptive measures like website monitoring can help mitigate problems and make you aware of issues before the first report comes in. Nothing shows a company is on the ball like going to report a problem only to see the company is already working on the solution.
It’s also important to get into the mindset that no matter where the problem really lies, as far as your customers are concerned, it lies with you.
No, you can’t prepare for everything, but the good news is the majority of customers don’t expect you to. When we talk about ownership, what we’re really looking for is accountability. And that means taking responsibility for all problems. Not just ones caused by something you’ve done, but all external factors.
Let’s say, for example, AWS goes down.
Was it your fault? No, but it is your problem. Customers want the problem fixed, but they may also want to vent, and they want to know someone is listening and emphasises with them. It’s human nature to want to complain, but it’s also human nature to be defused by a sympathetic voice.
One of the biggest mistakes companies make during a crisis is “information paralysis” where they get into a cycle of holding off on communicating until they get every last piece of information on a problem.
Or, worse, they won’t post until after they’ve solved the problem in some big flashy solution. This works for Tony Stark, but chances are, it isn’t going to work for you. Sorry to ruin your Iron Man fantasies!
Perhaps companies feel that updating people with information that may itself change is confusing. Or perhaps that coming to the table with a problem and no solution looks ill-prepared. In reality, though, nothing makes you look less prepared than burying your head in the sand. This sort of logic is nearly always doing your customers a disservice, anyway.
We live in a world of 24 hour live online news where information is drip-fed constantly, and often reports and Tweets and stories are updated and changed around us. In that environment, it’s a little naive to think your customers won’t be able to handle a little live update about your product.
Ultimately, establishing a dialogue is more important than getting everything perfect straight away. If you make a mistake or need to change something, it’s not the end of the world. It makes you seem more human, and that builds empathy.
Where and how you talk to your customers during ‘peacetime’ may not be the same as where you talk to them during a crisis. No matter what’s gone wrong, it’s important to remember the method of communication can be just as important as the words themselves.
One common mistake people make is leaving the crisis talk solely to Twitter or Facebook. Social media, which is ephemeral by its very nature, is an easy fall-back for quick updating. However, its limited word-count can make apologies and communication seem terse and insincere. Nobody wants to see an apology with a hashtag in it.
One middle-ground comms method that you may not have considered is a status service, like StatusCake Pages. You may already be sharing public data about your website’s performance and uptime, but tools like StatusCake Pages allow you to communicate issues outside of your control.
Status Page alerts can provide a great middle-ground because they’re more visible and quicker to write than a blog, but they’re more sincere and flexible than a Tweet. You can write what you need to write without worrying about character limits, and by doing so you will come across as more genuine and involved in the situation.
Allowing your customers to subscribe to these status updates also presents an image of transparency. Rather than your company looking like it’s attempting to hide failures out of sight, it becomes one that acknowledges, reacts and solves issues.
Plus, you can avoid information paralysis because unlike more hefty methods of communications like a newsletter, there’s no pressure to get everything perfect in a single post. The more status posts you create, the more actively it looks like your team is trying to solve the problem.
While we’re on the subject of messaging, don’t forget to turn off all scheduled marketing communication that may have been planned during the crisis.
Not only is it a waste of money and effort on your part but there is nothing worse for a customer than a flashy push notification about your “biggest discounts ever” while they’re getting an error 500 on your login page!
Honesty and ownership are what we’re aiming for here, so avoid making promises that you can’t keep. Sometimes the hardest part of a crisis is when the answer to a user’s question is very simply “I don’t know.” It’s easy to want to follow that up with timelines or solutions, but sometimes communication is about knowing when not to speak.
Your customers are only human, and we humans are a simple species. We’re far more annoyed by a promise that isn’t kept than by some random software outage. You’re more likely to lose a customer by lying to them than you are ignoring them (but it’s best to avoid doing both!)
Following-through isn’t just about making promises though, it’s about ending your crisis with care. When the main problem has been resolved, it’s very easy to want to sweep it all under the rug and try to move on as quickly as possible, but that can be dangerous.
Customers and even staff appreciate closure around these issues. This can be as simple as reaching out individually to customers affected to check if the problem has been solved, even if you know it has.
Dedicated changelog services like ChangeCrab are also a great place to talk about the particular problems that you faced in greater detail, and with the benefit of hindsight.
Changelogs can be a more effective method of communication for this sort of post-mortem because they tend to be where your most engaged customers will come to learn about the more in-depth parts of your business, while also being a place new users will stop by to see how you handled past problems and how often you update.
No matter how bad the problem is, all things pass and the world will move on. What will be left online is the record of how you dealt with the crisis, which will help new customers understand how you’ll handle them in the future.
I hope this helps! While you may not be looking forward to your next big problem, take comfort that if you communicate well, some good can always come out of even the most trying situations!
Share this
5 min read Autonomous Code, Trust Boundaries, and Why Governance Now Matters More Than Ever In Part 1, we looked at how AI has reduced the cost of building monitoring tools. Then in Part 2, we explored the operational and economic burden of owning them. Now we need to talk about something deeper. Because the real shift isn’t
6 min read The Real Cost of Owning Monitoring Isn’t Code — It’s Everything Else In Part 1, we explored how AI has dramatically reduced the cost of building monitoring tooling. That much is clear. You can scaffold uptime checks quickly, generate alert logic in minutes, and set-up dashboards faster than most teams used to schedule the kickoff
5 min read AI Has Made Building Monitoring Easy. It Hasn’t Made Owning It Any Easier. A few months ago, I spoke to an engineering manager who proudly told me they had rebuilt their monitoring stack over a long weekend. They’d used AI to scaffold synthetic checks. They’d generated alert logic with dynamic thresholds. They’d then wired everything
3 min read In the previous posts, we’ve looked at how alert noise emerges from design decisions, why notification lists fail to create accountability, and why alerts only work when they’re designed around a clear outcome. Taken together, these ideas point to a broader conclusion. That alerting is not just a technical system, it’s a socio-technical one. Alerting
3 min read In the first two posts of this series, we explored how alert noise emerges from design decisions, and why notification lists fail to create accountability when responsibility is unclear. There’s a deeper issue underneath both of those problems. Many alerting systems are designed without being clear about the outcome they’re meant to produce. When teams
3 min read In the previous post, we looked at how alert noise is rarely accidental. It’s usually the result of sensible decisions layered over time, until responsibility becomes diffuse and response slows. One of the most persistent assumptions behind this pattern is simple. If enough people are notified, someone will take responsibility. After more than fourteen years
Find out everything you need to know in our new uptime monitoring whitepaper 2021