Lesson Learned From Major Website Outages

Unfortunately, website outages are common and can occur at very inopportune times. In the US, the National Hurricane Data Center’s website went down this October due to DNS errors just as Hurricane Matthew approached the coast of Florida. Amazon, BT, BBC, Google and Microsoft all had website crashes in recent years, and the recent DDoS attack on Dyn caused website outages at many large international companies, including PayPal and Twitter. The question is not “if” your website will go down, the question is “when.” Here are a few lessons learned from major outages that can help you reduce the possibility of an outage and help you cope when your site does go down.

Anticipate potential problems

It may be a cliché, but it’s good advice: “The best offence is a good defense.” Be proactive, and defend against potential website crashes by evaluating your network and systems before problems occur. Determine what could cause failure at critical points and determine where you need to build in system redundancy. Practice how to restore critical systems before they go down so you’ll be ready to take quick action if a real failure occurs.

Monitor your website traffic. If it’s steadily growing, be sure you have the capacity to handle future growth. Traffic growth can be unpredictable. A favorable review of your company and its services could cause a spike in traffic, and your site could go down if you’re not equipped to handle it.

Communicate with your customers

If your website is down for any length of time, use social media and email to keep your customers informed. Be realistic and honest with them – they will appreciate it, and you’ll get some goodwill out of a difficult situation. Let them know why the site crashed, what steps you are taking to get it back online and how long you think it’s going to take. If it’s taking longer than you anticipated, give your customers an updated report. An uniformed customer becomes an unhappy customer, and an unhappy customer becomes an ex-customer.

Let your customers know when your site is back online and thank them for their patience. If possible, offer your customers something of value to compensate them for their inconvenience. For example, if you provide a paid subscription service, offer your customers the service for free for a short time. Again, your objective is to make your customers happy and retain them.

Don’t try to fix problems on the fly

When your site goes down, you want to get back online as quickly as possible. However, a quick fix may not be a stable fix, and your site may crash again. Roll back to a previous, trusted version of your site while you diagnose what caused your website outage, and take the time to test a fix properly before you implement it.

Monitor your website performance

Always remember that everything is impermanent. Despite your best planning efforts, problems can occur and cause your website to go down. That is why you should monitor your website’s performance to know promptly if your website is down allowing immediate corrective action to get you back online.

Perhaps the best lesson is the most obvious – have a plan and react quickly to implement it when you have a website outage.

StatusCake Team

More from StatusCake

Website Monitoring Checklist: What to Track Beyond Uptime

7 min read A website may be standing and still be in trouble. It may answer a request, return a cheerful 200 OK, and yet load slowly enough that visitors begin to lose patience. Its certificate may be nearing expiry. Its domain records may have changed. A server may be filling its disk in the background, patient and

Daniel June 9, 2026

Turn StatusCake into a verified alerting and escalation flow with Hermes

6 min read StatusCake tells you that something might be broken. Hermes can check whether it really looks broken, decide who should hear about it, send the email, and keep the record for tomorrow morning’s summary.

Daniel May 13, 2026

Blog

Beyond Uptime: Building a Self-Healing OpenClaw Observability Stack

3 min read The allure of OpenClaw is undeniable. You deploy a highly autonomous, self-hosted AI agent, give it access to your repositories and inboxes, and watch it reason through complex workflows while you sleep. It is the dream of the ultimate 10x developer tool realized. But as any veteran DevOps engineer will tell you: running an LLM-backed

Daniel April 23, 2026

When AWS us-east-1 Fails, Much of the Internet Fails With It

7 min read There are cloud outages, and then there are us-east-1 outages. That distinction matters because failures in AWS’s Northern Virginia region rarely feel like ordinary regional incidents. They tend instead to expose something larger and more uncomfortable: too much of the modern internet still behaves as though one place is an acceptable concentration point for infrastructure,

James Barnes April 15, 2026

In the Age of AI, Operational Memory Matters Most During Incidents

7 min read Artificial intelligence is making software easier to produce. That much is already obvious. Code that once took hours to scaffold can now be drafted in minutes. Boilerplate, integration logic, tests, refactors and small internal tools can be generated with startling speed. In some cases, even substantial pieces of implementation can be assembled quickly enough to

James Barnes April 10, 2026

AI Didn’t Kill the SDLC. It Made It Harder to See

10 min read Whilst AI has compressed the visible stages of software delivery; requirements, validation, review and release discipline have not disappeared. They have been pushed into automation, runtime and governance. The real risk is not that the lifecycle is dead, but that organisations start acting as if accountability died with it. There is a now-familiar story about

James Barnes April 2, 2026

Want to know how much website downtime costs, and the impact it can have on your business?

Find out everything you need to know in our new uptime monitoring whitepaper 2021

Monitoring Products

Further Reading

Monitoring

Resources