Fortnite, AWS, and the Importance of Monitoring

The Battle Royale game Fortnite has become a sensation amongst online gamers in no time at all. To explain it in simple terms, 100 players are simultaneously dropped into a battleground measuring several (in-game) square kilometers, and must proceed alone or as part of a team towards a random central point on the map whilst avoiding or confronting the other players. The last man or team standing takes the top spot and wins the game. It all adds up to an intense and at times hilarious experience that can last around 1-20 minutes.

The growth in popularity of the game has been epic from a 60,000 players on launch last July to 3,200,000 players in under nine months, and suddenly keeping the game up-and-running was going to require some pretty serious infrastructure.

From day one Epic, the publisher behind Fortnite, has like so many other large businesses such as Airbnb, Unilver, and Netflix relied on Amazon Web Services (AWS) to keep it online.
AWS gives Epic the ability to cope when player numbers spike; the difference in infrastructure workload might be up to ten times difference between the peaks and troughs.

Epic also takes advantage of AWS’s “availability zones”. These 55 zones are designed to ensure web services don’t lag in any one zone. Where one zone fails another simply takes up the baton. Fortnite currently runs across 24 of these zones.

This isn’t to say that AWS and the use of availability zones are infallible. In February of this year Fortnite experienced multiple outages which even AWS’s availability zoning couldn’t prevent.

It’s also worth remembering that whilst many companies such as Epic rely on AWS for its reliability and stability it’s worth remembering that Amazon itself can still have problems.

Just last month on Amazon’s Prime Day the rush for bargains not only brought Amazon down but impacted AWS. Whilst the AWS service itself continued to operate normally, AWS customers were unable to login to their accounts.

More serious however was the four hour outage in AWS’ US-East-1 region in February this year which saw over half of the top 100 internet retailers impacted. Many websites saw the performance of their sites impacted severely (Disney’s store took over 1000% longer to load than normal), many other sites went down completely; the same availability zone having similar issues again in May.

All of this highlights that even if you’re using cloud service providers such as AWS or Google Cloud that monitoring your website is as important as ever.

StatusCake Team

More from StatusCake

Website Monitoring Checklist: What to Track Beyond Uptime

7 min read A website may be standing and still be in trouble. It may answer a request, return a cheerful 200 OK, and yet load slowly enough that visitors begin to lose patience. Its certificate may be nearing expiry. Its domain records may have changed. A server may be filling its disk in the background, patient and

Daniel June 9, 2026

Turn StatusCake into a verified alerting and escalation flow with Hermes

6 min read StatusCake tells you that something might be broken. Hermes can check whether it really looks broken, decide who should hear about it, send the email, and keep the record for tomorrow morning’s summary.

Daniel May 13, 2026

Blog

Beyond Uptime: Building a Self-Healing OpenClaw Observability Stack

3 min read The allure of OpenClaw is undeniable. You deploy a highly autonomous, self-hosted AI agent, give it access to your repositories and inboxes, and watch it reason through complex workflows while you sleep. It is the dream of the ultimate 10x developer tool realized. But as any veteran DevOps engineer will tell you: running an LLM-backed

Daniel April 23, 2026

When AWS us-east-1 Fails, Much of the Internet Fails With It

7 min read There are cloud outages, and then there are us-east-1 outages. That distinction matters because failures in AWS’s Northern Virginia region rarely feel like ordinary regional incidents. They tend instead to expose something larger and more uncomfortable: too much of the modern internet still behaves as though one place is an acceptable concentration point for infrastructure,

James Barnes April 15, 2026

In the Age of AI, Operational Memory Matters Most During Incidents

7 min read Artificial intelligence is making software easier to produce. That much is already obvious. Code that once took hours to scaffold can now be drafted in minutes. Boilerplate, integration logic, tests, refactors and small internal tools can be generated with startling speed. In some cases, even substantial pieces of implementation can be assembled quickly enough to

James Barnes April 10, 2026

AI Didn’t Kill the SDLC. It Made It Harder to See

10 min read Whilst AI has compressed the visible stages of software delivery; requirements, validation, review and release discipline have not disappeared. They have been pushed into automation, runtime and governance. The real risk is not that the lifecycle is dead, but that organisations start acting as if accountability died with it. There is a now-familiar story about

James Barnes April 2, 2026

Want to know how much website downtime costs, and the impact it can have on your business?

Find out everything you need to know in our new uptime monitoring whitepaper 2021

Monitoring Products

Further Reading

Monitoring

Resources