Want to know how much website downtime costs, and the impact it can have on your business?
Find out everything you need to know in our new uptime monitoring whitepaper 2021



Last updated: 4 February 2026
HTTP error codes are more than just messages shown to users, They’re signals from distributed systems. For engineers, SREs, and platform teams, these codes are often the first indicator of degraded reliability, misconfiguration, or upstream failure.
This guide focuses on how HTTP error codes behave in real production environments: behind CDNs, load balancers, reverse proxies, APIs, and microservices. Rather than just defining error codes, we explain why they happen, what they usually mean operationally, and what to check first when you encounter one.
| Code | Class | Client or Server | Typical production meaning |
|---|---|---|---|
| 400 | 4xx | Client | Malformed or invalid request |
| 401 | 4xx | Client/Auth | Missing or invalid authentication |
| 403 | 4xx | Client/Auth | Permission denied |
| 404 | 4xx | Client/App | Route or resource not found |
| 408 | 4xx | Client/Network | Client timed out waiting for response |
| 500 | 5xx | Server | Unhandled server-side failure |
| 502 | 5xx | Server/Infra | Bad gateway or upstream failure |
| 503 | 5xx | Server/Infra | Service unavailable (overload or maintenance) |
| 504 | 5xx | Server/Infra | Gateway timeout |
At a high level:
Operationally, this distinction is critical:
Misclassifying these errors can hide real incidents, trigger unnecessary alerts, and cause retries that amplify failures. Understanding the differences allows teams to design better alerts, retries, and incident responses.
The server could not process the request because it was malformed or invalid.
Authentication is required, but the request lacks valid credentials.
The server understood the request but refuses to authorise it.
The requested resource does not exist or cannot be located.
The server timed out waiting for the client to send the request.
The server encountered an unexpected condition that prevented it from fulfilling the request.
A server acting as a gateway or proxy received an invalid response from an upstream server.
In modern architectures, this usually means one service could not successfully communicate with another.
This error is most commonly generated by:
The server is currently unable to handle the request.
A gateway did not receive a timely response from an upstream server.
A 502 indicates an invalid response from an upstream service, while a 503 indicates the service is unavailable (often due to overload or maintenance).
Generally no. Search engines expect some 4xx responses. Persistent 404s on important pages, however, should be addressed.
Most teams alert on sustained increases in 5xx error rates, not individual errors.
Yes. CDNs often generate their own 5xx responses when origin servers fail to respond correctly.
HTTP error codes are most useful when combined with uptime monitoring, error rate alerting, and regional checks. Tools like StatusCake help teams detect, classify, and respond to these failures before users notice.
Share this
3 min read In the previous posts, we’ve looked at how alert noise emerges from design decisions, why notification lists fail to create accountability, and why alerts only work when they’re designed around a clear outcome. Taken together, these ideas point to a broader conclusion. That alerting is not just a technical system, it’s a socio-technical one. Alerting
3 min read In the first two posts of this series, we explored how alert noise emerges from design decisions, and why notification lists fail to create accountability when responsibility is unclear. There’s a deeper issue underneath both of those problems. Many alerting systems are designed without being clear about the outcome they’re meant to produce. When teams
3 min read In the previous post, we looked at how alert noise is rarely accidental. It’s usually the result of sensible decisions layered over time, until responsibility becomes diffuse and response slows. One of the most persistent assumptions behind this pattern is simple. If enough people are notified, someone will take responsibility. After more than fourteen years
3 min read In a previous post, The Incident Checklist: Reducing Cognitive Load When It Matters Most, we explored how incidents stop being purely technical problems and become human ones. These are moments where decision-making under pressure and cognitive load matter more than perfect root cause analysis. When systems don’t support people clearly in those moments, teams compensate.
4 min read In the previous post, we looked at what happens after detection; when incidents stop being purely technical problems and become human ones, with cognitive load as the real constraint. This post assumes that context. The question here is simpler and more practical. What actually helps teams think clearly and act well once things are already
3 min read In the previous post, we explored how AI accelerates delivery and compresses the time between change and user impact. As velocity increases, knowing that something has gone wrong before users do becomes a critical capability. But detection is only the beginning. Once alerts fire and dashboards light up, humans still have to interpret what’s happening,
Find out everything you need to know in our new uptime monitoring whitepaper 2021