StatusCake

HTTP Error Codes Explained: Common 4xx & 5xx Errors, Root Causes, and Fixes

website downtime

Last updated: 4 February 2026

HTTP error codes are more than just messages shown to users, They’re signals from distributed systems. For engineers, SREs, and platform teams, these codes are often the first indicator of degraded reliability, misconfiguration, or upstream failure.

This guide focuses on how HTTP error codes behave in real production environments: behind CDNs, load balancers, reverse proxies, APIs, and microservices. Rather than just defining error codes, we explain why they happen, what they usually mean operationally, and what to check first when you encounter one.

Quick reference: common HTTP error codes

Code Class Client or Server Typical production meaning
400 4xx Client Malformed or invalid request
401 4xx Client/Auth Missing or invalid authentication
403 4xx Client/Auth Permission denied
404 4xx Client/App Route or resource not found
408 4xx Client/Network Client timed out waiting for response
500 5xx Server Unhandled server-side failure
502 5xx Server/Infra Bad gateway or upstream failure
503 5xx Server/Infra Service unavailable (overload or maintenance)
504 5xx Server/Infra Gateway timeout

Client errors vs server errors & why this matters operationally

At a high level:

  • 4xx errors indicate requests the server chose not to fulfil; whereas
  • 5xx errors indicate requests the server failed to fulfil

Operationally, this distinction is critical:

  • A spike in 4xx errors usually does not indicate an outage.
  • A spike in 5xx errors often does indicate SLO risk

Misclassifying these errors can hide real incidents, trigger unnecessary alerts, and cause retries that amplify failures. Understanding the differences allows teams to design better alerts, retries, and incident responses.

400 Bad Request

What this error means

The server could not process the request because it was malformed or invalid.

Common real-world causes

  • Invalid JSON or request body
  • Missing required parameters
  • Incorrect headers (e.g. Content-Type)

What to check first

  • Request validation logs
  • API schema mismatches
  • Client-side serialization

401 Unauthorized

What this error means

Authentication is required, but the request lacks valid credentials.

Common real-world causes

  • Expired tokens
  • Missing Authorization headers
  • Clock skew affecting token validity

What to check first

  • Identity provider health
  • Token expiry and refresh logic
  • Authentication middleware

403 Forbidden

What this error means

The server understood the request but refuses to authorise it.

Common real-world causes

  • Incorrect IAM or RBAC rules
  • IP allowlists or geo-blocking
  • CDN or WAF rules

What to check first

  • Permission policies
  • Security logs
  • Recent access control changes

404 Not Found

What this error means

The requested resource does not exist or cannot be located.

Common real-world causes

  • Broken links or outdated routes
  • Deployment drift between environments
  • Misconfigured rewrite rules

What to check first

  • Application routing
  • CDN cache behaviour
  • Deployment artifacts

408 Request Timeout

What this error means

The server timed out waiting for the client to send the request.

Common real-world causes

  • Slow or unstable client connections
  • Large payload uploads
  • Network congestion

What to check first

  • Client network metrics
  • Load balancer idle timeout settings
  • Request size limits

500 Internal Server Error

What this error means

The server encountered an unexpected condition that prevented it from fulfilling the request.

Common real-world causes

  • Unhandled exceptions
  • Dependency failures
  • Misconfigured environment variables

What to check first

  • Application error logs
  • Recent deployments
  • Dependency health

502 Bad Gateway

What this error means

A server acting as a gateway or proxy received an invalid response from an upstream server.

In modern architectures, this usually means one service could not successfully communicate with another.

What it usually means in production

  • The upstream service is down or unreachable
  • The upstream returned a malformed response
  • The connection was reset mid-request

This error is most commonly generated by:

  • Load balancers
  • Reverse proxies (Nginx, Envoy)
  • CDNs

Common real-world causes

  • Crashed backend containers or VMs
  • DNS resolution failures between services
  • TLS handshake failures
  • Timeout mismatches between proxy layers

How to diagnose

  • Check upstream service health and error rates
  • Inspect proxy and load balancer logs
  • Compare timeout configurations across layers

Is this transient or serious?

  • Transient: brief spikes during deploys or autoscaling
  • Serious: sustained error rate increase across regions

Prevention and monitoring

  • Health checks on upstream services
  • Synthetic monitoring from multiple regions
  • Alerting on error rate and duration

503 Service Unavailable

What this error means

The server is currently unable to handle the request.

Common real-world causes

  • Planned maintenance
  • Autoscaling lag
  • Resource exhaustion

What to check first

  • Capacity metrics
  • Deployment status
  • Maintenance windows

504 Gateway Timeout

What this error means

A gateway did not receive a timely response from an upstream server.

Common real-world causes

  • Slow backend services
  • Network latency
  • Database query bottlenecks

What to check first

  • Upstream response times
  • Timeout thresholds
  • Slow query logs

Frequently asked questions (FAQ)

What is the difference between 502 and 503?

A 502 indicates an invalid response from an upstream service, while a 503 indicates the service is unavailable (often due to overload or maintenance).

Are 4xx errors bad for SEO?

Generally no. Search engines expect some 4xx responses. Persistent 404s on important pages, however, should be addressed.

Which error codes should trigger alerts?

Most teams alert on sustained increases in 5xx error rates, not individual errors.

Can CDNs change error codes?

Yes. CDNs often generate their own 5xx responses when origin servers fail to respond correctly.

How monitoring tools help

HTTP error codes are most useful when combined with uptime monitoring, error rate alerting, and regional checks. Tools like StatusCake help teams detect, classify, and respond to these failures before users notice.

Share this

More from StatusCake

In the Age of AI, Operational Memory Matters Most During Incidents

7 min read Artificial intelligence is making software easier to produce. That much is already obvious. Code that once took hours to scaffold can now be drafted in minutes. Boilerplate, integration logic, tests, refactors and small internal tools can be generated with startling speed. In some cases, even substantial pieces of implementation can be assembled quickly enough to

AI Didn’t Kill the SDLC. It Made It Harder to See

10 min read Whilst AI has compressed the visible stages of software delivery; requirements, validation, review and release discipline have not disappeared. They have been pushed into automation, runtime and governance. The real risk is not that the lifecycle is dead, but that organisations start acting as if accountability died with it. There is a now-familiar story about

When Code Becomes Cheap: The New Reliability Constraint in Software Engineering

4 min read How AI Is Shifting Software Engineering’s Primary Constraint For most of the history of software engineering, the primary constraint was production. Code was expensive, skilled engineers were scarce, and shipping features required concentrated human effort. Velocity was limited by how fast people could reason, implement, test, and deploy. That constraint shaped everything from team size,

Buy vs Build in the Age of AI (Part 3)

5 min read Autonomous Code, Trust Boundaries, and Why Governance Now Matters More Than Ever In Part 1, we looked at how AI has reduced the cost of building monitoring tools. Then in Part 2, we explored the operational and economic burden of owning them. Now we need to talk about something deeper. Because the real shift isn’t

Buy vs Build in the Age of AI (Part 2)

6 min read The Real Cost of Owning Monitoring Isn’t Code — It’s Everything Else In Part 1, we explored how AI has dramatically reduced the cost of building monitoring tooling. That much is clear. You can scaffold uptime checks quickly, generate alert logic in minutes, and set-up dashboards faster than most teams used to schedule the kickoff

Buy vs Build in the Age of AI (Part 1)

5 min read AI Has Made Building Monitoring Easy. It Hasn’t Made Owning It Any Easier. A few months ago, I spoke to an engineering manager who proudly told me they had rebuilt their monitoring stack over a long weekend. They’d used AI to scaffold synthetic checks. They’d generated alert logic with dynamic thresholds. They’d then wired everything

Want to know how much website downtime costs, and the impact it can have on your business?

Find out everything you need to know in our new uptime monitoring whitepaper 2021

*By providing your email address, you agree to our privacy policy and to receive marketing communications from StatusCake.