Want to know how much website downtime costs, and the impact it can have on your business?
Find out everything you need to know in our new uptime monitoring whitepaper 2021




In a recent post, I argued that AI doesn’t fix weak engineering processes; rather it amplifies them. Strong review practices, clear ownership, and solid fundamentals still matter just as much when code is AI-assisted as when it’s not.
That post sparked a follow-up question in the comments that’s worth sitting with:
With AI speeding things up, how do teams realise something’s gone wrong before users do?
It’s the right question to ask next. Because once change velocity increases, prevention alone stops being enough.
AI shortens the distance between an idea and production. That’s the upside.
But it also shortens the distance between:
This isn’t new. What’s new is how little time there is between those moments.
Historically, slower release cycles acted as a buffer. Problems often surfaced during long staging phases, manual QA, or extended rollouts. AI-assisted development compresses those buffers.
The result isn’t more mistakes. It’s less time to notice them.
Good review processes still matter. Tests still matter. None of that goes away.
But once teams are shipping more frequently, the question quietly shifts:
If something does go wrong, how fast will we know — and who will know first?
This isn’t a tooling question. It’s a systems question. And it’s one many teams haven’t fully revisited yet.
In day-to-day engineering terms, this usually looks something like:
If the first signal of trouble is a support ticket, a social post, or a customer escalation, then customers have effectively become your canary in the coal mine.
That’s a risky place to be once change velocity increases.
Playbook Summary: Designing for “Knowing First”
As AI increases change velocity, resilient engineering teams:
The goal isn’t zero failure.
It’s early awareness, smaller blast radius, and faster recovery.
This isn’t about adopting specific tooling. It’s about designing deliberately for early signal.
Here’s how those principles show up in practice.
Fast teams don’t deploy and hope. They deploy and observe.
Every change should come with clear expectations:
That might be availability, latency, error rates, or reachability from outside your network. If you can’t articulate those expectations, you can’t notice failure quickly.
Historically, UAT was something we did before shipping.
As deployment frequency increases, validation moves closer to, and beyond, production. Post-deploy checks, real-world validation, and continuous verification become part of the release itself.
Shipping is no longer the end of testing. It’s the beginning of observation.
Fast teams don’t just ask “can we deploy this?” They ask “what breaks if this goes wrong?”
That means understanding:
It also means being honest about how confident you are in those answers.
In older or more tightly coupled systems, the true blast radius is often wider than expected. Legacy code paths, implicit dependencies, and infrastructure that’s grown organically make outcomes harder to predict.
The less certain you are about how a system behaves, the more you need to pause, reduce scope, and increase validation.
In short, deployment risk isn’t just about what you’re changing. It’s about how well you understand the system you’re changing.
Many teams estimate work based on how long it takes to build.
But as change velocity increases, the cost of deploying a change matters just as much. Riskier changes demand more attention, more validation, and stronger signals.
If story sizing ignores deployment risk, teams are incentivised to move quickly without accounting for operational impact. That gap tends to surface later; usually under pressure.
Internal dashboards tell you how the system thinks it’s behaving.
Users experience how it’s actually behaving.
Independent, external signals answer a simple question:
Can someone use this right now?
As change velocity increases, that outside-in view becomes more important, not less.
Detection doesn’t stop when you act. It stops when you know the action worked.
After a rollback or fix:
Fast feedback here matters as much as fast detection. Otherwise, velocity just turns into anxious waiting.
AI increases throughput. It lowers the cost of making changes.
That’s a good thing; provided that awareness keeps up.
When it doesn’t:
Teams slow down not because AI failed, but because trust in their systems did.
This is one reason external monitoring still matters. Independent availability and performance signals give teams a clear, unbiased view of user experience. They help teams spot issues early and confirm when fixes have actually worked; especially as change velocity increases.
Tools like StatusCake provide that outside-in signal. Not as a replacement for good engineering, but as a complement to it.
Across teams, industries, and stacks, the same pattern shows up:
AI doesn’t create this gap. It just reveals it.
Realising something’s gone wrong before users do is only the first step.
Once signals fire, humans still have to interpret them, make decisions under pressure, and act using the tools and processes available to them.
That raises the next question:
Are our systems designed to help humans make good decisions when things go wrong — or to get in their way?
That’s the layer worth exploring next.
So what does this mean in practice?
If AI is an amplifier, awareness is what keeps amplification from turning into instability.
Teams that can see problems early don’t just recover faster. They’re able to ship with more confidence. They take appropriate risks because they understand their systems and trust their signals.
As change velocity increases, the teams that thrive won’t be the ones that try to eliminate failure. They’ll be the ones that design for awareness, act quickly when reality diverges from intent, and learn continuously.
That’s what makes speed sustainable.
Share this
5 min read In a recent post, I argued that AI doesn’t fix weak engineering processes; rather it amplifies them. Strong review practices, clear ownership, and solid fundamentals still matter just as much when code is AI-assisted as when it’s not. That post sparked a follow-up question in the comments that’s worth sitting with: With AI speeding things
4 min read Why strong reviews, accountability, and monitoring matter more in an AI-assisted world Artificial intelligence has become the latest fault line in software development. For some teams, it’s an obvious productivity multiplier. For others, it’s viewed with suspicion. A source of low-quality code, unreviewable pull requests, and latent production risk. One concern we hear frequently goes

3 min read IPFS is a game-changer for decentralised storage and the future of the web, but it still requires active monitoring to ensure everything runs smoothly.

3 min read For any web developer, DevTools provides an irreplaceable aid to debugging code in all common browsers. Both Safari and Firefox offer great solutions in terms of developer tools, however in this post I will be talking about the highlights of the most recent features in my personal favourite browser for coding, Chrome DevTools. For something

6 min read There has certainly been a trend recently of using animations to elevate user interfaces and improve user experiences, and the more subtle versions of these are known as micro animations. Micro animations are an understated way of adding a little bit of fun to everyday user interactions such as hovering over a link, or clicking

2 min read Read about the latest websites that have experienced downtime including Netflix, Twitter, Facebook and more inside!
Find out everything you need to know in our new uptime monitoring whitepaper 2021