
Mean time to detect (MTTD) is the average time between when an incident begins and when your team becomes aware of it. It's one of the four key incident metrics, alongside:
MTTD matters because every minute between an incident starting and your team knowing about it is a minute of user-facing impact that compounds. See what is mean time to recovery for how MTTD fits into the broader recovery timeline.
MTTD = Total detection time across all incidents / Number of incidents
For a given incident:
Detection time = Time of alert - Time incident started
"Time incident started" is the hard part — it requires accurate monitoring timestamps to determine when the service first became unavailable, not just when someone noticed.
Your MTTD is bounded below by your monitoring check frequency.
If you check your service every 5 minutes, the earliest you can detect a failure is when the first failed check completes. The worst case is that a failure occurs immediately after a successful check, meaning you won't know for up to 5 minutes.
MTTD by check interval (worst case):
| Check interval | Worst-case MTTD |
|---|---|
| 1 minute | 1 minute |
| 5 minutes | 5 minutes |
| 15 minutes | 15 minutes |
| 1 hour | 1 hour |
Average MTTD is roughly half the check interval (failures occur randomly throughout the interval). A 5-minute check interval has an average MTTD of around 2.5 minutes from monitoring alone.
This is why choosing check frequency is a reliability decision, not just a cost decision.
Alert delivery delays
Even with 1-minute checks, if alerts are delivered by email and no one reads email immediately, your effective MTTD can be 30+ minutes. SMS and push notifications reduce this lag significantly.
Alert fatigue leading to alert dismissal
If your team receives too many false positives or low-priority alerts, they start dismissing or ignoring alerts — including real ones. See how to reduce alert fatigue for how to tune alerting to reduce false positives.
Discovering downtime from users rather than monitoring
If your first signal of an incident is a support ticket or a tweet, your MTTD is measured in customer complaints. This is avoidable with monitoring.
Not monitoring the right endpoints
Monitoring your homepage while your API is down means your monitoring doesn't detect the actual incident. MTTD for the API failure is essentially infinite until a customer reports it.
There's no universal industry standard for MTTD, but useful reference points:
For SLA commitments, tracking MTTD alongside MTTR gives you an honest picture of your incident response capability to share with customers and regulators.
Domain Monitor Pro plans support 1-minute check intervals with immediate SMS alerts — the configuration that delivers the lowest achievable MTTD. Create a free account.
A subdomain takeover lets an attacker claim your subdomain by exploiting dangling DNS records. Learn how it happens, real-world examples, and how DNS monitoring detects it.
Read moreMean time to detect (MTTD) measures how long it takes to discover an incident after it starts. Reducing MTTD is one of the highest-leverage improvements in reliability engineering.
Read moreBlack box monitoring tests your systems from the outside, the way users experience them — without access to internal code or infrastructure. Learn how it works and when to use it.
Read moreLooking to monitor your website and domains? Join our platform and start today.