Mean time to detect MTTD incident timeline diagram showing detection gap between incident start and alert notification
# website monitoring

What Is Mean Time to Detect (MTTD)?

Mean time to detect (MTTD) is the average time between when an incident begins and when your team becomes aware of it. It's one of the four key incident metrics, alongside:

  • MTTD — mean time to detect
  • MTTA — mean time to acknowledge
  • MTTR — mean time to respond/resolve
  • MTBF — mean time between failures

MTTD matters because every minute between an incident starting and your team knowing about it is a minute of user-facing impact that compounds. See what is mean time to recovery for how MTTD fits into the broader recovery timeline.


The MTTD Formula

MTTD = Total detection time across all incidents / Number of incidents

For a given incident:

Detection time = Time of alert - Time incident started

"Time incident started" is the hard part — it requires accurate monitoring timestamps to determine when the service first became unavailable, not just when someone noticed.


Why Monitoring Frequency Directly Affects MTTD

Your MTTD is bounded below by your monitoring check frequency.

If you check your service every 5 minutes, the earliest you can detect a failure is when the first failed check completes. The worst case is that a failure occurs immediately after a successful check, meaning you won't know for up to 5 minutes.

MTTD by check interval (worst case):

Check intervalWorst-case MTTD
1 minute1 minute
5 minutes5 minutes
15 minutes15 minutes
1 hour1 hour

Average MTTD is roughly half the check interval (failures occur randomly throughout the interval). A 5-minute check interval has an average MTTD of around 2.5 minutes from monitoring alone.

This is why choosing check frequency is a reliability decision, not just a cost decision.


Factors That Increase MTTD

Alert delivery delays

Even with 1-minute checks, if alerts are delivered by email and no one reads email immediately, your effective MTTD can be 30+ minutes. SMS and push notifications reduce this lag significantly.

Alert fatigue leading to alert dismissal

If your team receives too many false positives or low-priority alerts, they start dismissing or ignoring alerts — including real ones. See how to reduce alert fatigue for how to tune alerting to reduce false positives.

Discovering downtime from users rather than monitoring

If your first signal of an incident is a support ticket or a tweet, your MTTD is measured in customer complaints. This is avoidable with monitoring.

Not monitoring the right endpoints

Monitoring your homepage while your API is down means your monitoring doesn't detect the actual incident. MTTD for the API failure is essentially infinite until a customer reports it.


Benchmarks

There's no universal industry standard for MTTD, but useful reference points:

  • Best-in-class (high-criticality services): < 2 minutes — achieved with 1-minute checks and immediate SMS/push alerts
  • Good (production SaaS/e-commerce): < 5 minutes — achieved with 1–5 minute check intervals and fast notification channels
  • Acceptable (informational sites): < 15 minutes
  • Discovering via user reports: typically 30+ minutes — an indicator that monitoring is absent or misconfigured

For SLA commitments, tracking MTTD alongside MTTR gives you an honest picture of your incident response capability to share with customers and regulators.


How to Reduce Your MTTD

  1. Increase check frequency — move from 5-minute to 1-minute intervals for critical endpoints
  2. Use SMS and push alerts, not just email — reduce notification lag to seconds
  3. Monitor the right endpoints — don't monitor a proxy for your service; monitor the actual user-facing endpoints that matter
  4. Remove alert noise — teams that trust their alerts respond faster; teams that distrust alerts respond slower
  5. Use multi-location monitoring — catch regional failures faster by checking from multiple locations simultaneously

Domain Monitor Pro plans support 1-minute check intervals with immediate SMS alerts — the configuration that delivers the lowest achievable MTTD. Create a free account.


More posts

What Is a Subdomain Takeover and How to Prevent It

A subdomain takeover lets an attacker claim your subdomain by exploiting dangling DNS records. Learn how it happens, real-world examples, and how DNS monitoring detects it.

Read more
What Is Mean Time to Detect (MTTD)?

Mean time to detect (MTTD) measures how long it takes to discover an incident after it starts. Reducing MTTD is one of the highest-leverage improvements in reliability engineering.

Read more
What Is Black Box Monitoring?

Black box monitoring tests your systems from the outside, the way users experience them — without access to internal code or infrastructure. Learn how it works and when to use it.

Read more

Subscribe to our PRO plan.

Looking to monitor your website and domains? Join our platform and start today.