Uptime monitoring dashboard showing real-time response times, availability charts and global monitoring locations
# website monitoring

The Ultimate Guide to Website Uptime Monitoring

Your website can go down at any moment. A server failure, a botched deployment, a certificate expiry, a database crash, a DDoS attack — any of these can take your site offline, and most happen without warning.

The question isn't whether your site will ever go down. It will. The question is: how quickly will you know about it, and how quickly can you respond?

Website uptime monitoring is the answer. This guide covers everything — what it is, how it works, what to monitor, how to configure alerts, how to respond to incidents, and what good monitoring looks like in practice.


What Is Website Uptime Monitoring?

Uptime monitoring is the practice of automatically and continuously checking whether your website or application is accessible and responding correctly. A monitoring service makes periodic HTTP requests to your URLs from one or more locations and alerts you when something goes wrong.

"Uptime" refers to the percentage of time your service is available. A service that's down for an hour per month has roughly 99.86% uptime. A service that's down for four hours per month is closer to 99.45%. These numbers sound similar, but the difference matters when users or customers are unable to reach your service.

The goal of monitoring isn't to achieve perfect uptime — that's rarely achievable. The goal is to minimise the time between when downtime starts and when you know about it, so you can respond quickly.


Why Uptime Monitoring Matters

You Won't Know Without It

The most important reason to monitor is the simplest: without monitoring, you rely on users to tell you when your site is down. That means downtime can go undetected for hours — or longer over a weekend or holiday period.

Every minute of undetected downtime is a minute where:

  • Users are failing to access your service
  • Potential customers are bouncing from an error page
  • Transactions aren't completing
  • Background jobs may be failing silently
  • Your reputation is being damaged

The Cost of Downtime

Downtime has direct and indirect costs. For e-commerce sites, it's directly measurable — every minute the checkout doesn't work is lost revenue. For SaaS products, it risks churn. For any business, prolonged downtime damages trust and reputation.

The cost of monitoring is tiny compared to the cost of hours of undetected downtime.

SLA Compliance

If you have service level agreements with customers — contractual commitments to a certain availability percentage — you need monitoring data to track and demonstrate compliance. See uptime SLA guide for how SLAs are structured and measured.


How Uptime Monitoring Works

A monitoring service operates by repeatedly sending HTTP requests to your URLs from one or more locations around the world:

  1. The check — The monitor makes an HTTP GET request to your URL
  2. The response — Your server responds with a status code
  3. The evaluation — The monitoring service checks: did I get a response? Was the status code successful (2xx)? Was the response time within acceptable limits? Did the response body contain expected content?
  4. The result — The check is logged as up or down based on the evaluation criteria
  5. The alert — If the result changes from up to down (or stays down for a configurable period), an alert is sent

This cycle repeats on a configurable interval — typically every minute for critical applications, every five minutes for less critical ones.

Multi-Location Checking

A single monitoring location can give false readings. If the monitor's own server has a network hiccup, or if there's a regional routing issue, a single-location check might report your site as down when most users can still access it.

Multi-location monitoring solves this by checking from multiple geographic locations simultaneously. If your site is down in multiple locations, it's genuinely down. If it's only down from one location, it may be a regional issue.

Domain Monitor checks from multiple global locations on every check, giving you accurate, false-positive-resistant results.


What to Monitor

Your Homepage

The minimum viable monitor. Checks that your server is responding and your web application is serving content. A homepage check catches the most obvious failures: server down, DNS failure, web server crash.

Key User Journey Pages

For most sites, there are a small number of pages that matter most to users and business outcomes:

  • Login page
  • Signup / account creation page (e.g. /account/create/)
  • Checkout or conversion page
  • Key product or service pages

Monitoring these specifically ensures you catch failures that affect your most critical user flows, even if the homepage is still responding.

API Endpoints

If your product has a public API or internal APIs, monitor them separately. An API going down might not affect your website's homepage but will break integrations and mobile apps.

A dedicated health check endpoint is the standard approach:

@app.route('/health')
def health_check():
    return {'status': 'ok', 'timestamp': datetime.utcnow().isoformat()}, 200

A more thorough health check tests dependencies too:

@app.route('/health/deep')
def deep_health():
    try:
        db.execute('SELECT 1')
        db_status = 'ok'
    except:
        db_status = 'error'

    status = 'ok' if db_status == 'ok' else 'degraded'
    return {'status': status, 'database': db_status}, 200 if status == 'ok' else 503

Point monitors at both — the surface check for user experience and the deep check for internal health.

SSL Certificate Expiry

SSL certificate expiry is one of the most common causes of preventable downtime. Certificates typically expire after 90 days (Let's Encrypt) or 1-2 years (paid certificates). When they expire, browsers show security warnings and users can't access your site.

SSL monitoring alerts you weeks before expiry so you have time to renew. See complete guide to SSL certificates for SSL monitoring setup.

DNS Records

If your DNS records change unexpectedly — through a misconfiguration, an error, or a provider issue — your domain will stop resolving correctly. DNS monitoring detects these changes and alerts you before users are affected. See DNS monitoring is here for Domain Monitor's DNS monitoring features.

Domain Expiry

A domain that expires takes your entire website down instantly. Domain expiry monitoring gives you advance warning so you can renew before expiry. See guide to checking domain expiry date.


Configuring Check Frequency

How often your monitors check affects how quickly you detect downtime. The trade-off:

  • More frequent checks (every minute) — Faster downtime detection, more accurate uptime statistics, higher cost
  • Less frequent checks (every 5 minutes) — Slower detection, lower cost, acceptable for non-critical sites

For production applications that affect revenue or user experience, check every minute. For staging environments or lower-stakes sites, every 5 minutes is reasonable.

See how to choose monitoring check frequency for a detailed breakdown.


Alert Configuration

Monitoring without alerting is useless. When a monitor detects downtime, you need to be notified immediately through a channel you'll actually see.

Alert Channels

Email — Reliable and appropriate for most teams. Ensure it goes to an inbox that's actively checked, not a group alias that gets ignored.

SMS — For critical applications, SMS is harder to miss than email. Good for on-call engineers who need to respond quickly.

Slack / Teams — Works well for dev teams that live in messaging tools. Post to a dedicated #alerts channel rather than a general channel to reduce noise.

PagerDuty / OpsGenie — For complex on-call rotations, integrate with dedicated on-call management tools that handle escalation and rotation.

Alert Sensitivity

Most monitoring tools let you configure how many failed checks trigger an alert. Options:

  • Alert immediately on first failure — Maximum sensitivity, possible false positives from transient issues
  • Alert after 2-3 consecutive failures — Reduces false positives from momentary glitches, slightly delays alert

For critical production applications, immediate alerting is usually preferable — a false positive costs a few minutes of investigation; a missed genuine outage costs much more.

See how to set up downtime alerts for configuration guidance.


Uptime Statistics and Reporting

Good monitoring tools provide historical uptime data — the percentage of time your service was available over a given period, with charts showing when incidents occurred and how long they lasted.

This data is useful for:

  • Demonstrating SLA compliance to customers
  • Identifying recurring issues or patterns
  • Making informed decisions about infrastructure investment
  • Post-incident review and documentation

See how to interpret uptime reports for how to read and use uptime data effectively.


Responding to Downtime

Monitoring detects downtime. What you do next determines how quickly service is restored and how much trust you lose.

Immediate Response

When an alert fires:

  1. Acknowledge the alert — Let your team know someone is investigating
  2. Verify the issue — Check your monitoring dashboard, try accessing the site yourself
  3. Communicate — Post an initial update to your status page within minutes of detecting the issue
  4. Diagnose — Check server logs, error messages, recent deployments, resource usage
  5. Fix or roll back — Address the root cause or revert a recent change if deployment-related
  6. Verify resolution — Confirm monitoring shows the site back up, check it manually
  7. Post-incident update — Update your status page with resolution details

Communication During Incidents

How you communicate during downtime significantly affects user trust. The key principles:

  • Acknowledge before users ask — your status page update should come before users email support
  • Post updates every 15-30 minutes even if you have nothing new to report
  • Be specific about what happened and what you're doing
  • Don't go silent

See how to communicate website downtime for full guidance including templates.


Status Pages

A public status page gives users a place to check your service status during an incident, reducing support ticket volume and demonstrating transparency.

A good status page shows:

  • Current operational status for each major component
  • Active incident timeline with timestamps
  • Historical uptime record

Host it on separate infrastructure so it stays up when your main application goes down. Domain Monitor includes an automatic public status page from your monitoring data. See how to create a public status page.


Monitoring Best Practices

Start before launch — Set up monitors before your site goes live. The goal is to know about problems before users do, which requires monitoring to be in place first.

Monitor what users experience — Check real URLs with real HTTP requests, not just whether a server process is running. A server that's running but returning 500 errors is down from the user's perspective.

Check from multiple locations — Regional issues and routing problems require multi-location monitoring to detect accurately.

Test your alerts — After setting up alerts, test them. Confirm that notifications actually arrive in the right places and that the people who receive them know what to do.

Review and update monitors — As your application changes, update your monitors. New critical paths should get monitors. Retired URLs should have monitors removed or updated.

Don't monitor too many things — Monitor the things that matter. A hundred monitors for pages nobody uses creates noise. Focus on your critical user journey.

For a comprehensive checklist, see website monitoring checklist for developers and uptime monitoring best practices.


Getting Started With Domain Monitor

Domain Monitor checks your websites and applications every minute from multiple global locations, with instant alerts via email, SMS, or Slack when anything goes wrong.

Create a free account and add your first monitor in minutes. You'll get:

  • Minute-by-minute uptime checks from multiple locations
  • Instant alerts when your site goes down or recovers
  • SSL certificate expiry monitoring
  • DNS monitoring
  • Automatic public status page
  • Historical uptime reports and charts

The setup takes under five minutes. The peace of mind is permanent.

More posts

What Is Generative AI? How It Works and What It Creates

Generative AI creates new content — text, images, code, and more. This guide explains how it works, what tools are available, and where it's genuinely useful versus overhyped.

Read more
What Is Cursor AI? The AI Code Editor Explained

Cursor AI is an AI-powered code editor built on VS Code. Learn what it does, how it works, and whether it's the right tool for your development workflow.

Read more
What Is Claude Opus? Anthropic's Most Powerful Model Explained

Claude Opus is Anthropic's most capable AI model, built for complex reasoning and demanding tasks. Learn what it does, how it compares, and when to use it.

Read more

Subscribe to our PRO plan.

Looking to monitor your website and domains? Join our platform and start today.