
Your website can go down at any moment. A server failure, a botched deployment, a certificate expiry, a database crash, a DDoS attack — any of these can take your site offline, and most happen without warning.
The question isn't whether your site will ever go down. It will. The question is: how quickly will you know about it, and how quickly can you respond?
Website uptime monitoring is the answer. This guide covers everything — what it is, how it works, what to monitor, how to configure alerts, how to respond to incidents, and what good monitoring looks like in practice.
Uptime monitoring is the practice of automatically and continuously checking whether your website or application is accessible and responding correctly. A monitoring service makes periodic HTTP requests to your URLs from one or more locations and alerts you when something goes wrong.
"Uptime" refers to the percentage of time your service is available. A service that's down for an hour per month has roughly 99.86% uptime. A service that's down for four hours per month is closer to 99.45%. These numbers sound similar, but the difference matters when users or customers are unable to reach your service.
The goal of monitoring isn't to achieve perfect uptime — that's rarely achievable. The goal is to minimise the time between when downtime starts and when you know about it, so you can respond quickly.
The most important reason to monitor is the simplest: without monitoring, you rely on users to tell you when your site is down. That means downtime can go undetected for hours — or longer over a weekend or holiday period.
Every minute of undetected downtime is a minute where:
Downtime has direct and indirect costs. For e-commerce sites, it's directly measurable — every minute the checkout doesn't work is lost revenue. For SaaS products, it risks churn. For any business, prolonged downtime damages trust and reputation.
The cost of monitoring is tiny compared to the cost of hours of undetected downtime.
If you have service level agreements with customers — contractual commitments to a certain availability percentage — you need monitoring data to track and demonstrate compliance. See uptime SLA guide for how SLAs are structured and measured.
A monitoring service operates by repeatedly sending HTTP requests to your URLs from one or more locations around the world:
This cycle repeats on a configurable interval — typically every minute for critical applications, every five minutes for less critical ones.
A single monitoring location can give false readings. If the monitor's own server has a network hiccup, or if there's a regional routing issue, a single-location check might report your site as down when most users can still access it.
Multi-location monitoring solves this by checking from multiple geographic locations simultaneously. If your site is down in multiple locations, it's genuinely down. If it's only down from one location, it may be a regional issue.
Domain Monitor checks from multiple global locations on every check, giving you accurate, false-positive-resistant results.
The minimum viable monitor. Checks that your server is responding and your web application is serving content. A homepage check catches the most obvious failures: server down, DNS failure, web server crash.
For most sites, there are a small number of pages that matter most to users and business outcomes:
/account/create/)Monitoring these specifically ensures you catch failures that affect your most critical user flows, even if the homepage is still responding.
If your product has a public API or internal APIs, monitor them separately. An API going down might not affect your website's homepage but will break integrations and mobile apps.
A dedicated health check endpoint is the standard approach:
@app.route('/health')
def health_check():
return {'status': 'ok', 'timestamp': datetime.utcnow().isoformat()}, 200
A more thorough health check tests dependencies too:
@app.route('/health/deep')
def deep_health():
try:
db.execute('SELECT 1')
db_status = 'ok'
except:
db_status = 'error'
status = 'ok' if db_status == 'ok' else 'degraded'
return {'status': status, 'database': db_status}, 200 if status == 'ok' else 503
Point monitors at both — the surface check for user experience and the deep check for internal health.
SSL certificate expiry is one of the most common causes of preventable downtime. Certificates typically expire after 90 days (Let's Encrypt) or 1-2 years (paid certificates). When they expire, browsers show security warnings and users can't access your site.
SSL monitoring alerts you weeks before expiry so you have time to renew. See complete guide to SSL certificates for SSL monitoring setup.
If your DNS records change unexpectedly — through a misconfiguration, an error, or a provider issue — your domain will stop resolving correctly. DNS monitoring detects these changes and alerts you before users are affected. See DNS monitoring is here for Domain Monitor's DNS monitoring features.
A domain that expires takes your entire website down instantly. Domain expiry monitoring gives you advance warning so you can renew before expiry. See guide to checking domain expiry date.
How often your monitors check affects how quickly you detect downtime. The trade-off:
For production applications that affect revenue or user experience, check every minute. For staging environments or lower-stakes sites, every 5 minutes is reasonable.
See how to choose monitoring check frequency for a detailed breakdown.
Monitoring without alerting is useless. When a monitor detects downtime, you need to be notified immediately through a channel you'll actually see.
Email — Reliable and appropriate for most teams. Ensure it goes to an inbox that's actively checked, not a group alias that gets ignored.
SMS — For critical applications, SMS is harder to miss than email. Good for on-call engineers who need to respond quickly.
Slack / Teams — Works well for dev teams that live in messaging tools. Post to a dedicated #alerts channel rather than a general channel to reduce noise.
PagerDuty / OpsGenie — For complex on-call rotations, integrate with dedicated on-call management tools that handle escalation and rotation.
Most monitoring tools let you configure how many failed checks trigger an alert. Options:
For critical production applications, immediate alerting is usually preferable — a false positive costs a few minutes of investigation; a missed genuine outage costs much more.
See how to set up downtime alerts for configuration guidance.
Good monitoring tools provide historical uptime data — the percentage of time your service was available over a given period, with charts showing when incidents occurred and how long they lasted.
This data is useful for:
See how to interpret uptime reports for how to read and use uptime data effectively.
Monitoring detects downtime. What you do next determines how quickly service is restored and how much trust you lose.
When an alert fires:
How you communicate during downtime significantly affects user trust. The key principles:
See how to communicate website downtime for full guidance including templates.
A public status page gives users a place to check your service status during an incident, reducing support ticket volume and demonstrating transparency.
A good status page shows:
Host it on separate infrastructure so it stays up when your main application goes down. Domain Monitor includes an automatic public status page from your monitoring data. See how to create a public status page.
Start before launch — Set up monitors before your site goes live. The goal is to know about problems before users do, which requires monitoring to be in place first.
Monitor what users experience — Check real URLs with real HTTP requests, not just whether a server process is running. A server that's running but returning 500 errors is down from the user's perspective.
Check from multiple locations — Regional issues and routing problems require multi-location monitoring to detect accurately.
Test your alerts — After setting up alerts, test them. Confirm that notifications actually arrive in the right places and that the people who receive them know what to do.
Review and update monitors — As your application changes, update your monitors. New critical paths should get monitors. Retired URLs should have monitors removed or updated.
Don't monitor too many things — Monitor the things that matter. A hundred monitors for pages nobody uses creates noise. Focus on your critical user journey.
For a comprehensive checklist, see website monitoring checklist for developers and uptime monitoring best practices.
Domain Monitor checks your websites and applications every minute from multiple global locations, with instant alerts via email, SMS, or Slack when anything goes wrong.
Create a free account and add your first monitor in minutes. You'll get:
The setup takes under five minutes. The peace of mind is permanent.
Generative AI creates new content — text, images, code, and more. This guide explains how it works, what tools are available, and where it's genuinely useful versus overhyped.
Read moreCursor AI is an AI-powered code editor built on VS Code. Learn what it does, how it works, and whether it's the right tool for your development workflow.
Read moreClaude Opus is Anthropic's most capable AI model, built for complex reasoning and demanding tasks. Learn what it does, how it compares, and when to use it.
Read moreLooking to monitor your website and domains? Join our platform and start today.