SaaS Monitoring Checklist

A SaaS product has more monitoring surfaces than a simple website — and more ways for monitoring to be incomplete. This checklist covers every layer of a SaaS product that should be monitored, the configuration for each, and the alerting structure to support a production engineering team.

Use this to audit an existing monitoring setup or build a new one from scratch.

Homepage — https://yourapp.com, expected 200, content check for product name
Pricing page — revenue-critical; prospects who can't see pricing will not convert
Signup / account creation page — https://yourapp.com/account/create or equivalent; a broken signup flow is invisible lost revenue
Key landing pages — any pages receiving significant paid or organic traffic
Docs or help centre — customers in trouble who can't access documentation abandon before contacting support

Application

Application health check endpoint — /health or /status, returning 200 with a meaningful status payload
Deep health check (if implemented) — checks database, cache, and critical dependencies from within the app. See how to set up uptime monitoring
Login / authentication page — https://app.yourapp.com/login; if customers can't log in, they're locked out of your product
Core application dashboard — the main page authenticated users see
Key feature endpoints — the 2–3 most-used features in your application

API

API health check — https://api.yourapp.com/health
Authentication endpoint — /api/v1/auth or equivalent; if API auth is down, all integrations break
Core read endpoint — your most-called API endpoint
Webhook receiving endpoint (if you receive webhooks from third parties) — verify it returns the expected acknowledgment. See how to monitor webhooks

Background Jobs and Scheduled Tasks

Identify all scheduled jobs — billing runs, email sends, report generation, data syncs, reconciliation
Add heartbeat monitors for each critical job — the job pings the monitor on completion; if no ping arrives within the expected window, the alert fires. See what is heartbeat monitoring
Set appropriate intervals — a daily job should have a heartbeat window of 25 hours (to allow for minor timing variation)
Test heartbeat monitors — deliberately skip a job run and confirm the alert fires. See how to monitor cron jobs

Third-Party Dependencies

Payment processor — Stripe, Braintree, etc.: monitor your integration endpoint and the provider's status page. See how to monitor Stripe webhooks
Email delivery — SendGrid, Postmark, Mailgun: monitor your email sending endpoint or transactional email API
Authentication providers — Auth0, Clerk, Cognito: monitor your auth integration endpoint
Key external APIs — any third-party API that your product's core functionality depends on. See how to monitor third-party API dependencies

SSL Certificates

Marketing site — yourapp.com
Application — app.yourapp.com
API — api.yourapp.com
Status page — status.yourapp.com
Any other subdomains serving user traffic
Alert threshold — 30 days at minimum; 60 days for production SaaS
Verify no certificates expire within 30 days right now. See what is SSL certificate monitoring

Domain

Primary domain expiry monitoring — with alerts at 60+ days
Auto-renew enabled and payment card current at registrar. See why domain auto-renew fails
Nameserver change monitoring — immediate alerts on NS record changes
DNS record change monitoring — alerts on any A, MX, CNAME, or TXT record changes

Status Page

Public status page exists at status.yourapp.com or equivalent
Status page is independently hosted — stays up when your app goes down
Status page reflects real-time monitor status — not manually maintained
Incident update process defined — who updates the status page during an incident and how quickly. See how to create a public status page

Alerting

All critical monitors have SMS alerts — not just email. See SMS alerts
Slack notifications routing to a #monitoring or #incidents channel
Escalation path defined — if the first contact doesn't acknowledge within N minutes, who receives the next alert?
On-call rotation documented — who is on call each week, including nights and weekends
All alert contacts are current — no stale phone numbers or departed team members
Recovery alerts enabled — notified when services come back up

Check Frequency

Endpoint	Recommended Interval
Login / authentication	1 minute
Core application	1 minute
API health check	1 minute
Marketing site	5 minutes
Background job heartbeats	Per job schedule

See how to choose monitoring check frequency.

Multi-Location

Critical endpoints are checked from multiple locations — catch CDN or regional failures that single-location checks miss. See how to use multi-location monitoring

SLA Reporting

Uptime reports are enabled and accessible to the team
SLA targets are documented — what uptime percentage are you committed to?
Monitoring data is used to calculate SLA compliance — not just server-level metrics. See uptime SLA guide

Domain Monitor covers HTTP uptime, API monitoring, SSL, domain expiry, DNS, and heartbeat monitoring for SaaS products from a single dashboard. Create a free account.

What Is a Subdomain Takeover and How to Prevent It

A subdomain takeover lets an attacker claim your subdomain by exploiting dangling DNS records. Learn how it happens, real-world examples, and how DNS monitoring detects it.

What Is Mean Time to Detect (MTTD)?

Mean time to detect (MTTD) measures how long it takes to discover an incident after it starts. Reducing MTTD is one of the highest-leverage improvements in reliability engineering.

What Is Black Box Monitoring?

Black box monitoring tests your systems from the outside, the way users experience them — without access to internal code or infrastructure. Learn how it works and when to use it.

Subscribe to our PRO plan.

Looking to monitor your website and domains? Join our platform and start today.

View pricing & plans

Domain Monitoring

Uptime Monitoring

SSL Monitoring

WHOIS Lookup

Notifications

Status Pages

Ping test

Traceroute test

Find my website's IP

# website monitoring