
Cron jobs are the hidden workhorses of most web applications — sending emails, processing payments, generating reports, syncing data, running backups. And they fail silently.
When a web page breaks, users see an error. When a cron job stops running, nothing visible happens. Your backups quietly stop being made. Your nightly invoice emails stop going out. Your data sync falls 3 days behind. You discover this when something catastrophic happens — and by then the damage is done.
Cron job monitoring (also called heartbeat monitoring for scheduled tasks) solves this by detecting when a job stops running and alerting you immediately.
Unlike web requests that return errors and trigger application logging, a failed cron job typically produces:
Common reasons cron jobs stop:
Heartbeat monitoring flips the detection model:
This is fundamentally different from HTTP uptime monitoring — instead of the monitoring service calling your application, your application calls the monitoring service.
# Example: Add this to the end of your cron job script
curl -s "https://your-heartbeat-url/ping/your-monitor-token" > /dev/null
If the job fails before reaching this line, or never runs at all, no ping is sent, and you get alerted.
In Domain Monitor, create a heartbeat monitor and configure:
For a shell script:
#!/bin/bash
# Your cron job logic here
/usr/bin/python3 /path/to/your/script.py
# Ping heartbeat monitor on success
if [ $? -eq 0 ]; then
curl -s "https://monitoring-url/ping/TOKEN" > /dev/null
fi
For a Python script:
import requests
def run_job():
# Your job logic
process_invoices()
sync_data()
if __name__ == "__main__":
run_job()
# Signal successful completion
requests.get("https://monitoring-url/ping/TOKEN", timeout=5)
For Node.js:
const https = require('https');
async function runJob() {
// Your job logic
await processQueue();
// Signal success
https.get('https://monitoring-url/ping/TOKEN');
}
runJob().catch(console.error);
# Example: run every 5 minutes
*/5 * * * * /path/to/your/job.sh
# Example: run daily at 2am
0 2 * * * /path/to/daily-report.py
The monitoring interval should match your cron schedule, with a grace period accounting for execution time.
High priority — monitor immediately:
Medium priority:
Lower priority (still worth monitoring):
In addition to detecting failed jobs, heartbeat monitoring can detect when jobs run too slowly. If a job that normally completes in 30 seconds starts taking 5 minutes, this indicates a problem (database performance, external API slowness) even if the job ultimately succeeds.
Configure the grace period to match expected execution time. A job that usually takes 1 minute should have a grace period of 3-4 minutes — giving it reasonable time to complete while still alerting on significant delays.
After every deployment, verify your cron jobs are still running correctly. A common deployment mistake is overwriting crontab entries, changing file paths, or breaking environment variable access.
Add a post-deployment check to your runbook: "Verify last heartbeat received from critical jobs within the past [schedule interval]."
Heartbeat monitoring detects when a job doesn't run. Process monitoring detects when a process crashes.
For long-running background workers (as opposed to scheduled cron jobs), combine:
The heartbeat monitoring guide covers this broader context.
Set up heartbeat monitoring for your cron jobs at Domain Monitor — know the moment a scheduled task stops running.
A subdomain takeover lets an attacker claim your subdomain by exploiting dangling DNS records. Learn how it happens, real-world examples, and how DNS monitoring detects it.
Read moreMean time to detect (MTTD) measures how long it takes to discover an incident after it starts. Reducing MTTD is one of the highest-leverage improvements in reliability engineering.
Read moreBlack box monitoring tests your systems from the outside, the way users experience them — without access to internal code or infrastructure. Learn how it works and when to use it.
Read moreLooking to monitor your website and domains? Join our platform and start today.