Cloudflare Workers analytics dashboard showing request volume, error rates and edge locations on a global map
# developer tools# website monitoring

How to Monitor Cloudflare Workers in Production

Cloudflare Workers run your code at Cloudflare's edge — distributed across hundreds of data centres globally, with sub-millisecond cold starts and no servers to manage. When they work, they're fast and reliable. When they fail, the errors are often subtle: a Worker that throws an uncaught exception returns a 500 to users with nothing in your logs unless you've set up explicit error handling.

Monitoring Workers requires a different approach from monitoring a traditional application server.

What Cloudflare Analytics Provides

Cloudflare's dashboard shows you analytics for your Workers:

  • Request count and success rate
  • CPU time per request
  • Error rates (4xx and 5xx)
  • Subrequest metrics

These are useful but reactive — you see what happened, not what's happening right now. And analytics won't tell you which specific routes are failing, or alert you when error rates spike.

Cloudflare Workers Observability

Cloudflare has introduced Workers Observability (previously Tail Workers) which lets you stream real-time logs from your Workers. Enable it in your wrangler.toml:

[observability]
enabled = true

Or use a Tail Worker to process logs programmatically:

[[tail_consumers]]
service = "my-log-worker"

Your Tail Worker receives every request event and can forward errors to your logging service, Slack, or alerting system.

Error Handling in Workers

Workers that throw unhandled exceptions return a generic 500 error. Wrap your handler to catch and report errors:

export default {
    async fetch(request, env, ctx) {
        try {
            return await handleRequest(request, env, ctx);
        } catch (error) {
            // Log the error
            ctx.waitUntil(reportError(error, request, env));

            // Return a proper error response
            return new Response(JSON.stringify({
                error: 'Internal server error',
                message: env.ENVIRONMENT === 'development' ? error.message : undefined
            }), {
                status: 500,
                headers: { 'Content-Type': 'application/json' }
            });
        }
    }
};

async function reportError(error, request, env) {
    // Send to your logging/alerting service
    await fetch(env.ERROR_WEBHOOK_URL, {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({
            error: error.message,
            stack: error.stack,
            url: request.url,
            timestamp: new Date().toISOString()
        })
    });
}

Health Check Endpoint

Add a dedicated health check route to your Worker that external monitoring can check:

async function handleRequest(request, env) {
    const url = new URL(request.url);

    if (url.pathname === '/health') {
        return handleHealth(request, env);
    }

    // ... rest of your routing
}

async function handleHealth(request, env) {
    const checks = {};

    // Check KV availability if you use it
    if (env.MY_KV) {
        try {
            await env.MY_KV.get('health-check-key');
            checks.kv = 'ok';
        } catch (e) {
            checks.kv = 'error';
        }
    }

    // Check D1 database if you use it
    if (env.DB) {
        try {
            await env.DB.prepare('SELECT 1').run();
            checks.database = 'ok';
        } catch (e) {
            checks.database = 'error';
        }
    }

    const allOk = Object.values(checks).every(v => v === 'ok');

    return new Response(JSON.stringify({
        status: allOk ? 'ok' : 'degraded',
        ...checks,
        timestamp: new Date().toISOString()
    }), {
        status: allOk ? 200 : 503,
        headers: { 'Content-Type': 'application/json' }
    });
}

Point your uptime monitor at /health and you get a real signal about Worker functionality, not just whether Cloudflare's edge is responding.

Monitoring Workers That Proxy to an Origin

Many Workers act as a middleware layer — handling authentication, rate limiting, A/B testing, or caching before forwarding requests to an origin server. In this architecture, you have two layers to monitor:

  1. The Worker itself — Is it responding? Is it returning errors?
  2. The origin — Is the underlying server the Worker forwards to healthy?

A Worker can be running perfectly while the origin it proxies to is down. The Worker will return errors (502 or whatever you configure), but the failure is actually in the origin.

Monitor both the Worker endpoint and your origin's health check endpoint directly:

// In your Worker health check
async function handleHealth(request, env) {
    // Check origin health
    let originStatus = 'unknown';
    try {
        const originResponse = await fetch(`${env.ORIGIN_URL}/health`, {
            cf: { cacheEverything: false }
        });
        originStatus = originResponse.ok ? 'ok' : 'error';
    } catch (e) {
        originStatus = 'unreachable';
    }

    return new Response(JSON.stringify({
        status: originStatus === 'ok' ? 'ok' : 'degraded',
        worker: 'ok',
        origin: originStatus
    }), {
        status: originStatus === 'ok' ? 200 : 503,
        headers: { 'Content-Type': 'application/json' }
    });
}

KV, D1, and Durable Objects Failures

Workers bind to Cloudflare's storage primitives — KV, D1, R2, Durable Objects. These can experience availability issues independently of the Worker runtime itself. A Worker that reads from KV will fail if KV has a regional issue.

Build awareness of these dependencies into your health check, and monitor Cloudflare's own status page at cloudflarestatus.com for platform-level incidents.

Rate Limits and CPU Limits

Workers have CPU time limits (typically 50ms on the free plan, configurable on paid plans). A Worker that consistently hits its CPU limit will throw errors. Cloudflare's analytics will show this as CPU time spikes.

Optimise expensive operations, use waitUntil() for non-critical work after the response is sent, and monitor average CPU time in your analytics.

External Uptime Monitoring for Workers

Cloudflare Workers run on Cloudflare's own infrastructure, so monitoring them from within Cloudflare misses platform-level issues. External monitoring from independent infrastructure gives you independent verification.

Domain Monitor monitors your Workers from multiple global locations outside Cloudflare's network. If your Worker fails in a specific region — or globally — you're alerted immediately. Create a free account and point a monitor at your Worker's health check endpoint.

Combined with Cloudflare's own analytics and Tail Workers for detailed logs, external uptime monitoring gives you the complete picture.

Also in This Series

For more context on edge and platform monitoring, see uptime monitoring best practices and multi-location uptime monitoring.

More posts

What Is a Subdomain Takeover and How to Prevent It

A subdomain takeover lets an attacker claim your subdomain by exploiting dangling DNS records. Learn how it happens, real-world examples, and how DNS monitoring detects it.

Read more
What Is Mean Time to Detect (MTTD)?

Mean time to detect (MTTD) measures how long it takes to discover an incident after it starts. Reducing MTTD is one of the highest-leverage improvements in reliability engineering.

Read more
What Is Black Box Monitoring?

Black box monitoring tests your systems from the outside, the way users experience them — without access to internal code or infrastructure. Learn how it works and when to use it.

Read more

Subscribe to our PRO plan.

Looking to monitor your website and domains? Join our platform and start today.