How to Monitor Cloudflare Workers in Production

Cloudflare Workers run your code at Cloudflare's edge — distributed across hundreds of data centres globally, with sub-millisecond cold starts and no servers to manage. When they work, they're fast and reliable. When they fail, the errors are often subtle: a Worker that throws an uncaught exception returns a 500 to users with nothing in your logs unless you've set up explicit error handling.

Monitoring Workers requires a different approach from monitoring a traditional application server.

What Cloudflare Analytics Provides

Cloudflare's dashboard shows you analytics for your Workers:

Request count and success rate
CPU time per request
Error rates (4xx and 5xx)
Subrequest metrics

These are useful but reactive — you see what happened, not what's happening right now. And analytics won't tell you which specific routes are failing, or alert you when error rates spike.

Cloudflare Workers Observability

Cloudflare has introduced Workers Observability (previously Tail Workers) which lets you stream real-time logs from your Workers. Enable it in your wrangler.toml:

[observability]
enabled = true

Or use a Tail Worker to process logs programmatically:

[[tail_consumers]]
service = "my-log-worker"

Your Tail Worker receives every request event and can forward errors to your logging service, Slack, or alerting system.

Error Handling in Workers

Workers that throw unhandled exceptions return a generic 500 error. Wrap your handler to catch and report errors:

export default {
    async fetch(request, env, ctx) {
        try {
            return await handleRequest(request, env, ctx);
        } catch (error) {
            // Log the error
            ctx.waitUntil(reportError(error, request, env));

            // Return a proper error response
            return new Response(JSON.stringify({
                error: 'Internal server error',
                message: env.ENVIRONMENT === 'development' ? error.message : undefined
            }), {
                status: 500,
                headers: { 'Content-Type': 'application/json' }
            });
        }
    }
};

async function reportError(error, request, env) {
    // Send to your logging/alerting service
    await fetch(env.ERROR_WEBHOOK_URL, {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({
            error: error.message,
            stack: error.stack,
            url: request.url,
            timestamp: new Date().toISOString()
        })
    });
}

Health Check Endpoint

Add a dedicated health check route to your Worker that external monitoring can check:

async function handleRequest(request, env) {
    const url = new URL(request.url);

    if (url.pathname === '/health') {
        return handleHealth(request, env);
    }

    // ... rest of your routing
}

async function handleHealth(request, env) {
    const checks = {};

    // Check KV availability if you use it
    if (env.MY_KV) {
        try {
            await env.MY_KV.get('health-check-key');
            checks.kv = 'ok';
        } catch (e) {
            checks.kv = 'error';
        }
    }

    // Check D1 database if you use it
    if (env.DB) {
        try {
            await env.DB.prepare('SELECT 1').run();
            checks.database = 'ok';
        } catch (e) {
            checks.database = 'error';
        }
    }

    const allOk = Object.values(checks).every(v => v === 'ok');

    return new Response(JSON.stringify({
        status: allOk ? 'ok' : 'degraded',
        ...checks,
        timestamp: new Date().toISOString()
    }), {
        status: allOk ? 200 : 503,
        headers: { 'Content-Type': 'application/json' }
    });
}

Point your uptime monitor at /health and you get a real signal about Worker functionality, not just whether Cloudflare's edge is responding.

Monitoring Workers That Proxy to an Origin

Many Workers act as a middleware layer — handling authentication, rate limiting, A/B testing, or caching before forwarding requests to an origin server. In this architecture, you have two layers to monitor:

The Worker itself — Is it responding? Is it returning errors?
The origin — Is the underlying server the Worker forwards to healthy?

A Worker can be running perfectly while the origin it proxies to is down. The Worker will return errors (502 or whatever you configure), but the failure is actually in the origin.

Monitor both the Worker endpoint and your origin's health check endpoint directly:

// In your Worker health check
async function handleHealth(request, env) {
    // Check origin health
    let originStatus = 'unknown';
    try {
        const originResponse = await fetch(`${env.ORIGIN_URL}/health`, {
            cf: { cacheEverything: false }
        });
        originStatus = originResponse.ok ? 'ok' : 'error';
    } catch (e) {
        originStatus = 'unreachable';
    }

    return new Response(JSON.stringify({
        status: originStatus === 'ok' ? 'ok' : 'degraded',
        worker: 'ok',
        origin: originStatus
    }), {
        status: originStatus === 'ok' ? 200 : 503,
        headers: { 'Content-Type': 'application/json' }
    });
}

KV, D1, and Durable Objects Failures

Workers bind to Cloudflare's storage primitives — KV, D1, R2, Durable Objects. These can experience availability issues independently of the Worker runtime itself. A Worker that reads from KV will fail if KV has a regional issue.

Build awareness of these dependencies into your health check, and monitor Cloudflare's own status page at cloudflarestatus.com for platform-level incidents.

Rate Limits and CPU Limits

Workers have CPU time limits (typically 50ms on the free plan, configurable on paid plans). A Worker that consistently hits its CPU limit will throw errors. Cloudflare's analytics will show this as CPU time spikes.

Optimise expensive operations, use waitUntil() for non-critical work after the response is sent, and monitor average CPU time in your analytics.

External Uptime Monitoring for Workers

Cloudflare Workers run on Cloudflare's own infrastructure, so monitoring them from within Cloudflare misses platform-level issues. External monitoring from independent infrastructure gives you independent verification.

Domain Monitor monitors your Workers from multiple global locations outside Cloudflare's network. If your Worker fails in a specific region — or globally — you're alerted immediately. Create a free account and point a monitor at your Worker's health check endpoint.

Combined with Cloudflare's own analytics and Tail Workers for detailed logs, external uptime monitoring gives you the complete picture.

Also in This Series

For more context on edge and platform monitoring, see uptime monitoring best practices and multi-location uptime monitoring.

How to Monitor Cloudflare Workers in Production

What Cloudflare Analytics Provides

Cloudflare Workers Observability

Error Handling in Workers

Health Check Endpoint

Monitoring Workers That Proxy to an Origin

KV, D1, and Durable Objects Failures

Rate Limits and CPU Limits

External Uptime Monitoring for Workers

Also in This Series

More posts

What Is a Subdomain Takeover and How to Prevent It

What Is Mean Time to Detect (MTTD)?

What Is Black Box Monitoring?

Subscribe to our PRO plan.

Domain Monitoring

Uptime Monitoring

SSL Monitoring

WHOIS Lookup

Notifications

Status Pages

Ping test

Traceroute test

Find my website's IP

# developer tools# website monitoring

How to Monitor Cloudflare Workers in Production

What Cloudflare Analytics Provides

Cloudflare Workers Observability

Error Handling in Workers

Health Check Endpoint

Monitoring Workers That Proxy to an Origin

KV, D1, and Durable Objects Failures

Rate Limits and CPU Limits

External Uptime Monitoring for Workers

Also in This Series

More posts

What Is a Subdomain Takeover and How to Prevent It

What Is Mean Time to Detect (MTTD)?

What Is Black Box Monitoring?

Subscribe to our PRO plan.