How to Monitor Queue Workers on Laravel, Node.js, and Python Apps

Queue workers are silent infrastructure. When they're healthy, nobody notices. When they fail, jobs accumulate silently, emails stop sending, reports stop generating, and webhooks stop processing — and nobody knows until a user complains or you happen to check.

Unlike a crashed web server (which immediately returns errors), a dead queue worker leaves your application appearing healthy from the outside while background work quietly piles up. This is what makes queue monitoring different from uptime monitoring — and why it needs a separate approach.

What You're Actually Monitoring

Queue worker monitoring has three distinct concerns:

Worker liveness — Are workers running at all? A worker process that has crashed, been OOM-killed, or failed to restart after a deploy means no jobs are being processed.

Queue depth — How many jobs are waiting? A growing queue indicates workers can't keep up with inflow, even if workers are technically running.

Job failure rate — Are jobs completing successfully? Workers can be running while most jobs are failing, which is just as bad as no workers.

All three need monitoring. Any one of them can be the failure point.

Laravel Horizon

Laravel's Horizon dashboard gives you a UI overview, but it's not enough on its own for production alerting.

Health Check Endpoint

Expose a dedicated health check for your queue system:

// routes/api.php
Route::get('/health/queue', function () {
    $failedJobs = DB::table('failed_jobs')->count();
    $horizonStatus = Cache::get('horizon:status', 'inactive');

    $health = [
        'status' => $horizonStatus === 'running' ? 'ok' : 'degraded',
        'horizon' => $horizonStatus,
        'failed_jobs' => $failedJobs,
    ];

    $statusCode = $horizonStatus === 'running' ? 200 : 503;
    return response()->json($health, $statusCode);
});

Horizon writes its status to the cache — horizon:status will be running, paused, or inactive. Your uptime monitor can check this endpoint every minute and alert when the status isn't running.

Queue Depth Monitoring

Route::get('/health/queue', function () {
    $queues = ['default', 'emails', 'reports'];
    $depths = [];

    foreach ($queues as $queue) {
        $depths[$queue] = Queue::size($queue);
    }

    $maxDepth = max(array_values($depths));
    $status = $maxDepth > 1000 ? 'degraded' : 'ok';

    return response()->json([
        'status' => $status,
        'queues' => $depths,
        'horizon' => Cache::get('horizon:status'),
    ], $status === 'ok' ? 200 : 503);
});

Horizon Alerts Configuration

In config/horizon.php:

'environments' => [
    'production' => [
        'supervisor-1' => [
            'connection' => 'redis',
            'queue' => ['default'],
            'balance' => 'auto',
            'maxProcesses' => 10,
            'minProcesses' => 3,
            'tries' => 3,
            'timeout' => 60,
        ],
    ],
],

'metrics' => [
    'trim_snapshots' => [
        'job' => 24,
        'queue' => 24,
    ],
],

'waits' => [
    'redis:default' => 60,  // Alert if jobs wait > 60 seconds
],

Heartbeat Monitoring with Laravel Horizon

For jobs that should run on a schedule, use a heartbeat pattern:

// In a scheduled command that runs every 5 minutes
class QueueHeartbeat extends Command
{
    public function handle()
    {
        Cache::put('queue:heartbeat', now()->timestamp, 300);
    }
}

// Health check
Route::get('/health/queue', function () {
    $heartbeat = Cache::get('queue:heartbeat');
    $age = $heartbeat ? now()->timestamp - $heartbeat : null;

    $status = (!$heartbeat || $age > 600) ? 'degraded' : 'ok';

    return response()->json([
        'status' => $status,
        'heartbeat_age_seconds' => $age,
    ], $status === 'ok' ? 200 : 503);
});

Node.js / BullMQ

BullMQ is the standard choice for Node.js queue processing. Its built-in metrics make monitoring straightforward.

Health Check with Queue Stats

const { Queue } = require('bullmq');
const { createClient } = require('redis');

const connection = createClient({ url: process.env.REDIS_URL });
const emailQueue = new Queue('emails', { connection });
const reportQueue = new Queue('reports', { connection });

app.get('/health/queue', async (req, res) => {
    try {
        const [emailCounts, reportCounts] = await Promise.all([
            emailQueue.getJobCounts('waiting', 'active', 'failed', 'completed'),
            reportQueue.getJobCounts('waiting', 'active', 'failed', 'completed'),
        ]);

        const totalFailed = emailCounts.failed + reportCounts.failed;
        const totalWaiting = emailCounts.waiting + reportCounts.waiting;

        const degraded = totalFailed > 50 || totalWaiting > 1000;

        res.status(degraded ? 503 : 200).json({
            status: degraded ? 'degraded' : 'ok',
            queues: {
                emails: emailCounts,
                reports: reportCounts,
            },
        });
    } catch (err) {
        res.status(503).json({ status: 'error', message: err.message });
    }
});

Worker Liveness Check

BullMQ workers have a isRunning() method, but if your worker is in a separate process you need another approach:

// In your worker process — write a heartbeat to Redis
const worker = new Worker('emails', processEmailJob, { connection });

worker.on('ready', () => {
    console.log('Worker ready');
    setInterval(async () => {
        await connection.set('worker:emails:heartbeat', Date.now(), { EX: 120 });
    }, 30000);
});

worker.on('error', (err) => {
    console.error('Worker error:', err);
});

// In your health check — read the heartbeat
app.get('/health/queue', async (req, res) => {
    const heartbeat = await connection.get('worker:emails:heartbeat');
    const age = heartbeat ? Date.now() - parseInt(heartbeat) : null;
    const workerAlive = age !== null && age < 90000; // 90 second threshold

    res.status(workerAlive ? 200 : 503).json({
        status: workerAlive ? 'ok' : 'degraded',
        worker_age_ms: age,
    });
});

Stalled Job Detection

BullMQ automatically marks jobs as stalled if a worker dies mid-processing:

const queueEvents = new QueueEvents('emails', { connection });

queueEvents.on('stalled', ({ jobId }) => {
    console.error(`Job ${jobId} stalled — worker may have died`);
    // Send alert to your monitoring system
});

queueEvents.on('failed', ({ jobId, failedReason }) => {
    console.error(`Job ${jobId} failed: ${failedReason}`);
});

Python / Celery

Celery is the standard queue processing library for Python. Monitoring requires a combination of the Celery inspection API and external health checks.

Health Check Endpoint

from celery import Celery
from flask import Flask, jsonify

app = Flask(__name__)
celery = Celery('tasks', broker=os.environ['REDIS_URL'])

@app.route('/health/queue')
def queue_health():
    try:
        # Check if any workers are responding
        inspect = celery.control.inspect(timeout=2)
        active = inspect.active()

        if not active:
            return jsonify({
                'status': 'degraded',
                'error': 'No workers responding'
            }), 503

        worker_count = len(active)
        total_active_jobs = sum(len(jobs) for jobs in active.values())

        return jsonify({
            'status': 'ok',
            'workers': worker_count,
            'active_jobs': total_active_jobs,
        })
    except Exception as e:
        return jsonify({'status': 'error', 'error': str(e)}), 503

Queue Depth via Redis

import redis

r = redis.from_url(os.environ['REDIS_URL'])

@app.route('/health/queue')
def queue_health():
    queues = ['celery', 'emails', 'reports']
    depths = {}

    for queue in queues:
        depths[queue] = r.llen(queue)

    max_depth = max(depths.values()) if depths else 0
    status = 'degraded' if max_depth > 1000 else 'ok'

    return jsonify({
        'status': status,
        'queues': depths,
    }), 200 if status == 'ok' else 503

Celery Beat for Scheduled Heartbeats

# tasks.py
@celery.task
def queue_heartbeat():
    r = redis.from_url(os.environ['REDIS_URL'])
    r.setex('queue:heartbeat', 300, int(time.time()))

# celerybeat-schedule (runs every 5 minutes)
CELERYBEAT_SCHEDULE = {
    'queue-heartbeat': {
        'task': 'tasks.queue_heartbeat',
        'schedule': 300.0,
    },
}

@app.route('/health/queue')
def queue_health():
    heartbeat = r.get('queue:heartbeat')
    if not heartbeat:
        return jsonify({'status': 'degraded', 'error': 'No heartbeat'}), 503

    age = int(time.time()) - int(heartbeat)
    if age > 600:  # 10 minutes
        return jsonify({'status': 'degraded', 'heartbeat_age': age}), 503

    return jsonify({'status': 'ok', 'heartbeat_age': age})

What to Alert On

Signal	Threshold	Severity
Worker not running	Immediate	P1
Health endpoint returning 503	Immediate	P1
Queue depth growing	>1000 jobs	P2
Job failure rate	>5% failure rate	P2
Heartbeat missed	>10 minutes	P1
Jobs stalled	Any	P2

The most reliable approach is a dedicated /health/queue endpoint that your uptime monitor checks every minute. It encapsulates all the queue-specific logic, and you get the same alerting path as your main application uptime.

Application Uptime Monitoring

Domain Monitor monitors your health check endpoints — including queue-specific ones — from multiple global locations every minute. Point it at /health/queue alongside your main health check and you'll know immediately when workers go down, not when a user reports that their email never arrived. Create a free account.

How to Monitor Queue Workers on Laravel, Node.js, and Python Apps

What You're Actually Monitoring

Laravel Horizon

Health Check Endpoint

Queue Depth Monitoring

Horizon Alerts Configuration

Heartbeat Monitoring with Laravel Horizon

Node.js / BullMQ

Health Check with Queue Stats

Worker Liveness Check

Stalled Job Detection

Python / Celery

Health Check Endpoint

Queue Depth via Redis

Celery Beat for Scheduled Heartbeats

What to Alert On

Application Uptime Monitoring

Also in This Series

More posts

What Is a Subdomain Takeover and How to Prevent It

What Is Mean Time to Detect (MTTD)?

What Is Black Box Monitoring?

Subscribe to our PRO plan.

Domain Monitoring

Uptime Monitoring

SSL Monitoring

WHOIS Lookup

Notifications

Status Pages

Ping test

Traceroute test

Find my website's IP

# developer tools# website monitoring

How to Monitor Queue Workers on Laravel, Node.js, and Python Apps

What You're Actually Monitoring

Laravel Horizon

Health Check Endpoint

Queue Depth Monitoring

Horizon Alerts Configuration

Heartbeat Monitoring with Laravel Horizon

Node.js / BullMQ

Health Check with Queue Stats

Worker Liveness Check

Stalled Job Detection

Python / Celery

Health Check Endpoint

Queue Depth via Redis

Celery Beat for Scheduled Heartbeats

What to Alert On

Application Uptime Monitoring

Also in This Series

More posts

What Is a Subdomain Takeover and How to Prevent It

What Is Mean Time to Detect (MTTD)?

What Is Black Box Monitoring?

Subscribe to our PRO plan.