
Node.js powers millions of production applications — from APIs and real-time services to full-stack applications with frameworks like Express, Fastify, Next.js, and NestJS. Monitoring a Node.js application effectively combines external HTTP checks with process management to ensure both availability and recoverability.
Node.js applications have some unique failure characteristics:
External HTTP monitoring catches most of these — if the Node process crashes or becomes unresponsive, HTTP checks fail.
Every Node.js application should expose a /health endpoint:
app.get('/health', (req, res) => {
res.status(200).json({
status: 'ok',
uptime: process.uptime(),
timestamp: new Date().toISOString()
});
});
fastify.get('/health', async (request, reply) => {
return {
status: 'ok',
uptime: process.uptime()
};
});
A more thorough health endpoint checks that dependencies are available:
app.get('/health', async (req, res) => {
try {
// Test database connection
await db.query('SELECT 1');
res.status(200).json({ status: 'ok' });
} catch (error) {
res.status(503).json({
status: 'error',
detail: 'Database unreachable'
});
}
});
Keep health endpoints lightweight — they're called frequently by monitoring. Avoid expensive operations.
PM2 is the most popular Node.js process manager for production:
# Install PM2
npm install -g pm2
# Start your application
pm2 start app.js --name "my-app"
# Configure auto-restart
pm2 startup # Generate startup script
pm2 save # Save current process list
PM2 automatically restarts your application if it crashes. Key PM2 monitoring features:
# Monitor real-time metrics
pm2 monit
# View logs
pm2 logs my-app
# View status
pm2 status
Configure restart policy in your ecosystem file:
// ecosystem.config.js
module.exports = {
apps: [{
name: 'my-app',
script: 'app.js',
instances: 'max', // Use all CPU cores
exec_mode: 'cluster', // Cluster mode for zero-downtime reloads
max_memory_restart: '500M', // Restart if memory exceeds 500MB
restart_delay: 3000, // Wait 3 seconds before restarting
max_restarts: 10, // Alert if more than 10 restarts in restart_delay period
error_file: '/var/log/pm2/error.log',
out_file: '/var/log/pm2/out.log'
}]
};
max_memory_restart is especially useful for catching memory leaks before they cause total process failure.
External monitoring confirms your Node.js application is accessible to users:
Monitor: https://yourdomain.com/health
Method: GET
Expected status: 200
Content check: "ok"
Interval: 1 minute
Configure in Domain Monitor or any uptime monitoring service. This tests the full stack: DNS → load balancer/proxy → Node.js process → response.
Set alert channels to SMS (for immediate notification) and Slack (for team visibility). See how to set up downtime alerts for complete alert configuration.
If your Node.js application handles HTTPS directly (without a reverse proxy), monitor your SSL certificate:
const https = require('https');
const fs = require('fs');
const options = {
key: fs.readFileSync('/path/to/key.pem'),
cert: fs.readFileSync('/path/to/cert.pem')
};
https.createServer(options, app).listen(443);
Most production Node.js deployments use Nginx as a reverse proxy for SSL termination. Either way, SSL certificate monitoring with advance expiry warnings is essential.
Node.js is commonly used for background processing — job queues (Bull, BullMQ), scheduled tasks, and event processors. These don't expose HTTP endpoints but need monitoring.
For a Bull/BullMQ worker:
const { Worker } = require('bullmq');
const worker = new Worker('myQueue', async (job) => {
await processJob(job);
// Ping heartbeat monitor after successful processing
await fetch('https://monitoring-url/ping/YOUR_TOKEN');
});
See heartbeat monitoring for the complete setup.
For Next.js applications (whether on Vercel or self-hosted):
// pages/api/health.js or app/api/health/route.js
export async function GET() {
return Response.json({ status: 'ok' });
}
For Vercel-deployed Next.js, see monitoring Vercel deployments. For self-hosted Next.js, use the same PM2 + external monitoring approach described above.
Not handling unhandled rejections: Without a handler, an unhandled promise rejection crashes Node v15+. Add:
process.on('unhandledRejection', (reason, promise) => {
console.error('Unhandled rejection:', reason);
// Don't exit — let PM2 handle serious failures
});
Memory leak detection: If PM2 keeps restarting due to max_memory_restart, investigate for memory leaks rather than just increasing the limit.
Blocking the event loop: CPU-intensive operations should be offloaded to worker threads. Blocked event loops cause all requests to timeout — external monitoring detects this as timeout failures.
Monitor your Node.js applications from outside your infrastructure at Domain Monitor.
Generative AI creates new content — text, images, code, and more. This guide explains how it works, what tools are available, and where it's genuinely useful versus overhyped.
Read moreCursor AI is an AI-powered code editor built on VS Code. Learn what it does, how it works, and whether it's the right tool for your development workflow.
Read moreClaude Opus is Anthropic's most capable AI model, built for complex reasoning and demanding tasks. Learn what it does, how it compares, and when to use it.
Read moreLooking to monitor your website and domains? Join our platform and start today.