
Serverless functions promise zero infrastructure management — but they still fail. Cold starts add latency. Functions time out. Memory limits get hit. Deployment errors break handlers. Without monitoring, these failures are invisible until users complain.
Serverless failure modes differ from traditional servers:
External HTTP monitoring catches most of these — if your function-backed endpoint returns an error or times out, the monitor fails.
The foundation of serverless monitoring is the same as any web application: external HTTP checks on your production endpoints.
For a serverless API:
Monitor: https://api.yourdomain.com/v1/health
Expected status: 200
Content check: {"status":"ok"}
Interval: 1 minute
For a Vercel-deployed application:
Monitor: https://yourapp.vercel.app/api/health
Expected status: 200
This validates the complete path: DNS → CDN edge → serverless platform → function → response.
A dedicated /health endpoint in your function should be lightweight — just return 200. Don't call the database in health checks for serverless (it adds cost and latency).
Cold starts are a performance issue unique to serverless. Monitor response time data from your uptime checks to detect when cold starts are affecting user experience:
For latency-sensitive functions, consider:
Beyond external HTTP monitoring, AWS provides Lambda-specific tools:
CloudWatch Metrics (built-in):
Duration — execution time per invocationErrors — invocations that threw an exceptionThrottles — invocations rejected due to concurrency limitsConcurrentExecutions — simultaneous invocationsCloudWatch Alarms on these metrics complement external monitoring:
External monitoring tells you about user-facing availability. CloudWatch tells you about internal Lambda behaviour.
Vercel Functions (including Next.js API routes deployed on Vercel) are monitored through:
External HTTP monitoring: Point your uptime monitor at your function's endpoint. Vercel deployment monitoring covers the full approach.
Vercel Analytics: Vercel provides built-in Web Vitals and function execution metrics in their dashboard.
Function timeout awareness: Vercel Functions have execution limits (Hobby: 10s, Pro: 60s). Functions that regularly approach these limits need optimisation or migration to longer-running solutions.
Cloudflare Workers run at the edge (150+ global locations) with near-zero cold starts. Monitoring challenges:
Use multi-location uptime monitoring to detect regional Worker failures.
Cloudflare Analytics: Workers analytics shows requests, errors, and CPU time per worker.
Serverless background processing (SQS + Lambda, Cloudflare Queue Workers) is invisible to HTTP monitoring. Use heartbeat monitoring to verify background processing is running:
// Lambda handler for background processing
export const handler = async (event) => {
// Process SQS messages
for (const record of event.Records) {
await processMessage(record);
}
// Signal successful processing
await fetch('https://monitoring-url/ping/YOUR_TOKEN');
};
Configure the heartbeat monitor with a grace period matching your expected processing interval.
| Failure Type | Detection | Alert |
|---|---|---|
| Endpoint returning errors | External HTTP monitor | SMS + Slack |
| High error rate | CloudWatch alarm | Slack |
| Throttling | CloudWatch alarm | Slack (investigate concurrency limits) |
| Slow response times | External monitor threshold | Slack |
| Background job stopped | Heartbeat monitor | SMS |
| SSL certificate expiry | SSL monitor | Email (30 days advance) |
Serverless architectures benefit from the same downtime alert configuration as traditional applications — the delivery mechanisms are identical.
Monitor serverless function endpoints from outside your cloud infrastructure at Domain Monitor.
Generative AI creates new content — text, images, code, and more. This guide explains how it works, what tools are available, and where it's genuinely useful versus overhyped.
Read moreCursor AI is an AI-powered code editor built on VS Code. Learn what it does, how it works, and whether it's the right tool for your development workflow.
Read moreClaude Opus is Anthropic's most capable AI model, built for complex reasoning and demanding tasks. Learn what it does, how it compares, and when to use it.
Read moreLooking to monitor your website and domains? Join our platform and start today.