
Docker has transformed how applications are deployed — but containerised workloads have their own failure modes. Containers crash, Docker daemons restart, networks misconfigure, and images change in ways that break running applications.
Effective container monitoring combines Docker's built-in health checking with external uptime monitoring to give you full visibility.
Understanding what can go wrong helps you monitor the right things:
External HTTP monitoring catches most of these — if the application isn't responding to HTTP requests, the monitor fails regardless of cause.
Docker supports native health checks in your Dockerfile or compose configuration:
# Dockerfile
HEALTHCHECK \
CMD curl -f http://localhost:8080/health || exit 1
Or in docker-compose.yml:
services:
app:
image: your-app:latest
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 30s
Health check states:
An unhealthy container is still running — Docker won't stop it automatically. You need additional tooling to respond to unhealthy containers.
While Docker health checks monitor internally, external HTTP monitoring confirms reachability from outside the container network:
Monitor: https://yourdomain.com/health
Expected status: 200
Interval: 1 minute
This validates the full path: DNS → load balancer/proxy → Docker network → container → application.
A container can be healthy internally while being unreachable externally due to:
External monitoring from Domain Monitor catches what Docker health checks miss.
Configure restart policies so Docker automatically recovers from container failures:
# docker-compose.yml
services:
app:
image: your-app:latest
restart: unless-stopped # or: always, on-failure:5
Restart policies:
no — never restart (default)always — always restart, including on Docker daemon startupunless-stopped — restart unless manually stoppedon-failure:N — restart up to N times on non-zero exit codeWith restart: unless-stopped, a crashed container restarts automatically. External monitoring detects if the restart loop is failing (container keeps crashing) — evidenced by repeated brief outages.
For multi-service applications, monitor the endpoints of each critical service:
services:
web:
ports: ["80:80"]
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost/health"]
api:
ports: ["8080:8080"]
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
worker:
# No HTTP port — use heartbeat monitoring instead
For each exposed service, create an external monitor. For background workers without HTTP endpoints, use heartbeat monitoring — the worker pings a URL on each successful job completion.
If Nginx or another proxy handles SSL termination in your Docker setup, monitor the certificate independently of the container health.
An expired certificate causes the application to be unreachable despite all containers being healthy. SSL certificate monitoring with 30-day advance warnings prevents this.
For Docker Swarm (multi-node Docker):
For Kubernetes, see the dedicated guide on monitoring Kubernetes pods — the concepts are similar but Kubernetes has richer health probe options.
Combine external monitoring alerts with container event logging:
External HTTP monitoring (via Domain Monitor):
Container-level monitoring (via Docker events or your infrastructure monitoring):
unhealthy statusThe two layers cover different failure modes: external monitoring catches user-visible failures, container monitoring catches failures that haven't yet caused user impact.
Only using Docker health checks: Internal health checks don't verify external reachability — always add external HTTP monitoring.
No restart policy: Containers that exit don't restart by default. Set restart: unless-stopped for production services.
Health check endpoint doing too much: A health check that calls the database, external APIs, and runs business logic will fail for many reasons unrelated to container health. Keep health checks lightweight — just verify the process is alive and basic connectivity is intact.
Ignoring start_period: Without a start period, health checks run immediately on container startup. A slow-starting application fails its first checks and gets marked unhealthy before it's ready. Always set start_period to cover the application's startup time.
Monitor your Docker-based services externally at Domain Monitor.
Generative AI creates new content — text, images, code, and more. This guide explains how it works, what tools are available, and where it's genuinely useful versus overhyped.
Read moreCursor AI is an AI-powered code editor built on VS Code. Learn what it does, how it works, and whether it's the right tool for your development workflow.
Read moreClaude Opus is Anthropic's most capable AI model, built for complex reasoning and demanding tasks. Learn what it does, how it compares, and when to use it.
Read moreLooking to monitor your website and domains? Join our platform and start today.