
Django powers everything from small API backends to high-traffic platforms. Like any production application, Django needs monitoring — but the framework's conventions and ecosystem create specific patterns for health checking, worker monitoring, and performance visibility.
This guide covers how to monitor a production Django application from the outside in: external uptime monitoring, health endpoints, Celery worker monitoring, and database performance tracking.
Before configuring monitoring, understand what commonly goes wrong in Django applications:
Database connection pool exhaustion: Django's default database configuration does not use connection pooling. Under high load, each request opens and closes a database connection. Without django-db-geventpool or pgBouncer, connection limits are hit under concurrent load.
Celery worker crashes: Background task processing stops without causing an HTTP error. Users experience silent failures — emails not sent, images not processed, reports not generated.
Gunicorn/uWSGI worker timeout: Long-running requests can exhaust worker processes, leaving the application unable to serve new requests. The symptom is a sudden spike in 502/504 errors.
Static files misconfiguration: After a deployment, Django's collectstatic may not have run, resulting in broken CSS, JavaScript, or images while the application itself returns HTTP 200.
Cache invalidation failures: If Redis or Memcached becomes unavailable and Django is not configured to handle cache failures gracefully, it may raise unhandled exceptions on cached views.
Django does not include a health endpoint by default. Add one using django-health-check or a custom view.
pip install django-health-check
# settings.py
INSTALLED_APPS = [
...
'health_check',
'health_check.db',
'health_check.cache',
'health_check.storage',
'health_check.contrib.celery', # if using Celery
'health_check.contrib.redis', # if using Redis
]
# urls.py
from health_check.views import MainView
urlpatterns = [
...
path('health/', MainView.as_view()),
]
The /health/ endpoint returns HTTP 200 when all checks pass and HTTP 500 when any check fails, with a JSON body listing which checks passed and which failed.
For a simpler approach without the full package:
# views.py
from django.http import JsonResponse
from django.db import connection
from django.core.cache import cache
def health_check(request):
checks = {}
# Database check
try:
with connection.cursor() as cursor:
cursor.execute("SELECT 1")
checks["database"] = "ok"
except Exception as e:
checks["database"] = f"error: {str(e)}"
# Cache check
try:
cache.set("health_check", "ok", 5)
val = cache.get("health_check")
checks["cache"] = "ok" if val == "ok" else "error"
except Exception as e:
checks["cache"] = f"error: {str(e)}"
status_code = 200 if all(v == "ok" for v in checks.values()) else 503
return JsonResponse({"status": "ok" if status_code == 200 else "error", "checks": checks},
status=status_code)
# urls.py
path("health/", views.health_check, name="health_check"),
Monitor this endpoint at Domain Monitor with content verification checking for "status": "ok" in the response body.
Celery background workers are the most common source of silent failures in Django applications. Monitor them through multiple layers:
Include Celery status in your health endpoint:
from celery.app.control import Inspect
def get_celery_health():
try:
i = Inspect(timeout=2)
active = i.active()
if active is None:
return "no_workers"
return "ok"
except Exception:
return "error"
Celery Beat schedules periodic tasks. Add a heartbeat ping to your critical periodic tasks:
# tasks.py
import requests
from celery import shared_task
@shared_task
def daily_reconciliation():
# ... task logic ...
# Ping heartbeat monitor on success
requests.get("https://domain-monitor.io/heartbeat/daily-reconciliation", timeout=5)
Configure a heartbeat monitor with a 25-hour interval and 1-hour grace period. If the task misses its daily schedule, the alert fires within an hour.
See how to monitor cron jobs for a detailed heartbeat monitoring guide.
Use Flower — a real-time Celery monitoring tool — for queue depth visibility. Expose queue metrics through your health endpoint:
from kombu import Connection
def get_queue_depth(queue_name="default"):
with Connection(settings.CELERY_BROKER_URL) as conn:
with conn.channel() as channel:
name, message_count, consumer_count = channel.queue_declare(
queue=queue_name, passive=True
)
return message_count
Gunicorn is the most common WSGI server for Django in production. Configure it to expose worker health:
# gunicorn.conf.py
workers = 4
timeout = 30
keepalive = 5
max_requests = 1000
max_requests_jitter = 100
The max_requests setting recycles workers after a set number of requests, preventing memory leaks from accumulating.
Monitor Gunicorn via your health endpoint — if workers are exhausted, the health endpoint will time out or fail, triggering your external monitor alert.
Database performance is the most common cause of Django slowness. Monitor it through:
In development, Django Debug Toolbar shows queries per request, query times, and N+1 problems.
django-silk can profile production requests to identify slow queries.
External response time monitoring from Domain Monitor gives you the user-perspective view of performance degradation. Set alerts when p95 response times exceed your threshold:
Expose detailed metrics via Prometheus for Grafana dashboards:
pip install django-prometheus
# settings.py
INSTALLED_APPS = ['django_prometheus'] + INSTALLED_APPS
MIDDLEWARE = ['django_prometheus.middleware.PrometheusBeforeMiddleware'] + \
MIDDLEWARE + \
['django_prometheus.middleware.PrometheusAfterMiddleware']
# urls.py
path('', include('django_prometheus.urls')),
This exposes /metrics with request counts, response times, database query counts, and more.
Django applications often run on multiple domains — main site, API subdomain, admin subdomain. Monitor all of them:
See SSL certificate monitoring and what is domain hijacking for setup guidance.
Add a post-deployment check to your CI/CD pipeline:
# After deploying Django to production
sleep 30 # wait for gunicorn to restart
# Check health endpoint
HTTP_STATUS=$(curl -s -o /dev/null -w "%{http_code}" https://yourdomain.com/health/)
if [ "$HTTP_STATUS" != "200" ]; then
echo "Deployment health check failed: HTTP $HTTP_STATUS"
# Trigger rollback
exit 1
fi
echo "Deployment healthy"
See monitoring CI/CD pipelines for a broader approach to deployment health verification.
| Layer | Tool | Coverage |
|---|---|---|
| External uptime | Domain Monitor | HTTP, SSL, domain, response time |
| Application health | django-health-check | DB, cache, Celery, storage |
| Background workers | Heartbeat monitors | Celery Beat task completion |
| Performance metrics | django-prometheus + Grafana | Request rates, query times |
| Error tracking | Sentry | Exception capture with context |
| Alerting | PagerDuty or Opsgenie | On-call escalation |
Start with external uptime monitoring and the health endpoint — these give you immediate visibility with minimal setup. Add Celery heartbeat monitoring for any critical background tasks. Layer in Prometheus metrics and error tracking as your application scales.
Monitor your Django application's health endpoint and SSL certificate at Domain Monitor — external uptime monitoring that works with any Python web framework.
Generative AI creates new content — text, images, code, and more. This guide explains how it works, what tools are available, and where it's genuinely useful versus overhyped.
Read moreCursor AI is an AI-powered code editor built on VS Code. Learn what it does, how it works, and whether it's the right tool for your development workflow.
Read moreClaude Opus is Anthropic's most capable AI model, built for complex reasoning and demanding tasks. Learn what it does, how it compares, and when to use it.
Read moreLooking to monitor your website and domains? Join our platform and start today.