How to Monitor PostgreSQL Performance and Connection Issues

PostgreSQL is stable, capable, and widely trusted — but it has specific failure modes that catch developers off guard. Connection limits hit silently, table bloat accumulates, long-running transactions block other queries, and autovacuum falls behind under heavy write loads. None of these announce themselves loudly until something breaks.

Here's how to monitor the things that matter in a production Postgres setup.

Availability and Health Check

Start with the fundamentals — can your application connect to Postgres?

# Python / SQLAlchemy
@app.route('/health')
def health():
    try:
        with db.engine.connect() as conn:
            conn.execute(text('SELECT 1'))
        return jsonify({'status': 'ok', 'database': 'ok'}), 200
    except Exception as e:
        return jsonify({'status': 'degraded', 'database': str(e)}), 503

// Node.js / pg
app.get('/health', async (req, res) => {
    try {
        await pool.query('SELECT 1');
        res.json({ status: 'ok', database: 'ok' });
    } catch (err) {
        res.status(503).json({ status: 'degraded', database: err.message });
    }
});

Point your uptime monitor at this endpoint. When Postgres is down or unreachable, the health check returns 503 and you're alerted immediately.

Connection Monitoring

PostgreSQL has a hard max_connections limit (default 100). When it's hit, new connections are rejected with FATAL: sorry, too many clients already — an error that takes down your application immediately.

Check current connection status:

-- Current vs maximum connections
SELECT
    max_conn,
    used,
    ROUND(used::numeric / max_conn * 100, 1) AS usage_pct
FROM
    (SELECT setting::int AS max_conn FROM pg_settings WHERE name = 'max_connections') mc,
    (SELECT COUNT(*) AS used FROM pg_stat_activity) ua;

-- Breakdown by state
SELECT state, COUNT(*) FROM pg_stat_activity GROUP BY state;

-- Long-running connections
SELECT pid, now() - pg_stat_activity.query_start AS duration, query, state
FROM pg_stat_activity
WHERE state != 'idle' AND query_start < NOW() - INTERVAL '5 minutes'
ORDER BY duration DESC;

Alert when connection usage exceeds 80%.

PgBouncer for Connection Pooling

If you're hitting connection limits, PgBouncer is the standard solution. It sits between your application and Postgres, pooling connections so hundreds of application threads share a smaller number of actual Postgres connections.

# pgbouncer.ini
[databases]
yourdb = host=127.0.0.1 port=5432 dbname=yourdb

[pgbouncer]
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 25

Monitor PgBouncer alongside Postgres:

psql -h 127.0.0.1 -p 6432 -U pgbouncer pgbouncer -c "SHOW POOLS;"

Slow Query Monitoring

Enable pg_stat_statements for query-level performance tracking:

-- Enable pg_stat_statements (in postgresql.conf)
-- shared_preload_libraries = 'pg_stat_statements'

-- After restart, create extension
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

-- Top 10 slowest queries by total execution time
SELECT
    round(total_exec_time::numeric, 2) AS total_ms,
    round(mean_exec_time::numeric, 2) AS mean_ms,
    calls,
    LEFT(query, 80) AS query
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

Also enable the slow query log:

# postgresql.conf
log_min_duration_statement = 1000  # Log queries over 1 second
log_checkpoints = on
log_connections = on
log_lock_waits = on

Bloat and Autovacuum

PostgreSQL uses MVCC (Multi-Version Concurrency Control) — updates and deletes leave dead tuples in tables. Autovacuum reclaims this space, but under heavy load it can fall behind, causing table bloat that degrades query performance.

Check for tables with excessive dead tuples:

SELECT
    schemaname,
    tablename,
    n_live_tup,
    n_dead_tup,
    ROUND(n_dead_tup::numeric / NULLIF(n_live_tup + n_dead_tup, 0) * 100, 1) AS dead_pct,
    last_autovacuum
FROM pg_stat_user_tables
WHERE n_dead_tup > 10000
ORDER BY n_dead_tup DESC;

A dead_pct over 20% on frequently accessed tables indicates autovacuum isn't keeping up. You may need to tune autovacuum settings or run VACUUM ANALYZE manually.

Long-Running Transactions

A transaction that's been open for hours can block autovacuum, hold locks, and prevent other queries from completing.

-- Transactions open for over 5 minutes
SELECT
    pid,
    now() - xact_start AS duration,
    state,
    LEFT(query, 100) AS current_query
FROM pg_stat_activity
WHERE xact_start IS NOT NULL
    AND now() - xact_start > INTERVAL '5 minutes'
ORDER BY duration DESC;

Alert on transactions older than 30 minutes. Kill them if necessary:

-- Terminate a specific connection
SELECT pg_terminate_backend(pid);

Replication Monitoring (Read Replicas / HA)

-- On primary: check replication status
SELECT
    client_addr,
    state,
    sent_lsn,
    write_lsn,
    flush_lsn,
    replay_lsn,
    write_lag,
    flush_lag,
    replay_lag
FROM pg_stat_replication;

-- On replica: check lag
SELECT
    now() - pg_last_xact_replay_timestamp() AS replication_delay;

Alert when replication_delay exceeds your threshold — typically 30 seconds to a few minutes depending on your application's tolerance for stale reads.

Disk Space

PostgreSQL writes ahead logs (WAL), transaction logs, and table data. Full disks cause immediate failures.

# Check disk space
df -h /var/lib/postgresql

# Check Postgres data directory size
du -sh /var/lib/postgresql/*/main/

Alert at 80% disk usage. Pay particular attention to WAL accumulation — if a replication consumer falls behind, WAL files accumulate until they fill the disk.

Application-Level Monitoring

Database-specific monitoring catches the database problems. Application uptime monitoring catches what users actually experience.

Domain Monitor monitors your application health check endpoint from multiple global locations every minute. When PostgreSQL issues cause user-facing failures, you're alerted immediately — not five minutes later. Create a free account.

See how to monitor MySQL availability for comparison with MySQL monitoring approaches, and uptime monitoring best practices for the full monitoring picture.

How to Monitor PostgreSQL Performance and Connection Issues

Availability and Health Check

Connection Monitoring

PgBouncer for Connection Pooling

Slow Query Monitoring

Bloat and Autovacuum

Long-Running Transactions

Replication Monitoring (Read Replicas / HA)

Disk Space

Application-Level Monitoring

Also in This Series

More posts

What Is a Subdomain Takeover and How to Prevent It

What Is Mean Time to Detect (MTTD)?

What Is Black Box Monitoring?

Subscribe to our PRO plan.

Domain Monitoring

Uptime Monitoring

SSL Monitoring

WHOIS Lookup

Notifications

Status Pages

Ping test

Traceroute test

Find my website's IP

# developer tools# website monitoring

How to Monitor PostgreSQL Performance and Connection Issues

Availability and Health Check

Connection Monitoring

PgBouncer for Connection Pooling

Slow Query Monitoring

Bloat and Autovacuum

Long-Running Transactions

Replication Monitoring (Read Replicas / HA)

Disk Space

Application-Level Monitoring

Also in This Series

More posts

What Is a Subdomain Takeover and How to Prevent It

What Is Mean Time to Detect (MTTD)?

What Is Black Box Monitoring?

Subscribe to our PRO plan.