File upload monitoring health check showing storage provider connectivity, upload queue depth and processing worker status
# developer tools# website monitoring

How to Monitor File Upload Flows Without False Alerts

File upload flows are harder to monitor than most endpoints. A standard uptime check against your upload endpoint will almost always return 200 — the endpoint exists and responds. But the actual upload process involves storage providers, processing queues, and size/timeout constraints that break in ways the endpoint status doesn't reveal.

At the same time, file upload endpoints are prone to false alerts if you monitor them naively — timeout thresholds set for normal HTTP requests will fire on legitimate large uploads.

Here's how to monitor uploads correctly.


What Can Break in a File Upload Flow

A file upload typically involves:

  1. Client → server — the file is received by your web server
  2. Server → storage — the file is written to S3, GCS, Cloudflare R2, or local disk
  3. Queue trigger — an event fires to process the file (resize images, scan for viruses, extract text)
  4. Queue processing — a worker processes the file and updates the database record
  5. Response to client — the upload URL or record ID is returned

Monitoring the upload endpoint only covers step 1. Steps 2–5 are where most real failures occur.


Build a Health Check That Tests Storage and Processing

# Flask
import boto3

@app.route('/health/uploads')
def upload_health():
    checks = {}

    # 1. Test storage provider connectivity (S3/compatible)
    try:
        s3 = boto3.client('s3',
            aws_access_key_id=os.environ['AWS_ACCESS_KEY_ID'],
            aws_secret_access_key=os.environ['AWS_SECRET_ACCESS_KEY'],
        )
        # Write a tiny test object and delete it
        bucket = os.environ['S3_BUCKET']
        s3.put_object(Bucket=bucket, Key='health-check/test.txt', Body=b'ok')
        s3.delete_object(Bucket=bucket, Key='health-check/test.txt')
        checks['storage'] = 'ok'
    except Exception as e:
        checks['storage'] = f'error: {str(e)}'

    # 2. Test processing queue depth
    queue_depth = get_queue_size('file-processing')
    checks['processing_queue_depth'] = queue_depth
    checks['processing_queue'] = 'ok' if queue_depth < 100 else 'degraded'

    # 3. Test database write for upload records
    try:
        db.execute('SELECT COUNT(*) FROM uploads')
        checks['uploads_db'] = 'ok'
    except Exception as e:
        checks['uploads_db'] = f'error: {str(e)}'

    all_ok = all(v == 'ok' for v in [
        checks.get('storage'),
        checks.get('processing_queue'),
        checks.get('uploads_db'),
    ])

    return jsonify({'status': 'ok' if all_ok else 'degraded', **checks}), \
           200 if all_ok else 503
// Node.js / AWS SDK v3
import { S3Client, PutObjectCommand, DeleteObjectCommand } from '@aws-sdk/client-s3';

app.get('/health/uploads', async (req, res) => {
    const checks = {};

    try {
        // Test S3 connectivity with a tiny write
        const s3 = new S3Client({ region: process.env.AWS_REGION });
        await s3.send(new PutObjectCommand({
            Bucket: process.env.S3_BUCKET,
            Key: 'health-check/test.txt',
            Body: Buffer.from('ok'),
        }));
        await s3.send(new DeleteObjectCommand({
            Bucket: process.env.S3_BUCKET,
            Key: 'health-check/test.txt',
        }));
        checks.storage = 'ok';

        // Test processing queue
        const queue = new Queue('file-processing', { connection });
        const counts = await queue.getJobCounts('waiting', 'active');
        checks.processingQueueDepth = counts.waiting;
        checks.processingQueue = counts.waiting < 100 ? 'ok' : 'degraded';

        // Test DB
        await db.query('SELECT COUNT(*) FROM uploads');
        checks.uploadsDb = 'ok';

        const allOk = ['storage', 'processingQueue', 'uploadsDb']
            .every(k => checks[k] === 'ok');

        res.status(allOk ? 200 : 503).json({ status: allOk ? 'ok' : 'degraded', ...checks });
    } catch (err) {
        res.status(503).json({ status: 'error', error: err.message, ...checks });
    }
});

Avoiding False Alerts on Upload Endpoints

Don't Monitor the Upload Endpoint Directly

Monitoring POST /api/upload directly will either:

  • Always pass (the endpoint exists and returns 200 on GET)
  • Timeout on large uploads, triggering false alerts

Instead, monitor /health/uploads which tests each component independently without doing an actual upload.

Set Appropriate Timeouts

If you do monitor an upload endpoint, set the timeout threshold to match your expected upload duration. A 30-second timeout threshold for an endpoint that legitimately handles 200MB files will fire false alerts constantly.

Better: separate the timeout thresholds by endpoint type:

  • Static pages: 3 seconds
  • API endpoints: 5 seconds
  • Health checks: 10 seconds
  • Upload endpoints: monitor the health check, not the upload itself

Signed URL Patterns

If your architecture generates signed S3 URLs and uploads go directly to S3 (not through your server), your upload endpoint just generates a URL — it's fast and easily monitored. The actual upload bypasses your server entirely.

In this case, monitor:

  1. The URL generation endpoint (fast, testable)
  2. S3 connectivity via your health check
  3. The post-upload processing queue

What Breaks Upload Flows

  • S3 credentials expired or rotated — uploads return 403; the endpoint still responds with 200 at the URL generation step
  • Bucket permissions changed — similar to above; credentials are valid but bucket policy denies writes
  • Storage quota exceeded — writes begin failing once the bucket or account hits a limit
  • Processing queue backed up — files upload successfully but processing never completes; users see "processing" indefinitely
  • Worker crashed — queue depth grows, no files processed (see how to monitor queue workers)
  • Disk full on local storage — writes fail; upload endpoint may still return 200 if error handling is poor

Monitoring Upload Flows with Domain Monitor

Domain Monitor monitors your /health/uploads endpoint every minute. When your S3 credentials expire, your processing queue backs up, or your storage provider has an incident, you know within a minute — before users start complaining that their uploads aren't processing. Create a free account.


Also in This Series

More posts

Wildcard vs SAN vs Single-Domain SSL Certificates: Which Do You Need?

Wildcard, SAN (multi-domain), and single-domain SSL certificates cover different use cases. Here's a clear comparison to help you pick the right type — and avoid paying for coverage you don't need.

Read more
Why DNS Works in One Location but Fails in Another

DNS resolves correctly from your office but fails for users in other countries or on different ISPs. Here's why geographic DNS inconsistency happens and how to diagnose which layer is causing it.

Read more
Registrar Lock vs Transfer Lock: What's the Difference?

Registrar lock and transfer lock are often confused — and disabling the wrong one leaves your domain vulnerable. Here's a clear breakdown of what each does and when to use them.

Read more

Subscribe to our PRO plan.

Looking to monitor your website and domains? Join our platform and start today.