Google Cloud Run service monitoring dashboard showing container uptime and request health metrics
# website monitoring

How to Monitor Google Cloud Run Services

Google Cloud Run is a fully managed serverless platform that runs containers on demand. It scales to zero when not in use and scales automatically to handle traffic. While Cloud Run manages the infrastructure, your containerised application can still fail — and monitoring is essential for production services.

Cloud Run Failure Modes

  • Container startup failure — container exits on startup (configuration error, missing dependency)
  • Cold start latency — container initialisation takes too long, causing request timeouts
  • Request timeout — container hits the configured request timeout
  • Out of memory — container exceeds configured memory limit
  • Unhealthy deployment — new revision deployed with broken code
  • Custom domain misconfiguration — DNS not pointing to Cloud Run correctly
  • SSL certificate issues — custom domain certificate problems

External HTTP monitoring catches most of these from the user's perspective.

External HTTP Monitoring

Configure an uptime monitor on your Cloud Run service URL:

Monitor: https://your-service-name-xxxxx-uc.a.run.app
(or your custom domain: https://yourapp.com)
Expected status: 200
Interval: 1 minute

Monitoring the custom domain is preferable — it tests the complete path including DNS, load balancing, and SSL termination, not just the Cloud Run service itself.

Adding a Health Endpoint to Your Container

# FastAPI example
@app.get("/health")
async def health():
    return {"status": "ok"}
// Express example
app.get('/health', (req, res) => {
    res.json({ status: 'ok' });
});
// Go example
http.HandleFunc("/health", func(w http.ResponseWriter, r *http.Request) {
    w.Header().Set("Content-Type", "application/json")
    w.Write([]byte(`{"status":"ok"}`))
})

Cloud Run Health Checks

Cloud Run supports startup probes and liveness probes for containers:

# cloudbuild.yaml or service configuration
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: my-service
spec:
  template:
    spec:
      containers:
        - image: gcr.io/my-project/my-app
          livenessProbe:
            httpGet:
              path: /health
              port: 8080
            initialDelaySeconds: 0
            periodSeconds: 10
          startupProbe:
            httpGet:
              path: /health
              port: 8080
            failureThreshold: 3
            periodSeconds: 10

These internal probes restart unhealthy containers. External monitoring confirms user-facing availability after internal recovery.

Google Cloud Monitoring

Cloud Run integrates with Google Cloud Monitoring for native metrics:

Key Cloud Run metrics:

  • run.googleapis.com/request_count — requests per second
  • run.googleapis.com/request_latencies — response time distribution
  • run.googleapis.com/container/instance_count — active instances
  • run.googleapis.com/container/cpu/utilizations — CPU usage

Set up alerting policies in Cloud Monitoring:

gcloud alpha monitoring policies create \
    --policy-from-file=uptime-policy.json

Minimum Instances for Latency

Cloud Run scales to zero by default — cold starts add 1-3 seconds to the first request. For latency-sensitive applications, configure minimum instances:

gcloud run services update my-service \
    --min-instances=1 \
    --region=europe-west1

One minimum instance eliminates cold starts for most traffic patterns. Monitor response time in your uptime monitoring tool to detect cold start impact.

SSL and Domain Monitoring

Cloud Run provides SSL automatically for the .run.app domain. For custom domains (mapped via Cloud Run Domain Mappings or Cloud Load Balancing):

SSL monitor: yourapp.com
Alert at: 30 days remaining

Cloud Run manages certificates for mapped domains, but failures can occur. SSL certificate monitoring provides advance warnings.

Monitoring Traffic Splitting and Revisions

Cloud Run supports traffic splitting between revisions (blue-green / canary deployments). After deploying a new revision:

  1. Set maintenance window in monitoring tool
  2. Update traffic split (e.g., 10% to new revision)
  3. Watch error rates and response times in Cloud Monitoring
  4. Gradually increase traffic to new revision
  5. If issues detected: route 100% back to old revision
  6. Close maintenance window once stable

External monitoring on your service URL validates the combined experience across all revisions receiving traffic.


Monitor your Cloud Run services from outside Google's network at Domain Monitor.

More posts

What Is Generative AI? How It Works and What It Creates

Generative AI creates new content — text, images, code, and more. This guide explains how it works, what tools are available, and where it's genuinely useful versus overhyped.

Read more
What Is Cursor AI? The AI Code Editor Explained

Cursor AI is an AI-powered code editor built on VS Code. Learn what it does, how it works, and whether it's the right tool for your development workflow.

Read more
What Is Claude Opus? Anthropic's Most Powerful Model Explained

Claude Opus is Anthropic's most capable AI model, built for complex reasoning and demanding tasks. Learn what it does, how it compares, and when to use it.

Read more

Subscribe to our PRO plan.

Looking to monitor your website and domains? Join our platform and start today.