RunbooksHigh Latency

High Latency

⚠️

Severity: Medium-High | Alert Threshold: P95 latency > 2 seconds for 5+ minutes

Overview

This alert triggers when the 95th percentile response time exceeds 2 seconds sustained over 5 minutes.

Impact Assessment

P95 LatencyUser ImpactPriority
2-5sNoticeable slownessMedium
5-10sPoor user experienceHigh
> 10sUnusable, potential timeoutsCritical

Diagnostic Steps

Identify Slow Endpoints

# Check Cloud Run metrics for latency by path
gcloud logging read 'resource.type="cloud_run_revision"
  httpRequest.latency>"2s"' \
  --limit 50 \
  --format "table(httpRequest.requestUrl, httpRequest.latency)"

Check Database Performance

# Look for slow queries in logs
gcloud logging read 'textPayload=~"slow query"' --limit 20
 
# Check active connections
# Via Neon dashboard: https://console.neon.tech

Check External API Latency

Common external dependencies:

  • OpenAI API calls (LLM operations)
  • Auth0 (authentication)
  • Pinecone (vector search)
# Look for timeout/slow external calls
gcloud logging read 'textPayload=~"timeout" OR textPayload=~"ETIMEDOUT"' --limit 20

Review Resource Utilization

# Check CPU utilization
gcloud run services describe thinkhive-demo \
  --region us-central1 \
  --format "value(status)"

Common Causes & Remediation

Symptoms: Queries taking > 1 second

Diagnostic:

-- Check for missing indexes (run in Neon console)
SELECT schemaname, tablename, indexname
FROM pg_indexes
WHERE schemaname = 'thinkhive';

Fix:

  • Add missing indexes
  • Optimize N+1 queries
  • Enable query caching

Quick Mitigations

Increase Resources

gcloud run services update thinkhive-demo \
  --region us-central1 \
  --memory 1Gi \
  --cpu 2 \
  --concurrency 80

Enable Minimum Instances

gcloud run services update thinkhive-demo \
  --region us-central1 \
  --min-instances 2

Scale Out

gcloud run services update thinkhive-demo \
  --region us-central1 \
  --max-instances 20

Prevention

  • Implement request timeouts
  • Add caching for expensive operations
  • Use connection pooling for database
  • Set up performance budgets
  • Regular load testing
  • Monitor P50/P95/P99 latencies