High Latency

⚠️

Severity: Medium-High | Alert Threshold: P95 latency > 2 seconds for 5+ minutes

Overview

This alert triggers when the 95th percentile response time exceeds 2 seconds sustained over 5 minutes.

Impact Assessment

P95 Latency	User Impact	Priority
2-5s	Noticeable slowness	Medium
5-10s	Poor user experience	High
> 10s	Unusable, potential timeouts	Critical

Diagnostic Steps

Identify Slow Endpoints

# Check Cloud Run metrics for latency by path
gcloud logging read 'resource.type="cloud_run_revision"
  httpRequest.latency>"2s"' \
  --limit 50 \
  --format "table(httpRequest.requestUrl, httpRequest.latency)"

Check Database Performance

# Look for slow queries in logs
gcloud logging read 'textPayload=~"slow query"' --limit 20
 
# Check active connections
# Via Neon dashboard: https://console.neon.tech

Check External API Latency

Common external dependencies:

OpenAI API calls (LLM operations)
Auth0 (authentication)
Pinecone (vector search)

# Look for timeout/slow external calls
gcloud logging read 'textPayload=~"timeout" OR textPayload=~"ETIMEDOUT"' --limit 20

Review Resource Utilization

# Check CPU utilization
gcloud run services describe thinkhive-demo \
  --region us-central1 \
  --format "value(status)"

Common Causes & Remediation

Symptoms: Queries taking > 1 second

Diagnostic:

-- Check for missing indexes (run in Neon console)
SELECT schemaname, tablename, indexname
FROM pg_indexes
WHERE schemaname = 'thinkhive';

Fix:

Add missing indexes
Optimize N+1 queries
Enable query caching

Symptoms: Gradual latency increase, GC pauses

Fix:

# Increase memory
gcloud run services update thinkhive-demo \
  --region us-central1 \
  --memory 1Gi

Symptoms: Intermittent latency spikes, especially after idle

Fix:

# Set minimum instances
gcloud run services update thinkhive-demo \
  --region us-central1 \
  --min-instances 1

Quick Mitigations

Increase Resources

gcloud run services update thinkhive-demo \
  --region us-central1 \
  --memory 1Gi \
  --cpu 2 \
  --concurrency 80

Enable Minimum Instances

gcloud run services update thinkhive-demo \
  --region us-central1 \
  --min-instances 2

Scale Out

gcloud run services update thinkhive-demo \
  --region us-central1 \
  --max-instances 20

Prevention

Implement request timeouts
Add caching for expensive operations
Use connection pooling for database
Set up performance budgets
Regular load testing
Monitor P50/P95/P99 latencies

Service Down Database Slow

High Latency

Overview

Impact Assessment

Diagnostic Steps

Identify Slow Endpoints

Check Database Performance

Check External API Latency

Review Resource Utilization

Common Causes & Remediation

Quick Mitigations

Increase Resources

Enable Minimum Instances

Scale Out

Prevention

Related Runbooks