RunbooksDatabase Connections

Database Connections

⚠️

Severity: High | Alert Threshold: Connection pool usage > 80% OR connection errors

Overview

This alert triggers when the database connection pool is exhausted or connection errors are occurring. This can cause request failures and service degradation.

Connection Limits

EnvironmentMax ConnectionsPool SizeNotes
Neon Free10020Shared compute
Neon Pro50050Dedicated compute
Cloud RunPer instance20Multiply by instances

Diagnostic Steps

Check Current Connections

-- Run in Neon SQL Editor
SELECT count(*) as total_connections,
       state,
       usename,
       application_name
FROM pg_stat_activity
GROUP BY state, usename, application_name
ORDER BY total_connections DESC;

Identify Connection Leaks

-- Find idle connections that should be closed
SELECT pid, usename, application_name, state,
       query_start, state_change,
       now() - state_change as idle_duration
FROM pg_stat_activity
WHERE state = 'idle'
ORDER BY idle_duration DESC
LIMIT 20;

Check for Waiting Queries

-- Find queries waiting for connections
SELECT * FROM pg_stat_activity
WHERE wait_event_type = 'Client'
  AND state = 'active';

Review Application Logs

# Look for connection errors
gcloud logging read 'textPayload=~"connection" AND severity>=ERROR' \
  --limit 30 \
  --freshness=1h

Common Causes & Remediation

Symptoms: Connections grow over time, never released

Diagnostic:

-- Find long-idle connections
SELECT pid, usename, state, query_start
FROM pg_stat_activity
WHERE state = 'idle'
  AND query_start < now() - interval '10 minutes';

Fix:

  1. Identify leaking code (transactions not committed/rolled back)
  2. Add connection timeouts
  3. Restart service as temporary fix
gcloud run services update thinkhive-demo \
  --region us-central1 \
  --update-env-vars DB_RESTART=$(date +%s)

Emergency Actions

Kill Idle Connections

-- Terminate idle connections older than 10 minutes
SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE state = 'idle'
  AND query_start < now() - interval '10 minutes'
  AND usename = 'your_app_user';

Restart Service

gcloud run services update thinkhive-demo \
  --region us-central1 \
  --update-env-vars CONNECTION_RESET=$(date +%s)

Best Practices

Connection Pool Settings

const pool = new Pool({
  max: 20,                    // Max connections in pool
  min: 2,                     // Min connections to maintain
  idleTimeoutMillis: 30000,   // Close idle connections after 30s
  connectionTimeoutMillis: 10000, // Fail if can't connect in 10s
  maxUses: 7500,              // Close connection after N uses
});

Health Check Query

// Use for connection health checks
async function checkDbHealth() {
  const client = await pool.connect();
  try {
    await client.query('SELECT 1');
    return true;
  } finally {
    client.release();
  }
}

Prevention

  • Set appropriate pool sizes for your workload
  • Implement connection timeouts
  • Use connection pooler for serverless
  • Monitor connection metrics
  • Regular connection pool tuning
  • Implement circuit breaker for DB