RunbooksRate Limit Exceeded

Rate Limit Exceeded

⚠️

Severity: Medium | Alert Threshold: > 100 rate-limited requests in 5 minutes

Overview

This alert triggers when clients are being rate-limited at a high frequency, indicating either:

  • Legitimate high traffic requiring limit adjustment
  • Potential abuse or misconfigured clients
  • Need for client-side optimization

Rate Limit Tiers

TierRequests/MinBurstUse Case
Free1020Development/testing
Standard100200Small production
Professional300500Medium production
Enterprise1,0002,000Large scale

Diagnostic Steps

Identify Rate-Limited Clients

# Find which API keys are hitting limits
gcloud logging read 'httpRequest.status=429' \
  --limit 50 \
  --format "table(timestamp, httpRequest.requestUrl, labels.api_key_prefix)"

Check Traffic Patterns

# View request volume over time
gcloud logging read 'resource.type="cloud_run_revision"' \
  --limit 1000 \
  --format "value(timestamp)" | \
  cut -d'T' -f1 | sort | uniq -c

Identify Specific Endpoints

# Which endpoints are most hit
gcloud logging read 'httpRequest.status=429' \
  --limit 100 \
  --format "value(httpRequest.requestUrl)" | \
  sort | uniq -c | sort -rn

Check for Abuse Patterns

Signs of abuse:

  • Single IP making thousands of requests
  • Invalid API keys with many attempts
  • Requests to non-existent endpoints

Response Actions

Symptoms: Valid customers hitting limits

Actions:

  1. Contact customer about usage patterns
  2. Consider upgrading their tier
  3. Help optimize their integration

Temporary Relief:

-- Increase rate limit for specific key
UPDATE explainer_api_keys
SET rate_limit = 500
WHERE key_prefix = 'thk_abc123';

Adjusting Rate Limits

Global Adjustment

# Update environment variables
gcloud run services update thinkhive-demo \
  --region us-central1 \
  --update-env-vars RATE_LIMIT_DEFAULT=150

Per-Key Adjustment

-- In database
UPDATE api_keys
SET rate_limit = 500
WHERE id = 'key_id';

Client Guidance

Send to rate-limited clients:

**Rate Limit Best Practices**
 
1. **Implement exponential backoff**
   - Wait 1s, then 2s, 4s, 8s on 429 errors
 
2. **Cache responses**
   - Cache explainability results for 5-15 minutes
 
3. **Batch requests**
   - Use batch endpoints where available
 
4. **Monitor usage**
   - Track your API calls to stay under limits
 
5. **Upgrade if needed**
   - Contact sales@thinkhive.ai for higher limits

Prevention

  • Set up usage alerts for high-volume customers
  • Implement gradual limit increases
  • Provide usage dashboards to customers
  • Document rate limits clearly in API docs
  • Consider burst allowances for legitimate spikes