GuidesGuardrail Policies

Guardrail Policies Guide

Guardrail policies let you define scanning rules that run against AI inputs and outputs in real time. Each policy can combine multiple scanners — PII detection, keyword filtering, regex matching, topic detection, and tool call validation — into a single reusable configuration.

How Policies Work

A guardrail policy is a named collection of scanners. When you call guardrails.scan(), you reference a policy by ID, and all scanners in that policy run against the content. Scanners can block (reject the content) or redact (modify the content) depending on your configuration.

You can set one policy as the default policy. Any scan request that does not specify a policyId will use the default policy automatically.

Creating a Policy

Define your policy and scanners

import { guardrails } from '@thinkhive/sdk';
 
const policy = await guardrails.createPolicy({
  name: 'Customer Support Policy',
  description: 'Standard guardrails for customer-facing AI agents',
  isDefault: true,
  scanners: [
    {
      type: 'pii',
      action: 'redact',
      config: {
        entities: ['email', 'phone', 'ssn', 'credit_card'],
        redactionStyle: 'mask', // 'mask', 'hash', or 'replace'
      },
    },
    {
      type: 'keyword',
      action: 'block',
      config: {
        keywords: ['competitor_name', 'internal_only', 'confidential'],
        caseSensitive: false,
        matchWholeWord: true,
      },
    },
    {
      type: 'regex',
      action: 'redact',
      config: {
        patterns: [
          { name: 'api_key', pattern: '(sk-[a-zA-Z0-9]{32,})' },
          { name: 'internal_url', pattern: '(https?://internal\\.company\\.com[^\\s]*)' },
        ],
        redactionReplacement: '[REDACTED]',
      },
    },
    {
      type: 'topic',
      action: 'block',
      config: {
        blockedTopics: ['politics', 'religion', 'violence'],
        threshold: 0.8,
      },
    },
    {
      type: 'tool_call',
      action: 'block',
      config: {
        allowedTools: ['search_knowledge_base', 'get_order_status', 'create_ticket'],
        blockUnknownTools: true,
        parameterValidation: true,
      },
    },
  ],
});
 
console.log(policy.id); // 'policy_abc123'

Scan content against the policy

const result = await guardrails.scan({
  policyId: policy.id,
  output: 'Please contact john@example.com or call 555-123-4567 for help.',
});
 
console.log(result);
// {
//   action: 'redact',
//   redactedOutput: 'Please contact [EMAIL REDACTED] or call [PHONE REDACTED] for help.',
//   results: [
//     {
//       scanner: 'pii',
//       action: 'redact',
//       findings: [
//         { entity: 'email', original: 'john@example.com', redacted: '[EMAIL REDACTED]' },
//         { entity: 'phone', original: '555-123-4567', redacted: '[PHONE REDACTED]' }
//       ]
//     },
//     { scanner: 'keyword', action: 'pass' },
//     { scanner: 'regex', action: 'pass' },
//     { scanner: 'topic', action: 'pass' },
//     { scanner: 'tool_call', action: 'pass' }
//   ]
// }

Handle blocked content

When a scanner with action: 'block' triggers, the content is rejected.

const blockedResult = await guardrails.scan({
  policyId: policy.id,
  output: 'Our competitor_name has a similar product but ours is better.',
});
 
console.log(blockedResult);
// {
//   action: 'block',
//   redactedOutput: null,
//   results: [
//     {
//       scanner: 'keyword',
//       action: 'block',
//       findings: [{ keyword: 'competitor_name', position: 4 }]
//     }
//   ]
// }

Scanner Configuration Reference

PII Scanner

Detects and redacts personally identifiable information.

SettingTypeDescription
entitiesstring[]PII types to detect: email, phone, ssn, credit_card, address, name, dob
redactionStylestringHow to redact: mask (partial), hash (SHA-256), replace (placeholder)
allowListstring[]Values to exclude from detection (e.g., company email domains)

Keyword Scanner

Blocks or redacts content containing specific keywords.

SettingTypeDescription
keywordsstring[]List of keywords to match
caseSensitivebooleanWhether matching is case-sensitive (default: false)
matchWholeWordbooleanOnly match whole words, not substrings (default: true)

Regex Scanner

Applies custom regex patterns for detection.

SettingTypeDescription
patternsobject[]Array of { name, pattern } objects
redactionReplacementstringText to replace matches with (default: [REDACTED])

Topic Scanner

Detects off-limits topics using semantic classification.

SettingTypeDescription
blockedTopicsstring[]Topics to block (e.g., politics, religion, violence)
thresholdnumberConfidence threshold for topic detection (0.0—1.0)

Tool Call Scanner

Validates tool/function calls made by the AI agent.

SettingTypeDescription
allowedToolsstring[]List of permitted tool names
blockUnknownToolsbooleanBlock calls to tools not in the allowed list
parameterValidationbooleanValidate tool parameters against expected schemas

Managing Policies

List policies

const policies = await guardrails.listPolicies();
// [
//   { id: 'policy_abc', name: 'Customer Support Policy', isDefault: true, scannerCount: 5 },
//   { id: 'policy_def', name: 'Internal Tools Policy', isDefault: false, scannerCount: 2 },
// ]

Update a policy

await guardrails.updatePolicy(policy.id, {
  scanners: [
    ...policy.scanners,
    {
      type: 'keyword',
      action: 'block',
      config: {
        keywords: ['new_blocked_term'],
        caseSensitive: false,
      },
    },
  ],
});

Set a default policy

await guardrails.setDefaultPolicy('policy_def');

Delete a policy

await guardrails.deletePolicy('policy_abc');
⚠️

Deleting a policy that is referenced by active agents will cause those agents to fall back to the default policy. If no default policy exists, scans will be skipped.

Guardrail Analytics

View how your policies are performing.

const analytics = await guardrails.getAnalytics({
  policyId: policy.id,
  period: '30d',
});
 
console.log(analytics);
// {
//   totalScans: 48230,
//   blocked: 342,
//   redacted: 1876,
//   passed: 46012,
//   blockRate: 0.007,
//   topTriggers: [
//     { scanner: 'pii', entity: 'email', count: 1204 },
//     { scanner: 'pii', entity: 'phone', count: 672 },
//     { scanner: 'keyword', keyword: 'internal_only', count: 198 },
//     { scanner: 'topic', topic: 'politics', count: 144 },
//   ],
//   falsePositiveRate: 0.02,
// }

Dashboard Setup

You can also create and manage policies from the ThinkHive dashboard:

  1. Navigate to Settings > Guardrail Policies
  2. Click Create Policy and give it a name
  3. Add scanners using the visual editor — each scanner has a configuration panel
  4. Toggle Set as Default to make it the fallback policy
  5. Use the Test panel to scan sample content before activating

Integration Pattern

A typical integration scans both inputs and outputs.

import { guardrails } from '@thinkhive/sdk';
 
async function safeAgentResponse(userMessage: string, agentResponse: string) {
  // Scan both input and output in a single call
  // Uses default policy when policyId is omitted
  const scan = await guardrails.scan({
    input: userMessage,
    output: agentResponse,
  });
 
  if (scan.action === 'block') {
    return { error: 'Content blocked by guardrail policy.' };
  }
 
  // Return the (potentially redacted) response
  return { response: scan.redactedOutput ?? agentResponse };
}

Best Practices

Policy Design by Use Case

Use CaseRecommended ScannersNotes
Customer supportPII, keyword, topicRedact PII, block competitor mentions
HealthcarePII (all entities), keyword, regexStrict PII redaction for HIPAA
Internal toolsTool call, regexValidate tool usage, block secrets
SalesTopic, keyword, PIIBlock off-topic, protect customer data
  1. Start with PII scanning on all customer-facing agents — it has the highest compliance impact
  2. Use redact over block when possible to preserve conversation flow
  3. Test policies with sample content before deploying to production
  4. Monitor false positive rates and refine keyword lists and thresholds regularly
  5. Use separate policies for different agent types rather than one policy with many exceptions
  6. Set a default policy as a safety net so no agent runs without guardrails

Next Steps