Guardrail Policies Guide
Guardrail policies let you define scanning rules that run against AI inputs and outputs in real time. Each policy can combine multiple scanners — PII detection, keyword filtering, regex matching, topic detection, and tool call validation — into a single reusable configuration.
How Policies Work
A guardrail policy is a named collection of scanners. When you call guardrails.scan(), you reference a policy by ID, and all scanners in that policy run against the content. Scanners can block (reject the content) or redact (modify the content) depending on your configuration.
You can set one policy as the default policy. Any scan request that does not specify a policyId will use the default policy automatically.
Creating a Policy
Define your policy and scanners
import { guardrails } from '@thinkhive/sdk';
const policy = await guardrails.createPolicy({
name: 'Customer Support Policy',
description: 'Standard guardrails for customer-facing AI agents',
isDefault: true,
scanners: [
{
type: 'pii',
action: 'redact',
config: {
entities: ['email', 'phone', 'ssn', 'credit_card'],
redactionStyle: 'mask', // 'mask', 'hash', or 'replace'
},
},
{
type: 'keyword',
action: 'block',
config: {
keywords: ['competitor_name', 'internal_only', 'confidential'],
caseSensitive: false,
matchWholeWord: true,
},
},
{
type: 'regex',
action: 'redact',
config: {
patterns: [
{ name: 'api_key', pattern: '(sk-[a-zA-Z0-9]{32,})' },
{ name: 'internal_url', pattern: '(https?://internal\\.company\\.com[^\\s]*)' },
],
redactionReplacement: '[REDACTED]',
},
},
{
type: 'topic',
action: 'block',
config: {
blockedTopics: ['politics', 'religion', 'violence'],
threshold: 0.8,
},
},
{
type: 'tool_call',
action: 'block',
config: {
allowedTools: ['search_knowledge_base', 'get_order_status', 'create_ticket'],
blockUnknownTools: true,
parameterValidation: true,
},
},
],
});
console.log(policy.id); // 'policy_abc123'Scan content against the policy
const result = await guardrails.scan({
policyId: policy.id,
output: 'Please contact john@example.com or call 555-123-4567 for help.',
});
console.log(result);
// {
// action: 'redact',
// redactedOutput: 'Please contact [EMAIL REDACTED] or call [PHONE REDACTED] for help.',
// results: [
// {
// scanner: 'pii',
// action: 'redact',
// findings: [
// { entity: 'email', original: 'john@example.com', redacted: '[EMAIL REDACTED]' },
// { entity: 'phone', original: '555-123-4567', redacted: '[PHONE REDACTED]' }
// ]
// },
// { scanner: 'keyword', action: 'pass' },
// { scanner: 'regex', action: 'pass' },
// { scanner: 'topic', action: 'pass' },
// { scanner: 'tool_call', action: 'pass' }
// ]
// }Handle blocked content
When a scanner with action: 'block' triggers, the content is rejected.
const blockedResult = await guardrails.scan({
policyId: policy.id,
output: 'Our competitor_name has a similar product but ours is better.',
});
console.log(blockedResult);
// {
// action: 'block',
// redactedOutput: null,
// results: [
// {
// scanner: 'keyword',
// action: 'block',
// findings: [{ keyword: 'competitor_name', position: 4 }]
// }
// ]
// }Scanner Configuration Reference
PII Scanner
Detects and redacts personally identifiable information.
| Setting | Type | Description |
|---|---|---|
entities | string[] | PII types to detect: email, phone, ssn, credit_card, address, name, dob |
redactionStyle | string | How to redact: mask (partial), hash (SHA-256), replace (placeholder) |
allowList | string[] | Values to exclude from detection (e.g., company email domains) |
Keyword Scanner
Blocks or redacts content containing specific keywords.
| Setting | Type | Description |
|---|---|---|
keywords | string[] | List of keywords to match |
caseSensitive | boolean | Whether matching is case-sensitive (default: false) |
matchWholeWord | boolean | Only match whole words, not substrings (default: true) |
Regex Scanner
Applies custom regex patterns for detection.
| Setting | Type | Description |
|---|---|---|
patterns | object[] | Array of { name, pattern } objects |
redactionReplacement | string | Text to replace matches with (default: [REDACTED]) |
Topic Scanner
Detects off-limits topics using semantic classification.
| Setting | Type | Description |
|---|---|---|
blockedTopics | string[] | Topics to block (e.g., politics, religion, violence) |
threshold | number | Confidence threshold for topic detection (0.0—1.0) |
Tool Call Scanner
Validates tool/function calls made by the AI agent.
| Setting | Type | Description |
|---|---|---|
allowedTools | string[] | List of permitted tool names |
blockUnknownTools | boolean | Block calls to tools not in the allowed list |
parameterValidation | boolean | Validate tool parameters against expected schemas |
Managing Policies
List policies
const policies = await guardrails.listPolicies();
// [
// { id: 'policy_abc', name: 'Customer Support Policy', isDefault: true, scannerCount: 5 },
// { id: 'policy_def', name: 'Internal Tools Policy', isDefault: false, scannerCount: 2 },
// ]Update a policy
await guardrails.updatePolicy(policy.id, {
scanners: [
...policy.scanners,
{
type: 'keyword',
action: 'block',
config: {
keywords: ['new_blocked_term'],
caseSensitive: false,
},
},
],
});Set a default policy
await guardrails.setDefaultPolicy('policy_def');Delete a policy
await guardrails.deletePolicy('policy_abc');Deleting a policy that is referenced by active agents will cause those agents to fall back to the default policy. If no default policy exists, scans will be skipped.
Guardrail Analytics
View how your policies are performing.
const analytics = await guardrails.getAnalytics({
policyId: policy.id,
period: '30d',
});
console.log(analytics);
// {
// totalScans: 48230,
// blocked: 342,
// redacted: 1876,
// passed: 46012,
// blockRate: 0.007,
// topTriggers: [
// { scanner: 'pii', entity: 'email', count: 1204 },
// { scanner: 'pii', entity: 'phone', count: 672 },
// { scanner: 'keyword', keyword: 'internal_only', count: 198 },
// { scanner: 'topic', topic: 'politics', count: 144 },
// ],
// falsePositiveRate: 0.02,
// }Dashboard Setup
You can also create and manage policies from the ThinkHive dashboard:
- Navigate to Settings > Guardrail Policies
- Click Create Policy and give it a name
- Add scanners using the visual editor — each scanner has a configuration panel
- Toggle Set as Default to make it the fallback policy
- Use the Test panel to scan sample content before activating
Integration Pattern
A typical integration scans both inputs and outputs.
import { guardrails } from '@thinkhive/sdk';
async function safeAgentResponse(userMessage: string, agentResponse: string) {
// Scan both input and output in a single call
// Uses default policy when policyId is omitted
const scan = await guardrails.scan({
input: userMessage,
output: agentResponse,
});
if (scan.action === 'block') {
return { error: 'Content blocked by guardrail policy.' };
}
// Return the (potentially redacted) response
return { response: scan.redactedOutput ?? agentResponse };
}Best Practices
Policy Design by Use Case
| Use Case | Recommended Scanners | Notes |
|---|---|---|
| Customer support | PII, keyword, topic | Redact PII, block competitor mentions |
| Healthcare | PII (all entities), keyword, regex | Strict PII redaction for HIPAA |
| Internal tools | Tool call, regex | Validate tool usage, block secrets |
| Sales | Topic, keyword, PII | Block off-topic, protect customer data |
- Start with PII scanning on all customer-facing agents — it has the highest compliance impact
- Use
redactoverblockwhen possible to preserve conversation flow - Test policies with sample content before deploying to production
- Monitor false positive rates and refine keyword lists and thresholds regularly
- Use separate policies for different agent types rather than one policy with many exceptions
- Set a default policy as a safety net so no agent runs without guardrails
Next Steps
- Compliance — PII handling and regulatory compliance
- Transcript Analysis — Post-hoc conversation analysis
- API Reference — Full guardrails API documentation