GuidesHallucination Detection

Hallucination Detection Guide

Learn how to detect, categorize, and prevent hallucinations in your AI agent responses.

What are Hallucinations?

Hallucinations occur when an AI model generates information that is:

  • Fabricated - Made up entirely
  • Incorrect - Contradicts known facts
  • Unsupported - Not backed by provided context

9 Types of Hallucinations

ThinkHive detects these hallucination types:

1. Unsupported Claims

Statements not present in the source material.

Context: "Our product costs $99/month"
Response: "The product costs $99/month with a 20% annual discount"
Issue: The discount was never mentioned

2. Factual Errors

Incorrect facts that contradict reality.

Response: "Python was created by Guido van Rossum in 1995"
Issue: Python was created in 1991

3. Contradictions

Statements that conflict with each other.

Response: "The meeting is at 3pm... Please join us at 4pm"
Issue: Conflicting times in same response

4. Out-of-Scope Responses

Answering questions beyond available information.

Context: Product documentation
Response: "Based on user reviews, customers love..."
Issue: No user reviews in context

5. Fabricated References

Made-up citations or sources.

Response: "According to the 2024 AI Report by Gartner..."
Issue: No such report exists or was provided

6. Missing Context

Omitting critical caveats or conditions.

Context: "Free trial for 14 days. Credit card required."
Response: "We offer a free trial"
Issue: Credit card requirement omitted

7. Logical Fallacies

Invalid reasoning or conclusions.

Response: "Since our product is popular, it must be the best"
Issue: Popularity doesn't equal quality

8. Semantic Drift

Subtle changes in meaning.

Context: "May improve performance"
Response: "Will improve performance"
Issue: Changed possibility to certainty

9. Attribution Errors

Incorrectly attributing information.

Context: Document A says X, Document B says Y
Response: "Document A says Y"
Issue: Wrong source attribution

Setting Up Detection

import { runs } from '@thinkhive/sdk';
 
// Analyze with hallucination detection
const analysis = await runs.analyze(runId, {
  includeHallucinationDetection: true,
});
 
console.log(analysis.hallucinationReport);
// {
//   detected: true,
//   hallucinations: [
//     {
//       type: 'fabricated_reference',
//       text: 'According to our 2024 report...',
//       confidence: 0.92,
//       severity: 'high',
//       suggestion: 'Remove or verify the reference'
//     }
//   ]
// }

Prevention Strategies

1. Improve Prompts

Answer ONLY using information from the provided context.
If the context doesn't contain the answer, say "I don't have that information."
Never invent facts, statistics, or references.

Context: {context}
Question: {question}

2. Add Verification Layer

async function verifiedResponse(query: string) {
  const response = await generateResponse(query);
 
  // Run hallucination check
  const check = await thinkHive.analyze({
    response: response,
    context: retrievedDocs,
    includeHallucinationDetection: true,
  });
 
  if (check.hallucinationReport.detected) {
    // Regenerate with stricter prompt
    return regenerateWithWarning(query, check.hallucinations);
  }
 
  return response;
}

3. Use Temperature Control

Lower temperatures reduce hallucinations:

const completion = await openai.chat.completions.create({
  model: 'gpt-4',
  temperature: 0.3,  // Lower for factual responses
  // ...
});

4. Implement Guardrails

function validateResponse(response: string, context: string) {
  // Check for common hallucination patterns
  const patterns = [
    /according to .* report/i,
    /\d+% of (users|customers)/i,
    /studies show/i,
  ];
 
  for (const pattern of patterns) {
    if (pattern.test(response) && !pattern.test(context)) {
      return { valid: false, reason: 'Potential fabricated claim' };
    }
  }
 
  return { valid: true };
}

Monitoring & Alerts

// Set up alert for hallucinations
await webhooks.create({
  url: 'https://your-app.com/alerts',
  events: ['trace.failure'],
  filters: {
    'hallucinationReport.detected': true,
    'hallucinationReport.severity': 'high'
  }
});

Best Practices

Hallucination Risk by Use Case

Use CaseRisk LevelRecommended Actions
Casual chatLowMonitor trends
Customer supportMediumReal-time detection
Legal/MedicalHighHuman review required
  1. Always provide context - Responses without context are more likely to hallucinate
  2. Use lower temperatures for factual queries
  3. Implement post-generation checks for high-stakes responses
  4. Monitor hallucination trends over time
  5. Train on verified data when fine-tuning

Next Steps