Getting StartedCore Concepts

Core Concepts

ThinkHive uses a run-centric architecture to model AI agent behavior. Understanding these concepts will help you get the most out of the platform.

The Run-Centric Model

Core Entities

Run

A Run is a complete agent execution from input to final output. It represents one unit of work your AI agent performs.

const run = {
  id: "run_abc123",
  name: "customer-support-chat",
  input: "How do I reset my password?",
  output: "To reset your password, go to Settings > Security...",
  outcome: "success",
  startTime: "2024-01-15T10:30:00Z",
  endTime: "2024-01-15T10:30:02Z",
  metadata: {
    customerId: "cust_xyz",
    channel: "chat"
  }
};

Trace

A Trace is the collection of operations (spans) that occurred during a run. It provides the detailed timeline of what happened.

Span

A Span represents a single operation within a trace:

Span TypeDescriptionExample
LLMA call to a language modelGPT-4 completion
RetrievalA search or retrieval operationVector database query
ToolAn external tool invocationWeb search, calculator
ChainA workflow or orchestration stepLangChain chain execution
const span = {
  id: "span_def456",
  type: "llm",
  name: "gpt-4-response",
  startTime: "2024-01-15T10:30:00.500Z",
  endTime: "2024-01-15T10:30:01.800Z",
  attributes: {
    "llm.model": "gpt-4",
    "llm.provider": "openai",
    "llm.input_tokens": 150,
    "llm.output_tokens": 200,
  }
};

Policies

A Policy defines behavioral rules for your AI agent without requiring code changes. Policies let you configure guardrails, routing logic, and response constraints through the ThinkHive dashboard.

const policy = {
  id: "policy_001",
  name: "No Financial Advice",
  type: "guardrail",
  status: "active",
  rules: [
    {
      condition: "topic_contains",
      value: ["investment", "stock", "financial advice"],
      action: "block",
      fallbackResponse: "I'm not able to provide financial advice. Please consult a licensed advisor."
    }
  ]
};

Policy Types

TypeDescriptionExample
GuardrailBlock or modify responses matching criteriaPrevent financial/medical advice
RoutingDirect requests to specific agent configurationsRoute VIP customers to premium model
ConstraintEnforce response format or content rulesRequire citations, limit length
FallbackDefine behavior when primary agent failsGraceful degradation responses

Policies are evaluated at runtime and can be updated without redeploying your agent. Changes take effect immediately.

Quality Concepts

Claims

Claims are assertions extracted from agent responses. They are categorized as:

Facts are statements that can be verified against source material or known truths.

Inferences are conclusions drawn from available information.

const claim = {
  id: "claim_789",
  content: "The password reset link expires after 24 hours",
  type: "fact",           // or "inference"
  confidence: 0.95,
  source: "retrieved_doc_1",
  supported: true,
};

Cases

A Case is a cluster of similar failures or issues. ThinkHive automatically groups related problems to help you prioritize improvements.

const case_ = {
  id: "case_001",
  title: "Password reset instructions incomplete",
  severity: "medium",
  status: "open",
  traceCount: 47,
  pattern: "Missing 2FA reset step when user has MFA enabled",
  suggestedFix: "Add conditional step for MFA users"
};

Fixes

A Fix is a proposed improvement for a case. ThinkHive can generate fix suggestions automatically.

Shadow Tests

A Shadow Test validates a fix by comparing new behavior against the baseline without affecting production.

Evaluation Metrics

RAG Quality Metrics

MetricDescriptionRange
GroundednessResponse supported by retrieved context0-1
FaithfulnessResponse consistent with context0-1
Citation AccuracyProper attribution to sources0-1
Context RelevanceRetrieved docs match query0-1

Hallucination Types

ThinkHive detects 9 types of hallucinations:

  1. Unsupported Claims - Statements not in source material
  2. Factual Errors - Incorrect facts
  3. Contradictions - Conflicting statements
  4. Out-of-Scope - Answering beyond available info
  5. Fabricated References - Made-up citations
  6. Missing Context - Omitting critical information
  7. Logical Fallacies - Invalid reasoning
  8. Semantic Drift - Subtle meaning changes
  9. Attribution Errors - Incorrect source attribution

Business Concepts

Customer Context

Customer Context captures the state of a customer at a point in time, enabling time-series analysis.

const context = {
  customerId: "cust_xyz",
  timestamp: "2024-01-15T10:30:00Z",
  metrics: {
    subscription_tier: "premium",
    arr: 5000,
    health_score: 0.85,
    last_support_contact: "2024-01-10"
  }
};

Calibration

Calibration measures how well confidence scores predict actual outcomes. A well-calibrated model predicts outcomes accurately.

const calibration = {
  brierScore: 0.12,          // Lower is better (0-1)
  expectedCalibrationError: 0.08,
  status: "well_calibrated"  // or "under_confident", "over_confident"
};

Ticket Linking

ThinkHive supports 7 methods for deterministically linking runs to support tickets:

  1. Explicit run ID in ticket
  2. Zendesk marker embedding
  3. Timestamp correlation
  4. Conversation thread ID
  5. Customer session ID
  6. Semantic similarity
  7. Manual linking

API Tiers

TierRate LimitFeatures
Free10/minBasic tracing, explainability
Starter60/min+ Search, patterns, quality metrics
Professional300/min+ RAG eval, hallucinations, ROI
Enterprise1,000/minAll features + dedicated support

Next Steps