Core Concepts

ThinkHive uses a run-centric architecture to model AI agent behavior. Understanding these concepts will help you get the most out of the platform.

The Run-Centric Model

Core Entities

Run

A Run is a complete agent execution from input to final output. It represents one unit of work your AI agent performs.

const run = {
  id: "run_abc123",
  name: "customer-support-chat",
  input: "How do I reset my password?",
  output: "To reset your password, go to Settings > Security...",
  outcome: "success",
  startTime: "2024-01-15T10:30:00Z",
  endTime: "2024-01-15T10:30:02Z",
  metadata: {
    customerId: "cust_xyz",
    channel: "chat"
  }
};

Trace

A Trace is the collection of operations (spans) that occurred during a run. It provides the detailed timeline of what happened.

Span

A Span represents a single operation within a trace:

Span Type	Description	Example
LLM	A call to a language model	GPT-4 completion
Retrieval	A search or retrieval operation	Vector database query
Tool	An external tool invocation	Web search, calculator
Chain	A workflow or orchestration step	LangChain chain execution

const span = {
  id: "span_def456",
  type: "llm",
  name: "gpt-4-response",
  startTime: "2024-01-15T10:30:00.500Z",
  endTime: "2024-01-15T10:30:01.800Z",
  attributes: {
    "llm.model": "gpt-4",
    "llm.provider": "openai",
    "llm.input_tokens": 150,
    "llm.output_tokens": 200,
  }
};

Policies

A Policy defines behavioral rules for your AI agent without requiring code changes. Policies let you configure guardrails, routing logic, and response constraints through the ThinkHive dashboard.

const policy = {
  id: "policy_001",
  name: "No Financial Advice",
  type: "guardrail",
  status: "active",
  rules: [
    {
      condition: "topic_contains",
      value: ["investment", "stock", "financial advice"],
      action: "block",
      fallbackResponse: "I'm not able to provide financial advice. Please consult a licensed advisor."
    }
  ]
};

Policy Types

Type	Description	Example
Guardrail	Block or modify responses matching criteria	Prevent financial/medical advice
Routing	Direct requests to specific agent configurations	Route VIP customers to premium model
Constraint	Enforce response format or content rules	Require citations, limit length
Fallback	Define behavior when primary agent fails	Graceful degradation responses

Policies are evaluated at runtime and can be updated without redeploying your agent. Changes take effect immediately.

Quality Concepts

Claims

Claims are assertions extracted from agent responses. They are categorized as:

Facts are statements that can be verified against source material or known truths.

Inferences are conclusions drawn from available information.

const claim = {
  id: "claim_789",
  content: "The password reset link expires after 24 hours",
  type: "fact",           // or "inference"
  confidence: 0.95,
  source: "retrieved_doc_1",
  supported: true,
};

Cases

A Case is a cluster of similar failures or issues. ThinkHive automatically groups related problems to help you prioritize improvements.

const case_ = {
  id: "case_001",
  title: "Password reset instructions incomplete",
  severity: "medium",
  status: "open",
  traceCount: 47,
  pattern: "Missing 2FA reset step when user has MFA enabled",
  suggestedFix: "Add conditional step for MFA users"
};

Fixes

A Fix is a proposed improvement for a case. ThinkHive can generate fix suggestions automatically.

Shadow Tests

A Shadow Test validates a fix by comparing new behavior against the baseline without affecting production.

Evaluation Metrics

RAG Quality Metrics

Metric	Description	Range
Groundedness	Response supported by retrieved context	0-1
Faithfulness	Response consistent with context	0-1
Citation Accuracy	Proper attribution to sources	0-1
Context Relevance	Retrieved docs match query	0-1

Hallucination Types

ThinkHive detects 9 types of hallucinations:

Unsupported Claims - Statements not in source material
Factual Errors - Incorrect facts
Contradictions - Conflicting statements
Out-of-Scope - Answering beyond available info
Fabricated References - Made-up citations
Missing Context - Omitting critical information
Logical Fallacies - Invalid reasoning
Semantic Drift - Subtle meaning changes
Attribution Errors - Incorrect source attribution

Business Concepts

Customer Context

Customer Context captures the state of a customer at a point in time, enabling time-series analysis.

const context = {
  customerId: "cust_xyz",
  timestamp: "2024-01-15T10:30:00Z",
  metrics: {
    subscription_tier: "premium",
    arr: 5000,
    health_score: 0.85,
    last_support_contact: "2024-01-10"
  }
};

Calibration

Calibration measures how well confidence scores predict actual outcomes. A well-calibrated model predicts outcomes accurately.

const calibration = {
  brierScore: 0.12,          // Lower is better (0-1)
  expectedCalibrationError: 0.08,
  status: "well_calibrated"  // or "under_confident", "over_confident"
};

Ticket Linking

ThinkHive supports 7 methods for deterministically linking runs to support tickets:

Explicit run ID in ticket
Zendesk marker embedding
Timestamp correlation
Conversation thread ID
Customer session ID
Semantic similarity
Manual linking

API Tiers

Tier	Rate Limit	Features
Free	10/min	Basic tracing, explainability
Starter	60/min	+ Search, patterns, quality metrics
Professional	300/min	+ RAG eval, hallucinations, ROI
Enterprise	1,000/min	All features + dedicated support

Next Steps

Your First Trace - Build a complete traced application
JavaScript SDK - Detailed SDK documentation
API Reference - REST API endpoints

Quickstart How It Works